-
lucifer
ah cool, waiting for confirmation on that then.
-
akshaaatt
But it won't be a part of prod anyway rn lucifer
-
lucifer
oh ok, releasing then.
-
akshaaatt
Great!
-
BrainzGit
-
mayhem
lucifer: artist_similarity_1h and artist_similarity_2h have been computed using the same alg now.
-
atj
zas: OK, just pushed some updates
-
mayhem
I think the results look rather promising -- the biggest issue I see is that of popularity bias.
-
atj
I'm just using the netplan_configuration variable, no magic involved
-
mayhem
but calculating a popluarity rating for each artist and then negatively weighting popular artists should take care of this, I think.
-
zas
atj: aphex in hosts matches your ssh config right?
-
atj
yes
-
lucifer
mayhem: i see makes sense. this is from user_id right?
-
atj
try running "ansible-playbook bootstrap.yml -CD"
-
mayhem
yes, user_id.
-
lucifer
nice
-
mayhem
I'll try and making a recording similarity one later today or next week.
-
I think we should keep working in python for a little while longer and feel comfortable with it. then move it to spark.
-
lucifer
makes sense
-
atj
zas: let me know when you're happy and I'll try applying the netplan configuration
-
looks like your run didn't error
-
zas
enp9s0 has not set-name in the netplan config
-
atj
reminds me of the Windows XP days, when you had to race to apply updates after install before it got hacked
-
yeah, I left that out intentionally, but I'm being over cautious. I'll add it in
-
OK, done and pushed
-
zas
we'll see, it is supposed to work, but it didn't work for me last time I tried...
-
atj
I'm going to apply the netplan config
-
zas
ok
-
please paste commands you run here
-
atj
-
doesn't fill me with confidence
-
I'm going to run "ansible-playbook bootstrap.yml -t netplan"
-
zas
ok
-
atj
normally we'd be adding a "-l aphex" to target the server specifically, but that's not needed
-
OK, it failed, but gracefully!
-
"Error in network definition: enp4s0: 'set-name:' requires 'match:' properties"
-
zas
ah yes
-
atj
that seems to contradict the man page
-
I'll add a match key
-
zas
we can store mac address in a variable
-
the ext one is a8:a1:59:8e:bc:5e
-
int one is 6c:b3:11:0f:a3:39
-
atj
alright
-
just pushed, can you sanity check?
-
zas
I don't think we need double quotes around mac addresses (do we?)
-
atj
I think the ":" might confuse the YAML parser
-
(possibly)
-
zas
ok, apart that, looks good
-
atj
ok, I think the network is down
-
zas
yup, let me see
-
atj
-
hopefully a reboot might bring it back
-
zas
let's try
-
if not, I'll execute rescue system
-
atj
I just tried using set-name on a VM, and I had a VLAN interface setup, which then confuses it because the macaddress matches 2 interfaces...
-
then I tried matching on name, which works fine the first time and then fails because the interface has been renamed
-
zas
this set-name feature doesn't look safe to me, I gave up on previous servers I configured with netplan
-
atj
yeah, I think you might be right, annoying
-
zas
so may be we should just store the name of interface in variables and use them
-
still doesn't ping after a reset
-
atj
gah
-
I'll start adding some firewall configuration while it resets
-
zas
I changed netplan/ansible.yml from rescue system, removed match/set-name parts, let's see if it is enough for it to reboot
-
atj
fingers crossed
-
zas
doesn't work...
-
ok, I'll chroot and ensure netplan cfg is correct
-
atj
sorry :(
-
zas
np, that's more or less expected at this stage
-
(plus, netplan sucks)
-
netplan generate --debug do not report any issue...
-
lucifer
mayhem: prod updated with listen timestamps PR.
-
atj
zas: if there were a syntax error the role would have caught it and not applied the configuration
-
so I'm wondering if the settings themselves are wrong in some way
-
zas
likely, but where?
-
I tried another reboot, no success either
-
atj
maybe try removing the IPv6 stuff
-
could be the default gateway configuration
-
I noticed in the original file it used gateway6 but not gateway4, even though both are apparently deprecated...
-
zas
-
here is what I have on aretha:
-
-
(local network is in another file)
-
I'll mix both, and try again
-
atj
I noticed the extra indentation under addresses etc, but I check with a YAML parser and both are valid
-
I don't get why it uses the "on-link: true, to: 0.0.0/0, via: 138.201.203.1" route instead of default
-
zas
I did the changes, let's see if we can get a ping
-
I removed the part concerning the second interface for now
-
copied config from aretha, just replace ips
-
chrooted + netplan generate, no error
-
atj
OK, I'll use netplan try once we have it working again, to work out what the issue is
-
zas
grrr, no ping, I wonder if the file is actually applied
-
atj
if you boot back into rescue and chroot, can you run "journalctl -b-1" and see if there is anything useful?
-
zas
I reboot on rescue, I added your key
-
I'll mount /dev/md3 on /mnt
-
we can proceed to a fresh install if nothing works
-
atj
OK thanks
-
zas
-
atj
-
heh
-
I think it's IPv6 related
-
zas
ok, found the issue
-
atj
don't tell me it's a typo
-
zas
interface name is incorrect in the last config
-
I reboot, let's see
-
atj
gateway6 is set to a LL address, which seems odd
-
actually, maybe not
-
it's working
-
zas
-
ok this one works
-
I left out the second interface for now
-
but we can use it as basis
-
atj
OK, I'll change the ansible config to match this
-
I've backed up the working configuration, and will use netplan try to manually test the new one
-
OK, we're good
-
zas
ok, push your changes, and we can continue
-
I guess the next step is shorewall
-
atj
I think we just run the entire playbook
-
I'm creating a physical_servers group, on the assumption they all have two NICs
-
is that reasonable at this stage?
-
zas
yes
-
atj
just about to push some commits then I'll run the entire playbook through
-
zas
ok
-
atj
pushed
-
I've created a host level network_interfaces variable, which is then used for the shorewall configuration in group_vars/physical_servers.yml
-
zas
can we use those variables in netplan_configuration ?
-
atj
I don't think so as keys don't get interpolated.
-
zas
ok
-
atj
annoying
-
there may be a better way, but it's iterative
-
zas
that's ok, let's keep it simple for now
-
atj
new failure: sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_generic_timeout: No such file or directory
-
I guess the conntrack modules aren't loaded
-
maybe sysctl need not be in bootstrap?
-
it's not really essential to get the system up and running
-
zas
nope, that's tuning, so it can be run after the bootstrap
-
atj
ok, pushed the change and re-running
-
woohoo, it completed successfully
-
you should be able to SSH in now as your normal user
-
works for me
-
I added a task to run inxi and download the output into servers/<nodename>/inxi.txt
-
zas
but root access is still possible, we usually disable it after users are set up
-
atj
just pushed that too
-
so you want "PermitRootLogin no" in sshd_config?
-
zas
nope; only by key
-
-
(initial setup can be done by key or by password, we want to ensure password access is disabled after initial setup)
-
atj
OK, the default value is "PermitRootLogin without-password"
-
but the default for "PasswordAuthentication" is yes
-
I'll disable password authentication globally
-
zas
good