so yes, it called start_registrator, which rm the running container and start a new one
2021-02-10 04113, 2021
alastairp
this is how I've always done it. is there a better way that doesn't try and restart all services?
2021-02-10 04124, 2021
zas
I just copy the run script (cp cage.sh cage_tmp.sh), remove unneeded lines to keep only the concerned service, and run it, then if all is ok, then copy changes to real scripts, remove tmp script, commti changes, push
2021-02-10 04135, 2021
zas
but we could seriously improve this... ;)
2021-02-10 04159, 2021
alastairp
right
2021-02-10 04153, 2021
alastairp
I guess we could also source services.sh in the shell and just run the start_ method too
[brainzutils-python] alastair merged pull request #45 (master…version-ranges): Use open-ended versions so that downstream projects can pin exact versions https://github.com/metabrainz/brainzutils-python/…
do you want to test LB and CB again with the latest master (check that it installs, check that stuff like CB database lookups work), and confirm with me?
2021-02-10 04159, 2021
alastairp
then we can merge the LB change, I'll schedule a CB merge party for Friday
2021-02-10 04107, 2021
_lucifer
CPU usage on cage spiked again
2021-02-10 04153, 2021
_lucifer
alastairp: yes that's for BU. sure, i'll check again with LB and then later for CB as well
2021-02-10 04127, 2021
_lucifer
zas: ^^
2021-02-10 04119, 2021
alastairp
let me turn jenkins workers down to 3
2021-02-10 04129, 2021
ruaok
and with CPU spikes come MB service degradations. boo.
2021-02-10 04114, 2021
ruaok
zas: do you have a machine at hetzerner dedicated to "processing" tasks that are not impacting services with sensitive response times?
2021-02-10 04135, 2021
zas
ruaok: best choice atm would be paco
2021-02-10 04108, 2021
zas
gateways-redis isn't used anymore (but I keep it for now in case keydb has issues)
2021-02-10 04119, 2021
ruaok looks
2021-02-10 04124, 2021
zas
and there's pg-williams (<-- why???)
2021-02-10 04132, 2021
ruaok
lol.
2021-02-10 04145, 2021
ruaok
yes, that used to be on.. williams. but the service was never renamed.
2021-02-10 04104, 2021
zas
services shouldn't be named after machines
2021-02-10 04111, 2021
ruaok
agreed.
2021-02-10 04111, 2021
zas
but nvm
2021-02-10 04147, 2021
ruaok
ok, looking at the weekly load graph, I see that paco would be pretty good.
2021-02-10 04114, 2021
ruaok
there are two loads spikes that go to 2. and no time sensitive services.
2021-02-10 04146, 2021
ruaok
alastairp: what is needed to move the service from cage to paco? just editing the nodes file and stopping/starting services or more than that?
ruaok: we need to copy the jenkins-data volume too: /var/lib/docker/volumes/jenkins-data/_data
2021-02-10 04123, 2021
ruaok
want me to do that?
2021-02-10 04137, 2021
alastairp
yes please, I don't really have any time left today to look at this
2021-02-10 04141, 2021
ruaok
ok.
2021-02-10 04152, 2021
ruaok
what is the procedure for migrating a volume?
2021-02-10 04157, 2021
alastairp
rsync :)
2021-02-10 04103, 2021
ruaok
take containers down, create new volume, rsync?
2021-02-10 04125, 2021
alastairp
note that ci.metabrainz.org is magic based on the service existing, so you'll have to shut down on cage before starting up on paco, then it should just magically work
2021-02-10 04132, 2021
alastairp
at least that's what happened when I migrated from williams
2021-02-10 04136, 2021
ruaok
k
2021-02-10 04130, 2021
zas
I wonder why a cpu load on cage affects all mbs services
zas had shared the link and it was open in a tab. i was closing tabs when it just spiked, alastairp :)
2021-02-10 04158, 2021
alastairp
but there were also LB js tests running. oddly, `ps` hung while printing info about js processes :/
2021-02-10 04123, 2021
alastairp
I hope jest doesn't spawn a million threads for whatever reason
2021-02-10 04154, 2021
alastairp
Mr_Monkey: when my time frees up next month, I agree that we should sit down and try and improve python style guides. This should include instructions to make vscode do the right thing when you press return, consolidation of the tools that we have locally, in jenkins and squaking, and removal of stupid warnings that we don't want