will require a check that both base images have this support in run-consul-template
but if they do, then updating server scripts will be easy
ruaok
I would suggest that we do the following 1) Pick a target date for the consul upgrade. A day when all of us can be present at the same time. 2) each team works towards compliance, ideally restarting services before the deadline. 3) during the deadline, we're all on hand to help with what ever goes wrong while zas does the switchover.
zas
yup good idea
ruaok
when should that deadline be? in 30 days or so?
zas
first I need to deploy a consul network, we'll set a date for the switch after we are sure it is stable enough
but yes, 30 days is a max
ruaok
Feb 8?
alastairp
so we will configure all apps to hard-code host/port to the current consul server, new network can be set up in the background? (and both have the same config values) and then we can slowly move apps to the new one?
zas
before a weekend? I'd say tuesday before
ruaok
maybe 3 hours prior to meeting time? maybe with yvanzo leading for the MB team, so that bitmap doesn't have to get up early?
yvanzo
zas: Feb 8 is a Monday
zas
ah .... wrong month ;)
ruaok
tuesday is fine too, doesn't matter in pandemic times. its just work from here until.... well. shit.
zas
ok then
alastairp: the idea is to have 2 consul networks, containing same key-values (for config)
ruaok
so, in theory we can move beta.lb.org to the new consul server. if everything works, then we move prod during a release. yes?
zas
yes
yvanzo
we should build images that could make use of the new consult network for all services by Feb 8, right?
zas
yup
alastairp
right, so the aim is that everyone should have been upgraded before the 8th, and we're all on-hand for when the old one is turned off just in case things break?
ruaok
alastairp: lets work on this when we meet up next. we can also make sure that BB is in good shape.
alastairp: that was my suggestion, yes.
alastairp
yeah, this should be straightforward for LB, BB, CB, AB
zas
I'll start to work on new consul deployment
alastairp
ruaok: got it. just making sure I understand everything clearly
ruaok
how do we ensure that everything gets updated?
we have so many containers running, I have lost track of everything.
zas
well, we can check image's age
alastairp
we can grep the env flag in the startup script
zas
yes too
I don't that much an issue
think*
ruaok
zas: can you make a spreadsheet that lists all the containers that need upgrading, broken down by team that owns the container?
yvanzo
basically it's mostly about upgrading the base image and building new images from that.
ruaok
then we have a clear goal with clear ownership.
not containers -- images.
zas
yvanzo: yup, then deploying new images
alastairp
one question - what's the requirement for upgrading the base image?
does consul-template and consul versions have to be released in parallel?
zas
nope
but we want to upgrade consul & consul-template to most recent versions
base image is versionned
alastairp
OK. that requires more work for all of the python apps I think
I was going to look at this task this week anyway, though
let me look now and see if I can recall the problem
alastairp: do you recall why this downgrade was needed?
alastairp
it was to do with the order of config files being rendered and services being started
zas
then it'll need more work, but upgrade is still required
alastairp
tell me if I'm wrong, but I think the issue is that in 0.16 the service is started at the same time that the file is rendered. It was possible that the service tries to read the file before it is there
also base image upgrade can lead to compat issues too
yvanzo
is it something possibly being addressed with recent version of consul?
alastairp
and in new versions, it renders the template and then starts the service. it also requires changes to the config file. we were seeing syntax errors from consul-template on startup due to differing config files
zas
yvanzo: yes, tons of bugs
and bitmap and I suspect consul to be responsible of mbs failures we see sometimes
alastairp
the last time I looked at this was about the time that we made that commit, so things are fuzzy. I remember that downgrading base image fixed it. We definitely need to improve this anyway
zas
(when all containers break, and usually recover ... or not)
alastairp
so I'm definitely interested in looking at other apps that use 0.19 and see what they do differently
yvanzo
what is the version target to upgrade consul to? 1.2.0?
zas
the thing is that we were reluctant to upgrade to not break things, but now it is very needed, our stuff is getting very old
yvanzo: the most recent stable, whichever it is when we'll proceed
yes, to define signal handling, which is very needed
alastairp
I believe we determined that exec{} was required for 0.19, but not required for 0.16
so to upgrade to 0.25, we should first upgrade all python apps to 0.19
and keep using old consul server
zas
alastairp: handling signals coming from docker, through consul-template to pass them to underlying app is needed, else your app will misbehave when containers (or system) are shut down
alastairp
I guess we managed without it until now :)
can I meet with zas or yvanzo maybe later this week to try and prototype this upgrade?
yvanzo
on Thursday?
zas
ok for me, but I have something rather urgent to handle this week (upgrading kiki)
alastairp
fine
probably doesn't require 2 people, just 1 who understands this signals problem. I assume that some mbs devs understand it?
yvanzo
no :D
zas
bitmap does (I guess) ;)
alastairp
happy to push it to another week to give zas some more time
but I'd prefer at least 2 weeks time between that and 8 Feb
zas
well, nothing ordered yet for kiki's so thursday may work, I'll tell you
bitmap
I'm happy to help if needed
zas
alastairp: let's try this week
ok, let's close this meeting before the next one open, I need a pause, feel free to ask questions or shoot ideas on channel during next weeks
alastairp
bitmap: do you remember this signal issue?
bitmap
afraid not, do you have a link?
alastairp
mostly those two consul-template conf files that I linked + zas' comment that "this is very needed to define signal handling"
yvanzo
alastairp: it is truely resolved in mbs afaik
BrainzGit
[listenbrainz-server] MonkeyDo merged pull request #1211 (master…lb-579): LB-579: Improve pagination buttons on "Recent Listens" and "User Artists" page https://github.com/metabrainz/listenbrainz-serv...
alastairp
ok, no problem. let's pencil in Thursday. 1h before meeting time? If zas is avaialble, he can help. if not, we can try and muddle through it?
bitmap
sounds good to me
alastairp
thanks!
bitmap
it's been a while since I worked on the exec mode stuff, I'll look over what we did again
alastairp
do you have explicit runit service files for mb services, or are they all triggered from consul template files?
bitmap
we have runit service files which invoke consul-template
and then the exec mode handles starting the underlying service
yvanzo
alastairp: sorry, it is _not_ truely resolved in mbs afaik
alastairp
ahh, great. OK, we'll require some significant changes in AB and LB, then
bitmap: do you have any services that run a different app using the same docker image based on an env variable?
bitmap
I don't believe so, we have separate images for each service in general
Freso
<BANG>
alastairp
ok, let's talk about it on thurs. ,thanks
Freso
It’s International Parity at Work Monday!
(I didn't get to look up a song somehow related in advance, so go find one of your own.)
shivam-kapila
(need to be related?
Freso
zas: Go!
yvanzo
maybe someone that did not need a break?
Freso
Oh, sure. yvanzo, go!
yvanzo
oops
hi everyone!
Freso
:p
CatQuest
he
h
yvanzo
I reviewed two-fifths of MBS PRs opened since last year.
reosarevok blushes
reosarevok
Thanks for that!
yvanzo
Total count of opened PRs decreased by 20 despite new PRs.
CatQuest
cute
yvanzo
I’d like to continue at least at the same rate during this week.
Also updated wikidocs transclusion and updated beta.mb.o.
A new MBS release has been published just this morning, see the blog.
Last, I worked on search documentation and further bugfixes.
Go reosarevok!
reosarevok
Hi!
zas
I'm back, I can go after
Freso
(Others up for reviews: Mr_Monkey, ruaok, bitmap, alastairp, yvanzo, Freso, zas, CatQuest, jmp_music, _lucifer, shivam-kapila – anyone else who want to give review, let me know ASAP!)