in #metabrainz

15:58 PM
bitmap

yesterday's went until 19:23 so we'll surely have to stop them
16:04 PM
yvanzo

bitmap: I rearranged a bit the post release sequence to quickly produce a sample data dump at first, as it can help with debugging Docker Compose if something goes wrong during the last tests.
16:05 PM
bitmap

ok, makes sense
16:06 PM
yvanzo

Maybe we can even start the live data feed before starting a full data dump?
16:07 PM
reosarevok

yvanzo: I tweeted, wanna post elsewhere since it's probably faster for you given you have accessed them already? :)
16:09 PM
marcelveldt joined the channel
16:14 PM
bitmap

yvanzo: we could do it in that order too, yes (it just requires waiting around for the packet to be generated)
16:15 PM
reosarevok

bitmap: I'm going to stop cron for sitemaps then
16:15 PM
No need to stop the process in any specific way in advance, is there?
16:15 PM
bitmap

oh I already stopped the cron there, but I mean we can forcefully stop the container later
16:15 PM
reosarevok

Oh
16:16 PM
bitmap

stopping cron doesn't affect existing jobs that are running
16:16 PM
reosarevok

Sure, ok, I meant stop and remove the container :)
16:16 PM
Sorry
16:17 PM
I guess there's no reason to wait if we're reasonably sure it won't finish on time?
16:17 PM
bitmap

I just killed them
16:17 PM
reosarevok

Thanks
16:18 PM
bitmap

(you can run the command to stop and remove the container though)
16:19 PM
reosarevok

Done
16:19 PM
yvanzo: are you waiting for a specific time for "Check Weblate repository status" (again)? :)
16:20 PM
yvanzo

bitmap: I reordered the steps and made it clear that we should keep going on.
16:20 PM
bitmap

👍
16:21 PM
yvanzo

reosarevok: Weblate just needed time to catch up with changes: It had “33 missing in the push branch”, now 0.
16:21 PM
bitmap

we can start the sample dump concurrently with the MB production cron
16:23 PM
reosarevok

lucifer: it's soon time "to stop LB cron jobs that depend on the MB DB"
16:23 PM
yvanzo

Ok, wasn't sure if it might be delaying the first packet by a while.
16:24 PM
lucifer

reosarevok: sure can do it right now
16:25 PM
bitmap

reosarevok: had you started the pg_dump of musicbrainz_db already?
16:25 PM
reosarevok

Nope
16:25 PM
But I can
16:25 PM
bitmap

oh we probably should have had that first in the list
16:25 PM
reosarevok

Oops.
16:25 PM
started it now
16:26 PM
How long should this take?
16:26 PM
yvanzo

reosarevok: It seems that I’m not receiving any confirmation mail from Twitter, so I cannot see what you posted and use it for other networks.
16:27 PM
reosarevok

I just quoted the previous tweet with "Downtime in around an hour now. Finish your editing and wish us a quick release! - reo"
16:28 PM
Not really around an hour anymore tho
16:28 PM
bitmap

the dump I made on the 7th is 27GB, so if it dumps at least 1GB per minute, hopefully under 30 minutes :)
16:28 PM
it's at 2GB after 2 minutes...
16:30 PM
(I moved it to the top of the list)
16:32 PM
reosarevok

Thanks
16:33 PM
yvanzo

reosarevok: Done.
16:34 PM
bitmap, reosarevok: I also backed up MB web container logs just in case.
16:34 PM
bitmap

thanks, where are the backups?
16:35 PM
yvanzo

In my home directory.
16:35 PM
bitmap

👍
16:37 PM
the dump won't finish by 17 utc
16:38 PM
yvanzo

I’m stopping SIR instances on rakim.
16:39 PM
bitmap

should we skip it or wait? this particular step was before we had barman and zfs snapshots
16:40 PM
reosarevok

bitmap: which dump?
16:40 PM
the aretha one I started a while ago, on edit_data now
16:40 PM
bitmap

the pg_dump of musicbrainz_db you started at 16:27
16:41 PM
yvanzo

pg_dump is very easy to use, but if you are comfortable enough to restore with other methods, feel free to skip.
16:41 PM
reosarevok

Oh, I thought you said 30 min or so?
16:41 PM
How long do you think it'll take? We can wait 5 min extra, but
16:41 PM
bitmap

probably around 45 minutes in actuality, that was an early estimate
16:42 PM
yvanzo

It’s also okay to wait a bit more if needed.
16:42 PM
reosarevok

I think that's fine tbh
16:42 PM
We can stop it if it seems like it'll take a lot longer than that by 17utc
16:42 PM
bitmap

but could be up to an hour, not 100% sure it will stay at the same rate
16:44 PM
if something goes horribly wrong I assume we'd start with the zfs snapshot since it's taken right before the upgrade
16:45 PM
yvanzo

That gives us a bit more time to gather restore instructions with those other methods then.
16:47 PM
bitmap

it's almost half done, so, probably fine to wait a little
16:51 PM
yvanzo

👍
16:54 PM
reosarevok

Finished edit_data, at least
16:56 PM
atj

did you take the snapshot or have you not reached that point?
16:57 PM
yvanzo

not reached that point
16:59 PM
atj

if something was to wrong and you wanted to revert, it'd simply be a case of running `zfs rollback rpool/ROOT/ubuntu/srv/postgresql@preupgrade` and `zfs rollback rpool/ROOT/ubuntu/srv/postgresql/wal-12@preupgrade`
17:00 PM
should be pretty much instantaneous and you'd be exactly where you were before you started
17:00 PM
mayhem is home and idle
17:01 PM
mayhem

if anyone needs any help, shout@
17:01 PM
reosarevok puts on his robe and wizard ha
17:01 PM
yvanzo

hlep!
17:01 PM
reosarevok

oh, not what you were going for, mayhem, sorry
17:01 PM
Sophist_UK

atj: Ah, the wonders of ZFS snapshots. :-)
17:01 PM
reosarevok

The dump is on editor, hopefully recording/release won't take forever
17:02 PM
bitmap

12:15 is the new downtime ETA
17:02 PM
sorry, 17:15
17:02 PM
yvanzo

12:15 am ;)
17:02 PM
bitmap

haha
17:02 PM
atj

👀
17:04 PM
yvanzo

Thanks atj, I’ve added these two commands to the roadmap docs.
17:07 PM
I’m focused on mirrors’ code now.
17:08 PM
bitmap

https://docs.google.com/document/d/1o8h1z02YBWs... should be public (read-only) for others too
17:08 PM
reosarevok

dump is on track now
17:09 PM
So hopefully it's on track
17:09 PM
🤣
17:10 PM
yvanzo

But when it will be on release it won’t be finished still
17:11 PM
Actually, we have a second full mirror on wolf to test upgrading. :)
17:12 PM
MatthewGlubb joined the channel
17:12 PM
reosarevok

hi MatthewGlubb! We didn't break anything yet surely?
17:14 PM
mayhem

lol
17:14 PM
MatthewGlubb

Haha! Hi reosarevok! No. Just checking in to see that all systems are nominal for launch. I haven't looked, but I wondered if mb-solr related changes have been tagged in git to make it easy for me to know what to upgrade?
17:15 PM
(we are already running Solr 9, so should be simple for us)
17:15 PM
reosarevok

We're not releasing all the search stuff today in the end since it's a bit behind in testing but yvanzo can give you all the details :)
17:15 PM
bitmap

reosarevok: finished?
17:15 PM
yvanzo

MatthewGlubb: 👍 even though we won’t provide officially support for it until we have deployed it in production with dumps.
17:16 PM
reosarevok

bitmap: just did, yes!
17:16 PM
Let's move on! :L)
17:16 PM
* :)
17:16 PM
bitmap

nice, just on time
17:16 PM
merging https://github.com/metabrainz/docker-server-con... then
17:16 PM
reosarevok

yvanzo: are you done with all the image building now?
17:17 PM
MatthewGlubb

No worries. I can just rebuild our indices for Solr 9 with what I've already got. I'll leave you all be. Good luck!
17:17 PM
yvanzo

reosarevok: No, I’m still at building images for test mirrors
17:17 PM
bitmap

I'll stop the PG things, can you do the rest?
17:18 PM
yvanzo

MatthewGlubb: Sure, it works the same for SIR and MBS anyway :)
17:18 PM
reosarevok

bitmap: did you do the zfs now?
17:18 PM
bitmap

no not yet
17:18 PM
it's after the reboot
17:18 PM
reosarevok

Oh, ok
17:18 PM
bitmap

zas: you can reboot jimmy and hendrix
17:19 PM
reosarevok

bitmap: I'm stopping prod and beta
17:19 PM
zas

ok
17:19 PM
bitmap

are there any other hosts with non-redundant services we should prioritize rebooting? patton (MB redis store) comes to mind
17:19 PM
zas

rebooting hendrix
17:20 PM
yes, any server we don't usually reboot
17:21 PM
bitmap

you can reboot patton too then
17:21 PM
zas

rebooting jimmy
17:21 PM
reosarevok

Don't do patton yet
17:21 PM
I'm still stopping services
17:21 PM
zas

k
17:21 PM
reosarevok

I mean, unless it doesn't matter
17:22 PM
zas

reosarevok: tell me when
17:23 PM
reosarevok

bitmap: had you stopped the artwork indexer?
17:23 PM
Error response from daemon: No such container: artwork-indexer
17:24 PM
bitmap

I don't think so
17:25 PM
the command is wrong, should be artwork-indexer-prod
17:25 PM
fixed it in the doc
17:25 PM
zas

still waiting hendrix & jimmy to come back
17:26 PM
bitmap

reosarevok: I stopped it
17:26 PM
reosarevok

So did I, heh
17:26 PM
zas

bitmap: hendrix is back
17:26 PM
reosarevok

I guess I don't have all the latest ssh configs
17:26 PM
https://www.irccloud.com/pastebin/erOBjbxM/
17:26 PM
bitmap: can you run that one?
17:27 PM
zas

bitmap: jimmy is back
17:27 PM
reosarevok

Should have updated that earlier, oh well
17:27 PM
bitmap

ok
17:27 PM
zas

reosarevok: tell me when you're done with patton
17:27 PM
reosarevok

bitmap: I took prod-cron down again
17:27 PM
bitmap

thanks
17:28 PM
reosarevok

zas: I dunno if sir-prod runs on there, if it does, once bitmap stops it, if not, now
17:28 PM
bitmap

it doesn't
17:28 PM
I'll proceed with the PG v16 upgrade
17:28 PM
reosarevok

Ok, zas, feel free to reboot then :)
17:28 PM
bitmap

(sir-prod is stopped now)
17:28 PM
zas

rebooting patton
17:33 PM
bitmap

pg upgrade is running
17:33 PM
zas

patton's back