#metabrainz

/

      • reosarevok
        Thanks!
      • Has anyone started with "Test pulling latest HEAD for docker-server-configs clone on every needed host."?
      • bitmap
        not I
      • I think it's just to make sure there are no local changes that will interfere later
      • yvanzo[m] has quit
      • yvanzo
        Linked the time to the countdown, and published!
      • reosarevok: That can be done with a script.
      • reosarevok
        yvanzo: neat, are you running that?
      • yvanzo
        Ok, I should even have the command somewhere, will add it to the roadmap.
      • reosarevok
        That'd be great for next time too :)
      • yvanzo
        Last year I did unify deployment of docker-server-configs indeed and updated its documentation.
      • bitmap
        it'd be neat if we had a script in docker-server-configs to stop/restart some specified container(s) on all applicable hosts
      • yellowhatpro joined the channel
      • yvanzo
        All hosts are good but `quest` which we don’t need to change for today.
      • reosarevok
        yvanzo: so do I understand correctly that in a few weeks all docker users will need to recreate all their search indexes?
      • yvanzo
        reosarevok: yes
      • reosarevok
        Ok, hopefully it's ok to ask that of our customers outside a schema change boundary
      • yvanzo
        bitmap: do you have some specific container(s) in mind?
      • bitmap
        for today, mainly the musicbrainz-web* containers and all the PG (haproxy, pgbouncer) containers
      • yvanzo
        reosarevok: It is what it is.
      • reosarevok
        Well, otherwise we'll have to do it in September :)
      • (which would be annoying but not the end of the world)
      • bitmap
        actually, something more generalized, like an --exec flag on list_nodes.sh, might be more useful (since it could also be used to run set_service_maintenance commands)
      • reosarevok
        what email is the CB twitter 2fa even supposed to go to? I don't recognize the pattern
      • bitmap
        i.e., run this command on all hosts that have this container
      • yvanzo
        bitmap: list_nodes.sh works well for MB containers (and most of our other services) but not so well with PG containers (which are prefixed with the hostname).
      • twodoorcoupe_ joined the channel
      • You can use it within a loop (or two if there is more than one container).
      • bitmap
        yeah, the postgres ones are, but fortunately none of the haproxy/pgbouncer ones have a hostname
      • right, it would basically just avoid having to write a loop by hand for everything :)
      • twodoorcoupe has quit
      • (and having to drop into bash, since I rarely write loops in fish and always forget the syntax...)
      • reosarevok
        yvanzo, bitmap: has either of you started "Perform a full cluster backup on aretha (inside a screen session):" or should I?
      • bitmap
        no I have not
      • minimal joined the channel
      • yvanzo
        bitmap: Actually `list_nodes.sh` doesn’t look at any container name, just at the start_* commands.
      • bitmap
        reosarevok: feel free to start it
      • yvanzo
        `collect_logs.sh` does look at running container names, that can be a better base for such a script.
      • reosarevok: me neither
      • reosarevok
        Ok, started
      • bitmap
        true, I forgot about that distinction. I guess either would suffice for my use case
      • twodoorcoupe_ has quit
      • I'll simplify some of these steps with bash for-loops for now though. it'll probably shave some minutes off the downtime
      • nvm you're already on it :)
      • atj
        bitmap: we should take a ZFS snapshot when PG is stopped and before the upgrade is started
      • yvanzo
        bitmap: See my changes under “Downtime for the PostgreSQL v16 upgrade”
      • bitmap: I assumed that `sshc` (see `scripts/functions.sh`) is defined in your `~/.bashrc`
      • bitmap
        atj: made a note of it. can I defer to you when the time comes to execute that or do you want to add the command to the doc?
      • atj
        i'll add it to the doc, it's not complex
      • bitmap
        ok, I highlighted the placeholder in red
      • yvanzo: thanks, I didn't have it defined yet, will do
      • yvanzo
        It’s very handy to wreak havoc.
      • bitmap feels very powerful all of a sudden
      • atj
        keeping the snapshots until next weeks meeting seems reasonable to me
      • any one want to volunteer to remember? :)
      • bitmap
        I set a reminder
      • btw do you have an idea of how long the snapshot will take?
      • reosarevok
        I'm still running the backup, no idea how long *that* will take
      • bitmap
        reosarevok: it will finish in time
      • reosarevok
        "Build new PG v16 images on jimmy, hendrix, and aretha (barman)." does that need to be done *after* this finishes?
      • bitmap
        no I'm doing that now
      • aerozol[m] has quit
      • reosarevok
        Ok :)
      • yvanzo
        bitmap: When should `"down"` be put to `"yes"`?
      • bitmap
        at 17 utc
      • yvanzo
        Oh I see it's in the drafted PR, not in the roadmap.
      • reosarevok
      • Done
      • yvanzo: how's "Draft a Docker Compose release with upgrade steps." going? :)
      • yvanzo
        reosarevok: Not there yet, but since we cannot test these beforehand, it is expected to take some time anyway.
      • atj
        bitmap: it's instant
      • reosarevok
        Ok, was asking because it's the only bit left in the "Getting the upgrade ready" bit :)
      • yvanzo
        bitmap: Replaced some instructions for each host in a list under “Uptime for MB services” with two new snippets.
      • bitmap
        atj: oh :) amazing
      • yvanzo
        reosarevok: At least, we won’t have to rebuild search indexes at the same time, so it should be considerably faster than otherwise.
      • bitmap
        yvanzo: thanks!
      • yvanzo
        (Tested with echoing the commands.)
      • bitmap
        I was replacing some other one-off "On $host:" steps with a list_nodes invocation, too (even if it only applies to a single node)
      • since it's easier to reuse the steps in future migration docs if the host names aren't hard-coded
      • yvanzo
        I guess that dependent services should be set "up" again just after having checked that MB main website has no obvious issues?
      • Yes, it seems much better to avoid host names as much as possible in instructions.
      • bitmap
        yeah, a PR needs to be provisioned for that
      • (I just saw you made some draft PRs for MB, thanks!)
      • yvanzo
        Ok, will do
      • Done.
      • reosarevok
        Should we start the "A couple hours before the upgrade" section soon?
      • bitmap: does that include replacing "On hendrix:" in that section?
      • bitmap
        those can stay since they're run by hand on the remote machine instead of via the local bash snippets
      • reosarevok
        Ok
      • Should I start with them?
      • Or do you want to wait as long as possible?
      • bitmap
        better to let MB cron run at least once more
      • reosarevok
        Ok
      • bitmap
        the JSON dumps run at :30 so it's fine to stop those at the same time as MB cron
      • yvanzo
        Merge directions seem to be a bit wild, will comment.
      • bitmap
        we did recently have some commits in the production branch that weren't in beta or master, but I believe they've been synced
      • yvanzo
        bitmap: I made some suggestions, so that it is always the same usual direction.
      • Previous years, we didn’t have the translations branch in the loop either.
      • bitmap
        that looks good to me now
      • perhaps we should reboot jimmy/hendrix once downtime begins but before starting the PG upgrade?
      • zas
        bitmap: yes
      • outsidecontext
        rdswift: hi, does running setup.py build pot or build po work for you on picard-docs? for me this fails with some strange error where it tries to access some file in one of the python dependencies, e.g. it wants to build translation for "idna-3.7.dist-info/LICENSE"
      • rdswift
        Give me a minute and I'll check. It worked fine last time I updated something. Back in a minute.
      • outsidecontext
        thanks
      • reosarevok
        bitmap: aphex has daily-sitemaps on S and BuildSitemaps on R among others - how long do those usually take?
      • bitmap
        I don't think they take very long
      • reosarevok
        It says start 11:05
      • rdswift
        outsidecontext, it worked fine here so I pushed the updates. I had an older version of jinja2, so updated it to the latest version as per the dependabot update and it still worked fine.
      • reosarevok
        So not sure about not long 😅
      • outsidecontext
        rdswift: it's strange. maybe it is because I'm using a venv?
      • rdswift: I always get something like the above. If I remove the idna package with pip the error changes to another package. I have no clue at all why it starts picking the files inside pip installed packages https://www.irccloud.com/pastebin/PC4K8AVk/
      • python 3.12 maybe could also be an issue
      • rdswift
        Could be. I haven't got a venv set up for that. I just try to keep my system up-to-date. (I know, that's not always the greatest idea.) I'm still on Python 3.10 on that macjhine.
      • outsidecontext
        I'll try to debug this more. Thanks for doing the file update
      • yvanzo
        bitmap, reosarevok: Drafted release notes for Docker Compose.
      • Do we have scheduled -1h posts for Bluesky/Mastodon/Twitter?
      • reosarevok
        No, I was planning to just do it myself, but that won't help with bluesky nor mastodon
      • yvanzo
        I mean there is a schedule feature to publish at -1h we can use now.
      • reosarevok
        Yeah, I know
      • But I'm not sure we have a way to do so in the other services, unless you have access
      • yvanzo
        We all do
      • reosarevok
        Oh. Is that in syswiki? I have never even tried to use them, so :) Ok, then we can do that
      • yvanzo
        I don’t see any schedule button in Bluesky/Mastodon indeed.
      • lucifer
        mayhem: yes makes sense, can discuss with zas and atj later this week
      • reosarevok
        Well, that's not the end of the world, since we're around :)
      • I was more worried we couldn't post at all
      • yvanzo, bitmap: json dump container seems fine to stop, any reason not to move ahead with that?
      • bitmap
        go ahead
      • reosarevok
        Same for musicbrainz-production-cron
      • dumps done, I assume (it'd be nice if the command gave any feedback at all)
      • prod-cron done
      • sitemaps is not ready:
      • Dunno whether we give up on it or not
      • yvanzo
        It shouldn’t take long.
      • bitmap
        well the one started at 13:30 which is odd
      • yvanzo
        You can check logs.
      • bitmap
        that one is still processing stuff apparently
      • reosarevok
        127.0.0.1 - - [13/May/2024:15:48:54 +0000] "GET /release/a484ad67-422f-4365-a4e5-652492cae05d/cover-art HTTP/1.1" 200 2382 "-" "musicbrainz-server/28 (localhost:5000) libwww-perl/6.72"
      • etc
      • still doing stuff
      • yvanzo
        Logs from previous runs should give an estimate too.
      • Meantime I’ll start with sawing the branches we are sitting on.
      • bitmap
        😲
      • reosarevok: the one from 13:30 just finished
      • reosarevok
        Yeah, I saw
      • Nothing else seems to be going on?
      • But we can wait a tiny bit
      • bitmap
        daily sitemaps are still running, but we could stop them if needed
      • reosarevok
        Yeah. they are not logging anything, wonder why