#metabrainz

/

      • lucifer
        yes makes sense.
      • bitmap
        lucifer: there is a setting to log disconnections, but it's not enabled
      • mayhem
      • Pratha-Fish
        Would you guys mind if I create a smol alternate MusicBrainz docker instance on wolf? If so, how do I do it without interfering with the existing instance?
      • mayhem
        from gaga in tmux, I run `ssh -L 55432:127.0.0.1:5432 robert@wolf`
      • bitmap
        Pratha-Fish: the container I made is running with --network host, so it can access any ports exposed on the host
      • I'm not sure which port the database you want to access is on, though
      • Pratha-Fish
        bitmap: it's on 5432
      • lucifer
        mayhem: docker container won't be able to access localhost ports i think.
      • not sure if we run with --network=host on prod.
      • Pratha-Fish
        *5432 until and unless I start another musicbrainz-docker instance different from the existing one
      • mayhem
        sorry forgot to mention I am not running on docker.
      • just a venv. ends up being easier for tunneling
      • lucifer
        ah cool then
      • bitmap
        Pratha-Fish: then try psql -h localhost -p 5432 -U musicbrainz -d musicbrainz_db
      • lucifer
        looks good to me
      • Pratha-Fish
        bitmap: yes, I've been using that one for a while now :D
      • mayhem
        forgot to fetch the right password.
      • bitmap
        Pratha-Fish: I mean it should work inside the python container too
      • Pratha-Fish
        whoops
      • mayhem
        query started. monitoring wolf.
      • bitmap
        Pratha-Fish: also, I mounted your home dir as /snaek inside the container
      • Pratha-Fish
        bitmap: thanks that worked!
      • thanks a ton 🥹
      • bitmap
        try `su snaek` in the container and then cd /snaek
      • I set the uid/gid to the same as the host
      • lucifer
        mayhem: uhh, i think we also lost the messybrainz table in that issue.
      • its going to be another PITA to restore that table.
      • mayhem
        oh joy.
      • how so?
      • lucifer
        uuid are generated randomly and not from the source data.
      • mayhem
        but we backup all those tables with pg-dump no?
      • Pratha-Fish
        bitmap: I can only see directory called "csp-errors" in /snaek
      • bitmap
        oh that might be my home dir haha. sorry, let me fix
      • lucifer
        mayhem: i just checked and no. msb table is not backed up.
      • mayhem
        O_O
      • wow.
      • we need to review our backups. this is bad. :(
      • lucifer
        we can still get back the data from the listens but yeah excruciating.
      • indeed
      • mayhem
        SELECT * INTO ?
      • well not *, but msids
      • bitmap
        Pratha-Fish: now it should be fine
      • lucifer
        yeah msids, and the 4-5 fields that go into. plus deduplicating with all the data that has accumulated in the past 4 days
      • mayhem
        ok, how about this:
      • 1. stop TS wrtier
      • 2. SELECT INTO
      • Pratha-Fish
        bitmap: working!
      • mayhem
        3. REstart TS writer
      • 4. Write script to fix up 4 days.
      • lucifer
        yup that's the plan.
      • mayhem
        ok, can we get started with 1-3 asap>
      • ?
      • lucifer
        we could but its better to run in the morning. otherwise the ts writer will have to be stopped overnight.
      • mayhem
        ok, fair.
      • its late enough as is.
      • lol
      • lucifer
        there's one more unbacked up table that is going to be hard to restore. mbid_manual_mapping.
      • mayhem
        I just ran the partial set of the mb-metadata cache data.
      • and sure enough data is now being returned, for those 3 artists. that's a good sign.
      • hard?
      • where would we get the data from?
      • lucifer
        it would be impossible. but we have it mixed in spark dumps.
      • mayhem
        🤯
      • this is going to be a fun week, isn't it?
      • lucifer
        compare the automapper assigned with the recording mbid present in the dumps, if its different and not present in the actual listen submitted to LB then it was a manually mapped one.
      • so have to compare three datasets to recover that.
      • mayhem
        🤯 🤯 🤯
      • lucifer
        i am amazed that postgres let that query delete all tables in the script but kills our cache queries before completing execution every time.
      • mayhem
        might be a level of effort optimization
      • bitmap
        lucifer: regarding the pgbouncer logs, I can see why we disabled them, since it logs an entry every 0.005 seconds or less (and usually the info is not useful)
      • lucifer
        LB-1392
      • BrainzBot
        LB-1392: Document and require all tables to be dumped https://tickets.metabrainz.org/browse/LB-1392
      • lucifer
        bitmap: i see, is it possible to change that setting without restart?
      • mayhem
        I've started on the PR to dump all tables.
      • bitmap
        yeah, I can make it temporarily log disconnections
      • but it will probably just say "client close request" without any other info
      • lucifer
        cool, let's get on that tomorrow.
      • can we identify the connection that closed down?
      • mayhem
        I'll start working on it now.
      • bitmap
        it shows something like musicbrainz_db/musicbrainz@172.17.0.1:45668 closing because: client close request (age=0s)
      • lucifer
        mayhem: yeah that is one part, other is to error in say CI if we add a new table and forget to dump it.
      • bitmap
        those docker ports are completely opaque afaik
      • lucifer
        i see yeah. :/
      • mayhem
        good idea
      • lucifer
        thanks for checking that bitmap.
      • bitmap
        np
      • you are disabling the statement_timeout and pg is killing it for some other reason? (if pg is killing it, it should be in the pg logs, anyway)
      • mayhem
        lucifer: do we dump mapping.mb_metadata_cache ?
      • a lot of generated data to dump.
      • lucifer: while we're at it, we should switch to zstd compression. that is going to make dumps and moving data to spark a lot faster.
      • Pratha-Fish
        does tmux work with docker? 💀
      • bitmap
        Pratha-Fish: in what way?
      • Pratha-Fish
        I did the following:
      • lucifer
        mayhem: I would probably not dump generated data.
      • But rather make it reliable to generate
      • Pratha-Fish
        Logged into wolf > entered tmux > entered the python 3.11 container > su snaek > started_some_job
      • mayhem
        ok, then I see that we need to add "messybrainz.submissions" and "mbid_manual_mapping"
      • bitmap
        Pratha-Fish: yes that's fine
      • Pratha-Fish
        Then I exited tmux by pressing CTRL + B > D
      • lucifer
        Makes sense
      • Pratha-Fish
        Now, that thew me back to home dir as expected
      • but when I entered tmux again, I expected to go back inside the docker container and see updates on the job I was running
      • But entering tmux again just got me to the home dir
      • bitmap
        tbh I don't remember how to re-attach with tmux (I use screen)
      • Pratha-Fish
        NP, I'll try to figure something out and come back to you if it doesn't work :)
      • bitmap
        are you sure it's not just opening another session? can you switch sessions?
      • Pratha-Fish
        verifying
      • bitmap
        session or window I guess
      • BrainzGit
        [listenbrainz-server] 14mayhem opened pull request #2620 (03master…LB-1392): LB-1392: Add missing tables to dumps and ensure that all tables are dumped https://github.com/metabrainz/listenbrainz-serv...
      • Pratha-Fish
        Yes, I connected to the same session :D
      • Still no luck
      • Looks like I'll have to install tmux inside the container
      • Hope I didn't kill docker on wolf https://usercontent.irccloud-cdn.com/file/TP0Uk...
      • bitmap
        no you just stopped the container
      • Pratha-Fish
        phew seems like it
      • Sorry for the silly question again, but how do I restart it?
      • bitmap
        docker start snaek-python
      • Pratha-Fish
        thanks 🤦
      • bitmap
        sorry I'm not sure about your tmux issue, I know the same works fine in screen. if you wanna try installing tmux in the container, go ahead
      • Pratha-Fish
        bitmap: just figured it out :D
      • as simple as installing tmux on the container
      • and attaching to the session after entering the container
      • bitmap
        good :)
      • Pratha-Fish
        :)
      • petitminion has quit
      • petitminion joined the channel
      • bitmap: looks like we have something tangible on our hands right now :)
      • Not to mention reosarevok
      • I haven't created any PRs yet because the whole process of figuring out to even what to do in this project was pretty hectic. Will host the rest of the code on https://github.com/MetaBrainz/MusicBrainz-AreaBot/ ASAP.
      • atj
        `tmux a` re-attaches to an existing session btw
      • Pratha-Fish notes down
      • Just running tmux starts a new session
      • Pratha-Fish
        atj: thanks, I was making the same mistake lmao. now I just use tmux attach -t sesh_num
      • atj
      • Pratha-Fish
        E p i c
      • Lotheric has quit
      • BrainzGit
        [bookbrainz-data-js] 14kellnerd opened pull request #314 (03master…repeatable-imports): Repeatable imports https://github.com/metabrainz/bookbrainz-data-j...
      • kellnerd
        My last PR for GSoC, repeated imports can now "overwrite" pending entities :)
      • Pratha-Fish
        kellnerd: congrats on the work! 🥳