#metabrainz

/

      • ruaok
        if we had the diskspace, I would say, screw it, just write a duplicate table.
      • 2021-07-05 18650, 2021

      • ruaok
        and then we can swap over in an instant.
      • 2021-07-05 18606, 2021

      • ruaok
        but I dont think we can get away with that.
      • 2021-07-05 18622, 2021

      • kinduff
        alrighty, thank you lucifer, reosarevok, ruaok
      • 2021-07-05 18638, 2021

      • alastairp
        and even then, I suspect that recreating the continuous aggregate on a new datetime column at the same time that we have the one on the timestamp column will cause disk problems too
      • 2021-07-05 18650, 2021

      • ruaok
        we wouldn't need that.
      • 2021-07-05 18600, 2021

      • ruaok
        the new table would not be accessed until the actual switchover.
      • 2021-07-05 18626, 2021

      • ruaok
        we could take the listenstore offline for an hour or so during the switchover, no big deal.
      • 2021-07-05 18644, 2021

      • alastairp
        for both reads and writes?
      • 2021-07-05 18600, 2021

      • alastairp
        and then during that time drop the existing aggregate and recreate it on the datetime column?
      • 2021-07-05 18631, 2021

      • ruaok
        yes
      • 2021-07-05 18612, 2021

      • alastairp
        I'm happy to try it, though now this task has now ballooned in size a bit
      • 2021-07-05 18649, 2021

      • ruaok
        could we run a faily simple trial as a proof of concept?
      • 2021-07-05 18604, 2021

      • ruaok
        fairly == not fail-y
      • 2021-07-05 18631, 2021

      • ruaok
        create table, start copied old rows, monitor for 10% and extrapolate how much disk space it would really take.
      • 2021-07-05 18640, 2021

      • ruaok
        and only if it looks doable do we proceed with this approach.
      • 2021-07-05 18642, 2021

      • alastairp
        the only reason to do both columns at the same time would be to avoid 2 schema changes, are we worried about that? (I'm not)
      • 2021-07-05 18601, 2021

      • alastairp
        sure, let me put together a quick PR for that
      • 2021-07-05 18603, 2021

      • ruaok
        i'm not.
      • 2021-07-05 18618, 2021

      • ruaok
        but, I am worried about an exclusive table lock during the ALTER TABLE command.
      • 2021-07-05 18631, 2021

      • ruaok
        and we have no idea how long ALTER TABLE will run for
      • 2021-07-05 18639, 2021

      • alastairp
        I believe that add column default null in postgres now no longer needs a lock
      • 2021-07-05 18651, 2021

      • alastairp
        however, moving to not null may need one
      • 2021-07-05 18631, 2021

      • ruaok
        if we can avoid a table lock we should use your solution. clearly simpler.
      • 2021-07-05 18609, 2021

      • alastairp
        OK, I'll do the following: 1) PR for moving to user_id, 2) PR for testing the change to a date time field - by making a new table and copying 10%, 3) verify that adding a column doesn't need a lock and check the time that adding a not null constraint requires
      • 2021-07-05 18622, 2021

      • alastairp
        thanks for the discussion
      • 2021-07-05 18653, 2021

      • ruaok
        1) does not need a table lock either?
      • 2021-07-05 18606, 2021

      • alastairp
        for adding the column, no
      • 2021-07-05 18617, 2021

      • alastairp
        but now I'm doubting the change of the constraint
      • 2021-07-05 18631, 2021

      • ruaok
        let's do 3) first.
      • 2021-07-05 18644, 2021

      • ruaok
        because that will really inform 1 and 2
      • 2021-07-05 18647, 2021

      • alastairp
        perfect, let me finish my db import and I'll test that
      • 2021-07-05 18651, 2021

      • ruaok
        thx
      • 2021-07-05 18630, 2021

      • [1997kB] has quit
      • 2021-07-05 18631, 2021

      • outsidecontext_
        alastairp: is this intentional or an oversight? https://tickets.metabrainz.org/projects/AB/issues…
      • 2021-07-05 18632, 2021

      • ruaok
        alastairp: what exactly is broken about the public LB dumps?
      • 2021-07-05 18632, 2021

      • BrainzBot
        AB-460: API: Missing feature tonal.chords_changes_rate
      • 2021-07-05 18639, 2021

      • ruaok
        I see data.
      • 2021-07-05 18631, 2021

      • ruaok
        wow. I just ran a query on timescale to extract spotify recording IDs from the new mapping.
      • 2021-07-05 18637, 2021

      • ruaok
        anyone wanna guess how many rows it has?
      • 2021-07-05 18643, 2021

      • reosarevok
        9387579!
      • 2021-07-05 18649, 2021

      • reosarevok
        (I might have generated a random number)
      • 2021-07-05 18601, 2021

      • ruaok
        you might be withing 20%
      • 2021-07-05 18603, 2021

      • ruaok
        -g
      • 2021-07-05 18611, 2021

      • ruaok
        11M rows!
      • 2021-07-05 18617, 2021

      • ruaok
        but, those are mapped against MSIDs.
      • 2021-07-05 18625, 2021

      • ruaok
        meaning it contains loads of dupes
      • 2021-07-05 18658, 2021

      • ruaok
        1.4M unique recordings.
      • 2021-07-05 18621, 2021

      • ruaok
        lucifer: alastairp : quick ponderance.... for the parquet based LB dumps intended to be imported into spark...
      • 2021-07-05 18606, 2021

      • ruaok
        those are mostly intended as internal use. does it make sense to spend all that time XZ compressing them just to move them to another server at the same datacenter?
      • 2021-07-05 18611, 2021

      • ruaok
        (or cluster of datacenters)
      • 2021-07-05 18618, 2021

      • ruaok
        I'm inclined to not compress at ALL.
      • 2021-07-05 18604, 2021

      • lucifer
        sure i think we can get away without compressing to xz.
      • 2021-07-05 18627, 2021

      • lucifer
        also, parquet files by default use "snappy" compression iirc so it might already be comparable to compressed json anyway.
      • 2021-07-05 18656, 2021

      • ruaok
        oh. well, that makes everything easier then.
      • 2021-07-05 18601, 2021

      • ruaok
        was it 64MB chunks?
      • 2021-07-05 18605, 2021

      • ruaok
        I wonder how to estimate that.
      • 2021-07-05 18635, 2021

      • lucifer
        128 MB chunks or a lit less than less.
      • 2021-07-05 18644, 2021

      • ruaok
        k.
      • 2021-07-05 18600, 2021

      • ruaok
        if there is compression in the mix, I'll have to play with it to see if I can get close without going over.
      • 2021-07-05 18635, 2021

      • ruaok laughs at the thought that his first hard drive was 30MB large
      • 2021-07-05 18637, 2021

      • reosarevok
      • 2021-07-05 18638, 2021

      • BrainzBot
        MBS-11767: Track-level artists that differ from the release artist are no longer shown on multi-disc releases that aren't fully loaded
      • 2021-07-05 18648, 2021

      • reosarevok
        I took a quick look but I'm not sure why medium is not being detected as changed by useMemo
      • 2021-07-05 18641, 2021

      • Sophist-UK joined the channel
      • 2021-07-05 18641, 2021

      • Sophist-UK has quit
      • 2021-07-05 18641, 2021

      • Sophist-UK joined the channel
      • 2021-07-05 18638, 2021

      • Sophist_UK has quit
      • 2021-07-05 18643, 2021

      • lucifer
        ruaok: re lb public dumps, iiuc the `user` table schema of the public dumps is incorrect. we only import that table when there is no private dump so when we try to import public dump solely we get an error.
      • 2021-07-05 18605, 2021

      • ruaok
        ah
      • 2021-07-05 18657, 2021

      • BrainzGit
        [musicbrainz-android] 14akshaaatt opened pull request #81 (03master…patch-1): Update README.md https://github.com/metabrainz/musicbrainz-android…
      • 2021-07-05 18625, 2021

      • akshaaatt[m]
        lucifer: I updated the readme of the github project. Will add more changes soon but this seems like a good start.
      • 2021-07-05 18634, 2021

      • akshaaatt[m]
        We should add the website, topics and tags as well to the repository
      • 2021-07-05 18659, 2021

      • akshaaatt[m]
        <akshaaatt[m] "We should add the website, topic"> in the github about section
      • 2021-07-05 18616, 2021

      • ritiek joined the channel
      • 2021-07-05 18654, 2021

      • revi has quit
      • 2021-07-05 18606, 2021

      • revi joined the channel
      • 2021-07-05 18659, 2021

      • akashgp09 joined the channel
      • 2021-07-05 18621, 2021

      • lucifer
        akshaaatt[m]: we don't have a website for the app. adding topics and tags sounds good.
      • 2021-07-05 18638, 2021

      • akshaaatt[m]
      • 2021-07-05 18607, 2021

      • [1997kB] joined the channel
      • 2021-07-05 18620, 2021

      • ruaok
      • 2021-07-05 18614, 2021

      • alastairp
        ruaok: sorry, I had to pop out. did lucifer answer your question about public dumps?
      • 2021-07-05 18617, 2021

      • akshaaatt[m]
        <ruaok "https://juliareda.eu/2021/07/git"> This is really interesting!
      • 2021-07-05 18630, 2021

      • ruaok
        alastairp: y
      • 2021-07-05 18639, 2021

      • alastairp
        outsidecontext_: that's an oversight
      • 2021-07-05 18600, 2021

      • akshaaatt[m]
        I really wish it were a free plugin though. Anyway, open sourced plugins similar to this will float eventually.
      • 2021-07-05 18612, 2021

      • alastairp
        or more specifically, we made a list of things that we thought people might want to select, and that wasn't in our initial
      • 2021-07-05 18615, 2021

      • alastairp
        list
      • 2021-07-05 18622, 2021

      • lucifer
        akshaaatt[m]: i think we can use the MB android app page but that means we also have to maintain it at two places. let's finalize the details of the readme and see how we want to do it.
      • 2021-07-05 18639, 2021

      • akshaaatt[m]
        Okaaayyy boss!
      • 2021-07-05 18602, 2021

      • lucifer
        alastairp: sklearn training now takes ~5m after fixing the groundtruth path.
      • 2021-07-05 18613, 2021

      • alastairp
        🎉
      • 2021-07-05 18631, 2021

      • lucifer
        time to move to next step now :D
      • 2021-07-05 18645, 2021

      • alastairp
        did you find the model file?
      • 2021-07-05 18620, 2021

      • lucifer
        we have a lot of files in dataset directory of sklearn. checking for pkl file.
      • 2021-07-05 18645, 2021

      • lucifer
        yup we have it
      • 2021-07-05 18622, 2021

      • lucifer
        `/home/acousticbrainz/acousticbrainz-server/data/datasets/8f9c452b-6cef-4f36-a4c9-f2b29d4f167b/8f9c452b-6cef-4f36-a4c9-f2b29d4f167b/best_clf_model.pkl`
      • 2021-07-05 18640, 2021

      • alastairp
        great
      • 2021-07-05 18649, 2021

      • alastairp
        why is the uuid there twice?
      • 2021-07-05 18625, 2021

      • lucifer
        not sure, but that's part of the groundtruth path stuff. due to some reason, groundtruth path is used to calculate dataset_dir path.
      • 2021-07-05 18656, 2021

      • alastairp
        might be another error or perhaps an issue when selecting the groundtruth path, let's see if we can get rid of it
      • 2021-07-05 18605, 2021

      • lucifer
        i added a separate arg to avoid messing with it. i'll read through the code and work on simplifying it.
      • 2021-07-05 18619, 2021

      • alastairp
        ok, sounds great
      • 2021-07-05 18644, 2021

      • lucifer
        do we need to keep the standalone scripts?
      • 2021-07-05 18656, 2021

      • alastairp
        yes, I think they're useful to have
      • 2021-07-05 18604, 2021

      • lucifer
        👍
      • 2021-07-05 18659, 2021

      • lucifer
        another question unrealted to the PR, why is dataset eval page in react instead of jinja2?
      • 2021-07-05 18653, 2021

      • alastairp
        dataset editor is in react too
      • 2021-07-05 18609, 2021

      • alastairp
        having the editor in react was nice, as it made it interactive
      • 2021-07-05 18631, 2021

      • alastairp
        and so all of that part of the site is in react, as it was able to reuse code
      • 2021-07-05 18643, 2021

      • lucifer
        ah right. makes sense.
      • 2021-07-05 18653, 2021

      • lucifer
      • 2021-07-05 18608, 2021

      • alastairp
        nice!
      • 2021-07-05 18617, 2021

      • Lotheric_ joined the channel
      • 2021-07-05 18619, 2021

      • lucifer
        i just saw two different creation time format and found one is from jinja2 and other from react.
      • 2021-07-05 18627, 2021

      • alastairp
        ah, right
      • 2021-07-05 18637, 2021

      • alastairp
        yeah, there are a lot of react/data display tickets open
      • 2021-07-05 18601, 2021

      • lucifer
        ah! i see.
      • 2021-07-05 18614, 2021

      • lucifer
        i have also added the tool column here https://similarity.acousticbrainz.org/datasets/95…
      • 2021-07-05 18633, 2021

      • alastairp
        perfect
      • 2021-07-05 18612, 2021

      • lucifer
        also looked into failed status stuff, we have it already but are not catching all exceptions so sometimes it does not get updated.
      • 2021-07-05 18602, 2021

      • BrainzGit
        [acousticbrainz-server] 14alastair opened pull request #405 (03master…AB-460-chords_changes_rate): AB-460: Add tonal.chords_changes_rate to allowed lowlevel features https://github.com/metabrainz/acousticbrainz-serv…
      • 2021-07-05 18617, 2021

      • alastairp
        yes, I saw that. I think there's even a TODO saying to catch more exceptions, right? :)
      • 2021-07-05 18644, 2021

      • Lotheric has quit
      • 2021-07-05 18644, 2021

      • lucifer
        yup, how poetic.
      • 2021-07-05 18606, 2021

      • outsidecontext_
        alastairp: thanks, makes sense. So I could submit a PR to add this
      • 2021-07-05 18614, 2021

      • alastairp
        outsidecontext_: ^ I just did :)
      • 2021-07-05 18613, 2021

      • outsidecontext_
        Thanks!
      • 2021-07-05 18639, 2021

      • alastairp
        I'm just finishing up a few other features that I hope to merge soon, so expect to see this available some time this week
      • 2021-07-05 18639, 2021

      • akshaaatt[m]
        <BrainzGit "[musicbrainz-android] akshaaatt "> lucifer: Changes made ✌️
      • 2021-07-05 18636, 2021

      • BrainzGit
        [musicbrainz-android] 14amCap1712 merged pull request #81 (03master…patch-1): Update README.md https://github.com/metabrainz/musicbrainz-android…
      • 2021-07-05 18606, 2021

      • lucifer
        akshaaatt[m]: thanks! looks much better now.
      • 2021-07-05 18609, 2021

      • lucifer
        !m akshaaatt[m]
      • 2021-07-05 18609, 2021

      • BrainzBot
        You're doing good work, akshaaatt[m]!
      • 2021-07-05 18611, 2021

      • akshaaatt[m]
        Sweet! 💯
      • 2021-07-05 18601, 2021

      • reosarevok
        bitmap: https://tickets.metabrainz.org/browse/MBS-11762 seems like another side effect of the assumption you had about the disc URLs
      • 2021-07-05 18601, 2021

      • BrainzBot
        MBS-11762: Medium toolbox missing on disc URLs
      • 2021-07-05 18615, 2021

      • reosarevok
        I'm not sure what's the best option with this
      • 2021-07-05 18647, 2021

      • ritiek has quit
      • 2021-07-05 18604, 2021

      • ritiek joined the channel
      • 2021-07-05 18614, 2021

      • lucifer
        meeting time? :)
      • 2021-07-05 18634, 2021

      • lucifer
        Freso: ping
      • 2021-07-05 18634, 2021

      • ruaok
        wanna take a stab at running the meeting until Freso appears, lucifer ?
      • 2021-07-05 18642, 2021

      • ruaok
        if not, I'm happy to kick things off.
      • 2021-07-05 18601, 2021

      • lucifer
        i haven't done that anytime before. better if you do it :D
      • 2021-07-05 18610, 2021

      • ruaok
        ok.
      • 2021-07-05 18614, 2021

      • lucifer
        thanks!
      • 2021-07-05 18614, 2021

      • ruaok
        <BANG>
      • 2021-07-05 18619, 2021

      • ruaok
        meeting time!