#metabrainz

/

      • khan____ joined the channel
      • 2017-05-24 14448, 2017

      • bitmap
        it dumps each entity type in a separate process right now, but I was thinking of making this smarter
      • 2017-05-24 14415, 2017

      • ruaok
        and all procs run in parallel?
      • 2017-05-24 14416, 2017

      • bitmap
        I mean, it finished all the other entity types within a few hours iirc
      • 2017-05-24 14424, 2017

      • bitmap
        correct
      • 2017-05-24 14426, 2017

      • ruaok
        that was my next question.
      • 2017-05-24 14441, 2017

      • ruaok
        so we only need to speed up recordings, not surprsing.
      • 2017-05-24 14413, 2017

      • bitmap
        releases*
      • 2017-05-24 14423, 2017

      • ruaok
        ah yes. right.
      • 2017-05-24 14449, 2017

      • bitmap
        even less surprising since they include all the recordings :)
      • 2017-05-24 14455, 2017

      • ruaok
        that.
      • 2017-05-24 14401, 2017

      • ruaok
        which server does it run on?
      • 2017-05-24 14406, 2017

      • bitmap
        williams
      • 2017-05-24 14417, 2017

      • ruaok
        is compression done after the fact or does it happen streaming?
      • 2017-05-24 14425, 2017

      • bitmap
        after the fact
      • 2017-05-24 14448, 2017

      • ruaok
        what are your thoughts on making it smarter?
      • 2017-05-24 14421, 2017

      • ruaok
        divide table, run each section in thread, in the end glue files together, compress?
      • 2017-05-24 14431, 2017

      • bitmap
        well, at this point it's just a single process handling the entire release dump
      • 2017-05-24 14439, 2017

      • bitmap
        yeah, I guess something of that sort
      • 2017-05-24 14409, 2017

      • ruaok
        ok, lets let this process finish and get the first dump out there.
      • 2017-05-24 14417, 2017

      • bitmap
        though s/thread/process/ since perl doesn't do threads well or at all :/
      • 2017-05-24 14430, 2017

      • ruaok
        same diff.
      • 2017-05-24 14437, 2017

      • alastairp
        that sounds like a nice big dump
      • 2017-05-24 14443, 2017

      • ruaok
        ewww
      • 2017-05-24 14447, 2017

      • alastairp
        hey
      • 2017-05-24 14453, 2017

      • alastairp
        your mind went there, not mine
      • 2017-05-24 14458, 2017

      • ruaok
        yup
      • 2017-05-24 14410, 2017

      • ruaok
        how are the incremental dumps coming?
      • 2017-05-24 14415, 2017

      • alastairp
        you should take a look at my big dump, etc
      • 2017-05-24 14444, 2017

      • ruaok
        bitmap: how are the incremental dumps coming?
      • 2017-05-24 14456, 2017

      • bitmap
        oh, sorry
      • 2017-05-24 14407, 2017

      • Zastai joined the channel
      • 2017-05-24 14429, 2017

      • bitmap
        those aren't actually running yet because it can't work until at least one full dump is generated
      • 2017-05-24 14440, 2017

      • ruaok
        but the code is ready?
      • 2017-05-24 14446, 2017

      • bitmap
        otherwise it has nothing to compare changes against
      • 2017-05-24 14459, 2017

      • bitmap
        yep, I have tests written for it and it's working there, at least
      • 2017-05-24 14405, 2017

      • ruaok
        ok, good.
      • 2017-05-24 14428, 2017

      • ruaok
        where is the code for the JSON dumper?
      • 2017-05-24 14435, 2017

      • ruaok
        I wanna have a quick look at it.
      • 2017-05-24 14405, 2017

      • bitmap
      • 2017-05-24 14426, 2017

      • ruaok
        thx
      • 2017-05-24 14449, 2017

      • bitmap
        the latest commit there has most of it, I need to split that commit up before I make a PR
      • 2017-05-24 14458, 2017

      • ruaok
        ok.
      • 2017-05-24 14421, 2017

      • ruaok
        finally, I think we need to make a few changes to the metabrainz site in order to serve the incremental dumps.
      • 2017-05-24 14432, 2017

      • ruaok
        can you please coordinate with Gentlecat to get that accomplished?
      • 2017-05-24 14437, 2017

      • ruaok
        if you haven't already...
      • 2017-05-24 14405, 2017

      • ruaok
        huh. postgres is the bottleneck in this.
      • 2017-05-24 14450, 2017

      • Gentlecat
        sure, I can help with that
      • 2017-05-24 14400, 2017

      • ruaok
        great, thanks, Gentlecat
      • 2017-05-24 14453, 2017

      • ruaok
        bitmap: a thought for later... I wonder if we can "replay the replication packets" on top of the last dump to generate the next dump.
      • 2017-05-24 14453, 2017

      • khan____ has quit
      • 2017-05-24 14408, 2017

      • bitmap
        it kinda does something like that, though not fully since it doesn't edit the dumps directly
      • 2017-05-24 14433, 2017

      • bitmap
        all the json is stored in postgres, and the packets just update the json there
      • 2017-05-24 14447, 2017

      • bitmap
        so when the next dump is generated, it'll get it all from pg instead of the WS
      • 2017-05-24 14448, 2017

      • ruaok
        I'm glad you picked willams to see how much of an impact this is.
      • 2017-05-24 14458, 2017

      • ruaok
        !m bitmap
      • 2017-05-24 14458, 2017

      • BrainzBot
        You're doing good work, bitmap!
      • 2017-05-24 14402, 2017

      • ruaok
        very good.
      • 2017-05-24 14410, 2017

      • ruaok
        so, really the first dump is our worst case.
      • 2017-05-24 14419, 2017

      • bitmap
        yeah, hopefully
      • 2017-05-24 14436, 2017

      • ruaok
        I think we ought to periodically re-do the dumps. if there are any sort of errors, they will propagate.
      • 2017-05-24 14405, 2017

      • alastairp
        does it actually hit a real webservice over http, or just call the equivalent method in the server code to get the data?
      • 2017-05-24 14416, 2017

      • bitmap
        the latter
      • 2017-05-24 14422, 2017

      • alastairp
        ah, cool :)
      • 2017-05-24 14435, 2017

      • bitmap
        ruaok: agreed
      • 2017-05-24 14416, 2017

      • ruaok
        well done! this is clearly in good hands. :)
      • 2017-05-24 14451, 2017

      • CatQuest
        sooo many more maori instruments than i expected @__@
      • 2017-05-24 14451, 2017

      • CatQuest
        i've got like 3-4 left now and I'll be dooone!
      • 2017-05-24 14414, 2017

      • samj1912 joined the channel
      • 2017-05-24 14419, 2017

      • gcilou joined the channel
      • 2017-05-24 14409, 2017

      • asdofindia has quit
      • 2017-05-24 14426, 2017

      • hibiscuskazeneko joined the channel
      • 2017-05-24 14429, 2017

      • asdofindia joined the channel
      • 2017-05-24 14448, 2017

      • hibiscuskazeneko has quit
      • 2017-05-24 14433, 2017

      • kyan has quit
      • 2017-05-24 14439, 2017

      • arbenina_ has quit
      • 2017-05-24 14429, 2017

      • ruaok
        FYI: this is the focus of a DMCA takedown request we've gotten: https://musicbrainz.org/collection/f20b0777-105e-…
      • 2017-05-24 14458, 2017

      • ruaok
        oh heh, we already got to it. very good!
      • 2017-05-24 14404, 2017

      • lazka joined the channel
      • 2017-05-24 14401, 2017

      • Leftmost has quit
      • 2017-05-24 14428, 2017

      • Leftmost joined the channel
      • 2017-05-24 14442, 2017

      • samj1912
        qt 5.9 RC was released
      • 2017-05-24 14458, 2017

      • alastairp
        gcilou: Rachel bittner says hi
      • 2017-05-24 14408, 2017

      • alastairp
        We're having a drink in Barcelona
      • 2017-05-24 14418, 2017

      • agentsim
        ruaok: How can anything MB does be the target of a DMCA takedown... doesn't that require copyright infringement to exist?
      • 2017-05-24 14402, 2017

      • anthony25 has quit
      • 2017-05-24 14442, 2017

      • anthony25 joined the channel
      • 2017-05-24 14444, 2017

      • spuniun
        can anyone point me to a reasonable example of a systemd script for musicbrainz?
      • 2017-05-24 14419, 2017

      • spuniun
        I can't figure out how to get the @INC sources when from from systemd
      • 2017-05-24 14413, 2017

      • gcilou
        alastairp: oh wow I say hi back :)
      • 2017-05-24 14415, 2017

      • ZarkBit has quit
      • 2017-05-24 14431, 2017

      • CatQuest
        gcilou: hi!
      • 2017-05-24 14422, 2017

      • gcilou
        CatQuest: hey!
      • 2017-05-24 14444, 2017

      • hibiscuskazeneko joined the channel
      • 2017-05-24 14432, 2017

      • reosarevok
        agentsim: spammers posting links to illegal film and whatnot downloads or streams or something
      • 2017-05-24 14441, 2017

      • mselby joined the channel
      • 2017-05-24 14444, 2017

      • agentsim
        ahh :)
      • 2017-05-24 14457, 2017

      • mselby has quit
      • 2017-05-24 14456, 2017

      • UmkaDK has quit
      • 2017-05-24 14445, 2017

      • SothoTalker_
        ruaok: hello ^-^
      • 2017-05-24 14405, 2017

      • mselby joined the channel
      • 2017-05-24 14436, 2017

      • mselby has quit
      • 2017-05-24 14400, 2017

      • ZarkBit joined the channel
      • 2017-05-24 14439, 2017

      • danimal4 joined the channel
      • 2017-05-24 14422, 2017

      • UmkaDK joined the channel
      • 2017-05-24 14423, 2017

      • D4RK-PH0ENiX has quit
      • 2017-05-24 14455, 2017

      • D4RK-PH0ENiX joined the channel
      • 2017-05-24 14432, 2017

      • D4RK-PH0ENiX has quit
      • 2017-05-24 14400, 2017

      • D4RK-PH0ENiX joined the channel
      • 2017-05-24 14429, 2017

      • arbenina_ joined the channel
      • 2017-05-24 14427, 2017

      • D4RK-PH0ENiX has quit
      • 2017-05-24 14438, 2017

      • samj1912 has quit
      • 2017-05-24 14433, 2017

      • D4RK-PH0ENiX joined the channel
      • 2017-05-24 14448, 2017

      • arbenina_ has quit
      • 2017-05-24 14456, 2017

      • lazka has quit
      • 2017-05-24 14450, 2017

      • ruaok
        moin SothoTalker_
      • 2017-05-24 14455, 2017

      • ruaok
        what's news?
      • 2017-05-24 14407, 2017

      • ruaok
        reosarevok: I'll answer the latest support@ mail -- sounds cool.
      • 2017-05-24 14422, 2017

      • SothoTalker_
        ruaok: you did not read what i wrote yesterday evening? :)
      • 2017-05-24 14432, 2017

      • ruaok
        clearly not no. :(
      • 2017-05-24 14437, 2017

      • reosarevok
        ruaok: Oooooh NWR/CRI
      • 2017-05-24 14440, 2017

      • reosarevok
        awesome
      • 2017-05-24 14407, 2017

      • ruaok
        yeah.
      • 2017-05-24 14420, 2017

      • SothoTalker_
        <SothoTalker_> ruaok: some of those accounts with >500 chars in their profiles are definitely spammers, other are just... i don't know.
      • 2017-05-24 14422, 2017

      • ruaok
        time to erm...
      • 2017-05-24 14425, 2017

      • SothoTalker_
        <SothoTalker_> ruaok: should i report the few that are definitive spammers or will that be taken care of automagically later?
      • 2017-05-24 14427, 2017

      • ruaok
        write geordi++
      • 2017-05-24 14458, 2017

      • reosarevok
        "as efficiently as possible"
      • 2017-05-24 14403, 2017

      • reosarevok
        oh that sounds like not-excel-files!
      • 2017-05-24 14406, 2017

      • ruaok
        SothoTalker_: let yvanzo and bitmap take care of the first two/three batches of cleanups.
      • 2017-05-24 14414, 2017

      • ruaok
        then we'll regroup and have another look.
      • 2017-05-24 14440, 2017

      • SothoTalker_
        sure :)
      • 2017-05-24 14404, 2017

      • SothoTalker_
        those were the people with >500 bios but no edits
      • 2017-05-24 14448, 2017

      • ruaok
        and a large chunk of those are registered to spammer domains, so it is easy to clean them up with a script and not waste human time on it.
      • 2017-05-24 14404, 2017

      • SothoTalker_
        yup :)
      • 2017-05-24 14435, 2017

      • UmkaDK has quit
      • 2017-05-24 14408, 2017

      • sueastside joined the channel
      • 2017-05-24 14434, 2017

      • github joined the channel
      • 2017-05-24 14434, 2017

      • github
        [picard-plugins] MetaTunes closed pull request #94: New plugin - workparts (master...master) https://git.io/vS6CN
      • 2017-05-24 14434, 2017

      • github has left the channel
      • 2017-05-24 14404, 2017

      • gcilou has quit
      • 2017-05-24 14410, 2017

      • drsaunders joined the channel
      • 2017-05-24 14408, 2017

      • Zastai has quit
      • 2017-05-24 14424, 2017

      • danimal4 has left the channel
      • 2017-05-24 14458, 2017

      • rdswift has quit