#musicbrainz-devel

/

      • CallerNo6
        I expect the data to prove conclusively that there's only one Sting song.
      • ianmcorvidae
        haha
      • CallerNo6
        I mean, it's an okay song. Don't get me wrong.
      • Nyanko-sensei joined the channel
      • JesseW joined the channel
      • kepstin-laptop__ joined the channel
      • kepstin-laptop
        so, when running abzsumbit, I occasionally get sqlite database locking errors
      • kepstin-laptop opens https://github.com/MTG/acousticbrainz-client/issues/13
      • MightyJay_ joined the channel
      • demonimin_ joined the channel
      • Mineo_ joined the channel
      • zas_ joined the channel
      • michiwend joined the channel
      • DjSlash_ joined the channel
      • michiwend_ joined the channel
      • nikki_ joined the channel
      • legoktm joined the channel
      • legoktm joined the channel
      • Gentlecat joined the channel
      • so, 92k unique recordings in abz now
      • JesseW joined the channel
      • ianmcorvidae
        yup
      • I should do those graphs, lossy stuff has been growing a lot more lately
      • kepstin-laptop
        my lossless stuff hasn't finished yet
      • this data set's gonna be a bit more weighted towards japanese pop than most, i think ;)
      • ianmcorvidae
        haha
      • probably got more estonian hip-hop than the average dataset just by my 6 CDs worth :P
      • kepstin-laptop
        well, it can only improve the results, right?
      • ianmcorvidae
        yup!
      • I was thinking I should write a crappy recommender to kick us off
      • something obviously terrible like levenshtein distance of the JSON :P
      • kepstin-laptop
        hmm, something with just the low-level data? Could do something silly like just match bpm and key
      • you like this song in C# major at 140bpm, so you'll obviously like this other one too!
      • ianmcorvidae
        that's far more sophisticated than I was thinking XD
      • I mean, I'm really thinking in the vein of making a truly terrible recommender that anyone can do better than, because I want to goad them into doing so :P
      • CallerNo6
        listeners who like songs with "satan" in the title will probably like other songs with "satan" in the title?
      • kepstin-laptop wonders if there's something really silly and easy you could do which would on average perform worse than random matching.
      • ianmcorvidae
        hah
      • CallerNo6
        I've been assured that nobody's smart enough to be wrong all the time. But it can't hurt to try?
      • kepstin-laptop
        doesn't have to be all the time
      • just on average :)
      • (if you actually got it wrong all the time, you could presumably just flip your rating and get something actually useful)
      • CallerNo6
        hence the expression :-)
      • KillDaBOB_ joined the channel
      • ijabz1 joined the channel
      • ijabz1 joined the channel
      • ianmcorvidae
        past 100k uniques! :D
      • ruaok
        \ΓΈ/
      • djp joined the channel
      • yeeeargh joined the channel
      • alastairp: do you have a sec to talk about jesus christ your lord and saviour?
      • er wait.
      • how about the schema for the highlevel table? :)
      • alastairp
        I can see how you might confuse them
      • ruaok
        in particular I'm thinking of what version info we should track.
      • alastairp
        they're both world-changing
      • ruaok
        heh. :)
      • alastairp
        are you at the lab, or will do we do it here?
      • ruaok
        here. mom is in town and I only have half days while aleta baby-sits mom.
      • ruaok wishes he was in the lab
      • alastairp
        I don't know what features or algorithms high-level will be in the output
      • ruaok
        yeah, that too.
      • so, my inclinatio is to store: json, timestamp and essentia_git_sha
      • since, I am thinking that only the AB server should ever calculate high level stuff.
      • is that even a reasonable assumption?
      • alastairp
        split per algorithm?
      • ruaok
        ideally, but I just don't know if the essentia codebase is really ready for that/
      • I think we may just need to start with one version and get a move on.
      • the good thing is that we can re-calculate this at any time.
      • alastairp
        right. that'd be a good start then
      • ruaok
        ok, I'll get moving on that.
      • any signs of dima?
      • alastairp
        if there are many algorithms, there's no difference between 1 binary that spits out lots of bits of json, and many binaries that each spit out their own
      • no, but he normally does afternoons, I think
      • I'll try and grab him as soon as I can
      • ruaok returns from a mom interruption
      • I have to put out some ssl fires on freesound first, but back to this asap
      • ruaok
        ah yes.
      • LordSputnik joined the channel
      • ruaok_ joined the channel
      • Nyanko-sensei joined the channel
      • alastairp: got a moment for a quick sanity check on https://github.com/metabrainz/acousticbrainz-se... ?
      • all high level related stuff only.
      • alastairp
        ah, I see. that spit is pretty cool
      • do you want to do antying about highlevel_json / raw_json table namess?
      • ruaok
        unsure.
      • we are not likely to need the split and view as we do for the lowlevel stuff.
      • first question is if ianmcorvidae intended for all the json to go into one table.
      • my gut instinct says to use two tables.
      • for scalability.
      • and then deciding on the names.
      • alastairp
        right
      • ruaok
        but ianmcorvidae is sleeping, right now.
      • but assuming you're ok with the columns in said tables, I'll press on for now.
      • changing table names during the review phase is easy.
      • combining tables less so, but I think having two tables is desireable.
      • we're not losing anything having separate tables.
      • alastairp
        yes, I think 2 is a good idea
      • otherwise, fine
      • ruaok
        ok, I'll keep moving then.
      • not sure I can get a PR up for the high level stuff today, but I'll try.
      • hm.
      • I'll build no locking support into the highlevel stuff.
      • I'm going to assume that there will be one master program that looks at the DB, determines which highlevel data needs to be calculated, fires off a thread that will then calculate the highlevel data.
      • it then takes ending threads and stores the data into the DB>
      • Nyanko-sensei joined the channel
      • ardoRic
        does the vm update the musicbrainz-server code automatically, or should I check it out again ?
      • ruaok
        just do a git pull on it.
      • it doesn't update automatically
      • KillDaBOB_ joined the channel
      • chirlu` joined the channel
      • KillDaBOB joined the channel
      • Nyanko-sensei joined the channel
      • ijabz1 joined the channel
      • kepstin-laptop
        so, >100k recordings now :)
      • alastairp
        this is great. 10% of our target in 5 days
      • at this rate that'll be ~400k by the end of the month, so if we get more people running it in the coming week I think 500k or more is really doable
      • kepstin-laptop
        I've just about hit all the music I have now, though.
      • keeping the rate up probably really requires getting more people to run the tool :)
      • alastairp
        right, but the only reason we've not opened this up wider is that the tools still have problems
      • rob is confident, and I agree with him, that we can dump this tool on 2-4x as many people immediately
      • which will keep up our submission speed
      • kepstin-laptop has started to run it on the stuff he has only has lossy formats now
      • kepstin-laptop
        (which is a bunch of touhou arranges, mostly)
      • Nyanko-sensei joined the channel
      • ruaok
        in fact, I think we should start tapping people on the shoulders quietly and ask them to jump in.
      • alastairp
        right
      • ruaok
        we need to get derwin in on this.
      • nikki is still working on her stuff
      • nikki
        although when I'll be able to actually run it on *all* of my music is another question
      • ijabz1
        if we can get either an osx or windows version available soon will be alot easier to get more users
      • nikki
        (right now I can't do korean stuff, because apparently linux has a bug in its support for korean filenames on hfs filesystems)
      • JesseW joined the channel
      • ruaok
        ijabz1: that is our goal for friday, if at all possible
      • ijabz1
        great
      • jesus2099_ joined the channel
      • alastairp
        i wish
      • LordSputnik
        btw, have about 12k lossless tracks for scanning - are there instructions anywhere? :)
      • yeeeargh
      • LordSputnik
        ok, will see what I can do later :)
      • ruaok
        LordSputnik: sweet.
      • LordSputnik has left the channel
      • hawke1 joined the channel
      • kepstin-laptop__ joined the channel
      • drsaunde
        ruaok: Not sure what you guys are doing but i'd be happy to help whenever
      • ruaok
        got flacs?
      • drsaunde
        no
      • ruaok
        even if lossy, anything helps at this point.
      • got linux?