#musicbrainz-devel

/

      • Guest51157 joined the channel
      • 2014-10-08 28130, 2014

      • PanPipes joined the channel
      • 2014-10-08 28116, 2014

      • PanPipes
        I have just found two Anthony Hopkins, shouldn't there only be one? http://musicbrainz.org/artist/a7166c3d-68b0-4301-… and http://musicbrainz.org/artist/bdcc5ad0-076c-442c-…
      • 2014-10-08 28158, 2014

      • nikki
        there aren't two, there are just two mbids for the same one
      • 2014-10-08 28111, 2014

      • PanPipes
        :(
      • 2014-10-08 28118, 2014

      • PanPipes
        is this common?
      • 2014-10-08 28124, 2014

      • reosarevok
        Sure
      • 2014-10-08 28130, 2014

      • PanPipes
        righto
      • 2014-10-08 28134, 2014

      • PanPipes
        thanks
      • 2014-10-08 28135, 2014

      • reosarevok
        When they get merged, they redirect
      • 2014-10-08 28105, 2014

      • CallerNo6 joined the channel
      • 2014-10-08 28134, 2014

      • PanPipes
        thanks
      • 2014-10-08 28135, 2014

      • PanPipes
        howcome you don't do a full 301?
      • 2014-10-08 28138, 2014

      • ruaok joined the channel
      • 2014-10-08 28156, 2014

      • nikki
        because we haven't got round to it yet :)
      • 2014-10-08 28105, 2014

      • nikki
        too much to do, not enough people to do it all :(
      • 2014-10-08 28107, 2014

      • alastairp
        anyone know of differences between pytaglib and tagpy?
      • 2014-10-08 28125, 2014

      • alastairp
        - both python bindings to taglib, not sure if anyone has used one or both before and prefers one?
      • 2014-10-08 28153, 2014

      • nikki
        ah, http://tickets.musicbrainz.org/browse/MBS-4937 is the ticket for redirecting
      • 2014-10-08 28121, 2014

      • Mineo usually uses whichever has the latest release on pypi in such cases
      • 2014-10-08 28153, 2014

      • alastairp
        pytaglib it is
      • 2014-10-08 28122, 2014

      • alastairp
        except ubuntu has a package for tagpy and one for py3 for taglib :-P
      • 2014-10-08 28123, 2014

      • alastairp
        sigh
      • 2014-10-08 28116, 2014

      • adhawkins joined the channel
      • 2014-10-08 28137, 2014

      • CallerNo6 joined the channel
      • 2014-10-08 28128, 2014

      • ruaok
        looks about right, alastairp :)
      • 2014-10-08 28136, 2014

      • ruaok
        let's put it to use tomorrow. :)
      • 2014-10-08 28130, 2014

      • alastairp
        perfect
      • 2014-10-08 28102, 2014

      • ijabz1 joined the channel
      • 2014-10-08 28116, 2014

      • luks
        alastairp: is the test data/code that you used for your thesis at mcgill available publicly somewhere?
      • 2014-10-08 28148, 2014

      • alastairp
        the eval code?
      • 2014-10-08 28108, 2014

      • luks
        yep
      • 2014-10-08 28130, 2014

      • luks
        it's not important, but I was just curious as I'm working on a new fp system
      • 2014-10-08 28155, 2014

      • alastairp
        sure
      • 2014-10-08 28119, 2014

      • alastairp
      • 2014-10-08 28131, 2014

      • alastairp
        it'd be interesting to see if you can get it working :)
      • 2014-10-08 28139, 2014

      • alastairp
        there's lots of implicit knowledge in the setup
      • 2014-10-08 28142, 2014

      • luks
        the sample songs are not public right?
      • 2014-10-08 28159, 2014

      • luks
        so that I can reproduce the results
      • 2014-10-08 28118, 2014

      • Gentlecat joined the channel
      • 2014-10-08 28102, 2014

      • Freso
        ruaok: Any new and interesting command line programs having come out in the last hours since I took off? ;)
      • 2014-10-08 28123, 2014

      • Freso
        Oh, is it that fp-eval thing?
      • 2014-10-08 28140, 2014

      • Freso
        Oh, I guess not.
      • 2014-10-08 28152, 2014

      • ijabz1 joined the channel
      • 2014-10-08 28124, 2014

      • alastairp
        luks: sorry, no. It was internal to my department
      • 2014-10-08 28155, 2014

      • alastairp
        There's a chance I could set up the infrastructure again there and eval it for you
      • 2014-10-08 28139, 2014

      • alastairp
        Freso: basic cli is finished, we'll test integration tomorrow
      • 2014-10-08 28100, 2014

      • luks
        no, that's fine
      • 2014-10-08 28124, 2014

      • Freso
        alastairp: Alright. I'm looking forward to being allowed to toss data at you. ;)
      • 2014-10-08 28150, 2014

      • luks
        regarding the acousticbrainz thing, I was actually planning something like that in the past
      • 2014-10-08 28140, 2014

      • luks
        I basically shipped the first usable version of chromaprint/acoustid with the goal of using it to bootstrap any new data set I'll start collecting
      • 2014-10-08 28141, 2014

      • luks
        is the plan to implement something like that? (so that you can submit features without having MBIDs in tags)
      • 2014-10-08 28102, 2014

      • nikki probably has way more files if she can submit stuff without tags too
      • 2014-10-08 28129, 2014

      • LordSputnik joined the channel
      • 2014-10-08 28134, 2014

      • LordSputnik has left the channel
      • 2014-10-08 28152, 2014

      • JonnyJD_ joined the channel
      • 2014-10-08 28100, 2014

      • CatQuest joined the channel
      • 2014-10-08 28113, 2014

      • CatQuest has left the channel
      • 2014-10-08 28145, 2014

      • ianmcorvidae
        ruaok: load spike on totoro, not sure if possibly related to search changes earlier today. Seems to be dropping now but I can't really tell what caused it :/
      • 2014-10-08 28152, 2014

      • chirlu` joined the channel
      • 2014-10-08 28122, 2014

      • ianmcorvidae
        looks like the main spike corresponds with release group indexing, possibly
      • 2014-10-08 28108, 2014

      • ianmcorvidae
      • 2014-10-08 28122, 2014

      • ianmcorvidae
        release group indexing having started at 20:42 and finished at 21:00
      • 2014-10-08 28111, 2014

      • ianmcorvidae
        afterwards moved to release, started temporary table building at 21:00 and started the real indexing at 21:14
      • 2014-10-08 28120, 2014

      • ianmcorvidae
        (which is where it is now, of course)
      • 2014-10-08 28141, 2014

      • ianmcorvidae
        totoro was also swapping during this process, kicked up from ~2.7GB in swap to ~5.8, since down to 4.2
      • 2014-10-08 28110, 2014

      • alastairp
        nikki: oh, sure. If I finish v1 with tags tomorrow, I'll do fingerprint lookup and submit next week
      • 2014-10-08 28126, 2014

      • ianmcorvidae finally remembers to file http://tickets.musicbrainz.org/browse/SEARCH-394
      • 2014-10-08 28136, 2014

      • ruaok
        ianmcorvidae: interesting. its done releasegroup several times today, so its strange for it to be a problem now
      • 2014-10-08 28106, 2014

      • ruaok
        I'm starting to get the feeling that we're right on the edge of going into swap nearly all of the time.
      • 2014-10-08 28114, 2014

      • ruaok
      • 2014-10-08 28123, 2014

      • ruaok
        I think our plan to double ram is good one.
      • 2014-10-08 28115, 2014

      • alastairp
        ruaok: at was quick
      • 2014-10-08 28118, 2014

      • alastairp
        That
      • 2014-10-08 28109, 2014

      • ruaok
        what was?
      • 2014-10-08 28137, 2014

      • alastairp
        Well, I assume you're back
      • 2014-10-08 28140, 2014

      • alastairp
        Home
      • 2014-10-08 28147, 2014

      • ruaok
        I have been. :)
      • 2014-10-08 28159, 2014

      • ruaok
        I also live really close and have a fast bike. :)
      • 2014-10-08 28116, 2014

      • ruaok
        and I even took time to harass 'merican bros who were peeing on our bakery.
      • 2014-10-08 28152, 2014

      • alastairp
        True, close
      • 2014-10-08 28111, 2014

      • ruaok
        so, flask is very finicky about parsing POST content and mime-types.
      • 2014-10-08 28125, 2014

      • ruaok
        the mime-type determines how I have to access the data.
      • 2014-10-08 28138, 2014

      • ruaok
        and we decided on a non-standard POST body.
      • 2014-10-08 28155, 2014

      • ruaok
        so, this makes flask kinda twitchy
      • 2014-10-08 28157, 2014

      • alastairp
        hmm
      • 2014-10-08 28113, 2014

      • ruaok
        not sure if setting a json mime type will solve the problem.
      • 2014-10-08 28126, 2014

      • ruaok
        because the whole body isn't JSON.
      • 2014-10-08 28134, 2014

      • ruaok
        its two times JSON.
      • 2014-10-08 28147, 2014

      • alastairp
        yeah. Well, we could always wrap it in a list and make it Jason
      • 2014-10-08 28151, 2014

      • alastairp
        Gah ios
      • 2014-10-08 28155, 2014

      • ruaok
        lol
      • 2014-10-08 28109, 2014

      • ruaok
        then we're back to parsing the whole content on the server.
      • 2014-10-08 28115, 2014

      • ruaok
        which is probably not a big deal.
      • 2014-10-08 28120, 2014

      • alastairp
        Ah true
      • 2014-10-08 28121, 2014

      • ruaok
        making a single json document is fine by me, really.
      • 2014-10-08 28135, 2014

      • ruaok
        I think it makes overall standards compliance a little easier.
      • 2014-10-08 28151, 2014

      • ruaok
        (which will result in fewer pedant jira tickets too)
      • 2014-10-08 28134, 2014

      • alastairp
        but are we really being non compliant here?
      • 2014-10-08 28137, 2014

      • ruaok
        and then I can enforce that we need the proper mime-type set and be done with it.
      • 2014-10-08 28150, 2014

      • ruaok
        non standard, certainly
      • 2014-10-08 28154, 2014

      • alastairp
        I would have thought that if you're not urlencoding it, there's no definition what a post body should be
      • 2014-10-08 28117, 2014

      • ruaok
        there isn't, as far as I know.
      • 2014-10-08 28134, 2014

      • hawke joined the channel
      • 2014-10-08 28145, 2014

      • ruaok
        but there are pedant geeks who will think that two json documents in a post is heresy. :)
      • 2014-10-08 28109, 2014

      • ruaok
        let me see what happens when I give it a json mime type
      • 2014-10-08 28151, 2014

      • nikki
        wouldn't making it a single json document also make it easier for people to work with in general?
      • 2014-10-08 28101, 2014

      • ruaok
        thats my general gist, yes.
      • 2014-10-08 28110, 2014

      • nikki
        (and not just pleasing the pedants)
      • 2014-10-08 28146, 2014

      • ruaok
        I think examning the difference between those two is pedantic, but never mind me...
      • 2014-10-08 28150, 2014

      • ruaok
        ;)
      • 2014-10-08 28113, 2014

      • nikki
        this *is* mb :P
      • 2014-10-08 28120, 2014

      • ruaok
        exactly.
      • 2014-10-08 28100, 2014

      • ruaok
        alastairp: thoughts?
      • 2014-10-08 28119, 2014

      • ruaok
        we could add a top level key called "metadata" to the essentia generated data.
      • 2014-10-08 28123, 2014

      • ruaok
        and we nuke that before saving.
      • 2014-10-08 28117, 2014

      • alastairp
        yeah, it would make it easier before saving
      • 2014-10-08 28133, 2014

      • alastairp
        I mean, it would make it easier
      • 2014-10-08 28147, 2014

      • alastairp
        except at the expense of effort at the other end
      • 2014-10-08 28158, 2014

      • alastairp
        "you will generate data and then modify it and then submit it"
      • 2014-10-08 28113, 2014

      • ruaok
        yeah, kinda dumb too.
      • 2014-10-08 28123, 2014

      • alastairp
        I'm not sure that's any different than "you will generate data and then make some more data and then submit both of them together"
      • 2014-10-08 28128, 2014

      • ruaok
        but I suspect that its a very minor expense.
      • 2014-10-08 28106, 2014

      • ruaok
        how about we put the extra things in to X-your-mom-was-header headers of the HTTP request?
      • 2014-10-08 28106, 2014

      • alastairp
        well for me, any one of: as we have it now, single json document, json list with 2 documents would be fine
      • 2014-10-08 28110, 2014

      • ruaok runs away fast
      • 2014-10-08 28126, 2014

      • alastairp
        that's not much better than our url ideas
      • 2014-10-08 28144, 2014

      • ruaok
        clearly not. :)
      • 2014-10-08 28100, 2014

      • alastairp
        though, if you're going back to parsing the request on the server side, the only additional info we need is the extractor hash
      • 2014-10-08 28113, 2014

      • ruaok
        wasn't there some advisory about top level lists in JSON being a security problem?
      • 2014-10-08 28135, 2014

      • ruaok
        and the lossless flag
      • 2014-10-08 28149, 2014

      • ruaok
        it can influence what query I run
      • 2014-10-08 28115, 2014

      • ruaok
        so, if it goes in the URL, we need MBID, lossless and build_sha1
      • 2014-10-08 28133, 2014

      • ruaok
        which isn't the end of the world, but it isn't pretty
      • 2014-10-08 28150, 2014

      • alastairp
        lossless is annoying. I could live with the other two
      • 2014-10-08 28107, 2014

      • alastairp
        but it's really data that should be linked with the features
      • 2014-10-08 28135, 2014

      • ruaok
        true, then we're back to parsing the json. :)
      • 2014-10-08 28151, 2014

      • alastairp
        ah, right
      • 2014-10-08 28135, 2014

      • ruaok
        so, examining the impact of parsing JSON vs memory issues that an overloaded server is going to run into, I suspect that we're prematurly optimizing this issue.
      • 2014-10-08 28140, 2014

      • ruaok
        we're parsing 60k of JSON.
      • 2014-10-08 28141, 2014

      • alastairp
        becxause by putting them into the url/header again we don't need to parse the doc again
      • 2014-10-08 28143, 2014

      • alastairp
        yep
      • 2014-10-08 28145, 2014

      • alastairp
        it's not big
      • 2014-10-08 28121, 2014

      • alastairp
        especially given computation time, even 1000 people submitting is only going to be 1000 requests every 30 seconds
      • 2014-10-08 28127, 2014

      • alastairp
        which is nothing
      • 2014-10-08 28137, 2014

      • derwin
        this looks really interesting scrollback.
      • 2014-10-08 28144, 2014

      • ruaok
        that said, I would prefer the single JSON document that is the essentia generated document plus the top level metadata key.
      • 2014-10-08 28147, 2014

      • derwin
        (what data is being injested as JSON blobs?)
      • 2014-10-08 28149, 2014

      • ruaok
        you ok with that alastairp ?
      • 2014-10-08 28102, 2014

      • ruaok
        one sec derwin