#musicbrainz-devel

/

      • alastairp
        So many things to do. Sign
      • High
      • Sigh
      • Any thoughts on hiding output of essentia in the submitter?
      • I feel it's useful for debugging
      • ,
      • Maybe log it?
      • zas
        ok, audioloader-codec branch has the "bit_rate"
      • alastairp
        zas: yep, just got added yesterday
      • Still waiting for the maintainer to accept it
      • ruaok_
        off to MHD a'dam
      • bai
      • alastairp
        see you soon
      • ianmcorvidae: why do you have quotes in your tag names?
      • ianmcorvidae
        crap data, I shouldn't
      • but it still shouldn't die :)
      • alastairp
        yeah, I agree we should escape keys
      • let me make a change to the json writer
      • ok. he's looking at the peak detection error now, but doesn't have good internet. He's back in bcn on 15th, so we might have to wait until then :/
      • ianmcorvidae: re: your replaygain question last night - there's some magic going on there
      • I don't understand everything it does, but there was some discussion about it a few weeks ago so I believe it's correct
      • ianmcorvidae
        right, just curiosity on my part anyway, not really important
      • ruaok: https://github.com/metabrainz/acousticbrainz-se... if you're around to glance at it
      • (others too if you'd like, but XD)
      • alastairp
        hmm
      • did you discuss this?
      • this is interesting, because I have no idea exactly how unique it's going to be
      • different build of ffmpeg, same file? maybe different
      • mp3, flac, definitely different
      • mp3, different mp3? maybe different
      • ianmcorvidae
        well, the notion is that if the data's completely the same it's not worth keeping both
      • if there's more that should go into that calculation then that's also fine
      • alastairp
        yes, true
      • ianmcorvidae
        just trying to do better than "keep the last 5 that happened to be submitted"
      • alastairp
        right
      • ianmcorvidae
        (or "keep only the first, or the first lossless")
      • alastairp
        it's just that "the same" in terms of features can be different
      • I agree that "exactly the same" is useless
      • ianmcorvidae
        well, sure, though this is on the whole JSON data
      • alastairp
        but it's a big change (and potentially computationally expensive) for just that
      • ianmcorvidae
        so it should also change for things like build differences etc., with what I have
      • alastairp
        right, I suspect that this will only dedup the exact same person runnig it twice
      • yeah, it'll change on build
      • it makes me feel a little funny, so I'll wait for rob to weigh in
      • (thanks though!)
      • I'm just fixing the json exporter for you
      • ianmcorvidae
        just in terms of it not creating much uniqueness?
      • alastairp
        yeah
      • ianmcorvidae
        fair enough -- I think something like splitting things up a bit might make sense eventually -- such that smaller things are stored individually (such that each version thing only needs storing once, for example, but also if the whole lowlevel category comes out the same, or so -- not sure exactly where to break it up
      • which is the way to make this catch things better, I think, isolate "the tags changed" from "the build changed" from "the features changed"
      • alastairp
        yeah, I understand
      • ianmcorvidae
        up to 89 duplicates, each with exactly 2
      • heh
      • at least some of them are me and nikki, e.g. http://beta.musicbrainz.org/recording/5d9207fe-... :P
      • nikki
        haha
      • alastairp
        ianmcorvidae: please edit src/algorithms/io/yamloutput.cpp line 261 and add a call to escapeJsonString
      • and tell me if that fixes your bad key
      • ianmcorvidae doesn't know enough c++ to know where to add that, what should the final line look like?
      • ianmcorvidae
        (wrap n->name, maybe?)
      • alastairp
        yes
      • escapeJsonString(n->name)
      • ianmcorvidae
        right
      • compiling, one sec
      • working
      • alastairp
        it fixed it?
      • ianmcorvidae
        yeah
      • alastairp
        cool
      • ianmcorvidae
      • (for one example)
      • alastairp
        I understand why he didn't escape keys - they're supposed to all come from internal code and you should never name a pool (essentia term for a key) with that
      • ianmcorvidae
        yeah, makes sense
      • alastairp
        but in the case of tags it just gets everything from taglib and dumps it there
      • ianmcorvidae
        I figured it was something like that XD
      • alastairp
        ok, fixed in another branch, you can merge it if you want
      • I need to redo the branches, one for each of my fixes and one combining everying for us guys - it's because I want dmitry to be able to pull what he wants into master
      • unfortunately it might mean we end up with hashids in abz that are no longer in the tree. that'll be annoying
      • yeeeargh joined the channel
      • ianmcorvidae
      • alastairp
        next thing I want to do when we have more data is meta-stats over the mb database
      • how many complete albums, how much of an artist's collection
      • then meta-meta stats. how many pop albums as determined by lastfm tags (when are we getting genres?)
      • ijabz1
        alistairp the only mb data you are storing is mbrecordingid or are you storing acoustid as well ?
      • in acoustbrainz ?
      • alastairp
        only recordingid
      • in the case where there is no recordingid tag in the file I want to do an acoustid lookup, it would be fine to add that as additional metadata
      • ijabz1
        I just wonder because one acoustid can match multiple recordingsids, and when that is the case there is the chace that the mapping to recordingid is wrong
      • alastairp
        once I add that functionality we can also add an option to always submit acoustid if the person has it installed
      • right
      • ok, so if we find recordingid by fingerprinting it's a good idea to submit acoustid too
      • ijabz1
        having the acoustid would allow you to postcheck bad data at a later date
      • alastairp
        great, thanks for the headsup
      • ijabz1
        k
      • zas
        imho, matching acoustid isn't a good idea, it would mean some kind of autotagging, which will lead to many errors (acoustid associated with incorrect recording, acoustid matching multiple recordings, etc...), imho you should just encourage people to tag their files using Picard (which is using acoustid, but user can check if correct)
      • ijabz1
        Picard does autotag, i dont think people are going to want to start retagging their collection in order to contribute to acoustbrainz
      • im just saying that if their files already contain an acoustid its useful to send that as it helps verifies that the data is correct or indeed incorrect
      • alastairp
        agreed. I would propose matching with acoustid but marking the data as such
      • e.g. "I'm happy to deal with potentially bad files" or "I only want almost certain files"
      • ijabz1
        Maybe, thats not really what Im saying though, Ill try again.
      • If the songs have an mbrecordingid then they have already been tagged by some method be that Picard, SongKong ectera
      • You just need the mbrecordingid to serve as the key, but if the user has addtional metadata such as acoustid already in the file then they should send
      • that as well, this helps verify at a some later stage if see bad data
      • e.g, The Acoustid for that MBRecordingid matches to many MBRecordingIds, higher risk
      • or vice versa none of the Acoustids known for that MBRecordingid match the one user sent by the user, higher risk
      • zas
        it remembers me http://musicbrainz.org/release/09186fe9-18af-47... where acoustids on both discs are the same, second disc has tracks without main voices... ;) acoustids are totally messed up on this one, kinda expected
      • i wonder how track 1-1 and track 2-2 can share the same acoustid (now)
      • alastairp
        ijabz1: we send every tag that taglib finds
      • zas
        i mean track 1-2 and 2-1
      • alastairp
        if taglib tells us there is a tag for acoustid, it'll get sent (I'm not sure if this means that taglib needs to know how to parse an acoustid tag)
      • ijabz1
        oh okay I thought you only stored mbrecordingid
      • alastairp
        http://acousticbrainz.org/3a50c43e-f2f4-4b8e-86... e.g. see tags at the end of the file
      • man. who has such perfect tags. ianmcorvidae ?
      • ijabz1
        no acoustid on that one though
      • alastairp
        yeah
      • I don't know if it doesn't have a tag, or if taglib isn't reading it
      • let me put it on the list, and I'll see if taglib can understand it. If not, I'll look for the tag manually in the client
      • zas
        alastairp: any progress on fixing "ChordsDetection::chords: Could not push 1 value, output buffer is full" error ? this one occurs very often
      • alastairp
        please read scrollback
      • quick answer, no
      • yeeeargh
      • alastairp
        awesome!
      • ijabz1
        ah cool
      • Freso
        alastairp: Always storing AcoustIDs wouldn't be bad either, since recordings do sometimes need to be split up.
      • Also, caught up with back log: nvm. ;)
      • ijabz1 joined the channel
      • Nyanko-sensei joined the channel
      • ijabz1 joined the channel
      • ijabz1 joined the channel
      • Man. Those show stopper bugs are really annoying. :(
      • ianmcorvidae joined the channel
      • Leftmost joined the channel
      • ijabz1 joined the channel
      • tungol joined the channel
      • 21WABMWUW joined the channel
      • yeeeargh
        i'm a bit curiois what kind of music you guys are scanning. i didn't encounter that chord-bug once yet. the only errors i got where a bunch of replaygain/silence bugs with track which where either literally silence or tracks wich a larg amout of silence between two songs (hidden tracks)
      • Freso
        I've scanned some hip hop, R&B, dancehall, Christmas music stuff, pop, folk/trad., ...
      • I have one thread hanging right now, but have had several bailing out on a "IOError: [Errno 2] No such file or directory: '/tmp/tmpy5hQly.json'"
      • alastairp
        yeah, that'll be because the extractor fails to write the file, and the submitter tries to blindly open it
      • Freso
        Yep.
      • alastairp
        bug fix for that will be coming in the weekend
      • Freso
        And it's consistent for the files that happens to.
      • ...
      • Which, in retrospect, I should have probably collected somewhere for easy re-submission...
      • alastairp
        yeah, we have no way of marking a file as submitted, or bad for submitting
      • also, it'd be nice to know how the extractor failed on those ones
      • to report bugs if needed
      • Freso
        Yep. But I figure the ones I run into are the ones already reported last night, so I'll wait until those are sorted out before reporting new stuff. :)
      • It would also be nice if it would continue with the rest of the queued files and then report at the end which ones didn't work...
      • alastairp
        code. patch. etc
      • seriously, I have about 4 different things going on here