#musicbrainz

/

      • djce sleeps
      • djce has quit
      • Knio
        Knio is now known as Knio-dinner
      • sbw joined the channel
      • Knio-dinner
        Knio-dinner is now known as Knio
      • orogor has quit
      • real
        real is now known as real|gone
      • real|gone
        real|gone is now known as real
      • melange has quit
      • canidae has quit
      • canidae joined the channel
      • sbw has quit
      • orogor joined the channel
      • sbw joined the channel
      • Knio1 joined the channel
      • Knio has quit
      • Knio joined the channel
      • Knio1 has quit
      • Knio1 joined the channel
      • Knio has quit
      • Knio1
        Knio1 is now known as Knio
      • elinenbe has quit
      • orogor has quit
      • Knio
        Knio is now known as Knio-sleep
      • Knio-sleep
        Knio-sleep is now known as knio-sleep
      • knio-sleep
        knio-sleep is now known as Knio-sleep
      • Knio-sleep has quit
      • Misirlou has quit
      • Misirlou joined the channel
      • orogor joined the channel
      • djce joined the channel
      • davide joined the channel
      • djce has quit
      • davide has quit
      • orogor_ joined the channel
      • rj_ joined the channel
      • orogor has quit
      • orogor_
        orogor_ is now known as orogor
      • djce joined the channel
      • ruaok
        hey!
      • djce
        hi
      • So, I was trawling over the MB/FreeDB interface yesterday
      • and Discid.pm
      • ruaok digs himself out from under a relaxing weekend
      • and I think there's a bug in the way we calculate the FreeDB ID from the TOC.
      • ruaok
        Uh oh.
      • djce
        causing, at best, false negatives.
      • at worst, false positives.
      • ruaok
        is it and edge case?
      • djce
        let me post a couple of files up so you can see what I'm talking about.
      • ruaok
        ok
      • djce
        edge?
      • ruaok
        it works for most cases, but not some special cases?
      • djce
        Not that special, no.
      • ok, go to mb.org/~dave
      • ruaok
        weird -- it seems to do the right thing most of the time.
      • djce
        uh, better idea: see the files I've dropped on grunt in ~dave
      • test.pl, Discid.pm and FreeDB.pm
      • now let me guide you through it...
      • djce fires up vim
      • ready?
      • ruaok
        almost
      • whice file first?
      • djce
        FreeDB.pm
      • ruaok
        k
      • djce
        take a look at sub compute_discid
      • this is basically the code which FreeDB themselves publish as how to compute a disc id.
      • ruaok
        k
      • djce
        just above that is ConvertTOCToFreeDBID (which is itself part of Lookup),
      • i.e. how we calculate it.
      • basically I think the error is that the right way (AFAICT) is to use the rounded-down position of each track
      • hence, POSIX::floor in compute_discid
      • whereas what we've been doing is rounding down each track,
      • then summing it.
      • viz: $total_seconds += int($toc[$i + 1] / 75) - int($toc[$i] / 75);
      • These can give different results.
      • Now, if you run test.pl (it should just run straight off),
      • you'll see it trying out this theory.
      • it spits out approx 76 lines of output, twice.
      • ruaok
        mb.pm is missing
      • djce
        you can probably comment that out - or just point it towards your mb_server
      • The first set of lines represents one test album I chose. In this case, all the IDs match - i.e. our calculation yields the right result.
      • hence, the output lines end in "Y" (== a good match)
      • for the second set, the values don't match ("n")
      • So basically I think we've been doing it wrong for a while; the good news: we can easily fix the algorithm.
      • The bad news: has this introduced any bad data into the db?
      • So far, I don't know.
      • ruaok
        gimme a sec -- i need to move the files to a location where I can copy the modules
      • djce
        ok
      • (I'm done setting out my case now, so I'll go quiet now :-)
      • ruaok
        ok, I'll tinker and observe and think
      • djce
        thanks
      • (Just out of interest: this arose because I wanted to test Discid.pm. What I was trying to do was find a FreeDB entry, then work out what request to make to MB to cause it to fetch it from FreeDB)
      • (Straight away I found a "bad" case :-( with no match, when there should have been one)
      • ruaok
        ok, forget copying to another dir.
      • Ok
      • I see what you're saying.
      • I don't think it will have introduced bad data into MB.
      • If a bad id was calculated, then one of the following would've happened:
      • 1. The CD was not found.
      • 2. The FreeDB fuzzy matching algorithm matched it up anyways.
      • 3. A wrong CD was returned for moderation or the user.
      • Case #3 is the worst here, no doubt. But that means that a wrong CD may have been inserted, but the actual CD that was inserted is structurally ok.
      • I'd say fix the code, update the server and not worry about it more.
      • djce
        Hmmm.... I'm not sure if you've covered this case or not:
      • * user requests info using GetCDInfo
      • * not in db
      • * wrong one fetched from FreeDB
      • * wrong album added to MB,
      • Necronom has quit
      • then,
      • * the /right/ disc's MB discid is associated with the /wrong/ly imported album.
      • All under the guise of a "FreeDB moderation".
      • ruaok
        hmmmm
      • yes, the cdindex id would be a concern. But not the album itself.
      • djce
        Right.
      • Track lengths...? It depends which ID - MB or FreeDB - the track lengths are based on.
      • ruaok
        we could write a script to verify each of our discids.
      • djce
        Feed it through the bad algorithm and see which ones yield a wrong result?
      • ruaok
        es, as step one.
      • and then for those that generated a bad id, fetch the album again.
      • do a sanity check (a coarse one since the data may have been edited) to see if the right album was fetched.
      • if we suspect a problem either flag it or imprort the correct album for the CD id.
      • djce
        hmmm... ok sounds reasonable. I think we'll thrash out the details of this problem over the next few days (or however long it takes).
      • i.e. I may consult you for advice again :-)
      • ruaok
        ok. good detective work.
      • brb
      • djce is sorry to welcome ruaok back to the week with such a pain :-(
      • KyleB joined the channel
      • djce: no worries. :-)
      • KyleB
        how is everyone
      • ruaok
        KyleB: busy as usual :-)
      • ruaok is away: chaueffeur
      • KyleB
        working on helix stuff?
      • ruaok
        that too :-)
      • ruaok is back (gone 00:31:28)
      • djce: you around?
      • djce
        yes
      • ruaok
        wendell asks:
      • KyleB
        ruaok, do you think it would create a lot of server strain if people were using it just for metadata queries like return all songs by artist x, but not doing that in the context of moding/tagging?
      • ruaok
        djce: What's the easiest way to tell if there is a rejected moderation to merge two artists?
      • he is looking at the moderation table.
      • djce
        direct SQL you mean?
      • ruaok
        general approach.
      • I'd say look for merge artist mods and then scan for artist you might be interested in.
      • djce
        Does he mean for two /particular/ artists? i.e. you know the IDs already?
      • ruaok
        the primiary artist is listed in the artist col and the secondary artist(s) are in the new field, right?
      • djce
        yes
      • mod.artist = id of to-be-merged artist
      • ruaok
        He didn't specify..