#musicbrainz

/

      • navap
        You have 9 _windows_ open..so do you never close Opera?
      • nikki
        only when osx insists that I have to reboot
      • then I open it again :P
      • and when I run out of space for tabs in a window, I open a new window
      • brianfreud
        I just tend to keep them all in one window, but I hit around 350 tabs yesterday :(
      • luks
        what do you need the 100 tabs for? to replace bookmarks?
      • brianfreud
        typically just I've opened tabs and not bothered to close them :P
      • nikki
        well I'm usually in the middle of doing something
      • and then I get distracted :/
      • luks
        but you can't find anything in that many tabs
      • ruaok
        I'm more liable to have a mess of terminal windows.
      • each with tons of tabs.
      • luks
        unless opera has something like search in tab titles
      • nikki
        I can find things! just not instantly
      • ruaok
        sounds like an O(n) operation
      • nikki
        well I leave enough of the tab name visible so I just need to go through the windows quickly
      • the only time that fails is with mb edits, I can't remember which edit number was a particular edit :P
      • luks
        why I have about 15 tabs open in a single window, I'm getting lost
      • that's when I start bookmarking things I still need and close the tabs
      • Munger
        I have just been on the phone to a lovely lady called Tina at DeAgostini UK. She is doing her best to get me a list of Orbis/DeAgostini relases/catalog numbers in electronic format
      • creature
        I don't know the back story with this, but be careful. Label metadata is often way more messy than you'd expect.
      • Munger
        I know. But I would still say that it is a better source than ebay listings :-)
      • The series I'm researching was released in UK, Germany, Netherlands Italy & France (and others). In France it was Les Genies de Blues
      • Catalog numbers & sequences are fubar, and only DeAgostini will know for sure the significance of the diverse catalog numbers
      • I once saw a prototype browser that displayed a graphical tree of your tabs/history, allowing you to see instantly how you got to a particular page
      • RifRaf
        luks hi, would you have any time to help a bit with some plugin issues over next day or 2? am trying but not getting where i want, getting somewhere though
      • xlotlu_ has quit
      • luks
        RifRaf: if you ping me tommorow, I'll help
      • too busy to read/write any code at the moment
      • RifRaf
        ok cool, need to get away from code for a bit here too i think
      • xlotlu_ joined the channel
      • ruaok has quit
      • Tykling joined the channel
      • xlotlu__ joined the channel
      • sonium joined the channel
      • v6lur joined the channel
      • MightyJay joined the channel
      • xlotlu_ has quit
      • Munger
        I cannot believe my luck. When I called DeAgostini, I spoke to probably the only person in the world who considered it important to document their releases - catalog numbers in each country, release dates, track listings, source of the tracks. All typed with hand-written notes. She is scanning it all and sending it to me. REckons it will take her a few days to scan
      • Wizzcat
        O.o
      • only downside is then you won't have the fun of hunting down cat#s and release dates :p
      • Munger
        She even knows which release were taken from the the original french series, "Les Genies du Blues'
      • brianfreud
        finally :) a working part number function
      • Munger: if you thought your regexp was nuts, try this one: http://musicbrainz.pastebin.com/mc455b11
      • Munger
        WTF kind of insanity is that? :-D
      • brianfreud
        that would be a functional PartNumberStyle routine :D
      • Munger
        I think the word you were looking for was 'dysfunctional'
      • brianfreud
        we allow way too much garbage to be passed in to GC :P
      • Munger
        What does that regex actually do?
      • luks
        seriously brian, this is very close to completely unmaintainable code
      • brianfreud
        3 || Part‐Handler, Parts 1, 2, 3 - all handled
      • luks: I know... yet it's also quite needed
      • luks
        I don't it
      • there is always a way to write readable code
      • *doubt
      • brianfreud
        what do you find unmaintainable about it?
      • luks
        the HUGE regular expression, for example
      • Munger thinks that any lage list of such specific strings is conceptually flawed. Unless that regex can be easily re-generated to adapt to changes, it cannot be maintained
      • "do everything in a single place"
      • brianfreud
        it simplifies down quite a good bit, once you take out the Deseret hunk of it
      • luks
        maintainable code is usually "do only as little as possible in a single place"
      • and then you glue the pieces together
      • brianfreud
        Munger: find me a good way to access a range of plane 1 unicode with JS in a regexp, I'll quite happily swap out the 44 specifics
      • Munger
        If it's a search for this, replace with that regex, then a lookup table in a separate file is probaly better
      • luks
        you need to write a lot of small functions, not a single huge piece of code
      • brianfreud disagrees
      • brianfreud
        the old code worked by using a lot of small functions. Did you really find it readable?
      • luks
        well, I can do nothing about that, but I'm sure I'm right
      • yes, I was able to fix it any time I needed
      • Munger
        I don't know what exactly you are trying to achieve, but that regex makes me shudder. Spider sense is tingling
      • brianfreud
        put it this way: I have yet to find a single problem result that I cannot pinpoint to an exact line of code, just using console.log and moving up or down in the code.
      • Munger
        In words of one syllable, what does that code actually do?
      • brianfreud
        the old GC code, on the other hand, was a total nightmare, "fixing" things by merely adding in yet another specific fix
      • luks
        I guarantee you that you or somebody else will find this code problematic for the reasons I said
      • brianfreud
        Munger: it takes text, finds all "part" words, and does all it can to split part name from part number, then makes a valid part-style string from it
      • luks
        it's the same reason why people are moving from php-like style of coding websites to the MVC model
      • brianfreud
        luks: I don't plan to go anywhere
      • Munger
        Do you have a sample of the data it is working on?
      • brianfreud
        munger: foo part 1 2 3
      • navap
        It handles 1-3 now?
      • brianfreud
        read PartNumberStyle... it's a pita in its permissiveness
      • Munger
        and what does it parse 'foo part 1 2 3' into?
      • brianfreud
        Foo, Parts 1, 2 & 3
      • navap
        Oh, doh
      • Munger
        does 'foo part 1 2 3' appear on it's own line on the input file?
      • Munger wants to see a dump of the input file
      • brianfreud
        but also "foo pt 1- 2, 5", etc
      • Munger: any potential user input in a form field... there is no "input file"
      • "Foo, Parts 1 & 2, & 5" as output, btw
      • Munger
        Ther is always an input file, even if it is post data
      • brianfreud
        and luks: I see nothing about a text-parser which is handled better by a MVC-model, vs a FSM-model
      • Munger
        Can there be multiple records on a single form submission, or is it just one?
      • What I am asking is if you ever see 'foo part 1 2 3 bar part 5-6 7'
      • brianfreud
        all I know is, any possible cat-corner case I have yet to find, using every single GC-bug *ever* (wiki and trac), plus every single raw text input from the .nfo's for 1000 torrents, it's taken me less than 5 minutes to spot the problem area, and less than 8 hours to fix
      • nikki
        brianfreud: you might not plan to go anywhere, but you can't guarantee life has no plans for you :P
      • brianfreud
        Munger: nope
      • luks
        brianfreud: do you have a test suite for this code?
      • brianfreud
        no, not at the moment
      • but the GC code is, unlike the old one, a true engine - it could easily be hooked in to one, if someone wrote one
      • luks
        so if you fix one problem, you don't know if 5 old problems is not broken again
      • brianfreud
        well, actually, yes, 99.5% yes
      • Wizzcat
        shouldn't we have a testsuite in place first, to make sure we don't regress from what we already have?
      • nikki
        luks: quick question, does ngs include album/track aliases?
      • brianfreud
        any particular thing is handled only by a single section of code.
      • luks
        nikki: not directly, but it has all functionality for trivially adding that
      • brianfreud
        there is no recursion, no looping, etc. It's a true FSM processor.
      • luks
        nikki: but of course you have the master track -> track, album -> release mapping
      • so that's very similar to aliases
      • brianfreud
        so any change to PartNumberStyle code *only* affects PartNumberStyle text. If it doesn't match that regexp, it doesn't change the text, period.
      • Munger
        OK. so you are always dealing with a single record in the format <foo> <["part"|"pt"] > [range] (, [range])...
      • nikki
        ah...
      • nikki doesn't entirely understand it, but will take your word for it
      • brianfreud
        Munger: s?\,?[\s\[\(]p(?:ar)?ts?\.?\b\s , but yes
      • * \s?\,?[\s\[\(]p(?:ar)?ts?\.?\b\s
      • luks
        nikki: well, for example you have a release group "Foo" with releases "Foo" and "Bar"
      • so if you search for the release group, you can use both "Foo" and "Bar" as aliases
      • Munger
        Whoa! Steady on with the regular expressions :-D
      • brianfreud
        (part (pt (parts (pts (part. (pt. (parts. (pts. part pt parts pts part. pt. parts. pts. etc
      • luks
        nikki: I even had the idea to not have track/release group titles manually entered
      • but to derive them from release/"tracklist track" titles
      • brianfreud
        luks: all I know is, the old GC code took weeks to figure out how it did *anything*. This GC code, at least to me, is 100% comprehensible immediately. You may not get quite how something works, off the bat, but you can always spot exactly where situation X is handled, and why results Y or Z occur.
      • Munger
        It seems to me that you are better off scanning the input data for number ranges and expanding them into a list of all numbers, and then re-assembling them into your required output. e.g. 'foo part 1-3 5 6' expands to 'foo part 1',foo part 2', foo part 3', foo part 5'.'foo part 6'.
      • brianfreud was serious, btw, about the offer for a better way to do a range on Deseret unicode in JS
      • brianfreud
        Munger: you're assuming that part #'s are digits
      • part a, parts a-b, part foo, part bar, etc
      • nikki
        luks: hmm... would be nice if some things could be aliases without having another pseudo-release though, e.g. that "- Human" song which everyone tries to rename to "Human"
      • Munger
        I am giong on the data you gave me, but no, I am assuming no such thing. You need to identify all the input part number and expand them into individual records however, so that you can manipulate them as you wish
      • navap
        Theres a song called " - Human"? Whats the deal with the dash?
      • nikki
      • navap
        hm
      • luks
        you mean aliases that would be usable by taggers?
      • Munger
        If the user input 'foo part 1-3 2', do you catch the duplicate?
      • brianfreud
        Munger: it's "Guess Case", not "Replacement for User's Brain" :D
      • nikki
        I'm not sure to be honest... it's hard to draw a line between 'search hint' and 'alias for people to use'
      • brianfreud
        it's already smarter than the old GC, as it is
      • luks
        I can't think of a way how to use track/release aliases for tagging
      • how you specify that you want to use this particular alias?
      • Munger
        Ah! If you had told me you were working on the Guess Case code when I asked, I would have had a better idea of the problem. Now I can think clearly about it
      • :-)
      • brianfreud
        ah, sorry, thought you knew :)
      • Munger
        No, but my question is still valid. To properly clean it, you need to catch duplicates
      • brianfreud
        yes, but it merely attempts to apply the guideline. It tries to catch stupidity, but catching that part piece 3 is within the range of part pieces 1 and 2... a bit more than I think it needs to try to do (more than it already does, at least)
      • whoever: the current GC code: http://musicbrainz.pastebin.com/m4b3775e5 and all known issues: http://brianfreud.info/gcissues.txt
      • nikki
        I guess if it were me, I'd show some sort of has-alternate-names-indicator which people could click on to show a list of alternatives
      • navap
        heh what do you know, I have " - Human" lol
      • Munger
        So if I understand this right, when parsing input records, some of them have the word 'Part', 'Pt', 'Pts' or whatever and 1 or more numbers following. You need to catch these patterns and normalize them?
      • brianfreud
        yes
      • luks
        brianfreud: http://www.youtube.com/view_play_list?p=BDAB2BA... - really good videos on this topic
      • Munger
        ...or Parts a-c
      • brianfreud
        think of the worst freedb data you've ever seen. now try to fix it.