#musicbrainz

/

      • cikkolata
        the best I could think up would be suggesting that someone transliterates using one system, but not forcing it
      • dupuy
        since otherwise you get the French and Germans arguing about Tchaikovsky
      • (the english transliteration is semi-french)
      • on the wiki page, I actually wrote:
      • o If the name is unrecognizable e.g. it uses a non-roman character set (Cryllic, Kanji etc.) use the standard transliteration into roman characters if there is one.
      • + If the artist's home country's language does not define a standard transliteration into roman characters (e.g. Pinyin, Japanese romaji, use the transliteration into English - transliterations into other languages may be different (e.g. Ч = Ch in english, but Ч = Tch in french).
      • mkoebele
        mkoebele is now known as mkoebele|zzz
      • dupuy
        + Note that common English spellings should be preferred, e.g. Tchaikovsky, not Chajkovskij.
      • cikkolata
        what about when there are several ways to transliterate into english? (that's mostly what I'm thinking about)
      • dupuy
        this was just a political choice, and has an English-centric bias
      • if there are several ways, pick the "official" one (based on country)
      • (again, political)
      • cikkolata
        does russian have an official one? I don't know
      • dupuy
        so a mainland Chinese artist would use Pinyin, a Taiwanese artist something else
      • note also that it would be different for Russia vs. Ukraine
      • it basically gets to the point that only experts could actually point to an authority
      • and everyone else just muddles along
      • me, I depend on google-fights
      • whichever variant gets the most hits...
      • cikkolata
        what if both have the same number? (I've seen that for obscure stuff)
      • dupuy
        then I go by the more impressive website :-)
      • wikipedia has a good article on Russian to English transliteration: http://en.wikipedia.org/wiki/Transliteration_of...
      • and of the (four?) standards they mention, I would lean toward the UN version (as they claim it is used in Russia) or maybe the ISO (although that is surely the least readable)
      • DJKC
        dupuy: Japan has several major transliteration systems and the official one isn't th emost common
      • which is why I asked if there was a standard picked out of one of them
      • and with chinese artists, seems like 90% of the hk and tw ones have official english names instead of just romanized ones -_- Not sure if it's as prevalent on main land
      • Knio
        Knio is now known as Knio-dinner
      • Knio-dinner
        Knio-dinner is now known as Knio
      • DJKC
        is the wiki supposed to be in ISO-8859-1 or unicode? if that latter how should I file a bug report to have someone fix it so the server properly adds <meta http-equiv="Content-Type" CONTENT="text/html; charset=UTF-8"> to the header?
      • since right now pages like http://wiki.musicbrainz.org/wiki.pl?JapaneseArt... are set as iso-8859-1 and pasted in kana/kanji are getting translated by browsers into the unicode reference numbers so the source displays stuff like <li> <a href="http://www.musicbrainz.org/showartist...; (raico) instead of the actual kanji
      • and for some people like mo apparently not being translated and just resulting in hosed characters
      • if you're creating lists of artists in forum scripts might be fairly important to get that set right soon since it'll cause issues with some web browsers...
      • *foreign
      • dupuy: http://musicbrainz.org/showmod.html?modid=1345272 out of curiousity, where did you find them refered to as that?
      • CloCkWeRX joined the channel
      • dupuy
        asahi 12 girls... honestly, I don't remember - I'd guess that came out of some systran type machine translator like babelfish.altavista.com, although it doesn't translate it that way (any more?)
      • DJKC
        well
      • if you stick asahi 12 girls in google with quotes there are no hits
      • dupuy
        yep - tried that first :-)
      • DJKC
        without them the top hit is for the Ueto Aya and Bishoujo Club 31 show Girls A Go Go
      • which is on TV Asahi
      • Asahi being either a giant congomerate or a bunch of unrelated similarily named companies
      • I've never really checked
      • but there's TV Asahi, Asahi newspaper, and Asahi beer
      • *conglomerate
      • dupuy
        does it make any sense as a translation for 楽坊?
      • DJKC
        didn't paste right, my settings are on shift-jis
      • just google it and paste the google link
      • but the Japanese reading is Joshi Juuni Gakubou
      • see the mailing list
      • dupuy
        yes, saw your mail
      • DJKC
        since I wrote a message about the whole mess dealing with them there
      • what did you paste there though? my encoding settings ate it
      • dupuy
        if it's a chinese group, i would use that as their primary name, and merge the english/japanese entries into a single artist
      • (just the last two chars of the artist name in japanese)
      • DJKC
        ah
      • dupuy
        (what I had as Yuefang)
      • DJKC
        that's the Chinese reading I think
      • although not enough syllables
      • dupuy
        I remember puzzling over that one quite a bit
      • DJKC
        12 Girls Band is a literal translation
      • and what they release as in English
      • The Twelve Muses is a better and more poetic translation
      • dupuy
        starting with some misencoded gibberish, trying to match it up with google hits, etc.
      • DJKC
        breaking it down it's Girl's Twelve Music Priest
      • *Girls
      • so Muse seems to fit best. But they went with a simple "Girls Band"
      • dupuy
        if they release in english as 12 Girls Band, that's the best sort name
      • DJKC
        dupuy: Twelve Girls Band, actually, but yeah
      • dupuy
        yeah - that's a whole other issue
      • DJKC
        any idea if there are any Chinese speakers on that mailing list? I'm fairly certain which is the Simplified chinese reading and that that's the one for mainland china
      • but not 100%
      • dupuy
        it's hard enough getting eveyone to buy into the romanization of sortnames
      • i believe so, although scottt (Scott Tsai) really is more familiar with traditional chinese
      • lindestinel also reads/writes some chinese, I believe
      • DJKC
        well, I'm not entirely in favor of the romanizing either ^^; I wouldn't mind it except for the fact that it doesn't sort Japanese names in Japanese sort order
      • so I'd almost rather they were all listed at the end and hopefully sorted right there
      • dupuy
        well, the thing is, nothing I know of would do that either...
      • i think "hopefully" is the operative word
      • DJKC
        I think but am not certain that to some extent they're sorted in unicode order
      • dupuy
        yes, that's true
      • but unicode order != alphabetic order
      • for every alphabet it supports
      • DJKC
        " I think but am not certain that to some extent they're sorted in unicode in the correct Japanese order" <--- how that should have been
      • but yeah, would need to be able to label what language an artist's name was in
      • dupuy
        you would need some kind of crazy big collation locale to do it right for any non-roman script language
      • DJKC
        so that it knew how to sort it before could have a reasonable expectation of correct sorting
      • dupuy
        I imagine that unicode is "mostly" in the correct order, except for the exceptions
      • just like Cyrillic or Greek
      • DJKC
        but if you had it tagged as artist name in kanji plus romanized sort name and the whole tagged as Japanese
      • then you could have the db correctly sort those ones in a i u e o ka ki ku ke ko etc order
      • dupuy
        that could probably be done with a lot of smart coding and some custom collation locales
      • DJKC
        need to be able to tag artist names for language first
      • dupuy
        but what player would support that
      • yes, first things first.
      • DJKC
        dupuy: ? lilith sorts that way off the top of my head
      • for mp3 players
      • dupuy
        one of the things I want to do in the i18n wiki page is lay out a development plan
      • lilith = device? pc software?
      • DJKC
      • mpl3 player program
      • dupuy
        hmm, that page crashed my browser... bad luck
      • DJKC
        don't have Japanese fonts installed?
      • dupuy
        anyhow, I'm sure they did quite a bit of work to make that happen, and don't expect many other non-jp s/w or players to do it
      • DJKC
        encoding is shift_jis
      • dupuy: well, that's for automatic sorting. if you have a romanized sort name it's pretty easy
      • dupuy
        I have unicode fonts, but some don't appear in mozilla, and some pages crash it (so I use konqueror for that)
      • so lilith sorts by pronunciation for kanji w/o romanization (or kana)?
      • DJKC
        most common reading
      • dupuy
        how does it do with the sorting of the cyrillic smiley faces? :-)
      • DJKC
        it may be taking advantage of some inherent order in shift_jis
      • dupuy
        perhaps
      • DJKC
        dupuy: I don't know, I've never tried. I rarely use it
      • I use foobar + search button
      • dupuy
        yeah, when there are enough entries, the alphabetic browsing gets slow
      • it's moderately useless already for MB
      • DJKC
        I sort of noticed that
      • when you have to jump through 150 pages to find the right one
      • dupuy
        i found it handy to find artists with bad sortnames, not much else
      • i just use the symbol page
      • DJKC
        but with music players browsing isn't hard if you use a decent file system layout and have it all sorted by path
      • dupuy
        check the first and last few pages
      • DJKC
        but if you have 20k+ songs in the playlist then searching is faster than browsing 90%+ of the time
      • dupuy
        or MB, with 200K+ albums, 2.5M+ tracks
      • DJKC
        yes, there's that too. But I don't expect most people to have 2.5M+ tracks on their network
      • don't know many with more than 100k and none with over 200k
      • so unless you're planning to go steal the last.fm servers or something it'd be hard to beat MB's track count for your personal music collection
      • dupuy: so to be absolutely clear - I should delete current mods merging English and Japanese versions of the same album and when I have the time can go on a spree adding all the exact same albums I already added in Japanese to the same artists romanized into English?
      • dupuy
        that's my opinion
      • DJKC
        dupuy: translitertion is prefered, correct? not translation.
      • ]Thread[ joined the channel
      • dupuy: and should I add TRMs and disc IDs to those ones like I've been doing for the JP versions of them?
      • dupuy
        My feeling is that "Transliteration [Translation]" is sort of the best
      • but I would not object to en: versions of albums with only one or the other
      • DJKC
        well, mostly have been adding TRMs, just a few disc IDs. always a pain to go get out the actual CD for sticking it in the drive for 3 seconds
      • dupuy
        the idea is to get as much of the translation data as possible
      • DJKC
        dupuy: well, I'd be more likely to transliterate than translate
      • dupuy
        there's a trick for inserting disc ids from the track info
      • DJKC
        since a lot of things that sound fine in Japanese are absoluely horrible and corny in English
      • dupuy
        so true of every language
      • DJKC
        dupuy: well, I could make a cue sheet, mount that on a virtual drive, and then submit off that
      • dupuy
        one of the tricks of translation
      • DJKC
        but that seems like a bit more work than going and getting the actual disk
      • dupuy: well, you can name a Japanese song LOVE LOVE Shine
      • in English like that
      • and it's okay
      • but if you actually named an English song LOVE LOVE Shine it sounds bad...
      • dupuy
        (especially when it gets capitalized to Love Love Shine) - I've been meaning to write up another RFE for a CopyDiscID moderation
      • DJKC
        question on capitalization - Japanese tends to be very consistent with their capitalization abuse of English in titles across releases