the best I could think up would be suggesting that someone transliterates using one system, but not forcing it
dupuy
since otherwise you get the French and Germans arguing about Tchaikovsky
(the english transliteration is semi-french)
on the wiki page, I actually wrote:
o If the name is unrecognizable e.g. it uses a non-roman character set (Cryllic, Kanji etc.) use the standard transliteration into roman characters if there is one.
+ If the artist's home country's language does not define a standard transliteration into roman characters (e.g. Pinyin, Japanese romaji, use the transliteration into English - transliterations into other languages may be different (e.g. Ч = Ch in english, but Ч = Tch in french).
mkoebele
mkoebele is now known as mkoebele|zzz
dupuy
+ Note that common English spellings should be preferred, e.g. Tchaikovsky, not Chajkovskij.
cikkolata
what about when there are several ways to transliterate into english? (that's mostly what I'm thinking about)
dupuy
this was just a political choice, and has an English-centric bias
if there are several ways, pick the "official" one (based on country)
(again, political)
cikkolata
does russian have an official one? I don't know
dupuy
so a mainland Chinese artist would use Pinyin, a Taiwanese artist something else
note also that it would be different for Russia vs. Ukraine
it basically gets to the point that only experts could actually point to an authority
and everyone else just muddles along
me, I depend on google-fights
whichever variant gets the most hits...
cikkolata
what if both have the same number? (I've seen that for obscure stuff)
and of the (four?) standards they mention, I would lean toward the UN version (as they claim it is used in Russia) or maybe the ISO (although that is surely the least readable)
DJKC
dupuy: Japan has several major transliteration systems and the official one isn't th emost common
which is why I asked if there was a standard picked out of one of them
and with chinese artists, seems like 90% of the hk and tw ones have official english names instead of just romanized ones -_- Not sure if it's as prevalent on main land
Knio
Knio is now known as Knio-dinner
Knio-dinner
Knio-dinner is now known as Knio
DJKC
is the wiki supposed to be in ISO-8859-1 or unicode? if that latter how should I file a bug report to have someone fix it so the server properly adds <meta http-equiv="Content-Type" CONTENT="text/html; charset=UTF-8"> to the header?
and for some people like mo apparently not being translated and just resulting in hosed characters
if you're creating lists of artists in forum scripts might be fairly important to get that set right soon since it'll cause issues with some web browsers...
asahi 12 girls... honestly, I don't remember - I'd guess that came out of some systran type machine translator like babelfish.altavista.com, although it doesn't translate it that way (any more?)
DJKC
well
if you stick asahi 12 girls in google with quotes there are no hits
dupuy
yep - tried that first :-)
DJKC
without them the top hit is for the Ueto Aya and Bishoujo Club 31 show Girls A Go Go
which is on TV Asahi
Asahi being either a giant congomerate or a bunch of unrelated similarily named companies
I've never really checked
but there's TV Asahi, Asahi newspaper, and Asahi beer
*conglomerate
dupuy
does it make any sense as a translation for 楽坊?
DJKC
didn't paste right, my settings are on shift-jis
just google it and paste the google link
but the Japanese reading is Joshi Juuni Gakubou
see the mailing list
dupuy
yes, saw your mail
DJKC
since I wrote a message about the whole mess dealing with them there
what did you paste there though? my encoding settings ate it
dupuy
if it's a chinese group, i would use that as their primary name, and merge the english/japanese entries into a single artist
(just the last two chars of the artist name in japanese)
DJKC
ah
dupuy
(what I had as Yuefang)
DJKC
that's the Chinese reading I think
although not enough syllables
dupuy
I remember puzzling over that one quite a bit
DJKC
12 Girls Band is a literal translation
and what they release as in English
The Twelve Muses is a better and more poetic translation
dupuy
starting with some misencoded gibberish, trying to match it up with google hits, etc.
DJKC
breaking it down it's Girl's Twelve Music Priest
*Girls
so Muse seems to fit best. But they went with a simple "Girls Band"
if they release in english as 12 Girls Band, that's the best sort name
DJKC
dupuy: Twelve Girls Band, actually, but yeah
dupuy
yeah - that's a whole other issue
DJKC
any idea if there are any Chinese speakers on that mailing list? I'm fairly certain which is the Simplified chinese reading and that that's the one for mainland china
but not 100%
dupuy
it's hard enough getting eveyone to buy into the romanization of sortnames
i believe so, although scottt (Scott Tsai) really is more familiar with traditional chinese
lindestinel also reads/writes some chinese, I believe
DJKC
well, I'm not entirely in favor of the romanizing either ^^; I wouldn't mind it except for the fact that it doesn't sort Japanese names in Japanese sort order
so I'd almost rather they were all listed at the end and hopefully sorted right there
dupuy
well, the thing is, nothing I know of would do that either...
i think "hopefully" is the operative word
DJKC
I think but am not certain that to some extent they're sorted in unicode order
dupuy
yes, that's true
but unicode order != alphabetic order
for every alphabet it supports
DJKC
" I think but am not certain that to some extent they're sorted in unicode in the correct Japanese order" <--- how that should have been
but yeah, would need to be able to label what language an artist's name was in
dupuy
you would need some kind of crazy big collation locale to do it right for any non-roman script language
DJKC
so that it knew how to sort it before could have a reasonable expectation of correct sorting
dupuy
I imagine that unicode is "mostly" in the correct order, except for the exceptions
just like Cyrillic or Greek
DJKC
but if you had it tagged as artist name in kanji plus romanized sort name and the whole tagged as Japanese
then you could have the db correctly sort those ones in a i u e o ka ki ku ke ko etc order
dupuy
that could probably be done with a lot of smart coding and some custom collation locales
DJKC
need to be able to tag artist names for language first
dupuy
but what player would support that
yes, first things first.
DJKC
dupuy: ? lilith sorts that way off the top of my head
for mp3 players
dupuy
one of the things I want to do in the i18n wiki page is lay out a development plan
anyhow, I'm sure they did quite a bit of work to make that happen, and don't expect many other non-jp s/w or players to do it
DJKC
encoding is shift_jis
dupuy: well, that's for automatic sorting. if you have a romanized sort name it's pretty easy
dupuy
I have unicode fonts, but some don't appear in mozilla, and some pages crash it (so I use konqueror for that)
so lilith sorts by pronunciation for kanji w/o romanization (or kana)?
DJKC
most common reading
dupuy
how does it do with the sorting of the cyrillic smiley faces? :-)
DJKC
it may be taking advantage of some inherent order in shift_jis
dupuy
perhaps
DJKC
dupuy: I don't know, I've never tried. I rarely use it
I use foobar + search button
dupuy
yeah, when there are enough entries, the alphabetic browsing gets slow
it's moderately useless already for MB
DJKC
I sort of noticed that
when you have to jump through 150 pages to find the right one
dupuy
i found it handy to find artists with bad sortnames, not much else
i just use the symbol page
DJKC
but with music players browsing isn't hard if you use a decent file system layout and have it all sorted by path
dupuy
check the first and last few pages
DJKC
but if you have 20k+ songs in the playlist then searching is faster than browsing 90%+ of the time
dupuy
or MB, with 200K+ albums, 2.5M+ tracks
DJKC
yes, there's that too. But I don't expect most people to have 2.5M+ tracks on their network
don't know many with more than 100k and none with over 200k
so unless you're planning to go steal the last.fm servers or something it'd be hard to beat MB's track count for your personal music collection
dupuy: so to be absolutely clear - I should delete current mods merging English and Japanese versions of the same album and when I have the time can go on a spree adding all the exact same albums I already added in Japanese to the same artists romanized into English?
dupuy
that's my opinion
DJKC
dupuy: translitertion is prefered, correct? not translation.
]Thread[ joined the channel
dupuy: and should I add TRMs and disc IDs to those ones like I've been doing for the JP versions of them?
dupuy
My feeling is that "Transliteration [Translation]" is sort of the best
but I would not object to en: versions of albums with only one or the other
DJKC
well, mostly have been adding TRMs, just a few disc IDs. always a pain to go get out the actual CD for sticking it in the drive for 3 seconds
dupuy
the idea is to get as much of the translation data as possible
DJKC
dupuy: well, I'd be more likely to transliterate than translate
dupuy
there's a trick for inserting disc ids from the track info
DJKC
since a lot of things that sound fine in Japanese are absoluely horrible and corny in English
dupuy
so true of every language
DJKC
dupuy: well, I could make a cue sheet, mount that on a virtual drive, and then submit off that
dupuy
one of the tricks of translation
DJKC
but that seems like a bit more work than going and getting the actual disk
dupuy: well, you can name a Japanese song LOVE LOVE Shine
in English like that
and it's okay
but if you actually named an English song LOVE LOVE Shine it sounds bad...
dupuy
(especially when it gets capitalized to Love Love Shine) - I've been meaning to write up another RFE for a CopyDiscID moderation
DJKC
question on capitalization - Japanese tends to be very consistent with their capitalization abuse of English in titles across releases