the best I could think up would be suggesting that someone transliterates using one system, but not forcing it
2004-12-01 33626, 2004
dupuy
since otherwise you get the French and Germans arguing about Tchaikovsky
2004-12-01 33643, 2004
dupuy
(the english transliteration is semi-french)
2004-12-01 33616, 2004
dupuy
on the wiki page, I actually wrote:
2004-12-01 33618, 2004
dupuy
o If the name is unrecognizable e.g. it uses a non-roman character set (Cryllic, Kanji etc.) use the standard transliteration into roman characters if there is one.
2004-12-01 33620, 2004
dupuy
+ If the artist's home country's language does not define a standard transliteration into roman characters (e.g. Pinyin, Japanese romaji, use the transliteration into English - transliterations into other languages may be different (e.g. Ч = Ch in english, but Ч = Tch in french).
2004-12-01 33622, 2004
mkoebele
mkoebele is now known as mkoebele|zzz
2004-12-01 33628, 2004
dupuy
+ Note that common English spellings should be preferred, e.g. Tchaikovsky, not Chajkovskij.
2004-12-01 33612, 2004
cikkolata
what about when there are several ways to transliterate into english? (that's mostly what I'm thinking about)
2004-12-01 33620, 2004
dupuy
this was just a political choice, and has an English-centric bias
2004-12-01 33641, 2004
dupuy
if there are several ways, pick the "official" one (based on country)
2004-12-01 33648, 2004
dupuy
(again, political)
2004-12-01 33629, 2004
cikkolata
does russian have an official one? I don't know
2004-12-01 33632, 2004
dupuy
so a mainland Chinese artist would use Pinyin, a Taiwanese artist something else
2004-12-01 33654, 2004
dupuy
note also that it would be different for Russia vs. Ukraine
2004-12-01 33630, 2004
dupuy
it basically gets to the point that only experts could actually point to an authority
2004-12-01 33640, 2004
dupuy
and everyone else just muddles along
2004-12-01 33650, 2004
dupuy
me, I depend on google-fights
2004-12-01 33603, 2004
dupuy
whichever variant gets the most hits...
2004-12-01 33629, 2004
cikkolata
what if both have the same number? (I've seen that for obscure stuff)
and of the (four?) standards they mention, I would lean toward the UN version (as they claim it is used in Russia) or maybe the ISO (although that is surely the least readable)
2004-12-01 33610, 2004
DJKC
dupuy: Japan has several major transliteration systems and the official one isn't th emost common
2004-12-01 33631, 2004
DJKC
which is why I asked if there was a standard picked out of one of them
2004-12-01 33632, 2004
DJKC
and with chinese artists, seems like 90% of the hk and tw ones have official english names instead of just romanized ones -_- Not sure if it's as prevalent on main land
2004-12-01 33654, 2004
Knio
Knio is now known as Knio-dinner
2004-12-01 33631, 2004
Knio-dinner
Knio-dinner is now known as Knio
2004-12-01 33601, 2004
DJKC
is the wiki supposed to be in ISO-8859-1 or unicode? if that latter how should I file a bug report to have someone fix it so the server properly adds <meta http-equiv="Content-Type" CONTENT="text/html; charset=UTF-8"> to the header?
and for some people like mo apparently not being translated and just resulting in hosed characters
2004-12-01 33625, 2004
DJKC
if you're creating lists of artists in forum scripts might be fairly important to get that set right soon since it'll cause issues with some web browsers...
asahi 12 girls... honestly, I don't remember - I'd guess that came out of some systran type machine translator like babelfish.altavista.com, although it doesn't translate it that way (any more?)
2004-12-01 33650, 2004
DJKC
well
2004-12-01 33601, 2004
DJKC
if you stick asahi 12 girls in google with quotes there are no hits
2004-12-01 33609, 2004
dupuy
yep - tried that first :-)
2004-12-01 33617, 2004
DJKC
without them the top hit is for the Ueto Aya and Bishoujo Club 31 show Girls A Go Go
2004-12-01 33621, 2004
DJKC
which is on TV Asahi
2004-12-01 33638, 2004
DJKC
Asahi being either a giant congomerate or a bunch of unrelated similarily named companies
2004-12-01 33641, 2004
DJKC
I've never really checked
2004-12-01 33656, 2004
DJKC
but there's TV Asahi, Asahi newspaper, and Asahi beer
2004-12-01 33602, 2004
DJKC
*conglomerate
2004-12-01 33606, 2004
dupuy
does it make any sense as a translation for 楽坊?
2004-12-01 33622, 2004
DJKC
didn't paste right, my settings are on shift-jis
2004-12-01 33628, 2004
DJKC
just google it and paste the google link
2004-12-01 33635, 2004
DJKC
but the Japanese reading is Joshi Juuni Gakubou
2004-12-01 33639, 2004
DJKC
see the mailing list
2004-12-01 33649, 2004
dupuy
yes, saw your mail
2004-12-01 33653, 2004
DJKC
since I wrote a message about the whole mess dealing with them there
2004-12-01 33643, 2004
DJKC
what did you paste there though? my encoding settings ate it
2004-12-01 33644, 2004
dupuy
if it's a chinese group, i would use that as their primary name, and merge the english/japanese entries into a single artist
2004-12-01 33600, 2004
dupuy
(just the last two chars of the artist name in japanese)
2004-12-01 33607, 2004
DJKC
ah
2004-12-01 33611, 2004
dupuy
(what I had as Yuefang)
2004-12-01 33621, 2004
DJKC
that's the Chinese reading I think
2004-12-01 33627, 2004
DJKC
although not enough syllables
2004-12-01 33628, 2004
dupuy
I remember puzzling over that one quite a bit
2004-12-01 33636, 2004
DJKC
12 Girls Band is a literal translation
2004-12-01 33641, 2004
DJKC
and what they release as in English
2004-12-01 33651, 2004
DJKC
The Twelve Muses is a better and more poetic translation
2004-12-01 33610, 2004
dupuy
starting with some misencoded gibberish, trying to match it up with google hits, etc.
2004-12-01 33611, 2004
DJKC
breaking it down it's Girl's Twelve Music Priest
2004-12-01 33621, 2004
DJKC
*Girls
2004-12-01 33649, 2004
DJKC
so Muse seems to fit best. But they went with a simple "Girls Band"
if they release in english as 12 Girls Band, that's the best sort name
2004-12-01 33657, 2004
DJKC
dupuy: Twelve Girls Band, actually, but yeah
2004-12-01 33619, 2004
dupuy
yeah - that's a whole other issue
2004-12-01 33636, 2004
DJKC
any idea if there are any Chinese speakers on that mailing list? I'm fairly certain which is the Simplified chinese reading and that that's the one for mainland china
2004-12-01 33638, 2004
DJKC
but not 100%
2004-12-01 33647, 2004
dupuy
it's hard enough getting eveyone to buy into the romanization of sortnames
2004-12-01 33633, 2004
dupuy
i believe so, although scottt (Scott Tsai) really is more familiar with traditional chinese
2004-12-01 33645, 2004
dupuy
lindestinel also reads/writes some chinese, I believe
2004-12-01 33610, 2004
DJKC
well, I'm not entirely in favor of the romanizing either ^^; I wouldn't mind it except for the fact that it doesn't sort Japanese names in Japanese sort order
2004-12-01 33625, 2004
DJKC
so I'd almost rather they were all listed at the end and hopefully sorted right there
2004-12-01 33638, 2004
dupuy
well, the thing is, nothing I know of would do that either...
2004-12-01 33658, 2004
dupuy
i think "hopefully" is the operative word
2004-12-01 33620, 2004
DJKC
I think but am not certain that to some extent they're sorted in unicode order
2004-12-01 33640, 2004
dupuy
yes, that's true
2004-12-01 33650, 2004
dupuy
but unicode order != alphabetic order
2004-12-01 33658, 2004
dupuy
for every alphabet it supports
2004-12-01 33639, 2004
DJKC
" I think but am not certain that to some extent they're sorted in unicode in the correct Japanese order" <--- how that should have been
2004-12-01 33656, 2004
DJKC
but yeah, would need to be able to label what language an artist's name was in
2004-12-01 33602, 2004
dupuy
you would need some kind of crazy big collation locale to do it right for any non-roman script language
2004-12-01 33614, 2004
DJKC
so that it knew how to sort it before could have a reasonable expectation of correct sorting
2004-12-01 33625, 2004
dupuy
I imagine that unicode is "mostly" in the correct order, except for the exceptions
2004-12-01 33634, 2004
dupuy
just like Cyrillic or Greek
2004-12-01 33649, 2004
DJKC
but if you had it tagged as artist name in kanji plus romanized sort name and the whole tagged as Japanese
2004-12-01 33604, 2004
DJKC
then you could have the db correctly sort those ones in a i u e o ka ki ku ke ko etc order
2004-12-01 33648, 2004
dupuy
that could probably be done with a lot of smart coding and some custom collation locales
2004-12-01 33612, 2004
DJKC
need to be able to tag artist names for language first
2004-12-01 33614, 2004
dupuy
but what player would support that
2004-12-01 33622, 2004
dupuy
yes, first things first.
2004-12-01 33637, 2004
DJKC
dupuy: ? lilith sorts that way off the top of my head
2004-12-01 33648, 2004
DJKC
for mp3 players
2004-12-01 33656, 2004
dupuy
one of the things I want to do in the i18n wiki page is lay out a development plan
anyhow, I'm sure they did quite a bit of work to make that happen, and don't expect many other non-jp s/w or players to do it
2004-12-01 33616, 2004
DJKC
encoding is shift_jis
2004-12-01 33641, 2004
DJKC
dupuy: well, that's for automatic sorting. if you have a romanized sort name it's pretty easy
2004-12-01 33648, 2004
dupuy
I have unicode fonts, but some don't appear in mozilla, and some pages crash it (so I use konqueror for that)
2004-12-01 33656, 2004
dupuy
so lilith sorts by pronunciation for kanji w/o romanization (or kana)?
2004-12-01 33612, 2004
DJKC
most common reading
2004-12-01 33644, 2004
dupuy
how does it do with the sorting of the cyrillic smiley faces? :-)
2004-12-01 33647, 2004
DJKC
it may be taking advantage of some inherent order in shift_jis
2004-12-01 33657, 2004
dupuy
perhaps
2004-12-01 33600, 2004
DJKC
dupuy: I don't know, I've never tried. I rarely use it
2004-12-01 33608, 2004
DJKC
I use foobar + search button
2004-12-01 33638, 2004
dupuy
yeah, when there are enough entries, the alphabetic browsing gets slow
2004-12-01 33651, 2004
dupuy
it's moderately useless already for MB
2004-12-01 33656, 2004
DJKC
I sort of noticed that
2004-12-01 33608, 2004
DJKC
when you have to jump through 150 pages to find the right one
2004-12-01 33611, 2004
dupuy
i found it handy to find artists with bad sortnames, not much else
2004-12-01 33632, 2004
dupuy
i just use the symbol page
2004-12-01 33646, 2004
DJKC
but with music players browsing isn't hard if you use a decent file system layout and have it all sorted by path
2004-12-01 33650, 2004
dupuy
check the first and last few pages
2004-12-01 33608, 2004
DJKC
but if you have 20k+ songs in the playlist then searching is faster than browsing 90%+ of the time
2004-12-01 33622, 2004
dupuy
or MB, with 200K+ albums, 2.5M+ tracks
2004-12-01 33645, 2004
DJKC
yes, there's that too. But I don't expect most people to have 2.5M+ tracks on their network
2004-12-01 33604, 2004
DJKC
don't know many with more than 100k and none with over 200k
2004-12-01 33631, 2004
DJKC
so unless you're planning to go steal the last.fm servers or something it'd be hard to beat MB's track count for your personal music collection
2004-12-01 33633, 2004
DJKC
dupuy: so to be absolutely clear - I should delete current mods merging English and Japanese versions of the same album and when I have the time can go on a spree adding all the exact same albums I already added in Japanese to the same artists romanized into English?
2004-12-01 33601, 2004
dupuy
that's my opinion
2004-12-01 33603, 2004
DJKC
dupuy: translitertion is prefered, correct? not translation.
2004-12-01 33619, 2004
]Thread[ joined the channel
2004-12-01 33626, 2004
DJKC
dupuy: and should I add TRMs and disc IDs to those ones like I've been doing for the JP versions of them?
2004-12-01 33635, 2004
dupuy
My feeling is that "Transliteration [Translation]" is sort of the best
2004-12-01 33655, 2004
dupuy
but I would not object to en: versions of albums with only one or the other
2004-12-01 33607, 2004
DJKC
well, mostly have been adding TRMs, just a few disc IDs. always a pain to go get out the actual CD for sticking it in the drive for 3 seconds
2004-12-01 33611, 2004
dupuy
the idea is to get as much of the translation data as possible
2004-12-01 33623, 2004
DJKC
dupuy: well, I'd be more likely to transliterate than translate
2004-12-01 33638, 2004
dupuy
there's a trick for inserting disc ids from the track info
2004-12-01 33639, 2004
DJKC
since a lot of things that sound fine in Japanese are absoluely horrible and corny in English
2004-12-01 33652, 2004
dupuy
so true of every language
2004-12-01 33657, 2004
DJKC
dupuy: well, I could make a cue sheet, mount that on a virtual drive, and then submit off that
2004-12-01 33604, 2004
dupuy
one of the tricks of translation
2004-12-01 33609, 2004
DJKC
but that seems like a bit more work than going and getting the actual disk
2004-12-01 33627, 2004
DJKC
dupuy: well, you can name a Japanese song LOVE LOVE Shine
2004-12-01 33630, 2004
DJKC
in English like that
2004-12-01 33632, 2004
DJKC
and it's okay
2004-12-01 33648, 2004
DJKC
but if you actually named an English song LOVE LOVE Shine it sounds bad...
2004-12-01 33623, 2004
dupuy
(especially when it gets capitalized to Love Love Shine) - I've been meaning to write up another RFE for a CopyDiscID moderation
2004-12-01 33623, 2004
DJKC
question on capitalization - Japanese tends to be very consistent with their capitalization abuse of English in titles across releases