no comments? you wouldn't believe how pissed off I am right now with this. hundreds of tags fucked up because of some anal pointless change
2011-01-18 01849, 2011
Muz
Typographic characters in the DB are correct AFAIC, however, for the purpose of tagging, which isn't MB's only purpose, it should be the responsibility of tagging applications to normalise this data into something sane where desired via a user preference.
2011-01-18 01813, 2011
Muz
Justifying the behaviour of metadata tags behind catalog data on the site is a retarded thing to do.
2011-01-18 01809, 2011
luks
are you using Picard? I think nikki made a plugin to use the ASCII equivalents
2011-01-18 01802, 2011
mudcrow
i'm using picard and intalled the plugin and it still doesnt recognise the hypen thats replaced the standard keyboard hypen that every bugger uses
2011-01-18 01835, 2011
Milosz joined the channel
2011-01-18 01845, 2011
Milosz joined the channel
2011-01-18 01851, 2011
srotta
mudcrow: Yeah, I'm not too fond of that change either 8)
And I don't think it's "responsibility of tagging applications". It might be, if we lived in a perfect world or in a bottle where there's only MB.
2011-01-18 01833, 2011
mudcrow
I really don't think we should be alienating users and potential users by using non standard characters, especially when those characters can not be recognised by picard and other taggers, as tagging still is the main rason people use MB
2011-01-18 01851, 2011
srotta
I still don't know what's the purpose of using typograhically correct characters. Yes, it's "correct", and no, I still don't see the benefit. It makes adding data more cumbersome, using data harder and all around gives a nerdy image ;)
2011-01-18 01822, 2011
Muz
Providing decent enough input/correction methods on site to cater for inserting that typographic data to begin with is a given too.
2011-01-18 01844, 2011
Mineo
"non-standard characters" really depends on the standard you're talking about :)
2011-01-18 01830, 2011
Muz
And I did say "should", so I don't know how else to parse that statement about a perfect world, other than it being entirely superfluous.
2011-01-18 01815, 2011
Muz
And for the money generated by people using Picard, vs a commercial license, I do wonder how you can make a claim as tagging being the main reason to use MB.
2011-01-18 01852, 2011
Muz
Again, the companies parsing and using this data may, or may not, need to normalise it, given there are plenty of example libraries and pieces of code that do this already, I'm not sure what the problem is.
2011-01-18 01825, 2011
srotta
Well, "should" still doesn't change the fact that at the moment nothing normalizes the data, so changing the guideline has immediate effect on every user.
2011-01-18 01809, 2011
srotta
So are the companies normalizing the content? Are they even using it? Do they prefer typographically correct punctuation? Did anyone ask? (No, I haven't read the discussion in style list. Probably should.)
2011-01-18 01853, 2011
srotta
Fact is, at the moment this kind of changes make me think twice about running Picard on my files. And that makes me think twice about adding anything to MB.
2011-01-18 01837, 2011
reosarevok joined the channel
2011-01-18 01846, 2011
kamiccolo joined the channel
2011-01-18 01810, 2011
kamiccolo has left the channel
2011-01-18 01824, 2011
kepstin-laptop
Huh, and here I�m all happy that the text on the Latin-charset songs in my music library would gradually start looking nicer.
2011-01-18 01851, 2011
kepstin-laptop
of course, given that a rather large chunk of my music library is in Japanese, I�m relying on having software with full Unicode support anyways.
2011-01-18 01826, 2011
ddaydj
i don't get what the issue is
2011-01-18 01854, 2011
ddaydj
picard already has an option to only use ascii in filenames, so why else would it matter?
2011-01-18 01839, 2011
kepstin-laptop
It doesn�t even really matter in filesystems, that option�s really just to get rid of stuff like :"'/ which you can�t use in filenames on various systems.
2011-01-18 01819, 2011
ddaydj
i've had issues with non-ascii filenames when your system language isn't set to english
2011-01-18 01833, 2011
srotta
I don't really care (that much) about file names, because as pointed out, Picard does have the option of removing those.
2011-01-18 01821, 2011
ddaydj
so what's the issue?
2011-01-18 01824, 2011
srotta
The issue is with tag contents, which don't get touched by Picard.
2011-01-18 01824, 2011
kepstin-laptop
weird. Modern versions of Windows uses Unicode internally in filenames, so it shouldn�t cause problems there, most modern Unix variants store filenames in UTF-8.
2011-01-18 01844, 2011
srotta
kepstin-laptop: Except, of course, systems such as Linux.
2011-01-18 01854, 2011
kepstin-laptop uses Linux
2011-01-18 01806, 2011
srotta
Sure, and if you use UTF8 as system locale, it works ok.
2011-01-18 01812, 2011
ddaydj
last time i had my system set to not english was xp, i doubt it got changed in vista or 7
2011-01-18 01820, 2011
nikki wishes kepstin would send utf-8 on irc too :P
2011-01-18 01821, 2011
ddaydj
srotta, what do you mean tag contents?
2011-01-18 01836, 2011
kepstin-laptop
I do send UTF-8 on IRC, don�t I?
2011-01-18 01838, 2011
nikki
no
2011-01-18 01845, 2011
srotta
But at least with ext3 if the system locale is something else, also file names are stored as "something else".
2011-01-18 01846, 2011
kepstin-laptop
hmm. I should be...
2011-01-18 01847, 2011
nikki
I'm getting incompatible encoding every time you use an apostrophe
2011-01-18 01855, 2011
nikki
and the chatlogs show some weird character
2011-01-18 01800, 2011
srotta
ddaydj: The titles etc in tags.
2011-01-18 01824, 2011
srotta
And still, I'm wondering what's the benefit. Really.
2011-01-18 01833, 2011
ddaydj
so you want picard to use ascii in the tags too?
2011-01-18 01835, 2011
nikki
ah, the replacement character. why the hell does it show up as a greek character for me
2011-01-18 01835, 2011
pesto joined the channel
2011-01-18 01843, 2011
kepstin-laptop
hmm.
2011-01-18 01850, 2011
kepstin-laptop
This better: “”?
2011-01-18 01857, 2011
nikki
yep!
2011-01-18 01806, 2011
srotta
ddaydj: No, I'm perfectly happy to use UTF-8 where it's needed.
2011-01-18 01816, 2011
kepstin-laptop
Weird. Not sure why that got misconfigured.
2011-01-18 01849, 2011
srotta
ddaydj: I'm questioning whether it's worth it to use it in double quotes etc. when I'd say majority of the world uses the non-typographically-correct versions.
2011-01-18 01845, 2011
kepstin-laptop
The majority of the world can’t spell their artist names correctly or capitalize song titles correctly either, if you go look at the Last.fm data :/
kepstin-laptop: Yep, so why make it even harder for them to enter it, when it's perfectly acceptable to use the regular versions of the characters.
2011-01-18 01805, 2011
ddaydj
ah, i think i see
2011-01-18 01831, 2011
kepstin-laptop
If they’re using a tagger program they wouldn’t have to manually enter the characters unless they’re editing the DB… In which case they don’t have to manually enter the characters, because it’s optional.
2011-01-18 01841, 2011
srotta
Do we really have enough active editors to make the editing harder for the next guy?
2011-01-18 01809, 2011
reosarevok
It is not really *harder*, as it is preferred, not mandatory, isn't it?
2011-01-18 01810, 2011
nikki
how is it harder? nobody *has* to use those characters, but now the people who do care are allowed to edit them
2011-01-18 01830, 2011
kepstin-laptop
I’ll repeat that: When you’re editing the DB, you don’t have to add those characters, you can let someone else who cares fix it later.
2011-01-18 01855, 2011
srotta
nikki: So we have multiple versions of the same title name, and everything's perfectly alright, so we end up with borked data like the last.fm example.
2011-01-18 01813, 2011
kepstin-laptop
Last.fm has lots of issues other than that ;)
2011-01-18 01817, 2011
ddaydj
^
2011-01-18 01824, 2011
reosarevok
hark!
2011-01-18 01830, 2011
kepstin-laptop
Let me know when they finally sort out how to tell apart different artist with the same name.
2011-01-18 01831, 2011
srotta
Sure, but that's just one concrete example of immediate result.
2011-01-18 01850, 2011
nikki
it's nothing new for last.fm though
2011-01-18 01832, 2011
kepstin-laptop
It’s even worse with Last.fm on the Japanese tracks because everyone uses different variants of quotes or spaces or double/single-width characters.
2011-01-18 01833, 2011
adamlogan joined the channel
2011-01-18 01840, 2011
nikki looks at http://www.last.fm/music/三人祭/+tracks
2011-01-18 01841, 2011
adamlogan has left the channel
2011-01-18 01853, 2011
nikki
they only have three songs
2011-01-18 01854, 2011
kepstin-laptop
exactly :)
2011-01-18 01819, 2011
reosarevok
...
2011-01-18 01852, 2011
reosarevok
I guess there is a Japanese name there that isn't showing on my x-chat aqua?
srotta: fwiw, I've never agreed about how this style change was announced/rolled out. It was done in a pretty dick move, but the underlying intention is right IMO.
2011-01-18 01853, 2011
reosarevok
...
2011-01-18 01805, 2011
reosarevok
nikki: wtfffff
2011-01-18 01826, 2011
reosarevok
So all those are just 3 tracks?
2011-01-18 01829, 2011
nikki
yep
2011-01-18 01834, 2011
srotta
Muz: Sure, I don't have anything against using those, if there are tools to "incorrect" the data into something that the rest of the world uses.
2011-01-18 01819, 2011
reosarevok
Mainly the only issue is making Picard simplify it then
2011-01-18 01821, 2011
Muz
IDeally, even with that style change being "approved", it would not have been enforced immediately, to allow for people to cater for the forthcoming change accordingly, e.g an update to Picard that has this included etc.
2011-01-18 01825, 2011
reosarevok
Isn't there a way for it either?
2011-01-18 01840, 2011
reosarevok
*either -> already, wtf
2011-01-18 01859, 2011
srotta
Muz: Not only Picard, but also the data feed users.
2011-01-18 01814, 2011
Muz
Quite.
2011-01-18 01814, 2011
srotta
Ah, Picard was only an example. Yeah 8)
2011-01-18 01829, 2011
_Tsk_ joined the channel
2011-01-18 01822, 2011
andreypopp joined the channel
2011-01-18 01844, 2011
andreypopp
Hello, how NGS is related to Music Ontology?
2011-01-18 01849, 2011
andreypopp
Is it modeled after it? I know about LinkedBrainz, which attempts to provide mapping from NGS to MO, but I interesting in NGS specifically.
2011-01-18 01809, 2011
andreypopp
And how do you think, which metadata model is richer — NGS or Musicbrainz?
2011-01-18 01824, 2011
kepstin-laptop
andreypopp, to be honest, they're not really related at all. The mapping tools will have to be updated to work with NGS.
2011-01-18 01828, 2011
nikki
ngs *is* musicbrainz
2011-01-18 01857, 2011
kepstin-laptop
(I think he wanted to compare NGS and Music Ontology)
2011-01-18 01858, 2011
andreypopp
nikki: yes, I know, sorry I'm not native english speaker
2011-01-18 01807, 2011
andreypopp
kepstin-laptop: yes, exactly
2011-01-18 01851, 2011
nikki
ngs wasn't based on MO, the schema has been in the works for years
2011-01-18 01803, 2011
andreypopp
kepstin-laptop: I see in MO maillist, that MO 2.0 borrows concepts form NSG and discogs.
2011-01-18 01829, 2011
kurtjx joined the channel
2011-01-18 01837, 2011
nikki
oh, hi kurtjx!
2011-01-18 01853, 2011
kurtjx
hi nikki :-)
2011-01-18 01816, 2011
nikki
you know more about MO than me, right? :P
2011-01-18 01831, 2011
kepstin-laptop
there's some things in NGS that Music Ontology doesn't have, and vice-versa. For example, NGS has some additional media types, a few additional relationships of various types; but NGS doesn't have anything relating to descriptions of the music itself, like BPM, or the file formats.
2011-01-18 01829, 2011
kepstin-laptop
hmm. looks like MO has some work done on a way to manage classical opus/movement type things.
2011-01-18 01816, 2011
andreypopp
yeah, I still think that NGS is more correct, probably it's because it's battle tested
2011-01-18 01848, 2011
kepstin-laptop
looks like MO has a lot more levels of granularity between the 'Work' and 'Recording' stage.
it looks like the NGS 'Work' concept is more or less the combination of mo:Composition and mo:MusicalWork, and NGS 'Recording's incorporate all of mo:Performance, mo:Sound, mo:Recording, mo:Signal, and mo:Record in one.
2011-01-18 01821, 2011
kepstin-laptop
which was done mostly to simplify the lives of the people who edit the information.
2011-01-18 01858, 2011
andreypopp
Sounds reasonable
2011-01-18 01801, 2011
kepstin-laptop
It doesn't look like MO has anything comparable to NGS's artist credit system, which allows (multiple) artists to be credited using different names on each recording/release.
2011-01-18 01836, 2011
andreypopp
Yeah, I see, thanks for comments.
2011-01-18 01840, 2011
kurtjx
nikki: sorry i swaped screens - do you have a MO question?
2011-01-18 01854, 2011
nikki
no, but andreypopp was asking about it
2011-01-18 01855, 2011
andreypopp
Does NGS's link table allow store just attributes to entities, not association between pair entity? For example Rovi has attributes and relationships modeled as EAV, but I see NGS only have link table (which EAV relationship), but I can't see how to store arbitrary attributes for entities with NGS
2011-01-18 01851, 2011
andreypopp
kurtjx: I was asking about how MO relates to NGS, but it seems my question is closed for now :-)
2011-01-18 01800, 2011
kurtjx
heh
2011-01-18 01812, 2011
kurtjx
yes that was the topic of my previous job andreypopp