luks: still alive? got another python-quiz for you :p
luks
yep, still here :)
canidae
need a workaround for this error "UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-12: ordinal not in range(128)"
apparently i'm getting some string with weird characters
luks
use unicode objects
yllona
canidae: so you're on a mission to: 1) keep everyone young; 2) give everyone a few grey hairs :)
luks
canidae: python uses only ascii for implicit conversions between str and unicode
canidae
yllona: i'm not quite sure what my mission is, although i suspect i won't have any hair left on my head if i don't get this working within the end of the week :p'
luks
to you have to call e.g. something.encode('utf-8')
*so
yllona
canidae: :)
yay! luks to the rescue ;)
canidae
luks: so... you're suggesting i do a "text.encode('utf-8')" then?
luks
what exactly are you doing?
canidae
well :)
luks
if you need to pass the string to some external library which expects UTF-8 then yes
canidae
right
*breathe*
ok, so i got pylucene up & running after alot of pain, i've made a cute index just below 2gb and i'm fetching fields from a document
it works well for most, but at least one artist didn't like when i tried to "print doc.get('artist')"
not quite sure what more i can say, it borks when i try to do that print :)
luks
print uses the encoding from your locale
unless you redirect the output to a file
then it uses ascii
canidae
it's a cgi-script, so i guess it outputs to apache :p
luks
yeah, then you probably need to encode it explicitly
okay, i'm off to bed
good night
or good morning :)
luks has quit
canidae
well that's the oddest...
oh wait, it isn't :)
이동준
no wonder it borked if it tried to display that as ascii
tro has quit
tro joined the channel
CarlFK joined the channel
yllona has quit
Yurim joined the channel
Yurim
Hi folks!
Amblin- joined the channel
Amblin has quit
hawke has quit
toxickore joined the channel
hawke joined the channel
Yurim has quit
rpedro has quit
rpedro joined the channel
SoothingR joined the channel
MrQwerty joined the channel
toxickore has quit
FauxFaux has quit
FauxFaux joined the channel
rpedro has quit
rpedro joined the channel
dseomn has quit
sidd_ has quit
Aankhen`` has quit
yalaforge joined the channel
Freso joined the channel
luks joined the channel
luks has quit
luks joined the channel
sidd joined the channel
csp joined the channel
csp
Am I supposed to leave in spelling mistakes in track titles?
FauxFaux
I believe it's quoted as the artists' choice, check the covers?
csp
I do have the original CD, but I'm unsure if it's the artist (Jim Croce) choice to name it "It does'nt have to be that way"
(while I'm not a native english speaker, I believe it should be "doesn't")
FauxFaux
Yeah, that looks horrible. :p
csp
I guess it's a stupid compilation.. but it is acually printed that way
I think I'll try to make it correct english. Other compilations also use the correct title
Synchro joined the channel
Rondom joined the channel
deadchip
i think it should become a separately listed track with the annotation of being an extraordinary curiosity
csp
I could add an annotation to the release mentioning the error, but should the title of the track be the original or the corrected one then?
yalaforge
if other releases use the correct title, then it's no artist intent and it should be corrected, IMHO
intgr
Freso: Ping?
Freso
intgr: Pong!
HairMetalAddict
Album cover mistakes are regularly corrected. We want correct info, not the cover designer's inability to spell...
intgr
I'm at work (= on Windows) and I'd like to bump Picard to 0.7.1 in my overlay. Could you test it for me?
Now this is odd. I could've sworn I saw a 0.7.1 release announcement on the blog this morning... Or was I still dreaming?
yalaforge
it was in mb-users, I think
the reason why it's not yet on the ftp server could be a permission problem
yalaforge doesn't have upload privs for his packages either
luks
actually, no. uploading picard to ftp.mb.org was a temporary solution
i'm not sure if i should upload it also there
intgr
Helixcommunity sucks.
luks
yeah, but it has nice stats :)
wait a sec, i'll upload it
intgr
Thanks. :)
I wouldn't mind Helixcommunity if one wouldn't have to update that download ID with every new release.
deadchip
csp: no it was just a joke
csp: i wouldn't bother with anything, except maybe making a note or something that on the printed cover of this particular CD the title was spelled wrong
csp: so people who own the exact same CD are not confused (they'll probably think that it's just a typo as well but this will just let them know they're not the only onse)
ones*
csp
deadchip: sorry the joke didn't work. I'm leaving it corrected now. The collection (out since 1989) isn't popular anyway, so why bother
in that screenshot it matched the titles using LD and some additional code of course
intgr
Are you using standard weights?
deadchip
i don't know much about the specific LD calculation implementation i've used
i'd have to check heh
intgr
Differences in punctation and higher/lower case should probably have lower weights.
But what kind of matching algorithm does Picard currently use? I thought it was Levenshtein.
luks
levenshtein on normalized strings
deadchip
oh i didn't know you have it already in
intgr
Freso: So, would you be OK with testing 0.7.1?
deadchip
but either way, i've tried to do various preprocessing to the strings before calculating the ld and it turned out leaving them like they are seems to be best
i'm not even normalizing them in any way (lowercasing or anything)
Freso
intgr: Sure. :)
Rondom has quit
intgr: But please prepend all messages directed to me with my nick so that my highlight goes off. :)
intgr
Freso: Ah, ok. :)
Freso: I suppose you can manage the version-bumping yourself?
Freso
intgr: Unmasking/-keywording or renaming the ebuild?
The former? Sure. The latter? Sure - but I'd prefer you to do it. :)