what I've not seen in this discussion is an evaluation about what a good result is
ruaok
but I think exposing anything is premature. we will have a lot of iterations.
but that can be in the cards for the future.
alastairp
regardless of what the process or end output is
ruaok
alastairp: a very good question. I've not thought about that yet.
my focus so far has been to build the data sets that allow people to make recomendations.
though, I think making a new DB and then allowing people to download dumps of it for our challenge in the fall makes a lot of sense.
alastairp: does the industry have a metric for measuring the performance of rec systems?
alastairp
from my point of view, that's really important. since pristine__'s work has "something" working, but I've not seen any structured analysis as to whether the results are actually good
Nyanko-sensei joined the channel
other than ruaok saying "well, that _does_ look like something that I'd want to listen to"
right, there are 2 broad options
ruaok
correct, I agree.
alastairp
playlist recommendation (e.g. https://recsys-challenge.spotify.com/) evaluates you by withholding part of the playlist, and seeing how many of the items that you recommend are on the withheld part
ruaok
and really the CF stuff doesn't generate things I want to listen to. CF needs more backup/mashup.
alastairp
otherwise you have subjective analysis. give it to someone and ask them how good it is
the first one is much easier to evaluate, but you end up basically only recommending people stuff that they already know
because there's no other way of knowing that a recommendation out of their known songs is good for them
D4RK-PH0ENiX has quit
so, alternatively, do similar to what gentlecat and philip did for their masters projects, generate a playlist, give it to someone, and ask them to thumbs up/down recommendations
ruaok
I really only see the latter as being possible. since we don't have 1M playlists to begin with.
alastairp
(then you have to work out how to fold that feedback into the algorithm too)
you don't have playlists, but you have playback history
Nyanko-sensei has quit
D4RK-PH0ENiX joined the channel
ruaok
the CF alg will need to have a candidate dataset to recommend into.
which we haven't quite sorted out to do create yet, but have some ideas.
but that obviously impacts what gets generated. and may limit the effectiveness of using listens as a way of measuring effectiveness.
alastairp
so you want to build a test playlist? that's not a terrible idea
but man, it's going to be so subjective
ruaok
it will be for sure.
but I think that anything else is beyond the scope for the summer.
alastairp
ruaok: btw, bulk queries _do_ get slower, but it seems to be the transfer time for larger and larger responses rather than the actual db lookup
so your nginx suggestion is good
sure, not much time left in the summer for that
ruaok
if we get a page on LB where a user can click "gimme a playlist" and one appears in a reasonable amount of time, I would be happy for the summit.
summer.
alastairp: great.
given how we're evolving all of this, this needs to be part of the roadmap for a challenge in the autumn.
but for summer, it is too much.
thanks for putting that on the radar, alastairp.
iliekcomputers: pristine__: thoughts on this discussion?
iliekcomputers
not so much, evaluation is definitely something we need to work on soon.
i'd been thinking of how we could get user feedback (thumbs up/down) into the cf algorithm. i guess it'd involve adding/subtracting values into the listen counts passed into the cf algorithm.
ruaok
not sure if feeding back into CF is all that good to start with.
feeding back into the rec alg itself might be better or easier to start with.
or adjusting the candidate set.
iliekcomputers
hmm, yeah.
but no way of knowing that with no real evaluation so far. getting some recommendations into production with thumbs up / down should be priority for now, i guess.
ruaok
I also feel that if we get to the point where "I can't tell how much this decent recommendation is improving over time" then I'll be quite happy.
which of course means that we need to have a more qualitative approach to evaluating recommendations.
reosarevok
You mean giving them to someone with better quality taste than ruaok? Ok, me and zas are available :p
iliekcomputers
to be honest, we can't tell that right now either, really.
ruaok
both of you are right.
but I haven't seen anything that made me smile yet.
only things that I am convinced that I don't want to listen to.
reosarevok pats ruaok on the head
of course, we're also still early in the game.
reosarevok
Very much so
Qualitative evaluation is going to be very hard anyway, because it depends on having a lot of people with different tastes say "this, this is good shit"
ruaok
I guess if we can't please ourselves on a very basic level, then a more quantitative solution will only confirm what we already know.
reosarevok
We barely have a lot of people *submitting* yet :)
ruaok
(read: we suck)
yeah, that is another issue that I am grappling with.
reosarevok
Wait
"which of course means that we need to have a more qualitative approach to evaluating recommendations."
Did you mean quantitative?
ruaok
we keep releasing stuff and focusing on the next thing, but we need to work to get more users.
qualitative, I guess.
reosarevok
Oh, ok
ruaok
My brain is barely cohesive this morning. feh. jetlag gets worse as one ages.
reosarevok
If you can come up with some half-decent quantitative / programmatical way of knowing if stuff is kinda-sorta improving, that would be great, if only because for a human is hard to tell I feel
"Ok, I still hate this shit, but do I hate it LESS?"
ruaok
no arguments from me.
still, I'm happy we're facing these issues/questions.
clearly a sign of progress.
reosarevok
"Just how shit are we really?" "PROGRESS!"
:D
iliekcomputers
did we come to a conclusion about storing the data?
reosarevok
But yeah, I guess :)
ruaok
iliekcomputers: no
iliekcomputers
😂
reosarevok
iliekcomputers: you're a playground bully :p
You guys have more money than everyone else combined!
ferbncode
iliekcomputers: 😂
reosarevok
You'll still manage to lose to Pakistan somehow anyway, though, so it's ok
alastairp
iliekcomputers: I want to add a constant from somewhere in the code into a sphinx documentation so that it shows up in the api documentation. ever done that?
anyway. ruaok I wanted to ask you a slight off topic question. how hard is it really to register and own and maintain a *.cat domain (seeingas you live in barceloan now i thoguht you woudl know, don't you also have a *.cat websie now?)
ruaok
easy in the grand scheme of things.
there is one caveat -- there needs to be some catalan content on the page.
CatQuest
exactly
but liek, how strict are they?
ruaok
my mayhem.cat page has no text, except for "Benvinguts". so, welcome in Catalan. No one has ever come complaining.
CatQuest
if I translated reosarevok's "nokkloom" page into estonian, wil lit suffice