#metabrainz

/

9:43 AM
alastairp

what I've not seen in this discussion is an evaluation about what a good result is

2019-06-05 15609, 2019

9:43 AM
ruaok

but I think exposing anything is premature. we will have a lot of iterations.

2019-06-05 15616, 2019

9:43 AM
ruaok

but that can be in the cards for the future.

2019-06-05 15618, 2019

9:43 AM
alastairp

regardless of what the process or end output is

2019-06-05 15648, 2019

9:43 AM
ruaok

alastairp: a very good question. I've not thought about that yet.

2019-06-05 15604, 2019

9:44 AM
ruaok

my focus so far has been to build the data sets that allow people to make recomendations.

2019-06-05 15631, 2019

9:44 AM
ruaok

though, I think making a new DB and then allowing people to download dumps of it for our challenge in the fall makes a lot of sense.

2019-06-05 15657, 2019

9:44 AM
ruaok

alastairp: does the industry have a metric for measuring the performance of rec systems?

2019-06-05 15607, 2019

9:45 AM
alastairp

from my point of view, that's really important. since pristine__'s work has "something" working, but I've not seen any structured analysis as to whether the results are actually good

2019-06-05 15613, 2019

9:45 AM
Nyanko-sensei joined the channel

2019-06-05 15620, 2019

9:45 AM
alastairp

other than ruaok saying "well, that _does_ look like something that I'd want to listen to"

2019-06-05 15628, 2019

9:45 AM
alastairp

right, there are 2 broad options

2019-06-05 15631, 2019

9:45 AM
ruaok

correct, I agree.

2019-06-05 15609, 2019

9:46 AM
alastairp

playlist recommendation (e.g. https://recsys-challenge.spotify.com/) evaluates you by withholding part of the playlist, and seeing how many of the items that you recommend are on the withheld part

2019-06-05 15616, 2019

9:46 AM
ruaok

and really the CF stuff doesn't generate things I want to listen to. CF needs more backup/mashup.

2019-06-05 15627, 2019

9:46 AM
alastairp

otherwise you have subjective analysis. give it to someone and ask them how good it is

2019-06-05 15602, 2019

9:47 AM
alastairp

the first one is much easier to evaluate, but you end up basically only recommending people stuff that they already know

2019-06-05 15621, 2019

9:47 AM
alastairp

because there's no other way of knowing that a recommendation out of their known songs is good for them

2019-06-05 15658, 2019

9:47 AM
D4RK-PH0ENiX has quit

2019-06-05 15659, 2019

9:47 AM
alastairp

so, alternatively, do similar to what gentlecat and philip did for their masters projects, generate a playlist, give it to someone, and ask them to thumbs up/down recommendations

2019-06-05 15625, 2019

9:48 AM
ruaok

I really only see the latter as being possible. since we don't have 1M playlists to begin with.

2019-06-05 15628, 2019

9:48 AM
alastairp

(then you have to work out how to fold that feedback into the algorithm too)

2019-06-05 15643, 2019

9:48 AM
alastairp

you don't have playlists, but you have playback history

2019-06-05 15652, 2019

9:48 AM
Nyanko-sensei has quit

2019-06-05 15628, 2019

9:49 AM
D4RK-PH0ENiX joined the channel

2019-06-05 15650, 2019

9:49 AM
ruaok

the CF alg will need to have a candidate dataset to recommend into.

2019-06-05 15612, 2019

9:50 AM
ruaok

which we haven't quite sorted out to do create yet, but have some ideas.

2019-06-05 15653, 2019

9:50 AM
ruaok

but that obviously impacts what gets generated. and may limit the effectiveness of using listens as a way of measuring effectiveness.

2019-06-05 15654, 2019

9:50 AM
alastairp

so you want to build a test playlist? that's not a terrible idea

2019-06-05 15601, 2019

9:51 AM
alastairp

but man, it's going to be so subjective

2019-06-05 15618, 2019

9:51 AM
ruaok

it will be for sure.

2019-06-05 15633, 2019

9:51 AM
ruaok

but I think that anything else is beyond the scope for the summer.

2019-06-05 15645, 2019

9:51 AM
alastairp

ruaok: btw, bulk queries _do_ get slower, but it seems to be the transfer time for larger and larger responses rather than the actual db lookup

2019-06-05 15651, 2019

9:51 AM
alastairp

so your nginx suggestion is good

2019-06-05 15600, 2019

9:52 AM
alastairp

sure, not much time left in the summer for that

2019-06-05 15601, 2019

9:52 AM
ruaok

if we get a page on LB where a user can click "gimme a playlist" and one appears in a reasonable amount of time, I would be happy for the summit.

2019-06-05 15610, 2019

9:52 AM
ruaok

summer.

2019-06-05 15619, 2019

9:52 AM
ruaok

alastairp: great.

2019-06-05 15653, 2019

9:52 AM
ruaok

given how we're evolving all of this, this needs to be part of the roadmap for a challenge in the autumn.

2019-06-05 15604, 2019

9:53 AM
ruaok

but for summer, it is too much.

2019-06-05 15611, 2019

9:55 AM
ruaok

thanks for putting that on the radar, alastairp.

2019-06-05 15641, 2019

9:55 AM
ruaok

iliekcomputers: pristine__: thoughts on this discussion?

2019-06-05 15632, 2019

9:58 AM
iliekcomputers

not so much, evaluation is definitely something we need to work on soon.

2019-06-05 15604, 2019

9:59 AM
iliekcomputers

i'd been thinking of how we could get user feedback (thumbs up/down) into the cf algorithm. i guess it'd involve adding/subtracting values into the listen counts passed into the cf algorithm.

2019-06-05 15636, 2019

9:59 AM
ruaok

not sure if feeding back into CF is all that good to start with.

2019-06-05 15658, 2019

9:59 AM
ruaok

feeding back into the rec alg itself might be better or easier to start with.

2019-06-05 15606, 2019

10:00 AM
ruaok

or adjusting the candidate set.

2019-06-05 15633, 2019

10:01 AM
iliekcomputers

hmm, yeah.

2019-06-05 15633, 2019

10:01 AM
iliekcomputers

but no way of knowing that with no real evaluation so far. getting some recommendations into production with thumbs up / down should be priority for now, i guess.

2019-06-05 15657, 2019

10:02 AM
ruaok

I also feel that if we get to the point where "I can't tell how much this decent recommendation is improving over time" then I'll be quite happy.

2019-06-05 15621, 2019

10:03 AM
ruaok

which of course means that we need to have a more qualitative approach to evaluating recommendations.

2019-06-05 15650, 2019

10:03 AM
reosarevok

You mean giving them to someone with better quality taste than ruaok? Ok, me and zas are available :p

2019-06-05 15658, 2019

10:03 AM
iliekcomputers

to be honest, we can't tell that right now either, really.

2019-06-05 15615, 2019

10:04 AM
ruaok

both of you are right.

2019-06-05 15627, 2019

10:04 AM
ruaok

but I haven't seen anything that made me smile yet.

2019-06-05 15645, 2019

10:04 AM
ruaok

only things that I am convinced that I don't want to listen to.

2019-06-05 15648, 2019

10:04 AM
reosarevok pats ruaok on the head

2019-06-05 15617, 2019

10:05 AM
ruaok

of course, we're also still early in the game.

2019-06-05 15634, 2019

10:05 AM
reosarevok

Very much so

2019-06-05 15607, 2019

10:06 AM
reosarevok

Qualitative evaluation is going to be very hard anyway, because it depends on having a lot of people with different tastes say "this, this is good shit"

2019-06-05 15616, 2019

10:06 AM
ruaok

I guess if we can't please ourselves on a very basic level, then a more quantitative solution will only confirm what we already know.

2019-06-05 15617, 2019

10:06 AM
reosarevok

We barely have a lot of people *submitting* yet :)

2019-06-05 15619, 2019

10:06 AM
ruaok

(read: we suck)

2019-06-05 15634, 2019

10:06 AM
ruaok

yeah, that is another issue that I am grappling with.

2019-06-05 15636, 2019

10:06 AM
reosarevok

Wait

2019-06-05 15637, 2019

10:06 AM
reosarevok

"which of course means that we need to have a more qualitative approach to evaluating recommendations."

2019-06-05 15640, 2019

10:06 AM
reosarevok

Did you mean quantitative?

2019-06-05 15649, 2019

10:06 AM
ruaok

we keep releasing stuff and focusing on the next thing, but we need to work to get more users.

2019-06-05 15609, 2019

10:07 AM
ruaok

qualitative, I guess.

2019-06-05 15624, 2019

10:07 AM
reosarevok

Oh, ok

2019-06-05 15631, 2019

10:07 AM
ruaok

My brain is barely cohesive this morning. feh. jetlag gets worse as one ages.

2019-06-05 15600, 2019

10:08 AM
reosarevok

If you can come up with some half-decent quantitative / programmatical way of knowing if stuff is kinda-sorta improving, that would be great, if only because for a human is hard to tell I feel

2019-06-05 15609, 2019

10:08 AM
reosarevok

"Ok, I still hate this shit, but do I hate it LESS?"

2019-06-05 15618, 2019

10:08 AM
ruaok

no arguments from me.

2019-06-05 15653, 2019

10:08 AM
ruaok

still, I'm happy we're facing these issues/questions.

2019-06-05 15657, 2019

10:08 AM
ruaok

clearly a sign of progress.

2019-06-05 15650, 2019

10:09 AM
reosarevok

"Just how shit are we really?" "PROGRESS!"

2019-06-05 15651, 2019

10:09 AM
reosarevok

:D

2019-06-05 15651, 2019

10:09 AM
iliekcomputers

did we come to a conclusion about storing the data?

2019-06-05 15653, 2019

10:09 AM
reosarevok

But yeah, I guess :)

2019-06-05 15602, 2019

10:10 AM
ruaok

iliekcomputers: no

2019-06-05 15609, 2019

10:10 AM
iliekcomputers

😂

2019-06-05 15623, 2019

10:10 AM
reosarevok

iliekcomputers: you're a playground bully :p

2019-06-05 15631, 2019

10:10 AM
reosarevok

You guys have more money than everyone else combined!

2019-06-05 15645, 2019

10:10 AM
ferbncode

iliekcomputers: 😂

2019-06-05 15610, 2019

10:11 AM
reosarevok

You'll still manage to lose to Pakistan somehow anyway, though, so it's ok

2019-06-05 15639, 2019

10:12 AM
alastairp

iliekcomputers: I want to add a constant from somewhere in the code into a sphinx documentation so that it shows up in the api documentation. ever done that?

2019-06-05 15622, 2019

10:13 AM
alastairp

I guess LB does https://listenbrainz.readthedocs.io/en/production…

2019-06-05 15627, 2019

10:16 AM
pristine__

ruaok: hey. Sorry, I am a lil late, didn't know the time of the meeting. Phew.

2019-06-05 15645, 2019

10:16 AM
iliekcomputers

alastairp: i didn't write it but yeah

2019-06-05 15626, 2019

10:17 AM
iliekcomputers

reosarevok: https://github.com/metabrainz/listenbrainz-server…

2019-06-05 15633, 2019

10:17 AM
iliekcomputers

oh sorry, alastairp ^

2019-06-05 15651, 2019

10:17 AM
iliekcomputers

reosarevok: never lost to pakistan in a world cup :D

2019-06-05 15629, 2019

10:18 AM
pac23

but 29 hour response time is just apphaling

2019-06-05 15648, 2019

10:18 AM
pac23

iliekcomputers when is the match ?

2019-06-05 15654, 2019

10:18 AM
alastairp

iliekcomputers: cool, it's possible that one of these auto* methods can include the number directly into the docstring, I'll have a look

2019-06-05 15609, 2019

10:20 AM
BrainzGit

[musicbrainz-server] reosarevok merged pull request #1026 (master…MBS-10133): MBS-10133: Clarify "empty query" bad request error https://github.com/metabrainz/musicbrainz-server/…

2019-06-05 15610, 2019

10:20 AM
BrainzBot

MBS-10133: Error message when sending an empty query to the WS is unclear https://tickets.metabrainz.org/browse/MBS-10133

2019-06-05 15615, 2019

10:20 AM
iliekcomputers

ruaok: if we put everything in a different database, it'll be harder to access from LB, no joins etc.

2019-06-05 15615, 2019

10:20 AM
iliekcomputers

pac23: it is going on, SA 34/2 in 10 overs :D

2019-06-05 15601, 2019

10:22 AM
alastairp

not many good umpire emojis

2019-06-05 15606, 2019

10:22 AM
alastairp

\o/

2019-06-05 15608, 2019

10:22 AM
alastairp

\o

2019-06-05 15611, 2019

10:22 AM
alastairp

_o

2019-06-05 15640, 2019

10:22 AM
iliekcomputers

alastairp: nz demolished sri lanka a few days ago

2019-06-05 15641, 2019

10:22 AM
iliekcomputers

🎉

2019-06-05 15657, 2019

10:23 AM
iliekcomputers

alastairp: are you putting the ratelimit values inside the docs from sphinx?

2019-06-05 15631, 2019

10:24 AM
alastairp

I'm looking at the number of items per bulk query

2019-06-05 15647, 2019

10:24 AM
alastairp

ratelimit values would be nice, but those will come from consul now?

2019-06-05 15650, 2019

10:24 AM
iliekcomputers

ah.

2019-06-05 15601, 2019

10:25 AM
alastairp

and so won't be available when docs are built

2019-06-05 15611, 2019

10:27 AM
iliekcomputers

yeah, consul was the problem when i thought of putting it in there yesterday

2019-06-05 15636, 2019

10:27 AM
alastairp

can we set a default, and override it with config if set?

2019-06-05 15641, 2019

10:28 AM
iliekcomputers

that is what i did for now

2019-06-05 15637, 2019

10:29 AM
iliekcomputers

https://github.com/metabrainz/acousticbrainz-serv…

2019-06-05 15622, 2019

10:31 AM
alastairp

right, but that would set the limits to the BU defaults, I'm not sure if we want a specific AB defaults too

2019-06-05 15645, 2019

10:43 AM
BrainzGit

[musicbrainz-server] reosarevok merged pull request #1034 (master…MBS-8915): MBS-8915: Allow editors to choose delimiter in track parser https://github.com/metabrainz/musicbrainz-server/…

2019-06-05 15646, 2019

10:43 AM
BrainzBot

MBS-8915: Allow editors to choose delimiter in track parser https://tickets.metabrainz.org/browse/MBS-8915

2019-06-05 15659, 2019

10:47 AM
D4RK-PH0ENiX has quit

2019-06-05 15604, 2019

11:02 AM
D4RK-PH0ENiX joined the channel

2019-06-05 15640, 2019

11:06 AM
D4RK-PH0ENiX has quit

2019-06-05 15640, 2019

11:07 AM
ruaok

> if we put everything in a different database, it'll be harder to access from LB, no joins etc.

2019-06-05 15656, 2019

11:07 AM
ruaok

iliekcomputers: yes, exactly, but then again, what data exists in LB that needs to be joined?

2019-06-05 15604, 2019

11:08 AM
ruaok

the key data really lives in Influx.

2019-06-05 15620, 2019

11:09 AM
iliekcomputers

Select track from user join cf_recommendation on user.id

2019-06-05 15646, 2019

11:09 AM
iliekcomputers

To get user recommendations for a bunch of users

2019-06-05 15613, 2019

11:10 AM
ruaok

at the same time that limits recommendations to people who have LB accounts.

2019-06-05 15628, 2019

11:10 AM
ruaok

not sure if that is a relevant point.

2019-06-05 15653, 2019

11:10 AM
ruaok

I *think* adding a schema into the LB data is the right course of action for now.

2019-06-05 15606, 2019

11:12 AM
iliekcomputers

That sounds like a reasonable compromise to me for now.

2019-06-05 15618, 2019

11:12 AM
ruaok

what do we call it?

2019-06-05 15638, 2019

11:12 AM
ruaok

recommendation? recsys? (which is what the industry calls all this. not a fan, really).

2019-06-05 15602, 2019

11:14 AM
iliekcomputers

Recommendation

2019-06-05 15648, 2019

11:14 AM
Mr_Monkey

The WhyNot? Machine

2019-06-05 15653, 2019

11:14 AM
ruaok

recommendation.{track_track_relations|artist_artist_relations|cf_user_recommendation} ?

2019-06-05 15617, 2019

11:15 AM
ruaok

actually singhular on the first two.

2019-06-05 15650, 2019

11:15 AM
ruaok can't spel

2019-06-05 15643, 2019

11:16 AM
CatQuest

that's okaye

2019-06-05 15612, 2019

11:17 AM
ruaok

not a very catty comment from you, CatQuest...

2019-06-05 15649, 2019

11:17 AM
ruaok

hmmm. $17k invoices to send. delicious.

2019-06-05 15657, 2019

11:17 AM
CatQuest

anyway. ruaok I wanted to ask you a slight off topic question. how hard is it really to register and own and maintain a *.cat domain (seeingas you live in barceloan now i thoguht you woudl know, don't you also have a *.cat websie now?)

2019-06-05 15614, 2019

11:18 AM
ruaok

easy in the grand scheme of things.

2019-06-05 15628, 2019

11:18 AM
ruaok

there is one caveat -- there needs to be some catalan content on the page.

2019-06-05 15632, 2019

11:18 AM
CatQuest

exactly

2019-06-05 15639, 2019

11:18 AM
CatQuest

but liek, how strict are they?

2019-06-05 15655, 2019

11:18 AM
ruaok

my mayhem.cat page has no text, except for "Benvinguts". so, welcome in Catalan. No one has ever come complaining.

2019-06-05 15601, 2019

11:19 AM
CatQuest

if I translated reosarevok's "nokkloom" page into estonian, wil lit suffice

2019-06-05 15610, 2019

11:19 AM
ruaok

not sure, really.

2019-06-05 15610, 2019

11:19 AM
CatQuest

hmmmmmm

2019-06-05 15619, 2019

11:19 AM
ruaok

like I said, I have almost no text on my site.