#metabrainz

/

      • alastairp
        what I've not seen in this discussion is an evaluation about what a good result is
      • ruaok
        but I think exposing anything is premature. we will have a lot of iterations.
      • but that can be in the cards for the future.
      • alastairp
        regardless of what the process or end output is
      • ruaok
        alastairp: a very good question. I've not thought about that yet.
      • my focus so far has been to build the data sets that allow people to make recomendations.
      • though, I think making a new DB and then allowing people to download dumps of it for our challenge in the fall makes a lot of sense.
      • alastairp: does the industry have a metric for measuring the performance of rec systems?
      • alastairp
        from my point of view, that's really important. since pristine__'s work has "something" working, but I've not seen any structured analysis as to whether the results are actually good
      • Nyanko-sensei joined the channel
      • other than ruaok saying "well, that _does_ look like something that I'd want to listen to"
      • right, there are 2 broad options
      • ruaok
        correct, I agree.
      • alastairp
        playlist recommendation (e.g. https://recsys-challenge.spotify.com/) evaluates you by withholding part of the playlist, and seeing how many of the items that you recommend are on the withheld part
      • ruaok
        and really the CF stuff doesn't generate things I want to listen to. CF needs more backup/mashup.
      • alastairp
        otherwise you have subjective analysis. give it to someone and ask them how good it is
      • the first one is much easier to evaluate, but you end up basically only recommending people stuff that they already know
      • because there's no other way of knowing that a recommendation out of their known songs is good for them
      • D4RK-PH0ENiX has quit
      • so, alternatively, do similar to what gentlecat and philip did for their masters projects, generate a playlist, give it to someone, and ask them to thumbs up/down recommendations
      • ruaok
        I really only see the latter as being possible. since we don't have 1M playlists to begin with.
      • alastairp
        (then you have to work out how to fold that feedback into the algorithm too)
      • you don't have playlists, but you have playback history
      • Nyanko-sensei has quit
      • D4RK-PH0ENiX joined the channel
      • ruaok
        the CF alg will need to have a candidate dataset to recommend into.
      • which we haven't quite sorted out to do create yet, but have some ideas.
      • but that obviously impacts what gets generated. and may limit the effectiveness of using listens as a way of measuring effectiveness.
      • alastairp
        so you want to build a test playlist? that's not a terrible idea
      • but man, it's going to be so subjective
      • ruaok
        it will be for sure.
      • but I think that anything else is beyond the scope for the summer.
      • alastairp
        ruaok: btw, bulk queries _do_ get slower, but it seems to be the transfer time for larger and larger responses rather than the actual db lookup
      • so your nginx suggestion is good
      • sure, not much time left in the summer for that
      • ruaok
        if we get a page on LB where a user can click "gimme a playlist" and one appears in a reasonable amount of time, I would be happy for the summit.
      • summer.
      • alastairp: great.
      • given how we're evolving all of this, this needs to be part of the roadmap for a challenge in the autumn.
      • but for summer, it is too much.
      • thanks for putting that on the radar, alastairp.
      • iliekcomputers: pristine__: thoughts on this discussion?
      • iliekcomputers
        not so much, evaluation is definitely something we need to work on soon.
      • i'd been thinking of how we could get user feedback (thumbs up/down) into the cf algorithm. i guess it'd involve adding/subtracting values into the listen counts passed into the cf algorithm.
      • ruaok
        not sure if feeding back into CF is all that good to start with.
      • feeding back into the rec alg itself might be better or easier to start with.
      • or adjusting the candidate set.
      • iliekcomputers
        hmm, yeah.
      • but no way of knowing that with no real evaluation so far. getting some recommendations into production with thumbs up / down should be priority for now, i guess.
      • ruaok
        I also feel that if we get to the point where "I can't tell how much this decent recommendation is improving over time" then I'll be quite happy.
      • which of course means that we need to have a more qualitative approach to evaluating recommendations.
      • reosarevok
        You mean giving them to someone with better quality taste than ruaok? Ok, me and zas are available :p
      • iliekcomputers
        to be honest, we can't tell that right now either, really.
      • ruaok
        both of you are right.
      • but I haven't seen anything that made me smile yet.
      • only things that I am convinced that I don't want to listen to.
      • reosarevok pats ruaok on the head
      • of course, we're also still early in the game.
      • reosarevok
        Very much so
      • Qualitative evaluation is going to be very hard anyway, because it depends on having a lot of people with different tastes say "this, this is good shit"
      • ruaok
        I guess if we can't please ourselves on a very basic level, then a more quantitative solution will only confirm what we already know.
      • reosarevok
        We barely have a lot of people *submitting* yet :)
      • ruaok
        (read: we suck)
      • yeah, that is another issue that I am grappling with.
      • reosarevok
        Wait
      • "which of course means that we need to have a more qualitative approach to evaluating recommendations."
      • Did you mean quantitative?
      • ruaok
        we keep releasing stuff and focusing on the next thing, but we need to work to get more users.
      • qualitative, I guess.
      • reosarevok
        Oh, ok
      • ruaok
        My brain is barely cohesive this morning. feh. jetlag gets worse as one ages.
      • reosarevok
        If you can come up with some half-decent quantitative / programmatical way of knowing if stuff is kinda-sorta improving, that would be great, if only because for a human is hard to tell I feel
      • "Ok, I still hate this shit, but do I hate it LESS?"
      • ruaok
        no arguments from me.
      • still, I'm happy we're facing these issues/questions.
      • clearly a sign of progress.
      • reosarevok
        "Just how shit are we really?" "PROGRESS!"
      • :D
      • iliekcomputers
        did we come to a conclusion about storing the data?
      • reosarevok
        But yeah, I guess :)
      • ruaok
        iliekcomputers: no
      • iliekcomputers
        😂
      • reosarevok
        iliekcomputers: you're a playground bully :p
      • You guys have more money than everyone else combined!
      • ferbncode
        iliekcomputers: 😂
      • reosarevok
        You'll still manage to lose to Pakistan somehow anyway, though, so it's ok
      • alastairp
        iliekcomputers: I want to add a constant from somewhere in the code into a sphinx documentation so that it shows up in the api documentation. ever done that?
      • pristine__
        ruaok: hey. Sorry, I am a lil late, didn't know the time of the meeting. Phew.
      • iliekcomputers
        alastairp: i didn't write it but yeah
      • oh sorry, alastairp ^
      • reosarevok: never lost to pakistan in a world cup :D
      • pac23
        but 29 hour response time is just apphaling
      • iliekcomputers when is the match ?
      • alastairp
        iliekcomputers: cool, it's possible that one of these auto* methods can include the number directly into the docstring, I'll have a look
      • BrainzGit
        [musicbrainz-server] reosarevok merged pull request #1026 (master…MBS-10133): MBS-10133: Clarify "empty query" bad request error https://github.com/metabrainz/musicbrainz-serve...
      • BrainzBot
        MBS-10133: Error message when sending an empty query to the WS is unclear https://tickets.metabrainz.org/browse/MBS-10133
      • iliekcomputers
        ruaok: if we put everything in a different database, it'll be harder to access from LB, no joins etc.
      • pac23: it is going on, SA 34/2 in 10 overs :D
      • alastairp
        not many good umpire emojis
      • \o/
      • \o
      • _o
      • iliekcomputers
        alastairp: nz demolished sri lanka a few days ago
      • 🎉
      • alastairp: are you putting the ratelimit values inside the docs from sphinx?
      • alastairp
        I'm looking at the number of items per bulk query
      • ratelimit values would be nice, but those will come from consul now?
      • iliekcomputers
        ah.
      • alastairp
        and so won't be available when docs are built
      • iliekcomputers
        yeah, consul was the problem when i thought of putting it in there yesterday
      • alastairp
        can we set a default, and override it with config if set?
      • iliekcomputers
        that is what i did for now
      • alastairp
        right, but that would set the limits to the BU defaults, I'm not sure if we want a specific AB defaults too
      • BrainzGit
        [musicbrainz-server] reosarevok merged pull request #1034 (master…MBS-8915): MBS-8915: Allow editors to choose delimiter in track parser https://github.com/metabrainz/musicbrainz-serve...
      • BrainzBot
        MBS-8915: Allow editors to choose delimiter in track parser https://tickets.metabrainz.org/browse/MBS-8915
      • D4RK-PH0ENiX has quit
      • D4RK-PH0ENiX joined the channel
      • D4RK-PH0ENiX has quit
      • ruaok
        > if we put everything in a different database, it'll be harder to access from LB, no joins etc.
      • iliekcomputers: yes, exactly, but then again, what data exists in LB that needs to be joined?
      • the key data really lives in Influx.
      • iliekcomputers
        Select track from user join cf_recommendation on user.id
      • To get user recommendations for a bunch of users
      • ruaok
        at the same time that limits recommendations to people who have LB accounts.
      • not sure if that is a relevant point.
      • I *think* adding a schema into the LB data is the right course of action for now.
      • iliekcomputers
        That sounds like a reasonable compromise to me for now.
      • ruaok
        what do we call it?
      • recommendation? recsys? (which is what the industry calls all this. not a fan, really).
      • iliekcomputers
        Recommendation
      • Mr_Monkey
        The WhyNot? Machine
      • ruaok
        recommendation.{track_track_relations|artist_artist_relations|cf_user_recommendation} ?
      • actually singhular on the first two.
      • ruaok can't spel
      • CatQuest
        that's okaye
      • ruaok
        not a very catty comment from you, CatQuest...
      • hmmm. $17k invoices to send. delicious.
      • CatQuest
        anyway. ruaok I wanted to ask you a slight off topic question. how hard is it really to register and own and maintain a *.cat domain (seeingas you live in barceloan now i thoguht you woudl know, don't you also have a *.cat websie now?)
      • ruaok
        easy in the grand scheme of things.
      • there is one caveat -- there needs to be some catalan content on the page.
      • CatQuest
        exactly
      • but liek, how strict are they?
      • ruaok
        my mayhem.cat page has no text, except for "Benvinguts". so, welcome in Catalan. No one has ever come complaining.
      • CatQuest
        if I translated reosarevok's "nokkloom" page into estonian, wil lit suffice
      • ruaok
        not sure, really.
      • CatQuest
        hmmmmmm
      • ruaok
        like I said, I have almost no text on my site.