#metabrainz

/

      • ruaok
        what should be build as an example of what we've gathered?
      • 2020-05-05 12655, 2020

      • ruaok
        yes, I think the latter is easier to achieve, but will not be as amazing.
      • 2020-05-05 12613, 2020

      • ruaok
        I see two possible projects coming out of this:
      • 2020-05-05 12622, 2020

      • alastairp
        mmm, Andrés was working on similar stuff to this too
      • 2020-05-05 12629, 2020

      • ruaok
        1. Find "similar" tracks and display them.
      • 2020-05-05 12633, 2020

      • alastairp
        did you see the recsys spotify challenge a few years ago?
      • 2020-05-05 12638, 2020

      • shivam-kapila has quit
      • 2020-05-05 12638, 2020

      • shivam-kapila joined the channel
      • 2020-05-05 12640, 2020

      • ruaok
        yes
      • 2020-05-05 12651, 2020

      • alastairp
      • 2020-05-05 12614, 2020

      • alastairp
        I like the idea of similar tracks
      • 2020-05-05 12624, 2020

      • ruaok
        2. MusicIP, back in the day, had a great tool on the desktop that allowed you to pick a seed track and then made a playlist from your own music collection
      • 2020-05-05 12629, 2020

      • alastairp
        Philip's working on some tools for this kind of visualisation for his phd, but I'm not sure how ready it is
      • 2020-05-05 12644, 2020

      • alastairp
        I think it's public, let me ping him (and hope that we don't take the server hosting it down ;)
      • 2020-05-05 12647, 2020

      • ruaok
        and it used no collaborative filtering, but automatic annotation and acoustic features.
      • 2020-05-05 12610, 2020

      • ruaok
        overall, it did a really good job.
      • 2020-05-05 12627, 2020

      • alastairp
        right, intuitively it seems like aidan's work would be the perfect base to that
      • 2020-05-05 12643, 2020

      • ruaok
        but even more than a goal for a specific tool, is a desire to create a playground for recommendation work.
      • 2020-05-05 12613, 2020

      • shivam-kapila has quit
      • 2020-05-05 12625, 2020

      • shivam-kapila joined the channel
      • 2020-05-05 12628, 2020

      • ruaok
        if people can create building blocks (annoy, related artists, etc) and allow people to combine them in easy ways using boring ass python, we can reach a lot of people.
      • 2020-05-05 12645, 2020

      • ruaok
        and that premise is a good one for outreach.
      • 2020-05-05 12633, 2020

      • alastairp
        that's reminding me of some related work, one sec, let me find it
      • 2020-05-05 12638, 2020

      • ruaok
        so, if we can create a playground/toolkit that could implement similar tracks or "related tracks radio" or somesuch, that could be powerful.
      • 2020-05-05 12617, 2020

      • alastairp
      • 2020-05-05 12628, 2020

      • alastairp
        yeah, there are lots of "pull together blocks of components" frameworks
      • 2020-05-05 12601, 2020

      • ruaok
        oh yeah? got links to ones that are successful?
      • 2020-05-05 12607, 2020

      • ruaok
        anything truly open source?
      • 2020-05-05 12624, 2020

      • alastairp
        not anything specifically, Ive just seen them around
      • 2020-05-05 12631, 2020

      • alastairp
        I mean, doesn't spark kind of do that?
      • 2020-05-05 12641, 2020

      • ruaok
        not really, no.
      • 2020-05-05 12659, 2020

      • ruaok
        all we're using of spark is the collaborative filtering, not much else.
      • 2020-05-05 12611, 2020

      • alastairp
      • 2020-05-05 12616, 2020

      • ruaok
        which of course, requires lots of massaging data beforehand.
      • 2020-05-05 12618, 2020

      • alastairp
        this kind of thing
      • 2020-05-05 12656, 2020

      • ruaok
        ah. I thought you meant a loft of them in terms of music recommendation.
      • 2020-05-05 12604, 2020

      • alastairp
        oh, no. just in general
      • 2020-05-05 12620, 2020

      • ruaok
        oh, then that is an implicit vote for the approach, i think.
      • 2020-05-05 12624, 2020

      • alastairp
        as a gut feel, I'm not sure where we should focus first - a framework or a demo
      • 2020-05-05 12653, 2020

      • ruaok
        I'd like implement a demo by creating a MVP framework.
      • 2020-05-05 12613, 2020

      • ruaok
        just a demo makes it harder for people to jump in.
      • 2020-05-05 12626, 2020

      • alastairp
        a framework has the potential downside that we try and focus too much on making it all-singing all-dancing pluggable, and run out of time to make an actual demo
      • 2020-05-05 12635, 2020

      • ruaok
        but if a building block can be created by wrapping an API, that improves things for everyone.
      • 2020-05-05 12649, 2020

      • alastairp
        however, making 2 demos immediately, with the aim to make them pluggable, and then reverse engineer that into a framework
      • 2020-05-05 12653, 2020

      • ruaok
        yes, that is a key anti-pattern to avoid.
      • 2020-05-05 12659, 2020

      • alastairp
        means that we have a framework that works, and 2 demos
      • 2020-05-05 12606, 2020

      • alastairp
        I guess that's your MVP framework
      • 2020-05-05 12611, 2020

      • alastairp
        so it seems like we're on the same page
      • 2020-05-05 12631, 2020

      • ruaok
        yeah, lots of open questions still, but seemingly so.
      • 2020-05-05 12644, 2020

      • alastairp
        do you think it'd be interesting to look through a bunch of state of the art in terms of publications and challenges (like the recsys one) and see if we can reproduce them using our data?
      • 2020-05-05 12608, 2020

      • alastairp
        alternatively, we start really basic
      • 2020-05-05 12615, 2020

      • ruaok
        to me making the perfect framework would include the qualitative evaluation bits that are needed to properly evaluate algorithms in a academic context.
      • 2020-05-05 12628, 2020

      • alastairp
        and not try and add a whole bunch of complexity that state of the art includes
      • 2020-05-05 12642, 2020

      • ruaok
        but I feel that that should come later. I want to bring in people with blank slates and raw energy.
      • 2020-05-05 12600, 2020

      • ruaok
        really basic.
      • 2020-05-05 12603, 2020

      • alastairp
        this week I'll have a talk with Dmitry and Andrés, they were working on some very similar projects last year
      • 2020-05-05 12604, 2020

      • ruaok
        absolutely.
      • 2020-05-05 12614, 2020

      • ruaok
        great.
      • 2020-05-05 12636, 2020

      • ruaok
        I personally suspect that this framework will need multiple refactorings as we grow it over time.
      • 2020-05-05 12646, 2020

      • ruaok
        just as MB did, and that is fine.
      • 2020-05-05 12647, 2020

      • alastairp
        I think we almost have enough data to do a "extend this playlist by x songs"
      • 2020-05-05 12601, 2020

      • alastairp
        where a "playlist" can be a section of someone's listening history
      • 2020-05-05 12614, 2020

      • ruaok
        that's pretty good as well!
      • 2020-05-05 12632, 2020

      • ruaok
        part of my motivation is that I really enjoyed trying to coax something useful out of the annoy indexes.
      • 2020-05-05 12651, 2020

      • alastairp
        to clarify - do you see that as different than collaborative filtering?
      • 2020-05-05 12654, 2020

      • ruaok
        and it was clear that more data, not from AB, was needed to select tracks and to shape them into something useful.
      • 2020-05-05 12603, 2020

      • ruaok
        I'd like to make playing around with these tools much easier.
      • 2020-05-05 12615, 2020

      • alastairp
        or, collaborative filtering is a key in the puzzle to that recommender?
      • 2020-05-05 12619, 2020

      • ruaok
        see what as different?
      • 2020-05-05 12624, 2020

      • ruaok
        ah.
      • 2020-05-05 12626, 2020

      • alastairp
        "extend this playlist"
      • 2020-05-05 12644, 2020

      • ruaok
        right now I am viewing CF as yet another thing to plug into this system.
      • 2020-05-05 12610, 2020

      • ruaok
        if we get any sort of traction with people playing, someone or perhaps ourselves will move the CF stuff along and plug that in.
      • 2020-05-05 12620, 2020

      • ruaok
        that should clearly be the overall goal.
      • 2020-05-05 12626, 2020

      • alastairp
        right agreed
      • 2020-05-05 12643, 2020

      • ruaok
        but CF may not be needed for an initial version of extend this playlist -- I suspect it would be the winning approach.
      • 2020-05-05 12659, 2020

      • alastairp
        so it sounds like there are at least 2 possible demos for now - playlist extension, and your "similar songs in my collection" demo?
      • 2020-05-05 12659, 2020

      • ruaok
        but an initial approach could simple be to use the provided tracks as more context.
      • 2020-05-05 12629, 2020

      • ruaok
        three, likely.
      • 2020-05-05 12652, 2020

      • ruaok
        extend playlist, play me stuff based on this seed song (radio) and similar songs (any or in my collection)
      • 2020-05-05 12609, 2020

      • alastairp
        right
      • 2020-05-05 12624, 2020

      • alastairp
        extend playlist could be considered a special case of radio, just with more constraints, right?
      • 2020-05-05 12625, 2020

      • ruaok
        I get the feeling that our use cases should define the features that we should build into the MVP framework.
      • 2020-05-05 12639, 2020

      • ruaok
        more constraints and more context, yes.
      • 2020-05-05 12613, 2020

      • ruaok
        ok I think we're roughly on the same page from a high level perspective now.
      • 2020-05-05 12625, 2020

      • SomalRudra joined the channel
      • 2020-05-05 12631, 2020

      • ruaok
        shall we dive into more details as far as what baseline features such an MVP framework needs?
      • 2020-05-05 12610, 2020

      • alastairp
        yeah, let's do it
      • 2020-05-05 12630, 2020

      • alastairp
        btw, this is Philip's demo on grouping/clustering songs based on AB features: http://music-explore.upf.edu/
      • 2020-05-05 12633, 2020

      • ruaok
        ok, my base idea was nothing more than data sources, filters and data sinks.
      • 2020-05-05 12645, 2020

      • alastairp
        it's... a bit slow, but has some interesting stuff
      • 2020-05-05 12652, 2020

      • alastairp
        I've forgotten how it works
      • 2020-05-05 12658, 2020

      • ruaok
        ok, I'll digest the pile of links after this chat. thanks!
      • 2020-05-05 12639, 2020

      • ruaok
        data sources = given some input, find candidate tracks.
      • 2020-05-05 12654, 2020

      • ruaok
        filters = given candidate tracks, remove ones that do not meet filter requirements.
      • 2020-05-05 12611, 2020

      • ruaok
        data sinks = select final tracks, possibly order, output.
      • 2020-05-05 12646, 2020

      • ruaok
        ANNOY is a simple data source. Given a track and a feature, get tracks.
      • 2020-05-05 12658, 2020

      • ruaok
        CF filtering is a great data source.
      • 2020-05-05 12627, 2020

      • alastairp
        yeah, exactly
      • 2020-05-05 12642, 2020

      • alastairp
        plan musicbrainz metadata could be another interesting source
      • 2020-05-05 12654, 2020

      • alastairp
        "by the same producer", "solo work by a band member", etc
      • 2020-05-05 12600, 2020

      • ruaok
        yes.
      • 2020-05-05 12614, 2020

      • ruaok
        and there is a fine line between source and filter too.
      • 2020-05-05 12639, 2020

      • ruaok
        related artists could be used as a source to find more tracks or as a filter to reduce tracks in a candidate set.
      • 2020-05-05 12600, 2020

      • ruaok
        also, AB data bits. BPM,clearly.
      • 2020-05-05 12640, 2020

      • alastairp
        yes, I see what you mean
      • 2020-05-05 12658, 2020

      • ruaok
        fetch BPM for all candidates, filter out some, and the playlist shape as per example in the doc.
      • 2020-05-05 12604, 2020

      • alastairp
        it depends on if you want to use it to add more candidate songs, or get rid of some
      • 2020-05-05 12615, 2020

      • ruaok
        exactly that.
      • 2020-05-05 12651, 2020

      • ruaok
        and the power of writing 10 lines of python, running it and watching stats fly by and ending up with 1000 tracks or 0 is compelling.
      • 2020-05-05 12653, 2020

      • alastairp
        btw, it turns out that often just recommending the current top 40 is the best way to predict what people are likely to listen to next
      • 2020-05-05 12603, 2020

      • ruaok
        I know that I need to change filter/source, whatnot...
      • 2020-05-05 12605, 2020

      • alastairp
        right, so something that can run these modules and tell you at each stage what is coming out and going into the next
      • 2020-05-05 12609, 2020

      • alastairp
        sounds neat
      • 2020-05-05 12617, 2020

      • ruaok
        yes.
      • 2020-05-05 12626, 2020

      • alastairp
        being able to test individually will be a lifesaver too
      • 2020-05-05 12633, 2020

      • alastairp
        so, who's going to make that? :)
      • 2020-05-05 12646, 2020

      • ruaok
        I kinda want to take recommendations out of the academic environment that they sit in now and bring them closer to open source hackers.
      • 2020-05-05 12612, 2020

      • ruaok
        now, we will still need academics to build the higher ends bits, but this can bring people together.
      • 2020-05-05 12633, 2020

      • ruaok
        who? you, Mr_Monkey, and myself. and anyone else who wants to jump in.
      • 2020-05-05 12654, 2020

      • alastairp
        yes, doing a review of the kinds of things that these higher-end bits do will be a good start
      • 2020-05-05 12657, 2020

      • ruaok
        I think it would be natural for you to make ANNOY data sources and AB data sources.
      • 2020-05-05 12620, 2020

      • alastairp
        I'll have a look into some recent music recommendation publications
      • 2020-05-05 12623, 2020

      • ruaok
        and Mr_Monkey could work on the concept of making the output playable.
      • 2020-05-05 12646, 2020

      • ruaok
        once I finish the timescale migration, I can start building the core of the framework.
      • 2020-05-05 12618, 2020

      • ruaok
        which is just sets of data and calling sources, filters and output bits and joining the data.
      • 2020-05-05 12626, 2020

      • ruaok
        not rocket science, but fairly tedious.
      • 2020-05-05 12625, 2020

      • alastairp
        right
      • 2020-05-05 12612, 2020

      • alastairp
        so from my perspective, I'll start focusing during the next few weeks to get similarity integrated in AB and running at the scale that we need it
      • 2020-05-05 12614, 2020

      • ruaok
        ok, let me mull all of this over, read the link you posted and see about designing some data pipelines for this project.
      • 2020-05-05 12637, 2020

      • Mr_Monkey
        Is there currently an API for similarity?
      • 2020-05-05 12639, 2020

      • alastairp
        and parallel to that, I'll dig into what other people are doing in this space and see how much of it we can reproduce ourselves
      • 2020-05-05 12641, 2020

      • ruaok
        ok, sounds good.
      • 2020-05-05 12647, 2020

      • alastairp
        Mr_Monkey: the one we used at the summit
      • 2020-05-05 12601, 2020

      • alastairp
        but I think I replaced that instance with something else for testing
      • 2020-05-05 12621, 2020

      • alastairp
        next week I'll focus on the next steps for deployment, and will get it back up on similarity.ab.org
      • 2020-05-05 12629, 2020

      • Mr_Monkey
        I'm assuming all the similarity stuff will live with AB, and so will that API then
      • 2020-05-05 12622, 2020

      • ruaok
        iliekcomputers nudged me about having a meeting to get everyone on board with what is going on with LB stuff.
      • 2020-05-05 12638, 2020

      • ruaok
        I need to do more planning and have the board meeting this week.
      • 2020-05-05 12654, 2020

      • ruaok
        I should be ready for a meeting next week to hopefully kick of this project in a more formal sort of way.
      • 2020-05-05 12629, 2020

      • ruaok
        I would propose I hour before the normal meeting time on monday to hold this meeting. iliekcomputers, shivam-kapila, ishaanshah[m], alastairp, Mr_Monkey ?
      • 2020-05-05 12638, 2020

      • Freso
        I don’t know if it’s too early, but Zastai|2 or others from the Kodi community might want to keep close tabs on this. This would be absolutely killer for something like Kodi to have built-in.
      • 2020-05-05 12642, 2020

      • Mr_Monkey
        Good for me.
      • 2020-05-05 12658, 2020

      • alastairp
        Freso: that sounds like phase 2
      • 2020-05-05 12602, 2020

      • ruaok
        Freso: too early. but they are most certainly on my radar.
      • 2020-05-05 12605, 2020

      • alastairp
        get something working, get people to use it
      • 2020-05-05 12609, 2020

      • ruaok
        ding
      • 2020-05-05 12614, 2020

      • Freso nods
      • 2020-05-05 12627, 2020

      • alastairp
        yes, 6 on monday should be good for me
      • 2020-05-05 12639, 2020

      • iliekcomputers
        works for me too