2. MusicIP, back in the day, had a great tool on the desktop that allowed you to pick a seed track and then made a playlist from your own music collection
2020-05-05 12629, 2020
alastairp
Philip's working on some tools for this kind of visualisation for his phd, but I'm not sure how ready it is
2020-05-05 12644, 2020
alastairp
I think it's public, let me ping him (and hope that we don't take the server hosting it down ;)
2020-05-05 12647, 2020
ruaok
and it used no collaborative filtering, but automatic annotation and acoustic features.
2020-05-05 12610, 2020
ruaok
overall, it did a really good job.
2020-05-05 12627, 2020
alastairp
right, intuitively it seems like aidan's work would be the perfect base to that
2020-05-05 12643, 2020
ruaok
but even more than a goal for a specific tool, is a desire to create a playground for recommendation work.
2020-05-05 12613, 2020
shivam-kapila has quit
2020-05-05 12625, 2020
shivam-kapila joined the channel
2020-05-05 12628, 2020
ruaok
if people can create building blocks (annoy, related artists, etc) and allow people to combine them in easy ways using boring ass python, we can reach a lot of people.
2020-05-05 12645, 2020
ruaok
and that premise is a good one for outreach.
2020-05-05 12633, 2020
alastairp
that's reminding me of some related work, one sec, let me find it
2020-05-05 12638, 2020
ruaok
so, if we can create a playground/toolkit that could implement similar tracks or "related tracks radio" or somesuch, that could be powerful.
which of course, requires lots of massaging data beforehand.
2020-05-05 12618, 2020
alastairp
this kind of thing
2020-05-05 12656, 2020
ruaok
ah. I thought you meant a loft of them in terms of music recommendation.
2020-05-05 12604, 2020
alastairp
oh, no. just in general
2020-05-05 12620, 2020
ruaok
oh, then that is an implicit vote for the approach, i think.
2020-05-05 12624, 2020
alastairp
as a gut feel, I'm not sure where we should focus first - a framework or a demo
2020-05-05 12653, 2020
ruaok
I'd like implement a demo by creating a MVP framework.
2020-05-05 12613, 2020
ruaok
just a demo makes it harder for people to jump in.
2020-05-05 12626, 2020
alastairp
a framework has the potential downside that we try and focus too much on making it all-singing all-dancing pluggable, and run out of time to make an actual demo
2020-05-05 12635, 2020
ruaok
but if a building block can be created by wrapping an API, that improves things for everyone.
2020-05-05 12649, 2020
alastairp
however, making 2 demos immediately, with the aim to make them pluggable, and then reverse engineer that into a framework
2020-05-05 12653, 2020
ruaok
yes, that is a key anti-pattern to avoid.
2020-05-05 12659, 2020
alastairp
means that we have a framework that works, and 2 demos
2020-05-05 12606, 2020
alastairp
I guess that's your MVP framework
2020-05-05 12611, 2020
alastairp
so it seems like we're on the same page
2020-05-05 12631, 2020
ruaok
yeah, lots of open questions still, but seemingly so.
2020-05-05 12644, 2020
alastairp
do you think it'd be interesting to look through a bunch of state of the art in terms of publications and challenges (like the recsys one) and see if we can reproduce them using our data?
2020-05-05 12608, 2020
alastairp
alternatively, we start really basic
2020-05-05 12615, 2020
ruaok
to me making the perfect framework would include the qualitative evaluation bits that are needed to properly evaluate algorithms in a academic context.
2020-05-05 12628, 2020
alastairp
and not try and add a whole bunch of complexity that state of the art includes
2020-05-05 12642, 2020
ruaok
but I feel that that should come later. I want to bring in people with blank slates and raw energy.
2020-05-05 12600, 2020
ruaok
really basic.
2020-05-05 12603, 2020
alastairp
this week I'll have a talk with Dmitry and Andrés, they were working on some very similar projects last year
2020-05-05 12604, 2020
ruaok
absolutely.
2020-05-05 12614, 2020
ruaok
great.
2020-05-05 12636, 2020
ruaok
I personally suspect that this framework will need multiple refactorings as we grow it over time.
2020-05-05 12646, 2020
ruaok
just as MB did, and that is fine.
2020-05-05 12647, 2020
alastairp
I think we almost have enough data to do a "extend this playlist by x songs"
2020-05-05 12601, 2020
alastairp
where a "playlist" can be a section of someone's listening history
2020-05-05 12614, 2020
ruaok
that's pretty good as well!
2020-05-05 12632, 2020
ruaok
part of my motivation is that I really enjoyed trying to coax something useful out of the annoy indexes.
2020-05-05 12651, 2020
alastairp
to clarify - do you see that as different than collaborative filtering?
2020-05-05 12654, 2020
ruaok
and it was clear that more data, not from AB, was needed to select tracks and to shape them into something useful.
2020-05-05 12603, 2020
ruaok
I'd like to make playing around with these tools much easier.
2020-05-05 12615, 2020
alastairp
or, collaborative filtering is a key in the puzzle to that recommender?
2020-05-05 12619, 2020
ruaok
see what as different?
2020-05-05 12624, 2020
ruaok
ah.
2020-05-05 12626, 2020
alastairp
"extend this playlist"
2020-05-05 12644, 2020
ruaok
right now I am viewing CF as yet another thing to plug into this system.
2020-05-05 12610, 2020
ruaok
if we get any sort of traction with people playing, someone or perhaps ourselves will move the CF stuff along and plug that in.
2020-05-05 12620, 2020
ruaok
that should clearly be the overall goal.
2020-05-05 12626, 2020
alastairp
right agreed
2020-05-05 12643, 2020
ruaok
but CF may not be needed for an initial version of extend this playlist -- I suspect it would be the winning approach.
2020-05-05 12659, 2020
alastairp
so it sounds like there are at least 2 possible demos for now - playlist extension, and your "similar songs in my collection" demo?
2020-05-05 12659, 2020
ruaok
but an initial approach could simple be to use the provided tracks as more context.
2020-05-05 12629, 2020
ruaok
three, likely.
2020-05-05 12652, 2020
ruaok
extend playlist, play me stuff based on this seed song (radio) and similar songs (any or in my collection)
2020-05-05 12609, 2020
alastairp
right
2020-05-05 12624, 2020
alastairp
extend playlist could be considered a special case of radio, just with more constraints, right?
2020-05-05 12625, 2020
ruaok
I get the feeling that our use cases should define the features that we should build into the MVP framework.
2020-05-05 12639, 2020
ruaok
more constraints and more context, yes.
2020-05-05 12613, 2020
ruaok
ok I think we're roughly on the same page from a high level perspective now.
2020-05-05 12625, 2020
SomalRudra joined the channel
2020-05-05 12631, 2020
ruaok
shall we dive into more details as far as what baseline features such an MVP framework needs?
ok, my base idea was nothing more than data sources, filters and data sinks.
2020-05-05 12645, 2020
alastairp
it's... a bit slow, but has some interesting stuff
2020-05-05 12652, 2020
alastairp
I've forgotten how it works
2020-05-05 12658, 2020
ruaok
ok, I'll digest the pile of links after this chat. thanks!
2020-05-05 12639, 2020
ruaok
data sources = given some input, find candidate tracks.
2020-05-05 12654, 2020
ruaok
filters = given candidate tracks, remove ones that do not meet filter requirements.
2020-05-05 12611, 2020
ruaok
data sinks = select final tracks, possibly order, output.
2020-05-05 12646, 2020
ruaok
ANNOY is a simple data source. Given a track and a feature, get tracks.
2020-05-05 12658, 2020
ruaok
CF filtering is a great data source.
2020-05-05 12627, 2020
alastairp
yeah, exactly
2020-05-05 12642, 2020
alastairp
plan musicbrainz metadata could be another interesting source
2020-05-05 12654, 2020
alastairp
"by the same producer", "solo work by a band member", etc
2020-05-05 12600, 2020
ruaok
yes.
2020-05-05 12614, 2020
ruaok
and there is a fine line between source and filter too.
2020-05-05 12639, 2020
ruaok
related artists could be used as a source to find more tracks or as a filter to reduce tracks in a candidate set.
2020-05-05 12600, 2020
ruaok
also, AB data bits. BPM,clearly.
2020-05-05 12640, 2020
alastairp
yes, I see what you mean
2020-05-05 12658, 2020
ruaok
fetch BPM for all candidates, filter out some, and the playlist shape as per example in the doc.
2020-05-05 12604, 2020
alastairp
it depends on if you want to use it to add more candidate songs, or get rid of some
2020-05-05 12615, 2020
ruaok
exactly that.
2020-05-05 12651, 2020
ruaok
and the power of writing 10 lines of python, running it and watching stats fly by and ending up with 1000 tracks or 0 is compelling.
2020-05-05 12653, 2020
alastairp
btw, it turns out that often just recommending the current top 40 is the best way to predict what people are likely to listen to next
2020-05-05 12603, 2020
ruaok
I know that I need to change filter/source, whatnot...
2020-05-05 12605, 2020
alastairp
right, so something that can run these modules and tell you at each stage what is coming out and going into the next
2020-05-05 12609, 2020
alastairp
sounds neat
2020-05-05 12617, 2020
ruaok
yes.
2020-05-05 12626, 2020
alastairp
being able to test individually will be a lifesaver too
2020-05-05 12633, 2020
alastairp
so, who's going to make that? :)
2020-05-05 12646, 2020
ruaok
I kinda want to take recommendations out of the academic environment that they sit in now and bring them closer to open source hackers.
2020-05-05 12612, 2020
ruaok
now, we will still need academics to build the higher ends bits, but this can bring people together.
2020-05-05 12633, 2020
ruaok
who? you, Mr_Monkey, and myself. and anyone else who wants to jump in.
2020-05-05 12654, 2020
alastairp
yes, doing a review of the kinds of things that these higher-end bits do will be a good start
2020-05-05 12657, 2020
ruaok
I think it would be natural for you to make ANNOY data sources and AB data sources.
2020-05-05 12620, 2020
alastairp
I'll have a look into some recent music recommendation publications
2020-05-05 12623, 2020
ruaok
and Mr_Monkey could work on the concept of making the output playable.
2020-05-05 12646, 2020
ruaok
once I finish the timescale migration, I can start building the core of the framework.
2020-05-05 12618, 2020
ruaok
which is just sets of data and calling sources, filters and output bits and joining the data.
2020-05-05 12626, 2020
ruaok
not rocket science, but fairly tedious.
2020-05-05 12625, 2020
alastairp
right
2020-05-05 12612, 2020
alastairp
so from my perspective, I'll start focusing during the next few weeks to get similarity integrated in AB and running at the scale that we need it
2020-05-05 12614, 2020
ruaok
ok, let me mull all of this over, read the link you posted and see about designing some data pipelines for this project.
2020-05-05 12637, 2020
Mr_Monkey
Is there currently an API for similarity?
2020-05-05 12639, 2020
alastairp
and parallel to that, I'll dig into what other people are doing in this space and see how much of it we can reproduce ourselves
2020-05-05 12641, 2020
ruaok
ok, sounds good.
2020-05-05 12647, 2020
alastairp
Mr_Monkey: the one we used at the summit
2020-05-05 12601, 2020
alastairp
but I think I replaced that instance with something else for testing
2020-05-05 12621, 2020
alastairp
next week I'll focus on the next steps for deployment, and will get it back up on similarity.ab.org
2020-05-05 12629, 2020
Mr_Monkey
I'm assuming all the similarity stuff will live with AB, and so will that API then
2020-05-05 12622, 2020
ruaok
iliekcomputers nudged me about having a meeting to get everyone on board with what is going on with LB stuff.
2020-05-05 12638, 2020
ruaok
I need to do more planning and have the board meeting this week.
2020-05-05 12654, 2020
ruaok
I should be ready for a meeting next week to hopefully kick of this project in a more formal sort of way.
2020-05-05 12629, 2020
ruaok
I would propose I hour before the normal meeting time on monday to hold this meeting. iliekcomputers, shivam-kapila, ishaanshah[m], alastairp, Mr_Monkey ?
2020-05-05 12638, 2020
Freso
I don’t know if it’s too early, but Zastai|2 or others from the Kodi community might want to keep close tabs on this. This would be absolutely killer for something like Kodi to have built-in.
2020-05-05 12642, 2020
Mr_Monkey
Good for me.
2020-05-05 12658, 2020
alastairp
Freso: that sounds like phase 2
2020-05-05 12602, 2020
ruaok
Freso: too early. but they are most certainly on my radar.