Mr_Monkey: remember during the oddessey work we needed a script that could launch a playlist from the command line?
2020-05-11 13243, 2020
Mr_Monkey
The low hanging fruit I see are: better integration with spotify (and possibly other services like youtube) to be able to play content
2020-05-11 13253, 2020
ruaok
we need to dig back into that.
2020-05-11 13207, 2020
Mr_Monkey
Yes
2020-05-11 13221, 2020
shivam-kapila
(Youtube support is nice option)
2020-05-11 13230, 2020
ruaok
rdswift has some code I need to look at that submits a playlist to LB -- when those tracks are submitted to the LB player, it should start playing that list.
2020-05-11 13233, 2020
Mr_Monkey
We're assuming all the code lives in LB for now, right?
2020-05-11 13257, 2020
iliekcomputers
i would not want a new repository
2020-05-11 13221, 2020
ruaok
even for the recommendations toolkit, iliekcomputers?
2020-05-11 13233, 2020
ruaok
the playback stuff, is certainly part of LB, yes.
2020-05-11 13234, 2020
shivam-kapila
LB seems to be a good option for this feature.
2020-05-11 13235, 2020
alastairp
back
2020-05-11 13253, 2020
iliekcomputers
if it's in python, yes.
2020-05-11 13202, 2020
reosarevok
yvanzo: whatevs, works for me
2020-05-11 13218, 2020
ruaok
it is -- ok, then I'll make a branch for this stuff, rather than a new repo.
2020-05-11 13222, 2020
iliekcomputers
otherwise a lot of effort will go into setting up stuff like jenkins, jira, other github bots etc
2020-05-11 13223, 2020
ruaok
alastairp: thoughts so far?
2020-05-11 13230, 2020
alastairp
sure, a few things
2020-05-11 13203, 2020
iliekcomputers
ruaok: i'd just make a dir in master and work off master instead of a seperate branch
2020-05-11 13217, 2020
reosarevok
> first: annoy data sources.
2020-05-11 13221, 2020
reosarevok
Oh, we do that all the time!
2020-05-11 13201, 2020
alastairp
1) I think the academic side of this kind of stuff is a really good side to look at to source ideas from. It's true that not all work coming out of here gets turned into concrete products, but most of the research is really solid
2020-05-11 13226, 2020
alastairp
and there are a lot of people publishing now based on internships that they've done, both at music companies and non-music companies.
2020-05-11 13234, 2020
ruaok
no argument against that #1, alastairp/
2020-05-11 13258, 2020
alastairp
our data is great because it's open from end-to-end, so these people are a great resource to tap too, if we can get data to them
2020-05-11 13232, 2020
ruaok
yep, ideally we'll host all of the datasets and have the toolkit only make web requests.
2020-05-11 13247, 2020
alastairp
in terms of where this lives... I feel like maybe the definition needs to be clarified a bit more to work out what parts we need
2020-05-11 13252, 2020
ruaok
the initial version will require musicbrainz-docker to be installed.
2020-05-11 13254, 2020
alastairp
100% I believe that the toolkit is a new repo
2020-05-11 13206, 2020
djwhitey has quit
2020-05-11 13217, 2020
alastairp
the public view of the results of this toolkit, this can definitely be LB
2020-05-11 13219, 2020
ruaok
alastairp will go against iliekcomputers in the cage match over the new repo or not!
2020-05-11 13244, 2020
iliekcomputers
will LB eventually use the toolkit?
2020-05-11 13252, 2020
ruaok
iliekcomputers: possibly not.
2020-05-11 13256, 2020
alastairp
sure, that's one of the things that might use it
2020-05-11 13201, 2020
alastairp
oh, wait
2020-05-11 13202, 2020
ruaok
it is entirely too early to tell.
2020-05-11 13210, 2020
alastairp
"use the toolkit", that's kind of what I mean
2020-05-11 13218, 2020
alastairp
LB is going to want results, not building blocks
2020-05-11 13225, 2020
ruaok
the thing is, the toolkit is the base for building new stuff and for dev experimentation.
2020-05-11 13237, 2020
ruaok
as far as deployment for use on our sites, that is far from clear right now.
2020-05-11 13258, 2020
iliekcomputers
i'm trying to avoid the workflow where i need to add a feature and i have to make two different pull requests in two different repositories
2020-05-11 13207, 2020
shivam-kapila
Are we gonna have a totally new project?
2020-05-11 13222, 2020
iliekcomputers
or the dependency problems we had with messybrainz and BU to some extent
2020-05-11 13224, 2020
ruaok
shivam-kapila: no not really.
2020-05-11 13253, 2020
ruaok
how about we defer this question until we look at some example code that doesn't currently live in LB.
2020-05-11 13203, 2020
ruaok
the we can see if we want to add or make a new repo.
2020-05-11 13215, 2020
ruaok
so far there is 0% shared code and 0% shared functionality.
2020-05-11 13251, 2020
ruaok
this toolkit is going to be hecka inefficient, but super hecka flexible.
2020-05-11 13253, 2020
BrainzGit joined the channel
2020-05-11 13254, 2020
Mr_Monkey
Apart from playing the playlists I guess
2020-05-11 13208, 2020
ruaok
generating a playlist will take seconds to minutes. not snappy.
2020-05-11 13226, 2020
shivam-kapila
Playlist on the go -- Exciting
2020-05-11 13227, 2020
ruaok
Mr_Monkey: yes, but that is LB functionality, not toolkit functionality.
2020-05-11 13206, 2020
ruaok
but, once we see which algorithms generate good playlists/similar tracks/whatnot, we can step back and look at how to implement that for production purposes.
2020-05-11 13223, 2020
ruaok
its kinda hard to describe at this point.
2020-05-11 13224, 2020
Mr_Monkey
SGTM
2020-05-11 13233, 2020
ruaok
let's move on.
2020-05-11 13241, 2020
alastairp
maybe it's worth defining a few of our potential usecases/workflows, from the perspective of an algorithm developer or user
2020-05-11 13245, 2020
ruaok
alastairp: how much effort is it to get some annoy indexes up?
2020-05-11 13237, 2020
alastairp
we're already 90% there in that document I think, perhaps this could help us define a little more clearely where we _think_ things might end up living (with the potential to move in the future if we find something better)
2020-05-11 13203, 2020
alastairp
to get the indexes up that were running at the summit, less than a day
2020-05-11 13222, 2020
iliekcomputers
sorry, i have to drop off, i'll read the backlog.
2020-05-11 13224, 2020
ruaok
ohh, I'd like that!
2020-05-11 13235, 2020
alastairp
however, I get the impression that we were always running around making new apis and adding extra functionality
2020-05-11 13242, 2020
Mr_Monkey
^that
2020-05-11 13259, 2020
ruaok
and never finishing anything?
2020-05-11 13200, 2020
alastairp
so I don't know how much of what we have is actually useful for what we might need
2020-05-11 13203, 2020
Mr_Monkey
It's likely we'll need to adapt the api as the projects move along
2020-05-11 13210, 2020
alastairp
nah - I think we got stuff finished
2020-05-11 13219, 2020
ruaok
well, that's just where we've finally arrived!
2020-05-11 13228, 2020
ruaok
we finally get to tie all the pieces together.
2020-05-11 13232, 2020
alastairp
that is, the few odyssey endpoints that we made were working fine
2020-05-11 13249, 2020
alastairp
but I guess in effect that's what these blocks are
2020-05-11 13259, 2020
alastairp
taking the low-level similarity index, and building something more complex on top of it
2020-05-11 13208, 2020
ruaok
what I have I so far is includes: artist-artist relations, some crappy track-track relations and mb metadata lookup. next up annoy indexes and and the MSB mapping.
2020-05-11 13231, 2020
ruaok
once the MSB mapping is in I can fetch the top artists per user from LB and then make playlists based on that data!
2020-05-11 13240, 2020
alastairp
this week we'll have a BPM and key endpoint in AB too
2020-05-11 13251, 2020
ruaok
alastairp: perfect!
2020-05-11 13204, 2020
ruaok
then we can make AB filters for those two.
2020-05-11 13236, 2020
ruaok
the MSB mapping is key, because it lets me take any of the stats generated for LB users and use them as sources/filters in this toolkit.
2020-05-11 13212, 2020
Mr_Monkey
That's really powerful
2020-05-11 13231, 2020
ruaok
alastairp: shall we say that at the end of the week we'll have the annoy endpoints up and the BPM and key API endpoints?
2020-05-11 13200, 2020
ruaok
and I think Mr_Monkey and I can work out the playlist playback feature -- it seems to be mostly there.
2020-05-11 13230, 2020
alastairp
I can't commit to this week, I've got some other projects that I need to focus on, but next week should be fine
2020-05-11 13235, 2020
Mr_Monkey was just listening to LB recent page to check. Works well for me, but needs more testing
2020-05-11 13236, 2020
ruaok
Mr_Monkey: yes, combining all this stuff is really starting to open my eyes as to the possibilities for this. I mean, this has been the plan for years, but its finally coming together.
2020-05-11 13256, 2020
ruaok
alastairp: what can you give me this week?
2020-05-11 13217, 2020
ruaok
BPM and key API endpoints by the end of this week is fine.
2020-05-11 13221, 2020
alastairp
I don't know. My current plan was to be on a differnt project Tuesday - Friday
2020-05-11 13222, 2020
ruaok
annoy early next week?
2020-05-11 13230, 2020
alastairp
of course, BPM and Key are already in the API, but I'm looking at getting the individual features PR merged, hopefully tonight
2020-05-11 13243, 2020
alastairp
that was supposed to happen this afternoon but the day ran away on me
2020-05-11 13254, 2020
alastairp
so to clarify -
2020-05-11 13237, 2020
alastairp
what are we working on now, how long is now (is this a specific 2-3 week dev session), and do we have a specific thing that we want to make by the end of it?
2020-05-11 13219, 2020
ruaok
alastairp: that is the point of this meeting. and my hope was to actually influence your current priorities.
2020-05-11 13249, 2020
alastairp
OK, cool
2020-05-11 13254, 2020
ruaok
the end goal, for starters for you, would be the two added API endpoints and annoy indexes being up.
2020-05-11 13218, 2020
ruaok
and playlist launching with Mr_Monkey, ideally all implemented by the end of this week.
2020-05-11 13254, 2020
alastairp
right. from that, I'll be focusing on features tonight, and I'll see if I can carve out an hour or 2 to poke the similarity API and get it back up again
2020-05-11 13201, 2020
Mr_Monkey
On my side that should work
2020-05-11 13207, 2020
ruaok
then this would allow us to play with the toolkit next week and learn what it can do. then make a wishlist of what to improve and to make a roadmap for the next month.
2020-05-11 13228, 2020
alastairp
playlist launching - does this just cover "take a playlist and make a player for it?"
2020-05-11 13230, 2020
ruaok
Mr_Monkey: great!
2020-05-11 13237, 2020
ruaok
alastairp: yes.
2020-05-11 13246, 2020
alastairp
or do you want to include an actual playlist generator in this bit of work as well?
2020-05-11 13210, 2020
ruaok
alastairp: is the similarity API the oddeysey stuff? Really at this point I only need annoy index lookups.
2020-05-11 13222, 2020
alastairp
it's all the same codebase
2020-05-11 13233, 2020
ruaok
there is a playlist generator planned to be included in the toolkit.
2020-05-11 13252, 2020
ruaok
first cut: randomly select n playlists from the result of the pipeline.
2020-05-11 13254, 2020
alastairp
likely just a matter of checking out the correct branch
2020-05-11 13223, 2020
ruaok
don't worry about the similarity API stuff -- just annoy index lookups for now.
2020-05-11 13237, 2020
ruaok
effectively this similarity API stuff can be replaced by the work from this toolkit.
2020-05-11 13235, 2020
ruaok
Mr_Monkey: shall we work on the playlist launching stuff tomorrow?
2020-05-11 13241, 2020
Mr_Monkey
Yes
2020-05-11 13203, 2020
alastairp
so it sounds like we've got a bit of features, some acoustic similarity stuff, and some metadata relations
2020-05-11 13204, 2020
ruaok
I'll need some time in the morning to get my stuff in gear, but I can have something for the afternoon.
2020-05-11 13229, 2020
ruaok
alastairp: plus the MSB mapping which allows us to pull in LB user stats.
2020-05-11 13249, 2020
ruaok
so, every new stat that ishaanshah[m] and iliekcomputers are kicking out translates into a new data source in the toolkit.
2020-05-11 13252, 2020
alastairp
right, so that should be able to turn into some basic CF stuff, right?
2020-05-11 13252, 2020
Mr_Monkey
and artist-artist similarity as well, no?
2020-05-11 13203, 2020
ruaok
Mr_Monkey: already baked in.
2020-05-11 13209, 2020
alastairp
Mr_Monkey: right, I kind of counted that as metadata relations
2020-05-11 13225, 2020
ruaok
alastairp: yes, now that pristine__ is back, we're going to work on putting CF candidate track sets into the toolkit as well.
2020-05-11 13248, 2020
alastairp
ruaok: can you make up a data source based on a stat?
2020-05-11 13255, 2020
alastairp
I mean
2020-05-11 13258, 2020
ruaok
we can now!
2020-05-11 13205, 2020
pristine__
:)
2020-05-11 13211, 2020
ruaok
each of the stats for a users are fetchable via the API.
2020-05-11 13219, 2020
alastairp
not is it possible. I mean, can you give an example
2020-05-11 13230, 2020
alastairp
what is a stat? (what kind of things are we computing?)
2020-05-11 13249, 2020
ruaok
so, we just wget the top artists for a user, for instance -- that is live now.
2020-05-11 13210, 2020
ruaok
filter that through the MSB mapping and you have a list of top artists MBIDs.
2020-05-11 13232, 2020
alastairp
OK, so things like "top ranked artists for a user, with weights" is a feature that we have
2020-05-11 13234, 2020
ruaok
when the report supports that for the past week, we can generate a weekly playlist based on your past week's listens.
and any of the stats that we'll be adding to LB will be fetchable via the API, this that can become a datasource/filter.
2020-05-11 13235, 2020
ruaok
yes. we finally have enough datasets that we can pull all the pieces together.
2020-05-11 13237, 2020
alastairp
and candidate tracks, you specifically mean (all tracks) x (tracks listened to by a user) => (some new suggestions to listen to)
2020-05-11 13256, 2020
ruaok
yes.
2020-05-11 13259, 2020
alastairp
great
2020-05-11 13200, 2020
pristine__
Yes
2020-05-11 13243, 2020
ruaok
so, once we have these bits of data and develop algs that really work, then we can figure out how to push these back to spark/whatever for a mass deployment.
2020-05-11 13243, 2020
alastairp
does this currently work with all tracks, or do we have the capability to break it down by time period?
2020-05-11 13257, 2020
ruaok
pristine__: ^^
2020-05-11 13259, 2020
alastairp
very briefly, do we have thoughts about evaluation?
2020-05-11 13224, 2020
ruaok
alastairp: only that it is something that we will need to work on before too long.