#metabrainz

/

      • Mr_Monkey
        :)
      • 2020-05-11 13241, 2020

      • ruaok
        Mr_Monkey: remember during the oddessey work we needed a script that could launch a playlist from the command line?
      • 2020-05-11 13243, 2020

      • Mr_Monkey
        The low hanging fruit I see are: better integration with spotify (and possibly other services like youtube) to be able to play content
      • 2020-05-11 13253, 2020

      • ruaok
        we need to dig back into that.
      • 2020-05-11 13207, 2020

      • Mr_Monkey
        Yes
      • 2020-05-11 13221, 2020

      • shivam-kapila
        (Youtube support is nice option)
      • 2020-05-11 13230, 2020

      • ruaok
        rdswift has some code I need to look at that submits a playlist to LB -- when those tracks are submitted to the LB player, it should start playing that list.
      • 2020-05-11 13233, 2020

      • Mr_Monkey
        We're assuming all the code lives in LB for now, right?
      • 2020-05-11 13257, 2020

      • iliekcomputers
        i would not want a new repository
      • 2020-05-11 13221, 2020

      • ruaok
        even for the recommendations toolkit, iliekcomputers?
      • 2020-05-11 13233, 2020

      • ruaok
        the playback stuff, is certainly part of LB, yes.
      • 2020-05-11 13234, 2020

      • shivam-kapila
        LB seems to be a good option for this feature.
      • 2020-05-11 13235, 2020

      • alastairp
        back
      • 2020-05-11 13253, 2020

      • iliekcomputers
        if it's in python, yes.
      • 2020-05-11 13202, 2020

      • reosarevok
        yvanzo: whatevs, works for me
      • 2020-05-11 13218, 2020

      • ruaok
        it is -- ok, then I'll make a branch for this stuff, rather than a new repo.
      • 2020-05-11 13222, 2020

      • iliekcomputers
        otherwise a lot of effort will go into setting up stuff like jenkins, jira, other github bots etc
      • 2020-05-11 13223, 2020

      • ruaok
        alastairp: thoughts so far?
      • 2020-05-11 13230, 2020

      • alastairp
        sure, a few things
      • 2020-05-11 13203, 2020

      • iliekcomputers
        ruaok: i'd just make a dir in master and work off master instead of a seperate branch
      • 2020-05-11 13217, 2020

      • reosarevok
        > first: annoy data sources.
      • 2020-05-11 13221, 2020

      • reosarevok
        Oh, we do that all the time!
      • 2020-05-11 13201, 2020

      • alastairp
        1) I think the academic side of this kind of stuff is a really good side to look at to source ideas from. It's true that not all work coming out of here gets turned into concrete products, but most of the research is really solid
      • 2020-05-11 13226, 2020

      • alastairp
        and there are a lot of people publishing now based on internships that they've done, both at music companies and non-music companies.
      • 2020-05-11 13234, 2020

      • ruaok
        no argument against that #1, alastairp/
      • 2020-05-11 13258, 2020

      • alastairp
        our data is great because it's open from end-to-end, so these people are a great resource to tap too, if we can get data to them
      • 2020-05-11 13232, 2020

      • ruaok
        yep, ideally we'll host all of the datasets and have the toolkit only make web requests.
      • 2020-05-11 13247, 2020

      • alastairp
        in terms of where this lives... I feel like maybe the definition needs to be clarified a bit more to work out what parts we need
      • 2020-05-11 13252, 2020

      • ruaok
        the initial version will require musicbrainz-docker to be installed.
      • 2020-05-11 13254, 2020

      • alastairp
        100% I believe that the toolkit is a new repo
      • 2020-05-11 13206, 2020

      • djwhitey has quit
      • 2020-05-11 13217, 2020

      • alastairp
        the public view of the results of this toolkit, this can definitely be LB
      • 2020-05-11 13219, 2020

      • ruaok
        alastairp will go against iliekcomputers in the cage match over the new repo or not!
      • 2020-05-11 13244, 2020

      • iliekcomputers
        will LB eventually use the toolkit?
      • 2020-05-11 13252, 2020

      • ruaok
        iliekcomputers: possibly not.
      • 2020-05-11 13256, 2020

      • alastairp
        sure, that's one of the things that might use it
      • 2020-05-11 13201, 2020

      • alastairp
        oh, wait
      • 2020-05-11 13202, 2020

      • ruaok
        it is entirely too early to tell.
      • 2020-05-11 13210, 2020

      • alastairp
        "use the toolkit", that's kind of what I mean
      • 2020-05-11 13218, 2020

      • alastairp
        LB is going to want results, not building blocks
      • 2020-05-11 13225, 2020

      • ruaok
        the thing is, the toolkit is the base for building new stuff and for dev experimentation.
      • 2020-05-11 13237, 2020

      • ruaok
        as far as deployment for use on our sites, that is far from clear right now.
      • 2020-05-11 13258, 2020

      • iliekcomputers
        i'm trying to avoid the workflow where i need to add a feature and i have to make two different pull requests in two different repositories
      • 2020-05-11 13207, 2020

      • shivam-kapila
        Are we gonna have a totally new project?
      • 2020-05-11 13222, 2020

      • iliekcomputers
        or the dependency problems we had with messybrainz and BU to some extent
      • 2020-05-11 13224, 2020

      • ruaok
        shivam-kapila: no not really.
      • 2020-05-11 13253, 2020

      • ruaok
        how about we defer this question until we look at some example code that doesn't currently live in LB.
      • 2020-05-11 13203, 2020

      • ruaok
        the we can see if we want to add or make a new repo.
      • 2020-05-11 13215, 2020

      • ruaok
        so far there is 0% shared code and 0% shared functionality.
      • 2020-05-11 13251, 2020

      • ruaok
        this toolkit is going to be hecka inefficient, but super hecka flexible.
      • 2020-05-11 13253, 2020

      • BrainzGit joined the channel
      • 2020-05-11 13254, 2020

      • Mr_Monkey
        Apart from playing the playlists I guess
      • 2020-05-11 13208, 2020

      • ruaok
        generating a playlist will take seconds to minutes. not snappy.
      • 2020-05-11 13226, 2020

      • shivam-kapila
        Playlist on the go -- Exciting
      • 2020-05-11 13227, 2020

      • ruaok
        Mr_Monkey: yes, but that is LB functionality, not toolkit functionality.
      • 2020-05-11 13206, 2020

      • ruaok
        but, once we see which algorithms generate good playlists/similar tracks/whatnot, we can step back and look at how to implement that for production purposes.
      • 2020-05-11 13223, 2020

      • ruaok
        its kinda hard to describe at this point.
      • 2020-05-11 13224, 2020

      • Mr_Monkey
        SGTM
      • 2020-05-11 13233, 2020

      • ruaok
        let's move on.
      • 2020-05-11 13241, 2020

      • alastairp
        maybe it's worth defining a few of our potential usecases/workflows, from the perspective of an algorithm developer or user
      • 2020-05-11 13245, 2020

      • ruaok
        alastairp: how much effort is it to get some annoy indexes up?
      • 2020-05-11 13237, 2020

      • alastairp
        we're already 90% there in that document I think, perhaps this could help us define a little more clearely where we _think_ things might end up living (with the potential to move in the future if we find something better)
      • 2020-05-11 13203, 2020

      • alastairp
        to get the indexes up that were running at the summit, less than a day
      • 2020-05-11 13222, 2020

      • iliekcomputers
        sorry, i have to drop off, i'll read the backlog.
      • 2020-05-11 13224, 2020

      • ruaok
        ohh, I'd like that!
      • 2020-05-11 13235, 2020

      • alastairp
        however, I get the impression that we were always running around making new apis and adding extra functionality
      • 2020-05-11 13242, 2020

      • Mr_Monkey
        ^that
      • 2020-05-11 13259, 2020

      • ruaok
        and never finishing anything?
      • 2020-05-11 13200, 2020

      • alastairp
        so I don't know how much of what we have is actually useful for what we might need
      • 2020-05-11 13203, 2020

      • Mr_Monkey
        It's likely we'll need to adapt the api as the projects move along
      • 2020-05-11 13210, 2020

      • alastairp
        nah - I think we got stuff finished
      • 2020-05-11 13219, 2020

      • ruaok
        well, that's just where we've finally arrived!
      • 2020-05-11 13228, 2020

      • ruaok
        we finally get to tie all the pieces together.
      • 2020-05-11 13232, 2020

      • alastairp
        that is, the few odyssey endpoints that we made were working fine
      • 2020-05-11 13249, 2020

      • alastairp
        but I guess in effect that's what these blocks are
      • 2020-05-11 13259, 2020

      • alastairp
        taking the low-level similarity index, and building something more complex on top of it
      • 2020-05-11 13208, 2020

      • ruaok
        what I have I so far is includes: artist-artist relations, some crappy track-track relations and mb metadata lookup. next up annoy indexes and and the MSB mapping.
      • 2020-05-11 13231, 2020

      • ruaok
        once the MSB mapping is in I can fetch the top artists per user from LB and then make playlists based on that data!
      • 2020-05-11 13240, 2020

      • alastairp
        this week we'll have a BPM and key endpoint in AB too
      • 2020-05-11 13251, 2020

      • ruaok
        alastairp: perfect!
      • 2020-05-11 13204, 2020

      • ruaok
        then we can make AB filters for those two.
      • 2020-05-11 13236, 2020

      • ruaok
        the MSB mapping is key, because it lets me take any of the stats generated for LB users and use them as sources/filters in this toolkit.
      • 2020-05-11 13212, 2020

      • Mr_Monkey
        That's really powerful
      • 2020-05-11 13231, 2020

      • ruaok
        alastairp: shall we say that at the end of the week we'll have the annoy endpoints up and the BPM and key API endpoints?
      • 2020-05-11 13200, 2020

      • ruaok
        and I think Mr_Monkey and I can work out the playlist playback feature -- it seems to be mostly there.
      • 2020-05-11 13230, 2020

      • alastairp
        I can't commit to this week, I've got some other projects that I need to focus on, but next week should be fine
      • 2020-05-11 13235, 2020

      • Mr_Monkey was just listening to LB recent page to check. Works well for me, but needs more testing
      • 2020-05-11 13236, 2020

      • ruaok
        Mr_Monkey: yes, combining all this stuff is really starting to open my eyes as to the possibilities for this. I mean, this has been the plan for years, but its finally coming together.
      • 2020-05-11 13256, 2020

      • ruaok
        alastairp: what can you give me this week?
      • 2020-05-11 13217, 2020

      • ruaok
        BPM and key API endpoints by the end of this week is fine.
      • 2020-05-11 13221, 2020

      • alastairp
        I don't know. My current plan was to be on a differnt project Tuesday - Friday
      • 2020-05-11 13222, 2020

      • ruaok
        annoy early next week?
      • 2020-05-11 13230, 2020

      • alastairp
        of course, BPM and Key are already in the API, but I'm looking at getting the individual features PR merged, hopefully tonight
      • 2020-05-11 13243, 2020

      • alastairp
        that was supposed to happen this afternoon but the day ran away on me
      • 2020-05-11 13254, 2020

      • alastairp
        so to clarify -
      • 2020-05-11 13237, 2020

      • alastairp
        what are we working on now, how long is now (is this a specific 2-3 week dev session), and do we have a specific thing that we want to make by the end of it?
      • 2020-05-11 13219, 2020

      • ruaok
        alastairp: that is the point of this meeting. and my hope was to actually influence your current priorities.
      • 2020-05-11 13249, 2020

      • alastairp
        OK, cool
      • 2020-05-11 13254, 2020

      • ruaok
        the end goal, for starters for you, would be the two added API endpoints and annoy indexes being up.
      • 2020-05-11 13218, 2020

      • ruaok
        and playlist launching with Mr_Monkey, ideally all implemented by the end of this week.
      • 2020-05-11 13254, 2020

      • alastairp
        right. from that, I'll be focusing on features tonight, and I'll see if I can carve out an hour or 2 to poke the similarity API and get it back up again
      • 2020-05-11 13201, 2020

      • Mr_Monkey
        On my side that should work
      • 2020-05-11 13207, 2020

      • ruaok
        then this would allow us to play with the toolkit next week and learn what it can do. then make a wishlist of what to improve and to make a roadmap for the next month.
      • 2020-05-11 13228, 2020

      • alastairp
        playlist launching - does this just cover "take a playlist and make a player for it?"
      • 2020-05-11 13230, 2020

      • ruaok
        Mr_Monkey: great!
      • 2020-05-11 13237, 2020

      • ruaok
        alastairp: yes.
      • 2020-05-11 13246, 2020

      • alastairp
        or do you want to include an actual playlist generator in this bit of work as well?
      • 2020-05-11 13210, 2020

      • ruaok
        alastairp: is the similarity API the oddeysey stuff? Really at this point I only need annoy index lookups.
      • 2020-05-11 13222, 2020

      • alastairp
        it's all the same codebase
      • 2020-05-11 13233, 2020

      • ruaok
        there is a playlist generator planned to be included in the toolkit.
      • 2020-05-11 13252, 2020

      • ruaok
        first cut: randomly select n playlists from the result of the pipeline.
      • 2020-05-11 13254, 2020

      • alastairp
        likely just a matter of checking out the correct branch
      • 2020-05-11 13223, 2020

      • ruaok
        don't worry about the similarity API stuff -- just annoy index lookups for now.
      • 2020-05-11 13237, 2020

      • ruaok
        effectively this similarity API stuff can be replaced by the work from this toolkit.
      • 2020-05-11 13235, 2020

      • ruaok
        Mr_Monkey: shall we work on the playlist launching stuff tomorrow?
      • 2020-05-11 13241, 2020

      • Mr_Monkey
        Yes
      • 2020-05-11 13203, 2020

      • alastairp
        so it sounds like we've got a bit of features, some acoustic similarity stuff, and some metadata relations
      • 2020-05-11 13204, 2020

      • ruaok
        I'll need some time in the morning to get my stuff in gear, but I can have something for the afternoon.
      • 2020-05-11 13229, 2020

      • ruaok
        alastairp: plus the MSB mapping which allows us to pull in LB user stats.
      • 2020-05-11 13249, 2020

      • ruaok
        so, every new stat that ishaanshah[m] and iliekcomputers are kicking out translates into a new data source in the toolkit.
      • 2020-05-11 13252, 2020

      • alastairp
        right, so that should be able to turn into some basic CF stuff, right?
      • 2020-05-11 13252, 2020

      • Mr_Monkey
        and artist-artist similarity as well, no?
      • 2020-05-11 13203, 2020

      • ruaok
        Mr_Monkey: already baked in.
      • 2020-05-11 13209, 2020

      • alastairp
        Mr_Monkey: right, I kind of counted that as metadata relations
      • 2020-05-11 13225, 2020

      • ruaok
        alastairp: yes, now that pristine__ is back, we're going to work on putting CF candidate track sets into the toolkit as well.
      • 2020-05-11 13248, 2020

      • alastairp
        ruaok: can you make up a data source based on a stat?
      • 2020-05-11 13255, 2020

      • alastairp
        I mean
      • 2020-05-11 13258, 2020

      • ruaok
        we can now!
      • 2020-05-11 13205, 2020

      • pristine__
        :)
      • 2020-05-11 13211, 2020

      • ruaok
        each of the stats for a users are fetchable via the API.
      • 2020-05-11 13219, 2020

      • alastairp
        not is it possible. I mean, can you give an example
      • 2020-05-11 13230, 2020

      • alastairp
        what is a stat? (what kind of things are we computing?)
      • 2020-05-11 13249, 2020

      • ruaok
        so, we just wget the top artists for a user, for instance -- that is live now.
      • 2020-05-11 13210, 2020

      • ruaok
        filter that through the MSB mapping and you have a list of top artists MBIDs.
      • 2020-05-11 13232, 2020

      • alastairp
        OK, so things like "top ranked artists for a user, with weights" is a feature that we have
      • 2020-05-11 13234, 2020

      • ruaok
        when the report supports that for the past week, we can generate a weekly playlist based on your past week's listens.
      • 2020-05-11 13245, 2020

      • ruaok
        yes.
      • 2020-05-11 13201, 2020

      • alastairp
        great, which turns into a CF task really well
      • 2020-05-11 13207, 2020

      • Mr_Monkey
      • 2020-05-11 13209, 2020

      • ruaok
        and any of the stats that we'll be adding to LB will be fetchable via the API, this that can become a datasource/filter.
      • 2020-05-11 13235, 2020

      • ruaok
        yes. we finally have enough datasets that we can pull all the pieces together.
      • 2020-05-11 13237, 2020

      • alastairp
        and candidate tracks, you specifically mean (all tracks) x (tracks listened to by a user) => (some new suggestions to listen to)
      • 2020-05-11 13256, 2020

      • ruaok
        yes.
      • 2020-05-11 13259, 2020

      • alastairp
        great
      • 2020-05-11 13200, 2020

      • pristine__
        Yes
      • 2020-05-11 13243, 2020

      • ruaok
        so, once we have these bits of data and develop algs that really work, then we can figure out how to push these back to spark/whatever for a mass deployment.
      • 2020-05-11 13243, 2020

      • alastairp
        does this currently work with all tracks, or do we have the capability to break it down by time period?
      • 2020-05-11 13257, 2020

      • ruaok
        pristine__: ^^
      • 2020-05-11 13259, 2020

      • alastairp
        very briefly, do we have thoughts about evaluation?
      • 2020-05-11 13224, 2020

      • ruaok
        alastairp: only that it is something that we will need to work on before too long.