#metabrainz

/

      • disruptek has quit
      • disruptek joined the channel
      • eharris has quit
      • eharris joined the channel
      • D4RK-PH0ENiX has quit
      • spellew
        ferbncode: I accidentally closed my brainzutils pull request, can you reopen it for me? Still going back and making changes to the code as I work on critiquebrainz
      • D4RK-PH0ENiX joined the channel
      • D4RK-PH0ENiX has quit
      • D4RK-PH0ENiX joined the channel
      • amCap1712
        CatQuest, KassOtsimine: the update is live
      • i reckoned a bug just now to test properly you need to login first then open collections otherwise it doesn't work properly
      • i'll fix it in next release
      • disruptek has quit
      • disruptek joined the channel
      • Jay__ joined the channel
      • Jay__
        hey all i have a problem using acousticbrainz can someone help me, i have an image that perfectly describes it https://imgur.com/kxBfaIE
      • Jay__ has quit
      • pristine__
        ruaok: hey
      • ruaok
        Hey! Greetings from Florence.
      • pristine__
        Got few min?
      • reosarevok
        zarcade_droid: done!
      • ruaok
        I have no laptop on me. Just mobile. I should be available after 14h. But, try me.
      • pristine__
      • ruaok: have a llok
      • look*
      • if you could relate, to which on more?
      • one*
      • ruaok
        Very interesting. Much faster, which is great. I recognize many more artists, which also seems good.
      • But green day, for instance strays quite off my tastes for instance.
      • But, I need to get moving now. I can look again from the tram.
      • reosarevok
        "Do you have the time to listen to me whine / About nothing and everything all at once"
      • I dunno, sounds like ruaok to me!
      • kori joined the channel
      • pristine__
        ruaok: the first one is on a months data
      • And the second on a year's data
      • Ping me when you're here. We can discuss.
      • ruaok
        Ok, that sounds good.
      • How did you calculate the candidate set?
      • pristine__
        First of all, I fetched top 50 artists of a user in the given timeframe. Then made a list of these 50 artists plus artists similar to them using the json you provided. Then I fetched tracks of these similar artists which was the final candidate set.
      • ruaok
        Ok, totally makes sense. It would be nice to see the candidate set as well. I think that is something we need to review independently of the recommendations, what do you think?
      • yvanzo
        mo’’in’
      • pristine__
        ruaok: Yes. Totally makes sense. An HTML?
      • I just had this thought in my mind while working, we will find top x artists for users from their past week's history. Recommend songs per day, for the next day we will subtract already recommended songs from the candidate set and then recommend. If our set exhausts in the middle of the week we will find top y artists starting from x+1 and then repeat the procedure.
      • But the next top y artists can have similar artists from top x, so we need to keep track of that and avoid recommending same songs.
      • Also, is there a way that we can group artists according to genres. If we have such a table in MB db.
      • Also, I was thing about three playlists, 1. Songs from favorite artists (songs only of the top x artists) 2. Songs from similar artists (songs only of the artists similar to top x) 3. New artists (songs from whole set minus candidate set, in order to promote artists)
      • I am spamming the channel with thoughts I had in past two days 😆
      • CatQuest
        sorry amCap1712 I fell asleep. 'll check the application one i've had a shoer/eath breakfast etc
      • pristine__
        Also, we can group artists according to nationality, in addition to artist credit.
      • Nyanko-sensei joined the channel
      • amCap1712
        ok thanks CatQuest
      • D4RK-PH0ENiX has quit
      • ruaok
        pristine__: yeah, HTML should work fine.
      • For grouping artists, we have genres, but the data is not well populated.
      • And all those thoughts about recommendations and keeping track of what has been recommended, is great thinking. This is why I want a new schema inside the LB data.
      • To keep track of all that.
      • And yes, those three ideas are exactly what we can start working on when we have our underlying data sets ready.
      • I'm going to be working one a rudimentary msid <=> mbid mapping this week.
      • ferbncode
        spellew: You should see a "Reopen pull request" button here: https://github.com/metabrainz/brainzutils-pytho...
      • pristine__
        the mapping would refine similar artists list. Sounds good
      • travis-ci joined the channel
      • travis-ci
        Project bookbrainz-data-js build #1117: passed in 1 min 44 sec: https://travis-ci.org/bookbrainz/bookbrainz-dat...
      • travis-ci has left the channel
      • Nyanko-sensei has quit
      • D4RK-PH0ENiX joined the channel
      • gr0uch0mars joined the channel
      • chirlu has quit
      • D4RK-PH0ENiX has quit
      • yokel has quit
      • yokel joined the channel
      • amCap1712
        hi gr0uch0mars
      • i have merged the oauth pr and opened a pr on collections
      • there is still some work to be done on collections but most of it is ready
      • could you check it out?
      • also i have released an update to the app
      • i plan to release another update once some bugs i have identified are fixed and collections work gets completed
      • D4RK-PH0ENiX joined the channel
      • iliekcomputers
        alastairp: hi, will you have time to look at the ratelimit PR today?
      • BrainzGit
        [listenbrainz-recommendation-playground] paramsingh merged pull request #22 (popular-artist…popular-artist): queries for entities (artist, user) https://github.com/metabrainz/listenbrainz-reco...
      • akhilesh
        Mr_Monkey: Hi!
      • Mr_Monkey
        Hi akhilesh
      • BrainzGit
        [listenbrainz-recommendation-playground] paramsingh opened pull request #29 (master…popular-artist): Entity statistics https://github.com/metabrainz/listenbrainz-reco...
      • akhilesh
        Mr_Monkey: what should the output of `<entity>/<bbid>/relationships`?
      • means, which information should return?
      • Mr_Monkey
        akhilesh: An array of relationships, each containing: relationship type ('label' in the DB), direction, link phrase, other entity's type
      • I thinks that's the minimal information you need to reconstruct the relationship
      • Ah, and target entity bbid of course
      • The direction is wether the current entity is the source or target of the relationship
      • akhilesh
        ok
      • Mr_Monkey
        akhilesh: There are cases where the direction doesn't make sense (for example, Author A is married to Author B). Not sure what to do with those, possible simply default to 'forward' relationship
      • akhilesh
      • Mr_Monkey
        akhilesh: We also might want to publish the relationship type id along with the label
      • Now that I think of it
      • akhilesh
      • Mr_Monkey: Is it ok?
      • gr0uch0mars has quit
      • Mr_Monkey
        You won't need source and target, considering onc or the other is the current entity bbid. So you'll only have 'target', and depending on the position of the current entity (in source_bbid or target_bbid), the direction is 'forward' or 'backward'.
      • akhilesh
        ok
      • Mr_Monkey
        I would opt for `relationshipType: {label:X, id: Y}``
      • 'name' instead of 'label' perhaps?
      • akhilesh
      • Mr_Monkey: ^
      • is It ok for now?
      • Mr_Monkey
        akhilesh: Yes, that seems fine for a first step. there might be more to add at a later date
      • akhilesh
        yes
      • ruaok turns up at home
      • iliekcomputers
      • SothoTalKer
        hello :)
      • iliekcomputers
        ruaok: i opened a pr for exception catching in stats and the mlhd pr is ready for review
      • ruaok
        great!
      • I can start looking at those later today. if I can get used to being in a city again. :)
      • iliekcomputers
        what do you want to do with the spark-writer PR?
      • ruaok
        maybe just close it for now?
      • I'm still stuck on what to do there. the whole big data cluster is frustrating to me.
      • its a chicken/egg problem. we wont know how many resources we need until we run stuff, but we need to plan before we write code.
      • iliekcomputers
        the spark-writer thing really seems like a problem incremental dumps could solve.
      • ruaok
        and we have two usage cases: recommendations and user stats.
      • iliekcomputers
        wake up the cluster, download the dumps needed, import and run stats
      • ruaok
        YES!
      • that is a great insight!
      • let's do that.
      • iliekcomputers
        so how exactly would incremental dumps work, should we just start a series independent of the current full dumps? 1 (big large dump), 2, 3 and 4 and others smaller
      • ruaok
        ideally they would similar/identical in structure to the full dumps.
      • if you start with a full dump and apply all the partial dumps between full dumps, you should end up with exactly the same data as the next full dump.
      • which means that we are dumping data "as we receive it", not in time sequence.
      • not sure that answers your question.
      • iliekcomputers
        we started storing influx insert timestamps a long time ago, so that hopefully won't be a problem.
      • ruaok
        indeed.
      • iliekcomputers
        so i guess the series would be 1 (full), 2, 3, 4, 5, 6 (full again maybe), 7, 8, 9 and so on?
      • not sure what i'm saying.
      • ruaok
        in you consider the partial dumps are marking the progress of time, then at periodic points, we also emit a full dump.
      • 1p, 2p, 3p, 4p, 5p & 5full, 6p, 7p, 8p, 9p, 10p/10full, 11p ....
      • iliekcomputers
        ah okayyy
      • that makes sense.
      • ruaok
        ideally we would have to write very little new code, since the structure of both types of dumps are the same
      • thanks, iliekcomputers. you just solved a giant headache for me!
      • <3
      • iliekcomputers
        <3 <3
      • iliekcomputers had a good idea, who'd have thunk
      • ruaok
        because then its a matter of frequency on how often we wake up the cluster.
      • and if we feel that we need it more often, we can increase the frequency.
      • iliekcomputers
        yeah, exactly.
      • ruaok
        and if we feel that we need it all the time eventually, we go buy dedicated machines from hetzner.
      • and that means we use our azure credits and maybe then we know exactly what we want to do.
      • for a more long term solution.
      • k, so my goals past the board meeting and lawsuit resolution this week is to do the MSID<->MBID mapping version.
      • then I'l work on azure stuff to start/stop the cluster.
      • then I can loop back around and improve the MSID mapping.
      • I have another thought that I wanted to run you past you...
      • iliekcomputers
        how are you doing the mappings?