#metabrainz

/

      • supersandro2000 has quit
      • supersandro2000 joined the channel
      • nelgin
        yvanzo ,for some reason the setting to change my VM from 10 to 16gb didn't take, hardware acceleration was off and video memory was at 4mb. I've tweaked the setings back to something more reasonable so I'm going to give it another go.
      • Sophist-UK has quit
      • Sophist-UK joined the channel
      • 2020-09-04 01:12:29,450: Importing recording...
      • client_loop: send disconnect: Broken pipe
      • Very nice. My VM died.
      • sumedh joined the channel
      • ishaanshah
        Morning
      • shivam-kapila
        Morning
      • sumedh has quit
      • sumedh joined the channel
      • travis-ci joined the channel
      • travis-ci
        Project bookbrainz-site build #3406: passed in 5 min 8 sec: https://travis-ci.org/bookbrainz/bookbrainz-sit...
      • travis-ci has left the channel
      • sumedh has quit
      • BrainzGit
        [listenbrainz-server] paramsingh merged pull request #1072 (master…ishaan/shift_stats_cronjobs): Shift cronjobs by 12 hours https://github.com/metabrainz/listenbrainz-serv...
      • sumedh joined the channel
      • [listenbrainz-server] release v-2020-09-04.0 has been published by release-drafter[bot]: https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] paramsingh merged pull request #1064 (master…dependabot/pip/spotipy-2.14.0): Bump spotipy from 2.12.0 to 2.14.0 https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] paramsingh merged pull request #1066 (master…dependabot/pip/flask-cors-3.0.9): Bump flask-cors from 3.0.8 to 3.0.9 https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] paramsingh merged pull request #1067 (master…dependabot/pip/yattag-1.14.0): Bump yattag from 1.13.2 to 1.14.0 https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] paramsingh merged pull request #1069 (master…dependabot/pip/pytest-cov-2.10.1): Bump pytest-cov from 2.10.0 to 2.10.1 https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] paramsingh merged pull request #1068 (master…dependabot/pip/sphinx-3.2.1): Bump sphinx from 3.1.2 to 3.2.1 https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] paramsingh closed pull request #1065 (master…dependabot/pip/ujson-3.1.0): Bump ujson from 1.35 to 3.1.0 https://github.com/metabrainz/listenbrainz-serv...
      • pristine___
        ruaok: hey
      • jmp_music_
        Morning!
      • v6lur joined the channel
      • pristine___
      • iliekcomputers
      • ruaok: ^
      • put my thoughts down for later today.
      • reosarevok
        Fun!
      • alastairp
        iliekcomputers: cool. so this is mostly related to user-user interactions, rather than automated recommendations?
      • iliekcomputers
        alastairp: yes!
      • alastairp
        great. I started an overview document for the second one too, so a good combination
      • iliekcomputers
        This could eventually lead to data that we can feed into automated recommendations
      • But yeah, for now it is essentially an counterpart that's more human based
      • alastairp
        please feel free to add text or comments
      • BrainzGit
        [musicbrainz-server] reosarevok opened pull request #1677 (master…MBS-11065): MBS-11065: Only block smart links if they have a path https://github.com/metabrainz/musicbrainz-serve...
      • BrainzBot
        MBS-11065: Smart link blocks affecting legitimate links https://tickets.metabrainz.org/browse/MBS-11065
      • pristine___
      • alastairp: that day when you were talking about batches of 100, we're you suggesting to input data of 100 users in here and get the recs of 100 altogether?
      • BrainzGit
        [listenbrainz-server] paramsingh merged pull request #1040 (master…document-time-range): Add a docstring for the time_range https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] paramsingh merged pull request #1051 (master…stats): LB-708: Merge the __init__.py and utils.py in stats directory. https://github.com/metabrainz/listenbrainz-serv...
      • BrainzBot
        LB-708: Merge the __init__.py and utils.py in stats directory. https://tickets.metabrainz.org/browse/LB-708
      • BrainzGit
        [listenbrainz-server] paramsingh merged pull request #1063 (master…pydantic-model-dataframes): Pydantic model for data returned from spark (create_dataframes.py) https://github.com/metabrainz/listenbrainz-serv...
      • alastairp
        pristine___: it was a hypothesis based on the number of joins you were doing
      • yes, get the recs of all 100, and so when you do the join to go from index -> recording/artist ids (I think that's what it was doing?) you're only doing 1 join per 100 users, not 100 joins
      • _lucifer
        alastairp: would like you review on this for further improvements https://docs.google.com/document/d/1vaLT5AXont6...
      • pristine___
        alastairp: I have collected recs of all the users and performed a single join. I have done this improvement in the new PR ( 1073). The join here is not the only bottleneck. The script is slow because we were generating recs for all users one at a time i.e. we were calling the func I linked above for every user. So this morning I realised what if we call that fun (generate_rec) once for 100/1000 users. It will
      • drastically reduce the run time. Thanks for the hypothesis.
      • BrainzGit
        [listenbrainz-server] paramsingh opened pull request #1074 (master…param/move-python-tests-to-jenkins): [wip] move python tests over to jenkins from travis https://github.com/metabrainz/listenbrainz-serv...
      • iliekcomputers
        pristine___: there are a couple of tests failing in ^ that i think belong to you https://ci.metabrainz.org/job/listenbrainz-spar...
      • i'll skip them for now.
      • ruaok trundles in
      • ruaok
        moooin!
      • iliekcomputers
        morning.
      • i have today off and no plans, so am working on MeB stuff
      • well, technically working on MeB stuff is a plan
      • ruaok
        lol
      • I'm reading your doc and we both thought of completely different stuff.
      • in a really good way, from what I can see.
      • d4rkie joined the channel
      • iliekcomputers
        it's why i wrote it down!
      • to make sure we were in sync
      • ruaok
        <3
      • Nyanko-sensei has quit
      • let me throw my thoughts on the bottom of the paper -- it almost feels like sullying your clean plan, lol.
      • iliekcomputers
        sure
      • pristine___
        iliekcomputers: will have a look. Thanks
      • Gazooo794 has quit
      • Gazooo794 joined the channel
      • alastairp
        pristine___: oh, interesting. I'm not sure that I understand - do you mean that to predict items for 100 users is approximately as fast as predicting items for 1 user?
      • ruaok
        that would not surprise me given spark, really.
      • iliekcomputers
        me either
      • ruaok
        iliekcomputers: do you use google photos?
      • alastairp
        it makes a certain amount of sense. if that's the case, it's the same kind of speedup we got in AB higlevel too - the majority of the time was spent loading the models and getting them into a format in memory that is useful. actually passing data through the model to get an output is the fast part
      • iliekcomputers
        yes
      • ruaok
        alastairp: that
      • alastairp
        sweet, looking forward to a 100x speedup overnight, then
      • ruaok
        iliekcomputers: I love the scrollbar in the main photos timeline. it gives a good overview of your photos timeline.
      • imagine if FB had that.
      • alastairp
        FB? 😱
      • Rotab
        FB 😍
      • pristine___
        alastairp: yup. Runtime for 100 (or even 200) user ~ runtime for one user. I am limiting it to 100 users because I need to perform a join afterwards. The join can lead to OOM because it will be a cross join. If the join wouldn't have been there, I would have generated recs for all (10k users) at once. :p
      • alastairp
        pristine___: great. what exactly is the join for?
      • I'm not sure exactly how this storage part of spark works, but can you generate all at once, then generate a new data table/datastore/rrd/whatever for a smaller number of users and join against that?
      • pristine___
        So we get recording ids from the recommender. The join is to get the corresponding recording mbids.
      • ruaok
        iliekcomputers: dumped my thoughts. we can discsuss now while things are fresh or we can let them simmer for a bit and discuss later....
      • iliekcomputers
        let me give it a read
      • alastairp
        pristine___: how many rows in the final recommendations?
      • I'm surprised that postgres can join a billion rows without breaking a sweat but spark runs out of memory
      • ruaok
        that reminds me, I want to try and read MLHD into timescale....
      • alastairp
        oh nice
      • pristine___: anyway, I'd seriously look into doing the predictions on a lot of users, but the join on a smaller subset
      • iliekcomputers
        ruaok: read through it, left a few comments, mostly agreements, i think we're on the same page in terms of what we want to do
      • in the long term
      • ruaok
        I expected as much reading your stuff. let me read the comments.
      • pristine___
        alastairp: Given your postgres remark I was thinking to test join for all users. I have OOM in past, but maybe they were not because of the size of dataset. It will be a nice thing to test on cluster
      • iliekcomputers
        not sure if you have any comments on the features that I'm proposing to build specifically, maybe more comments will come up when i have a more technical design ;)
      • alastairp
        pristine___: yeah, right. this is definitely something that we should try, and only do it differently if it doesn't work
      • pristine___
        Rec for all and join for all. I am excited to see how much the runtime will be reduced.
      • alastairp: on a side note, I was just going through pyspark github. It is fairly easy to get all time recs for users. And many other things.
      • In case you want to have a look.
      • alastairp
        yeah, I saw your link yesterday. it looks good
      • I started looking in to the spark stuff about a month ago but got distracted this month, I'll start looking at it again
      • ruaok
        iliekcomputers: I have two key things to discuss about your proposal. 1) recommending a track without giving any comment on the recommendation feels like its missing something. somehow incomplete. 2) The interactions with BrainzPlayer need improvements for a better user experience
      • iliekcomputers
        hmm.
      • let's chat about 1
      • ruaok
        k.
      • iliekcomputers
        let's chat about 1 first, i mean
      • ruaok
        I fully understands that adding comment option opens pandoras's box.
      • iliekcomputers
        i'm not totally against it. i was trying to keep things simple. if we do go that road, i assume the comment would ideally live in CB
      • reosarevok
        should the user recommending give the comment, or the one who gets recommended?
      • (or both)
      • ruaok
        hmmm. CB.
      • iliekcomputers
        the user recommending the track, i would say
      • ruaok
        anothe can of worms, really.
      • iliekcomputers
        i would be against allowing threads for nw
      • so the users who see the recommendation wouldn't be able to comment
      • ruaok
        yes, no threads or comments. #dontreadthecomments
      • reosarevok
        haha
      • ruaok
        counter-recommend, not comment.
      • reosarevok
        One thing that would be cool is to be told whether the user you recommended it to loved/hated it
      • (you have buttons for that, right?)
      • ruaok
        that I see useful.
      • reosarevok
        So, you get *some* reaction, but without the comments
      • ruaok
        up/down voting.
      • reosarevok
        If you want to actually talk about it, send the user a message
      • iliekcomputers
        again, a good idea, not sure if we want it in the initial version.
      • reosarevok
        Via MB or whatever (that'd be easier once we had MeB accounts not mainly living in MB :p)
      • ruaok
        +1 to both iliekcomputers and reosarevok
      • I think limiting the features in the first run will give us a better idea as to how to continue better in the long run
      • reosarevok
        Yeah
      • iliekcomputers
        this project would essentially set up the base, allowing people to add friends
      • once that's done, there's a whole host of things we could do
      • reosarevok
        IMO as a start, just allowing people to send recommendations and that's it is enough
      • ruaok
        and recommend a recording, yes?
      • iliekcomputers
        yeah.
      • reosarevok
        I mean, that's what we sometimes do here, with zas dropping a bandcamp link or whatever
      • ruaok
        great. I think that is a very good first step.
      • reosarevok: THAT
      • this allows a vehicle for zas to do this. and I want to read his feed.
      • then lets move to point 2.
      • iliekcomputers
        wait
      • reosarevok
        Later you can decide whether you want a "timeline" with "your friend X got a BB badge!!"
      • ruaok waits