#metabrainz

/

      • pristine___
      • 2020-08-13 22652, 2020

      • pristine___
        This is where we perform the join on mapping.
      • 2020-08-13 22659, 2020

      • iliekcomputers
        yes. there might be people who've only connected spotify to listen to music.
      • 2020-08-13 22607, 2020

      • pristine___
        And listens.
      • 2020-08-13 22620, 2020

      • yvanzo
        reosarevok: gh:MBS#1610 is approved too
      • 2020-08-13 22621, 2020

      • BrainzBot
        MBS-10984 / MBS-10985: Convert Move / Remove Disc ID edit to React: https://github.com/metabrainz/musicbrainz-server/…
      • 2020-08-13 22630, 2020

      • iliekcomputers
        will have to look at the permissions we have in the permission column too.
      • 2020-08-13 22655, 2020

      • reosarevok
        Thanks!
      • 2020-08-13 22645, 2020

      • ruaok
        pristine___: yeah, confirmed. not a mapping problem. a missing data problem in MB -- not a glaring one,but still.
      • 2020-08-13 22608, 2020

      • pristine___
        Hmm... I like the idea of reporting it to the users
      • 2020-08-13 22620, 2020

      • ruaok
        the individual artists are there, but the collaborations have not been entered. that means its #2 (as you said) and we should prepare a report so that users can go enter the data.
      • 2020-08-13 22642, 2020

      • ruaok
      • 2020-08-13 22607, 2020

      • ruaok
        the best we can do is seed the release editor with the data we already know -- and the link above explains how to do that.
      • 2020-08-13 22643, 2020

      • ruaok
        so, if we can write a report that has a pile of links on it that open the release editor and seed it with the data we have, then the process of adding the data to MB is made a wee bit easier. and we want exactly that.
      • 2020-08-13 22644, 2020

      • pristine___
        With the data we already know?
      • 2020-08-13 22648, 2020

      • ruaok
        yes!
      • 2020-08-13 22650, 2020

      • ruaok
      • 2020-08-13 22641, 2020

      • ruaok
        iliekcomputers: are any of those perm combos we should not update the record_listens for?
      • 2020-08-13 22650, 2020

      • pristine___
        But we want to add the data thay we don't have.
      • 2020-08-13 22652, 2020

      • iliekcomputers
        i think we can do it for anyone with the `user-read-recently-played` permission.
      • 2020-08-13 22601, 2020

      • pristine___
        I think I am not clear about it.
      • 2020-08-13 22609, 2020

      • ruaok
        > With the data we already know?
      • 2020-08-13 22618, 2020

      • ruaok
        with the data from LB, not the data from MB.
      • 2020-08-13 22653, 2020

      • pristine___
        Woops. So you mean the data we have spotted in LB but ain't available in MB
      • 2020-08-13 22655, 2020

      • ruaok
        so for artist `Zack Knight, Jasmin Walia` we already have those two artists in the MB db.
      • 2020-08-13 22658, 2020

      • pristine___
        ?
      • 2020-08-13 22603, 2020

      • ruaok
        exactly that.
      • 2020-08-13 22610, 2020

      • pristine___
        Oooooo. Right
      • 2020-08-13 22611, 2020

      • ruaok
        in as much as it is possible.
      • 2020-08-13 22639, 2020

      • pristine___
        I will have to think about the implementation. I get the basic idea though
      • 2020-08-13 22646, 2020

      • ruaok
        likely we are only going to have 1 track with of info -- but even that makes adding that track, especially in light of releases easier.
      • 2020-08-13 22600, 2020

      • ruaok
        also, in a lot of cases we will have a spotify id, right?
      • 2020-08-13 22609, 2020

      • pristine___
        Hmm
      • 2020-08-13 22623, 2020

      • ruaok
        so, then we could fetch the spotify metadata for the release for that track and use it to the seed the release editor.
      • 2020-08-13 22625, 2020

      • ruaok
        woooooo!
      • 2020-08-13 22611, 2020

      • sumedh has quit
      • 2020-08-13 22645, 2020

      • pristine___
        Yay. Also I am happy to know that the queries in candidate_sets are fine :)
      • 2020-08-13 22611, 2020

      • pristine___
        Cool. So this sums up the discussion on mapping.
      • 2020-08-13 22648, 2020

      • pristine___
        Lemme know whenever you review the artist-artist-relation code (the ticket)
      • 2020-08-13 22627, 2020

      • pristine___
        I don't have my laptop today but I will be online.
      • 2020-08-13 22624, 2020

      • ruaok
      • 2020-08-13 22650, 2020

      • ruaok
        once I finish this spotify perms thing, I'm on it.
      • 2020-08-13 22651, 2020

      • iliekcomputers
        lgtm
      • 2020-08-13 22602, 2020

      • iliekcomputers
        actually
      • 2020-08-13 22608, 2020

      • iliekcomputers
        for precautions sake
      • 2020-08-13 22623, 2020

      • iliekcomputers
        could we first extract a list of users this would change?
      • 2020-08-13 22638, 2020

      • iliekcomputers
        so that we know which ones to revert in case it all goes kaput
      • 2020-08-13 22611, 2020

      • iliekcomputers
        select user_id from spotify_user where {same condition as in update query}
      • 2020-08-13 22623, 2020

      • iliekcomputers
      • 2020-08-13 22607, 2020

      • iliekcomputers
        looking at the code, it does send an email to the user if it errors out (https://github.com/metabrainz/listenbrainz-server…)
      • 2020-08-13 22628, 2020

      • ruaok
        ok, user list saved.
      • 2020-08-13 22631, 2020

      • v6lur joined the channel
      • 2020-08-13 22639, 2020

      • ruaok
      • 2020-08-13 22615, 2020

      • iliekcomputers
        lgtm.
      • 2020-08-13 22612, 2020

      • ruaok
      • 2020-08-13 22652, 2020

      • iliekcomputers
        cool, cool, cool.
      • 2020-08-13 22601, 2020

      • iliekcomputers
        459 is a lot 🙈
      • 2020-08-13 22643, 2020

      • ruaok pops open grafana to see what that does to our queue.
      • 2020-08-13 22649, 2020

      • ruaok
        probably won't be visible.
      • 2020-08-13 22603, 2020

      • ruaok
        pristine___: the code to calculate artist credit similarities works fine, but the dump code is borked.
      • 2020-08-13 22637, 2020

      • pristine___
        Where at? Any link?
      • 2020-08-13 22619, 2020

      • pristine___
        Of the code
      • 2020-08-13 22613, 2020

      • ruaok
      • 2020-08-13 22637, 2020

      • ruaok
        here I am cobbling artist credit names together, when a fully assembled one is already in the DB.
      • 2020-08-13 22658, 2020

      • ruaok
        I just need to fetch it instead. but I think I need to find lunch first, can't concentrate.
      • 2020-08-13 22608, 2020

      • pristine___
        No hurry. I am happy that we figured out the problem. shivam-kapila will probably have a better similar artist playlist by the end of the day :)
      • 2020-08-13 22641, 2020

      • iliekcomputers
        ruaok: how many listens would you estimate we're adding every day
      • 2020-08-13 22603, 2020

      • iliekcomputers
        i think it's at least 800-900k
      • 2020-08-13 22647, 2020

      • ruaok
        I honestly have no idea.
      • 2020-08-13 22655, 2020

      • ruaok
        it will be much more now. :)
      • 2020-08-13 22616, 2020

      • ruaok
        but before we fixed these accounts, I think it was far less than that.
      • 2020-08-13 22629, 2020

      • ruaok
        we may fetch that many from spotify, but 95% are dups.
      • 2020-08-13 22625, 2020

      • ruaok
        yeah, 350M. I mean timescale knocked a few M of those out, but its nowhere approaching 1M per day. sadly.
      • 2020-08-13 22635, 2020

      • ruaok
        but, I'd love for you to be right.
      • 2020-08-13 22606, 2020

      • iliekcomputers
        :D
      • 2020-08-13 22620, 2020

      • iliekcomputers
        i think we've been having a few good days at least.
      • 2020-08-13 22621, 2020

      • ishaanshah
        iliekcomputers: can we store the number of listens received that day in the DB at the end of the day?
      • 2020-08-13 22633, 2020

      • ishaanshah
        maybe we can make a graph for that?
      • 2020-08-13 22622, 2020

      • iliekcomputers
        yeah, we could.
      • 2020-08-13 22637, 2020

      • iliekcomputers
        i won't do it in the PR i have, but that's a good idea.
      • 2020-08-13 22658, 2020

      • iliekcomputers
        i'll follow up with a cron job that takes it from redis and stores it in pg
      • 2020-08-13 22646, 2020

      • ishaanshah
        sounds good :D
      • 2020-08-13 22618, 2020

      • ishaanshah
        I'll look into the graph part later then
      • 2020-08-13 22644, 2020

      • ishaanshah
        would be a good addition to sitewide stats
      • 2020-08-13 22630, 2020

      • iliekcomputers
        ++
      • 2020-08-13 22609, 2020

      • SothoTalKer_ has quit
      • 2020-08-13 22639, 2020

      • SothoTalKer joined the channel
      • 2020-08-13 22634, 2020

      • pristine___
        > maybe we can make a graph for that?
      • 2020-08-13 22640, 2020

      • pristine___
        <3
      • 2020-08-13 22658, 2020

      • v6lur has quit
      • 2020-08-13 22618, 2020

      • BrainzGit
        [bookbrainz-site] MonkeyDo merged pull request #490 (UserCollection…delete-collection-from-ES): Delete collection from Elasticsearch index https://github.com/bookbrainz/bookbrainz-site/pul…
      • 2020-08-13 22637, 2020

      • travis-ci joined the channel
      • 2020-08-13 22637, 2020

      • travis-ci
        Project bookbrainz-site build #3331: failed in 3 min 41 sec: https://travis-ci.org/bookbrainz/bookbrainz-site/…
      • 2020-08-13 22637, 2020

      • travis-ci has left the channel
      • 2020-08-13 22647, 2020

      • ishaanshah
        pristine___: ping
      • 2020-08-13 22647, 2020

      • iliekcomputers
        ruaok: i addressed your comments on https://github.com/metabrainz/listenbrainz-server…
      • 2020-08-13 22637, 2020

      • travis-ci joined the channel
      • 2020-08-13 22637, 2020

      • travis-ci
        Project bookbrainz-site build #3331: passed in 3 min 15 sec: https://travis-ci.org/bookbrainz/bookbrainz-site/…
      • 2020-08-13 22637, 2020

      • travis-ci has left the channel
      • 2020-08-13 22626, 2020

      • BrainzGit
        [bookbrainz-site] MonkeyDo merged pull request #483 (UserCollection…test-add/remove-items): tests for `collection/add` and `collection/remove` https://github.com/bookbrainz/bookbrainz-site/pul…
      • 2020-08-13 22611, 2020

      • CatQuest
        a question to indians: is https://www.vegrecipesofindia.com/jeera-aloo-reci… a 2legit" recepie? or how would you ake it?
      • 2020-08-13 22629, 2020

      • travis-ci joined the channel
      • 2020-08-13 22629, 2020

      • travis-ci
        Project bookbrainz-site build #3333: passed in 3 min 26 sec: https://travis-ci.org/bookbrainz/bookbrainz-site/…
      • 2020-08-13 22629, 2020

      • travis-ci has left the channel
      • 2020-08-13 22605, 2020

      • travis-ci joined the channel
      • 2020-08-13 22605, 2020

      • travis-ci
        Project bookbrainz-site build #3276: passed in 4 min 45 sec: https://travis-ci.org/bookbrainz/bookbrainz-site/…
      • 2020-08-13 22605, 2020

      • travis-ci has left the channel
      • 2020-08-13 22630, 2020

      • ruaok
      • 2020-08-13 22653, 2020

      • ruaok
        do you know what is going on with those two SVG files? I didn't modify them.
      • 2020-08-13 22611, 2020

      • ruaok
        reset hard does not get rid of them rm and checkout does not either.
      • 2020-08-13 22613, 2020

      • pristine___
        ishaanshah: yeah
      • 2020-08-13 22624, 2020

      • iliekcomputers
        not sure.
      • 2020-08-13 22627, 2020

      • ishaanshah
        Hi
      • 2020-08-13 22631, 2020

      • iliekcomputers
        first time i'm seeing it.
      • 2020-08-13 22642, 2020

      • yvanzo has quit
      • 2020-08-13 22603, 2020

      • ishaanshah
        I was using MSID MBID mapping to improve the results for stats
      • 2020-08-13 22637, 2020

      • ishaanshah
        I ran into outOfMemory error while using it
      • 2020-08-13 22617, 2020

      • ishaanshah
        I just wanted to ask that would it cause an issue on prod?
      • 2020-08-13 22631, 2020

      • ishaanshah
        I mean is there any optimisation that can be made
      • 2020-08-13 22632, 2020

      • ruaok
        pristine___: bug fixed, working on a new dump now. with up-to-date data even
      • 2020-08-13 22640, 2020

      • yvanzo joined the channel
      • 2020-08-13 22603, 2020

      • ishaanshah
        I'll link the query just asec
      • 2020-08-13 22606, 2020

      • pristine___
        ruaok: yay. Will you upload it on FTP?
      • 2020-08-13 22619, 2020

      • ishaanshah
      • 2020-08-13 22629, 2020

      • ruaok
        yes, pristine___
      • 2020-08-13 22633, 2020

      • ishaanshah
        line 75
      • 2020-08-13 22640, 2020

      • pristine___
        A sec
      • 2020-08-13 22631, 2020

      • pristine___
        Why do you want to to left join?
      • 2020-08-13 22622, 2020

      • ishaanshah
        So that we dont skip artists which haven't been mapped
      • 2020-08-13 22632, 2020

      • pristine___
        Not relates to optimization was just curious.
      • 2020-08-13 22634, 2020

      • pristine___
        Ah
      • 2020-08-13 22639, 2020

      • BrainzGit
        [bookbrainz-site] MonkeyDo merged pull request #481 (master…snyk-upgrade-bee51cd3c6853a5923dea1777d5e2eb2): [Snyk] Upgrade express from 4.16.3 to 4.17.1 https://github.com/bookbrainz/bookbrainz-site/pul…
      • 2020-08-13 22644, 2020

      • ishaanshah
        inner would skip those right
      • 2020-08-13 22632, 2020

      • pristine___
        OOM is generally when two huge tables are joined
      • 2020-08-13 22648, 2020

      • pristine___
        > inner would skip those right
      • 2020-08-13 22608, 2020

      • pristine___
        Yes. Rec use inner since we Strictly need MBIDs
      • 2020-08-13 22628, 2020

      • pristine___
        Broadcast join is one of the options.
      • 2020-08-13 22627, 2020

      • ishaanshah
        broadcast joins generally work with one small and one large table right?
      • 2020-08-13 22627, 2020

      • travis-ci joined the channel
      • 2020-08-13 22627, 2020

      • travis-ci
        Project bookbrainz-site build #3334: failed in 4 min 12 sec: https://travis-ci.org/bookbrainz/bookbrainz-site/…
      • 2020-08-13 22627, 2020

      • travis-ci has left the channel
      • 2020-08-13 22657, 2020

      • pristine___
        I am not sure. But the basic idea is that each excutor should have a copy of the table to minimize shuffling and stuff. I had OOMS back then, I played a lot with driver memory, executor memory and other configs and came down to configs we use now. They might need to be changed with data size.
      • 2020-08-13 22616, 2020

      • BrainzGit
        [bookbrainz-site] MonkeyDo merged pull request #487 (UserCollection…addEntityToCollection-CollectionPage): Feat: Adding Entity from Collection Page https://github.com/bookbrainz/bookbrainz-site/pul…
      • 2020-08-13 22611, 2020

      • pristine___
        I would like to have a look at the error if possible
      • 2020-08-13 22637, 2020

      • ishaanshah
        oops I closed the terminal window
      • 2020-08-13 22642, 2020

      • ishaanshah
        I will reproduce it
      • 2020-08-13 22647, 2020

      • ishaanshah
        give me 2 mins
      • 2020-08-13 22630, 2020

      • travis-ci joined the channel
      • 2020-08-13 22630, 2020

      • travis-ci
        Project bookbrainz-site build #3334: failed in 4 min 14 sec: https://travis-ci.org/bookbrainz/bookbrainz-site/…
      • 2020-08-13 22630, 2020

      • travis-ci has left the channel