#metabrainz

/

      • Mr_Monkey
        I'm not exactly sure what iliekcomputers had in mind in terms of text but at least we should make sure that we are calling the Spotify API with the 'right' arguments. I would recommend to start by reading the code for the `searchForSpotifyTrack` method and understanding what happens in it.
      • 2020-09-21 26500, 2020

      • Mr_Monkey
        Then, you'll want to create a test that makes sure that if I call searchForSpotifyTrack with `("mySpotifyToken123", "a beautiful track name", "dope artist", null)`, the Spotify API in turn is called with the right arguments. You'll want to make sure a call is made to `https://api.spotify.com/v1/search?q=track:a%20beautiful%20track%20name%20artist:dope%20artist&type=track`, with the right spotify token passed in to the
      • 2020-09-21 26500, 2020

      • Mr_Monkey
        Authorization header.
      • 2020-09-21 26529, 2020

      • pristine___
        _lucifer: one for normalizing input and other for normalizing score. Maybe merge both in one.
      • 2020-09-21 26530, 2020

      • Mr_Monkey
        searchForSpotifyTrack can be found here: https://github.com/metabrainz/listenbrainz-server…
      • 2020-09-21 26553, 2020

      • ishaanshah
        > ishaanshah: the best option is to use a bigger pool of data like mhld or something. People seldom want to search manually for tracks even if they have a recommend artist ig
      • 2020-09-21 26553, 2020

      • ishaanshah
        Hmm, makes sense, thanks for clarifying my doubts and good job on the recs :tada:
      • 2020-09-21 26519, 2020

      • ishaanshah
        iliekcomputers: did you have a look at the doc i posted yesterday?
      • 2020-09-21 26544, 2020

      • Mr_Monkey
        abhinavohri: And anywhere you see a condition in the code (like `if (!spotifyToken)`), you'll want to add a spearate test to make sure eveything works as it should.
      • 2020-09-21 26549, 2020

      • pristine___
        ishaanshah: did you get some recs of Lauren Jenkins in top artist playlist?
      • 2020-09-21 26520, 2020

      • ishaanshah
        Nope its mostlu Carly Rae Jespen
      • 2020-09-21 26524, 2020

      • ishaanshah
        mostly*
      • 2020-09-21 26527, 2020

      • pristine___
        Link?
      • 2020-09-21 26529, 2020

      • iliekcomputers
        ishaanshah: didn't get a chance yet, can you post it again, I'll read it after work today
      • 2020-09-21 26533, 2020

      • ishaanshah
        ishaanshah: the best option is to use a bigger pool of data like mhld or something. People seldom want to search manually for tracks even if they have a recommend artist ig
      • 2020-09-21 26558, 2020

      • _lucifer
        pristine___: also what are your views on adding a fake user, which has a listen count of one for recording in mb?
      • 2020-09-21 26503, 2020

      • pristine___
        ishaanshah: maybe because she wasn't in the mapping, I will have a look
      • 2020-09-21 26520, 2020

      • _lucifer
        *all recordings
      • 2020-09-21 26536, 2020

      • ishaanshah
      • 2020-09-21 26538, 2020

      • pristine___
        _lucifer: that will help in normalization?
      • 2020-09-21 26542, 2020

      • shivam-kapila
        woah _lucifer. calm down
      • 2020-09-21 26548, 2020

      • ishaanshah
        its just a rough guideline rn
      • 2020-09-21 26556, 2020

      • ishaanshah
        Have a lot to flesh out yet
      • 2020-09-21 26504, 2020

      • _lucifer
        pristine___: no that may help in increasing diversity of recs
      • 2020-09-21 26506, 2020

      • ishaanshah
        just wanted to makes sure we are on the same page
      • 2020-09-21 26518, 2020

      • pristine___
        Can you explain how?
      • 2020-09-21 26521, 2020

      • _lucifer
        shivam-kapila: was just joking :)
      • 2020-09-21 26525, 2020

      • bitmap
        yvanzo: no response that I can see
      • 2020-09-21 26533, 2020

      • yvanzo
        is CSP worth a separate ticket?
      • 2020-09-21 26539, 2020

      • iliekcomputers
        ishaanshah: thanks
      • 2020-09-21 26551, 2020

      • bitmap
        yeah, let me finish creating that
      • 2020-09-21 26551, 2020

      • _lucifer
        pristine___: that would ensure that all recs in mb are present in the source dataset
      • 2020-09-21 26508, 2020

      • ishaanshah
        > ishaanshah: maybe because she wasn't in the mapping, I will have a look
      • 2020-09-21 26508, 2020

      • ishaanshah
        yeah maybe, the theres very less data abt her on MB,
      • 2020-09-21 26517, 2020

      • pristine___
        _lucifer: I love this idea, maybe we have to tweak the listen count of fake user. I mean it will be similar to all or may be non, given listen count will be same for all recordings.
      • 2020-09-21 26556, 2020

      • _lucifer
        i am actually thinking once the data is normalized, the rating for all recordings of the fake user can be set to the mean value of the scale.
      • 2020-09-21 26504, 2020

      • _lucifer
        yeah pristine___ right
      • 2020-09-21 26522, 2020

      • pristine___
        We will have to do some trick like the one you mentioned _lucifer because in any case we will have to use LB listens, because that's the aim, user listening history.
      • 2020-09-21 26516, 2020

      • _lucifer
        yup, that's the rough idea. we can sketch the implementation details later
      • 2020-09-21 26557, 2020

      • pristine___
        _lucifer: right. Can we fix a meeting this week (weekend maybe) or later to get a plan for this.
      • 2020-09-21 26507, 2020

      • _lucifer
        sure pristine___ , let me know what time/day works for you
      • 2020-09-21 26503, 2020

      • _lucifer system is unable to handle android studio load any longer so he will work on other *brainz till he gets a new system
      • 2020-09-21 26509, 2020

      • pristine___
        _lucifer: weekend, preferably Saturday. I hope my fever goes away be then :(
      • 2020-09-21 26551, 2020

      • _lucifer
        works for me. get well soon :)
      • 2020-09-21 26514, 2020

      • pristine___
      • 2020-09-21 26520, 2020

      • pristine___
        Lauren not in the mapping :(
      • 2020-09-21 26557, 2020

      • pristine___
        _lucifer: yeah, and it will great if you could have a basic idea of the general flow. You can ping me anytime for that
      • 2020-09-21 26519, 2020

      • _lucifer
        thanks!, i'll study the working of listenbrainz-spark before the weekend
      • 2020-09-21 26521, 2020

      • shivam-kapila
        pristine___: If you dont mind then I would also like to help with recs
      • 2020-09-21 26536, 2020

      • _lucifer
        join in shivam-kapila :D
      • 2020-09-21 26548, 2020

      • pristine___
        shivam-kapila: hey
      • 2020-09-21 26558, 2020

      • pristine___
        So do you know about missing musicbrainz data?
      • 2020-09-21 26500, 2020

      • pristine___
        Endpoint
      • 2020-09-21 26505, 2020

      • shivam-kapila
        yep
      • 2020-09-21 26528, 2020

      • shivam-kapila
        that shows data in LB thats not in MB. Right?
      • 2020-09-21 26539, 2020

      • pristine___
        Right. So it mostly (almost all the time) gives data which is in LB but not in the *mapping*
      • 2020-09-21 26500, 2020

      • pristine___
        We initially thought it will give us data that is in LB and not in MB
      • 2020-09-21 26513, 2020

      • pristine___
        But ishaanshah and I verified, that's not the case
      • 2020-09-21 26553, 2020

      • pristine___
        The recs rn aren't diverse because of the restricted mapping and data source, if we improve the mapping, it will be a great thing
      • 2020-09-21 26504, 2020

      • pristine___
        Every week the endpoint is updated with new data
      • 2020-09-21 26509, 2020

      • shivam-kapila
        hmm
      • 2020-09-21 26523, 2020

      • pristine___
        That is in LB but maybe not in mapping or maybe not in MB
      • 2020-09-21 26536, 2020

      • pristine___
        So I was thinking, if we could use it wisely to improve the mapping
      • 2020-09-21 26512, 2020

      • pristine___
      • 2020-09-21 26515, 2020

      • pristine___
        I think here.
      • 2020-09-21 26545, 2020

      • pristine___
        ruaok: knows in detail about it, but it is something I really want to do :)
      • 2020-09-21 26548, 2020

      • yvanzo
        bitmap: I sent him another mail just about crediting in blog post (won't disclose otherwise).
      • 2020-09-21 26521, 2020

      • shivam-kapila
        pristine___: and how should the missing data to curate mapping
      • 2020-09-21 26528, 2020

      • pristine___
        shivam-kapila: I have no idea about it rn, I was thinking to research on it this weekend but since you pinged I ......
      • 2020-09-21 26530, 2020

      • pristine___
        :p
      • 2020-09-21 26555, 2020

      • shivam-kapila
        gotcha
      • 2020-09-21 26549, 2020

      • BrainzGit
        [musicbrainz-server] reosarevok opened pull request #1705 (master…MBS-11094): MBS-11094: Don't block editing on pre-existing too early format https://github.com/metabrainz/musicbrainz-server/…
      • 2020-09-21 26551, 2020

      • BrainzBot
        MBS-11094: Edit error message appears (and prevents update) unrelated to current edits https://tickets.metabrainz.org/browse/MBS-11094
      • 2020-09-21 26531, 2020

      • reosarevok
        ^ this one is a recently introduced bug (aaand my fault)
      • 2020-09-21 26504, 2020

      • reosarevok
        bitmap, yvanzo: not sure if important enough to put it out today already
      • 2020-09-21 26550, 2020

      • abhinavohri
        @Mr_Monkey ok thank you.
      • 2020-09-21 26504, 2020

      • Mr_Monkey
        Let me know if you run into issues getting set up :)
      • 2020-09-21 26541, 2020

      • abhinavohri
        Mr_Monkey: ok.
      • 2020-09-21 26500, 2020

      • ruaok
        pristine___: do you have some examples to hand of things that are not in the mapping but should be?
      • 2020-09-21 26516, 2020

      • ruaok
        if I had more concrete examples to work with I can take another stab at improving things
      • 2020-09-21 26546, 2020

      • yvanzo
        reosarevok: it doesn't show up at all, neither warning nor error
      • 2020-09-21 26518, 2020

      • yvanzo
        reosarevok: tested with setting release year to 1017 on http://localhost:5000/release/0ce274e3-3b89-4d15-…
      • 2020-09-21 26506, 2020

      • reosarevok
        yvanzo: we currently lack a year for 12" vinyl
      • 2020-09-21 26513, 2020

      • reosarevok
        Set it to CD, or just Vinyl
      • 2020-09-21 26529, 2020

      • reosarevok
        (another step we need to work on is finding years for stuff we're missing, but :) )
      • 2020-09-21 26558, 2020

      • yvanzo
        Thanks but how do you test it actually since creating wrong releases is not allowed?
      • 2020-09-21 26537, 2020

      • reosarevok
        There's one example in the ticket
      • 2020-09-21 26547, 2020

      • reosarevok
      • 2020-09-21 26529, 2020

      • yvanzo
        reosarevok: it's not in sample data
      • 2020-09-21 26535, 2020

      • reosarevok
        Oh
      • 2020-09-21 26555, 2020

      • reosarevok
        In that case, I'd just turn the error off, add the release, then readd it
      • 2020-09-21 26501, 2020

      • bitmap
        or you could update it in the DB directly
      • 2020-09-21 26506, 2020

      • reosarevok
        There's also a report for releases released too long ago
      • 2020-09-21 26514, 2020

      • reosarevok
        If you have some in there, they might fulfill the requirements
      • 2020-09-21 26527, 2020

      • yvanzo
        yup, just wonder how you did test it then.
      • 2020-09-21 26545, 2020

      • yvanzo
      • 2020-09-21 26510, 2020

      • pristine___
        ruaok: mbids corresponding to these msids should be in mapping
      • 2020-09-21 26512, 2020

      • pristine___
      • 2020-09-21 26525, 2020

      • pristine___
        That's the only data I have rn, what kind of data/format will help you
      • 2020-09-21 26553, 2020

      • pristine___
        Let me know, I will try to send it from spark
      • 2020-09-21 26516, 2020

      • ruaok
        that's a pretty solid answer, thank you. :)
      • 2020-09-21 26552, 2020

      • supersandro2000 has quit
      • 2020-09-21 26522, 2020

      • supersandro2000 joined the channel
      • 2020-09-21 26524, 2020

      • ruaok
        pristine___: I suspect a bug someplace
      • 2020-09-21 26525, 2020

      • ruaok
      • 2020-09-21 26538, 2020

      • ruaok
      • 2020-09-21 26528, 2020

      • reosarevok
        yvanzo: with pink :)
      • 2020-09-21 26512, 2020

      • reosarevok
        I can't submit edits, but I don't need to either
      • 2020-09-21 26515, 2020

      • pristine___
        ruaok: is the mapping used by bono and on FTP same? I see the mapping ok FTP was last updated on 30 June 2020 and that is what we are using in the spark cluster?
      • 2020-09-21 26523, 2020

      • pristine___
        On*
      • 2020-09-21 26533, 2020

      • bitmap
        yvanzo: I requested a few minor changes https://github.com/metabrainz/musicbrainz-server/…, it seems okay otherwise
      • 2020-09-21 26502, 2020

      • iliekcomputers
        ishaanshah: left a few comments on your doc
      • 2020-09-21 26547, 2020

      • pristine___
        ruaok: Also, the bono check mapping using artist and recording name, and the script uses recording msid and artist msid to check that, I am not sure if it can be one of the reason for mismatch. Just thinking out loud.
      • 2020-09-21 26554, 2020

      • yvanzo
        reosarevok: tested and made comments
      • 2020-09-21 26508, 2020

      • reosarevok
        Thanks, will see
      • 2020-09-21 26539, 2020

      • reosarevok
        Oh damn, just saw it fails a test too :) Will fix in a bit
      • 2020-09-21 26519, 2020

      • bitmap
        yvanzo: maybe you can commit the change to hourly.sh to the PR too?
      • 2020-09-21 26558, 2020

      • bitmap
        or if it only takes 4 days never mind
      • 2020-09-21 26534, 2020

      • yvanzo
        yup, it only takes 4 days
      • 2020-09-21 26516, 2020

      • yvanzo
        (or it keeps timing out forever ;)
      • 2020-09-21 26523, 2020

      • bitmap
        was thinking in case we have to redeploy the container, but
      • 2020-09-21 26555, 2020

      • bitmap
        the e.id thing seems to make some of the queries run orders of magnitude faster
      • 2020-09-21 26511, 2020

      • yvanzo
        well, we could run it on an old replicated database to get those 230K edits
      • 2020-09-21 26537, 2020

      • yvanzo
        that can be made any time later on though
      • 2020-09-21 26547, 2020

      • ruaok
        > ruaok: mbids corresponding to these msids should be in mapping
      • 2020-09-21 26557, 2020

      • ruaok
        ahhh, yes. I've been waiting for today.
      • 2020-09-21 26502, 2020

      • ruaok
        pristine___: ^^
      • 2020-09-21 26522, 2020

      • ruaok
        I've been trying to explain to you that you should NOT be using MSIDs for mapping.
      • 2020-09-21 26539, 2020

      • ruaok
        but STRINGS. this is why I put MATCHABLE strings into the mapping.
      • 2020-09-21 26555, 2020

      • ruaok
        I've tried to explain this to you a number of times, but have never succeeded.
      • 2020-09-21 26547, 2020

      • yvanzo
        bitmap: thanks, kept it as a separate commit since it seems worth noticing it.
      • 2020-09-21 26532, 2020

      • pristine___
        ruaok: Yeah, but you didn't comment on the join when I opened the PR. That's more relatable. Things slip otherwise :(
      • 2020-09-21 26540, 2020

      • pristine___
        Anyway, I will open a PR for this
      • 2020-09-21 26518, 2020

      • ruaok
        remember my comment about not being to follow the flow of data through the system?
      • 2020-09-21 26542, 2020

      • pristine___
        No.
      • 2020-09-21 26522, 2020

      • ruaok
        this was the result of that comment https://github.com/metabrainz/listenbrainz-server…
      • 2020-09-21 26549, 2020

      • thomasross joined the channel
      • 2020-09-21 26513, 2020

      • pristine___
        ruaok: Right. But things if not said for long are assumed I guess. My situation is also kinda similar. Things aren't that clear since lot of data is involved, everyday we find out new bugs. But yeah, I guess I troubled you a lot in understanding things. Will keep in mind:)
      • 2020-09-21 26535, 2020

      • ruaok
        :)
      • 2020-09-21 26556, 2020

      • nelgin has quit
      • 2020-09-21 26544, 2020

      • nelgin joined the channel
      • 2020-09-21 26503, 2020

      • _lucifer
        pristine___: couldn't find any old ticket. (maybe i had forgotten to open one) so i opened this LB-725
      • 2020-09-21 26504, 2020

      • BrainzBot
        LB-725: Normalize recordings of input recordings dataset https://tickets.metabrainz.org/browse/LB-725
      • 2020-09-21 26547, 2020

      • pristine___
        _lucifer: oh! No problem :)
      • 2020-09-21 26548, 2020

      • _lucifer
        (title was incorrect, fixed now)
      • 2020-09-21 26503, 2020

      • pristine___
        Thanks:) and thank you for that fake user idea <3
      • 2020-09-21 26534, 2020

      • _lucifer
        np :D
      • 2020-09-21 26557, 2020

      • ruaok
        pristine___: one comment with some questions on community...
      • 2020-09-21 26514, 2020

      • Freso
        <BANG>
      • 2020-09-21 26515, 2020

      • Freso
        It’s International Monday of Peace!