#metabrainz

/

      • ruaok
        we use this to show who the artists are in a compound string like "Queen & David Bowie"
      • 2020-09-18 26249, 2020

      • MajorLurker has quit
      • 2020-09-18 26227, 2020

      • pristine___
        Right, but I still don't understand how an artist mbid maps to multiple credit ids.
      • 2020-09-18 26256, 2020

      • ruaok
        it doesn't
      • 2020-09-18 26200, 2020

      • ruaok
        its the other way around.
      • 2020-09-18 26219, 2020

      • ruaok
        one artist credit id maps to multiple artist_mbids
      • 2020-09-18 26204, 2020

      • pristine___
      • 2020-09-18 26248, 2020

      • ruaok
        uhm. that doesn't seem right, lol.
      • 2020-09-18 26220, 2020

      • pristine___
        right. I was working on a feature
      • 2020-09-18 26225, 2020

      • pristine___
        saw this.
      • 2020-09-18 26234, 2020

      • pristine___
        > one artist credit id maps to multiple artist_mbids
      • 2020-09-18 26205, 2020

      • pristine___
        ruaok: do you mean this?
      • 2020-09-18 26207, 2020

      • pristine___
      • 2020-09-18 26222, 2020

      • ruaok
        yes
      • 2020-09-18 26233, 2020

      • pristine___
      • 2020-09-18 26248, 2020

      • pristine___
        ruaok: I though it was this :p
      • 2020-09-18 26222, 2020

      • ruaok
        no wait.
      • 2020-09-18 26233, 2020

      • ruaok
        hang on, I;m trying to debug this query. let me finish that.
      • 2020-09-18 26243, 2020

      • pristine___
        cool.
      • 2020-09-18 26201, 2020

      • pristine___
        ping me when you are done with it :)
      • 2020-09-18 26210, 2020

      • pristine___ goes to mop the floor
      • 2020-09-18 26205, 2020

      • ruaok
        actually, that is ok.
      • 2020-09-18 26250, 2020

      • ruaok
        that query "finds all of the artists creidts that a given artist is in". Which could be said as "it looks up all of the collaborations that the artist has been involved in"
      • 2020-09-18 26243, 2020

      • ruaok
      • 2020-09-18 26243, 2020

      • MajorLurker joined the channel
      • 2020-09-18 26249, 2020

      • v6lur_ has quit
      • 2020-09-18 26205, 2020

      • MajorLurker has quit
      • 2020-09-18 26221, 2020

      • iliekcomputers
        shivam-kapila: did you get a chance to work on the follower/following component at all?
      • 2020-09-18 26204, 2020

      • ishaanshah
        iliekcomputers: ping
      • 2020-09-18 26212, 2020

      • iliekcomputers
        pong
      • 2020-09-18 26222, 2020

      • ishaanshah
        \o
      • 2020-09-18 26227, 2020

      • iliekcomputers
        how goes it?
      • 2020-09-18 26243, 2020

      • ishaanshah
        Going good, finally got some time today to work on the tests
      • 2020-09-18 26250, 2020

      • iliekcomputers
        nice nice
      • 2020-09-18 26258, 2020

      • ishaanshah
        I was reading your comments on the PR
      • 2020-09-18 26238, 2020

      • ishaanshah
        rn what I am doing is, check if an incremental import with id > x (last full import) is imported
      • 2020-09-18 26244, 2020

      • ishaanshah
        if not import it
      • 2020-09-18 26254, 2020

      • ishaanshah
        this way suppose RC crashes today
      • 2020-09-18 26209, 2020

      • ishaanshah
        we restart it tomorrow, and it imports tomorrows dump
      • 2020-09-18 26216, 2020

      • ishaanshah
        we wont skip todays
      • 2020-09-18 26251, 2020

      • iliekcomputers
        if have the "last full import" filter on it, it could import the same incremental dump multiple times
      • 2020-09-18 26257, 2020

      • iliekcomputers
        if you have*
      • 2020-09-18 26258, 2020

      • ishaanshah
        no it wont
      • 2020-09-18 26206, 2020

      • ishaanshah
        I am checking if its already imported
      • 2020-09-18 26210, 2020

      • ishaanshah
        in the table
      • 2020-09-18 26217, 2020

      • iliekcomputers
        ah
      • 2020-09-18 26231, 2020

      • iliekcomputers
        ok, then why not just take the last imported dump from the table
      • 2020-09-18 26237, 2020

      • iliekcomputers
        instead of doing the last full import thing
      • 2020-09-18 26213, 2020

      • ishaanshah
        incase we miss some dump and the next dump is imported, we miss this dump forever
      • 2020-09-18 26221, 2020

      • iliekcomputers
        what i'm trying to get at is a guarantee that the data in spark is never in an invalid state
      • 2020-09-18 26221, 2020

      • ishaanshah
        just a bit more general case
      • 2020-09-18 26243, 2020

      • iliekcomputers
        so essentially if we miss dump x, i'd enforce on request consumer that we do not import dump x+1 without importing x
      • 2020-09-18 26245, 2020

      • ishaanshah
        > what i'm trying to get at is a guarantee that the data in spark is never in an invalid state
      • 2020-09-18 26246, 2020

      • ishaanshah
        yep I am making sure of this
      • 2020-09-18 26219, 2020

      • iliekcomputers
        meaning that if request consumer crashes today, then tomorrow's import should import today's dump and tomorrow's dump.
      • 2020-09-18 26237, 2020

      • iliekcomputers
        and if request consumer tries to import just tomorrow's dump somehow, it would error out
      • 2020-09-18 26238, 2020

      • ishaanshah
        yes that will happen
      • 2020-09-18 26253, 2020

      • ishaanshah
        oh
      • 2020-09-18 26255, 2020

      • ishaanshah
        the error
      • 2020-09-18 26202, 2020

      • ishaanshah
        ah, I got it
      • 2020-09-18 26207, 2020

      • iliekcomputers
        right, sorry for not being clear
      • 2020-09-18 26212, 2020

      • ishaanshah
        so incase a dump doesnt have SHA
      • 2020-09-18 26216, 2020

      • ishaanshah
        and it fails
      • 2020-09-18 26225, 2020

      • ishaanshah
        we should just stop the import right?
      • 2020-09-18 26229, 2020

      • iliekcomputers
        there should be no way for request consumer to go from x to x+2 at all
      • 2020-09-18 26233, 2020

      • iliekcomputers
        yes
      • 2020-09-18 26243, 2020

      • ishaanshah
        yep yep got it
      • 2020-09-18 26257, 2020

      • ishaanshah
        Yeah that is missing, I will add that
      • 2020-09-18 26201, 2020

      • ishaanshah
        good catch
      • 2020-09-18 26235, 2020

      • iliekcomputers
        for full dump imports, we just need to check that the id of the dump we're importing is greater than the id in spark.
      • 2020-09-18 26251, 2020

      • ishaanshah
        I was just importing the latest one
      • 2020-09-18 26205, 2020

      • ishaanshah
        it wont matter right?
      • 2020-09-18 26213, 2020

      • iliekcomputers
        mhmm, adding a check would still be good.
      • 2020-09-18 26224, 2020

      • iliekcomputers
        just to make sure we don't delete data :)
      • 2020-09-18 26245, 2020

      • ishaanshah
        ok, will do
      • 2020-09-18 26250, 2020

      • iliekcomputers
        like suppose if we tried to import a valid full dump with ID x - 3, when the cluster was at x
      • 2020-09-18 26208, 2020

      • ishaanshah
        yeah but for the next incremental import
      • 2020-09-18 26217, 2020

      • ishaanshah
        it will import x+1, x+2, x+3
      • 2020-09-18 26221, 2020

      • ishaanshah
        so wont be an issue
      • 2020-09-18 26254, 2020

      • iliekcomputers
        yeah, i guess that's true. but still keeping the dump IDs as linearly increasing as possible would be great
      • 2020-09-18 26255, 2020

      • ishaanshah
        sorry x-2, x-1, x
      • 2020-09-18 26222, 2020

      • iliekcomputers
        that way it's easy to reason about stuff
      • 2020-09-18 26231, 2020

      • ishaanshah
        yep, that would be better, I will update it
      • 2020-09-18 26206, 2020

      • ishaanshah
        About the yearly Reports
      • 2020-09-18 26216, 2020

      • ishaanshah
        I would be happy to work on that after this PR
      • 2020-09-18 26236, 2020

      • ishaanshah
        My half sem course is ending next week so I'll get more time too
      • 2020-09-18 26207, 2020

      • ishaanshah
        I will spend some time on this weekend writing a rough draft of what we should do and how we should do it
      • 2020-09-18 26255, 2020

      • iliekcomputers
        sounds good! That's exactly what i was gonna suggest :)
      • 2020-09-18 26213, 2020

      • ishaanshah
        :D
      • 2020-09-18 26222, 2020

      • shivam-kapila
        iliekcomputers: yep I had worked on it. Will get it completed by tomorrow
      • 2020-09-18 26241, 2020

      • iliekcomputers
        shivam-kapila: awesome, a PR tomorrow would be great!
      • 2020-09-18 26247, 2020

      • shivam-kapila
        iliekcomputers: I dont have that much transitions for the button like we have now
      • 2020-09-18 26201, 2020

      • shivam-kapila
        Simple purple buttons like we have now
      • 2020-09-18 26217, 2020

      • shivam-kapila
        Like I showed in figma***
      • 2020-09-18 26225, 2020

      • iliekcomputers
        the FollowButton component is pretty plug and play
      • 2020-09-18 26231, 2020

      • iliekcomputers
        we can just reuse that
      • 2020-09-18 26228, 2020

      • shivam-kapila
        Yeah using it
      • 2020-09-18 26236, 2020

      • shivam-kapila
        Just modified stylings
      • 2020-09-18 26213, 2020

      • iliekcomputers
        sounds good
      • 2020-09-18 26246, 2020

      • thomasross joined the channel
      • 2020-09-18 26207, 2020

      • abhinavohri joined the channel
      • 2020-09-18 26241, 2020

      • pristine___
      • 2020-09-18 26203, 2020

      • ruaok
        that is not valid.
      • 2020-09-18 26219, 2020

      • pristine___
        A sec
      • 2020-09-18 26234, 2020

      • pristine___
        > an artist_credit_id can map to a variety of artist_mbids
      • 2020-09-18 26240, 2020

      • ruaok
        an artist_credit_id will always map to on unique list of artist_mbids
      • 2020-09-18 26245, 2020

      • pristine___
        Right
      • 2020-09-18 26259, 2020

      • pristine___
        > also one artist_credit_id will always map to a single *artist_mbids* (note that i said artist_mbids meaning the list of mibds of length >=1)
      • 2020-09-18 26206, 2020

      • pristine___
        I framed this idea so badly
      • 2020-09-18 26208, 2020

      • pristine___
        :p
      • 2020-09-18 26224, 2020

      • pristine___
        Unique list, that was the word 😂
      • 2020-09-18 26240, 2020

      • pristine___
        ruaok: thanks
      • 2020-09-18 26259, 2020

      • ruaok
        np
      • 2020-09-18 26254, 2020

      • abhinavohri
        i want to work on LB-682.Can someone suggest me how to do it.
      • 2020-09-18 26254, 2020

      • BrainzBot
        LB-682: Improve the process for importing listens into HDFS https://tickets.metabrainz.org/browse/LB-682
      • 2020-09-18 26238, 2020

      • ishaanshah
        abhinavohri: Hi, I think I fixed the first point in a PR before
      • 2020-09-18 26252, 2020

      • ishaanshah
        The other two points are still open though
      • 2020-09-18 26244, 2020

      • ishaanshah
      • 2020-09-18 26207, 2020

      • ishaanshah
        these are the files related to the ticket
      • 2020-09-18 26248, 2020

      • ishaanshah
      • 2020-09-18 26248, 2020

      • ishaanshah
        over here we should create the directory explicitly
      • 2020-09-18 26227, 2020

      • abhinavohri
        ishaanshah: ok
      • 2020-09-18 26231, 2020

      • MajorLurker joined the channel
      • 2020-09-18 26242, 2020

      • abhinavohri
        ishaanshah Also please suggest some other ticket for me related to flask or react.
      • 2020-09-18 26205, 2020

      • ishaanshah
      • 2020-09-18 26206, 2020

      • BrainzBot
        LB-643: Improve the listening activity query to make it more scalable
      • 2020-09-18 26220, 2020

      • ishaanshah
        if you are interested in something more interesting after this you could take a look at this
      • 2020-09-18 26252, 2020

      • abhinavohri
        @ishaanshah ok thank you.
      • 2020-09-18 26256, 2020

      • ishaanshah
      • 2020-09-18 26257, 2020

      • BrainzBot
        LB-516: Rewrite the last.fm importer retry logic to be iterative.
      • 2020-09-18 26212, 2020

      • ishaanshah
        this can be a good starting point for React based tickets
      • 2020-09-18 26233, 2020

      • MajorLurker has quit
      • 2020-09-18 26201, 2020

      • pristine___
        ruaok: how often the mapping dump is updated? Was asking to understand if we should import the mapping into the spark cluster before generating recs.
      • 2020-09-18 26209, 2020

      • pristine___
        Every week
      • 2020-09-18 26252, 2020

      • ruaok
        0 4 * * 1,5
      • 2020-09-18 26202, 2020

      • ruaok is sure that pristine___ speaks crontab now
      • 2020-09-18 26203, 2020

      • ruaok
        :)
      • 2020-09-18 26241, 2020

      • alastairp
      • 2020-09-18 26244, 2020

      • chaban
      • 2020-09-18 26203, 2020

      • pristine___
        ruaok: thanks
      • 2020-09-18 26226, 2020

      • revi
        Looks like Vector is second-class here ~_~ :P https://usercontent.irccloud-cdn.com/file/65KRY2E…
      • 2020-09-18 26241, 2020

      • pristine___
        alastairp: thanks. Will have a look
      • 2020-09-18 26224, 2020

      • alastairp
        it looks like recsys is next week, and you'll have to pay to attend the (virtual) conference, but the paper will be available after
      • 2020-09-18 26218, 2020

      • Mr_Monkey
        Hi zas! Could you please talk me through how to back files up on prince ? Up until now we were only generating public dumps for BB but now we have a need for private dumps too, and a way to store them somewhere. Needless to say I currently know next to nothing of how the other projects do it…
      • 2020-09-18 26212, 2020

      • alastairp
        Mr_Monkey: that's done here: https://github.com/metabrainz/borg-backup
      • 2020-09-18 26233, 2020

      • pristine___
        shivam-kapila: did you open the rec?page= ticket?
      • 2020-09-18 26227, 2020

      • Mr_Monkey
        alastairp: Would that back up the entire node though?
      • 2020-09-18 26217, 2020

      • alastairp
        Mr_Monkey: see for example https://github.com/metabrainz/borg-backup/blob/ma…, there's a path of the thing that you want to back up (in this case /var/lib/docker/volumes/acousticbrainz-web-data-volume-prod)
      • 2020-09-18 26253, 2020

      • Mr_Monkey
        Merci, I'll have a look
      • 2020-09-18 26235, 2020

      • pristine___
        ruaok: if artist a and recording b are in MB, but not linked, what is the process to do that?