#metabrainz

/

      • ruaok
        create new table, create indexes and then in one transaction, rename tables effectively swapping them into place. once that transaction finished, drop the old table.
      • 2021-03-04 06321, 2021

      • BrainzGit
        [musicbrainz-server] reosarevok opened pull request #1955 (master…MBS-8678): MBS-8678 / MBS-11421: Split artist improvements https://github.com/metabrainz/musicbrainz-server/…
      • 2021-03-04 06304, 2021

      • iliekcomputers
        I see.
      • 2021-03-04 06319, 2021

      • ruaok
        how can we go about not duplicating the table definition?
      • 2021-03-04 06326, 2021

      • Rohan_Pillai joined the channel
      • 2021-03-04 06304, 2021

      • ruaok
        I wonder if I easily copy the table definition from an existing table. that would solve this problem
      • 2021-03-04 06337, 2021

      • iliekcomputers
        That seems possible
      • 2021-03-04 06317, 2021

      • Mineo
        create table foo like bar?
      • 2021-03-04 06346, 2021

      • ruaok
        no wai.
      • 2021-03-04 06349, 2021

      • iliekcomputers
        Yep
      • 2021-03-04 06351, 2021

      • ruaok
        that is exactly what I need. :) lol.
      • 2021-03-04 06357, 2021

      • ruaok
        thanks Mineo.
      • 2021-03-04 06320, 2021

      • ruaok
        well, still not perfect.
      • 2021-03-04 06349, 2021

      • ruaok
        I'll need to create indexes and constraints AFTER the data import.
      • 2021-03-04 06315, 2021

      • ruaok
        but, I can work with that.
      • 2021-03-04 06343, 2021

      • iliekcomputers
        It takes an "excluding" param as well
      • 2021-03-04 06302, 2021

      • iliekcomputers
        Create table foo like bar excluding indexes
      • 2021-03-04 06323, 2021

      • ruaok
        I'll add comments to the create_indexes that if changes are made to a table index there, the code needs updating as well.
      • 2021-03-04 06330, 2021

      • ruaok
        that should suffice for now.
      • 2021-03-04 06337, 2021

      • iliekcomputers
        For both the indexes and the constraints
      • 2021-03-04 06349, 2021

      • ruaok
        eayh
      • 2021-03-04 06351, 2021

      • iliekcomputers
        If you calculate data for a deleted user, insert it and then add the foreign key constraint, will there be a way to delete rows that break the constraint?
      • 2021-03-04 06305, 2021

      • iliekcomputers
        Or will it need to be done by hand
      • 2021-03-04 06350, 2021

      • ruaok
        not sure. let me see.
      • 2021-03-04 06339, 2021

      • CatQuest has left the channel
      • 2021-03-04 06359, 2021

      • CatQuest joined the channel
      • 2021-03-04 06353, 2021

      • Rohan_Pillai has quit
      • 2021-03-04 06321, 2021

      • Mr_Monkey
        iliekcomputers: Thoughts about this? https://chatlogs.metabrainz.org/brainzbot/metabra…
      • 2021-03-04 06330, 2021

      • iliekcomputers
        Mr_Monkey: sorry I missed that message, the types make sense to me
      • 2021-03-04 06328, 2021

      • Mr_Monkey
        OK, let me formalize that in the design doc
      • 2021-03-04 06316, 2021

      • alastairp
        sorry ruaok, I thought your question was "what should we name the indexes", and unfortunately you caught me just as I was heading out the door. It looks like you may have answered your question with iliekcomputers and Mineo?
      • 2021-03-04 06310, 2021

      • ruaok
        more or less yess.
      • 2021-03-04 06333, 2021

      • ruaok
        except doing bulk inserts with sqlalchemy is.... alchemy.
      • 2021-03-04 06353, 2021

      • alastairp
        ah yeah. one sec, I have a pattern to do that
      • 2021-03-04 06355, 2021

      • ruaok
        i really want to be using pscypg2 in spark_reader, not sqlalchemy.
      • 2021-03-04 06305, 2021

      • alastairp
        let me see if I can remember where I did it
      • 2021-03-04 06332, 2021

      • Mr_Monkey
        iliekcomputers: Have you decided on the structure of the API endpoint to GET timeline events?
      • 2021-03-04 06341, 2021

      • iliekcomputers
        Mr_Monkey: it would probably take the same params as the existing feed/listens endpoint.
      • 2021-03-04 06314, 2021

      • iliekcomputers
        I think we'll just rename the feed/listens endpoint to be more generic and then return all the data from there.
      • 2021-03-04 06304, 2021

      • Mr_Monkey
        `GET /user/XXXX/timeline` ?
      • 2021-03-04 06342, 2021

      • Mr_Monkey
        or `GET /user/XXXX/feed`?
      • 2021-03-04 06343, 2021

      • Rohan_Pillai joined the channel
      • 2021-03-04 06314, 2021

      • iliekcomputers
        Either of those is ok with me, do you have a preference?
      • 2021-03-04 06303, 2021

      • Mr_Monkey
        I feel like 'feed' is a more standard description for what we're doing.
      • 2021-03-04 06326, 2021

      • Mr_Monkey
        Another option is `/1/user/$user_name/feed/events`, especially if we want endpoints for specific event types like `/1/user/$user_name/feed/events/recommendation`
      • 2021-03-04 06319, 2021

      • iliekcomputers
        I guess because it's json, we'll want it behind the api prefix, so /1/user/username/feed/events makes sense to me.
      • 2021-03-04 06305, 2021

      • iliekcomputers
        Do we want different endpoints for different types of events? I think having just the one endpoint would be easier for the front-end.
      • 2021-03-04 06348, 2021

      • ruaok
        _lucifer: I found one problem with my spec. :(
      • 2021-03-04 06311, 2021

      • ruaok
        we should we using user_id (int) not user_name (str) for the data that gets returned from spark.
      • 2021-03-04 06316, 2021

      • ruaok
        sorry :(
      • 2021-03-04 06303, 2021

      • _lucifer
        that should be only a deleting code step and save us two joins methinks :)
      • 2021-03-04 06330, 2021

      • _lucifer
        i'll change it once i get back
      • 2021-03-04 06341, 2021

      • ruaok
        sweet, thanks.
      • 2021-03-04 06341, 2021

      • iliekcomputers
        I'm not completely sure we have the same user IDs in spark and postgres
      • 2021-03-04 06343, 2021

      • alastairp
        urgh. no idea where I had that sqlalchemy-core bulk insert code
      • 2021-03-04 06348, 2021

      • ruaok
        makes the data smaller too.
      • 2021-03-04 06305, 2021

      • ruaok
        alastairp: used livegrep?
      • 2021-03-04 06308, 2021

      • alastairp
        yep
      • 2021-03-04 06315, 2021

      • _lucifer
        iliekcomputers: yeah, that's what i want to check
      • 2021-03-04 06320, 2021

      • Mr_Monkey
        iliekcomputers: I think the front-end only needs the single endpoint. I was thinking of possible future use-cases if we need more precision, but can't think of a great example.
      • 2021-03-04 06326, 2021

      • alastairp
        is it not just as simple as `connection.execute(query, [list, of,items,to,insert])` ?
      • 2021-03-04 06344, 2021

      • ruaok
        alastairp: that works yes, but it stupidly slow.
      • 2021-03-04 06350, 2021

      • ruaok
      • 2021-03-04 06353, 2021

      • alastairp
        ruaok: I recall that somewhere else you just got a raw engine from the connection and did bulk insert
      • 2021-03-04 06302, 2021

      • ruaok
        we can add yet another module to make it happen...
      • 2021-03-04 06313, 2021

      • ruaok
        that would be ideal.
      • 2021-03-04 06337, 2021

      • alastairp
        ruaok: ah, right. I have a funny feeling that there was a flag that you had to set to tell sql-a to do a bulk query in this case instead of multiple inserts
      • 2021-03-04 06314, 2021

      • ruaok
        I have this all working in mbid_mapping. which uses psychopg2.
      • 2021-03-04 06315, 2021

      • alastairp
        you want to make an sql query like `INSERT INTO x (a, b) VALUES (1,2), (3,4), (4,5)`, right?
      • 2021-03-04 06332, 2021

      • ruaok
        yes, with like 10k rows per statement.
      • 2021-03-04 06310, 2021

      • ruaok
      • 2021-03-04 06350, 2021

      • ruaok
        the spark reader also runs flask for its logging purpose, which is odd. thus sqlalchemy.
      • 2021-03-04 06308, 2021

      • ruaok
        and the connection strings are not formatted right for connecting to psycopg2 directly.
      • 2021-03-04 06309, 2021

      • ruaok
      • 2021-03-04 06300, 2021

      • _lucifer
        ruaok: https://github.com/metabrainz/listenbrainz-server… it seems we only have the user_name in spark
      • 2021-03-04 06304, 2021

      • alastairp
        right - this will give you a psycopg2 connection which you can use execute_values with
      • 2021-03-04 06322, 2021

      • ruaok
        ok, I'll give that a shot.
      • 2021-03-04 06325, 2021

      • alastairp
        but goddamnit I solved this in sqlalchemy, I just can't remember where
      • 2021-03-04 06347, 2021

      • ruaok
        _lucifer: oh jeez, this simple problem keeps getting worse by the second. :(
      • 2021-03-04 06329, 2021

      • _lucifer
        yeah :(
      • 2021-03-04 06301, 2021

      • ruaok
        iliekcomputers: alastairp: _lucifer : can we change this table to use user_name instead of user_id ?
      • 2021-03-04 06301, 2021

      • ruaok
      • 2021-03-04 06326, 2021

      • alastairp
        that'll have an effect if a user changes their name
      • 2021-03-04 06328, 2021

      • ruaok
        otherwise I need reprocess each of the rows in order to look up the name
      • 2021-03-04 06341, 2021

      • alastairp
        is it because spark uses a public dump that doesn't have row ids?
      • 2021-03-04 06344, 2021

      • _lucifer
        i think the user cannot change the username
      • 2021-03-04 06348, 2021

      • reosarevok
        zas, ruaok: https://blog.metabrainz.org/2021/03/01/musicbrain… did we have any known issues earlier today? I don't think I noticed any alerts at all today
      • 2021-03-04 06352, 2021

      • Rohan_Pillai has quit
      • 2021-03-04 06358, 2021

      • ruaok
        alastairp: yes, no user_ids in spark.
      • 2021-03-04 06306, 2021

      • _lucifer
        its a long standing open issue because listens are closely tied to usernames
      • 2021-03-04 06325, 2021

      • alastairp
        _lucifer: that was an issue with the old listen storage, but should be less of an issue now
      • 2021-03-04 06331, 2021

      • ruaok
        reosarevok: sounds like a block not a service issue. have them give us their IP
      • 2021-03-04 06359, 2021

      • ruaok
        the other issue that I need to deal with is fixing up the FKs.
      • 2021-03-04 06314, 2021

      • _lucifer
        LB-383
      • 2021-03-04 06315, 2021

      • BrainzBot
        LB-383: Allow updating usernames when they're changed in MusicBrainz https://tickets.metabrainz.org/browse/LB-383
      • 2021-03-04 06316, 2021

      • alastairp
        given the fact that recommendation.similar_user.user_id has an FK, I'm not convinced that we should change it to a username
      • 2021-03-04 06318, 2021

      • ruaok
        in case a user is deleted. I suppose I can handle both in one go, but that make this table processing a lot worse.
      • 2021-03-04 06336, 2021

      • alastairp
        but it means we'd have to do a separate query/pass through the data to get that mapping?
      • 2021-03-04 06344, 2021

      • ruaok
        yep.
      • 2021-03-04 06347, 2021

      • reosarevok
        ruaok: thanks, done
      • 2021-03-04 06328, 2021

      • alastairp
        ruaok: and just to confirm you're not doing _any_ processing on that data between when it comes in and when you throw it into exeutute?
      • 2021-03-04 06350, 2021

      • ruaok
        I wasn't planning on doing any. but it looks like I do now.
      • 2021-03-04 06337, 2021

      • nawcom_ joined the channel
      • 2021-03-04 06338, 2021

      • nawcom has quit
      • 2021-03-04 06338, 2021

      • nawcom_ has quit
      • 2021-03-04 06333, 2021

      • sampsyo has quit
      • 2021-03-04 06352, 2021

      • sampsyo joined the channel
      • 2021-03-04 06300, 2021

      • BrainzGit
        [listenbrainz-server] MonkeyDo opened pull request #1315 (master…timeline-ui): Create timeline view for UserFeed page https://github.com/metabrainz/listenbrainz-server…
      • 2021-03-04 06313, 2021

      • ruaok
        _lucifer: you had some test files uncommitted on leader. I stashed them.
      • 2021-03-04 06325, 2021

      • ruaok
      • 2021-03-04 06323, 2021

      • reosarevok
        Does this have a chance to make Tidal relevant?
      • 2021-03-04 06333, 2021

      • reosarevok
        (or: is Tidal relevant now and I missed the news)
      • 2021-03-04 06337, 2021

      • ruaok
        my question exactly.
      • 2021-03-04 06337, 2021

      • _lucifer
        ruaok: that was there before i tested yesterday. i was unable to remove those files
      • 2021-03-04 06334, 2021

      • ruaok
        request_consumer@newleader:~/listenbrainz-server$ sudo chown -R request_consumer:request_consumer .
      • 2021-03-04 06359, 2021

      • ruaok
        the docker container causes this, reowning the files allows them to be deleted.
      • 2021-03-04 06359, 2021

      • _lucifer
        makes sense. thanks!
      • 2021-03-04 06310, 2021

      • ruaok
        request consumer deployed.
      • 2021-03-04 06345, 2021

      • ruaok
        now deploying a spark_writer and then I'll make some requests.
      • 2021-03-04 06357, 2021

      • _lucifer
        ruaok, you'll need to redeploy. i forgot to push one commit. i'll push it in 5 mins. sorry!
      • 2021-03-04 06314, 2021

      • ruaok hangs
      • 2021-03-04 06310, 2021

      • Mr_Monkey reboots ruaok
      • 2021-03-04 06324, 2021

      • Mr_Monkey
        I mean… it's been hanging for 10 minutes !
      • 2021-03-04 06349, 2021

      • shivam-kapila
        lol
      • 2021-03-04 06327, 2021

      • shivam-kapila
        Mr_Monkey: I was working on the semi circle progress. Shall I create a component from scratch or use some library
      • 2021-03-04 06334, 2021

      • MajorLurker joined the channel
      • 2021-03-04 06350, 2021

      • Mr_Monkey
        What's the context?
      • 2021-03-04 06341, 2021

      • Mr_Monkey
        There's some good pure css ones you can copy here:https://loading.io/css/
      • 2021-03-04 06320, 2021

      • shivam-kapila
      • 2021-03-04 06335, 2021

      • shivam-kapila
        The green over grey score indicator
      • 2021-03-04 06313, 2021

      • ruaok
        _lucifer: ?
      • 2021-03-04 06312, 2021

      • MajorLurker has quit
      • 2021-03-04 06309, 2021

      • SothoTalKer has quit
      • 2021-03-04 06346, 2021

      • Mr_Monkey
        shivam-kapila: I think you'll find lightweight css solutions for that. A whole library seems like a bit much. An example: https://codeconvey.com/pure-css-radial-progress-b…
      • 2021-03-04 06322, 2021

      • shivam-kapila
        cool. thank you
      • 2021-03-04 06333, 2021

      • _lucifer
        ruaok: updated. apologies for the delay
      • 2021-03-04 06340, 2021

      • ruaok
        k
      • 2021-03-04 06302, 2021

      • SothoTalKer joined the channel
      • 2021-03-04 06323, 2021

      • Mr_Monkey
      • 2021-03-04 06345, 2021

      • shivam-kapila bookmarks these sites for future css references
      • 2021-03-04 06326, 2021

      • ruaok
        _lucifer: "listenbrainz_spark.exceptions.FileNotFetchedException: File could not be fetched from /data/listenbrainz/2021/3.parquet" is this the error you were seeing yesterday?
      • 2021-03-04 06346, 2021

      • _lucifer
        yes
      • 2021-03-04 06355, 2021

      • _lucifer
        the exception was different though
      • 2021-03-04 06315, 2021

      • ruaok
        should I request a full dump import and see what happens?
      • 2021-03-04 06321, 2021

      • _lucifer
        a possible reason could be if i messed up a path in the mapping. but for fetching the listens dump the path is still hardcoded and unchanged
      • 2021-03-04 06335, 2021

      • _lucifer
        yeah let's see how it goes
      • 2021-03-04 06338, 2021

      • ruaok
        k
      • 2021-03-04 06334, 2021

      • _lucifer
      • 2021-03-04 06343, 2021

      • _lucifer
        did the import complete ruaok ?
      • 2021-03-04 06332, 2021

      • _lucifer
        the file seems to be present at the correct location