#metabrainz

/

11:01 AM
ruaok

create new table, create indexes and then in one transaction, rename tables effectively swapping them into place. once that transaction finished, drop the old table.

2021-03-04 06321, 2021

11:03 AM
BrainzGit

[musicbrainz-server] reosarevok opened pull request #1955 (master…MBS-8678): MBS-8678 / MBS-11421: Split artist improvements https://github.com/metabrainz/musicbrainz-server/…

2021-03-04 06304, 2021

11:04 AM
iliekcomputers

I see.

2021-03-04 06319, 2021

11:05 AM
ruaok

how can we go about not duplicating the table definition?

2021-03-04 06326, 2021

11:05 AM
Rohan_Pillai joined the channel

2021-03-04 06304, 2021

11:06 AM
ruaok

I wonder if I easily copy the table definition from an existing table. that would solve this problem

2021-03-04 06337, 2021

11:08 AM
iliekcomputers

That seems possible

2021-03-04 06317, 2021

11:09 AM
Mineo

create table foo like bar?

2021-03-04 06346, 2021

11:09 AM
ruaok

no wai.

2021-03-04 06349, 2021

11:09 AM
iliekcomputers

Yep

2021-03-04 06351, 2021

11:09 AM
ruaok

that is exactly what I need. :) lol.

2021-03-04 06357, 2021

11:09 AM
ruaok

thanks Mineo.

2021-03-04 06320, 2021

11:10 AM
ruaok

well, still not perfect.

2021-03-04 06349, 2021

11:10 AM
ruaok

I'll need to create indexes and constraints AFTER the data import.

2021-03-04 06315, 2021

11:11 AM
ruaok

but, I can work with that.

2021-03-04 06343, 2021

11:11 AM
iliekcomputers

It takes an "excluding" param as well

2021-03-04 06302, 2021

11:12 AM
iliekcomputers

Create table foo like bar excluding indexes

2021-03-04 06323, 2021

11:12 AM
ruaok

I'll add comments to the create_indexes that if changes are made to a table index there, the code needs updating as well.

2021-03-04 06330, 2021

11:12 AM
ruaok

that should suffice for now.

2021-03-04 06337, 2021

11:14 AM
iliekcomputers

For both the indexes and the constraints

2021-03-04 06349, 2021

11:14 AM
ruaok

eayh

2021-03-04 06351, 2021

11:16 AM
iliekcomputers

If you calculate data for a deleted user, insert it and then add the foreign key constraint, will there be a way to delete rows that break the constraint?

2021-03-04 06305, 2021

11:17 AM
iliekcomputers

Or will it need to be done by hand

2021-03-04 06350, 2021

11:17 AM
ruaok

not sure. let me see.

2021-03-04 06339, 2021

11:23 AM
CatQuest has left the channel

2021-03-04 06359, 2021

11:23 AM
CatQuest joined the channel

2021-03-04 06353, 2021

11:24 AM
Rohan_Pillai has quit

2021-03-04 06321, 2021

12:14 PM
Mr_Monkey

iliekcomputers: Thoughts about this? https://chatlogs.metabrainz.org/brainzbot/metabra…

2021-03-04 06330, 2021

12:21 PM
iliekcomputers

Mr_Monkey: sorry I missed that message, the types make sense to me

2021-03-04 06328, 2021

12:22 PM
Mr_Monkey

OK, let me formalize that in the design doc

2021-03-04 06316, 2021

12:33 PM
alastairp

sorry ruaok, I thought your question was "what should we name the indexes", and unfortunately you caught me just as I was heading out the door. It looks like you may have answered your question with iliekcomputers and Mineo?

2021-03-04 06310, 2021

12:34 PM
ruaok

more or less yess.

2021-03-04 06333, 2021

12:34 PM
ruaok

except doing bulk inserts with sqlalchemy is.... alchemy.

2021-03-04 06353, 2021

12:34 PM
alastairp

ah yeah. one sec, I have a pattern to do that

2021-03-04 06355, 2021

12:34 PM
ruaok

i really want to be using pscypg2 in spark_reader, not sqlalchemy.

2021-03-04 06305, 2021

12:35 PM
alastairp

let me see if I can remember where I did it

2021-03-04 06332, 2021

12:40 PM
Mr_Monkey

iliekcomputers: Have you decided on the structure of the API endpoint to GET timeline events?

2021-03-04 06341, 2021

12:41 PM
iliekcomputers

Mr_Monkey: it would probably take the same params as the existing feed/listens endpoint.

2021-03-04 06314, 2021

12:42 PM
iliekcomputers

I think we'll just rename the feed/listens endpoint to be more generic and then return all the data from there.

2021-03-04 06304, 2021

12:43 PM
Mr_Monkey

`GET /user/XXXX/timeline` ?

2021-03-04 06342, 2021

12:43 PM
Mr_Monkey

or `GET /user/XXXX/feed`?

2021-03-04 06343, 2021

12:43 PM
Rohan_Pillai joined the channel

2021-03-04 06314, 2021

12:44 PM
iliekcomputers

Either of those is ok with me, do you have a preference?

2021-03-04 06303, 2021

12:45 PM
Mr_Monkey

I feel like 'feed' is a more standard description for what we're doing.

2021-03-04 06326, 2021

12:48 PM
Mr_Monkey

Another option is `/1/user/$user_name/feed/events`, especially if we want endpoints for specific event types like `/1/user/$user_name/feed/events/recommendation`

2021-03-04 06319, 2021

12:49 PM
iliekcomputers

I guess because it's json, we'll want it behind the api prefix, so /1/user/username/feed/events makes sense to me.

2021-03-04 06305, 2021

12:50 PM
iliekcomputers

Do we want different endpoints for different types of events? I think having just the one endpoint would be easier for the front-end.

2021-03-04 06348, 2021

12:50 PM
ruaok

_lucifer: I found one problem with my spec. :(

2021-03-04 06311, 2021

12:51 PM
ruaok

we should we using user_id (int) not user_name (str) for the data that gets returned from spark.

2021-03-04 06316, 2021

12:51 PM
ruaok

sorry :(

2021-03-04 06303, 2021

12:52 PM
_lucifer

that should be only a deleting code step and save us two joins methinks :)

2021-03-04 06330, 2021

12:52 PM
_lucifer

i'll change it once i get back

2021-03-04 06341, 2021

12:52 PM
ruaok

sweet, thanks.

2021-03-04 06341, 2021

12:52 PM
iliekcomputers

I'm not completely sure we have the same user IDs in spark and postgres

2021-03-04 06343, 2021

12:52 PM
alastairp

urgh. no idea where I had that sqlalchemy-core bulk insert code

2021-03-04 06348, 2021

12:52 PM
ruaok

makes the data smaller too.

2021-03-04 06305, 2021

12:53 PM
ruaok

alastairp: used livegrep?

2021-03-04 06308, 2021

12:53 PM
alastairp

yep

2021-03-04 06315, 2021

12:53 PM
_lucifer

iliekcomputers: yeah, that's what i want to check

2021-03-04 06320, 2021

12:53 PM
Mr_Monkey

iliekcomputers: I think the front-end only needs the single endpoint. I was thinking of possible future use-cases if we need more precision, but can't think of a great example.

2021-03-04 06326, 2021

12:53 PM
alastairp

is it not just as simple as `connection.execute(query, [list, of,items,to,insert])` ?

2021-03-04 06344, 2021

12:53 PM
ruaok

alastairp: that works yes, but it stupidly slow.

2021-03-04 06350, 2021

12:53 PM
ruaok

https://benchling.engineering/sqlalchemy-batch-in…

2021-03-04 06353, 2021

12:53 PM
alastairp

ruaok: I recall that somewhere else you just got a raw engine from the connection and did bulk insert

2021-03-04 06302, 2021

12:54 PM
ruaok

we can add yet another module to make it happen...

2021-03-04 06313, 2021

12:54 PM
ruaok

that would be ideal.

2021-03-04 06337, 2021

12:54 PM
alastairp

ruaok: ah, right. I have a funny feeling that there was a flag that you had to set to tell sql-a to do a bulk query in this case instead of multiple inserts

2021-03-04 06314, 2021

12:55 PM
ruaok

I have this all working in mbid_mapping. which uses psychopg2.

2021-03-04 06315, 2021

12:55 PM
alastairp

you want to make an sql query like `INSERT INTO x (a, b) VALUES (1,2), (3,4), (4,5)`, right?

2021-03-04 06332, 2021

12:55 PM
ruaok

yes, with like 10k rows per statement.

2021-03-04 06310, 2021

12:56 PM
ruaok

excute_values in https://www.psycopg.org/docs/extras.html

2021-03-04 06350, 2021

12:56 PM
ruaok

the spark reader also runs flask for its logging purpose, which is odd. thus sqlalchemy.

2021-03-04 06308, 2021

12:57 PM
ruaok

and the connection strings are not formatted right for connecting to psycopg2 directly.

2021-03-04 06309, 2021

12:59 PM
ruaok

alastairp: https://docs.sqlalchemy.org/en/13/core/connection… ?

2021-03-04 06300, 2021

13:00 PM
_lucifer

ruaok: https://github.com/metabrainz/listenbrainz-server… it seems we only have the user_name in spark

2021-03-04 06304, 2021

13:00 PM
alastairp

right - this will give you a psycopg2 connection which you can use execute_values with

2021-03-04 06322, 2021

13:00 PM
ruaok

ok, I'll give that a shot.

2021-03-04 06325, 2021

13:00 PM
alastairp

but goddamnit I solved this in sqlalchemy, I just can't remember where

2021-03-04 06347, 2021

13:00 PM
ruaok

_lucifer: oh jeez, this simple problem keeps getting worse by the second. :(

2021-03-04 06329, 2021

13:01 PM
_lucifer

yeah :(

2021-03-04 06301, 2021

13:02 PM
ruaok

iliekcomputers: alastairp: _lucifer : can we change this table to use user_name instead of user_id ?

2021-03-04 06301, 2021

13:02 PM
ruaok

https://github.com/metabrainz/listenbrainz-server…

2021-03-04 06326, 2021

13:02 PM
alastairp

that'll have an effect if a user changes their name

2021-03-04 06328, 2021

13:02 PM
ruaok

otherwise I need reprocess each of the rows in order to look up the name

2021-03-04 06341, 2021

13:02 PM
alastairp

is it because spark uses a public dump that doesn't have row ids?

2021-03-04 06344, 2021

13:02 PM
_lucifer

i think the user cannot change the username

2021-03-04 06348, 2021

13:02 PM
reosarevok

zas, ruaok: https://blog.metabrainz.org/2021/03/01/musicbrain… did we have any known issues earlier today? I don't think I noticed any alerts at all today

2021-03-04 06352, 2021

13:02 PM
Rohan_Pillai has quit

2021-03-04 06358, 2021

13:02 PM
ruaok

alastairp: yes, no user_ids in spark.

2021-03-04 06306, 2021

13:03 PM
_lucifer

its a long standing open issue because listens are closely tied to usernames

2021-03-04 06325, 2021

13:03 PM
alastairp

_lucifer: that was an issue with the old listen storage, but should be less of an issue now

2021-03-04 06331, 2021

13:03 PM
ruaok

reosarevok: sounds like a block not a service issue. have them give us their IP

2021-03-04 06359, 2021

13:03 PM
ruaok

the other issue that I need to deal with is fixing up the FKs.

2021-03-04 06314, 2021

13:04 PM
_lucifer

LB-383

2021-03-04 06315, 2021

13:04 PM
BrainzBot

LB-383: Allow updating usernames when they're changed in MusicBrainz https://tickets.metabrainz.org/browse/LB-383

2021-03-04 06316, 2021

13:04 PM
alastairp

given the fact that recommendation.similar_user.user_id has an FK, I'm not convinced that we should change it to a username

2021-03-04 06318, 2021

13:04 PM
ruaok

in case a user is deleted. I suppose I can handle both in one go, but that make this table processing a lot worse.

2021-03-04 06336, 2021

13:04 PM
alastairp

but it means we'd have to do a separate query/pass through the data to get that mapping?

2021-03-04 06344, 2021

13:04 PM
ruaok

yep.

2021-03-04 06347, 2021

13:04 PM
reosarevok

ruaok: thanks, done

2021-03-04 06328, 2021

13:07 PM
alastairp

ruaok: and just to confirm you're not doing _any_ processing on that data between when it comes in and when you throw it into exeutute?

2021-03-04 06350, 2021

13:07 PM
ruaok

I wasn't planning on doing any. but it looks like I do now.

2021-03-04 06337, 2021

13:32 PM
nawcom_ joined the channel

2021-03-04 06338, 2021

13:32 PM
nawcom has quit

2021-03-04 06338, 2021

13:32 PM
nawcom_ has quit

2021-03-04 06333, 2021

13:47 PM
sampsyo has quit

2021-03-04 06352, 2021

13:55 PM
sampsyo joined the channel

2021-03-04 06300, 2021

14:02 PM
BrainzGit

[listenbrainz-server] MonkeyDo opened pull request #1315 (master…timeline-ui): Create timeline view for UserFeed page https://github.com/metabrainz/listenbrainz-server…

2021-03-04 06313, 2021

14:13 PM
ruaok

_lucifer: you had some test files uncommitted on leader. I stashed them.

2021-03-04 06325, 2021

14:16 PM
ruaok

https://www.nytimes.com/2021/03/04/business/media… O_O

2021-03-04 06323, 2021

14:20 PM
reosarevok

Does this have a chance to make Tidal relevant?

2021-03-04 06333, 2021

14:20 PM
reosarevok

(or: is Tidal relevant now and I missed the news)

2021-03-04 06337, 2021

14:20 PM
ruaok

my question exactly.

2021-03-04 06337, 2021

14:24 PM
_lucifer

ruaok: that was there before i tested yesterday. i was unable to remove those files

2021-03-04 06334, 2021

14:25 PM
ruaok

request_consumer@newleader:~/listenbrainz-server$ sudo chown -R request_consumer:request_consumer .

2021-03-04 06359, 2021

14:25 PM
ruaok

the docker container causes this, reowning the files allows them to be deleted.

2021-03-04 06359, 2021

14:25 PM
_lucifer

makes sense. thanks!

2021-03-04 06310, 2021

14:26 PM
ruaok

request consumer deployed.

2021-03-04 06345, 2021

14:26 PM
ruaok

now deploying a spark_writer and then I'll make some requests.

2021-03-04 06357, 2021

14:27 PM
_lucifer

ruaok, you'll need to redeploy. i forgot to push one commit. i'll push it in 5 mins. sorry!

2021-03-04 06314, 2021

14:28 PM
ruaok hangs

2021-03-04 06310, 2021

14:36 PM
Mr_Monkey reboots ruaok

2021-03-04 06324, 2021

14:36 PM
Mr_Monkey

I mean… it's been hanging for 10 minutes !

2021-03-04 06349, 2021

14:36 PM
shivam-kapila

lol

2021-03-04 06327, 2021

14:37 PM
shivam-kapila

Mr_Monkey: I was working on the semi circle progress. Shall I create a component from scratch or use some library

2021-03-04 06334, 2021

14:37 PM
MajorLurker joined the channel

2021-03-04 06350, 2021

14:37 PM
Mr_Monkey

What's the context?

2021-03-04 06341, 2021

14:38 PM
Mr_Monkey

There's some good pure css ones you can copy here:https://loading.io/css/

2021-03-04 06320, 2021

14:39 PM
shivam-kapila

https://www.figma.com/file/42lt9ldEzRabn4QQxIKDHg…

2021-03-04 06335, 2021

14:39 PM
shivam-kapila

The green over grey score indicator

2021-03-04 06313, 2021

14:40 PM
ruaok

_lucifer: ?

2021-03-04 06312, 2021

14:42 PM
MajorLurker has quit

2021-03-04 06309, 2021

14:43 PM
SothoTalKer has quit

2021-03-04 06346, 2021

14:43 PM
Mr_Monkey

shivam-kapila: I think you'll find lightweight css solutions for that. A whole library seems like a bit much. An example: https://codeconvey.com/pure-css-radial-progress-b…

2021-03-04 06322, 2021

14:44 PM
shivam-kapila

cool. thank you

2021-03-04 06333, 2021

14:44 PM
_lucifer

ruaok: updated. apologies for the delay

2021-03-04 06340, 2021

14:44 PM
ruaok

k

2021-03-04 06302, 2021

14:45 PM
SothoTalKer joined the channel

2021-03-04 06323, 2021

14:45 PM
Mr_Monkey

Or maybe https://codeconvey.com/semi-circle-progress-bar-c…

2021-03-04 06345, 2021

14:52 PM
shivam-kapila bookmarks these sites for future css references

2021-03-04 06326, 2021

15:10 PM
ruaok

_lucifer: "listenbrainz_spark.exceptions.FileNotFetchedException: File could not be fetched from /data/listenbrainz/2021/3.parquet" is this the error you were seeing yesterday?

2021-03-04 06346, 2021

15:10 PM
_lucifer

yes

2021-03-04 06355, 2021

15:10 PM
_lucifer

the exception was different though

2021-03-04 06315, 2021

15:17 PM
ruaok

should I request a full dump import and see what happens?

2021-03-04 06321, 2021

15:17 PM
_lucifer

a possible reason could be if i messed up a path in the mapping. but for fetching the listens dump the path is still hardcoded and unchanged

2021-03-04 06335, 2021

15:17 PM
_lucifer

yeah let's see how it goes

2021-03-04 06338, 2021

15:17 PM
ruaok

k

2021-03-04 06334, 2021

15:33 PM
_lucifer

https://www.irccloud.com/pastebin/5aMFxtKd/

2021-03-04 06343, 2021

15:33 PM
_lucifer

did the import complete ruaok ?

2021-03-04 06332, 2021

15:34 PM
_lucifer

the file seems to be present at the correct location