create new table, create indexes and then in one transaction, rename tables effectively swapping them into place. once that transaction finished, drop the old table.
how can we go about not duplicating the table definition?
2021-03-04 06326, 2021
Rohan_Pillai joined the channel
2021-03-04 06304, 2021
ruaok
I wonder if I easily copy the table definition from an existing table. that would solve this problem
2021-03-04 06337, 2021
iliekcomputers
That seems possible
2021-03-04 06317, 2021
Mineo
create table foo like bar?
2021-03-04 06346, 2021
ruaok
no wai.
2021-03-04 06349, 2021
iliekcomputers
Yep
2021-03-04 06351, 2021
ruaok
that is exactly what I need. :) lol.
2021-03-04 06357, 2021
ruaok
thanks Mineo.
2021-03-04 06320, 2021
ruaok
well, still not perfect.
2021-03-04 06349, 2021
ruaok
I'll need to create indexes and constraints AFTER the data import.
2021-03-04 06315, 2021
ruaok
but, I can work with that.
2021-03-04 06343, 2021
iliekcomputers
It takes an "excluding" param as well
2021-03-04 06302, 2021
iliekcomputers
Create table foo like bar excluding indexes
2021-03-04 06323, 2021
ruaok
I'll add comments to the create_indexes that if changes are made to a table index there, the code needs updating as well.
2021-03-04 06330, 2021
ruaok
that should suffice for now.
2021-03-04 06337, 2021
iliekcomputers
For both the indexes and the constraints
2021-03-04 06349, 2021
ruaok
eayh
2021-03-04 06351, 2021
iliekcomputers
If you calculate data for a deleted user, insert it and then add the foreign key constraint, will there be a way to delete rows that break the constraint?
Mr_Monkey: sorry I missed that message, the types make sense to me
2021-03-04 06328, 2021
Mr_Monkey
OK, let me formalize that in the design doc
2021-03-04 06316, 2021
alastairp
sorry ruaok, I thought your question was "what should we name the indexes", and unfortunately you caught me just as I was heading out the door. It looks like you may have answered your question with iliekcomputers and Mineo?
2021-03-04 06310, 2021
ruaok
more or less yess.
2021-03-04 06333, 2021
ruaok
except doing bulk inserts with sqlalchemy is.... alchemy.
2021-03-04 06353, 2021
alastairp
ah yeah. one sec, I have a pattern to do that
2021-03-04 06355, 2021
ruaok
i really want to be using pscypg2 in spark_reader, not sqlalchemy.
2021-03-04 06305, 2021
alastairp
let me see if I can remember where I did it
2021-03-04 06332, 2021
Mr_Monkey
iliekcomputers: Have you decided on the structure of the API endpoint to GET timeline events?
2021-03-04 06341, 2021
iliekcomputers
Mr_Monkey: it would probably take the same params as the existing feed/listens endpoint.
2021-03-04 06314, 2021
iliekcomputers
I think we'll just rename the feed/listens endpoint to be more generic and then return all the data from there.
2021-03-04 06304, 2021
Mr_Monkey
`GET /user/XXXX/timeline` ?
2021-03-04 06342, 2021
Mr_Monkey
or `GET /user/XXXX/feed`?
2021-03-04 06343, 2021
Rohan_Pillai joined the channel
2021-03-04 06314, 2021
iliekcomputers
Either of those is ok with me, do you have a preference?
2021-03-04 06303, 2021
Mr_Monkey
I feel like 'feed' is a more standard description for what we're doing.
2021-03-04 06326, 2021
Mr_Monkey
Another option is `/1/user/$user_name/feed/events`, especially if we want endpoints for specific event types like `/1/user/$user_name/feed/events/recommendation`
2021-03-04 06319, 2021
iliekcomputers
I guess because it's json, we'll want it behind the api prefix, so /1/user/username/feed/events makes sense to me.
2021-03-04 06305, 2021
iliekcomputers
Do we want different endpoints for different types of events? I think having just the one endpoint would be easier for the front-end.
2021-03-04 06348, 2021
ruaok
_lucifer: I found one problem with my spec. :(
2021-03-04 06311, 2021
ruaok
we should we using user_id (int) not user_name (str) for the data that gets returned from spark.
2021-03-04 06316, 2021
ruaok
sorry :(
2021-03-04 06303, 2021
_lucifer
that should be only a deleting code step and save us two joins methinks :)
2021-03-04 06330, 2021
_lucifer
i'll change it once i get back
2021-03-04 06341, 2021
ruaok
sweet, thanks.
2021-03-04 06341, 2021
iliekcomputers
I'm not completely sure we have the same user IDs in spark and postgres
2021-03-04 06343, 2021
alastairp
urgh. no idea where I had that sqlalchemy-core bulk insert code
2021-03-04 06348, 2021
ruaok
makes the data smaller too.
2021-03-04 06305, 2021
ruaok
alastairp: used livegrep?
2021-03-04 06308, 2021
alastairp
yep
2021-03-04 06315, 2021
_lucifer
iliekcomputers: yeah, that's what i want to check
2021-03-04 06320, 2021
Mr_Monkey
iliekcomputers: I think the front-end only needs the single endpoint. I was thinking of possible future use-cases if we need more precision, but can't think of a great example.
2021-03-04 06326, 2021
alastairp
is it not just as simple as `connection.execute(query, [list, of,items,to,insert])` ?
ruaok: I recall that somewhere else you just got a raw engine from the connection and did bulk insert
2021-03-04 06302, 2021
ruaok
we can add yet another module to make it happen...
2021-03-04 06313, 2021
ruaok
that would be ideal.
2021-03-04 06337, 2021
alastairp
ruaok: ah, right. I have a funny feeling that there was a flag that you had to set to tell sql-a to do a bulk query in this case instead of multiple inserts
2021-03-04 06314, 2021
ruaok
I have this all working in mbid_mapping. which uses psychopg2.
2021-03-04 06315, 2021
alastairp
you want to make an sql query like `INSERT INTO x (a, b) VALUES (1,2), (3,4), (4,5)`, right?
shivam-kapila bookmarks these sites for future css references
2021-03-04 06326, 2021
ruaok
_lucifer: "listenbrainz_spark.exceptions.FileNotFetchedException: File could not be fetched from /data/listenbrainz/2021/3.parquet" is this the error you were seeing yesterday?
2021-03-04 06346, 2021
_lucifer
yes
2021-03-04 06355, 2021
_lucifer
the exception was different though
2021-03-04 06315, 2021
ruaok
should I request a full dump import and see what happens?
2021-03-04 06321, 2021
_lucifer
a possible reason could be if i messed up a path in the mapping. but for fetching the listens dump the path is still hardcoded and unchanged