bitmap, yvanzo ^ not too big a deal, but a regression from this release nevertheless
reosarevok has quit
reosarevok_ joined the channel
revi has quit
Freso has quit
revi joined the channel
wilbur joined the channel
Freso joined the channel
_lucifer
alastairp: ruaok: in addition to the cache discussion, it would be nice to discuss datetime serialization as well today.
afaik we cannot store timestamp fields directly in jsonb, but for the youtube player we need that functionality in some places. so i was thinking to convert the date to isoformat and store that.
but would be nice to solve this in one place (maybe BU?) and use that everywhere else
sumedh has quit
alastairp
_lucifer: mmm, interesting idea.
so the field that you wan't isn't generic enough to go in a regular column?
is this something that the youtube api needs, or something that you need to keep track of to do an operation? does the youtube api take it as a timestamp or datetime formatted field>?
_lucifer
its the latest_listened_at field currently only used by spotify, youtube doesn't allow to records listens so it isn't needed.
for other music services, as well we don't know if it'll be needed.
yeah, the spotify reader uses it to track the last fetched listen i think
alastairp
mmm
I don't think that this should be in the oauth table
it's related to "import listens from external services", that should be another table
_lucifer
ah so two tables? one only for services that allow to import listens?
yvanzo
mo’’in’
alastairp
yeah, that's what I'd do
reosarevok_
moin!
alastairp
have (user_id, service_name, service_registration_id, last_imported, error_message) or something
_lucifer
interesting we don't need the jsonb field in oauth table.
alastairp
oh great
_lucifer
if we make two tables
alastairp
I thought you were storing additional fields in here that the oauth services returned
_lucifer
for now its all the excess data from the existing spotify table, i don't think we'll need to store anything else there
there's a permissions column but both youtube and spotify use it and other services will probably will as well so that can become a regular column
*permissions field
alastairp
right
_lucifer
cool sounds good to me. i'll rework the tables.
yvanzo
chaban: I checked, it doesn’t even match your editing periods. It is more likely caused either by other services load or by a few edits that trigger a lot of reindex messages (e.g. editing crowded areas or relationship types).
reosarevok_: most probably pg load caused this, it doesn't seem to be specifically related to adding a label.
reosarevok_, bitmap: reviewed, would be nice to update beta tonight :)
alastairp: is there any benefit in storing user_id and service again here instead of a FK to external_service_oauth
reosarevok_ is now known as reosarevok
ZaphodBeeblebrox is now known as CatQuest
alastairp
_lucifer: I think a user id is useful, and there should be a link to external_service_auth
_lucifer: depends on what we want to do with the auth row if there is a failure - our discussion about deleting spotify rows but then not being able to see an error message
so, if we want to delete the external_service_oauth row, we should have `service`, so that we can still report to the user if they had an error with spotify, etc
but if you want to keep an invalid auth row, perhaps service isn't needed
sumedh joined the channel
_lucifer
alastairp: makes sense, i don't see any value in keeping around invalid auth rows. so i would be fine with whatever's easier to implement, adding an invalid or deleting the row altogether.
alastairp
_lucifer: right, so in that case external_service_oauth_id would have to be nullable
_lucifer
yes, that's what i am thinking currently
alastairp
if a row exists in listen_importer with that value null, it means that there was an error (error should be set)
and then we can use user_id, service to link that to a user
yeah, this feels much better!
_lucifer
yup, just saw postgres has the `ON DELETE SET NULL` so its simple to implement as well
alastairp
yeah, exactly
_lucifer
also, i saw you pushed the helper file. should i add the remaining .finish files?
this is instead of the previous one? a bit more complex because it has 2 tables?
_lucifer
yes
alastairp
_lucifer: that's a beautiful query
_lucifer
yup, postgresql is indeed powerful :-DD
alastairp
CTEs are SQL standard, though
ruaok
indeed very nice query. I really dig crafting more complex PG queries.
alastairp
(but not suppored in all sql servers)
*cough* mysql
ruaok
yeah, that looks fine, _lucifer
alastairp
_lucifer: I see that you've still got a bunch of instances of the json column still in that PR, so I stopped reviewing it
_lucifer
alastairp, yes, just was going to add that comment. the python side is still not completely migrated. i wanted to get a review on the sql first before working on that.
alastairp
ah cool!
_lucifer
alastairp: i think it makes sense to add `last_updated` field as well to listens_importer table. what do you think?
both external_service_oauth and listens_importer will have a last updated field, for the first table it will denote last auth details update and for the second the last time we tried fetching that user's listens
sumedh joined the channel
alastairp
_lucifer: yes, good idea because that is not always going to be the same as last_listened_at
_lucifer
alastairp: 👍, further i think we can get rid of `record_listens` field as well, because we'll be inserting only users for which we can record listens in the listens_importer table.
alastairp
see how much simpler things get when you add another table!
_lucifer
yup! makes sense a lot sense as there is a clear distinction between roles of the two tables.
ruaok
alastairp: waiting for tests to run. what else should I look at next?
if you happen to be able to spot the problem, plz let me know.
ruaok dislikes mocks
alastairp
urg, yeah, that sounds like bugs with the mock
though looking at that test, my first gut feel is that it's wrong. we shouldn't be testing that logs are made, we should be testing some methods that return True or False
if you made this function return a tuple `max_rating > 1.0, min_rating < -1.0` then you could just assert the return values
ruaok
sounds sane, thanks.
alastairp
_lucifer: do you know how to evict things from the gh actions cache?
#1398 managed to build the full listenbrainz python image and cache it for js tests 🤦♂️
so now js tests restore those images on every test... takes another 20 seconds ;)
_lucifer
alastairp: i don't think it is currently possible to do that.
ah for that, we can change the cache key
alastairp
I don't think it's super important. hopefully it'll get evicted in a few days/weeks
_lucifer
iirc cache is reset weekly.
ah no! its like this
>A repository can have up to 5GB of caches. Once the 5GB limit is reached, older caches will be evicted based on when the cache was last accessed. Caches that are not accessed within the last week will also be evicted.
alastairp
but I guess the cache will be accessed every time we run a CI job