zas: there's not much info in pg logs other than that checkpoints were occurring too frequently (usually indicating heavy writes)
2023-10-06 27921, 2023
bitmap
however the load averages on wolf and the spark cluster nodes were higher than usual during the same time frame, so perhaps something heavy was being processed there
mayhem: we can do that yes but we won't be able to join that with any of our other data then.
2023-10-06 27934, 2023
lucifer
i guess it would be fine to store the mb_metadata_cache but it wouldn't lower the load on MB db because all the data is still read from there.
2023-10-06 27907, 2023
lucifer
imo we need to full proof the incremental updates so that we can totally get rid of the bulk generate from scratch.
2023-10-06 27912, 2023
bitmap
can you use unlogged tables? that would solve the wal accumulation issues. (you won't be able to query the tables from a standby though, you'll have to use the master)
2023-10-06 27945, 2023
zas
also Jackson5 were clearly active during the time floyd was under load, so this has to be checked too
2023-10-06 27949, 2023
lucifer
bitmap: that should be possible i guess. will need to check why so much wal accumulated though. we don't write back much, just read a lot.
2023-10-06 27931, 2023
mayhem
lucifer: yes, to all that. the inability to do joins limits what we can do, but I think there are many datasets that we envision hosting with DSH-on-spark could actually be handled by couchdb. mb metadata may not be the best example. but something like last_listened_at for all users could.
2023-10-06 27956, 2023
lucifer
mayhem: yes that makes sense
2023-10-06 27921, 2023
mayhem
and on making the cached data incremental updates work -- we'll have a chance to look at that real soon. joy.
mayhem: I cannot find mine D: did you find a random L shirt laying about in the office any of these days? 😅
2023-10-06 27948, 2023
reosarevok
(I will be at the office in 30)
2023-10-06 27950, 2023
mayhem makes he L gesture on his forehead
2023-10-06 27939, 2023
lusciouslover has quit
2023-10-06 27949, 2023
reosarevok
I mean, if that gets me a t-shirt again, I'll take the L
2023-10-06 27952, 2023
reosarevok hides
2023-10-06 27900, 2023
mayhem
outsidecontext: error handling has been fixed now on the DSH endpoint.
2023-10-06 27922, 2023
outsidecontext
mayhem: will check in a minute
2023-10-06 27929, 2023
mayhem
now to actually work on why some bits dont return data.
2023-10-06 27954, 2023
mayhem
yvanzo: ping
2023-10-06 27948, 2023
outsidecontext
curl -X POST -k -H 'Content-Type: application/json' -i 'https://labs.api.listenbrainz.org/mbid-mapping-release/json' --data '[{"[artist_credit_name]": "Paradise Lost", "[recording_name]": "Paradise Lost", "[release_name]": "Drown in Darkness \\u2013 The Early Demos"}, {"[artist_credit_name]": "Paradise Lost", "[recording_name]": "Paradise Lost (live)", "[release_name]": "Drown in Darkness \\u2013 The Early Demos"},
see the query that outsidecontext posted above. that is against the mapping with releases endpoint.
2023-10-06 27908, 2023
mayhem
can you plz take a look at that if you have a moment?
2023-10-06 27906, 2023
monkey
mayhem: pushed, should be fixed now
2023-10-06 27912, 2023
outsidecontext
lucifer: all the requested tracks are supposed to yield a result as there is a recording for it. but recording name equals band name. I can reliably reproduce the issue with this endpoint
2023-10-06 27926, 2023
lucifer
mayhem, outsidecontext: sure looking into it
2023-10-06 27927, 2023
mayhem
thx monkey
2023-10-06 27932, 2023
mayhem
thx lucifer
2023-10-06 27933, 2023
lucifer
what branch is it running on?
2023-10-06 27951, 2023
mayhem
thats on labs, no? should be current prod more or less.
Both to keep us on track/as a handy reminder, and to keep the community and those who couldn’t make it in the loop ➰
2023-10-06 27928, 2023
monkey
mayhem: Looks like the error is related to the manifest file we write to disk (so that flask knows which javascript file to load). I don't think we changed anything, so that's surprising