alastairp: if my counts are correct then at max 60 laus relevant to us were deleted in the past month.
i'd be fine with incrementally updating the rest and updating link fully every week or so. maybe also try to optimize the full metadata cache generation to the extent possible. i have some ideas to try out in that direction.
alastairp
lucifer: awesome
BrainzGit
[troi-recommendation-playground] 14amCap1712 merged pull request #67 (03main…daily-jams-accept-day): Accept jam_date as optional argument in daily-jams patch https://github.com/metabrainz/troi-recommendati...
alastairp
lucifer: interesting that you said that the query to get a new set of artist rels is pretty quick too - but I think artist rels are part of an artist, not as a top level key in the recording json?
so we'd need to get a full artist blob for each recording, right? and then replace the "artists" key in _all_ recordings
lucifer
alastairp: yes, indeed. but we can restructure the cache or do a partial update instead of insert.
alastairp
lucifer: ok, yeah, right.
so, I like this idea of moving things around to make updates easier, or change the way that we do updates
lucifer
we can also build a temp table of artist rels and index on artist mbid and then join to it instead of having it in CTE.
and in any case everything should happen at once when we run the mb_metadata_cache script.
alastairp
yes, right
you said that the cte for artist rels is pretty quick, does that speed carry over to actually writing the data to a table and building the index too?
just putting another idea out there - is splitting each of these CTEs into a separate temp table and then running the final query as slow as running the entire query in one go?
lucifer
wrtiting the data takes 1 hour for full table. it only begins after the query has finished executing iiuc.
yes, splitting each cte and indexing before running final query is one of the approaches i want to check.
mayhem: did you mean the normalizing email or the Italian one? :)
Anyway, I'm dealing with both, but can you check the cover art one?
(heh, or the Spanish answer)
mayhem
both. :)
reosarevok
Answered all 3 now
mayhem
thanks
lucifer
mayhem: i updated the PR to move the debug playlists back to post recommendation step, only daily jams run hourly now. that was the easiest way to keep debug playlists and not generate them everyhour in my understanding.
mayhem
makes sense
lucifer
also, tested it just not on cron by changing my time zone to US/Hawaii. will merge once tests pass.
monkey: to be clear, i invalidated the match so that the mapper attempted to find a new match and this time it found the right once.
monkey
Ah, I see.
Any way we could improve the odd ones by using the release name in the listen?
lucifer
the mapper should be deterministic and find the same match each time unless there are some code changes or new additions to the database. its neither in this case so not sure why it chose the wrong one earlier.
yes using release name in the listen is a planned improvement. mayhem can probably tell you more about the exact plan.
monkey
Thanks for the explanations !
lucifer
mayhem: can you review LB#2188. i'll do a release after that.
lucifer: hi, do you know which packet # that line is from by chance? and are you importing the packets into postgres first, or parsing the files directly?
lucifer
bitmap: parsing directly by reading the file as csv with excel-tab dialect and then trying to parse olddata/newdata column as json. i don't know the packet number currently but can find it.
currently, no UI available so have to use api directly.
alastairp
hah
Tpken?
lucifer
ah yes, typo. should be Token
monkey
One pain I have in my personal workflow is I use the Pano Scrobbler app on my phone which connects to LFM/LB, but loved tracks are only sent to LFM. Not I'll be able to import them :)
lucifer
currently we only import loved tracks that have a mbid assignned to them by LFM.
we could change it to lookup mbids ourself from mapper for tracks that do not have a mbid assigned or maybe look it up for all.
alastairp: LFM uses track mbids in this endpoint. maybe that's where the confusion between track mbid and recording mbid in MLHD came from ?
alastairp
lucifer: hmm, no, we were definitely seeing both recording and track mbids in the same field from the response of some API
I think the field name wwas "track_mbid" or something (as a fallback to pre-NGS)
lucifer
ah ok, i see.
alastairp
lucifer: I'll try and take a look at the BU PR when I get home
lucifer
for a couple of random users, i imported on a test account. i get: `{"inserted":3990,"invalid_mbid":0,"mbid_not_found":291,"missing_mbid":3745,"total":8026}`
alastairp
though, actually. maybe I'll just do it now, I trust all of your code :)
lucifer
its possible some of those 291 not found are recording mbids
alastairp
ohhh, interesting
lucifer
hehe. no hurry :)
kepstin
i think i actually noticed recently that last.fm switched from returning recording ids to track ids in the same field, yeah
so older tracks imported from last.fm before some date will have recording ids, after they'll have track ids.
then we decided to add them anyway, but after identifying this, we changed them importer to put them in another field which wasn't the main LB recording mbid field
I don't think we've done anything since then - we could definitely go back and look at all the fields we have in our listens and see which ones can map to recordings and which to trakcs
lucifer
with the mapper its much less relevant anyway.
alastairp
yes, right
still, it'd be curious to see where lfm and mapper differ
That's all the PRs I had ready lucifer, thanks for waiting
lucifer
np, thanks. will do a release later today
alastairp
!m monkey
BrainzBot
You're doing good work, monkey!
alastairp
monkey: how goes the upgrade?
I saw you closed dependabot, not sure if it was getting noisy due to changes you made, or was just being annoying
monkey
Took a break from diving into nivo internals.
I closed one dependabot PR that was doing the same upgrade I've started, but without implementing the required changes (i.e. still considered a point release despite some APIs and props completely changed
But I'm getting there
alastairp
nice
monkey
I'm gonna have to deploy that one to test.LB to ensure all the graphs are working properly
zas: hey, when you are around can you check the openresty logs for prod.caa-redirect.access.log? there seems to be a lot of requests being spammed from a particular user agent