“....If it may seem like a ‘frivolous’ project, I would like to reiterate that music is culture. Music enriches, inspires and connects. Personally, MusicBrainz has helped me a lot over the years to deepen my passion…”
2024-12-30 36553, 2024
aerozol[m]
mayhem: shall I write a word of appreciation and thanks on LinkedIn? Just checking in case there’s something I don’t know about Reiser/if there is an existing relationship
mayhem: monkey ansh aerozol sent yim emails updated with feedback. let me know if something is still broken.
2024-12-30 36547, 2024
lucifer[m]
aerozol: how do you want to coordinate the YIM release? I will trigger full dumps and imports on Jan 1, which almost takes half a day and then 3-4 of hours to generate the YIM data. so we should be ready by European time evening of Jan 1.
2024-12-30 36549, 2024
texke has quit
2024-12-30 36531, 2024
mayhem[m]
<lucifer[m]> "mayhem: monkey ansh aerozol sent..." <- I don't seem to have one. checked spam folder too.
I see it in spam, sorry I had checked but for the wrong email account. Email looks good :+1:
2024-12-30 36552, 2024
vardhan_ joined the channel
2024-12-30 36508, 2024
Kladky joined the channel
2024-12-30 36538, 2024
lucifer[m]
mayhem: hi! are you around to discuss a couple of things? (not urgent)
2024-12-30 36556, 2024
mayhem[m]
sure.
2024-12-30 36559, 2024
vardhan_ has quit
2024-12-30 36503, 2024
lucifer[m]
to make listen ingestion in spark realtime, i am planning to do two things.
2024-12-30 36504, 2024
lucifer[m]
1. add a service that runs on michael and polls listens from RabbitMQ unique listens queue periodically. Combines the listens with existing incremental listens parquet and replaces it in place. every week or so, we re-partition the data in the spark cluster for performance.
2024-12-30 36523, 2024
Kladky has quit
2024-12-30 36555, 2024
lucifer[m]
2. to make deletes realtime-ish, after listens are deleted from timescale. write them to a deletion queue, same service on michael deletes them daily before stats generation.
2024-12-30 36536, 2024
lucifer[m]
however, to account for the case where a user deleted a listen and then imported the same listen from a service we need something more than listened_at, user_id, recording_msid triplet.
2024-12-30 36502, 2024
lucifer[m]
so we need to assign a unique id to each listen.
2024-12-30 36541, 2024
lucifer[m]
so need to add a column to listens table, dumps, api etc. etc.
2024-12-30 36520, 2024
mayhem[m]
lucifer[m]: that sounds like sooooo much fun
2024-12-30 36555, 2024
lucifer[m]
yup :/ but i don't think there is another way to handle this case.
2024-12-30 36532, 2024
mayhem[m]
has this ever happened before? is it worth all this trouble for a serious edge case?
2024-12-30 36555, 2024
vardhan joined the channel
2024-12-30 36525, 2024
lucifer[m]
it used to happen a lot with the existing lfm importer because we advised people to delete their listens and import from scratch.
2024-12-30 36516, 2024
lucifer[m]
with the new importer, i would expect it to be not that often.
2024-12-30 36543, 2024
mayhem[m]
are we engineering for a case that no longer happens (or far less often)? and do people have 1 or 2 listens duplicated, or a whole pile of them?
2024-12-30 36510, 2024
mayhem[m]
could we do this without the extra column now and see how it behaves? if it still is problematic, go back and add the id?
2024-12-30 36519, 2024
lucifer[m]
sure.
2024-12-30 36555, 2024
mayhem[m]
sure, but to which of the 3 questions? :)
2024-12-30 36516, 2024
lucifer[m]
to doing it without the id for now and adding it later if needed.
2024-12-30 36528, 2024
Kladky joined the channel
2024-12-30 36533, 2024
mayhem[m]
then, I think we should do this.
2024-12-30 36554, 2024
lucifer[m]
for the first two questions, i am not sure how often it happens now.
Side note, adding the mail service to the musicbrainz-docker compose might be good?
2024-12-30 36524, 2024
Jade[m]1
My local testing I'm using... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/VAQcEBqDgtthqPkJYQEvppsx>)
2024-12-30 36558, 2024
aerozol[m]
Hey lucifer, I haven’t planned much for the release, the email will do most of the work. But if you tell me what time you would like to launch I will make a post on the socials
2024-12-30 36534, 2024
aerozol[m]
Never heard if we are re-calculating them later? That’s probably the main reason to post (e.g. to encourage people to sign up and import data to get their YIM still)
2024-12-30 36529, 2024
aerozol[m]
monkey: If the “your year in music coming soon” screen is all done in beta, is it still worth pushing it to prod?
2024-12-30 36522, 2024
lucifer[m]
aerozol: i don't have a particular time in mind. but i can generate the data on 1st and hold off on sending emails until you are around and we can send the emails and make the annoucements at the same time.