“....If it may seem like a ‘frivolous’ project, I would like to reiterate that music is culture. Music enriches, inspires and connects. Personally, MusicBrainz has helped me a lot over the years to deepen my passion…”
mayhem: shall I write a word of appreciation and thanks on LinkedIn? Just checking in case there’s something I don’t know about Reiser/if there is an existing relationship
mayhem: monkey ansh aerozol sent yim emails updated with feedback. let me know if something is still broken.
aerozol: how do you want to coordinate the YIM release? I will trigger full dumps and imports on Jan 1, which almost takes half a day and then 3-4 of hours to generate the YIM data. so we should be ready by European time evening of Jan 1.
texke has quit
mayhem[m]
<lucifer[m]> "mayhem: monkey ansh aerozol sent..." <- I don't seem to have one. checked spam folder too.
I see it in spam, sorry I had checked but for the wrong email account. Email looks good :+1:
vardhan_ joined the channel
Kladky joined the channel
lucifer[m]
mayhem: hi! are you around to discuss a couple of things? (not urgent)
mayhem[m]
sure.
vardhan_ has quit
lucifer[m]
to make listen ingestion in spark realtime, i am planning to do two things.
1. add a service that runs on michael and polls listens from RabbitMQ unique listens queue periodically. Combines the listens with existing incremental listens parquet and replaces it in place. every week or so, we re-partition the data in the spark cluster for performance.
Kladky has quit
2. to make deletes realtime-ish, after listens are deleted from timescale. write them to a deletion queue, same service on michael deletes them daily before stats generation.
however, to account for the case where a user deleted a listen and then imported the same listen from a service we need something more than listened_at, user_id, recording_msid triplet.
so we need to assign a unique id to each listen.
so need to add a column to listens table, dumps, api etc. etc.
mayhem[m]
lucifer[m]: that sounds like sooooo much fun
lucifer[m]
yup :/ but i don't think there is another way to handle this case.
mayhem[m]
has this ever happened before? is it worth all this trouble for a serious edge case?
vardhan joined the channel
lucifer[m]
it used to happen a lot with the existing lfm importer because we advised people to delete their listens and import from scratch.
with the new importer, i would expect it to be not that often.
mayhem[m]
are we engineering for a case that no longer happens (or far less often)? and do people have 1 or 2 listens duplicated, or a whole pile of them?
could we do this without the extra column now and see how it behaves? if it still is problematic, go back and add the id?
lucifer[m]
sure.
mayhem[m]
sure, but to which of the 3 questions? :)
lucifer[m]
to doing it without the id for now and adding it later if needed.
Kladky joined the channel
mayhem[m]
then, I think we should do this.
lucifer[m]
for the first two questions, i am not sure how often it happens now.
Hey lucifer, I haven’t planned much for the release, the email will do most of the work. But if you tell me what time you would like to launch I will make a post on the socials
Never heard if we are re-calculating them later? That’s probably the main reason to post (e.g. to encourage people to sign up and import data to get their YIM still)
monkey: If the “your year in music coming soon” screen is all done in beta, is it still worth pushing it to prod?
lucifer[m]
aerozol: i don't have a particular time in mind. but i can generate the data on 1st and hold off on sending emails until you are around and we can send the emails and make the annoucements at the same time.