well, ok. we HAD a check. there is a define for that.
lucifer: if you put in the check, I'll clean up the DB.
lucifer
on it
ruaok: for the error to show, how does this sound? "Value for key listened_at is too low. listened_at value should be of 2002 (Last.fm founding year) or later"
ruaok
hmmm.
LAST_FM_FOUNDING_YEAR = 2002
hmmm. I have a recollection that that ought to be 2005
alastairp: I have a vague memory of RJ telling us to ignore everything prior to a certain date.
founding year was defined in the main codebase at some point. now its just in spark, which seems odd since listen validation is being done on the flask side.
lucifer
yeah
i think at some point the check got deleted.
then stuff got moved around as usual refactoring.
ruaok
I want find the place where we had defined it orginally
there was an epoch timestamp as well.
lucifer
trying to find that in git history.
ruaok
us both
monkey is going to deploy huesound to test.LB, making some final UI refinements
also +100 n the buttons for saying "this is not correct" and a button for saying "this is the one it really is"
<333
ruaok thinks about scalabilty for the incremental dumps
also re listen/scrobble dates. there are tools for manually adding listened stuff to (last.fm) where you could conceivably add something listened to in 1998 to l.fm
ruaok
lucifer: one thing I hadn't yet considered with the "save incremental listens to a file" approach was that it makes scaling timescale_writer much much harder.
so, I am going back to looking at being able to sort on created, rather than listened_at.
I'm going to try creating an index on created, but I think that in order for timescale to use it, we'd need to upgrade ts.
lucifer
yeah, i had that in mind ts_writer could be impacted. but i was unsure about the exact numbers.
ruaok
if the spark cluster could do it at the end of the unique stream, that would be better.
lucifer
we have to upgrade ts anyways sooner or later so that's fine
ruaok
but, let me investigate this. I hope this will help and let us skip this extra complexity.
lucifer
can you elaborate on that?
ruaok
on this ? "if the spark cluster could do it at the end of the unique stream, that would be better."
lucifer
yes
oh nvm, i get it now. connect to rmq as a consumer?
ruaok
because the unique stream could come from multiple timescale_writers.
yes, that.
and then the files are already there.
lucifer
sure that's a good plan in the long term. not sure we need that now.
ruaok
we dont need multuple timescale writers yet, no.
lucifer
i think adding a container to do this on kiss/gaga and shipping to spark is a good middle ground for now.
ruaok
but we must resist fucking future ruaok and lucifer. because those two will need to wake up in the middle of the night and deploy more timescale-writers.
"i think adding a container to do this on kiss/gaga and shipping to spark is a good middle ground for now."
it is tempting to think that way. but we're pushing an impending disaster ahead of us. knowingly.
lucifer
oh definitely need to avoid that.
ruaok
yeah.
lucifer
yes but we don't have the MB on spark yet and just getting that to spark is huge work
ruaok
so, if we NEED to make this work differently RMQ on J5 is the way to go.
yeah, loads.
lucifer
but if that needs to be done then let's try to plan it then once alastairp is around or later this week?
even if don't implement rn, having an idea of how to get about it would be good.
[listenbrainz-server] 14mayhem opened pull request #1678 (03master…better-incremental-dumps): (LB-980) AISOTT: Improve incremental dumps by sorting on created, rather than listened_at https://github.com/metabrainz/listenbrainz-serv...
Lotheric has quit
Lotheric joined the channel
ruaok
overheard in catalunya: "I made fideuĂ like a biryani and now I don't know whom to hide from"
ah ok. i asked because i remember having this discussion with you about another delete endpoint (unfollow user iirc) where we decided its fine to return 200 if nothing was deleted.
alastairp
oh yes, that's right
lucifer
this is a different endpoint but we should probably be consistent unless there's a reason not to be.
alastairp
so the question is: when is it useful for a user to know the difference between something being successfully deleted, and something not existing?
lucifer
yup right. i cannot think of a use case currently where the distinction would matter.
aerozol has quit
aerozol joined the channel
but get_or_create can be a close analogy. we have that in our own codebase and also use cases where the distinction is useful.
opal has quit
opal joined the channel
i suggest let's document the current behvaiour and consider again if someone complains.
monkey
akshat: Yes, I noticed the huesound page wasn't working on mobile. I did work on it this morning and improved it some. Just deployed the result if you want to have a look
ruaok
alastairp: any chance you might have some time for the huesound PR today?
it would be nice to get that out soon
alastairp
yes, should be no problem. let me just finish some other reviews of lucifer's first
ruaok
👍
lucifer
:-D
alastairp
ruaok: did I see that you were adding an index to listens table?