if you assume for a minute that this will be available on labs, then you should be able to start thinking about the spark aspects of this, no?
2022-06-07 15810, 2022
mayhem
and the date will obviously be adjusted to a time window centered around now()
2022-06-07 15844, 2022
lucifer
makes sense. i forgot what exactly we intended to do with spark in this. let me reread our previous discussion.
2022-06-07 15828, 2022
lucifer
>2. In spark, create a job that downloads this list and then for each user calculates the intersection of their recent discovery data and the artists in the new releases.
2022-06-07 15831, 2022
mayhem
from discovery tracks distill a list of artists and when the user last listened to a track by that artist.
2022-06-07 15838, 2022
mayhem
yeah, that.
2022-06-07 15840, 2022
mayhem
but, the output of that query can be used directly by chinmay to display the same data for a "site wide" view. the per user view will just be a smaller.
2022-06-07 15822, 2022
lucifer
yes makes sense.
2022-06-07 15806, 2022
lucifer
so with that query as input we want (artist_mbid, last_listened) as output?
2022-06-07 15822, 2022
mayhem
no, we want to filter the list and remove all releases that do not contain at least one artist from the discovered track. return the data in the same format as the input.
2022-06-07 15819, 2022
lucifer
i see.
2022-06-07 15808, 2022
lucifer
so we if the user hasn't listened to a track from the release artist we remove that release from that user's view.
2022-06-07 15825, 2022
mayhem
yes.
2022-06-07 15841, 2022
mayhem
and if we're going for bonus points, could we create a confidence score?
2022-06-07 15852, 2022
lucifer
do we want to restrict the time range like not listened in last 3 months or never?
2022-06-07 15812, 2022
mayhem
you listened to 1 track by an artist on a release: lowest score. if you listened to a pile of tracks: high score.
2022-06-07 15832, 2022
mayhem
make the time range configurable, please.
2022-06-07 15833, 2022
lucifer
yes should be doable.
2022-06-07 15853, 2022
mayhem
I think at first we will want to be more lax to draw in more data. but over time we might want to be more constrcting.
2022-06-07 15820, 2022
mayhem
I fear that the output will be 1-2 releases for most people, which is not terribly fun to look at.
2022-06-07 15828, 2022
mayhem
and if we have too much data to show, we can dial it back\
2022-06-07 15849, 2022
lucifer
and if the user only listened to 1 artist of the album having multiple artist still include, right?
2022-06-07 15818, 2022
mayhem
yes, be as greedy in collecting releases as we can to start with.
2022-06-07 15826, 2022
lucifer
makes sense
2022-06-07 15835, 2022
mayhem
we can always filter more shit out, esp if we have a confidence score.
2022-06-07 15818, 2022
lucifer
yes sounds good
2022-06-07 15830, 2022
mayhem
ok, great.
2022-06-07 15852, 2022
mayhem
sorry for being so absent. I soo hope that life returns to some form of normal next week.
2022-06-07 15842, 2022
alastairp
hullo
2022-06-07 15845, 2022
lucifer
i also looked at the Oauth btw. i think the smallest unit of testable work is implementing one form of grant. so thinking to implement the one we use with pythonbrainz and test with LB.
2022-06-07 15851, 2022
alastairp
sorry I missed the meeting yesterday. forgot it was monday!
2022-06-07 15852, 2022
lucifer
heh np :D
2022-06-07 15810, 2022
alastairp
lucifer: one form of grant sounds neat
2022-06-07 15837, 2022
mayhem
lucifer: https://github.com/metabrainz/listenbrainz-server… on this PR, I forget if we discussed whether the user_setting table will have a JSONB field or individual columns that we will add as we add use options. do you remember?
2022-06-07 15840, 2022
lucifer
mayhem: iirc we decided to do a mix of those. one column for each type of settings. for example, one jsonb column for all troi related settings. one column for timezone so on.
2022-06-07 15838, 2022
mayhem
ok, then that PR is spot on, save for the UI being in flask/html rather than react.
2022-06-07 15841, 2022
mayhem
good good.
2022-06-07 15809, 2022
lucifer
mayhem: i see. that page is still in flask so makes sense for it to be in flask for the time being.
hmm, good point. so maybe it doesn't make sense to have an optional dev environment but always run it during tests
2022-06-07 15833, 2022
alastairp
OK, leave that for now - let me think about it to see if there's a better way. maybe it does make sense to always have a bb database too. we decided that the MB database was required
2022-06-07 15827, 2022
ansh
Or we can merge into the feature branch for now. After I add the edition group, then we can merge to master and deploy ?
2022-06-07 15816, 2022
alastairp
no, that's fine - let's merge and deploy to production. as we said last week it would be great to get small changes merged as quickly as possible