(even if we just have our own stuff I think that kind of display, with inviting image previews, would work well for an ‘explore’ section)
2022-10-23 29603, 2022
aerozol
monkey: Hey, the bubbles are looking good already!! Some font tweaks, a tighter crop, and options to add a album cover, and I would think it’s in the bag?
mayhem, lucifer: about generated playlists, are listens of the same weight? -> If I listen a generated playlist, does Troi think I like songs in it and then tends to suggest more of the same trend (not sure I'm clear).
2022-10-23 29637, 2022
lucifer
zas: yes, all listens are of same weight. if you listen a generated playlist, it would indeed think that you like it and suggest more but iirc we take 6 months of listens, so a few playlists are unlikely to affect the overall results.
2022-10-23 29647, 2022
zas
mayhem: on the tracklist you linked, since it is based on top recordings and my top recordings are somehow bugged (some are very high in list because of a spotify bug at some point), some suggestions are a bit weird, though I discovered nice music among suggested tracks.
2022-10-23 29618, 2022
zas
lucifer: another question, how does the duration of listens impact weights?
2022-10-23 29626, 2022
lucifer
zas: so far it doesn't affect the weight. only the number of times a recording is listened affects the weight currently.
2022-10-23 29600, 2022
zas
huh? so if I listen 20% of a song it thinks I somehow like it?
2022-10-23 29609, 2022
zas
which is usually the reverse
2022-10-23 29641, 2022
lucifer
right. its a pending improvement to make.
2022-10-23 29644, 2022
lucifer
one thing that still helps is that most users won't go back to a song they skipped say halfway so its listen count will be very low. whereas the ones they like they will have listened multiple times.
2022-10-23 29658, 2022
lucifer
so the count acts as a proxy. but yes adding duration/support would help make it more explicit.
2022-10-23 29603, 2022
zas
the fact I don't listen a track twice doesn't mean I don't like it, but for sure the fact I don't listen a track in full usually means I'm not fond of it
2022-10-23 29633, 2022
zas
(or I didn't had time to, but that's rare I guess)
2022-10-23 29645, 2022
lucifer
yes makes sense.
2022-10-23 29643, 2022
zas
Also, if I listen an album in full once (so 1 listen per track), usually means I like the music, so I guess listening multiple tracks of the same artist is a good indicator too, and even more if one or more albums were listened in full.
2022-10-23 29659, 2022
zas
Another thing: if I like a band, I'm always curious about what band members do outside this band, I expect such suggestions: A & B are in C band, A is also in D band, and B has a solo project, I expect tracks from D and B to appear, even though I never listened a track from them. Is this taken in account somehow atm?
2022-10-23 29635, 2022
mayhem
moooin!
2022-10-23 29612, 2022
mayhem
aerozol: I see you want to do the compositing with the cover art first and then the image on top? that might work better, indeed.
2022-10-23 29635, 2022
mayhem
are the covers in that image you posted transparent?
2022-10-23 29600, 2022
mayhem
we need to find a place to collect all of these images. can I fetch them out of the figema?
2022-10-23 29629, 2022
mayhem
lucifer: I had a really rough time getting to sleep last night, because the similarity data made me realize a big big thing.
2022-10-23 29605, 2022
mayhem
throwing all of the listens at the similarity alg simply overfits it and the result is.... noise.
2022-10-23 29652, 2022
mayhem
we're getting quite good results with TWO hits; this suggests that the optimum window is some time larger than that.
2022-10-23 29605, 2022
mayhem
lets call it 30 days.
2022-10-23 29601, 2022
mayhem
we should never apply our alg to more data than this window. ever.
2022-10-23 29642, 2022
mayhem
instead, what we should do is calculate many windows over time of this data. chunks made up of 90 days of data.
2022-10-23 29601, 2022
mayhem
(combining 3 chunks at a time)
2022-10-23 29651, 2022
mayhem
the key insight is that we will gain the best data when we analyze tracks in the time when a given track was released -- when people will have been listening to it with other tracks that were released about the same time.
2022-10-23 29646, 2022
mayhem
e.g. a 2000s track will have the best play co-incidences when analyzed with listens from the same era.
2022-10-23 29611, 2022
mayhem
I'll likely need to make some graphs to make this more clear.
2022-10-23 29601, 2022
lucifer
zas: currently not considered. what we intend to do is build various types of recommendation algorithms and collate their results. this artist correlation cannot be reliably inferred by the CF algorithm we use currently. the only correlations it can make is users who listened to artist A listen to artist B as well. however, we can build another algorithm which utilises these artist correlations to suggest tracks.
2022-10-23 29626, 2022
lucifer
mayhem: i see. sounds good to do multiple runs over small chunks.
2022-10-23 29658, 2022
lucifer
i think there is some value in doing a few larger chunks as well so that we get similarity of tracks which were released at different point in times as well but no reason that these multiple runs couldn't capture. we can always experiment and see how it goes.
2022-10-23 29602, 2022
mayhem
first, lets see about finding a well tuned window size. then we need to explore the temporal nature of the data.
2022-10-23 29628, 2022
mayhem
larger chunks are not the answer, I think.
2022-10-23 29629, 2022
lucifer
to make sure iiuc we will store scores of various windows separately and not aggregate them?
2022-10-23 29650, 2022
mayhem
you always use the same sized chunk, but you calculate them starting from older starting point.
2022-10-23 29603, 2022
mayhem
and then combine the data from chunks to make a real result.
2022-10-23 29627, 2022
lucifer
but that would act the same as the current alogrithm i think.
2022-10-23 29649, 2022
mayhem
if you used *all* chunks for searching, yes.
2022-10-23 29658, 2022
mayhem
but the key is to select only a few chunks for searching.
2022-10-23 29611, 2022
lucifer
whether you calculate sum of Jan-Mar, Apr-Jul and then add those. or you calculate Jan-Jul at once it would be the same.
2022-10-23 29625, 2022
mayhem
we need to be mindful of both the indexing chunks and which index chunks are used in a search.
2022-10-23 29635, 2022
lucifer
hmm, i see.
2022-10-23 29635, 2022
mayhem
lucifer: correct.
2022-10-23 29657, 2022
lucifer
i am not sure i understand the plan fully currently but let's try and see. it'll probably become clearer in due time.
2022-10-23 29624, 2022
mayhem
like I said, I am not explaining this. my brain was racing until 4am when I worked this out and I'm now poorly slept as a result.
2022-10-23 29638, 2022
mayhem
I'll draw a graph about this later today, that will make it more clear.
this is up on beta. save to LB then use export to spotify. (to see the button you'll have to change url to beta.lb manually and and also be logged in there. because open as playlist will go prod lb)
2022-10-23 29638, 2022
lucifer
also window size, 30 and 90 also generated. algorithms available: `session_based_days_7_session_300`, `session_based_days_30_session_300` and `session_based_days_90_session_300`.
2022-10-23 29629, 2022
lucifer
the overall lookup is now slower because the 90 days generated too many rows and there is no minimum threshold in place.
2022-10-23 29629, 2022
Maxr1998 joined the channel
2022-10-23 29601, 2022
Maxr1998_ has quit
2022-10-23 29623, 2022
CatQuest
happy Diwali
2022-10-23 29629, 2022
CatQuest
(:D)
2022-10-23 29645, 2022
aerozol
zas: lucifer: afaik skipping a song or video early on is one of the biggest indicators of 'didn't like' that TikTok etc uses to suggest stuff (and they are very good at creating personalized feeds...)
2022-10-23 29624, 2022
aerozol
But it does seem like that would be quite a new piece of code under the LB hood?
2022-10-23 29652, 2022
aerozol
mayhem: I was going to just do a transparent layer on top with shadows etc but then realized I could just do one image on top and nothing underneath. That one I posted on irc is good for you to use
2022-10-23 29607, 2022
mayhem
hiya!
2022-10-23 29618, 2022
aerozol
Everything I've done is also on the figma, have at it 👍
2022-10-23 29633, 2022
mayhem
I finally figured out later that this is what you had in mind. I'll play with that after I finish the board meeting prep
2022-10-23 29629, 2022
aerozol
Cool - could still do a jpg underneath if image size is an issue (got it to 300kb or so)
2022-10-23 29646, 2022
mayhem
should be fine.
2022-10-23 29616, 2022
aerozol
Happy Diwali all! (thus finishes my morning irc catch-up)
2022-10-23 29640, 2022
lucifer
aerozol: yes. agreed. LB indeed currently doesn't have a way to track skips. however this similar recordings thing we are currently working on can infer skips.
2022-10-23 29657, 2022
lucifer
we havent reached that point yet though.
2022-10-23 29632, 2022
aerozol
Ooh that sounds really promising. Afaik the modern way to figure out likes is to not even have users like or dislike stuff, just to track where they 'pause' and watch something. Which skews it to clickbaitey stuff but it seems to glue people to their phones pretty good
2022-10-23 29646, 2022
lucifer
yeah. that probably works well for reels like stuff but also needs a lot of tracking afaiu.
2022-10-23 29658, 2022
lucifer
spotify does this a lot. for instance it tracks why a track was paused, at what all points so on.
2022-10-23 29600, 2022
akshaaatt
Happy Diwali everyone!❤️❤️❤️❤️
2022-10-23 29613, 2022
lucifer
happy diwali! 🎉🎉
2022-10-23 29624, 2022
ansh
Happy Diwali!! 🎉 🎉
2022-10-23 29602, 2022
aerozol
🎉🎉🎉
2022-10-23 29615, 2022
mayhem
happy diwali!!
2022-10-23 29630, 2022
mayhem
aerozol: I think like the previous LPs on the floor image better.
2022-10-23 29601, 2022
mayhem
the new one is darker and has more of a margin that I think is not needed.
2022-10-23 29641, 2022
mayhem
and every time we change anything that moves the coverart around the image, I have to painstakingly align the images again. took me about an hour to get it right the first time.
2022-10-23 29612, 2022
aerozol
mayhem: you ran with a quick screenshot/snip that I posted 😜
2022-10-23 29627, 2022
aerozol
I'll tweak the pic to match the lineup in a little bit
2022-10-23 29628, 2022
mayhem
ahhh, ok. I can wait.
2022-10-23 29633, 2022
mayhem
k, thanks!
2022-10-23 29659, 2022
aerozol
I'll lighten it up a bit too
2022-10-23 29648, 2022
mayhem
lucifer: at a first glance, those similar data sets look quite interesting.
2022-10-23 29624, 2022
mayhem
but due to the size of the index (without the threshold) and that I have to make one request for every track (and I process 100 of them), it isn't feasible to work on this.
2022-10-23 29658, 2022
mayhem
can you please add the threshold and also enable more than one MBID to be looked up with the similarity endpoint? then things will be snappy. thanks!
2022-10-23 29600, 2022
v6lur joined the channel
2022-10-23 29609, 2022
lucifer
mayhem: yes, makes sense. we'll have to delete the existing table to get rid of those extra entries. i'll add a configurable threshold parameter.
2022-10-23 29628, 2022
mayhem
perfect.
2022-10-23 29642, 2022
lucifer
one thing you could try is passing the count parameter in meantime. maybe that speeds up the lookup a bit
2022-10-23 29602, 2022
mayhem
we need to adjust the data set hoster to take a single arg (algorithm) and a list of args (MBIDs). thoughts on how to do that?
2022-10-23 29651, 2022
mayhem
thanks, but its getting late here. I'll leave this be for today and pick it up again tomorrow.
2022-10-23 29607, 2022
lucifer
simplest way would be to pass in algorithm every time.
2022-10-23 29609, 2022
mayhem
but if aerozol comes up with something, I'll play with that. thats easier to understand when tired. :)
2022-10-23 29640, 2022
mayhem
lucifer: yes, but that suggests that we would honor them being different MBID by MBID.
2022-10-23 29650, 2022
mayhem
which is not necessary, I would say.
2022-10-23 29621, 2022
lucifer
yeah. another possible solution is pass the list normally. recording_mbids: list of mbids instead of doing recording_mbid: mbid for each item. the query can have extra logic to interpret the param as a list
2022-10-23 29644, 2022
mayhem
not quite sure I follow. perhaps better to discuss this tomorrow after more rest.
2022-10-23 29637, 2022
lucifer
we do `[{'x': 1}, {'x': 5}]` currently. instead we could do `[{'x': [1, 5]}]`.
2022-10-23 29658, 2022
lucifer
but yes sounds good to discuss later.
2022-10-23 29613, 2022
mayhem
indeed. that is the easy part. how do you express that in HTTP parameters for the GET?
2022-10-23 29654, 2022
lucifer
easy way would be to disallow arrays in get and only allow it in post.
2022-10-23 29656, 2022
aerozol
What’s the listenbrainz url again for the grids/visualisations?
2022-10-23 29626, 2022
lucifer
other ways are to specify some custom delimeter like a , or ; for multiple params or some apis specify the same param multiple times. in any case i think it'll be inconsistent with the post way but fine imo.