in #metabrainz

0:42 AM
aerozol

lucifer: the idea would be we could let community members/the public (we are already open source after all) create their own visualisers/whatever
0:42 AM
https://usercontent.irccloud-cdn.com/file/9obbt...
0:43 AM
(even if we just have our own stuff I think that kind of display, with inviting image previews, would work well for an ‘explore’ section)
0:49 AM
monkey: Hey, the bubbles are looking good already!! Some font tweaks, a tighter crop, and options to add a album cover, and I would think it’s in the bag?
0:49 AM
https://usercontent.irccloud-cdn.com/file/v5bmh...
2:30 AM
mayhem: finished the cleaned png for the
2:31 AM
‘LPs on the floor’. See how it looks with this on top
2:31 AM
https://usercontent.irccloud-cdn.com/file/7FdMr...
2:32 AM
If you can add some grain over top at the end, somehow, that would top it off, I wasn’t able to do it in the png without muddying the covers
3:34 AM
This was the other option, piece it together out of png elements. Then people could change the pattern on the floor and add/remove elements etc
3:34 AM
https://usercontent.irccloud-cdn.com/file/PdP0r...
7:40 AM
BrainzGit

[listenbrainz-server] 14chinmaykunkikar opened pull request #2217 (03master…tsconfig-exclude-dist): tsconfig: exclude `**/dist` directory https://github.com/metabrainz/listenbrainz-serv...
7:56 AM
lucifer

aerozol: i see, makes sense.
8:06 AM
zas

mayhem, lucifer: about generated playlists, are listens of the same weight? -> If I listen a generated playlist, does Troi think I like songs in it and then tends to suggest more of the same trend (not sure I'm clear).
8:08 AM
lucifer

zas: yes, all listens are of same weight. if you listen a generated playlist, it would indeed think that you like it and suggest more but iirc we take 6 months of listens, so a few playlists are unlikely to affect the overall results.
8:08 AM
zas

mayhem: on the tracklist you linked, since it is based on top recordings and my top recordings are somehow bugged (some are very high in list because of a spotify bug at some point), some suggestions are a bit weird, though I discovered nice music among suggested tracks.
8:11 AM
lucifer: another question, how does the duration of listens impact weights?
8:12 AM
lucifer

zas: so far it doesn't affect the weight. only the number of times a recording is listened affects the weight currently.
8:13 AM
zas

huh? so if I listen 20% of a song it thinks I somehow like it?
8:13 AM
which is usually the reverse
8:13 AM
lucifer

right. its a pending improvement to make.
8:14 AM
one thing that still helps is that most users won't go back to a song they skipped say halfway so its listen count will be very low. whereas the ones they like they will have listened multiple times.
8:15 AM
so the count acts as a proxy. but yes adding duration/support would help make it more explicit.
8:16 AM
zas

the fact I don't listen a track twice doesn't mean I don't like it, but for sure the fact I don't listen a track in full usually means I'm not fond of it
8:16 AM
(or I didn't had time to, but that's rare I guess)
8:16 AM
lucifer

yes makes sense.
8:18 AM
zas

Also, if I listen an album in full once (so 1 listen per track), usually means I like the music, so I guess listening multiple tracks of the same artist is a good indicator too, and even more if one or more albums were listened in full.
8:25 AM
Another thing: if I like a band, I'm always curious about what band members do outside this band, I expect such suggestions: A & B are in C band, A is also in D band, and B has a solo project, I expect tracks from D and B to appear, even though I never listened a track from them. Is this taken in account somehow atm?
8:42 AM
mayhem

moooin!
8:43 AM
aerozol: I see you want to do the compositing with the cover art first and then the image on top? that might work better, indeed.
8:43 AM
are the covers in that image you posted transparent?
8:44 AM
we need to find a place to collect all of these images. can I fetch them out of the figema?
8:44 AM
lucifer: I had a really rough time getting to sleep last night, because the similarity data made me realize a big big thing.
8:45 AM
throwing all of the listens at the similarity alg simply overfits it and the result is.... noise.
8:45 AM
we're getting quite good results with TWO hits; this suggests that the optimum window is some time larger than that.
8:46 AM
lets call it 30 days.
8:47 AM
we should never apply our alg to more data than this window. ever.
8:47 AM
instead, what we should do is calculate many windows over time of this data. chunks made up of 90 days of data.
8:48 AM
(combining 3 chunks at a time)
8:48 AM
the key insight is that we will gain the best data when we analyze tracks in the time when a given track was released -- when people will have been listening to it with other tracks that were released about the same time.
8:49 AM
e.g. a 2000s track will have the best play co-incidences when analyzed with listens from the same era.
8:50 AM
I'll likely need to make some graphs to make this more clear.
8:58 AM
lucifer

zas: currently not considered. what we intend to do is build various types of recommendation algorithms and collate their results. this artist correlation cannot be reliably inferred by the CF algorithm we use currently. the only correlations it can make is users who listened to artist A listen to artist B as well. however, we can build another algorithm which utilises these artist correlations to suggest tracks.
8:58 AM
mayhem: i see. sounds good to do multiple runs over small chunks.
8:59 AM
i think there is some value in doing a few larger chunks as well so that we get similarity of tracks which were released at different point in times as well but no reason that these multiple runs couldn't capture. we can always experiment and see how it goes.
9:00 AM
mayhem

first, lets see about finding a well tuned window size. then we need to explore the temporal nature of the data.
9:00 AM
larger chunks are not the answer, I think.
9:00 AM
lucifer

to make sure iiuc we will store scores of various windows separately and not aggregate them?
9:00 AM
mayhem

you always use the same sized chunk, but you calculate them starting from older starting point.
9:01 AM
and then combine the data from chunks to make a real result.
9:01 AM
lucifer

but that would act the same as the current alogrithm i think.
9:01 AM
mayhem

if you used *all* chunks for searching, yes.
9:01 AM
but the key is to select only a few chunks for searching.
9:02 AM
lucifer

whether you calculate sum of Jan-Mar, Apr-Jul and then add those. or you calculate Jan-Jul at once it would be the same.
9:02 AM
mayhem

we need to be mindful of both the indexing chunks and which index chunks are used in a search.
9:02 AM
lucifer

hmm, i see.
9:02 AM
mayhem

lucifer: correct.
9:02 AM
lucifer

i am not sure i understand the plan fully currently but let's try and see. it'll probably become clearer in due time.
9:03 AM
mayhem

like I said, I am not explaining this. my brain was racing until 4am when I worked this out and I'm now poorly slept as a result.
9:03 AM
I'll draw a graph about this later today, that will make it more clear.
9:03 AM
lucifer

heh. yes makes sense
9:03 AM
mayhem

but yes, first step: find a good window size.
9:04 AM
BrainzGit

[data-set-hoster] 14amCap1712 opened pull request #8 (03master…fix-none-bug): Check field is not none before trying to join https://github.com/metabrainz/data-set-hoster/p...
9:04 AM
mayhem

but I spent 2 hours being bathed in great music last night -- lots of new stuff was found. which is super exciting.
9:05 AM
BrainzGit

[listenbrainz-server] 14amCap1712 opened pull request #2218 (03master…fix-daily-jams): Daily Jams Fixes https://github.com/metabrainz/listenbrainz-serv...
9:05 AM
lucifer

both the endpoint and daily jams should be fixed now
9:05 AM
mayhem

great! I'll dig into PRs this evening.
9:07 AM
lucifer

and i have a train to catch later today. but the dump will probably be ready by then and i'll trigger some requests of various window sizes.
9:07 AM
mayhem

fabu.
9:07 AM
I'm happy to get lost in more music this evening.
10:38 AM
Lotheric_ is now known as Lotheric
10:59 AM
darkstardevx has quit
11:00 AM
darkstardevx joined the channel
12:14 PM
BrainzGit

[listenbrainz-server] 14amCap1712 opened pull request #2219 (03master…instant-playlist-save): Add a save playlist button to instant playlists https://github.com/metabrainz/listenbrainz-serv...
12:17 PM
lucifer

this is up on beta. save to LB then use export to spotify. (to see the button you'll have to change url to beta.lb manually and and also be logged in there. because open as playlist will go prod lb)
12:19 PM
also window size, 30 and 90 also generated. algorithms available: `session_based_days_7_session_300`, `session_based_days_30_session_300` and `session_based_days_90_session_300`.
12:20 PM
the overall lookup is now slower because the 90 days generated too many rows and there is no minimum threshold in place.
13:48 PM
Maxr1998 joined the channel
13:50 PM
Maxr1998_ has quit
14:37 PM
CatQuest

happy Diwali
14:37 PM
(:D)
16:50 PM
aerozol

zas: lucifer: afaik skipping a song or video early on is one of the biggest indicators of 'didn't like' that TikTok etc uses to suggest stuff (and they are very good at creating personalized feeds...)
16:51 PM
But it does seem like that would be quite a new piece of code under the LB hood?
16:52 PM
mayhem: I was going to just do a transparent layer on top with shadows etc but then realized I could just do one image on top and nothing underneath. That one I posted on irc is good for you to use
16:53 PM
mayhem

hiya!
16:53 PM
aerozol

Everything I've done is also on the figma, have at it 👍
16:53 PM
mayhem

I finally figured out later that this is what you had in mind. I'll play with that after I finish the board meeting prep
16:54 PM
aerozol

Cool - could still do a jpg underneath if image size is an issue (got it to 300kb or so)
16:54 PM
mayhem

should be fine.
16:58 PM
aerozol

Happy Diwali all! (thus finishes my morning irc catch-up)
16:59 PM
lucifer

aerozol: yes. agreed. LB indeed currently doesn't have a way to track skips. however this similar recordings thing we are currently working on can infer skips.
16:59 PM
we havent reached that point yet though.
17:02 PM
aerozol

Ooh that sounds really promising. Afaik the modern way to figure out likes is to not even have users like or dislike stuff, just to track where they 'pause' and watch something. Which skews it to clickbaitey stuff but it seems to glue people to their phones pretty good
17:08 PM
lucifer

yeah. that probably works well for reels like stuff but also needs a lot of tracking afaiu.
17:13 PM
spotify does this a lot. for instance it tracks why a track was paused, at what all points so on.
18:40 PM
akshaaatt

Happy Diwali everyone!❤️❤️❤️❤️
18:41 PM
lucifer

happy diwali! 🎉🎉
18:44 PM
ansh

Happy Diwali!! 🎉 🎉
19:45 PM
aerozol

🎉🎉🎉
19:52 PM
mayhem

happy diwali!!
19:52 PM
aerozol: I think like the previous LPs on the floor image better.
19:53 PM
the new one is darker and has more of a margin that I think is not needed.
19:53 PM
and every time we change anything that moves the coverart around the image, I have to painstakingly align the images again. took me about an hour to get it right the first time.
19:55 PM
aerozol

mayhem: you ran with a quick screenshot/snip that I posted 😜
19:55 PM
I'll tweak the pic to match the lineup in a little bit
19:55 PM
mayhem

ahhh, ok. I can wait.
19:55 PM
k, thanks!
19:55 PM
aerozol

I'll lighten it up a bit too
20:09 PM
mayhem

lucifer: at a first glance, those similar data sets look quite interesting.
20:10 PM
but due to the size of the index (without the threshold) and that I have to make one request for every track (and I process 100 of them), it isn't feasible to work on this.
20:10 PM
can you please add the threshold and also enable more than one MBID to be looked up with the similarity endpoint? then things will be snappy. thanks!
20:11 PM
v6lur joined the channel
20:17 PM
lucifer

mayhem: yes, makes sense. we'll have to delete the existing table to get rid of those extra entries. i'll add a configurable threshold parameter.
20:17 PM
mayhem

perfect.
20:17 PM
lucifer

one thing you could try is passing the count parameter in meantime. maybe that speeds up the lookup a bit
20:18 PM
mayhem

we need to adjust the data set hoster to take a single arg (algorithm) and a list of args (MBIDs). thoughts on how to do that?
20:18 PM
thanks, but its getting late here. I'll leave this be for today and pick it up again tomorrow.
20:19 PM
lucifer

simplest way would be to pass in algorithm every time.
20:19 PM
mayhem

but if aerozol comes up with something, I'll play with that. thats easier to understand when tired. :)
20:19 PM
lucifer: yes, but that suggests that we would honor them being different MBID by MBID.
20:19 PM
which is not necessary, I would say.
20:21 PM
lucifer

yeah. another possible solution is pass the list normally. recording_mbids: list of mbids instead of doing recording_mbid: mbid for each item. the query can have extra logic to interpret the param as a list
20:23 PM
mayhem

not quite sure I follow. perhaps better to discuss this tomorrow after more rest.
20:25 PM
lucifer

we do `[{'x': 1}, {'x': 5}]` currently. instead we could do `[{'x': [1, 5]}]`.
20:25 PM
but yes sounds good to discuss later.
20:26 PM
mayhem

indeed. that is the easy part. how do you express that in HTTP parameters for the GET?
20:26 PM
lucifer

easy way would be to disallow arrays in get and only allow it in post.
20:27 PM
aerozol

What’s the listenbrainz url again for the grids/visualisations?
20:28 PM
lucifer

other ways are to specify some custom delimeter like a , or ; for multiple params or some apis specify the same param multiple times. in any case i think it'll be inconsistent with the post way but fine imo.
20:28 PM
mayhem

lucifer: cheating. :)
20:28 PM
aerozol: grid.listenbrainz.org
20:29 PM
aerozol

Thanls
20:29 PM
*tahnks
20:29 PM
omg
20:29 PM
THANKS
20:30 PM
mayhem

lucifer: I being inconsistent is ok for some cases.
20:30 PM
lucifer

yup agreed.
20:31 PM
mayhem

+think
20:47 PM
aerozol

https://usercontent.irccloud-cdn.com/file/8eXB5...
20:47 PM
Let me know how that goes mayhem!
20:48 PM
mayhem

http://136.243.82.226:8000/coverart/cover-art-o...
20:49 PM
there are some artifacts I can't explain. the thin white border under fucking whatever is a roundoff error, I think.
20:50 PM
I need to take a closer look at each of these problematic pieces and see what to do about it.
20:50 PM
aerozol: ^^
20:50 PM
but, the key thing is, the headphones are now ABOVE the cover art. :) :)
20:51 PM
hang on, breaking things.
20:51 PM
aerozol

haha, might be from me resizing the png for web as well!
20:52 PM
mayhem

back up now.
20:53 PM
aerozol

Looking pretty good! Sorry I didn’t get it quite the same as the other one by the looks of it
20:53 PM
So maybe we call that done for now? Proof of concept ticked off?