in #metabrainz

0:23 AM
experemental joined the channel
0:24 AM
experemental has quit
1:11 AM
boscage joined the channel
1:21 AM
minimal has quit
4:12 AM
pranavkonidena_ joined the channel
4:26 AM
gcrkrause has quit
5:55 AM
ceyeseb463 joined the channel
5:55 AM
ceyeseb463 has quit
5:56 AM
ceyeseb463 joined the channel
5:58 AM
ceyeseb463 has left the channel
6:12 AM
Kladky joined the channel
6:50 AM
Sophist_UK joined the channel
6:50 AM
Sophist-UK has quit
7:55 AM
yvanzo

MB website response time is still going up and down, looking at logs.
8:37 AM
pranavkonidena_ has quit
9:16 AM
Tickets will be offline starting from 9:30 UTC (in 15min from now) for maintenance.
9:29 AM
experemental joined the channel
9:31 AM
BrainzGit

[listenbrainz-ios] 14akshaaatt opened pull request #16 (03main…add-envs): Separate out Environments https://github.com/metabrainz/listenbrainz-ios/...
9:43 AM
[listenbrainz-ios] 14akshaaatt merged pull request #16 (03main…add-envs): Separate out Environments https://github.com/metabrainz/listenbrainz-ios/...
9:46 AM
yvanzo

Tickets are back on, Jira security update applied.
9:46 AM
rana_satyaraj joined the channel
9:47 AM
rana_satyaraj

Hello everyone, I'm Satyaraj Rana. I'm new to the community and looking forward to contribute.
9:47 AM
Is this the chat for discussions related to Listenbrainz?
9:48 AM
yvanzo

Hi rana_satyaraj, welcome, it is!
9:51 AM
rana_satyaraj

Alright, thank you :)
10:44 AM
outsidecontext

zas: hi, you pinged me yesterday evening, but I was unavailable
10:45 AM
experemental has quit
10:46 AM
zas

outsidecontext: np, we were on smt with rob, and we wanted you to run a test, but we finally sorted it out, it was related to performance of https://github.com/metabrainz/listenbrainz-cont...
10:47 AM
we finally found out it was because mayhem was reading audio files from a slow disk compared to the one used to store database, hiding performance gain (which concern only database).
10:47 AM
outsidecontext

ah, ok :) happy to test this later here as well
10:53 AM
mayhem

new SSD arrives later this week, so that problem goes away. :)
10:58 AM
Sophist-UK joined the channel
10:58 AM
Sophist_UK has quit
11:08 AM
rana_satyaraj

I'm new here, I have set up the ListenBrainz development environment, but having trouble finding something to work on. Can anyone point me in the right direction, maybe give me some tasks to do? It could be anything as long as it's coding.
11:25 AM
mayhem

rana_satyaraj: hi! I'm looking but I can never find the "easy first bugs" label in jira
11:27 AM
yvanzo

That’s good-first-bug actually: https://tickets.metabrainz.org/issues/?jql=proj...
11:27 AM
mayhem

srsly
11:28 AM
rana_satyaraj: ^^
11:28 AM
mayhem saves the filter for future use
11:30 AM
rana_satyaraj

Hey, thanks guys, I'm on it
11:40 AM
monkey

lucifer: Hello! Did you see LB-1455 by any chance? Wondering if it is due to how often we rebuild the cache or if there's something else going on there that prevents it being added to the cache.
11:40 AM
BrainzBot

LB-1455: Album tracklists aren't populated https://tickets.metabrainz.org/browse/LB-1455
11:42 AM
monkey

Also, unless I misunderstood, you logged everyone out of LB in order to fix the tag submission auth issue? It's working now for me ! Thank you !
11:42 AM
zas

mayhem: https://github.com/metabrainz/listenbrainz-cont...
11:42 AM
database needs to be rebuilt (new table)
11:43 AM
mayhem

k
11:44 AM
monkey

Tagging <3
11:49 AM
mayhem

zas: you can run create on an existing DB file and it will make the new table for you.
11:51 AM
zas

great, but I think we'll still need better handling of schema updates at some point
11:51 AM
mayhem

yep.
11:51 AM
I didn't think we'd need it that soon, lol.
11:52 AM
zas

:D
11:54 AM
outsidecontext: The way we manage the catalog of audio files in listenbrainz-content-resolver could be done in Picard btw. In order to speed up music collection updates/tag resync etc
11:55 AM
monkey

Oh boy. mayhem do I have a fun mapping pickle for you !
11:55 AM
These two are not the same recording and not the same artist: pray (by Eve) and Pray (by EVE)
11:55 AM
https://musicbrainz.org/recording/f8c50031-b2e0...
11:55 AM
https://musicbrainz.org/recording/873c6390-a8b1...
11:55 AM
mayhem sticks his fingers in his ears and runs away shouting LALALALALALALA
11:55 AM
mayhem

yeah, this is a great reason why we need to consider release in the mapping.
11:56 AM
that will take care of this case.
12:14 PM
rana_satyaraj has quit
13:52 PM
reosarevok

Dealing with the jira vandalism
13:54 PM
minimal joined the channel
14:01 PM
Ok, did that now, but yvanzo, we might need to block new jira accounts again, our dear Russian idiot is back
14:02 PM
yvanzo

reosarevok: Thanks!
14:04 PM
reosarevok: There are a couple of Jira apps that can help with managing accounts, will look into these after moving to a Data Center license.
14:08 PM
reosarevok

Thanks
14:25 PM
lucifer

monkey: looking into some mapping issues already, might get fixed with that
14:25 PM
monkey

Nice ! :)
14:31 PM
musicListenerSam joined the channel
14:32 PM
musicListenerSam

hey mayhem , are you currently free ?
14:32 PM
mayhem

yes.
14:32 PM
hi!
14:33 PM
I took your code and tweaked it a little, mostly to give it a good footing for the command line:
14:33 PM
https://github.com/metabrainz/bpm-detector
14:33 PM
and i played with it last night and you certainly implemented the basics of the algorithm.
14:33 PM
but the tricky part comes.... now.
14:34 PM
musicListenerSam

yesterday we were talking about including the bpm code in the listen brainz resolver , so i converted it into an .py file in fork https://github.com/sambhavnoobcoder/listenbrain...
14:34 PM
okay
14:34 PM
mayhem

I have one album where most of the tracks are wrong. but it is always possible to get a right answer by changing the window size.
14:34 PM
musicListenerSam

hmm
14:34 PM
could you share the album ?
14:34 PM
mayhem

its not ready for the content resolver. :(
14:35 PM
musicListenerSam

okay , no worries , ill make changes to https://github.com/metabrainz/bpm-detector this repo only
14:37 PM
mayhem

great.
14:37 PM
one of the things I was thinking about is that getting this right is... hard.
14:38 PM
I could get right results by changing window size. 5 or 10 seconds.
14:38 PM
but obviously that doesn't work in the real world.
14:38 PM
the thing I had always wondered about is using machine learning to really solve this problem.
14:39 PM
musicListenerSam

hmm . perhaps instead of creating a voting classifier for multiple mdls we could begin with a voting clasifier for specif window sizes
14:39 PM
thats would be a start
14:39 PM
mayhem

not sure that is the right approach.
14:39 PM
I have a feeling that we should pick a middle of the road window size.
14:40 PM
and then no use a peak detector -- that part is the trickiest.
14:40 PM
what if instead the feed the generated data to something like a neural net?
14:40 PM
we'd need to build a decent training data set, with audio files and expected (verified) BMP values.
14:41 PM
then we can train a BMP classifier with that data.
14:41 PM
what do you think?
14:41 PM
musicListenerSam

yup , we will surely need to start with the data
14:41 PM
i think teh neural network is a right approach
14:41 PM
the algos can often fail in more dymaic scenarios , where the nueural network thrives
14:42 PM
training the BMP classifoeer with a good dataset would be a huge plus
14:43 PM
riksucks has quit
14:43 PM
ig i'll look into the dataset buliding for now then , ig spotify has a lot of bpm data ,or so ive heard
14:43 PM
mayhem

it does.
14:43 PM
but I dont know if we can trust it.
14:43 PM
arsh has quit
14:43 PM
AcousticBrainz has this data, but we can't rely on it.
14:44 PM
vscode_ has quit
14:44 PM
so my take was to make a collection of releases, from many different genres and work out a BPM value for each track in the collection.
14:44 PM
musicListenerSam

in that case , where else can we look for reliable sources of BMP data ?
14:45 PM
hmm'
14:45 PM
mayhem

what do you think is a good training dataset size for this?
14:45 PM
there are other algorithms out there. we could ownload as many as we can find, run then all, pull in AB/Spotify and if we get an agreement, the track goes into the collection.
14:46 PM
that might however, select for easy cases, so we may need to hand resolve the edge cases.
14:46 PM
Shubh has quit
14:47 PM
musicListenerSam

frankly speaking , if i take releases , average duration of each relsese 3 minutes , and since audio files are large sized in sonometry , i think 1 gb worth of data would be a good start . something that can be achoeved in the beginning
14:47 PM
ya i agree with that
14:47 PM
we could run a script matching the two , the data from the spotify api for bpm vs the algo and if its the same , it passes
14:47 PM
ShivamAwasthi has quit
14:48 PM
to enhance accuracy , we could parallely run mutiple agotihms and set a threshold for acceptance
14:48 PM
mayhem

agreed.
14:48 PM
musicListenerSam

that way we would have reduced number of false BMP results in the dataset .
14:48 PM
Freso has quit
14:49 PM
mayhem

let me see what I can do to collect this dataset.
14:50 PM
musicListenerSam

as far as the edge cases are concerned , once we have a neural network that works on the larger chunk of data , certain cases should stand out , say soft music or some other case that the dataset misses . we can then work towards that data needs specifically , perhaps using attention modelling of some sort
14:50 PM
mayhem

zas: are you following this convo?
14:50 PM
musicListenerSam

shouldnt be that hard once we are at that point
14:51 PM
mayhem

for ambient and classical music, ie. music without a clear beat, we should ideally say: Nope, can't determine BPM, rather than giving the wrong BPM.
14:51 PM
zas

mayhem: yes
14:52 PM
musicListenerSam

hmm , ill look into the dataset creation as well .
14:52 PM
perhaps for clasical and amibent we can set a confidence fro the models prediction
14:52 PM
mayhem

so, we're trying to come up with a machine learning BPM alg -- I have a feeling its been done before, but none of these approaches ever made it to open source.
14:52 PM
musicListenerSam

below a certain prediction we just say , no BPM detected
14:52 PM
mayhem

musicListenerSam: why don't you use my music service as source for music for now. let me worry about the dataset.
14:52 PM
musicListenerSam

okay
14:53 PM
mayhem

zas: your collection has more breadth than mine does.
14:54 PM
would you be willing to contribute 5 albums each from punk, jazz, metal for the training dataset?
14:54 PM
zas

np, but genres like jazz & metal are rather fuzzy
14:55 PM
mayhem

yep, understood. thye are just poorly represented in my collection.
14:55 PM
zas

bpm of doom metal is near zero, while bpm of death metal is rather high
14:55 PM
mayhem

which is why I want both.
14:55 PM
the more edge casesy sort of music you can help us with, the better.
14:59 PM
musicListenerSam

hmm . ok so ig understood what i need to do next .(y) i'll ping you mayhem in case of any more exciting developements and if we feel the changes are an improvement we can add them to the bpm-detector repo
15:00 PM
mayhem

yep. if you give me your github handle, I'll give you commit access to that repo.
15:00 PM
and tomorrow, I will start building a test dataset.
15:00 PM
should be good to start testing with tomorrow, but significant size will take some time still.
15:02 PM
musicListenerSam

okay . sure
15:03 PM
we'll start testing in small batches for now anyway , so the dataset size shouldnt be hurdle for now
15:04 PM
mayhem

agreed