in #metabrainz

15:22 PM
ruaok

perhaps we need to have a test set of 100,000 tracks for testing that we can run quickly.
15:23 PM
and when that 100,000 tracks produces some decent data, do we open it up for wider submissions.
15:23 PM
and we really need to find people with large music collections and find a way to use them for bootstrapping a reboot.
15:24 PM
should we stop all AB work until we have a new plan in place?
15:25 PM
reosarevok

What's the current state of AB alternatives?
15:26 PM
ruaok

in what sense, reosarevok ? open alternatives to AB?
15:26 PM
alastairp

I think that having a plan about what we want to do with AB is a good idea
15:26 PM
reosarevok

As in, have researchers managed better algorithms elsewhere? I guess Spotify or whoever might internally, but nothing open?
15:26 PM
alastairp

that is, just throwing data at a database isn't working
15:26 PM
reosarevok: there are many datasets with extracted features, based on algorithm x
15:26 PM
ruaok

well, the algs to hand were promised to be much better than they are.
15:27 PM
alastairp

but they fall into the same problem that caused us to start AB in the first place - that is, they're fixed in time, fixed in dataset size
15:27 PM
ruaok

so, if we can't rely on the original premise of taking things from academia and putting them into production, then the whole value proposition of AB falls on its face.
15:29 PM
reosarevok

Yeah, I was mostly wondering if there's something else that has already "replaced" AB
15:29 PM
Or if what we have is the least bad there is
15:29 PM
ruaok

nothing open. its a very large effort.
15:29 PM
alastairp

and I think this comes back to the question of scale. When you're testing on 1000 items and you get good results, it's easy to say that it works wel
15:30 PM
ruaok

which is why we need to start with something that runs in a reasonable amount of time, yet is representative of the whole picture.
15:30 PM
reosarevok

I think it's fine if AB doesn't work well on African music yet, but then we should be saying "hey, we know we have this issue, who has a large, diverse collection of African music and is willing to help us with it"
15:30 PM
I guess the problem is the whole picture is so absurdly wide
15:31 PM
alastairp

reosarevok: right, but the question here also is if it's just a matter of collecting the data, or if you actually have to perform research on field x in order to learn how to deal with African music
15:31 PM
ruaok

reosarevok: the problem is, how do you tell someone which parts work and which don't?
15:32 PM
if I call and API I expect it to work or I expect to see a confidence rating of the quality of the results. We have few tools to provide such things.
15:32 PM
alastairp

which was one of the big premises of compmusic - that you needed specific algorithms
15:33 PM
reosarevok

Well, your users should tell you what looks wrong, I guess :) But yeah, it's hard to use that programmatically
15:33 PM
alastairp

for example, one change to essentia since we released AB is that it now gives 5 different key estimations - as we realised that the one "standard" model that we thought worked well really only worked well on a small subset of data
15:33 PM
reosarevok

I would be surprised if the same algorithms which can work with EDM, jazz and classical break with African music
15:33 PM
But it might be that they don't work as well on that either :)
15:33 PM
alastairp

but also, at the scale of AB, knowing 1 piece of data is wrong in 10m files doesn't give a huge amount of context
15:34 PM
ruaok

reosarevok: for instance, I can't use AB in any of my playlist work. I doubt anyone else could.
15:34 PM
reosarevok

ruaok: is it meant for that though? I thought it was meant to mostly just be a long-term slow research project :)
15:34 PM
ruaok

if they do, they are getting shit results. and just think of how many people have already done research based on AB. clearly without vetting the results.
15:34 PM
yes, the idea was to have results after 5 years.
15:35 PM
reosarevok

I'd expect research to be done as in "we ran AB on this huge collection of African music and it worked / didn't work and this is what we saw"
15:35 PM
But I guess that might not be happening
15:35 PM
alastairp

I think that one big problem with AB is that we thought "oh yes, it can follow the research as it improves", but then we didn't make it follow essentia upgrades
15:36 PM
reosarevok

What's the main problem with making it follow upgrades? That it needs to re-scan everything?
15:36 PM
ruaok

alastairp: do you have any faith that updated essential algs would actually scale better?
15:36 PM
alastairp

yes - a combination of technical and social hurdles
15:37 PM
I'm sure that current essentia algorithms are "better" than the AB ones
15:37 PM
but we're stuck on the definition of better
15:37 PM
on 20m tracks there are still going to be awful results
15:38 PM
reosarevok

Yeah. How doable it is to *know* they are awful? I understand the automatic confidence isn't always great?
15:38 PM
ruaok

I think if we continue with AB we need to make things "algorithms first".
15:38 PM
first prove out that an algorithm works and scales well. then adopt it into AB and run it over data.
15:39 PM
lucifer

80% accuracy on 20m is still 4m tracks wrong.
15:39 PM
reosarevok

Also, how doable would it be to combine submitting LB listens with AB submission?
15:39 PM
ruaok

reosarevok: not doable at all.
15:39 PM
reosarevok

For people running local plugins on like VLC or something
15:39 PM
alastairp

that's what I was looking at this morning on the BPM algorithms - I had hoped that the histogram strength would show us when there was uncertancy - but in many cases it was pretty confident at its result
15:39 PM
ruaok

90% of our listens come from spotify.
15:40 PM
lucifer

spotify provides an audio analysis api fwiw so we could get that data for comparision with ab at least.
15:40 PM
reosarevok

ruaok: sure, I'm asking for old school people
15:40 PM
alastairp

and hence 95% of research is data management and evaluation
15:40 PM
reosarevok

Since I'm assuming we won't be getting access to all of Spotify :p
15:41 PM
Ideally of course something like AB would have an agreement with something like Spotify, but I assume everyone in that market already has their own inhouse stuff and are not willing to help anybody else
15:42 PM
lucifer: 80% accuracy is probably the most you can hope for, really - I mean, people shouldn't expect magic when using automatic stuff
15:42 PM
If you want perfection, use human-built playlists
15:42 PM
lucifer

reosarevok: indeed and looking at the research paper that descibes the current ab algorithm, my understanding is that 80% accuracy is the best case.
15:42 PM
reosarevok

My Spotify release radar for example is a huge mess, playlist-wise, so either they don't even try to sort it, or they do a terrible job of it
15:43 PM
(it's usually full of "rap-classical-metal-rap-classical" in random orders like that)
15:43 PM
CatQuest

hah
15:44 PM
ruaok

I think I am going to spend some time working out if the annoy stuff has any utility. because to date, I haven't been convinced of that.
15:44 PM
CatQuest

btw I mean I would happily submit ab stuff
15:44 PM
reosarevok

So maybe the main issue is not AB data as much as expectations
15:44 PM
CatQuest

also yes, i mean it's automated
15:44 PM
reosarevok

alastairp: how often is essentia updated?
15:44 PM
CatQuest

-i as thinking that having a way for letting users on eg mb feedback ab data shown on recordings might be usefull?
15:45 PM
reosarevok

I don't think it'd be doable to ask people to resubmit more than once a year or so, but having a new data version every year might not be that bad?
15:45 PM
alastairp

reosarevok: we try and keep it up to date with new updates to algorithms as they are released
15:45 PM
but again, only a few people involved in doing that
15:45 PM
CatQuest

like if al to of people like "downvote" a bpm tag from ab
15:45 PM
reosarevok

alastairp: sure, but how often are algorithms released? :D
15:45 PM
alastairp

reosarevok: every time there's a conference
15:46 PM
CatQuest

!recall oh no.
15:46 PM
BrainzBot

https://usercontent.irccloud-cdn.com/file/uwma2...
15:46 PM
lucifer

so yearly ?
15:46 PM
reosarevok

CatQuest's point isn't bad either, the more data we show (as "we don't know if this is good") in MB and elsewhere, the more we could find where we have stuff that just looks bad
15:46 PM
CatQuest

:D
15:46 PM
alastairp

improvements happen all the time. but sometimes that improvement is "we no longer screw up on this small part of this test dataset"
15:47 PM
reosarevok

alastairp: so would it be doable to say "we package all new improvements for the year once a year, and offer a new version of AB that supports that, but needs re-scanning"?
15:47 PM
ruaok

so, my feeling is that the only AB work that should be happening in the short term is to find new algs that are usable and making a plan for how to reboot.
15:47 PM
alastairp

reosarevok: that was one of the original ideas
15:47 PM
reosarevok

I guess it would make the data take a huuuge amount of space though if we have yearly versions of all the data?
15:48 PM
CatQuest

archive old data?
15:48 PM
alastairp

reosarevok: so maybe retire old versions? but then what do you do if an MBID gets processed with n-5 and never gets re-done. do you accept the old (maybe worse) version, or do you delete it?
15:48 PM
CatQuest

mark it as old but keep
15:48 PM
reosarevok

alastairp: maybe retire old versions *except* for stuff not in any newer version?
15:48 PM
alastairp

reosarevok: yes, perhaps
15:48 PM
reosarevok

And then allow people to optionally ask the API for "latest version, but fill the gaps with historical"?
15:49 PM
CatQuest

show on mb that it needs to be rescanned. call to people for rescanning
15:49 PM
reosarevok

So you can choose if you only want the latest, or all
15:49 PM
CatQuest

also also, make scanning easier. much, much easier
15:49 PM
oh i liek that idea reo
15:49 PM
alastairp

reosarevok: that was my idea for what to do when we got a new version of the extractor. stop accepting the old one, when you request an mbid get the new one if it exists otherwise use the old one
15:49 PM
CatQuest

having someay ot scna ab with picard would be :chef:
15:50 PM
alastairp

this was always a long-term plan, but it relied on having AB dev resources, having a stable release cadence for essentia, etc, etc
15:50 PM
CatQuest

:(
15:50 PM
reosarevok

alastairp: sounds good, although I think we could still have a way to specifically say "I would rather get 0 results than old results"
15:51 PM
CatQuest: there's a Picard plugin, but I dunno how well it works?
15:51 PM
CatQuest

I still think it can happen. just. idk, lb is being prioritized now. if prioritizing ab will make lb better .I'm sure we cna do that
15:51 PM
reosarevok

Or maybe that's just to *use* data
15:51 PM
CatQuest

mhm
15:52 PM
reosarevok

Oh, seems so
15:52 PM
Anyway, I'm sure it's doable
15:52 PM
alastairp: how many resources would that take? Are we talking "you spending a month a year on it"? or "needs a full-time person"?
15:53 PM
If we update once a year, I'm assuming it needs one big push to make that multi-version system work, and then just some time to update every year?
15:53 PM
BrainzGit

[troi-recommendation-playground] 14mayhem opened pull request #41 (03main…year-review): Year in music and a whole pile of other general development https://github.com/metabrainz/troi-recommendati...
15:54 PM
alastairp

reosarevok: I think that development work on AB to support this kind of feature extraction is probably only a few months of work, if that
15:54 PM
however, I think that building up QA for algorithms, making improvements, and rolling them out is a full time job for an entire data processing team
15:55 PM
reosarevok

Oh, I mean, yes, I'd expect the QA would be "hey, our community has detected these issues, whoever wants to do some research using AB, you can look into improvements for that"
15:56 PM
I can't expect we're going to be doing the algo improvements ourselves
15:57 PM
alastairp

I'm skeptical that a feedback button on an AB page to collect issues would be useful for the long-term improvement of algorithms, though
15:57 PM
CatQuest

we'll be training countless neuralnets to do it for us! :D
15:57 PM
alastairp

a researcher can't do anything with an mbid and "this is wrong". Perhaps they could do more with mbid + bpm annotation (in the case of bpms)
15:58 PM
CatQuest

that's what I meant. the "what is wrong" must be included
15:58 PM
alastairp

because really, you'd need audio in order to make improvements (this is basically a dataset)
15:58 PM
reosarevok

alastairp: I was expecting they could try to find the similarities between what kind of things are wrong, if there are enough reports
15:58 PM
But that'd anyway involve a lot of reports :)
15:59 PM
alastairp

I really don't have enough experience in this area to know if many reports of that form would be useful
15:59 PM
reosarevok

as in "well, we have a lot of reports for music of genre X"
15:59 PM
"so we should specifically try to find a good amount of genre X and see what we find"
15:59 PM
But yeah, dunno
16:00 PM
alastairp

yeah, I think that large collections of features -> genres is one of the things that AB _can_ do well.
16:01 PM
unfortunately, right about the time we released it, people got all in on deep learning, which requires orders of magnitude more features than what we put in AB
16:02 PM
and training models once you have more than ~1000 examples starts taking exponentially more time
16:02 PM
so again - in small sets of research data, the data + algorithms looked good, and really did give good results
16:03 PM
but if you try and apply a 8 class genre classifier to a million tracks, you're going to have problems real quick
16:04 PM
reosarevok

Sure
16:04 PM
So essentia isn't expected to be updated with deep learning algos?
16:05 PM
alastairp

it already is
16:05 PM
reosarevok

Oh
16:05 PM
alastairp

so it's already getting interesting results once you have more data: https://towardsdatascience.com/musicnn-5d1a5883...
16:06 PM
reosarevok

So, remind me, how does AB work with essentia again?
16:06 PM
Is essentia the bit that runs on the files locally, then submits the data up to AB?
16:06 PM
(just wondering about the "orders of magnitude more features than what we put in AB")
16:08 PM
alastairp

right. essentia is a library of algorithms (some are "process this audio into a representation that can be used for machine learning" and some are "give me the bpm of audio). there is a single binary which runs a bunch of different algorithms over an audio file (the 'music extractor'), which is the AB extractor. then the ab submitter takes the result of that and submits it
16:08 PM
https://essentia.upf.edu/streaming_extractor_mu...
16:08 PM
BrainzGit

[bookbrainz-site] 14MonkeyDo opened pull request #730 (03master…monkey-yarn-package-manager): Replace NPM with Yarn https://github.com/bookbrainz/bookbrainz-site/p...
16:10 PM
reosarevok

So "orders of magnitude more features than what we put in AB" just means "because we haven't updated it"?
16:11 PM
Or is there actually a hardcoded issue why AB can't support those?
16:14 PM
alastairp

partially yes, just a matter of adding it (as I said at the beginning of this discussion, we had already started having a discussion about adding a new data type/extractor to AB)
16:15 PM
partially no - the more detailed data that you add, the easier it becomes to reverse that data back into audio
16:16 PM
so then the question of what AB is changes a bit - do you want it to just output single, good values? (accurate bpm, key, etc). if so, you can do this but then you can't use the data in the database to improve algorithms. you have to improve them on external collections of music, then roll out a new version, and do what we discussed about rotating old versions out
16:17 PM
or maybe you want it to be a collection of detailed features that allow people to use these features independently to build new models, new algorithms etc without needing to have access to large collections of audio
16:34 PM
reosarevok

So now we're doing a) ?
16:34 PM
Also, "add detailed data, and see what ungodly mess comes back when trying to turn it back into audio" sounds hilarious
16:38 PM
alastairp

current AB is a bit of both - it includes specific features that required detailed audio data (bpm, key), but then it includes the chroma features which are used for training new models
16:41 PM
https://gist.github.com/bmcfee/a40c3ab83f166a38... this is an interesting experiment doing exactly that - we have some demo pages somewhere that allow us to play back the reproduced audio, let me see if I can find it
16:47 PM
reosarevok

So the doubt is "how many more features can we allow before someone sues us for piracy"?