#metabrainz

/

      • ruaok
        perhaps we need to have a test set of 100,000 tracks for testing that we can run quickly.
      • and when that 100,000 tracks produces some decent data, do we open it up for wider submissions.
      • and we really need to find people with large music collections and find a way to use them for bootstrapping a reboot.
      • should we stop all AB work until we have a new plan in place?
      • reosarevok
        What's the current state of AB alternatives?
      • ruaok
        in what sense, reosarevok ? open alternatives to AB?
      • alastairp
        I think that having a plan about what we want to do with AB is a good idea
      • reosarevok
        As in, have researchers managed better algorithms elsewhere? I guess Spotify or whoever might internally, but nothing open?
      • alastairp
        that is, just throwing data at a database isn't working
      • reosarevok: there are many datasets with extracted features, based on algorithm x
      • ruaok
        well, the algs to hand were promised to be much better than they are.
      • alastairp
        but they fall into the same problem that caused us to start AB in the first place - that is, they're fixed in time, fixed in dataset size
      • ruaok
        so, if we can't rely on the original premise of taking things from academia and putting them into production, then the whole value proposition of AB falls on its face.
      • reosarevok
        Yeah, I was mostly wondering if there's something else that has already "replaced" AB
      • Or if what we have is the least bad there is
      • ruaok
        nothing open. its a very large effort.
      • alastairp
        and I think this comes back to the question of scale. When you're testing on 1000 items and you get good results, it's easy to say that it works wel
      • ruaok
        which is why we need to start with something that runs in a reasonable amount of time, yet is representative of the whole picture.
      • reosarevok
        I think it's fine if AB doesn't work well on African music yet, but then we should be saying "hey, we know we have this issue, who has a large, diverse collection of African music and is willing to help us with it"
      • I guess the problem is the whole picture is so absurdly wide
      • alastairp
        reosarevok: right, but the question here also is if it's just a matter of collecting the data, or if you actually have to perform research on field x in order to learn how to deal with African music
      • ruaok
        reosarevok: the problem is, how do you tell someone which parts work and which don't?
      • if I call and API I expect it to work or I expect to see a confidence rating of the quality of the results. We have few tools to provide such things.
      • alastairp
        which was one of the big premises of compmusic - that you needed specific algorithms
      • reosarevok
        Well, your users should tell you what looks wrong, I guess :) But yeah, it's hard to use that programmatically
      • alastairp
        for example, one change to essentia since we released AB is that it now gives 5 different key estimations - as we realised that the one "standard" model that we thought worked well really only worked well on a small subset of data
      • reosarevok
        I would be surprised if the same algorithms which can work with EDM, jazz and classical break with African music
      • But it might be that they don't work as well on that either :)
      • alastairp
        but also, at the scale of AB, knowing 1 piece of data is wrong in 10m files doesn't give a huge amount of context
      • ruaok
        reosarevok: for instance, I can't use AB in any of my playlist work. I doubt anyone else could.
      • reosarevok
        ruaok: is it meant for that though? I thought it was meant to mostly just be a long-term slow research project :)
      • ruaok
        if they do, they are getting shit results. and just think of how many people have already done research based on AB. clearly without vetting the results.
      • yes, the idea was to have results after 5 years.
      • reosarevok
        I'd expect research to be done as in "we ran AB on this huge collection of African music and it worked / didn't work and this is what we saw"
      • But I guess that might not be happening
      • alastairp
        I think that one big problem with AB is that we thought "oh yes, it can follow the research as it improves", but then we didn't make it follow essentia upgrades
      • reosarevok
        What's the main problem with making it follow upgrades? That it needs to re-scan everything?
      • ruaok
        alastairp: do you have any faith that updated essential algs would actually scale better?
      • alastairp
        yes - a combination of technical and social hurdles
      • I'm sure that current essentia algorithms are "better" than the AB ones
      • but we're stuck on the definition of better
      • on 20m tracks there are still going to be awful results
      • reosarevok
        Yeah. How doable it is to *know* they are awful? I understand the automatic confidence isn't always great?
      • ruaok
        I think if we continue with AB we need to make things "algorithms first".
      • first prove out that an algorithm works and scales well. then adopt it into AB and run it over data.
      • lucifer
        80% accuracy on 20m is still 4m tracks wrong.
      • reosarevok
        Also, how doable would it be to combine submitting LB listens with AB submission?
      • ruaok
        reosarevok: not doable at all.
      • reosarevok
        For people running local plugins on like VLC or something
      • alastairp
        that's what I was looking at this morning on the BPM algorithms - I had hoped that the histogram strength would show us when there was uncertancy - but in many cases it was pretty confident at its result
      • ruaok
        90% of our listens come from spotify.
      • lucifer
        spotify provides an audio analysis api fwiw so we could get that data for comparision with ab at least.
      • reosarevok
        ruaok: sure, I'm asking for old school people
      • alastairp
        and hence 95% of research is data management and evaluation
      • reosarevok
        Since I'm assuming we won't be getting access to all of Spotify :p
      • Ideally of course something like AB would have an agreement with something like Spotify, but I assume everyone in that market already has their own inhouse stuff and are not willing to help anybody else
      • lucifer: 80% accuracy is probably the most you can hope for, really - I mean, people shouldn't expect magic when using automatic stuff
      • If you want perfection, use human-built playlists
      • lucifer
        reosarevok: indeed and looking at the research paper that descibes the current ab algorithm, my understanding is that 80% accuracy is the best case.
      • reosarevok
        My Spotify release radar for example is a huge mess, playlist-wise, so either they don't even try to sort it, or they do a terrible job of it
      • (it's usually full of "rap-classical-metal-rap-classical" in random orders like that)
      • CatQuest
        hah
      • ruaok
        I think I am going to spend some time working out if the annoy stuff has any utility. because to date, I haven't been convinced of that.
      • CatQuest
        btw I mean I would happily submit ab stuff
      • reosarevok
        So maybe the main issue is not AB data as much as expectations
      • CatQuest
        also yes, i mean it's automated
      • reosarevok
        alastairp: how often is essentia updated?
      • CatQuest
        -i as thinking that having a way for letting users on eg mb feedback ab data shown on recordings might be usefull?
      • reosarevok
        I don't think it'd be doable to ask people to resubmit more than once a year or so, but having a new data version every year might not be that bad?
      • alastairp
        reosarevok: we try and keep it up to date with new updates to algorithms as they are released
      • but again, only a few people involved in doing that
      • CatQuest
        like if al to of people like "downvote" a bpm tag from ab
      • reosarevok
        alastairp: sure, but how often are algorithms released? :D
      • alastairp
        reosarevok: every time there's a conference
      • CatQuest
        !recall oh no.
      • BrainzBot
      • lucifer
        so yearly ?
      • reosarevok
        CatQuest's point isn't bad either, the more data we show (as "we don't know if this is good") in MB and elsewhere, the more we could find where we have stuff that just looks bad
      • CatQuest
        :D
      • alastairp
        improvements happen all the time. but sometimes that improvement is "we no longer screw up on this small part of this test dataset"
      • reosarevok
        alastairp: so would it be doable to say "we package all new improvements for the year once a year, and offer a new version of AB that supports that, but needs re-scanning"?
      • ruaok
        so, my feeling is that the only AB work that should be happening in the short term is to find new algs that are usable and making a plan for how to reboot.
      • alastairp
        reosarevok: that was one of the original ideas
      • reosarevok
        I guess it would make the data take a huuuge amount of space though if we have yearly versions of all the data?
      • CatQuest
        archive old data?
      • alastairp
        reosarevok: so maybe retire old versions? but then what do you do if an MBID gets processed with n-5 and never gets re-done. do you accept the old (maybe worse) version, or do you delete it?
      • CatQuest
        mark it as old but keep
      • reosarevok
        alastairp: maybe retire old versions *except* for stuff not in any newer version?
      • alastairp
        reosarevok: yes, perhaps
      • reosarevok
        And then allow people to optionally ask the API for "latest version, but fill the gaps with historical"?
      • CatQuest
        show on mb that it needs to be rescanned. call to people for rescanning
      • reosarevok
        So you can choose if you only want the latest, or all
      • CatQuest
        also also, make scanning easier. much, much easier
      • oh i liek that idea reo
      • alastairp
        reosarevok: that was my idea for what to do when we got a new version of the extractor. stop accepting the old one, when you request an mbid get the new one if it exists otherwise use the old one
      • CatQuest
        having someay ot scna ab with picard would be :chef:
      • alastairp
        this was always a long-term plan, but it relied on having AB dev resources, having a stable release cadence for essentia, etc, etc
      • CatQuest
        :(
      • reosarevok
        alastairp: sounds good, although I think we could still have a way to specifically say "I would rather get 0 results than old results"
      • CatQuest: there's a Picard plugin, but I dunno how well it works?
      • CatQuest
        I still think it can happen. just. idk, lb is being prioritized now. if prioritizing ab will make lb better .I'm sure we cna do that
      • reosarevok
        Or maybe that's just to *use* data
      • CatQuest
        mhm
      • reosarevok
        Oh, seems so
      • Anyway, I'm sure it's doable
      • alastairp: how many resources would that take? Are we talking "you spending a month a year on it"? or "needs a full-time person"?
      • If we update once a year, I'm assuming it needs one big push to make that multi-version system work, and then just some time to update every year?
      • BrainzGit
        [troi-recommendation-playground] 14mayhem opened pull request #41 (03main…year-review): Year in music and a whole pile of other general development https://github.com/metabrainz/troi-recommendati...
      • alastairp
        reosarevok: I think that development work on AB to support this kind of feature extraction is probably only a few months of work, if that
      • however, I think that building up QA for algorithms, making improvements, and rolling them out is a full time job for an entire data processing team
      • reosarevok
        Oh, I mean, yes, I'd expect the QA would be "hey, our community has detected these issues, whoever wants to do some research using AB, you can look into improvements for that"
      • I can't expect we're going to be doing the algo improvements ourselves
      • alastairp
        I'm skeptical that a feedback button on an AB page to collect issues would be useful for the long-term improvement of algorithms, though
      • CatQuest
        we'll be training countless neuralnets to do it for us! :D
      • alastairp
        a researcher can't do anything with an mbid and "this is wrong". Perhaps they could do more with mbid + bpm annotation (in the case of bpms)
      • CatQuest
        that's what I meant. the "what is wrong" must be included
      • alastairp
        because really, you'd need audio in order to make improvements (this is basically a dataset)
      • reosarevok
        alastairp: I was expecting they could try to find the similarities between what kind of things are wrong, if there are enough reports
      • But that'd anyway involve a lot of reports :)
      • alastairp
        I really don't have enough experience in this area to know if many reports of that form would be useful
      • reosarevok
        as in "well, we have a lot of reports for music of genre X"
      • "so we should specifically try to find a good amount of genre X and see what we find"
      • But yeah, dunno
      • alastairp
        yeah, I think that large collections of features -> genres is one of the things that AB _can_ do well.
      • unfortunately, right about the time we released it, people got all in on deep learning, which requires orders of magnitude more features than what we put in AB
      • and training models once you have more than ~1000 examples starts taking exponentially more time
      • so again - in small sets of research data, the data + algorithms looked good, and really did give good results
      • but if you try and apply a 8 class genre classifier to a million tracks, you're going to have problems real quick
      • reosarevok
        Sure
      • So essentia isn't expected to be updated with deep learning algos?
      • alastairp
        it already is
      • reosarevok
        Oh
      • alastairp
        so it's already getting interesting results once you have more data: https://towardsdatascience.com/musicnn-5d1a5883...
      • reosarevok
        So, remind me, how does AB work with essentia again?
      • Is essentia the bit that runs on the files locally, then submits the data up to AB?
      • (just wondering about the "orders of magnitude more features than what we put in AB")
      • alastairp
        right. essentia is a library of algorithms (some are "process this audio into a representation that can be used for machine learning" and some are "give me the bpm of audio). there is a single binary which runs a bunch of different algorithms over an audio file (the 'music extractor'), which is the AB extractor. then the ab submitter takes the result of that and submits it
      • BrainzGit
        [bookbrainz-site] 14MonkeyDo opened pull request #730 (03master…monkey-yarn-package-manager): Replace NPM with Yarn https://github.com/bookbrainz/bookbrainz-site/p...
      • reosarevok
        So "orders of magnitude more features than what we put in AB" just means "because we haven't updated it"?
      • Or is there actually a hardcoded issue why AB can't support those?
      • alastairp
        partially yes, just a matter of adding it (as I said at the beginning of this discussion, we had already started having a discussion about adding a new data type/extractor to AB)
      • partially no - the more detailed data that you add, the easier it becomes to reverse that data back into audio
      • so then the question of what AB is changes a bit - do you want it to just output single, good values? (accurate bpm, key, etc). if so, you can do this but then you can't use the data in the database to improve algorithms. you have to improve them on external collections of music, then roll out a new version, and do what we discussed about rotating old versions out
      • or maybe you want it to be a collection of detailed features that allow people to use these features independently to build new models, new algorithms etc without needing to have access to large collections of audio
      • reosarevok
        So now we're doing a) ?
      • Also, "add detailed data, and see what ungodly mess comes back when trying to turn it back into audio" sounds hilarious
      • alastairp
        current AB is a bit of both - it includes specific features that required detailed audio data (bpm, key), but then it includes the chroma features which are used for training new models
      • https://gist.github.com/bmcfee/a40c3ab83f166a38... this is an interesting experiment doing exactly that - we have some demo pages somewhere that allow us to play back the reproduced audio, let me see if I can find it
      • reosarevok
        So the doubt is "how many more features can we allow before someone sues us for piracy"?