Would be cool if annas archive could help musicbrainz with metadata haha
2025-12-21 35544, 2025
FirefoxDeHuk has quit
2025-12-21 35531, 2025
sam2 has quit
2025-12-21 35550, 2025
avamander[m] joined the channel
2025-12-21 35550, 2025
avamander[m]
200GB of metadata seems like a lot
2025-12-21 35539, 2025
avamander[m]
i fear that that dump will be used to create neural models that generate more inauthentic music that's hard to instantly pick up
2025-12-21 35544, 2025
avamander[m]
s/pick up/detect/
2025-12-21 35531, 2025
avamander[m]
so if it's already a problem I can't imagine how it will be in like 5 years :(
2025-12-21 35503, 2025
avamander[m] uploaded an image: (6032KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/cQKhYTRXUDVDBtGmBeAegIOs/IMG_2410.png >
2025-12-21 35504, 2025
avamander[m]
cool graphs
2025-12-21 35522, 2025
BlastboomStrice[
avamander[m]: Yeah I think they do sell it if you are a big organisation
2025-12-21 35512, 2025
tagomago has quit
2025-12-21 35553, 2025
iconoclasthero has quit
2025-12-21 35555, 2025
hgc has quit
2025-12-21 35513, 2025
iconoclasthero joined the channel
2025-12-21 35558, 2025
MeatPupp3t217 has quit
2025-12-21 35516, 2025
MeatPupp3t217 joined the channel
2025-12-21 35558, 2025
hgc joined the channel
2025-12-21 35556, 2025
iconoclasthero has quit
2025-12-21 35514, 2025
iconoclasthero joined the channel
2025-12-21 35523, 2025
petitminion joined the channel
2025-12-21 35551, 2025
wileyfoxyx[m] joined the channel
2025-12-21 35552, 2025
wileyfoxyx[m] uploaded an image: (60KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/gLClsklwbXktPWVoptFynjbp/image.png >
2025-12-21 35552, 2025
wileyfoxyx[m]
tell me you're an AI artist without telling me...
2025-12-21 35518, 2025
elomatreb[m]
what's with those disambiguations
2025-12-21 35527, 2025
wileyfoxyx[m]
I know right lol. this guy was adding all these releases and put these disambiguations himself. gladly he had some of them on voting so I told him you should not do this
2025-12-21 35512, 2025
wileyfoxyx[m]
he was also adding these disambiguations to some recordings too. I noted that on those edits that are still on voting and he (for some reason) cancelled two of them lol
2025-12-21 35523, 2025
wileyfoxyx[m] uploaded an image: (51KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/ylZabjaUMgTubUlhaZgPLsMJ/image.png >
Can we consider the metadatas as open primary source for MB ?
2025-12-21 35515, 2025
shisma[m]
where is this log
2025-12-21 35523, 2025
shisma[m]
\ ?
2025-12-21 35539, 2025
elomatreb[m]
Help -> Show debug log (or something like that)
2025-12-21 35554, 2025
thuna` joined the channel
2025-12-21 35537, 2025
aerozol[m] joined the channel
2025-12-21 35538, 2025
aerozol[m]
wargreen: MusicBrainz doesn't have any official rules around sources. You just have to be prepared to justify your edits. I would say that annas-archive is just a copy of a primary source (Spotify), so why not
2025-12-21 35538, 2025
wargreen has left the channel
2025-12-21 35519, 2025
macularguide[m] has quit
2025-12-21 35509, 2025
BlastboomStrice[
Yeah thinking about it again, its not very useful for musicbrainz
2025-12-21 35516, 2025
BlastboomStrice[
We already have harmony
2025-12-21 35542, 2025
BlastboomStrice[
It does need some manual curation before applying edits
2025-12-21 35555, 2025
BlastboomStrice[
(Saying this cuz I also shared this news earlier)
2025-12-21 35555, 2025
wargreen[m]
aerozol[m]: Good to know.... Maybe another more projet on the todo list : write a script to push the data
2025-12-21 35536, 2025
wargreen[m]
BlastboomStrice[: ho, i havn't see
2025-12-21 35538, 2025
wargreen[m]
it is the link with ISRC / artist / album / release / track, maybe we can use it for add the missing albums ?
2025-12-21 35505, 2025
aerozol[m]
wargreen: It was always possible to scrape Spotify data directly to MusicBrainz, but MB rules do not allow this. Community cleanup from a single user doing this with another source, *years* ago, is still ongoing. http://harmony.pulsewidth.org.uk/ is an excellent tool for seeding data from Spotify
2025-12-21 35525, 2025
aerozol[m]
That said, something like adding ISRC's to MB tracks that are already linked to Spotify tracks could be automated. There's been a few similar, very useful, initiatives (like adding genres from Bandcamp, when entities are already linked). Keep in mind that bots have to be signed off :)
2025-12-21 35551, 2025
aerozol[m]
annas-archive could make it easier, but I don't think it adds anything new?
(as a terminal command, if you're comfortable doing that)
2025-12-21 35501, 2025
elomatreb[m]
you can probably also do it from Finder (assuming that's macOS?) but I can't tell you how exactly from memory
2025-12-21 35525, 2025
elomatreb[m]
sorry, the command should be chmod u+w
2025-12-21 35511, 2025
shisma[m] uploaded an image: (22KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/DwXKZsOjJKTRkCPGSsAUDxKf/Screenshot_2025-12-21_at_21.00.04.png >
2025-12-21 35512, 2025
shisma[m]
this is so weird. I assumed in macOS I would just have to look if this is checked
2025-12-21 35513, 2025
wargreen[m]
<aerozol[m]> "annas-archive could make it..." <- according to the anna-archive blog post :
2025-12-21 35513, 2025
wargreen[m]
> MusicBrainz has 5 million unique ISRCs, while our database has 186 million.
2025-12-21 35513, 2025
wargreen[m]
So my initial question : here, it is not "scraping" spotify... It is now a database on our computers. Do we considere these data as "open" now ? :D
2025-12-21 35531, 2025
petitminion joined the channel
2025-12-21 35559, 2025
Island joined the channel
2025-12-21 35536, 2025
iconoclasthero has quit
2025-12-21 35549, 2025
iconoclasthero joined the channel
2025-12-21 35537, 2025
i522
that and you *could* (if you got the actual music files), submit acoustid fingerprints generated from the files, if the spotify entry is already known to correspond to a MB entry
2025-12-21 35537, 2025
i522
how risky that'd be legally for MB is unsure tho
2025-12-21 35520, 2025
petitminion has quit
2025-12-21 35538, 2025
AJ_Z0 has quit
2025-12-21 35512, 2025
AJ_Z0 joined the channel
2025-12-21 35537, 2025
iconoclasthero has quit
2025-12-21 35553, 2025
iconoclasthero joined the channel
2025-12-21 35502, 2025
wargreen[m]
for now, only the metadata have been released
2025-12-21 35541, 2025
iconoclasthero has quit
2025-12-21 35502, 2025
iconoclasthero joined the channel
2025-12-21 35532, 2025
|G0d| has quit
2025-12-21 35557, 2025
aerozol[m]
wargreen: What's the functional difference between taking the data from Spotify and putting it on MB, and moving the data from Spotify to your computer and then putting it on mB? I guess I don't understand the question. The same rules apply to both as far as I can tell.
2025-12-21 35500, 2025
wargreen[m]
aerozol[m]: for me, something change at legal side : now, it was a leak of all the data, publicly available
2025-12-21 35531, 2025
aerozol[m]
It was always legal to take ISRC's from Spotify.
2025-12-21 35516, 2025
wargreen[m]
And from the MB rules ? is something change ?
2025-12-21 35508, 2025
wargreen[m]
I think to ISRC, but also all the related metadata. Can we programaticly populate MB from theses data ?
2025-12-21 35514, 2025
aerozol[m]
If we just cloned Spotify's data... what is the point? The value in MB is the editors who have spent hours comparing recordings, differentiating artists with the same name, linking things together. You could have a billion ISRC's in a file but that stat on it's own doesn't really mean anything.
2025-12-21 35502, 2025
aerozol[m]
Note that community cleanup from a single user doing auto-imports from another source, years ago, is still ongoing. Unfortunately it's not so simple. Unless the source is perfect and you can de-dupe etc
2025-12-21 35541, 2025
aerozol[m]
For targetted things, like previous bots have done with fingerprints and genres, leveraging the human connections already made here, all good!
2025-12-21 35516, 2025
elomatreb[m]
@wargreen: The rules about mass imports are not about the data source (although that can also be a problem), they're about the automated editing process in itself
2025-12-21 35552, 2025
corecaps has quit
2025-12-21 35506, 2025
wargreen[m]
I understand this point. It seem difficult to know if the imported data will be clean
2025-12-21 35519, 2025
elomatreb[m]
legally speaking it should not be a problem to add metadata from the scraped data though, since simple factual statements are not considered to be protected by copyright (at least in the US, European law has some exceptions for this). The various Spotify importer tools already get their data from the same place anyway
2025-12-21 35520, 2025
elomatreb[m]
yeah, the data quality on Spotifys side (or any storefront/streaming provider, really) is really variable
2025-12-21 35523, 2025
julian45[m]
right. it's as if magicisrc or isrchunt simply pulled from an offline, local copy of spotify rather than querying live. you still need to be sure the isrc goes to the right recording first, which is the MB editor's job