<aerozol> "We don't allow full automation -..." <- I see. Thanks for clarifying!
Are there some ideas/efforts (beyond seeding) to facilitate contributions in a more semi-automated way?
<aerozol> "Here's an example of an external..." <- Very cool. Just gave it a try here. Worked pretty well. Only had to select the Label manually. (The strings matched, so maybe even that could be "fixed" so one less manual interaction - just a suggestion)
<aerozol> "Here's an example of an external..." <- is there a similar "seeding flow" but for updating/filling in missing fields for existing items?
aerozol
schickling[m]: hmm, not sure... I haven't used one. Maybe someone else knows one.
ShivangiPatel joined the channel
ShivangiPatel has quit
Rishabh joined the channel
Rishabh has quit
vibhoo_24 joined the channel
d4rkie joined the channel
d4rk has quit
vibhoo_24 has quit
alexrelis has quit
alexrelis joined the channel
alexrelis has quit
thuna` joined the channel
vibhoo_24 joined the channel
schickling[m]
quick question: in plain language - what are the differences between the track and recording table? I see that the track table has around 10M more entries.
Based on this DB schema diagram I was assuming that there's a 1:1 relation between a track and a recording, so I'm a bit puzzled why there are so many more tracks than recordings?
schickling[m] uploaded an image: (629KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/zOMhRmJScfnNyMfbWkftntnG/image.png >
vibhoo_24 has quit
duncan
I'm not familiar with the database structure, but is it not because many releases share the same recordings?
schickling[m]
Ah I see. Yeah, that could make sense. It's basically like a n:m join table between a release/medium and recordings
kepstin
indeed, a track is on a specific release, while a recording can be shared between multiple releases (or be standalone, on no releases)
schickling[m]
Got it. Thanks :)
schickling[m] uploaded an image: (264KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/gMpfCkeLGRqDVvadgPMiQTyi/CleanShot%202022-12-08%20at%2017.39.28%402x.png >
I assume those missing recordings were deleted over time?
* I assume those "missing recordings" were deleted over time?
kepstin
could be some deleted ones, and also the way postgresql sequences work sometimes the sequence skips numbers to ensure there's no problems with parallel transactions.
schickling[m]
I see. Thanks. (Hope you don't mind those sort of questions, just trying to "absorb" the existing Musicbrainz database design wisdom 💡)
kepstin
but yeah, probably mostly deleted (or merged) recordings
the gid is the real reference, and you can look up merged gids through a separate table
(row ids are used for internal joins in the database tho)
schickling[m]
is the gid synonymous with MBID in tables like track, recording etc?
kepstin
yeah
schickling[m]
gid = global id?
kepstin
they're sometimes also called uuids, too, tho i dunno if that term's ever used in the mbs codebase.
schickling[m]
different question: if there's a row in e.g. `recording_gid_redirect`, does this give me a guarantee that there isn't a `recording` anymore with that `gid`?
another way to ask the same question: are merged `recording` entries deleted?
acohn has quit
acohn joined the channel
ttree joined the channel
kepstin
i believe that is true, but you should get confirmation from someone who's worked with the database more recently than me :)
I know that for deleted stuff, the main record of what it was when it existed will actually be in the editing history, not in the db.
well, editing history is also in the db, but you know what i mean :)
you might also consider asking in #metabrainz:libera.chat for the more technical musicbrainz internals questions.
ttree has quit
ttree joined the channel
fhe has quit
reosarevok
schickling[m]: yes, a redirect gid is as valid as a non-redirect one, so it only applies to one recording
The data is combined into one entity when merging
anonn joined the channel
vibhoo_24 joined the channel
schickling[m]
is there a elasticsearch instance or similar for fuzzy search on musicbrainz?
kaliko has quit
kaliko joined the channel
reosarevok
Nope. IIRC we have pretty bad experiences with ES specifically too, anyway
From when we tried it for BookBrainz
SOLR supports some sorts of fuzzy-ish search, but dunno if what you need :)
vibhoo_24 has quit
schickling[m]
<reosarevok> "Nope. IIRC we have pretty bad..." <- Would love to hear more about the sort of problems you ran into actually 🤔