#musicbrainz

/

      • aerozol
        schickling[m]: the best precedents to look at are probably these importers/'seeders': https://wiki.musicbrainz.org/Guides/Userscripts...
      • We don't allow full automation - e.g. a human has to press the 'submit' button - so seeding is the way to go
      • Here's an example of an external website rather than a userscript: https://atisket.pulsewidth.org.uk/
      • smach has quit
      • wargreen has quit
      • wargreen joined the channel
      • saumon has quit
      • flamingspinach joined the channel
      • saumon joined the channel
      • scrumplex_ joined the channel
      • scrumplex has quit
      • Vacuity has quit
      • Vacuity joined the channel
      • d4rk joined the channel
      • d4rk-ph0enix has quit
      • otisolsen70 joined the channel
      • otisolsen70 has quit
      • otisolsen70 joined the channel
      • AJ_Z0 has quit
      • AJ_Z0 joined the channel
      • G0d joined the channel
      • d4rk-ph0enix joined the channel
      • d4rkie has quit
      • flamingspinach has quit
      • vibhoo_24 joined the channel
      • fhe joined the channel
      • vibhoo_24 has quit
      • ttree has quit
      • schickling[m]
        <aerozol> "We don't allow full automation -..." <- I see. Thanks for clarifying!
      • Are there some ideas/efforts (beyond seeding) to facilitate contributions in a more semi-automated way?
      • <aerozol> "Here's an example of an external..." <- Very cool. Just gave it a try here. Worked pretty well. Only had to select the Label manually. (The strings matched, so maybe even that could be "fixed" so one less manual interaction - just a suggestion)
      • Maxr1998 has quit
      • Maxr1998 joined the channel
      • srxl has quit
      • srxl joined the channel
      • <aerozol> "Here's an example of an external..." <- is there a similar "seeding flow" but for updating/filling in missing fields for existing items?
      • aerozol
        schickling[m]: hmm, not sure... I haven't used one. Maybe someone else knows one.
      • ShivangiPatel joined the channel
      • ShivangiPatel has quit
      • Rishabh joined the channel
      • Rishabh has quit
      • vibhoo_24 joined the channel
      • d4rkie joined the channel
      • d4rk has quit
      • vibhoo_24 has quit
      • alexrelis has quit
      • alexrelis joined the channel
      • alexrelis has quit
      • thuna` joined the channel
      • vibhoo_24 joined the channel
      • schickling[m]
        quick question: in plain language - what are the differences between the track and recording table? I see that the track table has around 10M more entries.
      • Based on this DB schema diagram I was assuming that there's a 1:1 relation between a track and a recording, so I'm a bit puzzled why there are so many more tracks than recordings?
      • schickling[m] uploaded an image: (629KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/zOMhRmJScfnNyMfbWkftntnG/image.png >
      • vibhoo_24 has quit
      • duncan
        I'm not familiar with the database structure, but is it not because many releases share the same recordings?
      • schickling[m]
        Ah I see. Yeah, that could make sense. It's basically like a n:m join table between a release/medium and recordings
      • kepstin
        indeed, a track is on a specific release, while a recording can be shared between multiple releases (or be standalone, on no releases)
      • schickling[m]
        Got it. Thanks :)
      • schickling[m] uploaded an image: (264KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/gMpfCkeLGRqDVvadgPMiQTyi/CleanShot%202022-12-08%20at%2017.39.28%402x.png >
      • I assume those missing recordings were deleted over time?
      • * I assume those "missing recordings" were deleted over time?
      • kepstin
        could be some deleted ones, and also the way postgresql sequences work sometimes the sequence skips numbers to ensure there's no problems with parallel transactions.
      • schickling[m]
        I see. Thanks. (Hope you don't mind those sort of questions, just trying to "absorb" the existing Musicbrainz database design wisdom 💡)
      • kepstin
        but yeah, probably mostly deleted (or merged) recordings
      • the gid is the real reference, and you can look up merged gids through a separate table
      • (row ids are used for internal joins in the database tho)
      • schickling[m]
        is the gid synonymous with MBID in tables like track, recording etc?
      • kepstin
        yeah
      • schickling[m]
        gid = global id?
      • kepstin
        they're sometimes also called uuids, too, tho i dunno if that term's ever used in the mbs codebase.
      • schickling[m]
        different question: if there's a row in e.g. `recording_gid_redirect`, does this give me a guarantee that there isn't a `recording` anymore with that `gid`?
      • another way to ask the same question: are merged `recording` entries deleted?
      • acohn has quit
      • acohn joined the channel
      • ttree joined the channel
      • kepstin
        i believe that is true, but you should get confirmation from someone who's worked with the database more recently than me :)
      • I know that for deleted stuff, the main record of what it was when it existed will actually be in the editing history, not in the db.
      • well, editing history is also in the db, but you know what i mean :)
      • you might also consider asking in #metabrainz:libera.chat for the more technical musicbrainz internals questions.
      • ttree has quit
      • ttree joined the channel
      • fhe has quit
      • reosarevok
        schickling[m]: yes, a redirect gid is as valid as a non-redirect one, so it only applies to one recording
      • The data is combined into one entity when merging
      • anonn joined the channel
      • vibhoo_24 joined the channel
      • schickling[m]
        is there a elasticsearch instance or similar for fuzzy search on musicbrainz?
      • kaliko has quit
      • kaliko joined the channel
      • reosarevok
        Nope. IIRC we have pretty bad experiences with ES specifically too, anyway
      • From when we tried it for BookBrainz
      • SOLR supports some sorts of fuzzy-ish search, but dunno if what you need :)
      • vibhoo_24 has quit
      • schickling[m]
        <reosarevok> "Nope. IIRC we have pretty bad..." <- Would love to hear more about the sort of problems you ran into actually 🤔
      • otisolsen70 has quit
      • vibhoo_24 joined the channel
      • vibhoo_24 has quit
      • anonn has quit
      • G0d has quit
      • thevar1able has quit
      • thevar1able joined the channel
      • flamingspinach joined the channel
      • AJ_Z0 has quit
      • AJ_Z0 joined the channel