#metabrainz

/

      • bitmap
        no it's hardcoded into the structure
      • monkey[m]
        bdrats
      • reosarevok
        recall oh no.
      • Eh
      • bitmap
        at best we could change the index to be calculated on a substring (prefix) of the total name, I think...
      • reosarevok
        !recall oh no.
      • BrainzBot
      • reosarevok
        Yeah I found that as a suggestion when researching this
      • I understand that'd be a schema change tho and not a quick fix
      • bitmap
        well, it's even more annoying since any query that currently uses the index will have to be updated to perform the lookup using the same substring expression
      • reosarevok
        So, block too-long recording names in the schema change? :p
      • Can we refuse to index something if it's too long? Guess not?
      • Changing everything for this seems not worth it, but very long names could legitimately exist (whatever your opinion on the legitimacy of this specific one)...
      • lusciouslover has quit
      • lusciouslover joined the channel
      • minimal joined the channel
      • Tarun_0x0 joined the channel
      • Lotheric_ joined the channel
      • Lotheric has quit
      • Attila41 joined the channel
      • bitmap
        reosarevok: probably, just not sure what the limit should be
      • lusciouslover has quit
      • lusciouslover joined the channel
      • Tarun_0x0 has quit
      • reosarevok
        I guess "the biggest that is likely to fit the index"...
      • I guess that's not chars but bits though?
      • ApeKattQuest
        I mean somewhere out there there is a song that's just like, an entire NOVEL, just because (should that be added to BB as well/instead though 🤔 ) but yea
      • bitmap
        well the error message cites index row size which isn't the same as character count
      • the recording id you posted is the only name > 3000 chars which might be a start (3403 vs. 2955 in second place)
      • aerozol[m] has quit
      • derat joined the channel
      • sorting by char count is a nice way to find high quality white noise though
      • Attila41
        I added the record causing the issue to the MBS-13555 ticket.
      • BrainzBot
        MBS-13555: index row size exceeds btree version 4 maximum https://tickets.metabrainz.org/browse/MBS-13555
      • BrainzGit
        [listenbrainz-server] 14MonkeyDo opened pull request #2848 (03single-page-app…spa-improvement): Single page app fixes https://github.com/metabrainz/listenbrainz-serv...
      • bitmap
        Attila41: thanks, unfortunately it's not a simple fix. if you are able to modify the server code, you can comment out "recording_idx_name" in admin/sql/CreateIndexes.sql before recreating the DB
      • monkey[m]
        ansh (IRC): I deployed the spa-improvement branch agaoin to test.LB with more fixes, and opened LB#2848
      • BrainzBot
      • monkey[m]
        Continuation tomorrow :)
      • Attila41
        bitmap: Thanks. But don't you think there is inappropriate data in the name column? I provided the column content in the ticket. Could you please check it? Maybe it's easier to fix the data issue instead of recreating a DB with a 16K block size.
      • bitmap
        Attila41: I'm fully aware, and we may indeed have to arbitrarily truncate the name and generate a new dump in the interim (which will take some time). however that doesn't fix the underlying issue nor prevent its recurrence
      • reosarevok: lmk when you're back and we can discuss how to proceed
      • reosarevok
        reosarevok: back for a moment
      • ...
      • bitmap: even
      • My brain isn't quite here it seems
      • Ideally we'd do the indexing a truncated version but that seems like a ton of effort. Maybe a limit would be fine
      • Attila41: that does seem to be the actual name of the recording, although it's clearly kinda-sorta trollish, so it's a complicated situation :)
      • bitmap
        I considered changing it to a hash index but I think we do use it for sorting quite a bit
      • ORDER BY.*name has 153 hits so it might not be too bad to identify all the places it's used for sorting
      • so the truncated index is doable but recording_idx_name is not the only problematic index of course
      • Tarun_0x0 joined the channel
      • derat has quit
      • Attila41 has quit
      • ApeKattQuest
        a bagpipe that might interest indians! https://beta.musicbrainz.org/instrument/e1a801e...
      • bitmap
        98% of recording names are below 64 chars, 99.9% below 128 chars. there are only 152 recordings >= 512 chars and 17 >= 1024 chars
      • relaxoMob has quit
      • relaxoMob joined the channel
      • reosarevok
        Yeah, so sorting by truncated would sort almost all fine, but work
      • minimal has quit
      • bitmap
        left(name, 1024) sounds reasonable to me
      • yvanzo: ^ if you're around
      • Tarun_0x0 has quit
      • we shouldn't implement that without a specific hardcoded limit on the recording name length, though (since the index error effectively limits it currently)
      • Tarun_0x0 joined the channel
      • relaxoMob has quit
      • relaxoMob joined the channel
      • reosarevok
        So do you mean truncating the recording names, or creating an index of the first X characters?
      • bitmap
        an index, but the name length limit would be larger than the index limit
      • reosarevok
        Why limit names at all though then?
      • bitmap
        because they're currently limited and I'm not sure opening them is a good idea
      • reosarevok
        Oh, ok. But not bringing that limit down
      • bitmap
        we can set it to something higher than 3403 (which is what's needed to store the masterpiece)
      • the functional limit on our DB probably isn't much higher
      • jetpackjackson[4 joined the channel
      • jetpackjackson[4
        <aerozol> "jetpackjakson: probably more #..." <- https://community.metabrainz.org/t/how-to-undo-... I ended up getting it mostly resolved here but you can still take a look if you want!
      • MatchaTeaCookies joined the channel
      • MatchaTeaCookies has left the channel
      • monkey[m] has quit
      • Tarun_0x0 has quit
      • minimal joined the channel
      • eharris joined the channel
      • lucifer
        mayhem: just noticed spotify has monthly stats, updated daily.
      • *has added
      • eharris has quit
      • relaxoMob has quit
      • yellowhatpro has quit
      • eharris joined the channel
      • relaxoMob joined the channel
      • jetpackjackson[4 has quit
      • relaxoMob has quit
      • relaxoMob joined the channel
      • SigHunter has quit
      • SigHunter joined the channel
      • zerodogg has quit
      • Kladky has quit
      • BrainzGit
        [musicbrainz-server] 14mwiencek opened pull request #3245 (03master…mbs-13557): MBS-13557: Set a Referrer-Policy on login/register pages https://github.com/metabrainz/musicbrainz-serve...
      • bitmap
        lucifer: do you need this on test.mb ^?
      • zerodogg joined the channel
      • Tarun_0x0 joined the channel
      • Tarun_0x0 has quit
      • minimal has quit
      • minimal joined the channel