#metabrainz

/

      • minimal has quit
      • pite has quit
      • lucifer[m]
        <mayhem[m]> "i think the 2 character limit is..." <- i had run typesense with 2 character limit as well for this comparison.
      • BrainzGit
        [musicbrainz-server] 14reosarevok merged pull request #3315 (03master…useless-hangul-filler): MBS-13528 / MBS-13696: Calculate invalid edit notes in more places and with more invisible characters https://github.com/metabrainz/musicbrainz-serve...
      • Kladky joined the channel
      • relaxo[m] joined the channel
      • relaxo[m]
        reosarevok How applies this edit note thing to Seeds? What happens, when there are forbidden chars in the seeded edit note?
      • reosarevok[m]
        Nothing should happen as long as they're not the only characters :)
      • This is meant to stop notes which are only spaces, invisible chars and so on
      • If you have a normal note with some of them they will just work, in theory
      • (if you find a bug, let us know!)
      • Kladky has quit
      • Kladky joined the channel
      • relaxo[m]
        Okay. No bug found yet, but I want to get to another point. Sometimes my seeder does not clearly identify an entity so it should be looked up manual. I want to make sure that the editor will notie it. My idea is to add a forbidden char to the edit note with a hint to point the editor to it.
      • reosarevok[m]
        Hmm
      • If you seed a relationship with name only, it shouldn't let the user submit until they either select the entity or remove the relationship, I think?
      • yvanzo: minor, but in https://github.com/metabrainz/musicbrainz-serve... - we have the prepare jira step first as 1, talking about updating tickets, but the link to the tickets and the transitioning etc is only in the update jira step (7)
      • If we want to update the descriptions and whatnot in step 1 we should move that stuff to step 1, and if not we only need step 7 and step 1 should be empty, no? :)
      • Vile_Vulture joined the channel
      • Vile_Vulture has left the channel
      • relaxo[m]
        reosarevok Thanks, will try. I am using derat/yambs command line. Maybe he is around and can answer if this is possible. derat
      • kellnerd[m]
        Why you don't you simply try it yourself? I am pretty sure the form can't be submitted if you only seed names instead of MBIDs.
      • relaxo[m]
        Not at home rn. Will try it later for sure.
      • reosarevok[m]
        yvanzo, bitmap: did some Spanish translating, will release beta in the afternoon / evening - feel free to review or ask for review on stuff if you want me to put them out today
      • yvanzo: what was the requisite to put the Spanish translation out in prod? :)
      • (it's probably close to that, so I'd like to know what to prioritize)
      • reosarevok[m] goes for a dog walk and some food after
      • mayhem[m]
        lucifer: nudge on the failing tests here: https://github.com/metabrainz/metabrainz.org/pu...
      • also, ready to chat about Solr/typesense.
      • yvanzo[m]
        reosarevok: Yes, it is redundant because it has been too often overlooked. Tickets should actually be updated when transitioning to the development branch.
      • reosarevok: For prod languages, it should be nearly complete. For Spanish, relationship types seem to be the only last big chunk.
      • reosarevok: Is that just me or not? https://github.com/metabrainz/musicbrainz-serve...
      • BrainzGit
        [metabrainz.org] 14amCap1712 merged pull request #474 (03master…CVE-2024-40647): Update sentry sdk https://github.com/metabrainz/metabrainz.org/pu...
      • lucifer[m]
        mayhem: PR fixed.
      • for solr/typesense, i think best way would be to find some test cases and compare results and discuss?
      • reosarevok[m]
        yvanzo[m]: I'll recheck when I get home:)
      • mayhem[m]
        lucifer[m]: sounds good. let me finish with the hotel nonsense and we can proceed.
      • lucifer[m]
        👍️
      • BrainzGit
        [metabrainz.org] release 03v-2024-07-23.0 has been published by 14mayhem: https://github.com/metabrainz/metabrainz.org/re...
      • mayhem[m]
        lucifer: ready.
      • derat[m] joined the channel
      • derat[m]
        relaxo yeah, my experience matches others'. i think that edit forms generally won't let you submit if you've only set a name in a field that requires an MBID. pretty much every seeder relies on this functionality if it e.g. can't find an MBID for an artist.
      • lucifer[m]
      • mayhem[m]
        ok, the mapping test cases, ok.
      • we should add a few much longer test cases
      • lucifer[m]
        1. beyoncé - dreaming: doing a simple artist and recording search it should be an exact match which works fine. but note the diateric e should be present.
      • solr by default doesn't remove diaterics from the text field. but we can create an extra field to store the unidecoded output
      • BrainzGit
        [listenbrainz-server] 14MonkeyDo merged pull request #2944 (03master…sentry_react_router): feat: Upgrade sentry to use React Router Integration https://github.com/metabrainz/listenbrainz-serv...
      • lucifer[m]
        <mayhem[m]> "we should add a few much..." <- can you look for more test cases while i test these ones?
      • mayhem[m]
        sure.
      • https://musicbrainz.org/recording/84abfa00-de53... artist: "Pink Floyd" album: "Ummaguumma" recording: "Several Species of Small Furry Animals Gathered Together in a Cave and Grooving With a Pict"
      • which is worth listening to, if you don't know it.
      • right zas ?
      • zas[m]
        Definitively
      • mayhem[m]
        1daf6f11-cbec-4503-b0b4-6b38716062ef "Metropolitan Opera Orchestra, Erich Leinsdorf", "Die Walküre", "Die Walküre: Act III, Scene I. Vorspiel "Walkürenritt: Hojotoho! Heiaha!" (Gerhilde, Helmwige, Waltraute, Schwertleite, Ortlinde, Siegrune, Grimgerge, Roßweiße)"
      • e4c8c9b3-38f2-41be-a2d8-1ad23d8b7d48 "peedranch ^ Jansky Noise", "Mi^grate", "Love, Exciting and New Come Aboard, We're Expecting You. Love, Life's Sweetest Reward. Let It Flow, It Floats Back to You. The Love Boat Soon Will Be Making Another Run, the Love Boat Promises Something for Everyone, Set a Course for Adventure, Your Mind on a New Romance. Love Won't Hurt Anymore, It's an Open Smile on a Friendly Shore. Yes Love! It's Love!
      • The Love Boat Soon Will Be Making Another Run. The Love Boat Promises Something for Everyone, Set a Course for Adventure, Your Mind on a New Romance. Love Won't Hurt Anymore, It's an Open Smile on a Friendly Shore. It's Love! It's Love! It's Love! It's the Love Boat-Ah! It's the Love Boat-Ah! (Recorded Onboard the Love Boat With the Kitchen Staff)"
      • seriously, hard to tell if spam or not. lol
      • do you think we need more lucifer ?
      • lucifer[m]
        mayhem: not sure we might need some of different kinds but the mapping ticket should have them
      • mayhem[m] goes to look
      • reosarevok[m] uploaded an image: (184KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/DmDOeExuBqqpTcaCNarXLuZp/Screenshot%20from%202024-07-23%2013-08-34.png >
      • reosarevok[m]
        yvanzo: ^
      • If that's what you had in mind, seems to work for me (see Addddddd or whatnot on the setlist info that I added to be 100% sure this was the JS version)
      • yvanzo[m]
        reosarevok: Yes, thank you.
      • mayhem[m]
        d0b09116-8cf1-4b5a-baf8-3db8d9fc5116 , "tripleS", "LOVElution <ↀ>", "Speed Love" (open the page to get the correct release name)
      • lucifer: mapping explain is broken: https://labs.api.listenbrainz.org/explain-mbid-... let me make a PR for the fix.
      • lucifer[m]
        mayhem: you can pass a space to release name and it should work for now
      • BrainzGit
        [listenbrainz-server] 14mayhem opened pull request #2945 (03master…fix-mbid-mapping-explain): Make release_name optional https://github.com/metabrainz/listenbrainz-serv...
      • yvanzo[m]
        reosarevok: Found why some strings are not translated anymore: rebases on master missing intermediate changes.
      • There might be more than just string changes.
      • lusciouslover has quit
      • lusciouslover joined the channel
      • lucifer[m]
        mayhem: `rising from the asheserosion 896979fc1a-f6bc-45a6-9240-a0ca06d213b3` can't get solr to match this so far.
      • * mayhem: `rising from the asheserosion 89 6979fc1a-f6bc-45a6-9240-a0ca06d213b3` can't get solr to match this so far.
      • s/6979fc1a-f6bc-45a6-9240-a0ca06d213b3//
      • * mayhem: recording:`rising from the ashes` artist:`erosion 89` can't get solr to match this so far.
      • mayhem[m]
        odd. that seems pretty simple.
      • lucifer[m]
        mayhem: okay got it to match using a fuzzy search. the right artist name is erosion89 (no space) but another observation.
      • recording name - exact search and artist name - fuzzy search. matches.
      • recording name - fuzzy search and artist name - fuzzy search. doesn't match.
      • so we need to test all combinations separately in worst cases.
      • mayhem[m]
        huh???
      • aren't the two searches for artist name independent?
      • lucifer[m]
        sorry not sure what you mean
      • mayhem[m]
        "artist name - fuzzy search. matches." and "artist name - fuzzy search. doesn't match." I would expect that these give the same result.
      • because we're looking artist names separately from recording names, right?
      • lucifer[m]
        ah no
      • recording name AND artist name at the same time.
      • mayhem[m]
        ah, but shouldn't we be testing the separated lookups?
      • lucifer[m]
        its one index like we have for typesense.
      • mayhem[m]
        because we agreed that each should be looked up separately, right?
      • lucifer[m]
        right but i don't think we can implement that easily with solr.
      • mayhem[m]
        ugh.
      • not sure I like this.
      • lucifer[m]
        its 50ms for one field fuzzy search. less than 10ms for exact search.
      • we can shard the index in solr based on artist names - but it wouldn't improve the perfomance.
      • mayhem[m]
        the seems like a deal breaker for solr, no?
      • lucifer[m]
        the overall performance would be better.
      • than what we have with typesense.
      • mayhem[m]
        problem is that 10% better is not solving our problem.
      • 2-3 times faster starts getting there.
      • lucifer[m]
        makes sense. but i don't think there is an equally performant alternative
      • mayhem[m]
        the testing I did with mnslib was clocking in around 5ms per lookup.
      • lucifer[m]
        the testing we did with individual indexes also came about around the same
      • i see
      • can you remind me if it was exact match only or supports fuzzy too?
      • mayhem[m]
        and clearly we'd need to do more than 1 lookup, but I'd expect 2-4 lookups per track. so 20ms or so.
      • fuzzy and also more than just 2 chars.
      • lucifer[m]
        cool, lets load test on the vm and if its the same performance, we can go ahead with that
      • mayhem[m]
        so, finish load testing your solution, then load test mine?
      • lucifer[m]
        yes solr is done for now.
      • mayhem[m]
        ok, cool.
      • the biggest problem that I am still facing with mnslib is the building of indexes.
      • but I can jump back into that with a fresh perspective if you want.
      • lucifer[m]
        i think we can fix that building part later.
      • once we are satisfied with the querying part.
      • mayhem[m]
        let me look at the code again.
      • ahhh, I think I see what is going on how. I am seeing a lot of disk I/O that I hasn't looked at before.
      • IO contention is limiting CPU time.
      • constained on Write. eh???
      • oh, I wonder if scikit learn is being too smart for us. might have a memory use limit and this writes to disk.
      • SKLEARN_WORKING_MEMORY
      • working_memoryint, default=None
      • If set, scikit-learn will attempt to limit the size of temporary arrays to this number of MiB (per job when parallelised), often saving both computation time and memory on expensive operations that can be performed in chunks. Global default: 1024.
      • mayhem
        tykling: ping
      • bttf joined the channel
      • ursa-major has quit
      • ajhalili2006 has quit
      • outsidecontext has quit
      • RetroPunk has quit
      • djl has quit
      • serra has quit
      • irimi1 has quit
      • outsidecontext joined the channel
      • djl joined the channel
      • irimi1 joined the channel
      • serra joined the channel
      • RetroPunk joined the channel
      • ursa-major joined the channel
      • ajhalili2006 joined the channel
      • minimal joined the channel
      • Jade[m]
        <yvanzo[m]> "bitmap, Jade, reosarevok: I just..." <- Aah only just saw this, oops!
      • On it now
      • Thank you :)
      • bitmap[m]
        oops x2, I forgot to accept the invitation on Friday and it expired over the weekend. could you please send me one again yvanzo?
      • Jade[m] uploaded an image: (36KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/lXholpJauvAeuipYZGiKxHYd/image.png >
      • Jade[m]
        Yeah I might have done the same if I didn't see it
      • Yep, sorry
      • yvanzo[m]
        Resent it.