<mayhem[m]> "i think the 2 character limit is..." <- i had run typesense with 2 character limit as well for this comparison.
2024-07-23 20558, 2024
BrainzGit
[musicbrainz-server] 14reosarevok merged pull request #3315 (03master…useless-hangul-filler): MBS-13528 / MBS-13696: Calculate invalid edit notes in more places and with more invisible characters https://github.com/metabrainz/musicbrainz-server/…
2024-07-23 20521, 2024
Kladky joined the channel
2024-07-23 20506, 2024
relaxo[m] joined the channel
2024-07-23 20506, 2024
relaxo[m]
reosarevok How applies this edit note thing to Seeds? What happens, when there are forbidden chars in the seeded edit note?
2024-07-23 20522, 2024
reosarevok[m]
Nothing should happen as long as they're not the only characters :)
2024-07-23 20536, 2024
reosarevok[m]
This is meant to stop notes which are only spaces, invisible chars and so on
2024-07-23 20555, 2024
reosarevok[m]
If you have a normal note with some of them they will just work, in theory
2024-07-23 20503, 2024
reosarevok[m]
(if you find a bug, let us know!)
2024-07-23 20508, 2024
Kladky has quit
2024-07-23 20541, 2024
Kladky joined the channel
2024-07-23 20504, 2024
relaxo[m]
Okay. No bug found yet, but I want to get to another point. Sometimes my seeder does not clearly identify an entity so it should be looked up manual. I want to make sure that the editor will notie it. My idea is to add a forbidden char to the edit note with a hint to point the editor to it.
2024-07-23 20520, 2024
reosarevok[m]
Hmm
2024-07-23 20553, 2024
reosarevok[m]
If you seed a relationship with name only, it shouldn't let the user submit until they either select the entity or remove the relationship, I think?
2024-07-23 20511, 2024
reosarevok[m]
yvanzo: minor, but in https://github.com/metabrainz/musicbrainz-server/… - we have the prepare jira step first as 1, talking about updating tickets, but the link to the tickets and the transitioning etc is only in the update jira step (7)
2024-07-23 20541, 2024
reosarevok[m]
If we want to update the descriptions and whatnot in step 1 we should move that stuff to step 1, and if not we only need step 7 and step 1 should be empty, no? :)
2024-07-23 20548, 2024
Vile_Vulture joined the channel
2024-07-23 20504, 2024
Vile_Vulture has left the channel
2024-07-23 20535, 2024
relaxo[m]
reosarevok Thanks, will try. I am using derat/yambs command line. Maybe he is around and can answer if this is possible. derat
2024-07-23 20510, 2024
kellnerd[m]
Why you don't you simply try it yourself? I am pretty sure the form can't be submitted if you only seed names instead of MBIDs.
2024-07-23 20522, 2024
relaxo[m]
Not at home rn. Will try it later for sure.
2024-07-23 20526, 2024
reosarevok[m]
yvanzo, bitmap: did some Spanish translating, will release beta in the afternoon / evening - feel free to review or ask for review on stuff if you want me to put them out today
2024-07-23 20548, 2024
reosarevok[m]
yvanzo: what was the requisite to put the Spanish translation out in prod? :)
2024-07-23 20501, 2024
reosarevok[m]
(it's probably close to that, so I'd like to know what to prioritize)
2024-07-23 20515, 2024
reosarevok[m] goes for a dog walk and some food after
reosarevok: Yes, it is redundant because it has been too often overlooked. Tickets should actually be updated when transitioning to the development branch.
2024-07-23 20532, 2024
yvanzo[m]
reosarevok: For prod languages, it should be nearly complete. For Spanish, relationship types seem to be the only last big chunk.
relaxo yeah, my experience matches others'. i think that edit forms generally won't let you submit if you've only set a name in a field that requires an MBID. pretty much every seeder relies on this functionality if it e.g. can't find an MBID for an artist.
1. beyoncé - dreaming: doing a simple artist and recording search it should be an exact match which works fine. but note the diateric e should be present.
2024-07-23 20549, 2024
lucifer[m]
solr by default doesn't remove diaterics from the text field. but we can create an extra field to store the unidecoded output
which is worth listening to, if you don't know it.
2024-07-23 20521, 2024
mayhem[m]
right zas ?
2024-07-23 20519, 2024
zas[m]
Definitively
2024-07-23 20555, 2024
mayhem[m]
1daf6f11-cbec-4503-b0b4-6b38716062ef "Metropolitan Opera Orchestra, Erich Leinsdorf", "Die Walküre", "Die Walküre: Act III, Scene I. Vorspiel "Walkürenritt: Hojotoho! Heiaha!" (Gerhilde, Helmwige, Waltraute, Schwertleite, Ortlinde, Siegrune, Grimgerge, Roßweiße)"
2024-07-23 20540, 2024
mayhem[m]
e4c8c9b3-38f2-41be-a2d8-1ad23d8b7d48 "peedranch ^ Jansky Noise", "Mi^grate", "Love, Exciting and New Come Aboard, We're Expecting You. Love, Life's Sweetest Reward. Let It Flow, It Floats Back to You. The Love Boat Soon Will Be Making Another Run, the Love Boat Promises Something for Everyone, Set a Course for Adventure, Your Mind on a New Romance. Love Won't Hurt Anymore, It's an Open Smile on a Friendly Shore. Yes Love! It's Love!
2024-07-23 20540, 2024
mayhem[m]
The Love Boat Soon Will Be Making Another Run. The Love Boat Promises Something for Everyone, Set a Course for Adventure, Your Mind on a New Romance. Love Won't Hurt Anymore, It's an Open Smile on a Friendly Shore. It's Love! It's Love! It's Love! It's the Love Boat-Ah! It's the Love Boat-Ah! (Recorded Onboard the Love Boat With the Kitchen Staff)"
2024-07-23 20552, 2024
mayhem[m]
seriously, hard to tell if spam or not. lol
2024-07-23 20506, 2024
mayhem[m]
do you think we need more lucifer ?
2024-07-23 20520, 2024
lucifer[m]
mayhem: not sure we might need some of different kinds but the mapping ticket should have them
2024-07-23 20538, 2024
mayhem[m] goes to look
2024-07-23 20538, 2024
reosarevok[m] uploaded an image: (184KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/DmDOeExuBqqpTcaCNarXLuZp/Screenshot%20from%202024-07-23%2013-08-34.png >
2024-07-23 20541, 2024
reosarevok[m]
yvanzo: ^
2024-07-23 20508, 2024
reosarevok[m]
If that's what you had in mind, seems to work for me (see Addddddd or whatnot on the setlist info that I added to be 100% sure this was the JS version)
2024-07-23 20522, 2024
yvanzo[m]
reosarevok: Yes, thank you.
2024-07-23 20503, 2024
mayhem[m]
d0b09116-8cf1-4b5a-baf8-3db8d9fc5116 , "tripleS", "LOVElution <ↀ>", "Speed Love" (open the page to get the correct release name)
reosarevok: Found why some strings are not translated anymore: rebases on master missing intermediate changes.
2024-07-23 20528, 2024
yvanzo[m]
There might be more than just string changes.
2024-07-23 20534, 2024
lusciouslover has quit
2024-07-23 20520, 2024
lusciouslover joined the channel
2024-07-23 20539, 2024
lucifer[m]
mayhem: `rising from the ashes erosion 89 6979fc1a-f6bc-45a6-9240-a0ca06d213b3` can't get solr to match this so far.
2024-07-23 20549, 2024
lucifer[m]
* mayhem: `rising from the ashes erosion 89 6979fc1a-f6bc-45a6-9240-a0ca06d213b3` can't get solr to match this so far.
2024-07-23 20556, 2024
lucifer[m]
s/6979fc1a-f6bc-45a6-9240-a0ca06d213b3//
2024-07-23 20518, 2024
lucifer[m]
* mayhem: recording:`rising from the ashes` artist:`erosion 89` can't get solr to match this so far.
2024-07-23 20529, 2024
mayhem[m]
odd. that seems pretty simple.
2024-07-23 20507, 2024
lucifer[m]
mayhem: okay got it to match using a fuzzy search. the right artist name is erosion89 (no space) but another observation.
2024-07-23 20528, 2024
lucifer[m]
recording name - exact search and artist name - fuzzy search. matches.
2024-07-23 20542, 2024
lucifer[m]
recording name - fuzzy search and artist name - fuzzy search. doesn't match.
2024-07-23 20525, 2024
lucifer[m]
so we need to test all combinations separately in worst cases.
2024-07-23 20526, 2024
mayhem[m]
huh???
2024-07-23 20555, 2024
mayhem[m]
aren't the two searches for artist name independent?
2024-07-23 20515, 2024
lucifer[m]
sorry not sure what you mean
2024-07-23 20550, 2024
mayhem[m]
"artist name - fuzzy search. matches." and "artist name - fuzzy search. doesn't match." I would expect that these give the same result.
2024-07-23 20519, 2024
mayhem[m]
because we're looking artist names separately from recording names, right?
2024-07-23 20527, 2024
lucifer[m]
ah no
2024-07-23 20542, 2024
lucifer[m]
recording name AND artist name at the same time.
2024-07-23 20500, 2024
mayhem[m]
ah, but shouldn't we be testing the separated lookups?
2024-07-23 20509, 2024
lucifer[m]
its one index like we have for typesense.
2024-07-23 20516, 2024
mayhem[m]
because we agreed that each should be looked up separately, right?
2024-07-23 20503, 2024
lucifer[m]
right but i don't think we can implement that easily with solr.
2024-07-23 20530, 2024
mayhem[m]
ugh.
2024-07-23 20536, 2024
mayhem[m]
not sure I like this.
2024-07-23 20503, 2024
lucifer[m]
its 50ms for one field fuzzy search. less than 10ms for exact search.
2024-07-23 20549, 2024
lucifer[m]
we can shard the index in solr based on artist names - but it wouldn't improve the perfomance.
2024-07-23 20553, 2024
mayhem[m]
the seems like a deal breaker for solr, no?
2024-07-23 20511, 2024
lucifer[m]
the overall performance would be better.
2024-07-23 20520, 2024
lucifer[m]
than what we have with typesense.
2024-07-23 20534, 2024
mayhem[m]
problem is that 10% better is not solving our problem.
2024-07-23 20540, 2024
mayhem[m]
2-3 times faster starts getting there.
2024-07-23 20557, 2024
lucifer[m]
makes sense. but i don't think there is an equally performant alternative
2024-07-23 20517, 2024
mayhem[m]
the testing I did with mnslib was clocking in around 5ms per lookup.
2024-07-23 20519, 2024
lucifer[m]
the testing we did with individual indexes also came about around the same
2024-07-23 20527, 2024
lucifer[m]
i see
2024-07-23 20545, 2024
lucifer[m]
can you remind me if it was exact match only or supports fuzzy too?
2024-07-23 20553, 2024
mayhem[m]
and clearly we'd need to do more than 1 lookup, but I'd expect 2-4 lookups per track. so 20ms or so.
2024-07-23 20504, 2024
mayhem[m]
fuzzy and also more than just 2 chars.
2024-07-23 20550, 2024
lucifer[m]
cool, lets load test on the vm and if its the same performance, we can go ahead with that
2024-07-23 20536, 2024
mayhem[m]
so, finish load testing your solution, then load test mine?
2024-07-23 20551, 2024
lucifer[m]
yes solr is done for now.
2024-07-23 20514, 2024
mayhem[m]
ok, cool.
2024-07-23 20535, 2024
mayhem[m]
the biggest problem that I am still facing with mnslib is the building of indexes.
2024-07-23 20556, 2024
mayhem[m]
but I can jump back into that with a fresh perspective if you want.
2024-07-23 20526, 2024
lucifer[m]
i think we can fix that building part later.
2024-07-23 20548, 2024
lucifer[m]
once we are satisfied with the querying part.
2024-07-23 20505, 2024
mayhem[m]
let me look at the code again.
2024-07-23 20552, 2024
mayhem[m]
ahhh, I think I see what is going on how. I am seeing a lot of disk I/O that I hasn't looked at before.
2024-07-23 20503, 2024
mayhem[m]
IO contention is limiting CPU time.
2024-07-23 20532, 2024
mayhem[m]
constained on Write. eh???
2024-07-23 20550, 2024
mayhem[m]
oh, I wonder if scikit learn is being too smart for us. might have a memory use limit and this writes to disk.
2024-07-23 20516, 2024
mayhem[m]
SKLEARN_WORKING_MEMORY
2024-07-23 20534, 2024
mayhem[m]
working_memoryint, default=None
2024-07-23 20534, 2024
mayhem[m]
If set, scikit-learn will attempt to limit the size of temporary arrays to this number of MiB (per job when parallelised), often saving both computation time and memory on expensive operations that can be performed in chunks. Global default: 1024.
2024-07-23 20558, 2024
mayhem
tykling: ping
2024-07-23 20533, 2024
bttf joined the channel
2024-07-23 20534, 2024
ursa-major has quit
2024-07-23 20535, 2024
ajhalili2006 has quit
2024-07-23 20535, 2024
outsidecontext has quit
2024-07-23 20535, 2024
RetroPunk has quit
2024-07-23 20535, 2024
djl has quit
2024-07-23 20536, 2024
serra has quit
2024-07-23 20536, 2024
irimi1 has quit
2024-07-23 20554, 2024
outsidecontext joined the channel
2024-07-23 20557, 2024
djl joined the channel
2024-07-23 20557, 2024
irimi1 joined the channel
2024-07-23 20502, 2024
serra joined the channel
2024-07-23 20504, 2024
RetroPunk joined the channel
2024-07-23 20505, 2024
ursa-major joined the channel
2024-07-23 20509, 2024
ajhalili2006 joined the channel
2024-07-23 20552, 2024
minimal joined the channel
2024-07-23 20546, 2024
Jade[m]
<yvanzo[m]> "bitmap, Jade, reosarevok: I just..." <- Aah only just saw this, oops!
2024-07-23 20500, 2024
Jade[m]
On it now
2024-07-23 20539, 2024
Jade[m]
Thank you :)
2024-07-23 20507, 2024
bitmap[m]
oops x2, I forgot to accept the invitation on Friday and it expired over the weekend. could you please send me one again yvanzo?
2024-07-23 20514, 2024
Jade[m] uploaded an image: (36KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/lXholpJauvAeuipYZGiKxHYd/image.png >
2024-07-23 20528, 2024
Jade[m]
Yeah I might have done the same if I didn't see it