I can increase the limit again from 4096, however I'm not sure how to determine the number of clauses being generated by the queries were failing earlier in order to get an idea of what a suitable value might be.
d4rk has quit
d4rk joined the channel
zas[m]
atj: moooin, it seems those limits can be set globally or per collection in solr9, but documentation doesn't anything about "correct" values.What I know is we get more 500s than before. But perhaps those 500s are expected, since they are triggered but certain queries.
s/but/by/
atj[m]
I don't think there is a "correct" value because it depends on your schema and various query parameters
zas[m]
I don't think those queries come from Picard (to know better we need to find matching request User Agent)
atj[m]
the problem we have is that each word in the query is expanded to query every (?) field in the entity
zas[m]
yes, I mean correct values for a certain domain, but they don't provide much guidelines
atj[m]
so "the~" is expanded to (artistname:the~2)^1.2 | comment:the~2 | barcode:the~2 | (releaseaccent:the~2)^2.0 | (release:the~2)^1.5 | label:the~2 | ngram:the~2 | (alias:the~2)^1.2 | (creditname:the~2)^1.2 | tag:the~2)
(for release)
i don't really understand how all this works though to be honest
feels like you need a PhD in search
atj[m] uploaded an image: (69KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/lGWgyHRUcwvehDJJkNsmdlct/image.png >
this is quite nice to see, huge improvement in latency, both faster and more consistent
<zas[m]> "yes, I mean correct values for a..." <- i think i will try doubling it again to 8192 and see if that is enough for the remaining outliers
we haven't seen any performance degradation from increasing it to 4096
outsidecontext[m joined the channel
outsidecontext[m
lucifer, monkey: rdswift noted that we probably should adapt the IRC link in the Picard website footer now that we have moved to matrix. The same would also apply to LB and CB. Had there already been some talk about this? We could replace the IRC link with a link to the matrix room. Or we keep both.
zas[m]
atj: yes; performance is much better (more nodes, more powerful), but what I see is that scaling is easier now (thanks to the new structure and Ansible).
old cluster nodes will need to be removed ASAP, likely next week (if everything is ok with new cluster).
zerodogg
Hi, my weekly playlists from listenbrainz have been stuck since march. I've tried reconnecting spotify several times, without any luck. Anyone I can contact to have look at it?
mayhem[m]
hi zerodogg (IRC) we're working on the computing cluster that computes all of this and should have this resolved soon, sorry.
lucifer
uhh weird, should have been processed on tuesday itself. checking.
zerodogg
mayhem[m]: aha, thanks! :)
lucifer
outsidecontext[m: no discussions afaik but i guess we could point to the communications wiki page which lists both matrix and irc
Rotab has quit
Rotab joined the channel
Jade[m] has quit
atj[m]
<zas[m]> "atj: yes; performance is much..." <- I'm a bit disappointed that the nodes are reading from the disk so much. Wish the VMs were available with 48GB of RAM
I may look at reducing the replication factor to try and reduce the size of each nodes indexes.
yvanzo[m]
The fact that the indexer can be prioritized by HAProxy is big improvement given last week-end’s incident.
atj[m]
yvanzo[m]: We'll have to wait and see how much difference that makes to be honest!
But it's something
yvanzo[m]
<zas[m]> "old cluster nodes will need to..." <- We still need it for dumps at the moment.
<outsidecontext[m> "lucifer, monkey: rdswift noted..." <- MB website footer has been updated in beta already.
rimskii[m]
<lucifer> "rimskii: hmm i see, okay try..." <- once I run this command I should not kill the terminal right?
I ran this command yesterday (took ~4 hours), and I wanted to test it today
but seems like I have to create the tables again?
mayhem[m]
I see data, rimskii . are you getting an error? if so, what is it?
lucifer
rimskii[m]: i see the tables exist on wolf, so no need to create tables again. are you getting any error?
mayhem[m]
lololol
lucifer
XD
rimskii[m]
lol
i mean its not seeing the table "relation "mapping.canonical_recording_redirect" does not exist"
but its happening now
when I run yesterday, it hadn't that issue, but was running infinitely and not outputing anything
mayhem[m]
good, that table has 6677454 rows in it. :)
rimskii[m]
I am just afraid if I run "docker run --rm -it --network musicbrainz-docker_default metabrainz/mbid-mapping python3 manage.py canonical-data --use-mb-conn" it will start rebuilding the whole db again
lucifer
yes, do not run it again.
just run the ssh command and start up labs_api container and it should all work.
ah, ok, on MB. yes, but the footer is different. Picard, LB and CB all share the same footer layout, hence I asked. When we change that it should be consistent
Maxr1998 joined the channel
Maxr1998_ has quit
monkey[m]
<yvanzo[m]> "just misread L instead of M" <- MistenBrainz
zas[m]
atj: what's unclear to me is why some nodes are reading from disk much more than others. I guess it relates to cores/shards. That said, it reads from fast SSDs so that's not much an issue. Though the biggest core is recording with ~75Gb, all others are much smaller, the second next one is release-group with 7.6Gb, this should fit in ram... does solr use all available ram?
yvanzo[m]
outsidecontext: Got it but it still is the same link :)
outsidecontext[m
yvanzo: Nearly, or not yet. The footers of the other projects currently link directly to the metabrainz room via kiwiirc.com. But as lucifer suggested we probably should change this to go to the doc page instead.
atj: it would be interesting to increase the memory for one solr instance and see if it makes any difference regarding disk I/O. Current value is 8g, I would suggest to set it to 12g on one node, and observe changes.
rimskii[m]: i will need to create some more tables manually on wolf by dumping data from production. i will let you know in a few hours.
rimskii[m]
ahhh i see
okay thank you !
i will work on other thing then
<lucifer> "rimskii: i will need to create..." <- wait can’t I just import the sql data dump you sent me?
lucifer
rimskii[m]: that is for apple music, not spotify.
d4rk-ph0enix joined the channel
d4rk has quit
zas[m]
yvanzo: what needs to be done to move dumps to new cluster?
d4rk-ph0enix has quit
d4rk joined the channel
Jade[m] joined the channel
Jade[m]
bitmap: I just realised I misread the timings and this *GSoC Session: ‘Contributor Evaluations with GSoC Admins’* thing is at the same time as our weekly catch up
Are you OK with having our meeting an hour later, or should I just record the event to watch later?
bitmap[m]
Jade: no problem, I'm fine with meeting later :) just ping me when you're around
[musicbrainz-server] 14reosarevok opened pull request #3293 (03master…MBS-13630): MBS-13630: Prioritize "Voting is closed" as no vote rights reason in AE elections https://github.com/metabrainz/musicbrainz-serve...
minimal joined the channel
d4rk has quit
d4rk joined the channel
reosarevok[m]
yvanzo: https://github.com/metabrainz/musicbrainz-serve... is supposed to be ready now after bitmap made some improvements. Do you have any time to review / test? :) I'll test it further too but I expect we should all review this before we consider merging it
(probably should release the EAA beta first and merge this to the next beta anyway, but that's a second matter)
yvanzo[m]
Hi reosarevok: Nice to see it updated, I will review it in priority among MBS PRs, hopefully this week.
zas: (1) Adapt the MBS container to pack the new Solr backup format. (2) Move mirrors from Solr standalone to SolrCloud (and still allow them to either build collections or load backups).
Cluster-side, atj already deployed NFS and I tested that.
We might need to allow some connection from the MBS container to the SolrCloud cluster.
That should be all for our infrastructure. (2) is mirrors only.
bitmap[m]
I copied the EAA types to https://wiki.musicbrainz.org/Event_Art/Types yesterday BTW. But I couldn't come up with an introductory description (besides "the event art type describes the type of event art" 😅)
yvanzo[m]
bitmap: I cannot find the EAA types in the translation source strings.
bitmap[m]
You're right, it's missing from extract_pot_db. I'll add it
yvanzo[m]
reosarevok: I have suggestions for the EAA type descriptions, how should I proceed, making separate revisions of the wiki page above?
reosarevok[m]
The descriptions are stored in the DB
So it seems easiest to just discuss them here and then make the change on both places as needed
I hope they don't make them longer because IIRC aerozol was already a bit annoyed with the length of some of them :D
reosarevok: OK, for _Poster_: I suggest replacing “A poster” with “Usually vertical image” to go along with the description of _Banner_.
reosarevok[m]
aerozol: ^ opinion? :) I'm okay with it as long as you are
yvanzo[m]
Should we rather discuss these in the MB channel?
reosarevok[m]
Are there a lot of suggestions? If not, we already started here :) But if there are we can move
yvanzo[m]
reosarevok: It is 3:30 AM for our favorite kiwi.
reosarevok[m]
Yeah, I know, I was expecting to wait for feedback
"A poster" does seem a bit silly, just dunno if "Usually vertical image" helps with poster as much as horizontal helps with banner, since poster is probably more obvious than banner? But we can have it, unless there's a lot of non-vertical posters :D
d4rk has quit
d4rk joined the channel
BrainzGit
[musicbrainz-server] 14reosarevok merged pull request #3293 (03master…MBS-13630): MBS-13630: Prioritize "Voting is closed" as no vote rights reason in AE elections https://github.com/metabrainz/musicbrainz-serve...
yvanzo[m]
Describing foo with A foo just doesn’t help.
atj[m]
<yvanzo[m]> "That should be all for our..." <- I didn't set up `rrsync` yet, is that a requirement?
d4rk has quit
d4rk joined the channel
yvanzo[m]
atj: Good catch! Only SSH access will be required.
atj[m]
<zas[m]> "atj: what's unclear to me is why..." <- The documentation indicates that Solr uses MMAP to read the Lucene indexes so that they can be stored in the Linux page cache.
reosarevok[m]
yvanzo: I agree with that, it's just there to make it into a full sentence 😅 But your wording might be better here