so I'm curious about what was inserted, did the database size actually increased?
2023-06-02 15335, 2023
zas
there was a huge peak in deleted a bit after
2023-06-02 15340, 2023
bitmap
it doesn't seem like it
2023-06-02 15353, 2023
bitmap
I'm trying to figure out where those operations came from
2023-06-02 15338, 2023
zas
well, I cannot think anymore, I'm off :) thanks for your help on this, we still have a lot of questions, but the fact wal count stopped increasing is reassuring
2023-06-02 15344, 2023
zas
gn
2023-06-02 15341, 2023
bitmap
good night :)
2023-06-02 15346, 2023
bitmap
zas: I think it's traffic from the mbid_mapper (under mapping schema in musicbrainz_db) judging by logs, but not certain
I tried using parts of the universal language icon but it’s so complex - because we’re mainly trying to change the favicon I couldn’t get it work. But I used the same hangul letter so there’s some connection
2023-06-02 15339, 2023
aerozol
p.s. if we’re keen on that one I’ll double check if it’s okay to turn the character like that
2023-06-02 15337, 2023
Zhele has quit
2023-06-02 15318, 2023
CatQuest
isn't that japanese hiragana?
2023-06-02 15329, 2023
CatQuest
well I liked it
2023-06-02 15344, 2023
CatQuest
it seemed right
2023-06-02 15354, 2023
CatQuest
idk about th colour thoug. is green for translationbrainz? :D
2023-06-02 15323, 2023
CatQuest
oh god can we name our weblate instance TransBrainz /jk
ok, in short, for us, high number of ops on pg floyd, caused creation of 120gb of WAL files on pink (because it couldn't cope with the rate apparently), then new WAL files kept accumulating
2023-06-02 15329, 2023
zas
we were far from disk size limits, and we're still unsure whether WALs got cleaned after bitmap's action or not
2023-06-02 15359, 2023
zas
also it got almost unnoticed because of a buggy alert (fixed now, the good status of floyd was hiding the bad status of pink)
2023-06-02 15359, 2023
lucifer
i see makes sense. i think we should be able to avoid the WAL creation atleast with the UNLOGGED quickfix.
2023-06-02 15320, 2023
zas
lucifer: it's possible to run this task at anytime right? I mean can we trigger it for real-life testing (in order to verify it has expected results, but under control this time)?
2023-06-02 15343, 2023
lucifer
zas: yes, we can.
2023-06-02 15340, 2023
zas
we measured a lot of db inserts (for more than one hour), followed by a lot of delete operations under a short time. Can you link those to code?
CatQuest: looks Japanese for sure! But pretty sure it's Korean script.. they are all so similar!
2023-06-02 15310, 2023
aerozol
The green is just the MetaBrainz color, I've been using it for stuff that covers all projects (tickets, forums, wiki)
2023-06-02 15357, 2023
reosarevok
I like the icon :)
2023-06-02 15302, 2023
mayhem
moooin
2023-06-02 15322, 2023
mayhem
lucifer: zas : not logging the metadata cache makes sense to me. its all derived data.
2023-06-02 15302, 2023
zas
mayhem: morning
2023-06-02 15302, 2023
zas
do you have any idea why the problem only appears now? I wonder if that's a change on lb side or due to the switch floyd<->pink we did earlier this week
2023-06-02 15316, 2023
zas
I mean this cron job runs since a long time, right?
2023-06-02 15349, 2023
mayhem
yes, no change on our side.
2023-06-02 15320, 2023
mayhem
awww man, 0 listens recorded for all day yesterday. I'm guessing the spotify app dropped off at some point and then panoscrobbler couldn't do its job.
2023-06-02 15334, 2023
mayhem
so panoscrobbler isn't up to the task either.
2023-06-02 15322, 2023
zas
hmmm, it happened in the past, last month we had a similar peak on pink, though it didn't had the same results
mayhem: to confirm, unlogging the table implies that it won't be available on replicas and will not survive crash or unclean shutdowns.
2023-06-02 15323, 2023
mayhem
its 100% derived data, yes? then I see no problem with that.
2023-06-02 15342, 2023
lucifer
yup completely derived data. cool, sounds good.
2023-06-02 15354, 2023
zas
atj: will you be available for the move to new consul today?
2023-06-02 15326, 2023
phw joined the channel
2023-06-02 15335, 2023
atj
i have work commitments until 13:30, then lunch, so will be around after 14:00
2023-06-02 15316, 2023
atj
can you list the steps that are required for the migration?
2023-06-02 15351, 2023
zas
lucifer, mayhem: even though it triggered the pink WALs issue, I don't think that's an actual issue on LB side. Of course, reducing the use of main db is always better, but this process didn't trigger the issue on last month run.
2023-06-02 15307, 2023
zas
atj: yes, that's quite short: deploy consul/unbound changes, stop dnsmasq & consul-agent containers, restart consul/unbound with new configs
2023-06-02 15331, 2023
zas
well, we need to test the process manually on few nodes first, but I think it can be automated
2023-06-02 15334, 2023
zas
once done, we can totally remove few containers from each node (serviceregistrator-10.2.2.*, consul-agent, dnsmasq) and remove them from startup scripts
2023-06-02 15315, 2023
zas
now, we hope the deployment doesn't break anything in running containers, I expect surprises as usual
2023-06-02 15314, 2023
atj
do we want to do this on a Friday afternoon?
2023-06-02 15357, 2023
zas
well, we are in the middle of the river right now, and I prefer we reach the shore before the flood...
2023-06-02 15336, 2023
atj
bit of a tortured metaphor there ;) are we expecting a flood?
2023-06-02 15351, 2023
zas
well, yes, always ;)
2023-06-02 15303, 2023
zas
more seriously we can delay that to Monday morning
2023-06-02 15330, 2023
atj
i'd like to do the testing and write the playbook this afternoon
2023-06-02 15344, 2023
zas
ok, and we can refine steps
2023-06-02 15345, 2023
atj
then we can roll it out on Monday
2023-06-02 15306, 2023
zas
ok for me
2023-06-02 15353, 2023
mayhem
zas: have you started blocking lidarr?
2023-06-02 15349, 2023
atj
rudi/rex are blocking over 500 HTTPS req/s
2023-06-02 15304, 2023
mayhem
damn.
2023-06-02 15322, 2023
mayhem
lidarr specific or total?
2023-06-02 15339, 2023
atj
total
2023-06-02 15345, 2023
atj
i was just surprised by the numbers
2023-06-02 15302, 2023
mayhem
yeah, QNAP et al.
2023-06-02 15310, 2023
zas
atj: dreq is only what we block based on IPs, blocking based on UAs is done at openresty level (but I parse logs from time to time to block some at IP level)
2023-06-02 15335, 2023
atj
right, you can't block UAs at TCP level
2023-06-02 15355, 2023
zas
mayhem: I blocked lidarr-extended for now
2023-06-02 15332, 2023
mayhem
I think we should block anything with lidarr in its UA. they have their own hosting -- there is no need to serve any of their requests.
2023-06-02 15308, 2023
zas
ok, I'll check logs
2023-06-02 15339, 2023
zas
I see no request from lidarr (only from lidarr-extended)
2023-06-02 15303, 2023
mayhem
ok. lets keep monitoring.
2023-06-02 15307, 2023
zas
oh wait, there are few, very few
2023-06-02 15339, 2023
zas
I'll block those too
2023-06-02 15353, 2023
mayhem
great
2023-06-02 15328, 2023
zas
done
2023-06-02 15332, 2023
mayhem
thx
2023-06-02 15327, 2023
reosarevok
I did not know zas is the new Cantona, but maybe the weirdly poetic metaphors are a French people thing :D
2023-06-02 15333, 2023
zas
:D
2023-06-02 15347, 2023
monkey
Yes they are
2023-06-02 15301, 2023
monkey
mayhem, aerozol: Recs page has been nicely polished up, and is ready for inspection!