so I'm curious about what was inserted, did the database size actually increased?
there was a huge peak in deleted a bit after
bitmap
it doesn't seem like it
I'm trying to figure out where those operations came from
zas
well, I cannot think anymore, I'm off :) thanks for your help on this, we still have a lot of questions, but the fact wal count stopped increasing is reassuring
gn
bitmap
good night :)
zas: I think it's traffic from the mbid_mapper (under mapping schema in musicbrainz_db) judging by logs, but not certain
I tried using parts of the universal language icon but it’s so complex - because we’re mainly trying to change the favicon I couldn’t get it work. But I used the same hangul letter so there’s some connection
p.s. if we’re keen on that one I’ll double check if it’s okay to turn the character like that
Zhele has quit
CatQuest
isn't that japanese hiragana?
well I liked it
it seemed right
idk about th colour thoug. is green for translationbrainz? :D
oh god can we name our weblate instance TransBrainz /jk
ok, in short, for us, high number of ops on pg floyd, caused creation of 120gb of WAL files on pink (because it couldn't cope with the rate apparently), then new WAL files kept accumulating
we were far from disk size limits, and we're still unsure whether WALs got cleaned after bitmap's action or not
also it got almost unnoticed because of a buggy alert (fixed now, the good status of floyd was hiding the bad status of pink)
lucifer
i see makes sense. i think we should be able to avoid the WAL creation atleast with the UNLOGGED quickfix.
zas
lucifer: it's possible to run this task at anytime right? I mean can we trigger it for real-life testing (in order to verify it has expected results, but under control this time)?
lucifer
zas: yes, we can.
zas
we measured a lot of db inserts (for more than one hour), followed by a lot of delete operations under a short time. Can you link those to code?
CatQuest: looks Japanese for sure! But pretty sure it's Korean script.. they are all so similar!
The green is just the MetaBrainz color, I've been using it for stuff that covers all projects (tickets, forums, wiki)
reosarevok
I like the icon :)
mayhem
moooin
lucifer: zas : not logging the metadata cache makes sense to me. its all derived data.
zas
mayhem: morning
do you have any idea why the problem only appears now? I wonder if that's a change on lb side or due to the switch floyd<->pink we did earlier this week
I mean this cron job runs since a long time, right?
mayhem
yes, no change on our side.
awww man, 0 listens recorded for all day yesterday. I'm guessing the spotify app dropped off at some point and then panoscrobbler couldn't do its job.
so panoscrobbler isn't up to the task either.
zas
hmmm, it happened in the past, last month we had a similar peak on pink, though it didn't had the same results
mayhem: to confirm, unlogging the table implies that it won't be available on replicas and will not survive crash or unclean shutdowns.
mayhem
its 100% derived data, yes? then I see no problem with that.
lucifer
yup completely derived data. cool, sounds good.
zas
atj: will you be available for the move to new consul today?
phw joined the channel
atj
i have work commitments until 13:30, then lunch, so will be around after 14:00
can you list the steps that are required for the migration?
zas
lucifer, mayhem: even though it triggered the pink WALs issue, I don't think that's an actual issue on LB side. Of course, reducing the use of main db is always better, but this process didn't trigger the issue on last month run.
atj: yes, that's quite short: deploy consul/unbound changes, stop dnsmasq & consul-agent containers, restart consul/unbound with new configs
well, we need to test the process manually on few nodes first, but I think it can be automated
once done, we can totally remove few containers from each node (serviceregistrator-10.2.2.*, consul-agent, dnsmasq) and remove them from startup scripts
now, we hope the deployment doesn't break anything in running containers, I expect surprises as usual
atj
do we want to do this on a Friday afternoon?
zas
well, we are in the middle of the river right now, and I prefer we reach the shore before the flood...
atj
bit of a tortured metaphor there ;) are we expecting a flood?
zas
well, yes, always ;)
more seriously we can delay that to Monday morning
atj
i'd like to do the testing and write the playbook this afternoon
zas
ok, and we can refine steps
atj
then we can roll it out on Monday
zas
ok for me
mayhem
zas: have you started blocking lidarr?
atj
rudi/rex are blocking over 500 HTTPS req/s
mayhem
damn.
lidarr specific or total?
atj
total
i was just surprised by the numbers
mayhem
yeah, QNAP et al.
zas
atj: dreq is only what we block based on IPs, blocking based on UAs is done at openresty level (but I parse logs from time to time to block some at IP level)
atj
right, you can't block UAs at TCP level
zas
mayhem: I blocked lidarr-extended for now
mayhem
I think we should block anything with lidarr in its UA. they have their own hosting -- there is no need to serve any of their requests.
zas
ok, I'll check logs
I see no request from lidarr (only from lidarr-extended)
mayhem
ok. lets keep monitoring.
zas
oh wait, there are few, very few
I'll block those too
mayhem
great
zas
done
mayhem
thx
reosarevok
I did not know zas is the new Cantona, but maybe the weirdly poetic metaphors are a French people thing :D
zas
:D
monkey
Yes they are
mayhem, aerozol: Recs page has been nicely polished up, and is ready for inspection!