Lb team, check resources sometimes, lemmy 170 load15, Gaga disk full
shivam-kapila
Morning!
Lol my similarity to myself is 0.007% 🤣
Oops _lucifer beat me. His score is 0.001
_lucifer
😌
MajorLurker joined the channel
MajorLurker has quit
zas
yvanzo: SIR queue is growing since a while
ballenby joined the channel
sumedh has quit
sumedh joined the channel
navap
I just followed the musicbrainz-docker dev setup steps on ubuntu 18.04 and ended up with the following error when running `sudo docker-compose up -d, any ideas? ERROR: Invalid interpolation format for "volumes" option in service "musicbrainz": "${MUSICBRAINZ_SERVER_LOCAL_ROOT:?Missing path of musicbrainz-server working copy}:/musicbrainz-server"
MajorLurker joined the channel
MajorLurker has quit
v6lur joined the channel
zas
yvanzo: sir broke at ~3:55 UTC with following error:
i have no idea where the input from the internet is from
i'll need some help parsing where the load is coming from. it's not from the cron container, i've stopped it. most of htop is a bunch of uwsgi processes, so i assume we're getting a lot of heavy requests. however I have no data on whether we're getting an abnormal number of requests or not, and I'm not sure where to get that data.
if you zoom out you can see a number of peaks like this.
and anytime you see a peak that has a flat top, it is unlikely to be inbound traffic.
but something that is bound, by a NIC of a process.
Lotheric_ joined the channel
Lotheric has quit
Lotheric__ joined the channel
Lotheric__ is now known as Lotheric
Lotheric_ has quit
reosarevok
zas, ruaok: 502s
In prod
Is that the same lemmy issue?
zas
hmmm
nope, something else
floyd is under heavy load
bitmap: ^^
yvanzo: ^^
sumedh has quit
_lucifer
today i found a LinkedIn profile where someone had put added 15 artists to Musicbrainz as experience
sumedh joined the channel
ruaok
reosarevok: 502 in prod and you bug *us*?? why aren't you investigating?
reosarevok
Because that suggests sysadmin to me, not junior dev
Lotheric has a lot of experience suddenly
Hope it's not the whole "people querying VA releases" thing
(that code should be released in the next release, but if it seems to be causing issues we could put it out sooner)
ruaok
reosarevok: you really need to start learning more about our production setup
we all need to take part in it and its not fair to just push things off to zas.
so, let's start now.
1. high disk writes. 40-70MB/s
2. low disk reads
3. 60% CPU use
reosarevok: go google how to find currently running queries in postgres
reosarevok
I know absolutely 0 things about hardware, so all that tells me very little. I studied web programming and the last time I set up a server it was on a Pentium 2 or so
Ok, that tells me more :p
ruaok
this has ZERO to do with hardware.
this is EVERYTHING to do with the software that YOU help write.
it is doing something wrong. zas didn't write it. perhaps you didn't either, but you should learn more about what it does.
reosarevok
Oh, absolutely, my point is that I know nothing about what influences disk writes and reads and CPU use, I mostly know how to make code that makes stuff show up on a website
zas
it started at 16:11 UTC, massive writes, almost 100% CPU, load ~50, ram ok but usage increased
reosarevok
I'm not saying I shouldn't learn more
ruaok
time to start learning.
reosarevok
Just that this doesn't tell me anything at first :)
Let's see
ruaok
load is more normal now
zas
yes, it decreases
reosarevok
Guessing that means whatever happened is no longer in pg_stat_activity
But I see at least one query for VA
Two
So yeah
Prooobably should hotfix that
Because next release is in almost 10 days
ruaok
there are 4 queries that have been running for 21 days.
2 for 21 days. some for longer
one of them in an explain. wtf.
reosarevok
That wouldn't cause a sudden spike, I assume, while we know the VA queries do
But still weird
bitmap, yvanzo: this has two approvals, so I'm going to hotfix it to beta/prod myself, unless one of you is around and has a good reason not to in the time it takes me to do so :p
this actually makes sense to me -- sundays are heavy load time and if we have something that is known to be bad, this can cause everything to back up.
so, yes, please hotfix asap.
zas: I'm also concerned about the fan temp on floyd. do we need to schedule a fix?
zas
I wonder too, but I think it's normal
because that's a huge cpu chip, it tends to produce more heat
also it doesn't throttle at those temperatures
ruaok
ok. let's keep our eyes on it.
zas
temp increases with load, but alert threshold is perhaps a bit low for it
note: it reaches threshold under 100% cpu (all cores) for a loong time
I'd say the cpu cooling system isn't the best one, but is working "normally"
(and btw, it's on my radar since a while ;)
ruaok
heheh, ok.
shivam-kapila
I am not an expert at all this. But is it possible that some of our client/user has set a cron job to fetch/update data for large data set every sunday
(Ignore please if talked nonsense)
Just thinking that why every weekend such high load occurs
reosarevok
bitmap: btw, any idea about this one? I think it's the one ruaok mentioned above, and it's about sitemaps so you'd be the most likely to know: