Lb team, check resources sometimes, lemmy 170 load15, Gaga disk full
2020-07-19 20153, 2020
shivam-kapila
Morning!
2020-07-19 20116, 2020
shivam-kapila
Lol my similarity to myself is 0.007% 🤣
2020-07-19 20116, 2020
shivam-kapila
Oops _lucifer beat me. His score is 0.001
2020-07-19 20140, 2020
_lucifer
😌
2020-07-19 20129, 2020
MajorLurker joined the channel
2020-07-19 20137, 2020
MajorLurker has quit
2020-07-19 20106, 2020
zas
yvanzo: SIR queue is growing since a while
2020-07-19 20145, 2020
ballenby joined the channel
2020-07-19 20114, 2020
sumedh has quit
2020-07-19 20143, 2020
sumedh joined the channel
2020-07-19 20154, 2020
navap
I just followed the musicbrainz-docker dev setup steps on ubuntu 18.04 and ended up with the following error when running `sudo docker-compose up -d, any ideas? ERROR: Invalid interpolation format for "volumes" option in service "musicbrainz": "${MUSICBRAINZ_SERVER_LOCAL_ROOT:?Missing path of musicbrainz-server working copy}:/musicbrainz-server"
2020-07-19 20127, 2020
MajorLurker joined the channel
2020-07-19 20125, 2020
MajorLurker has quit
2020-07-19 20130, 2020
v6lur joined the channel
2020-07-19 20110, 2020
zas
yvanzo: sir broke at ~3:55 UTC with following error:
i have no idea where the input from the internet is from
2020-07-19 20101, 2020
iliekcomputers
i'll need some help parsing where the load is coming from. it's not from the cron container, i've stopped it. most of htop is a bunch of uwsgi processes, so i assume we're getting a lot of heavy requests. however I have no data on whether we're getting an abnormal number of requests or not, and I'm not sure where to get that data.
if you zoom out you can see a number of peaks like this.
2020-07-19 20104, 2020
ruaok
and anytime you see a peak that has a flat top, it is unlikely to be inbound traffic.
2020-07-19 20124, 2020
ruaok
but something that is bound, by a NIC of a process.
2020-07-19 20139, 2020
Lotheric_ joined the channel
2020-07-19 20100, 2020
Lotheric has quit
2020-07-19 20119, 2020
Lotheric__ joined the channel
2020-07-19 20139, 2020
Lotheric__ is now known as Lotheric
2020-07-19 20124, 2020
Lotheric_ has quit
2020-07-19 20101, 2020
reosarevok
zas, ruaok: 502s
2020-07-19 20105, 2020
reosarevok
In prod
2020-07-19 20123, 2020
reosarevok
Is that the same lemmy issue?
2020-07-19 20144, 2020
zas
hmmm
2020-07-19 20149, 2020
zas
nope, something else
2020-07-19 20157, 2020
zas
floyd is under heavy load
2020-07-19 20100, 2020
zas
bitmap: ^^
2020-07-19 20105, 2020
zas
yvanzo: ^^
2020-07-19 20123, 2020
sumedh has quit
2020-07-19 20124, 2020
_lucifer
today i found a LinkedIn profile where someone had put added 15 artists to Musicbrainz as experience
2020-07-19 20146, 2020
sumedh joined the channel
2020-07-19 20109, 2020
ruaok
reosarevok: 502 in prod and you bug *us*?? why aren't you investigating?
2020-07-19 20145, 2020
reosarevok
Because that suggests sysadmin to me, not junior dev
2020-07-19 20110, 2020
Lotheric has a lot of experience suddenly
2020-07-19 20133, 2020
reosarevok
Hope it's not the whole "people querying VA releases" thing
2020-07-19 20102, 2020
reosarevok
(that code should be released in the next release, but if it seems to be causing issues we could put it out sooner)
2020-07-19 20104, 2020
ruaok
reosarevok: you really need to start learning more about our production setup
2020-07-19 20127, 2020
ruaok
we all need to take part in it and its not fair to just push things off to zas.
2020-07-19 20138, 2020
ruaok
so, let's start now.
2020-07-19 20107, 2020
ruaok
1. high disk writes. 40-70MB/s
2020-07-19 20114, 2020
ruaok
2. low disk reads
2020-07-19 20136, 2020
ruaok
3. 60% CPU use
2020-07-19 20147, 2020
ruaok
reosarevok: go google how to find currently running queries in postgres
2020-07-19 20157, 2020
reosarevok
I know absolutely 0 things about hardware, so all that tells me very little. I studied web programming and the last time I set up a server it was on a Pentium 2 or so
2020-07-19 20100, 2020
reosarevok
Ok, that tells me more :p
2020-07-19 20113, 2020
ruaok
this has ZERO to do with hardware.
2020-07-19 20123, 2020
ruaok
this is EVERYTHING to do with the software that YOU help write.
2020-07-19 20100, 2020
ruaok
it is doing something wrong. zas didn't write it. perhaps you didn't either, but you should learn more about what it does.
2020-07-19 20108, 2020
reosarevok
Oh, absolutely, my point is that I know nothing about what influences disk writes and reads and CPU use, I mostly know how to make code that makes stuff show up on a website
2020-07-19 20112, 2020
zas
it started at 16:11 UTC, massive writes, almost 100% CPU, load ~50, ram ok but usage increased
2020-07-19 20112, 2020
reosarevok
I'm not saying I shouldn't learn more
2020-07-19 20115, 2020
ruaok
time to start learning.
2020-07-19 20118, 2020
reosarevok
Just that this doesn't tell me anything at first :)
2020-07-19 20136, 2020
reosarevok
Let's see
2020-07-19 20147, 2020
ruaok
load is more normal now
2020-07-19 20156, 2020
zas
yes, it decreases
2020-07-19 20152, 2020
reosarevok
Guessing that means whatever happened is no longer in pg_stat_activity
2020-07-19 20103, 2020
reosarevok
But I see at least one query for VA
2020-07-19 20112, 2020
reosarevok
Two
2020-07-19 20113, 2020
reosarevok
So yeah
2020-07-19 20120, 2020
reosarevok
Prooobably should hotfix that
2020-07-19 20140, 2020
reosarevok
Because next release is in almost 10 days
2020-07-19 20149, 2020
ruaok
there are 4 queries that have been running for 21 days.
2020-07-19 20100, 2020
ruaok
2 for 21 days. some for longer
2020-07-19 20122, 2020
ruaok
one of them in an explain. wtf.
2020-07-19 20154, 2020
reosarevok
That wouldn't cause a sudden spike, I assume, while we know the VA queries do
2020-07-19 20159, 2020
reosarevok
But still weird
2020-07-19 20102, 2020
reosarevok
bitmap, yvanzo: this has two approvals, so I'm going to hotfix it to beta/prod myself, unless one of you is around and has a good reason not to in the time it takes me to do so :p
this actually makes sense to me -- sundays are heavy load time and if we have something that is known to be bad, this can cause everything to back up.
2020-07-19 20116, 2020
ruaok
so, yes, please hotfix asap.
2020-07-19 20134, 2020
ruaok
zas: I'm also concerned about the fan temp on floyd. do we need to schedule a fix?
2020-07-19 20157, 2020
zas
I wonder too, but I think it's normal
2020-07-19 20126, 2020
zas
because that's a huge cpu chip, it tends to produce more heat
2020-07-19 20153, 2020
zas
also it doesn't throttle at those temperatures
2020-07-19 20127, 2020
ruaok
ok. let's keep our eyes on it.
2020-07-19 20147, 2020
zas
temp increases with load, but alert threshold is perhaps a bit low for it
2020-07-19 20134, 2020
zas
note: it reaches threshold under 100% cpu (all cores) for a loong time
2020-07-19 20102, 2020
zas
I'd say the cpu cooling system isn't the best one, but is working "normally"
2020-07-19 20122, 2020
zas
(and btw, it's on my radar since a while ;)
2020-07-19 20105, 2020
ruaok
heheh, ok.
2020-07-19 20152, 2020
shivam-kapila
I am not an expert at all this. But is it possible that some of our client/user has set a cron job to fetch/update data for large data set every sunday
2020-07-19 20107, 2020
shivam-kapila
(Ignore please if talked nonsense)
2020-07-19 20133, 2020
shivam-kapila
Just thinking that why every weekend such high load occurs
2020-07-19 20100, 2020
reosarevok
bitmap: btw, any idea about this one? I think it's the one ruaok mentioned above, and it's about sitemaps so you'd be the most likely to know: