ocharles-, warp, ianmcorvidae, getting 502s for just about everything.
2013-03-30 08919, 2013
andreypopp joined the channel
2013-03-30 08901, 2013
reosarevok joined the channel
2013-03-30 08947, 2013
reosarevok
Anyone knows why the hell we're having 50x on every single page?
2013-03-30 08954, 2013
Leftmost
No, and no one with access seems to be around to figure it out.
2013-03-30 08924, 2013
reosarevok
Well
2013-03-30 08927, 2013
reosarevok
Seems to be back for now
2013-03-30 08948, 2013
Leftmost
It's been in and out for me for a while.
2013-03-30 08928, 2013
reosarevok
Yeah, ok
2013-03-30 08932, 2013
reosarevok
Gone again now
2013-03-30 08935, 2013
reosarevok
*grumbles*
2013-03-30 08907, 2013
andreypopp joined the channel
2013-03-30 08951, 2013
petesake joined the channel
2013-03-30 08913, 2013
DremoraLV joined the channel
2013-03-30 08926, 2013
warp
hello!
2013-03-30 08918, 2013
bandtrace joined the channel
2013-03-30 08926, 2013
bandtrace joined the channel
2013-03-30 08912, 2013
Leftmost joined the channel
2013-03-30 08909, 2013
nikki_ joined the channel
2013-03-30 08928, 2013
ruaok joined the channel
2013-03-30 08938, 2013
ruaok
warp: PING
2013-03-30 08900, 2013
warp
ack
2013-03-30 08904, 2013
zas joined the channel
2013-03-30 08930, 2013
nikki_ wakes up to a pile of ISEs about problems reading from the redis server
2013-03-30 08913, 2013
warp
nikki_: yep, we're aware of it. and it's even worse now apparantly.
2013-03-30 08900, 2013
nikki_
I thought the redis stuff was supposed to stop it from ISEing like that :/
2013-03-30 08913, 2013
warp
site is back.
2013-03-30 08939, 2013
warp
nikki_: this is a different ISE
2013-03-30 08931, 2013
nikki_
well, there's still a bunch of these "Can't use an undefined value as a HASH reference" ones
2013-03-30 08938, 2013
warp
nikki_: the theory was that either: 1. memcached would lose sessions (it's a cache, not a datastore). 2. if connection to memcached was lost a new session was created, so still losing the session.
2013-03-30 08913, 2013
warp
nikki_: which is why we switched to redis, because in redis is a datastore, and the connection handling is better
2013-03-30 08905, 2013
warp
but redis ran out of filehandles (memcached has stuff to deal with this, and redis as well, but our super old version of redis doesn't)
2013-03-30 08903, 2013
warp
nikki_: and ofcourse there are many other things broken in the release editor which can make it ISE.
2013-03-30 08945, 2013
nikki_
so I've noticed
2013-03-30 08913, 2013
nikki_
the majority of the ISEs are the release editor crashing or people submitting cd stubs without a tracklist :(
2013-03-30 08944, 2013
nikki_
oh, or the random search ones
2013-03-30 08952, 2013
Leftmost
Is it okay if I feel a great sense of satisfaction when I add a disc ID that kills a CD stub?
2013-03-30 08940, 2013
warp
Leftmost: yes.
2013-03-30 08943, 2013
nikki_
we can't exactly stop you :P
2013-03-30 08917, 2013
Leftmost
Just because you can't stop me doesn't mean it's okay. :-P
2013-03-30 08953, 2013
zas
Hmmm, i cannot log to acoustid.org with my usual credentials, is this related to the issue MB just had ?
2013-03-30 08943, 2013
luks
zas: might be
2013-03-30 08923, 2013
luks
the MB auth requests are timing out
2013-03-30 08943, 2013
zas
ohoho, 502 again on MB
2013-03-30 08925, 2013
reosarevok joined the channel
2013-03-30 08929, 2013
ocharles- joined the channel
2013-03-30 08942, 2013
ocharles wakes up
2013-03-30 08945, 2013
ocharles
warp: ping
2013-03-30 08910, 2013
warp
ocharles: hello!
2013-03-30 08927, 2013
ocharles
still file handle troubles?
2013-03-30 08929, 2013
warp
ocharles: we've got redis at 100% cpu for no apparent reason
2013-03-30 08901, 2013
warp
I don't think it's file handles.
2013-03-30 08901, 2013
Mineo joined the channel
2013-03-30 08908, 2013
ocharles
what server?
2013-03-30 08917, 2013
warp
roobarb
2013-03-30 08953, 2013
ocharles
lets have a looksie
2013-03-30 08955, 2013
warp
max open files is set correctly when I check /proc/19501/limits
2013-03-30 08916, 2013
warp
connect clients hovers between 200 and 300 when I can connect. (redis-cli info)
2013-03-30 08919, 2013
warp
connected
2013-03-30 08950, 2013
ocharles
we have a 16 core machine so a load of 2 doesn't seem the end of the world, I guess?
2013-03-30 08930, 2013
warp
yeah, that should be fine.
2013-03-30 08900, 2013
ocharles
can I have sudo on that machine?
2013-03-30 08916, 2013
warp
sure
2013-03-30 08958, 2013
warp
ocharles: done.
2013-03-30 08914, 2013
ocharles
thanks
2013-03-30 08938, 2013
warp
hrm. now there's two redis-server's running.
2013-03-30 08902, 2013
ocharles
redis seems to be writing 30MB/s
2013-03-30 08909, 2013
ocharles
so I imagine that cpu usage is almost entirely io dominated
2013-03-30 08913, 2013
warp
ok
2013-03-30 08920, 2013
warp
so it needs to flush/save less.
2013-03-30 08942, 2013
ocharles
but according to /var/log/redis it isn't flushing that ofte
2013-03-30 08946, 2013
warp
at most it should flush once every minute. but only if 10000 keys have changed since the last save.
2013-03-30 08959, 2013
ocharles
hum, mabye it's not that, atop isn't showing much activity for the disk actually
2013-03-30 08935, 2013
warp
:(
2013-03-30 08907, 2013
andreypopp joined the channel
2013-03-30 08913, 2013
warp
ocharles: what are you currently doing?
2013-03-30 08942, 2013
warp
(there's still two redis-servers running, which cannot be good, I'd like to kill one
19501, which also seems to have killed the other one. so perhaps it's normal.
2013-03-30 08934, 2013
ocharles
the logs say:
2013-03-30 08939, 2013
ocharles
30 Mar 12:04:55 * Opening TCP port: bind: Address already in use
2013-03-30 08940, 2013
warp
but if one is an intentional/internal fork of the other, I would have expected their start time to be the same.
2013-03-30 08942, 2013
ocharles
so I don't think it's normal
2013-03-30 08949, 2013
ocharles
also
2013-03-30 08950, 2013
ocharles
30 Mar 12:44:47 - The server is now ready to accept connections on port 6379
2013-03-30 08908, 2013
ocharles
30 Mar 11:20:14 * WARNING overcommit_memory is set to 0! Background save may fail under low condition memory. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
2013-03-30 08916, 2013
ocharles
but that's fail, not spin
2013-03-30 08959, 2013
warp
the server has 4GB free memory, so that message doesn't seem relevant to our current trouble.
2013-03-30 08927, 2013
ocharles
this connection is too weak to do anything useful :/
2013-03-30 08930, 2013
warp
ocharles: anyway, so I don't understand why it's being this finnicky. it should have enough file handles, enough CPU, enough memory.
2013-03-30 08933, 2013
ocharles
acid2 [at roobarb]:~$ ps aux | grep strc
2013-03-30 08934, 2013
ocharles
a^C^[[A^C^C^C^C^Z
2013-03-30 08935, 2013
ocharles
for example
2013-03-30 08936, 2013
ocharles
:)
2013-03-30 08902, 2013
warp
ecuador cable internet ftw!
2013-03-30 08942, 2013
warp
ocharles: I'm inclined to install redis 2.x either on roobarb or a new hoser vm. 1.2 seems old. though upgrading and hoping that magically fixes a problem is in general not the best strategy.
2013-03-30 08908, 2013
ocharles
i'm ok with that
2013-03-30 08951, 2013
warp
ok
2013-03-30 08922, 2013
ocharles
i really can't do anything other than advise i'm afraid
2013-03-30 08901, 2013
ocharles
i'm seeing java take 400% of the cpu though atm
2013-03-30 08947, 2013
warp
ocharles: yep, roobarb and dora are the search servers.
2013-03-30 08905, 2013
ocharles
i know
2013-03-30 08911, 2013
ocharles
but i mean the load doesn't seem to be coming from redis right now
2013-03-30 08958, 2013
warp nods.
2013-03-30 08910, 2013
warp
and redis is working fine when it's not spiking at 100%
2013-03-30 08939, 2013
ocharles
it seems to have only broken today though
2013-03-30 08955, 2013
ocharles
and ruaok did an upgrade on the search servers yesterday, so i wonder if the events are correlated?