#musicbrainz-devel

/

0:34 AM
ruaok

lol @ xkcd

2013-03-30 08923, 2013

0:36 AM
warp

lol!

2013-03-30 08942, 2013

1:20 AM
Prophet5 joined the channel

2013-03-30 08912, 2013

1:37 AM
reoafk joined the channel

2013-03-30 08947, 2013

2:05 AM
Ben\Sput has left the channel

2013-03-30 08901, 2013

2:06 AM
Ben\Sput joined the channel

2013-03-30 08906, 2013

2:06 AM
Ben\Sput

50* errors :(

2013-03-30 08919, 2013

2:09 AM
Ben\Sput has left the channel

2013-03-30 08901, 2013

2:51 AM
j-b_ joined the channel

2013-03-30 08929, 2013

2:51 AM
navap joined the channel

2013-03-30 08910, 2013

2:53 AM
DWSR2 joined the channel

2013-03-30 08945, 2013

2:55 AM
ocharles- joined the channel

2013-03-30 08954, 2013

5:33 AM
Prophet5 joined the channel

2013-03-30 08910, 2013

6:47 AM
andreypopp joined the channel

2013-03-30 08953, 2013

7:22 AM
andreypopp joined the channel

2013-03-30 08957, 2013

7:49 AM
Leftmost

ocharles-, warp, ianmcorvidae, getting 502s for just about everything.

2013-03-30 08919, 2013

8:02 AM
andreypopp joined the channel

2013-03-30 08901, 2013

8:24 AM
reosarevok joined the channel

2013-03-30 08947, 2013

8:25 AM
reosarevok

Anyone knows why the hell we're having 50x on every single page?

2013-03-30 08954, 2013

8:30 AM
Leftmost

No, and no one with access seems to be around to figure it out.

2013-03-30 08924, 2013

8:32 AM
reosarevok

Well

2013-03-30 08927, 2013

8:32 AM
reosarevok

Seems to be back for now

2013-03-30 08948, 2013

8:32 AM
Leftmost

It's been in and out for me for a while.

2013-03-30 08928, 2013

8:33 AM
reosarevok

Yeah, ok

2013-03-30 08932, 2013

8:33 AM
reosarevok

Gone again now

2013-03-30 08935, 2013

8:33 AM
reosarevok

*grumbles*

2013-03-30 08907, 2013

8:38 AM
andreypopp joined the channel

2013-03-30 08951, 2013

9:11 AM
petesake joined the channel

2013-03-30 08913, 2013

9:26 AM
DremoraLV joined the channel

2013-03-30 08926, 2013

10:21 AM
warp

hello!

2013-03-30 08918, 2013

10:39 AM
bandtrace joined the channel

2013-03-30 08926, 2013

10:45 AM
bandtrace joined the channel

2013-03-30 08912, 2013

10:52 AM
Leftmost joined the channel

2013-03-30 08909, 2013

11:23 AM
nikki_ joined the channel

2013-03-30 08928, 2013

11:29 AM
ruaok joined the channel

2013-03-30 08938, 2013

11:29 AM
ruaok

warp: PING

2013-03-30 08900, 2013

11:35 AM
warp

ack

2013-03-30 08904, 2013

11:39 AM
zas joined the channel

2013-03-30 08930, 2013

11:39 AM
nikki_ wakes up to a pile of ISEs about problems reading from the redis server

2013-03-30 08913, 2013

11:40 AM
warp

nikki_: yep, we're aware of it. and it's even worse now apparantly.

2013-03-30 08900, 2013

11:41 AM
nikki_

I thought the redis stuff was supposed to stop it from ISEing like that :/

2013-03-30 08913, 2013

11:43 AM
warp

site is back.

2013-03-30 08939, 2013

11:43 AM
warp

nikki_: this is a different ISE

2013-03-30 08931, 2013

11:44 AM
nikki_

well, there's still a bunch of these "Can't use an undefined value as a HASH reference" ones

2013-03-30 08938, 2013

11:44 AM
warp

nikki_: the theory was that either: 1. memcached would lose sessions (it's a cache, not a datastore). 2. if connection to memcached was lost a new session was created, so still losing the session.

2013-03-30 08913, 2013

11:45 AM
warp

nikki_: which is why we switched to redis, because in redis is a datastore, and the connection handling is better

2013-03-30 08905, 2013

11:46 AM
warp

but redis ran out of filehandles (memcached has stuff to deal with this, and redis as well, but our super old version of redis doesn't)

2013-03-30 08903, 2013

11:47 AM
warp

nikki_: and ofcourse there are many other things broken in the release editor which can make it ISE.

2013-03-30 08945, 2013

11:47 AM
nikki_

so I've noticed

2013-03-30 08913, 2013

11:48 AM
nikki_

the majority of the ISEs are the release editor crashing or people submitting cd stubs without a tracklist :(

2013-03-30 08944, 2013

11:48 AM
nikki_

oh, or the random search ones

2013-03-30 08952, 2013

11:48 AM
Leftmost

Is it okay if I feel a great sense of satisfaction when I add a disc ID that kills a CD stub?

2013-03-30 08940, 2013

11:50 AM
warp

Leftmost: yes.

2013-03-30 08943, 2013

11:50 AM
nikki_

we can't exactly stop you :P

2013-03-30 08917, 2013

11:51 AM
Leftmost

Just because you can't stop me doesn't mean it's okay. :-P

2013-03-30 08953, 2013

11:56 AM
zas

Hmmm, i cannot log to acoustid.org with my usual credentials, is this related to the issue MB just had ?

2013-03-30 08943, 2013

11:57 AM
luks

zas: might be

2013-03-30 08923, 2013

11:58 AM
luks

the MB auth requests are timing out

2013-03-30 08943, 2013

11:58 AM
zas

ohoho, 502 again on MB

2013-03-30 08925, 2013

12:07 PM
reosarevok joined the channel

2013-03-30 08929, 2013

12:18 PM
ocharles- joined the channel

2013-03-30 08942, 2013

12:19 PM
ocharles wakes up

2013-03-30 08945, 2013

12:19 PM
ocharles

warp: ping

2013-03-30 08910, 2013

12:20 PM
warp

ocharles: hello!

2013-03-30 08927, 2013

12:20 PM
ocharles

still file handle troubles?

2013-03-30 08929, 2013

12:20 PM
warp

ocharles: we've got redis at 100% cpu for no apparent reason

2013-03-30 08901, 2013

12:21 PM
warp

I don't think it's file handles.

2013-03-30 08901, 2013

12:21 PM
Mineo joined the channel

2013-03-30 08908, 2013

12:21 PM
ocharles

what server?

2013-03-30 08917, 2013

12:22 PM
warp

roobarb

2013-03-30 08953, 2013

12:22 PM
ocharles

lets have a looksie

2013-03-30 08955, 2013

12:22 PM
warp

max open files is set correctly when I check /proc/19501/limits

2013-03-30 08916, 2013

12:23 PM
warp

connect clients hovers between 200 and 300 when I can connect. (redis-cli info)

2013-03-30 08919, 2013

12:23 PM
warp

connected

2013-03-30 08950, 2013

12:23 PM
ocharles

we have a 16 core machine so a load of 2 doesn't seem the end of the world, I guess?

2013-03-30 08930, 2013

12:24 PM
warp

yeah, that should be fine.

2013-03-30 08900, 2013

12:25 PM
ocharles

can I have sudo on that machine?

2013-03-30 08916, 2013

12:25 PM
warp

sure

2013-03-30 08958, 2013

12:25 PM
warp

ocharles: done.

2013-03-30 08914, 2013

12:26 PM
ocharles

thanks

2013-03-30 08938, 2013

12:32 PM
warp

hrm. now there's two redis-server's running.

2013-03-30 08902, 2013

12:33 PM
ocharles

redis seems to be writing 30MB/s

2013-03-30 08909, 2013

12:33 PM
ocharles

so I imagine that cpu usage is almost entirely io dominated

2013-03-30 08913, 2013

12:33 PM
warp

ok

2013-03-30 08920, 2013

12:33 PM
warp

so it needs to flush/save less.

2013-03-30 08942, 2013

12:33 PM
ocharles

but according to /var/log/redis it isn't flushing that ofte

2013-03-30 08946, 2013

12:34 PM
warp

at most it should flush once every minute. but only if 10000 keys have changed since the last save.

2013-03-30 08959, 2013

12:34 PM
ocharles

hum, mabye it's not that, atop isn't showing much activity for the disk actually

2013-03-30 08935, 2013

12:37 PM
warp

:(

2013-03-30 08907, 2013

12:39 PM
andreypopp joined the channel

2013-03-30 08913, 2013

12:43 PM
warp

ocharles: what are you currently doing?

2013-03-30 08942, 2013

12:43 PM
warp

(there's still two redis-servers running, which cannot be good, I'd like to kill one

2013-03-30 08945, 2013

12:43 PM
warp

)

2013-03-30 08912, 2013

12:44 PM
ocharles

i'm doing nothing

2013-03-30 08913, 2013

12:44 PM
ocharles

go ahead

2013-03-30 08930, 2013

12:44 PM
ocharles does not see two servers

2013-03-30 08957, 2013

12:44 PM
warp

redis 19501 64.5 9.2 1533664 1527524 ? Ss 12:05 24:17 /usr/bin/redis-server /etc/redis/redis.conf

2013-03-30 08900, 2013

12:45 PM
warp

redis 21088 25.8 9.2 1534108 1527768 ? R 12:38 1:17 /usr/bin/redis-server /etc/redis/redis.conf

2013-03-30 08911, 2013

12:45 PM
ocharles

root 18846 0.0 0.0 8292 720 pts/4 S+ 11:41 0:00 tail -f redis-server.log

2013-03-30 08914, 2013

12:45 PM
ocharles

acid2 21241 0.0 0.0 7628 1020 pts/6 S+ 12:44 0:00 grep --color=auto redis

2013-03-30 08918, 2013

12:45 PM
ocharles

oddly, I saw no servers

2013-03-30 08954, 2013

12:45 PM
ocharles

which did you kill?

2013-03-30 08912, 2013

12:46 PM
warp

19501, which also seems to have killed the other one. so perhaps it's normal.

2013-03-30 08934, 2013

12:46 PM
ocharles

the logs say:

2013-03-30 08939, 2013

12:46 PM
ocharles

30 Mar 12:04:55 * Opening TCP port: bind: Address already in use

2013-03-30 08940, 2013

12:46 PM
warp

but if one is an intentional/internal fork of the other, I would have expected their start time to be the same.

2013-03-30 08942, 2013

12:46 PM
ocharles

so I don't think it's normal

2013-03-30 08949, 2013

12:46 PM
ocharles

also

2013-03-30 08950, 2013

12:46 PM
ocharles

30 Mar 12:44:47 - The server is now ready to accept connections on port 6379

2013-03-30 08908, 2013

12:47 PM
ocharles

30 Mar 11:20:14 * WARNING overcommit_memory is set to 0! Background save may fail under low condition memory. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.

2013-03-30 08916, 2013

12:47 PM
ocharles

but that's fail, not spin

2013-03-30 08959, 2013

12:47 PM
warp

the server has 4GB free memory, so that message doesn't seem relevant to our current trouble.

2013-03-30 08927, 2013

12:50 PM
ocharles

this connection is too weak to do anything useful :/

2013-03-30 08930, 2013

12:50 PM
warp

ocharles: anyway, so I don't understand why it's being this finnicky. it should have enough file handles, enough CPU, enough memory.

2013-03-30 08933, 2013

12:50 PM
ocharles

acid2 [at roobarb]:~$ ps aux | grep strc

2013-03-30 08934, 2013

12:50 PM
ocharles

a^C^[[A^C^C^C^C^Z

2013-03-30 08935, 2013

12:50 PM
ocharles

for example

2013-03-30 08936, 2013

12:50 PM
ocharles

:)

2013-03-30 08902, 2013

12:51 PM
warp

ecuador cable internet ftw!

2013-03-30 08942, 2013

12:52 PM
warp

ocharles: I'm inclined to install redis 2.x either on roobarb or a new hoser vm. 1.2 seems old. though upgrading and hoping that magically fixes a problem is in general not the best strategy.

2013-03-30 08908, 2013

12:53 PM
ocharles

i'm ok with that

2013-03-30 08951, 2013

12:54 PM
warp

ok

2013-03-30 08922, 2013

12:58 PM
ocharles

i really can't do anything other than advise i'm afraid

2013-03-30 08901, 2013

12:59 PM
ocharles

i'm seeing java take 400% of the cpu though atm

2013-03-30 08947, 2013

13:00 PM
warp

ocharles: yep, roobarb and dora are the search servers.

2013-03-30 08905, 2013

13:01 PM
ocharles

i know

2013-03-30 08911, 2013

13:01 PM
ocharles

but i mean the load doesn't seem to be coming from redis right now

2013-03-30 08958, 2013

13:01 PM
warp nods.

2013-03-30 08910, 2013

13:02 PM
warp

and redis is working fine when it's not spiking at 100%

2013-03-30 08939, 2013

13:02 PM
ocharles

it seems to have only broken today though

2013-03-30 08955, 2013

13:02 PM
ocharles

and ruaok did an upgrade on the search servers yesterday, so i wonder if the events are correlated?

2013-03-30 08908, 2013

13:03 PM
ocharles

http://stats.musicbrainz.org/webstats/nginx-rrd/d… shows it kicked in around 1am

2013-03-30 08903, 2013

13:04 PM
ocharles

though that upgrade looks to have finished around 3 hours before

2013-03-30 08922, 2013

13:04 PM
warp

ocharles: and 3 hours is the interval at which we deploy search indexes? :)

2013-03-30 08951, 2013

13:04 PM
ocharles

i thought we did that in a loop now

2013-03-30 08900, 2013

13:06 PM
warp

then the loop takes 3 hours? or I'm just misremembering.

2013-03-30 08945, 2013

13:06 PM
ocharles

3 hours is what it says on the /search page

2013-03-30 08955, 2013

13:06 PM
ocharles

Search Results

2013-03-30 08956, 2013

13:06 PM
ocharles

Last updated: 2013-03-30 08:34 GMT

2013-03-30 08910, 2013

13:07 PM
ocharles

that does seem to be quite a while ago

2013-03-30 08900, 2013

13:09 PM
warp

that is certainly more than 3 hours.

2013-03-30 08950, 2013

13:09 PM
ocharles

yea

2013-03-30 08950, 2013

13:09 PM
warp

ok, I've got a redis 2.6 on dora.

2013-03-30 08954, 2013

13:09 PM
ocharles

cool

2013-03-30 08911, 2013

13:11 PM
warp

shall I just switch things over, or should I make some attempt at preserving the sessions?

2013-03-30 08928, 2013

13:11 PM
ocharles

uff, roobarb at load 13 again

2013-03-30 08931, 2013

13:11 PM
ocharles

and no, just switch them

2013-03-30 08938, 2013

13:11 PM
warp

alright.

2013-03-30 08943, 2013

13:11 PM
ocharles

it seems to be java that's really being problematic here

2013-03-30 08913, 2013

13:12 PM
warp

then switching to dora isn't going to help. it currently has java at 400% and load 4.something.