-
chhavi_
ruaok: Yes, we can do that :)
2018-06-15 16659, 2018
-
Leo__Verto
yvanzo, have you gotten around to running the script I sent you yet?
2018-06-15 16638, 2018
-
zas
samj1912, yvanzo: i re-use musicbrainz_server_session cookie, with cookie prefix haproxy option
2018-06-15 16614, 2018
-
samj1912
cool
2018-06-15 16627, 2018
-
zas
it needs more tests though, not sure yet if the config is correct, still digging docs
2018-06-15 16645, 2018
-
samj1912
but that wont help with sir posting docs right?
2018-06-15 16612, 2018
-
samj1912
zas: I am shifting sir from solr1 to floating ip?
2018-06-15 16615, 2018
-
zas
for post i have another config i'm working on, we can target one server
2018-06-15 16617, 2018
-
samj1912
or should I wait?
2018-06-15 16628, 2018
-
samj1912
zas: always 1 server wont be nice
2018-06-15 16629, 2018
-
zas
wait, i finish the config, then we can test
2018-06-15 16632, 2018
-
samj1912
we need to target the leader
2018-06-15 16641, 2018
-
samj1912
that will result in least latency
2018-06-15 16650, 2018
-
samj1912
otherwise solr forwards the request to leader
2018-06-15 16604, 2018
-
samj1912
and then leader sends replication packets back to other replicas
2018-06-15 16613, 2018
-
zas
but how can we get the leader IP in haproxy?
2018-06-15 16625, 2018
-
samj1912
hmm, I dont think we can
2018-06-15 16631, 2018
-
samj1912
not sure
2018-06-15 16632, 2018
-
zas
well, it embeds lua
2018-06-15 16648, 2018
-
zas
can we query solr to get the leader?
2018-06-15 16653, 2018
-
samj1912
we might
2018-06-15 16659, 2018
-
samj1912
its too complicated though
2018-06-15 16603, 2018
-
samj1912
lets leave it to later
2018-06-15 16624, 2018
-
zas
ok
2018-06-15 16621, 2018
-
zas
can you restart the load testing script?
2018-06-15 16603, 2018
-
samj1912
okay, live or stress?
2018-06-15 16608, 2018
-
zas
stress it !
2018-06-15 16620, 2018
-
samj1912
okay
2018-06-15 16629, 2018
-
samj1912
that will be directly on the lb though
2018-06-15 16634, 2018
-
samj1912
not via mbs
2018-06-15 16635, 2018
-
zas
that's ok
2018-06-15 16657, 2018
-
zas
i just want to be sure everything is still ok, and test with 3 xc51
2018-06-15 16627, 2018
-
zas
hmm, it doesnt load balance
2018-06-15 16635, 2018
-
Leo__Verto
2018-06-15 16635, 2018
-
zas
due to the sticky thing
2018-06-15 16658, 2018
-
zas
samj1912: please stop the script, i'll change the config
2018-06-15 16610, 2018
-
samj1912
okay
2018-06-15 16627, 2018
-
samj1912
2018-06-15 16630, 2018
-
samj1912
stats if you want
2018-06-15 16622, 2018
-
zas
ok, start it again
2018-06-15 16645, 2018
-
samj1912
okay
2018-06-15 16602, 2018
-
ruaok
play it again, sam!
2018-06-15 16608, 2018
-
ruaok runs away
2018-06-15 16637, 2018
-
zas
2018-06-15 16640, 2018
-
samj1912
running
2018-06-15 16652, 2018
-
ruaok
lol. meta memes.
2018-06-15 16610, 2018
-
samj1912
meta meta-memes? :P
2018-06-15 16613, 2018
-
Leo__Verto
meta is what we do best
2018-06-15 16627, 2018
-
ruaok
it is part of our name, afterall.
2018-06-15 16641, 2018
-
zas
samj1912: let it run few mins (the script, not ruaok)
2018-06-15 16603, 2018
-
samj1912
XD
2018-06-15 16639, 2018
-
zas
requests are perfectly balanced, but not the timings... weird it is
2018-06-15 16652, 2018
-
Leo__Verto
perfectly balanced, as all things should be
2018-06-15 16600, 2018
-
Leo__Verto just couldn't resist
2018-06-15 16601, 2018
-
samj1912
zas: something to do with stat display?
2018-06-15 16602, 2018
-
zas
:)
2018-06-15 16615, 2018
-
zas
samj1912: nope, it is confirmed by logs
2018-06-15 16633, 2018
-
zas
that's more related to caches imho
2018-06-15 16602, 2018
-
zas
or the leader thing
2018-06-15 16605, 2018
-
samj1912
let me know when to stop
2018-06-15 16610, 2018
-
zas
never stop !
2018-06-15 16648, 2018
-
ruaok feels like he stumbled into an inspirational hippie IRC channel.
2018-06-15 16650, 2018
-
zas
let's stress til it dies! no no ... no no limit !
2018-06-15 16652, 2018
-
zas
samj1912: looks at cpu usage
2018-06-15 16601, 2018
-
ruaok
samj1912: give the replay script a percentage that is higher than 100%.
2018-06-15 16608, 2018
-
ruaok
(will need code changes, but still)
2018-06-15 16608, 2018
-
zas
solr1 62% and others are at 100%
2018-06-15 16612, 2018
-
ruaok
where doe sit break?
2018-06-15 16630, 2018
-
samj1912
ruaok: we are doing a stress test currently
2018-06-15 16641, 2018
-
zas
still 100% 200 responses
2018-06-15 16659, 2018
-
samj1912
wait let me increase concurrency then
2018-06-15 16618, 2018
-
samj1912
2018-06-15 16643, 2018
-
samj1912
3x concurrency now
2018-06-15 16647, 2018
-
zas
ok
2018-06-15 16608, 2018
-
zas
115 req/s zero loss on this test
2018-06-15 16610, 2018
-
samj1912
zas, I am stopping sir for a while
2018-06-15 16611, 2018
-
ruaok
is solr-1 running some other task that slows things down?
2018-06-15 16621, 2018
-
samj1912
ruaok: yup
2018-06-15 16621, 2018
-
zas
sir prolly
2018-06-15 16624, 2018
-
samj1912
sir
2018-06-15 16629, 2018
-
ruaok
ah.
2018-06-15 16639, 2018
-
samj1912
and replication packets to other nodes
2018-06-15 16643, 2018
-
zas
from which host do you run the test samj1912 ?
2018-06-15 16643, 2018
-
samj1912
stopped sir
2018-06-15 16645, 2018
-
zas
herb?
2018-06-15 16646, 2018
-
ruaok
someone please write a tool called m'aam.
2018-06-15 16648, 2018
-
samj1912
yup
2018-06-15 16651, 2018
-
zas
ok
2018-06-15 16621, 2018
-
samj1912
sir is down
2018-06-15 16631, 2018
-
zas
ok stress again !
2018-06-15 16639, 2018
-
samj1912
its on
2018-06-15 16647, 2018
-
samj1912
didnt stop the stress test
2018-06-15 16624, 2018
-
samj1912
zas: how is it?
2018-06-15 16639, 2018
-
samj1912
oh wait
2018-06-15 16645, 2018
-
samj1912
let me reshuffle the reqs
2018-06-15 16647, 2018
-
samj1912
otherwise caches
2018-06-15 16651, 2018
-
zas
ok
2018-06-15 16607, 2018
-
samj1912
2018-06-15 16611, 2018
-
zas
peaking at 212 ops
2018-06-15 16649, 2018
-
samj1912
restarted with shuffle
2018-06-15 16659, 2018
-
zas
you can increase concurrency a bit ? i'd like it to throw some"enough! enough! i'll tell you all!"
2018-06-15 16619, 2018
-
samj1912
its at 100 :P
2018-06-15 16625, 2018
-
ruaok
"She can't handle any more captain!!!"
2018-06-15 16635, 2018
-
ruaok
samj1912: make it go to 11!
2018-06-15 16638, 2018
-
ruaok
er, 110!
2018-06-15 16640, 2018
-
samj1912
I am not sure herb can :P
2018-06-15 16646, 2018
-
zas
can you run the same test script from the db server on hetzner cloud ?
2018-06-15 16600, 2018
-
samj1912
thats destroyed
2018-06-15 16605, 2018
-
ruaok
playing mean now. :)
2018-06-15 16610, 2018
-
zas
herb is is very good (usually)
2018-06-15 16621, 2018
-
ruaok
it needs the logs, zas.
2018-06-15 16644, 2018
-
samj1912
okay, 300 concurrency
2018-06-15 16648, 2018
-
samj1912
what gives :
2018-06-15 16650, 2018
-
samj1912
* :P
2018-06-15 16605, 2018
-
samj1912
2018-06-15 16607, 2018
-
samj1912
last test
2018-06-15 16647, 2018
-
samj1912
at 300 concurrency
2018-06-15 16659, 2018
-
zas
solr1 has timings twice better than others...
2018-06-15 16600, 2018
-
samj1912
if it doesn't work, I will run it from another node
2018-06-15 16612, 2018
-
zas
still unexplained
2018-06-15 16621, 2018
-
samj1912
zas: its not been down since forever
2018-06-15 16628, 2018
-
zas
for now, we lose no query at 220 queries/s
2018-06-15 16630, 2018
-
samj1912
its got to have a lot of stuff on memory
2018-06-15 16640, 2018
-
samj1912
not just caches
2018-06-15 16604, 2018
-
samj1912
apart from the jvm, solr indexes are kept in memory
2018-06-15 16652, 2018
-
zas
can you loop the test so it runs 10 times ? and i will power down solr2 in the middle
2018-06-15 16613, 2018
-
samj1912
10 times?
2018-06-15 16622, 2018
-
samj1912
I will have to figure out a doc size
2018-06-15 16634, 2018
-
samj1912
going almost 300 ops now?
2018-06-15 16652, 2018
-
zas
yes
2018-06-15 16606, 2018
-
samj1912
nice
2018-06-15 16643, 2018
-
zas
btw, set this concurrency to 1000
2018-06-15 16653, 2018
-
zas
herb doesn't care, <1 load
2018-06-15 16659, 2018
-
samj1912
lol okay
2018-06-15 16627, 2018
-
zas
imho we'll need to work on reducing latency
2018-06-15 16605, 2018
-
zas
(btw, the bandwidth isn't counted between hetzner machines)
2018-06-15 16635, 2018
-
zas
solr machines reached 50 load
2018-06-15 16637, 2018
-
samj1912
2018-06-15 16648, 2018
-
samj1912
evident
2018-06-15 16648, 2018
-
ruaok
latency in what terms?
2018-06-15 16659, 2018
-
ruaok
little we can do between the regular and cloud machines.
2018-06-15 16616, 2018
-
samj1912 just wants to put it in prod now
2018-06-15 16625, 2018
-
zas
the time needed to answer ONE query, because even without load i have the feeling it could be better than >100ms
2018-06-15 16652, 2018
-
zas
samj1912: step by step ;)
2018-06-15 16613, 2018
-
samj1912
ah, I am not sure how I can improve solr anymore
2018-06-15 16648, 2018
-
zas
still running ?
2018-06-15 16651, 2018
-
ruaok
where can I see the response time graphs?
2018-06-15 16653, 2018
-
zas
load >70
2018-06-15 16600, 2018
-
samj1912
zas: stopped
2018-06-15 16603, 2018
-
zas