-- BotBot disconnected, possible missing messages --
-- BotBot disconnected, possible missing messages --
BrainzBot joined the channel
qtjojo has quit
qtjojo joined the channel
Leftmost
LordSputnik, I don't think I have a link to the doc on updating rika. Could you send that my way when you're on next?
Also, I have an inkling of an idea on how to deal with revisions. Instead of creating them directly with a create function, we do it with a trigger on saving an Entity or EntityData instance.
qtjojo has quit
qtjojo2 joined the channel
diana_olhovik_ joined the channel
qtjojo2 has quit
JonnyJD_ joined the channel
ariscop has quit
reosarevok joined the channel
jesus2099 joined the channel
ariscop joined the channel
jesus2099 has quit
ruaok joined the channel
zas
Good morning, Rob
samphippen joined the channel
samphippen has quit
ariZon_a has quit
ariZon_a joined the channel
ruaok has quit
samphippen joined the channel
ruaok joined the channel
ruaok has quit
ruaok joined the channel
ruaok
awesome network is awesome.
ruaok has quit
ruaok joined the channel
ruaok sighs
reosarevok
Heh. Where are you this time?
ruaok
at the uni.
and this is the worst network connection in weeks. :(
it concerns the modgroup command, which doesn't do what it should, not preserving the order in the group after you commit
Nyanko-sensei has quit
ruaok
gah network lag. :(
zas
if you have group G A B and modgroup G add C before B, it works (crm configure show: G A C B) but when after the commit it is set to G A B C ... and since order in the group is very important, it mess up the setup
i didn't search yet if it is a known issue (perhaps fixed in later versions)
in fact group is working like colocation + order, so they are redundant if used together (ie. initial setup you made)
ruaok: to recover you need to reduce rate limiter value to 500 ie
else it will not work :(
ruaok
500?
wow.
zas
500, then let search servers recover, then move it back to original value
ruaok nods
i wonder with they went mad again
ruaok
it used to be 1000
zas
you can try 1000, but last time i had to go even further down
ruaok
1000 didn't work, on 500 now
zas
this is damn annoying
ruaok nods
ruaok
still not recovering.
zas
try to stop one server, and let the other going up (it takes 7-8 minutes)
ruaok did just that
this is incredible they can't recover by themselves... we need to find a way to slow start them, bitmap suggested to use the rate limiter for that, though i have no idea yet on how to set up something simple enough to be reliable
ruaok
I dont think slow starting them is all the hard.
zas
but another approach would to find WHY they suddenly crash, and fix the issue at the root
ruaok
ok, recovered now
so, in order to cold start a search server we just need to write a script that executes searches on each of the indexes.
pick a random dictionary word, do a search, repeat.
do that for X times on each index and things should be safe enough to restart.
zas
we need to block requests until the slow start process is finished
ruaok
that part is harder. :(
zas
a simple iptables rule could do
with a REJECT
ruaok
and then nginx would fail out that search server.
nice idea. :)
zas
block, start server, do slow start queries, unblock
ok both servers are back to business, until next crash
ruaok nods to both
about gateway-chef, i'm working on the pacemaker config, and it is a damn hell, but i'm now knowing much more about it
rule: it doesn't work as you think it works ;)
ruaok
there is a lot of know. :(
oy
zas
plus it has bugs
small ones, annoying ones
i rewrote all cookbooks, so they play nicely now
ruaok
<3
thank you!
zas
i can install both gateways on VMs easily, but the crm config isn't ready yet
i'm thinking about not using groups at all
dnscache / tinydns depends only on one internal IP
mx/mail IPs are independent from the rest (well, i think...)
nginx / rate limiter depends on both external IPs and few internal IPs
tinydns has to start before dnscache (since it forwards requests to tinydns)
we have 2 possibilities, either all resources can run anywhere by default (meaning they start as soon they are configured), or the reverse which implies to explicitely define where they can run, more complex but it ensures no resource is started until we say to
also i found a nice thing
ruaok
I always wondered about all the IPs we use.
zas
ifconfig doesn't report configured interfaces
ip command does
ruaok
would everything be simpler if we used 1 IP for everything?
zas
but there is a way to have them displayed, using iflabel in primitives
one IP ?
ruaok
we currently have 5 or so IPs that can fail over.
would it be easier with 1 and have all the services on 1?
zas
it will prolly easier, though i think i can make it work as is
Nyanko-sensei joined the channel
basically if a service fails (or an IP) we migrate all to the other server anyway
ruaok
exactly. :)
zas
this is something that was not working, ie. if dnscache failed, all were stopped ...
now the thing i didnt do yet, is deep checks: we currently test the services are running ... not they are working (nginx can run, and still not answering ie.)
i prepared what is needed to do that, we can add an extracheck=program for each service
ruaok is starving
ruaok
feh, I need to buy is the irccloud package. but I can't think straight, so hungry.
back in a bit.
zas
obviously each check program is different, depending on what we check (dns requests / web requests)