-
ruaok
I think it would be better if we picked this up first thing tomorrow.
-
yvanzo
Ok
-
ruaok
kewl
-
_lucifer
-
yvanzo
I will migrate sir-prod this evening , so it will already make trille happier.
-
zas
yvanzo: trying to fix telegraf issue related to rabbitmq
-
yvanzo
zas: I did not anticipate that stats dashboard would need to be updated for the new rabbitmq instance
-
I would be happy to learn more about how to set it up, so I can make use of it for other projects.
-
zas
well, in this case that's rather complicated
-
on yehudi runs a telegraf-services container
-
it is based on consul-template, and list rabbitmq services
-
I had to update the template because it was not meant to multiple instances
-
(btw, dashboards neither)
-
yvanzo
Ok, should I wait before emptying /sir vhost in trille rabbitmq instance?
-
zas
I'm updating dashboards right now
-
-
yvanzo
Ok, I will setup /sir on prince without moving any messages yet.
-
zas
don't worry, that's just stats/alerts/display, I'll fix stuff if needed
-
yvanzo
Ok, updating sir-prod
-
Initialized vhost in new instance
-
now transferring messages
-
updated broker in floyd, stopped messages transfer, sir-prod don't use trille anymore.
-
in less than ten minutes :)
-
zas
!m yvanzo
-
BrainzBot
You're doing good work, yvanzo!
-
v6lur joined the channel
-
travis-ci joined the channel
-
travis-ci
-
travis-ci has left the channel
-
reosarevok
-
BrainzBot
MBS-11392: not possible to add tracklist and media "Gateway timeout 504"
-
bitmap
it looks like there's been an increase in 50x errors recently, I'll have a loook
-
actually I think they've stopped already, but unclear what happened yet
-
ephemer0l has quit
-
zas
-
7GHACW088 joined the channel
-
bitmap
yeah, I think they were spread pretty evenly
-
it was correlated with a rise in the average query time / transaction duration on floyd
-
hmmm
-
-
yvanzo: ^
-
7GHACW088 is now known as ephemer0l
-
ephemer0l is now known as GeneralDiscourse
-
GeneralDiscourse is now known as ephemer0l
-
I think this might be adding a timeout/delay to most inserts
-
zas: where is rabbitmq running now?
-
zas
prince, but not sure what's the current status, yvanzo was managing this
-
bitmap
okay, he might be afk, I'm looking into the login failure
-
did the version of rabbitmq change? iirc I ran into something similar on macos when upgrading versions
-
yeah it's the same issue, logs are being spammed about channel_max
-
ugh, I'm not sure how to resolve this outside of upgrading pg_amqp
-
working on compiling that inside the container, hopefully we can upgrade it without restarting pg
-
zas: I think we'll have to restart PG
-
zas
On Floyd?
-
bitmap
yes
-
or downgrade rabbitmq
-
zas
Can we upgrade on pink, and switch to it?
-
bitmap
well, that will still require downtime
-
zas
Ok, if we have to restart, we'll also reboot
-
bitmap
well, let me see if I can configure rabbitmq to work around this
-
zas
When do you want to do it?
-
K
-
bitmap
oh thank god, I reverted channel_max to 0 in the rabbitmq config and I think it's happy for now
-
I guess this value is "dangerous" in case of a channel leak, so we should plan a restart, but it doesn't have to be now
-
zas
Ok, let's do that soon though
-
bitmap
-
-
currently we are running pg_amqp 0.4.1 which doesn't have this patch
-
yvanzo
bitmap: yes, rabbitmq version changed
-
bitmap
I saw :)
-
yvanzo
we can fallback to trille if needed
-
bitmap
it's ok for now
-
yvanzo
bitmap: so do you want to upgrade pg_amqp or to configure/downgrade rabbitmq?
-
bitmap
yvanzo: we should def. upgrade pg_amqp to avoid this in the future. but as a temporary workaround I've set channel_max to 0 in the rabbitmq container
-
in /etc/rabbitmq/rabbitmq.conf and then restarting
-
adhawkins has quit
-
adhawkins joined the channel
-
adhawkins has quit
-
adhawkins joined the channel
-
adhawkins has quit
-
adhawkins joined the channel
-
v6lur has quit
-
adhawkins has quit
-
adhawkins joined the channel
-
d4rkie joined the channel
-
Nyanko-sensei has quit