[@mayhem:chatbrainz.org](https://matrix.to/#/@mayhem:chatbrainz.org) lb and meb redis are separate already. not sure what caused it. were meb or lb under elevated traffic when this happened?
I would check myself but I am unsure when this started.
FWIW, I did find some queries to donor api just before the redis mess started
mayhem[m]
no elevated traffic; nothing I could see, except from redis connection messages.
lucifer[m]
So I have implemented a cache for those queries
Testing it in beta at the moment.
mayhem[m]
and redis flushing its data to disk with heavy IO once a minute. that may not seem like a good idea...
lucifer[m]
Those queries could create a lot of db connections to meb db bringing down meb too
mayhem[m]
ah, interesting.
lucifer[m]
As for redis flushing, the bulk of the data in redis would be metadata caches.
Wild guess would be metadata caches overwhelmed redis somehow and that caused issues in lb which caused the meb dono queries to lag and brought down meb too
But I will have to dig deeper
To be sure
zas[m]
mayhem: How metabrainz.org linked to kiss? Kiss exhibits high disk write activity every 4 minutes for 3 minutes, leading to overall high disk I/O, load is rather low, cpu usage is low, memory usage looks normal
mayhem[m]
That might be redis flushing to disk, which seems excessive
<zas[m]> "But metabrainz.org use this..." <- no, but it looks like lucifer may have put a lot of load on the DB testing a pesky query, which may have caused this. that query is now being cached to limit the impact.