20:05 PM
BrainzGit
20:06 PM
mayhem
thx!
20:06 PM
lucifer
this shouldn't change anything but lets' see
20:06 PM
because the actual image was built and pushed to dockerhub 2 days ago
20:08 PM
mayhem
hm. random CB tests failing on my pr. seems unrelated.
20:14 PM
lucifer: spark tests still failing. I merged your branch into mine
20:14 PM
20:15 PM
lucifer
mayhem: yes i the previous runs, its a new error. looking into it.
20:15 PM
*saw the
20:33 PM
aerozol
reosarevok: had to get this out of my head, been floating around for months… if we could assign a ‘product’ attribute to works then I think we could display something like this without having to get new data? It’s already all there, just doesn’t display in a way that’s useful for me
https://usercontent.irccloud-cdn.com/file/bT0ub...
20:33 PM
The niggly bit would be the year, but this would be useful for me even if that column didn’t exist
20:46 PM
BrainzGit
20:47 PM
monkey
There an issue with LB, I get this when I try to load my dashboard:
20:47 PM
20:47 PM
mayhem
ok, shifted work-day, but I still got stuff done. except paying some stragglers.
20:47 PM
bitmap: ^^
20:48 PM
on prod, monkey ?
20:48 PM
monkey
yvanzo:
20:48 PM
Woopp,s sorry for the ping
20:48 PM
I mean Yes
20:48 PM
mayhem
seems related to the DB move.
20:48 PM
yep. got the same
20:50 PM
mayhem restarts the web container
20:50 PM
zas: atj : bitmap : we've got some DB issues!
20:50 PM
bitmap
I'll stop some of the pgbouncer instances for now
20:50 PM
we might have to increase pg's max_connections
20:51 PM
mayhem
MB seems unaffected.
20:52 PM
mayhem is all for raising number of connections.
20:52 PM
they are not expensive, so why are we so conservative with them?
20:53 PM
bitmap
yeah, we should next time we restart PG
20:54 PM
mayhem
the problem is still happening.
20:54 PM
anything I can help with bitmap?
20:57 PM
bitmap
I stopped the pgbouncer on hendrix which should reduce the # of connections
20:57 PM
hard to tell what's going on though 'cause I think the pgbouncer stats aren't working since the move
20:57 PM
mayhem
no change. should I restart the web container again?
20:58 PM
bitmap
you can try, I'll try increasing LB's connection allotment in the pgbouncer config
20:59 PM
mayhem
no change. :(
20:59 PM
monkey
Some alerts on telegram about gaga. Related?
21:00 PM
mayhem
ah, we're looking at the wrong db. :)
21:00 PM
bitmap: its gaga, not jimmy/hendrix
21:00 PM
lucifer: ping
21:00 PM
gaga PG is quite unhappy.
21:01 PM
bitmap
ah, no wonder nothing was taking
21:01 PM
lucifer
pong
21:01 PM
mayhem
gaga's PG is very unhappy.
21:01 PM
help me look , please?
21:01 PM
lucifer
sure, unhappy as in?
21:02 PM
bitmap undoes the main cluster changes then
21:02 PM
mayhem
high load, many PG procs.
21:02 PM
21:03 PM
this error was found a lot in the logs before we ran out of connections.
21:03 PM
that might be eating the connections.
21:03 PM
lucifer
2023-06-06 02:38:15.409 UTC [3991072] STATEMENT:
21:03 PM
5 months ago
21:03 PM
mayhem
went too far, lol.
21:04 PM
21:04 PM
ok, this was in the logs before.
21:04 PM
do we have a spark process or anything writing to gaga right now?
21:06 PM
lucifer
not really
21:06 PM
checking pg_stat_activity i see mostly listen read requests
21:06 PM
mayhem
we should check where the connections are coming from.
21:06 PM
how do we do that?
21:07 PM
lucifer
its back to normal now i think
21:07 PM
mayhem
did you change anything?
21:07 PM
lucifer
nope
21:08 PM
mayhem
odd. wtf?
21:08 PM
lucifer
16824 | listenbrainz_ts | 1179026 | | | | | | | | 2023-11-02 20:28:19.08004+00 | 2023-11-02 20:42:10.394548+00 | 2023-11-02 20:42:10.394548+00 | 2023-11-02 20:42:10.394548+00 | Timeout | VacuumDelay | active | | 406250699 | autovacuum: VACUUM ANALYZE
21:08 PM
_timescaledb_internal._hyper_41_6573_chunk | autovacuum worker
21:08 PM
a very wild guess.
21:08 PM
mayhem
plausible.
21:08 PM
lucifer
autovaccum locked listen table, leading to accumulation of connections
21:09 PM
mayhem nods
21:09 PM
but then autovaccum timed out so it went back to normal again
21:09 PM
so went from immediate bad to long term bad
21:09 PM
monkey
Although LB is accessible, it shows that I haven(t listened to anything on my profile
21:09 PM
mayhem
I wonder what spooked autovacuum.
21:09 PM
monkey: give it a moment to recover
21:09 PM
monkey
Oki
21:09 PM
lucifer
it couldn't complete within the timelimit
21:09 PM
not sure why
21:10 PM
mayhem
all my listens are accounted for, but they are coming from navidrome
21:11 PM
lucifer
its showing some listens now.
21:11 PM
(not for me, aerozol's profile)
21:14 PM
tomorrow, i think we should run `VACUUM VERBOSE listen` and it may list if there are any obvious problems.
21:14 PM
if not, let's upgrade PG/TS on gaga to latest and then look into tuning it.
21:15 PM
mayhem
good plan
21:18 PM
aerozol
lucifer: my ‘older’ button is not working now, I can’t check if the covers are fixed 🫣
21:18 PM
lucifer
aerozol: yup, its related to the db issues.
21:18 PM
aerozol
“You have listened to 8 songs so far”
21:18 PM
uh oh
21:18 PM
lucifer
the older button that is.
21:18 PM
aerozol
Ah okay, you’re onto it. Thanks
21:18 PM
lucifer
covers are broken due to a different db issue
21:18 PM
which will take longer to fix
21:19 PM
aerozol
allgood
21:37 PM
21:37 PM
21:38 PM
mayhem
yerp
21:38 PM
maybe they also don't know about the "save to spotify" option?
21:38 PM
aerozol
Happy to ask follow up questions if wanted, otherwise happy to leave it as “something for users to live with” :)
21:39 PM
Since I don’t use Spotify I don’t have any idea what’s meant tbh, I assumed it was the player, but maybe not?
21:41 PM
mayhem
21:41 PM
this option exists on a playlist if the user has connected spotify
21:41 PM
I wonder if this person knows about it
21:42 PM
fivesheep joined the channel
21:42 PM
gaga is still under quite a load.
21:42 PM
aerozol
Right - I don’t even know if they are talking about radio, playing stuff from elsewhere on LB, or submitting listens from Spotify...
21:43 PM
mayhem
doesn't matter.
21:43 PM
its brainzplayer flakiness that they are upset about. so do as we do and send playlists to spotify.
21:43 PM
lucifer
mayhem: i tried cancelling autovacuum but it seems it isn't going away that easy.
21:43 PM
aerozol
Okay, I’ll pass that on
21:44 PM
mayhem
lucifer: do you need to cancel it for each chunk? 🙄
21:44 PM
lucifer
yeah
21:44 PM
well for the chunks its currently running at the least
21:44 PM
20 or sor
21:44 PM
fivesheep
21:45 PM
lucifer
fivesheep: hi! yes, our db is currently under high load so some glitches on the site. the data is still there.
21:45 PM
mayhem
your listens are fine fivesheep. the listen counts got borked.
21:46 PM
fivesheep
Ah great thank you for the reassurance!!
21:49 PM
lucifer
21:49 PM
the second one looks redundant, i'll drop it
21:49 PM
fivesheep has quit
22:06 PM
fivesheep joined the channel
22:07 PM
lusciouslover joined the channel
22:12 PM
lusciouslover has quit
22:12 PM
fivesheep has quit
22:29 PM
kellnerd
22:29 PM
Scared LB users on the forums as well
22:32 PM
Maxr1998_
Thanks for the update & the hard work on resolving this! 🙂
22:32 PM
mayhem
22:34 PM
the data in the DB appears to be fine.
22:34 PM
time to restart labs api, methinks.
22:35 PM
ah no, that is not labs. huh, odd.
22:39 PM
well, all seems normal now, except for playcounts and top-recordings. the latter is not so important now.
22:41 PM
relaxoMob has quit
22:46 PM
monkey
mayhem: it's not just the play counts that are borked from a user point of view. no listen is shown that is older than 2h ago, so it looks to users as if they lost all their listens. See mine for example:
https://listenbrainz.org/user/mr_monkey/
22:46 PM
mayhem
i understand.
22:47 PM
but pagination is borked because of the play counts.
22:49 PM
monkey
22:50 PM
mayhem
its all connected, monkey.
22:50 PM
petitminion joined the channel
22:50 PM
relaxoMob joined the channel
22:50 PM
monkey
So weird
23:00 PM
mayhem
listen counts, minimum timestamps and maximum timestamps are all calculated in the same process.
23:00 PM
and without min/max ts you get no working pagination.
23:08 PM
petitminion has quit
23:08 PM
relaxoMob has quit
23:11 PM
outsidecontext
What a shock seeing the LB data missing. I really had enough data loss shit this week. But glad to hear the listens themselves are fine. Wish you the best getting everything back up
23:19 PM
petitminion joined the channel
23:24 PM
relaxoMob joined the channel