in #metabrainz

1:18 AM
minimal has quit
2:31 AM
lucifer[m] joined the channel
2:31 AM
lucifer[m]

[@mayhem:chatbrainz.org](https://matrix.to/#/@mayhem:chatbrainz.org) last week it had run out of resources but today I checked and there is some other weird error. Spark is not able to start at all, I will try to reset the cluster.
2:59 AM
ApeKattQuest joined the channel
3:00 AM
LupinIII has quit
3:01 AM
HSOWA has quit
3:06 AM
HSOWA joined the channel
5:48 AM
pite has quit
7:15 AM
Kladky joined the channel
8:47 AM
reosarevok[m]

zas: MBS-11723 is still an issue, has the VM usage climbed again?
8:47 AM
BrainzBot

MBS-11723: WikiDocs 404 sometimes https://tickets.metabrainz.org/browse/MBS-11723
9:00 AM
reosarevok[m]

aerozol: does my comment in MBS-13809 make any sense to you or do you think it would rather be confusing? (ideally we'd separate the two but that's not trivial at all)
9:00 AM
BrainzBot

MBS-13809: There's no visual indication of open edits for a setlist https://tickets.metabrainz.org/browse/MBS-13809
9:17 AM
prout has quit
9:17 AM
prout joined the channel
9:31 AM
yvanzo[m]

mayhem: Wolf instance is up-to-date. I just restarted it to enable daily replication.
9:31 AM
mayhem[m]

thanks!
9:32 AM
zas[m]

reosarevok: I think we can close MBS-11723, everything seems fine now, and since a while.
9:33 AM
BrainzBot

MBS-11723: WikiDocs 404 sometimes https://tickets.metabrainz.org/browse/MBS-11723
9:33 AM
reosarevok[m]

I literally got a report of 404s overnight
9:33 AM
So it doesn't seem so?
9:33 AM
(that's why I brought it up - MBS-13813)
9:33 AM
BrainzBot

MBS-13813: How To page moved? https://tickets.metabrainz.org/browse/MBS-13813
9:58 AM
mayhem[m]

interesting: https://github.com/sjdonado/idonthavespotify
10:19 AM
monkey[m]

<mayhem[m]> "interesting: https://github.com..."; <- Interesting indeed. I'll note that they use the spotify API for searching on Spotify, but prefer to load the apple music web page and parse it for information. Says something about the apple music API, doesn't it?
10:44 AM
mayhem[m]

as always, yes.
10:59 AM
zas: ping
11:00 AM
yvanzo[m]

bitmap, reosarevok: I reported the issue about duplicate messages upstream.
11:00 AM
reosarevok[m]

Thanks
11:01 AM
yvanzo[m]

reosarevok: Are you available to check the CSP issue again?
11:01 AM
reosarevok[m]

Remind me what that was? But I'm available after lunch
11:03 AM
yvanzo[m]

It’s related to Weblate too.
11:03 AM
[OTHER-433](https://tickets.metabrainz.org/browse/OTHER-433)
11:03 AM
BrainzBot

OTHER-433: Weblate: Redirection to authentication provider not working
11:03 AM
reosarevok[m]

Aah, that
11:03 AM
Yes, I'm having lunch but I can check with you after that
11:09 AM
Maxr1998_ joined the channel
11:10 AM
Maxr1998 has quit
11:51 AM
Guest36 joined the channel
11:54 AM
Guest36 is now known as strawberryshaker
11:54 AM
strawberryshaker is now known as suvid
11:55 AM
suvid has quit
11:55 AM
suvid joined the channel
11:56 AM
suvid is now known as strawberryshaker
11:56 AM
strawberryshaker is now known as suvid
11:59 AM
suvid is now known as strawberry
11:59 AM
strawberry is now known as berrybhai
12:00 PM
berrybhai is now known as suvid
12:07 PM
yvanzo: back now
12:13 PM
suvid has quit
12:14 PM
Guest36 joined the channel
12:14 PM
Guest36 is now known as suvid
12:14 PM
suvid has quit
12:14 PM
suvid joined the channel
12:19 PM
yvanzo[m]

reosarevok: Can you still reproduce what you reported in comment?
12:22 PM
reosarevok[m]

Let me see :) IIRC that was with a different laptop but I can fish it out
12:23 PM
suvid has quit
12:23 PM
Well, right now it seems down?
12:23 PM
So, not yet :)
12:23 PM
yvanzo[m]

No, I’m logged in.
12:23 PM
reosarevok[m]

Huh
12:23 PM
I get connection refused
12:23 PM
Let me change browser
12:24 PM
yvanzo[m]

What browser was it?
12:24 PM
reosarevok[m]

Chrome, but same from Firefox
12:25 PM
Weird, works on my phone 🤷‍♂️
12:25 PM
(including logout and login)
12:25 PM
yvanzo[m]

Please disable userscripts and open the console before logging in, for each browser.
12:26 PM
Your phone with what browser?
12:26 PM
reosarevok[m]

Whatever the default Android is, so I assume Chrome
12:26 PM
For Chrome on my laptop I was in incognito with no scripts, nothing in the console either
12:26 PM
Just immediately refused
12:27 PM
Checking the other laptop now (where I used to hit the error)
12:27 PM
yvanzo[m]

(I’m able to log in using FF but there is a message in the console.)
12:43 PM
reosarevok[m]

On my older laptop it loaded but I can still reproduce that issue in the old version of Chrome it has
12:45 PM
130.0.6723.116, same as here actually, still same error
12:45 PM
Hmm
12:45 PM
Now it worked
12:45 PM
yvanzo[m]

reosarevok: Please fill the table I’ve just added to the ticket with the details about your different attempts.
12:45 PM
reosarevok[m]

After I unset the beta flag
12:46 PM
I wonder if it's a beta redirect issue then?
12:46 PM
Yeah
12:47 PM
If I set the beta cookie again, then it starts failing again
12:47 PM
yvanzo[m]

I confirm.
12:47 PM
reosarevok[m]

So it's about that
12:47 PM
yvanzo[m]

How does it come it works on your phone still?
12:48 PM
reosarevok[m]

I probably haven't set the cookie there
12:48 PM
(since I don't use my phone much for MB)
12:57 PM
I wonder if we should skip redirecting to beta for specific endpoints such as this
13:08 PM
minimal joined the channel
13:23 PM
yvanzo[m]

bitmap, reosarevok: I reported the beta/oauth issue separately at https://tickets.metabrainz.org/browse/MBS-13814
13:23 PM
BrainzBot

MBS-13814: OAuth2 violates content security policy when beta site preference is set
13:32 PM
reosarevok[m]

MonkeyPython: see if you can log in if you unset beta :) ^
13:33 PM
(IIRC you were also having some issues here)
13:35 PM
yvanzo[m]

bitmap, reosarevok: Stuck with the initially reported issue (email missing), forwarded it to you by mail.
13:57 PM
reosarevok[m]

zas: did you take the wiki down?
13:57 PM
zas[m]

I'm proceeding to an upgrade right now
13:59 PM
I'll reboot, very short downtime. mysql was upgraded so the wiki was unavailable few seconds.
14:01 PM
done
14:01 PM
reosarevok[m]

Ah, ok, perfect
14:01 PM
Just funny timing then :D
14:02 PM
Thanks!
14:05 PM
mayhem[m]

zas: let me know when you have a minute or two, would like to discuss a few things.
14:06 PM
zas[m]

We can now
14:08 PM
mayhem[m]

we are adding a few endpoints for the purpose of monitoring some LB services.
14:08 PM
incoming queue, last stats generation, last playlists generation, last dump dates.
14:09 PM
https://api.listenbrainz.org/1/status/get-dump-...
14:09 PM
is one endpoint we already have.
14:09 PM
we would like to do two things:
14:10 PM
1. Make alerts from graphana that when these go outside accepted limits that we get a notifcation.
14:10 PM
2. Make a new telegram group for LB services (stats, dumps, playlists, etc).
14:10 PM
because the other telegram channels have too much stuff in them. we would like to have dedicated LB channel.
14:11 PM
take a look at the data returned from the URL above -- the new endpoints are separate endpoints for each things to check.
14:11 PM
I think separate endpoints makes the graphana rules easier? or would it be better to check all stats in one call?
14:11 PM
either way, all these calls are cached and recomputed only periodically.
14:14 PM
https://tickets.metabrainz.org/browse/LB-1673
14:14 PM
BrainzBot

LB-1673: Create a service alerts telegram group with the LB team members
14:14 PM
zas[m]

Hmmm, to me it would be much easier to have a "metrics" endpoint (additionally), providing data directly in a usable format. If we want to use telegraf to collect them, see https://github.com/influxdata/telegraf/blob/mas...
14:14 PM
https://docs.influxdata.com/telegraf/v1/configu... for an example
14:15 PM
mayhem[m]

ok, all stats in one endpoint then?
14:15 PM
and do you want unix epoch timestamps or human readable timestamps?
14:17 PM
we dont need to keep these stats over time -- all we really need are alerts.
14:17 PM
zas[m]

Or/and we could have a Prometheus-compatible endpoint -> https://signoz.io/guides/prometheus-metrics-end... for a simple example
14:18 PM
mayhem[m]

in an ideal world, for not keeping stats over time, but only notification, what should the data format be?
14:19 PM
zas[m]

Yes; but alerts are based on those stats anyway, basically each metric has its own timestamp (and format can be defined in telegraf, usually unix timestamp), see the example -> https://docs.influxdata.com/telegraf/v1/configu...
14:19 PM
Look at this output -> https://gbfs.citibikenyc.com/gbfs/en/station_st...
14:20 PM
mayhem[m]

let me mock up the proposed JSON. one sec.
14:22 PM
zas[m]

In the example above, they use "last_reported" field as the timestamp
14:23 PM
mayhem[m]

[... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
14:23 PM
where fetched indicates, the timestamp when this data was fetched. I'm adding this since the data is cached.
14:24 PM
zas[m]

values can be numbers, or booleans, details in -> https://docs.influxdata.com/telegraf/v1/data_fo...
14:24 PM
mayhem[m]

I'll only need ints for this case. what do you think of the proposed format?
14:27 PM
@zas ^^
14:27 PM
zas: ^^
14:30 PM
zas[m]

last_updated is a timestamp too? What's the purpose? It would probably be more convenient to have a duration since last update instead if the goal is to alert when this is getting too old.
14:31 PM
mayhem[m]

last updated is for when the service last updated the data. fetched is when the data was fetched, since it could be cached.
14:31 PM
do you have seconds elapsed instead?
14:36 PM
* do you want to have seconds
14:36 PM
zas[m]

Since we have a timestamp in fetched, the field can use be seconds_since_last_update or the like, it will drop to 0 if just updated, and increase over the time, and we can have a threshold for the alert ("alert if not updated since N seconds"). It's not very convenient to work with timestamps in fields since it will require extra calculations (and sometimes that's make things overcomplicated).
14:37 PM
mayhem[m]

ok, that is all I needed to know.
14:37 PM
zas[m]

btw, there are plenty of examples in https://github.com/influxdata/telegraf/tree/mas...
14:37 PM
mayhem[m]

[... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
14:39 PM
zas[m]

Yes, but you still need a timestamp somewhere, it can be a upper level like in https://github.com/influxdata/telegraf/blob/mas...
14:40 PM
mayhem[m]

what does that timestamp signify? current time?
14:41 PM
{... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
14:41 PM
zas[m]

The time the metric is associated with
14:41 PM
mayhem[m]

k
14:46 PM
zas[m]

for a different Telegram channel, I guess we'll have to configure a new contact point, see https://grafana.com/docs/grafana/latest/alertin... & https://grafana.com/docs/grafana/latest/alertin...
14:49 PM
We'll need the Chat ID for the matching Telegram channel, that's all (I think). And then configure alerts accordingly.
14:50 PM
mayhem[m]

zas: how do I indicate an error? Lets assume that there are 4 metrics to fetch and I was unable to fetch a given metric, lets say it timed out. what do I return for that metric? or don't return it?