#metabrainz

/

15:44 PM
pite joined the channel

2025-03-25 08445, 2025

15:50 PM
monkey[m]

zas: Hello! Hope you are not too busy battling AI vermin, if you have a moment I have a question regarding a Grafana alert.... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/KYxbdoKEvZqxPwRnDSGEQONU>)

2025-03-25 08413, 2025

15:51 PM
d4rk-ph0enix has quit

2025-03-25 08435, 2025

15:51 PM
d4rk-ph0enix joined the channel

2025-03-25 08409, 2025

15:55 PM
zas[m]

Why doesn't it provide data sometimes? I mean, if the reason is the service doesn't run we expect an alert, right?

2025-03-25 08453, 2025

15:55 PM
monkey[m]

Transient issues that resolve themselves

2025-03-25 08436, 2025

15:56 PM
monkey[m]

We have other alerts for non-responding website and API, so I'm not expecting these stats alerts to trigger when connection is temporarily lost.

2025-03-25 08455, 2025

15:56 PM
monkey[m]

After a delay of 10 minutes, then yes it would make sense to trigger.

2025-03-25 08418, 2025

15:57 PM
zas[m]

There's already a 5m pending period for this alert, we can increase this

2025-03-25 08430, 2025

15:57 PM
monkey[m]

Ah, yes please!

2025-03-25 08457, 2025

15:57 PM
monkey[m]

Is that a 5m delay on triggering the alert in any case, or just for loss of data (just out of curiosity)

2025-03-25 08459, 2025

15:57 PM
zas[m]

10m or more?

2025-03-25 08410, 2025

15:58 PM
jasje[m]

Any contributors who are cooking a GSoC proposal towards ListenBrainz Android project should take a look at ideas page again. Eased some things and added more context.

2025-03-25 08416, 2025

15:58 PM
zas[m]

Any alert for this alert rule

2025-03-25 08422, 2025

15:58 PM
zas[m]

We can't dissociate them

2025-03-25 08458, 2025

15:58 PM
monkey[m]

That's what I figured. Thanks for confirming

2025-03-25 08412, 2025

15:59 PM
monkey[m]

To me this reeads as a 1m delay, not 5. Am I looking at the wrong thing?

2025-03-25 08415, 2025

15:59 PM
monkey[m] uploaded an image: (34KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/XKEAkQFpjiAZCDNZEtXQOqpW/image.png >

2025-03-25 08430, 2025

16:00 PM
zas[m]

That's the time the rule is evaluated, but the pending period is the time before it actually alerts (if the state is the same)

2025-03-25 08434, 2025

16:00 PM
zas[m] uploaded an image: (46KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/wQKnfFxSmudIESOxhainIIZZ/image.png >

2025-03-25 08407, 2025

16:01 PM
zas[m]

So basically it evaluates the state every minute, but wait for 5 minutes to see if it was transient or not

2025-03-25 08425, 2025

16:01 PM
zas[m]

It limits false alerts

2025-03-25 08453, 2025

16:01 PM
zas[m]

We can evaluate the state less often too

2025-03-25 08406, 2025

16:02 PM
zas[m]

but then the pending period has to be longer

2025-03-25 08439, 2025

16:02 PM
monkey[m]

I was lookign at the pending period, it seemed to me to be set to 1m when I opened the alert edit page

2025-03-25 08448, 2025

16:02 PM
zas[m]

If you look carefully Grafana shows alerts as activated as "Pending" when it happens, notifications are sent at the end of the pending period

2025-03-25 08403, 2025

16:03 PM
zas[m]

s/as//

2025-03-25 08404, 2025

16:03 PM
monkey[m]

Anyway, thanks for the assist. Let's see how it goes with 5m delay.

2025-03-25 08417, 2025

16:04 PM
zas[m]

Wait, which alert did you copied from? because I have 5m for stats

2025-03-25 08428, 2025

16:04 PM
monkey[m]

<zas[m]> "There's already a 5m pending..." <- This is the bit I was confused about. I can only see 1m pending period, nothing that says it was set to 5m

2025-03-25 08429, 2025

16:04 PM
zas[m]

https://stats.metabrainz.org/alerting/fe4fa6v0gnj…

2025-03-25 08437, 2025

16:04 PM
monkey[m]

I was on https://stats.metabrainz.org/alerting/debdj8a19ue…

2025-03-25 08451, 2025

16:04 PM
monkey[m]

I see, those are thesitewide

2025-03-25 08418, 2025

16:05 PM
zas[m]

You can change pending period to 5m for it then

2025-03-25 08419, 2025

16:05 PM
monkey[m]

OK, and it wasn't configured with the same delay. Mystery solved :)

2025-03-25 08419, 2025

16:05 PM
monkey[m]

I'll change that for the sitewide stats alerts

2025-03-25 08446, 2025

16:05 PM
monkey[m]

Then I think it will work fine, the other (non-sitewide) alerts have been behaving better

2025-03-25 08457, 2025

16:05 PM
zas[m]

The pending period limits the number of notifications if states change too quickly$

2025-03-25 08404, 2025

16:06 PM
zas[m]

s/$/./

2025-03-25 08420, 2025

16:06 PM
monkey[m]

Yep, that's what I was looking for.

2025-03-25 08425, 2025

16:07 PM
monkey[m]

Thank you!

2025-03-25 08440, 2025

16:07 PM
zas[m]

np :)

2025-03-25 08429, 2025

16:13 PM
reosarevok[m]

Weren't we supposed to meet around now? :)

2025-03-25 08437, 2025

16:16 PM
zas[m]

yup

2025-03-25 08408, 2025

16:18 PM
reosarevok[m]

mayhem, bitmap, julian45 @julian45:julian45.net

2025-03-25 08414, 2025

16:18 PM
mayhem[m] is here despite being on the phone the bank

2025-03-25 08409, 2025

16:20 PM
reosarevok[m]

So, any updates or new info?

2025-03-25 08419, 2025

16:21 PM
julian45[m]

none from me that haven't already been discussed out-of-band

2025-03-25 08425, 2025

16:21 PM
zas[m]

We have to decide what to do, it was suggested to use Anubis, does it look actionable? Are there any objection?

2025-03-25 08435, 2025

16:21 PM
mayhem[m]

I wrote a proposal with my ideas, but didn't get a lot of feedback, only from julian45 who made a number of good arguments as to why its not a great idea.

2025-03-25 08420, 2025

16:22 PM
zas[m]

We had discussions about Cloudflare, and potential conflicts with our policies, this should be investigated too (in case we decide to move to such service, cloudflare or similar)

2025-03-25 08430, 2025

16:22 PM
mayhem[m]

I feel so so about anubis. its seems a bit heavy handed, so I wish we could find out more about what level of effort these people are willing to go through to keep scraping us.

2025-03-25 08454, 2025

16:22 PM
mayhem[m]

the recent cloudflare outages make me really not like that option much.

2025-03-25 08410, 2025

16:23 PM
julian45[m]

anubis looks actionable IMO as a first line/first attempt, esp since docs indicate policy is configurable to allow, e.g., legit scrapers like google while challenging others

2025-03-25 08432, 2025

16:23 PM
julian45[m]

https://anubis.techaro.lol/docs/admin/policies

2025-03-25 08432, 2025

16:24 PM
reosarevok[m]

I think heavy handed is a good start given the situation

2025-03-25 08446, 2025

16:24 PM
mayhem[m]

if we implement anubis, can we have it only on pages that contain data that is being scraped?

2025-03-25 08450, 2025

16:24 PM
reosarevok[m]

(and we could lower the heavy-handiness if we get things under control in other ways)

2025-03-25 08453, 2025

16:24 PM
julian45[m]

i do worry that it could potentially be annoying for some users who, e.g., disable js by default in their browsers, but those kinds of folks should be willing to carve out exceptions

2025-03-25 08410, 2025

16:25 PM
mayhem[m]

e.g. style guides require no anubis

2025-03-25 08412, 2025

16:25 PM
zas[m]

mayhem: about your proposal, I think we should rely on existing tools first if possible (not reinventing the wheel), but I don't totally rule it out, because we might find limitations in third party tools we don't have with our own ones.

2025-03-25 08421, 2025

16:25 PM
reosarevok[m]

Those users could possibly get in touch with us and get exceptions added for them

2025-03-25 08424, 2025

16:25 PM
julian45[m]

mayhem[m]: this kind of goes back to the separation of API requests to a subdomain need that was discussed yesterday

2025-03-25 08427, 2025

16:25 PM
reosarevok[m]

(re: no-js people)

2025-03-25 08407, 2025

16:26 PM
julian45[m]

reosarevok[m]: if they aren't able to configure their clients to make exceptions for us, sure - chances are they would need js to use our site(s) anyway, no?

2025-03-25 08423, 2025

16:26 PM
mayhem[m]

julian45[m]: that seems conflated to me. we can partition the URL space for web pages without ever considering the API.

2025-03-25 08445, 2025

16:26 PM
julian45[m]

ah i see

2025-03-25 08402, 2025

16:27 PM
zas[m]

They (may) hit ANY page, but the fact is those with MB data have a much higher cost for us (they hit backends and db)

2025-03-25 08407, 2025

16:27 PM
reosarevok[m]

julian45: fair, we require JS for editing anyway, just not for reading - but the kind of people we'd possibly make exceptions for probably edit so

2025-03-25 08423, 2025

16:27 PM
julian45[m]

if needed, per the policy doc i linked, i think we can tell anubis to let certain path regexes through but not others

2025-03-25 08429, 2025

16:27 PM
mayhem[m]

julian45[m]: still, your point is valid.

2025-03-25 08442, 2025

16:27 PM
mayhem[m]

but probably not needed right this second.

2025-03-25 08402, 2025

16:28 PM
reosarevok[m]

Would something like anubis interfere with external seeding?

2025-03-25 08437, 2025

16:28 PM
reosarevok[m]

(it's ok if it does temporarily given the circumstances, but making sure we don't need to make MBS changes to support it)

2025-03-25 08443, 2025

16:28 PM
julian45[m]

i doubt it, but then again external seeding is something we allow that many implementers (e.g., GNOME project gitlab) might not

2025-03-25 08403, 2025

16:29 PM
julian45[m]

so i would suggest deploying on test or beta to figure that out before prod

2025-03-25 08420, 2025

16:29 PM
julian45[m]

* i doubt it, but then again external seeding is something we have as part of our use cases that many implementers (e.g., GNOME project gitlab) might not

2025-03-25 08437, 2025

16:30 PM
reosarevok[m]

zas: is beta also being hit?

2025-03-25 08455, 2025

16:30 PM
zas[m]

Yes, it was first

2025-03-25 08457, 2025

16:30 PM
reosarevok[m]

(so, if we put this in front of beta first, would it actually teach us if it will help)

2025-03-25 08400, 2025

16:31 PM
reosarevok[m]

Ok

2025-03-25 08413, 2025

16:31 PM
zas[m]

This is how I discovered the problem, beta containers were eating a lot of resources suddenly

2025-03-25 08448, 2025

16:31 PM
bitmap[m]

reosarevok[m]: it might interfere with release editor seeding since that requires POST data which can't be redirected

2025-03-25 08447, 2025

16:32 PM
reosarevok[m]

Hopefully those can be let through then since those should never match the kind of hits causing issues

2025-03-25 08407, 2025

16:33 PM
zas[m]

But seeding requires to be logged in, so we can just skip any check for those

2025-03-25 08432, 2025

16:33 PM
julian45[m]

zas[m]: it usually forces logout, then login though, right?

2025-03-25 08446, 2025

16:33 PM
reosarevok[m]

Ok, that's another thing I asked several times but I think never got an answer for: can we separate logged in from not logged in queries?

2025-03-25 08450, 2025

16:33 PM
reosarevok[m]

For anubis

2025-03-25 08400, 2025

16:34 PM
reosarevok[m]

And run it only on logged out for now

2025-03-25 08426, 2025

16:34 PM
mayhem[m]

I propose that we try anubis on mb.org's data pages and see what happens.

2025-03-25 08427, 2025

16:34 PM
mayhem[m]

I much prefer this option over cloudflare.

2025-03-25 08427, 2025

16:34 PM
mayhem[m]

who objects to this suggestion?

2025-03-25 08439, 2025

16:34 PM
mayhem[m]

(sorry was disconnected for a bit, back now)

2025-03-25 08442, 2025

16:34 PM
zas[m]

bitmap suggested an internal header set by backends for that, so we can get this info on gateways at least

2025-03-25 08450, 2025

16:34 PM
reosarevok[m]

I'd say on beta.mb.org for now, but I'd agree otherwise

2025-03-25 08453, 2025

16:34 PM
julian45[m]

reosarevok[m]: unfortunately not sure

2025-03-25 08403, 2025

16:35 PM
mayhem[m]

mayhem[m]: this may now be out of date. heh.

2025-03-25 08412, 2025

16:35 PM
julian45[m]

* not sure i.r.t. anubis

2025-03-25 08422, 2025

16:35 PM
reosarevok[m]

That'd also allow us to play with any changes we need to make things better on the MBS side

2025-03-25 08428, 2025

16:35 PM
reosarevok[m]

Before we put them in prod

2025-03-25 08448, 2025

16:35 PM
bitmap[m]

julian45[m]: yeah, IIRC the login cookies are not available to the request MB receives

2025-03-25 08404, 2025

16:36 PM
bitmap[m]

but if it's only restricted to data pages then not an issue

2025-03-25 08430, 2025

16:36 PM
reosarevok[m]

Yeah, I guess if we can ignore /edit pages it seems good

2025-03-25 08457, 2025

16:36 PM
reosarevok[m]

Well, /edit /add /create etc

2025-03-25 08415, 2025

16:37 PM
reosarevok[m]

But we could easily come up with a list

2025-03-25 08424, 2025

16:37 PM
julian45[m]

reosarevok[m]: which seems doable but i would like someone to double check the doc page i linked to make sure i'm interpreting correctly

2025-03-25 08443, 2025

16:37 PM
julian45[m]

reosarevok[m]: great, because policy for anubis is configured by json file anyway per docs

2025-03-25 08426, 2025

16:38 PM
reosarevok[m]

Well, https://anubis.techaro.lol/docs/admin/policies/ the docs have

2025-03-25 08429, 2025

16:38 PM
reosarevok[m]

{... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/jJwopxmBmSAzwBbgUVbNzSrO>)

2025-03-25 08447, 2025

16:38 PM
reosarevok[m]

I don't see why it would not work for other things than robots.txt :)

2025-03-25 08458, 2025

16:38 PM
julian45[m]

exactly

2025-03-25 08422, 2025

16:39 PM
julian45[m]

just wanted to be sure i wasn't the only one reaching that conclusion from docs

2025-03-25 08432, 2025

16:41 PM
reosarevok[m]

Ok, this looks like it should work - so sysadmin team works on setting up anubis, mbs team figures out what paths to allow (could include all of /ws/2 as well for now AFAICT), we reconvene and let it loose on test first, then beta if nothing is horribly broken?

2025-03-25 08401, 2025

16:42 PM
bitmap[m]

"Anubis uses a multi-threaded proof of work check to ensure that users browsers are up to date and support modern standards." not sure what they mean by the last past (since we support older browsers too)

2025-03-25 08450, 2025

16:42 PM
zas[m]

I wonder if Anubis is able to handle our traffic too. There's no number.

2025-03-25 08459, 2025

16:42 PM
zas[m]

"Anubis has very minimal system requirements. I suspect that 128Mi of ram may be sufficient for a large number of concurrent clients. Anubis may be a poor fit for apps that use WebSockets and maintain open connections, but I don't have enough real-world experience to know one way or another."

2025-03-25 08400, 2025

16:43 PM
reosarevok[m]

If we need to limit support for some older browsers temporarily while we find better options, that's a sacrifice that seems sensible to me

2025-03-25 08406, 2025

16:43 PM
bitmap[m]

s/past/part/

2025-03-25 08420, 2025

16:43 PM
reosarevok[m]

zas[m]: Only one way to find out?

2025-03-25 08444, 2025

16:43 PM
zas[m]

Well, we can conduct a test on test.mb (...)

2025-03-25 08455, 2025

16:43 PM
zas[m]

And evaluate pros & cons after that

2025-03-25 08433, 2025

16:44 PM
reosarevok[m]

It seems worth a try compared with what you are having to spend time doing now

2025-03-25 08456, 2025

16:44 PM
reosarevok[m]

Worst case scenario, we know we need to keep doing the same and look into cloudflare or our own version

2025-03-25 08431, 2025

16:45 PM
zas[m]

Also I wonder how it scales, need to check that

2025-03-25 08402, 2025

16:46 PM
lucifer[m]

also, https://github.com/TecharoHQ/anubis/issues/34

2025-03-25 08422, 2025

16:47 PM
zas[m]

Also it might be tricky to insert in our proxies chain ...

2025-03-25 08401, 2025

16:49 PM
zas[m]

So, let's try to deploy it for test.mb at least

2025-03-25 08443, 2025

16:49 PM
zas[m]

About ws / website separation, do we agree to move to api.mb a bit faster?

2025-03-25 08422, 2025

16:50 PM
zas[m]

1) ensure it works 2) update docs & notify users 3) redirects if possible

2025-03-25 08435, 2025

16:51 PM
zas[m]

How long do we need to deprecate mb.o/ws/ ? Years.

2025-03-25 08448, 2025

16:51 PM
mayhem[m]

zas[m]: 10!

2025-03-25 08452, 2025

16:51 PM
reosarevok[m]

As long as we don't entirely break the non api.mb version, it seems fine - even if it's slowed down

2025-03-25 08403, 2025

16:52 PM
mayhem[m]

and people still complained that we yanked the service "without notice"

2025-03-25 08410, 2025

16:54 PM
zas[m]

@bitmap Is moving to api.mb a problem for MB server?

2025-03-25 08417, 2025

16:54 PM
lucifer[m]

i think throttling redirects from ws to api would annoy at least active users and make them migrate.

2025-03-25 08459, 2025

16:54 PM
bitmap[m]

zas[m]: no, it doesn't care about which domain it's being served from as long as it's configured properly

2025-03-25 08411, 2025

16:55 PM
zas[m]

ok, perfect.

2025-03-25 08436, 2025

16:55 PM
zas[m]

So let's check if it's properly configured this week (I'll set it up)

2025-03-25 08458, 2025

16:55 PM
bitmap[m]

I'm not sure we can redirect mb.org/ws/ requests without breaking everything but slowly throttling it more might be a good incentive to switch

2025-03-25 08407, 2025

16:57 PM
zas[m]

I guess there's no problem with GET/HEAD requests right?

2025-03-25 08402, 2025

16:58 PM
zas[m]

Also for beta & test, api.test.mb and api.beta.mb ? or?

2025-03-25 08426, 2025

16:58 PM
bitmap[m]

I expect most clients will follow redirects properly, but can't be certain... only for data submission would it definitely be a problem

2025-03-25 08408, 2025

16:59 PM
bitmap[m]

zas[m]: not sure, does LB use a specific layout already?

2025-03-25 08434, 2025

16:59 PM
mayhem[m]

yes

2025-03-25 08448, 2025

16:59 PM
mayhem[m]

https://test-api.listenbrainz.org/1/status/servic…

2025-03-25 08451, 2025

16:59 PM
lucifer[m]

api., beta-api., test-api.

2025-03-25 08459, 2025

16:59 PM
mayhem[m]

that.

2025-03-25 08444, 2025

17:00 PM
zas[m]

ok

2025-03-25 08410, 2025

17:01 PM
zas[m]

let's stick to that then

2025-03-25 08423, 2025

17:01 PM
zas[m]

I'll configure everything tomorrow