should check to make sure that nginx is sending an X-Forwarded_Proto or whatever header so you can check it
(it's non-standard, but most servers set it to the string https for https requests)
bitmap
yeah, I see a line for x-forwarded-proto on the load balancer
kepstin_
the other option would be to do the redirects in nginx rather than the application, but that won't work if you're doing it only for logged in users.
Gentlecat
why not?
kepstin_
hmm, well, I suppose it could if you're setting a cookie for logged in users
and set it up to always redirect the login page, of course
would be an if at the top level of the http server block containing a rewrite, and a separate rewrite, possibly inside a location block, to do the login page
shouldn't be hard, actually.
bitmap
hm, it looks like we already rewrite all urls to https in the nginx config?
ianmcorvidae
yeah, I assume what's happening here is
kepstin_
ah, so it's unconditionally https?
ianmcorvidae
a.) nginx is redirecting it
kepstin_
that should be fine, unless the app server is also redirecting it :)
yeah, when running with the nginx redirector, you'll want to disable that. One way would be to hook up the X-Forwarded-Proto header so it knows the url is already https, the other would be to just remove the decorator.
ianmcorvidae
you should ensure request.is_secure is appropriately set based on X-Forwarded-Proto and/or X-MB-https, both of which we set (the former to 'https', the latter to 'on')
the 'derive.php' are the things that generate thumbnails, of course
kepstin_
hmm. oh, do you have to be logged in for that?
ianmcorvidae
I guess for a non-admin that page doesn't show the "server readonly -- tasks waiting for harddrive fix" status, but still
probably
KillDaBOB
hm. one set of images i just uploaded is showing up fine, but the previous few sets i uploaded aren’t. that’s just strange. but maybe it is just a backlog.
kepstin_
yeah, they have some sort of queue system to do processing of uploads, and it can get backlogged.
ianmcorvidae
theoretically our side of the queue can also get 503d and that can cause problems, which is annoying to check from here so I won't, but :P
KillDaBOB
heh
ianmcorvidae
(also it takes it a bit since it retries it a few times first)
KillDaBOB
fair enough
kepstin_
ah, that item history thing is pretty cool
kepstin_ found his archive login :)
ianmcorvidae
one thing to check is if the archive.org item page (the /details/ one) has an index.json
that's what our queue is doing, is uploading that file
which at least at some points in history has had to be there before the IA would derive the 250/500px thumbnails
KillDaBOB
yeah, that’s what the problem was once before when this was happening. there was a MB server restart and that process didn’t get restarted with it or just didn’t start up. something like that.
but the json file is present on the releases i’m having problems with, so it’s not that.
ianmcorvidae
yeah, probably just backlog at the IA
or a broken hard drive needing replacement (which makes servers readonly)
kepstin
has nginx reverse-proxying itself in order to stack the a cache in front of a ratelimiter
ianmcorvidae
hehe :)
arguably we should just use nginx's ratelimiting ourselves
but our rules are rather complex so it's been nice having custom code for it
(especially when you include the stats handling -- though probably that could be done by postprocessing a log)
kepstin
the nginx one is kind of interesting, the way I have it set up, it'll actually queue up and delay requests up to a point rather than rejecting them immediately.
ianmcorvidae
does that tie up nginx worker processes?
I guess probably not, nginx is usually smart about that
kepstin
no, nginx is internally event-based
the worker processes is just to allow it to scale to multiple cores
ianmcorvidae
right, yeah
how far will it queue/delay rather than failing (and how does it fail? 503/429/something else?)
kepstin
the queue is configurable via the "burst" parameter, and failure is configurable, defaults to 503.
ianmcorvidae
cool
I should try to do our ratelimiting/complexity thereof in nginx sometime, presumably
kepstin
the ratelimiting is key-based; so the trick is just setting up the key somehow.
ianmcorvidae
figuring out how to set the key would be the hard part, making it be something other than a chain of gross ifs
yeah
kepstin
in mbjs, the key is $server_name, because the whole thing has to be in a single ratelimit key, so that was just a static string which was available.
you'll note that i'm also correctly setting the user agent ;)
ianmcorvidae
hehe
kepstin
anyways, the goal of this is that the mbjs page itself doesn't actually need a ratelimiter, since cached content will be served with no limit, and nginx will automatically delay requests for uncached content.
ianmcorvidae
right now we have a several-step thing where it first modifies the key for some things like customers that get higher ratelimits, then it looks up keys by a regex to determine the ratelimit parameters (count and period the count applies to, mostly)
and then it does this three times, one for IP, one for UA, one for global
yeah
kepstin
yeah, you'd have to have multiple rate limit zones in nginx for the different settings
I wonder if you can use the limit_req inside an if.
ianmcorvidae
heh
at least you can limit_req and then proxy_pass it, like you are
kepstin
... no. it's http,server,location context only.
ianmcorvidae
so we'd still only need one final backend that passes to the real WS servers
kepstin
the nginx ratelimiter's design inherently smooths out bursts
rather than passing through 10 requests immediately then rejecting others, depending on config it'll e.g. (with burst=10) either accept 10 but delay each one to e.g. 1 req/s on the output, or (burst=0) accept 1 and reject others until 1 second passes.
so it's very different behaviour from what musicbrainz has now.
... unless that's not what it does. It's not exactly super-clear
I wonder if they have the actual algorithm used documented somewhere
given that they talk about delay in the docs, I have to assume that it's doing queueing
KillDaBOB: sorry about that, the main database was locked up for some reason
KillDaBOB
heh.
hopefully it wasn’t down for long
luks
submissions were down for about one hour :(
KillDaBOB
that’s not too bad. i guess it was caught pretty early.
Gentlecat joined the channel
reosarevok joined the channel
jesus2099 joined the channel
jesus2099
do we kno why nikki’s simple Remove relationship edits are but « An error occured while loading this edit » ? they are just some normal AR being removed, nothing like new types or anything…
reosarevok
The way attributes and removals are handled changed
Will be fixed tonight anyway
chirlu`
It’s both because they now store the “ended” flag and because they may contain instrument credits.
jesus2099
oh great tonight ! i feared it would be long term, i can stop switching back and forth tonight :)
reosarevok
Normally this is at most two weeks, since it's just caused by a difference on how edits are treated in beta vs. prod