Since it needs to load everything the site needs to work
bitmap
these roles are specific to our production setup, so I guess it might make sense to store a file defining all the grants in docker-server-configs, and have a step to run that during each schema change?
that's what I assumed about musicbrainz_ro, yeah
reosarevok
yvanzo: does musicbrainz-docker use these at all in any way?
bitmap: sir also needs to access editor, or at least some columns in editor.
bitmap
ah ok
we should be able to set up some column level authorization too
the docker-postgres-cluster repo (which is also private) could also work as a place to store the grants
yvanzo
reosarevok: for development purpose only, initially I opened this ticket to allow using musicbrainz_ro in PROD_STANDBY DBDefs
reosarevok
Just asking because if we put it as prod-only in docker-server-configs then we might need a separate config for other people's docker servers or something
But I guess that's also fine?
yvanzo
but that could/should also be used to connect to pink db when running local branch
for example, not using musicbrainz_ro caused issues when we switched floyd/pink for a few weeks
reosarevok
Maybe the development setup could have a script copy what we do in MB for ease of development
yvanzo
pink was not read-only then, and using musicbrainz_ro would have prevent making changes to the main database.
bitmap: using docker-postgres-cluster would require to replace the container each time we change permissions though? :/ docker-server-configs might be more appropriate then
reosarevok
So should we put together some lists of what each user should have access to? And maybe document them somewhere, if they are not?
yvanzo
yes, and have a script in musicbrainz-server to check/enforce such list
bitmap
yvanzo: we wouldn't necessarily need to replace the container, I guess it'd just be stored as a .sql file that we can copy into the container
since it only needs to be run during schema changes
elgranRoble81 joined the channel
elgranRoble81 has quit
yvanzo
yes but we sometimes introduce new tables in production months before releasing a schema change
(it happened a few times for the latest schema change at least)
bitmap
right
but I think whether it's stored in docker-server-configs doesn't matter much since we have to copy the file to execute it anyway
Do we want "ALL TABLES IN SCHEMA musicbrainz" for caa_redirect?
yvanzo
Or are there other tables to protect more?
bitmap
we could use this file as a base
yvanzo
reosarevok: It needs access to some edit/release/release group info at least.
bitmap
for caa_redirect we can restrict the tables a bit more, I can make a list of the ones it needs access to
yvanzo
👍
reosarevok
I was more thinking about for example users with private collections, tags, ratings etc
But that's per user, and might not be so important to restrict anyway?
I guess we do need musicbrainz_ro to access those, I'm expecting sir for example doesn't need _raw tables?
bitmap
depends on if there are other projects using musicbrainz_ro and what they use it for
reosarevok
Do we need users to be able to log in when ro btw?
yvanzo
we should really have a pg user per project, that helps with debugging
reosarevok
That probably would not hurt
bitmap
yes that would make sense
reosarevok
wikidata-bot certainly doesn't need to access every table
Would we store that info also in docker-server-configs or?
bitmap
(there is also a way to specify application_name when connecting to pg, but it doesn't appear to be set for existing musicbrainz_ro connections)
docker-server-configs works for me
reosarevok
So I guess we need to ask each project to set a list of tables they need access to? If we decide to go that way
bitmap
ideally yes, or at least make sure they are using musicbrainz_ro and set an application_name
reosarevok
I guess application_name doesn't restrict the access they have to though?
bitmap
nope, just helps with debugging where connections are coming from
reosarevok
How important is that restriction? As in (probably dumb question) what's the risk? That someone in that project accidentally selects from editor and prints it on a table? :)
(which reminds me that I have to complete the one for RabbitMQ)
reosarevok: can be accidental or intentional (not necessarily from the code)
odnes_ has quit
The tables application and editor_oauth_token should be restricted too.
bitmap
reosarevok: after the last data leak I think it's good to minimize risk where possible, even if unlikely to occur
reosarevok
Agreed
Ok!
So, can we document the next steps?
I guess there's no specific rush other than "do this before the next time we need to go read-only"
bitmap
well, I'll start with making a list of tables needed for caa_redirect
reosarevok
Can you also document your proposed way to do this under the ticket too?
bitmap
I think we should then move these sql files from docker-postgres repo to docker-server-configs
reosarevok
Also, I guess we did not announce a Q4 schema change, so we aren't having one :)
yvanzo
For table level, it seems easy. For column level, I don't know.
reosarevok
And should we add tickets for each project lead to document which tables they need?
Problem being I'm not even sure what are all the projects that need it :D
Wonder if there's more small things like wikidata-bot :)
I expect each project could have like critiquebrainz_ro or whatever?
bitmap
we could bring it up during the next meeting
reosarevok
Ok
yvanzo
Ok (but I won't around)
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, Per-project PSQL users (bitmap, reo)
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, Per-project mb_db users (bitmap, reo)
bitmap
but we should probably start with just restricting the caa_redirect & sir roles a bit more and have a process for updating those during schema changes
yvanzo
👍
bitmap
the per-project roles can be a future improvement
anyway. I can open a PR to move the .sql files to docker-server-configs this week and we can improve on those in the PR?
reosarevok
And I guess run a one-off script to set the privs as we want them too?
Seems good to me
yvanzo
bitmap: restricting caa_redirect at least would be a good start, as it seems to be the one requiring the less tables.
bitmap
yeah, that one should be easier
yvanzo
for column level permissions, we will have to test whether sir is able to handle it as well
(what I don't know is whether mbdata is not trying to query all the columns anyway)
bitmap
agreed, we don't have to be hasty with the column level stuff
we could probably restrict access to oauth tables at least, to start
yvanzo
👍
reosarevok
Seems like we have a plan
odnes joined the channel
bitmap
yup
reosarevok
So, thanks!
I wanted to ask something else but I forgot, whoops.
So, I guess that's good enough for now :)
yvanzo
so I will
reosarevok
Oh! Go for it :D
yvanzo
would it be fine to add a /search-indexer endpoint?
reosarevok
What's the intention?
yvanzo
to help with setting/updating triggers from search indexer
currently on mirrors, we have to generate SQL files, copy them from a container to another, send run a script
reosarevok
So the idea is to send the file via the endpoint?
yvanzo
ideally, sir should be automatically check its own triggers and update these if necessary through this endpoint
we can also use it to have sir sends its status to the webserver
reosarevok
As long as it's not an added risk, why not :)
yvanzo
it's an added risk of course :)
well, it's not something for now, just an idea in the air
reosarevok
I mean, that the endpoint is meant to make changes on sir but you can't modify the MB DB through it
yvanzo
but the idea of having an all-in-one admin UI in MBS has been around for some years
bitmap
how come you need the endpoint to allow sir to update its own triggers if it already has database access?
Can we give sir right to read triggers without modifying them?
bitmap
oh, okay I'll look
yvanzo
reosarevok: "primary" wouldn't be any better than "core": neither more descriptive nor avoiding collision with other defined terms ;)
reosarevok
Well, hence hoping bitmap has ideas :D
yvanzo
reosarevok: good question, probably yes
Thanks for the feedback on SIR endpoint. As I said, it's not for now, there are a number of higher priority changes to be made about SIR/Solr.
reosarevok
Having a way for sir to notify if it's struggling sounds great, in any case - can we have a page for all our project's statuses?
yvanzo
That's much bigger project and using prometheus /metrics endpoint would be a potential starting point
bitmap, reosarevok: The above ticket is essentially an almost full review of “entity” usage in docs and code. So it can be a bit overwhelming. Maybe to be split under subtickets if there are changes that we identify/agree on sooner than others.