Since it needs to load everything the site needs to work
2022-09-06 24912, 2022
bitmap
these roles are specific to our production setup, so I guess it might make sense to store a file defining all the grants in docker-server-configs, and have a step to run that during each schema change?
2022-09-06 24953, 2022
bitmap
that's what I assumed about musicbrainz_ro, yeah
2022-09-06 24957, 2022
reosarevok
yvanzo: does musicbrainz-docker use these at all in any way?
bitmap: sir also needs to access editor, or at least some columns in editor.
2022-09-06 24923, 2022
bitmap
ah ok
2022-09-06 24911, 2022
bitmap
we should be able to set up some column level authorization too
2022-09-06 24923, 2022
bitmap
the docker-postgres-cluster repo (which is also private) could also work as a place to store the grants
2022-09-06 24934, 2022
yvanzo
reosarevok: for development purpose only, initially I opened this ticket to allow using musicbrainz_ro in PROD_STANDBY DBDefs
2022-09-06 24938, 2022
reosarevok
Just asking because if we put it as prod-only in docker-server-configs then we might need a separate config for other people's docker servers or something
2022-09-06 24944, 2022
reosarevok
But I guess that's also fine?
2022-09-06 24952, 2022
yvanzo
but that could/should also be used to connect to pink db when running local branch
2022-09-06 24944, 2022
yvanzo
for example, not using musicbrainz_ro caused issues when we switched floyd/pink for a few weeks
2022-09-06 24903, 2022
reosarevok
Maybe the development setup could have a script copy what we do in MB for ease of development
2022-09-06 24922, 2022
yvanzo
pink was not read-only then, and using musicbrainz_ro would have prevent making changes to the main database.
2022-09-06 24948, 2022
yvanzo
bitmap: using docker-postgres-cluster would require to replace the container each time we change permissions though? :/ docker-server-configs might be more appropriate then
2022-09-06 24924, 2022
reosarevok
So should we put together some lists of what each user should have access to? And maybe document them somewhere, if they are not?
2022-09-06 24949, 2022
yvanzo
yes, and have a script in musicbrainz-server to check/enforce such list
2022-09-06 24955, 2022
bitmap
yvanzo: we wouldn't necessarily need to replace the container, I guess it'd just be stored as a .sql file that we can copy into the container
2022-09-06 24916, 2022
bitmap
since it only needs to be run during schema changes
2022-09-06 24949, 2022
elgranRoble81 joined the channel
2022-09-06 24956, 2022
elgranRoble81 has quit
2022-09-06 24929, 2022
yvanzo
yes but we sometimes introduce new tables in production months before releasing a schema change
2022-09-06 24902, 2022
yvanzo
(it happened a few times for the latest schema change at least)
2022-09-06 24916, 2022
bitmap
right
2022-09-06 24945, 2022
bitmap
but I think whether it's stored in docker-server-configs doesn't matter much since we have to copy the file to execute it anyway
Do we want "ALL TABLES IN SCHEMA musicbrainz" for caa_redirect?
2022-09-06 24918, 2022
yvanzo
Or are there other tables to protect more?
2022-09-06 24948, 2022
bitmap
we could use this file as a base
2022-09-06 24903, 2022
yvanzo
reosarevok: It needs access to some edit/release/release group info at least.
2022-09-06 24904, 2022
bitmap
for caa_redirect we can restrict the tables a bit more, I can make a list of the ones it needs access to
2022-09-06 24943, 2022
yvanzo
👍
2022-09-06 24945, 2022
reosarevok
I was more thinking about for example users with private collections, tags, ratings etc
2022-09-06 24900, 2022
reosarevok
But that's per user, and might not be so important to restrict anyway?
2022-09-06 24923, 2022
reosarevok
I guess we do need musicbrainz_ro to access those, I'm expecting sir for example doesn't need _raw tables?
2022-09-06 24902, 2022
bitmap
depends on if there are other projects using musicbrainz_ro and what they use it for
2022-09-06 24946, 2022
reosarevok
Do we need users to be able to log in when ro btw?
2022-09-06 24947, 2022
yvanzo
we should really have a pg user per project, that helps with debugging
2022-09-06 24957, 2022
reosarevok
That probably would not hurt
2022-09-06 24917, 2022
bitmap
yes that would make sense
2022-09-06 24930, 2022
reosarevok
wikidata-bot certainly doesn't need to access every table
2022-09-06 24904, 2022
reosarevok
Would we store that info also in docker-server-configs or?
2022-09-06 24912, 2022
bitmap
(there is also a way to specify application_name when connecting to pg, but it doesn't appear to be set for existing musicbrainz_ro connections)
2022-09-06 24905, 2022
bitmap
docker-server-configs works for me
2022-09-06 24911, 2022
reosarevok
So I guess we need to ask each project to set a list of tables they need access to? If we decide to go that way
2022-09-06 24957, 2022
bitmap
ideally yes, or at least make sure they are using musicbrainz_ro and set an application_name
2022-09-06 24929, 2022
reosarevok
I guess application_name doesn't restrict the access they have to though?
2022-09-06 24959, 2022
bitmap
nope, just helps with debugging where connections are coming from
2022-09-06 24956, 2022
reosarevok
How important is that restriction? As in (probably dumb question) what's the risk? That someone in that project accidentally selects from editor and prints it on a table? :)
(which reminds me that I have to complete the one for RabbitMQ)
2022-09-06 24955, 2022
yvanzo
reosarevok: can be accidental or intentional (not necessarily from the code)
2022-09-06 24941, 2022
odnes_ has quit
2022-09-06 24956, 2022
yvanzo
The tables application and editor_oauth_token should be restricted too.
2022-09-06 24908, 2022
bitmap
reosarevok: after the last data leak I think it's good to minimize risk where possible, even if unlikely to occur
2022-09-06 24938, 2022
reosarevok
Agreed
2022-09-06 24940, 2022
reosarevok
Ok!
2022-09-06 24948, 2022
reosarevok
So, can we document the next steps?
2022-09-06 24938, 2022
reosarevok
I guess there's no specific rush other than "do this before the next time we need to go read-only"
2022-09-06 24955, 2022
bitmap
well, I'll start with making a list of tables needed for caa_redirect
2022-09-06 24929, 2022
reosarevok
Can you also document your proposed way to do this under the ticket too?
2022-09-06 24948, 2022
bitmap
I think we should then move these sql files from docker-postgres repo to docker-server-configs
2022-09-06 24958, 2022
reosarevok
Also, I guess we did not announce a Q4 schema change, so we aren't having one :)
2022-09-06 24900, 2022
yvanzo
For table level, it seems easy. For column level, I don't know.
2022-09-06 24900, 2022
reosarevok
And should we add tickets for each project lead to document which tables they need?
2022-09-06 24915, 2022
reosarevok
Problem being I'm not even sure what are all the projects that need it :D
2022-09-06 24933, 2022
reosarevok
Wonder if there's more small things like wikidata-bot :)
2022-09-06 24953, 2022
reosarevok
I expect each project could have like critiquebrainz_ro or whatever?
2022-09-06 24924, 2022
bitmap
we could bring it up during the next meeting
2022-09-06 24947, 2022
reosarevok
Ok
2022-09-06 24937, 2022
yvanzo
Ok (but I won't around)
2022-09-06 24944, 2022
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, Per-project PSQL users (bitmap, reo)
2022-09-06 24954, 2022
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, Per-project mb_db users (bitmap, reo)
2022-09-06 24915, 2022
bitmap
but we should probably start with just restricting the caa_redirect & sir roles a bit more and have a process for updating those during schema changes
2022-09-06 24928, 2022
yvanzo
👍
2022-09-06 24944, 2022
bitmap
the per-project roles can be a future improvement
2022-09-06 24957, 2022
bitmap
anyway. I can open a PR to move the .sql files to docker-server-configs this week and we can improve on those in the PR?
2022-09-06 24937, 2022
reosarevok
And I guess run a one-off script to set the privs as we want them too?
2022-09-06 24945, 2022
reosarevok
Seems good to me
2022-09-06 24950, 2022
yvanzo
bitmap: restricting caa_redirect at least would be a good start, as it seems to be the one requiring the less tables.
2022-09-06 24910, 2022
bitmap
yeah, that one should be easier
2022-09-06 24947, 2022
yvanzo
for column level permissions, we will have to test whether sir is able to handle it as well
2022-09-06 24927, 2022
yvanzo
(what I don't know is whether mbdata is not trying to query all the columns anyway)
2022-09-06 24951, 2022
bitmap
agreed, we don't have to be hasty with the column level stuff
2022-09-06 24906, 2022
bitmap
we could probably restrict access to oauth tables at least, to start
2022-09-06 24919, 2022
yvanzo
👍
2022-09-06 24944, 2022
reosarevok
Seems like we have a plan
2022-09-06 24903, 2022
odnes joined the channel
2022-09-06 24931, 2022
bitmap
yup
2022-09-06 24959, 2022
reosarevok
So, thanks!
2022-09-06 24908, 2022
reosarevok
I wanted to ask something else but I forgot, whoops.
2022-09-06 24916, 2022
reosarevok
So, I guess that's good enough for now :)
2022-09-06 24918, 2022
yvanzo
so I will
2022-09-06 24932, 2022
reosarevok
Oh! Go for it :D
2022-09-06 24933, 2022
yvanzo
would it be fine to add a /search-indexer endpoint?
2022-09-06 24953, 2022
reosarevok
What's the intention?
2022-09-06 24958, 2022
yvanzo
to help with setting/updating triggers from search indexer
2022-09-06 24925, 2022
yvanzo
currently on mirrors, we have to generate SQL files, copy them from a container to another, send run a script
2022-09-06 24949, 2022
reosarevok
So the idea is to send the file via the endpoint?
2022-09-06 24915, 2022
yvanzo
ideally, sir should be automatically check its own triggers and update these if necessary through this endpoint
2022-09-06 24948, 2022
yvanzo
we can also use it to have sir sends its status to the webserver
2022-09-06 24927, 2022
reosarevok
As long as it's not an added risk, why not :)
2022-09-06 24942, 2022
yvanzo
it's an added risk of course :)
2022-09-06 24955, 2022
yvanzo
well, it's not something for now, just an idea in the air
2022-09-06 24957, 2022
reosarevok
I mean, that the endpoint is meant to make changes on sir but you can't modify the MB DB through it
2022-09-06 24921, 2022
yvanzo
but the idea of having an all-in-one admin UI in MBS has been around for some years
2022-09-06 24943, 2022
bitmap
how come you need the endpoint to allow sir to update its own triggers if it already has database access?
Can we give sir right to read triggers without modifying them?
2022-09-06 24958, 2022
bitmap
oh, okay I'll look
2022-09-06 24933, 2022
yvanzo
reosarevok: "primary" wouldn't be any better than "core": neither more descriptive nor avoiding collision with other defined terms ;)
2022-09-06 24948, 2022
reosarevok
Well, hence hoping bitmap has ideas :D
2022-09-06 24952, 2022
yvanzo
reosarevok: good question, probably yes
2022-09-06 24910, 2022
yvanzo
Thanks for the feedback on SIR endpoint. As I said, it's not for now, there are a number of higher priority changes to be made about SIR/Solr.
2022-09-06 24949, 2022
reosarevok
Having a way for sir to notify if it's struggling sounds great, in any case - can we have a page for all our project's statuses?
2022-09-06 24952, 2022
yvanzo
That's much bigger project and using prometheus /metrics endpoint would be a potential starting point
2022-09-06 24907, 2022
yvanzo
bitmap, reosarevok: The above ticket is essentially an almost full review of “entity” usage in docs and code. So it can be a bit overwhelming. Maybe to be split under subtickets if there are changes that we identify/agree on sooner than others.