Skip to content

Add general table / view revocations and future schema/table/view grants and revocations

Taylor A Murphy requested to merge (removed):table-view-revocations into master

Merge Request Checklist

  • Link to an issue #1481 (closed) #1482 (closed)
  • Include the proposed fix or feature
  • Include and update tests for the modified code
  • Include a documentation change
  • Add a CHANGELOG.md entry in the Unreleased section

Once your merge request is complete, please assign it a Meltano maintainer for review cycle. Once the review cycle finished, the reviewer shall approve the merge request so it can be merged by any Meltano maintainer.

Changes:

snowflake_grants.py

  • Removes full_schema_list
  • generate_schema_grants
    • Adds future grants for schemas and handle diffing of grants
    • Adds revocations to future grants for schemas
  • generate_table_and_view_grants
    • Adds revocations to tables/views
    • Adds revocations for future grants for tables/views and handle diffing of grants
    • Simplifies the grants to views since there is only a single "write" permission to be granted
    • Removes the grant to ALL tables/views since we now explicitly fetch all tables/views no matter what. This list is added to the full grant list if <schema_name>.* is passed

snowflake_connector.py

  • Moves the full_schema_list function from snowflake_grants.py and adjusts all references
  • Adds show_future_grants

snowflake_spec_load.py

  • Updated get_privileges_from_snowflake_server to get future grants and consolidates them into self.grants_to_role

I worked hard to keep this iteration small!

I tried to make this iteration small but in doing so realized I had to handle the future grants in this iteration. 😒

The simplest changes were to snowflake_connector and snowflake_spec_loader. I moved the full_schema_list function from the SnowflakeGrantsGenerator to the SnowflakeConnector. I also added a show_future_grants function. snowflake_spec_loader will now fetch future grants using this new function and collapse everything into the single object self.grants_to_role. I'm curious if you think there's a way to simplify the dictionary consolidation. I like it explicit at least for a first pass though. The final format of grants_to_role does not change (i.e. the top level is still role), but now it will include entities like raw.<schema> and analytics.analytics_staging.<table> and analytics.analytics_staging.<view>. These represent the future grants on schemas and tables / views respectively. I also switched to using set_default which is a nice feature I learned about in Fluent Python.

The update to schema grants now handles the case where <db_name>.* is passed - I don't actually think we were handling that properly before. We were fetching all the schemas but we weren't doing the future grants. The function handles that case now and also handles revocations for future grants (the new elif in revokes). It also handles the case (the updated final elif in revokes) where an object is granted on Snowflake but is not explicitly granted with a fully qualified name and is not included as part of the future grant representing all objects.

Table and view grants have also been updated to remove the grant to all objects. Now that we fetch all schemas and tables, if an asterisk is seen, then all objects are just added to the list of things to be granted to. This was easier to handle in the case of revocations as the "grant all" on snowflake unfurls into individual grants anyways. A downside of this is that the variables for the granting in this function feel a bit confusing. There are some that hold the state for the entire function call, but some similarly named variables are only for a given loop instantiation. Guidance on renaming would be appreciated I think. Also, I simplified the view grant logic b/c there is only a single privilege that can be granted for views in either the read or write case - that's "SELECT". This also fixes a bug for the views where table was being selected and not view. This also handles determining whether a future grant is new or not by iterating through the write privilege arrays where.

I've also added revocations to tables and views for regular and future grants. This follows similar logic for schemas.

I did not add any tests to this because we need some sort of Snowflake connection mock for dealing with schemas and views. I will need some help with that for sure.

My recommendations for reviewing:

  • Validate full_schema_list move and changes to conn
  • Validate show_future_grants does what it's supposed to
  • Validate get_privileges_from_snowflake_server collapses privileges properly
  • Validate privileges are properly generated for schemas
  • Validate future privileges are properly generated for schemas
  • Validate revokes are properly generated for schemas
  • Validate privileges are properly generated for tables/views
  • Validate future privileges are properly generated for tables/views
  • Validate revokes are properly generated for tables/views

Notes:

  • There is some new code that probably could be refactored to be a bit more dry. I'm intentionally leaving it to sit for an iteration to see if it's really worth it. Specifically the future grant logic and some of the revoke logic.

Testing I did for this:

  • The engineer role in our instance has future grant privileges on the raw database and most every schema in raw. Removing read and write permissions in the spec resulted in:
REVOKE usage ON FUTURE SCHEMAS IN DATABASE raw FROM ROLE engineer
REVOKE monitor, create table, create view, create stage, create file format, create sequence, create function, create pipe ON FUTURE SCHEMAS IN DATABASE raw FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.greenhouse FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.zuora_stitch FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.engineering_extracts FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.snowflake FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.gitlab_data_yaml FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.snapshots FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.version_db_dump_stitch FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.tap_postgres FROM ROLE engineer
REVOKE select ON FUTURE tables IN SCHEMA raw.sheetload FROM ROLE engineer
REVOKE select ON FUTURE views IN SCHEMA raw.zuora_stitch FROM ROLE engineer
REVOKE insert, update, delete, truncate, references ON FUTURE tables IN SCHEMA raw.engineering_extracts FROM ROLE engineer
REVOKE insert, update, delete, truncate, references ON FUTURE tables IN SCHEMA raw.snapshots FROM ROLE engineer
  • The engineer role spec has analytics.*.* for both read and write and a ton of grant future statements were generated. This is a subset:
+ GRANT usage ON FUTURE SCHEMAS IN DATABASE analytics TO ROLE engineer;
+ GRANT usage, monitor, create table, create view, create stage, create file format, create sequence, create function, create pipe ON FUTURE SCHEMAS IN DATABASE analytics TO ROLE engineer;
+ GRANT select ON FUTURE tables IN SCHEMA analytics.analytics TO ROLE engineer;
+ GRANT select ON FUTURE views IN SCHEMA analytics.analytics TO ROLE engineer;
+ GRANT select ON FUTURE tables IN SCHEMA analytics.analytics_analytics TO ROLE engineer;
+ GRANT select ON FUTURE views IN SCHEMA analytics.analytics_analytics TO ROLE engineer;
+ GRANT select ON FUTURE tables IN SCHEMA analytics.analytics_meta TO ROLE engineer;
+ GRANT select ON FUTURE views IN SCHEMA analytics.analytics_meta TO ROLE engineer;
+ GRANT select ON FUTURE tables IN SCHEMA analytics.analytics_sensitive TO ROLE engineer;
+ GRANT select ON FUTURE views IN SCHEMA analytics.analytics_sensitive TO ROLE engineer;
+ GRANT select ON FUTURE tables IN SCHEMA analytics.analytics_staging TO ROLE engineer;
+ GRANT select ON FUTURE views IN SCHEMA analytics.analytics_staging TO ROLE engineer;
+ GRANT select ON FUTURE tables IN SCHEMA analytics.boneyard TO ROLE engineer;
+ GRANT select ON FUTURE views IN SCHEMA analytics.boneyard TO ROLE engineer;
+ GRANT select, insert, update, delete, truncate, references ON FUTURE tables IN SCHEMA analytics.snowplow_2020_04 TO ROLE engineer;
+ GRANT select ON FUTURE views IN SCHEMA analytics.snowplow_2020_04 TO ROLE engineer;
+ GRANT select, insert, update, delete, truncate, references ON FUTURE tables IN SCHEMA analytics.tmurphy_scratch_analytics TO ROLE engineer;
+ GRANT select ON FUTURE views IN SCHEMA analytics.tmurphy_scratch_analytics TO ROLE engineer;
  • The pwc_contractor role has read access to raw.salesforce_stitch.* and when removed it generated a ton of statements like
+ REVOKE usage ON schema raw.salesforce_stitch FROM ROLE pwc_contractor;
+ REVOKE select ON table raw.salesforce_stitch.statement_of_work__c FROM ROLE pwc_contractor;
+ REVOKE select ON FUTURE views IN SCHEMA raw.salesforce_stitch FROM ROLE pwc_contractor;
+ REVOKE select ON FUTURE tables IN SCHEMA raw.salesforce_stitch FROM ROLE pwc_contractor;
Edited by Melty Bot

Merge request reports