WIP: Allow system database schema to be overridden when PostgreSQL is used (!1781) · Merge requests · Meltano / Meltano

Charles Julian Knight requested to merge rabidaudio/meltano:2082_override_system_database_schema into master Jul 27, 2020

Merge Request Checklist

Link to an issue
Include the proposed fix or feature
Include and update tests for the modified code
Include a documentation change
Add a CHANGELOG.md entry in the Unreleased section

Once your merge request is complete, please assign it a Meltano maintainer for review cycle. Once the review cycle finished, the reviewer shall approve the merge request so it can be merged by any Meltano maintainer.

Closes #2082 (closed)

I thought I'd take a stab at this one thinking it would be easy, but there's a few gotchas here. What I've committed so far is a conversation starter, it doesn't work yet.

Some problems with this solution:

The env var is hard coded and spread around a few places. Ideally this setting would be configured like everything else, but there's kind of a circular dependency between the config code and the database
core/db.py supports a database engine per project, but I'm not really sure under what situations there would be more than one project. SQLAlchemy seems to set the schema either on the the tables themselves or via MetaData, in either case these would be shared across multiple projects. I'm not sure if that's a concern or not
Getting the migrations to find out about the configuration option has been a lot harder than I expected. It doesn't seem to recognize the option from MetaData. Some sources online suggested using:

    def upgrade(self, silent=False):
        conn = self.engine.connect().execution_options(
            schema_translate_map={
                None: os.getenv('MELTANO_DATABASE_SCHEMA')
            }
        )

This seems to change the schema for Alembic, but it causes a database error when the migrations are run:

[2020-07-27 18:42:59,321] [21398|MainThread|root] [ERROR] (psycopg2.errors.UndefinedObject) type "job_state" does not exist
LINE 5:  state job_state, 
               ^
 [SQL: '\nCREATE TABLE meltano.job (\n\tid SERIAL NOT NULL, \n\tjob_id VARCHAR, \n\tstate job_state, \n\tstarted_at TIMESTAMP WITHOUT TIME ZONE, \n\tended_at TIMESTAMP WITHOUT TIME ZONE, \n\tpayload VARCHAR, \n\tpayload_flags INTEGER, \n\tPRIMARY KEY (id)\n)\n\n'] (Background on this error at: http://sqlalche.me/e/f405)

With a little direction I'd still be willing to contribute. Otherwise I can leave this as context for a future implementer.

Edited Jul 30, 2020 by Charles Julian Knight

WIP: Allow system database schema to be overridden when PostgreSQL is used

Merge Request Checklist

Merge request reports