This project is archived. Its data is read-only. This project is read-only.
Make project management in Meltano "git aware"
Documenting here some ideas as discussed in Meltano-for-Meltano deployment discussion between @aaronsteers and @tayloramurphy. ------ Especially when inside a container, it could be extremely helpful if meltano were made git aware. Without this awareness, the Meltano deployment story is extremely difficult for a typical data developer to implement. A three phase approach to this: 1. The docker container has a bootloader script that pulls the project from Git as a first step. 2. Meltano natively is able to push and pull prior to certain commands. 3. The Meltano UI is able to push and pull to git, and will prompt you if you have "uncommitted changes". ## Simplify the git workflow for Data Professionals out of the box - When launching the `meltano/meltano` docker image, the image will detect the repo URL, credentials, and default branch. Assuming no other project as been mapped, it will download the project repo at launch time. - When needed, `meltano install` will also be run automatically after the project is cloned. - Unless otherwise specified, we can assume 2 branches on every project repo: `main` (or `master`) and `development`. Projects start on the `development` branch by default, creating it from `main` if it does not yet exist. - Optionally, a user name or environment name can be appended in the branch name: `development/aj` or `development/web-ui`. - Meltano will default to "auto-commit mode" for users not familiar with git, or for environments where we do not have direct interactivity with the developer. Within this mode of operation: - Commits and pushes are triggered automatically against the development branch if files are changed by a `meltano` CLI command. - Pulls are triggered automatically before running `meltano elt` and before modifications to the project. - The repo is automatically in "read only mode" whenever on `main`. In this mode, `meltano` CLI commands will fail (or prompt for a branch change) if they would modify `meltano.yml`. ## Support for advanced scenarios as teams and requirements evolve The above would be a default experience for new teams. For advanced teams and for highly tuned environments: - Auto-commit mode can be disabled. - The list of protected / read-only branches can be expanded beyond just `main`. - Specific branches can be checked out. - The native `git` executable still works within the repo as usual, since git operations performed by `meltano` also are using the same standard `git` operations. - Each container can have customized environment variables specifying which set of branches it expects to be run against, or any other constraints such as forcing read-only mode. In these examples, `meltano project` is similar in behavior to comparable `git` commands, except that additional behaviors and constraints are applied as make sense for meltano projects specifically. ```bash meltano project pull # Pulls from the repo. URL and creds are in env vars or meltano.yml meltano project commit # Check branch rules; commit and push if safe, otherwise throw error. # A default commit message will be provided if none given. meltano project checkout <BRANCH> # Switches between branches ``` ## A sample Deployment Story A possible kubernetes workflow would then be: **Project initialization:** 1. Developer creates copies from our new project template, or pushes the output of meltano init. 1. Developer creates and auth token in Gitlab/Github if they don't have one already. 1. Developer maps their project git URL and auth token into environment variables for docker-compose, kubernetes, or similar. **Project deployment:** 1. Containers starts up. 2. Some command is passed to the container `meltano install/init/elt`. 3. Meltano detects from env vars the git settings. 4. Meltano pulls the latest. 5. The original command is run (probably either `meltano ui` or `meltano elt`). 6. Whenever `meltano.yml` or other files are changed, meltano attempts to commit and push back. ## Streamlined deployment story: - With proper env var config, the stock `meltano/meltano` docker image can be run directly from ECS, Kubernetes, or docker-compose, and only requires env vars for git project initialization. - The project running in the container does not need to be set in read-only mode, but instead can default to auto-commit mode (unless or until the user sets it otherwise). - After finding the repo from git, the image can also auto-install all the components it needs. - As needed to decrease initialization and install time, eventually we still expect users to create their own Dockerfiles. ## Web UI opportunity down the road: - If `meltano ui` is executed without a defined project, the Web UI could wait at a project initialization screen, asking a user to input project details and then initialize from user input.
issue