MVC

Jacob: After talking with @joshlambert we came up with this MVC. The layout is the file structure as well. So for example, top layer you would have extractors.

Repo Directory Structure

Extract
- Lever (Done)
  - Mapping/Filtering
- SFDC (Done)
  - Mapping/Filtering
- GitLab (Done)
  - Mapping/Filtering
- BambooHR (Done)
  - Mapping/Filtering
- Zuora (Done)
  - Mapping/Filtering
- NetSuite (Done)
  - Mapping/Filtering
- ZenDesk (Done)
  - Mapping/Filtering
- Fastly (in progress)
  - Mapping/Filtering
- CSV (todo)
  - Mapping/Filtering
Load
- Postgresql
- CSV (not MVC)
- BigQuery (not MVC)
- MySQL (not MVC)
- SnowFlake (not MVC)
- Anonymization/Pseudoanonymization step
Transform
- dbt transformations (just files)
- python files (for example for API lookup
Model
- melt files
Analyze
- source files of flask application
Orchestrate
- .gitlab-ci.yml files, the ones we use ourselves but also 20 other samples and examples

Meltano analyze will search the extractor directory for a list of extractors. If will then look in the load and orchestrate directories for a directory of the same name to find the corresponding steps in the life cycle.

Currently we have two types of extractors which will eventually become 1. To make an MVP happen sooner we will use both extractors. Extractor type 1 is a stand alone extractor which will be used with it's corresponding loaders. Extractor type 2 is the original extractor which contains a loader built in. Those will be labeled in the directory as extractor_name__legacy which will be able to be run by Meltano analysis.

UX

Meltano analysis will crawl these folders and the tabs will be

MELTANO

Model
- Look for Melt files
- Exist for the visualizations
Extract
- Look for extractor files
- Be able to run extractors from UI
Load
- Loader files
- Be able to run Loader from UI
- Demonstration load: CSV -> PG
Transform
- Look for DBT files.
Analyze
- Charts and tables in the UI through melt files.
Orchestrate
- Run a real ELT from the console (not MVC: requires credential entry, etc.)
- List of Gitlab CI YAML files (.gitlab-ci.yml)

Todo's

Python Implementation of this
- Architecture step by Alex Z
Docker & Helm chart
- Base Dockerfile with no samples (meltano:base)
- Helm chart to deploy alongside Postgres
Automate schema creation if it doesn't exist

Customer installation:

Simple use case

Helm chart to easily deploy to k8s
- Provision prostgresql
- Provision meltano

BYO MeltML/Transforms

Make your own Dockerfile
- FROM meltano:base
- ADD whatever files you want
- Edit chart's values.yaml to use your image

Personas

Data Engineer persona

Installation & Getting started:

Clone the "getting started" repo from GitLab. This repo includes only: default melt files, dbt transforms, a .gitlab-ci.yml (CI pipeline) and a values.yaml (Helm chart config). Also included is a README with instructions.
First stage in pipeline is to start from the meltano docker image, then layer the whole repo on top.
Second stage in pipeline is to deploy Meltano to Kubernetes (k8s for now, other targets like VM's later).

Benefits:

Customers do not have to fork the whole Meltano repo
- In this case only the initial MeltML files and DBT transforms are cloned. This is ideal as these are likely to be changed anyway
- Less files included means less worries about conflicts as we update the defaults.
Simple mechanism to package custom files in the Docker image, as well as keep up to date with the melt files in the repo without having to build all the git pull/branch functionality first.

Data Analyst persona

They would have access to the cloned "getting started" repo that the engineer set up earlier. They can then edit the LookML or other dashboard files as desired, after merged a new container version will be built and pushed.

Developer persona

Developers can clone the full meltano repo locally, and follow the existing process we use to develop it. The main issue is that they will be unable to run CI, since they will have to fork the repo instead of branch. (Since our production variables are not protected yet.)
If they want to generate a test image, they can simply do a docker build.

Edited Aug 01, 2018 by Jacob Schatz