Skip to content

Add Airflow DAGs data model

Fred de Gier requested to merge feature/airflow-backend into master

What does this MR do and why?

This MR adds the data model for Airflow in particular the table airflow_dags. This table contains information about the current state of a users DAG's in their Airflow instance. This is the first iteration, more tables will be added in future iterations. Below you will find a screenshot of what it looks like in the frontend.

Describe in detail what your merge request does and why.

Screenshots or screen recordings

Scherm_afbeelding_2023-01-18_om_12.29.03

See also the latest Airflow SEG update: https://www.youtube.com/watch?v=E3_YGF7Wr2k

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Tables Risk

  • What is the anticipated growth for the new table over the next 3 months, 6 months, 1 year? What assumptions are these based on?

An Airflow instance will usually have between 5 and 100 DAGs. These will be synced to GitLab, if a DAG is deleted, it will no longer be stored in GitLab. Personally my aim is to have 500 projects use the Airflow integration in 2023 so lets assume 50.000 records in 1 year.

  • How many reads and writes per hour would you expect this table to have in 3 months, 6 months, 1 year? Under what circumstances are rows updated? What assumptions are these based on?

I would like to implement a smart update algorithm client side to minimize pushing data with a maximum of once per minute per instance.

  • Based on the anticipated data volume and access patterns, does the new table pose an availability risk to GitLab.com or self-managed instances? Does the proposed design scale to support the needs of GitLab.com and self-managed customers?

I would hope not.

We expect data access on these tables to be very low in the beginning, since we are still incubating this feature.

Command outputs

rails db:migrate:up:main VERSION=20230111174113

Attention: used pure ruby version of MurmurHash3
main: == 20230111174113 CreateAirflowDags: migrating ================================
main: -- create_table(:airflow_dags)
main: -- quote_column_name(:dag_name)
main:    -> 0.0000s
main: -- quote_column_name(:schedule)
main:    -> 0.0001s
main: -- quote_column_name(:fileloc)
main:    -> 0.0000s
main:    -> 0.0113s
main: == 20230111174113 CreateAirflowDags: migrated (0.0124s) =======================

rails db:migrate:down:main VERSION=20230111174113

Attention: used pure ruby version of MurmurHash3
main: == 20230111174113 CreateAirflowDags: reverting ================================
main: -- drop_table(:airflow_dags)
main:    -> 0.0031s
main: == 20230111174113 CreateAirflowDags: reverted (0.0065s) =======================

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Fred de Gier

Merge request reports