Skip to content
Snippets Groups Projects

add entry for snowflake development databases, including retention policy

Merged Justin Stark requested to merge add-docs-for-dev-dbs into main
All threads resolved!
@@ -42,22 +42,31 @@ If you're interested in using dbt, the [dbt documentation has a great tutorial](
If you wish to use dbt and contribute to the data team project, you'll need to gain access to our Snowflake instance, which can be done via an [access request](/handbook/business-technology/end-user-services/onboarding-access-requests/access-requests/).
### Configuration
### Local environment
- Ensure you have access to our Snowflake instance
We use user-dedicated development databases and [virtual environments](/handbook/business-technology/data-team/platform/dbt-guide/#venv-workflow) via [make](https://www.gnu.org/software/make/manual/make.html) recipes in order to facilitate a 'local' development environment for GitLab team members contributing to our dbt project(s)
#### 'Local' User-Dedicated Development Databases
When needed for team members we create local development databases corresponding to the snowflake user with `_PREP` and `_PROD` suffixes, corresponding to the `PREP` and `PROD` databases in snowflake. These will be targeted by any dbt runs within the local environment
#### Configuration
- Ensure you have access to our Snowflake instance and dedicated development databases for your snowflake user
- Ensure you have [Make](https://en.wikipedia.org/wiki/Make_(software)) installed (should be installed on new Macs and with XCode)
- Create a folder in your home directory called `.dbt`
- In the `~/.dbt/` folder there should be a `profiles.yml`file that looks like this [sample profile](https://gitlab.com/gitlab-data/analytics/blob/master/admin/sample_profiles.yml)
- The smallest possible warehouse should be stored as an environment variable. Our dbt jobs use `SNOWFLAKE_TRANSFORM_WAREHOUSE` as the variable name to identify the warehouse. The environment variable can be set in the `.bashrc` or `.zshrc` file as follows:
- `export SNOWFLAKE_TRANSFORM_WAREHOUSE="ANALYST_XS"`
- `export SNOWFLAKE_TRANSFORM_WAREHOUSE="DEV_XS"`
- In cases where more compute is required, this variable can be overwritten at run. We will cover how to do this in the [next section](/handbook/business-technology/data-team/platform/dbt-guide/#choosing-the-right-snowflake-warehouse-when-running-dbt).
- Clone the [analytics project](https://gitlab.com/gitlab-data/analytics/)
- If running on Linux:
- Ensure you have [Docker installed](https://docs.docker.com/docker-for-mac/)
- Ensure you have [Rancher Desktop Installed](https://rancherdesktop.io/)
Note that many of these steps are done in the [onboarding script](https://gitlab.com/gitlab-data/analytics/-/blob/master/admin/onboarding_script.zsh) we recommend new analysts run.
### Choosing the right Snowflake warehouse when running dbt
#### Choosing the right Snowflake warehouse when running dbt
Our Snowflake instance contains [warehouses of multiple sizes](https://docs.snowflake.com/en/user-guide/warehouses-overview.html), which allow for dbt developers to allocate
differing levels of compute resources to the queries they run. The larger a warehouse is and the longer it runs, the more the query costs. For example, it costs [8 times](https://docs.snowflake.com/en/user-guide/warehouses-overview.html#warehouse-size) more to run a Large warehouse for an hour than it costs to run an X-Small warehouse for an hour.
@@ -83,7 +92,7 @@ gitlab-snowflake:
user: {username}
role: {rolename}
database: {databasename}
warehouse: ANALYST_XS
warehouse: DEV_XS
schema: preparation
authenticator: externalbrowser
dev_l:
@@ -93,7 +102,7 @@ gitlab-snowflake:
user: {username}
role: {rolename}
database: {databasename}
warehouse: ANALYST_L
warehouse: DEV_L
schema: preparation
authenticator: externalbrowser
```
@@ -105,7 +114,7 @@ use a larger warehouse. You want to retry the build, but this time you want dbt
`dbt run --models @{model_name} --target dev_l`, which tells dbt to use the warehouse you specified in the `dev_l` target in your `profiles.yml` file. After a few minutes, the build
completes and you start checking your work.
### Venv Workflow
#### Venv Workflow
{: #Venv-workflow}
Loading