Accelerated tap development framework (v0.0.1-apha)
This initial MR would add core capabilities and provide the initial interface spec for tap developers to use when creating new taps.
Objectives:
- We want tap and target developers can get more done with less code, and without having to become experts in the Singer spec.
- We want the cost of supporting taps and targets to be significantly decreased.
- We want to enable new features and new Singer spec additions in a systemic and minimally invasive way.
- We want to create a smooth onramp for existing taps.
- We want to take advantage of modern Python typing to eliminate guess work during development.
Related discussion
Please see meltano#2401 (closed) for a robust discussion on the need iof a new framework.
Status:
- All core capabilities functions are working: discovery via scan, discovery via catalog file, sync_one, sync_all, cli execution.
-
docs/dev_guide.mdshows recommended usage. - Cookiecutter started as well here, modeled after the Parquet test. (Needs updates.)
- GitLab CI testing is online leverage
pytestframework. -
Poetryhas been implemented for package and dependency management. - Current samples implemented:
- GitLab (REST/GraphQL streams hybrid)
- Countries API (GraphQL stream type)
- Snowflake (Database stream type)
- Parquet (Generic stream type)
Known limitations:
-
Need more Robust paging for REST and GraphQL sources. - Currently there is no paging implemented for the GraphQL source and the REST implementation is based on a single sample (GitLab). In theory, the developer can simply override the
get_next_page()method, returning something truthy if there's another page, but this not well defined or well documented as of yet.
- Currently there is no paging implemented for the GraphQL source and the REST implementation is based on a single sample (GitLab). In theory, the developer can simply override the
-
Templating and parameterization is not well documented as of yet, and it may be worth leveraging jinja instead of doing it by hand. - I'd like to evaluate if instead of using the current and generic
{my_val}syntax for templating, we perhaps should migrate to jinja syntax{{my_val}}and then developers and tap users can implement more complex logic if and when it is needed. (I'll spin offI have spun off an Issue in #11 (closed) to discuss this in more depth.) - A basic means of templating is already implemented for the
url_suffixparameterization in REST calls, but we probably will also want a similar standard for parsing tap settings likefilepathorfile_naming_scheme, as well as GraphQL queries and perhaps also in SQL queries.
- I'd like to evaluate if instead of using the current and generic
Many functions still need to be added and I've opened tickets here for follow-on items. That said, the MR is already getting quite large, I would love to get initial feedback on a first merge to main.
Edited by AJ Steers