Accelerated tap development framework (v0.0.1-apha)

This initial MR would add core capabilities and provide the initial interface spec for tap developers to use when creating new taps.

Objectives:

  • We want tap and target developers can get more done with less code, and without having to become experts in the Singer spec.
  • We want the cost of supporting taps and targets to be significantly decreased.
  • We want to enable new features and new Singer spec additions in a systemic and minimally invasive way.
  • We want to create a smooth onramp for existing taps.
  • We want to take advantage of modern Python typing to eliminate guess work during development.

Related discussion

Please see meltano#2401 (closed) for a robust discussion on the need iof a new framework.

Status:

  • All core capabilities functions are working: discovery via scan, discovery via catalog file, sync_one, sync_all, cli execution.
  • docs/dev_guide.md shows recommended usage.
  • Cookiecutter started as well here, modeled after the Parquet test. (Needs updates.)
  • GitLab CI testing is online leverage pytest framework.
  • Poetry has been implemented for package and dependency management.
  • Current samples implemented:
    • GitLab (REST/GraphQL streams hybrid)
    • Countries API (GraphQL stream type)
    • Snowflake (Database stream type)
    • Parquet (Generic stream type)

Known limitations:

  1. Need more Robust paging for REST and GraphQL sources.
    • Currently there is no paging implemented for the GraphQL source and the REST implementation is based on a single sample (GitLab). In theory, the developer can simply override the get_next_page() method, returning something truthy if there's another page, but this not well defined or well documented as of yet.
  2. Templating and parameterization is not well documented as of yet, and it may be worth leveraging jinja instead of doing it by hand.
    • I'd like to evaluate if instead of using the current and generic {my_val} syntax for templating, we perhaps should migrate to jinja syntax {{my_val}} and then developers and tap users can implement more complex logic if and when it is needed. (I'll spin off I have spun off an Issue in #11 (closed) to discuss this in more depth.)
    • A basic means of templating is already implemented for the url_suffix parameterization in REST calls, but we probably will also want a similar standard for parsing tap settings like filepath or file_naming_scheme, as well as GraphQL queries and perhaps also in SQL queries.

Many functions still need to be added and I've opened tickets here for follow-on items. That said, the MR is already getting quite large, I would love to get initial feedback on a first merge to main.

Edited by AJ Steers

Merge request reports

Loading