Conventions re: Multi-call REST implementations

There are two main scenarios under which a REST-based tap may require multiple HTTP calls, likely with different parameter values in subsequent calls.

Scenario A: ID of entity being streamed is keyed into URL, requiring multiple calls: one call each for every captured entry.

Example: GitLab Projects.

  • The project ID is in the URL, and (in this case) the user is able to specify a comma-separated list of projects to be extracted.
  • Multiple calls need to be made to retrieve the full list of projects. (One call per provided project ID.)

Requirements to fulfill:

  • get_url() should be plural.
  • get_params() should probably be plural - or dynamic based on an iterated input.

Scenario B: Parent ID is parameterized, requiring multiple calls: one call each for every parent entity.

  • Note: This scenario and scenario C are difficult with traditional REST but may be significantly easier in GraphQL, since joins can be performed in the query itself.

Example: Gitlab Epic Issues.

Sample path: /groups/{id}/epics/{secondary_id}/issues

  • Since Epics are child to Groups, and since project ID and issue ID are both part of the URL to retrieve issue comments, we must make multiple calls, one per issue ID identified during extraction.
  • Since Epic Issues cannot be known unless Epics are known, it is not possible to query Epic Issues without also querying Epics.

Requirements to fulfill:

  • URLs need to be generated dynamically from the output of other REST request.
  • We may need intelligence to differentiate behavior:
    • Epics is selected - in which case we emit matching records and also chain calls to the child stream
    • Epics is not selected - we still need to query Epics to get the IDs, but we should not emit the corresponding records.

Some options:

  • Option A: The parent entity's stream either calls the child stream or registers future calls for the child stream.
  • Option B: The child entity gets access to the parent's keys or is empowered to make calls to retrieve them.
  • Option C: The child entity registers itself in relationship to it's parent, and, once registered, receives a call on subsequent iterations of the parent key instance.

Scenario C: Follow-on (child) HTTP calls needed for full stream definition.

  • The core stream needs some vital information from another lookup, either in a 1:1, 1:many, or many:many relationship.

Example: Hypothetical "Students" records requiring chained calls (above).

Using the hypothetical example of relationships between "students", "users", and "addresses", for example:

  • "users" - 1:1 with "students", needs additional lookup by "user_id" on the "student" record
  • "home_zip_code" - 1:many with "students", needs additional lookup to "addresses" by the "address_id" on the "student" record.
  • "majors" - many:many with "students", needs additional lookup to "subjects" by the "major_subject_ids" on the "student" record.

Requirements to fulfill:

  • Already available: We can use post_process() for the additional lookups.
  • Lower priority: We could create a documented paradigm in samples for calling the (parameterized and auto-retrying) http methods from within post_process()`
Edited by AJ Steers