Define license-db architecture for depsdev bigquery data

Topic to evaluate

Propose an addition to the license-db infrastructure to handle importing deps.dev data through bigquery.

Tasks to evaluate

  • Propose and discuss ways to fit importing of deps.dev bigquery data using existing infrastructure.
  • Confirm dataset size (38GiB).
    • The estimated maximum amount of data processed for a snapshot is 8GiB
  • Define possible failure modes during import (timeouts, saturated pubsub topics, pressure on the database).
    • Authentication, intermittent connection/query failures, connection timeout.
  • Document sequence of events (diagram)

Timebox

3d

Outcome

  • Provide data on estimated resource use.
  • Ensure proposed design is discussed with team.
Edited by Igor Frenkel