Define license-db architecture for depsdev bigquery data
Topic to evaluate
Propose an addition to the license-db infrastructure to handle importing deps.dev data through bigquery.
Tasks to evaluate
-
Propose and discuss ways to fit importing of deps.dev bigquery data using existing infrastructure. - Fan-out between feeder and interfacer (thread).
- Limit amount of data processed by
license-dbinfrastructure (minimal change approach).
-
Confirm dataset size (38GiB).- The estimated maximum amount of data processed for a snapshot is 8GiB
-
Define possible failure modes during import (timeouts, saturated pubsub topics, pressure on the database). - Authentication, intermittent connection/query failures, connection timeout.
-
Document sequence of events (diagram)
Timebox
3d
Outcome
-
Provide data on estimated resource use. -
Ensure proposed design is discussed with team.
Edited by Igor Frenkel