npm feeder job exceeds job timeout
Problem
The npm feeder is currently hitting the job timeout of 3 hours. As a result, we're currently unable to update our license database. This is happening because each time the job is executed it starts from the beginning and fetches all the required splits. Contrast this with the go feeder that uses a cursor persisted in cloud storage in order to carry on working from where it left off.
Failing Job
Proposed Solutions
- Persist the last round's splits
Implementation Plan 1
-
Save the splits from the latest round to the cursor -
Instead of starting from "scratch", start from the last saved split. their offsets need to be fetched again, as those might have moved -
Sub-split as needed -
Persist the results of the new sub-splits
Please note: we worked through this solution but ran into an issue with rate-limiting. An extenstion to the solution was needed.
Implementation Plan 2
Iteration 1
-
Create a new compute instance in ext-license-db-dev-d6ba6f35in theus-east1region -
Create an administratoraccount and set a password -
Setup couchdb -
Create a new database called license-db-npm-mirror -
Start the replication process using https://replicate.npmjs.com/registry as the source and license-db-npm-mirroras the target. Selectcontinuousas the type of replication -
Monitor progress and ensure it completes -
Create a disk image backup -
Set custom hostname for the instance to avoid relying on an ephemeralipaddress -
Perform some manual QA to validate that removing rate-limiting moves us closer to our goal
Iteration 2
-
Submit Use errgroup to re-fetch NPM split offsets for review -
Make npmregistryurlconfigurable infeederand add support for basic authentication -
Make npmregistryurlconfigurable ininterfacerand add support for basic authentication -
Release feederandinterfacer
Iteration 3
-
Review infrastructure changes -
Align with existing approaches to security -
Add terraform code to provisioncouchdband associated infrastructure
Edited by Philip Cunningham