Complete docs for DBLab 4.1
# DBLab v4.1 Documentation Gap Analysis ## What's New in v4.1.0 Based on `git log v4.0.4..v4.1.0`, these are the significant documentation gaps introduced or exposed in DBLab 4.1. ## New Features / Changes: Status After Round 1 Re-reviewed against `postgres-ai/docs` `master` at `3a3400d`, including MR `!896` / commit `637e9b8`. | Feature | Commit | Status after round 1 | Notes | | --- | --- | --- | --- | | Protection lease (time-limited clone protection) | `75e09d26` | Fixed in round 1 | `clone-protection.md` was rewritten and the config reference, API reference, and CLI reference now document lease-based protection. | | Prometheus exporter (`/metrics` endpoint) | `09560329` | Partially fixed | There is now a dedicated Prometheus monitoring page and `/metrics` is covered in the API reference, but `database-lab-engine-components.md` still does not mention the metrics endpoint. | | Teleport integration (`dblab teleport serve`) | `753d6f7c` | Fixed in round 1 | A dedicated Teleport integration howto exists, CLI docs cover `teleport serve`, and the new `pg_hba.conf` / SSL setup is documented. | | Database rename (`databaseRename` option) | `0ff34b05` | Partially fixed | `databaseRename` is now documented in the configuration reference, but there is still no dedicated howto / usage note. | | RDS/Aurora logical refresh component | `0a8fb735` | Partially fixed | A dedicated `rds-refresh.md` page now exists and is linked in the sidebar, but `rds.md` still does not link to or explain this workflow. | | Sync WAL lag Prometheus metric | `6801c02c` | Fixed in round 1 | The Prometheus monitoring page documents `dblab_sync_wal_lag_seconds`. | | Default branch set to `main` for clones | `53b94934` | Still missing | I still do not see docs explicitly calling out `main` as the default branch. | | ARM64 / Colima support | `6d2965dc` | Still missing / unclear | There is an existing macOS / Colima guide, but round 1 did not update it and I do not see explicit coverage of the 4.1 ARM64 / Colima fixes. | | Drop PostgreSQL 9.6 support | `c457a2d7` | Partially fixed | Main DBLab overview pages now say PostgreSQL 10+, but `docs/questions-and-answers.md` still says DBLab Engine supports 9.6+. | | RDS IAM instance identifier in config projection | `762cda97` | Partially fixed | `dbInstanceIdentifier` is now documented in the configuration reference and RDS howto, but the config API projection aspect is still not documented explicitly. | | Pool space indicator fix for shared ZFS pools | `c55f825b` | N/A | Bug fix; no standalone docs action needed. | ## Round 1 Re-review Summary Fixed in round 1: - Protection lease docs are now present in the clone protection guide, config reference, API reference, and CLI reference. - Teleport integration is now documented end-to-end. - Prometheus monitoring has a dedicated page and `/metrics` is now in the API reference. - The RDS/Aurora refresh tool now has a dedicated howto page. - The CLI reference now covers Teleport commands, protection-lease usage, and `snapshot delete --force`. - Main DBLab overview pages now reflect PostgreSQL 10+ support. Still missing or partial after round 1: - `database-lab-engine-components.md` still does not mention `/metrics`. - `databaseRename` still lacks a dedicated howto / usage note. - `rds.md` still does not link to or explain the `rds-refresh` flow. - I still do not see docs explicitly stating that `main` is the default branch. - ARM64 / Colima fixes are still not clearly documented as part of the 4.1 work. - `docs/questions-and-answers.md` still says DBLab Engine supports PostgreSQL 9.6+. - Webhook docs list the new triggers, but the payload section is still vague and does not include a concrete schema / example payload. - The `dbInstanceIdentifier` config API projection aspect is still not documented explicitly. ## Gap 1: Protection Lease Is Completely Missing ### What exists in code but not in docs The engine now supports time-limited clone protection instead of just boolean on/off protection. Configuration in the `cloning` section: - `protectionLeaseDurationMinutes` (`uint`, default `1440`): default protection duration in minutes; `0` means infinite protection. - `protectionMaxDurationMinutes` (`uint`, default `10080`): maximum allowed protection duration; `0` means no limit. - `protectionExpiryWarningMinutes` (`uint`, default `1440`): warning webhook lead time before expiry. API changes: - `CreateClone` request: new field `protectionDurationMinutes` (`int64`) - `UpdateClone` request: new field `protectionDurationMinutes` (`int64`) - Clone response: new field `protectedTill` (`date-time`) - `CloneMetadata`: new fields `protectionLeaseDurationMinutes`, `protectionMaxDurationMinutes` - Cloning status: new fields `protectionLeaseDurationMinutes`, `protectionMaxDurationMinutes` ### What needs to be done 1. Add the 3 new `cloning` parameters to the configuration reference. 2. Update API reference / OpenAPI schemas. `dblab_server_swagger.yaml` is partially updated, but `dblab_openapi.yaml` is not. 3. Document `--protection-duration` for `clone create` and `clone update` in the CLI reference. 4. Rewrite `clone-protection.md`, which currently only covers boolean protection. 5. Mention protection duration in the create-clone howto. ## Gap 2: Prometheus Exporter Is Not in Published Docs ### What exists There is a comprehensive `PROMETHEUS.md` in the engine repo root covering: - 50+ metrics - Prometheus configuration - PromQL examples - Alerting rules - OpenTelemetry integration ### What's missing from the docs site - No dedicated Prometheus / monitoring page for DBLab - No mention of the `/metrics` endpoint in the API reference - No howto for monitoring DBLab itself - No Prometheus-related settings in the config reference - `database-lab-engine-components.md` does not mention the metrics endpoint ### What needs to be done - Create a new docs page, for example: - `docs/database-lab/monitoring.md`, or - `docs/dblab-howtos/administration/prometheus-monitoring.md` - Add `/metrics` to the API reference - Add the new page to the sidebar ## Gap 3: Teleport Integration Is Not in Published Docs ### What exists `engine/cmd/cli/commands/teleport/SETUP.md` already has: - Architecture diagrams - Prerequisites - Config examples - Troubleshooting ### What's missing - No docs-site page for Teleport integration - CLI reference does not mention `dblab teleport serve` - No howto for setting up Teleport with DBLab - The new `pg_hba.conf` `hostssl cert` rule in v4.1 is undocumented ### What needs to be done - Create a Teleport integration howto page - Add the `teleport` command to the CLI reference - Mention the new default `pg_hba.conf` rules in release notes or config docs ## Gap 4: `databaseRename` Is Missing from Config Reference ### What exists in code All 5 example config files include `databaseRename` for both logical and physical snapshot jobs. The README mentions it briefly. ### What's missing - `database-lab-engine-configuration-reference.md` does not document `databaseRename` under either `logicalSnapshot` or `physicalSnapshot` - No howto explaining when and why to use database rename ### What needs to be done - Add `databaseRename` (key-value, optional) to both `logicalSnapshot` and `physicalSnapshot` sections in the config reference - Consider adding a short howto or note in data-source docs ## Gap 5: RDS/Aurora Logical Refresh Component Is Not in Published Docs ### What exists There is a complete `engine/cmd/rds-refresh/README.md`, example config, and Go implementation for a standalone tool that dumps from temporary RDS clones instead of production. ### What's missing - No docs-site page for the RDS/Aurora refresh tool - The RDS data-source howto (`rds.md`) does not mention this alternative approach - No configuration reference for the `rds-refresh` tool ### What needs to be done - Create a new howto or data-source page for RDS/Aurora refresh - Link to it from existing `rds.md` - Document configuration, IAM policies, and scheduling ## Gap 6: OpenAPI Specs Are Outdated and Inconsistent ### Two spec files with different coverage 1. `dblab_server_swagger.yaml` - Version says `3.5.0` - Has protection lease fields partially - Missing branch endpoints: - `/branches` - `/branch` - `/branch/{name}` - `/branch/{name}/log` - `/branch/snapshot` - Missing: - `/full-refresh` - `/clones` (list) - `/snapshot` (create) - `/snapshot/{id}` (delete) 2. `dblab_openapi.yaml` - Version says `4.0.0` - Has branch endpoints - Missing protection lease fields: - `protectedTill` - `protectionDurationMinutes` - `protectionLeaseDurationMinutes` - `protectionMaxDurationMinutes` - Missing `/metrics` - Missing `databaseRename` mentions - Still has `#TODO` comments for PATCH clone `404` response ### Neither spec documents - `/metrics` - Protection lease fields comprehensively - `clone_delete` webhook trigger ### What needs to be done - Update `dblab_openapi.yaml` to version `4.1.0` - Add protection lease fields to all relevant schemas - Add `/metrics` - Merge missing endpoints from `dblab_server_swagger.yaml` into `dblab_openapi.yaml` (or consolidate the two files) - Remove the stale TODOs - Update the API reference page to point to v4.1 docs on ReadMe.io ## Gap 7: API Reference Page Is Extremely Thin ### Current state `database-lab-engine-api-reference.md` is only ~23 lines long and only links to ReadMe.io for 3.5.x and 4.0.x. There is no 4.1.x link and no inline summary. ### What needs to be done - Add a link for v4.1.0 API reference - Consider adding a summary table of endpoints inline - Mention `/metrics` as an unauthenticated endpoint, similar to `/healthz` ## Gap 8: CLI Reference Is Missing Commands and Flags ### Missing from CLI reference - `teleport` command and its `serve` subcommand - `clone create --protection-duration` - `clone update --protection-duration` - The help overview lists `branch`, `switch`, `commit`, and `log`, but the help-subcommand section does not - `snapshot delete --force` ## Gap 9: Config Reference Is Missing Several Parameters ### Missing parameters - `cloning.protectionLeaseDurationMinutes` - `cloning.protectionMaxDurationMinutes` - `cloning.protectionExpiryWarningMinutes` - `logicalSnapshot.databaseRename` - `physicalSnapshot.databaseRename` - `logicalDump.source.rdsIam.dbInstanceIdentifier` is present in code but not documented as projectable via config API in v4.1 - Links to example configs still point to v4.0.3 and should point to v4.1.0 or `master` ## Gap 10: Howto Docs Have Several Gaps | Howto / Area | Issue | | --- | --- | | `clone-protection.md` | Only covers boolean protection; no mention of time-limited protection leases | | `data/rds.md` | Uses `postgresai/dblab-server:4.0.3`; no mention of RDS/Aurora refresh tool or `rdsIam.dbInstanceIdentifier` projection | | Admin data-source howtos | Docker image tags still reference `4.0.3`; should be `4.1.0` | | Monitoring / observability | No howto for monitoring DBLab itself | | Teleport integration | No howto exists | | GitHub Actions integration | README still says `TBD` | | GitLab CI/CD integration | README still says `TBD` | | Database rename | No howto for `databaseRename` | | ARM64 / Colima / Mac | `run-database-lab-on-mac.md` may not cover ARM64-related fixes | ## Gap 11: Supported Databases Page Is Outdated The docs overview page (`database-lab/index.md:73`) says DBLab supports PostgreSQL starting from 9.6. DBLab 4.1 dropped PostgreSQL 9.6 support, and the docs should align with the README, which says PostgreSQL 10-18. ## Gap 12: `README.md` Still Has `TBD` Links In the engine README: - `How to work with branches` points to `XXXXXXX` - `How to integrate DBLab with GitHub Actions` points to `XXXXXXX` - `How to integrate DBLab with GitLab CI/CD` points to `XXXXXXX` The branching howtos now exist in `docs/dblab-howtos/branching/`, so that README link should be fixed. The CI/CD integration howtos are still missing. ## Gap 13: Version References Throughout the Docs Are Stale Examples found so far: - Config reference links to v4.0.3 example configs - RDS howto uses `postgresai/dblab-server:4.0.3` - API reference links to v3.5.0 and v4.0.0 on ReadMe.io - OpenAPI spec still says `4.0.0` - Demo server description says "DBLab 4.0 demo server" - Overview page references a DBLab 3.x demo server at `demo.aws.postgres.ai:446` ## Gap 14: Webhook Configuration Docs Are Incomplete The config reference documents `clone_create` and `clone_reset`, but Teleport integration also uses `clone_delete`, and that trigger is not documented. The payload-format section is also vague and should include the actual schema or at least a concrete example payload. ## Proposed Documentation Work ### High priority - Document protection leases end-to-end - Publish monitoring / Prometheus docs - Publish Teleport integration docs - Update OpenAPI / API reference to 4.1.0 - Fix supported PostgreSQL version references - Add missing config parameters ### Cleanup and consistency work - Update stale version numbers across docs - Replace README `TBD` links where target docs already exist - Add missing CLI flags and commands - Document webhook triggers and payloads more concretely
issue