Complete docs for DBLab 4.1
# DBLab v4.1 Documentation Gap Analysis
## What's New in v4.1.0
Based on `git log v4.0.4..v4.1.0`, these are the significant documentation gaps introduced or exposed in DBLab 4.1.
## New Features / Changes: Status After Round 1
Re-reviewed against `postgres-ai/docs` `master` at `3a3400d`, including MR `!896` / commit `637e9b8`.
| Feature | Commit | Status after round 1 | Notes |
| --- | --- | --- | --- |
| Protection lease (time-limited clone protection) | `75e09d26` | Fixed in round 1 | `clone-protection.md` was rewritten and the config reference, API reference, and CLI reference now document lease-based protection. |
| Prometheus exporter (`/metrics` endpoint) | `09560329` | Partially fixed | There is now a dedicated Prometheus monitoring page and `/metrics` is covered in the API reference, but `database-lab-engine-components.md` still does not mention the metrics endpoint. |
| Teleport integration (`dblab teleport serve`) | `753d6f7c` | Fixed in round 1 | A dedicated Teleport integration howto exists, CLI docs cover `teleport serve`, and the new `pg_hba.conf` / SSL setup is documented. |
| Database rename (`databaseRename` option) | `0ff34b05` | Partially fixed | `databaseRename` is now documented in the configuration reference, but there is still no dedicated howto / usage note. |
| RDS/Aurora logical refresh component | `0a8fb735` | Partially fixed | A dedicated `rds-refresh.md` page now exists and is linked in the sidebar, but `rds.md` still does not link to or explain this workflow. |
| Sync WAL lag Prometheus metric | `6801c02c` | Fixed in round 1 | The Prometheus monitoring page documents `dblab_sync_wal_lag_seconds`. |
| Default branch set to `main` for clones | `53b94934` | Still missing | I still do not see docs explicitly calling out `main` as the default branch. |
| ARM64 / Colima support | `6d2965dc` | Still missing / unclear | There is an existing macOS / Colima guide, but round 1 did not update it and I do not see explicit coverage of the 4.1 ARM64 / Colima fixes. |
| Drop PostgreSQL 9.6 support | `c457a2d7` | Partially fixed | Main DBLab overview pages now say PostgreSQL 10+, but `docs/questions-and-answers.md` still says DBLab Engine supports 9.6+. |
| RDS IAM instance identifier in config projection | `762cda97` | Partially fixed | `dbInstanceIdentifier` is now documented in the configuration reference and RDS howto, but the config API projection aspect is still not documented explicitly. |
| Pool space indicator fix for shared ZFS pools | `c55f825b` | N/A | Bug fix; no standalone docs action needed. |
## Round 1 Re-review Summary
Fixed in round 1:
- Protection lease docs are now present in the clone protection guide, config reference, API reference, and CLI reference.
- Teleport integration is now documented end-to-end.
- Prometheus monitoring has a dedicated page and `/metrics` is now in the API reference.
- The RDS/Aurora refresh tool now has a dedicated howto page.
- The CLI reference now covers Teleport commands, protection-lease usage, and `snapshot delete --force`.
- Main DBLab overview pages now reflect PostgreSQL 10+ support.
Still missing or partial after round 1:
- `database-lab-engine-components.md` still does not mention `/metrics`.
- `databaseRename` still lacks a dedicated howto / usage note.
- `rds.md` still does not link to or explain the `rds-refresh` flow.
- I still do not see docs explicitly stating that `main` is the default branch.
- ARM64 / Colima fixes are still not clearly documented as part of the 4.1 work.
- `docs/questions-and-answers.md` still says DBLab Engine supports PostgreSQL 9.6+.
- Webhook docs list the new triggers, but the payload section is still vague and does not include a concrete schema / example payload.
- The `dbInstanceIdentifier` config API projection aspect is still not documented explicitly.
## Gap 1: Protection Lease Is Completely Missing
### What exists in code but not in docs
The engine now supports time-limited clone protection instead of just boolean on/off protection.
Configuration in the `cloning` section:
- `protectionLeaseDurationMinutes` (`uint`, default `1440`): default protection duration in minutes; `0` means infinite protection.
- `protectionMaxDurationMinutes` (`uint`, default `10080`): maximum allowed protection duration; `0` means no limit.
- `protectionExpiryWarningMinutes` (`uint`, default `1440`): warning webhook lead time before expiry.
API changes:
- `CreateClone` request: new field `protectionDurationMinutes` (`int64`)
- `UpdateClone` request: new field `protectionDurationMinutes` (`int64`)
- Clone response: new field `protectedTill` (`date-time`)
- `CloneMetadata`: new fields `protectionLeaseDurationMinutes`, `protectionMaxDurationMinutes`
- Cloning status: new fields `protectionLeaseDurationMinutes`, `protectionMaxDurationMinutes`
### What needs to be done
1. Add the 3 new `cloning` parameters to the configuration reference.
2. Update API reference / OpenAPI schemas.
`dblab_server_swagger.yaml` is partially updated, but `dblab_openapi.yaml` is not.
3. Document `--protection-duration` for `clone create` and `clone update` in the CLI reference.
4. Rewrite `clone-protection.md`, which currently only covers boolean protection.
5. Mention protection duration in the create-clone howto.
## Gap 2: Prometheus Exporter Is Not in Published Docs
### What exists
There is a comprehensive `PROMETHEUS.md` in the engine repo root covering:
- 50+ metrics
- Prometheus configuration
- PromQL examples
- Alerting rules
- OpenTelemetry integration
### What's missing from the docs site
- No dedicated Prometheus / monitoring page for DBLab
- No mention of the `/metrics` endpoint in the API reference
- No howto for monitoring DBLab itself
- No Prometheus-related settings in the config reference
- `database-lab-engine-components.md` does not mention the metrics endpoint
### What needs to be done
- Create a new docs page, for example:
- `docs/database-lab/monitoring.md`, or
- `docs/dblab-howtos/administration/prometheus-monitoring.md`
- Add `/metrics` to the API reference
- Add the new page to the sidebar
## Gap 3: Teleport Integration Is Not in Published Docs
### What exists
`engine/cmd/cli/commands/teleport/SETUP.md` already has:
- Architecture diagrams
- Prerequisites
- Config examples
- Troubleshooting
### What's missing
- No docs-site page for Teleport integration
- CLI reference does not mention `dblab teleport serve`
- No howto for setting up Teleport with DBLab
- The new `pg_hba.conf` `hostssl cert` rule in v4.1 is undocumented
### What needs to be done
- Create a Teleport integration howto page
- Add the `teleport` command to the CLI reference
- Mention the new default `pg_hba.conf` rules in release notes or config docs
## Gap 4: `databaseRename` Is Missing from Config Reference
### What exists in code
All 5 example config files include `databaseRename` for both logical and physical snapshot jobs. The README mentions it briefly.
### What's missing
- `database-lab-engine-configuration-reference.md` does not document `databaseRename` under either `logicalSnapshot` or `physicalSnapshot`
- No howto explaining when and why to use database rename
### What needs to be done
- Add `databaseRename` (key-value, optional) to both `logicalSnapshot` and `physicalSnapshot` sections in the config reference
- Consider adding a short howto or note in data-source docs
## Gap 5: RDS/Aurora Logical Refresh Component Is Not in Published Docs
### What exists
There is a complete `engine/cmd/rds-refresh/README.md`, example config, and Go implementation for a standalone tool that dumps from temporary RDS clones instead of production.
### What's missing
- No docs-site page for the RDS/Aurora refresh tool
- The RDS data-source howto (`rds.md`) does not mention this alternative approach
- No configuration reference for the `rds-refresh` tool
### What needs to be done
- Create a new howto or data-source page for RDS/Aurora refresh
- Link to it from existing `rds.md`
- Document configuration, IAM policies, and scheduling
## Gap 6: OpenAPI Specs Are Outdated and Inconsistent
### Two spec files with different coverage
1. `dblab_server_swagger.yaml`
- Version says `3.5.0`
- Has protection lease fields partially
- Missing branch endpoints:
- `/branches`
- `/branch`
- `/branch/{name}`
- `/branch/{name}/log`
- `/branch/snapshot`
- Missing:
- `/full-refresh`
- `/clones` (list)
- `/snapshot` (create)
- `/snapshot/{id}` (delete)
2. `dblab_openapi.yaml`
- Version says `4.0.0`
- Has branch endpoints
- Missing protection lease fields:
- `protectedTill`
- `protectionDurationMinutes`
- `protectionLeaseDurationMinutes`
- `protectionMaxDurationMinutes`
- Missing `/metrics`
- Missing `databaseRename` mentions
- Still has `#TODO` comments for PATCH clone `404` response
### Neither spec documents
- `/metrics`
- Protection lease fields comprehensively
- `clone_delete` webhook trigger
### What needs to be done
- Update `dblab_openapi.yaml` to version `4.1.0`
- Add protection lease fields to all relevant schemas
- Add `/metrics`
- Merge missing endpoints from `dblab_server_swagger.yaml` into `dblab_openapi.yaml` (or consolidate the two files)
- Remove the stale TODOs
- Update the API reference page to point to v4.1 docs on ReadMe.io
## Gap 7: API Reference Page Is Extremely Thin
### Current state
`database-lab-engine-api-reference.md` is only ~23 lines long and only links to ReadMe.io for 3.5.x and 4.0.x. There is no 4.1.x link and no inline summary.
### What needs to be done
- Add a link for v4.1.0 API reference
- Consider adding a summary table of endpoints inline
- Mention `/metrics` as an unauthenticated endpoint, similar to `/healthz`
## Gap 8: CLI Reference Is Missing Commands and Flags
### Missing from CLI reference
- `teleport` command and its `serve` subcommand
- `clone create --protection-duration`
- `clone update --protection-duration`
- The help overview lists `branch`, `switch`, `commit`, and `log`, but the help-subcommand section does not
- `snapshot delete --force`
## Gap 9: Config Reference Is Missing Several Parameters
### Missing parameters
- `cloning.protectionLeaseDurationMinutes`
- `cloning.protectionMaxDurationMinutes`
- `cloning.protectionExpiryWarningMinutes`
- `logicalSnapshot.databaseRename`
- `physicalSnapshot.databaseRename`
- `logicalDump.source.rdsIam.dbInstanceIdentifier` is present in code but not documented as projectable via config API in v4.1
- Links to example configs still point to v4.0.3 and should point to v4.1.0 or `master`
## Gap 10: Howto Docs Have Several Gaps
| Howto / Area | Issue |
| --- | --- |
| `clone-protection.md` | Only covers boolean protection; no mention of time-limited protection leases |
| `data/rds.md` | Uses `postgresai/dblab-server:4.0.3`; no mention of RDS/Aurora refresh tool or `rdsIam.dbInstanceIdentifier` projection |
| Admin data-source howtos | Docker image tags still reference `4.0.3`; should be `4.1.0` |
| Monitoring / observability | No howto for monitoring DBLab itself |
| Teleport integration | No howto exists |
| GitHub Actions integration | README still says `TBD` |
| GitLab CI/CD integration | README still says `TBD` |
| Database rename | No howto for `databaseRename` |
| ARM64 / Colima / Mac | `run-database-lab-on-mac.md` may not cover ARM64-related fixes |
## Gap 11: Supported Databases Page Is Outdated
The docs overview page (`database-lab/index.md:73`) says DBLab supports PostgreSQL starting from 9.6. DBLab 4.1 dropped PostgreSQL 9.6 support, and the docs should align with the README, which says PostgreSQL 10-18.
## Gap 12: `README.md` Still Has `TBD` Links
In the engine README:
- `How to work with branches` points to `XXXXXXX`
- `How to integrate DBLab with GitHub Actions` points to `XXXXXXX`
- `How to integrate DBLab with GitLab CI/CD` points to `XXXXXXX`
The branching howtos now exist in `docs/dblab-howtos/branching/`, so that README link should be fixed. The CI/CD integration howtos are still missing.
## Gap 13: Version References Throughout the Docs Are Stale
Examples found so far:
- Config reference links to v4.0.3 example configs
- RDS howto uses `postgresai/dblab-server:4.0.3`
- API reference links to v3.5.0 and v4.0.0 on ReadMe.io
- OpenAPI spec still says `4.0.0`
- Demo server description says "DBLab 4.0 demo server"
- Overview page references a DBLab 3.x demo server at `demo.aws.postgres.ai:446`
## Gap 14: Webhook Configuration Docs Are Incomplete
The config reference documents `clone_create` and `clone_reset`, but Teleport integration also uses `clone_delete`, and that trigger is not documented.
The payload-format section is also vague and should include the actual schema or at least a concrete example payload.
## Proposed Documentation Work
### High priority
- Document protection leases end-to-end
- Publish monitoring / Prometheus docs
- Publish Teleport integration docs
- Update OpenAPI / API reference to 4.1.0
- Fix supported PostgreSQL version references
- Add missing config parameters
### Cleanup and consistency work
- Update stale version numbers across docs
- Replace README `TBD` links where target docs already exist
- Add missing CLI flags and commands
- Document webhook triggers and payloads more concretely
issue