Counterpart Request for GKG GA rollout
Request
What kind of support are you looking for?
- Feature Group Stable Counterpart
- Dedicated team member/resource
- Advice on structure/performance/ect
What team?
- Handbook: https://internal.gitlab.com/handbook/engineering/r-and-d-pmo/programs/knowledge-graph-ga/
- Slack Channel: #f_knowledge_graph
- Engineering Manager: @michaelangeloio
- Label: groupknowledge (I think)
Describe the feature or ongoing work that needs assistance
The analytics section is working on getting Siphon set up in staging and production as part of the Knowledge Graph GA rollout. There's a lot of context available in the handbook page and there's a pretty aggressive GA date.
Expectations for participating member(s) of the database group in the target group/project
We will need assistance with best practices for setting up Siphon, someone to work with the SRE to debug the present connectivity issues (see gitlab-com/gl-infra/production-engineering#28386).
We will also need some dedicated time to execute commands/verification checks on the postgres staging cluster. The test plan for this is on Initial Siphon test plan on staging (gitlab-org/analytics-section/siphon#175 - closed). Most of the steps can be done by ourselves but we will need someone to help with teardown steps of dropping the replication slot.
We'll need someone to complete the readiness review (gitlab-com/gl-infra/readiness#120). t sort of thing.
We're hoping to get Siphon set up in both staging and production by mid-March.
Expectations for participating member(s) of the database group in the database excellence group
Priorities for DBE member:
- Incident Response
- PG18 Upgrade Planning
- This Project time-boxed to 4h/week
Time commitment:
- Shouldn't be more than 4 hour per week.
- Expected work is mostly reviewing operational aspects of Siphon deployment
Exit Criteria
Siphon works in all environments and does not impact the databases.
Milestone 1:
- Siphon works on Gitlab.com staging environment with proper connectivity and doesn't affect actual database workload
- Replication Lag on database remains under 10 minutes
- WAL file generation doesn't exceedingly increase
Milestone 2:
- Siphon works on Gitlab.com production environment with proper connectivity and doesn't affect actual database workload
- Replication Lag on database remains under 10 minutes
- WAL file generation doesn't put pressure on disks
Checklist
Requesting Team
- The issue has a descriptive title
- There are detailed answers to the questions above
- The issue is assigned to the database team manager
- If this is urgent, reach out to the team manager in slack
Database Team
- There is enough information to prioritize the request
- The request has been assigned to a member of the team
- The priority of the request has been agreed by the stakeholders and author