Model and table for google cloud logging integration (!119943) · Merge requests · GitLab.org / GitLab

Harsimar Sandhu requested to merge 409421-setup-model-and-table-for-google-cloud-logging-configs into master May 08, 2023

What does this MR do and why?

Model and table for google cloud logging integration

This commit introduces model and table used to integrate audit events to google cloud logging service. The table stores configurations associated with IAM service accounts, used for generating access tokens

Changelog: added

Migration logs

rake db:migrate:down:main VERSION=20230507192028
main: == [advisory_lock_connection] object_id: 275260, pg_backend_pid: 524
main: == 20230507192028 CreateAuditEventsGoogleCloudLoggingConfigurations: reverting
main: -- drop_table(:audit_events_google_cloud_logging_configurations)
main:    -> 0.0044s
main: == 20230507192028 CreateAuditEventsGoogleCloudLoggingConfigurations: reverted (0.0117s)

main: == [advisory_lock_connection] object_id: 275260, pg_backend_pid: 524

rake db:migrate:up:main VERSION=20230507192028
main: == [advisory_lock_connection] object_id: 275200, pg_backend_pid: 1230
main: == 20230507192028 CreateAuditEventsGoogleCloudLoggingConfigurations: migrating
main: -- create_table(:audit_events_google_cloud_logging_configurations)
main: -- quote_column_name(:google_project_id_name)
main:    -> 0.0000s
main: -- quote_column_name(:client_email)
main:    -> 0.0000s
main: -- quote_column_name(:log_id_name)
main:    -> 0.0000s
main:    -> 0.0316s
main: == 20230507192028 CreateAuditEventsGoogleCloudLoggingConfigurations: migrated (0.2503s)

main: == [advisory_lock_connection] object_id: 275200, pg_backend_pid: 1230

rake db:migrate:down:main VERSION=20230508074515
main: == [advisory_lock_connection] object_id: 275260, pg_backend_pid: 1995
main: == 20230508074515 AddGoogleCloudLoggingConfigurationLimitToPlanLimits: reverting
main: -- remove_column(:plan_limits, :google_cloud_logging_configurations, :integer, {:default=>5, :null=>false})
main:    -> 0.0083s
main: == 20230508074515 AddGoogleCloudLoggingConfigurationLimitToPlanLimits: reverted (0.0212s)

main: == [advisory_lock_connection] object_id: 275260, pg_backend_pid: 1995


rake db:migrate:up:main VERSION=20230508074515
main: == [advisory_lock_connection] object_id: 275200, pg_backend_pid: 2880
main: == 20230508074515 AddGoogleCloudLoggingConfigurationLimitToPlanLimits: migrating
main: -- add_column(:plan_limits, :google_cloud_logging_configurations, :integer, {:default=>5, :null=>false})
main:    -> 0.0108s
main: == 20230508074515 AddGoogleCloudLoggingConfigurationLimitToPlanLimits: migrated (0.0184s)

main: == [advisory_lock_connection] object_id: 275200, pg_backend_pid: 2880

Table preparation `audit_events_google_cloud_logging_configurations`

What is the anticipated growth for the new table over the next 3 months, 6 months, 1 year? What assumptions are these based on?

The anticipated growth of this table depends on the number of users integrating with Google Cloud and utilizing audit event streaming. You can refer to the current growth rate for audit event streaming to external destinations by following this internal link: https://app.periscopedata.com/app/gitlab/663045/Govern:-Compliance-PM-Dashboard?widget=14870983&udv=1809512 Search for audit_event_destinations. I believe this table size will be less than the external destination table size.

How many reads and writes per hour would you expect this table to have in 3 months, 6 months, 1 year? Under what circumstances are rows updated? What assumptions are these based on?

This table is expected to have low write frequency, as destinations are typically created once. The read frequency will be directly proportional to the number of jobs executed by the AuditEventStreamingWorker. Since every job requires checking for available Google Cloud logging configurations, we can anticipate around 60-70 read queries per second, based on the current Sidekiq completion rate of 60-70 jobs per second. refer: https://dashboards.gitlab.net/d/stage-groups-compliance/stage-groups-compliance-group-dashboard?orgId=1 [internal link]

Based on the anticipated data volume and access patterns, does the new table pose an availability risk to GitLab.com or self-managed instances? Does the proposed design scale to support the needs of GitLab.com and self-managed customers?

The new table does not pose any availability risk to GitLab.com or self-managed instances. The proposed design is expected to scale and support the needs of both GitLab.com and self-managed customers.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Related to #409421 (closed)

Edited May 09, 2023 by Harsimar Sandhu

Model and table for google cloud logging integration

What does this MR do and why?

Migration logs

Table preparation audit_events_google_cloud_logging_configurations

MR acceptance checklist

Merge request reports

Table preparation `audit_events_google_cloud_logging_configurations`