Consolidation and Categorization of Audit Event Types
Problem to solve
Audit events is a tool for owners and administrators to track user or system driven changes.
We currently allow events to be filtered by dates
, entity_type
and entity_id
.
While this is helpful, there is no easy way for administrators to retrieve targeted audit events such as: In the past one month how many users were added to Project A
? Let's take another example of a data breach: a particular user account was compromised, it is currently not possible to fetch login events on the user in the past 24 hours: Fetch all login attempts for User X in the last 24 hours
.
The data that can help retrieve specific results is currently stored in free form text in the database, such as: Changed number of required approvals from 0 to 1
, Repository Download Started
, Removed user access via system job. Reason: access expired on 2020-01-24
.
In order to make the audit logs more meaningful and retrievable, it leads to the need to categorize or label audit event types based on the action that is being performed.
Proposal
List of action types
action_type |
---|
create_deploy_key |
add_email |
add_group |
add_project |
add_project_access |
add_protected_branch |
add_protected_branch |
add_ssh_key |
add_user_access |
archive_project |
change_prevent_merge_request_approval_reviewers |
change_access_level |
change_builds_access_level |
change_forking_access_level |
change_issues_access_level |
change_merge_request_approval_from_author |
change_merge_requests_access_level |
change_metrics_dashboard_access_level |
change_number_of_required_approvals |
change_pages_access_level |
change_password |
change_project_name |
change_project_namespace |
change_project_path |
change_project_visibility |
change_release |
change_release_milestone |
change_repository_access_level |
change_snippets_access_level |
change_username |
change_wiki_access_level |
create_release |
create_release_ios_daily_build |
create_release_job |
grant_oauth_access |
login |
login_failed |
login_with_bitbucket |
login_with_github |
login_with_github_failed |
login_with_google_oauth2 |
login_with_google_oauth2_failed |
login_with_group_saml |
login_with_salesforce |
login_with_two_factor |
login_with_two_factor_via_u2f_device |
mark_group_for_deletion |
mark_project_for_deletion |
remove_email |
remove_project |
remove_protected_branch |
remove_user_access |
request_password_reset |
start_repository_download |
start_export_file_download |
We can break the efforts into the following 4 stages:
Stage 1: Capture action_type
- Create a column
action_type
inaudit_events
. We have an option to store knownaction_type
as an enum or in a reference table. To facilitate easier addition, let's start with an enum. - Ensure appropriate index is created for
action_type
. We will enable search on this field.
Stage 2: Save action_type
for new audit events
- Modify all write services to write to the new column (i.e.
AuditEventService
andAuditor
) - Add a new field to the development doc
Stage 3: Categorise existing audit events
- Examine existing audit events and start categorizing them
- Perform data migration to populate
action_type
in batches. May require input from specific teams
Note: Stage 2 and 3 can be executed in parallel.
Stage 4: Expose filter by action_type
in Audit event dashboard and API
Once we have bucketed data, we can allow administrators to filter audit events based on the action_type
.
Old proposal
Proposal
To start with, we can label audit events into action types.
There are certain action types that are common to most entities, such as:
Common action types |
---|
Create |
Update |
Delete |
Restore |
While, we can also have specific action targeted to entities such as:
Source | Target | Action Type |
---|---|---|
Project | User | Export request |
Project | System | Export complete |
Project | User | Grant access level Maintainer |
User | User | New password request |
This raises the question: do we want to filter action based on entities? It could potentially change how we store possible values for action_types
, but can be addressed later with non-breakable changes.
Implementation plan
We can break the efforts into the following 4 stages:
Stage 1: Capture action_type
- Create a column
action_type
inaudit_events
. This can be an enum field, as we want to know the possible values this column can hold. We also want to avoid free form text to make the field easily queryable -
To decide: The
action_type
list can be large and will be evolving - do we want to store possible values in a yaml file?
Stage 2: Categorize existing audit events
- Examine existing audit events and start categorizing them
- Perform data migration to populate action types in stages (for e.g.
login
,project events
,.. can be done separate migrations). May require input from specific teams
Stage 3: Save action_type
for new audit events
As part of audit event service refactoring efforts, we can make changes to the service to accept a parameter for action_type
.
Note: Stage 2 and 3 can be executed in parallel.
Stage 4: Expose filter by action_type
in Audit event dashboard and API
Once we have bucketed data, we can allow administrators to filter audit events based on the action type.
This will also allow us to create template reports such as Authentication and Authorization
, Access Level
, and so on. It can also be enhanced to build features that will let customers create their own report based on the actions. And, it can be set to generate at regular intervals such as Generate and email Access level report once a week
.
Future work
In certain instances, entity
(source) and target
may be different: User A was added as a maintainer to Project B
. Here, the entity
is Project B
and source is User A
. Administrators should be able to query by source
or target
or combination of both.