Skip to content
GitLab
Next
    • GitLab: the DevOps platform
    • Explore GitLab
    • Install GitLab
    • How GitLab compares
    • Get started
    • GitLab docs
    • GitLab Learn
  • Pricing
  • Talk to an expert
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
    Projects Groups Topics Snippets
  • Register
  • Sign in
  • GitLab GitLab
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 52,622
    • Issues 52,622
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 1,543
    • Merge requests 1,543
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Artifacts
    • Schedules
    • Test cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.orgGitLab.org
  • GitLabGitLab
  • Issues
  • #336779
Closed
Open
Issue created Jul 23, 2021 by Amanda Rueda@amandarueda☕Developer

Data to be included in pseudonymization service

Summary

In &6309 (closed) we are creating a pseudonymization service to protect personally identifiable information related to Users.

In this issue, we will define which collected data can be used to identify users for purposes of including in the pseudonymization service.

What data is in scope of pseudonymization?

Personally identifiable User data which is related to private profiles will be pseudonymized.

We will pseudonymize some group, project and namespace data which could lead to identification of users. Examples of this are group names. As an example, I may call my group "Amanda Rueda's Cool Group" which would then allow viewers of group activity data to know it was my personal activity.

However, we are not pseudonymizating data with the intention of preventing tieing the data to an entity. With pseudonymization of user data in place, we will be able to understand activity behavior at a Company level.

An example of this would be:

  • We can know that Acme Company has 20 groups and 35 projects
  • We can know that 4 users within Project X (de-identified) created Epics this week
  • We can know that a single user (identity unknown) in Project Y (de-identified) created an MR then purchased additional CI minutes.

Considered data for pseudonymization

Noting that information documented in the below table is valid as of 2021-09-15

Metric Example Data Should be De-identified? Should be Collected? Currently Collected? Comment
1 user_ID "2890431" Yes Yes No This is an indirect indentifier which can be used to reveal directly identifiable data. With user_id anyone can access name and username of both public and private profiles.
2 username "amandarueda" Yes No No This can be personally identifiable data
3 user_name "Amanda Rueda" Yes No No This is personally identifiable data
4 user_email "arueda@gitlab.com" No No No While this can be personally identifiable data, we should not collect it at all.
5 user_public_email "arueda@gitlab.com" No No No While this can be personally identifiable data, we should not collect it at all.
6 Social media handles:
skype
linkedin
twitter

"amandamrueda"
"amandamrueda"
"amandamrueda"
No No No This is PI data, we should not collect it at all.
7 website_url "https://gitlab.com/amandarueda" No No No While this can be personally identifiable data, we should not collect it at all.
8 organization "GitLab" No No No While this can be personally identifiable data, we should not collect it at all.
9 group_ID "12813618" No Yes No While the Group ID can be used to identify the group name via the api, this is only true for groups set to Public visibility or where you are a member. Given this, anonymization of Group ID is not necessary.
10 group_name "Golden Path" Yes No No This can be personally identifiable data
11 group_description "This is the group Golden Path, we're great!" No No No While this can be personally identifiable data, we should not collect it at all.
12 group_path "golden_path" Yes Yes No This can be personally identifiable data
13 group_web_url "https://gitlab.com/groups/golden_path" Yes Yes Yes This can be personally identifiable data
14 group_full_name "Golden Path" Yes No No This can be personally identifiable data
15 group_full_path "golden-path" Yes Yes No This can be personally identifiable data
16 project_ID "27005757" No Yes No While the Project ID can be used to identify the project name via the api, this is only true for projects set to Public visibility or where you are a member. Given this, anonymization of Project ID is not necessary.
17 project_name "Amanda Rueda Project" Yes No No This can be personally identifiable data
18 project_description "This is Amanda Rueda's project covering all things that are cool." No No No While this can be PII data, we should not collect it at all.
19 project_name_with_namespace "Golden Path / Amanda Rueda Project" No No No While this can be personally identifiable data, we should not collect it at all.
20 project_path "amanda-rueda-project" Yes Yes No This can be personally identifiable data
21 project_path_with_namespace "golden-path/amanda-rueda-project" Yes Yes No This can be personally identifiable data
22 project_ssh_url_to_repo "git@gitlab.com:golden-path/amanda-rueda-project.git" No No No While this can be personally identifiable data, we should not collect it at all.
23 project_http_url_to_repo "https://gitlab.comgolden-path/amanda-rueda-project.git" No No No While this can be personally identifiable data, we should not collect it at all.
24 project_web_url "https://gitlab.comgolden-path/amanda-rueda-project" No No No While this can be personally identifiable data, we should not collect it at all.
25 project_readme_url "https://gitlab.com/golden-path/amanda-rueda-project/-/blob/master/README.md" No No No While this can be PII data, we should not collect it at all.
26 namespace_ID "12174719" No Yes No While the Namespace ID can be used to identify the namespace name via the api, one can only return namespace information for which they are a member of.
27 namespace_path "amandarueda" Yes Yes No This can be personally identifiable data
28 namespace_name "Amanda Rueda" Yes No No This can be personally identifiable data
29 namespace_full_path "amandarueda" Yes Yes No This can be personally identifiable data
30 namespace_web_url "https://gitlab.com/amandarueda" Yes Yes No This can be personally identifiable data
31 uuid 3059d4a0-7f26-4edc-h989-545aff87da5x No Yes Yes This is not personally identifiable data
32 ip address 192.158.1.38 No Yes Yes This is not personally identifiable data without other joined data

Example API Outputs

Below outputs run by a non-admin user not related to the queried object

Example API Output - User
{
  "id": 8956705,
  "name": "Amanda Rueda",
  "username": "arueda24",
  "state": "active",
  "avatar_url": "https://secure.gravatar.com/avatar/04cb48ea5a0b81467abc897dca331f61?s=80&d=identicon",
  "web_url": "https://gitlab.com/arueda24",
  "created_at": "2021-05-24T17:43:01.933Z",
  "bio": "",
  "bio_html": "",
  "location": null,
  "public_email": "",
  "skype": "arueda24",
  "linkedin": "arueda24",
  "twitter": "arueda24",
  "website_url": "www.arueda24.com",
  "organization": null,
  "job_title": "",
  "pronouns": null,
  "bot": false,
  "work_information": null,
  "followers": 0,
  "following": 0
}
Example API Output - Group
{
  "id": 12813618,
  "web_url": "https://gitlab.com/groups/teste309",
  "name": "teste",
  "path": "teste309",
  "description": "",
  "visibility": "public",
  "share_with_group_lock": false,
  "require_two_factor_authentication": false,
  "two_factor_grace_period": 48,
  "project_creation_level": "developer",
  "auto_devops_enabled": null,
  "subgroup_creation_level": "maintainer",
  "emails_disabled": null,
  "mentions_disabled": null,
  "lfs_enabled": true,
  "default_branch_protection": 2,
  "avatar_url": null,
  "request_access_enabled": true,
  "full_name": "teste",
  "full_path": "teste309",
  "created_at": "2021-07-23T13:16:10.678Z",
  "parent_id": null,
  "ldap_cn": null,
  "ldap_access": null,
  "shared_with_groups": [],
  "prevent_sharing_groups_outside_hierarchy": false,
  "projects": [
    {
      "id": 28300516,
      "description": "",
      "name": "sfi_ttt_ttt",
      "name_with_namespace": "teste / sfi_ttt_ttt",
      "path": "sfi_ttt_ttt",
      "path_with_namespace": "teste309/sfi_ttt_ttt",
      "created_at": "2021-07-21T17:26:57.957Z",
      "default_branch": "main",
      "tag_list": [],
      "topics": [],
      "ssh_url_to_repo": "git@gitlab.com:teste309/sfi_ttt_ttt.git",
      "http_url_to_repo": "https://gitlab.com/teste309/sfi_ttt_ttt.git",
      "web_url": "https://gitlab.com/teste309/sfi_ttt_ttt",
      "readme_url": "https://gitlab.com/teste309/sfi_ttt_ttt/-/blob/main/README.md",
      "avatar_url": null,
      "forks_count": 0,
      "star_count": 0,
      "last_activity_at": "2021-07-23T14:44:11.063Z",
      "namespace": {
        "id": 12813618,
        "name": "teste",
        "path": "teste309",
        "kind": "group",
        "full_path": "teste309",
        "parent_id": null,
        "avatar_url": null,
        "web_url": "https://gitlab.com/groups/teste309"
      },
      "container_registry_image_prefix": "registry.gitlab.com/teste309/sfi_ttt_ttt",
      "_links": {
        "self": "https://gitlab.com/api/v4/projects/28300516",
        "issues": "https://gitlab.com/api/v4/projects/28300516/issues",
        "merge_requests": "https://gitlab.com/api/v4/projects/28300516/merge_requests",
        "repo_branches": "https://gitlab.com/api/v4/projects/28300516/repository/branches",
        "labels": "https://gitlab.com/api/v4/projects/28300516/labels",
        "events": "https://gitlab.com/api/v4/projects/28300516/events",
        "members": "https://gitlab.com/api/v4/projects/28300516/members"
      },
      "packages_enabled": true,
      "empty_repo": false,
      "archived": false,
      "visibility": "public",
      "resolve_outdated_diff_discussions": false,
      "container_expiration_policy": {
        "cadence": "1d",
        "enabled": false,
        "keep_n": 10,
        "older_than": "90d",
        "name_regex": ".*",
        "name_regex_keep": null,
        "next_run_at": "2021-07-22T17:26:57.999Z"
      },
      "issues_enabled": true,
      "merge_requests_enabled": true,
      "wiki_enabled": true,
      "jobs_enabled": true,
      "snippets_enabled": true,
      "container_registry_enabled": true,
      "service_desk_enabled": true,
      "service_desk_address": "incoming+teste309-sfi-ttt-ttt-28300516-issue-@incoming.gitlab.com",
      "can_create_merge_request_in": true,
      "issues_access_level": "enabled",
      "repository_access_level": "enabled",
      "merge_requests_access_level": "enabled",
      "forking_access_level": "enabled",
      "wiki_access_level": "enabled",
      "builds_access_level": "enabled",
      "snippets_access_level": "enabled",
      "pages_access_level": "enabled",
      "operations_access_level": "enabled",
      "analytics_access_level": "enabled",
      "emails_disabled": false,
      "shared_runners_enabled": true,
      "lfs_enabled": true,
      "creator_id": 8737232,
      "import_status": "none",
      "open_issues_count": 0,
      "ci_default_git_depth": 50,
      "ci_forward_deployment_enabled": true,
      "ci_job_token_scope_enabled": false,
      "public_jobs": true,
      "build_timeout": 3600,
      "auto_cancel_pending_pipelines": "enabled",
      "build_coverage_regex": null,
      "ci_config_path": "",
      "shared_with_groups": [],
      "only_allow_merge_if_pipeline_succeeds": false,
      "allow_merge_on_skipped_pipeline": null,
      "restrict_user_defined_variables": false,
      "request_access_enabled": true,
      "only_allow_merge_if_all_discussions_are_resolved": false,
      "remove_source_branch_after_merge": true,
      "printing_merge_request_link_enabled": true,
      "merge_method": "merge",
      "squash_option": "default_off",
      "suggestion_commit_message": null,
      "auto_devops_enabled": false,
      "auto_devops_deploy_strategy": "continuous",
      "autoclose_referenced_issues": true,
      "keep_latest_artifact": true,
      "approvals_before_merge": 0,
      "mirror": false,
      "external_authorization_classification_label": "",
      "marked_for_deletion_at": null,
      "marked_for_deletion_on": null,
      "requirements_enabled": true,
      "security_and_compliance_enabled": false,
      "compliance_frameworks": [],
      "issues_template": null,
      "merge_requests_template": null,
      "merge_pipelines_enabled": false,
      "merge_trains_enabled": false
    },
    {
      "id": 27860586,
      "description": "",
      "name": "Template_Pipeline",
      "name_with_namespace": "teste / Template_Pipeline",
      "path": "template_pipeline",
      "path_with_namespace": "teste309/template_pipeline",
      "created_at": "2021-07-02T12:04:01.929Z",
      "default_branch": "main",
      "tag_list": [],
      "topics": [],
      "ssh_url_to_repo": "git@gitlab.com:teste309/template_pipeline.git",
      "http_url_to_repo": "https://gitlab.com/teste309/template_pipeline.git",
      "web_url": "https://gitlab.com/teste309/template_pipeline",
      "readme_url": "https://gitlab.com/teste309/template_pipeline/-/blob/main/README.md",
      "avatar_url": null,
      "forks_count": 0,
      "star_count": 0,
      "last_activity_at": "2021-07-23T13:48:26.023Z",
      "namespace": {
        "id": 12813618,
        "name": "teste",
        "path": "teste309",
        "kind": "group",
        "full_path": "teste309",
        "parent_id": null,
        "avatar_url": null,
        "web_url": "https://gitlab.com/groups/teste309"
      },
      "container_registry_image_prefix": "registry.gitlab.com/teste309/template_pipeline",
      "_links": {
        "self": "https://gitlab.com/api/v4/projects/27860586",
        "issues": "https://gitlab.com/api/v4/projects/27860586/issues",
        "merge_requests": "https://gitlab.com/api/v4/projects/27860586/merge_requests",
        "repo_branches": "https://gitlab.com/api/v4/projects/27860586/repository/branches",
        "labels": "https://gitlab.com/api/v4/projects/27860586/labels",
        "events": "https://gitlab.com/api/v4/projects/27860586/events",
        "members": "https://gitlab.com/api/v4/projects/27860586/members"
      },
      "packages_enabled": true,
      "empty_repo": false,
      "archived": false,
      "visibility": "public",
      "resolve_outdated_diff_discussions": false,
      "container_expiration_policy": {
        "cadence": "1d",
        "enabled": false,
        "keep_n": 10,
        "older_than": "90d",
        "name_regex": ".*",
        "name_regex_keep": null,
        "next_run_at": "2021-07-03T12:04:01.951Z"
      },
      "issues_enabled": true,
      "merge_requests_enabled": true,
      "wiki_enabled": true,
      "jobs_enabled": true,
      "snippets_enabled": true,
      "container_registry_enabled": true,
      "service_desk_enabled": true,
      "service_desk_address": "incoming+teste309-template-pipeline-27860586-issue-@incoming.gitlab.com",
      "can_create_merge_request_in": true,
      "issues_access_level": "enabled",
      "repository_access_level": "enabled",
      "merge_requests_access_level": "enabled",
      "forking_access_level": "enabled",
      "wiki_access_level": "enabled",
      "builds_access_level": "enabled",
      "snippets_access_level": "enabled",
      "pages_access_level": "enabled",
      "operations_access_level": "enabled",
      "analytics_access_level": "enabled",
      "emails_disabled": true,
      "shared_runners_enabled": true,
      "lfs_enabled": true,
      "creator_id": 8737232,
      "import_status": "none",
      "open_issues_count": 0,
      "ci_default_git_depth": 50,
      "ci_forward_deployment_enabled": true,
      "ci_job_token_scope_enabled": false,
      "public_jobs": true,
      "build_timeout": 3600,
      "auto_cancel_pending_pipelines": "enabled",
      "build_coverage_regex": null,
      "ci_config_path": "",
      "shared_with_groups": [],
      "only_allow_merge_if_pipeline_succeeds": false,
      "allow_merge_on_skipped_pipeline": null,
      "restrict_user_defined_variables": false,
      "request_access_enabled": true,
      "only_allow_merge_if_all_discussions_are_resolved": false,
      "remove_source_branch_after_merge": true,
      "printing_merge_request_link_enabled": true,
      "merge_method": "merge",
      "squash_option": "default_off",
      "suggestion_commit_message": null,
      "auto_devops_enabled": false,
      "auto_devops_deploy_strategy": "continuous",
      "autoclose_referenced_issues": true,
      "keep_latest_artifact": true,
      "approvals_before_merge": 0,
      "mirror": false,
      "external_authorization_classification_label": "",
      "marked_for_deletion_at": null,
      "marked_for_deletion_on": null,
      "requirements_enabled": true,
      "security_and_compliance_enabled": false,
      "compliance_frameworks": [],
      "issues_template": null,
      "merge_requests_template": null,
      "merge_pipelines_enabled": false,
      "merge_trains_enabled": false
    }
  ],
  "shared_projects": [],
  "shared_runners_minutes_limit": null,
  "extra_shared_runners_minutes_limit": null,
  "prevent_forking_outside_group": null
}
Example API Output - Project
{
  "id": 27005757,
  "description": "Gitaly is a Git RPC service for handling all the git calls made by GitLab",
  "name": "gitaly",
  "name_with_namespace": "Baodong Cao / gitaly",
  "path": "gitaly",
  "path_with_namespace": "icbd/gitaly",
  "created_at": "2021-05-29T03:24:28.229Z",
  "default_branch": "master",
  "tag_list": [],
  "topics": [],
  "ssh_url_to_repo": "git@gitlab.com:icbd/gitaly.git",
  "http_url_to_repo": "https://gitlab.com/icbd/gitaly.git",
  "web_url": "https://gitlab.com/icbd/gitaly",
  "readme_url": "https://gitlab.com/icbd/gitaly/-/blob/master/README.md",
  "avatar_url": "https://gitlab.com/uploads/-/system/project/avatar/27005757/gitaly7.png",
  "forks_count": 0,
  "star_count": 0,
  "last_activity_at": "2021-07-23T14:33:44.661Z",
  "namespace": {
    "id": 3198322,
    "name": "Baodong Cao",
    "path": "icbd",
    "kind": "user",
    "full_path": "icbd",
    "parent_id": null,
    "avatar_url": "/uploads/-/system/user/avatar/2556296/avatar.png",
    "web_url": "https://gitlab.com/icbd"
  },
  "container_registry_image_prefix": "registry.gitlab.com/icbd/gitaly",
  "_links": {
    "self": "https://gitlab.com/api/v4/projects/27005757",
    "issues": "https://gitlab.com/api/v4/projects/27005757/issues",
    "merge_requests": "https://gitlab.com/api/v4/projects/27005757/merge_requests",
    "repo_branches": "https://gitlab.com/api/v4/projects/27005757/repository/branches",
    "labels": "https://gitlab.com/api/v4/projects/27005757/labels",
    "events": "https://gitlab.com/api/v4/projects/27005757/events",
    "members": "https://gitlab.com/api/v4/projects/27005757/members"
  },
  "packages_enabled": true,
  "empty_repo": false,
  "archived": false,
  "visibility": "public",
  "owner": {
    "id": 2556296,
    "name": "Baodong Cao",
    "username": "icbd",
    "state": "active",
    "avatar_url": "https://gitlab.com/uploads/-/system/user/avatar/2556296/avatar.png",
    "web_url": "https://gitlab.com/icbd"
  },
  "resolve_outdated_diff_discussions": false,
  "container_expiration_policy": {
    "cadence": "1d",
    "enabled": false,
    "keep_n": 10,
    "older_than": "90d",
    "name_regex": ".*",
    "name_regex_keep": null,
    "next_run_at": "2021-05-30T03:24:28.257Z"
  },
  "issues_enabled": true,
  "merge_requests_enabled": true,
  "wiki_enabled": true,
  "jobs_enabled": true,
  "snippets_enabled": true,
  "container_registry_enabled": true,
  "service_desk_enabled": true,
  "service_desk_address": "incoming+icbd-gitaly-27005757-issue-@incoming.gitlab.com",
  "can_create_merge_request_in": true,
  "issues_access_level": "enabled",
  "repository_access_level": "enabled",
  "merge_requests_access_level": "enabled",
  "forking_access_level": "enabled",
  "wiki_access_level": "enabled",
  "builds_access_level": "enabled",
  "snippets_access_level": "enabled",
  "pages_access_level": "enabled",
  "operations_access_level": "enabled",
  "analytics_access_level": "enabled",
  "emails_disabled": null,
  "shared_runners_enabled": true,
  "lfs_enabled": true,
  "creator_id": 2556296,
  "forked_from_project": {
    "id": 2009901,
    "description": "Gitaly is a Git RPC service for handling all the git calls made by GitLab",
    "name": "gitaly",
    "name_with_namespace": "GitLab.org / gitaly",
    "path": "gitaly",
    "path_with_namespace": "gitlab-org/gitaly",
    "created_at": "2016-11-14T21:07:35.543Z",
    "default_branch": "master",
    "tag_list": [
      "git",
      "gitlab",
      "rpc"
    ],
    "topics": [
      "git",
      "gitlab",
      "rpc"
    ],
    "ssh_url_to_repo": "git@gitlab.com:gitlab-org/gitaly.git",
    "http_url_to_repo": "https://gitlab.com/gitlab-org/gitaly.git",
    "web_url": "https://gitlab.com/gitlab-org/gitaly",
    "readme_url": "https://gitlab.com/gitlab-org/gitaly/-/blob/master/README.md",
    "avatar_url": "https://gitlab.com/uploads/-/system/project/avatar/2009901/gitaly7.png",
    "forks_count": 138,
    "star_count": 269,
    "last_activity_at": "2021-07-23T13:41:07.002Z",
    "namespace": {
      "id": 9970,
      "name": "GitLab.org",
      "path": "gitlab-org",
      "kind": "group",
      "full_path": "gitlab-org",
      "parent_id": null,
      "avatar_url": "/uploads/-/system/group/avatar/9970/logo-extra-whitespace.png",
      "web_url": "https://gitlab.com/groups/gitlab-org"
    }
  },
  "import_status": "finished",
  "open_issues_count": 0,
  "ci_default_git_depth": 0,
  "ci_forward_deployment_enabled": true,
  "ci_job_token_scope_enabled": false,
  "public_jobs": true,
  "build_timeout": 3600,
  "auto_cancel_pending_pipelines": "enabled",
  "build_coverage_regex": null,
  "ci_config_path": "",
  "shared_with_groups": [],
  "only_allow_merge_if_pipeline_succeeds": false,
  "allow_merge_on_skipped_pipeline": null,
  "restrict_user_defined_variables": false,
  "request_access_enabled": true,
  "only_allow_merge_if_all_discussions_are_resolved": false,
  "remove_source_branch_after_merge": true,
  "printing_merge_request_link_enabled": true,
  "merge_method": "merge",
  "squash_option": "default_off",
  "suggestion_commit_message": null,
  "auto_devops_enabled": false,
  "auto_devops_deploy_strategy": "continuous",
  "autoclose_referenced_issues": true,
  "keep_latest_artifact": true,
  "approvals_before_merge": 0,
  "mirror": true,
  "mirror_user_id": 2556296,
  "mirror_trigger_builds": false,
  "only_mirror_protected_branches": false,
  "mirror_overwrites_diverged_branches": false,
  "external_authorization_classification_label": "",
  "marked_for_deletion_at": null,
  "marked_for_deletion_on": null,
  "requirements_enabled": true,
  "security_and_compliance_enabled": false,
  "compliance_frameworks": [],
  "issues_template": null,
  "merge_requests_template": null,
  "merge_pipelines_enabled": false,
  "merge_trains_enabled": false,
  "permissions": {
    "project_access": null,
    "group_access": null
  }
}
Edited Sep 16, 2021 by Amanda Rueda
Assignee
Assign to
Time tracking