Skip to content
Snippets Groups Projects

Improve database response time for listing user activity

Merged Andreas Brandl requested to merge 40525-listing-user-activity-timeouts into master
All threads resolved!

What does this MR do?

This MR improves a database query to display activity of a given user to the currently logged in user. It leverages the user_interacted_tables table introduced in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/17327.

Are there points in the code the reviewer needs to double check?

  1. Please double-check permissions are still respected properly.
  2. Also, I ditched the LATERAL join because https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/17454#note_61882724 -- anything I'm not seeing?

Why was this MR needed?

For some users, displaying the user activity page times out. That is, the related query runs into the statement timeout.

A URL to test this is https://gitlab.com/ather-embitel (note we fail to show something for 'most recent activity' after a while).

Grafana links: Controller performance

Does this MR meet the acceptance criteria?

What are the relevant issue numbers?

Closes #40525 (closed)

Edited by Yorick Peterse

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Andreas Brandl added 127 commits

    added 127 commits

    • 86dc1660...e20f96c8 - 110 commits from branch master
    • 108216f7 - Create user_contributed_projects table.
    • b8d47d7f - Add post-migration to populate user_contributed_projects table.
    • e347975c - Simpler migration strategy for MySQL.
    • e2091b81 - Track which projects a user contributed to.
    • fc791a22 - Cache project/user combinations.
    • da493674 - Treat special cases accordingly.
    • 34f4d045 - Address rubocop offenses.
    • 076a284c - Only track contributions if table is available.
    • 2db19399 - Flush cached information about schema.
    • e9f0e4d6 - Add changelog.
    • 58574a22 - Improve migration robustness and speed.
    • ea626d60 - Rename to UserInteractedProjects.
    • 3bd9cc31 - Leverage user_contributed_projects to find recent events.
    • d5b057f0 - TEMP: Use PG 9.6 for testing.
    • 836db001 - SAVEPOINT
    • aba64821 - fixup! Leverage user_contributed_projects to find recent events.
    • 45eac5d8 - WIP: #to_a necessary?

    Compare with previous version

  • Andreas Brandl marked as a Work In Progress from aba64821

    marked as a Work In Progress from aba64821

  • Andreas Brandl added 132 commits

    added 132 commits

    • 45eac5d8...f29dbaf5 - 116 commits from branch master
    • c2f0321e - Create user_contributed_projects table.
    • 6f8d1815 - Add post-migration to populate user_contributed_projects table.
    • 062d51dd - Simpler migration strategy for MySQL.
    • d72d9790 - Track which projects a user contributed to.
    • 4a9fbfb1 - Cache project/user combinations.
    • 1bba96eb - Treat special cases accordingly.
    • cf3cf5cb - Address rubocop offenses.
    • b31fd8c9 - Only track contributions if table is available.
    • abb5cec7 - Flush cached information about schema.
    • 904a7460 - Add changelog.
    • 11d3266f - Improve migration robustness and speed.
    • 7a28f202 - Rename to UserInteractedProjects.
    • 9a9cfdf6 - For MySQL, foreign keys are dependent on index.
    • c85b07a3 - Leverage user_contributed_projects to find recent events.
    • da12f676 - TEMP: Use PG 9.6 for testing.
    • f2c938b4 - SAVEPOINT

    Compare with previous version

  • Andreas Brandl added 3 commits

    added 3 commits

    • 8b55fc3d - Singularize model name.
    • 77f27fd9 - Leverage user_contributed_projects to find recent events.
    • 1d88d0b1 - TEMP: Use PG 9.6 for testing.

    Compare with previous version

  • Andreas Brandl added 2 commits

    added 2 commits

    • 8d9694a0 - Leverage user_contributed_projects to find recent events.
    • 2c2b25a4 - TEMP: Use PG 9.6 for testing.

    Compare with previous version

  • Author Developer

    Query comparison:

    Numbers from staging. All queries yield 7 events.

    New query with LATERAL join:

    SELECT events.id
    FROM (
    	SELECT "projects"."id"
    	FROM "projects"
    	INNER JOIN "project_authorizations" ON "project_authorizations"."project_id" = "projects"."id"
    	INNER JOIN "user_interacted_projects" ON "projects"."id" = "user_interacted_projects"."project_id"
    	WHERE "user_interacted_projects"."user_id" = 876946
    		AND "project_authorizations"."user_id" = 1675239
    	UNION
    	SELECT "projects"."id"
    	FROM "projects"
    	INNER JOIN "user_interacted_projects" ON "projects"."id" = "user_interacted_projects"."project_id"
    	WHERE "user_interacted_projects"."user_id" = 876946
    		AND "projects"."visibility_level" IN (
    			10
    			,20
    			)
    	) AS projects_for_lateral
    INNER JOIN LATERAL(SELECT "events".* FROM "events" WHERE "events"."author_id" = 876946
    		AND (events.project_id = projects_for_lateral.id) ORDER BY id DESC LIMIT 20) AS events ON true
    INNER JOIN "projects" ON "projects"."id" = "events"."project_id"
    LEFT JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
    LEFT JOIN "users" ON "users"."id" = "events"."author_id"
    ORDER BY "events"."id" DESC LIMIT 20

    New query with normal join:

    SELECT events.id
    FROM (
    	SELECT "projects"."id"
    	FROM "projects"
    	INNER JOIN "project_authorizations" ON "project_authorizations"."project_id" = "projects"."id"
    	INNER JOIN "user_interacted_projects" ON "projects"."id" = "user_interacted_projects"."project_id"
    	WHERE "user_interacted_projects"."user_id" = 876946
    		AND "project_authorizations"."user_id" = 1675239
    	UNION
    	SELECT "projects"."id"
    	FROM "projects"
    	INNER JOIN "user_interacted_projects" ON "projects"."id" = "user_interacted_projects"."project_id"
    	WHERE "user_interacted_projects"."user_id" = 876946
    		AND "projects"."visibility_level" IN (
    			10
    			,20
    			)
    	) AS projects_for_join
    INNER JOIN (
    	SELECT "events".*
    	FROM "events"
    	WHERE "events"."author_id" = 876946
    	) AS events ON events.project_id = projects_for_join.id
    INNER JOIN "projects" ON "projects"."id" = "events"."project_id"
    LEFT JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
    LEFT JOIN "users" ON "users"."id" = "events"."author_id"
    ORDER BY "events"."id" DESC LIMIT 20

    Original query with LIMIT 20:

    Original query with LIMIT 2000

    Choosing a higher LIMIT yields a completely different (and much better!) plan.

    SELECT events.id
    FROM "events"
    LEFT JOIN "users" ON "users"."id" = "events"."author_id"
    LEFT JOIN "projects" ON "projects"."id" = "events"."project_id"
    LEFT JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
    WHERE "events"."author_id" = 876946
    	AND (
    		projects.id IN (
    			SELECT projects.id
    			FROM "projects"
    			WHERE (
    					EXISTS (
    						SELECT 1
    						FROM "project_authorizations"
    						WHERE "project_authorizations"."user_id" = 1675239
    							AND (project_authorizations.project_id = projects.id)
    						)
    					)
    			UNION
    			SELECT projects.id
    			FROM "projects"
    			WHERE "projects"."visibility_level" IN (
    					10
    					,20
    					)
    			)
    		)
    ORDER BY "events"."id" DESC LIMIT 20
    Edited by Andreas Brandl
  • Andreas Brandl added 2 commits

    added 2 commits

    • 0161a250 - Leverage user_contributed_projects to find recent events.
    • 54e7d6cf - TEMP: Use PG 9.6 for testing.

    Compare with previous version

  • Andreas Brandl unmarked as a Work In Progress

    unmarked as a Work In Progress

  • Andreas Brandl changed title from WIP: Resolve "Listing user activity can take more than 15s" to Improve database response time for listing user activity

    changed title from WIP: Resolve "Listing user activity can take more than 15s" to Improve database response time for listing user activity

  • Andreas Brandl changed the description

    changed the description

  • mentioned in issue #40525 (closed)

    • Author Developer
      Resolved by Andreas Brandl

      The current changeset comes with a LATERAL join if the postgres version supports it and otherwise falls back to a normal join. In both cases the execution time is <1ms and the differences between both approaches are negligible in the example in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/17454#note_61502737.

      I'm thinking we might ditch the LATERAL approach and not branch out depending on the postgres version. However would like to see production timings first and maybe look at another example user.

      If we were to keep the LATERAL join, I'd strongly suggest to also enable PG 9.6 on CI as otherwise, we'd have untested branches of code that would run on e.g. gitlab.com or any other customer we a >= PG 9.3 setup.

  • Andreas Brandl added 2 commits

    added 2 commits

    • 2fd1be0a - Leverage user_contributed_projects to find recent events.
    • ac5f6f61 - TEMP: Use PG 9.6 for testing.

    Compare with previous version

  • Andreas Brandl added 7 commits

    added 7 commits

    • e5cad049 - Operate on ids to avoid unnecessary loading.
    • a6d9e22b - Remove unnecessary safe guard.
    • 318805c5 - Nested transaction for find_or_create_by! queries.
    • b844e57f - Improve robustness of migration.
    • ca6bd5d8 - Add id primary key to table.
    • d56e0098 - Leverage user_contributed_projects to find recent events.
    • a1584475 - TEMP: Use PG 9.6 for testing.

    Compare with previous version

  • Andreas Brandl mentioned in merge request !17327 (merged)

    mentioned in merge request !17327 (merged)

  • Andreas Brandl added 145 commits

    added 145 commits

    • a1584475...49f72d06 - 123 commits from branch master
    • 8cca3282 - Create user_contributed_projects table.
    • f579a087 - Add post-migration to populate user_contributed_projects table.
    • f88d5132 - Simpler migration strategy for MySQL.
    • ebcf3c73 - Track which projects a user contributed to.
    • d4d0740e - Cache project/user combinations.
    • 87010983 - Treat special cases accordingly.
    • 8dde0301 - Address rubocop offenses.
    • 375a5c9f - Only track contributions if table is available.
    • 4418066e - Flush cached information about schema.
    • 7a22a589 - Add changelog.
    • 986f470d - Improve migration robustness and speed.
    • 43b74afd - Rename to UserInteractedProjects.
    • 8a7cd25d - For MySQL, foreign keys are dependent on index.
    • 11a559f7 - Singularize model name.
    • 9ebbcb8f - Operate on ids to avoid unnecessary loading.
    • e83e85ce - Remove unnecessary safe guard.
    • d270b857 - Nested transaction for find_or_create_by! queries.
    • 8e668076 - Improve robustness of migration.
    • 9f776707 - Shortcut and return on duplicate record error.
    • 39d4dd58 - Add id primary key to table.
    • 3f4d8942 - Leverage user_contributed_projects to find recent events.
    • c12cc795 - TEMP: Use PG 9.6 for testing.

    Compare with previous version

  • Andreas Brandl resolved all discussions

    resolved all discussions

  • Andreas Brandl added 2 commits

    added 2 commits

    • b8a61ca5 - Leverage user_contributed_projects to find recent events.
    • b1192689 - Ditch LATERAL join approach.

    Compare with previous version

  • Andreas Brandl changed the description

    changed the description

  • Andreas Brandl marked the checklist item Changelog entry added, if necessary as completed

    marked the checklist item Changelog entry added, if necessary as completed

  • Andreas Brandl marked the checklist item Conform by the style guides as completed

    marked the checklist item Conform by the style guides as completed

  • Andreas Brandl marked the checklist item Conform by the merge request performance guides as completed

    marked the checklist item Conform by the merge request performance guides as completed

  • Andreas Brandl
  • Author Developer

    I think this is good for review along with https://gitlab.com/gitlab-org/gitlab-ce/issues/43460 (which should be reviewed/merged first).

    Over to @yorickpeterse for review. 😃

  • Andreas Brandl changed the description

    changed the description

  • @abrandl Prior to merging I would prefer for related commits to be squashed together, this keeps the git log a bit cleaner.

  • Andreas Brandl resolved all discussions

    resolved all discussions

  • @abrandl Is it possible to run the JOIN vs JOIN LATERAL queries on production? I'm surprised JOIN LATERAL is performing slightly less well, especially since we've used it in the past to speed things up dramatically (in EventCollection that is). I'm wondering if perhaps the presence of buffers may influence the plans/timings here (that is, the first query caching the data so the 2nd one can reuse it).

  • Author Developer

    @yorickpeterse I ran them in production earlier today, see https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/17454#note_61882724 for the timings. Not sure why this goes in favor of the normal join.

    Here are the plans for lateral join and normal join. Both queries yield 7 rows.

    The order of those queries did not matter, so I don't think it's due to query caching.

    Edited by Andreas Brandl
  • @abrandl I was curious so I gave both queries a try (but using SELECT events.* instead of SELECT events.id). I also ran both queries on two different secondaries. In this case the following happens: the INNER JOIN query runs in about 5 milliseconds while the JOIN LATERAL one runs in about 55 milliseconds.

    Super weird, but the data confirms INNER JOIN is better suited here so let's go with that.

  • Andreas Brandl marked the checklist item End-to-end tests pass (package-qa manual pipeline job) as completed

    marked the checklist item End-to-end tests pass (package-qa manual pipeline job) as completed

  • Andreas Brandl added 202 commits

    added 202 commits

    Compare with previous version

  • Author Developer

    @yorickpeterse Done, also squashed the commits together

  • Andreas Brandl marked the checklist item Squashed related commits together as completed

    marked the checklist item Squashed related commits together as completed

  • Yorick Peterse marked the checklist item Has been reviewed by Database as completed

    marked the checklist item Has been reviewed by Database as completed

  • Yorick Peterse marked the checklist item Has been reviewed by Backend as completed

    marked the checklist item Has been reviewed by Backend as completed

  • Yorick Peterse approved this merge request

    approved this merge request

  • @rspeicher @jameslopez This MR introduces some very nice performance improvements for user profiles (and solve timeouts). I would like to have this picked into 10.6.0 if possible.

    Sign off:

    Edited by Douwe Maan
  • Yorick Peterse mentioned in commit aec3e1dd

    mentioned in commit aec3e1dd

  • @yorickpeterse

    This MR introduces some very nice performance improvements

    You know how to sell it!

    I'm okay with this one. Stupid question, but has this been tested on mySQL? :)

  • Actually we just merged this: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/17636

    **Do not** set the relevant `Pick into X.Y` label (see above) before request an
    exception; this should be done after the exception is approved.

    So I'm removing ~"Pick into 10.6" for now

    Edited by James Lopez
  • Ah! Also, we should use the exception request template on https://gitlab.com/gitlab-org/release/tasks/issues/new

  • @jameslopez

    I'm okay with this one. Stupid question, but has this been tested on mySQL? :)

    Yes, all MySQL builds passed.

  • @jameslopez Wait, so should I create an issue on said issue tracker or copy over something into this MR?

    Edit: Oh, new issue it is.

    Edited by Yorick Peterse
  • Thanks Yorick 👍

  • Setting back ~"Pick into 10.6" as we've got the 3 Yes :)

  • Picked into https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/17723, will merge into 10-6-stable ready for 10.6 RC5

  • mentioned in issue #44446 (closed)

  • Please register or sign in to reply
    Loading