More explicit visibility behavior for projects API
The projects API has a standard GET /api/v4/projects to retrieve a list of projects. By default, we only add a filter to exclude projects pending deletion.
This endpoint is quite slow and we attempted to apply query optimization techniques to the problem in #30815 (comment 243502805). From a database querying perspective, this does work - but it's effectively complicated to implement, complex to handle and as such not necessarily an improvement we want.
This endpoint works in different ways depending on whether or not the user is authenticated:
- If the user is not authenticated, we only retrieve public projects (which has its own issues, but relatively easy to fix)
- If the user is authenticated, we retrieve public and internal projects and private projects the user has access to.
(2) is what we're interested in here. This directly translates into a SQL condition that is causing all the trouble:
SELECT * FROM "projects"
WHERE (EXISTS (
SELECT
1
FROM
"project_authorizations"
WHERE
"project_authorizations"."user_id" = 1562869
AND (project_authorizations.project_id = projects.id))
OR projects.visibility_level IN (10, 20))
The OR is the only reason this query is so slow, as we cannot effectively answer this from one index. The problem in the API is that we currently return all projects the client is allowed to see - depending on whether or not they are authenticated and what access permissions they have.
Going forward, I would like to discuss if we can change our API. If we make it more explicit, which objects the client wants to get, this problem goes away completely.
For example, we could make the visibility parameter required and default to public, internal. In this case, GET /projects would always only return public or internal (depending on authentication) projects. Only if the user is authenticated and explicitly asks for private projects with GET /projects?visibility=private, we return those the user has been given access to.
I think this reduces API complexity quite a bit (from the database querying viewpoint). It does make it a little less flexible and more explicit, but that might even be useful for clients, too.