Large tables with several columns

Last DB office hours we discussed the topic of having larger tables with several columns

# NOTE: The query is using the information of my local schema, results
# might be different on GitLab.com

gitlabhq_development=# SELECT table_name, count(*) as column_count
FROM information_schema.columns
WHERE table_schema = 'public'
GROUP by table_name order by column_count desc
LIMIT(10);
                table_name                | column_count
------------------------------------------+--------------
 application_settings                     |          221
 users                                    |           84
 projects                                 |           82
 geo_node_statuses                        |           54
 ci_builds                                |           50
 namespaces                               |           43
 merge_requests                           |           40
 conversational_development_index_metrics |           33
 epics                                    |           31
 issues                                   |           30
(10 rows)

At the moment users and projects are considered large tables due to its size (and blacklisted from specific operations).

application_settings has the largest number of columns, but since it only has one record it might no be a problem.

Disadvantages of adding more columns to larger tables:

  • We can't easily add defaults for these columns (At least until we upgrade to PostgreSQL 11, as the rewritten behavior changes on this version)
  • There's a performance impact when it comes to reading information from a wide table, we'll always need to read the whole record from disk and not part of it.

There are specific efforts to reduce the number of columns on projects like #19826 (closed) && !19761 (merged)

Proposal - To do

  • Identify large tables with several columns: !28161 (merged)
    • We could start with projects and users, but perhaps also consider issues, merge_requests, ci_builds
  • Add a new cop to prevent adding columns to the wide tables - !28161 (merged)
  • Add documentation to our Database Guides - !28681 (merged)
  • Add a mention in the Engineering Week in Review
Edited by Mayra Cabrera