Make it easy to correlate an SQL query with a given line of code
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem to solve
Currently, debugging database performance problems is hindered by the fact that we generate a wide range of queries automatically, and it's difficult to correlate those queries with the code that generates them
Further details
E.g., there have been several queries that introduce a "sequence scan on the projects table" in recent times: https://gitlab.slack.com/archives/CB7P5CJS1/p1531320927000712
Going from "this database query is slow" to "here's the line of code generating that database query" is really difficult
Proposal
Introduce a source map of some kind that allows us to link the SQL statement to the line of GitLab code that generates it
In Postgres, /* ... */ comments seem to work fine for this. They're included in the query that is sent to the server, any server-side logs, and in any returned errors. For example:
c = ActiveRecord::Base.connection
c.execute("SET statement_timeout = 1")
# Raises an ActiveRecord::StatementInvalid: PG::QueryCanceled error
# Note how difficult it is to link the error text to a particular line of code.
# The only option would be to grep for pg_sleep
c.execute("SELECT pg_sleep(2)")
# (1.6ms) SELECT pg_sleep(2)
# ActiveRecord::StatementInvalid: PG::QueryCanceled: ERROR: canceling statement due to statement timeout
# : SELECT pg_sleep(2)
# from /home/lupine/.gem/ruby/2.3.7/gems/peek-pg-1.3.0/lib/peek/views/pg.rb:17:in `async_exec'
# Raises the same error
# We now include a reference to a specific file and line of code - (pry):11 here
# We could even make it an URL so you can just click on the link to see the code
c.execute("/* #{__FILE__}:#{__LINE__}*/ SELECT pg_sleep(2)")
# (1.6ms) /* (pry):11 */ SELECT pg_sleep(2)
# ActiveRecord::StatementInvalid: PG::QueryCanceled: ERROR: canceling statement due to statement timeout
# : /* (pry):11 */ SELECT pg_sleep(2)
# from /home/lupine/.gem/ruby/2.3.7/gems/peek-pg-1.3.0/lib/peek/views/pg.rb:17:in `async_exec'
Ideally, we'd add this as a filter around all SQL queries. It would introspect the stack, looking for the most specific user code implicated in the query execution and adding it to the query before passing it on.
There may be performance concerns with that. If so, we could have it disabled by default. When we start to see a query causing problems that we don't know the origin of, we can turn it on (feature flag?), determine the code causing the problem, and turn it off again.
What does success look like, and how can we measure that?
We should be able to successfully correlate any SQL query made by GitLab that is output to the server logs, or returned as an error to the client, to GitLab code, in under a minute