Log database queries for a given percentage of requests
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
In order to analyze database queries, we want postgres to log queries executed. There is a configuration setting log_min_duration_statement allowing us to define a threshold of execution time after which the query is logged.
Setting log_min_duration_statement=0 effectively means logging each and every query, which may become a performance problem on a busy cluster. On the other hand, any threshold greater than 0 puts a blind spot on the analysis - because the log does not include queries faster than the threshold.
The proposal here is to have a percentage of requests log their queries with log_min_duration_statement=0. This is a per-session setting in postgres, so we can actually control that on the application side when connecting to the database.
- Add feature flag to be able to control the percentage (0 = no logging, 0.1 = log 10% of requests)
- Execute
SET log_min_duration_statement=0after connecting to the database, if request should get logged
In order to determine the percentage, we'd apply a random number sampling (rand < desired_percentage).