Collect 15-30min of Postgres logs with log_min_duration_statement = 0 for analysis and DB experiments
These actions are part of issue infrastructure#4921 As discussed with @abrandl and @dawsmith, I'm going to collect a sample (30-60 minutes) of all-queries log on Wednesday, Sep 19, starting at 12:00 UTC (the busiest time is ~13:00 UTC). The brief description of actions --- To do that, I'll setup `log_min_duration_statement` with manual `alter system set ..` command on the master, followed by `select pg_reload_conf();`. The plan (see details below) is to reach 0 iteratively, descending from the current value of 1 second to 0 by steps, with ~15 minutes apart, controlling IO and the log size. Projected IO impact --- The projected IO impact, based on `pg_stat_statements` view (it lacks query parameters, but this just adds some small error to the forecast – there are no huge Geo queries on the master node), based on last 10 hours of observation: ```sql gitlabhq_production=# select now(), :'TS_PGSS_RESET' as pg_stat_statements_last_reset, now() - :'TS_PGSS_RESET' since_reset, sum(calls * length(query)) as total_bytes, sum(calls * length(query)) / extract(epoch from now() - :'TS_PGSS_RESET') as bytes_per_sec from pg_stat_statements; -[ RECORD 1 ]-----------------+------------------------------ now | 2018-09-18 07:51:16.052095+00 pg_stat_statements_last_reset | 2018-09-17 21:19:59.557029+00 since_reset | 10:31:16.495066 total_bytes | 37407827046 bytes_per_sec | 987626.41529573 ``` – this gives an estimate of ~1MB/s for writing Postgres log. This is based on not the busiest hours though, so it's subject to checking during Tuesday mid-day. This is an estimate for `log_min_duration_statement = 0`. For higher values, including the possible lowest positive `1ms`, the excepted IO will be much lower. Additional observation: a lot of queries under 1ms --- If we take Top-15 (by `total_time`) queries among those which `max_time` is < 1mas, we'll see that those take >50% of time: ```sql select total_time, sum(total_time) over () as total_time_registered_in_pss, round(100::real * (sum(total_time) over (order by total_time desc)) / sum(total_time) over ())::text || '%' as accum_percentage, min_time, round(mean_time::numeric, 2) as mean_time, max_time, query from pg_stat_statements where max_time < 1 order by total_time desc limit 15; ``` The detailed plan of actions --- - [x] 0) Tuesday, 2018-09-18 12:00 UTC: reset pg_stat_statements and double check the write IO estimate. In psql on Postgres master: ```sql select now() as "TS_PGSS_RESET" from pg_stat_statements_reset() \gset select now(), :'TS_PGSS_RESET' as pg_stat_statements_last_reset, now() - :'TS_PGSS_RESET' since_reset, sum(calls * length(query)) as total_bytes, sum(calls * length(query)) / extract(epoch from now() - :'TS_PGSS_RESET') as bytes_per_sec from pg_stat_statements; \watch 300 ``` - [x] 1) Wednesday, 2018-09-19 11:xx Preparations. Set millisecond precision in log timestamps – use `%m` instead of `%t` (currently we have second-level precision, `%t`): ``` alter system set log_line_prefix = '%m [%p]: [%l-1] db=%d,user=%u '; select pg_reload_conf(); ``` - [x] 2) Wednesday, 2018-09-19 12:00 UTC: start descending. In psql on Postgres master: ```sql alter system set log_min_duration_statement = '100ms'; select pg_reload_conf(); ``` - [x] 3) Observe the current size of the log and avg speed of its growth: ```shell sudo ls -lah /var/log/gitlab/postgresql/current # this will print avg bytes/sec based on the full "current" log file sudo head -n1 /var/log/gitlab/postgresql/current | head -c 22 | sed 's/_/T/g' | xargs date +"%s" --date \ | awk \ -v date="$(date +"%s")" \ '{print (date - $1)}' \ | awk \ -v bytes="$(sudo wc -c /var/log/gitlab/postgresql/current | awk '{print $1}')" \ '{print (bytes / $1)}' ``` And check IO with: - `sudo iotop -o` - monitoring - [x] 4) Wednesday, 2018-09-19 12:15 UTC: continue descending, set to '10ms' In psql on Postgres master: ```sql alter system set log_min_duration_statement = '10ms'; select pg_reload_conf(); ``` + continue observing as described in step 3 - [x] 5) Wednesday, 2018-09-19 12:30 UTC: continue descending, set to '5ms' In psql on Postgres master: ```sql alter system set log_min_duration_statement = '5ms'; select pg_reload_conf(); ``` + continue observing as described in step 3 + check IO with `sudo iotop -o` and monitoring (https://dashboards.gitlab.net/d/pEfSMUhmz/postgresql-disk-io?orgId=1 + GCP graphs) - [x] 6) Wednesday, 2018-09-19 12:45 UTC: continue descending, set to '1ms' In psql on Postgres master: ```sql alter system set log_min_duration_statement = '1ms'; select pg_reload_conf(); ``` + continue observing as described in step 3 + check IO with `sudo iotop -o` and monitoring (https://dashboards.gitlab.net/d/pEfSMUhmz/postgresql-disk-io?orgId=1 + GCP graphs) - [x] 7) DECISION TO BE MADE BASED ON OBSERVED NUMBERS: If the current IO caused by logging is > 1MB/s, stop with `1ms`, and collect only partial logs, with this threshold. If it's below, continue with last step of descending to collect all-queries log: ```sql alter system set log_min_duration_statement = 0; select pg_reload_conf(); ``` - [ ] 8) Collect logs (minimum 15 minutes, max 30 minutes): ```shell sudo cp /var/log/gitlab/postgresql/current ~nikolays/postgres_0_20180919.log ``` - [x] 9) Return to the initial state - in psql on the master node: ```sql alter system reset log_min_duration_statement; alter system reset log_line_prefix; select pg_reload_conf(); \c show log_min_duration_statement; show log_line_prefix; ``` Notes on possible risks --- No risks to have any performance degradation are expected. In case of any unpredictable IO, the step 8 will be immediately applied. It expected, that ~2-4GB of logs will be generated during collection phase (with projected generation speed ~1-2MB/s), which is an equivalent to ~10 days of logging with current `log_min_duration_statement` threshold (1s) This might cause some delays in Kibana processing. TODO after these works are done --- After this is complete, TODO (separate issues): - process, analyze logs with pgBadger - PoC usage of logs with pgreplay on experimental DB nodes - `set log_min_duration_statement to 100ms;` permanently - set better `log_line_prefix` - stop using "one for all" DB roles, use separate ones for different apps and for administration. /cc @gitlab\-com/gl\-infra
issue