K002 Workload type
The aim
While K001 shows just one line with "totals", and K003 shows detailed stats, per each query group, report K002 aims to give an understanding of workload type: how classes of queries (SELECT vs INSERT vs DELETE, etc) are distributed, by any of cumulative metric – from the main ones as calls, total_time, rows to secondary, IO and CPU related.
How to get data
We detect class in simple and straightforward way – observing query's the first word. Of course, this approach has drawbacks – for example, it won't be really helpful if a lot of PLpgSQL functions are being used in the project, and among them there are some read-only, and there are modifying ones or even with DDL.
Nevertheless, this approach is good because it's simple. The query (it also distinguishes regular, non-blocking SELECTs from SELECT..FOR UPDATE and other blocking SELECTs)
with data as (
select
case lower(regexp_replace(s.query, '^\W*(\w+)\W+.*$', '\1'))
when 'select' then
case
when s.query ~* 'for\W+(no\W+key\W+)?update' then 'select ... for [no key] update'
when s.query ~* 'for\W+(key\W+)?share' then 'select ... for [key] share'
else 'select'
end
else lower(regexp_replace(s.query, '^\W*(\w+)\W+.*$', '\1'))
end as word,
count(*) cnt,
sum(calls) calls,
sum(s.total_time) total_time,
1000 * sum(k.user_time) user_time,
1000 * sum(k.system_time) system_time,
sum(reads_blks) reads_blks,
sum(writes_blks) writes_blks
from pg_stat_statements s
left join pg_stat_kcache_detail k using (query)
group by 1
)
select
word,
calls,
round(100 * calls::numeric / (sum(calls) over())::numeric, 2) as "Calls, %",
round(total_time::numeric, 2) total_time,
round(100 * total_time::numeric / (sum(total_time) over())::numeric, 2) as "Time, %",
reads_blks,
round(100 * reads_blks::numeric / (sum(reads_blks) over())::numeric, 2) as "Read blocks, %",
... -- all other metrics
from data
order by total_time desc;
Acceptance criteria
As a DBA, I see the table with several rows (one row for INSERTs, one for DELETEs, etc). I can understand, what is the ratio (in % – see https://gitlab.com/postgres-ai-team/postgres-health-check/issues/179) of, say DELETEs, if we consider total_time. Or if we consider rows. Or shared_blks_read.
For each query class, I also see absolute values for each metric, as well as differentiated (per second, per call) values.