Skip to content

Metadata query `SELECT .. FROM pg_type` returns 20% of all tuples from GitLab.com's Postgres cluster

While investigating another issue I noticed that query: 2814304093225873107 utilizes a great deal of resources on our postgres instances.

Query

The query is as follows:

gitlabhq_production=# select * from pg_stat_statements where queryid=2814304093225873107;
-[ RECORD 1 ]-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
userid              | 16384
dbid                | 16401
queryid             | 2814304093225873107
query               | SELECT t.oid, t.typname, t.typelem, t.typdelim, t.typinput, r.rngsubtype, t.typtype, t.typbasetype                                                                                                            +
                    | FROM pg_type as t                                                                                                                                                                                             +
                    | LEFT JOIN pg_range as r ON oid = rngtypid                                                                                                                                                                     +
                    | WHERE                                                                                                                                                                                                         +
                    |   t.typname IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32, $33, $34, $35, $36, $37, $38, $39, $40)+
                    |   OR t.typtype IN ($41, $42, $43)                                                                                                                                                                             +
                    |   OR t.typinput = $44::regprocedure                                                                                                                                                                           +
                    |   OR t.typelem != $45
calls               | 7819708
total_time          | 28423904.384813692
min_time            | 2.545977
max_time            | 1065.5387460000002
mean_time           | 3.6349061096418893
stddev_time         | 0.8468267154966969
rows                | 6295112473
shared_blks_hit     | 5192924371
shared_blks_read    | 225
shared_blks_dirtied | 0
shared_blks_written | 0
local_blks_hit      | 0
local_blks_read     | 0
local_blks_dirtied  | 0
local_blks_written  | 0
temp_blks_read      | 0
temp_blks_written   | 0
blk_read_time       | 17.147496
blk_write_time      | 0

An example with parameters bound looks like this, and returns about 900 rows

SELECT
  t.oid,
  t.typname,
  t.typelem,
  t.typdelim,
  t.typinput,
  r.rngsubtype,
  t.typtype,
  t.typbasetype
FROM
  pg_type AS t
  LEFT JOIN pg_range AS r ON oid = rngtypid
WHERE
  t.typname IN ('int2', 'int4', 'int8', 'oid', 'float4', 'float8', 'text', 'varchar', 'char', 'name', 'bpchar', 'bool', 'bit', 'varbit', 'timestamptz', 'date', 'money', 'bytea', 'point', 'hstore', 'json', 'jsonb', 'cidr', 'inet', 'uuid', 'xml', 'tsvector', 'macaddr', 'citext', 'ltree', 'line', 'lseg', 'box', 'path', 'polygon', 'circle', 'interval', 'time', 'timestamp', 'numeric') 
  OR t.typtype IN ('r', 'e', 'd')
  OR t.typinput = 'array_in(cstring,oid,integer)'::regprocedure
  OR t.typelem != 0

Explain plan:

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Left Join  (cost=0.41..321.24 rows=1289 width=86) (actual time=0.022..3.367 rows=809 loops=1)
   Merge Cond: (t.oid = r.rngtypid)
   Buffers: shared hit=669
   ->  Index Scan using pg_type_oid_index on pg_type t  (cost=0.28..313.25 rows=1289 width=82) (actual time=0.016..3.148 rows=809 loops=1)
         Filter: ((typname = ANY ('{int2,int4,int8,oid,float4,float8,text,varchar,char,name,bpchar,bool,bit,varbit,timestamptz,date,money,bytea,point,hstore,json,jsonb,cidr,inet,uuid,xml,tsvector,macaddr,citext,ltree,line,lseg,box,path,polygon,circle,interval,time,timestamp,numeric}'::name[])) OR (typtype = ANY ('{r,e,d}'::"char"[])) OR ((typinput)::oid = '750'::oid) OR (typelem <> '0'::oid))
         Rows Removed by Filter: 1907
         Buffers: shared hit=667
   ->  Index Scan using pg_range_rngtypid_index on pg_range r  (cost=0.13..4.72 rows=6 width=8) (actual time=0.003..0.006 rows=6 loops=1)
         Buffers: shared hit=2
 Planning Time: 0.286 ms
 Execution Time: 3.453 ms
(11 rows)

Cost

Between 20% and 30% of all rows returned from postgres are for this query.

image

https://thanos-query.ops.gitlab.net/graph?g0.range_input=1w&g0.max_source_resolution=1h&g0.expr=sum%20by%20(fqdn)%20(rate(pg_stat_statements_rows%7Benv%3D%22gprd%22%2C%20type%3D%22patroni%22%2C%20queryid%3D%222814304093225873107%22%7D%5B1h%5D))%0A%2F%0Asum%20by%20(fqdn)%20(rate(pg_stat_statements_rows%7Benv%3D%22gprd%22%2C%20type%3D%22patroni%22%7D%5B1h%5D))%0A&g0.tab=0

It is called at a sustained rate of around 8 reqs/second on each postgres node:

image

https://thanos-query.ops.gitlab.net/graph?g0.range_input=1w&g0.max_source_resolution=1h&g0.expr=sum%20by%20(fqdn)%20(rate(pg_stat_statements_calls%7Benv%3D%22gprd%22%2C%20type%3D%22patroni%22%2C%20queryid%3D%222814304093225873107%22%7D%5B1h%5D))%0A%0A&g0.tab=0

Ask

This appears to be static, or relatively static data.

Do we need to query it at this rate? Could we reduce the rate or only do this on startup