different behaviours on pgbouncer nodes on Ubuntu 18.04 vs. 16.04

On the new registry pgbouncer nodes in gstg, which are running on Ubuntu 18.04, we see osqueryd and systemd-resolved running with significantly higher CPU load than on the older Ubuntu 16.04 pgbouncer nodes.

We should investigate this and check if this could be related to delivery#2028 (closed).

Current Status

We identified a DNS resolver loop between dnsmasq and systemd-resolved to be the cause, leading systemd-resolved to spin with 100% on one core and DNS lookups being slow (which caused sporadic DB connection timeouts for registry).

There might be more services impacted by this, if they use dnsmasq, systemd-resolved and resolvconf together. One example is deploy-cny-01-sv-gstg.c.gitlab-staging-1.internal, which also is suffering from the DNS loop.

Edited by Henri Philipps