Ensure StackGres pods have enough Shared Memory (SHM)
As it is very well described in PostgreSQL at low level: stay curious! blog post, K8s pods may have a limited amount of SHM (64M
) thay may cause Postgres errors in particular in the face of concurrent queries:
ERROR: could not resize shared memory segment
"/PostgreSQL.699663942" to 50438144 bytes:
No space left on device
We can easily check the amount of shared memory on a given pod on any K8s cluster easily, and see that it may default to 64M (see example below, run on microk8s
):
$ kubectl run -it $RANDOM --image busybox --restart Never -- df -h |grep shm
shm 64.0M 0 64.0M 0% /dev/shm
While there's an upcoming fix to K8s, it is not ready yet and we will need to fix it for earlier K8s versions anyway. We cannot let a SG Postgres container be subject to these kind of errors, it is not suitable for production workloads.
There's an apparent, easy fix (also mentioned in OpenShift documentation), which is to include as part of the pod's definition:
volumeMounts:
- mountPath: /dev/shm
name: dshm
volumes:
- name: dshm
emptyDir:
medium: Memory
Tasks:
-
Create the appropriate e2e
tests to verify that the SG pods have more than64M
of shared memory (we might not aim at a fixed amount; but if it is more than the default, we know we have a fix in place, and the amount of SHM will depend on the RAM of the nodes). They should fail at the beginning on some (all) of the e2e-supported platforms. Please report here the results. -
Add and test the above code. Should fix the problem and e2e tests should run fine.
Edited by Xavier Sierra