Reduce rate of mapping updates in ES
Relates to https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10094.
One of the things we're looking into as part of stabilising the logging cluster is reducing the rate of mapping updates. Because we're using dynamic mappings, they will get updated whenever a document with a new field arrives. And that currently happens quite frequently.
These increase the size (and rate of change) of the cluster state, which we hypothesise is putting load on the master nodes.
Some methods were developed to look at the cluster state in: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10134.
Here is a way of capturing the mappings for a specific index with some interval in between, and then comparing the two:
➜ ~ curl -s "${ELASTICSEARCH_URL}/pubsub-rails-inf-gprd/_alias" | jq -r 'to_entries[]|select(.value.aliases[].is_write_index)|.key'
pubsub-rails-inf-gprd-002998
➜ ~ curl -s "${ELASTICSEARCH_URL}/pubsub-rails-inf-gprd-002998/_mapping" > rails_mapping.0.json
# wait some time (can check for `update_mapping` logs in elastic cloud logs)
➜ ~ curl -s "${ELASTICSEARCH_URL}/pubsub-rails-inf-gprd-002998/_mapping" > rails_mapping.1.json
➜ ~ diff -U 0 <(cat rails_mapping.0.json | jq -cr --sort-keys 'leaf_paths|join(".")') <(cat rails_mapping.1.json | jq -cr --sort-keys 'leaf_paths|join(".")')
This shows which fields were added, and where in the structure. That can give an indication for dynamic fields that we may want to restrict. For example: gitlab-org/gitlab!31910 (merged).
A longer term strategy is to have a strict schema for the logs we produce, but that is out of scope for this issue.