Use quantization for Elasticsearch embeddings

Proposal

Use quantization on Elasticsearch for storing embeddings.

Available on 8.12.0:

When using the int8_hnsw index, each of the float vectors' dimensions are quantized to 1-byte integers. This can reduce the memory footprint by as much as 75% at the cost of some accuracy. However, the disk usage can increase by 25% due to the overhead of storing the quantized and raw vectors.

A significant optimisation was made in 8.14 (link).

I suggest upgrading to 8.14.3 to unlock the potential for vector search. If not, we can upgrade to 8.12.2 instead to just get the quantization.

We need to continue supporting earlier versions of Elasticsearch.

Steps