Question about performance limitations of different setups
Created by: m-attack
Hi, I was looking over this spreadsheet and became quite interested in what you guys do here. Benchmark results are very impressive for both a single node and a cluster. But then I noticed that even though there's a support for dimensions in the PostgreSQL indexing backend, test results don't reflect the performance with this backend selected and there isn't much data about it anywhere. Also, I see in your docs that you recommend using ddb_proxy component while the tests write directly to the dalmatinerdb. So I have a few questions:
- what are the performance implications of using PostgreSQL backend instead of the built-in one? What are the limitations that arise?
- what are the performance implications of using ddb_proxy instead of writing data explicitly to dalmatinerdb? Surely, adding another step for the data won't make things faster, but I'm more interested in how much slower the system gets in this case. Since there are definitely some pluses like different input data formats.
Thanks in advance for the answers