Skip to content

chore(metrics): Dedupe response when multiple datapoints with the same TraceID get aggregated

Ankit Bhatnagar requested to merge abhatnagar/dedup-histogram-exemplars into main

Related #2932 (closed)

This MR handles the case where multiple datapoints with the same TraceID get aggregated across a given time-window, by deduplicating the TraceIds array. The logic to do that was added as a common utility - groupUniqInOrder which also ensures the order in which we get trace IDs from the ClickHouse query remains intact.

Note: A previous iteration of this MR was done with using arrayReduce('groupUniqArray', $tids) but the sorting order there is non-deterministic which leads to our API response to also be non-deterministic in structure.

Tested on a dev-cluster with:

➜  ~ curl --silent "http://localhost:8082/v3/query/51792562/metrics/search?mname=rpc.client.duration&mtype=Histogram&mvisual=Heatmap&period=5m" | jq -r '.results[].data[].distribution[0]'
[
  [
    "1726572900000000000",
    "0.000000",
    [
      "32432213-1747-e8f0-e9f8-ef9903cfd5fa",
      "a09299d0-7895-cac7-8f31-4aaec93ae07d",
      "5277e73e-ca2a-fd6a-7aff-15c5a3557627"
    ]
  ],
  [
    "1726572960000000000",
    "0.000000",
    [
      "1b2c0eb5-81bf-b131-2784-137a9b989c7b"
    ]
  ],
  [
    "1726573020000000000",
    "0.000000",
    []
  ]
]
Edited by Ankit Bhatnagar

Merge request reports

Loading