Allow aggregation set definitions to specify an offset to apply to its source metrics
In #2445 (closed) we've noticed that adding an offset to the metrics that are used in recording rules avoids incorrectly recorded values caused by not all metrics being available yet at the time the recording rule runs. To be able to consistently do this, we want to be able to configure an offset on a recording rule definition.
This should be applied to the aggregation sets that we're iterating on separately in this project.
Example for recording source metrics:
componentSLIs: aggregationSet.AggregationSet({
id: 'component',
name: 'Global SLI Metrics',
intermediateSource: false,
selector: { monitor: 'global' },
labels: ['env', 'environment', 'tier', 'type', 'stage', 'component'],
supportedBurnRates: ['5m', '30m', '1h', '6h', '3d'],
+ offset: "29s",
metricFormats: {
apdexRatio: 'gitlab_component_apdex:ratio_%s',
opsRate: 'gitlab_component_ops:rate_%s',
errorRate: 'gitlab_component_errors:rate_%s',
errorRatio: 'gitlab_component_errors:ratio_%s',
},
}),
This will generate a recording rule for the ops rate as follows:
- record: gitlab_component_ops:rate
labels:
component: rails_request
tier: sv
type: web
expr: |
sum by (env,environment,stage) (
- rate(gitlab_sli_rails_request_total{job="gitlab-rails",type="web"}[5m])
+ rate(gitlab_sli_rails_request_total{job="gitlab-rails",type="web"}[5m] offset 29s)
)
Example for aggregation set transformations:
serviceSLIs: aggregationSet.AggregationSet({
id: 'service',
name: 'Global Service-Aggregated Metrics',
intermediateSource: false,
selector: { monitor: 'global' },
labels: ['env', 'environment', 'tier', 'type', 'stage'],
offset: '29s',
metricFormats: {
apdexSuccessRate: 'gitlab_service_apdex:success:rate_%s',
apdexWeight: 'gitlab_service_apdex:weight:score_%s',
apdexRatio: 'gitlab_service_apdex:ratio_%s',
opsRate: 'gitlab_service_ops:rate_%s',
errorRate: 'gitlab_service_errors:rate_%s',
errorRatio: 'gitlab_service_errors:ratio_%s',
},
// Only include components (SLIs) with service_aggregation="yes"
aggregationFilter: 'service',
}),
Will result in this change in recording rules:
- record: gitlab_service_ops:rate_5m
expr: |
sum by (env,environment,tier,type,stage) (
- (gitlab_component_ops:rate_5m{env="gprd",monitor="global"} >= 0) and on(component, type) (gitlab_component_service:mapping{global_aggregation="yes",monitor="global",service_aggregation="yes"})
+ (gitlab_component_ops:rate_5m{env="gprd",monitor="global"} offset 29s >= 0) and on(component, type) (gitlab_component_service:mapping{global_aggregation="yes",monitor="global",service_aggregation="yes"})
)
Edited by Bob Van Landuyt