Add number of obs to pre-generated actual content constraints
As Jean-Baptiste,
I want to retrieve the number of observations of a dataflow in a matter of milli-seconds by getting it through the request for the pre-generated actual content constraint,
So that I do not have to wait too long for this information, and by this mean also avoid charging the SDMX web service with unnecessary workload.
Requirements
Currently, the NSI availability queries return dynamically generated actual content constraints (once a dataflow has been initialised) which contain the number of available observations in the annotation of type “sdmx_metrics” and id “obs_count”
For instance:
https://nsi-qa-stable.siscc.org/rest/availableconstraint/OECD.ECO,ADB,1.0/all?mode=exact
returns:
<structure:ContentConstraint id="CC" agencyID="SDMX" version="1.0" type="Actual">
<common:Annotations>
<common:Annotation id="obs_count">
<common:AnnotationTitle>843295</common:AnnotationTitle>
<common:AnnotationType>sdmx_metrics</common:AnnotationType>
</common:Annotation>
</common:Annotations>
<common:Name xml:lang="en">Autogenerated content constraint</common:Name>
The actual content constraints (both live and PIT) that are currently pre-generated by the transfer service should also include this annotation. Once this is done, the DLM could use this pre-generated actual content constraint instead of the current 0-0 range request (in another ticket).
Potential technical implementation approach
The NSI web service today already allows retrieving a fully dynamically generated actual content constraint that contains this annotation of the obs_count, e.g.
https://nsi-qa-stable.siscc.org/rest/availableconstraint/OECD.ECO,ADB,1.0/all?mode=exact
This query seems to perform very well (response in 1.5s). In order to get this annotation easily, but also in order to improve the performance of the pre-generation of stored actual content constraints, and in order to reuse code to reduce the overall code base, the Eurostat approach of generating the actual content constraint could be used, instead of our own.
However, if this approach is used, it is important to make sure to use the correct time range information in the actual content constraint. Today the NSI returns this information in this way:
...
</structure:CubeRegion>
<structure:ReferencePeriod startTime="1945-01-01T00:00:00Z" endTime="2022-01-01T00:00:00Z"/>
</structure:ContentConstraint>
...
This is incorrect. It should be returned as CubeRegion's TimeRange:
...
<common:KeyValue id="TIME_PERIOD">
<common:TimeRange>
<common:StartPeriod isInclusive="true">1945-01-01T00:00:00Z</common:StartPeriod>
<common:EndPeriod isInclusive="false">2022-01-01T00:00:00Z</common:EndPeriod>
</common:TimeRange>
</common:KeyValue>
</structure:CubeRegion>
</structure:ContentConstraint>
This issue is being addressed in this ticket: https://citnet.tech.ec.europa.eu/CITnet/jira/browse/SDMXRI-1527 (see dotstatsuite-core-sdmxri-nsi-ws#108 (closed)).