12.2 Support NaN as observation values in SDMX-ML data messages
Data upload using the attached SDMX-ML data file fails with the error "OLE DB provider 'STREAM' for linked server '(null)' returned invalid data for column '[!BulkInsert].VALUE'." because the upload doesn't support NaN
for the observation values. NaN
s are to be considered NULL
observation values (consciously "missing" values).
SDMX-CSV upload using the same data works fine.
DF_SDG_KH_data.xml DF_SDG_KH.xml
Data is taken from the NSI WS v6.16.0 (http://cambodgia-statvm1.eastasia.cloudapp.azure.com/SeptemberDisseminateNSIService/rest/data/KH_NIS,DF_SDG_KH,1.1/all/)
Note that SDMX-RI has a configuration feature that generates SDMX-ML data messages with all NULL observation values expressed as NaN
.
Other test case reported by STATEC:
SDMX-ML file with complete file: LU1-DF_B1101-1.0-data.xml
STATEC has also uploaded the data (without NaN value), here is the corresponding table: https://de-qa.siscc.org/vis?lc=en&df[ds]=staging%3ASIS-CC-stable&df[id]=DF_B1101&df[ag]=LU1&df[vs]=1.0&av=true&pd=%2C2020&dq=SL13%2BSL04.A&vw=tb&lb=bt Although the uploaded data references artefacts with a later version, it should be more or less the same table.
Other example reported by NBB: dotstatsuite-data-lifecycle-manager#258 (closed)
Technical notes
- See implementation approach to differentiate between NaN input and empty input with an Input of type="number".
Simply testing if a double/float value is equal to "NaN" would not be correct, because "NaN" is different from everything, including from "NaN" and this test would always be negative. Also see: https://docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca2242 - SDMX version 2.0: https://sdmx.org/wp-content/uploads/SDMX_2_0_SECTION_03A_SDMX_ML.pdf, page 105, states:
5.9 Missing Observation Values 3976 In some of the SDMX-ML documents, an Observation is required (as in the Utility 3977 format) or it is desirable to indicate that a numerical value does not exist. While this 3978 information may be captured in an Observation-level attribute such as 3979 OBS_STATUS, with a code indicating that the value for the observation is missing, 3980 there is also a way to reliably indicate this state in the data itself. For this purpose, 3981 missing observation values – when included in an SDMX-ML data file – should be 3982 indicated using “NaN”. In XML, this indicates “not a number”, but is still valid in 3983 numeric fields. This avoids having to use a number (such as “-9999999” or “0”), along 3984 with a status code of “missing” (or similar construct) to indicate missing numeric 3985 values.
- In the "Append" context, "NaN" should be interpreted as: The value is to be set to empty (missing). It does not mean "the value is not provided and thus not to be changed".