Use REPORTING_YEAR_START_DAY in PERIOD_START & PERIOD_END value calculation, support irregular "time range" time periods & distinct points in time
NOTE: This should be implemented together with dotstatsuite-core-transfer#190
-
When uploading data into the DB, if a REPORTING_YEAR_START_DAY attribute value is provided then derive the PERIOD_START and PERIOD_END values from the TIME_PERIOD value by taking the REPORTING_YEAR_START_DAY attribute value into account.
-
Also make sure to support irregular "time range" time periods as specified by the SDMX standard https://sdmx.org/wp-content/uploads/SDMX_2-1_SECTION_6_TechnicalNotes.pdf (p. 19ff) when calculating PERIOD_START & PERIOD_END values:
749 4.2.7 Distinct Range 750 In the case that the reporting period does not fit into one of the prescribe periods 751 above, a distinct time range can be used. The value of these ranges is based on the 752 ISO 8601 time interval format of start/duration. Start can be expressed as either an 753 ISO 8601 date or a date-time, and duration is expressed as an ISO 8601 duration. 754 However, the duration can only be positive. … 761 Time Range: Start time and duration (YYYY-MMDD(Thh:mm:ss)?/<duration>) … 826 <Series REF_PERIOD="2000-01-01T00:00:00/P2M"/>
-
Also make sure to support the "Date Time" type that is used to unambiguously state that an observation is measured at a single distinct point in time rather than being reported over a period. The representation is "YYYY-MM-DDThh:mm:ss" (SDMX allows also reporting seconds fractionally - but support doesn't need to be added yet if that requires major changes or a implies a general performance loss.) In this case, the period start and the period end are equal.
IMPORTANT ADDITIONAL NOTE:
SDMX time periods with frequencies higher than daily (e.g. hours, minutes, seconds, milliseconds) need a special type for the fact table columns StartPeriod and EndPeriod. In order to not penalise performance-wise the majority of DSDs that do not use datetime TimePeriods, this special column type should be triggered only through the DSD annotation of type "SUPPORT_DATETIME".
"annotations": [{
"type": "SUPPORT_DATETIME"
}]
Only if that annotation is set, the fact table should be created with the datatime column type for StartPeriod and EndPeriod, otherwise the current type (date?) continuous to be used.
If a user imports data with datetime time periods into a DSD table to yet ready for this type of time periods, due to the missing annotation, then the transfer (at least in advanced validations) should log an appropriate error message: "Error found in {... info on place of error, like row number, component etc. ...}. Datetime time periods are not supported by default. To allow supporting this special type of time periods, you need to add an annotation of type "SUPPORT_DATETIME" to the underlying Data Structure Definition and re-initialise the data DB objects using the transfer method /init/dataflow."
This message should be shown instead of current "ERROR String or binary data would be truncated. The statement has been terminated."
REPORTING_YEAR_START_DAY example
Today when uploading the following data:
then the DB content contains:
See:
- DE: https://de-qa.siscc.org/vis?lc=en&df%5Bds%5D=qa%3Astable&df%5Bid%5D=DF_WPI_OS_2&df%5Bag%5D=ABS&df%5Bvs%5D=0.0.2&av=true&pd=%2C&dq=......Q
- structure: https://nsi-qa-stable.siscc.org/rest/dataflow/ABS/DF_WPI_OS_2/0.0.2
- data: https://nsi-qa-stable.siscc.org/rest/data/ABS,DF_WPI_OS_2,0.0.2/......?dimensionAtObservation=AllDimensions
Expected DB content:
TIME_PERIOD | REPORTING_YEAR_START_DAY | calculated PERIOD_START | calculated PERIOD_END |
---|---|---|---|
1997-Q1 | "--07-01" | "1997-07-01" | "1997-09-30" |
1997-Q2 | "--07-01" | "1997-10-01" | "1997-12-31" |
1997-Q3 | "--07-01" | "1998-01-01" | "1998-03-31" |
1997-Q4 | "--07-01" | "1998-04-01" | "1998-06-30" |
1998-Q1 | "--07-01" | "1998-07-01" | "1998-09-30" |
1998-Q2 | "--07-01" | "1998-10-01" | "1998-12-31" |
For more details on the calculation of time shifts see the SDMX technical notes.
SDMX time-period IDs
As of today, the SDMX Source implementation of the observation object does not consider the SDMX version and accepts the following valid SDMX v2.0 observation time values:
- semester value with period indicator
B
, e.g. 2014-B1 (same as 2014-S1 in SDMX v2.1) - weekly value with only 1 digit after the period indicator
W
, e.g. 2014-W1 (same as 2014-W01 in SDMX v2.1)
Therefore:
- 2021-W1 is correct, and 2021-W01 is correct
- 2021-M1 is correct, and 2021-M01 is correct
- 2021-D1 is correct, and 2021-D001 is correct
Also note that the uploaded time period IDs must be valid SDMX ID for time periods. E.g. According to above SDMX Technical Notes: "The reporting week is always represented as two digits, therefore 1-9 are 0 padded (e.g. 01). This allows the values to be sorted chronologically using textual sorting methods." Similar for monthly ("yyyy-Mmm") and daily data ("yyyy-Dddd") time periods.
Thus:
- 2021-W1 is wrong, 2021-W01 is correct
- 2021-M1 is wrong, 2021-M01 is correct
- 2021-D1 is wrong, 2021-D001 is correct
Wrong weekly example:
- structure: MDP_INDICATORS.xml
- data: data_mdp.xml
Time range example
Use any data upload example that has a TIME_PERIOD time dimension, and replace at least one TIME_PERIOD value with (an) example(s) for (YYYY-MM-DD(Thh:mm:ss)?/):
- "2000-01-01T00:00:00/P5Y": this corresponds to the 5-year range "2000-2004"
Date time example
Example structure and data with TIME_PERIOD=2009-01-15T00:00:00. Currently such data upload returns the error: "Received an invalid column length from the bcp client for colid 1."
Alternatively, use any data upload example that has a TIME_PERIOD time dimension, and replace at least one TIME_PERIOD value with (an) example(s) for (YYYY-MM-DDThh:mm:ss):
- "2000-11-21T10:20:30": this corresponds to the distinct point in time 10:20:30 AM on 2000-11-21.