Redefine Document ID and Datasource definition
Implements a new loading principle based on below datasource definition:
A **datasource** represents the **definition domain** of a dataflow, within this domain a dataflow is uniq, no doubles must exist by construction. A datasource can spread on many agencies, if we do not want to merge dataflows from different agencies we must create many datasources.
New datasource configuration:
{
datasourceId,
url,
queries: [
{
agencyId: agencyId1,
categoryschemeId: categorySchemeId1,
version: version1,
},
{
agencyId: agencyId2,
categoryschemeId: categorySchemeId2,
version: version2,
},
]
}
where:
- version: a fixed version number or 'latest'; but 'all' is not allowed
- agencyId: any Id or 'all'
- categorySchemeId: any Id or 'all'
The above configuration example should result in running 2 SDMX queries:
- ${url}/categoryscheme/agencyId1/categorySchemeId1/version1/?detail=allstubs&references=dataflow
- ${url}/categoryscheme/agencyId2/categorySchemeId2/version2/?detail=allstubs&references=dataflow
The resulting list of dataflows of both queries is to be unified:
- only keep latest version of any dataflow within a datasource
- remove doubles of dataflows (same dataflowId, version)
- If result contains dataflows of same ID and version but different agencies then keep one of them and ignore the other ones.
Then for each dataflow(agencyId, dataflowId, version):
- call
{url}/dataflow/
{agencyId}/{dataflowId}/
{version}/?detail=referencepartial&references=all (URL parameters depend on configuration) - index dataflows documents with ID = (datasourceId, dataflowId)
Edited by Nicolas Briemant