Improve performance of ReadableDataLocationFactory.GetReadableDataLocation
The dotTrace performance profiler, highlights the function CopyStream as a hotspot for performance issues for xml, csv (both from file and url) for data and referential metadata imports. Specifically the line inputStream.CopyTo(outputStream, 1024);
, part of the class Org.Sdmxsource.Util.Io.StreamUtil
0.14 % GetDataflowFromSdmxFile • 20,532 ms • DotStat.Transfer.Producer.BaseFileProducer`1.GetDataflowFromSdmxFile(T)
0.10 % GetReadableDataLocation • 15,049 ms • Org.Sdmxsource.Util.Io.ReadableDataLocationFactory.GetReadableDataLocation(Stream)
0.10 % ReadableDataLocationTmp..ctor • 15,049 ms • Org.Sdmxsource.Util.Io.ReadableDataLocationTmp..ctor(Stream)
0.10 % GetFileFromStream • 15,049 ms • Org.Sdmxsource.Util.Io.URIUtil.GetFileFromStream(Stream)
0.10 % CopyStream • 15,049 ms • Org.Sdmxsource.Util.Io.StreamUtil.CopyStream(Stream, Stream)
0.10 % CopyTo • 15,049 ms • System.IO.FileStream.CopyTo(Stream, Int32)
0.03 % GetDataflow • 4,727 ms • DotStat.MappingStore.MappingStoreDataAccess.GetDataflow(String, String, String, String, Boolean, ResolveCrossReferences, Boolean)
0.00 % MoveNextDataset • 508 ms • Org.Sdmxsource.Sdmx.DataParser.Engine.Reader.AbstractDataReaderEngine.MoveNextDataset
0.00 % Dispose • 202 ms • Org.Sdmxsource.Sdmx.DataParser.Engine.Reader.AbstractDataReaderEngine.Dispose
0.00 % GetDataReaderEngine • 22 ms • Org.Sdmxsource.Sdmx.DataParser.Manager.DataReaderManager.GetDataReaderEngine(IReadableDataLocation, ISdmxObjectRetrievalManager)
0.00 % GetRetrievalManager • 13 ms • DotStat.MappingStore.MappingStoreDataAccess.GetRetrievalManager(String)
0.00 % FileStream..ctor • 10 ms • System.IO.FileStream..ctor(String, FileMode, FileAccess, FileShare)
This function is called in the import step to define the data source for the reader engine that will eventually read the contents of the import file. The bigger the file, the more time it takes for the function ReadableDataLocationFactory.GetReadableDataLocation to complete.
-
Make sure that the size of the input file does not affect the performance of the function ReadableDataLocationFactory.GetReadableDataLocation. Specially because at this step there is no need to read the contents of the file. -
Make sure that the function this is not opening and reading the contents of the file or somehow copying from one stream to another the contents of the file.
-
Sample structures&input files:
- OECD.ECO-EO_TEST_PERFS-1.0-all.xml
- Data file sample in shared folder: \prognoz-app-2\Shared_files_for_SDD\#461\EO_TEST_PERFS.csv
Edited by Pedro Carranza