Skip to content

WIP: csv: Fix reading empty columns as float

Tjerk Vreeken requested to merge csv-fix-emtpy-column-read-nan into master

We used to use np.genfromtxt with dtype=None, which means that it would guess. If a column did not have any value specified, it would guess the dtype boolean with values False. This would eventually be turned into a Timeseries of 0.0. We however want a Timeseries full of np.nan.

It is difficult to make np.genfromtxt read in floats, when the first column can sometimes be a string. Therefore, we switch to using the stdlib module 'csv', and build the structured/named array ourselves. That way we can make sure that all data values are read in as floats, and that they are NaN when a value is missing.

Note that performance is about the same.

Closes #1128 (closed)

Edited by Tjerk Vreeken

Merge request reports