Refactoring - Data and IndividualData QoL
What does the code in the MR do ?
This MR implements a bunch of quality of life changes: minor refactoring steps for Data
and IndividualData
, enables ID-based membership tests for Data
, updates the documentation and type hints, and improves tests and coverage.
New features
- ID-based membership test for
Data
objects, i.e. correct implementation of'SUB-ID-12' in data
using__contains__
, as illustrated (for IndividualParameters) in #42.
Refactoring
IndividualData
- Refactor observation management by using a single, 2D numpy array, instead of a list of lists
- Remove
.add_cofactor()
and.add_observation()
which were duplicates - Remove
.individual_parameters
which I guessed was legacy (sinceIndividualParameters
now exists for this purpose) and was not used anywhere
Data
- Rewrite the
.to_dataframe()
method to:- Check for cofactors presence
- Remove the torch dependency for readability, given that it was purely internal and numpy was already needed
- Remove
.get_by_idx()
which was legacy (replaced by__getitem__
) - Set computed attributes as properties, e.g.
.dimension
, for easier maintenance, assuming it is not computed too frequently
Documentation
Updated documentation of most Data
and IndividualData
methods, including more accurate type hinting and appropriate Leaspy exceptions
Where should the reviewer start ?
I would suggest IndividualData
as a whole, then Data
, focusing first on .to_dataframe()
and .__contains__()
which are the major changes.
How can the code be tested ?
New tests have been implemented for the added / refactored features. All tests succeed.
When is the MR due for? (review deadline)
Those are mostly quality of life / nice to have changes, so no priority.
What issues are linked to the MR ?
This partly addresses #42 (comments)