Possible speed ups for aiapy.calibrate.register()
tl;dr Easiest acceleration (3-4x vs. orig) is to wait for contains_full_disk
improvement in upstream sunpy. Recommended to use OpenCV affine transform (8-9x acceleration vs. orig). Recommended to test registering raw data and then creating Sunpy Map (instead of currently creating map and then registering map data).
As part of the Helio Hackweek 2020 AIApy project, Raphael Attie (@WaaallEEE) and I looked at aiapy.calibrate.register
to see how it could be sped up. We wrote some basic replacements for register
and benchmarked them. To summarize:
-
cfd_register
: Jack Ireland changedsunpy.map.maputils.contains_full_disk
to test only the edge pixels and not every pixel. This is an ongoing pull request in sunpy, but the original version made the overallregister
3-4x faster. (This technically has nothing to do with the actual affine transform). -
cupy_register
: Drop in a one-to-one CuPy replacement forscipy.ndimage.affine_transform
(CAVEAT: only supports linear interpolation at the moment). This is about 6-9x faster thanregister
. -
cv_register
: Implement R. Attie's version of register using OpenCV (cv2
) python library and merge it with the current sunpy meta handling. This is about 6-9x faster thanregister
.
So, it looks like upgrading to the openCV affine transform is the ideal path forward. This may change in a year or two, as further improvements to CuPy (like adding cubic interpolation to affine_transform
) are made.
Raphael also broke down register
into its individual components for timing, namely the metadata overhead versus the actual affine transformation. His conclusion:
using SunPy maps to update the metadata adds 2000% bottleneck with respect to the fastest AIA data prepping, which is only an interpolation using an Affine Transform. While these accelerated interpolations offer up to ~10x speedup (OpenCV) with respect to the original method, it would be premature optimization to accelerate the Affine Transform & interpolation method any further as long as the Sunpy overhead (~2 seconds) is not dealt with.
Ultimately, we conclude that it may make more sense to do a data prep pipeline without the Sunpy maps; i.e. loading the data, preparing it, and then creating the finished Map. The added abstraction prevents vectorization of the data prep which can be leveraged for acceleration. This data load could be done by using astropy.io.fits
, which is already included in the sunpy
dependencies.