Add an xml option to specify the `kernel-device`
Typical scenario: deterministic atlas with "big" target data, to be estimated on a cuda-enabled cluster witha large number of cpu cores. It will be necessary to use the keops
kernel for efficient memory usage. For now, Deformetrica will by default refuse multithreading by overwriting the user-specified number-of-threads
option to 1, when the user might prefer to use multithreading over gpu backend for the keops kernel.
Edited by Benoit Martin