feat: add threshold checking nuance to SMART tests
Sometimes, SMART tests on a disk can fail on platforms like Hetzner because the disk "percentage used" is over 100%, despite the disk still being usable.
According to Hetzner support, it's important to check the available_spare andavailable_spare_threshold to determine whether the disk is no longer safe to use and should be replaced.
If available_spare > available_spare_threshold, then we should be safe to continue operating, and the failure is a false positive.
This commit adds a new configuration option (CHECK_SPARE_THRESHOLD
)
and code to perform more nuanced checking of SMART test results.
Signed-off-by: vados vados@vadosware.io