Fuzzy hash / text similarity helper

Right now 'closeable' use a CRC24 hash generated from lowercase text content to identify matching tooltip/reminders, text is transformed to lowercase before hashing to ignore capitalization revisions - which helps but isn't great.

Ideally I'd like a fuzzy hash function that can return a 'difference %' between between hashes without the original content - this would allow us to set clear reset thresholds for things like closeable tip text.

It would be:

fuzzy hash, CTPH (context triggered piecewise hashing) or similar
d::Library relatively small piece of code, current implementation is ~230 characters
d::Library reasonably performant for up to a paragraph of input text. Although it isn't meant for real-time operations, it is part of PD init to close prior closed tips.

Should:

make tooltip/reminder reappear only if content was significantly changed
generate hashes of a reasonable (120 chars? flexible) size work as storage keys.
allow custom diff threshold

Edited Feb 26, 2022 by Lorin Halpert