Fuzzy hash / text similarity helper
Right now 'closeable' use a CRC24 hash generated from lowercase text content to identify matching tooltip/reminders, text is transformed to lowercase before hashing to ignore capitalization revisions - which helps but isn't great.
Ideally I'd like a fuzzy hash function that can return a 'difference %' between between hashes without the original content - this would allow us to set clear reset thresholds for things like closeable tip text.
It would be:
-
fuzzy hash, CTPH (context triggered piecewise hashing) or similar -
d::Library relatively small piece of code, current implementation is ~230 characters -
d::Library reasonably performant for up to a paragraph of input text. Although it isn't meant for real-time operations, it is part of PD init to close prior closed tips.
Should:
-
make tooltip/reminder reappear only if content was significantly changed -
generate hashes of a reasonable (120 chars? flexible) size work as storage keys. -
allow custom diff threshold
Edited by Lorin Halpert