Skip to content

WIP: 19 remove similar

This MR addresses #19

Added dependencies

  1. This MR brings in imagehash as a dependency.

TODO:

  • Validate the approach against it's intended use case. ("similar" is very nuanced when working with ML image similarity measures).
  • Add on-disk hash cache. Depending on use cases keeping all state in memory might be a non-starter and so storing cached hashes (or image vectors/embeddings) on disk might be better option.
  • test performance (memory and speed) on large datasets to validate usability.

Merge request reports

Loading