Skip to content

Improve blob hashing performance in HgExtractor

Edward Cree requested to merge ecree/reposurgeon:hashtemp into master

Trying to hash a file by catting it over the hgclient protocol is painfully slow; we can improve this by instead catting it to a tempfile (with hg cat -o).

But we can also get some use out of the hg blob hash (from hg manifest), by maintaining a map from blob hashes we've seen to the resulting content hashes; when the metadata changes we'll have to rehash the file, but otherwise it saves us the work of redoing the hash calculation for every revision.

Merge request reports