Improve handling of large diffs
This MR adjusts the way checking for large diffs takes place. Prior to this MR the procedure was basically as follows:
- Iterate over every diff in a collection
- Just load the entire diff into memory, why not
- Check if the resulting content including any diff markers/meta data exceed a threshold
- Prune or collapse the diff
This MR changes things around so the procedure is instead as follows:
- Iterate over every diff in a collection
- Check if the data modified (excluding diff markers) is larger than a threshold
- If this is not the case, proceed as usual. if this is the case we'll prune/collapse the diff