When indexing large blobs that are stored whole (non-delta form),
avoid allocating the entire blob in memory and instead stream it
through the SHA-1 checksum computation. This reduces the size
of memory required by IndexPack when processing very big blobs,
such as a 500 MiB uncompressable binary.
If the large blob already exists in the local repository, its
contents needs to be compared byte-for-byte after the entire pack
has been indexed, to ensure there isn't an unexpected SHA-1 collision
which may result in later data corruption. This compare is performed
as a streaming compare, again avoiding the large object allocation.
This change doesn't improve on memory utilization for large objects
stored as deltas. The change also doesn't improve handling for
any large commits, trees or annotated tags. There isn't much to
be done here for those objects, because they need to be passed down
to the ObjectChecker as a byte[]. Fortunately it isn't common for
these object types to be that large,
Bug: 312868
Change-Id: I862afd4cb78013ee033d4ec68c067b1774a05be8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Roberto Tyley <roberto.tyley@guardian.co.uk>