Browse Source

Add path hash code to ObjectWalk

PackWriter wants to categorize objects that are similar in path name,
so blobs that are probably from the same file (or same sort of file)
can be delta compressed against each other.  Avoid converting into
a string by performing the hashing directly against the path buffer
in the tree iterator.

We only hash the last 16 bytes of the path, and we try avoid any
spaces, as we want the suffix of a file such as ".java" to be more
important than the directory it is in, like "src".

Change-Id: I31770ee711526306769a6f534afb19f937e0ba85
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
stable-0.9
Shawn O. Pearce 15 years ago
parent
commit
c20daa7314
  1. 12
      org.eclipse.jgit/src/org/eclipse/jgit/revwalk/ObjectWalk.java
  2. 18
      org.eclipse.jgit/src/org/eclipse/jgit/treewalk/AbstractTreeIterator.java

12
org.eclipse.jgit/src/org/eclipse/jgit/revwalk/ObjectWalk.java

@ -384,6 +384,18 @@ public class ObjectWalk extends RevWalk {
return last != null ? treeWalk.getEntryPathString() : null;
}
/**
* Get the current object's path hash code.
* <p>
* This method computes a hash code on the fly for this path, the hash is
* suitable to cluster objects that may have similar paths together.
*
* @return path hash code; any integer may be returned.
*/
public int getPathHashCode() {
return last != null ? treeWalk.getEntryPathHashCode() : 0;
}
@Override
public void dispose() {
super.dispose();

18
org.eclipse.jgit/src/org/eclipse/jgit/treewalk/AbstractTreeIterator.java

@ -402,6 +402,24 @@ public abstract class AbstractTreeIterator {
return TreeWalk.pathOf(this);
}
/**
* Get the current entry's path hash code.
* <p>
* This method computes a hash code on the fly for this path, the hash is
* suitable to cluster objects that may have similar paths together.
*
* @return path hash code; any integer may be returned.
*/
public int getEntryPathHashCode() {
int hash = 0;
for (int i = Math.max(0, pathLen - 16); i < pathLen; i++) {
byte c = path[i];
if (c != ' ')
hash = (hash >>> 2) + (c << 24);
}
return hash;
}
/**
* Get the byte array buffer object IDs must be copied out of.
* <p>

Loading…
Cancel
Save