Provide a factory for comparators that use the default heuristics except
with a different ordering of PackSources.
Change-Id: I0809b64deb3d0486040076946fdbdad650d69240
There are several ways of comparing DfsPackDescriptions for different
purposes, such as object lookup search order and reftable ordering. Some
of these are later compounded into comparators on other objects, so they
appear in the code as Comparator<DfsReftable>, for example.
Put all the DfsPackDescription comparators in static methods on
DfsPackDescription itself. Stop implementing Comparable, to avoid giving
the impression that there is always one true and correct way of sorting
packs.
Change-Id: Ia5ca65249c13373f7ef5b8a5d1ad50a26577706c
Rather than requiring callers to do their own computations based on the
package-private "category" number, provide an actual
Comparator<PackSource> instance, and explicitly discourage usage of
default Enum comparison.
Construct the default comparator using a builder pattern based on
defining equivalence classes. This gives us the same behavior as the old
category field in PackSource, with an abstraction that does not leak the
implementation detail of comparing rank numbers.
Change-Id: I6757211397ab1bc181d61298e073f88b69dbefc3
In normal operation, the source of a pack should never be null; the DFS
implementation should always know where a pack came from. Existing
implementations in InMemoryRepository and at Google always have the
source available at construction time.
The problem with null PackSources in the previous implementation was it
made the DfsPackDescription#compareTo method intransitive. Specifically,
it skips comparing the sources at all if *either* operand is null.
Suppose we have three descriptions A, B, and C, where all fields are
equal except the PackSource, and:
* A's source is INSERT
* B's source is null
* C's source is RECEIVE
In this case, A.compareTo(B) == 0, and B.compareTo(C) == 0, since all
fields are equal except the source, which is skipped. But
A.compareTo(C) != 0, since A and B have different sources.
Avoid this problem in compareTo by enforcing that the source is never
null. We could of course assign an arbitrary category number to a null
source in order to make comparison transitive[1], but it's simpler to
implement and reason about if the field is non-nullable, and there is no
real-world use case to make it null.
Although a non-null source is required at construction time, the field
is currently still mutable: DfsPackDecscription#setPackSource is used by
DfsInserterTest to mark packs as garbage. This could probably be
avoided as well, allowing us to convert packSource to a final field, but
doing so is beyond the scope of this change.
[1] The astute reader will notice this is already done by
DfsObjDatabase#reftableComparator(). In fact, the reason that
different comparator implementations non-obviously have different
semantics for this nullable field is another reason why it's clearer
to avoid null entirely.
Change-Id: I85a2aaf3fd6d4868f241f7972a0349f087830ffa
The canonical implementation also doesn't. Compare current
code in remote.c, function get_stale_heads_cb.[1] Not handling
symrefs in this case was introduced in canonical git in [2]
in 2008.
[1] https://github.com/git/git/blob/v2.17.0/remote.c#L2259
[2] https://github.com/git/git/commit/740fdd27f0
Bug: 533549
Change-Id: If348d56bb4a96b8aa7141f7e7b5a0d3dd4e7808b
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Callers of getAllRefs that only iterate over the `values()` of the
returned map can be trivially fixed to call getRefDatabase().getRefs()
instead.
Only fix those where the calling method is already declared to throw
IOException, to avoid potential API changes.
Change-Id: I2b05f785077a1713953cfd42df7bf915f889f90b
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Callers should instead use getRefDatabase().getRefs(), which does not
swallow the IOException.
Replace @link with @code in the Javadoc of FileRepository, since linking
to the deprecated method causes an error:
Javadoc: The method getAllRefs() from the type Repository is deprecated
Existing callers of the deprecated method are not adapted in this commit
because many of them require more refactoring. They will be done in
separate follow-up commits.
Bug: 534731
Change-Id: Id84e70e4cd7be3d1ca1795512950c6abe3d18ffd
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Callers should use getRefDatabase().peel(ref) instead since it
doesn't swallow the IOException.
Adapt all trivial callers to user the alternative.
DescribeCommand still uses the deprecated method and is not adapted in
this change since it will require more refactoring to add handling of
the IOException.
Change-Id: I14d4a95a5e0570548753b9fc5c03d024dc3ff832
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This means less cognitive overhead for both implementors and callers,
since this way we can guarantee that they are always synonyms for
RefDatabase#exactRef and RefDatabase#findRef, respectively.
Change-Id: Ic8aeb52fc7ed65672f3f6cd1da39a66908d88baa
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Make FileTreeIterator not enter ignored directories by default. We
only need to enter ignored directories if we do some operation against
git, and there is at least one tracked file underneath an ignored
directory.
Walking ignored directories should be avoided as much as possible as
it is a potential performance bottleneck. Some projects have a lot of
files or very deep hierarchies in ignored directories; walking those
may be costly (especially so on Windows). See for instance also bug
500106.
Provide a FileTreeIterator.setWalkIgnoredDirectories() operation to
force the iterator to iterate also through otherwise ignored
directories. Useful for tests (IgnoreNodeTest, CGitIgnoreTest), or
to implement things like "git ls-files --ignored".
Add tests in DirCacheCheckoutTest, and amend IndexDiffTest to test a
little bit more.
Bug: 388582
Change-Id: I6ff584a42c55a07120a4369fd308409431bdb94a
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Callers should use getRefDatabase().getRefsByPrefix(R_TAGS) instead.
Adjust the tests accordingly.
Bug: 534731
Change-Id: Ib28ae365e42720268996ff46e34cae1745ad545c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Declare Repository#notifyIndexChanged() final and
Repository#notifyIndexChanged(boolean) abstract to force implementors
to switch to overriding the latter method. This makes Repository less
error-prone to extend since implementors no longer need to remember to
override one of those two methods.
Change-Id: I721db0f4a4865db3b35212ee0a2045d5b31c96af
Signed-off-by: Jonathan Nieder <jrn@google.com>
e9e150fdd2 (Store in IndexChangedEvent if it was caused by JGit
itself, 2018-05-13) modified Repository#notifyIndexChanged to take a
boolean argument to indicate whether the index change happened under
the current process's control or externally, for use by EGit. In
other words, the function signature changed from
public abstract void notifyIndexChanged();
to
public abstract void notifyIndexChanged(boolean internal);
Callers outside JGit itself notifying a Repository about index changes
are expected to be rare, so this is not very disruptive to them. In
most cases they would be notifying about changes that they made
themselves, so treating their notifyIndexChanged() calls as
notifyIndexChanged(true) should be relatively safe.
Implementors have the opposite problem: adding the new "abstract void
notifyIndexChanged(boolean)" method means they are obligated to
override it. Add a default implementation that calls their existing
override of notifyIndexChanged() to make their migration easier.
The main downside is that authors of new Repository subclasses that
do not realize they need to override notifyIndexChanged would end up
with a default implementation which calls notifyIndexChanged(true),
in turn calling notifyIndexChanged() again and so on, resulting in
StackOverflowException. Add an implementors' note to the class
Javadoc to avoid this issue. A followup commit will force
implementors to adapt to the new API by changing the methods to
@Deprecated
public final void notifyIndexChanged() {
notifyIndexChanged(true);
}
public abstract void notifyIndexChanged(boolean internal);
Change-Id: I7d014890ee19abf283ea824d9baa9044bfdde130
Signed-off-by: Jonathan Nieder <jrn@google.com>
Remove it from
* package private functions.
* try blocks
* for loops
this was done with the following python script:
$ cat f.py
import sys
import re
import os
def replaceFinal(m):
return m.group(1) + "(" + m.group(2).replace('final ', '') + ")"
methodDecl = re.compile(r"^([\t ]*[a-zA-Z_ ]+)\(([^)]*)\)")
def subst(fn):
input = open(fn)
os.rename(fn, fn + "~")
dest = open(fn, 'w')
for l in input:
l = methodDecl.sub(replaceFinal, l)
dest.write(l)
dest.close()
for root, dirs, files in os.walk(".", topdown=False):
for f in files:
if not f.endswith('.java'):
continue
full = os.path.join(root, f)
print full
subst(full)
Change-Id: If533a75a417594fc893e7c669d2c1f0f6caeb7ca
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Refs are not advertised as part of the protocol v2 capability
advertisement. Don't call AdvertiseRefsHook.
Noticed because many implementations of AdvertiseRefsHook read all
refs in order to call UploadPack#setAdvertisedRefs, causing the
capability advertisement to be as slow as a v0 ref advertisement with
some RefDatabase implementations.
Such an AdvertiseRefsHook is of dubious utility (a better place to
determine which refs are advertised is in the RefDatabase
implementation itself, as in Gerrit), but at any rate since it's not
bringing about any benefit here, we can skip the hook call.
TODO:
- call an appropriate hook instead (https://bugs.eclipse.org/534847)
- add tests
[jn: fleshed out commit message; added TODO notes]
Change-Id: I6eb60ccfb251a45432954467a9ae9c1079a8c8b5
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
With this patch, a server spawned by "jgit daemon" can be accessed using
protocol v2 from a Git client that supports it (for example, "git" with
the appropriate patches). This is only activated if the repository's
config has "protocol.version" be 2.
This required a change to the package-private #execute methods in
DaemonService to allow passing of extra parameters.
This has been tested with a patched Git.
Change-Id: Icf043efec7ce956d72b075fc6dc7a87d5a2da82a
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Add support for the "ofs-delta" parameter in the "fetch" command in
the fetch-pack/upload-pack protocol v2.
Change-Id: I728cf986082fce4ddeb6a6435897692e15e60cc7
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Add support for the "include-tag" parameter in the "fetch" command in
the fetch-pack/upload-pack protocol v2.
In order to determine which tags to include, only objects pointed to by
refs starting with "refs/tags/" are checked. This restriction is for
performance reasons and to match the behavior of Git (see add_ref_tag()
in builtin/pack-objects.c).
Change-Id: I7d70aa09bcc8a525218ff1559e286c2a610258ca
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
When OPTION_INCLUDE_TAG is set, UploadPack#sendPack uses the #refs
instance variable as a source of information of tags. A subsequent patch
will need to supply this information to #sendPack without
modifying #refs, so refactor #sendPack to take in this information
through a parameter instead.
Note that prior to this patch, #refs was used twice in #sendPack: once
to generate the argument to PackWriter#setTagTargets, and once to
determine if any tags need to be included in the packfile. This patch
only updates the latter use, since the former is meant not only for
"true" tag targets but any object that should be hoisted earlier during
packing (see the documentation of PackWriter#setTagTargets).
This patch does not introduce any functionality change.
Change-Id: I70ed65a1041334abeda8d4bac98cce7cae7efcdf
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Add an internal lineMapOrNull helper that returns null when the file
is binary.
This is simpler than using an exception for control flow and avoids
having to override fillInStackTrace to avoid a performance regression.
Change-Id: Ib8bb8df6a6bbd60c62cfb3b4c484a962a98b7507
This allows to differentiate if index was changed by an external git
command or by JGit itself.
Change-Id: Iae692ba7d9bf01a288b3fb2dc2d07aec9891c712
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This makes binary detection exact in ResolveMerger and DiffFormatter
This has the same intention as
Id4342a199628d9406bfa04af1b023c27a47d4014 but preserves backward
compatibility of the signature of RawParseUtils.lineMap.
Change-Id: Ia24a4e716592bab3363ae24e3a46315a7511154f
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
https://tools.ietf.org/html/rfc7616 says:
5.12. Parameter Randomness
The security of this protocol is critically dependent on the
randomness of the randomly chosen parameters, such as client and
server nonces. These should be generated by a strong random or
properly seeded pseudorandom source (see [RFC4086]).
Change-Id: I4da5316cb1eb3f59ae06c070ce1c3335e9ee87d6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
On a local non-NFS filesystem the .git/config file will be orphaned if
it is replaced by a new process while the current process is reading the
old file. The current process successfully continues to read the
orphaned file until it closes the file handle.
Since NFS servers do not keep track of open files, instead of orphaning
the old .git/config file, such a replacement on an NFS filesystem will
instead cause the old file to be garbage collected (deleted). A stale
file handle exception will be raised on NFS clients if the file is
garbage collected (deleted) on the server while it is being read. Since
we no longer have access to the old file in these cases, the previous
code would just fail. However, in these cases, reopening the file and
rereading it will succeed (since it will open the new replacement file).
Since retrying the read is a viable strategy to deal with stale file
handles on the .git/config file, implement such a strategy.
Since it is possible that the .git/config file could be replaced again
while rereading it, loop on stale file handle exceptions, up to 5 extra
times, trying to read the .git/config file again, until we either read
the new file, or find that the file no longer exists. The limit of 5 is
arbitrary, and provides a safe upper bounds to prevent infinite loops
consuming resources in a potential unforeseen persistent error
condition.
Change-Id: I6901157b9dfdbd3013360ebe3eb40af147a8c626
Signed-off-by: Nasser Grainawi <nasser@codeaurora.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
That site serves from https now.
Reported-by: Nicholas Glorioso <glorioso@google.com>
Change-Id: I2150a18425a1fe3ab5a022882ffe06ccbde17f16
Signed-off-by: Jonathan Nieder <jrn@google.com>
This is easier to type and makes it clearer that it only returns refs
and not the pseudo-refs returned by getAdditionalRefs. It also puts us
in a better position to add a method to the Repository class later
that delegates to this one without colliding with the existing
Repository#getAllRefs method that returns a Map<String, Ref>.
While at it, clarify the javadoc of getRefs and hasRefs to make the
same point.
Suggested-by: David Pursehouse <david.pursehouse@gmail.com>
Change-Id: I23497c66ac7b5e0c987b91efbc9e9cc29924ca66
Signed-off-by: Jonathan Nieder <jrn@google.com>
Callers can now say:
db.getRefDatabase().hasRefs()
rather than the more verbose:
!db.getRefDatabase().getAllRefs().isEmpty()
The default implementation simply uses getAllRefs().isEmpty(), but a
derived class could possibly override the method with a more efficient
implementation.
Change-Id: I5244520708a1a7d9adb351f10e43fc39d98e22a1
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
The previous version suggested testing w2 first because w1 was used
for hashing, but in fact, hashCode returns w2. The order (w3, w4, w5,
w1, w2) might be better on 64-bit processors too, since it allows
comparing 64 bits at a time, although perhaps on a modern SIMD
processor, the entire 160 bytes would be compared at once anyway.
Change-Id: Ieb69606d3c1456aeff36bffe99a71587ea76e977
Signed-off-by: David Turner <dturner@twosigma.com>
Currently to get all refs, callers must use:
getRefsByPrefix(ALL)
Introduce getAllRefs, which does this, and migrate all existing
callers of getRefsByPrefix(ALL).
Change-Id: I7b1687c162c8ae836dc7db3ccc7ac847863f691d
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
The Javadoc refers to the deprecated getRefs method. Update it to refer
to getRefsByPrefix which is the recommended replacement of getRefs.
Change-Id: I61f2abcf1a3794f40a1746317dbc18aa0beb87a7
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Eclipse warns that DfsReader should be managed by try-with-resource.
As described in 1484d6e (LargePackedWholeObject: Do not reuse released
inflater, 2018-04-26), the DfsReader is owned and closed by the
PackInputStream or explicitly closed in the try block's finally.
Suppress the warning with a brief explanatory comment.
Change-Id: I4187c935742072f3ee7f2d3551a6a98d40fc2702
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
LargePackedWholeObject.openStream produces a stream that allows
reading a large object. This stream holds a DfsReader that takes care
of caching delta bases etc and in particular holds zlib Inflater for
use while reading the each delta in the packfile.
At DfsReader creation time, the Inflater is acquired from a global
InflaterCache to avoid initialization overhead in case there is an
existing Inflater available for reuse. When done with the Inflater,
the DfsReader is responsible for returning it to the cache for reuse.
The DfsReader is AutoClosable to remind the caller to close it and
release the Inflater when finished with it.
b0ac5f9c89 (LargePackedWholeObject:
Refactor to open DfsReader in try-with-resource, 2018-04-11) tried to
clarify the lifetime of the DfsReader but was too aggressive: when
this function returns, PackInputStream owns the DfsReader and is
already going to release it. Worse, the returned InflaterInputStream
holds a reference to the DfsReader's inflater, making releasing the
DfsReader not only unnecessary but unsafe.
The Inflater gets released into the InflaterCache's pool, to be
acquired by another caller that uses it concurrently with the
InflaterInputStream. This results in errors, such as
java.util.zip.ZipException: incorrect header check
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.util.zip.InflaterInputStream.skip(InflaterInputStream.java:208)
at java.io.BufferedInputStream.skip(BufferedInputStream.java:377)
and
java.util.zip.DataFormatException: incorrect header check
at java.util.zip.Inflater.inflateBytes(Native Method)
at java.util.zip.Inflater.inflate(Inflater.java:259)
at org.eclipse.jgit.internal.storage.dfs.DfsReader.inflate(DfsReader.java:783)
at org.eclipse.jgit.internal.storage.dfs.DfsPackFile.decompress(DfsPackFile.java:420)
at org.eclipse.jgit.internal.storage.dfs.DfsPackFile.load(DfsPackFile.java:767)
and
Caused by: java.util.zip.ZipException: incorrect header check
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at org.eclipse.jgit.lib.ObjectStream$Filter.read(ObjectStream.java:219)
at org.eclipse.jgit.util.IO.readFully(IO.java:233)
at org.eclipse.jgit.transport.PackParser.checkObjectCollision(PackParser.java:1173)
Verified in production. It should be possible to make a
straightforward unit test for this using the InflaterCache state but
that can wait for a followup commit.
Change-Id: Iaf1d6fd368b64f76c520d215fd270a6098a1f236
Eclipse reports these as errors, so remove them.
Change-Id: Ic53d8003f9faef38fe776af5a73794e7bb1dfc49
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>