Run with @Parameterized, so we don't have to duplicate test setup for
each atomic/non-atomic test. We still have to have two different sets of
asserts for the cases where the behavior is different. In fact, this is
a readability win: it emphasizes that performing the exact same setup
except for the atomic setting will have different behavior.
Change-Id: I78a8214075e204732a423341f14c09de273a7854
The existing packed-refs file provides a mechanism for implementing
atomic multi-ref updates without any changes to the on-disk format or
lockfile protocol. We just need to make sure that there are no loose
refs involved in the transaction, which we can achieve by packing the
refs while holding locks on all loose refs. Full details of the
algorithm are in the PackedBatchRefUpdate javadoc.
This change does not implement reflog support, which will come in a
later change.
Change-Id: I09829544a0d4e8dbb141d28c748c3b96ef66fee1
ReceiveCommand.Result has a slightly richer set of possibilities, so it
makes sense for RefUpdate.Result to have more values in order to match.
In particular, this allows us to return REJECTED_MISSING_OBJECT from
RefUpdate when an object is missing.
The comment in RefUpdate#safeParse about expecting some old objects to be
missing is only applicable to the old ID, not the new ID. A missing new
ID is a bug or programmer error, and we should not update a ref to point
to one.
Fix various tests that started failing because they depended for no good
reason on setting refs to point to nonexistent objects; it's always easy
to create a real object when necessary.
It is possible that some downstream users of RefUpdate.Result might
choose to handle one of the new statuses differently, for example by
providing a more user-readable error message; that is not done in this
change.
Change-Id: I734b1c32d5404752447d9e20329471436ffe05fc
Fix patch matching for patterns of form a/b/** : this should not match
paths like a/b but still match a/b/ and a/b/c.
Change-Id: Iacbf496a43f01312e7d9052f29c3f9c33807c85d
Signed-off-by: Dmitry Pavlenko <pavlenko@tmatesoft.com>
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
When a file uses a different block size (e.g. 500) than the cache
(e.g. 512), and the DfsPackFile's blockSize field has not been
initialized, the cache misaligns block loads. The cache uses its
default of 512 to compute the block alignment instead of the file's
500.
This causes DfsReader try to set an empty range into an Inflater,
resulting in an object being unable to load.
Change-Id: I7d6352708225f62ef2f216d1ddcbaa64be113df6
Simple test to verify two DfsRepository instances will reuse the same
DfsBlocks in the DfsBlockCache, even though the DfsStreamKey instance
is now different between their DfsPackFile instances.
Change-Id: I409c109142dea488d189b9ac0d3c319755dce7b4
By making this a deterministic function, DfsBlockCache can stop
retaining a map of every DfsPackDescription it has ever seen. This
fixes a long standing memory leak in DfsBlockCache.
This refactoring also simplifies the idea of setting up more
lightweight objects around streams.
Change-Id: I051e7b96f5454c6b0a0e652d8f4a69c0bed7f6f4
The reader may find it surprising that this succeeds without incident
unless there is peeling or a fast-forward check involved. This behavior
may be changed in the future, but for now, just document the current
behavior.
Change-Id: I348b37e93e0264dc0905c4d58ce881852d1dfe5e
The RefDirectory implementation of doDelete never considered whether to
delete a symref or its leaf, because the detachingSymbolicRef bit was
never exposed from RefUpdate. The behavior was thus incorrectly to
always delete the symref, never the leaf.
There was no test for this behavior. The only thing that attempted to be
a test was testDeleteHeadInBareRepo, but this test was broken for
reasons unrelated to this bug. Specifically, it set the leaf to point to
a completely nonexistent object, and then asserted that deleting HEAD
resulted in NO_CHANGE. The only reason this test ever passed is because
of a quirk of updateImpl, which treats a missing object as the same as
null. This quirk aside, the test wasn't really testing the right thing.
Turn this into a real test by writing out a real object and pointing the
leaf at that.
Also, add a test for the detachingSymbolicRef case, i.e. deleting the
symref and leaving the leaf alone.
Change-Id: Ib96d2a35b4f99eba0734725486085fc6f9d78aa5
The contents of the packedRefList AtomicReference should never differ
from what we expect prior to writing, because this segment of the code
is protected by the packed-refs lock file on disk. If it does happen,
whether due to programmer error or a rogue process not respecting the
locking protocol, it's better to let the caller know than to silently
drop the whole commit operation on the floor.
The existing concurrentOnlyOneWritesPackedRefs test is inherently
nondeterministic as written, and was already about 6% flaky as measured
by bazel:
$ bazel test --runs_per_test=200 //org.eclipse.jgit.test:org_eclipse_jgit_internal_storage_file_GcPackRefsTest
...
INFO: Elapsed time: 42.608s, Critical Path: 10.35s
//org.eclipse.jgit.test:org_eclipse_jgit_internal_storage_file_GcPackRefsTest FAILED in 12 out of 200 in 1.6s
Stats over 200 runs: max = 1.6s, min = 1.1s, avg = 1.3s, dev = 0.1s
This flakiness was caused by the assumption that exactly one of the 2
threads would fail, when both might actually succeed in practice due to
racing on the compare-and-swap.
For whatever reason, this change affected the interleaving behavior in
such a way that the flakiness jumped to around 50%. Making the
interleaving of the test fully deterministic is beyond the scope of this
change, but a simple tweak to the assertion is enough to make it pass
consistently 200+ times both before and after this change.
Change-Id: I5ff4dc39ee05bda88d47909acb70118f3d0c8f74
Some downstream code checks whether a ReceiveCommand is a create or a
delete based on the type field. Other downstream code (in particular a
good chunk of Gerrit code I wrote) checks the same thing by comparing
oldId/newId to zeroId. Unfortunately, there were no strict checks in the
constructor that ensures that zeroId is only set for oldId/newId if the
type argument corresponds, so a caller that passed mismatched IDs and
types would observe completely undefined behavior as a result. This is
and always has been a misuse of the API; throw IllegalArgumentException
so the caller knows that it is a misuse.
Similarly, throw from the constructor if oldId/newId are null. The
non-nullness requirement was already documented. Fix RefDirectoryTest to
not do the wrong thing.
Change-Id: Ie2d0bfed8a2d89e807a41925d548f0f0ce243ecf
This renaming supports reusing DfsStreamKey in a future commit
to index other PackExt type streams inside of the DfsBlockCache.
Change-Id: Ib52d374e47724ccb837f4fbab1fc85c486c5b408
The merger is now able to react to the use of the merge attribute.
The value unset and the custom value 'binary' are handled (-merge
and merge=binary)
Since the specification of the merge attribute states that when the
attribute is unset, ours version must be kept in case of a conflict, we
don't overwrite the file but keep the local version.
Bug: 517128
Change-Id: Ib5fbf17bdaf727bc5d0e106ce88f2620d9f87a6f
Signed-off-by: Mathieu Cartaud <mathieu.cartaud@obeo.fr>
These config options allow overriding the message type (error, warn or
ignore) of a specific message ID such as missingEmail.
The supported fsck message IDs are defined in ObjectChecker.ErrorType.
Since TransferConfig.FsckMode wasn't public parsing fsck configuration
options like e.g. fsck.missingEmail=ignore failed with an
IllegalAccessException. Fix this by declaring this enum public.
Change-Id: I3f41ff7a76a846250a63ce92a9fd111eb347269f
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In the case of multiple tags on the same commit, jgit previously
only ever looked at the last of those tags; git behaviour is to
return the first tag (or first matching one if --match is
specified).
Bug: 518377
Change-Id: I3b6b58ad9f8aa3879ae35b84542b7bddc74a27d6
Signed-off-by: Oliver Lockwood <oliver.lockwood@cantab.net>
A `match()` method has been added to the DescribeCommand, allowing
users to specify one or more `glob(7)` matchers as per Git convention.
Bug: 518377
Change-Id: Ib4cf34ce58128eed0334adf6c4a052dbea62c601
Signed-off-by: Oliver Lockwood <oliver.lockwood@cantab.net>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: Idcc93c2ca95938995d489cffda649c7d7b26c50e
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If set, "singlePack" will create a single GC pack file for all
objects reachable from refs/*. If not set, the GC pack will contain
object reachable from refs/heads/* and refs/tags/*, and the GC_REST
pack will contain all other reachable objects.
Change-Id: I56bcb6a9da2c10a0909c2f940c025db6f3acebcb
Signed-off-by: Terry Parker <tparker@google.com>
When a command invoked from readPipe fails to launch (i.e. the exec call
fails due to a missing command executable), Process.start() throws,
which gets caught by the generic IOException handler, resulting in a
null return. This change detects this case and rethrows a
CommandFailedException instead.
Additionally, this change uses /bin/sh instead of bash for its posix
command failure test, to accomodate building in environments where bash
is unavailable.
Change-Id: Ifae51e457e5718be610c0a0914b18fe35ea7b008
Signed-off-by: Bryan Donlan <bdonlan@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Commit db77610 ensured that all refs/tags commits are added to the
primary GC pack. It did that by adding all of the refs/tags commits
to the primary GC pack PackWriter's "interesting" object set.
Unfortunately, all commit objects in the "interesting" set are
selected as commits for which bitmap indices will be built. In a
repository like chromium with lots of tags, this changed the number of
bitmaps created from <700 to >10000. That puts huge memory pressure on
the GC task.
This change restores the original behavior of ignoring tags when
selecting commits for bitmaps.
In the "uninteresting" set, commits for refs/heads and refs/tags for
unannotated tags can not be differentiated. We instead identify
refs/tags commits by passing their ObjectIds as a new "noBitmaps"
parameter to the PackWriter.preparePack() methods.
PackWriterBitmapPreparer.setupTipCommitBitmaps() can then use that
"noBitmaps" parameter to exclude those commits.
Change-Id: Icd287c6b04fc1e48de773033fe432a9b0e904ac5
Signed-off-by: Terry Parker <tparker@google.com>
DirCacheCheckout is generating names for temporary files. It was not checking
the length of this filenames. It may happen that a generated filename is
longer than 255 chars which causes problems on certain platforms. Make sure
that filenames for temporary files do not exceed 255 chars.
Bug: 508823
Change-Id: I9475c04351ce3faebdc6ad40ea4faa3c326815f4
Delete the condition to check whether the garbage pack creation time
is older than the last GC operation, because it's not possible to
find the last GC operation time when there is no GC pack.
Add additional tests to make sure the contents of the expired garbage
packs are considered during the GC operation and any actively
referenced objects from the garbage packs are copied successfully
into the GC pack before deleting the garbage pack.
Change-Id: I09e8b2656de8ba7f9b996724ad1961d908e937b6
Signed-off-by: Thirumala Reddy Mutchukota <thirumala@google.com>
Android wants them to work, and we're only interested in them for bare
repos, so add them just for that.
Make sure to use symlinks instead of just using the copyfile
implementation. Some scripts look up where they're actually located in
order to find related files, so they need the link back to their
project.
Change-Id: I929b69b2505f03036f69e25a55daf93842871f30
Signed-off-by: Dan Willemsen <dwillemsen@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jeff Gaston <jeffrygaston@google.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This is necessary for deploying submodules on android.googlesource.com.
* Allow an empty base URL. This is useful if the 'fetch' field is "."
and all names are relative to some host root.
* The URLs in the resulting superproject are relative to the
superproject's URL. Add RepoCommand#setDestinationURI to
set this. If unset, the existing behavior is maintained.
* Add two tests for the Android and Gerrit case, checking the URL
format in .gitmodules; the tests use a custom RemoteReader which is
representative of the use of this class in Gerrit's Supermanifest
plugin.
Change-Id: Ia75530226120d75aa0017c5410fd65d0563e91b
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This is a workaround for
https://bugs.openjdk.java.net/browse/JDK-4666701.
Change-Id: Idd04657e8d95a841d72230f8881b6b899daadbc2
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In the repo manifest documentation [1] the fetch attribute is marked
as "#REQUIRED".
If the fetch attribute is not specified, this would previously result in
NullPointerException. Throw a SAXException instead.
[1] https://gerrit.googlesource.com/git-repo/+/master/docs/manifest-format.txt
Change-Id: Ib8ed8cee6074fe6bf8f9ac6fc7a1664a547d2d49
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
It was already increased in 61a943e, but that was still not enough to
take into account the length of snapshot versions.
Change-Id: Ib54cec97e97042fe274b87a3a1afa9bb06c8bf19
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
All that's really required to run a merge operation is a single
ObjectInserter, from which we can construct a RevWalk, plus a Config
that declares a diff algorithm. Provide some factory methods that don't
take Repository.
Change-Id: Ib884dce2528424b5bcbbbbfc043baec1886b9bbd
A higher limit is required to account for proper JGit version number
being sent in the UserAgent.
The version string "4.7.0.201704031717-r" is 20 characters, however
the strings used during development are shorter:
- When running from mvn, "4.7.0.qualifier" is used; 15 characters
- When running in Eclipse, "unknown" is used; 7 characters
Change-Id: I9aca2f71389a42fedce305e9078db016869c3d1a
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
DiffAlgorithms can return different edit locations for inserts or
deletes, if they can be "shifted" up or down repeating blocks of
lines. This causes the 3-way merge to apply both edits, resulting in
incorrectly removing or duplicating lines.
Augment an existing "tidy-up" stage in DiffAlgorithm to move all
shiftable edits (not just the last INSERT edit) to a consistent
location, and add test cases for previously incorrect merges.
Bug: 514095
Change-Id: I5fe150a2fc04e1cdb012d22609d86df16dfb0b7e
Signed-off-by: KB Sriram <kbsriram@google.com>
Add a new API method to set the recurse mode, and pass the mode into
the fetch command.
Extend the existing FetchCommandRecurseSubmodulesTest to also perform
the same tests for fetch. Rename the test class accordingly.
Change-Id: I12553af47774b4778f7011e1018bd575a7909bd0
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This is a preparation change to Bazel build implementation. Error
Prone rejects the code with variable crypto algorithm as insecure
see: [1].
[1] http://errorprone.info/bugpattern/InsecureCryptoUsage
Change-Id: I92db70a7da454bc364597a995e8be5dccc2d6427
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
With the filename suffix "Tests", the module was not included in tests
when building with Maven, and without the @Test annotations the tests
didn't get executed under Eclipse or buck test.
testRacyGitDetection was failing because the index file did not exist.
Add the missing configuration, the missing annotations, and add a call
to reset() in testRacyGitDetection to force creation of the index file.
Change-Id: I29dd8f89c36fef4ab40bedce7f4a26bd9b2390e4
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This fixes error flagged by error prone:
Java compilation in rule '//org.eclipse.jgit.test:jgit' failed: Worker
process sent response with exit code: 1.
org.eclipse.jgit.test/tst/org/eclipse/jgit/revwalk/RevFlagSetTest.java:149:
error: [CollectionIncompatibleType] Argument '"bob"' should not be
passed to this method; its type String is not compatible with its
collection's type argument RevFlag
assertFalse(set.contains("bob"));
Change-Id: I4a971ce92fee55e28b2ab0c7b716ac20fa9c6709
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
The submodule.name.fetchRecurseSubmodules value was being read from the
configuration of the submodule, but it should be read from the config
of the parent repository.
Also, the fetch.recurseSubmodules value from the parent repository's
configuration was not being considered at all.
Fix both of these and add tests. Now the precedence of the recurse mode
is determined as follows:
1. Value passed to the API
2. Value configured in submodule.name.fetchRecurseSubmodules
3. Value configured in fetch.recurseSubmodules
4. Default to "on demand"
Change-Id: Ic23b7c40b5f39135fb3fd754c597dd4bcc94240c
Extend FetchCommand to expose a new method, setRecurseSubmodules(mode),
which allows to set the mode to ON, OFF or ON_DEMAND.
After fetching a repository, its submodules are recursively fetched:
- When the mode is YES, submodules are always fetched.
- When the mode is NO, submodules are not fetched.
- When the mode is ON_DEMAND, submodules are only fetched when the
parent repository receives an update of the submodule and the new
revision is not already in the submodule.
The mode is determined in the following order of precedence:
- Value specified in the API call using setRecurseSubmodules.
- Value specified in the repository's config under the key
submodule.name.fetchRecurseSubmodules
- Defaults to ON_DEMAND if neither of the previous is set.
Extend FetchResult to recursively include results for submodules, as
a map of the submodule path to an instance of FetchResult.
Test setup is based on testCloneRepositoryWithNestedSubmodules.
Change-Id: Ibc841683763307cb76e78e142e0da5b11b1add2a
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This operation was added recently with the goal to provide some
way to auto-correct invalid user input, or to provide a correction
suggestion to the user -- EGit uses it now that way. But the initial
implementation was very restrictive; it removed all non-ASCII
characters and even slashes.
Understandably end users were not happy with that. Git has no such
restriction to ASCII-only; nor does JGit. Branch names should be
meaningful to the end user, and if a user-supplied branch name is
invalid for technical reasons, a "normalized" name should still
be meaningful to the user.
Rewrite to attempt a minimal fix such that the result will pass
isValidRefName.
* Replace all Unicode whitespace by underscore.
* Replace troublesome special characters by dash.
* Collapse sequences of underscores, dots, and dashes.
* Remove underscores, dots, and dashes following slashes, and
collapse sequences of slashes.
* Strip leading and trailing sequences of slashes, dots, dashes,
and underscores.
* Avoid the ".lock" extension.
* Avoid the Windows reserved device names.
* If input name is null return an empty String so callers don't need to
check for null.
This still allows branch names with single slashes as separators
between components, avoids some pitfalls that isValidRefName() tests
for, and leaves other character untouched and thus allows non-ASCII
branch names.
Also move the function from the bottom of the file up to where
isValidRefName is implemented.
Bug: 512508
Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Update SHA1 class to include a Java port of sha1dc[1]'s ubc_check,
which can detect the attack pattern used by the SHAttered[2] authors.
Given the shattered example files that have the same SHA-1, this
modified implementation can identify there is risk of collision given
only one file in the pair:
$ jgit ...
[main] WARN org.eclipse.jgit.util.sha1.SHA1 - SHA-1 collision 38762cf7f55934b34d179ae6a4c80cadccbb7f0a
When JGit detects probability of a collision the SHA1 class now warns
on the logger, reporting the object's SHA-1 hash, and then throws a
Sha1CollisionException to the caller.
From the paper[3] by Marc Stevens, the probability of a false positive
identification of a collision is about 14 * 2^(-160), sufficiently low
enough for any detected collision to likely be a real collision.
git-core[4] may adopt sha1dc before the system migrates to an entirely
new hash function. This commit enables JGit to remain compatible with
that move to sha1dc, and help protect users by warning if similar
attacks as SHAttered are identified.
Performance declined about 8% (detection off), now:
MessageDigest 238.41 MiB/s
MessageDigest 244.52 MiB/s
MessageDigest 244.06 MiB/s
MessageDigest 242.58 MiB/s
SHA1 216.77 MiB/s (was ~240.83 MiB/s)
SHA1 220.98 MiB/s
SHA1 221.76 MiB/s
SHA1 221.34 MiB/s
This decline in throughput is attributed to the step loop unrolling in
compress(), which was necessary to easily fit the UbcCheck logic into
the hash function. Using helper functions s1-s4 reduces the code
explosion, providing acceptable throughput.
With detection enabled (default):
SHA1 detectCollision 180.12 MiB/s
SHA1 detectCollision 181.59 MiB/s
SHA1 detectCollision 181.64 MiB/s
SHA1 detectCollision 182.24 MiB/s
sha1dc (native C) ~206.28 MiB/s
sha1dc (native C) ~204.47 MiB/s
sha1dc (native C) ~203.74 MiB/s
Average time across 100,000 calls to hash 4100 bytes (such as a commit
or tree) for the various algorithms available to JGit also shows SHA1
is slower than MessageDigest, but by an acceptable margin:
MessageDigest 17 usec
SHA1 18 usec
SHA1 detectCollision 22 usec
Time to index-pack for git.git (217982 objects, 69 MiB) has increased:
MessageDigest SHA1 w/ detectCollision
------------- -----------------------
20.12s 25.25s
19.87s 25.48s
20.04s 25.26s
avg 20.01s 25.33s +26%
Being implemented in Java with these additional safety checks is
clearly a penalty, but throughput is still acceptable given the
increased security against object name collisions.
[1] https://github.com/cr-marcstevens/sha1collisiondetection
[2] https://shattered.it/
[3] https://marc-stevens.nl/research/papers/C13-S.pdf
[4] https://public-inbox.org/git/20170223230621.43anex65ndoqbgnf@sigill.intra.peff.net/
Change-Id: I9fe4c6d8fc5e5a661af72cd3246c9e67b1b9fee6
The TreeWalk filtering classes need to support the three different
meanings of the return value the path comparison generates.
A new path comparison method (isPathMatch) is created with
three distinct return values (isPathPrefix use value '0' to
encode two of these) which will makes it possible for the logical
operators (especially NOT) to aggregate a correct verdict.
A filter like: AND(Path("path"), NOT(Path("path/to/other")))
Should filter out 'path/to/other/file', but not 'path/to/my/file'.
The path-limiting feature when testing path/to/my/file, would
result to run test for the following paths:
path
path/to
path/to/my
path/to/my/file
isPathPrefix('path/to/other') will return '0' for the first two
and since there is no way for NOT to distinguish between an exact
match and a match indicating that the tested path is a 'parent',
it will incorrectly return false and thus remove everything below
'path' immediately.
isPathMatch has a distinguished value for 'parent' matches that
will be preserved through the logic operators and should not
cause an over-eager removal of paths.
The functionality of isPathPrefix is required by other parts
and is untouched.
Unit tests are included to ensure that the logical functionality
is correct and can be preserved.
Change-Id: Ice2ca9406f09f1b179569e99b86a0e5d77baa20d
Signed-off-by: Magnus Vigerlöf <magnus.vigerlof@gmail.com>
Allow SHA1 instances to be reused to compute another hash value, and
resume caching them in ObjectInserter and PackParser. This shaves a
small amount of running time off parsing git.git's pack file:
before after
------ ------
25.25s 25.55s
25.48s 25.06s
25.26s 24.94s
Almost noise (small difference), but recycling the instances reduces
some stress on the memory allocator finding two 80 word message block
arrays needed for hashing and collision detection.
Change-Id: I4af88a720e81460293bc5c5d1d3db1a831e7e228