Shadow commits in the RevWalk in the UploadPack object may cause the
UNINTERESTING flag not being carried over to their parents commits since
they were marked NO_PARENTS during the assumeShallow or
initializeShallowCommits call.
A new RevWalk needs to be created for this reason, but instead of
creating a new RevWalk from Repository, we can reuse the ObjectReader in
the RevWalk of UploadPack to load objects.
Change-Id: Ic3fee0512d35b4f555c60e696a880f8b192e4439
Signed-off-by: Zhen Chen <czhen@google.com>
Per git-config(1), core.logAllRefUpdates auto-creates reflogs for HEAD
and for refs under heads, notes, tags, and for HEAD. Add notes and
remove stash from ReflogWriter#shouldAutoCreateLog. Explicitly force
writing reflogs for refs/stash at call sites, now that this is
supported.
Change-Id: I3a46d2c2703b7c243e0ee2bbf6948279800c485c
Even if a repository has core.logAllRefUpdates=true, ReflogWriter does
not create reflog files unless the refs are under a hard-coded list of
prefixes, or unless the forceWrite bit is set. Expose the forceWrite bit
on a per-update basis in RefUpdate/BatchRefUpdate/ReceiveCommand,
creating RefLogWriters as necessary.
Change-Id: Ifc851fba00f76bf56d4134f821d0576b37810f80
The ReflogWriter constructor just took a Repository and called
getDirectory() on it to figure out the reflog dirs, but not all
Repository instances use this storage format for reflogs, so it's
incorrect to attempt to use ReflogWriter when there is not a
RefDirectory directly involved. In practice, ReflogWriter was mostly
only used by the implementation of RefDirectory, so enforcing this is
mostly just shuffling around calls in the same internal package.
The one exception is StashDropCommand, which writes to a reflog lock
file directly. This was a reasonable implementation decision, because
there is no general reflog interface in JGit beyond using
(Batch)RefUpdate to write new entries to the reflog. So to implement
"git stash drop <N>", which removes an arbitrary element from the
reflog, it's fair to fall back to the RefDirectory implementation.
Creating and using a more general interface is well beyond the scope of
this change.
That said, the old behavior of writing out the reflog file even if
that's not the reflog format used by the given Repository is clearly
wrong. Fail fast in this case instead.
Change-Id: I9bd4b047bc3e28a5607fd346ec2400dde9151730
This test was never being run. Since it was introduced it was
named "notest.." which meant it didn't run with JUnit3, and
since it is not annotated @Test it also doesn't run with JUnit4.
When compiling with Bazel 0.6.0, error-prone raises an error
that the public method is not annotated with @Ignore or @Test.
Given that the test has never been run anyway, we can just
remove it.
Bug: 525415
Change-Id: Ie9a54f89fe42e0c201f547ff54ff1d419ce37864
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Git has a rather elaborate mechanism to specify HTTP configuration
options per URL, based on pattern matching the URL against "http"
subsection names.[1] The URLs used for this matching are always the
original URLs; redirected URLs do not participate.
* Scheme and host must match exactly case-insensitively.
* An optional user name must match exactly.
* Ports must match exactly after default ports have been filled in.
* The path of a subsection, if any, must match a segment prefix of
the path of the URL.
* Matches with user name take precedence over equal-length path
matches without, but longer path matches are preferred over
shorter matches with user name.
Implement this for JGit. Factor out the HttpConfig from TransportHttp
and implement the matching and override mechanism.
The set of supported settings is still the same; JGit currently
supports only followRedirects, postBuffer, and sslVerify, plus the
JGit-specific maxRedirects key.
Add tests for path normalization and prefix matching only on segment
separators, and use the new mechanism in SmartClientSmartServerSslTest
to disable sslVerify selectively for only the test server URLs.
Compare also bug 374703 and bug 465492. With this commit it would be
possible to set sslVerify to false for only the git server using a
self-signed certificate instead of having to switch it off globally
via http.sslVerify.
[1] https://git-scm.com/docs/git-config
Change-Id: I42a3c2399cb937cd7884116a2a32fcaa7a418fcb
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
This will be used later when adding for support for recursing
submodules on push.
Change-Id: Ie2a183e5404a32046de9f6524e6ceeec37919671
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
With atomic ref updates using packed refs, JGit did not fire a
RefsChangedEvent. This resulted in a user-visible regression in
EGit: the UI would not update after a "Fetch from upstream...".
Presumably it would also make Gerrit miss out on ref changes?
Strengthen the BatchRefUpdateTest by also asserting the expected
number of RefsChangedEvents, and ensure modCnt is incremented in
RefDirectory.commitPackedRefs() when refs really changed (as opposed
to some internal housekeeping operation, such as packing loose refs).
Bug: 521296
Change-Id: Ia985bda1d99f45a5f89c8020ca4845e7a66e743e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Much of the time the caller can specify a RefSpec succinctly using a
string, and doesn't care about calling setters. Add a convenience method
for this case, and use it where applicable in JGit core.
Change-Id: Ic3fac7fc568eee4759236a5264d2e7e5f9b9716d
An application can choose to invoke setAdvertisedRefs multiple times,
for example several AdvertiseRefsHook installed in a chain. Each of
these invocations populates the advertisedHaves collection with the
unique set of ObjectIds.
This can lead to a server over-advertising with ".have" lines if the
first hook pushes in a lot of references, and the second hook filters
this to a subset. ReceivePack will advertise the unique objects from
the first hook using ".have" lines, which may lead to a huge
advertisement sent to the client.
This can also contribute to a very slow connectivity check after the
pack is parsed as ReceivePack calls markUninteresting on every commit
in advertisedHaves. This may require expanding a lot of subtrees to
mark all trees as uninteresting as well. On a very big repository
this can lead to a many-second stall.
Clear the advertisedHaves collection any time the refs are updated.
Add a test to verify the correct set of objects was sent.
Change-Id: I97f6998d0597251444a2e846a3ea1f461bae96f9
The tests:
- testCheckBlobNotCorrupt
- testCheckBlobCorrupt
create instances of ObjectChecker that are the same.
The tests:
- testCheckBlobWithBlobObjectCheckerNotCorrupt
- testCheckBlobWithBlobObjectCheckerCorrupt
also create instances of ObjectChecker that are the same.
Factor these instances out to constants instead of creating them
in the tests.
The `checker` member is still created anew in each test, since some
of the tests change its state.
Change-Id: I2d90263829d01d208632185b1ec2f678ae1a3f4c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Add tests for "true" and "false" matching to "YES" and "NO".
Change-Id: I58223855022871ac4b21bd34ff6a9cd00fce30a1
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
If a ReftableConfig has been supplied by the caller, write out a
reftable as a sibling of the the GC pack, alongside the heads.
To bootstrap from a non-reftable system, the refs are read from the
DfsRefDatabase if no GC reftables are present. Its assumed the
references are fully current, and do not need to be merged with any
other reftables. Any non-GC reftables will be pruned at the end of
the GC cycle, just like any packs that were replaced.
If a GC reftable is present, all existing reftables are compacted, and
references from DfsRefDatabase are only used to seed the packer. Its
assumed these are consistent with each other.
Change-Id: Ie397eb58aaaefb6865c816d9b39de3ac12998019
ServerSocket.accept() is not interruptible: a thread busy in accept()
may not react to Thread.interrupt() and may not return from accept()
via an InterruptedException. Close the socket instead to make the
daemon's listener thread terminate.
* Close the listening socket to get the listening thread to exit
instead of interrupting it.
* Add a stopAndWait() method that stops the listening thread and
then waits until it has indeed finished.
* Set SO_REUSE_ADDRESS on the listening socket.
Bug: 376369
Change-Id: I9d6014103e6dcb0173daea134feb44dc52c5c69a
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
While parsing .gitmodules, the name of the submodule subsection is
purely arbitrary: it frequently is the path of the submodule, but
there's no requirement for it to be. By building a map of paths to
the section name in .gitmodules, we can more accurately return
the submodule URL.
Bug: 508801
Change-Id: I8399ccada1834d4cc5d023344b97dcf8d5869b16
Also-by: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Some tests call out to external cgit. Those tests all failed for me
locally on Mac. Turned out that the reason was that the system git
config used by the git in the bazel run contained paths with ~/ but
somehow $HOME was not set. As a result the external git returned
with exit code 128.
Fix this by passing along $HOME explicitly. Also improve assertions
to make sure we do get the stderr of the external command in the
test log.
I hadn't noticed that until now because apparently the maven build
does pass along $HOME.
Change-Id: I7069676d5cc7b23a71e79a4866fe8acab5a405f4
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Make jsch visible to the test bundle and add the dependency.
Change-Id: I0c49ee9b8f64fe8a8c74d2f08865917eb33069b4
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Do not automatically organize imports using a save action since this
seems to be buggy and removed some annotations org.eclipse.jgit.pgm
needs to use args4j.
Change-Id: I5a91292c3b9241ce2dde3e4ecce14ad460097129
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Revert the following save actions which were introduced in c0ad77d8:
- always use braces around blocks
- remove unused imports
Other than I expected save actions are run globally on edited files -
and not only on edited code lines only.
Hence revert the save action "Convert control statement bodies to
blocks" which would affect a large number of code lines not affected by
the change editing some small part of a class. This would generate a
large number of changes which may lead to many unnecessary conflicts.
Total number of affected lines across jgit would be around 10k lines.
Also revert "Remove unused imports" since it erroneously removes imports
of some annotations needed by pgm classes using args4j.
Change-Id: I879a47f68e664129e6124cf25c1ae1f6a2d7a5aa
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add the following Eclipse save actions executed when saving modified
lines. This should help to reduce manual work needed to maintain a clean
and consistent code style:
- organize imports
- always use braces around blocks
- add missing annotations
- @Override including implementation of interface methods
- @Deprecated
- remove
- unused imports
- unnecessary $NON-NLS$ tags
- redundant type arguments
Also add default values for new settings that were introduced in recent
Eclipse versions up to Neon since we updated save rules the last time.
Change-Id: Idc90b249df044d0552f04edf01a5f607c4846f50
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Some repositories can have a policy that do not accept certain blobs. To
check if the incoming pack file contains such blobs, ObjectChecker can
be used. However, this ObjectChecker is not called by PackParser if the
blob is stored as a whole. This is because the object can be so large
that it doesn't fit in memory.
This change introduces BlobObjectChecker. This interface takes chunks of
a blob instead of the entire object. ObjectChecker can optionally return
a BlobObjectChecker. This won't change existing ObjectChecker
implementation; existing implementation continues to receive deltified
blob objects only.
Change-Id: Ic33a92c2de42bd7a89786a4da26b7a648b25218d
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
** matching always tries the empty match first. If a mismatch occurs
later, the ** must be extended by exactly one segment and matching must
resume with the matcher following the ** matcher.
Bug: 520920
Change-Id: Id019ad1c773bd645ae92e398021952f8e961f45c
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Path pattern matching for attribute rules is different than matching
for excluded files.
The first difference concerns patterns without slashes. For
gitattributes those must match on the last component only, not on
any earlier segment. This is true also for directory-only patterns.
The second difference concerns directory-only patterns. Those also
must not match on a prefix or segment except the last one. They do
not apply recursively to all files beneath.
And third, matches only on a prefix must match for gitattributes
only if the last matcher was "/**".
Add a new parameter for such path matching to IMatcher.matches() and
pass it through as appropriate (false for gitignore, true for
gitattributes). As far as gitignore is concerned, there is no change.
New tests have been added, and some existing attribute matching tests
have been fixed since they operated on wrong assumptions.
Bug: 508568
Change-Id: Ie825dc2cac8a85a72a7eeb0abb888f3193d21dd2
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
These tests verify that JGit matches the same as C git, for
both attribute matching (.gitattributes) and file exclusion matching
(.gitignore). These tests work by setting up a test repository and
test rules, and then determine excluded files or attributes both with
JGit and with the native C git, and then compare the results.
For .gitignore tests, we run
git ls-files --ignored --exclude-standard -o
and for attribute tests we use
git check-attr --stdin --all
and pass the list of all files in the repository via stdin.
Change-Id: I5b40946e04ff4a97456be7dffe09374323b7c89d
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
The Eclipse compiler raises errors for unthrown exceptions declared to
be thrown by test methods introduced in 88e45399.
Change-Id: I0d91c89e1b20ceff52c38b759abf906cc94e9902
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Except for %p and %r and partially %C, we can do token substitutions
as defined by OpenSSH inside the config file parser. %p and %r can
be replaced only if specified in the config; if not, it would be the
caller's responsibility to replace them with values obtained from the
URI to connect to.
Jsch doesn't know about token substitutions at all. By doing the
replacements as good as we can in the config file parser, we can
make Jsch support most of these tokens.
%i is not handled at all as Java has no concept of a "user ID".
Includes unit tests.
Bug: 496170
Change-Id: If9d324090707de5d50c740b0d4455aefa8db46ee
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Ensure the Jsch instance used knows about ~/.ssh/config. This
enables Jsch to honor more user configurations (see
com.jcraft.jsch.Session.applyConfig()), in particular also the
UserKnownHostsFile configuration, or additional identities given
via multiple IdentityFile entries.
Turn JGit's OpenSshConfig into a full parser that can be a
Jsch-compliant ConfigRepository. This avoids a few bugs
in Jsch's OpenSSHConfig and keeps the JGit-facing interface
unchanged. At the same time we can supply a JGit OpenSshConfig
instance as a ConfigRepository to Jsch. And since they'll both
work from the same object, we can also be sure that the parsing
behavior is identical.
The parser does not handle the "Match" and "Include" keys, and it
doesn't do %-token substitutions (yet).
Note that Jsch doesn't handle multi-valued UserKnownHostFile
entries as known by modern OpenSSH.[1]
[1] http://man.openbsd.org/OpenBSD-current/man5/ssh_config.5
Additional tests for new features are provided in OpenSshConfigTest.
Bug: 490939
Change-Id: Ic683bd412fa8c5632142aebba4a07fad4c64c637
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add an update_index to every reference in a reftable, storing the
exact transaction that last modified the reference. This is necessary
to fix some merge race conditions.
Consider updates at T1, T3 are present in two reftables. Compacting
these will create a table with range [T1,T3]. If T2 arrives during
or after the compaction its impossible for readers to know how to
merge the [T1,T3] table with the T2 table.
With an explicit update_index per reference, MergedReftable is able to
individually sort each reference, merging individual entries at T3
from [T1,T3] ahead of identically named entries appearing in T2.
Change-Id: Ie4065d4176a5a0207dcab9696ae05d086e042140
resolve(Ref) helps callers recursively chase symbolic references and
is a useful function when wrapping a Reftable inside a RefDatabase, as
RefCursor does not resolve symbolic references during iteration.
Change-Id: I1ba143f403773497972e225dc92c35ecb989e154
A compaction of reftables is just copying the results of a
MergedReftable into a ReftableWriter. Wrap this up into a utility.
Change-Id: I6f5677d923e9628993a2d8b4b007a9b8662c9045
MergedReftable combines multiple reference tables together in a stack,
allowing higher/later tables to shadow earlier/lower tables. This
forms the basis of a transaction system, where each transaction writes
a new reftable containing only the modified references, and readers
perform a merge on the fly to get the latest value.
Change-Id: Ic2cb750141e8c61a8b2726b2eb95195acb6ddc83
Add additional test cases for looking up entries within a namespace
such as refs/heads/ or refs/tags/, where the seek is passed a name
that ends with '/'.
Change-Id: I5f944de7518cd0090374bddba48d4dd3955a8d72
Add more test cases that cover larger collections of
references, verifying every reference is accessible
both by scan and by seek.
Change-Id: Icada59fdcfc92a4634f6df61baaebb1c37b75d98
This set of tests covers primitive storage of an empty
file, and each type of supported reference.
Change-Id: I3bdff35cae8ae27283051932f20608b3ac353559
Currently there is no way to determine the precise changes done
to the working tree by a JGit command. Only the CheckoutCommand
actually provides access to the lists of modified, deleted, and
to-be-deleted files, but those lists may be inaccurate (since they
are determined up-front before the working tree is modified) if
the actual checkout then fails halfway through. Moreover, other
JGit commands that modify the working tree do not offer any way to
figure out which files were changed.
This poses problems for EGit, which may need to refresh parts of the
Eclipse workspace when JGit has done java.io file operations.
Provide the foundations for better file change tracking: the working
tree is modified exclusively in DirCacheCheckout. Make it emit a new
type of RepositoryEvent that lists all files that were modified or
deleted, even if the checkout failed halfway through. We update the
'updated' and 'removed' lists determined up-front in case of file
system problems to reflect the actual state of changes made.
EGit thus can register a listener for these events and then knows
exactly which parts of the Eclipse workspace may need to be refreshed.
Two commands manage checking out individual DirCacheEntries themselves:
checkout specific paths, and applying a stash with untracked files.
Make those two also emit such a new WorkingTreeModifiedEvent.
Furthermore, merges may modify files, and clean, rm, and stash create
may delete files.
CQ: 13969
Bug: 500106
Change-Id: I7a100aee315791fa1201f43bbad61fbae60b35cb
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
When creating a new PackFile instance it is specified whether this pack
has an associated bitmap index file or not. This information is cached
and the public method getBitmapIndex() will always assume a bitmap index
file must exist if the cached data tells so. But it may happen that the
packfiles are repacked during a gc in a different process causing the
packfile, bitmap-index and index file to be deleted. Since JGit still
has an open FileHandle on the packfile this file is not really deleted
and can still be accessed. But index and bitmap index file are deleted.
Fix getBitmapIndex() to invalidate the cached packfile instance if such
a situation occurs.
This problem showed up when a gerrit server was serving repositories
which where garbage collected with native git regularly. Fetch and
clone commands for certain repositories failed permanently after a
native git gc had deleted old bitmap index files.
Change-Id: I8e620bec74dd3f310ba42024f9a657062f868f0e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Per the git config documentation[1], pushInsteadOf is ignored when
a remote has explicit pushUris.
Implement this, and adapt tests.
Up to now JGit mistakenly applied pushInsteadOf also to existing
pushUris. If some repositories had relied on this mis-feature,
pushes may newly suddenly fail (the uncritical case; the config
just needs to be fixed) or even still succeed but push to unexpected
places, namely to the non-rewritten pushUrls (the critical case).
The release notes should point out this change.
[1] https://git-scm.com/docs/git-config
Bug: 393170
Change-Id: I38c83204d2ac74f88f3d22d0550bf5ff7ee86daf
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
According to [1], pushInsteadOf is
1. applied to the uris, not to the pushUris
2. ignored if a remote has an explicit pushUri
JGit applied it only to the pushUris. As a result, pushInsteadOf was
ignored for remotes having only a uri, but no pushUri.
This commit implements (1) if there are no pushUris. I did not dare
implement (2) because:
* there are explicit tests for it that expect that pushInsteadOf gets
applied to existing pushUrls, and
* people may actually use and rely on this JGit behavior.
[1] https://git-scm.com/docs/git-config
Bug: 393170
Change-Id: I6dacbf1768a105190c2a8c5272e7880c1c9c943a
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>