Refactor all of the push option support code to allocate the list
immediately before parsing the options section off the stream.
Move option support down to ReceivePack instead of BaseReceivePack.
Push options are specific to the ReceivePack protocol and are not
likely to appear in the 4 year old subscription proposal. These
changes are OK before JGit 4.5 ships as no consumer should be relying
on these new APIs.
Change-Id: Ib07d18c877628aba07da07cd91875f918d509c49
Initialize pushOptions when we decide to use them, instead of when we
advertise them.
In the case of HTTP the advertisement is in a different network
request, hence in a different instance of the BaseReceivePack.
Change-Id: I094c60942e04de82cb6d8433c9cd43a46ffae332
Signed-off-by: Stefan Beller <sbeller@google.com>
Do not open an OBJ_TREE if the caller is expecting an OBJ_BLOB or
OBJ_COMMIT; instead throw IncorrectObjectTypeException. This better
matches behavior of WindowCursor, the ObjectReader implementation of
the local file based object store.
Change-Id: I3fb0e77f54895b123679a405e1b6ba5b95752ff0
DfsRefDatabase#compareAndPut had a vague semantics for reference
matching. Because of this, an operation to make a symbolic
reference had been broken for some DFS implementations even if they
followed the contract of compareAndPut. The clarified semantics
requires the implementations to satisfy the followings:
* Matching references should be both symbolic references or both
object ID references.
* If both are symbolic references, both should have the same target
name.
* If both are object ID references, both should have the same object
ID.
This semantics is defined based on
https://git.eclipse.org/r/#/c/77416/. Before this commit,
DfsRefDatabase couldn't see the target of symbolic references.
InMemoryRepository is changed to comply with the new semantics. This
semantics change can affect the existing DFS implementations that only
checks object IDs. This commit adds two tests that the previous
InMemoryRepository couldn't pass.
Change-Id: I6c6b5d3cc8241a81f4a37782381c88e8a59fdf15
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
When doing a detaching operation, JGit fakes a SymbolicRef as an
ObjectIdRef. This is because RefUpdate#updateImpl dereferences the
SymbolicRef when updating it. For example, assume that HEAD is
pointing to refs/heads/master. If I try to make a detached HEAD
pointing to a commit c0ffee, RefUpdate dereferences HEAD as
refs/heads/master first and changes refs/heads/master to c0ffee. The
detach argument of RefDatabase#newUpdate avoids this dereference by
faking HEAD as ObjectIdRef.
This faking is problematic for the linking operation of
DfsRefDatabase. It does a compare-and-swap operation on every
reference change because of its distributed systems nature. If a
SymbolicRef is faked as an ObjectRef, it thinks that there is a
racing change in the reference and rejects the update. Because of
this, DFS based repositories cannot change the link target of symbolic
refs. This has not been a problem for file-based repositories because
they have a file-lock based semantics instead of the CAS based one.
The reference implementation, InMemoryRepository, is not affected
because it only compares ObjectIds.
When [1] introduced this faking code, there was no way for RefUpdate
to distinguish the detaching operation. When [2] fixed the detaching
operation, it introduced a detachingSymbolicRef flag. This commit uses
this flag to control whether it needs to dereference the symbolic refs
by calling Ref#getLeaf. The same flag is used in the reflog update
operation.
This commit does not affect any operation that succeeds currently. In
some DFS repository implementations, this fixes a ref linking
operation, which is currently failing.
[1]: 01b5392cdb
[2]: 3a86868c08
Change-Id: I118f85f0414dbfad02250944e28d74dddd59469b
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Imitate the packet tracing feature from C Git v1.7.5-rc0~58^2~1 (add
packet tracing debug code, 2011-02-24). Unlike C Git, use the log4j
log level setting instead of the GIT_TRACE_PACKET environment variable
to enable tracing.
Tested as follows:
1. Enable tracing by adding the lines
log4j.logger.org.eclipse.jgit.transport=DEBUG, stderr
log4j.additivity.org.eclipse.jgit.transport=false
to org.eclipse.jgit.pgm/resources/log4j.properties.
2. mvn package
3. org.eclipse.jgit.pgm/target/jgit \
ls-remote git://git.kernel.org/pub/scm/git/git 2>&1 |less
Then the output provides a trace of packets sent and received over
the wire:
2016-08-24 16:36:42 DEBUG PacketLineOut:145 - git> git-upload-pack /pub/scm/git/git^@host=git.kernel.org^@
2016-08-24 16:36:42 DEBUG PacketLineIn:165 - git< 2632c897f74b1cc9b5533f467da459b9ec725538 HEAD^@multi_ack thin-pack side-band side-band-64k ofs-delta shallow no-progress include-tag multi_ack_detailed symref=HEAD:refs/heads/master agent=git/2.8.4
2016-08-24 16:36:42 DEBUG PacketLineIn:165 - git< e0c1ceafc5bece92d35773a75fff59497e1d9bd5 refs/heads/maint
Change-Id: I5028c064f3ac090510386057cb4e6d30d4eae232
Signed-off-by: Dan Wang <dwwang@google.com>
This fixes the tests failed in JDK8.
FS uses java.nio API to get file attributes. The timestamps obtained
from that API are more precise than the ones from
java.io.File#lastModified() since Java8.
This difference accidentally makes JGit detect newly added files as
smudged. Use the precised timestamp to avoid this false positive.
Bug: 500058
Change-Id: I9e587583c85cb6efa7562ad6c5f26577869a2e7c
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
The metadata comparison of submodules is not reliable because of the
last modified timestamp and directory length.
Bug: 498759
Change-Id: If5db69ef3868e475ac477d3e8a7750b268799b0c
The exception can be thrown in a various reason, and sometimes 403
Forbidden is not appropriate. Make the HTTP status code customizable.
Change-Id: If2ef6f454f7479158a4e28a12909837db483521c
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
AddCommandTest is flaky because IOException is thrown sometimes.
Caused by: java.io.IOException: Stream closed
at java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:433)
at java.io.OutputStream.write(OutputStream.java:116)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at org.eclipse.jgit.util.FS.runProcess(FS.java:993)
at org.eclipse.jgit.util.FS.execute(FS.java:1102)
at org.eclipse.jgit.treewalk.WorkingTreeIterator.filterClean(WorkingTreeIterator.java:470)
... 22 more
OpenJDK replaces the underlying OutputStream with NullOutputStream when
the process exits. This throws IOException for all write operation. When
it exits before JGit writes the input to the pipe buffer, the input
stays in BufferedOutputStream. The close method tries to write it again,
and IOException is thrown.
Since we ignore IOException in StreamGobbler, we also ignore it when
we close the stream.
Fixes Bug 499633.
Change-Id: I30c7ac78e05b00bd0152f697848f4d17d53efd17
Signed-off-by: Masaya Suzuki <draftcode@gmail.com>
If a reference was updated more recently than a pack was written
(typical) the PackList was perpetually dirty until the next GC
was completed for the repository.
Detect this condition by observing no changes to the PackList
membership and resetting the dirty bit.
Change-Id: Ie2133aca1f8083307c73b6a26358175864f100ef
This will be used by EGit for implementing commit amend in the staging
view (see Idcd1efeeee8b3065bae36e285bfc0af24ab1e88f).
Change-Id: Ice9ebbb1c0c3314c679f4db40cdd3664f61c27c3
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Reported-by: David Pursehouse <david.pursehouse@gmail.com>
Change-Id: I9e9b021d335bda4d58b6bcc30f59b81ac5b37724
Signed-off-by: Jonathan Nieder <jrn@google.com>
Since 84d2738ff2 (Don't skip want validation when the client sends no
haves, 2013-06-21), this branch is not taken. Process the
"shallow"s anyway as a defensive measure in case the code path gets
revived.
Change-Id: Idfb834825d77f51e17191c1635c9d78c78738cfd
Signed-off-by: Jonathan Nieder <jrn@google.com>
d385a7a5e5 (Shallow fetch: Respect "shallow" lines, 2016-08-03) forgot
that UploadPack wasn't passing a DepthWalk to PackWriter in the first
place. As a result, shallow clones fail:
java.lang.IllegalArgumentException: Shallow packs require a DepthWalk
at org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:756)
at org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:1497)
at org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:1381)
at org.eclipse.jgit.transport.UploadPack.service(UploadPack.java:774)
at org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:667)
at org.eclipse.jgit.http.server.UploadPackServlet.doPost(UploadPackServlet.java:191)
Change-Id: Ib0d8c2946eebfea910a2b767fb92e23da15d4749
cgit changed the --depth parameter to mean the total depth of history
rather than the depth of ancestors to be returned [1]. JGit still uses
the latter meaning, so update it to match cgit.
depth=0 still means a non-shallow clone. depth=1 now means only the
wants rather than the wants and their direct parents.
This is accomplished by changing the semantic meaning of "depth" in
UploadPack and PackWriter to mean the total depth of history desired,
while keeping "depth" in DepthWalk.{RevWalk,ObjectWalk} to mean
the depth of traversal. Thus UploadPack and PackWriter always
initialize their DepthWalks with "depth-1".
[1] upload-pack: fix off-by-one depth calculation in shallow clone
https://code.googlesource.com/git/+/682c7d2f1a2d1a5443777237450505738af2ff1a
Change-Id: I87ed3c0f56c37e3491e367a41f5e555c4207ff44
Signed-off-by: Terry Parker <tparker@google.com>
When fetching from a shallow clone, the client sends "have" lines
to tell the server about objects it already has and "shallow" lines
to tell where its local history terminates. In some circumstances,
the server fails to honor the shallow lines and fails to return
objects that the client needs.
UploadPack passes the "have" lines to PackWriter so PackWriter can
omit them from the generated pack. UploadPack processes "shallow"
lines by calling RevWalk.assumeShallow() with the set of shallow
commits. RevWalk creates and caches RevCommits for these shallow
commits, clearing out their parents. That way, walks correctly
terminate at the shallow commits instead of assuming the client has
history going back behind them. UploadPack converts its RevWalk to an
ObjectWalk, maintaining the cached RevCommits, and passes it to
PackWriter.
Unfortunately, to support shallow fetches the PackWriter does the
following:
if (shallowPack && !(walk instanceof DepthWalk.ObjectWalk))
walk = new DepthWalk.ObjectWalk(reader, depth);
That is, when the client sends a "deepen" line (fetch --depth=<n>)
and the caller has not passed in a DepthWalk.ObjectWalk, PackWriter
throws away the RevWalk that was passed in and makes a new one. The
cleared parent lists prepared by RevWalk.assumeShallow() are lost.
Fortunately UploadPack intends to pass in a DepthWalk.ObjectWalk.
It tries to create it by calling toObjectWalkWithSameObjects() on
a DepthWalk.RevWalk. But it doesn't work: because DepthWalk.RevWalk
does not override the standard RevWalk#toObjectWalkWithSameObjects
implementation, the result is a plain ObjectWalk instead of an
instance of DepthWalk.ObjectWalk.
The result is that the "shallow" information is thrown away and
objects reachable from the shallow commits can be omitted from the
pack sent when fetching with --depth from a shallow clone.
Multiple factors collude to limit the circumstances under which this
bug can be observed:
1. Commits with depth != 0 don't enter DepthGenerator's pending queue.
That means a "have" cannot have any effect on DepthGenerator unless
it is also a "want".
2. DepthGenerator#next() doesn't call carryFlagsImpl(), so the
uninteresting flag is not propagated to ancestors there even if a
"have" is also a "want".
3. JGit treats a depth of 1 as "1 past the wants".
Because of (2), the only place the UNINTERESTING flag can leak to a
shallow commit's parents is in the carryFlags() call from
markUninteresting(). carryFlags() only traverses commits that have
already been parsed: commits yet to be parsed are supposed to inherit
correct flags from their parent in PendingGenerator#next (which
doesn't happen here --- that is (2)). So the list of commits that have
already been parsed becomes relevant.
When we hit the markUninteresting() call, all "want"s, "have"s, and
commits to be unshallowed have been parsed. carryFlags() only
affects the parsed commits. If the "want" is a direct parent of a
"have", then it carryFlags() marks it as uninteresting. If the "have"
was also a "shallow", then its parent pointer should have been null
and the "want" shouldn't have been marked, so we see the bug. If the
"want" is a more distant ancestor then (2) keeps the uninteresting
state from propagating to the "want" and we don't see the bug. If the
"shallow" is not also a "have" then the shallow commit isn't parsed
so (2) keeps the uninteresting state from propagating to the "want
so we don't see the bug.
Here is a reproduction case (time flowing left to right, arrows
pointing to parents). "C" must be a commit that the client
reports as a "have" during negotiation. That can only happen if the
server reports it as an existing branch or tag in the first round of
negotiation:
A <-- B <-- C <-- D
First do
git clone --depth 1 <repo>
which yields D as a "have" and C as a "shallow" commit. Then try
git fetch --depth 1 <repo> B:refs/heads/B
Negotiation sets up: have D, shallow C, have C, want B.
But due to this bug B is marked as uninteresting and is not sent.
Change-Id: I6e14b57b2f85e52d28cdcf356df647870f475440
Signed-off-by: Terry Parker <tparker@google.com>
DepthWalk needs to override toObjectWalkWithSameObjects() and thus
needs to be able to directly set the objects and freeFlags fields, so
make them package private.
Change-Id: I24561b82c54ba3d6522582ca25105b204d777074
Signed-off-by: Terry Parker <tparker@google.com>
When doing an incremental fetch from JGit, "have" commits are marked
as "uninteresting". In a non-shallow fetch, when the RevWalk hits an
"uninteresting" commit it marks the commit's corresponding tree as
uninteresting. That has the effect of dropping those trees and all the
trees and blobs they reference out of the thin pack returned to the
client.
However, shallow fetches use a DepthWalk to limit the RevWalk, which
nearly always causes the RevWalk to terminate before encountering the
"have" commits. As a result the pack created for the incremental fetch
never encounters "uninteresting" tree objects and thus includes
duplicate objects that it knows the client already has.
Change-Id: I7b1f7c3b0d83e04d34cd2fa676f1ad4fec904c05
Signed-off-by: Terry Parker <tparker@google.com>
Previously jgit would attempt to clean git repositories that had not
been committed by calling a non-recursive delete on them, which would
fail as they are directories. This commit addresses that issue in the
following ways.
Repositories are skipped in a default clean, similarly to cgit and only
cleaned when the force flag is applied. When the force flag is applied
repositories are deleted using a recursive delete call. The force flag
and setForce method are added here to CleanCommand to support this
change.
Bug: 498367
Change-Id: Ib6cfff65a033d0d0f76395060bf76719e13fc467
Signed-off-by: Matthaus Owens <matthaus@puppetlabs.com>
We have to be able to access the enum from outside the package as part of
the API.
Change-Id: I4bdc6bd53a14237c5f4fb9397ae850f9a24c4cfb
Signed-off-by: Stefan Beller <sbeller@google.com>
The message "close() called when useCnt is already zero" is logged with
level warning, and then if debug logging is enabled, the stack trace is
logged separately with level debug.
Log the message and the stack trace in the same call, so that they always
appear together in the output rather than potentially interleaved with
other log statements.
Change-Id: I1b5c1557ddc2d19f3f5b29baec96e62bc467d88a
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Gerrit's superproject subscription feature uses RefSpecs to formalize
the ACLs of when the superproject subscription feature is allowed.
As this is a slightly different use case than describing a local/remote
pair of refs, we need to be more permissive. Specifically we want to allow:
refs/heads/*
refs/heads/*:refs/heads/master
refs/heads/master:refs/heads/*
Introduce a new constructor, that allows constructing these RefSpecs.
Change-Id: I46c0bea9d876e61eb2c8d50f404b905792bc72b3
Signed-off-by: Stefan Beller <sbeller@google.com>
We had a case in Gerrits superproject subscriptions where
'refs/heads/' was configured with the intention to mean 'refs/heads/*'.
The first expression lacks the '*', which is why it is not considered
a wildcard but it was considered valid and so was not found early to be
a typo.
Refs are not allowed to end with '/' anyway, so add a check for that.
Change-Id: I3ffdd9002146382acafb4fbc310a64af4cc1b7a9
Signed-off-by: Stefan Beller <sbeller@google.com>
Example usage:
$ ./jgit push \
--push-option "Reviewer=j.doe@example.org" \
--push-option "<arbitrary string>" \
origin HEAD:refs/for/master
Stefan Beller has also made an equivalent change to CGit:
http://thread.gmane.org/gmane.comp.version-control.git/299872
Change-Id: I6797e50681054dce3bd179e80b731aef5e200d77
Signed-off-by: Dan Wang <dwwang@google.com>
What's invalidated when an object database is "dirty" is not the whole
database, but rather a specific list of packs. If there is a race
between getting the pack list and setting the volatile dirty flag
where the packs are rescanned, we don't need to mark the new pack list
as dirty.
This is a fine point that only really applies if the decision of
whether or not to mark dirty actually requires introspecting the pack
list (say, its timestamps). The general operation of "take whatever
is the current pack list and mark it dirty" may still be inherently
racy, but the cost is not so high.
Change-Id: I159e9154bd8b2d348b4e383627a503e85462dcc6
This variable has been populated and never used since it was
introduced in commit 5cf53fdacf
(Speed up clone/fetch with large number of refs, 2013-02-18).
Noted by FindBugs:
"BatchRefUpdate.java:359, UC_USELESS_OBJECT, Priority: Normal"
Change-Id: I7aacb49540aaee4a83db3d38b15633bb6c4773d0
Signed-off-by: Dan Wang <dwwang@google.com>
Currently, there is a race where a user of a DfsRepository in a single
thread may get unexpected MissingObjectExceptions trying to look up an
object that appears as the current value of a ref:
1. Thread A scans packs before scanning refs, for example by reading
an object by SHA-1.
2. Thread B flushes an object and updates a ref to point to that
object.
3. Thread A looks up the ref updated in (2). Since it is scanning refs
for the first time, it sees the new object SHA-1.
4. Thread A tries to read the object it found in (3), using the cached
pack list it got from (1). The object appears missing.
Allow implementations to work around this by marking the object
database's current pack list as "dirty." A dirty pack list means that
DfsReader will rescan packs and try again if a requested object is
missing. Implementations should mark objects as dirty any time the ref
database reads or scans refs that might be newer than a previously
cached pack list.
Change-Id: I06c722b20c859ed1475628ec6a2f6d3d6d580700
We observe in Gerrit 2.12 that useCnt can become negative in rare cases.
Log this to help finding the bug.
Change-Id: Ie91c7f9d190a5d7cf4733d4bf84124d119ca20f7
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When Repository.close() decrements the useCount to 0 currently the cache
immediately evicts the repository from WindowCache and RepositoryCache.
This leads to I/O overhead on busy repositories because pack files and
references are inserted and deleted from the cache frequently.
This commit defers the eviction of a repository from the caches until
last use of the repository is older than time to live. The eviction is
handled by a background task running periodically.
Add two new configuration parameters:
* core.repositoryCacheExpireAfter: cache entries are evicted if the
cache entry wasn't accessed longer than this time in milliseconds
* core.repositoryCacheCleanupDelay: defines the interval in milliseconds
for running a background task evicting expired cache entries. If set to
-1 the delay is set to min(repositoryCacheExpireAfter, 10 minutes). If
set to 0 the time based eviction is switched off and no background task
is started. If time based eviction is switched off the JVM can still
evict cache entries if heap memory is running low.
Change-Id: I4a0214ad8b4a193985dda6a0ade63b70bdb948d7
Also-by: Matthias Sohn <matthias.sohn@sap.com>
Also-by: Hugo Arès <hugo.ares@ericsson.com>
Also-by: Sasa Zivkov <sasa.zivkov@sap.com>
These try-with-resources blocks close the underlying output stream
twice: once when closing the CountingOutputStream wrapper, then again
when closing the DfsOutputStream out.
Simplify by only closing the CountingOutputStream.
In practice this shouldn't matter because the close() method of a
Closable is required to be idempotent, but avoiding the redundant
extra close makes the code simpler to read and understand.
Change-Id: I1778c4fc8ba075a2c6cd2129528bb272cb3a1af7
If the client sent a well-formed enough request to see it wants to use
side-band-64k for status reporting (meaning its a modern client), but
any other command record was somehow invalid (e.g. corrupt SHA-1)
report the parsing exception using channel 3. This allows clients to
see the failure and know the server will not be continuing.
git-core and JGit clients send all commands and then start a sideband
demux before sending the pack. By consuming all commands first we get
the client into a state where it can see and respond to the channel 3
server failure.
This behavior is useful on HTTPS connections when the client is buggy
and sent a corrupt command, but still managed to request side-band-64k
in the first line.
Change-Id: If385b91ceb9f024ccae2d1645caf15bc6b206130
The "shallow $id" parsing can also throw InvalidObjectIdException,
just like parseCommand. Move it into its own method with a proper
try-catch block to convert to the checked PackProtocolException.
Change-Id: I6839b95f0bc4fbf4d2c213331ebb9bd24ab2af65
Instead of deferring until after command parsing, enable the
capabilities after the first pkt-line has been read from the client.
This allows the server to setup the side-band-64k channel immediately.
Change-Id: I141b7fc92e983a41d3a58da8e1464a6917422b6c
If the push client has requested side-band support the server can
signal a fatal error parsing the pack using the error channel (3)
and then hang up. This may cause the PackWriter to fail to write to
data onto the network socket, which throws a misleading error back
up to the application and the user.
During a write failure poll the input to see if the side band system
can parse out an error message off channel 3. This should be fast as
there will either be an error present in the buffer, or the remote will
also have hung-up on the side band channel. In the case of a hang-up
just rethrow the original IOException as its a network error.
This roughly matches what C git does; once commands are sent and the
packer is started a new thread runs in the background to decode any
possible server error during unpacking on the remote peer
Change-Id: Idb37a4a122a565ec4b59130e08c27d963ba09395
The more specific type InvalidObjectIdException is thrown by
ObjectId.fromString(). Use it here in ReceivePack as the more
generic IAE is never thrown by the body of the try-catch block.
Change-Id: I53fc13c561c7d429a50b5eb82773f1a670431c54