If we need to append less than 20 bytes in order to fix a thin pack
and make it complete, we need to set the length of our file back to
the actual number of bytes used because the original SHA-1 footer was
not completely overwritten. That extra data will confuse the header
and footer fixup logic when it tries to read to the end of the file.
This isn't a very common case to occur, which is why we've never
seen it before. Getting a delta that requires a whole object which
uses less than 20 bytes in pack representation is really hard.
Generally a delta generator won't make these, because the delta
would be bigger than simply deflating the whole object. I only
managed to do this with a hand-crafted pack file where a 1 byte
delta was pointed to a 1 byte whole object.
Normally we try really hard to avoid truncating, because its
typically not safe across network filesystems. But the odds of
this occurring are very low. This truncation is done on a file
we have open for writing, will append more content onto, and is
a temporary file that we won't move into position for others to
see until we've validated its SHA-1 is sane. I don't think the
truncate on NFS issue is something we need to worry about here.
Change-Id: I102b9637dfd048dc833c050890d142f43c1e75ae
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We can avoid one stat call by trying to perform a directory
listing without checking if the reference File is a directory.
Attempting a directory listing is defined to return. The other
case for null returns from list is when an I/O error occcurs.
Both cases are now intepreted as a possible plain reference. I/O
errors when reading plain references will be handled (ignored)
in scanRef().
Change-Id: I9906ed8c42eab4d6029c781aab87b3b07c1a1d2c
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
In the current implementation Repository reads user and repository
config only at creation point of time.
The new implementatiopn checks in Repository.getConfig if user or
repository config have changed on disk and reload the config if
required.
Change-Id: Ibd97515919ef66c6f8aa1a4fe8a11a6711335dad
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
We only need to check file existense if some other stat returns
a value that may mean that the file does not exist. File.length() == 0
or File.lastModified() == 0 are two such properties. We use length
here.
Change-Id: If626b12e7bb4da994b5c086f6a5b7a12c187261c
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
The JSch bundle in Eclipse 3.4 does not export its packages with
version numbers. Use Require-Bundle on version 0.1.37 that comes
with Eclipse 3.4
There is no 0.1.37 in the maven repositories so the pom still refers
to 0.1.41 so the build can get the compile time dependencies right.
Bug: 308031
CQ: 3904 jsch Version: 0.1.37 (using Orbit CQ2014)
Change-Id: I12eba86bfbe584560c213882ebba58bf1f9fa0c1
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
When listing branches, EGit only reads the advertisement and
then disconnects. When it closes down the pack channel the remote
side is waiting for the client to send our list of commands, or a
flush-pkt to let it know there is nothing to do.
However if an error thread is open watching the SSH stderr stream,
we ask for it to finish before we send the flush-pkt. Unfortunately
the thread won't terminate until the main output stream closes,
which is waiting for the flush-pkt. A classic network deadlock.
If the output stream needs a flush-pkt we send it before we wait
for the error stream to close. If the flush-pkt is rejected, we
close down the output stream early, assuming that the remote side
is broken and we will get error information soon.
Change-Id: I8d078a339077756220c113f49d206b1bf295d434
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Since the API is changing relative to 0.7.0, we'll call our next
release 0.8.1. But until that gets released, builds from master
will be 0.8.0.qualifier.
Change-Id: I921e984f51ce498610c09e0db21be72a533fee88
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
By default a receive pack assumes that its user will only provide
references to objects that the user already has access to on their
local client. In certain cases, an additional check to verify the
references point only to reachable objects is necessary.
This additional checking is useful when the code doesn't trust
the client not to provide a forged SHA-1 reference to an object,
in an attempt to access parts of the DAG that they weren't allowed
to see by the configured RefFilter.
Change-Id: I3e4b8505cb2992e3e4be253abb14a1501e47b970
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When reading pkt-lines off an InputStream we are quite likely to
consume a whole group of fairly short lines in rapid succession, such
as in the have exchange that occurs in the fetch-pack/upload-pack
protocol. Rather than allocating a throwaway buffer for each
line's raw byte sequence, reuse a buffer that is equal to the small
side-band packet size, which is 1000 bytes. Text based pkt-lines
are required to be less than this size because many widely deployed
versions of C Git use a statically allocated array of this length.
Change-Id: Ia5c8e95b85020f7f80b6d269dda5059b092d274d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some transports actually provide stream buffering on their own,
without needing to be wrapped up inside of a BufferedInputStream in
order to smooth out system calls to read or write. A great example
of this is the JSch SSH client, or the Apache MINA SSHD server.
Both use custom buffering to packetize the streams into the encrypted
SSH channel, and wrapping them up inside of a BufferedInputStream
or BufferedOutputStream is relatively pointless.
Our SideBandOutputStream implementation also provides some fairly
large buffering, equal to one complete side-band packet on the main
data channel. Wrapping that inside of a BufferedOutputStream just to
smooth out small writes from PackWriter causes extra data copies, and
provides no advantage. We can save some memory and some CPU cycles
by letting PackWriter dump directly into the SideBandOutputStream's
internal buffer array.
Instead we push the buffering streams down to be as close to the
network socket (or operating system pipe) as possible. This allows
us to smooth out the smaller reads/writes from pkt-line messages
during advertisement and negotation, but avoid copying altogether
when the stream switches to larger writes over a side band channel.
Change-Id: I2f6f16caee64783c77d3dd1b2a41b3cc0c64c159
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This proved to be a pretty difficult to find bug. If we read exactly
the number of response bytes from the UnionInputStream and didn't
try to read beyond that length, the last connection's InputStream is
still inside of the UnionInputStream, and UnionInputStream.isEmpty()
returns false. But there is no data present, so the next read
request to our UnionInputStream returns EOF at a point where the
HTTP client code should have started a new request in order to get
more data.
Instead of wrapping the UnionInputStream, push an dummy stream onto
the end of it which when invoked always starts the next request and
then returns EOF. The UnionInputStream will automatically pop that
dummy stream out, and then read the next request's stream.
This way we never get into the state where we don't think we need
to run another request in order to satisfy the current read request,
but we really do.
The bug was hidden for so long because BasePackConnection.init()
was always wrapping the InputStream into a BufferedInputStream
with an 8 KiB buffer. This made the odds of us reading from the
UnionInputStream the exact number of available bytes quite low, as
the BufferedInputStream would always try to read a full buffer size.
Change-Id: I02b5ec3ef6853688687d91de000a5fbe2354915d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the application wants to, it can use sendError(String) to send one
or more error messages to clients before the advertisements are sent.
These will cause a C Git client to break out of the advertisement
parsing loop, display "remote error: message\n", and terminate.
Servers can optionally use this to send a detailed error to a client
explaining why it cannot use the ReceivePack service on a repository.
Over smart HTTP these errors are sent in a 200 OK response, and
are in the payload, allowing the Git client to give the end-user
the custom message rather than the generic error "403 Forbidden".
Change-Id: I03f4345183765d21002118617174c77f71427b5a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
GitHub broke the native git protocol a while ago by interjecting an
"ERR message" line into the upload-pack or receive-pack advertisement
list. This didn't match the expected pattern, so it caused existing
C Git clients to abort with a protocol exception.
These days, C Git clients actually look for this message and abort
with a more graceful notice to the end-user. JGit should do the
same, including setting up a custom exception type that makes it
easier for higher-level UIs to identify a message from the remote
site and present it to the user.
Change-Id: I51ab62a382cfaf1082210e8bfaa69506fd0d9786
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
JSch will allow us to close the connection and then just drop
any late messages coming over the stderr stream for the command.
This makes it easy to lose final output on a command, like from
Gerrit Code Review's post receive hook.
Instead spawn a background thread to copy data from JSch's pipe
into our own buffer, and wait for that thread to receive EOF on the
pipe before we declare the connection closed. This way we don't
have a race condition between the stderr data arriving and JSch
just tearing down the channel.
Change-Id: Ica1ba40ed2b4b6efb7d5e4ea240efc0a56fb71f6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Any messages received on side band #2 that aren't scraped as a
progress message into our ProgressMonitor are now forwarded to a
buffer which is later included into the OperationResult object.
Application callers can use this buffer to present the additional
messages from the remote peer after the push or fetch operation
has concluded.
The smart push connections using the native send-pack/receive-pack
protocol now request side-band-64k capability if it is available
and forward any messages received through that channel onto this
message buffer. This makes hook messages available over smart HTTP,
or even over SSH.
The SSH transport was modified to redirect the remote command's
stderr stream into the message buffer, interleaved with any data
received over side band #2. Due to buffering between these two
different channels in the SSH channel mux itself the order of any
writes between the two cannot be ensured, but it tries to stay close.
The local fork transport was also modified to redirect the local
receive-pack's stderr into the message buffer, rather than going to
the invoking JVM's System.err. This gives applications a chance
to log the local error messages, rather than needing to redirect
their JVM's stderr before startup.
To keep things simple, the application has to wait for the entire
operation to complete before it can see the messages. This may
be a downside if the user is trying to debug a remote hook that is
blocking indefinitely, the user would need to abort the connection
before they can inspect the message buffer in any sort of UI built
on top of JGit.
Change-Id: Ibc215f4569e63071da5b7e5c6674ce924ae39e11
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We now advertise the side-band-64k capability inside of ReceivePack,
allowing hooks to echo status messages down the side band channel
instead of over the optional stderr stream.
This change permits hooks running inside of an http:// based push
invocation to still message the end-user with more detailed errors
than the small per-command string in the status report.
Change-Id: I64f251ef2d13ab3fd0e1a319a4683725455e5244
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
To avoid scraping a non-progress message as though it were a progress
item for the progress monitor, use a more restrictive pattern to
watch the remote side's messages. These two regexps should match
any message produced by C Git since 42e18fbf5f94 ("more compact
progress display", Oct 2007), and which first appeared in Git 1.5.4.
Change-Id: I57e34cf59d42c1dbcbd1a83dd6f499ce5e39d15d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When we pull task messages off the remote peer via sideband #2
prefix them with the string "remote: " to make it clear to the
user these are coming from the other system, and not from their
local client.
Change-Id: I02c5e67c6be67e30e40d3bc4be314d6640feb519
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This field is unsigned in the protocol, so treat it
as such when we report the channel number in errors.
Change-Id: I20a52809c7a756e9f66b3557a4300ae1e11f6d25
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Typically we refer to the raw InputStream (the stream without the
pkt-line headers on it) as rawIn, and the pkt-line header variant
as pckIn. Refactor our fields to reflect that. To ensure these
are actually the same underlying InputStream, we now create our own
PacketLineIn wrapper around the supplied raw InputStream. Its a
very low-cost object since it has only the 4 byte length buffer.
Instead of hardcoding the header length as 5, use the constant from
SideBandOutputStream. This makes it a bit more clear what we are
consuming, exactly here.
Change-Id: Iebd05538042913536b88c3ddc3adc3a86a841cc5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of relying on our callers to wrap us up inside of a
BufferedOutputStream and using the proper block sizing, do the
buffering directly inside of SideBandOutputStream. This ensures
we don't get large write-throughs from BufferedOutputStream that
might overflow the configured packet size.
The constructor of SideBandOutputStream is also beefed up to check
its arguments and ensure they are within acceptable ranges for the
current side-band protocol.
Change-Id: Ic14567327d03c9e972f9734b8228178bc448867d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The tests were using a Locale.ROOT constant which was introduced
in Java 6. However, we need to retain Java 5 support.
Change-Id: I75c5648fcfc728a9aea2e839d2ad0320f5cf742f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Sasa Zivkov <sasa.zivkov@sap.com>
The initial contribution was handled through a CQ, and does not need
to be reported as an individual bug record in the project's IP log.
Its an odd corner case that the EMO IP team doesn't want to see,
even though its technically a contribution written by at least
some non-committers.
The project.skipCommit variable can now be used to mask out any
particular change from the IP log. Currently within JGit we want
to mask only the initial commit, but others could be masked if the
need arises.
Change-Id: I598e08137ddc5913284471ee2aa545f4df685023
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The support for NLS relies on java.util API to load a standard
ResourceBundle and then uses java reflection API to inject localized
strings into public String fields of the corresponding instance
of TranslationBundle.
Locale setting is supported per thread to enable concurrent threads
to use different locales. This is useful when JGit runs in a server
context where (error) messages might need to differ per-request to
suit the user's preference.
Change-Id: Ie0e63a0d7bb74eaad495dbe8248595d8a3a76883
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
When implementing branch read access, we need to prove that the
newly created reference(s) point to objects that the user can see.
There are two ways that an object is reachable:
1) It's reachable from a branch or change the user can see
2) It was uploaded as part of the pack file the user sent us
This change adds additional methods in ReceivePack that will allow a
server to check the above conditions, in order to ensure that a user
is not trying to create a reference that they cannot see, or that a
malicious user isn't attempting to forge the SHA-1 of an object that
they cannot see in order to base a change off of it.
Change-Id: Ieba75b4f0331e06a03417c37f4ae1ebca4fbee5a
If the readAdvertisedRefs() method throws an exception, its already
closed the connection and wrapped the underlying cause inside of a
suitable TransportException object that it is throwing. We shouldn't
catch IOException and rethrow a wrapped copy here, because we'll double
wrap the exception thrown by readAdvertisedRefs. This may obsecure the
root cause of the connection failure from the end-user.
Change-Id: I0ca61560f9888c666323dac8a5582aab25e897ff
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When a user of ReceivePack or UploadPack wants to control what refs
are sent to the client, for instance when some refs should be hidden
from some clients, this interface can be extended to provide a fine
grained control over what refs are sent to the client.
Change-Id: Ie6320b0f8922e1a5e1bad91c016bd476ea094366
The boolean field sentCommand is always true at this point, as it
was assigned just 5 lines above. So we always set the status of
the update command object to AWAITING_REPORT.
Simplify the logic by dropping the ?: operator. I assume this is
older code from an attempt to manage dry-run push support within
the native connection, but in fact dry-run support is done higher
up inside of PushProcess.
Change-Id: I450d491bbbb5afecdbf5444ab7169222e856a3bb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Windows users by default have core.autocrlf set to true. JGit
does not recognize the flags and thus works as if it is set. In order
to make JGit more compatible with msysgit we set the flag to false
in repositories that JGit creates.
Bug: 301775
Change-Id: I7ea462fe3516e5060b87aa1f7ed63689936830c2
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Doing a keep call with a length of 1 will copy the current entry just
like the previous add was doing, but it avoids doing any validation
on the entry. This is sane because the entry can be assumed to be
already valid, since its originating from the destination index.
Change-Id: I250d902fc98580444af1ba4b8fedceb654541451
Originally: http://thread.gmane.org/gmane.comp.version-control.git/128214/focus=128213
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A 0 file mode in a DirCacheEntry is not a valid mode. To C git
such a value indicates the record should not be present. We already
were catching this bad state and exceptioning out when writing tree
objects to disk, but we did not fail when writing the dircache back
to disk. This allowed JGit applications to create a dircache file
which C git would not like to read.
Instead of checking the mode during writes, we now check during
mutation. This allows application bugs to be detected sooner and
closer to the cause site. It also allows us to avoid checking most
of the records which we read in from disk, as we can assume these
are formatted correctly.
Some of our unit tests were not setting the FileMode on their test
entry, so they had to be updated to use REGULAR_FILE.
Change-Id: Ie412053c390b737c0ece57b8e063e4355ee32437
Originally: http://thread.gmane.org/gmane.comp.version-control.git/128214/focus=128213
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Adam W. Hawks <awhawks@writeme.com>
A dircache record must not use a path string like "/a" or "a//b"
as this results in a tree entry being written with a zero length
name component in the record. C git does not support an empty name,
and neither does any modern filesystem.
A record also must not have a stage outside of the standard 0-3
value range, as there are only 2 bits of space available in the
on-disk format of the record to store the stage information.
Any other values would be truncated into this space, storing a
different value than the caller expected.
If an application tries to create a DirCache record with either of
these wrong values, we abort with an IllegalArgumentException.
Change-Id: I699de149efdfccd85d8adde07d3efd080e3b49c2
Originally: http://thread.gmane.org/gmane.comp.version-control.git/128214
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Adam W. Hawks <awhawks@writeme.com>
Rather than implementing the file reading logic ourselves, and
wind up leaking the FileInputStream's file descriptor until the
next GC, use IO.readFully(File) which wraps the read loop inside
of a try/finally to ensure the stream is closed before it exits.
Change-Id: I85a3fe87d5eff88fa788962004aebe19d2e91bb4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Roland Grunberg <rgrunber@redhat.com>
Actually set the range of versions we are willing to accept for
each package we import, lest we import something in the future
that isn't compatible with our needs.
Change-Id: I25dbbb9eaabe852631b677e0c608792b3ed97532
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
ObjectWalk is invoking next() for each record we consider in a tree.
Rather than doing several method calls against the current parser,
and testing if we are at eof() at least twice per next() invocation,
do it only once and inline the logic to move the parser forward.
Change-Id: If5938f5d7b3ca24f500a184c9bd2ef193015414e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The supplied test case comes out of the example tree identified by
Robert de Wilde and Ilari on #git:
$ git ls-tree -rt a54f1a85ebf6a7f53aa60a45a1be33f8b078fb7e
040000 tree bfe058ad536cdb12e127cde63b01472c960ea105 A
040000 tree 4b825dc642 A/A
040000 tree 4b825dc642 A/B
100644 blob abbbfafe3129f85747aba7bfac992af77134c607 B
In this tree, "B" was being skipped because "A/A" as an empty tree
was immediately followed by "A/B", also an empty tree, but the
ObjectWalk broke out too early and never visited "B".
Bug: 286653
Change-Id: I25bcb0bc99d0cbbbdd9c2bd625ad6a691a6d0335
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
During dispose() or reset() we are suppose to be restoring the
ObjectWalk instance back to the original pre-walk state, but we
failed to reset the tree parser. This can lead to confusing state
if the ObjectWalk was reused by the caller, as entries from the
old walk might be reported as part of the new walk.
Change-Id: I6237bae7bfd3794e8b9a92b4dd475559cc72e634
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of including "ObjectId[SHA-1]" in the message, just
us the formatted SHA-1 name of the object by calling name().
Change-Id: I0d1d0e8207f8a3f02188e60242e4e9bf7420e88f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We didn't skip the correct number of bytes when we skipped over an
unrecognized but optional dircache extension. We missed skipping
the 8 byte header that makes up the extension's name and length.
We also didn't include the skipped extension's payload as part of
our index checksum, resuting in a checksum failure when the index
was done reading. So ensure we always scan through a skipped
section and include it in the checksum computation.
Add a test case for a currently unsupported index extension, 'ZZZZ',
to verify we can still read the DirCache object even though we
don't know what 'ZZZZ' is supposed to mean.
Bug: 301287
Change-Id: I4bdde94576fffe826d0782483fd98cab1ea628fa
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the repository is empty, we have no HEAD branch, which means we
can't test to see if the HEAD is detached and should be advertised
as a .have line.
Change-Id: I6e85f836e7db057cede812d0d6c1aecbd6cbe6c5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The unsetSection method can be used to delete an entire configuration
block, such as a [branch ""] or [remote ""] section in a file.
Change-Id: I93390c9b2187eb1b0d51353518feaed83bed2aad
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Config was confusing the following two variables when writing the
file back to text format:
[my]
empty =
enabled
When parsed, we say that my.empty has 1 value, null, and my.enabled
is an empty string value that in boolean context should be evaluated
as true.
Saving this configuration file back to text format was ignoring the
null value for my.empty, producing a completely different file than
what Config read:
[my]
empty
enabled
Instead handle the writing differently to ensure the original format
is output. New tests cases cover the expected behavior and return
values from accessor methods.
Change-Id: Id37379ce20cb27e3330923cf989444dd9f2bdd96
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We don't want to use the JRE cache when fetching content.
Change-Id: Id76f3e618967c98ed4fbc47a1a2a9e77acbe41ab
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>