Its not strictly required that we sort topologically in order to
produce a valid pack file. This was just something that Linus
thought would be a good idea to do. In practice its not that
important for most repositories. Local file IO quickly falls
out of the pattern that topological sorting provides any sort
of benefit for, so expending extra resources to enforce it when
we make a pack isn't really worth it.
I'm removing this sort in the pipeline because later changes
would support really efficient COMMIT_TIME_DESC sorting on a
non-file storage system, but TOPO sorting would be a bit more
ugly to run, due to the in-memory delays it imposes.
Change-Id: I0121453461c2140c6917cb10c6df584eb47e5795
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If UploadPack invokes flush() on the output stream we pass it, its
most likely the progress messages coming down the side band stream.
As pack generation can take a while, we want to push that down
at the client as early as we can, to keep the connection alive,
and to let the user know we are still working on their behalf.
Ensure we dump the temporary buffer whenever flush() is invoked,
otherwise the messages don't get sent in a timely fashion to the
user agent (in this case, git fetch).
We specifically don't implement flush() for ReceivePack right now,
as that protocol currently does not provide progress messages to
the user, but it does invoke flush several times, as the different
streams include '0000' type flush-pkts to denote various end points.
Change-Id: I797c90a2c562a416223dc0704785f61ac64e0220
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We want to eventually get rid of the mapCommit, mapTree APIs on
Repository and force everyone into the faster parsers that exist
in RevWalk. Rewriting resolve in terms of the faster parsers is
a good first step.
It actually simplifies the code a bit, as we no longer need to
keep track of an ObjectId and an Object (the parsed form), since
all RevObjects implicitly have their ObjectId readily available.
Change-Id: I4d234630195616e2c263e7e70038b55a1be4e7a3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of peeling things by hand in application level code, defer
the peeling logic into RevWalk's new peel utility method.
Change-Id: Idabd10dc41502e782f6a2eeb56f09566b97775a8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We already have these objects parsed and cached in our object pool.
We shouldn't be looking them up via the legacy mapObject API, but
instead can use the pool and the faster parsing routines available
through the RevWalk that we extend.
While we are here fixing the code, lets also correct the tag date
sorting to accept tags that have no tagger identity, because they
were created before Git knew how to store that field.
Change-Id: Id49a11f6d9c050c82b876e5e11058840c894b2d7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Rather than relying on the helpers in RepositoryConfig to get
these objects, obtain them directly through the Config API.
Its only slightly more verbose, but permits us to work with the
base Config class, which is more flexible than the highly file
specific RepositoryConfig.
This is what I really meant to do when I added the section parser
and caching support to Config, we just failed to finish updating
all of the call sites.
Change-Id: I481cb365aa00bfa8c21e5ad0cd367ddd9c6c0edd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We don't have to assume/depend on RepositoryConfig here, these
two tests can use higher level versions of the class and still
come up with the same test. That frees us up to do some changes
to the RepositoryConfig API.
Change-Id: Ia7b263c8c5efa3fae1054416d39c546867288132
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Under smart HTTP the biDirectionalPipe flag is false, and we return
back immediately at this point in the negotiation process. There is
no need to flush the stream to the client, the request is over and
it will be automatically flushed out by the higher level servlet
that invoked us. Avoiding flush here allows us to only use flush
after a progress message is sent during pack generation.
Change-Id: Id0c8b7e95e3be6ca4c1b479e096bed6b0283b828
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This simplifies the PackIndex code, which is trying to quickly copy
an existing ObjectId into a MutableObjectId. Rather than having
the PackIndex violate the ObjectId's internals, expose a copy from
function similar to the other ones for copying from raw byte arrays
or hex formatted strings.
Change-Id: I142635cbece54af2ab83c58477961ce925dc8255
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Storage systems can use these implementations to compare a passed
AnyObjectId with a stored representation of an ObjectId in the
canonical network byte order format. This can be useful to do a
binary search, or just linear scan, over an encoded storage file.
Change-Id: I8c72993c4f4c6e98d599ac2c9867453752f25fd2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
An implementation might prefer to use the RefList type here, and
RefList is part of our public API. Expose the constructor so callers
who have a RefList can take advantage of the existing sorting.
Change-Id: I545867f85aa2c479d2d610024ebbe318144709c8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When we finally move RefDirectory to the new storage.file package,
its associated RefDirectoryUpdate will need visiblity to this
constructor in order to initialize itself. This is true of any
other repository implementation, so make it protected rather than
package level visible.
Change-Id: If838aec9baeb80ee2f12dcbca717657c725a9242
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Repository implementations outside of .lib need to be able to
create these events and deliver them to listening application code.
Expose and document the constructors so that they are visible when
we move FileRepository into storage.file.FileRepository.
Change-Id: I7fb6e8f4f5fdab683c5ebb5267673aa6d5b560bb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A Git reference name must never end with ".lock", as it would
confuse any existing C client that tries to obtain a clone of the
repository over the network. Even if the repository isn't on a
local filesystem, it still should ban that suffix.
Because I plan to move LockFile to storage.file and make it a private
implementation detail of the local file system storage model,
we can't rely on its package level SUFFIX field here. Making it
public probably won't work long-term either, as I also plan to
pull storage.file into its own separate project that depends on
the core library.
So, just inline the constant here. Its as foribidden as ":" is.
Change-Id: If85076861baeacc183b82696375a13e935ba8836
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some sources had dos line endings. Also configure all projects to use
unix line endings and UTF-8 text encoding.
Change-Id: I8fc9a1dbb219ffa91d1b3011b3b11b7e48e74ca7
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Modifications to various classes in order to allow serialization
for use of JGit in Hudson's git plugin.
Change-Id: If088717d3da7483538c00a927e433a74085ae9e6
If a repository is "bare", it currently still returns a working directory.
This conflicts with the specification of "bare"-ness.
Bug: 311902
Change-Id: Ib54b31ddc80b9032e6e7bf013948bb83e12cfd88
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Currently, there is no way to read the content
of the Git Configuration in a way that would
allow to list all configured values generically.
This change extends the Config class in such a
way as to being able to get a list of sections and
to get a list of names for any given section or
subsection.
This is required in able to implement proper
configuration handling in EGit (show all the
content of a given configuration similar to
"git config -l").
Change-Id: Idd4bc47be18ed0e36b11be8c23c9c707159dc830
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Created wrong tags for 0.8.3 hence creating another version.
Change-Id: I4e00bbcffe1cf872e2d7e3f3d88d068701fb5330
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Refactor and extend the internals so users can override and
intervene during formatting, e.g. to colorize output.
Change-Id: Ia1cf40cfd4a5ed7dfb6503f8dfc617237bee0659
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
This is required to enable accessing legal info for
org.eclipse.jgit from
Help > About > Installation Details > Plugins
Change-Id: I73f40dd2018112cd23102954d7647ecdbbbf0d89
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Building jgit with pde.build was broken without resources.
Bug:315823
Change-Id: I45be510ada068b3ffab0feb30ec60f2c96a5ca32
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
On Windows, FS_Win32_Cygwin has been used if a Cygwin Git installation
is present in the PATH. Assuming that the user works with the Cygwin
Git installation may result in unnecessary overhead if he actually
does not.
Applications built on top of jgit may have more knowledge on the
actually used Git client (Cygwin or not) and hence should be able to
configure which FS to use accordingly.
Change-Id: Ifc4278078b298781d55cf5421e9647a21fa5db24
A Change-Id helps tools like Gerrit Code Review to keeps different
versions of a patch together. The Change-Id is computed as a SHA-1
hash of some of the same basic information as a commit id on the first
commit intended to solve a particular problem and then reused for
updated solutions.
Change-Id: I04334f84e76e83a4185283cb72ea0308b1cb4182
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
The J2SE NIO APIs require that FileChannel close the underlying file
descriptor if a thread is interrupted while it is inside of a read or
write operation on that channel. This is insane, because it means we
cannot share the file descriptor between threads. If a thread is in
the middle of the FileChannel variant of IO.readFully() and it
receives an interrupt, the pack will be automatically closed on us.
This causes the other threads trying to use that same FileChannel to
receive IOExceptions, which leads to the pack getting marked as
invalid. Once the pack is marked invalid, JGit loses access to its
entire contents and starts to report MissingObjectExceptions.
Because PackWriter must ensure that the chosen pack file stays
available until the current object's data is fully copied to the
output, JGit cannot simply reopen the pack when its automatically
closed due to an interrupt being sent at the wrong time. The pack may
have been deleted by a concurrent `git gc` process, and that open file
descriptor might be the last reference to the inode on disk. Once its
closed, the PackWriter loses access to that object representation, and
it cannot complete sending the object the client.
Fortunately, RandomAccessFile's readFully method does not have this
problem. Interrupts during readFully() are ignored. However, it
requires us to first seek to the offset we need to read, then issue
the read call. This requires locking around the file descriptor to
prevent concurrent threads from moving the pointer before the read.
This reduces the concurrency level, as now only one window can be
paged in at a time from each pack. However, the WindowCache should
already be holding most of the pages required to handle the working
set for a process, and its own internal locking was already limiting
us on the number of concurrent loads possible. Provided that most
concurrent accesses are getting hits in the WindowCache, or are for
different repositories on the same server, we shouldn't see a major
performance hit due to the more serialized loading.
I would have preferred to use a pool of RandomAccessFiles for each
pack, with threads borrowing an instance dedicated to that thread
whenever they needed to page in a window. This would permit much
higher levels of concurrency by using multiple file descriptors (and
file pointers) for each pack. However the code became too complex to
develop in any reasonable period of time, so I've chosen to retrofit
the existing code with more serialization instead.
Bug: 308945
Change-Id: I2e6e11c6e5a105e5aef68871b66200fd725134c9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Merges the current head with one other commit.
In this first iteration the merge command supports
only fast forward and already up-to-date.
Change-Id: I0db480f061e01b343570cf7da02cac13a0cbdf8f
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The CommitCommand should take care to create a merge commit if the file
$GIT_DIR/MERGE_HEAD exists. It should then read the parents for the merge
commit out of this file. It should also take care that when commiting
a merge and no commit message was specified to read the message from
$GIT_DIR/MERGE_MSG.
Finally the CommitCommand should remove these files if the commit
succeeded.
Change-Id: I4e292115085099d5b86546d2021680cb1454266c
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
The strings are externalized into the root resource bundles.
The resource bundles are stored under the new "resources" source
folder to get proper maven build.
Strings from tests are, in general, not externalized. Only in
cases where it was necessary to make the test pass the strings
were externalized. This was typically necessary in cases where
e.getMessage() was used in assert and the exception message was
slightly changed due to reuse of the externalized strings.
Change-Id: Ic0f29c80b9a54fcec8320d8539a3e112852a1f7b
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
In close() method of SshFetchConnection and SshPushConnection
errorThread.join() can wait forever if JSch will not close the
channel's error stream. Join with a timeout, and interrupt the
copy thread if its blocked on data that will never arrive.
Bug: 312863
Change-Id: I763081267653153eed9cd7763a015059338c2df8
Reported-by: Dmitry Neverov <dmitry.neverov@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If we get an interrupt during an IO operation (src.read or dst.write)
caused by the flush() method incrementing the flush counter, ensure
we restart the proper section of code. Just ignore the interrupt
and continue running.
Bug: 313082
Change-Id: Ib2b37901af8141289bbac9807cacf42b4e2461bd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The number of bytes to copy was truncated to an int, but the
pack's copyToStream() method expected to be passed a long here.
Pass through the long so we don't truncate a giant object.
Change-Id: I0786ad60a3a33f84d8746efe51f68d64e127c332
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Rather than holding onto the PackedObjectLoader, only hold the
PackFile and the object offset. During a reuse copy that is all
we should need to complete a reuse, and the other parts of the
PackedObjectLoader just waste memory.
This change reduces the per-object memory usage of a PackWriter by
32 bytes on a 32 bit JVM using only OFS_DELTA formatted objects.
The savings is even larger (by another 20 bytes) for REF_DELTAs.
This is close to a 50% reduction in the size of ObjectToPack,
making it rather worthwhile to do.
Beyond the memory reduction, this change will help to make future
refactoring work easier. We need to redo the API used to support
copying data, and disconnecting it from the PackedObjectLoader is
a good first step.
Change-Id: I24ba4e621e101f14e79a16463aec5379f447aa9b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Rather than keep track of both the position of the object, and the
position of its data, just keep track of the number of bytes used
by the object's header in the pack. This shaves 4 bytes out of the
size of the PackedObjectLoader instances.
We also can defer the addition instruction to the materialize()
operation, avoiding it entirely if the caller never actually uses
the loader. This may be relevant for PackWriter invocations,
where only 1 loader gets chosen for a given object, even though
the object may appear on disk in more than one pack file.
Error reporting is now simplified, as we can rely on the object
offset rather than its data offset. This is the value displayed
by pack debugging tools like `git verify-pack -v`, so its better
to use that in our own errors.
Because nobody needs getDataOffset() now, we can drop that from
the public API.
Change-Id: Ic639c0d5a722315f4f5c8ffda6e26643d90e5f42
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Since we use this code twice, pull it into a private method. Let
the compiler/JIT worry about whether or not this logic should be
inlined into the call sites.
Change-Id: Ia44fb01e0328485bcdfd7af96835d62b227a0fb1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Originally when I wrote this code I had hoped to use OffsetCache
to also implement the UnpackedObjectCache. But it turns out they
need rather different code, and it just wasn't worth trying to
reuse the OffsetCache base class.
Before doing any major refactoring or code cleanups here, squash the
two classes together and delete OffsetCache. As WindowCache is our
only subclass, this is pretty simple to do. We also get a minor
code reduction due to less duplication between the two classes,
and the JIT should be able to do a better job of optimization here
as we can define types up front rather than relying on generics
that erase back to java.lang.Object.
Change-Id: Icac8bda01260e405899efabfdd274928e98f3521
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When we read the object header we copy 20 bytes from the pack data,
then start parsing out the type and the inflated size. For most
objects, this is only going to require 3 bytes, which is sufficient
to represent objects with inflated sizes of up to 2^16. The local
buffer however still has 17 bytes remaining in it, and that can be
used to satisfy the OBJ_OFS_DELTA header.
We shouldn't need to worry about walking off the end of the buffer
here, because delta offsets cannot be larger than 64 bits, and that
requires only 9 bytes in the OFS_DELTA encoding.
Assuming worst-case scenarios of 9 bytes for the OFS_DELTA encoding,
the pack file itself must be approaching 2^64 bytes, an infeasible
size to store on any current technology. However, even if this
were the case we still have 11 bytes for the type/size header.
In that encoding we can represent an object as large as 2^74 bytes,
which is also an infeasible size to process in JGit.
So drop the second read here.
The data offsets we pass into the ObjectLoaders being constructed
need to be computed individually now. This saves a local variable,
but pushes the addition operation into each branch of the switch.
Change-Id: I6cf64697a9878db87bbf31c7636c03392b47a062
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
JSch may hang or abort with the timeout if JGit connects before
its obtained the streams. Instead defer the connect() call until
after the streams have been configured.
Bug: 312383
Change-Id: I7c3a687ba4cb69a41a85e2b60d381d42b9090e3f
Reported-by: Dmitry Neverov <dmitry.neverov@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If a flush() gets delivered at the same time that we are blocking
while writing to an interruptable stream, the copy thread will
abort assuming its a stream error. Instead ignore the interrupt,
and retry the write.
Change-Id: Icbf62d1b8abe0fabbb532dbee088020eecf4c6c2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
It is possible to miss flush() invocation in StreamCopyThread.
In this case some data will not be sent to remote host and we will
wait forever (or until timeout) in src.read().
Use a counter to keep track of the flush requests.
Change-Id: Ia818be9b109a1674d9e2a9c78e125ab248cfb75b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Added a new package org.eclipse.jgit.api and a builder-style API for
jgit. Added also the first implementation for two git commands: Commit
and Log.
This API is intended to be used by external components when
functionalities of the standard git commands are required. It will also
help to ease writing JGit tests.
For internal usages this API may often not be optimal because the git
commands are doing much more than required or they expect parameters of
an unappropriate type.
Change-Id: I71ac4839ab9d2f848307eba9252090c586b4146b
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>