Simple variant of addAll() that knows how to copy large segments
quickly using System.arraycopy() rather than looping through with
an Iterator object.
Change-Id: Icb50a8f87fe9180ea28b6920f473bb9e70c300f1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In case a file needs to be checked out (from THEIRS) during a merge
operation, it has to be checked if the worktree version of this file
is dirty. If this is true, merge shall fail.
Change-Id: I17c24845584700aad953c3d4f2bea77a0d665ec4
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
1. Perform an explicit check for untracked files.
2. Extract 'dirty checks' into separate methods
3. Clean up comments.
4. Tests: also check contents of files not affected by merge.
Change-Id: Ieb089668834d0a395c9ab192c555538917dfdc47
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
We test the -o option of the commit command very accurate by
writing tests for each line of a decision table. In order to
still be able to point new jgit users to the CommitAndLogCommandTest
to find out how to use log() and commit() I factored out these 1200
lines of very specific tests into their own class.
Change-Id: Icf7c517f790a8fa79c8afd9b7f4a2805cf79196e
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
If the remote peer replies with "ERR %s" instead of "ACK %s common" or
"NAK" during ancestor negotiation in the fetch-pack/upload-pack
protocol, treat that as an exception that aborts processing with the
error text as supplied by the remote system.
This matches behavior with "ERR %s" during the advertisements, which
is also a way for the remote to abort processing.
Change-Id: I2fe818e75c7f46156744ef4f703c40173cbc76d0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
An option to insert a change id into the commit message was added
to CommitCommand.
This change is a prerequisite for removing GitIndex from EGit.
Change-Id: Iff9e26a8aaf21d8224bfd6ce3c98821c077bcd82
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
This enables applications to differentiate between explicitly set
configuration parameters and best effort attempts to guess these
parameters from the operating system.
Change-Id: I67cc4099238a40c6dca795e64f0155ced6008ef1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This allows callers to determine if a URI is supported, before
worrying about the local repository.
Suggested-by: Dariusz Luksza <dariusz@luksza.org>
Change-Id: Ifc76a4ba841f2e2e7354bd51306b87b3b9d7f6ab
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
This better matches with PackFile and CachedPack's methods
that return the same value.
Change-Id: Idb9b7c71d2048dd2344a62c2cde20b4e34529ab7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
PackWriter incorrectly returned 0 from getObjectsNumber() when the
pack has not been written yet. This caused dumb transports like
amazon-s3:// and sftp:// to abort early and never write out a pack,
under the assumption that the pack had no objects.
Until the pack header is written to the output stream, compute the
current object count each time it is requested. Once the header is
started, use the object count from the stats object.
Change-Id: I041a2368ae0cfe6f649ec28658d41a6355933900
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
OwnerMap is about 200 ms faster than SubclassMap, more friendly to the
GC, and uses less storage: testing the "Counting objects" part of
PackWriter on 1886362 objects:
ObjectIdSubclassMap:
load factor 50%
table: 4194304 (wasted 2307942)
ms spent 36998 36009 34795 34703 34941 35070 34284 34511 34638 34256
ms avg 34800 (last 9 runs)
ObjectIdOwnerMap:
load factor 100%
table: 2097152 (wasted 210790)
directory: 1024
ms spent 36842 35112 34922 34703 34580 34782 34165 34662 34314 34140
ms avg 34597 (last 9 runs)
The major difference with OwnerMap is entries must extend from
ObjectIdOwnerMap.Entry, where the OwnerMap has injected its own
private "next" field into each object. This allows the OwnerMap to use
a singly linked list for chaining collisions within a bucket. By
putting collisions in a linked list, we gain the entire table back for
the SHA-1 bits to index their own "private" slot.
Unfortunately this means that each object can appear in at most ONE
OwnerMap, as there is only one "next" field within the object instance
to thread into the map. For types that are very object map heavy like
RevWalk (entity RevObject) and PackWriter (entity ObjectToPack) this
is sufficient, these entity types are only put into one map by their
container. By introducing a new map type, we don't break existing
applications that might be trying to use ObjectIdSubclassMap to track
RevCommits they obtained from a RevWalk.
The OwnerMap uses less memory. Each object uses 1 reference more (so
we're up 1,886,362 references), but the table is 1/2 the size (2^20
rather than 2^21). The table itself wastes only 210,790 slots, rather
than 2,307,942. So OwnerMap is wasting 200k fewer references.
OwnerMap is more friendly to the GC, because it hardly ever generates
garbage. As the map reaches its 100% load factor target, it doubles in
size by allocating additional segment arrays of 2048 entries. (So the
first grow allocates 1 segment, second 2 segments, third 4 segments,
etc.) These segments are hooked into the pre-allocated directory of
1024 spaces. This permits the map to grow to 2 million objects before
the directory itself has to grow. By using segments of 2048 entries,
we are asking the GC to acquire 8,204 bytes in a 32 bit JVM. This is
easier to satisfy then 2,307,942 bytes (for the 512k table that is
just an intermediate step in the SubclassMap). By reusing the
previously allocated segments (they are re-hashed in-place) we don't
release any memory during a table grow.
When the directory grows, it does so by discarding the old one and
using one that is 4x larger (so the directory goes to 4096 entries on
its first grow). A directory of size 4096 can handle up to 8 millon
objects. The second directory grow (16384) goes to 33 million objects.
At that point we're starting to really push the limits of the JVM
heap, but at least its many small arrays. Previously SubclassMap would
need a table of 67108864 entries to handle that object count, which
needs a single contiguous allocation of 256 MiB. That's hard to come
by in a 32 bit JVM. Instead OwnerMap uses 8192 arrays of about 8 KiB
each. This is much easier to fit into a fragmented heap.
Change-Id: Ia4acf5cfbf7e9b71bc7faa0db9060f6a969c0c50
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The new TransportProtocol type describes what a particular Transport
implementation wants in order to support a connection. 3rd parties
can now plug into the Transport.open() logic by implementing their
own TransportProtocol and Transport classes, and registering with
Transport.register().
GUI applications can help the user configure a connection by looking
at the supported fields of a particular TransportProtocol type, which
makes the GUI more dynamic and may better support new Transports.
Change-Id: Iafd8e3a6285261412aac6cba8e2c333f8b7b76a5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This change adds the --only/ -o option to the commit command.
Change-Id: I44352d56877f8204d985cb7a35a2e0faffb7d341
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Instead of resizing an ArrayList until all objects have been added,
append objects into a specialized List type that uses small arrays
of 1024 entries for each 1024 objects added.
For a large repository like linux-2.6, PackWriter will now allocate
1,758 smaller arrays to hold the object list, without creating any
garbage from the intermediate states due to list expansion.
1024 was chosen as the block size (and initial directory size) as this
is a reasonable balance for the PackWriter code. Each block uses
approximately 4096 bytes in a 32 bit JVM, as does the default top
level block directory. The top level directory doesn't expand until 1
million items have been added to the list, which for linux-2.6 won't
yet occur as the lists are per-object-type and are thus bounded to
about 1/3 of 1.8 million.
Change-Id: If9e4092eb502394c5d3d044b58cf49952772f6d6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The mapTree() routines have been deprecated for a long time, and their
sibilings for mapCommit() and mapTag() were already removed from the
main Repository API.
Remove mapTree(). Application callers who only need the tree's name
can use resolve("^{tree}") syntax to resolve to the tree ObjectId, or
fail if the input is not a tree.
Applications that want to read a tree should use DirCache or TreeWalk.
Change-Id: I85726413790fc87721271c482f6636f81baf8b82
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This interface has been deprecated for a while now.
Applications can use a TreeWalk instead.
Change-Id: I751d6e919e4b501c36fc36e5f816b8a8c5379cb9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This has been deprecated for some time now. Applications should
instead use DirCache within a TreeWalk.
Change-Id: I8099d93f07139c33fe09bdeef8d739782397da17
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This class has been deprecated for a long time now.
Time to remove it. Applications can use the newer
DirCacheCheckout class as a replacement.
Change-Id: Id66d29fcca5a7286b8f8838303d83f40898918d2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This interface has been deprecated for a long time now.
Time to remove it.
Change-Id: I29a938657e4637b2a9d0561940b38d70866613f7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Bug 317411 (Push does not update remote tracking branch) is assigned to
JGit. This test verifies that JGit does the right thing.
Bug: 317411
Change-Id: I8f632e3e6c8a4f16a1170b1dba92e8fd3d6267d0
When parsing a string such as "foo-gbed2" resolve() was assuming the
suffix was from git describe output. This lead to JGit trying to find
the completion for the object abbreviation "bed2", rather than using
the current value of the reference. If there was only one such object
in the repository, JGit might actually use the wrong value here, as
resolve() would return the completion of the abbreviation "bed2"
rather than the current value of the reference "refs/heads/foo-gbed2".
Move the parsing of git describe abbreviations out of the operator
portion of the resolve() method and into the simple portion that is
supposed to handle only object ids or reference names, and only do the
describe parsing after all other approaches have already failed to
provide a resolution.
Add new unit tests to verify the behavior is as expected by users.
Bug: 338839
Change-Id: I52054d7b89628700c730f9a4bd7743b16b9042a9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Applications may already have a Ref or ObjectId on hand that they want
the remote to be updated to. Instead of converting these into a
String and relying on the parsing rules of resolve(), allow the
application to supply the Ref or ObjectId directly.
Bug: 338839
Change-Id: If5865ac9eb069de1c8f224090b6020fc422f9f12
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Checkout of remote tracking branch failed when no local branch
existed. Also enhance RepositoryTestCase to enable checking index
state of another test repository.
Bug: 337695
Change-Id: Idf4c05bdf23b5161688818342b2bf9a45b49f479
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Previously, this method would not (always) work when a recursive path
such as "a/b" was passed into it.
Change-Id: I0752a1f5fc7fef32064d8f921b33187c0bdc7227
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
If the RevFilter doesn't actually require the commit body,
we shouldn't reparse it if the body was disposed. This happens
often inside of UploadPack during common ancestor negotation, the
RevWalk is reset and re-run over roughly the same commit space,
but the bodies are discarded because the commit message is not
relevant to the process.
Change-Id: I87b6b6a5fb269669867047698abf718d366bd002
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Because of change I28ae5713, the commit message lost the "into HEAD" and
caused the MergeCommandTest to fail. This change fixes it.
Bug: 336059
Change-Id: Ifac0138c6c6d66c40d7295b5e11ff3cd98bc9e0c
When MergeMessageFormatter was given a symbolic ref HEAD which points to
refs/heads/master (which is the case when merging a branch in EGit), it
would result in a merge message like the following:
Merge branch 'a' into HEAD
But it should print the following (as C Git does):
Merge branch 'a'
The solution is to use the leaf ref when checking for refs/heads/master.
Change-Id: I28ae5713b7e8123a0176fc6d7356e469900e7e97
There is no point in pushing all of the files within the edge
commits into the delta search when making a thin pack. This floods
the delta search window with objects that are unlikely to be useful
bases for the objects that will be written out, resulting in lower
data compression and higher transfer sizes.
Instead observe the path of a tree or blob that is being pushed
into the outgoing set, and use that path to locate up to WINDOW
ancestor versions from the edge commits. Push only those objects
into the edgeObjects set, reducing the number of objects seen by the
search window. This allows PackWriter to only look at ancestors
for the modified files, rather than all files in the project.
Limiting the search to WINDOW size makes sense, because more than
WINDOW edge objects will just skip through the window search as
none of them need to be delta compressed.
To further improve compression, sort edge objects into the front
of the window list, rather than randomly throughout. This puts
non-edges later in the window and gives them a better chance at
finding their base, since they search backwards through the window.
These changes make a significant difference in the thin-pack:
Before:
remote: Counting objects: 144190, done
remote: Finding sources: 100% (50275/50275)
remote: Getting sizes: 100% (101405/101405)
remote: Compressing objects: 100% (7587/7587)
Receiving objects: 100% (50275/50275), 24.67 MiB | 9.90 MiB/s, done.
Resolving deltas: 100% (40339/40339), completed with 2218 local objects.
real 0m30.267s
After:
remote: Counting objects: 61549, done
remote: Finding sources: 100% (50275/50275)
remote: Getting sizes: 100% (18862/18862)
remote: Compressing objects: 100% (7588/7588)
Receiving objects: 100% (50275/50275), 11.04 MiB | 3.51 MiB/s, done.
Resolving deltas: 100% (43160/43160), completed with 5014 local objects.
real 0m22.170s
The resulting pack is 13.63 MiB smaller, even though it contains the
same exact objects. 82,543 fewer objects had to have their sizes
looked up, which saved about 8s of server CPU time. 2,796 more
objects from the client were used as part of the base object set,
which contributed to the smaller transfer size.
Change-Id: Id01271950432c6960897495b09deab70e33993a9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Sigend-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Enhance the Git API to support cloning repositories.
Bug: 334763
Change-Id: Ibe1191498dceb9cbd1325aed85b4c403db19f41e
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
After consulting with Christian Halstrick, it turned out that the
handling of rebase during pull was implemented incorrectly.
Change-Id: I40f03409e080cdfeceb21460150f5e02a016e7f4
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
The new addIfAbsent() method combines get() with add(), but does
it in a single step so that the common case of get() returning null
for a new object can immediately insert the object into the map.
Change-Id: Ib599ab4de13ad67665ccfccf3ece52ba3222bcba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Rebase must honor the upstream configuration
branch.<branchname>.rebase
Change-Id: Ic94f263d3f47b630ad75bd5412cb4741bb1109ca
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
This bug was hidden by an incomplete test: the current Rebase
implementation using the "git rebase -i" pattern does not work
correctly if fast-forwarding is involved. The reason for this is that
the log command does not return any commits in this case.
In addition, a check for already merged commits was introduced to
avoid spurious conflicts.
Change-Id: Ib9898fe0f982fa08e41f1dca9452c43de715fdb6
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Properly handle return value of java.io.File.createNewFile().
Change-Id: I3a74cc84cd126ca1a0eaccc77b2944d783ff0747
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
java.io.File.mkdir() and mkdirs() report failure as an exceptional
return value false. Fix the code which silently ignored this
exceptional return value.
Change-Id: I41244f4b9d66176e68e2c07e2329cf08492f8619
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This problem surfaced since EGit Core ResetOperationTest is failing
since change I26806d21. JGit detected checkout conflict for untracked
files which never were tracked by the repository.
"git reset --hard" in c git also doesn't remove such untracked files.
Change-Id: Icc8e1c548ecf6ed48bd2979c81eeb6f578d347bd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When DirCacheCheckout was checking out it was silently
overwriting untracked files. This is only ok if the
files are also ignored. Untracked and not ignored files
should not be overwritten. This fix adds checks for
this situation.
Because this change in the behaviour also broke tests
which expected that a checkout will overwrite untracked
files (PullCommandTest) these tests have to be modified
also.
Bug: 333093
Change-Id: I26806d2108ceb64c51abaa877e11b584bf527fc9
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
By moving the logic that parses a pack stream from the network (or
a bundle) into a type that can be constructed by an ObjectInserter,
repository implementations have a chance to inject their own logic
for storing object data received into the destination repository.
The API isn't completely generic yet, there are still quite a few
assumptions that the PackParser subclass is storing the data onto
the local filesystem as a single file. But its about the simplest
split of IndexPack I can come up with without completely ripping
the code apart.
Change-Id: I5b167c9cc6d7a7c56d0197c62c0fd0036a83ec6c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
RevFilter.include()'s documentation promises the RevCommit's
body is parsed before include is invoked. This wasn't always
true if the commit was parsed once, had its body discarded,
the RevWalk was reset() and started a new traversal.
Change-Id: Ie5cafde09ae870712b165d8a97a2c9daf90b1dbd
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>