When MergeMessageFormatter was given a symbolic ref HEAD which points to
refs/heads/master (which is the case when merging a branch in EGit), it
would result in a merge message like the following:
Merge branch 'a' into HEAD
But it should print the following (as C Git does):
Merge branch 'a'
The solution is to use the leaf ref when checking for refs/heads/master.
Change-Id: I28ae5713b7e8123a0176fc6d7356e469900e7e97
There is no point in pushing all of the files within the edge
commits into the delta search when making a thin pack. This floods
the delta search window with objects that are unlikely to be useful
bases for the objects that will be written out, resulting in lower
data compression and higher transfer sizes.
Instead observe the path of a tree or blob that is being pushed
into the outgoing set, and use that path to locate up to WINDOW
ancestor versions from the edge commits. Push only those objects
into the edgeObjects set, reducing the number of objects seen by the
search window. This allows PackWriter to only look at ancestors
for the modified files, rather than all files in the project.
Limiting the search to WINDOW size makes sense, because more than
WINDOW edge objects will just skip through the window search as
none of them need to be delta compressed.
To further improve compression, sort edge objects into the front
of the window list, rather than randomly throughout. This puts
non-edges later in the window and gives them a better chance at
finding their base, since they search backwards through the window.
These changes make a significant difference in the thin-pack:
Before:
remote: Counting objects: 144190, done
remote: Finding sources: 100% (50275/50275)
remote: Getting sizes: 100% (101405/101405)
remote: Compressing objects: 100% (7587/7587)
Receiving objects: 100% (50275/50275), 24.67 MiB | 9.90 MiB/s, done.
Resolving deltas: 100% (40339/40339), completed with 2218 local objects.
real 0m30.267s
After:
remote: Counting objects: 61549, done
remote: Finding sources: 100% (50275/50275)
remote: Getting sizes: 100% (18862/18862)
remote: Compressing objects: 100% (7588/7588)
Receiving objects: 100% (50275/50275), 11.04 MiB | 3.51 MiB/s, done.
Resolving deltas: 100% (43160/43160), completed with 5014 local objects.
real 0m22.170s
The resulting pack is 13.63 MiB smaller, even though it contains the
same exact objects. 82,543 fewer objects had to have their sizes
looked up, which saved about 8s of server CPU time. 2,796 more
objects from the client were used as part of the base object set,
which contributed to the smaller transfer size.
Change-Id: Id01271950432c6960897495b09deab70e33993a9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Sigend-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Some of this code predates making ObjectId.equals() final
and fixing RevObject.equals() to match ObjectId.equals().
It was therefore more complex than it needs to be, because
it tried to work around RevObject's broken equals() rules
by converting to ObjectId in a different collection.
Also combine setUpWalker() and findObjectsToPack() methods,
these can be one method and the code is actually cleaner.
Change-Id: I0f4cf9997cd66d8b6e7f80873979ef1439e507fe
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The first 'Compressing objects' progress message is wrong, its
actually PackWriter looking up the sizes of each object in the
ObjectDatabase, so objects can be sorted correctly in the later
type-size sort that tries to take advantage of "Linus' Law" to
improve delta compression.
Rename the progress to say 'Getting sizes', which is an accurate
description of what it is doing.
Change-Id: Ida0a052ad2f6e994996189ca12959caab9e556a3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
When compressing objects, don't include the edges in the progress
meter. These cost almost no CPU time as they are simply pushed into
and popped out of the delta search window.
Change-Id: I7ea19f0263e463c65da34a7e92718c6db1d4a131
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Enhance the Git API to support cloning repositories.
Bug: 334763
Change-Id: Ibe1191498dceb9cbd1325aed85b4c403db19f41e
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
CGit push clients 1.6.6 and later support progress messages on the
side-band-64k channel during push, as this was introduced to handle
server side hook errors reported over smart HTTP.
Since JGit's delta resolution isn't always as fast as CGit's is,
a user may think the server has crashed and failed to report
status if the user pushed a lot of content and sees no feedback.
Exposing the progress monitor during the resolving deltas phase
will let the user know the server is still making forward progress.
This also helps BasePackPushConnection, which has a bounded timeout
on how long it will wait before assuming the remote server is dead.
Progress messages pushed down the side-band channel will reset the
read timer, helping the connection to stay alive and avoid timing
out before the remote side's work is complete.
Change-Id: I429c825e5a724d2f21c66f95526d9c49edcc6ca9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Non-commits are added to a pending queue, but duplicates are
removed by checking a flag. During a reset that flag must be
stripped off the old roots, otherwise the caller cannot reuse
the old roots after the reset.
RevWalk already does this correctly for commits, but ObjectWalk
failed to handle the non-commit case itself.
Change-Id: I99e1832bf204eac5a424fdb04f327792e8cded4a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This was a mistake that was missed due to historical reasons.
"The first /r/ tells our Apache to redirect the request to Gerrit.
The second /r/ tells Gerrit that the thing following is a Git SHA-1
and it should try to locate the changes that use that commit object.
Nothing I can easily do about it now. The second /r/ is historical
and comes from Gerrit 1.x days."
Change-Id: Iec2dbf5e077f29c0e0686cab11ef197ffc705012
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
After consulting with Christian Halstrick, it turned out that the
handling of rebase during pull was implemented incorrectly.
Change-Id: I40f03409e080cdfeceb21460150f5e02a016e7f4
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
The new addIfAbsent() method combines get() with add(), but does
it in a single step so that the common case of get() returning null
for a new object can immediately insert the object into the map.
Change-Id: Ib599ab4de13ad67665ccfccf3ece52ba3222bcba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This reverts commit f5fe2dca3c.
I regret adding this feature to the public API. Caches aren't always
the best idea, as they require work to maintain. Here the cache is
redundant information that must be computed, and when it grows stale
must be removed. The redundant information takes up more disk space,
about the same size as the pack-*.idx files are. For the linux-2.6
repository, that's more than 40 MB for a 400 MB repository. So the
cache is a 10% increase in disk usage.
The entire point of this cache is to improve PackWriter performance,
and only PackWriter performance, and only when sending an initial
clone to a new client. There may be better ways to optimize this, and
until we have a solid solution, we shouldn't be using a separate cache
in JGit.
Rebase must honor the upstream configuration
branch.<branchname>.rebase
Change-Id: Ic94f263d3f47b630ad75bd5412cb4741bb1109ca
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
This bug was hidden by an incomplete test: the current Rebase
implementation using the "git rebase -i" pattern does not work
correctly if fast-forwarding is involved. The reason for this is that
the log command does not return any commits in this case.
In addition, a check for already merged commits was introduced to
avoid spurious conflicts.
Change-Id: Ib9898fe0f982fa08e41f1dca9452c43de715fdb6
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
IOException constructor taking Exception as parameter is
new for JDK 6.
Change-Id: Iec349fc7be9e9fbaeb53841894883c47a98a7b29
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Properly handle return value of java.io.File.createNewFile().
Change-Id: I3a74cc84cd126ca1a0eaccc77b2944d783ff0747
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
java.io.File.mkdir() and mkdirs() report failure as an exceptional
return value false. Fix the code which silently ignored this
exceptional return value.
Change-Id: I41244f4b9d66176e68e2c07e2329cf08492f8619
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
DirCacheCheckout.checkoutEntry() prepares the new file content using a
temporary file and then renames it to the file to be written during
checkout. For files to be updated checkout() created each file before
calling checkoutEntry(). Hence renaming the temporary file always
failed which was corrected in exception handling by retrying to rename
the file after deleting the just newly created file.
Change-Id: I219f864f2ed8d68051d7b5955d0659964fa27274
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Counting the objects needed for packing is the most expensive part of
an UploadPack request that has no uninteresting objects (otherwise
known as an initial clone). During this phase the PackWriter is
enumerating the entire set of objects in this repository, so they can
be sent to the client for their new clone.
Allow the ObjectReader (and therefore the underlying storage system)
to keep a cached list of all reachable objects from a small number of
points in the project's history. If one of those points is reached
during enumeration of the commit graph, most objects are obtained from
the cached list instead of direct traversal.
PackWriter uses the list by discarding the current object lists and
restarting a traversal from all refs but marking the object list name
as uninteresting. This allows PackWriter to enumerate all objects
that are more recent than the list creation, or that were on side
branches that the list does not include.
However, ObjectWalk tags all of the trees and commits within the list
commit as UNINTERESTING, which would normally cause PackWriter to
construct a thin pack that excludes these objects. To avoid that,
addObject() was refactored to allow this list-based enumeration to
always include an object, even if it has been tagged UNINTERESTING by
the ObjectWalk. This implies the list-based enumeration may only be
used for initial clones, where all objects are being sent.
The UNINTERESTING labeling occurs because StartGenerator always
enables the BoundaryGenerator if the walker is an ObjectWalk and a
commit was marked UNINTERESTING, even if RevSort.BOUNDARY was not
enabled. This is the default reasonable behavior for an ObjectWalk,
but isn't desired here in PackWriter with the list-based enumeration.
Rather than trying to change all of this behavior, PackWriter works
around it.
Because the list name commit's immediate files and trees were all
enumerated before the list enumeration itself starts (and are also
within the list itself) PackWriter runs the risk of adding the same
objects to its ObjectIdSubclassMap twice. Since this breaks the
internal map data structure (and also may cause the object to transmit
twice), PackWriter needs to use a new "added" RevFlag to track whether
or not an object has been put into the outgoing list yet.
Change-Id: Ie99ed4d969a6bb20cc2528ac6b8fb91043cee071
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
It can be very handy for the implementation to resort the
object list based on data locality, improving prefetch in
the operating system's buffer cache.
Export the list to the implementation was a proper List,
and document that its mutable and OK to be modified. The
only caller in PackWriter is already OK with these rules.
Change-Id: I3f51cf4388898917b2be36670587a5aee902ff10
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When performing an initial clone of a repository there are no
uninteresting commits, and the resulting pack will be completely
self-contained. Therefore PackWriter does not need to honor C
Git standard TOPO ordering as described in JGit commit ba984ba2e0
("Fix checkReferencedIsReachable to use correct base list").
Switching to COMMIT_TIME_DESC when there are no uninteresting commits
allows the "Counting objects" phase to emit progress earlier, as the
RevWalk will not buffer the commit list. When TOPO is set the RevWalk
enumerates all commits first, before outputing any for PackWriter to
mark progress updates from.
Change-Id: If2b6a9903b536c7fb3c45f85d0a67ff6c6e66f22
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
These two methods are specific to the FileRepository implementation
and should not be exposed as part of the base Repository API. Now
that PackParser is generic and does not require these two methods
to import a pack stream into a repostiory, it is safe to remove
these and get them out of the public view.
Change-Id: I8990004d08074657f467849dabfdaa7e6674e69a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This problem surfaced since EGit Core ResetOperationTest is failing
since change I26806d21. JGit detected checkout conflict for untracked
files which never were tracked by the repository.
"git reset --hard" in c git also doesn't remove such untracked files.
Change-Id: Icc8e1c548ecf6ed48bd2979c81eeb6f578d347bd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This information is generally useful - have followed the
accessor pattern of 'children' and 'parents'
Change-Id: I79b3ddd6f390152aa49e6b7a4c72a4aca0d6bc72
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The change Ie0350e032a97e0d09626d6143c5c692873a5f6a2 was not
done properly. The renamed file was not write protected, and
this broke a test.
Bug: 335388
Change-Id: I41b2235b7677bc5fddc70dda2a56cdd2cb53ce5d
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
This is needed for implementing Fetch in EGit using the API.
Change-Id: Ibdcc95906ef0f93e3798ae20d4de353fb394f2e2
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
When DirCacheCheckout was checking out it was silently
overwriting untracked files. This is only ok if the
files are also ignored. Untracked and not ignored files
should not be overwritten. This fix adds checks for
this situation.
Because this change in the behaviour also broke tests
which expected that a checkout will overwrite untracked
files (PullCommandTest) these tests have to be modified
also.
Bug: 333093
Change-Id: I26806d2108ceb64c51abaa877e11b584bf527fc9
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The Fetch command line was failing with NPE in case some options were omitted.
Additionally, it was setting a negative timeout when no timeout option was used
which caused HttpURLConnection to throw an IllegalArgumentException.
Change-Id: I2c67e2e1a03044284d183d73f0b82bb7ff79de95
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
There was one place where the parameter substitution wasn't done which caused
text fragments like "{0}" to appear in JGits output.
Bug: 325025
Change-Id: I89b881a8b5ef39f609437546310463ed4f6e1fb5
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
We cannot always rename read-only files on network shares,
so rename the temp file for a new loose object first, and
then set it as read-only.
Bug: 335388
Change-Id: Ie0350e032a97e0d09626d6143c5c692873a5f6a2
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
When debugging and enhancing DirCacheCheckout.processEntry() I found
that some of if-statements where hard to read/understand. This
change just splits some long if statements and adds more comments
explaining in which state we are. This change is only a preparation
for followup commits which introduce checks for untracked+ignored
files.
Change-Id: I670ff08310b72c858709b9e395f0aebb4b290a56
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
If HEAD exists but points to an not-existing branch the merge
command should silently create the missing branch and check
it out. This happens if you pull into freshly initalized repo.
HEAD points to refs/heads/master but refs/heads/master doesn't
exist. If you know merge a commit X into HEAD then the branch
master should be created (pointing to X) the working tree should
be updated to reflect X. That is achieved by checkout with one
tree only (HEAD is missing).
A test for this functionality will come the the next proposal
in PullCommandTest.
Change-Id: Id4a0d56d944e0acebd4b3157428bb50bd3fdd872
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>