Counting the objects needed for packing is the most expensive part of
an UploadPack request that has no uninteresting objects (otherwise
known as an initial clone). During this phase the PackWriter is
enumerating the entire set of objects in this repository, so they can
be sent to the client for their new clone.
Allow the ObjectReader (and therefore the underlying storage system)
to keep a cached list of all reachable objects from a small number of
points in the project's history. If one of those points is reached
during enumeration of the commit graph, most objects are obtained from
the cached list instead of direct traversal.
PackWriter uses the list by discarding the current object lists and
restarting a traversal from all refs but marking the object list name
as uninteresting. This allows PackWriter to enumerate all objects
that are more recent than the list creation, or that were on side
branches that the list does not include.
However, ObjectWalk tags all of the trees and commits within the list
commit as UNINTERESTING, which would normally cause PackWriter to
construct a thin pack that excludes these objects. To avoid that,
addObject() was refactored to allow this list-based enumeration to
always include an object, even if it has been tagged UNINTERESTING by
the ObjectWalk. This implies the list-based enumeration may only be
used for initial clones, where all objects are being sent.
The UNINTERESTING labeling occurs because StartGenerator always
enables the BoundaryGenerator if the walker is an ObjectWalk and a
commit was marked UNINTERESTING, even if RevSort.BOUNDARY was not
enabled. This is the default reasonable behavior for an ObjectWalk,
but isn't desired here in PackWriter with the list-based enumeration.
Rather than trying to change all of this behavior, PackWriter works
around it.
Because the list name commit's immediate files and trees were all
enumerated before the list enumeration itself starts (and are also
within the list itself) PackWriter runs the risk of adding the same
objects to its ObjectIdSubclassMap twice. Since this breaks the
internal map data structure (and also may cause the object to transmit
twice), PackWriter needs to use a new "added" RevFlag to track whether
or not an object has been put into the outgoing list yet.
Change-Id: Ie99ed4d969a6bb20cc2528ac6b8fb91043cee071
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
It can be very handy for the implementation to resort the
object list based on data locality, improving prefetch in
the operating system's buffer cache.
Export the list to the implementation was a proper List,
and document that its mutable and OK to be modified. The
only caller in PackWriter is already OK with these rules.
Change-Id: I3f51cf4388898917b2be36670587a5aee902ff10
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When performing an initial clone of a repository there are no
uninteresting commits, and the resulting pack will be completely
self-contained. Therefore PackWriter does not need to honor C
Git standard TOPO ordering as described in JGit commit ba984ba2e0
("Fix checkReferencedIsReachable to use correct base list").
Switching to COMMIT_TIME_DESC when there are no uninteresting commits
allows the "Counting objects" phase to emit progress earlier, as the
RevWalk will not buffer the commit list. When TOPO is set the RevWalk
enumerates all commits first, before outputing any for PackWriter to
mark progress updates from.
Change-Id: If2b6a9903b536c7fb3c45f85d0a67ff6c6e66f22
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
These two methods are specific to the FileRepository implementation
and should not be exposed as part of the base Repository API. Now
that PackParser is generic and does not require these two methods
to import a pack stream into a repostiory, it is safe to remove
these and get them out of the public view.
Change-Id: I8990004d08074657f467849dabfdaa7e6674e69a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This problem surfaced since EGit Core ResetOperationTest is failing
since change I26806d21. JGit detected checkout conflict for untracked
files which never were tracked by the repository.
"git reset --hard" in c git also doesn't remove such untracked files.
Change-Id: Icc8e1c548ecf6ed48bd2979c81eeb6f578d347bd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This information is generally useful - have followed the
accessor pattern of 'children' and 'parents'
Change-Id: I79b3ddd6f390152aa49e6b7a4c72a4aca0d6bc72
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The change Ie0350e032a97e0d09626d6143c5c692873a5f6a2 was not
done properly. The renamed file was not write protected, and
this broke a test.
Bug: 335388
Change-Id: I41b2235b7677bc5fddc70dda2a56cdd2cb53ce5d
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
This is needed for implementing Fetch in EGit using the API.
Change-Id: Ibdcc95906ef0f93e3798ae20d4de353fb394f2e2
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
When DirCacheCheckout was checking out it was silently
overwriting untracked files. This is only ok if the
files are also ignored. Untracked and not ignored files
should not be overwritten. This fix adds checks for
this situation.
Because this change in the behaviour also broke tests
which expected that a checkout will overwrite untracked
files (PullCommandTest) these tests have to be modified
also.
Bug: 333093
Change-Id: I26806d2108ceb64c51abaa877e11b584bf527fc9
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The Fetch command line was failing with NPE in case some options were omitted.
Additionally, it was setting a negative timeout when no timeout option was used
which caused HttpURLConnection to throw an IllegalArgumentException.
Change-Id: I2c67e2e1a03044284d183d73f0b82bb7ff79de95
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
There was one place where the parameter substitution wasn't done which caused
text fragments like "{0}" to appear in JGits output.
Bug: 325025
Change-Id: I89b881a8b5ef39f609437546310463ed4f6e1fb5
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
We cannot always rename read-only files on network shares,
so rename the temp file for a new loose object first, and
then set it as read-only.
Bug: 335388
Change-Id: Ie0350e032a97e0d09626d6143c5c692873a5f6a2
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
When debugging and enhancing DirCacheCheckout.processEntry() I found
that some of if-statements where hard to read/understand. This
change just splits some long if statements and adds more comments
explaining in which state we are. This change is only a preparation
for followup commits which introduce checks for untracked+ignored
files.
Change-Id: I670ff08310b72c858709b9e395f0aebb4b290a56
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
If HEAD exists but points to an not-existing branch the merge
command should silently create the missing branch and check
it out. This happens if you pull into freshly initalized repo.
HEAD points to refs/heads/master but refs/heads/master doesn't
exist. If you know merge a commit X into HEAD then the branch
master should be created (pointing to X) the working tree should
be updated to reflect X. That is achieved by checkout with one
tree only (HEAD is missing).
A test for this functionality will come the the next proposal
in PullCommandTest.
Change-Id: Id4a0d56d944e0acebd4b3157428bb50bd3fdd872
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Add possibility to disable ssl verification, just as i can do with git
using: git config --global http.sslVerify false
To enable the feature, configure
Window->Preferences->Team->Git->Configuration
and add a new key/value: http.sslVerify=false
When handling repos over https, JGit will then check that flag to see
if security is loose and the ssl verification should be ignored.
Having it implemented as a key/value makes it not too obvious in the
GUI - so the user must know what he/she is doing when adding it. Being
aware of the risks etc.
Bug: 332487
Change-Id: I2a1b8098b5890bf512b8dbe07da41036c0fc9b72
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Reading a repository for millions of missing objects might be very
expensive to perform, especially if the repository is on a network
filesystem or some other costly RPC backend. A repository owner
might choose to accept some risk in return for better performance,
so allow disabling collision checking when receiving a pack.
Currently there is no way for an end-user to disable this feature.
This is intentional, because it is generally *NOT* a good idea to
skip this check. Instead this feature is supplied for storage
implementations to bypass the default checking logic, should they
have their own custom routines that is just as effective but can
be handled more efficiently.
Change-Id: I90c801bb40e86412209de0c43e294a28f6a767a5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If a pack uses OFS_DELTA only (e.g. its an initial push to a
repository) and PackParser's implementation is broken such that the
delta chain that hangs below a particular object offset is empty, the
entryCount won't match the expected objectCount. Fail fast rather
than claiming the stream was parsed correctly.
The current implementation is not broken as described above. I broke
the code when I implemented my own new subclass of PackParser (which
incorrectly mucked with the object offset information), leading me to
discover this consistency check was missing.
Change-Id: I07540f0ae1144ef6f3bda48774dbdefb8876e1d3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
By moving the logic that parses a pack stream from the network (or
a bundle) into a type that can be constructed by an ObjectInserter,
repository implementations have a chance to inject their own logic
for storing object data received into the destination repository.
The API isn't completely generic yet, there are still quite a few
assumptions that the PackParser subclass is storing the data onto
the local filesystem as a single file. But its about the simplest
split of IndexPack I can come up with without completely ripping
the code apart.
Change-Id: I5b167c9cc6d7a7c56d0197c62c0fd0036a83ec6c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
RevFilter.include()'s documentation promises the RevCommit's
body is parsed before include is invoked. This wasn't always
true if the commit was parsed once, had its body discarded,
the RevWalk was reset() and started a new traversal.
Change-Id: Ie5cafde09ae870712b165d8a97a2c9daf90b1dbd
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
This is needed for commands that use Transport internally.
Change-Id: I9417c85255b160723968c647063b9c7e05995ea4
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Additionally, defined the NoteMap.getNote method which returns a Note
instance. These changes were necessary to enable implementation of
the NoteMerger interface (the merge method needs to instantiate a
Note) and to enable direct use of NoteMerger which expects instances
of Note class as its paramters. Implementing creation of code review
summary notes in Gerrit [1] will make use of both of these features.
[1] https://review.source.android.com/#change,20045
Change-Id: I627aefcedcd3434deecd63fa1d3e90e303b385ac
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Instead of offering only a high-level isModified() method a new
method compareMetadata() is introduced which compares a working tree entry
and a index entry by looking at metadata only. Some use-cases
(e.g. computing the content-id in idBuffer()) may use this new method
instead of isModified().
Change-Id: I4de7501d159889fbac5ae6951f4fef8340461b47
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Change-Id: I4f05bdb0c58b039bd379341a6093f06a2cdfec6e
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The java.io.File.createNewFile() method for creating new empty files
reports failure by returning false. To ease proper checking of return
values provide a utility method wrapping createNewFile() throwing
IOException on failure.
Change-Id: I42a3dc9d8ff70af62e84de396e6a740050afa896
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If remote branches are present they can not be added
to the RefMap from the local branches - the two RefMaps
have a different value of 'prefix' and consequently an
IllegalArgumentException is thrown.
Java's user.home is not the same as $HOME so EGit did see the
same global configuration as C Git does.
Bug: 333269
Change-Id: Id54fc5292bf8c5a67177f9097ee692717a7df336
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
There is a space missing between <from> and "to" in the reflog
message produced by the CheckoutCommand, which is of the form
moving from <from> to <to>
Change-Id: I3dc57ab0a6589292db77a17d9029ee9499dfc725
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
This is needed by a EGit change
http://egit.eclipse.org/r/#change,2232
Change-Id: I3d62f904b769fc2f1b7b8f0f24f7dd757fc9c379
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
A test in NLSTest was mixing the "old" and the "new" way of handling
concurrency. This change makes use of the java.util.concurrent facilities to
control concurrency and removes the code that was directly dealing with Thread
objects.
Change-Id: Ie7267776e988a48a5443f0f3fe4eb43e79eee4b1
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Instead just return success. In the case that no commit has been
cherry-picked or reverted, just return the old HEAD.
Bug: 333814
Change-Id: I67db2b77b52c43932436d22a8daa5a6556423484
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Merging Git notes branches has several differences from merging "normal"
branches. Although Git notes are initially stored as one flat tree the
tree may fanout when the number of notes becomes too large for efficient
access. In this case the first two hex digits of the note name will be
used as a subdirectory name and the rest 38 hex digits as the file name
under that directory. Similarly, when number of notes decreases a fanout
tree may collapse back into a flat tree. The Git notes merge algorithm
must take into account possibly different tree structures in different
note branches and must properly match them against each other.
Any conflict on a Git note is, by default, resolved by concatenating
the two conflicting versions of the note. A delete-edit conflict is, by
default, resolved by keeping the edit version.
The note merge logic is pluggable and the caller may provide custom
note merger that will perform different merging strategy.
Additionally, it is possible to have non-note entries inside a notes
tree. The merge algorithm must also take this fact into account and
will try to merge such non-note entries. However, in case of any merge
conflicts the merge operation will fail. Git notes merge algorithm is
currently not trying to do content merge of non-note entries.
Thanks to Shawn Pearce for patiently answering my questions related to
this topic, giving hints and providing code snippets.
Change-Id: I3b2335c76c766fd7ea25752e54087f9b19d69c88
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Patterns containing only a trailing slash have to be treated
as "global" patterns. For example: "classes/" matches "classes"
as well as "dir/classes" directory.