This avoids executing mergeAlgorithm.merge on binary data, which is
unlikely to be useful.
Arguably, binary data should not make it to
ResolveMerger#contentMerge, but this approach has the following
advantages:
* binary detection is exact, since it doesn't only look at the start
of the blob.
* it is cheap, as we have to iterate over the bytes anyway to find
'\n'.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Change-Id: I424295df1dc60a719859d9d7c599067891b15792
The ObjectWalker in PackWriterBitmapWalker needs to be reset whenever it
starts a new walk. Move this responsibility from the caller to the
method when the new walk starts.
Change-Id: Ib66003be1b5bdc80f46b9bbbb17d45e616714912
Signed-off-by: Zhen Chen <czhen@google.com>
Some repository topologies can cause carryOntoHistory to overflow the
thread stack, due to its strategy of recursing into the 2nd+ parents
of a merge commit. This can easily happen if a project maintains a
local fork, and frequently pulls from the upstream repository, which
itself may have a branchy history.
Rewrite the carryOntoHistory algorithm to use a fixed amount of thread
stack, pushing the save points onto the heap. By using heap space the
thread stack depth is no longer a concern. Repositories are instead
limited by available memory.
The algorithm is now structured as two loops:
carryOntoHistory: This outer loop pops saved commits off the top of
the stack, allowing the inner loop algorithm to dive down that path
and carry bits onto commits along that part of the graph. The loop
ends when there are no more stack elements.
carryOntoHistoryInner: The inner loop walks along a single path of
the graph. For a string of pearls (commits with one parent each)
r <- s <- t <- u
the algorithm walks backwards from u to r by iteratively updating
its local variable 'c'. This avoids heap allocation along a simple
path that does not require remembering state.
The inner loop breaks in the HAVE_ALL case, when all bits have been
found to be previously set on the commit. This occurs when a prior
iteration of the outer loop (carryOntoHistory) explored a different
path to this same commit, and copied the bits onto it.
When the inner loop encounters a merge commit, it pushes all parents
onto the heap based stack by allocating individual CarryStack
elements for each parent. Parents are pushed in order, allowing
side branches to be explored first.
A small optimization is taken for the last parent, avoiding pushing
it and instead updating 'c', allowing the side branch to be entered
without allocating a CarryStack.
Change-Id: Ib7b67d90f141c497fbdc61a31b0caa832e4b3c04
Add the --recurse-submodules option on the command, which causes
submodules to also be initialized and updated.
Add a callback interface on CloneCommand and SubmoduleUpdateCommand to
them to provide progress feedback for clone operations.
Change-Id: I41b1668bc0d0bdfa46a9a89882c9657ea3063fc1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Delete the condition to check whether the garbage pack creation time
is older than the last GC operation, because it's not possible to
find the last GC operation time when there is no GC pack.
Add additional tests to make sure the contents of the expired garbage
packs are considered during the GC operation and any actively
referenced objects from the garbage packs are copied successfully
into the GC pack before deleting the garbage pack.
Change-Id: I09e8b2656de8ba7f9b996724ad1961d908e937b6
Signed-off-by: Thirumala Reddy Mutchukota <thirumala@google.com>
It is quite common to want to parse a commit without already having a
RevWalk. Provide a shortcut to do so to make it more convenient, and to
ensure that the RevWalk is released afterwards.
Signed-off-by: Martin Fick<mfick@codeaurora.org>
Change-Id: I9528e80063122ac318f115900422a24ae49a920e
Android wants them to work, and we're only interested in them for bare
repos, so add them just for that.
Make sure to use symlinks instead of just using the copyfile
implementation. Some scripts look up where they're actually located in
order to find related files, so they need the link back to their
project.
Change-Id: I929b69b2505f03036f69e25a55daf93842871f30
Signed-off-by: Dan Willemsen <dwillemsen@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jeff Gaston <jeffrygaston@google.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Because objects described by the client using "have" lines do not need
to be reachable by any ref on the server, it is possible for them to
point to missing objects in the reachability graph. When such an
object is encountered, I1097a2defa4a9dcf502ca8baca5d32880378818f (Only
throw MissingObjectException when necessary, 2017-03-29) aborts the
"have" walk early to salvage the fetch. The downside of that change
is that remaining "have"s are ignored unless they pointed directly to
an object with a bitmap. In the worst case this can increase the
bandwidth cost of a fetch to the cost of a clone because most "have"s
are ignored.
Avoid this cost by bypassing the failed "have" completely and moving
on to the remaining "have"s.
Change-Id: Iac236b6d05f735078c9935abfa6e58d1eb47f388
When looping through alternates, prevent visiting the same object
directory twice. This could happen when the objects/info/alternates file
includes itself directly or indirectly via a another repo and its
alternates file.
Change-Id: I79bb3da099ebc3c262d2e6c61ed4578eb1aa3474
Signed-off-by: James Melvin <jmelvin@codeaurora.org>
Signed-off-by: Martin Fick <mfick@codeaurora.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Generating the src list with an unrestricted wildcard causes all
files in the source tree to be included. This results in junk files
such as .orig (generated during merge conflict resolution) to be
included, which causes in a build error:
in srcs attribute of java_library rule //org.eclipse.jgit:jgit:
file '//org.eclipse.jgit:src/org/eclipse/jgit/gitrepo/RepoCommand.java.orig'
is misplaced here (expected .java, .srcjar or .properties).
Modify the globs to only include Java source files.
Change-Id: Iaef3db33ac71d71047cd28acb0378e15cb09ece9
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This is necessary for deploying submodules on android.googlesource.com.
* Allow an empty base URL. This is useful if the 'fetch' field is "."
and all names are relative to some host root.
* The URLs in the resulting superproject are relative to the
superproject's URL. Add RepoCommand#setDestinationURI to
set this. If unset, the existing behavior is maintained.
* Add two tests for the Android and Gerrit case, checking the URL
format in .gitmodules; the tests use a custom RemoteReader which is
representative of the use of this class in Gerrit's Supermanifest
plugin.
Change-Id: Ia75530226120d75aa0017c5410fd65d0563e91b
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This is a workaround for
https://bugs.openjdk.java.net/browse/JDK-4666701.
Change-Id: Idd04657e8d95a841d72230f8881b6b899daadbc2
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In the repo manifest documentation [1] the fetch attribute is marked
as "#REQUIRED".
If the fetch attribute is not specified, this would previously result in
NullPointerException. Throw a SAXException instead.
[1] https://gerrit.googlesource.com/git-repo/+/master/docs/manifest-format.txt
Change-Id: Ib8ed8cee6074fe6bf8f9ac6fc7a1664a547d2d49
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
OSGi semantic versioning rules allow to break implementors of an API in
a minor version.
Change-Id: I4ada3e6455e8e8e1bb8fb71affa0a1b36bd46fc4
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When preparing the bitmap, the flag ignoreMissingStart only applied to
the start object. However, sometime the start object is present but some
related objects are not present during the walk, we should only release
the MissingObjectException when the ignoreMissingStart is set false.
Change-Id: I1097a2defa4a9dcf502ca8baca5d32880378818f
Signed-off-by: Zhen Chen <czhen@google.com>
All that's really required to run a merge operation is a single
ObjectInserter, from which we can construct a RevWalk, plus a Config
that declares a diff algorithm. Provide some factory methods that don't
take Repository.
Change-Id: Ib884dce2528424b5bcbbbbfc043baec1886b9bbd
This is passed directly to the super constructor, where it is also
@Nullable. Marking it here saves the reader a jump.
Change-Id: Icc8db2f2dc6aae6e591aa4f09a3c283336a5424c
This is a continuation from https://git.eclipse.org/r/#/c/4716/. For a
non-bidirectional request, we need to consume the request before writing
any response. In UploadPack, we write "shallow"/"unshallow" responses
before parsing "have" lines. This has happened not to be a problem most
of the time in the smart HTTP protocol because the underlying
InputStream has a 32 KiB buffer in SmartOutputStream.
Change-Id: I7c61659e7c4e8bd49a8b17e2fe9be67bb32933d3
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
DiffAlgorithms can return different edit locations for inserts or
deletes, if they can be "shifted" up or down repeating blocks of
lines. This causes the 3-way merge to apply both edits, resulting in
incorrectly removing or duplicating lines.
Augment an existing "tidy-up" stage in DiffAlgorithm to move all
shiftable edits (not just the last INSERT edit) to a consistent
location, and add test cases for previously incorrect merges.
Bug: 514095
Change-Id: I5fe150a2fc04e1cdb012d22609d86df16dfb0b7e
Signed-off-by: KB Sriram <kbsriram@google.com>
* Parse the base URL in ManifestParser construction. This will signal
errors earlier.
* Simplify stripping of trailing slashes.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Change-Id: I4a86f68c9d7737f71cf20352cfe26288fbd2b463
Add NoPackSignatureException and UnsupportedPackVersionException to
explicitly mark permanent unrecoverable problems with a pack
Assume problem with a pack is permanent only if we are sure the
exception signals a non-transient problem we can't recover from:
- AccessDeniedException: we lack permissions
- CorruptObjectException: we detected corruption
- EOFException: file ended unexpectedly
- NoPackSignatureException: pack has no pack signature
- NoSuchFileException: file has gone missing
- PackMismatchException: pack no longer matches its index
- UnpackException: unpacking failed
- UnsupportedPackIndexVersionException: unsupported pack index version
- UnsupportedPackVersionException: unsupported pack version
Do not attempt to handle Errors since they are thrown for serious
problems applications should not try to recover from.
Change-Id: I2c416ce2b0e23255c4fb03a3f9a0ee237f7a484a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
A packfile random file open operation may fail with a
FileNotFoundException even if the file exists, possibly
for the temporary lack of resources.
Instead of managing the FileNotFoundException as any generic
IOException it is best to rethrow the exception but prevent
the packfile for being flagged as invalid until it is actually
opened and read successfully or unsuccessfully.
Bug: 514170
Change-Id: Ie37edba2df77052bceafc0b314fd1d487544bf35
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add a new API method to set the recurse mode, and pass the mode into
the fetch command.
Extend the existing FetchCommandRecurseSubmodulesTest to also perform
the same tests for fetch. Rename the test class accordingly.
Change-Id: I12553af47774b4778f7011e1018bd575a7909bd0
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This provides a place to declare visibility restrictions and
transitive dependencies for each library.
Other targets should only declare dependencies on what they directly
use, making dependencies easier to maintain.
Trim the dependencies of org.eclipse.jgit:jgit to follow that rule.
It declares dependencies on Apache httpcomponents and the servlet
API but doesn't use them.
Tested:
* 'bazel build //...' succeeds
* applying the change https://gerrit-review.googlesource.com/90843
to a copy of Gerrit, following the instructions there, and running
'bazel test //...' in that copy of Gerrit still succeeds
Change-Id: I3ab958ce8b3227019cdbe4cc81e0f042e1541034
This fixes Bazel build:
in srcs attribute of java_library rule //org.eclipse.jgit:jgit:
file '//org.eclipse.jgit:src/org/eclipse/jgit/util/sha1/SHA1.recompress'
is misplaced here (expected .java, .srcjar or .properties).
Another option that was considered is to exclude the non source files.
Change-Id: I7083f27a4a49bf6681c85c7cf7b08a83c9a70c77
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
The FileNotFoundException is typically raised in three conditions:
1. file doesn't exist
2. incompatible read vs. read/write open modes
3. filesystem locking
4. temporary lack of resources (e.g. too many open files)
1. is already managed, 2. would never happen as packs are not
overwritten while with 3. and 4. it is worth logging the exception and
retrying to read the pack again.
Log transient errors using an exponential backoff strategy to avoid
flooding the logs with the same error if consecutive retries to access
the pack fail repeatedly.
Bug: 513435
Change-Id: I03c6f6891de3c343d3d517092eaa75dba282c0cd
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The submodule.name.fetchRecurseSubmodules value was being read from the
configuration of the submodule, but it should be read from the config
of the parent repository.
Also, the fetch.recurseSubmodules value from the parent repository's
configuration was not being considered at all.
Fix both of these and add tests. Now the precedence of the recurse mode
is determined as follows:
1. Value passed to the API
2. Value configured in submodule.name.fetchRecurseSubmodules
3. Value configured in fetch.recurseSubmodules
4. Default to "on demand"
Change-Id: Ic23b7c40b5f39135fb3fd754c597dd4bcc94240c
Extend FetchCommand to expose a new method, setRecurseSubmodules(mode),
which allows to set the mode to ON, OFF or ON_DEMAND.
After fetching a repository, its submodules are recursively fetched:
- When the mode is YES, submodules are always fetched.
- When the mode is NO, submodules are not fetched.
- When the mode is ON_DEMAND, submodules are only fetched when the
parent repository receives an update of the submodule and the new
revision is not already in the submodule.
The mode is determined in the following order of precedence:
- Value specified in the API call using setRecurseSubmodules.
- Value specified in the repository's config under the key
submodule.name.fetchRecurseSubmodules
- Defaults to ON_DEMAND if neither of the previous is set.
Extend FetchResult to recursively include results for submodules, as
a map of the submodule path to an instance of FetchResult.
Test setup is based on testCloneRepositoryWithNestedSubmodules.
Change-Id: Ibc841683763307cb76e78e142e0da5b11b1add2a
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>