Extract ObjectReachabilityChecker interface from the walk-based
implementation, to add a bitmapped based implementation later.
Refactor the test case to use it for both implementations.
Change-Id: Iaac7c6b037723811956ac22625f27d3b4d742139
Signed-off-by: Ivan Frade <ifrade@google.com>
Preparing the code to optimize the bitmap-based object reachability
checker. We are mirroring first the commit reachability checker
structure (interface + 2 implementations).
Move the walk-base reachability checker to its own class.
This class is public at the moment. Later ObjectWalk will return an
interface and this implementation will be package-private.
Change-Id: Ifac70094e1af137291c3607d95e689992f814b26
Signed-off-by: Ivan Frade <ifrade@google.com>
After negotiation phase of a fetch, the advertised ref map is no longer used and
can be safely cleared. For >1GiB repos object selection and packfile writing may
take 10s of minutes. For the chromium.googlesource.com/chromium/src repo, this
advertised ref map is >400MiB. Returning this memory to the Java heap is a major
scalability win.
Change-Id: I00d453c5ef47630c21f199e333e1cfcf47b7e92a
Signed-off-by: Minh Thai <mthai@google.com>
In Id5376f09f0d a test with dependency on log4j library was added, but
the library was missed to be added to the Bazel build tool chain.
Given that Bazel test runner doesn't suport custom security manager the
test wouldn't pass even if the missing dependency would be added. The
only solution we have for now is to exclude that test from Bazel tool
chain.
Filed a feature request for bazel to support such tests at
https://github.com/bazelbuild/bazel/issues/11146
Bug: 562274
Change-Id: I873a0e09addc583455b68122f66cd3952e485f0e
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Since reftables might have update index ranges that are overlapped.
Change-Id: I8f8215b99a0a978d4dd0155dbaf33e5e06ea8202
Signed-off-by: Minh Thai <mthai@google.com>
(cherry picked from commit 06748c205c)
Ensure files are writable before trying to delete them.
Bug: 408846
Change-Id: I930a547594bba853c33634ae54bd64d236afade3
Signed-off-by: Alexander Nittka <alex@nittka.de>
Revert commit 2323d7a. Using $0 in the shell command call results in
the command string being taken literally. That was introduced to fix
a problem with backslashes, but is actually not correct.
First, the problem with backslashes occurred only on Win32/Cygwin,
and has been properly fixed in commit 6f268f8.
Second, this is used only for hooks (which don't have backslashes in
their names) and filter commands from the git config, where the user
is responsible for properly quoting or escaping such that the commands
work.
Third, using $0 actually breaks correctly quoted filter commands
like in the bug report. The shell really takes the command literally,
and then doesn't find the command because of quotes.
So revert this change.
At the same time there's a related problem with hooks. If the path to
the hook contains blanks, runInShell() would also fail to find the
hook. In this case, the command doesn't come from user input but is
just a Java File object with an absolute path containing blanks. (Can
occur if core.hooksPath points to such a path with blanks, or if the
repository has such a path.)
The path to the hook as obtained from the file system must be quoted.
Add a test for a hook path with a blank.
This reverts commit 2323d7a1ef.
Bug: 561666
Change-Id: I4d7df13e6c9b245fe1706e191e4316685a8a9d59
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Commit 60cf85a4 corrected the handling of check-in for files where
the index version is non-normalized, i.e., contains CR-LF line endings.
However, it did so only for regular files, not executable files.
Bug: 561438
Change-Id: I372cc990c5efeb00315460f36459c0652d5d1e77
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Since reftables might have update index ranges that are overlapped.
Change-Id: I8f8215b99a0a978d4dd0155dbaf33e5e06ea8202
Signed-off-by: Minh Thai <mthai@google.com>
The recursive merge strategy builds a virtual ancestor merging
recursively the common bases (when more than one) between the
want-to-merge commits. While building this virtual ancestor, content
conflicts are ignored, but current code doesn't do so when a file is
removed.
This was spotted in [1], for example. Merging two commits to build the
virtual ancestor bumped into a conflict (modified in one side, deleted
in the other) that stopped the process.
Follow the "spec" and in case of conflict leave the unmerged content in
the index and working trees.
[1] https://android-review.googlesource.com/c/kernel/common/+/1228962
Change-Id: Ife9c32ae3ac3a87d3660fa1242e07854b65169d5
Signed-off-by: Ivan Frade <ifrade@google.com>
Allow explicitly setting the tag option for the remote configuration
when cloning a repository.
Bug: 561021
Change-Id: Iac43268a2bb231ae7599c3255bf555883d34fa32
Signed-off-by: Alexander Nittka <alex@nittka.de>
The error message for an Exception thrown by StartGenerator when given
both the TOPO flag and the TOPO_KEEP_BRANCH_TOGETHER flag mentions a
non-existent flag, TOPO_NON_INTERMIX. The error message was introduced
in commit e498d43.
Replace TOPO_NON_INTERMIX with TOPO_KEEP_BRANCH_TOGETHER in the error
message of an Exception thrown by the StartGenerator when the TOPO flag
is provided together with the TOPO_KEEP_BRANCH_TOGETHER flag.
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Change-Id: Id24640dc08e96a196508fe38ce144aa7e035082f
The topological sort algorithm in TopoSortGenerator for RevWalk may mix
multiple lines of history, producing results that differ from C git's
git-log whose man page states: "Show no parents before all of its
children are shown, and avoid showing commits on multiple lines of
history intermixed." Lines of history are mixed because
TopoSortGenerator merely delays producing a commit until all of its
children have been produced; it does not immediately produce a commit
after its last child has been produced.
Therefore, add a new RevSort option called TOPO_KEEP_BRANCH_TOGETHER
with a new topo sort algorithm in TopoNonIntermixGenerator. In the
Generator, when the last child of a commit has been produced, unpop
that commit so that it will be returned upon the subsequent call to
next(). To avoid producing duplicates, mark commits that have not yet
been produced as TOPO_QUEUED so that when a commit is popped, it is
produced if and only if TOPO_QUEUED is set.
To support nesting with other generators that may produce the same
commit multiple times like DepthGenerator (for example, StartGenerator
does this), do not increment parent inDegree for the same child commit
more than once.
Commit b5e764abd2 modified the existing
TopoSortGenerator to avoid mixing lines of history, but it was reverted
in e40c38ab08 because the new behavior
caused problems for EGit users. This motivated adding a new Generator
for the new behavior.
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Change-Id: Icbb24eac98c00e45c175b01e1c8122554f617933
Does not fix any issue but prevents user from shooting themselves in the
foot with improper configuration.
Suggested by Demetr Starshov at https://git.eclipse.org/r/#/c/157681/
Change-Id: I006d65022f0a7d4066970825d00080c59404fdc3
Signed-off-by: Michael Dardis <git@md-5.net>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The change Ic0b974fa (c217d33, "Documentation/technical/reftable:
improve repo layout") defines a new repository layout, which was
agreed with the git-core mailing list.
It addresses the following problems:
* old git clients will not recognize reftable-based repositories, and
look at encompassing directories.
* Poorly written tools might write directly into
.git/refs/heads/BRANCH.
Since we consider JGit reftable as experimental (git-core doesn't
support it yet), we have no backward compatibility. If you created a
repository with reftable between mid-Nov 2019 and now, you can do the
following to convert:
mv .git/refs .git/reftable/tables.list
git config core.repositoryformatversion 1
git config extensions.refStorage reftable
Change-Id: I80df35b9d22a8ab893dcbe9fbd051d924788d6a5
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
currVisit could be null if a blob is marked as start point in
ObjectWalk. Add null check before skipping current tree.
Change-Id: Ic5d876fe2800f3373d136979be6c27d1bbd38dc1
Signed-off-by: Yunjie Li <yunjieli@google.com>
This reverts commit b5e764abd2.
PlotWalk uses the TopoSortGenerator, which is causing problems for EGit users
who rely on the emission of commits being somewhat based on date as in the
previous topo-sort algorithm.
Bug: 560529
Change-Id: I3dbd3598a7aeb960de3fc39352699b4f11a8c226
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
The Java Language Specification recommends listing modifiers in
the following order:
1. Annotations
2. public
3. protected
4. private
5. abstract
6. static
7. final
8. transient
9. volatile
10. synchronized
11. native
12. strictfp
Not following this convention has no technical impact, but will reduce
the code's readability because most developers are used to the standard
order.
This was detected using SonarLint.
Change-Id: I9cddecb4f4234dae1021b677e915be23d349a380
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
The topological sort algorithm in TopoSortGenerator for RevWalk may mix
multiple lines of history, producing results that differ from C git's
git log whose man page states: "Show no parents before all of its
children are shown, and avoid showing commits on multiple lines of
history intermixed." Lines of history are mixed because
TopoSortGenerator merely delays a commit until all of its children have
been produced; it does not immediately produce a commit after its last
child has been produced.
Therefore, when the last child of a commit has been produced, unpop the
commit so that it will be returned upon the subsequent call to next() in
TopoSortGenerator. To avoid producing duplicates, mark commits that
have not yet been produced as TOPO_QUEUED so that when a commit is
popped, it is produced if and only if TOPO_QUEUED is set.
To support nesting with other generators that may produce the same
commit multiple times like DepthGenerator (for example, StartGenerator
does this), do not increment parent inDegree for the same child commit
more than once.
Modify tests that assert that TopoSortGenerator mixes lines of commit
history.
Change-Id: I4ee03c7a8e5265d61230b2a01ae3858745b2432b
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
The ReftableCompactor supported a byteLimit, but this is currently
unused. The FileReftableStack has a more sophisticated strategy that
amortizes compaction costs.
Rename min/maxUpdateIndex to reflogExpire{Min,Max}UpdateIndex to
reflect their purpose more accurately.
Since reflogs are generally pruned chronologically (oldest entries are
expired first), one can only prune entries on full compaction, so they
should not be set by default.
Rephrase the function Reader#minUpdateIndex and maxUpdateIndex. These
vars are documented to affect log entries, but semantically, they are
about ref entries. Since ref entries have their timestamps
delta-compressed, it is important for the min/maxUpdateIndex values to
be coherent between different tables.
The logical timestamps for log entries do not have to be coherent in
different tables, as the timestamps of a log entry is part of the key.
For example, a table written at update index 20 may contain a tombstone
log entry at timestamp 1.
Therefore, we set ReftableWriter's min/maxUpdateIndex from the merged
tables we are compacting, rather than from the compaction settings
(which should only control reflog expiry.)
The previous behavior could drop log entries erroneously, especially
in the presence of tombstone log entries. Unfortunately, testing this
properly requires both an API for adding log tombstones, and a more
refined API for controlling automatic compaction. Hence, no test.
Change-Id: I2f4eb7866f607fddd0629809e8e61f0b9097717f
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
The merged table contains handles to open files. A full compaction
causes those files to be closed, and so further lookups would fail
with EBADF.
Change-Id: I7bb74f7228ecc7fec9535b00e56a617a9c18e00e
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Commit 6216b0de changed the behavior of the setMirror(),
setCloneAllBranches(), and setBranchesToClone() operations. Before
that commit, these could be set and reset independently and only in
call() it would be determined what exactly to do. Since that commit,
the last of these calls would determine the operation. This means
that the sequence
cloneCommand.setCloneAllBranches(true);
cloneCommand.setBranchesToClone(/* some list of refs */);
would formerly do a "clone all" giving a fetch refspec with wildcards
+refs/heads/*:refs/remotes/origin/*
which picks up new upstream branches, whereas since commit 6216b0de
individual non-wildcard fetch refspecs would be generated and new
upstream branches would not be fetched anymore.
Undo this behavioral change. Make the operations independently settable
and resettable again, and determine the exact operation only in call():
mirror=true > cloneAll=true > specific refs, where ">" means "takes
precedence over", and if none is set assume cloneAll=true.
Note that mirror=true implies setBare(true).
Bug: 559796
Change-Id: I7162b60e99de5e3e512bf27ff4113f554c94f5a6
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
The default threads from the ForkJoinPool run without any privileges
when a SecurityManager is installed, leading to SecurityExceptions
when trying to create the probe file even if the application otherwise
has write privileges in the directory.
Use a dedicated ThreadPoolExecutor using daemon threads instead.
Bug: 551690
Change-Id: Id5376f09f0d7da5ceea367e1f0dfc70f823d62d3
Signed-off-by: Alex Jitianu <alex_jitianu@sync.ro>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Enable UnusedException at ERROR level which causes the build to fail
in many places with:
[UnusedException] This catch block catches an symbol and re-throws
another, but swallows the caught symbol rather than setting it as a
cause. This can make debugging harder.
Fix it by setting the caught exception as cause on the subsequently
thrown exception.
Note: The grammatically incorrect error message is copy-pasted as-is
from the version of ErrorProne currently used in Bazel; it has been
fixed by [1] in the latest version.
[1] https://github.com/google/error-prone/commit/d57a39c
Change-Id: I11ed38243091fc12f64f1b2db404ba3f1d2e98b5
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Since version 4.13 JUnit has an assertThrows method. Remove the
implementation in MoreAsserts and use the one from JUnit.
CQ: 21439
Change-Id: I086baa94aa3069cebe87c4cbf91ed1534523c6cb
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Java GC evicts all SoftReferences when the used heap size comes close to
the maximum heap size. This means peaks in heap memory consumption can
flush the complete WindowCache which was observed to have negative
impact on performance of upload-pack in Gerrit.
Hence add a boolean option core.packedGitUseStrongRefs to allow using
strong references to reference packfile pages cached in the WindowCache.
If this option is set to true Java gc can no longer flush the
WindowCache to free memory if the used heap comes close to the maximum
heap size. On the other hand this provides more predictable performance.
Bug: 553573
Change-Id: I9de406293087ab0fa61130c8e0829775762ece8d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>