After packaging references, the folders containing these references are
not deleted. In a busy repository, this causes operations to slow down
as traversing the references tree becomes longer.
Delete empty reference folders after the loose references have been
packed.
To avoid deleting a folder that was just created by another concurrent
operation, only delete folders that were not modified in the last 30
seconds.
Signed-off-by: Hector Oswaldo Caballero <hector.caballero@ericsson.com>
Change-Id: Ie79447d6121271cf5e25171be377ea396c7028e0
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Log as warning when an attempt to remove a directory
fails. This helps troubleshooting some bugs like the GC leaving
behind empty directories.
Change-Id: Idb94ce17f8be9668a970c7ecae31436bf434073c
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Try to give as much information as possible. The connection's
response message might contain additional hints as to why the
connection could not be established.
Bug: 536541
Change-Id: I7230e4e0be9417be8cedeb8aaab35186fcbf00a5
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
In a new RevWalk, if the first object parsed is one of the
shallow commits, the following happens:
1) RevCommit.parseCanonical() is called on a new "r1" RevCommit.
2) RevCommit.parseCanonical() immediately calls
RevWalk.initializeShallowCommits().
3) RevWalk.initializeShallowCommits() calls lookupCommit(id),
creating and adding a new "r2" version of this same object and
marking its parents empty.
4) RevCommit.parseCanonical() initializes the "r1" RevCommit's
fields, including the parents.
5) RevCommit.parseCanonical()'s caller uses the "r1" commit that
has parents, losing the fact that it is a shallow commit.
This change passes the current RevCommit as an argument to
RevWalk.initializeShallowCommits() so that method can set its
parents empty rather than creating the duplicate "r2" commit.
Change-Id: I67b79aa2927dd71ac7b0d8f8917f423dcaf08c8a
Signed-off-by: Terry Parker <tparker@google.com>
The previous algorithm selected commits by creating bitmaps at
each branch tip, doing a revwalk to populate each bitmap, and
looping in this way:
1) Select the remaining branch with the most commits (the branch
whose bitmap has the highest cardinality)
2) Select well-spaced bitmaps in that branch
3) Remove commits in the selected branch from the remaining
branch-tip bitmaps
4) Repeat at #1
This algorithm gave good commit selection on all branches but
a more uniform selection on "important" branches, where branch
length is the proxy for "important". However the algorithm
required N bitmaps of size M solely for the purpose of commit
selection, where N is the number of branch tips in the primary
GC pack, and M is the number of objects in the pack.
This new algorithm uses branch modification date as the proxy for
"important" branches, replacing the N*M memory allocation with a
single M-sized bitmap and N revwalks from new branch tips to
shared history (which will be short when there is a lot of shared
history).
GcCommitSelectionTest.testDistributionOnMultipleBranches verifies
that this algorithm still yields good coverage on all branches.
Change-Id: Ib6019b102b67eabb379e6b85623e4b5549590e6e
Signed-off-by: Terry Parker <tparker@google.com>
Since no files are actually deleted it makes no sense to fire such an
event.
Change-Id: I66e87afc1791f27fddaa873bafe8bb8b61662535
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
This change fixes the issue [1]. Before this fix, a merge involving
the caching of consecutive yet similar filenames with Norwegian
characters [2] used to throw an IllegalStateException: Duplicate
stages not allowed. This was caused by inaccurate decoding of the
filenames, using string values assuming default encoding. In the
toString method of DirCacheEntry, used before through getPathString,
UTF-8 encoding is used, but the end result becomes default encoding,
through Object's default toString usage. The special characters in
those two consecutive (particular) filenames [2] were becoming the
very same decoded /single character, lending consecutive -but then
identical- filenames. Thus the perceived duplicate 0-staging of the
file(s).
Replace getPathString usage with getRawPath for this specific case,
or use byte array representations of cached entries instead of string.
Adding a test for this change is not possible, as there is no known
way to change the default encoding for filenames such as [2] (e.g.).
JGitTestUtil does write file contents through UTF-8, but encoding like
so does not apply to the actual file name. Hence there is no way to
create files with names properly made of special characters such as
[2]'s. And the test that is necessary for this case assumes such
Norwegian (or similar characters) filenames. Changing the default
locale programmatically in a test has no effect either. And changing
the LANG value passed to the JVM is only possible upon starting it.
[1] https://bugs.chromium.org/p/gerrit/issues/detail?id=9153
[2] <=>
(...)
"a/b/SíÒr-Norge.map",
"a/b/Sør-Norge.map",
(...)
Change-Id: Ib9f2f5297932337c9817064cc09d9f774dd168f4
Signed-off-by: Marco Miller <marco.miller@ericsson.com>
If I run
git config --global protocol.version 2
mkdir repo
cd repo
git init --bare
git remote add origin https://go.googlesource.com/proposal
git fetch --depth=1
git fetch --unshallow
then I expect to have a full history, just as though I had fetched
without --depth in the first place. Instead, it reports success
but does not fetch enough objects:
$ git fsck
notice: HEAD points to an unborn branch (master)
Checking object directories: 100% (256/256), done.
Checking objects: 100% (468/468), done.
broken link from commit 2c6bc83f234085c8eadb7ea33405ce6223c44d1b
to commit 878975cf2b600675b4c905e5d9591bd24541ae9e
missing commit 878975cf2b600675b4c905e5d9591bd24541ae9e
dangling commit 314be00dae78dd526851f5635e6349014e2ad0c2
The false success indicates problems in the client and the server.
Git 2.18-rc2 (the client) ought to have been more defensive, noticing
the incomplete history. The greater error is in JGit (the server),
which neglects to send the objects requested.
When serving protocol v0 requests, JGit sends the correct objects by
taking unshallowCommits into account when generating the pack to send
to the client. Do the same in the protocol v2 code path. I forgot to
do this in v5.0.0.201806050710-rc3~6 (Teach UploadPack shallow fetch
in protocol v2, 2018-03-15).
Reported-by: Russ Cox <rsc@golang.org>
Change-Id: I282b45f47616a641b9e8d6210b4a070d3efdbb9b
Signed-off-by: Jonathan Nieder <jrn@google.com>
Instead, do it when we return the first PlotCommit from next().
On a repository with many refs, getAllRefsByPeeledObjectId() can
take a while. Doing a late initialization simplifies the handling
of a PlotWalk.
EGit, for instance, creates and configures an instance, and then
does the real walk in a background job. With late initialization,
the potentially expensive getAllRefsByPeeledObjectId() also occurs
in that background job.
Bug: 485743
Change-Id: I84c020cf8f7afda6f181778786612b8e6ddd7ed8
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
When SshSupport.runSshCommand fails since the executed external ssh
command failed throw a CommandFailedException.
If discovery of LFS server fails due to failure of the
git-lfs-authenticate command chain the CommandFailureException to the
LfsConfigInvalidException in order to allow root cause analysis in the
application using that.
Change-Id: I2f9ea2be11274549f6d845937164c248b3d840b2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
JGit now considers lightweight tags only if the --tags option is set
i.e. `git.describe().setAllTags(true)` has to be set, else the default
is now as in c git:
Only annotated tags are evaluated unless you pass true
equivalent to --tags (or --all) by the option setAllTags.
Hint: This (still) doesn't address any difference between c-git
`--all` and `!--all --tags` behavior;
perhaps this might be a follow up request
Bug: 423206
Change-Id: I9a3699756df0b9c6a7c74a7e8887dea0df17c8e7
Signed-off-by: Marcel Trautwein <me+eclipse@childno.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* URIish seems to have a tiny feature (bug?). The path of the URI
starts with a '/' only if the URI has a port set (it seems).
* GitHub does not return SSH authorization on a single line as Gerrit
does - need to account for that.
* Increase the SSH git-lfs-authenticate timeout, as GitHub sometimes
responds slower than expected.
* Guard against NPE in case the download action does not contain any
additional headers.
Change-Id: Icd1ead3d015479fd4b8bbd42ed42129b0abfb95c
Signed-off-by: Markus Duft <markus.duft@ssi-schaefer.com>
Change-Id: Ib4ebc57236bdea663f27295764886413e2550580
Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Jsch checks only for the availability of the algorithms given by
Jsch-internal config keys "CheckCiphers", "CheckKexes", and
"CheckSignatures". If the ssh config defines any algorithms
unknown to Jsch not listed in those keys, it'll still propose them
during the negotiation phase, and run into an NPE later on if the
server happens to propose such an algorithm and it gets chosen.
Jsch reads those "CheckCiphers" and the other values from either a
session-local config, or the global static Jsch config. It bypasses
~/.ssh/config for these values.
Therefore, copy these values from the config as read from
~/.ssh/config into the session-specific config. That makes Jsch
check _all_ configured algorithms up front, discarding any for
which it has no implementation. Thus it proposes only algorithms
it actually can handle.
Bug: 535672
Change-Id: I6a68e54f4d9a3267e895c536bcf3c58099826ad5
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
From the javadoc for Files.list:
"The returned stream encapsulates a DirectoryStream. If timely disposal
of file system resources is required, the try-with-resources construct
should be used to ensure that the stream's close method is invoked
after the stream operations are completed."
This is the only call to Files#newDirectoryStream that is not already in
a try-with-resources.
Change-Id: I91e6c56b5d74e8435457ad6ed9e6b4b24d2aa14e
(cherry picked from commit 1c16ea4601)
synchronize on simple Object monitor instead of using ReentrantLock
Change-Id: I897020ab35786336b51b0fef76ea6071aff8aefa
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Ensure that notifyIndexChanged is called every time we call
FileSnapshot.save, except the first.
Change-Id: I5a4e9826e791f518787366ae7c3a0ef3d416d2c1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If the configuration variable uploadpack.allowfilter is true, advertise
that "filter" is supported, and support it if the client sends such an
argument.
Change-Id: I7de66c0a0ada46ff71c5ba124d4ffa7c47254c3b
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
A subsequent patch needs dynamic generation of this advertisement
depending on a configuration variable in the underlying repository, so
refactor it into a function instead of using a constant list.
Change-Id: Ie00584add1fb56c9e88c7b57f75703981ea5bb85
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
The implementation of protocol v2 will also need to parse the "filter"
option, so refactor it into its own method.
Change-Id: I751f6e6ca63fab873298594653a3885202297a2e
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
JGit's implementation of the fetch command of protocol v2, unlike its
implementation of ls-refs, currently tolerates unknown arguments.
Tighten fetch to not allow unrecognized arguments and add tests to
verify this behavior for both ls-refs and fetch.
Change-Id: I321161d568bd638252fab1a47b06b924d472a669
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Add support for the "shallow" and "deepen" parameters in the "fetch"
command in the fetch-pack/upload-pack protocol v2. Advertise support for
this in the capability advertisement.
TODO: implement deepen-relative, deepen-since, deepen-not
Change-Id: I7ffd80d6c38872f9d713ac7d6e0412106b3766d7
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
This reduces the amount of state held as instance variables in
UploadPack, and makes it easier for a future patch to contain a clearer
version of UploadPack#processShallow.
Change-Id: I6df80b42f9e5118fda1420692e02e417670cced3
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Teach UploadPack to support protocol v2 with non-bidirectional pipes,
and add support to the HTTP protocol for v2. This is only activated if
the repository's config has "protocol.version" equal to 2.
Change-Id: I093a14acd2c3850b8b98e14936a716958f35a848
Helped-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Provide a factory for comparators that use the default heuristics except
with a different ordering of PackSources.
Change-Id: I0809b64deb3d0486040076946fdbdad650d69240
There are several ways of comparing DfsPackDescriptions for different
purposes, such as object lookup search order and reftable ordering. Some
of these are later compounded into comparators on other objects, so they
appear in the code as Comparator<DfsReftable>, for example.
Put all the DfsPackDescription comparators in static methods on
DfsPackDescription itself. Stop implementing Comparable, to avoid giving
the impression that there is always one true and correct way of sorting
packs.
Change-Id: Ia5ca65249c13373f7ef5b8a5d1ad50a26577706c
Rather than requiring callers to do their own computations based on the
package-private "category" number, provide an actual
Comparator<PackSource> instance, and explicitly discourage usage of
default Enum comparison.
Construct the default comparator using a builder pattern based on
defining equivalence classes. This gives us the same behavior as the old
category field in PackSource, with an abstraction that does not leak the
implementation detail of comparing rank numbers.
Change-Id: I6757211397ab1bc181d61298e073f88b69dbefc3
In normal operation, the source of a pack should never be null; the DFS
implementation should always know where a pack came from. Existing
implementations in InMemoryRepository and at Google always have the
source available at construction time.
The problem with null PackSources in the previous implementation was it
made the DfsPackDescription#compareTo method intransitive. Specifically,
it skips comparing the sources at all if *either* operand is null.
Suppose we have three descriptions A, B, and C, where all fields are
equal except the PackSource, and:
* A's source is INSERT
* B's source is null
* C's source is RECEIVE
In this case, A.compareTo(B) == 0, and B.compareTo(C) == 0, since all
fields are equal except the source, which is skipped. But
A.compareTo(C) != 0, since A and B have different sources.
Avoid this problem in compareTo by enforcing that the source is never
null. We could of course assign an arbitrary category number to a null
source in order to make comparison transitive[1], but it's simpler to
implement and reason about if the field is non-nullable, and there is no
real-world use case to make it null.
Although a non-null source is required at construction time, the field
is currently still mutable: DfsPackDecscription#setPackSource is used by
DfsInserterTest to mark packs as garbage. This could probably be
avoided as well, allowing us to convert packSource to a final field, but
doing so is beyond the scope of this change.
[1] The astute reader will notice this is already done by
DfsObjDatabase#reftableComparator(). In fact, the reason that
different comparator implementations non-obviously have different
semantics for this nullable field is another reason why it's clearer
to avoid null entirely.
Change-Id: I85a2aaf3fd6d4868f241f7972a0349f087830ffa
The canonical implementation also doesn't. Compare current
code in remote.c, function get_stale_heads_cb.[1] Not handling
symrefs in this case was introduced in canonical git in [2]
in 2008.
[1] https://github.com/git/git/blob/v2.17.0/remote.c#L2259
[2] https://github.com/git/git/commit/740fdd27f0
Bug: 533549
Change-Id: If348d56bb4a96b8aa7141f7e7b5a0d3dd4e7808b
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Callers of getAllRefs that only iterate over the `values()` of the
returned map can be trivially fixed to call getRefDatabase().getRefs()
instead.
Only fix those where the calling method is already declared to throw
IOException, to avoid potential API changes.
Change-Id: I2b05f785077a1713953cfd42df7bf915f889f90b
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Callers should instead use getRefDatabase().getRefs(), which does not
swallow the IOException.
Replace @link with @code in the Javadoc of FileRepository, since linking
to the deprecated method causes an error:
Javadoc: The method getAllRefs() from the type Repository is deprecated
Existing callers of the deprecated method are not adapted in this commit
because many of them require more refactoring. They will be done in
separate follow-up commits.
Bug: 534731
Change-Id: Id84e70e4cd7be3d1ca1795512950c6abe3d18ffd
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Callers should use getRefDatabase().peel(ref) instead since it
doesn't swallow the IOException.
Adapt all trivial callers to user the alternative.
DescribeCommand still uses the deprecated method and is not adapted in
this change since it will require more refactoring to add handling of
the IOException.
Change-Id: I14d4a95a5e0570548753b9fc5c03d024dc3ff832
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>