If the attributes of FileSnapshot don't detect modification of a
packfile read the packfile's checksum and compare it against the
checksum cached in the loaded packfile.
Since reading the checksum needs less IO than reloading the complete
packfile this may help to reduce the overhead to detect modficiation
when a gc completes while ObjectDirectory scans for packfiles in another
thread.
Bug: 546891
Change-Id: I9811b497eb11b8a85ae689081dc5d949ca8c4be5
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This allows to verify the expected behavior in
FileSnapshotTest#testSimulatePackfileReplacement and enables extending
FileSnapshot for packfiles to read the packfile's checksum as another
criterion to detect modifications without reading the full content.
Also add another field capturing the result of the last check if
lastModified was racily clean.
Remove unnecessary determination of raciness in the constructor. It was
determined twice in all relevant cases.
Change-Id: I100a2f49d7949693d7b72daa89437e166f1dc107
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
NTFS does not support FileKey hence ignore this test on Windows.
Change-Id: I7b53a591daa5e03eb5e401b5b26d612ab68ce10d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
testNewFileNoWait() was identical to testNewFileWithWait() but claims it
doesn't wait at all. Hence remove the waits.
Change-Id: I49b8ca5cb49a43c55fe61870c18c42f32fb4b74d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Due to finite filesystem timestamp resolution the last modified
timestamp of files cannot detect file changes which happened in the
immediate past (less than one filesystem timer tick ago).
Some filesystems expose unique file identifiers, e.g. inodes in Posix
filesystems which are named filekeys in Java's BasicFileAttributes. Use
them as another means to detect file modifications based on stat
information.
Running git gc on a repository yields a new packfile with the same id as
a packfile which existed before the gc if these packfiles contain the
same set of objects. The content of the old and the new packfile might
differ if a different PackConfig was used when writing the packfile.
Considering filekeys in FileSnapshot may help to detect such packfile
modifications.
Bug: 546891
Change-Id: I711a80328c55e1a31171d540880b8e80ec1fe095
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
FileSnapshot.notRacyClean() assumed a worst case filesystem timestamp
resolution of 2.5 sec (FAT has a resolution of 2 sec). Instead measure
timestamp resolution to avoid unnecessary IO caused by false positives
in detecting the racy git problem caused by finite filesystem timestamp
resolution [1].
Cache the measured resolution per FileStore since timestamp resolution
depends on the respective filesystem type. If timestamp resolution
cannot be measured or fails due to an exception fallback to the worst
case FAT timestamp resolution and avoid caching this value.
Add a 10% safety margin in FileSnapshot.notRacyClean(), though running
FsTest.testFsTimestampResolution() 1000 times which is not using a
safety margin didn't fail on Mac using APFS and Java 8, 11, 12.
Measured Java file timestamp resolution: [2]
[1] https://github.com/git/git/blob/master/Documentation/technical/racy-git.txt
[2] https://docs.google.com/spreadsheets/d/1imy0y6WmRqBf0kjCxzxj2X7M50eIVfa7oaUIzEOHmjo
Bug: 546891
Change-Id: I493f3b57b6b306285ffa7d392339d253e5966ab8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Do not reload packfiles when their associated filesnapshot is not
modified on disk compared to the one currently stored in memory.
Fix the regression introduced by fef78212 which, in conjunction with
core.trustfolderstats = false, caused any lookup of objects inside
the packlist to loop forever when the object was not found in the pack
list.
Bug: 546190
Change-Id: I38d752ebe47cefc3299740aeba319a2641f19391
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The prune method did not delete empty fanout directories when loose
objects moved to a new pack file but only when loose unreferenced
objects were pruned.
Change-Id: Ia068f4914c54d9cf9f40b75e8ea50759402b5000
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Due to finite filesystem timestamp resolution the last modified
timestamp of files cannot detect file changes which happened in the
immediate past (less than one filesystem timer tick ago).
Read and consider file size also, so that differing file size can help
to more accurately detect file changes without reading the file content.
Use bulk read to avoid multiple stat calls to retrieve file attributes.
Change-Id: I974288fff78ac78c52245d9218b5639603f67a46
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
ProtocolV2Parser explains:
// TODO(ifrade): This validation should be done after the
// protocol parsing. It is not a protocol problem asking for an
// unexisting ref and we wouldn't need the ref database here.
Do so. This way all ref database accesses are in one place, in the
UploadPack class.
No user-visible change intended --- this is just to make the code
easier to manipulate.
Change-Id: I68e87dff7b9a63ccc169bd0836e8e8baaf5d1048
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
when multiple match options are given in git describe the result must
not depend on the order of the match options. JGit wrongly picked the
first match using the match options in the order they were defined. Fix
this by concatenating the streams of matching tags for all match options
and then choosing the first match on the concatenated stream sorted in
tie break order.
See https://git-scm.com/docs/git-describe#git-describe---matchltpatterngt
Change-Id: Id01433d35fa16fb4c30526605bee041ac1d954b2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Correct behaviour as git 1.7.1.1 is to resolve tie-breakers to choose
the most recent tag.
https://github.com/git/git/blob/master/Documentation/RelNotes/1.7.1.1.txt:
* "git describe" did not tie-break tags that point at the same commit
correctly; newer ones are preferred by paying attention to the
tagger date now.
Bug: 538610
Change-Id: Ib0b2a301997bb7f75935baf7005473f4de952a64
Signed-off-by: Håvard Wall <haavardw@gmail.com>
The main concern are submodule urls starting with '-' that could pass as
options to an unguarded tool.
Pass through the parser the ids of blobs identified as .gitmodules
files in the ObjectChecker. Load the blobs and parse/validate them
in SubmoduleValidator.
Change-Id: Ia0cc32ce020d288f995bf7bc68041fda36be1963
Signed-off-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In order to validate .gitmodules files, we first need to find them
in the incoming pack.
Do it in the ObjectChecker stage. Check in the tree objects if they
point to a .gitmodules file and report the tree id and the .gitmodules
blob id.
This can be used later to check if the file is in the root of the
project and if the contents are good.
While we're here, make isMacHFSGit more accurate by detecting variants
of filenames that vary in case.
[jn: tweaked NTFS and HFS+ checking; added more tests]
Change-Id: I70802e7d2c1374116149de4f89836b9498f39582
Signed-off-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In C git versions before 2.19.1, the submodule is fetched by running
"git clone <uri> <path>". A URI starting with "-" would be interpreted
as an option, causing security problems. See CVE-2018-17456.
Refuse to add submodules with URIs, names or paths starting with "-",
that could be confused with command line arguments.
[jn: backported to JGit 4.7.y, bringing portions of Masaya Suzuki's
dotdot check code in v5.1.0.201808281540-m3~57 (Add API to specify
the submodule name, 2018-07-12) along for the ride]
Change-Id: I2607c3acc480b75ab2b13386fe2cac435839f017
Signed-off-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The statement:
assertThat(recvStream.available(), is(0));
results in a warning from Eclipse:
The expression of type int is boxed into Integer
because recvStream.available() returns int, but the hamcrest is()
method takes an Integer.
Replace it with the equivalent JUnit assertion.
Also remove the suppression of another similar warning and fix that
in the same way.
Change-Id: I6f18b304a540bcd0a10aec7d3abc7dc6f047fe80
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
In order to support GPG-signed commits, add some methods which will
allow GPG signatures to be parsed out of RevCommit objects.
Later, we can add code to verify the signatures.
Change-Id: Ifcf6b3ac79115c15d3ec4b4eaed07315534d09ac
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Use existing test utility methods instead of nested PrintWriter usage.
Change-Id: I324852c7971ae644fa499f377a31d1cf265c7fd9
Signed-off-by: René Scheibe <rene.scheibe@gmail.com>
ErrorProne warns [1] about implicit use of the platform default charset,
which can result in differing behaviour between JVM executions or
incorrect behavior if the encoding of the data source doesn't match
expectations.
[1] http://errorprone.info/bugpattern/DefaultCharset
Change-Id: I0fd489d352170339c3867355cd24324dfdbd4b59
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Client will stop sending haves when the number of haves sent reaches maxhaves.
Change-Id: I1e5b1525be4c67f20a81ca24a2770c20eb5c1271
Signed-off-by: Minh Thai <mthai@google.com>
The parsing code for protocol v2 fetch doesn't have any dependency on
the rest of UploadPack.
Move it to its own class. This makes testing easier (no need to
instantiate the full UploadPack), simplifies the code in UploadPack and
increases modularity.
At the moment, the parser needs to know about the reference database to
validate incoming references. This dependency could be easily removed
moving the validation later in the flow, after the parsing, where other
validations are already happening. Postponing that to keep this patch
about moving unmodified code around.
Change-Id: I7ad29a6b99caa7c12c06f5a7f30ab6a5f6e44dc7
Signed-off-by: Ivan Frade <ifrade@google.com>
At the moment there are two copies of the client shallow commit list:
one in the request and another in the clientShallowCommits member of
the class.
The verifyShallowCommit function was removing missing object ids
from the member but not the request list, and code afterwards was
using the request's version.
In practice, this didn't cause trouble because these shallow commits
are used as endpoint for a walk, and missing ids are just never reached.
Change-Id: I70a8f1fd46de135da09f16e5d954693c8438ffcb
Signed-off-by: Ivan Frade <ifrade@google.com>
This test was never being run. Since it was introduced it was
named "notest.." which meant it didn't run with JUnit3, and
since it is not annotated @Test it also doesn't run with JUnit4.
When compiling with Bazel 0.6.0, error-prone raises an error
that the public method is not annotated with @Ignore or @Test.
Given that the test has never been run anyway, we can just
remove it.
Bug: 525415
Change-Id: Ie9a54f89fe42e0c201f547ff54ff1d419ce37864
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Renamed and restructured tests for improved clarity.
Bug: 479266
Change-Id: Ic9d05ddf722bddd148fa9d9c19248dd53d97f1e4
Signed-off-by: René Scheibe <rene.scheibe@gmail.com>
On recent VMs, collection.toArray(new T[0]) is faster than
collection.toArray(new T[collection.size()]). Since it is also more
readable, it should now be the preferred way of collection to array
conversion.
https://shipilev.net/blog/2016/arrays-wisdom-ancients/
Change-Id: I80388532fb4b2b0663ee1fe8baa94f5df55c8442
Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de>
If packed refs are used, duplicate updates result in an exception
because JGit tries to lock the same lock file twice. With non-atomic
ref updates, this used to work, since the same ref would simply be
locked and updated twice in succession.
Let's be more lenient in this case and remove duplicates before
trying to do the ref updates. Silently skip duplicate updates
for the same ref, if they both would update the ref to the same
object ID. (If they don't, behavior is undefined anyway, and we
still throw an exception.)
Add a test that results in a duplicate ref update for a tag.
Bug: 529400
Change-Id: Ide97f20b219646ac24c22e28de0c194a29cb62a5
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Bug: 529314
Change-Id: I91eaeda8a988d4786908fba6de00478cfc47a2a2
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Change-Id: I2442394fb7eae5b3715779555477dd27b274ee83
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
The TransportLocal object created in
newTransportLocalWithStrictValidation() closes original repository and
increments use of test internal "dst" repository, but never decrements
use. Because of this, pack file is not closed and during tearDown on
Windows system is unable to delete it.
Bug: 538068
Change-Id: I96df8e91abfee78c91cf26c2466718e9145a69db
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
In Git protocol v2, UploadPack and ReceivePack have the same
capabilities and can process any protocol v2 request. For example, a
client can sent a "fetch" command to the "/git-receive-pack" endpoint.
This makes it difficult for existing hook interfaces. For example,
PreUploadHook takes UploadPack, but a "fetch" command may be received by
ReceivePack.
To resolve this skew, this change introduce a different hook interface
for the protocol v2. The hook takes a request that is independent to the
handlers (UploadPack, ReceivePack). Also this makes it clear what
parameters the hook is counting on, instead of keep track of the hook
using getters from UploadPack / ReceivePack.
Bug: 534847
Change-Id: I71f3266584483db1e2b2edfc1a72d0bdf1bb6041
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
When fetching with protocol v2, git expects the shallow-info section to
appear before wanted-refs if both appear in the response. Teach
UploadPack to do this.
Change-Id: Ie26a91edcce5d27a1d727d7fba5c30e1144e118b
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
ResolveMerger.checkout() and cleanUp() check out files directly and
must honor CR/LF settings and also smudge filters.
Deprecate the 3-argument version of DirCacheCheckout.checkoutEntry().
It isn't used anymore anywhere in JGit (nor in EGit).
Bug: 537410
Change-Id: I062b35401c8bd5bc99deb2f68f91089a0643504c
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
UploadPack already allows the client to send wanted OIDs as "want"
lines. Extend UploadPack to also allow the client to send wanted ref
names as "want-ref" lines when the fetch is done using protocol v2.
The corresponding Git commit is 516e2b76bd ("upload-pack: implement
ref-in-want", 2018-06-28).
To support a two-stage rollout, two configuration variables are
provided: uploadpack.allowrefinwant (default "false") allows clients to
specify "want-ref" in their requests, and uploadpack.advertiserefinwant
(default "true") makes UploadPack advertise this capability. If
uploadpack.allowrefinwant is true but uploadpack.advertiserefinwant is
false, UploadPack will not advertise that it supports "want-ref", but it
will support it.
Change-Id: I3c24077949640d453af90d81a7f48ce4b8ac9833
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Two submodules at the same path on different branches need not represent
the same repository, and two submodules at different paths can represent
the same one.
The C Git implementation uses the submodule name to internally manage
the submodule repositories under .git/modules. When a submodule
represents different repositories in different branches, it makes a
conflict inside .git/modules.
The current RepoCommand implementation uses submodule paths as the
submodule names. When the manifest file mounts different repositories to
the same path in different branches, this makes a situation described
above. To solve this issue, we can use the project name instead of
the path as the submodule name.
On the other hand, since repo v1.12.8~3^2 (repo: Support multiple
branches for the same project., 2013-10-11), a manifest file can mount
the same project to different paths. If we naively use the project
name as the submodule name, it makes a conflict in .git/modules, too.
This patch uses the project name as the submodule name basically, but
when the same project is mounted to different paths, it uses the project
name and path as the submodule name.
Change-Id: I09dc7d62ba59016fe28852d3139a56ef7ef49b8f
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Reported-by: JP Sugarbroad <jpsugar@google.com>
Remove completely the empty directories under refs/<namespace>
including the first level partition of the changes, when they are
completely empty.
Bug: 536777
Change-Id: I88304d34cc42435919c2d1480258684d993dfdca
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When processing a fetch using protocol v2, UploadPack#fetchV2 sends an
extraneous flush pkt when also sending a packfile (#sendPack sending its
own flush pkt). Update that method to only send the flush pkt if the
packfile is not being sent.
Change-Id: I7117a264bccd2d7f3a048645fcb8425a9d78d526
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
In a0c9016abd ("upload-pack: send refs' objects despite "filter"",
2018-07-09), Git updated the "filter" option in the fetch-pack
upload-pack protocol to not filter objects explicitly specified in
"want" lines, even if they match the criterion of the filter. Update
JGit to match that behavior.
Change-Id: Ia4d74326edb89e61062e397e05483298c50f9232
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Don't try to delete folders if the merger chooses THEIRS, but all of
BASE, OURS, and THEIRS contain the folder.
Add a test for rebase with auto-stash and subdirectories that
verifies this case. The needless directory deletion and reporting
such directories in getModifiedFiles() was the root cause of bug
536880.
Note even with this fix, bug 536880 will not be fixed in all cases
yet. There may still be cases where the set of modified files ends
up containing directories. This will be dealt with in EGit where
this set is used. (See https://git.eclipse.org/r/#/c/126242/ .)
Bug: 536880
Change-Id: I62b4571a1c1d4415934a6cb4270e0c8036deb2e9
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
RacyGitTests depend on filesystem timer resolution. We wait for a file
system timer tick, remember that time as t1, modify a file and assume
that this file has a lastmodified of t1.
If this assumption is not fulfilled then ignore the test result.
Bug: 526111
Change-Id: Ia38b7d2f99171ef54b8f9fe5be343cf9fcfd3971
Currently SubmoduleAddCommand always uses the path as submodule name.
This patch lets the caller specify a submodule name.
SubmoduleUpdateCommand still does not make use of the submodule name
(see bug 535027) but Git does. To avoid triggering CVE-2018-11235,
do some validation on the name to avoid '..' path components.
[jn: fleshed out commit message, mostly to work around flaky CI]
Change-Id: I6879c043c6d7973556e2080387f23c246e3d76a5
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Make the method names more consistent and their semantics simpler:
hasRef and seekRef to look up a single exact reference by name and
hasRefsByPrefix and seekRefsByPrefix to look up multiple references by
name prefix.
In particular, splitting hasRef into two separate methods for its
different uses makes DfsReftableDatabase.isNameConflicting easier to
follow.
[jn: fleshed out commit message]
Change-Id: I71106068ff3ec4f7e14dd9eb6ee6b5fab8d14d0b
Signed-off-by: Minh Thai <mthai@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Reftable implementation of RefDatabase.getRefsByPrefix() should be
more performant, as references are filtered directly by prefix;
instead of fetching the whole subtree then filter by prefix.
Change-Id: If4f5f8c08285ea1eaec9efb83c3d864cea7a1321
Signed-off-by: Minh Thai <mthai@google.com>