Core classes to parse and process .gitattributes files including
support for reading attributes in WorkingTreeIterator and the
dirCacheIterator.
The implementation follows the git ignore implementation. It supports
lazy reading attributes while walking the working tree.
Bug: 342372
CQ: 9078
Change-Id: I05f3ce1861fbf9896b1bcb7816ba78af35f3ad3d
Also-by: Marc Strapetz <marc.strapetz@syntevo.com>
Also-by: Gunnar Wagenknecht <gunnar@wagenknecht.org>
Also-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Gunnar Wagenknecht <gunnar@wagenknecht.org>
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
C Git trims name and email before inserting them into the commit header
so that " A U Thor " and " author@example.com " becomes
"A U Thor <author@example.com>" with a single separating space.
This changes PersonIdent#toExternalString() to trim name and email
before concatenating them.
Change-Id: Idd77b659d0db957626824f6632e2da38d7731625
Signed-off-by: Rüdiger Herrmann <ruediger.herrmann@gmx.de>
Native Git canonicalizes line endings when detecting
renames, more specifically it replaces CRLF by LF.
See: hash_chars in diffcore-delta.c
Bug: 449545
Change-Id: Iec2aab12ae9e67074cccb7fbd4d9defe176a0130
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The path matcher should not fail if the rule ends with trailing slash,
target pattern does not ends with the slash and the "assumeDirectory"
flag is set.
E.g. */bin/ should also match a/bin if this pattern is threated as
directory by WorkingTreeIterator (FileMode.TREE).
The old code/tests have never tested directory rules with patterns
*without* trailing slashes but with the "assumeDirectory" flag set.
Unfortunately this is exactly what WorkingTreeIterator does... The tests
are changed to test *both* cases now (with trailing slash and without)
if the target pattern has trailing slash (represents directory).
Bug: 454672
Change-Id: I621c1644d9e94df3eb9f6f09c6de0fe51f0950a4
Also-by: Markus Duft <markus.duft@salomon.at>
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Mac's HFS+ folds concatentations of ".git" and ignorable Unicode
characters [1] to ".git" [2]. Hence we need to disallow all names which
could potentially be a shortname for ".git". Example: in an empty
directory create a folder ".g\U+200Cit". Now you can't create another
folder ".git".
The following characters are ignorable Unicode which are ignored on
HFS+:
unicode hex name
-------------------------------------------------
U+200C 0xe2808c ZERO WIDTH NON-JOINER
U+200D 0xe2808d ZERO WIDTH JOINER
U+200E 0xe2808e LEFT-TO-RIGHT MARK
U+200F 0xe2808f RIGHT-TO-LEFT MARK
U+202A 0xe280aa LEFT-TO-RIGHT EMBEDDING
U+202B 0xe280ab RIGHT-TO-LEFT EMBEDDING
U+202C 0xe280ac POP DIRECTIONAL FORMATTING
U+202D 0xe280ad LEFT-TO-RIGHT OVERRIDE
U+202E 0xe280ae RIGHT-TO-LEFT OVERRIDE
U+206A 0xe281aa INHIBIT SYMMETRIC SWAPPING
U+206B 0xe281ab ACTIVATE SYMMETRIC SWAPPING
U+206C 0xe281ac INHIBIT ARABIC FORM SHAPING
U+206D 0xe281ad ACTIVATE ARABIC FORM SHAPING
U+206E 0xe281ae NATIONAL DIGIT SHAPES
U+206F 0xe281af NOMINAL DIGIT SHAPES
U+FEFF 0xefbbbf ZERO WIDTH NO-BREAK SPACE
[1] http://www.unicode.org/versions/Unicode7.0.0/ch05.pdf#G40025http://www.unicode.org/reports/tr31/#Layout_and_Format_Control_Characters
[2] http://dubeiko.com/development/FileSystems/HFSPLUS/tn1150.html#UnicodeSubtleties
Change-Id: Ib6a1dd090b2649bdd8ec16387c994ed29de2860d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Windows creates shortnames for all non-8.3 files (see [1]). Hence we
need to disallow all names which could potentially be a shortname for
".git". Example: in an empty directory create a folder "GIT~1". Now you
can't create another folder ".git".
The path "GIT~1" may map to ".git" on Windows. A potential victim to
such an attack first has to initialize a git repository in order to
receive any git commits. Hence the .git folder created by init will get
the shortname "GIT~1". ".git" will only get a different shortname if the
user has created a file "GIT~1" before initialization of the git
repository.
[1] http://en.wikipedia.org/wiki/8.3_filename
Change-Id: I9978ab8f2d2951c46c1b9bbde57986d64d26b9b2
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Windows treats "foo." and "foo " as "foo". The ".git" directory is
special, as it contains metadata for a local Git repository. Disallow
variations that Windows considers to be the same.
Change-Id: I28eb48859a95a89111b4987c91de97557e3bb539
The component name ".GIT" inside a tree entry could confuse a
case insensitive filesystem into looking at a submodule and
not a directory entry.
Disallow any case permutations of ".git" to prevent this
confusion from entering a repository and showing up at a
later date on a case insensitive system.
Change-Id: Iaa3f768931d0d5764bf07ac5f6f3ff2b1fdda01b
If the DirCache contains a path that is known to be invalid, refuse to
read the DirCache into memory. This avoids confusing errors later if
an invalid path read from the DirCache were to be passed into a new
DirCacheEntry constructor.
Change-Id: Ic033d81e23a5fbd554cc4dff80a232504562ffa8
isValidPath is an older simple form of the validation performed by
checkValidPath. Use the latter as it more consistently matches
git-core's validation rules.
By running the same validation as fsck, callers creating an entry
for the DirCache are more likely to learn early they are trying
to build trees that will fail fsck.
Change-Id: Ibf5ac116097156aa05c18e231bc65c0854932eb1
Windows does not like naming files "a.". The trailing "." may be
dropped by the filesystem, which is confusing. Even though these
tests currently do not write to disk, future tests like them might.
Replace "." with "-", which has the same sorting properties that
were desirable about ".", but does not have the same limitations.
Change-Id: Ie5b7594bf5e79828d1341883c73ddb70123d5055
When updating a submodule (e.g. during recursive clone) the repository
for the submodule should be located at <gitdir>/modules/<submodule-path>
whereas the working tree of the submodule should be located at
<working-tree>/<submodule-path> (<gitdir> and <working-tree> are
associated to the containing repository). Since CloneCommand has learned
about specifying a separate gitdir this is easy to implement in
SubmoduleUpdateCommand.
Change-Id: I9b56a3dfa50f97f6975c2bb7c97b36296f331b64
This feature is needed to support the new submodule layout where the
.git folder of the submodules is under .git/modules/<submodule>.
Change-Id: If5f13426cfd09b7677e23478e9700c8c25a6dae5
Native git's "init" command allows to specify the location of the .git
folder with the option "--separate-git-dir". This allows for example to
setup repositories with a non-standard layout. E.g. .git folder under
/repos/a.git and the worktree under /home/git/a. Both directories
contain pointers to the other side: /repos/a.git/config contains
core.worktree=/home/git/a . And /home/git/a/.git is a file containing
"gitdir: /repos/a.git". This commit adds that option to InitCommand.
This feature is needed to support the new submodule layout where the
.git folder of the submodules is under .git/modules/<submodule>.
Change-Id: I0208f643808bf8f28e2c979d6e33662607775f1f
Without explicitly closing repos we can't delete the test repositories
on windows.
Change-Id: Id5fa17bd764cbf28703c2f21639d7e969289c2d6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
To update the file length stat we need to use the length of the
temporary file since it's not yet renamed to the target file name here.
The incorrect file length stat update was introduced in
a606dc363d.
Bug: 453962
Change-Id: I715c048227553efae6f8f6b6878c0f04f2609d9c
Also-by: Konrad Kügler <swamblumat-eclipsebugs@yahoo.de>
Also-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In case of an add/add conflict, no base stage exists. The previous
implementation would skip over the entries because the condition
expected the base stage to always exist.
Change-Id: Ie2b3685d958c09b241991b74e6177401e8a1ebc9
Signed-off-by: Robin Stocker <robin@nibor.org>
The change tries to make jgit behave more like native CLI git regarding
the negation rules. According to [1] "... prefix "!" which negates the
pattern; any matching file excluded by a previous pattern will become
included again." Negating the pattern should not automatically make the
file *not ignored* - other pattern rules have to be considered too.
The fix adds test cases for both bugs 448094 and 407475.
[1] https://www.kernel.org/pub/software/scm/git/docs/gitignore.html
Bug: 448094
Bug: 407475
Change-Id: I322954200dd3c683e3d8f4adc48506eb99e56ae1
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Entries should only be written to the working tree managed by the
Repository. Simplify callers by passing only the entry and computing
the work tree location inside of the checkoutEntry method.
Change-Id: I574e41280d0407f1853fda12f4bd0d30f75d74e7
Encourage callers to explicitly name a directory to hold any
overflow data. Call sites have more information about what is
going into the buffer and how it should be protected at the
filesystem level than just throwing content to the system wide
temporary directory.
Callers that still really don't care (or need to care) can pass
null for the File argument to have the system directory used.
Change-Id: I89009bbee49d3850d42cd82c2c462e51043acda0
Increase the in-memory buffer for the TREE extension to 5 MiB, and
overflow to $GIT_DIR instead of /tmp. Using a larger buffer reduces
the chances a repository will overflow and need to spool the extension
to disk. Using $GIT_DIR allows the TREE extension contents to have
the same file system protections as the final $GIT_DIR/index.
Wrap the entire thing in a try/finally to ensure the temp file is
deleted from disk after the block has finished using it. To avoid
dangling NFS files, LocalFile.destroy() does close the local file
before deleting it.
Change-Id: I8f871181a4689e3ebf0cdd4fd1769333cf7546c3
Callers should manage the ObjectReader, as this allows the JGit library to cache
context relevant information across files checked out at the same time. If the
caller only has one file to checkout, it should still explicitly manage the life
span of the ObjectReader.
Change-Id: Ib57fba6cb4b774ccff8c416ef4d32e2b390f16a9
To improve ignore parser performance we can avoid using java.util.regex
code on simple wildcard patterns with leading or trailing asterisk. As
those patterns represent a majority of ignore rules, the index diff
performance can be drastically increased on huge repository with lot of
ignore rules.
Bug: 450466
Change-Id: I80428441cc8d5de5468813f841d89322413eed8b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Actually the test only allows a range from [1,255], so let's name the
variable so.
Change-Id: Iecdb8149b83389c67e3cd2f64f4a654c175475be
Signed-off-by: Stefan Beller <sbeller@google.com>
JGit style is to import exactly the classes required, and never
to use "import foo.*" as the foo package could add new classes
in the future which are conflicting/confusing with the imports
already used by a source file.
Change-Id: I5693408c777e5843ec65fff1163d5d717849fa34
The latest changes to IndexDiff just assumed that all configured
submodules are allways cloned. If a configured submodule did not exist
an exception was thrown. This is fixed by this commit.
Bug: 450567
Change-Id: Iabe3b196d998c19483082e5720038ebddaeb1890
In a situation where a certain path was ignored but a working tree file
with this path existed jgit didn't allow to checkout a branch which
didn't ignore this path but contained different content. JGit considered
this to be a checkout conflict to prevent overwriting the file in the
working tree and raised an error. This commit fixes this by ensuring
that ignored dirty working tree files don't lead to a checkout conflict.
Bug: 450169
Change-Id: I90288d314ffac73c24a9c70a5181f8243bd4679a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
For each submodule native git allows to configure which modifications to
submodules should be ignored by the status command. It is possible to
ignore "none", "all", "dirty", "untracked" [1]. This configuration is
now supported by IndexDiff. The StatusCommand offers the possibility to
specify this mode.
[1] http://git-scm.com/docs/gitmodules
Change-Id: Ifd81d574a680f9b4152945ba70f8ec4af4f452c9
The current IgnoreRule/FileNameMatcher implementation scales not well
with huge repositories - it is both slow and memory expensive while
parsing glob expressions (bug 440732). Addtitionally, the "double star"
pattern (/**/) is not understood by the old parser (bug 416348).
The proposed implementation is a complete clean room rewrite of the
gitignore parser, aiming to add missing double star pattern support and
improve the performance and memory consumption.
The glob expressions from .gitignore rules are converted to Java regular
expressions (java.util.regex.Pattern). java.util.regex.Pattern code can
evaluate expression from gitignore rules considerable faster (and with
less memory consumption) as the old FileNameMatcher implementation.
CQ: 8828
Bug: 416348
Bug: 440732
Change-Id: Ibefb930381f2f16eddb9947e592752f8ae2b76e1
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
ResetCommand threw an NPE if neither mode nor path was defined. Instead
it should default to a mixed reset like native git does.
Change-Id: I455902394f9e7b0c7afae42381f34838f7f2a138
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When writing new packs it should be allowed to specify objects as "have"
(objects which should not be included in the pack) which do not exist in
the local repository.
This works with the traditional PackWriter, but when PackWriter was
working on a repository with bitmap indexes and used
PackWriterBitmapWalker then this feature was broken. Non-existing "have"
objects lead to MissingObjectExceptions. That broke push and Gerrit
replication. When the replication target had branches unknown to the
replication source then the source repository wanted to build pack files
where "have" included branch-tips which were unknown in the source
repository.
Bug: 427107
Change-Id: I6b6598a1ec49af68aa77ea6f1f06e827982ea4ac
Also-by: Matthias Sohn <matthias.sohn@sap.com>