FileWriter uses the platform default encoding, which might not
be UTF-8. JGit prefers UTF-8 everywhere for string encodings,
so make the unit tests more predictable by ensuring use of UTF-8.
Change-Id: I75bb9f962ee230b73ca3a942bffd7a8a28674ba5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The java.io.File methods for creating directories report failure by
returning false. To ease proper checking of return values provide
utility methods wrapping mkdir() and mkdirs() which throw IOException
on failure.
Also fix the tests to store test data under a trash folder and cleanup
after test.
Change-Id: I09c7f9909caf7e25feabda9d31e21ce154e7fcd5
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
This is needed to ensure interoperability with the command line: if
the git-rebase-todo file was created manually (by git rebase -i in the
command line), and any commands other than pick are used (reword,
edit, fixup, squash) JGit must abort as it does not understand these
commands yet.
The same is true if an unknown command is found (e.g. due to a typo);
this is the same behavior as shown by the command line.
Change-Id: I2322014f69460361f7fc09da223e8a5c31f100dd
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Always use streaming (for SHA-checksum & collision detection)
when indexing whole blobs, regardless of their size.
Positives:
* benefits of bugfix #312868 will apply to all runtimes, without
additional conf for mem-constrained JVMs (5MB huge for some)
* no byte array allocation
(re-uses readBuffer instead of allocating new full-size array)
* mildly better overall performance
(given the usual blob-does-not-need-collision-checking case)
* removes unnecessary code
Negative:
* doubles the disk IO for a blob comparision
(comparitively rare occurance)
I perf-tested a range of threshold sizes against a random selection
of packfiles I found on my harddrive, the results are here:
https://spreadsheets.google.com/ccc?key=tLCQElyyd2RKN9QevfvgwGQ&hl=en_GB#gid=1
My interpretation of the results is that the streaming size threshold
isn't beneficial (actually seems to be very slightly detrimental) -so
we should just get rid of it. This tallies with some of the comments
Shawn & I had for the default value of streamFileThreshold in the
review for I862afd4c:
http://egit.eclipse.org/r/#patch,sidebyside,2040,2,org.eclipse.jgit/src/org/eclipse/jgit/transport/IndexPack.java
The perf-test code is here: https://gist.github.com/735402
It's a bit scruffy but basically does 10 runs (in randomised order)
for each threshold size on various packfiles, waiting a second
between each pack-indexing to allow GC to catch up. I know it's not
perfect - proper perf testing is hard to do :-)
For convenience provide an option to skip deletion of non-existing
files. Also add some tests for deletion methods in FileUtils.
Change-Id: I33e355cfcdc19367d50208150ee49a4a06394890
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This change contains a bunch of unit tests for the newly introduced
IndexDiffFilter. With these tests the code coverage of
IndexDiffFilter.include() is now 100%, i.e. every special case is
tested at least once.
Change-Id: Ib248d1cd16084f9c8e099006af151814c63c5941
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Rebase would update the original HEAD to the wrong commit when
"skipping" the last commit after a merged commit.
Includes a test for the specific situation.
Change-Id: I087314b1834a3f11a4561f04ca5c21411d54d993
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
If a treewalk walks also over index and the workingtree then the
IndexDiffFilter filter can be used which works much faster then
the semantically equivalent ANY_DIFF filter. This is because this
filter can better avoid computing SHA-1 ids over the content of
working-tree files which is very costly.
This fix will significantly improve the performance of e.g.
EGit's commit dialog.
Change-Id: I2a51816f4ed9df2900c6307a54cd09f50004266f
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
For --continue, the Rebase command asserts that there are no unmerged
paths in the current repository. Then it checks if a commit is needed.
If yes, the commit message and author are taken from the author_script
and message files, respectively, and a commit is performed before the
next step is applied.
For --skip, the workspace is reset to the current HEAD before applying
the next step.
Includes some tests and a refactoring that extracts Strings in the
code into constants.
Change-Id: I72d9968535727046e737ec20e23239fe79976179
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
The isModified() is more efficient because it can skip over files that
are stat clean, without needing to scan them.
This is useful to efficently work on paths that were already staged
and thus differ between HEAD and the index, but not between the index
and the working tree.
Change-Id: I4418202e612f0571974e0898050d987c6c280966
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
To improve runtime performance, caching the WorkingTreeOptions inside
of the Config object using the Config.SectionParser API allows
the WorkingTreeOptions to be accessed more efficiently whenever a
FileTreeIterator is constructed for the Repository.
Instead of passing the filemode handling option into isModified(),
the WorkingTreeIterator should always honor whatever setting has
been configured in this repository, as defined by its own copy of
the WorkingTreeOptions. This simplifies all of the callers as they
no longer need to lookup core.filemode on their own.
A few locations were changed from always using a hardcoded "true"
on the file mode to passing what is actually configured in the
repository. This is a behavior change, but corrects what should be
considered to be bugs as the core.filemode variable wasn't always
being used.
Change-Id: Idb176736fa0dc97af372f1d652a94ecc72fb457c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When indexing large blobs that are stored whole (non-delta form),
avoid allocating the entire blob in memory and instead stream it
through the SHA-1 checksum computation. This reduces the size
of memory required by IndexPack when processing very big blobs,
such as a 500 MiB uncompressable binary.
If the large blob already exists in the local repository, its
contents needs to be compared byte-for-byte after the entire pack
has been indexed, to ensure there isn't an unexpected SHA-1 collision
which may result in later data corruption. This compare is performed
as a streaming compare, again avoiding the large object allocation.
This change doesn't improve on memory utilization for large objects
stored as deltas. The change also doesn't improve handling for
any large commits, trees or annotated tags. There isn't much to
be done here for those objects, because they need to be passed down
to the ObjectChecker as a byte[]. Fortunately it isn't common for
these object types to be that large,
Bug: 312868
Change-Id: I862afd4cb78013ee033d4ec68c067b1774a05be8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Roberto Tyley <roberto.tyley@guardian.co.uk>
Its confusing that a new TreeWalk() needs to have reset() invoked
on it before addTree(). This is a historical accident caused by
how TreeWalk was abused within ObjectWalk.
Drop the initial empty tree from the TreeWalk and thus remove a
number of pointless reset() operations from unit tests and some of
the internal JGit code.
Existing application code which is still calling reset() will simply
be incurring a few unnecessary field assignments, but they should
consider cleaning up their code in the future.
Change-Id: I434e94ffa43491019e7dff52ca420a4d2245f48b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
java.io.File.delete() reports failure as an exceptional
return value false. Fix the code which silently ignored
this exceptional return value. Also remove some duplicate
deletion helper methods.
Change-Id: I80ed20ca1f07a2bc6e779957a4ad0c713789c5be
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
These objects don't need to be updated with the resulting ObjectId of
the formatted content, callers can get that from the ObjectInserter on
their own.
Change-Id: Idc5f097de9f7beafc5e54e597383d82daf9d7db4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Chris Aniszczyk <caniszczyk@gmail.com>
The correct names for these is build(), as that is what a Java
developer will expect given the "builder" pattern.
Bug: 323541
Change-Id: I35042bdc95a955beeaee29e54bde10e4240b2a71
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Chris Aniszczyk <caniszczyk@gmail.com>
When in OURS and THEIRS a new file is created we want a conflict
when the two contents differ. If on two branches the same file
with the same content is created this should not be a conflict.
But: the current merge algorithm is throwing NPEs in this case.
Fix this by choosing an empty RawText as common base if the
base is empty.
Change-Id: I21cb23f852965b82fb82ccd66ec961c7edb3ac3d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
If the CLI stops a rebase upon conflict, the current
step is already popped from the git-rebase-todo and appended to the
"done" file. The current implementation wrongly pops the step only
after successful cherry-pick.
Change-Id: I8640dda0cbb2a5271ecf75fcbad69410122eeab6
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
The Repository is then in state "Rebase interactive".
Change-Id: I5d2de57f8670e1d4c71ed22509ab17f04e2561b5
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
The IndexDiff had not collected the info if the flag
"assume-unchanged" is set. This information is useful for clients
which may want to decide if specific actions are allowed on a file.
Bug: 326213
Change-Id: I14bb7b03247d6c0b429a9d8d3f6b10f21d8ddeb1
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
When the assume unchanged flag is set the Add command must not update
the index for this file if any changes are present in the working
directory.
Bug: 331351
Change-Id: I255870f689225a1d88971182e0eb377952641b42
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
The referenced bug showed that JGit produced different merge results
compared to C Git. Unit test was added to reproduce the issue. The
problem can be solved by switching to histogram diff algorithm.
Bug: 331078
Change-Id: I54f30afb3a9fef1dbca365ca5f98f4cc846092e3
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
The diff algorithm which is used by Merge, Cherry-Pick, Rebase
should be configurable. A new configuration parameter "diff.algorithm"
is introduced which currently accepts the values "myers" or
"histogram". Based on this parameter for example the ResolveMerger
will choose a diff algorithm. The reason for this is bug 331078.
This bug shows that JGit is more compatible with C Git when
histogram diff is in place. But since histogram diff is quite new we
need an easy way to fall back to Myers diff.
Bug: 331078
Change-Id: I2549c992e478d991c61c9508ad826d1a9e539ae3
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Check for deletion of temporary files in .git folder.
Check for deletion and creation of files.
Change-Id: I60b0b2975724f2e3582e8674d9f876dcbf62b350
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Coverage tests showed that we are missing to test certain areas
in the rebase command. Add the missing tests.
Change-Id: Ia4a272d26cde7e1861dac30496e4b6799fc8187a
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Add the ability to checkout a branch to the working tree.
Bug: 330860
Change-Id: Ie06b9e799a9e1be384da0b8996efa7209b32eac3
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
There was a bug introduced by commit 0e815fe. For non-versioned files
the merge algorithm detected an incoming deletion from THEIRS.
Consequently such files were deleted. That's a severe bug which was
fixed by more precisely detecting incoming deletions.
Change-Id: I4385d3c990db11d62e371a385dc8ee89841db84a
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This is a first iteration to implement Rebase. At the moment, this
does not implement --continue and --skip, so if the first
conflict is found, the only option is to --abort the command.
Bug: 328217
Change-Id: I24d60c0214e71e5572955f8261e10a42e9e95298
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
In the case where DirCacheCheckout was used to checkout a tree
without taking HEAD into account (e.g. during a clone or hard reset)
we didn't handle conflicts correctly. E.g. if there are conflicts
(entries with stage != 0) in the index and we tried to hard reset
we have been processing the conflicting pathes multiple times (once
for every stage). With this fix we will update the index with the
entry from the "merge" state (the state we want checkout) when we
detect existing conflicts.
Change-Id: Iffbddccaa588cf0d1460a5e44dabaf540d996e26
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Leaf level notes trees are split into a new fan-out tree if an
insertion occurs and the tree already contains >= 256 notes in it.
The splitting may occur multiple times if all of the notes have the
same prefix; in the worst case this produces a tree path such as
"00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/be" if all
of the notes begin with zeros.
Change-Id: I2d7d98f35108def9ec49936ddbdc34b13822a3c7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is necessary to allow applications to wrap the note tree in
a commit and update the note branch with the new state.
Change-Id: Idbd7ead4a1b16ae2b64a30a4a01a29cfed548cdf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
NoteMap now supports editing in-memory, allowing applications to
modify the NoteMap once it has been loaded from the branch. The
ability to write the branch back to tree objects is not yet done,
so the edits are strictly transient.
Change-Id: I63448954abfca2a8e3e95369cd84c0d1176cdb79
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Files bigger than 8 MB (2^23 bytes) tended to overflow the internal
hashtable, as the table was capped in size to 2^17 records. If a
file contained 2^17 unique data blocks/lines, the table insertion
got stuck in an infinite loop as the able couldn't grow, and there
was no open slot for the new item.
Remove the artifical 2^17 table limit and instead allow the table
to grow to be as big as 2^30. With a 64 byte block size, this
permits hashing inputs as large as 64 GB.
If the table reaches 2^30 (or cannot be allocated) hashing is
aborted. RenameDetector no longer tries to break a modify file pair,
and it does not try to match the file for rename or copy detection.
Change-Id: Ibb4d756844f4667e181e24a34a468dc3655863ac
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The NoteMap makes it easy to read a small notes tree as created by
the `git notes` command in C Git. To make the initial implementation
simple a notes tree is read recursively into a map in memory.
This is reasonable if the application will need to access all notes,
or if there are less than 256 notes in the tree, but doesn't behave
well when the number of notes exceeds 256 and the application
doesn't need to access all of them.
We can later add support for lazily loading different subpaths,
thus fixing the large note tree problem described above.
Currently the implementation only supports reading. Writing notes
is more complex because trees need to be expanded or collapsed at
the exact 256 entry cut-off in order to retain the same tree SHA-1
that C Git would use for the same content. It also needs to retain
non-note tree entries such as ".gitignore" or ".gitattribute" files
that might randomly appear within a notes tree. We can also add
writing support later.
Change-Id: I93704bd84ebf650d51de34da3f1577ef0f7a9144
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Test was broken by commit b087bba3 changing formatting of merge
commit messages.
Change-Id: I98b1b936b9b6cbaa50fbc59d243a43e66a6ee9f9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
We stopped handling URIs such as "example.com:/some/p ath", because
this was confused with the Windows absolute path syntax of "c:/path".
Support absolute style scp URIs again, but only when the host name
is more than 2 characters long.
Change-Id: I9ab049bc9aad2d8d42a78c7ab34fa317a28efc1a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
IndexDiff was extended to detect files which are both removed from the
index and untracked. Before this change these files were only added
to the removed collection.
Change-Id: I971d8261d2e8932039fce462b59c12e143f79f90
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
There was a bug in ResolveMerger which is one reason for
bug 328841. If a merge was failing because of conflicts
deletions where not handled correctly. Files which have
to be deleted (because there was a non-conflicting deletion
coming in from THEIRS) are not deleted. In the
non-conflicting case we also forgot to delete the file but
in this case we explicitly checkout in the end these files
get deleted during that checkout.
This is fixed by handling incoming deletions explicitly.
Bug: 328841
Change-Id: I7f4c94ab54138e1b2f3fcdf34fb803d68e209ad0
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
The automatically generated commit message of a merge should have the
same structure as in C Git for consistency (as per git fmt-merge-msg).
Before this change:
merging refs/heads/a into refs/heads/master
After:
Merge branch 'a'
Plurals, "into" and joining by "," and "and" also work.
Change-Id: I9658ce2817adc90d2df1060e8ac508d7bd0571cb
There were also some compiler warning due to empty catch blocks that
were fixed.
Change-Id: I165bcddcdfacd34f020d1b938a41954916eb106e
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
This mirrors the getByte() API in ObjectId and allows the caller to
modify a single byte, which is useful when updating it as part of a
loop walking through 0x00..0xff inside of a range of objects.
Change-Id: I57fa8420011fe5ed5fc6bfeb26f87a02b3197dab
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Processing git notes requires random access to part of the raw data
of each ObjectId... which isn't easy because ObjectIds are stored
with an internal representation of 5 ints. Expose random access
to the individual data bytes through new methods, avoiding the
need to convert first to a byte[20].
Change-Id: I99e64700b27fc0c95aa14ef8ad46a0e8832d4441
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The JGit merge algorithm or the Merge Command may have problems with handling
deletions always correctly. Therefore one additional test is added to check
this.
Change-Id: Id6aa49136996b29047c340994fe7faba68858e8c
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
JGit merge algorithm behaved differently from C Git when
we had adjacent modifications. If line 9 was modified by
OURS and line 10 by theirs then C Git will return a
conflict while JGit was seeing this as independent
modifications. This change is not only there to achieve
compatibility, but there where also some really wrong
merge results produced by JGit in the area of adjacent
modifications.
Change-Id: I8d77cb59e82638214e45b3cf9ce3a1f1e9b35c70
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Introduced similar helper methods than in AbstractDiffTestCase.
Then the test cases are much smaller and better understandable.
Change-Id: I2beb4db5a93bd8c0c1238d5d3039cbd6719eee90
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>