Implements a garbage collector for FileRepositories. Main ideas are
copied from the garbage collector for DFS based repos
(DfsGarbageCollector). Added functionalities are
- pruning loose objects
- handling of the index
- packing refs
- handling of reflogs (objects referenced from reflog will not be
pruned/)
These are features of a GC which are not handled in this change and
which should come with subsequent changes:
- unpacking packed objects into loose objects (to support that pruning
packed objects doesn't delete them until they are older than two weeks)
- expiration of reflogs
- support for configuration parameters (e.g. gc.pruneExpire)
Change-Id: I14ea5cb7e0fd1b5c50b994fd77f4e05bfbb9d911
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Currently, after a merge/cherry-pick/rebase, all index entries are
smudged as the ResolveMerger never sets entry lengths and/or
modification times. This change teaches it to re-set them at least for
things it did not touch. The other entries are then repaired when the
index is persisted, or entries are checked out.
The first attempt to get this in was commit
3ea694c252 which has been reverted.
Since then some fixes to ResolveMerger and a few more tests have
been added which check situations where the index is not matching
HEAD before we merge.
Change-Id: I648fda30846615b3bf688c34274c6cf4bc857832
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Also-by: Markus Duft <markus.duft@salomon.at>
Currently, after a merge/cherry-pick/rebase, all index entries are
smudged as the ResolveMerger never sets entry lengths and/or
modification times. This change teaches it to re-set them at least for
things it did not touch. The other entries are then repaired when the
index is persisted, or entries are checked out.
Change-Id: I0944f2017483d32043d0d09409b13055b5609a4b
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Otherwise applying will fail with a FileNotFoundException, because
File.createNewFile() fails with missing parents.
Contains change & according test.
Change-Id: I970522b549b8bb260ca6720da11f12c57ee8a492
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
An example where this is necessary is when a whole directory was deleted
and checkout is used to restore a file which was in that directory.
Bug: 372133
Change-Id: I1d45e0a5d2525fe1fdfbf08c9c5c166dd909e9fd
Signed-off-by: Robin Stocker <robin@nibor.org>
This was broken in fe1f1b8f8a, which
preferred the index over the working tree when both were present.
Change-Id: I97dcf9a088adcbd0187fa7eec9ef34445ce3a981
Signed-off-by: Kevin Sawicki <kevin@github.com>
When starting a rebase with C Git, there may be empty lines in the
git-rebase-todo file. Before this change, JGit would fail to parse the
file with e.g. the following exception:
JGitInternalException: Unknown or unsupported command "
#", only "pick" is allowed.
This happened when there was an empty line just before the comments,
because the nextSpace would be the one from the comment. Now the empty
lines are ignored by checking for nextSpace < ptr outside of the loop.
Change-Id: I94ad299f367c846e7729c74f49c6b8f93f75ae81
Signed-off-by: Robin Stocker <robin@nibor.org>
Before, the paths to delete were stored in a HashMap, which doesn't have
a particular order. So when e.g. both the file "a/b" and the directory
"a" were to be deleted, it would sometimes try to delete "a" first. This
resulted in a failed path because File#delete() fails when a directory
isn't empty.
With this change, an ArrayList is used for storing the paths to delete.
The list contains the paths in a top-down order, as defined by the order
of processEntry. When the files are deleted, the list is iterated in
reverse, ensuring that all files of a directory are deleted before the
directory itself.
Bug: 354099
Change-Id: I6b2ce96b3932ca84ecdfbeab457ce823c95433fb
Signed-off-by: Robin Stocker <robin@nibor.org>
ResolveMerger#mergeImpl() was only returning false (= failed) when there
were unmerged paths. In the case when there were only failing paths, it
returned true.
Because MergeCommand looks at the return value for determining if the
merge failed, it would fall into the successful case there, where it
should instead return a MergeResult with MergeStatus.FAILED.
This change adds a test case for this and makes the ResolveMerger return
false when there are failing paths.
This was discovered while working on fixing bug 354099 and is needed for
its test case.
Bug: 354099
Change-Id: I499f518f6289ef93e017db924b2aa857f2154707
Signed-off-by: Robin Stocker <robin@nibor.org>
Whenever a call to JGit returns a Repository the caller should make sure
to call close() on it if he doesn't need it anymore. Since instances of
Repository contain e.g. open FileOutputStreams (for pack files)
forgetting to close the repository can lead to resource leaks.
This was the reason why dozens of the JUnit tests failed on Windows
with "Can't delete file ...." errors.
In LocalDiskRepositoryTestCase.tearDown() we tried to delete the
repositories we used during tests which failed because we had open
FileOutputStreams.
Not only the obvious cases during Clone or Init operations returned
Repositories, but also the new SubModule API created repository
instances. In some places we even forgot to close submodule repositories
in our internal coding.
To see the effects of this fix run the JGit JUnit tests under Windows.
On other platforms it's harder to see because either the leaking
resources don't lead to failing JUnit tests (on Unix you can delete
files with open FileOutputStreams) or the java gc runs differently and
cleans up the resources earlier.
Change-Id: I6d4f637b0d4af20ff4d501db091548696373a58a
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When receiving a pack, data buffered after the pack can restored
to the InputStream if the stream supports mark and reset.
Change-Id: If04915c32c91be28db8df7e8491ed3e9fe0e1608
LockFileTest was failing on Windows because we couldn't delete the lock
file of the index. The reason was that a LockFile instance still had an
open handle to the lock file preventing us to delete the file (in
contrast to the behavior on other platforms).
Change-Id: I1d50442b7eb8a27f98f69ad77c5e24a9698a7b66
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
It is not always appropriate to use the .gitmodules file from the
working tree, for example if reading the modules at a specific commit.
And sometimes it is impossible, as in a bare repository.
When using the static factory methods, automatically set up the
appropriate root tree so lazy loading of the config file reads from
the appropriate place. Leave the current behavior of looking in the
working tree as a fallback for the case where walking the index.
Change-Id: I71b7ed3ba16c80b0adb8c5fd85b5c37fd4aef8eb
Relax the read() method to not block until exactly "len" bytes have
been read. Instead, return when one or more bytes have been read, up
to "len", so UnionInputStream more closely resembles InputStream's
read() method.
Change-Id: I3f632be8eb85a4a0baf27c9f067c8d817162de2b
This updates the timestamp of files that are not touched during
checkout. Otherwise the timestamp will always be zero, causing the
IndexDiffFilter to always calculate the checksum of file contents.
Change-Id: I18047f5725f22811bb4194ca1d3a3cac56074183
Add isModeDifferent method to WorkingTreeIterator
that compares mode with consideration of the
core.filemode setting in the config.
Bug: 379004
Change-Id: I07335300d787a69c3d1608242238991d5b5214ac
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
This allows findGitDir to be used for repositories containing
a .git file with a gitdir: ref to the repository's directory
such as submodule repositories that point to a folder under the
parent repository's .git/modules folder
Change-Id: I2f1ec7215a2208aa90511c065cadc7e816522f62
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
PackWriter supports excluding objects from being written to the pack.
You may specify a PackIndex which lists all those objects which should
not go into the new pack. This feature was broken because not all
commits have been checked whether they should be excluded or not. For
other object types the exclude algorithm worked. This commit adds the
missing check.
Change-Id: Id0047098393641ccba784c58b8325175c22fcece
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Write the old object id from the RefUpdate to the
ORIG_HEAD file after the update completes.
Add two new convenience methods to Repository to read
and write the ORIG_HEAD reference similar to the methods
for reading/writing CHERRY_PICK_HEAD and MERGE_HEAD.
Bug: 375525
Change-Id: I120b3b2cd3b1ddae88fce435285bae15cbf96f5e
All commands should throw a GitAPIException so new exceptions can be
added without breaking the builds of old code, i.e. anyone that calls
a Git API should catch GitAPIException and not just the currently known
exceptions.
Now the only checked exceptions on Git API calls are GitException and
subclasses of it. New checked exceptions that are subclasses of
GitException may be added without breaking the API.
Javadoc for GitAPIException is declared on GitCommand and
inherited to subclasses. JGitInternalException is not explicitly
documented anymore.
Unfortunately this change itself breaks the API. The intention is
that it shall be possible to add new checked subclasses of
GitAPIException without breaking the API.
Bug: 366914
EGit-Change-Id: I50380f13fc82c22d0036f47c7859cc3a77e767c5
Change-Id: I50380f13fc82c22d0036f47c7859cc3a77e767c5
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This is needed to allow jumping to a selected commit when loading
history incrementally.
Change-Id: Id3b97d88d3b4b2d67561b11f8810cb88fe040823
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Previously if a packed-refs file was racily clean then there
was a 2.5 second window in which each call to getPackedRefs
would increment the mod count causing a RefsChangedEvent to be
fired since the FileSnapshot would report the file as modified.
If a RefsChangedListener called getRef/getRefs from the
onRefsChanged method then a StackOverflowError could occur
since the stack could be exhausted before the 2.5 second
window expired and the packed-refs file would no longer
report being modified.
Now a SHA-1 is computed of the packed-refs file and the
mod count is only incremented when the packed refs are
successfully set and the id of the new packed-refs file
does not match the id of the old packed-refs file.
Change-Id: I8cab6e5929479ed748812b8598c7628370e79697
Overload DirCache.lock to take a repository that is
used for updating smudged index entries with information
from the repository's working tree.
New unit tests are also added for updating smudged index
entries on reset, checkout, and commit.
Change-Id: I88689f26000e4e57e77931e5ace7c804d92af1b6
When reset command was called with tag name as parameter the resulting
HEAD was set to the tag's SHA-1 which is a bug. This patch ensures that
repository.resolve() call always returns commit id.
Change-Id: I219b898c620a75c497c8652dbf4735fd094c4d7c
Signed-off-by: Dariusz Luksza <dariusz@luksza.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If there are a lot of references to modify, using BatchRefUpdate can
save time if the underlying storage is able to combine these updates
together. This should speed up initial clone or fetch into an empty
repository, as some projects can have hundreds of release tags, or
hundreds of branch heads.
Change-Id: Iee9af8d5fa19080077d88357c18853540936e940
This never should have been in the core library test suite, as that
test suite never should depend upon the HTTP server module.
Change-Id: Ie0528c4d1c755823303d138e327a3a2f4caccc32
That happens when the index and a new file is created within the same
second and becomes a problem if we then modify the newly created file
within the same second after adding it to the index. Without smudging
JGit will, on later reads, think the file is unchanged.
The accompanying test passed with the smuding on read.
Change-Id: I4dfecf5c93993ef690e7f0dddb3f3e6125daae15
This is needed to allow jumping to a selected commit when loading
history incrementally.
Change-Id: Id3b97d88d3b4b2d67561b11f8810cb88fe040823
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Use the NullOutputStream.INSTANCE value when the
configured output stream is null or the command is
configured to only show name and status.
Also only set the context and prefix options if
formatting is actually being performed.
Bug: 377157
Change-Id: I333cfcc82ee746f3c6a8e94c09dcc803ffbb4b3a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
DirCacheCheckout and CanonicalTreeParser cooperate. CanonicalTreeParser
can detect malformed, potentially malicious tree entries and sets a
flag, while DirCacheCheckout refuses to work with such paths.
Malicious tree entries are ".", "..", ".git" (case insensitive), any
name containing '/' and (on Windows '\') and also (on Windows)
any paths ending in a combination of '.' or space or containing a ':'.
We also forbid all special names like "con" etc on Windows.
Some of the test can execute on any platform by enabling partial
platform emulation.
A new runtime exception, InvalidPathException, is introduced. For
backwards compatibility it extends InvalidArgumentException.
Change-Id: I86199105814b63d4340e5de0e471d0da6b579ead
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Allow adding files with size over 2 GB. The drawback is that the tests
for huge file support adds roughly 10 minutes of execution time.
For that reason we @Ignore the test in the standard test execution.
Change-Id: I5788e8009899203b346f353297166825b3744575
When there is a conflict sometimes we did not set the stage of
the conflict entries properly for the STAGE_1 entry.
Change-Id: I1c28ff6251fdbc95f7c40fc3e401f1b41157a9f6
Tags can be un-annotated whereby there is no RevTag object, only
a ref pointing to the tagged object.
Bug: 360650
Change-Id: I06309c45c0a896fe2a0a874700febf78c9fb87e8
Instead of indexing the subsection names on each request for a given
section name, index both the section and subsection names in a single
scan through the entry list. This should improve lookup time for
reading the section names out of the configuration, especially for the
url.*.insteadof type of processing performed in RemoteConfig.
Change-Id: I7b3269565b1308f69d20dc3f3fe917aea00f8a73
Iterate over all successfully cloned submodules recursively
and continue initializing and updating until no more are found.
Bug: 375426
Change-Id: Ifb99e41e2deb0c369442bca3c0f5f072dd006816
Content length is computed and cached (short term) in the working
tree iterator when core.autocrlf is set.
Hopefully this is a cleaner fix than my previous attempt to make
autocrlf work.
Change-Id: I1b6bbb643101a00db94e5514b5e2b069f338907a
This reverts commit bf845c126d since this
change needs to go through a formal IP review and Chris missed to file a
CQ for that.
Change-Id: I303515d78116f0591a2911dbfb9f857738f086a9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>