JGit failed to do checkouts when the index contained smudged entries and
autocrlf was on. In such cases the WorkingTreeIterator calculated the
SHA1 sometimes on content which was not correctly filtered. The SHA1 was
computed on content which two times went through a lf->crlf conversion.
We used to tell the treewalk whether it is a checkin or checkout
operation and always use the related filters when reading any content.
If on windows and autocrlf is true and we do a checkout operation then
we always used a lf->crlf conversion on any text content. That's not
correct. Even during a checkout we sometimes need the crlf->lf
conversion. E.g. when calculating the content-id for working-tree
content we need to use crlf->lf filtering although the overall operation
type is checkout.
Often this bug does not have effects because we seldom compute the
content-id of filesystem content during a checkout. But we do need to
know whether a file is dirty or not before we overwrite it during a
checkout. And if the index entries are smudged we don't trust the index
and compute filesystem-content-sha1's explicitly.
This caused EGit not to be able to switch branches anymore on Windows
when autocrlf was true. EGit denied the checkout because it thought
workingtree files are dirty because content-sha1 are computed on wrongly
filtered content.
Bug: 493360
Change-Id: I1072a57b4c529ba3aaa50b7b02d2b816bb64a9b8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Before this change, jgit used to read packed-refs before scanning
loose refs. That was not a problem if gc didn't run concurrently. When
gc did run concurrently with such refs reading, that order sometimes
broke the latter. This lead to reading an older version of a ref's
tip, which meant "losing" the real tip or commit. The specific
read-Vs-gc concurrency scenario which broke reading that way follows:
1. let ref R be in packed-refs and R' be in loose
2. jgit starts reading packed-refs
3. gc also starts its business around that very time
4. jgit still has the time to read R from packed-refs
5. as gc is not done yet updating packed-refs with R'
6. jgit then starts scanning loose refs (or is about to)
7. gc quickly ends up being done moving loose R' to packed-refs
8. so gc (quickly) removes loose refs
9. -while jgit is scanning loose refs, now gone
10. so jgit assumes no loose to consider => packed-refs winning
11. so jgit wrongfully returns R (from 4.) as the tip, instead of R'.
This fix switches the order so loose refs are scanned (secured) before
taking the time to read packed-refs. This way, knowledge of the
likelier tip is guaranteed for ref reading to return the true tip
- despite concurrent gc. If there is no loose ref to scan, jgit reads
packed-refs and lands on R' (or S), which it then returns, as
expected. The gerrit issue [1] should be solved by this fix.
[1] https://code.google.com/p/gerrit/issues/detail?id=2302
Change-Id: Ibd120120a361a3a6ed565f3836afc1db706fbcdd
Signed-off-by: Marco Miller <marco.miller@ericsson.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
There was a bug in JGit which caused NPEs being thrown when Transport
errors should be reported. Avoid the NPE to let the original error show
up.
Change-Id: I9e1e2b0195bd61b7e531a09d0fc7bce109bd6515
When CheckoutCommand or MergeCommand is called then not in all situation
the treewalks have been prepared to support clean/smudge filters. Fix
this
Bug: 491505
Change-Id: Iab5608049221c46d06812552ab97299e44d59e64
Such hunks are identifiable by a zero value for "new start line". Prior
to the fix, JGit throws and ArrayIndexOutOfBoundsException on such
patches.
Change-Id: I4f3deb5e5f41a08af965fcc178d678c77270cddb
Signed-off-by: Jonathan Schneider <jkschneider@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
RepositoryCache has 2 methods to remove a repository from the cache but
they are never called when a repository is closed. Users of the cache
were expected to call one of those 2 methods but how could they have
called them at proper time without having visibility of the repository
usage count.
Ideally, I would have reworked the RepositoryCache to wrap any
repository it opens in a class that would be responsible to unregister
them from the cache when it's really closed, i.e. when usage counter
reaches 0. The problem preventing the wrapping solution is the
RepositoryCache.register method that allows to register an already
opened repository in the cache. Such repositories cannot be wrapped
because callers are still holding a reference on the unwrapped
repository.
Document that RepositoryCache.close method is removing the repository
from the cache as well as closing it and rework
RepositoryCache.unregister method to only remove the repository from the
cache. Use the latter to unregister repository when Repository.doClose
is getting executed.
Change-Id: Ia364816e4da8d7b6cfa72f10758ca31aa8a1f9db
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When repositories are opened using the RepositoryCache, they are kept in
memory and when the repository usage counter reaches 0, the
Repository.close method is called which then calls close method on its
reference and object databases.
The problem is that RefDirectory.close method was a no-op and the
reference database was kept in memory. This problem is only happening
when opening a repository using the RepositoryCache because it never
evicts repositories, it's just calling the close method.
Change-Id: Iacb961de8e8b1f5b37824bf0d1a4caf4c6f1233f
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
Repository has a usage counter that is initialized to 1 at
instantiation and this counter is decremented when Repository.close
method is called. There is also a Repository.incrementOpen method that
RepositoryCache uses to increment the usage count when it's returning a
repository that is already opened.
The problem was that RepositoryCache was incrementing the usage count
for repositories that it just opened or registered. The usage count was
2 when it should have been 1.
Incrementing usage count is now only be done for repository that are
served from the cache.
This bug is causing slow memory increase of our Gerrit server until the
server become slow. Even if the RepositoryCache is using SoftReference,
it seems that the JVM is not garbage collecting the repositories because
it's not yet on the edge of being out of memory.
To test this change, I replicated all repositories(11k) from Gerrit
master to one slave. The Gerrit master used memory after this test was
10GB without this change and 3.5GB with.
Change-Id: I86c7b36174e384f106b51fe92f306018fd1dbdf0
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
* master:
Add config parameter gc.prunePackExpire for packfile expiration
In TestRepository, use a consistent clock
Change-Id: I7ac568e650fbd191e48a8f1a4068af72deb242e8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When checking out commits/branches JGit was triggering correctly
configured smudge filters. But when checking out paths (either from
index or from commits) JGit was not triggering smudge filters. Fix
CheckoutCommand to properly call filters.
Bug: 486560
Also-by: Pascal Krause <pascal.krausek@sap.com>
Change-Id: I5ff893054defe57ab12e201d901fe74e1376efea
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Implement the DIR_NO_GITLINKS setting with the same functionality
it provides in cGit.
Bug: 436200
Change-Id: I8304e42df2d7e8d7925f515805e075a92ff6ce28
Signed-off-by: Preben Ingvaldsen <preben@puppetlabs.com>
JGit's Garbage Collector is repacking relevant objects into new
packfiles and is afterwards deleting the now obsolete packfiles. But to
prevent problems caused by race conditions JGit was not deleting
packfiles when they are too young. The same mechanism as for loose
objects and the config parameter gc.pruneExpire was used.
But JGit was reusing the parameter gc.pruneExpire also for packfiles
which may cause a lot of filesystem consumption if gc.pruneExpire was
set to the default of 2 weeks. Only two weeks after packfile creation gc
was allowed to delete this packfile.
This change introduces a new config paramter gc.prunePackExpire with a
default of "1.hour". This parameter is used when packfiles are deleted.
Only packfiles older than the specified time can be deleted.
For loose objects the behaviour is not changed and only the old
parameter gc.pruneExpire is relevant.
Change-Id: I6209efb05678b15153bd22479dc13486907a44f8
The default author and committer objects in TestRepository were
initialized statically and did not use the MockSystemReader passed into
the TestRepository ctor. Make these fields non-static and initialize
them with a consistent clock.
Also make the author and commiter name and email strings public for
tests that want to verify against them.
Change-Id: I88b444b96e22743001b32824d8e4e03c2239aa86
Signed-off-by: Terry Parker <tparker@google.com>
The FileLfsRepository.out member could have been accessed from multiple
threads which would corrupt the content.
Don't store the AtomicObjectOutputStream in the FileLfsRepository.out but
move it to the ObjectUploadListener which is instantiated per-request.
Add a parallel upload test.
Change-Id: I62298630e99c46b500d376843ffcde934436215b
Signed-off-by: Saša Živkov <sasa.zivkov@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This change replaced some tabs with spaces introduced in
change I8b3765713599e34f1411f9bbc7f575ec7c2384e0.
Change-Id: Ia5c23b38c9fbbb46f150e527347b61c64c8d9e87
Signed-off-by: Yuxuan 'fishy' Wang <fishywang@google.com>
With ignoreRemoteFailures set to true, we can ignore remote failures
(e.g. the branch of a project described in the manifest file does not
exist), skip that project and continue to the next one, instead of fail
the whole operation.
Change-Id: I8b3765713599e34f1411f9bbc7f575ec7c2384e0
Signed-off-by: Yuxuan 'fishy' Wang <fishywang@google.com>
This commit introduces a FileModeStrategy to
the FileTreeIterator class. This provides a way to
allow different modes of traversing a file tree;
for example, to control whether or not a nested
.git directory should be treated as a gitlink.
Bug: 436200
Change-Id: Ibf85defee28cdeec1e1463e596d0dcd03090dddd
Signed-off-by: Preben Ingvaldsen <preben@puppetlabs.com>
Allow access to the ObjectId of a DirCacheTree if known for low level
integration code.
Change-Id: I6f05b10c9ac781f5e8b38af4a19e653313c91fa8
Signed-off-by: Philipp Marx <smigfu@googlemail.com>
TreeWalk provides the new method getEolStreamType. This new method can
be used with EolStreamTypeUtil in order to create a wrapped InputStream
or OutputStream when reading / writing files. The implementation
implements support for the git configuration options core.crlf, core.eol
and the .gitattributes "text", "eol" and "binary"
CQ: 10896
Bug: 486563
Change-Id: Ie4f6367afc2a6aec1de56faf95120fff0339a358
Signed-off-by: Ivan Motsch <ivan.motsch@bsiag.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14dfa70520 fixed the problem that HEAD wasn't added to the reftree when
rebuilding the reftree in an empty repository where HEAD isn't yet
resolvable. Since non-resolvable refs are filtered out by
RefDatabase.getRefs(ALL) we have to add HEAD to the reftree explicitly
in this special case.
This fix resulted in another bug: rebuilding the reftree in a repository
which has a resolvable HEAD failed with a DirCacheNameConflictException
in RefTree.apply(). If HEAD is resolvable RefDatabase.getRefs(ALL) does
not filter out HEAD. This results in two identical CREATE commands for
HEAD which RefTree.apply() refuses to execute.
Fix this by no longer creating a duplicate CREATE command for HEAD.
See: I46cbc2611b9ae683ef7319dc46af277925dfaee5
Change-Id: I58dd6bcdef88820aa7de29761d43e2edfa18fcbe
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This fixes a MissingResourceException thrown when executing
jgit debug-rebuild-ref-tree --help
Change-Id: I637ea55084a913f5105ebf4cf2baef8b81877938
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This version fixes signing of Apache httplclient.
Change-Id: I81d7a643233386442bd31ee602669d2c88b68576
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The reason is that URIish(URL) and URIish(String) make different parsing
of path / rawPath with regard to drive letters. /C:/... for URL and
C:/... for String. This patch fixes the issue.
Change-Id: I8e2013fff30b7bb198ff733c038e21366667b8a0
Signed-off-by: Ivan Motsch <ivan.motsch@bsiag.com>
Target platform can be configured directly, e.g.:
$ mvn clean install -Dtarget-platform=jgit-4.6
Set the default to use the Mars target platform.
Change-Id: Ib6075af19be88fa418ecbe4dd7a217d9879e178a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This may have caused the spurious compile errors sometimes observed in
Eclipse since org.eclipse.jgit.lfs.lib is a split package to enable
testing package private code in bundle org.eclipse.jgit.lfs.
Change-Id: I0294448965de8ad8c254b26382386ef2b9f6e863
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add a storage implementation storing large objects in Amazon S3.
The AmazonS3Repository pre-signs download and upload requests.
AWS access and secret key are expected to be in the
$HOME/.aws/credentials file in the following format:
[default]
accessKey = ...
secretKey = ...
Use AWS version 4 request signing [1] because it is more secure and
supported by all regions. The version 3 signing is not supported in
newer regions.
In follow up changes we should:
- implement getVerifyAction() and do actual verification. Subclasses of
S3Repository can implement caching for object meta data (size) in order
to avoid extra roundtrips to S3. Verification should ensure that meta
data store and content of S3 storage are in sync
- HEAD request used in S3Repository.getSize() seems to always return
Content-length 0 in contrast to the documentation [2]. So getSize() does
detect if the object exists in S3 or not but in case the object exists
it always returns size 0
[1] http://docs.aws.amazon.com/general/latest/gr/signature-version-4.html
[2] https://forums.aws.amazon.com/thread.jspa?threadID=223616
Change-Id: Ic47f094928a259e5264c92b3aacf6d90210907a8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Capture the internal "want X not valid" state as a specific subclass
of PackProtocolException, allowing this to be more easily identified
in server stack traces and wrapper application code.
Change-Id: I4b1adb7497f396432da420b0f600ad25a261f912
If the client sends a SHA-1 that the server does not recognize echo
this back to the client with an explicit error message instead of
the generic "internal server error".
This was always the intent of the implementation but it was being
dropped on smart HTTP due to the UploadPackServlet catching the
PackProtocolException, discarding the buffered message UploadPack
meant to send, and sending along a generic message instead.
Change-Id: I8d96b064ec655aef64ac2ef3e01853625af32cd1