github/jgit - jgit - 帆软第三方插件仓库

Commit Graph

Author	SHA1	Message	Date
Shawn O. Pearce	fa4cc2475f	DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb	13 years ago
Robin Rosenberg	afd4f3b0cf	Allow '\' in user names in URI-ish Actually this is not ok according to the RFC, but this implementation is ment to be Git compatible. A '\' is needed when the authentication requires or allows authentication to a Windows domain where the user name can be specified as DOMAIN\user. Change-Id: If02f258c032486f1afd2e09592a3c7069942eb8b	13 years ago
Carl Myers	85a9ab7410	Fix NPE when PATH environment variable is empty Change-Id: Ic27d509cd5e2d6c855e7d355fc308399d9dc01c9 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Kevin Sawicki	931b931ee8	Provide an id for submodule entries. Open a repository for submodule entries that have a child .git directory and use the resolved HEAD commit as the entry's id. Change-Id: I68d6e127f018b24ee865865a2dd3011a0e21453c Signed-off-by: Kevin Sawicki <kevin@github.com>	13 years ago
Kevin Sawicki	5041f738e9	Suppress unused and unchecked warnings Change-Id: I9f51cc749f5cb9d2e3aa86874e60fca29b779565 Signed-off-by: Kevin Sawicki <kevin@github.com>	13 years ago
Marc Strapetz	bf81119e62	DirCacheEntry: accessors for cached creation time (CTIME) Change-Id: I986d5fff63ff1a86cca6bab49c744ea673fe4892	13 years ago
Robin Rosenberg	3ceb4fac23	Do not resolve path using cygwin unless told to The system property jgit.cygpath must be set to true in order for cygwin's cygpath to be used to translate path from cygwin namespace to Windows namespace. The cygwin path translation should be considered deprecated. Bug: 353389 Change-Id: I2b5234c0ab936dac67d1e232f4cd28331bf3226d Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	13 years ago
Matthias Sohn	a5f72d6b3b	Implement Config.Entry.toString() to help debugging Change-Id: I86f6359d955d39ab033848b87ed39d20378d3c1f Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Shawn O. Pearce	b24a61272a	Ensure the ObjectInserter flushes after a merge If this does not happen some databases may discard objects and not make them available. Change-Id: I347b3c3724db52c8a6c09f4804071497a3a377ab	13 years ago
Robin Rosenberg	57bdb04873	Cosmetic adjustment of relative date format, do not display "0 months" Though it may seem less precise, "0 months" looks bad and the reference Git implementation also does not display "0 months" Change-Id: I488e9c97656f9941788ae88d7c5c1562ab6c26f0	13 years ago
Carsten Pfeiffer	98d4bd6d36	Allow detecting which files were renamed during a revwalk The egit history view shows the files associated with a commit by using a PathFilter. When following renames with a FollowFilter, the PathFilter cannot be configured anymore because the affected files are simply not known. Thus, it should be possible to get to know which files are renamed. Bug: 302549 Change-Id: I4761e9f5cfb4f0ef0b0e1e38991401a1d5003bea	13 years ago
Robin Rosenberg	63bb6ff06c	Fix compatibilty breakage for SystemReader Introducing a new abstract method is not nice when one expects other to subclass them. Create default implementations so old code that implements SystemReader does not break. The default methods just delegate to the JVM. Change-Id: I42cdfdcb6b29f7203697a23833dca85185b0b9b3 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	13 years ago
Robin Rosenberg	f4460dda97	Define a utility class for handling Git date formats Besides the formats known by git-log(1) we also add "locale" and "localelocal" that formats dates according to the user's locale. "locale" does not translate into local timezone, while localelocal does. Change-Id: I1c088dcec992c107e43f6c17be4ac9ed6eb428bf	13 years ago
Robin Rosenberg	3a4fa52723	Add locale to the properties manageable by SystemReader Change-Id: I5e9af40d38bb671cb9fcdb0fa3b4eb3af5f36f6c	13 years ago
Robin Rosenberg	06b183f9b7	Add a method to SystemReader to get the time zone Change-Id: Ifd31f408ed2c5b7869694b715fea3219e74963ef	13 years ago
Robin Rosenberg	fb68c7a4cd	Use the SystemReader to get system time Change-Id: Ib79c0cc964bfe799b204419e552b9aa6243966ce	13 years ago
Robin Rosenberg	2e43dcd645	Fix bad checkout behaviour when a file is removed We deleted the entry if there was a file and an index entry, but not when there was just an index entry. Now delete the file in both cases since the missing file just means our worktree is dirty. This affected the implementation of reset --hard. Bug: 347574 Change-Id: Ie66fa61303472422830f5e33614e93ad65094e5d	13 years ago
Kevin Sawicki	86e96b41e2	Correct typo in RevWalk.parseBody comment Change-Id: I0e65a5a6809a8d32d256322dbcae94b6aa603e5e Signed-off-by: Kevin Sawicki <kevin@github.com>	13 years ago
Jens Baumgart	6befabcb15	Extend IndexDiff to calculate ignored files and folders IndexDiff was extended to calculate ignored files and folders. The calculation only considers files that are NOT in the index. This functionality is required by the new EGit decorator implementation. Bug: 359264 Change-Id: I8f09d6a4d61b64aeea80fd22bf3a2963c2bca347 Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>	13 years ago
Kevin Sawicki	565e4a06ef	Add missing comment text for mergeCommitTree parameter Change-Id: I35cef13d8be4f06515668f710fd508700b90f44d Signed-off-by: Kevin Sawicki <kevin@github.com>	13 years ago
Robin Rosenberg	602c869d7a	Do not attempt to resolve describe-labels with less than four digits Change-Id: I21dcd3cca3b41102fd898238d8d640dea25e0caf Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	13 years ago
Robin Rosenberg	1570aa9e5c	Fix DirCacheEdtor.DeleteTree for empty string argument Change-Id: I7425da91c0752ae82484e3c29d21b57402d30c61	13 years ago
Kevin Sawicki	654f7235ec	Add varargs version of PathFilterGroup.createFromStrings This allows the following usage pattern: PathFilterGroup.createFromStrings("path1", "path2"); Change-Id: I589e758cc55873ce75614602e017ac793435e24d Signed-off-by: Kevin Sawicki <kevin@github.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	13 years ago
Manuel Doninger	458b5a4042	New config constant for default start-point This constant determine the default start-point, if the user don't want to create a branch from the current HEAD. Change-Id: Iea944e11e80134fbafc4c47383457d5ed11a4164 Signed-off-by: Manuel Doninger <manuel.doninger@googlemail.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	13 years ago
Matthias Sohn	0db0476542	Fire IndexChangedEvent on DirCache.commit() Since we replaced GitIndex by DirCache JGit didn't fire IndexChangedEvents anymore. For EGit this still worked with a high latency since its RepositoryChangeScanner which is scheduled to run each 10 seconds fires the event in case the index changes. This scanner is meant to detect index changes induced by a different process e.g. by calling "git add" from native git. When the index is changed from within the same process we should fire the event synchronously. Compare the index checksum on write to index checksum when index was read earlier to determine if index really changed. Use IndexChangedListener interface to keep DirCache decoupled from Repository. Change-Id: Id4311f7a7859ffe8738863b3d86c83c8b5f513af Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Kevin Sawicki	e630f91305	Remove TODO for generated constructor. Change-Id: Ie405f6de99b8fa632d7462400e647a37f30e2e31 Signed-off-by: Kevin Sawicki <kevin@github.com>	13 years ago
Christian Halstrick	1230d353d8	Fix status in index entries after checkout of paths The checkout command was producing an inconsistent state of the index which even confuses native git. The content sha1 of the touched index entries was updated, but the length and the filemode was not updated. Later in coding the index entries got automatically corrected (through Dircache.checkoutEntry()) but the correction was after persisting the index to disk. So, the correction was lost and we ended up with an index where length and sha1 don't fit together. A similar problem is fixed with "lastModified" of DircacheEntry. When checking out a path without specifying an explicit commit (you want to checkout what's in the index) the index was not updated regarding lastModified. Readers of the index will think the checked-out file is dirty because the file has a younger lastmodified then what's in the index. Change-Id: Ifc6d806fbf96f53c94d9ded0befcc932d943aa04 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Jens Baumgart <jens.baumgart@sap.com> Bug: 355205	13 years ago
Robin Rosenberg	b4112c1748	Fix DirCache,getEntriesWithin for empty string arguemnt Change-Id: I0bea130df611de3ef8c9251093b11c62b5442cd1	13 years ago
Robin Rosenberg	39ad503fcb	Append merge strategy to reflog message Change-Id: Ia0e73208b86c45a3d96698e973f6e70ec5cb7303 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Robin Rosenberg	4f4e468f6f	Fix the reflog prefix for cherry-pick, revert and merge commands We should see whether the commit was a regular commit or something else. Change-Id: I82d8300cf3c53cb2bdcb6495386aadb803e0c6f7 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Roberto Tyley	791a9fd691	Enable full Transport configuration for JGit API commands Add a TransportConfigCallback parameter to JGit API commands, to allow consumers of the JGit command API to perform custom Transport configuration that would be otherwise difficult to anticipate & expose on the API command builders. My specific use-case is configuring additional properties on SshTransport - I need to take over the SshSessionFactory used by the transport. Using TransportConfigCallback I can simply do this (rather than reimplement the API command classes): public void configure(Transport tn) { if (tn instanceof SshTransport) { ((SshTransport) tn).setSshSessionFactory(factoryProvider.get()); } } Adding an explicit setSshSessionFactory() method to the JGit command classes would bloat the API. Also, creating the replacement SshSessionFactory is unnecessary if the transport is not SSH, but the type of the Transport is only known once the remote has been resolved and the URI parsed - consequently it makes sense to perform this step in a callback, where the transport instance can be inspected to determine if it's of a relevant type. A note about where this leaves the API - there are now 4 commands: CloneCommand PullCommand FetchCommand PushCommand -that share 3 identical transport-related parameters: timeout credentialsProvider transportConfigurator I think there's potential for introducing an interface or val-object to identify/encapsulate this repetition, which I'd be happy to do in a subsequent commit. Change-Id: I8983c3627cdd7d7b2aeb0b6a3dadee553378b951 Signed-off-by: Roberto Tyley <roberto.tyley@gmail.com>	13 years ago
Matthias Sohn	46771e9e88	Remove use of GitIndex to detect index changes We can detect index changes using FileSnapshot. This is more efficient and removes usage of a deprecated class. Change-Id: I4a679102c9a1bd8e82b9ca93eb9dbbde445e9be4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Matthias Sohn	19a366d532	Prepare 1.2.0 builds Change-Id: I9ec247135d93ef28d732e94f18d0ec1d0e2e6d44 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Matthias Sohn	57d6585522	Prepare post v1.1.0.201109151100-r build Change-Id: Ib099ec93d8243b238641d79328216874532ab5eb Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Matthias Sohn	1cb0510cee	JGit v1.1.0.201109151100-r Change-Id: Iadcec7e5973600e005cbdeb837fa197d3ae2ea86 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Shawn O. Pearce	01888db892	UploadPack: Fix races in smart HTTP negotiation Clients cache the set of advertised references at the start of a negotiation, and keep replaying the same "want SHA1" list to the server on each negotiation step. If another client pushes into a branch and moves it by fast-forward, any request to obtain that branch's prior SHA-1 is still valid, the commit is reachable from the new position of the reference. Unfortunately the fast-forward causes smart HTTP negotations to fail, as the server no longer is advertising that prior SHA-1. Instead of causing clients to fail out with a "want invalid" error and forcing the end-user retry, possibly getting into a never ending try-fail-retry race while other clients are pushing into the same busy repository, allow the slightly stale want request so long as it is still reachable. C Git implemented this same change recently to fix races on the smart HTTP protocol when the C Git git-http-backend is used. The new RequestPolicy feature also allows server authors to make an even more lenient configuration that exports any SHA-1 to the client. This might be useful in certain settings where a server has authenticated the client as the "repository owner" and wants to allow them to grab any content from the server as a complete unbroken history chain. The new setAdvertisedRefs() method allows server authors to manually fix the references that are advertised, possibly bypassing the getAllRefs() call on the Repository object. Change-Id: I7cdb563bf9c55c83653f217f6e53c3add55a0541 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	1b6a549ff3	PackWriter: Export more statistics Export the shallow pack information, and also a handy function to sum up the total times. Include the time writing out the index file, if it was created. Change-Id: I7f60ae6848455a357b25feedb23743bbf6c153cf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	38b3816d65	Do not requeue state vector in stateless RPC fetch If the no-done capability was enabled on the connection, don't queue up the state vector again once the ACK %s ready message is observed from the remote. The pack will be following in this response stream, so the state vector is no longer required. Change-Id: I7bd1e76957cb58c7ff1cdaeef227f1b02a7e5d24 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	575a80ac44	Wrap excessively long line in BasePackFetchConnection Change-Id: I926838058c1de2146e22faa08570406600457acb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	c1a9b2ae8b	Fix smart HTTP client stream alignment errors The client's use of UnionInputStream was broken when combined with a 8192 byte buffer used by PackParser. A smart HTTP client connection always pushes in the execute stateless RPC input stream after the data stream has ended from the remote peer. At the end of the pack, PackParser asked to fill a 8192 byte buffer, but if only e.g. 1000 bytes remained UnionInputStream went to the next stream and asked it for input, which triggered a new RPC, and failed because there was nothing pending in the request buffer. Change UnionInputStream to only return what it consumed from a single InputStream without invoking the next InputStream, just in case that second InputStream happens to be one of these magical ones that generates an RPC invocation. Change-Id: I0e51a8e6fea1647e4d2e08ac9cfc69c2945ce4cb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Kevin Sawicki	4005f3c693	Remove duplicate calls to DirCache.unlock on checkout Calls to unlock the DirCache before throwing an exception were not needed since checkout calls doCheckout wrapped in a try block that calls DirCache.unlock in a finally block. Change-Id: I2b249a784f9e363430e288aad67fcefb7fac0a6e Signed-off-by: Kevin Sawicki <kevin@github.com>	13 years ago
Robin Rosenberg	a7d3c68015	Allow commit when submodule changes are present We do not yet check or validate submodules, but can accept that someone staged a change in a submodule with other tools. Change-Id: I642ede382314bfbd1892dd509a2222885cc5350a Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	13 years ago
Robin Rosenberg	576abf64d1	Ignore submodule on checkout instead of deleting it The purpose of this commit is to prevent destruction of submodules on checkout from a tree with a submodule to another. For consistency we handle the reverse case too, when we checkout a branch that has a submodule and the submodule directory exists. And finally we ignore the case where the submodule changes. We do not update the submodules, we just try to ignore them harder. Bug: 356664 Change-Id: I202c695a57af99b13d0d7220803fd08def3d9b5e Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	13 years ago
Robin Rosenberg	2bb8da0405	cleanup: Reuse local variable for current DirCacheEntry Since we already have assigned i.getDirCacheEntry() to dce, use dce instead. Change-Id: I107713ad0b356516d75c29203f945b056bad3ac7 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	13 years ago
Matthias Sohn	b09d21b6eb	Prepare post v1.1.0.201109071825-rc3 builds Change-Id: I1244f6639263d156a6f9e4530167e5eb1826a535 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Matthias Sohn	75611a8314	JGit v1.1.0.201109071825-rc3 Change-Id: I1b989d3101272632eacabe25a0b111ad0ff5bb3b Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Dariusz Luksza	570d862ef3	Fix IOOBE in Repository.resolveSimple() IndexOutOfBoundException is thrown from Repository.resolveSimple() when '-g' string is located less then 4 characters from the end of this string. Change-Id: I1128c2cdfec9db3023d4d0f1f40d863e84b75950 Signed-off-by: Dariusz Luksza <dariusz@luksza.org>	13 years ago
Matthias Sohn	cfdb09e9db	Use commit message best practices for Mylyn Commit template We should use a template for Mylyn commit messages that matches with our guidelines for commit messages. http://wiki.eclipse.org/EGit/Contributor_Guide#Commit_message_guidelines Bug: 337401 Change-Id: I05812abf0eb0651d22c439142640f173fc2f2ba0 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Robin Rosenberg	b695f66487	Fix the names in the reflog for checkouts We were diverging from the reference implementation. Always use the ref we checkout to as the to-branch the reflog and avoid the refs/heads both in the from-name and to-name. Change-Id: Id973d9102593872e4df41d0788f0eb7c7fd130c4 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Robin Rosenberg	eadc26c0a0	Add a helper for parsing branch switch info out of a reflog entry Change-Id: I91c7e08c4afd2562df2226887a933d93c78a0371 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago

... 13 14 15 16 17 ...

2187 Commits (ce5fd525be80d664db5f7263f8a4c961e320940e)