github/jgit - jgit - 帆软第三方插件仓库

Commit Graph

Author	SHA1	Message	Date
Shawn Pearce	3760e4319b	Remove cached_packs support in favor of bitmaps The bitmap code in PackWriter knows exactly when to use a pack as a "cached pack". It enables cached pack usage only when the pack has a bitmap and its entire closure of objects needs to be sent. This is a much simpler code path to maintain, and JGit actually has a way to write the necessary index. Change-Id: I2645d482f8733fdf0c4120cc59ba9aa4d4ba6881	12 years ago
Shawn Pearce	b2c0021b8a	Remove objects before optimization from DfsGarbageCollector Just counting objects is not sufficient. There are some race conditions with receive packs and delta base completion that may confuse such a simple algorithm. Instead always do the larger set computations, and rely on the PackWriter having no objects pending as the way to avoid creating an empty pack file. Change-Id: Ic81fefb158ed6ef8d6522062f2be0338a49f6bc4	12 years ago
Shawn Pearce	fc6b898cbe	Simplfy caching of DfsPackDescription from PackWriter.Statistics Let the pack description copy the relevant stats values. This moves it out of the garbage collector and compactor algorithms, co-locating with something that might care. Remove some unnecessary code from the DfsPackCompactor, the stats tracks the same information and can supply it. Change-Id: Id64ab38d507c0ed19ae0d106862d175b7364eba3	12 years ago
Dave Borowitz	8e2a24a3b6	NameRevCommand: Use ~ notation for first parents of merges Prefer ~(N+1) to ^1~N. Although both are correct, the former is cleaner and matches "git name-rev". Change-Id: I772001a219e5eb346f5552c92e6d98c70b2cfa98	12 years ago
Dave Borowitz	d2a6c4b955	Allow adding single refs or all tags to NameRevCommand Change-Id: I90e85bc835d11278631afd0e801425a292578bba	12 years ago
Shawn Pearce	e175daf123	Merge "Cluster UNREACHABLE_GARBAGE packs at the end of the search list"	12 years ago
Shawn Pearce	7e229c75c1	Merge "Avoid repacking unreachable garbage in DfsGarbageCollector"	12 years ago
Shawn Pearce	c017d7ef45	Merge changes Icd550359,If7aad533 * changes: Avoid looking at UNREACHABLE_GARBAGE for client have lines Simplify UploadPack by parsing wants separately from haves	12 years ago
Shawn Pearce	ef91da3605	Merge "Add a NameRevCommand for describing IDs in terms of refnames"	12 years ago
Dave Borowitz	30ba407a9a	Add a NameRevCommand for describing IDs in terms of refnames The walk logic does not use RevWalk because it needs to walk all paths to each of the requested commits, keeping track of each path along which the commit was found in the RevCommit subclass. From these paths, a single "best" path is chosen based on the total path length, with a penalty applied for paths that traverse merges. This functionality parallels "git name-rev". Change-Id: I92bfb47dd16c898313d2ee525395609c3bf72ebe	12 years ago
Robin Stocker	9105e1c9af	Add isRebasing to RepositoryState See EGit change Ic69f5c952a49f023c0949f04b3e976be1b267fbe where this could be used. Change-Id: I9ec8568fa1100d2e9c8d4ca0e347bf77ec6d8734	12 years ago
Shawn Pearce	4e9fe58bb5	Avoid looking at UNREACHABLE_GARBAGE for client have lines Clients send a bunch of unknown objects to UploadPack on each round of negotiation. Many of these are not known to the server, which leads the implementation to be looking at indexes for garbage packs. Disable examining the index of a garbage pack, allowing servers to avoid reading them from disk during negotiation. The effect of this change is the server will only ACK a have line if the object was reachable during the last garbage collection, or was recently added to the repository. For most repositories there is no impact in this behavior change. If a repository rewinds a branch, runs GC, and then resets the branch back to where it was before, the now current tip is going to be skipped by this change. A client that has the commit may wind up getting a slightly larger data transfer from the server as an older common ancestor will be chosen during negotiation. This is fixable on the server side by running GC again to correct the layout of objects in pack files. Change-Id: Icd550359ef70fc7b701980f9b13d923fd13c744b	12 years ago
Shawn Pearce	437be8dfad	Simplify UploadPack by parsing wants separately from haves The DHT backend was very slow at parsing objects. To work around that performance limitation I obfuscated UploadPack by folding both the want and have sets together in a single parse queue. Since DHT was removed the complexity is no longer constructive to JGit. Doing this refactoring prepares the code for a slightly future change where the have lines need to be handled specially from the want lines. Splitting the parsing up into two phases makes such a modification trivial. Change-Id: If7aad533b82448bbb688278e21f709282e5ccf4b	12 years ago
Shawn Pearce	ea5eef912a	Cluster UNREACHABLE_GARBAGE packs at the end of the search list Garbage is unlikely to be used by a reader. Ensure they always cluster at the end of the search list, no matter what timestamp was used on the pack files. Change-Id: I3bed89e9569ee3363c36bb3f73fcd34057a3883f	12 years ago
Shawn Pearce	bb002c619b	Avoid repacking unreachable garbage in DfsGarbageCollector If a repository has significant amounts of unreachable garbage the final phase to coalesce it can take longer than any other part of the garbage collection phase. Provide a setting for applications to tweak the threshold where coalescing ends and files just remain on disk. Change-Id: I5f11a998a7185c75ece3271d8bc6181bb83f54c1	12 years ago
Robin Stocker	3ee04e3531	Include the number of ms in timeout error message Noticed that while analyzing bug 402131. Change-Id: If3fd40b64d5088c4579946271a67346cbd9e6556	12 years ago
Robin Rosenberg	3ad454497c	Do not cherry-pick merge commits during rebase Rebase computes the list of commits that are included in the merges, just like Git does, so do not try to include the merge commits. Re-recreating merges during rebase is a bit more complicated and might be a useful future extension, but for now just linearize during rebase. Change-Id: I61239d265f395e5ead580df2528e46393dc6bdbd Signed-off-by: Robin Stocker <robin@nibor.org>	12 years ago
Robin Rosenberg	08d5ede281	Extend FileUtils.delete with option to delete empty directories only The new option EMPTY_DIRECTORIES_ONLY will make delete() only delete empty directories. Any attempt to delete files will fail. Can be combined with RECURSIVE to wipe out entire tree structures and IGNORE_ERRORS to silently ignore any files or non-empty directories. Change-Id: Icaa9a30e5302ee5c0ba23daad11c7b93e26b7445 Signed-off-by: Robin Stocker <robin@nibor.org>	12 years ago
Matthias Sohn	13ea3b0957	Add javaewah bundle to features using it This ensures that OSGi consumers can retrieve this dependency from the JGit or EGit p2 repository. Change-Id: I6f88a4914a19e4e18aa60d59b0cc8a33b61f7fc2 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Shawn Pearce	913cccd5c4	Do not attempt to read bitmap from invalid pack If a pack file has been marked invalid due to a prior IOException accessing its contents, do not offer its bitmap index to callers. The pack cannot be used so its bitmap should be off limits from any reader trying to work from a bitmap. Change-Id: Ia44e46558abdddee560bb184158b1e0af9437eee	12 years ago
Shawn Pearce	88c962484f	Rename DfsPackFile getBitmap method to match PackFile There is no reason for these to differ in name. Match the shorter name used by PackFile. Change-Id: I2d3a299069acc5ce276b1b5439ff2258903c6ff3	12 years ago
Colby Ranger	c660362768	Write the bitmap index correctly in DFS GC. A bug caused the .bitmap to actually have the .idx contents. Change-Id: I428bb27d419e8b1b69b6f3e2fd07cd29703669ad	12 years ago
Colby Ranger	e6883dfe4b	Enable writing bitmaps during GC by default. Bitmaps provide a huge performance boost for counting objects and they play nice with the cgit implementation. Change-Id: I33b05a6c8f1ee2df7770f0b9fdc50d0b4bbf1029	12 years ago
Colby Ranger	f82821728b	Enable writing pack indexes with bitmaps in the GC. Update the dfs and file GC implementations to prepare and write bitmaps on the packs that contain the full closure of the object graph. Update the DfsPackDescription to include the index version. Change-Id: I3f1421e9cd90fe93e7e2ef2b8179ae2f1ba819ed	12 years ago
Colby Ranger	43ea887c8b	Enable serving upload requests using bitmaps. If the pack index has bitmaps, allow the PackWriter to use the bitmaps for upload requests. Change-Id: Iefa995fe927a11e4fd78afb34530995614221fc0	12 years ago
Colby Ranger	dafcb8f6db	Support creating pack bitmap indexes in PackWriter. Update the PackWriter to support writing out pack bitmap indexes, a parallel ".bitmap" file to the ".pack" file. Bitmaps are selected at commits every 1 to 5,000 commits for each unique path from the start. The most recent 100 commits are all bitmapped. The next 19,000 commits have a bitmaps every 100 commits. The remaining commits have a bitmap every 5,000 commits. Commits with more than 1 parent are prefered over ones with 1 or less. Furthermore, previously computed bitmaps are reused, if the previous entry had the reuse flag set, which is set when the bitmap was placed at the max allowed distance. Bitmaps are used to speed up the counting phase when packing, for requests that are not shallow. The PackWriterBitmapWalker uses a RevFilter to proactively mark commits with RevFlag.SEEN, when they appear in a bitmap. The walker produces the full closure of reachable ObjectIds, given the collection of starting ObjectIds. For fetch request, two ObjectWalks are executed to compute the ObjectIds reachable from the haves and from the wants. The ObjectIds needed to be written are determined by taking all the resulting wants AND NOT the haves. For clone requests, we get cached pack support for "free" since it is possible to determine if all of the ObjectIds in a pack file are included in the resulting list of ObjectIds to write. On my machine, the best times for clones and fetches of the linux kernel repository (with about 2.6M objects and 300K commits) are tabulated below: Operation Index V2 Index VE003 Clone 37530ms (524.06 MiB) 82ms (524.06 MiB) Fetch (1 commit back) 75ms 107ms Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB) Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB) Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB) Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB) Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB) Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d	12 years ago
Colby Ranger	3b325917a5	Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f	12 years ago
Shawn Pearce	234b4e0432	Merge "Break the dependency on RevObject when creating a newObjectToPack()."	12 years ago
Shawn Pearce	374406ac46	Merge "Fix RefUpdate performance for existing Refs"	12 years ago
Shawn Pearce	22625cd1d8	Merge "Fix corrupted CloneCommand bare-repo fetch-refspec (#402031 )"	12 years ago
Colby Ranger	be7a135e94	Break the dependency on RevObject when creating a newObjectToPack(). Update the ObjectReuseAsIs API to support creating new ObjectToPack with only the AnyObjectId and Git object type. This is needed to support the future pack index bitmaps, which only contain this information and do not want the overhead of creating a temporary object for every ObjectId. Change-Id: I906360b471412688bf429ecef74fd988f47875dc	12 years ago
Colby Ranger	8d4f227c13	Merge "Remove the unused method PackFile.hasExt()."	12 years ago
Colby Ranger	1512d0ab4e	Remove the unused method PackFile.hasExt(). It will be used in a future change, so just include it with that change. Change-Id: I7db28d86f8e8b282a403acd9a4c4defaae828f94	12 years ago
Roberto Tyley	a46b042905	Fix corrupted CloneCommand bare-repo fetch-refspec (#402031 ) CloneCommand has been creating fetch refspecs like this on bare clones: [remote "origin"] url = ssh://example.com/my-repo.git fetch = +refs/heads/:refs/heads// As you can see, the destination ref pattern has a superfluous slash. It looks like this behaviour has always been the case for CloneCommand, at least since `cc2197ed` when code catering to bare-clone fetch refspecs was added. That was released with JGit v1.0 almost 2 years ago, so there will probably be some bare repos in the wild which will have been cloned with JGit and have these corrupted refspecs. The effect of the corrupted fetch refspec is quite interesting. Up to and including JGit 2.0, the corrupt refspec was tolerated and fetches would work as intended with no indication to the user that anything was amiss. With JGit 2.1, a change was introduced which made JGit less tolerant, and fetches now attempt to update the non-existing ref "refs/heads//master". No exception is raised, but the real ref - "refs/heads/master" - is not updated. This behaviour was noticed by a user of Agit (which does bare clones by default and recently updated from JGit v2.0 to v2.2), reported here: https://github.com/rtyley/agit/issues/92 If you run C-Git fetch on a bare-repo cloned by JGit, it flat-out rejects the refspec (checked against v1.7.10.4): fatal: Invalid refspec '+refs/heads/:refs/heads//' Incidentally, C-Git does not create an explicit fetch refspec at all when performing a bare clone - the full remote config generated by C-Git looks like this: [remote "origin"] url = ssh://example.com/my-repo.git Using JGit on such a repository works fine, so omitting the fetch refspec entirely is also an option. Change-Id: I14b0d359dc69b8908f68e02cea7a756ac34bf881	12 years ago
Roberto Tyley	f1dea3e279	Fix RefUpdate performance for existing Refs No longer invoke the expensive RefDatabase.isNameConflicting() check on updating existing refs, reducing batch ref update time by ~97%. The RefDirectory implementation of isNameConflicting() is quite slow (it has to do an expensive loose-ref scan) but it's only necessary to perform this check on ref update if the ref is being created - if the ref already exists, we can already guarantee that it does not conflict with any other refs. C-Git seems to use a similar condition before making the is_refname_available() check: https://github.com/git/git/blob/v1.8.1.4/refs.c#L1660-L1670 As an example of the effects on performance, here's a simple timing experiment using The BFG to remove one file from the JGit repo: --- $ wget http://repo1.maven.org/maven2/com/madgag/bfg-repo-cleaner/1.0.1/bfg-1.0.1.jar $ git clone --mirror https://git.eclipse.org/r/p/jgit/jgit.git $ java -jar bfg-1.0.1.jar -D make_jgit.sh jgit.git .... Updating references: 100% (5760/5760) ...Ref update completed in 148,949 ms. BFG run is complete! --- The execution time for the run is completely dominated by the batch ref update at the end. Repeating the experiment with BFG v1.0.2 (using JGit patched with this change), the refs update is dramatically reduced: --- Updating references: 100% (5760/5760) ...Ref update completed in 4,327 ms. --- Change-Id: I9057bc4ee22f9cc269b1cc00c493841c71527cd6	12 years ago
Shawn Pearce	178d55c24d	Merge "Improve the documentation of the ByteArraySet used by PathFilterGroup"	12 years ago
Colby Ranger	4a317a1790	Include supported extensions in PackFile constructor. Previously a PackFile class was assumed to only support a .pack and .idx file. Update the constructor to enumerate the supported extensions for the pack file. This will allow the bitmap code to only be executed if the bitmap extension file is known to exist. Change-Id: Ie59041dffec5f60d7ea2771026ffd945106bd4bf	12 years ago
Gustaf Lundh	212fb3071c	Fix while boundries in DateRevQueue.add() In add(), "low" will never equals "first". This fact should be reflected in the code. Change-Id: I5cab51374e67bd2d3301e5d9dac47c4259b5e562	12 years ago
Shawn Pearce	9613b04d81	Merge "Performance fixes in DateRevQueue"	12 years ago
Gustaf Lundh	84afea9179	Performance fixes in DateRevQueue When a lot of commits are added to DateRevQueue, the sort-on-insertion approach is very heavy on CPU cycles. One approach to fix this was made by Dave Borowitz: https://git.eclipse.org/r/#/c/5491/ But using Java's PriorityQueue seems to have brought some extra overhead, and the desired performance could not be reached. This fix takes another approach to the insertion problem, without changing the expected behaviour or bringing extra memory overhead: If we detect over 1000 commits in the DateRevQueue, a "seek-index" is rebuilt every 1000th added commit. The index keeps track of every 100th commit in the DateRevQueue. During insertions, it will be used for a preliminary scanning (binary search) of the queue, with the intention of helping add() find a good starting point to start walking from. After finding this starting point, add() will step commit-by-commit until the correct insertion place in the queue is found (today, the queue is expected to be sorted at all times). When applied to repositories with many refs, this approach has proven to bring huge performance gains and scales quite well. For instance, in a repository with close to 80000 refs, we could cut down the time a typical Gerrit replication of 1 commit would take (just a push from JGit's point of view) from 32sec down to 3.5sec. Below you see some typical times to add a specific amount of commits (with random commit times) to the DateRevQueue and the difference the preliminary seek-index makes: Commits \| Index \| No Index 1024 8ms 8ms 2048 13ms 9ms 4096 5ms 59ms 8192 11ms 595ms 16384 22ms 3058ms 32768 64ms 13811ms 65536 201ms 62677ms 131072 783ms 331585ms Only one extra reference is needed for every 100 inserted commits (and only when we see more than 1000 commits in the queue), so the memory overhead should be negligible. Various index-stepping values were tested, and 100 seemed to scale very well and be effective from start. In the future, it should probably be dynamic and based on the number of refs in the queue, but this should serve well as a starting point. Note: While other fundamentally different data structures may be more suitable, the DateRevQueue is extremely central to many of the Git core operations. This approach was chosen, since the effect of the patch is easy to predict in conjuction with the current implementation. A totally new data structure will make it harder to predict behaviour in many common and uncommon cases (in terms of breaking ties, memory usage, cost when using few elements, object creation/disposing overhead, etc). Change-Id: Ie7b99f40eacf6324bfb4716d82073adeda64d10f	12 years ago
Matthias Sohn	912e19a8d6	Update last release version to 2.3.1.201302201838-r Change-Id: I9c6d774526028e56707e15e80370460d964de76e Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Matthias Sohn	af64b9a3b3	Deploy Maven artifacts to Eclipse Nexus repository Bug: 401469 Bug: 401470 Change-Id: I4901dc208fe8f9e4055d27ab7e0ced979fd234f5 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
George C. Young	ab99b78ca0	Implement recursive merge strategy Extend ResolveMerger with RecursiveMerger to merge two tips that have up to 200 bases. Bug: 380314 CQ: 6854 Change-Id: I6292bb7bda55c0242a448a94956f2d6a94fddbaa Also-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Chris Aniszczyk <zx@twitter.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Robin Rosenberg	78606404de	Improve the documentation of the ByteArraySet used by PathFilterGroup Change-Id: I2ba7a67e8e1596aa6c33a9caddee03a6be48f008	12 years ago
Colby Ranger	95ef1e83d0	Fix off by one error in PackReverseIndex. The last 32bit offset is at Integer.MAX_VALUE. Change-Id: Idee8be3c7887e1d0c8339ff94aceff36dbf000db	12 years ago
Matthias Sohn	c033f016c9	Merge branch 'stable-2.3' * stable-2.3: Prepare 2.3.2-SNAPSHOT builds JGit v2.3.1.201302201838-r Accept Change-Id even if footer contains not well-formed entries Fix false positives in hashing used by PathFilterGroup Change-Id: I5882aa3b482d6bcd40a45bed51e5ab03f018a5bc Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Matthias Sohn	49ec6c1b3b	Prepare 2.3.2-SNAPSHOT builds Change-Id: I51a8a53194928416b1aef1f3fce0ce66aadceca4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Matthias Sohn	63dceceb0e	JGit v2.3.1.201302201838-r Change-Id: I0d79873137ad4042ecc2a0210fe1f6305608b851 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Matthias Sohn	301df23d9b	Merge "Accept Change-Id even if footer contains not well-formed entries" into stable-2.3	12 years ago
Stefan Lay	3b41fcbd96	Accept Change-Id even if footer contains not well-formed entries Instead of only looking for a Change-Id in the last section if it consists only of well-formed "key: value" lines replace the last occurrence of a valid Change-Id line in the last section. Some tools require footer lines e.g. without a colon. Gerrit doesn't accept Change-Id lines in the footer if the Change-Id line doesn't start at the beginning of the line. Bug: 400818 Change-Id: Icce54872adc8c566994beea848448a2f7ca87085 Signed-off-by: Stefan Lay <stefan.lay@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago

1 2 3 4 5 ...

2487 Commits (3760e4319b02ce79ff1eeae021fd88faebf739d5) All Branches Search

2487 Commits (3760e4319b02ce79ff1eeae021fd88faebf739d5)

All Branches