jgit/org.eclipse.jgit at 84afea9179932995d1e59f8fda4e6b11217382ad - jgit

纯java实现的git操作库。

Gustaf Lundh 84afea9179 Performance fixes in DateRevQueue When a lot of commits are added to DateRevQueue, the sort-on-insertion approach is very heavy on CPU cycles. One approach to fix this was made by Dave Borowitz: https://git.eclipse.org/r/#/c/5491/ But using Java's PriorityQueue seems to have brought some extra overhead, and the desired performance could not be reached. This fix takes another approach to the insertion problem, without changing the expected behaviour or bringing extra memory overhead: If we detect over 1000 commits in the DateRevQueue, a "seek-index" is rebuilt every 1000th added commit. The index keeps track of every 100th commit in the DateRevQueue. During insertions, it will be used for a preliminary scanning (binary search) of the queue, with the intention of helping add() find a good starting point to start walking from. After finding this starting point, add() will step commit-by-commit until the correct insertion place in the queue is found (today, the queue is expected to be sorted at all times). When applied to repositories with many refs, this approach has proven to bring huge performance gains and scales quite well. For instance, in a repository with close to 80000 refs, we could cut down the time a typical Gerrit replication of 1 commit would take (just a push from JGit's point of view) from 32sec down to 3.5sec. Below you see some typical times to add a specific amount of commits (with random commit times) to the DateRevQueue and the difference the preliminary seek-index makes: Commits \| Index \| No Index 1024 8ms 8ms 2048 13ms 9ms 4096 5ms 59ms 8192 11ms 595ms 16384 22ms 3058ms 32768 64ms 13811ms 65536 201ms 62677ms 131072 783ms 331585ms Only one extra reference is needed for every 100 inserted commits (and only when we see more than 1000 commits in the queue), so the memory overhead should be negligible. Various index-stepping values were tested, and 100 seemed to scale very well and be effective from start. In the future, it should probably be dynamic and based on the number of refs in the queue, but this should serve well as a starting point. Note: While other fundamentally different data structures may be more suitable, the DateRevQueue is extremely central to many of the Git core operations. This approach was chosen, since the effect of the patch is easy to predict in conjuction with the current implementation. A totally new data structure will make it harder to predict behaviour in many common and uncommon cases (in terms of breaking ties, memory usage, cost when using few elements, object creation/disposing overhead, etc). Change-Id: Ie7b99f40eacf6324bfb4716d82073adeda64d10f		12 years ago
..
.settings	Harmonize the JDT settings within JGit	12 years ago
META-INF	Prepare 2.3.0 builds	12 years ago
findBugs	[findBugs] Silence warning about Transport initialization	12 years ago
resources/org/eclipse/jgit	Fix stash apply using merge logic	12 years ago
src/org/eclipse/jgit	Performance fixes in DateRevQueue	12 years ago
.classpath	Externalize strings from JGit	15 years ago
.fbprefs	Initial JGit contribution to eclipse.org	15 years ago
.gitignore	Finish removing Apache Felix maven-bundle-plugin	15 years ago
.project	Revert "Hide Maven target directories from Eclipse"	14 years ago
about.html	Add missing about.html files to all shipped bundles	14 years ago
build.properties	Add "resources/" as a source folder	15 years ago
plugin.properties	Remove incubation marker	14 years ago
pom.xml	Prepare 2.3.0 builds	12 years ago