github/jgit - jgit - 帆软第三方插件仓库

Commit Graph

Author	SHA1	Message	Date
Shawn Pearce	c9707e6353	Always attempt delta compression when reuseDeltas is false If reuseObjects=true but reuseDeltas=false the caller wants attempt a delta for every object in the input list. Test for reuseDeltas to ensure every object passes through the searchInWindow() method. If no delta is possible for an object and it will be stored whole (non-delta format), PackWriter may still reuse its content from any source pack. This avoids an inflate()-deflate() cycle to recompress the object contents. Change-Id: I845caeded419ef4551ef1c85787dd5ffd73235d9	12 years ago
Shawn Pearce	a5c6aac76c	Avoid TemporaryBuffer.Heap on very small deltas TemporaryBuffer is great when the output size is not known, but must be bound by a relatively large upper limit that fits in memory, e.g. 64 KiB or 20 MiB. The buffer gracefully supports growing storage by allocating 8 KiB blocks and storing them in an ArrayList. In a Git repository many deltas are less than 8 KiB. Typical tree objects are well below this threshold, and their deltas must be encoded even smaller. For these much smaller cases avoid the 8 KiB minimum allocation used by TemporaryBuffer. Instead allocate a very small OutputStream writing to an array that is sized at the limit. Change-Id: Ie25c6d3a8cf4604e0f8cd9a3b5b701a592d6ffca	12 years ago
Shawn Pearce	8a7c2f97d0	Correct distribution of allowed delta size along chain length Nicolas Pitre discovered a very simple rule for selecting between two different delta base candidates: - if based whole object, must be <= 50% of target - if at end of a chain, must be <= 1/depth * 50% of target The rule penalizes deltas near the end of the chain, requiring them to be very small in order to be kept by the packer. This favors deltas that are based on a shorter chain, where the read-time unpack cost is much lower. Fewer bytes need to be consulted from the source pack file, and less copying is required in memory to rebuild the object. Junio Hamano explained Nico's rule to me today, and this commit fixes DeltaWindow to implement it as described. When no base has been chosen the computation is simply the statements denoted above. However once a base with depth of 9 has been chosen (e.g. when pack.depth is limited to 10), a non-delta source may create a new delta that is up to 10x larger than the already selected base. This reflects the intent of Nico's size distribution rule no matter what order objects are visited in the DeltaWindow. With this patch and my other patches applied, repacking JGit with: [pack] reuseObjects = false reuseDeltas = false depth = 50 window = 250 threads = 4 compression = 9 CGit (all) 5,711,735 bytes; real 0m13.942s user 0m47.722s [1] JGit heads 5,718,295 bytes; real 0m11.880s user 0m38.177s [2] rest 9,809 bytes The improved JGit result for the head pack is only 6.4 KiB larger than CGit's resulting pack. This patch allowed JGit to find an additional 39.7 KiB worth of space savings. JGit now also often runs 2s faster than CGit, despite also creating bitmaps and pruning objects after the head pack creation. [1] time git repack -a -d -F --window=250 --depth=50 [2] time java -Xmx128m -jar jgit debug-gc Change-Id: I5caec31359bf7248cabdd2a3254c84d4ee3cd96b	12 years ago
Shawn Pearce	3b7924f403	Split remaining delta work on path boundaries When an idle thread tries to steal work from a sibling's remaining toSearch queue, always try to split along a path boundary. This avoids missing delta opportunities in the current window of the thread whose work is being taken. The search order is reversed to walk further down the chain from current position, avoiding the risk of splitting the list within the path the thread is currently processing. When selecting which thread to split from use an accurate estimate of the size to be taken. This avoids selecting a thread that has only one path remaining but may contain more pending entries than another thread with several paths remaining. As there is now a race condition where the straggling thread can start the next path before the split can finish, the stealWork() loop spins until it is able to acquire a split or there is only one path remaining in the siblings. Change-Id: Ib11ff99f90a4d9efab24bf4a85342cc63203dba5	12 years ago
Shawn Pearce	af33a911d0	Replace DeltaWindow array with circularly linked list Typical window sizes are 10 and 250 (although others are accepted). In either case the pointer overhead of 1 pointer in an array or 2 pointers for a double linked list is trivial. A doubly linked list as used here for window=250 is only another 1024 bytes on a 32 bit machine, or 2048 bytes on a 64 bit machine. The critical search loops scan through the array in either the previous direction or the next direction until the cycle is finished, or some other scan abort condition is reached. Loading the next object's pointer from a field in the current object avoids the branch required to test for wrapping around the edge of the array. It also saves the array bounds check on each access. When a delta is chosen the window is shuffled to hoist the currently selected base as an earlier candidate for the next object. Moving the window entry is easier in a double-linked list than sliding a group of array entries. Change-Id: I9ccf20c3362a78678aede0f0f2cda165e509adff	12 years ago
Shawn Pearce	0f32901ab7	Micro-optimize copy instructions in DeltaEncoder The copy instruction formatter should not to compute the shifts and masks twice. Instead compute them once and assume there is a register available to store the temporary "b" for compare with 0. Change-Id: Ic7826f29dca67b16903d8f790bdf785eb478c10d	12 years ago
Shawn Pearce	1db50c9d91	Micro-optimize DeltaWindow primary loop javac and the JIT are more likely to understand a boolean being used as a branch conditional than comparing int against 0 and 1. Rewrite NEXT_RES and NEXT_SRC constants to be booleans so the code is clarified for the JIT. Change-Id: I1bdd8b587a69572975a84609c779b9ebf877b85d	12 years ago
Shawn Pearce	6903fa4a34	Micro-optimize DeltaWindow maxMemory test to be != 0 Instead of using a compare-with-0 use a does not equal 0. javac bytecode has a special instruction for this, as it is very common in software. We can assume the JIT knows how to efficiently translate the opcode to machine code, and processors can do != 0 very quickly. Change-Id: Idb84c1d744d2874517fd4bfa1db390e2dbf64eac	12 years ago
Shawn Pearce	4db695c1c6	Mark DeltaWindowEntry methods final This class and all of its methods are only package visible. Clarify the methods as final for the benefit of the JIT to inline trivial code. Change-Id: I078841f9900dbf299fbe6abf2599f0208ae96856	12 years ago
Shawn Pearce	b5cbfa0146	Merge changes Ideecc472,I2b12788a,I6cb9382d,I12cd3326,I200baa0b,I05626f2e,I65e45422 * changes: Increase PackOutputStream copy buffer to 64 KiB Tighten object header writing in PackOutuptStream Skip main thread test in ThreadSafeProgressMonitor Declare members of PackOutputStream final Always allocate the PackOutputStream copyBuffer Disable CRC32 computation when no PackIndex will be created Steal work from delta threads to rebalance CPU load	12 years ago
Robin Rosenberg	8272f65730	Merge "LogCommand.all(): filter out refs that do not refer to commit objects"	12 years ago
Robin Rosenberg	ad2ffc576b	Merge "LogCommand.all(), peel references before using them"	12 years ago
Shawn Pearce	6c0bb4351d	Increase PackOutputStream copy buffer to 64 KiB Colby just pointed out to me the buffer was 16 KiB. This may be very small for common objects. Increase to 64 KiB. Change-Id: Ideecc4720655a57673252f7adb8eebdf2fda230d	12 years ago
Shawn Pearce	46ef61a702	Tighten object header writing in PackOutuptStream Most objects are written as OFS_DELTA with the base in the pack, that is why this case comes first in writeHeader(). Rewrite the condition to always examine this first and cache the PackWriter's formatting flag for use of OFS_DELTA headers, in modern Git networks this is true more often then it it is false. Assume the cost of write() is high, especially due to entering the MessageDigest to update the pack footer SHA-1 computation. Combine the OFS_DELTA information as part of the header buffer so that the entire burst is a single write call, rather than two relatively small ones. Most OFS_DELTA headers are <= 6 bytes, so this rewrite tranforms 2 writes of 3 bytes each into 1 write of ~6 bytes. Try to simplify the objectHeader code to reduce branches and use more local registers. This shouldn't really be necessary if the compiler is well optimized, but it isn't very hard to clarify data usage to either javac or the JIT, which may make it easier for the JIT to produce better machine code for this method. Change-Id: I2b12788ad6866076fabbf7fa11f8cce44e963f35	12 years ago
Shawn Pearce	d01fe32795	Skip main thread test in ThreadSafeProgressMonitor update(int) is only invoked from a worker thread, in JGit's case this is DeltaTask. The Javadoc of TSPM suggests update should only ever be used by a worker thread. Skip the main thread check, saving some cycles on each run of the progress monitor. Change-Id: I6cb9382d71b4cb3f8e8981c7ac382da25304dfcb	12 years ago
Shawn Pearce	66192817cd	Declare members of PackOutputStream final These methods cannot be sanely overridden anywhere. Most methods are package visible only, or are private. A few public methods do exist but there is no useful way to override them since creation of PackOutputStream is managed by PackWriter and cannot be delegated. Change-Id: I12cd3326b78d497c1f9751014d04d1460b46e0b0	12 years ago
Shawn Pearce	2be6927d8e	Always allocate the PackOutputStream copyBuffer The getCopyBuffer() is almost always used during output. All known implementations of ObjectReuseAsIs rely on the buffer to be present, and the only sane way to get good performance from PackWriter is to reuse objects during packing. Avoid a branch and test when obtaining this buffer by making sure it is always populated. Change-Id: I200baa0bde5dcdd11bab7787291ad64535c9f7fb	12 years ago
Shawn Pearce	eb17495ca4	Disable CRC32 computation when no PackIndex will be created If a server is streaming 3GiB worth of pack data to a client there is no reason to compute the CRC32 checksum on the objects. The CRC32 code computed by PackWriter is used only in the new index created by writeIndex(), which is never invoked for the native Git network protocols. Object reuse may still compute its own CRC32 to verify the data being copied from an existing pack has not been corrupted. This check is done by the ObjectReader that implements ObjectReuseAsIs and has no relationship to the CRC32 being skipped during output. Change-Id: I05626f2e0d6ce19119b57d8a27193922636d60a7	12 years ago
Shawn Pearce	d0a5337625	Steal work from delta threads to rebalance CPU load If the configuration wants to run 4 threads the delta search work is initially split somewhat evenly across the 4 threads. During execution some threads will finish early due to the work not being split fairly, as the initial partitions were based on object count and not cost to inflate or size of DeltaIndex. When a thread finishes early it now tries to take 50% of the work remaining on a sibling thread, and executes that before exiting. This repeats as each thread completes until a thread has only 1 object remaining. Repacking Blink, Chromium's new fork of WebKit (2.2M objects 3.9G): [pack] reuseDeltas = false reuseObjects = false depth = 50 threads = 8 window = 250 windowMemory = 800m before: ~105% CPU after 80% after: >780% CPU to 100% Change-Id: I65e45422edd96778aba4b6e5a0fd489ea48e8ca3	12 years ago
Christian Halstrick	266ec24d49	Merge "clean up merge squash and no-commit messages in pgm"	12 years ago
Robin Rosenberg	0a824f5996	Add a constant for info/exclude Change-Id: Ifd537ce4e726cb9460ea332f683428689bd3d7f4	12 years ago
Matthias Sohn	0182e8152e	Merge changes I8445070d,I38f10d62,I2af0bf68 * changes: Fix plugin provider names to conform with release train requirement Add missing @since tags for new API methods DfsReaderOptions are options for a DFS stored repository	12 years ago
Matthias Sohn	011f7fd27d	Fix plugin provider names to conform with release train requirement According to release train requirements [1] the provider name for all artifacts of Eclipse projects is "Eclipse <project name>". [1] http://wiki.eclipse.org/Development_Resources/HOWTO/Release_Reviews#Checklist Change-Id: I8445070d1d96896d378bfc49ed062a5e7e0f201f Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Tomasz Zarna	b42b50fdf5	clean up merge squash and no-commit messages in pgm Change-Id: Iffa6e8752fbd94f3ef69f49df772be82e3da5d05	12 years ago
Robin Rosenberg	59baf9148e	Detect and handle a checkout conflict during merge nicely Report the conflicting files nicely and inform the user. Change-Id: I75d464d4156d10c6cc6c7ce5a321e2c9fb0df375	12 years ago
Matthias Sohn	2f93551e18	Add missing @since tags for new API methods Change-Id: I38f10d622c30f19d1154a4901477e844cb411707 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Matthias Sohn	41cba241d8	DfsReaderOptions are options for a DFS stored repository Change-Id: I2af0bf686188f1402fb53bf6dbe0ecb228069ace Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Shawn Pearce	5d446f410d	Support cutting existing delta chains longer than the max depth Some packs built by JGit have incredibly long delta chains due to a long standing bug in PackWriter. Google has packs created by JGit's DfsGarbageCollector with chains of 6000 objects long, or more. Inflating objects at the end of this 6000 long chain is impossible to complete within a reasonable time bound. It could take a beefy system hours to perform even using the heavily optimized native C implementation of Git, let alone with JGit. Enable pack.cutDeltaChains to be set in a configuration file to permit the PackWriter to determine the length of each delta chain and clip the chain at arbitrary points to fit within pack.depth. Delta chain cycles are still possible, but no attempt is made to detect them. A trivial chain of A->B->A will iterate for the full pack.depth configured limit (e.g. 50) and then pick an object to store as non-delta. When cutting chains the object list is walked in reverse to try and take advantage of existing chain computations. The assumption here is most deltas are near the end of the list, and their bases are near the front of the list. Going up from the tail attempts to reuse chainLength computations by relying on the memoized value in the delta base. The chainLength field in ObjectToPack is overloaded into the depth field normally used by DeltaWindow. This is acceptable because the chain cut happens before delta search, and the chainLength is reset to 0 if delta search will follow. Change-Id: Ida4fde9558f3abbbb77ade398d2af3941de9c812	12 years ago
Shawn Pearce	01a0699acc	Micro-optimize reuseDeltaFor in PackWriter This switch is called mostly for OBJ_TREE and OBJ_BLOB types, which typically make up 66% of the objects in a repository. Simplify the test for these common types by testing for the one bit they have in common and returning early. Object type 5 is currently undefined. In the old code it would hit the default and return true. In the new code it will match the early case and also return true. In either implementation 5 should never show up as it is not a valid type known to Git. Object type 6 OFS_DELTA is not permitted to be supplied here. Object type 7 REF_DELTA is not permitted to be supplied here. Change-Id: I0ede8acee928bb3e73c744450863942064864e9c	12 years ago
Shawn Pearce	8e83c36e27	Static import OBJ_* constants into PackWriter Shortens most of the code that touches the objectLists. Change-Id: Ib14d366dd311e544e7ba50e9ce07a6f3ce0cf254	12 years ago
Shawn Pearce	6a5019f539	Renumber internal ObjectToPack flags Now that WANT_WRITE is gone renumber the flags to move the unused bit next to the type. Recluster AS_IS and DELTA_ATTEMPTED to be next to each other since these bits are tested as a pair. Change-Id: I42994b5ff1f67435e15c3f06d02e3b82141e8f08	12 years ago
Shawn Pearce	241eed844d	Move wantWrite flag to be special offset 1 Free up the WANT_WRITE flag in ObjectToPack by switching the test to use the special offset value of 1. The Git pack file format calls for the first 4 bytes to be 'PACK', which means any object must start at an offset >= 4. Current versions require another 8 bytes in the header, placing the first object at offset = 12. So offset = 1 is an invalid location for an object, and can be used as a marker signal to indicate the writing loop has tried to write the object, but recursed into the base first. When an object is visited with offset == 1 it means there is a cycle in the delta base path, and the cycle must be broken. Change-Id: I2d05b9017c5f9bd9464b91d43e8d4b4a085e55bc	12 years ago
Shawn Pearce	1eed78657f	Don't delta compress garbage objects Garbage is randomly ordered and unlikely to delta compress against other garbage. Disable delta compression allowing objects to switch to whole form when moving to the garbage pack. Because the garbage is not well compressed assume deltas were not attempted during a normal GC cycle. Override the reuse settings, garbage that can be reused should be reused as-is into the garbage pack rather than switching something like the compression level during a GC. It is intended that garbage will eventually be removed from the repository so expending CPU time on a compression switch is not worthwhile. Change-Id: I0e8e58ee99e5011d375d3d89c94f2957de8402b9	12 years ago
Shawn Pearce	56497be34d	Delete broken DFS read-ahead support This implementation has been proven to deadlock in production server loads. Google has been running with it disabled for a quite a while, as the bugs have been difficult to identify and fix. Instead of suggesting it works and is useful, drop the code. JGit should not advertise support for functionality that is known to be broken. In a few of the places where read-ahead was enabled by DfsReader there is more information about what blocks should be loaded when. During object representation selection, or size lookup, or sending object as-is to a PackWriter, or sending an entire pack as-is the reader knows exactly which blocks are required in the cache, and it also can compute when those will be needed. The broken read-ahead code was stupid and just read a fixed amount ahead of the current offset, which can waste IOs if more precise data was available. DFS systems are usually slow to respond so read-ahead is still a desired feature, but it needs to be rebuilt from scratch and make better use of the offset information. Change-Id: Ibaed8288ec3340cf93eb269dc0f1f23ab5ab1aea	12 years ago
Shawn Pearce	d72416afbb	Optimize DFS object reuse selection code Rewrite this complicated logic to examine each pack file exactly once. This reduces thrashing when there are many large pack files present and the reader needs to locate each object's header. The intermediate temporary list is now smaller, it is bounded to the same length as the input object list. In the prior version of this code the list contained one entry for every representation of every object being packed. Only one representation object is allocated, reducing the overall memory footprint to be approximately one reference per object found in the current pack file (the pointer in the BlockList). This saves considerable working set memory compared to the prior version that made and held onto a new representation for every ObjectToPack. Change-Id: I2c1f18cd6755643ac4c2cf1f23b5464ca9d91b22	12 years ago
Shawn Pearce	93a27ce728	Simplify size test in PackWriter Clip the configured limit to Integer.MAX_VALUE at the top of the loop, saving a compare branch per object considered. This can cut 2M branches out of a repacking of the Linux kernel. Rewrite the logic so the primary path is to match the conditional; most objects are larger than BLKSZ (16 bytes) and less than limit. This may help branch prediction on CPUs if the CPU tries to assume execution takes the side of the branch and not the second. Change-Id: I5133d1651640939afe9fbcfd8cfdb59965c57d5a	12 years ago
Shawn Pearce	d45277a691	Declare critical exposed methods of ObjectToPack final There is no reasonable way for a subclass to correctly override and implement these methods. They depend on internal state that cannot otherwise be managed. Most of these methods are also in critical paths of PackWriter. Declare them final so subclasses do not try to replace them, and so the JIT knows the smaller ones can be safely inlined. Change-Id: I9026938e5833ac0b94246d21c69a143a9224626c	12 years ago
Shawn Pearce	1d362e35bc	Declare internal flag accessors of ObjectToPack final None of these methods should ever be overridden at runtime by an extension class. Given how small they are the JIT should perform inlining where reasonable. Hint this is possible by marking all methods final so its clear no replacement can be loaded later on. Change-Id: Ia75a5d36c6bd25b24169e2bdfa360c8f52b669cd	12 years ago
Shawn Pearce	876a2ffb21	Remove unused method isDeltaAttempted() This flag is never checked on its own. It is only checked as part of a pair through the doNotAttemptDelta() method. Delete the method so there is less confusion about the flag being used on its own. Change-Id: Id7088caa649599f4f11d633412c2a2af0fd45dd8	12 years ago
Shawn Pearce	594d4ceb12	Simplify setDoNotDelta() to always set the flag This method is only invoked with true as the argument. Remove the unnecessary parameter and branch, making the code easier for the JIT to optimize. Change-Id: I68a9cd82f197b7d00a524ea3354260a0828083c6	12 years ago
Tomasz Zarna	5453585773	Add the no-commit option to MergeCommand Added also tests and the associated option for the command line Merge command. Bug: 335091 Change-Id: Ie321c572284a6f64765a81674089fc408a10d059 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Christian Halstrick	81b601de53	Merge "Fix PathFilterGroup not to throw StopWalkException too early"	12 years ago
Christian Halstrick	ac0481039d	Merge "Indicate initial commit on a branch in the reflog"	12 years ago
Robin Rosenberg	c9a94dc1ee	Fix PathFilterGroup not to throw StopWalkException too early Due to the Git internal sort order a directory is sorted as if it ended with a '/', this means that the path filter didn't set the last possible matching entry to the correct value. In the reported issue we had the following filters. org.eclipse.jgit.console org.eclipse.jgit As an optimization we throw a StopWalkException when the walked tree passes the last possible filter, which was this: org.eclipse.jgit.console Due to the git sorting order, the tree was processed in this order: org.eclipse.jgit.console org.eclipse.jgit.test org.eclipse.jgit At org.eclipse.jgit.test we threw the StopWalkException preventing the walk from completing successfully. A correct last possible match should be: org.eclipse.jgit/ For simplicit we define it as: org/eclipse/jgit/ This filter would be the maximum if we also had e.g. org and org.eclipse in the filter, but that would require more work so we simply replace all characters lower than '/' by a slash. We believe the possible extra walking does not not warrant the extra analysis. Bug: 362430 Change-Id: I4869019ea57ca07d4dff6bfa8e81725f56596d9f	12 years ago
Robin Rosenberg	65027d8bb4	Indicate initial commit on a branch in the reflog Bug: 393463 Change-Id: I4733d6f719bc0dc694e7a6a6ad2092de6364898c	12 years ago
Arthur Baars	35be98fb8f	LogCommand.all(): filter out refs that do not refer to commit objects 1. I have authored 100% of the content I'm contributing, 2. I have the rights to donate the content to Eclipse, 3. I contribute the content under the EDL Change-Id: I48b1828e0b1304f76276ec07ebac7ee9f521b194	12 years ago
Arthur Baars	2b9c440fd1	LogCommand.all(), peel references before using them Problem: LogCommand.all() throws an IncorrectObjectTypeException when there are tag references, and the repository does not contain the file "packed-refs". It seems that the references were not properly peeled before being added to the markStart() method. Solution: Call getRepository().peel() on every Ref that has isPeeled()==false in LogCommand.all() . Added test case for LogCommand.all() on repo with a tag. 1. I have authored 100% of the content I'm contributing, 2. I have the rights to donate the content to Eclipse, 3. I contribute the content under the EDL Bug: 402025 Change-Id: Idb8881eeb6ccce8530f2837b25296e8e83636eb7	12 years ago
Robin Rosenberg	5cf53fdacf	Speed up clone/fetch with large number of refs Instead of re-reading all refs after each update, execute the deletes first, then read all refs once and perform the check for conflicting ref names in memory. Change-Id: I17d0b3ccc27f868c8497607d8e57bf7082e65ba3	12 years ago
Robin Rosenberg	4796fe7043	Merge "When renaming the lock file succeeds the lock isn't held anymore"	12 years ago
Shawn Pearce	1f51aecf95	Fix CommitCommand amend mode to preserve parent order Change-Id: I476921ff8dfa6a357932d42ee59340873502b582	12 years ago

1 2 3 4 5 ...

2574 Commits (c9707e6353617f86cc75a7692797e1ed45e47cd3) All Branches Search

2574 Commits (c9707e6353617f86cc75a7692797e1ed45e47cd3)

All Branches