Browse Source
When packing is able to reuse lots of deltas from existing packs, those objects are marked as "doNotAttemptDelta" and do not contribute to DeltaTask's computeTopPaths() "totalWeight" calculation. In the extreme case when all packs are reusable, "totalWeight" will be zero. DeltaTask.partitionTasks() uses "totalWeight" to determine a "weightPerThread" size it uses to set up DeltaTasks. When "totalWeight" is small, partitionTasks() ends up creating a DeltaTask for every unique path. For a large repository, the small "weightPerThread" can result in the creation of >100k tasks (for the MSM 3.10 Linux repository, the count was ~150k). This makes the "task stealing" mechanism in DeltaTask very inefficient, because every attempt to steal work does a linear walk through all tasks, searching for the one with the most work remaining, which is O(N^2) comparisons. For the MSM 3.10 repository when all deltas were reusable, PackWriter.parallelDeltaSearch() took (1615+1633+1458)/3 = 1568 seconds. The error is that DeltaTask treats the weights of objects marked as "doNotAttemptDelta" inconsistently. It ignores the weights when calculating "totalWeight" but uses them when partitioning the tasks. The fix is to also ignore them when partitioning the tasks. With this patch applied, PackWriter.parallelDeltaSearch() on the MSM 3.10 repository when all deltas are reused went from taking 1568 seconds to 62ms (>25k speedup). This patch also fixes a totalWeight initialization error in DeltaTask.computeTopPaths(). Change-Id: I2ae37efa83bca42b0e716266ae6aa9d182e76d9c Signed-off-by: Terry Parker <tparker@google.com>stable-4.2
Terry Parker
9 years ago
1 changed files with 15 additions and 8 deletions
Loading…
Reference in new issue