You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Tree:
eb2cfd4552
master
next
stable-0.10
stable-0.11
stable-0.12
stable-0.7
stable-0.8
stable-0.9
stable-1.0
stable-1.1
stable-1.2
stable-1.3
stable-2.0
stable-2.1
stable-2.2
stable-2.3
stable-3.0
stable-3.1
stable-3.2
stable-3.3
stable-3.4
stable-3.5
stable-3.6
stable-3.7
stable-4.0
stable-4.1
stable-4.10
stable-4.11
stable-4.2
stable-4.3
stable-4.4
stable-4.5
stable-4.6
stable-4.7
stable-4.8
stable-4.9
stable-5.0
stable-5.1
stable-5.2
stable-5.3
stable-5.4
stable-5.5
stable-5.6
stable-5.7
stable-5.8
spearce-gpg-pub
v0.10.1
v0.11.1
v0.11.3
v0.12.1
v0.7.0
v0.7.1
v0.8.1
v0.8.4
v0.9.1
v0.9.3
v1.0.0.201106011211-rc3
v1.0.0.201106051725-r
v1.0.0.201106071701-r
v1.0.0.201106081625-r
v1.0.0.201106090707-r
v1.1.0.201109011030-rc2
v1.1.0.201109071825-rc3
v1.1.0.201109151100-r
v1.2.0.201112221803-r
v1.3.0.201202121842-rc4
v1.3.0.201202151440-r
v2.0.0.201206130900-r
v2.1.0.201209190230-r
v2.2.0.201212191850-r
v2.3.0.201302130906
v2.3.1.201302201838-r
v3.0.0.201305080800-m7
v3.0.0.201305281830-rc2
v3.0.0.201306040240-rc3
v3.0.0.201306101825-r
v3.0.2.201309041250-rc2
v3.0.2.201311090911-r
v3.0.3.201309161630-r
v3.1.0.201309270735-rc1
v3.1.0.201310021548-r
v3.2.0.201311130903-m3
v3.2.0.201312181205-r
v3.3.0.201402191814-rc1
v3.3.0.201403021825-r
v3.3.1.201403241930-r
v3.3.2.201404171909-r
v3.4.0.201405051725-m7
v3.4.0.201405211411-rc1
v3.4.0.201405281120-rc2
v3.4.0.201406041058-rc3
v3.4.0.201406110918-r
v3.4.1.201406201815-r
v3.4.2.201412180340-r
v3.5.0.201409071800-rc1
v3.5.0.201409260305-r
v3.5.1.201410131835-r
v3.5.2.201411120430-r
v3.5.3.201412180710-r
v3.6.0.201411121045-m1
v3.6.0.201412230720-r
v3.6.1.201501031845-r
v3.6.2.201501210735-r
v3.7.0.201502031740-rc1
v3.7.0.201502260915-r
v3.7.1.201504261725-r
v4.0.0.201503231230-m1
v4.0.0.201505050340-m2
v4.0.0.201505191015-rc1
v4.0.0.201505260635-rc2
v4.0.0.201506020755-rc3
v4.0.0.201506090130-r
v4.0.1.201506240215-r
v4.1.0.201509280440-r
v4.1.1.201511131810-r
v4.1.2.201602141800-r
v4.10.0.201712302008-r
v4.11.0.201803080745-r
v4.11.1.201807311124-r
v4.11.2.201809100523-r
v4.11.3.201809181037-r
v4.11.4.201810060650-r
v4.11.5.201810191925-r
v4.11.6.201812241910-r
v4.11.7.201903122105-r
v4.11.8.201904181247-r
v4.11.9.201909030838-r
v4.2.0.201511101648-m1
v4.2.0.201601211800-r
v4.3.0.201603230630-rc1
v4.3.0.201604071810-r
v4.3.1.201605051710-r
v4.4.0.201605041135-m1
v4.4.0.201605250940-rc1
v4.4.0.201606011500-rc2
v4.4.0.201606070830-r
v4.4.1.201607150455-r
v4.5.0.201609210915-r
v4.5.1.201703201650-r
v4.5.2.201704071617-r
v4.5.3.201708160445-r
v4.5.4.201711221230-r
v4.5.5.201812240535-r
v4.5.6.201903121547-r
v4.5.7.201904151645-r
v4.6.0.201612231935-r
v4.6.1.201703071140-r
v4.7.0.201704051617-r
v4.7.1.201706071930-r
v4.7.2.201807261330-r
v4.7.3.201809090215-r
v4.7.4.201809180905-r
v4.7.5.201810051826-r
v4.7.6.201810191618-r
v4.7.7.201812240805-r
v4.7.8.201903121755-r
v4.7.9.201904161809-r
v4.8.0.201705170830-rc1
v4.8.0.201706111038-r
v4.9.0.201710071750-r
v4.9.1.201712030800-r
v4.9.10.201904181027-r
v4.9.2.201712150930-r
v4.9.3.201807311005-r
v4.9.4.201809090327-r
v4.9.5.201809180939-r
v4.9.6.201810051924-r
v4.9.7.201810191756-r
v4.9.8.201812241815-r
v4.9.9.201903122025-r
v5.0.0.201805151920-m7
v5.0.0.201805221745-rc1
v5.0.0.201805301535-rc2
v5.0.0.201806050710-rc3
v5.0.0.201806131550-r
v5.0.1.201806211838-r
v5.0.2.201807311906-r
v5.0.3.201809091024-r
v5.1.0.201808281540-m3
v5.1.0.201809051400-rc1
v5.1.0.201809111528-r
v5.1.1.201809181055-r
v5.1.10.201908230655-r
v5.1.11.201909031202-r
v5.1.12.201910011832-r
v5.1.13.202002110435-r
v5.1.2.201810061102-r
v5.1.3.201810200350-r
v5.1.4.201812251853-r
v5.1.5.201812261915-r
v5.1.6.201903130242-r
v5.1.7.201904200442-r
v5.1.8.201906050907-r
v5.1.9.201908210455-r
v5.2.0.201811281532-m3
v5.2.0.201812061821-r
v5.2.1.201812262042-r
v5.3.0.201901161700-m1
v5.3.0.201901162155-m1
v5.3.0.201903061415-rc1
v5.3.0.201903130848-r
v5.3.1.201904271842-r
v5.3.2.201906051522-r
v5.3.3.201908210735-r
v5.3.4.201908231101-r
v5.3.5.201909031855-r
v5.3.6.201910020505-r
v5.3.7.202002110540-r
v5.4.0.201905081430-m2
v5.4.0.201905221418-m3
v5.4.0.201906121030-r
v5.4.1.201908211225-r
v5.4.2.201908231537-r
v5.4.3.201909031940-r
v5.5.0.201908280940-m3
v5.5.0.201909041048-rc1
v5.5.0.201909110433-r
v5.5.1.201910021850-r
v5.6.0.201911271000-m3
v5.6.0.201912041214-rc1
v5.6.0.201912101111-r
v5.6.1.202002131546-r
v5.7.0.202001151323-m1
v5.7.0.202002241735-m3
v5.7.0.202003090808-r
v5.7.0.202003110725-r
v5.8.0.202005061305-m2
v5.8.0.202006091008-r
v5.8.1.202007141445-r
${ noResults }
jgit/org.eclipse.jgit/META-INF
Shawn O. Pearce
fa4cc2475f
In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb |
13 years ago | |
---|---|---|
.. | ||
MANIFEST.MF | DFS: A storage layer for JGit | 13 years ago |
SOURCE-MANIFEST.MF | Prepare 1.2.0 builds | 13 years ago |