Add a storage implementation storing large objects in Amazon S3.
The AmazonS3Repository pre-signs download and upload requests.
AWS access and secret key are expected to be in the
$HOME/.aws/credentials file in the following format:
[default]
accessKey = ...
secretKey = ...
Use AWS version 4 request signing [1] because it is more secure and
supported by all regions. The version 3 signing is not supported in
newer regions.
In follow up changes we should:
- implement getVerifyAction() and do actual verification. Subclasses of
S3Repository can implement caching for object meta data (size) in order
to avoid extra roundtrips to S3. Verification should ensure that meta
data store and content of S3 storage are in sync
- HEAD request used in S3Repository.getSize() seems to always return
Content-length 0 in contrast to the documentation [2]. So getSize() does
detect if the object exists in S3 or not but in case the object exists
it always returns size 0
[1] http://docs.aws.amazon.com/general/latest/gr/signature-version-4.html
[2] https://forums.aws.amazon.com/thread.jspa?threadID=223616
Change-Id: Ic47f094928a259e5264c92b3aacf6d90210907a8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Implement LfsProtocolServlet handling the "Git LFS v1 Batch API"
protocol [1]. Add a simple file system based LFS content store and the
debug-lfs-store command to simplify testing.
Introduce a LargeFileRepository interface to enable additional storage
implementation while reusing the same protocol implementation.
At the client side we have to configure the lfs.url, specify that
we use the batch API and we don't use authentication:
[lfs]
url = http://host:port/lfs
batch = true
[lfs "http://host:port/lfs"]
access = none
the git-lfs client appends the "objects/batch" to the lfs.url.
Hard code an Authorization header in the FileLfsRepository.getAction
because then git-lfs client will skip asking for credentials. It will
just forward the Authorization header from the response to the
download/upload request.
The FileLfsServlet supports file content storage for "Large File
Storage" (LFS) server as defined by the Github LFS API [2].
- upload and download of large files is probably network bound hence use
an asynchronous servlet for good scalability
- simple object storage in file system with 2 level fan-out
- use LockFile to protect writing large objects against multiple
concurrent uploads of the same object
- to prevent corrupt uploads the uploaded file is rejected if its hash
doesn't match id given in URL
The debug-lfs-store command is used to run the LfsProtocolServlet and,
optionally, the FileLfsServlet which makes it easier to setup a
local test server.
[1]
https://github.com/github/git-lfs/blob/master/docs/api/http-v1-batch.md
[2] https://github.com/github/git-lfs/tree/master/docs/api
Bug: 472961
Change-Id: I7378da5575159d2195138d799704880c5c82d5f3
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
The Large File Storage extension specified by GitHub [1] uses SHA-256 to
compute the ID of large files stored by the extension. Hence implement a
SHA-256 abstraction similar to the SHA-1 abstraction used by JGit.
[1] https://git-lfs.github.com/
Bug: 470333
Change-Id: I3a95954543c8570d73929e55f4a884b55dbf1b7a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>