Opening a readable channel can be expensive and the number of channels
can be limited in DFS. Ensure that caller of
BlockBasedFile.readOneBlock() is responsible for opening and closing
the file, and that the ReadableChannel is reused in the request. As a side
effect, this makes the code easier to read, with better use of
try-with-resources.
The downside is that this means a readable channel is always opened, even
when the entire pack is already available for copying from cache. This
should be an acceptable cost: it's rare enough not to overload the server
and from a client latency perspective, the latency cost is in the noise
relative to the data transfer cost involved in a clone. If this turns out
to be a problem in practice, we can reintroduce that optimization in a
followup change.
Change-Id: I340428ee4bacd2dce019d5616ef12339a0c85f0b
Signed-off-by: Minh Thai <mthai@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
We see the same index being loaded by multiple threads. Each is
hundreds of MB and takes several seconds to load, causing server to
run out of memory. This change introduces a lock to avoid these
duplicate works. It uses a new set of locks similar in implementation
to the loadLocks for getOrLoad of blocks. The locks are kept separate
to prevent long-running index loading from blocking out fast block
loading. The cache instance can be configured with a consumer to
monitor the wait time of the new locks.
Change-Id: I44962fe84093456962d5981545e3f7851ecb6e43
Signed-off-by: Minh Thai <mthai@google.com>
CheckoutCommand had a setForce() method. But this didn't correspond
to native git's 'git checkout -f' option. Deprecate the old setForce()
method and move its implementation to a new method setForceRefUpdate()
and use it to implement the -B option in the CLI class Checkout.
Add a setForced() method and use it to fix the associated '-f' option of
the CLI Checkout class to behave like native git's 'git checkout -f'
which overwrites dirty worktree files during checkout.
This is still not fully matching native git's behavior: updating
additionally dirty index entries is not done yet.
Bug: 530771
Change-Id: I776b78eb623b6ea0aca42f681788f2e4b1667f15
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
To avoid breaking ABI, take the opportunity to give these setters
(hopefully sometimes better) names and deprecate their old names.
Change-Id: Ib45011678c3d941f8ecc1a1e0fdf4c09cdc337e3
Signed-off-by: Mario Molina <mmolimar@gmail.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Swallowing intermittent errors and trying to recover from them
makes JGit's behavior hard to predict and difficult to debug.
Propagate the errors instead. This doesn't violate JGit's usual
backward compatibility promise for clients because in these
contexts an IOException indicates either repository corruption or
a true I/O error. Let's consider these cases one at a time.
In the case of repository corruption, falling back e.g. to an empty
set of refs or a missing ref will not serve a caller well. The
fallback does not indicate the nature of the corruption, so they are
not in a good place to recover from the error. This is analogous to
Git, which attempts to provide sufficient support to recover from
corruption (by ensuring commands like "git branch -D" cope with
corruption) but little else.
In the case of an I/O error, the best we can do is to propagate the
error so that the user sees a dialog and has an opportunity to try
again. As in the corruption case, the fallback behavior does not
provide enough information for a caller to rely on the current error
handling, and callers such as EGit already need to be able to handle
runtime exceptions.
To be conservative, keep the existing behavior for the deprecated
Repository#peel method. In this example, the fallback behavior is to
return an unpeeled ref, which is distinguishable from the ref not
existing and should thus at least be possible to debug.
Change-Id: I0eb58eb8c77519df7f50d21d1742016b978e67a3
Signed-off-by: Jonathan Nieder <jrn@google.com>
Its implementation contains
} catch (IOException e) {
// Legacy API, assume error means "no"
return false;
}
Better to use ObjectDatabase#has, which throws IOException to report
errors.
Change-Id: I7de02f7ceb8f57b2a8ebdb16d2aa4376775ff933
Signed-off-by: Jonathan Nieder <jrn@google.com>
That constant is just a redirection to a java standard constant
meanwhile. It is not referenced anymore in jgit code (and egit is just
removing it). Clients can use the redirection target directly.
Change-Id: I058d013f61da8d7b771c499d8743aafb8faa5ea8
Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
A checkout done directly after cloning (the "initial
checkout") has a different semantic as a default
checkout. That is defined in the documentation of
"git read-tree" [1]. JGit was detecting that it is
doing an initial checkout differently from native
git: jgit used to check that the index is empty
but native git required that the index file does
not exist [2]. Teach JGit to behave like native
git.
[1] https://github.com/git/git/blob/master/Documentation/git-read-tree.txt#L187
[2] https://marc.info/?t=154150811200001&r=1&w=2
Change-Id: I1dd0f1ede7cd7ea60d28607916d0165269a9f628
and allow package org.eclipse.jgit.http.server to use package
org.eclipse.jgit.internal.transport.parser.
Change-Id: Ief330c3e75a735853d0a5a265a9ff56fb5128b99
Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Another step toward merging BaseReceivePack into ReceivePack.
Change-Id: If861e28ce512f556e574352fa7d4a0df0984693f
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Another step toward merging BaseReceivePack into ReceivePack.
Change-Id: I43cf2e36e2d5b0cd85bf23c81469909c14757b63
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Another step toward eliminating BaseReceivePack as a separate API.
Change-Id: If7b7d5c65a043607a2424211adb479fa33a9077b
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This is a first step toward eliminating the BaseReceivePack API.
Inspired by a larger change by Dan Wang <dwwang@google.com>.
Change-Id: I5c876a67d8db24bf808823d9ea44d991b1ce5277
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
RefDatabase.getRef was declared final in
c1954f6c36
which only affects implementers.
Change-Id: I4c14232a119670d263d88db2b8d725dcdd36ab2a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This simplifies the BaseReceivePack class and decreases its API
surface, which should make merging with ReceivePack easier.
Inspired by 6aca8899a5 (Move first line
parsing for v0/v1 pack negotiation out of UploadPack, 2018-09-17).
Change-Id: I1fc175d15aa7cb5968c26fc83a95480403af617c
Using findRef instead of getRef makes it clearer that the caller wants
to search for the ref in the search path, instead of looking for a ref
that exactly matches the input.
This change introduces the new findRef method and deprecates getRef.
It updates Repository#findRef to use the new method, ensuring some
test coverage. Other callers will be updated in followup changes.
A nice side effect of introducing the new findRef method is that it is
final and based on firstExactRef, so implementers can focus on
implementing the latter efficiently and do not have to carefully write
custom path search code respecting SEARCH_PATH.
Change-Id: Id3bb944344a9743705fd1f20193ab679298fa51c
Signed-off-by: Jonathan Nieder <jrn@google.com>
Override exactRef(String...) and firstExactRef(String...) with
implementations specific to FileRepository.
The specialized implementations are similar to the generic ones from
RefDatabase, but because these use readRef directly instead of
exactRef, they only need to call fireRefsChanged once.
This will allow replacing RefDirectory#getRef with a generic
implementation that uses firstExactRef without hurting performance.
Change-Id: I1be1948bd6121c1a1e8152e201aab97e7fb308bb
Signed-off-by: Jonathan Nieder <jrn@google.com>
Psuedorefs like FETCH_HEAD and MERGE_HEAD are supposed to be directly
under the .git directory, not in other locations in the SEARCH_PATH
like refs/ and refs/heads/. Use exactRef to access them.
Change-Id: Iab8ac47008822fa78fc0691e239e518c34d7a98e
Signed-off-by: Jonathan Nieder <jrn@google.com>
This is simpler to implement than getRef. Make it abstract so
implementers remember to override it.
Change-Id: I5f319be1fb1206d7a0142ea939dc4e1039f850ab
Signed-off-by: Jonathan Nieder <jrn@google.com>
getRef and exactRef can produce recoverable exceptions --- for
example, a corrupt loose ref that cannot be parsed. If readRef was
called and updated looseRefs in the process, RefsChangedEvent should
still be fired.
Noticed while improving the implementation of getRef. This commit
only affects exactRef and getRef. Other methods might be similarly
skipping firing RefsChangedEvent in their error handling code, and
this change does not fix them.
Change-Id: I0f460f6c8d9a585ad8453a4a47c1c77e24a1fb83
Signed-off-by: Jonathan Nieder <jrn@google.com>
Both getRef and exactRef look for a ref or pseudoref in the $GIT_DIR
directory, with careful error handling to handle non-refs like
.git/config.
Avoid the duplication by factoring out a helper that takes care of
this. This should make the code easier to understand and manipulate.
Change-Id: I2ea67816d2385e84e2d3394b897e23df5826ba50
Signed-off-by: Jonathan Nieder <jrn@google.com>
Clients can use --shallow-exclude to obtain information about what
commits are reachable from refs they are not supposed to be able to
see. Plug the hole by allowing the AdvertiseRefsHook and RefFilter to
take effect here, too.
Change-Id: If2b8e95344fa49e10a6a202144318b60d002490e
Signed-off-by: Jonathan Nieder <jrn@google.com>
The AdvertiseRefsHook can be called twice if the following conditions
hold:
1. This AdvertiseRefsHook doesn't set this.refs.
2. getAdvertisedOrDefaultRefs is called after getFilteredRefs.
For example, this can happen when fetchV2 is called after lsRefsV2
when using a stateful bidirectional transport.
The second call does not accomplish anything useful. Guard it with
'if (!advertiseRefsHookCalled)' to avoid wasted work.
Reported-by: Jonathan Tan <jonathantanmy@google.com>
Change-Id: Ib746582e4ef645b767a5b3fb969596df99ac2ab5
Signed-off-by: Jonathan Nieder <jrn@google.com>
Now the reference carries its updateIndex, so the cursor doesn't need
to expose it.
Change-Id: Icbfca46f92a13f3d8215ad10b2a166a6f40b0b0f
Signed-off-by: Ivan Frade <ifrade@google.com>
In DFS implementations the reference table can fall out of sync, but
it is not possible to check this situation in the current API.
Add a property to the Refs indicating the order of its updates. This
version is set only by RefDatabase implementations that support
versioning (e.g reftable based).
Caller is responsible to check if the reference db creates versioned
refs before accessing getUpdateIndex(). E.g:
Ref ref = refdb.exactRef(...);
if (refdb.hasVersioning()) {
ref.getUpdateIndex();
}
Change-Id: I0d5ec8e8df47c730301b2e12851a6bf3dac9d120
Signed-off-by: Ivan Frade <ifrade@google.com>
In the longer term, we can add support for this to the
RequestValidator interface. In the short term, this is a minimal
band-aid to ensure any refs the client requests are visible to the
client.
Change-Id: I0683c7a00e707cf97eef6c6bb782671d0a550ffe
Reported-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
ProtocolV2Parser explains:
// TODO(ifrade): This validation should be done after the
// protocol parsing. It is not a protocol problem asking for an
// unexisting ref and we wouldn't need the ref database here.
Do so. This way all ref database accesses are in one place, in the
UploadPack class.
No user-visible change intended --- this is just to make the code
easier to manipulate.
Change-Id: I68e87dff7b9a63ccc169bd0836e8e8baaf5d1048
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
AdvertiseRefsHook is used to limit the visibility of the refs in Gerrit.
If this hook is not called, then all refs are treated as visible.
In protocol v2, the hook is not called, causing the server to advertise
all refs. This bug was introduced in v5.0.0.201805221745-rc1~1^2~9
(Execute AdvertiseRefsHook only for protocol v0 and v1, 2018-05-14).
Even before then, the hook was not called in requests after the
capability advertisement, so in transports like HTTP that do not retain
state between round-trips, the server would advertise all refs in
response to an ls-refs (ls-remote) request.
Fix both cases by using getAdvertisedOrDefaultRefs to retrieve the
advertised refs in lsRefs, ensuring the hook is called in all cases that
use its result.
[jn: backported to stable-5.0; split out from a larger patch that also
fixes protocol v0; avoided filtering this.refs by ref prefix]
Change-Id: I64bce0e72d15b90baccc235c067e57b6af21b55f
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
AdvertiseRefsHook is used to limit the visibility of the refs in Gerrit.
If this hook is not called, then all refs are treated as visible,
causing the server to serve commits reachable from branches the client
should not be able to access, if asked to via a request naming a guessed
object id.
This bug was introduced in v2.0.0.201206130900-r~123 (Modify refs in
UploadPack/ReceivePack using a hook interface, 2012-02-08). Stateful
bidirectional transports are not affected.
Fix it by moving the AdvertiseRefsHook call to
getAdvertisedOrDefaultRefs, ensuring the hook is called in all cases.
[jn: backported to stable-4.5 by splitting out tests and the protocol v2
specific parts]
Change-Id: I159f396216354f2eda3968d17802e166d8c8ec2d
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When a server sends a ref advertisement using protocol v2 it contains
lines other than ref names and sha1s. Attempting to get the sha1 out
of such a line using the substring method can result in a SIOOB error
when it doesn't actually contain the sha1 and ref name.
Add a check that the line is of the expected length, and subsequently
that the extracted object id is valid, and if not throw an exception.
Change-Id: Id92fe66ff8b6deb2cf987d81929f8d0602c399f4
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>