* http-protocol question @ 2014-12-02 2:17 Bryan Turner 2014-12-02 3:34 ` Jonathan Nieder 0 siblings, 1 reply; 14+ messages in thread From: Bryan Turner @ 2014-12-02 2:17 UTC (permalink / raw) To: Git Users In Documentation/technical/http-protocol.txt, there is the following statement: "S: Parse the git-upload-pack request: Verify all objects in `want` are directly reachable from refs. The server MAY walk backwards through history or through the reflog to permit slightly stale requests." Is there actually logic somewhere in Git that does that "MAY walk backwards" step? Or is that something another implementation of Git may do but the standard Git does not? Thanks, Bryan Turner ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 2:17 http-protocol question Bryan Turner @ 2014-12-02 3:34 ` Jonathan Nieder 2014-12-02 4:28 ` Bryan Turner 0 siblings, 1 reply; 14+ messages in thread From: Jonathan Nieder @ 2014-12-02 3:34 UTC (permalink / raw) To: Bryan Turner; +Cc: Git Users Hi Bryan, Bryan Turner wrote: > Is there actually logic somewhere in Git that does that "MAY walk > backwards" step? Yes. See upload-pack.c::check_non_tip and http://thread.gmane.org/gmane.comp.version-control.git/178814. Hope that helps, Jonathan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 3:34 ` Jonathan Nieder @ 2014-12-02 4:28 ` Bryan Turner 2014-12-02 4:29 ` Bryan Turner 2014-12-02 4:45 ` Jonathan Nieder 0 siblings, 2 replies; 14+ messages in thread From: Bryan Turner @ 2014-12-02 4:28 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git Users On Tue, Dec 2, 2014 at 2:34 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Hi Bryan, > > Bryan Turner wrote: > >> Is there actually logic somewhere in Git that does that "MAY walk >> backwards" step? > > Yes. See upload-pack.c::check_non_tip and > http://thread.gmane.org/gmane.comp.version-control.git/178814. Jonathan, Thanks for the reply! I realize now I didn't really cite the part of that documentation that I'm most interested in, which is: "through history _or through the reflog_". It's the reflog bit I'm looking for. Sorry for not being more clear. check_non_tip appears to look at ancestry, walking back up (or down, depending on your view) the DAG to see if the requested SHA-1 is reachable from a current ref, so it looks like that's covering the "through history" part of that blurb. The reason I ask is that there is a race condition that exists where the ref advertisement lists refs/heads/foo at abc1234, and then foo is deleted before the actual upload-pack request comes in. In that situation, walking backwards through _history_ will only allow the upload-pack to complete unless the deleted ref was merged into another ref prior to being deleted (if I understand the code correctly). It seems like looking at the reflogs, and seeing that refs/heads/foo did in fact exist and did in fact refer to abc1234, is the only approach that could allow the upload-pack to complete. The documentation hints that such a thing is possible, but I don't see any code in Git that seems to do that. Does that make more sense? Does that functionality exist and I've just overlooked it? Thanks again, Bryan Turner > > Hope that helps, > Jonathan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 4:28 ` Bryan Turner @ 2014-12-02 4:29 ` Bryan Turner 2014-12-02 4:45 ` Jonathan Nieder 1 sibling, 0 replies; 14+ messages in thread From: Bryan Turner @ 2014-12-02 4:29 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git Users On Tue, Dec 2, 2014 at 3:28 PM, Bryan Turner <bturner@atlassian.com> wrote: > On Tue, Dec 2, 2014 at 2:34 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: >> Hi Bryan, >> >> Bryan Turner wrote: >> >>> Is there actually logic somewhere in Git that does that "MAY walk >>> backwards" step? >> >> Yes. See upload-pack.c::check_non_tip and >> http://thread.gmane.org/gmane.comp.version-control.git/178814. > > Jonathan, > > Thanks for the reply! I realize now I didn't really cite the part of > that documentation that I'm most interested in, which is: "through > history _or through the reflog_". It's the reflog bit I'm looking for. > Sorry for not being more clear. check_non_tip appears to look at > ancestry, walking back up (or down, depending on your view) the DAG to > see if the requested SHA-1 is reachable from a current ref, so it > looks like that's covering the "through history" part of that blurb. > > The reason I ask is that there is a race condition that exists where > the ref advertisement lists refs/heads/foo at abc1234, and then foo is > deleted before the actual upload-pack request comes in. In that > situation, walking backwards through _history_ will only allow the > upload-pack to complete unless the deleted ref was merged into another s/unless/if, sorry. "In that situation, walking backwards through history will only allow the upload-pack to complete if the deleted ref was merged into another ref." > ref prior to being deleted (if I understand the code correctly). It > seems like looking at the reflogs, and seeing that refs/heads/foo did > in fact exist and did in fact refer to abc1234, is the only approach > that could allow the upload-pack to complete. The documentation hints > that such a thing is possible, but I don't see any code in Git that > seems to do that. > > Does that make more sense? Does that functionality exist and I've just > overlooked it? > > Thanks again, > Bryan Turner > >> >> Hope that helps, >> Jonathan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 4:28 ` Bryan Turner 2014-12-02 4:29 ` Bryan Turner @ 2014-12-02 4:45 ` Jonathan Nieder 2014-12-02 5:04 ` Bryan Turner 1 sibling, 1 reply; 14+ messages in thread From: Jonathan Nieder @ 2014-12-02 4:45 UTC (permalink / raw) To: Bryan Turner; +Cc: Git Users Hi, Bryan Turner wrote: > The reason I ask is that there is a race condition that exists where > the ref advertisement lists refs/heads/foo at abc1234, and then foo is > deleted before the actual upload-pack request comes in. Can you say more about the context? For example, was this actually happening, or is this for the sake of understanding the protocol better? I ask because knowing the context might help us give more specific advice. Sometimes when people delete a branch they really want those objects to be inaccessible *right away*. So for such people, git's behavior of failing the request unless the objects are still accessible by some other path is helpful. A stateful server could theoretically cache the list of objects they have advertised for a short time, to avoid clients having to suffer when something becomes inaccessible during the window between the upload-pack advertisement and upload-pack request. Or a permissive server can allow all wants except a specific blacklist (and some people do that). Jonathan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 4:45 ` Jonathan Nieder @ 2014-12-02 5:04 ` Bryan Turner 2014-12-02 5:17 ` Jonathan Nieder 2014-12-02 5:33 ` Jeff King 0 siblings, 2 replies; 14+ messages in thread From: Bryan Turner @ 2014-12-02 5:04 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git Users On Tue, Dec 2, 2014 at 3:45 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Hi, > > Bryan Turner wrote: > >> The reason I ask is that there is a race condition that exists where >> the ref advertisement lists refs/heads/foo at abc1234, and then foo is >> deleted before the actual upload-pack request comes in. > > Can you say more about the context? For example, was this actually > happening, or is this for the sake of understanding the protocol > better? I ask because it's actually happening. Heavy CI load sometimes has builds fail because git clone dies with "not our ref". That's the specific context I'm working to try and address right now. Some teams use rebase-heavy workflows, which also evades the check_non_tip easing and fails with "not our ref", so I can't be 100% certain it's ref deletion in specific causing it (but I guess which of those it is is probably largely academic; as long as hosting spans multiple requests it seems like this type of race condition is unavoidable). I'm trying to decide if there is something I can enable or tune to prevent it, and the type of twilighting hinted at by the http-protocol documentation seemed like it could be just the thing. > > I ask because knowing the context might help us give more specific > advice. > > Sometimes when people delete a branch they really want those objects > to be inaccessible *right away*. So for such people, git's behavior > of failing the request unless the objects are still accessible by > some other path is helpful. > > A stateful server could theoretically cache the list of objects they > have advertised for a short time, to avoid clients having to suffer > when something becomes inaccessible during the window between the > upload-pack advertisement and upload-pack request. Or a permissive > server can allow all wants except a specific blacklist (and some > people do that). > > Jonathan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 5:04 ` Bryan Turner @ 2014-12-02 5:17 ` Jonathan Nieder 2014-12-02 5:30 ` Duy Nguyen 2014-12-02 5:37 ` Duy Nguyen 2014-12-02 5:33 ` Jeff King 1 sibling, 2 replies; 14+ messages in thread From: Jonathan Nieder @ 2014-12-02 5:17 UTC (permalink / raw) To: Bryan Turner; +Cc: Git Users Bryan Turner wrote: > I ask because it's actually happening. Heavy CI load sometimes has > builds fail because git clone dies with "not our ref". That's the > specific context I'm working to try and address right now. Thanks --- that makes sense. > Some teams > use rebase-heavy workflows, which also evades the check_non_tip easing > and fails with "not our ref", so I can't be 100% certain it's ref > deletion in specific causing it (but I guess which of those it is is > probably largely academic; as long as hosting spans multiple requests > it seems like this type of race condition is unavoidable). Yeah, everything mentioned before about ref deletion also applies to non-fast-forward updates. > I'm trying to decide if there is something I can enable or tune to > prevent it, and the type of twilighting hinted at by the http-protocol > documentation seemed like it could be just the thing. If you control the side that clones, then just cloning the single branch you are building (with --single-branch and -b) can help. Using a bidirectional protocol like git:// or ssh:// (where the ref advertisement and check of wants happen in the same connection) would avoid the problem we're talking about. On the server side, I agree that either mining reflogs or storing advertisements to disk would be a nice way to take care of this. No one has implemented either of those, but it would be a nice setting to have. :) Thanks, Jonathan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 5:17 ` Jonathan Nieder @ 2014-12-02 5:30 ` Duy Nguyen 2014-12-02 5:37 ` Duy Nguyen 1 sibling, 0 replies; 14+ messages in thread From: Duy Nguyen @ 2014-12-02 5:30 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Bryan Turner, Git Users On Tue, Dec 2, 2014 at 12:17 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > On the server side, I agree that either mining reflogs or storing > advertisements to disk would be a nice way to take care of this. > No one has implemented either of those, but it would be a nice setting > to have. :) and could be dangerous too. If I accidentally add a password file, then delete it (way before a clone is performed), I don't want that part of reflog visible to the cloner. Problem is we don't know how far we should look back in reflog unless we keep some state. -- Duy ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 5:17 ` Jonathan Nieder 2014-12-02 5:30 ` Duy Nguyen @ 2014-12-02 5:37 ` Duy Nguyen 1 sibling, 0 replies; 14+ messages in thread From: Duy Nguyen @ 2014-12-02 5:37 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Bryan Turner, Git Users On Tue, Dec 2, 2014 at 12:17 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: >> I'm trying to decide if there is something I can enable or tune to >> prevent it, and the type of twilighting hinted at by the http-protocol >> documentation seemed like it could be just the thing. > > If you control the side that clones, then just cloning the single branch > you are building (with --single-branch and -b) can help. Or we could extend this a bit on the server side, if one ref fail, let the clone continue with the remaining refs (and warn loudly to the user). -- Duy ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 5:04 ` Bryan Turner 2014-12-02 5:17 ` Jonathan Nieder @ 2014-12-02 5:33 ` Jeff King 2014-12-02 5:47 ` Bryan Turner 2014-12-02 17:45 ` Junio C Hamano 1 sibling, 2 replies; 14+ messages in thread From: Jeff King @ 2014-12-02 5:33 UTC (permalink / raw) To: Bryan Turner; +Cc: Jonathan Nieder, Git Users On Tue, Dec 02, 2014 at 04:04:11PM +1100, Bryan Turner wrote: > > Can you say more about the context? For example, was this actually > > happening, or is this for the sake of understanding the protocol > > better? > > I ask because it's actually happening. Heavy CI load sometimes has > builds fail because git clone dies with "not our ref". That's the > specific context I'm working to try and address right now. Some teams > use rebase-heavy workflows, which also evades the check_non_tip easing > and fails with "not our ref", so I can't be 100% certain it's ref > deletion in specific causing it (but I guess which of those it is is > probably largely academic; as long as hosting spans multiple requests > it seems like this type of race condition is unavoidable). There is a practical reason to care. Ref deletion will also delete the reflog, leaving no trace of the reachability. Whereas a non-fast-forward push could be resolved by looking in the reflog. One problem with hunting for sha1s in the reflog is that upload-pack does not know which ref the client thinks they are requesting. So a search would involve looking in _every_ reflog, which could be expensive. It might not be too painful if you do the search only after hitting a "not our ref" condition, which should in theory be rare. A malicious client could convince you to grep your reflogs repeatedly, but that is hardly the only way to convince upload-pack to chew CPU. Asking it to make a pack comes to mind. :) > I'm trying to decide if there is something I can enable or tune to > prevent it, and the type of twilighting hinted at by the http-protocol > documentation seemed like it could be just the thing. For a public repository, it might make sense to provide a config option to loosen the is_our_ref check completely (i.e., to just has_sha1_file). But such an option does not yet exist. -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 5:33 ` Jeff King @ 2014-12-02 5:47 ` Bryan Turner 2014-12-02 5:52 ` Jeff King 2014-12-02 17:45 ` Junio C Hamano 1 sibling, 1 reply; 14+ messages in thread From: Bryan Turner @ 2014-12-02 5:47 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Nieder, Git Users On Tue, Dec 2, 2014 at 4:33 PM, Jeff King <peff@peff.net> wrote: > On Tue, Dec 02, 2014 at 04:04:11PM +1100, Bryan Turner wrote: > >> > Can you say more about the context? For example, was this actually >> > happening, or is this for the sake of understanding the protocol >> > better? >> >> I ask because it's actually happening. Heavy CI load sometimes has >> builds fail because git clone dies with "not our ref". That's the >> specific context I'm working to try and address right now. Some teams >> use rebase-heavy workflows, which also evades the check_non_tip easing >> and fails with "not our ref", so I can't be 100% certain it's ref >> deletion in specific causing it (but I guess which of those it is is >> probably largely academic; as long as hosting spans multiple requests >> it seems like this type of race condition is unavoidable). > > There is a practical reason to care. Ref deletion will also delete the > reflog, leaving no trace of the reachability. Whereas a non-fast-forward > push could be resolved by looking in the reflog. A fair point. I had mistakenly thought that reflogs would survive the ref's deletion and be "pruned" as part of garbage collection, but a quick test shows that, as I'm sure you already know, that's not true. Thanks for correcting my mistake! Bryan Turner ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 5:47 ` Bryan Turner @ 2014-12-02 5:52 ` Jeff King 0 siblings, 0 replies; 14+ messages in thread From: Jeff King @ 2014-12-02 5:52 UTC (permalink / raw) To: Bryan Turner; +Cc: Jonathan Nieder, Git Users On Tue, Dec 02, 2014 at 04:47:50PM +1100, Bryan Turner wrote: > > There is a practical reason to care. Ref deletion will also delete the > > reflog, leaving no trace of the reachability. Whereas a non-fast-forward > > push could be resolved by looking in the reflog. > > A fair point. I had mistakenly thought that reflogs would survive the > ref's deletion and be "pruned" as part of garbage collection, but a > quick test shows that, as I'm sure you already know, that's not true. I wish it worked that way. Unfortunately there are complications with keeping the old reflogs in place, because they sometimes cause conflicts with new refs being created (e.g., a reflog in ".git/logs/refs/heads/foo" would prevent ".git/logs/refs/heads/foo/bar" from being created). I had some patches long ago to try to keep a "reflog graveyard" around, but they were quite invasive, and there were some corner cases that caused weird errors. Handling this sort of D/F conflict more gracefully is one of the things I'd like to experiment with once we have pluggable ref backends (I think we'll also disallow "foo/bar" if "foo" exists, but the storage could at least keep the reflogs around). -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 5:33 ` Jeff King 2014-12-02 5:47 ` Bryan Turner @ 2014-12-02 17:45 ` Junio C Hamano 2014-12-02 19:50 ` Jeff King 1 sibling, 1 reply; 14+ messages in thread From: Junio C Hamano @ 2014-12-02 17:45 UTC (permalink / raw) To: Jeff King; +Cc: Bryan Turner, Jonathan Nieder, Git Users Jeff King <peff@peff.net> writes: > For a public repository, it might make sense to provide a config option > to loosen the is_our_ref check completely (i.e., to just has_sha1_file). > But such an option does not yet exist. In principle, yes, but that cannot be has_sha1_file(); it has to have a fully connected healthy history behind it. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: http-protocol question 2014-12-02 17:45 ` Junio C Hamano @ 2014-12-02 19:50 ` Jeff King 0 siblings, 0 replies; 14+ messages in thread From: Jeff King @ 2014-12-02 19:50 UTC (permalink / raw) To: Junio C Hamano; +Cc: Bryan Turner, Jonathan Nieder, Git Users On Tue, Dec 02, 2014 at 09:45:06AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > For a public repository, it might make sense to provide a config option > > to loosen the is_our_ref check completely (i.e., to just has_sha1_file). > > But such an option does not yet exist. > > In principle, yes, but that cannot be has_sha1_file(); it has to > have a fully connected healthy history behind it. I thought about that, but I wonder if it matters whether we check up front. We are not advertising it, but only trying our best to fulfill the "want" from the other side. So either: 1. We check immediately whether we have the full history behind it, and reject the "want" otherwise. 2. We start pack-objects on it, and the first thing it will do is collect the full set of objects to send. If it doesn't have them, it will barf. Probably (1) would produce nicer error messages, but (2) is a lot more efficient. I dunno. I am not volunteering to implement, so I will leave thinking on it more to somebody who wants to do so. :) -Peff ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-12-02 19:50 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-02 2:17 http-protocol question Bryan Turner 2014-12-02 3:34 ` Jonathan Nieder 2014-12-02 4:28 ` Bryan Turner 2014-12-02 4:29 ` Bryan Turner 2014-12-02 4:45 ` Jonathan Nieder 2014-12-02 5:04 ` Bryan Turner 2014-12-02 5:17 ` Jonathan Nieder 2014-12-02 5:30 ` Duy Nguyen 2014-12-02 5:37 ` Duy Nguyen 2014-12-02 5:33 ` Jeff King 2014-12-02 5:47 ` Bryan Turner 2014-12-02 5:52 ` Jeff King 2014-12-02 17:45 ` Junio C Hamano 2014-12-02 19:50 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).