All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: Dennis Kaarsemaker <dennis@kaarsemaker.net>,
	gitster@pobox.com, git@vger.kernel.org
Subject: [PATCH v2 0/3] fix http deadlock on giant ref negotiations
Date: Wed, 20 May 2015 03:35:26 -0400	[thread overview]
Message-ID: <20150520073526.GA16784@peff.net> (raw)
In-Reply-To: <55563AD5.4090105@linuxfoundation.org>

On Fri, May 15, 2015 at 02:28:37PM -0400, Konstantin Ryabitsev wrote:

> On 15/05/15 02:22 PM, Junio C Hamano wrote:
> > Also, is it worth allocating small and then growing up to the maximum?
> > I think this only relays one request at a time anyway, and I suspect
> > that a single 1MB allocation at the first call kept getting reused
> > may be sufficient (and much simpler).
> 
> Does it make sense to make that configurable via an env variable at all?
> I suspect the vast majority of people would not hit this bug unless they
> are dealing with repositories polluted with hundreds of refs created by
> automation (like the codeaurora chromium repo).
> 
> E.g. can be set via an Apache directive like
> 
> SetEnv FOO_MAX_SIZE 2048
> 
> The backend can then be configured to emit an error message when the
> spool size is exhausted saying "foo exhausted, increase FOO_MAX_SIZE to
> allow for moar foo."

Yeah, that was the same conclusion I came to elsewhere in the thread.
Here's a re-roll:

  [1/3]: http-backend: fix die recursion with custom handler
  [2/3]: t5551: factor out tag creation
  [3/3]: http-backend: spool ref negotiation requests to buffer

It makes the size configurable (either through the environment, which is
convenient for setting via Apache; or through the config, which is
convenient if you have one absurdly-sized repo). It mentions the
variable name when it barfs into the Apache log. I also bumped the
default to 10MB, which I think should be enough to handle even
ridiculous cases.

I also adapted Dennis's test into the third patch. Beware that it's
quite slow to run (and is protected by the "EXPENSIVE" prerequisite).
Patch 2 is new, and just refactors the script to make adding the new
test easier.

I really wanted to add a test like:

diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index e2a2fa1..1fff812 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -273,6 +273,16 @@ test_expect_success 'large fetch-pack requests can be split across POSTs' '
 	test_line_count = 2 posts
 '
 
+test_expect_success 'http-backend does not buffer arbitrarily large requests' '
+	test_when_finished "(
+		cd \"$HTTPD_DOCUMENT_ROOT_PATH/repo.git\" &&
+		test_unconfig http.maxrequestbuffer
+	)" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
+		config http.maxrequestbuffer 100 &&
+	test_must_fail git clone $HTTPD_URL/smart/repo.git foo.git
+'
+
 test_expect_success EXPENSIVE 'http can handle enormous ref negotiation' '
 	(
 		cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&

to test that the maxRequestBuffer code does indeed work. Unfortunately,
even though the server behaves reasonably in this case, the client ends
up hanging forever. I'm not sure there is a simple solution to that; I
think it is a protocol issue where remote-http is waiting for fetch-pack
to speak, but fetch-pack is waiting for more data from the remote.

Personally, I think I'd be much more interested in pursuing a saner,
full duplex http solution like git-over-websockets than trying to iron
out all of the corner cases in the stateless-rpc protocol.

-Peff

  reply	other threads:[~2015-05-20  7:35 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-13 21:04 Clone hangs when done over http with --reference Konstantin Ryabitsev
2015-05-14  0:47 ` Jeff King
2015-05-14  1:02   ` [PATCH] http-backend: fix die recursion with custom handler Jeff King
2015-05-15  6:20     ` Jeff King
2015-05-14 17:08   ` Clone hangs when done over http with --reference Konstantin Ryabitsev
2015-05-14 19:52     ` Jeff King
2015-05-15  6:29   ` [PATCH 0/2] fix http deadlock on giant ref negotiations Jeff King
2015-05-15  6:29     ` [PATCH 1/2] http-backend: fix die recursion with custom handler Jeff King
2015-05-15  6:33     ` [PATCH 2/2] http-backend: spool ref negotiation requests to buffer Jeff King
2015-05-15 18:22       ` Junio C Hamano
2015-05-15 18:28         ` Konstantin Ryabitsev
2015-05-20  7:35           ` Jeff King [this message]
2015-05-20  7:36             ` [PATCH v2 1/3] http-backend: fix die recursion with custom handler Jeff King
2015-05-20  7:36             ` [PATCH v2 2/3] t5551: factor out tag creation Jeff King
2015-05-20  7:37             ` [PATCH v2 3/3] http-backend: spool ref negotiation requests to buffer Jeff King
2015-05-26  2:07               ` Konstantin Ryabitsev
2015-05-26  2:24                 ` Jeff King
2015-05-26  3:43                   ` Junio C Hamano
2015-05-15 19:16         ` [PATCH 2/2] " Jeff King
2015-05-15  7:41     ` [PATCH 0/2] fix http deadlock on giant ref negotiations Dennis Kaarsemaker
2015-05-15  8:38       ` Jeff King
2015-05-15  8:44         ` Dennis Kaarsemaker
2015-05-15  8:53           ` Jeff King
2015-05-15  9:11             ` Dennis Kaarsemaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150520073526.GA16784@peff.net \
    --to=peff@peff.net \
    --cc=dennis@kaarsemaker.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=konstantin@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.