git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git push keeps writing after server failure
@ 2015-06-12 17:31 Shawn Pearce
  2015-06-12 17:57 ` Junio C Hamano
  2015-06-12 18:12 ` Jeff King
  0 siblings, 2 replies; 5+ messages in thread
From: Shawn Pearce @ 2015-06-12 17:31 UTC (permalink / raw)
  To: git

I did something stupid like trying to push a copy of WebKit[1] into my
GitHub account. This is ~5.2 GiB of data, which GitHub prefers not to
accept. Ok ...

$ git push --all git@github.com:spearce/wk.git
Counting objects: 2752427, done.
Delta compression using up to 12 threads.
Compressing objects: 100% (442684/442684), done.
remote: fatal: pack exceeds maximum allowed size
Writing objects: 100% (2752427/2752427), 5.28 GiB | 8.86 MiB/s, done.
Total 2752427 (delta 2225007), reused 2752427 (delta 2225007)
fatal: The remote end hung up unexpectedly
fatal: The remote end hung up unexpectedly

Notice GitHub prints "remote: fatal: pack exceeds maximum allowed
size". That interrupted my "Writing objects" progress meter, and then
git push just kept going and wrote really really fast (170 MiB/s!)
until the entire pack was sent.

A similar thing happens on https:// if the remote HTTP server goes
away in the middle of sending the pack. Except its slower to send the
remainder of the pack before git push finally terminates with an
error.

Shouldn't git push realize its stream is broken and stop writing when
the peer is all like "uh, no, I'm not going to do that, but thanks for
trying"?


[1] https://webkit.googlesource.com/WebKit/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git push keeps writing after server failure
  2015-06-12 17:31 git push keeps writing after server failure Shawn Pearce
@ 2015-06-12 17:57 ` Junio C Hamano
  2015-06-12 18:12 ` Jeff King
  1 sibling, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2015-06-12 17:57 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Shawn Pearce <spearce@spearce.org> writes:

> I did something stupid like trying to push a copy of WebKit[1] into my
> GitHub account. This is ~5.2 GiB of data, which GitHub prefers not to
> accept. Ok ...
> ...
> Shouldn't git push realize its stream is broken and stop writing when
> the peer is all like "uh, no, I'm not going to do that, but thanks for
> trying"?

Nice wish, but I am not sure if we are structured to do that easily.
The "fatal:" message is going on the sideband that is demultiplexed
by a separate async "sideband-demux" thread and pack_objects() is
oblivious to what is being said on the error channel #2.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git push keeps writing after server failure
  2015-06-12 17:31 git push keeps writing after server failure Shawn Pearce
  2015-06-12 17:57 ` Junio C Hamano
@ 2015-06-12 18:12 ` Jeff King
  2015-06-12 18:20   ` Jeff King
  1 sibling, 1 reply; 5+ messages in thread
From: Jeff King @ 2015-06-12 18:12 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

On Fri, Jun 12, 2015 at 10:31:33AM -0700, Shawn Pearce wrote:

> I did something stupid like trying to push a copy of WebKit[1] into my
> GitHub account. This is ~5.2 GiB of data, which GitHub prefers not to
> accept. Ok ...

Heh, yeah. We cap it at 2G, and if you are going to have a WebKit fork,
we prefer you fork the actual WebKit repo so it shares objects (though
if you have a need to create a new fork network, let me know).

> $ git push --all git@github.com:spearce/wk.git
> Counting objects: 2752427, done.
> Delta compression using up to 12 threads.
> Compressing objects: 100% (442684/442684), done.
> remote: fatal: pack exceeds maximum allowed size
> Writing objects: 100% (2752427/2752427), 5.28 GiB | 8.86 MiB/s, done.
> Total 2752427 (delta 2225007), reused 2752427 (delta 2225007)
> fatal: The remote end hung up unexpectedly
> fatal: The remote end hung up unexpectedly
> 
> Notice GitHub prints "remote: fatal: pack exceeds maximum allowed
> size". That interrupted my "Writing objects" progress meter, and then
> git push just kept going and wrote really really fast (170 MiB/s!)
> until the entire pack was sent.

Sounds like it's writing to a closed fd, then. Which makes sense; I
think we should hang up the socket after writing the "fatal" message
above.

> Shouldn't git push realize its stream is broken and stop writing when
> the peer is all like "uh, no, I'm not going to do that, but thanks for
> trying"?

Hrm. I have this old patch, which was originally written so that "kill
$(pidof git-push)" did not let a rogue pack-objects continue writing.

I'm not sure if that's what is going on here, though. I think we connect
pack-objects directly to the socket. So it sounds more like
"pack-objects --stdout" needs to know to stop writing when writes to the
socket fail.

-- >8 --
Date: Sun, 3 Apr 2011 20:53:08 -0400
Subject: [PATCH] send-pack: kill pack-objects helper on signal or exit

We spawn an external pack-objects process to actually send
objects to the remote side. If we are killed by a signal
during this process, the pack-objects will keep running and
complete the push, which may surprise the user. We should
take it down when we go down.

Signed-off-by: Jeff King <peff@peff.net>
---
 send-pack.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/send-pack.c b/send-pack.c
index 2a64fec..bdf723b 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -67,6 +67,7 @@ static int pack_objects(int fd, struct ref *refs, struct sha1_array *extra, stru
 	po.in = -1;
 	po.out = args->stateless_rpc ? -1 : fd;
 	po.git_cmd = 1;
+	po.clean_on_exit = 1;
 	if (start_command(&po))
 		die_errno("git pack-objects failed");
 
-- 
2.4.2.752.geeb594a

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: git push keeps writing after server failure
  2015-06-12 18:12 ` Jeff King
@ 2015-06-12 18:20   ` Jeff King
  2015-06-12 18:50     ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2015-06-12 18:20 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

On Fri, Jun 12, 2015 at 02:12:56PM -0400, Jeff King wrote:

> > $ git push --all git@github.com:spearce/wk.git
> > Counting objects: 2752427, done.
> > Delta compression using up to 12 threads.
> > Compressing objects: 100% (442684/442684), done.
> > remote: fatal: pack exceeds maximum allowed size
> > Writing objects: 100% (2752427/2752427), 5.28 GiB | 8.86 MiB/s, done.
> > Total 2752427 (delta 2225007), reused 2752427 (delta 2225007)
> > fatal: The remote end hung up unexpectedly
> > fatal: The remote end hung up unexpectedly
> > 
> > Notice GitHub prints "remote: fatal: pack exceeds maximum allowed
> > size". That interrupted my "Writing objects" progress meter, and then
> > git push just kept going and wrote really really fast (170 MiB/s!)
> > until the entire pack was sent.
> 
> Sounds like it's writing to a closed fd, then. Which makes sense; I
> think we should hang up the socket after writing the "fatal" message
> above.

For reference, here's the patch implementing the max-size check on the
server. It's on my long list of patches to clean up and send to the
list; I never did this one because of the unpack-objects caveat
mentioned below.

The death happens in index-pack. I think receive-pack should notice that
and bail (closing the socket), but it's possible that it doesn't.

There's also a git-aware proxying layer at GitHub between clients and
git-receive-pack these days. It's possible that it is not properly
hanging up until it sees EOF from the client.

If any of those things are true, then it is a GitHub-specific problem
(which I still want to fix, but it is not something git upstream needs
to care about).

-- >8 --
Date: Tue, 4 Sep 2012 12:01:06 -0400
Subject: [PATCH] receive-pack: allow a maximum input size to be specified

Receive-pack feeds its input to index-pack, which will
happily accept as many bytes as a sender is willing to
provide. Let's allow an arbitrary cutoff point where we will
stop writing bytes to disk (they're still left in the
tmp_pack, but cleaning that can easily be done outside of
receive-pack).

Note that this doesn't handle the git-unpack-objects code
path at all (because GitHub sets transfer.unpacklimit to 1,
so we never unpack pushed objects anyway).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/index-pack.c   |  5 +++++
 builtin/receive-pack.c | 14 +++++++++++++-
 t/t5528-push-limits.sh | 27 +++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 1 deletion(-)
 create mode 100755 t/t5528-push-limits.sh

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index dd1c5c9..054a144 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -70,6 +70,7 @@ static struct progress *progress;
 static unsigned char input_buffer[4096];
 static unsigned int input_offset, input_len;
 static off_t consumed_bytes;
+static off_t max_input_size;
 static unsigned deepest_delta;
 static git_SHA_CTX input_ctx;
 static uint32_t input_crc32;
@@ -167,6 +168,8 @@ static void use(int bytes)
 	if (signed_add_overflows(consumed_bytes, bytes))
 		die("pack too large for current definition of off_t");
 	consumed_bytes += bytes;
+	if (max_input_size && consumed_bytes > max_input_size)
+		die("pack exceeds maximum allowed size");
 }
 
 static const char *open_pack_file(const char *pack_name)
@@ -1148,6 +1151,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 					opts.off32_limit = strtoul(c+1, &c, 0);
 				if (*c || opts.off32_limit & 0x80000000)
 					die("bad %s", arg);
+			} else if (!prefixcmp(arg, "--max-input-size=")) {
+				max_input_size = strtoul(arg + 17, NULL, 10);
 			} else
 				usage(index_pack_usage);
 			continue;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 0afb8b2..e64b8d2 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -31,6 +31,7 @@ static int transfer_fsck_objects = -1;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int unpack_limit = 100;
+static off_t max_input_size;
 static int report_status;
 static int use_sideband;
 static int quiet;
@@ -113,6 +114,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.maxsize") == 0) {
+		max_input_size = git_config_ulong(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -832,9 +838,10 @@ static const char *unpack(void)
 			return NULL;
 		return "unpack-objects abnormal exit";
 	} else {
-		const char *keeper[7];
+		const char *keeper[8];
 		int s, status, i = 0;
 		char keep_arg[256];
+		char max_input_arg[256];
 		struct child_process ip;
 
 		s = sprintf(keep_arg, "--keep=receive-pack %"PRIuMAX" on ", (uintmax_t) getpid());
@@ -848,6 +855,11 @@ static const char *unpack(void)
 		keeper[i++] = "--fix-thin";
 		keeper[i++] = hdr_arg;
 		keeper[i++] = keep_arg;
+		if (max_input_size) {
+			snprintf(max_input_arg, sizeof(max_input_arg),
+				 "--max-input-size=%lu", max_input_size);
+			keeper[i++] = max_input_arg;
+		}
 		keeper[i++] = NULL;
 		memset(&ip, 0, sizeof(ip));
 		ip.argv = keeper;
diff --git a/t/t5528-push-limits.sh b/t/t5528-push-limits.sh
new file mode 100755
index 0000000..c2f6d77
--- /dev/null
+++ b/t/t5528-push-limits.sh
@@ -0,0 +1,27 @@
+#!/bin/sh
+
+test_description='check input limits for pushing'
+. ./test-lib.sh
+
+test_expect_success 'create known-size commit' '
+	test-genrandom foo 1024 >file &&
+	git add file &&
+	test_commit one-k
+'
+
+test_expect_success 'create remote repository' '
+	git init --bare dest &&
+	git --git-dir=dest config receive.unpacklimit 1
+'
+
+test_expect_success 'receive.maxsize rejects push' '
+	git --git-dir=dest config receive.maxsize 512 &&
+	test_must_fail git push dest HEAD
+'
+
+test_expect_success 'bumping limit allows push' '
+	git --git-dir=dest config receive.maxsize 4k &&
+	git push dest HEAD
+'
+
+test_done
-- 
2.4.2.752.geeb594a

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: git push keeps writing after server failure
  2015-06-12 18:20   ` Jeff King
@ 2015-06-12 18:50     ` Jeff King
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2015-06-12 18:50 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

On Fri, Jun 12, 2015 at 02:20:45PM -0400, Jeff King wrote:

> > > Notice GitHub prints "remote: fatal: pack exceeds maximum allowed
> > > size". That interrupted my "Writing objects" progress meter, and then
> > > git push just kept going and wrote really really fast (170 MiB/s!)
> > > until the entire pack was sent.
> > 
> > Sounds like it's writing to a closed fd, then. Which makes sense; I
> > think we should hang up the socket after writing the "fatal" message
> > above.
> 
> For reference, here's the patch implementing the max-size check on the
> server. It's on my long list of patches to clean up and send to the
> list; I never did this one because of the unpack-objects caveat
> mentioned below.

I did a little more digging on this.

With the max-size patch, we seem to reliably notice the problem and die
of EPIPE (you can bump receive.maxsize to something reasonable like 1m).
Pushing to GitHub[1], though, sometimes dies and sometimes ends up
pushing the whole pack over the ssh session. It seems racy.

I've confirmed in both cases that the receive-pack process dies on our
server. So presumably the problem is in between; it might be an ssh
weirdness, or it might be a problem with our proxy layer. I'll open an
issue internally to look at that.

-Peff

[1] I can tweak the max-size on a per-repo basis, which is how I did my
    testing without waiting for 2G to transfer. If anybody is interested
    in diagnosing the client side of this, I am happy to drop the
    max-size on a test repo for you. But AFAICT it is not a client
    problem at all.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-06-12 18:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-12 17:31 git push keeps writing after server failure Shawn Pearce
2015-06-12 17:57 ` Junio C Hamano
2015-06-12 18:12 ` Jeff King
2015-06-12 18:20   ` Jeff King
2015-06-12 18:50     ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).