git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* http-backend fatal error message on shallow clone
@ 2016-06-21 11:23 Eric Wong
  2016-06-21 12:10 ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Wong @ 2016-06-21 11:23 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn O. Pearce

I noticed "fatal: The remote end hung up unexpectedly" in server
logs from shallow clones.  Totally reproducible in the test
cases, too.  The following change shows it:

diff --git a/t/t5561-http-backend.sh b/t/t5561-http-backend.sh
index 90e0d6f..cfa55ce 100755
--- a/t/t5561-http-backend.sh
+++ b/t/t5561-http-backend.sh
@@ -132,5 +132,11 @@ test_expect_success 'server request log matches test results' '
 	test_cmp exp act
 '
 
+test_expect_success 'shallow clone' '
+	config http.uploadpack true &&
+	git clone --depth=1 "$HTTPD_URL/smart/repo.git" shallow &&
+	tail "$HTTPD_ROOT_PATH"/error.log | grep fatal
+'
+
 stop_httpd
 test_done


And the last test ends like this:

expecting success: 
	config http.uploadpack true &&
	git clone --depth=1 "$HTTPD_URL/smart/repo.git" shallow &&
	tail "$HTTPD_ROOT_PATH"/error.log | grep fatal

Cloning into 'shallow'...
[Tue Jun 21 11:07:41.391269 2016] [cgi:error] [pid 21589] [client 127.0.0.1:37518] AH01215: fatal: The remote end hung up unexpectedly
ok 15 - shallow clone

It doesn't show above, but I think http-backend exits
with a non-zero status, too, which might cause some CGI
implementations to complain or break.

Not sure if it's just a corner case that wasn't tested
or something else, but the clone itself seems successful...

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: http-backend fatal error message on shallow clone
  2016-06-21 11:23 http-backend fatal error message on shallow clone Eric Wong
@ 2016-06-21 12:10 ` Jeff King
  2016-06-21 16:41   ` Duy Nguyen
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2016-06-21 12:10 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, Shawn O. Pearce

On Tue, Jun 21, 2016 at 11:23:03AM +0000, Eric Wong wrote:

> I noticed "fatal: The remote end hung up unexpectedly" in server
> logs from shallow clones.  Totally reproducible in the test
> cases, too.  The following change shows it:
> [...]
> [Tue Jun 21 11:07:41.391269 2016] [cgi:error] [pid 21589] [client 127.0.0.1:37518] AH01215: fatal: The remote end hung up unexpectedly
> 
> It doesn't show above, but I think http-backend exits
> with a non-zero status, too, which might cause some CGI
> implementations to complain or break.
> 
> Not sure if it's just a corner case that wasn't tested
> or something else, but the clone itself seems successful...

The dying process is actually upload-pack. If we instrument it like
this:

diff --git a/upload-pack.c b/upload-pack.c
index 9e03c27..a1da676 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -820,6 +820,14 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+NORETURN
+static void custom_die(const char *err, va_list params)
+{
+	vreportf("fatal: ", err, params);
+	warning("aborting");
+	abort();
+}
+
 int main(int argc, const char **argv)
 {
 	const char *dir;
@@ -836,6 +844,9 @@ int main(int argc, const char **argv)
 		OPT_END()
 	};
 
+	warning("running upload-pack");
+	set_die_routine(custom_die);
+
 	git_setup_gettext();
 
 	packet_trace_identity("upload-pack");

we can see two things. One is we can get a backtrace from the core file:

#0  0x00007f04aef51458 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f04aef528da in __GI_abort () at abort.c:89
#2  0x0000000000406009 in custom_die (err=0x4ede60 "The remote end hung up unexpectedly", 
    params=0x7ffe15858758) at upload-pack.c:828
#3  0x000000000045ec63 in die (err=0x4ede60 "The remote end hung up unexpectedly") at usage.c:108
#4  0x000000000041e016 in get_packet_data (fd=0, src_buf=0x0, src_size=0x0, dst=0x7ffe158588b0, 
    size=4, options=2) at pkt-line.c:167
#5  0x000000000041e0ea in packet_read (fd=0, src_buf=0x0, src_len=0x0, 
    buffer=0x73cc40 <packet_buffer> "deepen 1", size=65520, options=2) at pkt-line.c:204
#6  0x000000000041e22b in packet_read_line_generic (fd=0, src=0x0, src_len=0x0, dst_len=0x0)
    at pkt-line.c:234
#7  0x000000000041e27c in packet_read_line (fd=0, len_p=0x0) at pkt-line.c:244
#8  0x0000000000404dc9 in get_common_commits () at upload-pack.c:384
#9  0x0000000000405eb4 in upload_pack () at upload-pack.c:798
#10 0x0000000000406229 in main (argc=1, argv=0x7ffe15858c28) at upload-pack.c:872

So we expected to read a packet but didn't get one. Not surprising. The
interesting thing is that we are in get_common_commits(), and if we were
to get a flush packet in stateless-rpc mode, we would simply exit(0).
But we get EOF instead, which provokes packet_read_line() to die.

The other interesting thing is that we can see in httpd's error.log that
it is the penultimate upload-pack that dies (I trimmed the log lines for
readability):

AH01215: warning: running upload-pack
AH01215: fatal: The remote end hung up unexpectedly
AH01215: warning: aborting
AH01215: error: git-upload-pack died of signal 6
AH01215: warning: running upload-pack

So this request actually takes _two_ upload-pack instances to serve
(which is not uncommon when we need multiple rounds of
get_common_commits(), though I am a little surprised that would be the
case for a clone). And the first one seems to be missing a closing
"0000" flush packet from the client to end it.

If that analysis is correct, this isn't affecting operation in any way;
it's just giving a useless message from upload-pack (and as you note,
that could trigger http-backend to exit(1), which may make the webserver
unhappy).

If we further instrument upload-pack to set GIT_TRACE_PACKET on the
server side, we can see the two requests:

packet:  upload-pack< want 379518c0c94e3b1a0710129d03d5560814a0ba6f multi_ack_detailed no-done side-band-64k thin-pack include-tag ofs-delta agent=git/2.9.0.37.gb3ad8ab.dirty
packet:  upload-pack< deepen 1
packet:  upload-pack< 0000
packet:  upload-pack> shallow 379518c0c94e3b1a0710129d03d5560814a0ba6f
packet:  upload-pack> 0000

packet:  upload-pack< want 379518c0c94e3b1a0710129d03d5560814a0ba6f multi_ack_detailed no-done side-band-64k thin-pack include-tag ofs-delta agent=git/2.9.0.37.gb3ad8ab.dirty
packet:  upload-pack< deepen 1
packet:  upload-pack< 0000
packet:  upload-pack> shallow 379518c0c94e3b1a0710129d03d5560814a0ba6f
packet:  upload-pack> 0000
packet:  upload-pack< done
packet:  upload-pack> NAK
packet:  upload-pack> 0000

I think in the first one we would get "deepen 1" from the client in
receive_needs(), and similarly write out our "shallow" line. But then we
go into get_common_commits() and the client has hung up, which causes
the message. Whereas in the second line it gives us a "done", which
completes the negotiation.

So my not-very-educated thoughts are:

  1. The client should probably be sending an extra flush in the first
     request. Alternatively, in the stateless-rpc case we should just
     accept the lack of flush as an acceptable end to the request.

  2. Presumably the shallowness is what causes the double-request, as
     fetch-pack wants to see the shallow list before proceeding. But
     since it has no actual commits to negotiate, the negotiation is a
     noop. So probably this case could actually happen in a single
     request.

     I suspect that other fetches could not, though, so I'm not sure how
     much effort is worth putting into optimizing.

-Peff

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: http-backend fatal error message on shallow clone
  2016-06-21 12:10 ` Jeff King
@ 2016-06-21 16:41   ` Duy Nguyen
  0 siblings, 0 replies; 3+ messages in thread
From: Duy Nguyen @ 2016-06-21 16:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Wong, Git Mailing List, Shawn O. Pearce

On Tue, Jun 21, 2016 at 2:10 PM, Jeff King <peff@peff.net> wrote:
> ...
> So this request actually takes _two_ upload-pack instances to serve
> (which is not uncommon when we need multiple rounds of
> get_common_commits(), though I am a little surprised that would be the
> case for a clone). And the first one seems to be missing a closing
> "0000" flush packet from the client to end it.
>
> If that analysis is correct, this isn't affecting operation in any way;
> it's just giving a useless message from upload-pack (and as you note,
> that could trigger http-backend to exit(1), which may make the webserver
> unhappy).
>
> If we further instrument upload-pack to set GIT_TRACE_PACKET on the
> server side, we can see the two requests:
>
> packet:  upload-pack< want 379518c0c94e3b1a0710129d03d5560814a0ba6f multi_ack_detailed no-done side-band-64k thin-pack include-tag ofs-delta agent=git/2.9.0.37.gb3ad8ab.dirty
> packet:  upload-pack< deepen 1
> packet:  upload-pack< 0000
> packet:  upload-pack> shallow 379518c0c94e3b1a0710129d03d5560814a0ba6f
> packet:  upload-pack> 0000
>
> packet:  upload-pack< want 379518c0c94e3b1a0710129d03d5560814a0ba6f multi_ack_detailed no-done side-band-64k thin-pack include-tag ofs-delta agent=git/2.9.0.37.gb3ad8ab.dirty
> packet:  upload-pack< deepen 1
> packet:  upload-pack< 0000
> packet:  upload-pack> shallow 379518c0c94e3b1a0710129d03d5560814a0ba6f
> packet:  upload-pack> 0000
> packet:  upload-pack< done
> packet:  upload-pack> NAK
> packet:  upload-pack> 0000
>
> I think in the first one we would get "deepen 1" from the client in
> receive_needs(), and similarly write out our "shallow" line. But then we
> go into get_common_commits() and the client has hung up, which causes
> the message. Whereas in the second line it gives us a "done", which
> completes the negotiation.
>
> So my not-very-educated thoughts are:
>
>   1. The client should probably be sending an extra flush in the first
>      request. Alternatively, in the stateless-rpc case we should just
>      accept the lack of flush as an acceptable end to the request.

Our pkt-line.c can't deal with eof if I remember correctly (I tried to
use pkt-line for the parallel checkout stuff, where workers can also
exit early...) so sending extra flush may be easier. Old upload-pack
will not have a problem with this extra flush right? I haven't checked
upload-pack.c yet...

>   2. Presumably the shallowness is what causes the double-request, as
>      fetch-pack wants to see the shallow list before proceeding. But
>      since it has no actual commits to negotiate, the negotiation is a
>      noop. So probably this case could actually happen in a single
>      request.

I seem to recall our discussion a few months(?) ago about quickfetch()
where shallow clone would force another fetch initiated from
backfill_tags(). I guess that's why you see two fetches.

>
>      I suspect that other fetches could not, though, so I'm not sure how
>      much effort is worth putting into optimizing.
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Duy

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-06-21 16:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-21 11:23 http-backend fatal error message on shallow clone Eric Wong
2016-06-21 12:10 ` Jeff King
2016-06-21 16:41   ` Duy Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).