git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] upload-pack: Use finish_{command,async}() instead of waitpid().
@ 2007-11-04 19:46 Johannes Sixt
  2007-11-05  1:42 ` Michael J. Cohen
  2007-11-05 20:05 ` Junio C Hamano
  0 siblings, 2 replies; 5+ messages in thread
From: Johannes Sixt @ 2007-11-04 19:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

upload-pack spawns two processes, rev-list and pack-objects, and carefully
monitors their status so that it can report failure to the remote end.
This change removes the complicated procedures on the grounds of the
following observations:

- If everything is OK, rev-list closes its output pipe end, upon which
  pack-objects (which reads from the pipe) sees EOF and terminates itself,
  closing its output (and error) pipes. upload-pack reads from both until
  it sees EOF in both. It collects the exit codes of the child processes
  (which indicate success) and terminates successfully.

- If rev-list sees an error, it closes its output and terminates with
  failure. pack-objects sees EOF in its input and terminates successfully.
  Again upload-pack reads its inputs until EOF. When it now collects
  the exit codes of its child processes, it notices the failure of rev-list
  and signals failure to the remote end.

- If pack-objects sees an error, it terminates with failure. Since this
  breaks the pipe to rev-list, rev-list is killed with SIGPIPE.
  upload-pack reads its input until EOF, then collects the exit codes of
  the child processes, notices their failures, and signals failure to the
  remote end.

- If upload-pack itself dies unexpectedly, pack-objects is killed with
  SIGPIPE, and subsequently also rev-list.

The upshot of this is that precise monitoring of child processes is not
required because both terminate if either one of them dies unexpectedly.
This allows us to use finish_command() and finish_async() instead of
an explicit waitpid(2) call.

The change is smaller than it looks because most of it only reduces the
indentation of a large part of the inner loop.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
---
	This patch allows us to reduce the differences to the
	MinGW port even further. It goes on top of js/forkexec
	(which meanwhile is in master).

	The test case checks for failures in rev-list (a missing
	object). Any hints how to trigger a failure in pack-objects
	that does not also trigger in rev-list would be welcome.

	BTW, I don't know what it means to process zombies if the
	parent does not waitpid(), but just terminates. Does this
	work as expected, ie. no zombies are left behind?

	-- Hannes

 t/t5530-upload-pack-error.sh |   49 +++++++++++
 upload-pack.c                |  192 +++++++++++++++++-------------------------
 2 files changed, 126 insertions(+), 115 deletions(-)
 create mode 100755 t/t5530-upload-pack-error.sh

diff --git a/t/t5530-upload-pack-error.sh b/t/t5530-upload-pack-error.sh
new file mode 100755
index 0000000..9923ba0
--- /dev/null
+++ b/t/t5530-upload-pack-error.sh
@@ -0,0 +1,49 @@
+#!/bin/sh
+
+test_description='errors in upload-pack'
+
+. ./test-lib.sh
+
+D=`pwd`
+
+test_expect_success 'setup and corrupt repository' '
+
+	echo file >file &&
+	git add file &&
+	git rev-parse :file &&
+	git commit -a -m original &&
+	test_tick &&
+	echo changed >file &&
+	git commit -a -m changed &&
+	rm -f .git/objects/f7/3f3093ff865c514c6c51f867e35f693487d0d3
+
+'
+
+test_expect_failure 'fsck fails' '
+
+	git fsck
+'
+
+test_expect_failure 'upload pack fails due to error in rev-list' '
+
+	echo "0032want $(git rev-parse HEAD)
+00000009done
+0000" | git-upload-pack . > /dev/null
+
+'
+
+test_expect_success 'create empty repository' '
+
+	mkdir foo &&
+	cd foo &&
+	git init
+
+'
+
+test_expect_failure 'fetch fails' '
+
+	git fetch .. master
+
+'
+
+test_done
diff --git a/upload-pack.c b/upload-pack.c
index 6799468..7e04311 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -144,6 +144,7 @@ static void create_pack_file(void)
 	char abort_msg[] = "aborting due to possible repository "
 		"corruption on the remote side.";
 	int buffered = -1;
+	ssize_t sz;
 	const char *argv[10];
 	int arg = 0;
 
@@ -168,22 +169,15 @@ static void create_pack_file(void)
 	pack_objects.git_cmd = 1;
 	pack_objects.argv = argv;
 
-	if (start_command(&pack_objects)) {
-		/* daemon sets things up to ignore TERM */
-		kill(rev_list.pid, SIGKILL);
+	if (start_command(&pack_objects))
 		die("git-upload-pack: unable to fork git-pack-objects");
-	}
 
 	/* We read from pack_objects.err to capture stderr output for
 	 * progress bar, and pack_objects.out to capture the pack data.
 	 */
 
 	while (1) {
-		const char *who;
 		struct pollfd pfd[2];
-		pid_t pid;
-		int status;
-		ssize_t sz;
 		int pe, pu, pollsize;
 
 		reset_timeout();
@@ -204,123 +198,91 @@ static void create_pack_file(void)
 			pollsize++;
 		}
 
-		if (pollsize) {
-			if (poll(pfd, pollsize, -1) < 0) {
-				if (errno != EINTR) {
-					error("poll failed, resuming: %s",
-					      strerror(errno));
-					sleep(1);
-				}
-				continue;
-			}
-			if (0 <= pu && (pfd[pu].revents & (POLLIN|POLLHUP))) {
-				/* Data ready; we keep the last byte
-				 * to ourselves in case we detect
-				 * broken rev-list, so that we can
-				 * leave the stream corrupted.  This
-				 * is unfortunate -- unpack-objects
-				 * would happily accept a valid pack
-				 * data with trailing garbage, so
-				 * appending garbage after we pass all
-				 * the pack data is not good enough to
-				 * signal breakage to downstream.
-				 */
-				char *cp = data;
-				ssize_t outsz = 0;
-				if (0 <= buffered) {
-					*cp++ = buffered;
-					outsz++;
-				}
-				sz = xread(pack_objects.out, cp,
-					  sizeof(data) - outsz);
-				if (0 < sz)
-						;
-				else if (sz == 0) {
-					close(pack_objects.out);
-					pack_objects.out = -1;
-				}
-				else
-					goto fail;
-				sz += outsz;
-				if (1 < sz) {
-					buffered = data[sz-1] & 0xFF;
-					sz--;
-				}
-				else
-					buffered = -1;
-				sz = send_client_data(1, data, sz);
-				if (sz < 0)
-					goto fail;
-			}
-			if (0 <= pe && (pfd[pe].revents & (POLLIN|POLLHUP))) {
-				/* Status ready; we ship that in the side-band
-				 * or dump to the standard error.
-				 */
-				sz = xread(pack_objects.err, progress,
-					  sizeof(progress));
-				if (0 < sz)
-					send_client_data(2, progress, sz);
-				else if (sz == 0) {
-					close(pack_objects.err);
-					pack_objects.err = -1;
-				}
-				else
-					goto fail;
+		if (!pollsize)
+			break;
+
+		if (poll(pfd, pollsize, -1) < 0) {
+			if (errno != EINTR) {
+				error("poll failed, resuming: %s",
+				      strerror(errno));
+				sleep(1);
 			}
+			continue;
 		}
-
-		/* See if the children are still there */
-		if (rev_list.pid || pack_objects.pid) {
-			pid = waitpid(-1, &status, WNOHANG);
-			if (!pid)
-				continue;
-			who = ((pid == rev_list.pid) ? "git-rev-list" :
-			       (pid == pack_objects.pid) ? "git-pack-objects" :
-			       NULL);
-			if (!who) {
-				if (pid < 0) {
-					error("git-upload-pack: %s",
-					      strerror(errno));
-					goto fail;
-				}
-				error("git-upload-pack: we weren't "
-				      "waiting for %d", pid);
-				continue;
+		if (0 <= pu && (pfd[pu].revents & (POLLIN|POLLHUP))) {
+			/* Data ready; we keep the last byte to ourselves
+			 * in case we detect broken rev-list, so that we
+			 * can leave the stream corrupted.  This is
+			 * unfortunate -- unpack-objects would happily
+			 * accept a valid packdata with trailing garbage,
+			 * so appending garbage after we pass all the
+			 * pack data is not good enough to signal
+			 * breakage to downstream.
+			 */
+			char *cp = data;
+			ssize_t outsz = 0;
+			if (0 <= buffered) {
+				*cp++ = buffered;
+				outsz++;
+			}
+			sz = xread(pack_objects.out, cp,
+				  sizeof(data) - outsz);
+			if (0 < sz)
+					;
+			else if (sz == 0) {
+				close(pack_objects.out);
+				pack_objects.out = -1;
 			}
-			if (!WIFEXITED(status) || WEXITSTATUS(status) > 0) {
-				error("git-upload-pack: %s died with error.",
-				      who);
+			else
 				goto fail;
+			sz += outsz;
+			if (1 < sz) {
+				buffered = data[sz-1] & 0xFF;
+				sz--;
 			}
-			if (pid == rev_list.pid)
-				rev_list.pid = 0;
-			if (pid == pack_objects.pid)
-				pack_objects.pid = 0;
-			if (rev_list.pid || pack_objects.pid)
-				continue;
-		}
-
-		/* both died happily */
-		if (pollsize)
-			continue;
-
-		/* flush the data */
-		if (0 <= buffered) {
-			data[0] = buffered;
-			sz = send_client_data(1, data, 1);
+			else
+				buffered = -1;
+			sz = send_client_data(1, data, sz);
 			if (sz < 0)
 				goto fail;
-			fprintf(stderr, "flushed.\n");
 		}
-		if (use_sideband)
-			packet_flush(1);
-		return;
+		if (0 <= pe && (pfd[pe].revents & (POLLIN|POLLHUP))) {
+			/* Status ready; we ship that in the side-band
+			 * or dump to the standard error.
+			 */
+			sz = xread(pack_objects.err, progress,
+				  sizeof(progress));
+			if (0 < sz)
+				send_client_data(2, progress, sz);
+			else if (sz == 0) {
+				close(pack_objects.err);
+				pack_objects.err = -1;
+			}
+			else
+				goto fail;
+		}
+	}
+
+	if (finish_command(&pack_objects)) {
+		error("git-upload-pack: git-pack-objects died with error.");
+		goto fail;
+	}
+	if (finish_async(&rev_list))
+		goto fail;	/* error was already reported */
+
+	/* flush the data */
+	if (0 <= buffered) {
+		data[0] = buffered;
+		sz = send_client_data(1, data, 1);
+		if (sz < 0)
+			goto fail;
+		fprintf(stderr, "flushed.\n");
 	}
+	if (use_sideband)
+		packet_flush(1);
+	return;
+
  fail:
-	if (pack_objects.pid)
-		kill(pack_objects.pid, SIGKILL);
-	if (rev_list.pid)
-		kill(rev_list.pid, SIGKILL);
 	send_client_data(3, abort_msg, sizeof(abort_msg));
 	die("git-upload-pack: %s", abort_msg);
 }
-- 
1.5.3.4.315.g2ce38

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] upload-pack: Use finish_{command,async}() instead of waitpid().
  2007-11-04 19:46 [PATCH] upload-pack: Use finish_{command,async}() instead of waitpid() Johannes Sixt
@ 2007-11-05  1:42 ` Michael J. Cohen
  2007-11-05 20:05 ` Junio C Hamano
  1 sibling, 0 replies; 5+ messages in thread
From: Michael J. Cohen @ 2007-11-05  1:42 UTC (permalink / raw)
  To: Git Mailing List

On Nov 4, 2007, at 2:46 PM, Johannes Sixt wrote:

> The change is smaller than it looks because most of it only reduces  
> the
> indentation of a large part of the inner loop.


mac-pro:git mjc$ git diff -u -b HEAD | diffstat
t/t5530-upload-pack-error.sh |   49 ++++++++++++++++++++++++++++
upload-pack.c                |   74 +++++++++ 
+---------------------------------
2 files changed, 67 insertions(+), 56 deletions(-)

agreed. :P

-mjc

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] upload-pack: Use finish_{command,async}() instead of waitpid().
  2007-11-04 19:46 [PATCH] upload-pack: Use finish_{command,async}() instead of waitpid() Johannes Sixt
  2007-11-05  1:42 ` Michael J. Cohen
@ 2007-11-05 20:05 ` Junio C Hamano
  2007-11-05 21:40   ` [PATCH] t5530-upload-pack-error: Check more carefully for failures Johannes Sixt
  1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2007-11-05 20:05 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

Johannes Sixt <johannes.sixt@telecom.at> writes:

> - If pack-objects sees an error, it terminates with failure. Since this
>   breaks the pipe to rev-list, rev-list is killed with SIGPIPE.

I was a bit uneasy about this part.  We do not explicitly tell
rev-list not to ignore SIGPIPE, so it could spin fruitlessly
listing revs and calling write(2).  But the error from that
write should already be handled correctly anyway, so this should
be Ok.

> 	The test case checks for failures in rev-list (a missing
> 	object). Any hints how to trigger a failure in pack-objects
> 	that does not also trigger in rev-list would be welcome.

How about removing a blob from the test repository  to corrupt
it?  rev-list --objects I think would happily list the blob
because it sees its name in its containing tree without checking
its existence.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] t5530-upload-pack-error: Check more carefully for failures.
  2007-11-05 20:05 ` Junio C Hamano
@ 2007-11-05 21:40   ` Johannes Sixt
  2007-11-06  0:22     ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Sixt @ 2007-11-05 21:40 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

Previously, the test only triggered a failure in upload-pack's rev-list
subprocess. Here we also test for a failure of pack-objects, and make sure
that the right process failed.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
---
  On Monday 05 November 2007 21:05, Junio C Hamano wrote:
  > > 	The test case checks for failures in rev-list (a missing
  > > 	object). Any hints how to trigger a failure in pack-objects
  > > 	that does not also trigger in rev-list would be welcome.
  >
  > How about removing a blob from the test repository  to corrupt
  > it?  rev-list --objects I think would happily list the blob
  > because it sees its name in its containing tree without checking
  > its existence.

  That does it. This goes on top of my previous patch.

  -- Hannes

 t/t5530-upload-pack-error.sh |   27 +++++++++++++++++++++++----
 1 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/t/t5530-upload-pack-error.sh b/t/t5530-upload-pack-error.sh
index 9923ba0..70d4f86 100755
--- a/t/t5530-upload-pack-error.sh
+++ b/t/t5530-upload-pack-error.sh
@@ -15,7 +15,7 @@ test_expect_success 'setup and corrupt repository' '
 	test_tick &&
 	echo changed >file &&
 	git commit -a -m changed &&
-	rm -f .git/objects/f7/3f3093ff865c514c6c51f867e35f693487d0d3
+	rm -f .git/objects/5e/a2ed416fbd4a4cbe227b75fe255dd7fa6bd4d6
 
 '
 
@@ -24,14 +24,33 @@ test_expect_failure 'fsck fails' '
 	git fsck
 '
 
-test_expect_failure 'upload pack fails due to error in rev-list' '
+test_expect_success 'upload-pack fails due to error in pack-objects' '
 
-	echo "0032want $(git rev-parse HEAD)
+	! echo "0032want $(git rev-parse HEAD)
 00000009done
-0000" | git-upload-pack . > /dev/null
+0000" | git-upload-pack . > /dev/null 2> output.err &&
+	grep "pack-objects died" output.err
+'
+
+test_expect_success 'corrupt repo differently' '
+
+	git hash-object -w file &&
+	rm -f .git/objects/be/c63e37d08c454ad3a60cde90b70f3f7d077852
 
 '
 
+test_expect_failure 'fsck fails' '
+
+	git fsck
+'
+test_expect_success 'upload-pack fails due to error in rev-list' '
+
+	! echo "0032want $(git rev-parse HEAD)
+00000009done
+0000" | git-upload-pack . > /dev/null 2> output.err &&
+	grep "waitpid (async) failed" output.err
+'
+
 test_expect_success 'create empty repository' '
 
 	mkdir foo &&
-- 
1.5.3.4.315.g2ce38

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] t5530-upload-pack-error: Check more carefully for failures.
  2007-11-05 21:40   ` [PATCH] t5530-upload-pack-error: Check more carefully for failures Johannes Sixt
@ 2007-11-06  0:22     ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2007-11-06  0:22 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

Johannes Sixt <johannes.sixt@telecom.at> writes:

>   On Monday 05 November 2007 21:05, Junio C Hamano wrote:
>   > > 	The test case checks for failures in rev-list (a missing
>   > > 	object). Any hints how to trigger a failure in pack-objects
>   > > 	that does not also trigger in rev-list would be welcome.
>   >
>   > How about removing a blob from the test repository  to corrupt
>   > it?  rev-list --objects I think would happily list the blob
>   > because it sees its name in its containing tree without checking
>   > its existence.
>
>   That does it. This goes on top of my previous patch.

Thanks.  Will squash with further changes attached for readability.

---
 t/t5530-upload-pack-error.sh |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/t/t5530-upload-pack-error.sh b/t/t5530-upload-pack-error.sh
index 70d4f86..cc8949e 100755
--- a/t/t5530-upload-pack-error.sh
+++ b/t/t5530-upload-pack-error.sh
@@ -6,6 +6,13 @@ test_description='errors in upload-pack'
 
 D=`pwd`
 
+corrupt_repo () {
+	object_sha1=$(git rev-parse "$1") &&
+	ob=$(expr "$object_sha1" : "\(..\)") &&
+	ject=$(expr "$object_sha1" : "..\(..*\)") &&
+	rm -f ".git/objects/$ob/$ject"
+}
+
 test_expect_success 'setup and corrupt repository' '
 
 	echo file >file &&
@@ -15,7 +22,7 @@ test_expect_success 'setup and corrupt repository' '
 	test_tick &&
 	echo changed >file &&
 	git commit -a -m changed &&
-	rm -f .git/objects/5e/a2ed416fbd4a4cbe227b75fe255dd7fa6bd4d6
+	corrupt_repo HEAD:file
 
 '
 
@@ -35,7 +42,7 @@ test_expect_success 'upload-pack fails due to error in pack-objects' '
 test_expect_success 'corrupt repo differently' '
 
 	git hash-object -w file &&
-	rm -f .git/objects/be/c63e37d08c454ad3a60cde90b70f3f7d077852
+	corrupt_repo HEAD^^{tree}
 
 '
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-11-06  0:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-04 19:46 [PATCH] upload-pack: Use finish_{command,async}() instead of waitpid() Johannes Sixt
2007-11-05  1:42 ` Michael J. Cohen
2007-11-05 20:05 ` Junio C Hamano
2007-11-05 21:40   ` [PATCH] t5530-upload-pack-error: Check more carefully for failures Johannes Sixt
2007-11-06  0:22     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).