* [RFC] allowing hooks to ignore input?
@ 2014-09-12 22:48 Junio C Hamano
2014-09-13 8:19 ` Johannes Sixt
0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2014-09-12 22:48 UTC (permalink / raw)
To: git
The pre-receive and post-receive hooks were designed to be an
improvement over old style update and post-update hooks that used to
take the update information on the command line and were limited by
the command line length limit. They take the same information from
their standard input. It has been mandatory for these new style
hooks to consume the update information fully from the standard
input stream. Otherwise, they would risk killing the receive-pack
process via SIGPIPE.
This is easy, and it has already been done by existing hooks that
are written correctly, to work around, if a hook does not want to
look at all the information, by sending its standard input to
/dev/null (perhaps a niche use of hook might need to know only the
fact that a push was made, without having to know what objects have
been pushed to update which refs).
However, because there is no good way to consistently fail hooks
that do not consume the input fully, it can lead to a hard to
diagnose "once in a blue moon" phantom failure.
I wonder if this "you must consume the input fully", which is a
mandate that is not enforced strictly, is not helping us to catch
mistakes in hooks more than it is hurting us. Perhaps we can do
something like the attached patch to make it optional for them to
read the input we feed?
I dunno.
builtin/receive-pack.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 30560a7..9d9d16d 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -17,6 +17,7 @@
#include "version.h"
#include "tag.h"
#include "gpg-interface.h"
+#include "sigchain.h"
static const char receive_pack_usage[] = "git receive-pack <git-dir>";
@@ -500,6 +501,8 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_sta
return code;
}
+ sigchain_push(SIGPIPE, SIG_IGN);
+
while (1) {
const char *buf;
size_t n;
@@ -511,6 +514,9 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_sta
close(proc.in);
if (use_sideband)
finish_async(&muxer);
+
+ sigchain_pop(SIGPIPE);
+
return finish_command(&proc);
}
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC] allowing hooks to ignore input?
2014-09-12 22:48 [RFC] allowing hooks to ignore input? Junio C Hamano
@ 2014-09-13 8:19 ` Johannes Sixt
2014-09-15 17:51 ` Junio C Hamano
2014-09-16 22:27 ` Junio C Hamano
0 siblings, 2 replies; 5+ messages in thread
From: Johannes Sixt @ 2014-09-13 8:19 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Am 13.09.2014 um 00:48 schrieb Junio C Hamano:
> The pre-receive and post-receive hooks were designed to be an
> improvement over old style update and post-update hooks that used to
> take the update information on the command line and were limited by
> the command line length limit. They take the same information from
> their standard input. It has been mandatory for these new style
> hooks to consume the update information fully from the standard
> input stream. Otherwise, they would risk killing the receive-pack
> process via SIGPIPE.
>
> This is easy, and it has already been done by existing hooks that
> are written correctly, to work around, if a hook does not want to
> look at all the information, by sending its standard input to
> /dev/null (perhaps a niche use of hook might need to know only the
> fact that a push was made, without having to know what objects have
> been pushed to update which refs).
>
> However, because there is no good way to consistently fail hooks
> that do not consume the input fully, it can lead to a hard to
> diagnose "once in a blue moon" phantom failure.
>
> I wonder if this "you must consume the input fully", which is a
> mandate that is not enforced strictly, is not helping us to catch
> mistakes in hooks more than it is hurting us. Perhaps we can do
> something like the attached patch to make it optional for them to
> read the input we feed?
>
> I dunno.
>
> builtin/receive-pack.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index 30560a7..9d9d16d 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -17,6 +17,7 @@
> #include "version.h"
> #include "tag.h"
> #include "gpg-interface.h"
> +#include "sigchain.h"
>
> static const char receive_pack_usage[] = "git receive-pack <git-dir>";
>
> @@ -500,6 +501,8 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_sta
> return code;
> }
>
> + sigchain_push(SIGPIPE, SIG_IGN);
> +
> while (1) {
> const char *buf;
> size_t n;
> @@ -511,6 +514,9 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_sta
> close(proc.in);
> if (use_sideband)
> finish_async(&muxer);
> +
> + sigchain_pop(SIGPIPE);
> +
> return finish_command(&proc);
> }
I think this is a good move. Hooks are written by users, who sometimes
are not clueful enough.
But what do our writers do when they fail with EPIPE? Die? If so, this
patch alone is not sufficient.
-- Hannes
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] allowing hooks to ignore input?
2014-09-13 8:19 ` Johannes Sixt
@ 2014-09-15 17:51 ` Junio C Hamano
2014-09-16 22:27 ` Junio C Hamano
1 sibling, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2014-09-15 17:51 UTC (permalink / raw)
To: Johannes Sixt; +Cc: git
Johannes Sixt <j6t@kdbg.org> writes:
> I think this is a good move. Hooks are written by users, who sometimes
> are not clueful enough.
>
> But what do our writers do when they fail with EPIPE? Die? If so, this
> patch alone is not sufficient.
I think it is in a loop
while (...) {
if (write_in_full(...))
break;
}
...
return finish_command(&proc);
and the caller only cares about the tatus of finish_command(),
i.e. the exit status of the downstream of the pipe. In an EPIPE
situation, they have already exited with the status they want to
give to us without listening to what we wanted to tell them, so I
think there is no issue.
We _might_ want to check, at that point of "break", if we got a
write error other than EPIPE and act somewhat differently, but that
would be a different issue the current code already has. E.g. if we
broke out of the loop due to an EIO or EPERM, it will not be affect
what the later code would do in the current code.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] allowing hooks to ignore input?
2014-09-13 8:19 ` Johannes Sixt
2014-09-15 17:51 ` Junio C Hamano
@ 2014-09-16 22:27 ` Junio C Hamano
2014-09-18 11:14 ` Jeff King
1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2014-09-16 22:27 UTC (permalink / raw)
To: git; +Cc: Johannes Sixt
Johannes Sixt <j6t@kdbg.org> writes:
> I think this is a good move. Hooks are written by users, who sometimes
> are not clueful enough.
Thanks for a sanity check. I do not think it is about cluefulness
in this particular case. A rule that is not meaningfully enforced
by reliably failing offenders is a rule that is hard to follow if a
clueful user wanted to.
This round comes with a test, but depending on the size of your pipe
buffer and context switching race, an unpatched Git may pass it
(reducing the test_seq number down to say 199 would certainly make
it pass without the fix most of the time).
-- >8 --
Subject: [PATCH] receive-pack: allow hooks to ignore its standard input stream
The pre-receive and post-receive hooks were designed to be an
improvement over old style update and post-update hooks, which take
the update information on their command line and are limited by the
command line length limit. The same information is fed from the
standard input to pre/post-receive hooks instead to lift this
limitation. It has been mandatory for these new style hooks to
consume the update information fully from the standard input stream.
Otherwise, they would risk killing the receive-pack process via
SIGPIPE.
If a hook does not want to look at all the information, it is easy
to send its standard input to /dev/null (perhaps a niche use of hook
might need to know only the fact that a push was made, without
having to know what objects have been pushed to update which refs),
and this has already been done by existing hooks that are written
carefully.
However, because there is no good way to consistently fail hooks
that do not consume the input fully (a small push may result in a
short update record that may fit within the pipe buffer, to which
the receive-pack process may manage to write before the hook has a
chance to exit without reading anything, which will not result in a
death-by-SIGPIPE of receive-pack), it can lead to a hard to diagnose
"once in a blue moon" phantom failure.
Lift this "hooks must consume their input fully" mandate. A mandate
that is not enforced strictly is not helping us to catch mistakes in
hooks. If a hook has a good reason to decide the outcome of its
operation without reading the information we feed it, let it do so
as it pleases.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
builtin/receive-pack.c | 6 ++++++
t/t5401-update-hooks.sh | 13 +++++++++++++
2 files changed, 19 insertions(+)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f93ac45..516386f 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -15,6 +15,7 @@
#include "connected.h"
#include "argv-array.h"
#include "version.h"
+#include "sigchain.h"
static const char receive_pack_usage[] = "git receive-pack <git-dir>";
@@ -288,6 +289,8 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_sta
return code;
}
+ sigchain_push(SIGPIPE, SIG_IGN);
+
while (1) {
const char *buf;
size_t n;
@@ -299,6 +302,9 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_sta
close(proc.in);
if (use_sideband)
finish_async(&muxer);
+
+ sigchain_pop(SIGPIPE);
+
return finish_command(&proc);
}
diff --git a/t/t5401-update-hooks.sh b/t/t5401-update-hooks.sh
index 17bcb0b..7f278d8 100755
--- a/t/t5401-update-hooks.sh
+++ b/t/t5401-update-hooks.sh
@@ -135,4 +135,17 @@ test_expect_success 'send-pack stderr contains hook messages' '
test_cmp expect actual
'
+test_expect_success 'pre-receive hook that forgets to read its input' '
+ write_script victim.git/hooks/pre-receive <<-\EOF &&
+ exit 0
+ EOF
+ rm -f victim.git/hooks/update victim.git/hooks/post-update &&
+
+ for v in $(test_seq 100 999)
+ do
+ git branch branch_$v master || return
+ done &&
+ git push ./victim.git "+refs/heads/*:refs/heads/*"
+'
+
test_done
--
2.1.0-403-g099cf47
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC] allowing hooks to ignore input?
2014-09-16 22:27 ` Junio C Hamano
@ 2014-09-18 11:14 ` Jeff King
0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2014-09-18 11:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Johannes Sixt
On Tue, Sep 16, 2014 at 03:27:12PM -0700, Junio C Hamano wrote:
> Johannes Sixt <j6t@kdbg.org> writes:
>
> > I think this is a good move. Hooks are written by users, who sometimes
> > are not clueful enough.
>
> Thanks for a sanity check. I do not think it is about cluefulness
> in this particular case. A rule that is not meaningfully enforced
> by reliably failing offenders is a rule that is hard to follow if a
> clueful user wanted to.
This looks good to me. Here's a real-world example of somebody getting
bitten by this (I _thought_ we had dealt with it then, but apparently
not):
http://article.gmane.org/gmane.comp.version-control.git/186287
> This round comes with a test, but depending on the size of your pipe
> buffer and context switching race, an unpatched Git may pass it
> (reducing the test_seq number down to say 199 would certainly make
> it pass without the fix most of the time).
I took a look at your test. Having worked on racy sigpipe tests
recently, I think the only way to guarantee failure is to make sure you
fill up the pipe buffer. My experiments showed this was typically 64K on
a variety of Linux systems, but I think can be bumped up at runtime.
When working on the sanitized_signals() test, we decided to just pick an
arbitrarily high number like 1MB and use that (ideally you would send
infinite data until you get SIGPIPE, but the failure mode for your test
is then that it doesn't finish :-/).
You have things much harder and much easier here. Harder, in that you
can only convince git to send ~100 bytes per ref, so you would need a
lot of refs (and thus it's expensive to over-compensate). But easier, in
that you do not need to _reliably_ catch SIGPIPE. You only need to catch
it often enough that somebody notices if the rest is broken, not catch
it every time to ensure success.
So I think it is OK as-is. You should be generating ~91K of ref data
(refname + two sha1s + spaces and newline), and I can't imagine many
systems have a pipe buffer bigger than 64K. If they do, the only
downside is that those systems might only intermittently catch a
regression.
-Peff
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-09-18 11:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-12 22:48 [RFC] allowing hooks to ignore input? Junio C Hamano
2014-09-13 8:19 ` Johannes Sixt
2014-09-15 17:51 ` Junio C Hamano
2014-09-16 22:27 ` Junio C Hamano
2014-09-18 11:14 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).