* status hangs trying to get submodule summary @ 2015-03-22 5:56 Wincent Colaiuta 2015-03-22 7:44 ` [PATCH] status: read submodule process output before calling wait() Jeff King 0 siblings, 1 reply; 26+ messages in thread From: Wincent Colaiuta @ 2015-03-22 5:56 UTC (permalink / raw) To: git Hi, I just ran into some odd behavior trying to update a submodule in one of my projects where it would hang indefinitely trying to run either `git status` or `git commit`. Here's the minimal repro recipe: mkdir demo cd demo git init git submodule add git://github.com/ansible/ansible.git ansible (cd ansible && git checkout v1.6.10) git add ansible git commit -m "initial commit" (cd ansible && git checkout v1.8.4) git config status.submodulesummary true git status # hangs... git commit # hangs... At the time `git status` is hanging, these are the Git processes running on the system: 12431 git status 12462 git submodule summary --files --for-status --summary-limit -1 12463 /bin/sh /usr/libexec/git-core/git-submodule summary --files --for-status --summary-limit -1 12507 /bin/sh /usr/libexec/git-core/git-submodule summary --files --for-status --summary-limit -1 12522 git log --pretty=format: %m %s --first-parent 8959338284f6c6e44890b8911434285848f34859...ebc8d48d34296fe010096f044e2b7591df37a622 And `strace` shows the processes `wait4`-ing, except for the `git log` one which is doing a `write` but apparently blocked: # strace -p 12431 Process 12431 attached wait4(12462, ^CProcess 12431 detached <detached ...> # strace -p 12462 Process 12462 attached wait4(12463, ^CProcess 12462 detached <detached ...> # strace -p 12463 Process 12463 attached wait4(-1, ^CProcess 12463 detached <detached ...> # strace -p 12507 Process 12507 attached wait4(-1, ^CProcess 12507 detached <detached ...> # strace -p 12522 Process 12522 attached write(1, "\n > Merge pull request #7814 fr"..., 108^CProcess 12522 detached <detached ...> This repros for me on Mac OS X 10.10.2 with Git 1.9.5 and Git 2.3.3, and on the Amazon LInux (a RHEL-like OS) with Git 2.1.0. Both of these with an empty .gitconfig (other than user.email, user.name and the status.submodulesummary value already mentioned above). I've never seen this hang before despite frequent use of submodules. Oddly, I was able to work around the hang by moving the submodule in two hops (one from Ansible v1.6.10 to v1.7.0, then from v1.7.0 to v1.8.4). I am not sure if this is specific to the Ansible repo, or whether the length of the summary is crossing some threshold that triggers the bug to manifest. If I run the forked commands manually from an interactive shell, they complete just fine. -Greg ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH] status: read submodule process output before calling wait() 2015-03-22 5:56 status hangs trying to get submodule summary Wincent Colaiuta @ 2015-03-22 7:44 ` Jeff King 2015-03-22 8:07 ` Jeff King 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King 0 siblings, 2 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 7:44 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git On Sat, Mar 21, 2015 at 10:56:54PM -0700, Wincent Colaiuta wrote: > I've never seen this hang before despite frequent use of submodules. > Oddly, I was able to work around the hang by moving the submodule in > two hops (one from Ansible v1.6.10 to v1.7.0, then from v1.7.0 to > v1.8.4). I am not sure if this is specific to the Ansible repo, or > whether the length of the summary is crossing some threshold that > triggers the bug to manifest. If I run the forked commands manually > from an interactive shell, they complete just fine. It's the length of the summary. The fix is below. -- >8 -- The status code tries to read the output of "git submodule summary" over a pipe by waiting for the program to finish and then reading its output, like this: run_command(&sm_summary); len = strbuf_read(&cmd_stdout, sm_summary.out, 1024); Besides being a violation of the run-command API (which makes no promises about the state of the struct after run_command returns), this can easily lead to deadlock. The "submodule status" process may fill up the pipe buffer and block on write(). Meanwhile, the reading side in the parent process is blocked in wait(), waiting for the child to finish. Instead, we should start the process, read everything it produces, and only then call wait() to finish it off. Signed-off-by: Jeff King <peff@peff.net> --- I notice that we also don't detect when the sub-command fails. I don't know what we would do in that case (abort the status? print a message?) and it's orthogonal to this issue, so I left it for somebody more clueful in the area to think about. wt-status.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/wt-status.c b/wt-status.c index 7036fa2..96f0033 100644 --- a/wt-status.c +++ b/wt-status.c @@ -748,9 +748,9 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt fflush(s->fp); sm_summary.out = -1; - run_command(&sm_summary); - + start_command(&sm_summary); len = strbuf_read(&cmd_stdout, sm_summary.out, 1024); + finish_command(&sm_summary); /* prepend header, only if there's an actual output */ if (len) { -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH] status: read submodule process output before calling wait() 2015-03-22 7:44 ` [PATCH] status: read submodule process output before calling wait() Jeff King @ 2015-03-22 8:07 ` Jeff King 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King 1 sibling, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 8:07 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git On Sun, Mar 22, 2015 at 03:44:55AM -0400, Jeff King wrote: > The status code tries to read the output of "git submodule > summary" over a pipe by waiting for the program to finish > and then reading its output, like this: > > run_command(&sm_summary); > len = strbuf_read(&cmd_stdout, sm_summary.out, 1024); By the way, I spotted this code as bogus immediately upon seeing it (though certainly it helped to know there was a deadlock in the area, which had me thinking about such things). So I wondered if it could have been easy to catch in review, but its introduction was a little bit subtle. The original run_command invocation came in ac8d5af (builtin-status: submodule summary support, 2008-04-12), which just let the sub-process dump its stdout to the same descriptor that the rest of the status output was going to. So the use of run_command there was fine. It was later, in 3ba7407 (submodule summary: ignore --for-status option, 2013-09-06), that we started post-processing the output and it became buggy. But that's harder to see in review. > Besides being a violation of the run-command API (which > makes no promises about the state of the struct after > run_command returns), This may be overly harsh of me. Certainly we make no guarantees (and things like the dynamic "args" and "env_array" are cleaned up automatically after finish_command returns), but I would not be surprised if there are other spots that treat "struct child_process" as transparent rather than as a black box. It's really the run_command + pipe construct that is really the danger here. I wonder if we should do something like this: diff --git a/run-command.c b/run-command.c index 3afb124..78807de 100644 --- a/run-command.c +++ b/run-command.c @@ -557,7 +557,12 @@ int finish_command(struct child_process *cmd) int run_command(struct child_process *cmd) { - int code = start_command(cmd); + int code; + + if (cmd->out < 0) + die("BUG: run_command with a pipe can cause deadlock"); + + code = start_command(cmd); if (code) return code; return finish_command(cmd); It seems to catch at least one other dubious construct. -Peff ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks 2015-03-22 7:44 ` [PATCH] status: read submodule process output before calling wait() Jeff King 2015-03-22 8:07 ` Jeff King @ 2015-03-22 9:59 ` Jeff King 2015-03-22 10:00 ` [PATCH 1/7] wt-status: don't flush before running "submodule status" Jeff King ` (6 more replies) 1 sibling, 7 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 9:59 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git On Sun, Mar 22, 2015 at 03:44:55AM -0400, Jeff King wrote: > diff --git a/wt-status.c b/wt-status.c > index 7036fa2..96f0033 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -748,9 +748,9 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt > fflush(s->fp); > sm_summary.out = -1; > > - run_command(&sm_summary); > - > + start_command(&sm_summary); > len = strbuf_read(&cmd_stdout, sm_summary.out, 1024); > + finish_command(&sm_summary); This isn't quite right. In both the original and the new one, if start_command fails, we may call strbuf_read on whatever it happens to have left in sm_summary.out. Worse, in the new one, we would call finish_command on something that was never started. And in both old and new, we leak the sm_summary.out descriptor. Furthermore, I found two other potential deadlocks (and one more descriptor leak). I tried at first to fix them all up like I did with the above patch, but I found that I was introducing similar problems in each case. So instead, I pulled this pattern out into its own strbuf helper. So here's a replacement series. [1/7]: wt-status: don't flush before running "submodule status" [2/7]: wt_status: fix signedness mismatch in strbuf_read call [3/7]: strbuf: introduce strbuf_read_cmd helper [4/7]: wt-status: use strbuf_read_cmd [5/7]: submodule: use strbuf_read_cmd [6/7]: trailer: use strbuf_read_cmd [7/7]: run-command: forbid using run_command with piped output -Peff ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 1/7] wt-status: don't flush before running "submodule status" 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King @ 2015-03-22 10:00 ` Jeff King 2015-03-22 10:00 ` [PATCH 2/7] wt_status: fix signedness mismatch in strbuf_read call Jeff King ` (5 subsequent siblings) 6 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 10:00 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git This is a holdover from the original implementation in ac8d5af (builtin-status: submodule summary support, 2008-04-12), which just had the sub-process output to our descriptor; we had to make sure we had flushed any data that we produced before it started writing. Since 3ba7407 (submodule summary: ignore --for-status option, 2013-09-06), however, we pipe the sub-process output back to ourselves. So there's no longer any need to flush (it does not hurt, but it may leave readers wondering why we do it). Signed-off-by: Jeff King <peff@peff.net> --- wt-status.c | 1 - 1 file changed, 1 deletion(-) diff --git a/wt-status.c b/wt-status.c index 7036fa2..1712762 100644 --- a/wt-status.c +++ b/wt-status.c @@ -745,7 +745,6 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt sm_summary.git_cmd = 1; sm_summary.no_stdin = 1; - fflush(s->fp); sm_summary.out = -1; run_command(&sm_summary); -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 2/7] wt_status: fix signedness mismatch in strbuf_read call 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King 2015-03-22 10:00 ` [PATCH 1/7] wt-status: don't flush before running "submodule status" Jeff King @ 2015-03-22 10:00 ` Jeff King 2015-03-22 10:07 ` [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper Jeff King ` (4 subsequent siblings) 6 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 10:00 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git We call strbuf_read(), and want to know whether we got any output. To do so, we assign the result to a size_t, and check whether it is non-zero. But strbuf_read returns a signed ssize_t. If it encounters an error, it will return -1, and we'll end up treating this the same as if we had gotten output. Instead, we can just check whether our buffer has anything in it (which is what we care about anyway, and is the same thing since we know the buffer was empty to begin with). Note that the "len" variable actually has two roles in this function. Now that we've eliminated the first, we can push the declaration closer to the point of use for the second one. Signed-off-by: Jeff King <peff@peff.net> --- wt-status.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/wt-status.c b/wt-status.c index 1712762..b47f6d9 100644 --- a/wt-status.c +++ b/wt-status.c @@ -729,7 +729,6 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt struct strbuf cmd_stdout = STRBUF_INIT; struct strbuf summary = STRBUF_INIT; char *summary_content; - size_t len; argv_array_pushf(&sm_summary.env_array, "GIT_INDEX_FILE=%s", s->index_file); @@ -749,10 +748,10 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt run_command(&sm_summary); - len = strbuf_read(&cmd_stdout, sm_summary.out, 1024); + strbuf_read(&cmd_stdout, sm_summary.out, 1024); /* prepend header, only if there's an actual output */ - if (len) { + if (cmd_stdout.len) { if (uncommitted) strbuf_addstr(&summary, _("Submodules changed but not updated:")); else @@ -763,6 +762,7 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt strbuf_release(&cmd_stdout); if (s->display_comment_prefix) { + size_t len; summary_content = strbuf_detach(&summary, &len); strbuf_add_commented_lines(&summary, summary_content, len); free(summary_content); -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King 2015-03-22 10:00 ` [PATCH 1/7] wt-status: don't flush before running "submodule status" Jeff King 2015-03-22 10:00 ` [PATCH 2/7] wt_status: fix signedness mismatch in strbuf_read call Jeff King @ 2015-03-22 10:07 ` Jeff King 2015-03-22 19:36 ` Eric Sunshine 2015-03-22 23:22 ` Junio C Hamano 2015-03-22 10:08 ` [PATCH 4/7] wt-status: use strbuf_read_cmd Jeff King ` (3 subsequent siblings) 6 siblings, 2 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 10:07 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git Something as simple as reading the stdout from a command turns out to be rather hard to do right. Doing: if (!run_command(&cmd)) strbuf_read(&buf, cmd.out, 0); can result in deadlock if the child process produces a large amount of output. What happens is: 1. The parent spawns the child with its stdout connected to a pipe, of which the parent is the sole reader. 2. The parent calls wait(), blocking until the child exits. 3. The child writes to stdout. If it writes more data than the OS pipe buffer can hold, the write() call will block. This is a deadlock; the parent is waiting for the child to exit, and the child is waiting for the parent to call read(). So we should do instead: if (!start_command(&cmd)) { strbuf_read(&buf, cmd.out, 0); finish_command(&cmd); } But note that this leaks cmd.out (which must be closed). And there's no error handling for strbuf_read. We probably want to know whether the operation succeeded, but we must make sure to always run finish_command even if the read failed (or else we leave a zombie child process). Let's introduce a strbuf helper that can make this a bit simpler for callers to do right. Signed-off-by: Jeff King <peff@peff.net> --- This is really at the intersection of the strbuf and run-command APIs, so you could argue for it being part of either It is logically quite like the strbuf_read_file() function, so I put it there. strbuf.c | 17 +++++++++++++++++ strbuf.h | 10 ++++++++++ 2 files changed, 27 insertions(+) diff --git a/strbuf.c b/strbuf.c index 88cafd4..9d1d48f 100644 --- a/strbuf.c +++ b/strbuf.c @@ -1,6 +1,7 @@ #include "cache.h" #include "refs.h" #include "utf8.h" +#include "run-command.h" int starts_with(const char *str, const char *prefix) { @@ -414,6 +415,22 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint) return -1; } +int strbuf_read_cmd(struct strbuf *sb, struct child_process *cmd, size_t hint) +{ + cmd->out = -1; + if (start_command(cmd) < 0) + return -1; + + if (strbuf_read(sb, cmd->out, hint) < 0) { + close(cmd->out); + finish_command(cmd); /* throw away exit code */ + return -1; + } + + close(cmd->out); + return finish_command(cmd); +} + int strbuf_getcwd(struct strbuf *sb) { size_t oldalloc = sb->alloc; diff --git a/strbuf.h b/strbuf.h index 1883494..93a50da 100644 --- a/strbuf.h +++ b/strbuf.h @@ -1,6 +1,8 @@ #ifndef STRBUF_H #define STRBUF_H +struct child_process; + /** * strbuf's are meant to be used with all the usual C string and memory * APIs. Given that the length of the buffer is known, it's often better to @@ -373,6 +375,14 @@ extern int strbuf_read_file(struct strbuf *sb, const char *path, size_t hint); extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint); /** + * Execute the given command, capturing its stdout in the given strbuf. + * Returns -1 if starting the command fails or reading fails, and otherwise + * returns the exit code of the command. The output collected in the + * buffer is kept even if the command returns a non-zero exit. + */ +int strbuf_read_cmd(struct strbuf *sb, struct child_process *cmd, size_t hint); + +/** * Read a line from a FILE *, overwriting the existing contents * of the strbuf. The second argument specifies the line * terminator character, typically `'\n'`. -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper 2015-03-22 10:07 ` [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper Jeff King @ 2015-03-22 19:36 ` Eric Sunshine 2015-03-22 22:54 ` Junio C Hamano 2015-03-22 23:34 ` [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper Jeff King 2015-03-22 23:22 ` Junio C Hamano 1 sibling, 2 replies; 26+ messages in thread From: Eric Sunshine @ 2015-03-22 19:36 UTC (permalink / raw) To: Jeff King; +Cc: Wincent Colaiuta, Git List On Sun, Mar 22, 2015 at 6:07 AM, Jeff King <peff@peff.net> wrote: > Something as simple as reading the stdout from a command > turns out to be rather hard to do right. Doing: > > if (!run_command(&cmd)) > strbuf_read(&buf, cmd.out, 0); > > can result in deadlock if the child process produces a large > amount of output. [...] > > Let's introduce a strbuf helper that can make this a bit > simpler for callers to do right. > > Signed-off-by: Jeff King <peff@peff.net> > --- > This is really at the intersection of the strbuf and > run-command APIs, so you could argue for it being part of > either It is logically quite like the strbuf_read_file() > function, so I put it there. It does feel like a layering violation. If moved to the run-command API, it could given one of the following names or something better: run_command_capture() capture_command() command_capture() run_command_with_output() capture_output() > diff --git a/strbuf.h b/strbuf.h > index 1883494..93a50da 100644 > --- a/strbuf.h > +++ b/strbuf.h > @@ -1,6 +1,8 @@ > #ifndef STRBUF_H > #define STRBUF_H > > +struct child_process; > + > /** > * strbuf's are meant to be used with all the usual C string and memory > * APIs. Given that the length of the buffer is known, it's often better to > @@ -373,6 +375,14 @@ extern int strbuf_read_file(struct strbuf *sb, const char *path, size_t hint); > extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint); > > /** > + * Execute the given command, capturing its stdout in the given strbuf. > + * Returns -1 if starting the command fails or reading fails, and otherwise > + * returns the exit code of the command. The output collected in the > + * buffer is kept even if the command returns a non-zero exit. > + */ > +int strbuf_read_cmd(struct strbuf *sb, struct child_process *cmd, size_t hint); > + > +/** > * Read a line from a FILE *, overwriting the existing contents > * of the strbuf. The second argument specifies the line > * terminator character, typically `'\n'`. > -- > 2.3.3.618.ga041503 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper 2015-03-22 19:36 ` Eric Sunshine @ 2015-03-22 22:54 ` Junio C Hamano 2015-03-22 23:40 ` Junio C Hamano 2015-03-22 23:34 ` [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper Jeff King 1 sibling, 1 reply; 26+ messages in thread From: Junio C Hamano @ 2015-03-22 22:54 UTC (permalink / raw) To: Eric Sunshine; +Cc: Jeff King, Wincent Colaiuta, Git List Eric Sunshine <sunshine@sunshineco.com> writes: > On Sun, Mar 22, 2015 at 6:07 AM, Jeff King <peff@peff.net> wrote: >> Something as simple as reading the stdout from a command >> turns out to be rather hard to do right. Doing: >> >> if (!run_command(&cmd)) >> strbuf_read(&buf, cmd.out, 0); >> >> can result in deadlock if the child process produces a large >> amount of output. [...] >> >> Let's introduce a strbuf helper that can make this a bit >> simpler for callers to do right. >> >> Signed-off-by: Jeff King <peff@peff.net> >> --- >> This is really at the intersection of the strbuf and >> run-command APIs, so you could argue for it being part of >> either It is logically quite like the strbuf_read_file() >> function, so I put it there. > > It does feel like a layering violation. If moved to the run-command > API, it could given one of the following names or something better: > > run_command_capture() > capture_command() > command_capture() > run_command_with_output() > capture_output() Sound like a good suggestion (but I haven't read the users of the proposed function, after doing which I might change my mind---I'll see). Thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper 2015-03-22 22:54 ` Junio C Hamano @ 2015-03-22 23:40 ` Junio C Hamano 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King 0 siblings, 1 reply; 26+ messages in thread From: Junio C Hamano @ 2015-03-22 23:40 UTC (permalink / raw) To: Eric Sunshine; +Cc: Jeff King, Wincent Colaiuta, Git List Junio C Hamano <gitster@pobox.com> writes: > Eric Sunshine <sunshine@sunshineco.com> writes: > >> It does feel like a layering violation. If moved to the run-command >> API, it could given one of the following names or something better: >> >> run_command_capture() >> capture_command() >> command_capture() >> run_command_with_output() >> capture_output() > > Sound like a good suggestion (but I haven't read the users of the > proposed function, after doing which I might change my mind---I'll > see). Now I read the callers, it does look like this new function better fits in the run-command suite, essentially allowing us to do what we would do with $(cmd) or `cmd` in shell and Perl scripts, even though I do not particularly agree with the phrase "layering violation" to call its current placement. Thanks. ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v2 0/7] introduce capture_command to avoid deadlocks 2015-03-22 23:40 ` Junio C Hamano @ 2015-03-23 3:53 ` Jeff King 2015-03-23 3:53 ` [PATCH v2 1/7] wt-status: don't flush before running "submodule status" Jeff King ` (7 more replies) 0 siblings, 8 replies; 26+ messages in thread From: Jeff King @ 2015-03-23 3:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Sunshine, Wincent Colaiuta, Git List On Sun, Mar 22, 2015 at 04:40:37PM -0700, Junio C Hamano wrote: > Now I read the callers, it does look like this new function better > fits in the run-command suite, essentially allowing us to do what we > would do with $(cmd) or `cmd` in shell and Perl scripts, even though > I do not particularly agree with the phrase "layering violation" to > call its current placement. I was on the fence, and you both seem to prefer it in run-command, so here is a re-roll in that direction (no other changes). [1/7]: wt-status: don't flush before running "submodule status" [2/7]: wt_status: fix signedness mismatch in strbuf_read call [3/7]: run-command: introduce capture_command helper [4/7]: wt-status: use capture_command [5/7]: submodule: use capture_command [6/7]: trailer: use capture_command [7/7]: run-command: forbid using run_command with piped output -Peff ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v2 1/7] wt-status: don't flush before running "submodule status" 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King @ 2015-03-23 3:53 ` Jeff King 2015-03-23 3:53 ` [PATCH v2 2/7] wt_status: fix signedness mismatch in strbuf_read call Jeff King ` (6 subsequent siblings) 7 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-23 3:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Sunshine, Wincent Colaiuta, Git List This is a holdover from the original implementation in ac8d5af (builtin-status: submodule summary support, 2008-04-12), which just had the sub-process output to our descriptor; we had to make sure we had flushed any data that we produced before it started writing. Since 3ba7407 (submodule summary: ignore --for-status option, 2013-09-06), however, we pipe the sub-process output back to ourselves. So there's no longer any need to flush (it does not hurt, but it may leave readers wondering why we do it). Signed-off-by: Jeff King <peff@peff.net> --- wt-status.c | 1 - 1 file changed, 1 deletion(-) diff --git a/wt-status.c b/wt-status.c index 7036fa2..1712762 100644 --- a/wt-status.c +++ b/wt-status.c @@ -745,7 +745,6 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt sm_summary.git_cmd = 1; sm_summary.no_stdin = 1; - fflush(s->fp); sm_summary.out = -1; run_command(&sm_summary); -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 2/7] wt_status: fix signedness mismatch in strbuf_read call 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King 2015-03-23 3:53 ` [PATCH v2 1/7] wt-status: don't flush before running "submodule status" Jeff King @ 2015-03-23 3:53 ` Jeff King 2015-03-23 3:53 ` [PATCH v2 3/7] run-command: introduce capture_command helper Jeff King ` (5 subsequent siblings) 7 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-23 3:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Sunshine, Wincent Colaiuta, Git List We call strbuf_read(), and want to know whether we got any output. To do so, we assign the result to a size_t, and check whether it is non-zero. But strbuf_read returns a signed ssize_t. If it encounters an error, it will return -1, and we'll end up treating this the same as if we had gotten output. Instead, we can just check whether our buffer has anything in it (which is what we care about anyway, and is the same thing since we know the buffer was empty to begin with). Note that the "len" variable actually has two roles in this function. Now that we've eliminated the first, we can push the declaration closer to the point of use for the second one. Signed-off-by: Jeff King <peff@peff.net> --- wt-status.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/wt-status.c b/wt-status.c index 1712762..b47f6d9 100644 --- a/wt-status.c +++ b/wt-status.c @@ -729,7 +729,6 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt struct strbuf cmd_stdout = STRBUF_INIT; struct strbuf summary = STRBUF_INIT; char *summary_content; - size_t len; argv_array_pushf(&sm_summary.env_array, "GIT_INDEX_FILE=%s", s->index_file); @@ -749,10 +748,10 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt run_command(&sm_summary); - len = strbuf_read(&cmd_stdout, sm_summary.out, 1024); + strbuf_read(&cmd_stdout, sm_summary.out, 1024); /* prepend header, only if there's an actual output */ - if (len) { + if (cmd_stdout.len) { if (uncommitted) strbuf_addstr(&summary, _("Submodules changed but not updated:")); else @@ -763,6 +762,7 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt strbuf_release(&cmd_stdout); if (s->display_comment_prefix) { + size_t len; summary_content = strbuf_detach(&summary, &len); strbuf_add_commented_lines(&summary, summary_content, len); free(summary_content); -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 3/7] run-command: introduce capture_command helper 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King 2015-03-23 3:53 ` [PATCH v2 1/7] wt-status: don't flush before running "submodule status" Jeff King 2015-03-23 3:53 ` [PATCH v2 2/7] wt_status: fix signedness mismatch in strbuf_read call Jeff King @ 2015-03-23 3:53 ` Jeff King 2015-03-23 3:53 ` [PATCH v2 4/7] wt-status: use capture_command Jeff King ` (4 subsequent siblings) 7 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-23 3:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Sunshine, Wincent Colaiuta, Git List Something as simple as reading the stdout from a command turns out to be rather hard to do right. Doing: cmd.out = -1; run_command(&cmd); strbuf_read(&buf, cmd.out, 0); can result in deadlock if the child process produces a large amount of output. What happens is: 1. The parent spawns the child with its stdout connected to a pipe, of which the parent is the sole reader. 2. The parent calls wait(), blocking until the child exits. 3. The child writes to stdout. If it writes more data than the OS pipe buffer can hold, the write() call will block. This is a deadlock; the parent is waiting for the child to exit, and the child is waiting for the parent to call read(). So we might try instead: start_command(&cmd); strbuf_read(&buf, cmd.out, 0); finish_command(&cmd); But that is not quite right either. We are examining cmd.out and running finish_command whether start_command succeeded or not, which is wrong. Moreover, these snippets do not do any error handling. If our read() fails, we must make sure to still call finish_command (to reap the child process). And both snippets failed to close the cmd.out descriptor, which they must do (provided start_command succeeded). Let's introduce a run-command helper that can make this a bit simpler for callers to get right. Signed-off-by: Jeff King <peff@peff.net> --- run-command.c | 16 ++++++++++++++++ run-command.h | 13 +++++++++++++ 2 files changed, 29 insertions(+) diff --git a/run-command.c b/run-command.c index 3afb124..e591d2c 100644 --- a/run-command.c +++ b/run-command.c @@ -829,3 +829,19 @@ int run_hook_le(const char *const *env, const char *name, ...) return ret; } + +int capture_command(struct child_process *cmd, struct strbuf *buf, size_t hint) +{ + cmd->out = -1; + if (start_command(cmd) < 0) + return -1; + + if (strbuf_read(buf, cmd->out, hint) < 0) { + close(cmd->out); + finish_command(cmd); /* throw away exit code */ + return -1; + } + + close(cmd->out); + return finish_command(cmd); +} diff --git a/run-command.h b/run-command.h index d6868dc..263b966 100644 --- a/run-command.h +++ b/run-command.h @@ -71,6 +71,19 @@ int run_command_v_opt(const char **argv, int opt); */ int run_command_v_opt_cd_env(const char **argv, int opt, const char *dir, const char *const *env); +/** + * Execute the given command, capturing its stdout in the given strbuf. + * Returns -1 if starting the command fails or reading fails, and otherwise + * returns the exit code of the command. The output collected in the + * buffer is kept even if the command returns a non-zero exit. The hint field + * gives a starting size for the strbuf allocation. + * + * The fields of "cmd" should be set up as they would for a normal run_command + * invocation. But note that there is no need to set cmd->out; the function + * sets it up for the caller. + */ +int capture_command(struct child_process *cmd, struct strbuf *buf, size_t hint); + /* * The purpose of the following functions is to feed a pipe by running * a function asynchronously and providing output that the caller reads. -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 4/7] wt-status: use capture_command 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King ` (2 preceding siblings ...) 2015-03-23 3:53 ` [PATCH v2 3/7] run-command: introduce capture_command helper Jeff King @ 2015-03-23 3:53 ` Jeff King 2015-03-23 3:53 ` [PATCH v2 5/7] submodule: " Jeff King ` (3 subsequent siblings) 7 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-23 3:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Sunshine, Wincent Colaiuta, Git List When we spawn "git submodule status" to read its output, we use run_command() followed by strbuf_read() read from the pipe. This can deadlock if the subprocess output is larger than the system pipe buffer. Furthermore, if start_command() fails, we'll try to read from a bogus descriptor (probably "-1" or a descriptor we just closed, but it is a bad idea for us to make assumptions about how start_command implements its error handling). And if start_command succeeds, we leak the file descriptor for the pipe to the child. All of these can be solved by using the capture_command helper. Signed-off-by: Jeff King <peff@peff.net> --- wt-status.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/wt-status.c b/wt-status.c index b47f6d9..853419f 100644 --- a/wt-status.c +++ b/wt-status.c @@ -744,11 +744,8 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt sm_summary.git_cmd = 1; sm_summary.no_stdin = 1; - sm_summary.out = -1; - run_command(&sm_summary); - - strbuf_read(&cmd_stdout, sm_summary.out, 1024); + capture_command(&sm_summary, &cmd_stdout, 1024); /* prepend header, only if there's an actual output */ if (cmd_stdout.len) { -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 5/7] submodule: use capture_command 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King ` (3 preceding siblings ...) 2015-03-23 3:53 ` [PATCH v2 4/7] wt-status: use capture_command Jeff King @ 2015-03-23 3:53 ` Jeff King 2015-03-23 3:54 ` [PATCH v2 6/7] trailer: " Jeff King ` (2 subsequent siblings) 7 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-23 3:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Sunshine, Wincent Colaiuta, Git List In is_submodule_commit_present, we call run_command followed by a pipe read, which is prone to deadlock. It is unlikely to happen in this case, as rev-list should never produce more than a single line of output, but it does not hurt to avoid an anti-pattern (and using the helper simplifies the setup and cleanup). Signed-off-by: Jeff King <peff@peff.net> --- submodule.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/submodule.c b/submodule.c index d37d400..c0e6c81 100644 --- a/submodule.c +++ b/submodule.c @@ -576,12 +576,10 @@ static int is_submodule_commit_present(const char *path, unsigned char sha1[20]) cp.env = local_repo_env; cp.git_cmd = 1; cp.no_stdin = 1; - cp.out = -1; cp.dir = path; - if (!run_command(&cp) && !strbuf_read(&buf, cp.out, 1024)) + if (!capture_command(&cp, &buf, 1024) && !buf.len) is_present = 1; - close(cp.out); strbuf_release(&buf); } return is_present; -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 6/7] trailer: use capture_command 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King ` (4 preceding siblings ...) 2015-03-23 3:53 ` [PATCH v2 5/7] submodule: " Jeff King @ 2015-03-23 3:54 ` Jeff King 2015-03-23 3:54 ` [PATCH v2 7/7] run-command: forbid using run_command with piped output Jeff King 2015-03-23 4:40 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Junio C Hamano 7 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-23 3:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Sunshine, Wincent Colaiuta, Git List When we read from a trailer.*.command sub-program, the current code uses run_command followed by a pipe read, which can result in deadlock (though in practice you would have to have a large trailer for this to be a problem). The current code also leaks the file descriptor for the pipe to the sub-command. Instead, let's use capture_command, which makes this simpler (and we can get rid of our custom helper). Signed-off-by: Jeff King <peff@peff.net> --- trailer.c | 18 +++++------------- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/trailer.c b/trailer.c index 05b3859..4b14a56 100644 --- a/trailer.c +++ b/trailer.c @@ -214,16 +214,6 @@ static struct trailer_item *remove_first(struct trailer_item **first) return item; } -static int read_from_command(struct child_process *cp, struct strbuf *buf) -{ - if (run_command(cp)) - return error("running trailer command '%s' failed", cp->argv[0]); - if (strbuf_read(buf, cp->out, 1024) < 1) - return error("reading from trailer command '%s' failed", cp->argv[0]); - strbuf_trim(buf); - return 0; -} - static const char *apply_command(const char *command, const char *arg) { struct strbuf cmd = STRBUF_INIT; @@ -240,14 +230,16 @@ static const char *apply_command(const char *command, const char *arg) cp.argv = argv; cp.env = local_repo_env; cp.no_stdin = 1; - cp.out = -1; cp.use_shell = 1; - if (read_from_command(&cp, &buf)) { + if (capture_command(&cp, &buf, 1024)) { + error("running trailer command '%s' failed", cmd.buf); strbuf_release(&buf); result = xstrdup(""); - } else + } else { + strbuf_trim(&buf); result = strbuf_detach(&buf, NULL); + } strbuf_release(&cmd); return result; -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 7/7] run-command: forbid using run_command with piped output 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King ` (5 preceding siblings ...) 2015-03-23 3:54 ` [PATCH v2 6/7] trailer: " Jeff King @ 2015-03-23 3:54 ` Jeff King 2015-03-23 4:40 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Junio C Hamano 7 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-23 3:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Eric Sunshine, Wincent Colaiuta, Git List Because run_command both spawns and wait()s for the command before returning control to the caller, any reads from the pipes we open must necessarily happen after wait() returns. This can lead to deadlock, as the child process may block on writing to us while we are blocked waiting for it to exit. Worse, it only happens when the child fills the pipe buffer, which means that the problem may come and go depending on the platform and the size of the output produced by the child. Let's detect and flag this dangerous construct so that we can catch potential bugs early in the test suite rather than having them happen in the field. Signed-off-by: Jeff King <peff@peff.net> --- run-command.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/run-command.c b/run-command.c index e591d2c..aad03ab 100644 --- a/run-command.c +++ b/run-command.c @@ -557,7 +557,12 @@ int finish_command(struct child_process *cmd) int run_command(struct child_process *cmd) { - int code = start_command(cmd); + int code; + + if (cmd->out < 0 || cmd->err < 0) + die("BUG: run_command with a pipe can cause deadlock"); + + code = start_command(cmd); if (code) return code; return finish_command(cmd); -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v2 0/7] introduce capture_command to avoid deadlocks 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King ` (6 preceding siblings ...) 2015-03-23 3:54 ` [PATCH v2 7/7] run-command: forbid using run_command with piped output Jeff King @ 2015-03-23 4:40 ` Junio C Hamano 7 siblings, 0 replies; 26+ messages in thread From: Junio C Hamano @ 2015-03-23 4:40 UTC (permalink / raw) To: Jeff King; +Cc: Eric Sunshine, Wincent Colaiuta, Git List Thanks; will queue. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper 2015-03-22 19:36 ` Eric Sunshine 2015-03-22 22:54 ` Junio C Hamano @ 2015-03-22 23:34 ` Jeff King 1 sibling, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 23:34 UTC (permalink / raw) To: Eric Sunshine; +Cc: Wincent Colaiuta, Git List On Sun, Mar 22, 2015 at 03:36:01PM -0400, Eric Sunshine wrote: > > This is really at the intersection of the strbuf and > > run-command APIs, so you could argue for it being part of > > either It is logically quite like the strbuf_read_file() > > function, so I put it there. > > It does feel like a layering violation. If moved to the run-command > API, it could given one of the following names or something better: A layering violation implies there is an ordering to the APIs. Certainly we call APIs from other APIs all the time. I guess you could argue that these are the "same" layer, and should be next to each, and not building on each other (i.e., that strbuf has dependencies only on system APIs like stdio.h, and run-command only on system APIs like unistd.h, etc). But then reversing the order of the dependency does not really solve that. You would have to introduce a new higher-level API that combines them. But that seems silly for a single function (and I do not foresee any other similar functions). That being said, I'm not opposed to one of the reverse names if people feel strongly (I also considered making it an option flag to run_command_v_opt, but it ended up tangling things quite a bit more). -Peff ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper 2015-03-22 10:07 ` [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper Jeff King 2015-03-22 19:36 ` Eric Sunshine @ 2015-03-22 23:22 ` Junio C Hamano 2015-03-22 23:36 ` Jeff King 1 sibling, 1 reply; 26+ messages in thread From: Junio C Hamano @ 2015-03-22 23:22 UTC (permalink / raw) To: Jeff King; +Cc: Wincent Colaiuta, git Jeff King <peff@peff.net> writes: > diff --git a/strbuf.h b/strbuf.h > index 1883494..93a50da 100644 > --- a/strbuf.h > +++ b/strbuf.h > @@ -1,6 +1,8 @@ > #ifndef STRBUF_H > #define STRBUF_H > > +struct child_process; > + > /** > * strbuf's are meant to be used with all the usual C string and memory > * APIs. Given that the length of the buffer is known, it's often better to > @@ -373,6 +375,14 @@ extern int strbuf_read_file(struct strbuf *sb, const char *path, size_t hint); > extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint); > > /** > + * Execute the given command, capturing its stdout in the given strbuf. > + * Returns -1 if starting the command fails or reading fails, and otherwise > + * returns the exit code of the command. The output collected in the > + * buffer is kept even if the command returns a non-zero exit. > + */ > +int strbuf_read_cmd(struct strbuf *sb, struct child_process *cmd, size_t hint); > + > +/** > * Read a line from a FILE *, overwriting the existing contents > * of the strbuf. The second argument specifies the line > * terminator character, typically `'\n'`. It is an unfortunate tangent that this is a bugfix that may want to go to 'maint' and older, but our earlier jk/strbuf-doc-to-header topic introduces an unnecessary merge conflicts. I've wiggled this part and moved the doc elsewhere, only to remove that in the merge, which may not be optimal from the point of view of what I have to do when merging this topic down from pu to next to master to maint, but I do not see a good way around it. Thanks. The whole series looks very sensible. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper 2015-03-22 23:22 ` Junio C Hamano @ 2015-03-22 23:36 ` Jeff King 0 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 23:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Wincent Colaiuta, git On Sun, Mar 22, 2015 at 04:22:50PM -0700, Junio C Hamano wrote: > > +/** > > * Read a line from a FILE *, overwriting the existing contents > > * of the strbuf. The second argument specifies the line > > * terminator character, typically `'\n'`. > > It is an unfortunate tangent that this is a bugfix that may want to > go to 'maint' and older, but our earlier jk/strbuf-doc-to-header > topic introduces an unnecessary merge conflicts. Yeah, that is the worst part of refactoring and cleanup. Even when you make sure you are not hurting any topics in flight, you cannot know when a new topic will take off in your general area. > I've wiggled this part and moved the doc elsewhere, only to remove > that in the merge, which may not be optimal from the point of view > of what I have to do when merging this topic down from pu to next > to master to maint, but I do not see a good way around it. I'd suggest just dropping the documentation in the "maint" version (i.e., make it a moral cherry-pick of the function declaration only). -Peff ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 4/7] wt-status: use strbuf_read_cmd 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King ` (2 preceding siblings ...) 2015-03-22 10:07 ` [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper Jeff King @ 2015-03-22 10:08 ` Jeff King 2015-03-22 10:08 ` [PATCH 5/7] submodule: " Jeff King ` (2 subsequent siblings) 6 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 10:08 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git When we spawn "git submodule status" to read its output, we use run_command() followed by a strbuf_read() from a pipe. This can deadlock if the subprocess output is larger than the system pipe buffer. Furthermore, if start_command() fails, we'll try to read from a bogus descriptor (probably "-1" or a descriptor we just closed, but it is a bad idea for us to make assumptions about how start_command implements its error handling). And if start_command succeeds, we leak the file descriptor for the pipe to the child. All of these can be solved by using the strbuf_read_cmd helper. Signed-off-by: Jeff King <peff@peff.net> --- wt-status.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/wt-status.c b/wt-status.c index b47f6d9..fd85d61 100644 --- a/wt-status.c +++ b/wt-status.c @@ -744,11 +744,8 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt sm_summary.git_cmd = 1; sm_summary.no_stdin = 1; - sm_summary.out = -1; - run_command(&sm_summary); - - strbuf_read(&cmd_stdout, sm_summary.out, 1024); + strbuf_read_cmd(&cmd_stdout, &sm_summary, 1024); /* prepend header, only if there's an actual output */ if (cmd_stdout.len) { -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 5/7] submodule: use strbuf_read_cmd 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King ` (3 preceding siblings ...) 2015-03-22 10:08 ` [PATCH 4/7] wt-status: use strbuf_read_cmd Jeff King @ 2015-03-22 10:08 ` Jeff King 2015-03-22 10:09 ` [PATCH 6/7] trailer: " Jeff King 2015-03-22 10:10 ` [PATCH 7/7] run-command: forbid using run_command with piped output Jeff King 6 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 10:08 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git In is_submodule_commit_present, we call run_command followed by a pipe read, which is prone to deadlock. It is unlikely to happen in this case, as rev-list should never produce more than a single line of output, but it does not hurt to avoid an anti-pattern (and using the helper simplifies the setup and cleanup). Signed-off-by: Jeff King <peff@peff.net> --- submodule.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/submodule.c b/submodule.c index d37d400..d85f1bb 100644 --- a/submodule.c +++ b/submodule.c @@ -576,12 +576,10 @@ static int is_submodule_commit_present(const char *path, unsigned char sha1[20]) cp.env = local_repo_env; cp.git_cmd = 1; cp.no_stdin = 1; - cp.out = -1; cp.dir = path; - if (!run_command(&cp) && !strbuf_read(&buf, cp.out, 1024)) + if (!strbuf_read_cmd(&buf, &cp, 1024) && !buf.len) is_present = 1; - close(cp.out); strbuf_release(&buf); } return is_present; -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 6/7] trailer: use strbuf_read_cmd 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King ` (4 preceding siblings ...) 2015-03-22 10:08 ` [PATCH 5/7] submodule: " Jeff King @ 2015-03-22 10:09 ` Jeff King 2015-03-22 10:10 ` [PATCH 7/7] run-command: forbid using run_command with piped output Jeff King 6 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 10:09 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git When we read from a trailer.*.command sub-program, the current code uses run_command followed by a pipe read, which can result in deadlock (though in practice you would have to have a large trailer for this to be a problem). The current code also leaks the file descriptor for the pipe to the sub-command. Instead, let's use strbuf_read_cmd, which makes this simpler (and we can get rid of our custom helper). Signed-off-by: Jeff King <peff@peff.net> --- trailer.c | 18 +++++------------- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/trailer.c b/trailer.c index 05b3859..b47bc0e 100644 --- a/trailer.c +++ b/trailer.c @@ -214,16 +214,6 @@ static struct trailer_item *remove_first(struct trailer_item **first) return item; } -static int read_from_command(struct child_process *cp, struct strbuf *buf) -{ - if (run_command(cp)) - return error("running trailer command '%s' failed", cp->argv[0]); - if (strbuf_read(buf, cp->out, 1024) < 1) - return error("reading from trailer command '%s' failed", cp->argv[0]); - strbuf_trim(buf); - return 0; -} - static const char *apply_command(const char *command, const char *arg) { struct strbuf cmd = STRBUF_INIT; @@ -240,14 +230,16 @@ static const char *apply_command(const char *command, const char *arg) cp.argv = argv; cp.env = local_repo_env; cp.no_stdin = 1; - cp.out = -1; cp.use_shell = 1; - if (read_from_command(&cp, &buf)) { + if (strbuf_read_cmd(&buf, &cp, 1024)) { + error("running trailer command '%s' failed", cmd.buf); strbuf_release(&buf); result = xstrdup(""); - } else + } else { + strbuf_trim(&buf); result = strbuf_detach(&buf, NULL); + } strbuf_release(&cmd); return result; -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 7/7] run-command: forbid using run_command with piped output 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King ` (5 preceding siblings ...) 2015-03-22 10:09 ` [PATCH 6/7] trailer: " Jeff King @ 2015-03-22 10:10 ` Jeff King 6 siblings, 0 replies; 26+ messages in thread From: Jeff King @ 2015-03-22 10:10 UTC (permalink / raw) To: Wincent Colaiuta; +Cc: git Because run_command both spawns and wait()s for the command before returning control to the caller, any reads from the pipes we open must necessarily happen after wait() returns. This can lead to deadlock, as the child process may block on writing to us while we are blocked waiting for it to exit. Worse, it only happens when the child fills the pipe buffer, which means that the problem may come and go depending on the platform and the size of the output produced by the child. Let's detect and flag this dangerous construct so that we can catch potential bugs early in the test suite rather than having them happen in the field. Signed-off-by: Jeff King <peff@peff.net> --- run-command.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/run-command.c b/run-command.c index 3afb124..23636e0 100644 --- a/run-command.c +++ b/run-command.c @@ -557,7 +557,12 @@ int finish_command(struct child_process *cmd) int run_command(struct child_process *cmd) { - int code = start_command(cmd); + int code; + + if (cmd->out < 0 || cmd->err < 0) + die("BUG: run_command with a pipe can cause deadlock"); + + code = start_command(cmd); if (code) return code; return finish_command(cmd); -- 2.3.3.618.ga041503 ^ permalink raw reply related [flat|nested] 26+ messages in thread
end of thread, other threads:[~2015-03-23 4:40 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-22 5:56 status hangs trying to get submodule summary Wincent Colaiuta 2015-03-22 7:44 ` [PATCH] status: read submodule process output before calling wait() Jeff King 2015-03-22 8:07 ` Jeff King 2015-03-22 9:59 ` [PATCH 0/7] introduce strbuf_read_cmd to avoid deadlocks Jeff King 2015-03-22 10:00 ` [PATCH 1/7] wt-status: don't flush before running "submodule status" Jeff King 2015-03-22 10:00 ` [PATCH 2/7] wt_status: fix signedness mismatch in strbuf_read call Jeff King 2015-03-22 10:07 ` [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper Jeff King 2015-03-22 19:36 ` Eric Sunshine 2015-03-22 22:54 ` Junio C Hamano 2015-03-22 23:40 ` Junio C Hamano 2015-03-23 3:53 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Jeff King 2015-03-23 3:53 ` [PATCH v2 1/7] wt-status: don't flush before running "submodule status" Jeff King 2015-03-23 3:53 ` [PATCH v2 2/7] wt_status: fix signedness mismatch in strbuf_read call Jeff King 2015-03-23 3:53 ` [PATCH v2 3/7] run-command: introduce capture_command helper Jeff King 2015-03-23 3:53 ` [PATCH v2 4/7] wt-status: use capture_command Jeff King 2015-03-23 3:53 ` [PATCH v2 5/7] submodule: " Jeff King 2015-03-23 3:54 ` [PATCH v2 6/7] trailer: " Jeff King 2015-03-23 3:54 ` [PATCH v2 7/7] run-command: forbid using run_command with piped output Jeff King 2015-03-23 4:40 ` [PATCH v2 0/7] introduce capture_command to avoid deadlocks Junio C Hamano 2015-03-22 23:34 ` [PATCH 3/7] strbuf: introduce strbuf_read_cmd helper Jeff King 2015-03-22 23:22 ` Junio C Hamano 2015-03-22 23:36 ` Jeff King 2015-03-22 10:08 ` [PATCH 4/7] wt-status: use strbuf_read_cmd Jeff King 2015-03-22 10:08 ` [PATCH 5/7] submodule: " Jeff King 2015-03-22 10:09 ` [PATCH 6/7] trailer: " Jeff King 2015-03-22 10:10 ` [PATCH 7/7] run-command: forbid using run_command with piped output Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).