* [PATCH] Use line buffering for standard output @ 2008-08-03 21:26 Anders Melchiorsen 2008-08-03 21:38 ` Junio C Hamano 2008-08-03 21:46 ` Linus Torvalds 0 siblings, 2 replies; 8+ messages in thread From: Anders Melchiorsen @ 2008-08-03 21:26 UTC (permalink / raw) To: git; +Cc: gitster, Anders Melchiorsen Normally, stdout is fully buffered, unless it refers to a terminal device. This gives problems when fork() is in play: the buffer is cloned and output appears twice. By always setting stdout to line buffering, we make the output work identically for all output devices. Signed-off-by: Anders Melchiorsen <mail@cup.kalibalik.dk> --- On #git, blix mentioned that running git clone through a pipe made it output the "Initialized empty" line twice. This seems to be due to bad interactions between fork() and buffered stdio. Rather than putting in flushing at all the right places, this sledgehammer fix simply reverts to line buffering for all output devices. Anders. git.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/git.c b/git.c index 37b1d76..040b900 100644 --- a/git.c +++ b/git.c @@ -421,6 +421,12 @@ int main(int argc, const char **argv) int done_alias = 0; /* + * Use line buffering, even if we do not have interactive + * output. Full buffering mixes badly with fork(). + */ + setvbuf(stdout, NULL, _IOLBF, 0); + + /* * Take the basename of argv[0] as the command * name, and the dirname as the default exec_path * if we don't have anything better. -- 1.5.6.4 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Use line buffering for standard output 2008-08-03 21:26 [PATCH] Use line buffering for standard output Anders Melchiorsen @ 2008-08-03 21:38 ` Junio C Hamano 2008-08-03 21:46 ` Linus Torvalds 1 sibling, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2008-08-03 21:38 UTC (permalink / raw) To: Anders Melchiorsen; +Cc: git Anders Melchiorsen <mail@cup.kalibalik.dk> writes: > Normally, stdout is fully buffered, unless it refers to a terminal > device. This gives problems when fork() is in play: the buffer is > cloned and output appears twice. I thought we have been careful to flush() them before we fork but perhaps there are recent changes that were not careful. Wouldn't it be better to fix them first, independent of this change? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Use line buffering for standard output 2008-08-03 21:26 [PATCH] Use line buffering for standard output Anders Melchiorsen 2008-08-03 21:38 ` Junio C Hamano @ 2008-08-03 21:46 ` Linus Torvalds 2008-08-03 22:48 ` Anders Melchiorsen 1 sibling, 1 reply; 8+ messages in thread From: Linus Torvalds @ 2008-08-03 21:46 UTC (permalink / raw) To: Anders Melchiorsen; +Cc: git, gitster On Sun, 3 Aug 2008, Anders Melchiorsen wrote: > > Normally, stdout is fully buffered, unless it refers to a terminal > device. This gives problems when fork() is in play: the buffer is > cloned and output appears twice. > > By always setting stdout to line buffering, we make the output work > identically for all output devices. Please don't. This is a huge peformance issue for things like git log -p > file where we really want it to be fully buffered. So please just find the place where we do a fork() without flushing pending output... (We really shouldn't have all that many "fork()" calls left, I thought - the Windows stuff means that most of it should be abstracted away. So it's not like we're talking about hundreds of sites, there should be just a couple). Linus ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Use line buffering for standard output 2008-08-03 21:46 ` Linus Torvalds @ 2008-08-03 22:48 ` Anders Melchiorsen 2008-08-04 0:24 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Anders Melchiorsen @ 2008-08-03 22:48 UTC (permalink / raw) To: Linus Torvalds, gitster; +Cc: git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sun, 3 Aug 2008, Anders Melchiorsen wrote: >> >> By always setting stdout to line buffering, we make the output work >> identically for all output devices. > > Please don't. > > This is a huge peformance issue for things like > > git log -p > file > > where we really want it to be fully buffered. > > So please just find the place where we do a fork() without flushing > pending output... Sure. The sledgehammer approach was partly to get some advice on the proper solution. I now realize that you have generally been careful about this, and so a single flush should be enough. Below are two alternative proposals, one local and one global. Both of them fix the problem for me, but maybe you were even thinking about a third place? For the run-command.c one, I was not sure whether to put it inside or outside the ifdef, and I also was not sure whether to add it for start_command(). Not having other testcases, and not knowing Windows, this is the way it ended up. Cheers, Anders. From: Anders Melchiorsen <mail@cup.kalibalik.dk> Date: Mon, 4 Aug 2008 00:21:49 +0200 Subject: [PATCH] Flush stdout in init-db Before this change, clone outputs "Initialized empty ..." twice if output is piped. Signed-off-by: Anders Melchiorsen <mail@cup.kalibalik.dk> --- builtin-init-db.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/builtin-init-db.c b/builtin-init-db.c index baf0d09..954c7e9 100644 --- a/builtin-init-db.c +++ b/builtin-init-db.c @@ -315,11 +315,13 @@ int init_db(const char *template_dir, unsigned int flags) git_config_set("receive.denyNonFastforwards", "true"); } - if (!(flags & INIT_DB_QUIET)) + if (!(flags & INIT_DB_QUIET)) { printf("%s%s Git repository in %s/\n", reinit ? "Reinitialized existing" : "Initialized empty", shared_repository ? " shared" : "", get_git_dir()); + fflush(stdout); + } return 0; } From: Anders Melchiorsen <mail@cup.kalibalik.dk> Date: Mon, 4 Aug 2008 00:35:40 +0200 Subject: [PATCH] Flush standard output in start_async This prevents double output in case stdout is redirected. Signed-off-by: Anders Melchiorsen <mail@cup.kalibalik.dk> --- run-command.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/run-command.c b/run-command.c index a3b28a6..67be079 100644 --- a/run-command.c +++ b/run-command.c @@ -304,6 +304,9 @@ int start_async(struct async *async) async->out = pipe_out[0]; #ifndef __MINGW32__ + /* Flush output before fork() to avoid cloning the buffer */ + fflush(stdout); + async->pid = fork(); if (async->pid < 0) { error("fork (async) failed: %s", strerror(errno)); ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Use line buffering for standard output 2008-08-03 22:48 ` Anders Melchiorsen @ 2008-08-04 0:24 ` Junio C Hamano 2008-08-04 0:30 ` [PATCH] Flush output in start_async Anders Melchiorsen 2008-08-04 5:50 ` [PATCH] Use line buffering for standard output Linus Torvalds 0 siblings, 2 replies; 8+ messages in thread From: Junio C Hamano @ 2008-08-04 0:24 UTC (permalink / raw) To: Anders Melchiorsen; +Cc: Linus Torvalds, git Anders Melchiorsen <mail@cup.kalibalik.dk> writes: > From: Anders Melchiorsen <mail@cup.kalibalik.dk> > Date: Mon, 4 Aug 2008 00:35:40 +0200 > Subject: [PATCH] Flush standard output in start_async > > This prevents double output in case stdout is redirected. > > Signed-off-by: Anders Melchiorsen <mail@cup.kalibalik.dk> > --- > run-command.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/run-command.c b/run-command.c > index a3b28a6..67be079 100644 > --- a/run-command.c > +++ b/run-command.c > @@ -304,6 +304,9 @@ int start_async(struct async *async) > async->out = pipe_out[0]; > > #ifndef __MINGW32__ > + /* Flush output before fork() to avoid cloning the buffer */ > + fflush(stdout); > + > async->pid = fork(); > if (async->pid < 0) { > error("fork (async) failed: %s", strerror(errno)); I think this with s/stdout/NULL/ would be a reasonable thing to do. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Flush output in start_async 2008-08-04 0:24 ` Junio C Hamano @ 2008-08-04 0:30 ` Anders Melchiorsen 2008-08-04 5:50 ` [PATCH] Use line buffering for standard output Linus Torvalds 1 sibling, 0 replies; 8+ messages in thread From: Anders Melchiorsen @ 2008-08-04 0:30 UTC (permalink / raw) To: gitster; +Cc: git, Anders Melchiorsen This prevents double output in case stdout is redirected. Signed-off-by: Anders Melchiorsen <mail@cup.kalibalik.dk> --- run-command.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/run-command.c b/run-command.c index a3b28a6..6af83c5 100644 --- a/run-command.c +++ b/run-command.c @@ -304,6 +304,9 @@ int start_async(struct async *async) async->out = pipe_out[0]; #ifndef __MINGW32__ + /* Flush stdio before fork() to avoid cloning buffers */ + fflush(NULL); + async->pid = fork(); if (async->pid < 0) { error("fork (async) failed: %s", strerror(errno)); -- 1.5.6.4 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Use line buffering for standard output 2008-08-04 0:24 ` Junio C Hamano 2008-08-04 0:30 ` [PATCH] Flush output in start_async Anders Melchiorsen @ 2008-08-04 5:50 ` Linus Torvalds 2008-08-04 10:18 ` [PATCH] Add output flushing before fork() Anders Melchiorsen 1 sibling, 1 reply; 8+ messages in thread From: Linus Torvalds @ 2008-08-04 5:50 UTC (permalink / raw) To: Junio C Hamano; +Cc: Anders Melchiorsen, git On Sun, 3 Aug 2008, Junio C Hamano wrote: > Anders Melchiorsen <mail@cup.kalibalik.dk> writes: > > > > #ifndef __MINGW32__ > > + /* Flush output before fork() to avoid cloning the buffer */ > > + fflush(stdout); > > + > > async->pid = fork(); > > if (async->pid < 0) { > > error("fork (async) failed: %s", strerror(errno)); > > I think this with s/stdout/NULL/ would be a reasonable thing to do. Agreed, I think that's the right thing to do. There's another fork there in start_command(), I suspect we should do it there too: it's a "generic" path, so it should try to be safe. The other ones look ok from a quick scan. I don't know the imap-send.c code, but it's from outside people who hopefully know what they were doing. The other ones don't seem to be using stdio before the fork (except for things like "die()" ;) There is a "fork()" in a _comment_ in builtin-ls-tree.c, and that one definitely should have a fflush(NULL) in front of it. But it _is_ just a comment, and rather than addign a fflush() there, it would probably be better to turn it into a "start_command()" or something like that. Linus. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Add output flushing before fork() 2008-08-04 5:50 ` [PATCH] Use line buffering for standard output Linus Torvalds @ 2008-08-04 10:18 ` Anders Melchiorsen 0 siblings, 0 replies; 8+ messages in thread From: Anders Melchiorsen @ 2008-08-04 10:18 UTC (permalink / raw) To: gitster; +Cc: git, Anders Melchiorsen This adds fflush(NULL) before fork() in start_command(), to keep the generic interface safe. A remaining use of fork() with no flushing is in a comment in show_tree(). Rewrite that comment to use start_command(). Signed-off-by: Anders Melchiorsen <mail@cup.kalibalik.dk> --- This fixes up the remaining two spots that Linus suggested. builtin-ls-tree.c | 13 ++++++------- run-command.c | 1 + 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/builtin-ls-tree.c b/builtin-ls-tree.c index d25767a..cb61717 100644 --- a/builtin-ls-tree.c +++ b/builtin-ls-tree.c @@ -66,17 +66,16 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen, /* * Maybe we want to have some recursive version here? * - * Something like: + * Something similar to this incomplete example: * if (show_subprojects(base, baselen, pathname)) { - if (fork()) { - chdir(base); - exec ls-tree; - } - waitpid(); + struct child_process ls_tree; + + ls_tree.dir = base; + ls_tree.argv = ls-tree; + start_command(&ls_tree); } * - * ..or similar.. */ type = commit_type; } else if (S_ISDIR(mode)) { diff --git a/run-command.c b/run-command.c index 6af83c5..bbb9c77 100644 --- a/run-command.c +++ b/run-command.c @@ -68,6 +68,7 @@ int start_command(struct child_process *cmd) trace_argv_printf(cmd->argv, "trace: run_command:"); #ifndef __MINGW32__ + fflush(NULL); cmd->pid = fork(); if (!cmd->pid) { if (cmd->no_stdin) -- 1.5.6.4 ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-08-04 10:19 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-08-03 21:26 [PATCH] Use line buffering for standard output Anders Melchiorsen 2008-08-03 21:38 ` Junio C Hamano 2008-08-03 21:46 ` Linus Torvalds 2008-08-03 22:48 ` Anders Melchiorsen 2008-08-04 0:24 ` Junio C Hamano 2008-08-04 0:30 ` [PATCH] Flush output in start_async Anders Melchiorsen 2008-08-04 5:50 ` [PATCH] Use line buffering for standard output Linus Torvalds 2008-08-04 10:18 ` [PATCH] Add output flushing before fork() Anders Melchiorsen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).