git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Scripted clone generating an incomplete, unusable .git/config
@ 2010-11-10 23:21 Dun Peal
  2010-11-11  7:55 ` Stefan Naewe
  2010-11-11 10:37 ` Jonathan Nieder
  0 siblings, 2 replies; 17+ messages in thread
From: Dun Peal @ 2010-11-10 23:21 UTC (permalink / raw)
  To: Git ML

This is a weird issue I ran into while scripting some Git operations
with git 1.7.2 on a Linux server.

When running the git-clone command manually from the command line, the
resulting repo/.git/config had all three required sections: core,
remote (origin), branch (master).

When running the exact same git-clone command manually from the Python
scripted, the resulting repo/.git/config was missing the `core` and
`remote` sections.

Here's a bash log fully demonstrating the issue:

  $ python -c "import os; os.popen('git clone
git@git.domain.com:repos/repo.git')"
  [...]
  $ cat repo/.git/config
  [branch "master"]
          remote = origin
          merge = refs/heads/master
  $ rm -Rf repo
  $ git clone git@git.domain.com:repos/repo.git
  $ cat repo/.git/config
  [core]
          repositoryformatversion = 0
          filemode = true
          bare = false
          logallrefupdates = true
  [remote "origin"]
          fetch = +refs/heads/*:refs/remotes/origin/*
          url = git@git.domain.com:repo/repo.git
  [branch "master"]
          remote = origin
          merge = refs/heads/master

What's causing this?  Is it a bug?

Thanks, D

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-10 23:21 Scripted clone generating an incomplete, unusable .git/config Dun Peal
@ 2010-11-11  7:55 ` Stefan Naewe
  2010-11-11  8:00   ` Stefan Naewe
  2010-11-11 10:37 ` Jonathan Nieder
  1 sibling, 1 reply; 17+ messages in thread
From: Stefan Naewe @ 2010-11-11  7:55 UTC (permalink / raw)
  To: Dun Peal; +Cc: Git ML

On 11/11/2010 12:21 AM, Dun Peal wrote:
> This is a weird issue I ran into while scripting some Git operations
> with git 1.7.2 on a Linux server.
> 
> When running the git-clone command manually from the command line, the
> resulting repo/.git/config had all three required sections: core,
> remote (origin), branch (master).
> 
> When running the exact same git-clone command manually from the Python
> scripted, the resulting repo/.git/config was missing the `core` and
> `remote` sections.
> 
> Here's a bash log fully demonstrating the issue:
> 
>   $ python -c "import os; os.popen('git clone
> git@git.domain.com:repos/repo.git')"
>   [...]
>   $ cat repo/.git/config
>   [branch "master"]
>           remote = origin
>           merge = refs/heads/master
>   $ rm -Rf repo
>   $ git clone git@git.domain.com:repos/repo.git
>   $ cat repo/.git/config
>   [core]
>           repositoryformatversion = 0
>           filemode = true
>           bare = false
>           logallrefupdates = true
>   [remote "origin"]
>           fetch = +refs/heads/*:refs/remotes/origin/*
>           url = git@git.domain.com:repo/repo.git
>   [branch "master"]
>           remote = origin
>           merge = refs/heads/master
> 
> What's causing this?  Is it a bug?

Same for me with git version 1.7.3.2 on Debian Etch.
Seems to be a problem with the popen() returning too early or
the interpreter dying too early.

This works though:

$ python -c "import subprocess; subprocess.call(['git', 'clone', 'git://host/repoo.git'])"

Regards,
  Stefan
-- 
----------------------------------------------------------------
/dev/random says: I is knot dain bramaged!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11  7:55 ` Stefan Naewe
@ 2010-11-11  8:00   ` Stefan Naewe
  0 siblings, 0 replies; 17+ messages in thread
From: Stefan Naewe @ 2010-11-11  8:00 UTC (permalink / raw)
  To: Dun Peal; +Cc: Git ML

On 11/11/2010 8:55 AM, Stefan Naewe wrote:
> On 11/11/2010 12:21 AM, Dun Peal wrote:
>>
>> Here's a bash log fully demonstrating the issue:
>>
>>   $ python -c "import os; os.popen('git clone
>> git@git.domain.com:repos/repo.git')"
>>   [...]
>> What's causing this?  Is it a bug?
> 
> Same for me with git version 1.7.3.2 on Debian Etch.

Make that 'Debian Lenny (5.0.6)' FWIW...

> Seems to be a problem with the popen() returning too early or
> the interpreter dying too early.

I forgot to say:

...because it works if I do it interactivley in the python shell.
 
> This works though:
> 
> $ python -c "import subprocess; subprocess.call(['git', 'clone', 'git://host/repoo.git'])"
 
Regards,
  Stefan
-- 
----------------------------------------------------------------
/dev/random says: I is knot dain bramaged!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-10 23:21 Scripted clone generating an incomplete, unusable .git/config Dun Peal
  2010-11-11  7:55 ` Stefan Naewe
@ 2010-11-11 10:37 ` Jonathan Nieder
  2010-11-11 12:16   ` Nguyen Thai Ngoc Duy
  2010-11-11 17:39   ` Scripted clone generating an incomplete, unusable .git/config Andreas Schwab
  1 sibling, 2 replies; 17+ messages in thread
From: Jonathan Nieder @ 2010-11-11 10:37 UTC (permalink / raw)
  To: Dun Peal; +Cc: Git ML, Stefan Naewe

Hi,

Dun Peal wrote:

>   $ python -c "import os; os.popen('git clone git@git.domain.com:repos/repo.git')"
>   [...]
>   $ cat repo/.git/config
>   [branch "master"]
>           remote = origin
>           merge = refs/heads/master

It looks like you've probably figured it out already, but for
completeness:

Most likely the clone is terminating when Python exits, perhaps due to
SIGPIPE.  It doesn't look like a bug to me; I suspect you meant to use
os.system(), which is synchronous, instead.  You might be able to get
a better sense of what happened with GIT_TRACE=1 or strace.

Hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11 10:37 ` Jonathan Nieder
@ 2010-11-11 12:16   ` Nguyen Thai Ngoc Duy
  2010-11-11 17:32     ` Jonathan Nieder
  2010-11-11 17:39   ` Scripted clone generating an incomplete, unusable .git/config Andreas Schwab
  1 sibling, 1 reply; 17+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-11 12:16 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Dun Peal, Git ML, Stefan Naewe

On Thu, Nov 11, 2010 at 5:37 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi,
>
> Dun Peal wrote:
>
>>   $ python -c "import os; os.popen('git clone git@git.domain.com:repos/repo.git')"
>>   [...]
>>   $ cat repo/.git/config
>>   [branch "master"]
>>           remote = origin
>>           merge = refs/heads/master
>
> It looks like you've probably figured it out already, but for
> completeness:
>
> Most likely the clone is terminating when Python exits, perhaps due to
> SIGPIPE.  It doesn't look like a bug to me; I suspect you meant to use
> os.system(), which is synchronous, instead.  You might be able to get
> a better sense of what happened with GIT_TRACE=1 or strace.

If "git clone" is terminated before it completes, shouldn't it clean
the uncompleted repo?
-- 
Duy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11 12:16   ` Nguyen Thai Ngoc Duy
@ 2010-11-11 17:32     ` Jonathan Nieder
  2010-11-11 17:55       ` Daniel Barkalow
  0 siblings, 1 reply; 17+ messages in thread
From: Jonathan Nieder @ 2010-11-11 17:32 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Dun Peal, Git ML, Stefan Naewe, Daniel Barkalow, Carl Worth

On Thu, Nov 11, 2010 at 07:16:27PM +0700, Nguyen Thai Ngoc Duy wrote:
> On Thu, Nov 11, 2010 at 5:37 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

>> Most likely the clone is terminating when Python exits, perhaps due to
>> SIGPIPE.  It doesn't look like a bug to me; I suspect you meant to use
>> os.system(), which is synchronous, instead.
[...]
> If "git clone" is terminated before it completes, shouldn't it clean
> the uncompleted repo?

Ah, so it should.

 trace: built-in: git clone jrn@localhost:/home/jrn/src/xz
 trace: run_command: ssh jrn@localhost git-upload-pack '/home/jrn/src/xz'
 trace: remove junk called
 jrn@localhosts password: 
 trace: run_command: index-pack --stdin -v --fix-thin --keep=fetch-pack 19314 on burratino
 trace: exec: git index-pack --stdin -v --fix-thin --keep=fetch-pack 19314 on burratino
 trace: built-in: git index-pack --stdin -v --fix-thin --keep=fetch-pack 19314 on burratino
 remote: Counting objects: 7299, done.
 remote: Compressing objects: 100% (1826/1826), done.
 remote: Total 7299 (delta 5421), reused 7274 (delta 5401)
 Receiving objects: 100% (7299/7299), 2.36 MiB | 4.43 MiB/s, done.
 Resolving deltas: 100% (5421/5421), done.
 trace: exited with status 0
 trace: exited with status 0
 trace: remove junk called
 trace: remove_junk: pid != 0

Are there any downside to the following?

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
diff --git a/builtin/clone.c b/builtin/clone.c
index 19ed640..af6b40a 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -667,6 +667,5 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&branch_top);
 	strbuf_release(&key);
 	strbuf_release(&value);
-	junk_pid = 0;
 	return err;
 }

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11 10:37 ` Jonathan Nieder
  2010-11-11 12:16   ` Nguyen Thai Ngoc Duy
@ 2010-11-11 17:39   ` Andreas Schwab
  1 sibling, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2010-11-11 17:39 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Dun Peal, Git ML, Stefan Naewe

Jonathan Nieder <jrnieder@gmail.com> writes:

> Hi,
>
> Dun Peal wrote:
>
>>   $ python -c "import os; os.popen('git clone git@git.domain.com:repos/repo.git')"
>>   [...]
>>   $ cat repo/.git/config
>>   [branch "master"]
>>           remote = origin
>>           merge = refs/heads/master
>
> It looks like you've probably figured it out already, but for
> completeness:
>
> Most likely the clone is terminating when Python exits, perhaps due to
> SIGPIPE.  It doesn't look like a bug to me; I suspect you meant to use
> os.system(), which is synchronous, instead.  You might be able to get
> a better sense of what happened with GIT_TRACE=1 or strace.

I think it would be a bug in python not wait(2)ing for the subprocess
during the implicit close when exiting.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11 17:32     ` Jonathan Nieder
@ 2010-11-11 17:55       ` Daniel Barkalow
  2010-11-11 18:48         ` Jonathan Nieder
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Barkalow @ 2010-11-11 17:55 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Nguyen Thai Ngoc Duy, Dun Peal, Git ML, Stefan Naewe, Carl Worth

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2183 bytes --]

On Thu, 11 Nov 2010, Jonathan Nieder wrote:

> On Thu, Nov 11, 2010 at 07:16:27PM +0700, Nguyen Thai Ngoc Duy wrote:
> > On Thu, Nov 11, 2010 at 5:37 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> 
> >> Most likely the clone is terminating when Python exits, perhaps due to
> >> SIGPIPE.  It doesn't look like a bug to me; I suspect you meant to use
> >> os.system(), which is synchronous, instead.
> [...]
> > If "git clone" is terminated before it completes, shouldn't it clean
> > the uncompleted repo?
> 
> Ah, so it should.
> 
>  trace: built-in: git clone jrn@localhost:/home/jrn/src/xz
>  trace: run_command: ssh jrn@localhost git-upload-pack '/home/jrn/src/xz'
>  trace: remove junk called
>  jrn@localhosts password: 
>  trace: run_command: index-pack --stdin -v --fix-thin --keep=fetch-pack 19314 on burratino
>  trace: exec: git index-pack --stdin -v --fix-thin --keep=fetch-pack 19314 on burratino
>  trace: built-in: git index-pack --stdin -v --fix-thin --keep=fetch-pack 19314 on burratino
>  remote: Counting objects: 7299, done.
>  remote: Compressing objects: 100% (1826/1826), done.
>  remote: Total 7299 (delta 5421), reused 7274 (delta 5401)
>  Receiving objects: 100% (7299/7299), 2.36 MiB | 4.43 MiB/s, done.
>  Resolving deltas: 100% (5421/5421), done.
>  trace: exited with status 0
>  trace: exited with status 0
>  trace: remove junk called
>  trace: remove_junk: pid != 0
> 
> Are there any downside to the following?
> 
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 19ed640..af6b40a 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -667,6 +667,5 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	strbuf_release(&branch_top);
>  	strbuf_release(&key);
>  	strbuf_release(&value);
> -	junk_pid = 0;
>  	return err;
>  }

I believe that would cause it to remove the repository when it terminates, 
regardless of whether it completed or not. That line is after all of the 
clone's code. I'm a bit suspicious that it's not flushing some stdio 
buffer and relying on the C library doing it on an orderly exit.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11 17:55       ` Daniel Barkalow
@ 2010-11-11 18:48         ` Jonathan Nieder
  2010-11-11 19:05           ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Jonathan Nieder @ 2010-11-11 18:48 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Nguyen Thai Ngoc Duy, Dun Peal, Git ML, Stefan Naewe, Carl Worth,
	Jeff King

On Thu, Nov 11, 2010 at 12:55:48PM -0500, Daniel Barkalow wrote:
> On Thu, 11 Nov 2010, Jonathan Nieder wrote:

>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -667,6 +667,5 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>  	strbuf_release(&branch_top);
>>  	strbuf_release(&key);
>>  	strbuf_release(&value);
>> -	junk_pid = 0;
>>  	return err;
>>  }
>
> I believe that would cause it to remove the repository when it terminates, 
> regardless of whether it completed or not.

Ah, right, the second remove_junk() call is because of atexit().

So why does git clone keep running after the first remove_junk() call?
It seems that the signal is initially set up (by Python's popen()?) as
SIG_IGN.  I guess "git clone" should explicitly override that to be
SIG_DFL?

Here's a proof of concept.  It is not very good because it overrides
any previously set sigchain handlers (in case the "git" wrappers
start to use one) and because using SIG_DFL as a sigchain_fun feels
like violating an abstraction.

diff --git a/builtin/clone.c b/builtin/clone.c
index 19ed640..2f21a91 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -458,6 +458,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	}
 	junk_git_dir = git_dir;
 	atexit(remove_junk);
+	sigchain_push_common(SIG_DFL);
 	sigchain_push_common(remove_junk_on_signal);
 
 	setenv(CONFIG_ENVIRONMENT, mkpath("%s/config", git_dir), 1);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11 18:48         ` Jonathan Nieder
@ 2010-11-11 19:05           ` Jeff King
  2010-11-12  2:16             ` Jonathan Nieder
  2010-11-12  4:32             ` Jonathan Nieder
  0 siblings, 2 replies; 17+ messages in thread
From: Jeff King @ 2010-11-11 19:05 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Dun Peal, Git ML,
	Stefan Naewe, Carl Worth

On Thu, Nov 11, 2010 at 12:48:29PM -0600, Jonathan Nieder wrote:

> So why does git clone keep running after the first remove_junk() call?
> It seems that the signal is initially set up (by Python's popen()?) as
> SIG_IGN.  I guess "git clone" should explicitly override that to be
> SIG_DFL?

I was tracing this earlier today, too, and got sidetracked. But I got to
the same confusing point: why doesn't it die after cleaning up? It looks
like we inherit SIG_IGN for SIGPIPE from the parent python process.

I don't think it makes sense for git-clone to do this itself. If we are
going to say "SIGPIPE should default to SIG_DFL on startup" then we
should do it as the very first thing in the git wrapper, not just for
git-clone. That gives each git program a known starting point of
behavior.

But I wonder if we should perhaps just be ignoring SIGPIPE in this
instance instead.  There isn't a real error here; we just ended up not
being able to write some useless progress report to stdout. There's no
reason to fail.

Note that we probably don't want to ignore SIGPIPE for all of git; many
of the output-producing programs rely on it for early termination when
the user closes the pager. But for clone, it makes sense.

> Here's a proof of concept.  It is not very good because it overrides
> any previously set sigchain handlers (in case the "git" wrappers
> start to use one) and because using SIG_DFL as a sigchain_fun feels
> like violating an abstraction.
> [...]
> +	sigchain_push_common(SIG_DFL);

I don't think your patch is the right solution, but FWIW, sigchain was
explicitly intended to be able to take SIG_DFL and SIG_IGN. Probably
sigchain_fun should be removed and we should just use sighandler_t
explicitly (I think in initial versions they were not the same thing,
and I realized the error of my ways, but the sigchain_fun typedef hung
around anyway).

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11 19:05           ` Jeff King
@ 2010-11-12  2:16             ` Jonathan Nieder
  2010-11-12  4:24               ` Jeff King
  2010-11-12  4:32             ` Jonathan Nieder
  1 sibling, 1 reply; 17+ messages in thread
From: Jonathan Nieder @ 2010-11-12  2:16 UTC (permalink / raw)
  To: Jeff King
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Dun Peal, Git ML,
	Stefan Naewe, Carl Worth

Jeff King wrote:

> I don't think your patch is the right solution, but FWIW, sigchain was
> explicitly intended to be able to take SIG_DFL and SIG_IGN. Probably
> sigchain_fun should be removed and we should just use sighandler_t
> explicitly

Sorry, that was lazy of me.  The name sighandler_t is a GNU extension[1].

The following addresses my confusion but I doubt it's worth the
syntactic ugliness.

-- 8< --
Subject: sigchain: hide sigchain_fun type

Signal handlers that might be passed to signal() must be pointers to
function with the prototype

	void handler(int signum);

In glibc this type is called sighandler_t; in the sigchain lib,
sigchain_fun.

These really represent the same type in all respects: even special
values like SIG_IGN and SIG_DFL are perfectly reasonable arguments
for a function accepting values of one of the two types.  Avoid
confusion by eliminating the sigchain_fun name from sigchain.h.

It would be nice to instead use sighandler_t everywhere, but
unfortunately that name is a GNU extension.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 [1] http://www.delorie.com/gnu/docs/glibc/libc_481.html

 sigchain.c |    2 ++
 sigchain.h |    6 ++----
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/sigchain.h b/sigchain.h
index 618083b..571d148 100644
--- a/sigchain.h
+++ b/sigchain.h
@@ -1,11 +1,9 @@
 #ifndef SIGCHAIN_H
 #define SIGCHAIN_H
 
-typedef void (*sigchain_fun)(int);
-
-int sigchain_push(int sig, sigchain_fun f);
+int sigchain_push(int sig, void (*f)(int));
 int sigchain_pop(int sig);
 
-void sigchain_push_common(sigchain_fun f);
+void sigchain_push_common(void (*f)(int));
 
 #endif /* SIGCHAIN_H */
diff --git a/sigchain.c b/sigchain.c
index 1118b99..f837f61 100644
--- a/sigchain.c
+++ b/sigchain.c
@@ -3,6 +3,8 @@
 
 #define SIGCHAIN_MAX_SIGNALS 32
 
+typedef void (*sigchain_fun)(int);
+
 struct sigchain_signal {
 	sigchain_fun *old;
 	int n;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-12  2:16             ` Jonathan Nieder
@ 2010-11-12  4:24               ` Jeff King
  2010-11-12  4:35                 ` Jonathan Nieder
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2010-11-12  4:24 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Dun Peal, Git ML,
	Stefan Naewe, Carl Worth

On Thu, Nov 11, 2010 at 08:16:02PM -0600, Jonathan Nieder wrote:

> > I don't think your patch is the right solution, but FWIW, sigchain was
> > explicitly intended to be able to take SIG_DFL and SIG_IGN. Probably
> > sigchain_fun should be removed and we should just use sighandler_t
> > explicitly
> 
> Sorry, that was lazy of me.  The name sighandler_t is a GNU extension[1].

Ah, you're right. ANSI C defines signal() without using a typedef at
all. Maybe that is why I didn't use it in the first place. I don't
recall.

> The following addresses my confusion but I doubt it's worth the
> syntactic ugliness.
>
> -- 8< --
> Subject: sigchain: hide sigchain_fun type

I think it makes the code uglier. Maybe this is a better solution:

-- >8 --
Subject: [PATCH] document sigchain api

It's pretty straightforward, but a stripped-down example
never hurts. And we should make clear that it is explicitly
OK to use SIG_DFL and SIG_IGN.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/technical/api-sigchain.txt |   41 ++++++++++++++++++++++++++++++
 1 files changed, 41 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/technical/api-sigchain.txt

diff --git a/Documentation/technical/api-sigchain.txt b/Documentation/technical/api-sigchain.txt
new file mode 100644
index 0000000..535cdff
--- /dev/null
+++ b/Documentation/technical/api-sigchain.txt
@@ -0,0 +1,41 @@
+sigchain API
+============
+
+Code often wants to set a signal handler to clean up temporary files or
+other work-in-progress when we die unexpectedly. For multiple pieces of
+code to do this without conflicting, each piece of code must remember
+the old value of the handler and restore it either when:
+
+  1. The work-in-progress is finished, and the handler is no longer
+     necessary. The handler should revert to the original behavior
+     (either another handler, SIG_DFL, or SIG_IGN).
+
+  2. The signal is received. We should then do our cleanup, then chain
+     to the next handler (or die if it is SIG_DFL).
+
+Sigchain is a tiny library for keeping a stack of handlers. Your handler
+and installation code should look something like:
+
+------------------------------------------
+  void clean_foo_on_signal(int sig)
+  {
+	  clean_foo();
+	  sigchain_pop(sig);
+	  raise(sig);
+  }
+
+  void other_func()
+  {
+	  sigchain_push_common(clean_foo_on_signal);
+	  mess_up_foo();
+	  clean_foo();
+  }
+------------------------------------------
+
+Handlers are given the typdef of sigchain_fun. This is the same type
+that is given to signal() or sigaction(). It is perfectly reasonable to
+push SIG_DFL or SIG_IGN onto the stack.
+
+You can sigchain_push and sigchain_pop individual signals. For
+convenience, sigchain_push_common will push the handler onto the stack
+for many common signals.
-- 
1.7.3.2.362.g0e229.dirty

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-11 19:05           ` Jeff King
  2010-11-12  2:16             ` Jonathan Nieder
@ 2010-11-12  4:32             ` Jonathan Nieder
  2010-11-12  4:41               ` Jeff King
  2010-11-12  5:12               ` [RFC/PATCH] daemon, tag, verify-tag: do not pass ignored signals to child (Re: Scripted clone generating an incomplete, unusable .git/config) Jonathan Nieder
  1 sibling, 2 replies; 17+ messages in thread
From: Jonathan Nieder @ 2010-11-12  4:32 UTC (permalink / raw)
  To: Jeff King
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Dun Peal, Git ML,
	Stefan Naewe, Carl Worth

Jeff King wrote:

> I don't think it makes sense for git-clone to do this itself. If we are
> going to say "SIGPIPE should default to SIG_DFL on startup" then we
> should do it as the very first thing in the git wrapper, not just for
> git-clone. That gives each git program a known starting point of
> behavior.

Here's what that might look like.

-- 8< --
Subject: SIGPIPE and other fatal signals should default to SIG_DFL

The intuitive behavior when a git command receives a fatal
signal is for it to die.

Indeed, that is the default handling.  However, POSIX semantics
allow the parent of a process to override that default by
ignoring a signal, since ignored signals are preserved by fork() and
exec().  For example, Python 2.6 and 2.7's os.popen() runs a shell
with SIGPIPE ignored (Python issue 1736483).

Worst of all, in such a situation, many git commands use
sigchain_push_common() to run cleanup actions (removing temporary
files, discarding locks, that kind of thing) before passing control to
the original handler; if that handler ignores the signal rather than
exiting, then execution will continue in an unplanned-for and unusual
state.

So give the signals in question default handling at the start of
main(), before sigchain_push_common can be called.

NEEDSWORK:

 - imposes an obnoxious maintenance burden.  New non-builtins
   that might call sigchain_push_common() would need to have
   default handling restored at the start of main().

 - needs tests, especially for interaction with "git daemon"
   client shutdown

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Currently git daemon uses SIG_IGN state on SIGTERM to protect
children with active connections.  Why isn't that causing the same
sort of problems as os.popen() causes?

 check-racy.c   |    1 +
 daemon.c       |    1 +
 fast-import.c  |    1 +
 git.c          |    2 ++
 http-backend.c |    1 +
 http-fetch.c   |    1 +
 http-push.c    |    1 +
 imap-send.c    |    1 +
 remote-curl.c  |    1 +
 shell.c        |    2 ++
 show-index.c   |    2 ++
 upload-pack.c  |    1 +
 12 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/check-racy.c b/check-racy.c
index 00d92a1..f1ad9d5 100644
--- a/check-racy.c
+++ b/check-racy.c
@@ -6,6 +6,7 @@ int main(int ac, char **av)
 	int dirty, clean, racy;
 
 	dirty = clean = racy = 0;
+	sigchain_push_common(SIG_DFL);
 	read_cache();
 	for (i = 0; i < active_nr; i++) {
 		struct cache_entry *ce = active_cache[i];
diff --git a/daemon.c b/daemon.c
index 7ccd097..7719f33 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1001,6 +1001,7 @@ int main(int argc, char **argv)
 	gid_t gid = 0;
 	int i;
 
+	sigchain_push_common(SIG_DFL);
 	git_extract_argv0_path(argv[0]);
 
 	for (i = 1; i < argc; i++) {
diff --git a/fast-import.c b/fast-import.c
index dc48245..ce28759 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3027,6 +3027,7 @@ int main(int argc, const char **argv)
 {
 	unsigned int i;
 
+	sigchain_push_common(SIG_DFL);
 	git_extract_argv0_path(argv[0]);
 
 	if (argc == 2 && !strcmp(argv[1], "-h"))
diff --git a/git.c b/git.c
index e08d0f1..a8677fb 100644
--- a/git.c
+++ b/git.c
@@ -502,6 +502,8 @@ int main(int argc, const char **argv)
 	if (!cmd)
 		cmd = "git-help";
 
+	sigchain_push_common(SIG_DFL);
+
 	/*
 	 * "git-xxxx" is the same as "git xxxx", but we obviously:
 	 *
diff --git a/http-backend.c b/http-backend.c
index 14c90c2..b03455a 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -550,6 +550,7 @@ int main(int argc, char **argv)
 	char *cmd_arg = NULL;
 	int i;
 
+	sigchain_push_common(SIG_DFL);
 	git_extract_argv0_path(argv[0]);
 	set_die_routine(die_webcgi);
 
diff --git a/http-fetch.c b/http-fetch.c
index 762c750..aca4e8f 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -24,6 +24,7 @@ int main(int argc, const char **argv)
 	int get_verbosely = 0;
 	int get_recover = 0;
 
+	sigchain_push_common(SIG_DFL);
 	git_extract_argv0_path(argv[0]);
 
 	while (arg < argc && argv[arg][0] == '-') {
diff --git a/http-push.c b/http-push.c
index c9bcd11..a9d0894 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1791,6 +1791,7 @@ int main(int argc, char **argv)
 	struct remote *remote;
 	char *rewritten_url = NULL;
 
+	sigchain_push_common(SIG_DFL);
 	git_extract_argv0_path(argv[0]);
 
 	repo = xcalloc(sizeof(*repo), 1);
diff --git a/imap-send.c b/imap-send.c
index 71506a8..eb13adc 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1538,6 +1538,7 @@ int main(int argc, char **argv)
 	int nongit_ok;
 
+	sigchain_push_common(SIG_DFL);
 	git_extract_argv0_path(argv[0]);
 
 	if (argc != 1)
 		usage(imap_send_usage);
diff --git a/remote-curl.c b/remote-curl.c
index 04d4813..804492d 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -787,6 +787,7 @@ int main(int argc, const char **argv)
 	struct strbuf buf = STRBUF_INIT;
 	int nongit;
 
+	sigchain_push_common(SIG_DFL);
 	git_extract_argv0_path(argv[0]);
 	setup_git_directory_gently(&nongit);
 	if (argc < 2) {
diff --git a/shell.c b/shell.c
index dea4cfd..9798b99 100644
--- a/shell.c
+++ b/shell.c
@@ -137,6 +137,8 @@ int main(int argc, char **argv)
 	int devnull_fd;
 	int count;
 
+	sigchain_push_common(SIG_DFL);
+
 	/*
 	 * Always open file descriptors 0/1/2 to avoid clobbering files
 	 * in die().  It also avoids not messing up when the pipes are
diff --git a/show-index.c b/show-index.c
index 4c0ac13..7d48c88 100644
--- a/show-index.c
+++ b/show-index.c
@@ -11,6 +11,8 @@ int main(int argc, char **argv)
 	unsigned int version;
 	static unsigned int top_index[256];
 
+	sigchain_push_common(SIG_DFL);
+
 	if (argc != 1)
 		usage(show_index_usage);
 	if (fread(top_index, 2 * 4, 1, stdin) != 1)
diff --git a/upload-pack.c b/upload-pack.c
index f05e422..4c764f7 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -682,6 +682,7 @@ int main(int argc, char **argv)
 	int i;
 	int strict = 0;
 
+	sigchain_push_common(SIG_DFL);
 	git_extract_argv0_path(argv[0]);
 	read_replace_refs = 0;
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-12  4:24               ` Jeff King
@ 2010-11-12  4:35                 ` Jonathan Nieder
  0 siblings, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2010-11-12  4:35 UTC (permalink / raw)
  To: Jeff King
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Dun Peal, Git ML,
	Stefan Naewe, Carl Worth

Jeff King wrote:

> Subject: [PATCH] document sigchain api
> 
> It's pretty straightforward, but a stripped-down example
> never hurts. And we should make clear that it is explicitly
> OK to use SIG_DFL and SIG_IGN.

Nice, thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-12  4:32             ` Jonathan Nieder
@ 2010-11-12  4:41               ` Jeff King
  2010-11-12  5:18                 ` Jonathan Nieder
  2010-11-12  5:12               ` [RFC/PATCH] daemon, tag, verify-tag: do not pass ignored signals to child (Re: Scripted clone generating an incomplete, unusable .git/config) Jonathan Nieder
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2010-11-12  4:41 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Dun Peal, Git ML,
	Stefan Naewe, Carl Worth

On Thu, Nov 11, 2010 at 10:32:30PM -0600, Jonathan Nieder wrote:

> Subject: SIGPIPE and other fatal signals should default to SIG_DFL
> 
> The intuitive behavior when a git command receives a fatal
> signal is for it to die.
> 
> Indeed, that is the default handling.  However, POSIX semantics
> allow the parent of a process to override that default by
> ignoring a signal, since ignored signals are preserved by fork() and
> exec().  For example, Python 2.6 and 2.7's os.popen() runs a shell
> with SIGPIPE ignored (Python issue 1736483).
>
> [...]
>
>  check-racy.c   |    1 +
>  daemon.c       |    1 +
>  fast-import.c  |    1 +
>  git.c          |    2 ++
>  http-backend.c |    1 +
>  http-fetch.c   |    1 +
>  http-push.c    |    1 +
>  imap-send.c    |    1 +
>  remote-curl.c  |    1 +
>  shell.c        |    2 ++
>  show-index.c   |    2 ++
>  upload-pack.c  |    1 +
>  12 files changed, 15 insertions(+), 0 deletions(-)

Do we need to have it in every command? Is calling git-foo deprecated
enough that we can just put it in git.c?

I guess there are still a few commands that get installed explicitly in
.../bin (upload-pack, for example). They would need a separate one.
Perhaps it doesn't hurt to just put it in all of the non-builtins as you
did. It's not that big a maintenance issue.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC/PATCH] daemon, tag, verify-tag: do not pass ignored signals to child (Re: Scripted clone generating an incomplete, unusable .git/config)
  2010-11-12  4:32             ` Jonathan Nieder
  2010-11-12  4:41               ` Jeff King
@ 2010-11-12  5:12               ` Jonathan Nieder
  1 sibling, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2010-11-12  5:12 UTC (permalink / raw)
  To: Jeff King
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Dun Peal, Git ML,
	Stefan Naewe, Carl Worth

Jonathan Nieder wrote:

> Currently git daemon uses SIG_IGN state on SIGTERM to protect
> children with active connections.  Why isn't that causing the same
> sort of problems as os.popen() causes?

It's late so please do not trust me, but I think the following would
fix that.

-- 8< --
Subject: daemon, tag, verify-tag: do not pass ignored signals to child

It is bad practice to have signals ignored or blocked while creating a
child process, since to do so triggers not-so-well-tested code paths
in many programs.

tag and verify-tag block SIGPIPE to avoid termination from writing
after gpg fails and closes its pipe early.  Ignoring SIGPIPE in the
child is an unintended side-effect; avoid it by narrowing the scope
of the request to ignore SIGPIPE to encompass only the write() (and
in particular, not the fork()).

Connection handling threads in daemon block SIGTERM to avoid
termination of active connections when the number of connections gets
too high.  Use a signal handling function instead of SIG_IGN to
avoid passing the ignored signal to the child.  Ignoring SIGTERM in
the request-handling child is not necessary, since kill_some_child()
never tries to kill those.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Needs tests.

 builtin/tag.c        |   11 +++++++----
 builtin/verify-tag.c |   10 +++++++---
 daemon.c             |    6 +++++-
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index d311491..efc9b93 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -173,10 +173,6 @@ static int do_sign(struct strbuf *buffer)
 			bracket[1] = '\0';
 	}
 
-	/* When the username signingkey is bad, program could be terminated
-	 * because gpg exits without reading and then write gets SIGPIPE. */
-	signal(SIGPIPE, SIG_IGN);
-
 	memset(&gpg, 0, sizeof(gpg));
 	gpg.argv = args;
 	gpg.in = -1;
@@ -189,9 +185,14 @@ static int do_sign(struct strbuf *buffer)
 	if (start_command(&gpg))
 		return error("could not run gpg.");
 
+	/* When the username signingkey is bad, program could be terminated
+	 * because gpg exits without reading and then write gets SIGPIPE. */
+	sigchain_push(SIGPIPE, SIG_IGN);
+
 	if (write_in_full(gpg.in, buffer->buf, buffer->len) != buffer->len) {
 		close(gpg.in);
 		close(gpg.out);
+		sigchain_pop(SIGPIPE);
 		finish_command(&gpg);
 		return error("gpg did not accept the tag data");
 	}
@@ -199,6 +200,8 @@ static int do_sign(struct strbuf *buffer)
 	len = strbuf_read(buffer, gpg.out, 1024);
 	close(gpg.out);
 
+	sigchain_pop(SIGPIPE);
+
 	if (finish_command(&gpg) || !len || len < 0)
 		return error("gpg failed to sign the tag");
 
diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c
index 9f482c2..5361017 100644
--- a/builtin/verify-tag.c
+++ b/builtin/verify-tag.c
@@ -54,8 +54,15 @@ static int run_gpg_verify(const char *buf, unsigned long size, int verbose)
 		return error("could not run gpg.");
 	}
 
+	/* sometimes the program was terminated because this signal
+	 * was received in the process of writing the gpg input: */
+	sigchain_push(SIGPIPE, ignore_signal);
+
 	write_in_full(gpg.in, buf, len);
 	close(gpg.in);
+
+	sigchain_pop(SIGPIPE);
+
 	ret = finish_command(&gpg);
 
 	unlink_or_warn(path);
@@ -104,9 +111,6 @@ int cmd_verify_tag(int argc, const char **argv, const char *prefix)
 	if (argc <= i)
 		usage_with_options(verify_tag_usage, verify_tag_options);
 
-	/* sometimes the program was terminated because this signal
-	 * was received in the process of writing the gpg input: */
-	signal(SIGPIPE, SIG_IGN);
 	while (i < argc)
 		if (verify_tag(argv[i++], verbose))
 			had_error = 1;
diff --git a/daemon.c b/daemon.c
index 7719f33..ccc560b 100644
--- a/daemon.c
+++ b/daemon.c
@@ -243,6 +243,10 @@ static int git_daemon_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
+static void ignore_termination_signal(int sig)
+{
+}
+
 static int run_service(char *dir, struct daemon_service *service)
 {
 	const char *path;
@@ -294,7 +298,7 @@ static int run_service(char *dir, struct daemon_service *service)
 	 * We'll ignore SIGTERM from now on, we have a
 	 * good client.
 	 */
-	signal(SIGTERM, SIG_IGN);
+	signal(SIGTERM, ignore_termination_signal);
 
 	return service->fn();
 }
-- 
1.7.2.3.557.gab647.dirty

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Scripted clone generating an incomplete, unusable .git/config
  2010-11-12  4:41               ` Jeff King
@ 2010-11-12  5:18                 ` Jonathan Nieder
  0 siblings, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2010-11-12  5:18 UTC (permalink / raw)
  To: Jeff King
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Dun Peal, Git ML,
	Stefan Naewe, Carl Worth

Jeff King wrote:
> On Thu, Nov 11, 2010 at 10:32:30PM -0600, Jonathan Nieder wrote:

>> Subject: SIGPIPE and other fatal signals should default to SIG_DFL
>> 
>> The intuitive behavior when a git command receives a fatal
>> signal is for it to die.
[...]
> Do we need to have it in every command? Is calling git-foo deprecated
> enough that we can just put it in git.c?
> 
> I guess there are still a few commands that get installed explicitly in
> .../bin (upload-pack, for example). They would need a separate one.
> Perhaps it doesn't hurt to just put it in all of the non-builtins as you
> did. It's not that big a maintenance issue.

Okay.  Here's a hunk I forgot to add.  The next challenge is how to
test this mess. :)

diff --git a/cache.h b/cache.h
index d85ce86..8088e26 100644
--- a/cache.h
+++ b/cache.h
@@ -5,6 +5,7 @@
 #include "strbuf.h"
 #include "hash.h"
 #include "advice.h"
+#include "sigchain.h"
 
 #include SHA1_HEADER
 #ifndef git_SHA_CTX

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-11-12  5:19 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-10 23:21 Scripted clone generating an incomplete, unusable .git/config Dun Peal
2010-11-11  7:55 ` Stefan Naewe
2010-11-11  8:00   ` Stefan Naewe
2010-11-11 10:37 ` Jonathan Nieder
2010-11-11 12:16   ` Nguyen Thai Ngoc Duy
2010-11-11 17:32     ` Jonathan Nieder
2010-11-11 17:55       ` Daniel Barkalow
2010-11-11 18:48         ` Jonathan Nieder
2010-11-11 19:05           ` Jeff King
2010-11-12  2:16             ` Jonathan Nieder
2010-11-12  4:24               ` Jeff King
2010-11-12  4:35                 ` Jonathan Nieder
2010-11-12  4:32             ` Jonathan Nieder
2010-11-12  4:41               ` Jeff King
2010-11-12  5:18                 ` Jonathan Nieder
2010-11-12  5:12               ` [RFC/PATCH] daemon, tag, verify-tag: do not pass ignored signals to child (Re: Scripted clone generating an incomplete, unusable .git/config) Jonathan Nieder
2010-11-11 17:39   ` Scripted clone generating an incomplete, unusable .git/config Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).