[PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16
@ 2006-10-16  3:04 Gareth Knight
  2006-10-16  3:37 ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Gareth Knight @ 2006-10-16  3:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: torvalds

I looked in MAINTAINERS for a suitable person for the generic signal  
code, but couldn't find anyone in particular.  Please Cc me on  
comments, which are most welcome, as I am not on LKML, although I do  
peruse the archives.

Gareth.

Userspace signal mask
==================

Inspiration and rationale:

This is a proposed addition to the linux kernel to reduce the  
overhead required to mask signals.  The intended usage is an  
application with critical sections that need to be guarded against  
deadlock by preventing signals from being delivered whilst inside one  
of the critical sections.  Currently such applications may be very  
heavy users of the sigprocmask syscall, this proposal provides an  
additional signal mask stored in userspace that can be updated with a  
simple store rather than a syscall.

The userspace sigprocmask is activated via a new subcommand of the  
sigprocmask or rt_sigprocmask syscall.  The first argument is  
currently an integer indicating which operation to perform.   
Currently the operations are SIG_BLOCK, SIG_UNBLOCK and SIG_SETMASK;  
the new operation I've called SIG_SETUSERMASKADDRESS.

Usage is:

sigprocmask( SIG_SETUSERMASKADDRESS, sigset_t *address, NULL);

The *address pointer points to wherever you've decided to keep the  
userspace sigmask in your thread. Any non-zero address enables the  
use of the userspace sigprocmask and conversely, setting the address  
to zero disables it.  On a kernel without support, this operation can  
be expected to return -EINVAL.

Behaviour:

Each thread has its own address, but there is no restriction on  
threads in a process pointing to the same address.

By default, a process does not have a userspace sigmask.  When a new  
thread is created or a process is forked, the new thread inherits any  
userspace sigmask address from its parent.

At any time the userspace sigmask is enabled, the effective sigmask  
is the union of the userspace sigmask and the traditional sigmask in  
the kernel for that thread.  The traditional kernel sigmask is set  
and controlled by the SIG_BLOCK etc operations as it was before.

As soon as the relevant bit is set in the userspace sigmask, that  
signal is blocked in the same way it would be as if you'd blocked it  
via a traditional sigprocmask call.  If a signal is both blocked and  
also pending for delivery, clearing the bit in the userspace sigmask  
does not guarantee immediate delivery as a SIG_UNBLOCK call would -  
instead delivery typically occurs at the next pre-emption point (any  
syscall for example).

If a bad address is passed to the kernel for the location of the  
userspace mask, it is discarded.

Patch:

This patch is against a vanilla 2.6.16 kernel from kernel.org.  The  
modifications are entirely architecture independent.  It has been  
tested on i386, x86_64, ia64 and ppc64 kernels:

--- pristine-linux-2.6.16/include/linux/sched.h	2006-03-19  
21:53:29.000000000 -0800
+++ linux-2.6.16-userspacesigmask/include/linux/sched.h	2006-10-09  
18:09:24.000000000 -0700
@@ -807,6 +807,7 @@
	struct sighand_struct *sighand;
	sigset_t blocked, real_blocked;
+  sigset_t __user *userspace_blocked; /* If non-zero, task has opted  
to maintain an additional blocked mask in userspace */
	sigset_t saved_sigmask;		/* To be restored with TIF_RESTORE_SIGMASK */
	struct sigpending pending;
--- pristine-linux-2.6.16/kernel/fork.c	2006-03-19 21:53:29.000000000  
-0800
+++ linux-2.6.16-userspacesigmask/kernel/fork.c	2006-10-05  
22:10:13.000000000 -0700
@@ -987,6 +987,7 @@
	spin_lock_init(&p->alloc_lock);
	spin_lock_init(&p->proc_lock);
+  p->userspace_blocked = current->userspace_blocked;
	clear_tsk_thread_flag(p, TIF_SIGPENDING);
	init_sigpending(&p->pending);
--- pristine-linux-2.6.16/kernel/signal.c	2006-03-19  
21:53:29.000000000 -0800
+++ linux-2.6.16-userspacesigmask/kernel/signal.c	2006-10-14  
21:08:38.000000000 -0700
@@ -24,6 +24,7 @@
#include <linux/ptrace.h>
#include <linux/posix-timers.h>
#include <linux/signal.h>
+#include <linux/mm.h>
#include <linux/audit.h>
#include <linux/capability.h>
#include <asm/param.h>
@@ -224,7 +225,30 @@
void recalc_sigpending(void)
{
-	recalc_sigpending_tsk(current);
+  sigset_t usermask, combinedblocked;
+
+  sigemptyset(&usermask);
+
+  if (current->userspace_blocked != 0)
+  {
+    /* Task is using the optional userspace signal mask, fetch it  
from the address given to us. */
+    /* If the address is bad, we don't mind and the mask remains  
empty */
+    if (copy_from_user(&usermask, current->userspace_blocked, sizeof 
(sigset_t)))
+    {
+      /* If we have a bad address, disable the userspace mask */
+      current->userspace_blocked = 0;
+    }
+  }
+
+  sigorsets(&combinedblocked,&usermask,&current->blocked);
+
+	if (current->signal->group_stop_count > 0 ||
+	    (freezing(current)) ||
+	    PENDING(&current->pending, &combinedblocked) ||
+	    PENDING(&current->signal->shared_pending, &combinedblocked))
+		set_thread_flag(TIF_SIGPENDING);
+	else
+		clear_thread_flag(TIF_SIGPENDING);
}
/* Given the mask, find the first available signal that should be  
serviced. */
@@ -914,6 +938,7 @@
	if (sigismember(&t->blocked, sig)) {
		sigdelset(&t->blocked, sig);
	}
+
	recalc_sigpending_tsk(t);
	ret = specific_send_sig_info(sig, info, t);
	spin_unlock_irqrestore(&t->sighand->siglock, flags);
@@ -1921,7 +1946,7 @@
{
	sigset_t *mask = &current->blocked;
	int signr = 0;
-
+
relock:
	spin_lock_irq(&current->sighand->siglock);
	for (;;) {
@@ -2136,6 +2161,15 @@
	if (set) {
		error = -EFAULT;
+
+    /* SIG_SETUSERMASK just needs to record the address of the extra  
mask in userspace and return */
+    if ( how == SIG_SETUSERMASKADDRESS )
+    {
+      current->userspace_blocked = set;
+      error = 0;
+      goto out;
+		}
+
		if (copy_from_user(&new_set, set, sizeof(*set)))
			goto out;
		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
@@ -2607,6 +2641,9 @@
			break;
		case SIG_SETMASK:
			current->blocked.sig[0] = new_set;
+      break;
+    case SIG_SETUSERMASKADDRESS:
+      current->userspace_blocked = (sigset_t *) set;
			break;
		}
--- pristine-linux-2.6.16/include/asm-generic/signal.h	2006-03-19  
21:53:29.000000000 -0800
+++ linux-2.6.16-userspacesigmask/include/asm-generic/signal.h	 
2006-10-09 18:13:36.000000000 -0700
@@ -7,6 +7,9 @@
#ifndef SIG_SETMASK
#define SIG_SETMASK        2	/* for setting the signal mask */
#endif
+#ifndef SIG_SETUSERMASKADDRESS
+#define SIG_SETUSERMASKADDRESS    8	/* for setting the optional  
userspace signal mask address */
+#endif
#ifndef __ASSEMBLY__
typedef void __signalfn_t(int);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16
  2006-10-16  3:04 [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16 Gareth Knight
@ 2006-10-16  3:37 ` Linus Torvalds
  2006-10-16  3:50   ` Gareth Knight
  2006-10-16 13:28   ` Andi Kleen
  0 siblings, 2 replies; 7+ messages in thread
From: Linus Torvalds @ 2006-10-16  3:37 UTC (permalink / raw)
  To: Gareth Knight; +Cc: linux-kernel

On Sun, 15 Oct 2006, Gareth Knight wrote:
>
> I looked in MAINTAINERS for a suitable person for the generic signal code, but
> couldn't find anyone in particular.  Please Cc me on comments, which are most
> welcome, as I am not on LKML, although I do peruse the archives.

That's a truly horribly disfigured patch - your whitespace is all screwed 
up.

Anyway, the whole approach is not doable. At all.

Why? You're doing user-space accesses from within critical sections with a 
spinlock, and that's just a big no-no. Think page faults, swapping etc.

That's ignoring all the issues with the fact that doing the user accesses 
during recalc_sigpending is broken for other reasons, namely that we don't 
even _do_ the signal pending recalculation all the time, just when we 
"know" things may have changed. So your approach would miss updates to the 
user-space masks.

So the whole approach is flawed.

You _could_ try to make it do something special at signal delivery time, 
to see if delivery can be delayed at that point, but quite frankly, it's 
going to be nasty there too (and that's going to be a disaster for the 
whole issue of non-thread-specific signals, which have been steered to one 
thread, and then the new mask would say that they can't be accepted by 
that thread after all).

Quite frankly, you'd probably be better off trying to do totally different 
approaches. For example, it would be possible to block all signals 
entirely, and then just create a new system call that uses a _synchronous_ 
delivery method to avoid races with async delivery. Preferably a file 
descriptor, so that you can select/poll on it.

That was discussed at some point. See for example:

	http://groups.google.com/group/linux.kernel/browse_thread/thread/1332715ae3e26b9/1f3fc521db812a07?lnk=st&q=&rnum=1&hl=en#1f3fc521db812a07

which I found by just googling for "synchronous signal queue" with me as 
the author. That's from almost four years ago, and nobody ever got quite 
excited enough about it to actually take it any further, but I still think 
it's a lot better than the alternatives like yours..

			Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16
  2006-10-16  3:37 ` Linus Torvalds
@ 2006-10-16  3:50   ` Gareth Knight
  2006-10-16 13:28   ` Andi Kleen
  1 sibling, 0 replies; 7+ messages in thread
From: Gareth Knight @ 2006-10-16  3:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

Hey Linus,

Thanks for the lightning fast response !

I take your points - the reason I favoured trying userspace access  
approach was to keep the feature portable.  I've seen this feature in  
other *nix kernels but always done with the mask kept on some sort of  
syspage shared between user and kernel space, at a fixed address but  
contents local to each thread.  I would love to add such a feature to  
Linux - perhaps keep the tid, timeofday and other popular things on  
there as well; however I felt that sort of work was beyond my kernel  
hacking abilities right now.  I guess it could either be done as an  
outright special case or part of supporting a MAP_LOCAL style of mmap  
on Linux.

I did google for MAP_LOCAL and syspage, but I didn't see any  
promising avenues in terms of previous work I could pickup.

Gareth.

>
>
> On Sun, 15 Oct 2006, Gareth Knight wrote:
>>
>> I looked in MAINTAINERS for a suitable person for the generic  
>> signal code, but
>> couldn't find anyone in particular.  Please Cc me on comments,  
>> which are most
>> welcome, as I am not on LKML, although I do peruse the archives.
>
> That's a truly horribly disfigured patch - your whitespace is all  
> screwed
> up.
>
> Anyway, the whole approach is not doable. At all.
>
> Why? You're doing user-space accesses from within critical sections  
> with a
> spinlock, and that's just a big no-no. Think page faults, swapping  
> etc.
>
> That's ignoring all the issues with the fact that doing the user  
> accesses
> during recalc_sigpending is broken for other reasons, namely that  
> we don't
> even _do_ the signal pending recalculation all the time, just when we
> "know" things may have changed. So your approach would miss updates  
> to the
> user-space masks.
>
> So the whole approach is flawed.
>
> You _could_ try to make it do something special at signal delivery  
> time,
> to see if delivery can be delayed at that point, but quite frankly,  
> it's
> going to be nasty there too (and that's going to be a disaster for the
> whole issue of non-thread-specific signals, which have been steered  
> to one
> thread, and then the new mask would say that they can't be accepted by
> that thread after all).
>
> Quite frankly, you'd probably be better off trying to do totally  
> different
> approaches. For example, it would be possible to block all signals
> entirely, and then just create a new system call that uses a  
> _synchronous_
> delivery method to avoid races with async delivery. Preferably a file
> descriptor, so that you can select/poll on it.
>
> That was discussed at some point. See for example:
>
> 	http://groups.google.com/group/linux.kernel/browse_thread/thread/ 
> 1332715ae3e26b9/1f3fc521db812a07? 
> lnk=st&q=&rnum=1&hl=en#1f3fc521db812a07
>
> which I found by just googling for "synchronous signal queue" with  
> me as
> the author. That's from almost four years ago, and nobody ever got  
> quite
> excited enough about it to actually take it any further, but I  
> still think
> it's a lot better than the alternatives like yours..
>
> 			Linus


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16
  2006-10-16  3:37 ` Linus Torvalds
  2006-10-16  3:50   ` Gareth Knight
@ 2006-10-16 13:28   ` Andi Kleen
  2006-10-16 23:15     ` Nicholas Miell
  1 sibling, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2006-10-16 13:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, gk

Linus Torvalds <torvalds@osdl.org> writes:
> 
> Why? You're doing user-space accesses from within critical sections with a 
> spinlock, and that's just a big no-no. Think page faults, swapping etc.

He could pin the page in memory like futexes do. One page pinned
per thread shouldn't be a big DOS issue either.

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16
  2006-10-16 13:28   ` Andi Kleen
@ 2006-10-16 23:15     ` Nicholas Miell
  0 siblings, 0 replies; 7+ messages in thread
From: Nicholas Miell @ 2006-10-16 23:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, linux-kernel, gk

On Mon, 2006-10-16 at 15:28 +0200, Andi Kleen wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
> > 
> > Why? You're doing user-space accesses from within critical sections with a 
> > spinlock, and that's just a big no-no. Think page faults, swapping etc.
> 
> He could pin the page in memory like futexes do. One page pinned
> per thread shouldn't be a big DOS issue either.
> 
> -Andi

Whatever the final solution ends up being (assuming this actually goes
somewhere), it should be extensible enough for other uses (for example,
some sort of "please don't preempt me right now" hint for use with
pthread spinlocks.)


-- 
Nicholas Miell <nmiell@comcast.net>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16
@ 2006-10-17  9:24 linux
  2006-10-17 12:32 ` Gareth Knight
  0 siblings, 1 reply; 7+ messages in thread
From: linux @ 2006-10-17  9:24 UTC (permalink / raw)
  To: gk, linux-kernel; +Cc: ak, linux, torvalds

> This is a proposed addition to the linux kernel to reduce the  
> overhead required to mask signals.  The intended usage is an  
> application with critical sections that need to be guarded against  
> deadlock by preventing signals from being delivered whilst inside one  
> of the critical sections.  Currently such applications may be very  
> heavy users of the sigprocmask syscall, this proposal provides an  
> additional signal mask stored in userspace that can be updated with a  
> simple store rather than a syscall.

Wouldn't a simpler solution be to provide a way to cancel signal
delivery after it's hit user-space?

Then user space can keep its own block mask, which is maintained as
a superset of the kernel block mask.  Then a critical section would,
in the fast path, proceed like:

- Block signals -> Noted in user-space only
- Do critical section (part 1)
- Do critical section (part 2)
- Unblock signals -> Noted in user-space only
- More code

And if something bad happened

- Block signals -> Noted in user-space only
- Do critical section (part 1)
  - Signal arrives
  - Test against user-space mask
  - Tell kernel about all blocked signals -> Note new kernel mask
  - Return "please try again later" from signal handler
- Do critical section (part 2)
- Unblock signals -> Note that it's less than kernel mask
  - Tell kernel about newly unblocked signals
  - Signal arrives (again)
  - Test against user-space mask
  - Call registered signal handler
  - Return "signal handled"
- More code
   
This does do the whole signal delivery dance twice if it gets unlucky,
but keeps a conceptually simpler kernel interface.

The one thing that might be complicated is pthread signal delivery.
"Please try again later" could need to go back to the process layer and
immediately re-deliver it to a different thread.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16
  2006-10-17  9:24 linux
@ 2006-10-17 12:32 ` Gareth Knight
  0 siblings, 0 replies; 7+ messages in thread
From: Gareth Knight @ 2006-10-17 12:32 UTC (permalink / raw)
  To: linux; +Cc: linux-kernel, ak, torvalds

A way to cancel delivery / push things back onto the signal queue  
would indeed be a worthy addition.  The way I would see that working  
would be where you passed a special or new parameter to sigreturn()  
to indicate you wanted that signal requeued, presumably the  
implication would be that is was also added to the blocked mask at  
the same time.

The scheme for critical sections described below is a decent  
compromise, but it does require more application change than just  
speeding up sigprocmask.  My idea was that glibc could take advantage  
of the sigprocmask speedup when it was available and thus you  
automatically roll out the benefit to existing applications.

Linus pretty much mauled the suggested patch, but I don't think its  
quite as bad as it may first appear.  The suggestion from Andi to  
pin / lock the page (inspired from the futex code) I think solves a  
number of edge cases right off the bat and is straightforward to do.   
The concern over a copy_from_user whilst holding the &current- 
 >sighand->siglock I think is just a case of ensuring  
recalc_sigpending is not called in that situation - it doesn't seem  
necessary to hold the siglock to recalc the pending signals.

I also re-read the signal code and whilst it is quite true that  
recalc_sigpending is not called in every path, the omissions seem to  
me to be places that would be irrelevant anyway - in the case of  
forcing signal delivery for instance, where you wish to ignore any mask.

I do agree that in the case of unblocking a pending signal with my  
scheme, delivery is not instantaneous (typically the next syscall),  
but I would argue that this is only the case for signals which would  
not have been temporally deterministic anyway - some synchronous  
event such as a SIGBUS will cause pre-emption in a deterministic  
manner as always.

Suggestions and comments most welcome,

Gareth.

>> This is a proposed addition to the linux kernel to reduce the
>> overhead required to mask signals.  The intended usage is an
>> application with critical sections that need to be guarded against
>> deadlock by preventing signals from being delivered whilst inside one
>> of the critical sections.  Currently such applications may be very
>> heavy users of the sigprocmask syscall, this proposal provides an
>> additional signal mask stored in userspace that can be updated with a
>> simple store rather than a syscall.
>
> Wouldn't a simpler solution be to provide a way to cancel signal
> delivery after it's hit user-space?
>
> Then user space can keep its own block mask, which is maintained as
> a superset of the kernel block mask.  Then a critical section would,
> in the fast path, proceed like:
>
> - Block signals -> Noted in user-space only
> - Do critical section (part 1)
> - Do critical section (part 2)
> - Unblock signals -> Noted in user-space only
> - More code
>
> And if something bad happened
>
> - Block signals -> Noted in user-space only
> - Do critical section (part 1)
>   - Signal arrives
>   - Test against user-space mask
>   - Tell kernel about all blocked signals -> Note new kernel mask
>   - Return "please try again later" from signal handler
> - Do critical section (part 2)
> - Unblock signals -> Note that it's less than kernel mask
>   - Tell kernel about newly unblocked signals
>   - Signal arrives (again)
>   - Test against user-space mask
>   - Call registered signal handler
>   - Return "signal handled"
> - More code
>
> This does do the whole signal delivery dance twice if it gets unlucky,
> but keeps a conceptually simpler kernel interface.
>
> The one thing that might be complicated is pthread signal delivery.
> "Please try again later" could need to go back to the process layer  
> and
> immediately re-deliver it to a different thread.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-10-17 12:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-16  3:04 [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16 Gareth Knight
2006-10-16  3:37 ` Linus Torvalds
2006-10-16  3:50   ` Gareth Knight
2006-10-16 13:28   ` Andi Kleen
2006-10-16 23:15     ` Nicholas Miell
  -- strict thread matches above, loose matches on Subject: below --
2006-10-17  9:24 linux
2006-10-17 12:32 ` Gareth Knight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox