public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* scalability of signal delivery for Posix Threads
@ 2004-11-22 15:51 Ray Bryant
  2004-11-22 16:07 ` Matthew Wilcox
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Ray Bryant @ 2004-11-22 15:51 UTC (permalink / raw)
  To: Kernel Mailing List, linux-ia64@vger.kernel.org, lse-tech
  Cc: holt, Dean Roe, Brian Sumner, John Hawkes

We've encountered a scalability problem with signal delivery.  Our application
is attempting to use ITIMER_PROF to deliver one signal per clock tick to each
thread of a ptrheaded (NPTL).  These threads are created with CLONE_SIGHAND,
so that there is a single sighand->siglock for the entire application.

On our Altix systems, everything works fine until we increase the number of
threads (and processors, with one thread bound to each processor) beyond
about 112 threads or so.  At that point lock contention can become severe
enough to make the system unresponsive.  The reason is that each thread has
to obtain the (global to the application, in this case) lock sighand->siglock.

(Obviously, one solution is to recode the application to send fewer signals
per thread as the number of threads increase.  However, we are concerned by
the fact that a user application, of any kind, can be constructed in a way
that causes system to become responsive and would like to find a solutiuon
that would let us correctly execute the program as described.)

Perusing the kernel sources shows that this global lock is used, in many 
cases, to protect data that is purely local to the current thread.  (For 
example, see block_all_signals(), where current->sighand->siglock is obtained
and then the current task's signal mask is manipulated.)

This lock is also aquired in the following routines, where most of the time
is being spent our application:

ia64_do_signal
set_signal_to_deliver
ia64_rt_sigreturn

In these cases, it appears that the global lock is being acquired to make sure
the signal handler definition doesn't change underneath us, as well as dealing
with the per thread signal data.

Since signals are sent much more often than sigaction() is called, it would
seem to make more sense to make sigaction() take a heavier weight lock of
some kind (to update the signal handler decription) and to have the signal
delivery mechanism take a lighter weight lock.  Making 
current->sighand->siglock a rwlock_t really doesn't improve the situation
much, since cache line contention is just a severe in that case (if not worse) 
than it is with the current definition.

It seems to me that scalability would be improved if we moved the siglock from
the sighand structure to the task_struct.  (keep reading, please...)  Code 
that manipulates the current task signal data only would just obtain that 
lock.  Code that needs to change the sighand structure (e. g. sigaction())
would obtain all of the siglock's of all tasks using the same sighand 
structure.  A list of those task_struct's would be added to the sighand
structure to enable finding these structurs without having to take the
task_list_lock and search for them.

Obviously, this change ricochet's throughout the entire signal handling code.
It also means that sigaction() can become quite expensive for a many threaded
POSIX application, but my guess is that this doesn't happen very often.
The change could also make do_exit(), thread group shutdown, etc slower and
perhaps somewhat more complex.

Anyway, we would be interested in the community's ideas about dealing with
this signal delivery scalability issue, and, comments on the solution above
or suggestions for alternative solutions are welcome.
-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: [Lse-tech] scalability of signal delivery for Posix Threads
@ 2004-11-22 21:26 Boehm, Hans
  2004-11-22 21:34 ` Andi Kleen
  0 siblings, 1 reply; 19+ messages in thread
From: Boehm, Hans @ 2004-11-22 21:26 UTC (permalink / raw)
  To: Ray Bryant, Andi Kleen
  Cc: Andreas Schwab, Kernel Mailing List, linux-ia64, lse-tech, holt,
	Dean Roe, Brian Sumner, John Hawkes

Although I don't fully understand all the issues here,
I'm concerned about this proposal.  In particular, our
garbage collector (used by gcj, and Mono, among others)
uses signals to stop threads for each garbage collection.
With a small heap, and many threads, I would expect the
frequency of signal delivery to be similar to what you
get with performance tools.  But it does not, and should not,
use SIGPROF.

I think this is a more general issue.  Special casing one
piece of it is only going to make performance more surprising,
something I think should be avoided if at all possible.

Hans

> -----Original Message-----
> From: linux-ia64-owner@vger.kernel.org
> [mailto:linux-ia64-owner@vger.kernel.org]On Behalf Of Ray Bryant
> Sent: Monday, November 22, 2004 11:23 AM
> To: Andi Kleen
> Cc: Andreas Schwab; Kernel Mailing List; linux-ia64@vger.kernel.org;
> lse-tech; holt@sgi.com; Dean Roe; Brian Sumner; John Hawkes
> Subject: Re: [Lse-tech] scalability of signal delivery for 
> Posix Threads
> 
> 
> OK, apparently SIGPROF is delivered in both the ITIMER_PROF and
> pmu interrupt cases, so if we special case that signal we should
> be fine.
> -- 
> Best Regards,
> Ray
> -----------------------------------------------
>                    Ray Bryant
> 512-453-9679 (work)         512-507-7807 (cell)
> raybry@sgi.com             raybry@austin.rr.com
> The box said: "Requires Windows 98 or better",
>             so I installed Linux.
> -----------------------------------------------
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: [Lse-tech] scalability of signal delivery for Posix Threads
@ 2004-11-22 23:01 Boehm, Hans
  0 siblings, 0 replies; 19+ messages in thread
From: Boehm, Hans @ 2004-11-22 23:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ray Bryant, Andreas Schwab, Kernel Mailing List, linux-ia64,
	lse-tech, holt, Dean Roe, Brian Sumner, John Hawkes

Just to clarify:

I have no problem with special-casing signals sent to a specific
thread.  Our garbage collector uses pthread_kill, and thus should
also benefit from that change.  And it makes sense to me that this
kind of signal should be cheaper to deliver.

SIGSEGV delivery also matters to me.  But that should presumably
also fall into the same class.

I would prefer to avoid special handling for just SIGPROF.
If that was never proposed, please ignore my comments.

Hans

> -----Original Message-----
> From: Andi Kleen [mailto:ak@suse.de]
> Sent: Monday, November 22, 2004 1:35 PM
> To: Boehm, Hans
> Cc: Ray Bryant; Andi Kleen; Andreas Schwab; Kernel Mailing List;
> linux-ia64@vger.kernel.org; lse-tech; holt@sgi.com; Dean Roe; Brian
> Sumner; John Hawkes
> Subject: Re: [Lse-tech] scalability of signal delivery for 
> Posix Threads
> 
> 
> > I think this is a more general issue.  Special casing one
> 
> It just cannot be done in the general case without slowing
> down sigaction significantly. Or maybe it can, but nobody
> has proposed a way to do it so far. 
> 
> It's difficult to design for machines where a simple spinlock
> doesn't work properly anymore.
> 
> > piece of it is only going to make performance more surprising,
> > something I think should be avoided if at all possible.
> 
> The special case in particular would be signals directed to a 
> specific TID;
> compared to signals load balanced over the thread group which needs
> shared writable state. To simplify the fast path you could also make
> more simplications: no queueing (otherwise you would need to duplicate
> a lot of state to handle that into the task_struct) and probably
> no SIGCHILD which is also full of special cases.
> 
> -And
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2004-12-01 22:54 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-22 15:51 scalability of signal delivery for Posix Threads Ray Bryant
2004-11-22 16:07 ` Matthew Wilcox
2004-11-22 19:49   ` [Lse-tech] " Ray Bryant
2004-11-22 19:53     ` Andi Kleen
2004-11-22 16:22 ` [Lse-tech] " Andi Kleen
2004-11-22 16:51   ` Andreas Schwab
2004-11-22 16:54     ` Andi Kleen
2004-11-22 18:56       ` Ray Bryant
2004-11-22 19:22       ` Ray Bryant
2004-11-22 17:23   ` Philip J. Mucci
2004-11-22 17:19 ` Robin Holt
2004-11-22 19:25   ` Ray Bryant
2004-11-23 20:42   ` Ray Bryant
2004-11-22 21:27 ` [Lse-tech] " Rick Lindsley
2004-11-22 23:39   ` Ray Bryant
  -- strict thread matches above, loose matches on Subject: below --
2004-11-22 21:26 Boehm, Hans
2004-11-22 21:34 ` Andi Kleen
2004-12-01 22:53   ` Brent Casavant
2004-11-22 23:01 Boehm, Hans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox