public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ray Bryant <raybry@sgi.com>
To: Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	lse-tech <lse-tech@lists.sourceforge.net>
Cc: holt@sgi.com, Dean Roe <roe@sgi.com>, Brian Sumner <bls@sgi.com>,
	John Hawkes <hawkes@tomahawk.engr.sgi.com>
Subject: scalability of signal delivery for Posix Threads
Date: Mon, 22 Nov 2004 09:51:15 -0600	[thread overview]
Message-ID: <41A20AF3.9030408@sgi.com> (raw)

We've encountered a scalability problem with signal delivery.  Our application
is attempting to use ITIMER_PROF to deliver one signal per clock tick to each
thread of a ptrheaded (NPTL).  These threads are created with CLONE_SIGHAND,
so that there is a single sighand->siglock for the entire application.

On our Altix systems, everything works fine until we increase the number of
threads (and processors, with one thread bound to each processor) beyond
about 112 threads or so.  At that point lock contention can become severe
enough to make the system unresponsive.  The reason is that each thread has
to obtain the (global to the application, in this case) lock sighand->siglock.

(Obviously, one solution is to recode the application to send fewer signals
per thread as the number of threads increase.  However, we are concerned by
the fact that a user application, of any kind, can be constructed in a way
that causes system to become responsive and would like to find a solutiuon
that would let us correctly execute the program as described.)

Perusing the kernel sources shows that this global lock is used, in many 
cases, to protect data that is purely local to the current thread.  (For 
example, see block_all_signals(), where current->sighand->siglock is obtained
and then the current task's signal mask is manipulated.)

This lock is also aquired in the following routines, where most of the time
is being spent our application:

ia64_do_signal
set_signal_to_deliver
ia64_rt_sigreturn

In these cases, it appears that the global lock is being acquired to make sure
the signal handler definition doesn't change underneath us, as well as dealing
with the per thread signal data.

Since signals are sent much more often than sigaction() is called, it would
seem to make more sense to make sigaction() take a heavier weight lock of
some kind (to update the signal handler decription) and to have the signal
delivery mechanism take a lighter weight lock.  Making 
current->sighand->siglock a rwlock_t really doesn't improve the situation
much, since cache line contention is just a severe in that case (if not worse) 
than it is with the current definition.

It seems to me that scalability would be improved if we moved the siglock from
the sighand structure to the task_struct.  (keep reading, please...)  Code 
that manipulates the current task signal data only would just obtain that 
lock.  Code that needs to change the sighand structure (e. g. sigaction())
would obtain all of the siglock's of all tasks using the same sighand 
structure.  A list of those task_struct's would be added to the sighand
structure to enable finding these structurs without having to take the
task_list_lock and search for them.

Obviously, this change ricochet's throughout the entire signal handling code.
It also means that sigaction() can become quite expensive for a many threaded
POSIX application, but my guess is that this doesn't happen very often.
The change could also make do_exit(), thread group shutdown, etc slower and
perhaps somewhat more complex.

Anyway, we would be interested in the community's ideas about dealing with
this signal delivery scalability issue, and, comments on the solution above
or suggestions for alternative solutions are welcome.
-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------

             reply	other threads:[~2004-11-22 16:30 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-22 15:51 Ray Bryant [this message]
2004-11-22 16:07 ` scalability of signal delivery for Posix Threads Matthew Wilcox
2004-11-22 19:49   ` [Lse-tech] " Ray Bryant
2004-11-22 19:53     ` Andi Kleen
2004-11-22 16:22 ` [Lse-tech] " Andi Kleen
2004-11-22 16:51   ` Andreas Schwab
2004-11-22 16:54     ` Andi Kleen
2004-11-22 18:56       ` Ray Bryant
2004-11-22 19:22       ` Ray Bryant
2004-11-22 17:23   ` Philip J. Mucci
2004-11-22 17:19 ` Robin Holt
2004-11-22 19:25   ` Ray Bryant
2004-11-23 20:42   ` Ray Bryant
2004-11-22 21:27 ` [Lse-tech] " Rick Lindsley
2004-11-22 23:39   ` Ray Bryant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41A20AF3.9030408@sgi.com \
    --to=raybry@sgi.com \
    --cc=bls@sgi.com \
    --cc=hawkes@tomahawk.engr.sgi.com \
    --cc=holt@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    --cc=roe@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox