From: Ravikiran G Thirumalai <kiran@scalex86.org>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Christoph Lameter <clameter@engr.sgi.com>,
Shai Fultheim <shai@scalex86.org>,
Nippun Goel <nippung@calsoftinc.com>,
linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>
Subject: Re: [rfc][patch] Avoid taking global tasklist_lock for single threadedprocess at getrusage()
Date: Wed, 22 Mar 2006 14:18:44 -0800 [thread overview]
Message-ID: <20060322221844.GA3300@localhost.localdomain> (raw)
In-Reply-To: <441EEEC8.D4D9C40A@tv-sign.ru>
On Mon, Mar 20, 2006 at 09:04:56PM +0300, Oleg Nesterov wrote:
> Hello Ravikiran,
Hi Oleg, sorry for the late response..
>
> Ravikiran G Thirumalai wrote:
> >
> > Following patch avoids taking the global tasklist_lock when possible,
> > if a process is single threaded during getrusage(). Any avoidance of
> > tasklist_lock is good for NUMA boxes (and possibly for large SMPs).
> >
> > ...
> >
> > static void k_getrusage(struct task_struct *p, int who, struct rusage *r)
> > @@ -1681,14 +1697,22 @@ static void k_getrusage(struct task_stru
> > struct task_struct *t;
> > unsigned long flags;
> > cputime_t utime, stime;
> > + int need_lock = 0;
> >
> > memset((char *) r, 0, sizeof *r);
> > -
> > - if (unlikely(!p->signal))
> > - return;
> > -
> > utime = stime = cputime_zero;
> >
> > + need_lock = (p != current || !thread_group_empty(p));
> > + if (need_lock) {
> > + read_lock(&tasklist_lock);
> > + if (unlikely(!p->signal)) {
> > + read_unlock(&tasklist_lock);
> > + return;
> > + }
> > + } else
> > + /* See locking comments above */
> > + smp_rmb();
> > +
>
> I think now it is possible to improve this patch.
>
> Could you look at these patches?
>
> [PATCH] introduce lock_task_sighand() helper
> http://marc.theaimsgroup.com/?l=linux-kernel&m=114028190927763
>
> [PATCH 0/3] make threads traversal ->siglock safe
> http://marc.theaimsgroup.com/?l=linux-kernel&m=114064825626496
>
> I think we can forget about tasklist_lock in k_getrusage() completely
> and just use lock_task_sighand().
>
> What do you think?
Great!! Nice patches to avoid tasklist lock on thread traversal.
However, I was trying to comprehend the tasklist locking changes in
2.6.16-rc6mm2 and hit upon:
__exit_signal
cleanup_sighand(tsk);
kmem_cache_free(sighand)
spin_unlock(sighand->lock)
It looked suspicious to me until I realised sighand cache now had
SLAB_DESTROY_BY_RCU. Can we please add comments (at cleanup_sighand or
__exit_signal) to make it a bit clearer for people like me :)
How is the following patch to avoid tasklist lock completely at getrusage?
(Andrew, I can remake the patch against a reverted
avoid-taking-global-tasklist_lock-for-single-threadedprocess-at-getrusage
if you prefer it that way)
Thanks,
Kiran
Change avoid-taking-global-tasklist_lock-for-single-threadedprocess-at-getrusage
patch to not take the global tasklist lock at all. We don't need to take
the tasklist lock for thread traversal of a process since Oleg's
do-__unhash_process-under-siglock.patch and related work.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Index: linux-2.6.16-rc6mm2/kernel/sys.c
===================================================================
--- linux-2.6.16-rc6mm2.orig/kernel/sys.c 2006-03-21 17:04:09.000000000 -0800
+++ linux-2.6.16-rc6mm2/kernel/sys.c 2006-03-22 12:46:03.000000000 -0800
@@ -1860,23 +1860,20 @@ out:
* fields when reaping, so a sample either gets all the additions of a
* given child after it's reaped, or none so this sample is before reaping.
*
- * tasklist_lock locking optimisation:
- * If we are current and single threaded, we do not need to take the tasklist
- * lock or the siglock. No one else can take our signal_struct away,
- * no one else can reap the children to update signal->c* counters, and
- * no one else can race with the signal-> fields.
- * If we do not take the tasklist_lock, the signal-> fields could be read
- * out of order while another thread was just exiting. So we place a
- * read memory barrier when we avoid the lock. On the writer side,
- * write memory barrier is implied in __exit_signal as __exit_signal releases
- * the siglock spinlock after updating the signal-> fields.
- *
- * We don't really need the siglock when we access the non c* fields
- * of the signal_struct (for RUSAGE_SELF) even in multithreaded
- * case, since we take the tasklist lock for read and the non c* signal->
- * fields are updated only in __exit_signal, which is called with
- * tasklist_lock taken for write, hence these two threads cannot execute
- * concurrently.
+ * Locking:
+ * We need to take the siglock for CHILDEREN, SELF and BOTH
+ * for the cases current multithreaded, non-current single threaded
+ * non-current multithreaded. Thread traversal is now safe with
+ * the siglock held.
+ * Strictly speaking, we donot need to take the siglock if we are current and
+ * single threaded, as no one else can take our signal_struct away, no one
+ * else can reap the children to update signal->c* counters, and no one else
+ * can race with the signal-> fields. If we do not take any lock, the
+ * signal-> fields could be read out of order while another thread was just
+ * exiting. So we should place a read memory barrier when we avoid the lock.
+ * On the writer side, write memory barrier is implied in __exit_signal
+ * as __exit_signal releases the siglock spinlock after updating the signal->
+ * fields. But we don't do this yet to keep things simple.
*
*/
@@ -1885,35 +1882,25 @@ static void k_getrusage(struct task_stru
struct task_struct *t;
unsigned long flags;
cputime_t utime, stime;
- int need_lock = 0;
memset((char *) r, 0, sizeof *r);
utime = stime = cputime_zero;
- if (p != current || !thread_group_empty(p))
- need_lock = 1;
-
- if (need_lock) {
- read_lock(&tasklist_lock);
- if (unlikely(!p->signal)) {
- read_unlock(&tasklist_lock);
- return;
- }
- } else
- /* See locking comments above */
- smp_rmb();
+ rcu_read_lock();
+ if (!lock_task_sighand(p, &flags)) {
+ rcu_read_unlock();
+ return;
+ }
switch (who) {
case RUSAGE_BOTH:
case RUSAGE_CHILDREN:
- spin_lock_irqsave(&p->sighand->siglock, flags);
utime = p->signal->cutime;
stime = p->signal->cstime;
r->ru_nvcsw = p->signal->cnvcsw;
r->ru_nivcsw = p->signal->cnivcsw;
r->ru_minflt = p->signal->cmin_flt;
r->ru_majflt = p->signal->cmaj_flt;
- spin_unlock_irqrestore(&p->sighand->siglock, flags);
if (who == RUSAGE_CHILDREN)
break;
@@ -1940,9 +1927,10 @@ static void k_getrusage(struct task_stru
default:
BUG();
}
-
- if (need_lock)
- read_unlock(&tasklist_lock);
+
+ unlock_task_sighand(p, &flags);
+ rcu_read_unlock();
+
cputime_to_timeval(utime, &r->ru_utime);
cputime_to_timeval(stime, &r->ru_stime);
}
next prev parent reply other threads:[~2006-03-22 22:18 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-24 17:52 [rfc][patch] Avoid taking global tasklist_lock for single threaded process at getrusage() Oleg Nesterov
2005-12-27 20:21 ` Christoph Lameter
2005-12-28 12:38 ` [rfc][patch] Avoid taking global tasklist_lock for single threadedprocess " Oleg Nesterov
2005-12-28 18:33 ` Ravikiran G Thirumalai
2005-12-28 22:57 ` Ravikiran G Thirumalai
2005-12-30 17:57 ` Oleg Nesterov
2006-01-04 23:16 ` Ravikiran G Thirumalai
2006-01-05 19:17 ` Oleg Nesterov
2006-01-06 9:46 ` Ravikiran G Thirumalai
2006-01-06 17:23 ` Christoph Lameter
2006-01-06 19:46 ` Ravikiran G Thirumalai
2006-03-20 18:04 ` Oleg Nesterov
2006-03-22 22:18 ` Ravikiran G Thirumalai [this message]
2006-03-23 18:18 ` Oleg Nesterov
2006-01-06 23:52 ` Andrew Morton
2006-01-08 11:49 ` Oleg Nesterov
2006-01-08 19:58 ` Ravikiran G Thirumalai
2006-01-09 18:55 ` Oleg Nesterov
2006-01-09 20:54 ` Ravikiran G Thirumalai
2006-01-10 19:03 ` Oleg Nesterov
2006-01-16 20:56 ` Ravikiran G Thirumalai
2006-01-17 19:59 ` Oleg Nesterov
2006-01-17 19:52 ` Ravikiran G Thirumalai
2006-01-18 9:17 ` Oleg Nesterov
2006-01-03 18:18 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060322221844.GA3300@localhost.localdomain \
--to=kiran@scalex86.org \
--cc=akpm@osdl.org \
--cc=clameter@engr.sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nippung@calsoftinc.com \
--cc=oleg@tv-sign.ru \
--cc=shai@scalex86.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox