public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ravikiran G Thirumalai <kiran@scalex86.org>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Christoph Lameter <clameter@engr.sgi.com>,
	Shai Fultheim <shai@scalex86.org>,
	Nippun Goel <nippung@calsoftinc.com>,
	linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>
Subject: Re: [rfc][patch] Avoid taking global tasklist_lock for single threadedprocess at getrusage()
Date: Wed, 22 Mar 2006 14:18:44 -0800	[thread overview]
Message-ID: <20060322221844.GA3300@localhost.localdomain> (raw)
In-Reply-To: <441EEEC8.D4D9C40A@tv-sign.ru>

On Mon, Mar 20, 2006 at 09:04:56PM +0300, Oleg Nesterov wrote:
> Hello Ravikiran,

Hi Oleg, sorry for the late response..

> 
> Ravikiran G Thirumalai wrote:
> > 
> > Following patch avoids taking the global tasklist_lock when possible,
> > if a process is single threaded during getrusage().  Any avoidance of
> > tasklist_lock is good for NUMA boxes (and possibly for large SMPs).
> >
> > ...
> >
> >  static void k_getrusage(struct task_struct *p, int who, struct rusage *r)
> > @@ -1681,14 +1697,22 @@ static void k_getrusage(struct task_stru
> >         struct task_struct *t;
> >         unsigned long flags;
> >         cputime_t utime, stime;
> > +       int need_lock = 0;
> > 
> >         memset((char *) r, 0, sizeof *r);
> > -
> > -       if (unlikely(!p->signal))
> > -               return;
> > -
> >         utime = stime = cputime_zero;
> > 
> > +       need_lock = (p != current || !thread_group_empty(p));
> > +       if (need_lock) {
> > +               read_lock(&tasklist_lock);
> > +               if (unlikely(!p->signal)) {
> > +                       read_unlock(&tasklist_lock);
> > +                       return;
> > +               }
> > +       } else
> > +               /* See locking comments above */
> > +               smp_rmb();
> > +
> 
> I think now it is possible to improve this patch.
> 
> Could you look at these patches?
> 
> 	[PATCH] introduce lock_task_sighand() helper
> 	http://marc.theaimsgroup.com/?l=linux-kernel&m=114028190927763
> 
> 	[PATCH 0/3] make threads traversal ->siglock safe
> 	http://marc.theaimsgroup.com/?l=linux-kernel&m=114064825626496
> 
> I think we can forget about tasklist_lock in k_getrusage() completely
> and just use lock_task_sighand().
> 
> What do you think?

Great!! Nice patches to avoid tasklist lock on thread traversal.

However, I was trying to comprehend the tasklist locking changes in 
2.6.16-rc6mm2 and hit upon:

__exit_signal
	cleanup_sighand(tsk);
		kmem_cache_free(sighand)
	spin_unlock(sighand->lock)

It looked suspicious to me until I realised sighand cache now had 
SLAB_DESTROY_BY_RCU. Can we please add comments (at cleanup_sighand or 
__exit_signal) to make it a bit clearer for people like me :)

How is the following patch to avoid tasklist lock completely at getrusage?
(Andrew, I can remake the patch against a reverted 
avoid-taking-global-tasklist_lock-for-single-threadedprocess-at-getrusage
if you prefer it that way)

Thanks,
Kiran


Change avoid-taking-global-tasklist_lock-for-single-threadedprocess-at-getrusage
patch to not take the global tasklist lock at all. We don't need to take
the tasklist lock for thread traversal of a process since Oleg's
do-__unhash_process-under-siglock.patch and related work.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>

Index: linux-2.6.16-rc6mm2/kernel/sys.c
===================================================================
--- linux-2.6.16-rc6mm2.orig/kernel/sys.c	2006-03-21 17:04:09.000000000 -0800
+++ linux-2.6.16-rc6mm2/kernel/sys.c	2006-03-22 12:46:03.000000000 -0800
@@ -1860,23 +1860,20 @@ out:
  * fields when reaping, so a sample either gets all the additions of a
  * given child after it's reaped, or none so this sample is before reaping.
  *
- * tasklist_lock locking optimisation:
- * If we are current and single threaded, we do not need to take the tasklist
- * lock or the siglock.  No one else can take our signal_struct away,
- * no one else can reap the children to update signal->c* counters, and
- * no one else can race with the signal-> fields.
- * If we do not take the tasklist_lock, the signal-> fields could be read
- * out of order while another thread was just exiting. So we place a
- * read memory barrier when we avoid the lock.  On the writer side,
- * write memory barrier is implied in  __exit_signal as __exit_signal releases
- * the siglock spinlock after updating the signal-> fields.
- *
- * We don't really need the siglock when we access the non c* fields
- * of the signal_struct (for RUSAGE_SELF) even in multithreaded
- * case, since we take the tasklist lock for read and the non c* signal->
- * fields are updated only in __exit_signal, which is called with
- * tasklist_lock taken for write, hence these two threads cannot execute
- * concurrently.
+ * Locking:
+ * We need to take the siglock for CHILDEREN, SELF and BOTH 
+ * for  the cases current multithreaded, non-current single threaded
+ * non-current multithreaded.  Thread traversal is now safe with 
+ * the siglock held. 
+ * Strictly speaking, we donot need to take the siglock if we are current and 
+ * single threaded,  as no one else can take our signal_struct away, no one 
+ * else can  reap the  children to update signal->c* counters, and no one else 
+ * can race with the signal-> fields. If we do not take any lock, the 
+ * signal-> fields could be read out of order while another thread was just 
+ * exiting. So we should  place a read memory barrier when we avoid the lock.  
+ * On the writer side,  write memory barrier is implied in  __exit_signal 
+ * as __exit_signal releases  the siglock spinlock after updating the signal-> 
+ * fields. But we don't do this yet to keep things simple.
  *
  */
 
@@ -1885,35 +1882,25 @@ static void k_getrusage(struct task_stru
 	struct task_struct *t;
 	unsigned long flags;
 	cputime_t utime, stime;
-	int need_lock = 0;
 
 	memset((char *) r, 0, sizeof *r);
 	utime = stime = cputime_zero;
 
-	if (p != current || !thread_group_empty(p))
-		need_lock = 1;
-
-	if (need_lock) {
-		read_lock(&tasklist_lock);
-		if (unlikely(!p->signal)) {
-			read_unlock(&tasklist_lock);
-			return;
-		}
-	} else
-		/* See locking comments above */
-		smp_rmb();
+	rcu_read_lock();
+	if (!lock_task_sighand(p, &flags)) {
+		rcu_read_unlock();
+		return;
+	}	
 
 	switch (who) {
 		case RUSAGE_BOTH:
 		case RUSAGE_CHILDREN:
-			spin_lock_irqsave(&p->sighand->siglock, flags);
 			utime = p->signal->cutime;
 			stime = p->signal->cstime;
 			r->ru_nvcsw = p->signal->cnvcsw;
 			r->ru_nivcsw = p->signal->cnivcsw;
 			r->ru_minflt = p->signal->cmin_flt;
 			r->ru_majflt = p->signal->cmaj_flt;
-			spin_unlock_irqrestore(&p->sighand->siglock, flags);
 
 			if (who == RUSAGE_CHILDREN)
 				break;
@@ -1940,9 +1927,10 @@ static void k_getrusage(struct task_stru
 		default:
 			BUG();
 	}
-
-	if (need_lock)
-		read_unlock(&tasklist_lock);
+	
+	unlock_task_sighand(p, &flags);
+	rcu_read_unlock();
+	
 	cputime_to_timeval(utime, &r->ru_utime);
 	cputime_to_timeval(stime, &r->ru_stime);
 }


  reply	other threads:[~2006-03-22 22:18 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-24 17:52 [rfc][patch] Avoid taking global tasklist_lock for single threaded process at getrusage() Oleg Nesterov
2005-12-27 20:21 ` Christoph Lameter
2005-12-28 12:38   ` [rfc][patch] Avoid taking global tasklist_lock for single threadedprocess " Oleg Nesterov
2005-12-28 18:33     ` Ravikiran G Thirumalai
2005-12-28 22:57       ` Ravikiran G Thirumalai
2005-12-30 17:57         ` Oleg Nesterov
2006-01-04 23:16           ` Ravikiran G Thirumalai
2006-01-05 19:17             ` Oleg Nesterov
2006-01-06  9:46               ` Ravikiran G Thirumalai
2006-01-06 17:23                 ` Christoph Lameter
2006-01-06 19:46                   ` Ravikiran G Thirumalai
2006-03-20 18:04                     ` Oleg Nesterov
2006-03-22 22:18                       ` Ravikiran G Thirumalai [this message]
2006-03-23 18:18                         ` Oleg Nesterov
2006-01-06 23:52                   ` Andrew Morton
2006-01-08 11:49                 ` Oleg Nesterov
2006-01-08 19:58                   ` Ravikiran G Thirumalai
2006-01-09 18:55                     ` Oleg Nesterov
2006-01-09 20:54                       ` Ravikiran G Thirumalai
2006-01-10 19:03                         ` Oleg Nesterov
2006-01-16 20:56                           ` Ravikiran G Thirumalai
2006-01-17 19:59                             ` Oleg Nesterov
2006-01-17 19:52                               ` Ravikiran G Thirumalai
2006-01-18  9:17                                 ` Oleg Nesterov
2006-01-03 18:18         ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060322221844.GA3300@localhost.localdomain \
    --to=kiran@scalex86.org \
    --cc=akpm@osdl.org \
    --cc=clameter@engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nippung@calsoftinc.com \
    --cc=oleg@tv-sign.ru \
    --cc=shai@scalex86.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox