All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ravikiran G Thirumalai <kiran@scalex86.org>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Christoph Lameter <clameter@engr.sgi.com>,
	Shai Fultheim <shai@scalex86.org>,
	Nippun Goel <nippung@calsoftinc.com>,
	linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>
Subject: Re: [rfc][patch] Avoid taking global tasklist_lock for single threadedprocess at getrusage()
Date: Wed, 22 Mar 2006 14:18:44 -0800	[thread overview]
Message-ID: <20060322221844.GA3300@localhost.localdomain> (raw)
In-Reply-To: <441EEEC8.D4D9C40A@tv-sign.ru>

On Mon, Mar 20, 2006 at 09:04:56PM +0300, Oleg Nesterov wrote:
> Hello Ravikiran,

Hi Oleg, sorry for the late response..

> 
> Ravikiran G Thirumalai wrote:
> > 
> > Following patch avoids taking the global tasklist_lock when possible,
> > if a process is single threaded during getrusage().  Any avoidance of
> > tasklist_lock is good for NUMA boxes (and possibly for large SMPs).
> >
> > ...
> >
> >  static void k_getrusage(struct task_struct *p, int who, struct rusage *r)
> > @@ -1681,14 +1697,22 @@ static void k_getrusage(struct task_stru
> >         struct task_struct *t;
> >         unsigned long flags;
> >         cputime_t utime, stime;
> > +       int need_lock = 0;
> > 
> >         memset((char *) r, 0, sizeof *r);
> > -
> > -       if (unlikely(!p->signal))
> > -               return;
> > -
> >         utime = stime = cputime_zero;
> > 
> > +       need_lock = (p != current || !thread_group_empty(p));
> > +       if (need_lock) {
> > +               read_lock(&tasklist_lock);
> > +               if (unlikely(!p->signal)) {
> > +                       read_unlock(&tasklist_lock);
> > +                       return;
> > +               }
> > +       } else
> > +               /* See locking comments above */
> > +               smp_rmb();
> > +
> 
> I think now it is possible to improve this patch.
> 
> Could you look at these patches?
> 
> 	[PATCH] introduce lock_task_sighand() helper
> 	http://marc.theaimsgroup.com/?l=linux-kernel&m=114028190927763
> 
> 	[PATCH 0/3] make threads traversal ->siglock safe
> 	http://marc.theaimsgroup.com/?l=linux-kernel&m=114064825626496
> 
> I think we can forget about tasklist_lock in k_getrusage() completely
> and just use lock_task_sighand().
> 
> What do you think?

Great!! Nice patches to avoid tasklist lock on thread traversal.

However, I was trying to comprehend the tasklist locking changes in 
2.6.16-rc6mm2 and hit upon:

__exit_signal
	cleanup_sighand(tsk);
		kmem_cache_free(sighand)
	spin_unlock(sighand->lock)

It looked suspicious to me until I realised sighand cache now had 
SLAB_DESTROY_BY_RCU. Can we please add comments (at cleanup_sighand or 
__exit_signal) to make it a bit clearer for people like me :)

How is the following patch to avoid tasklist lock completely at getrusage?
(Andrew, I can remake the patch against a reverted 
avoid-taking-global-tasklist_lock-for-single-threadedprocess-at-getrusage
if you prefer it that way)

Thanks,
Kiran


Change avoid-taking-global-tasklist_lock-for-single-threadedprocess-at-getrusage
patch to not take the global tasklist lock at all. We don't need to take
the tasklist lock for thread traversal of a process since Oleg's
do-__unhash_process-under-siglock.patch and related work.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>

Index: linux-2.6.16-rc6mm2/kernel/sys.c
===================================================================
--- linux-2.6.16-rc6mm2.orig/kernel/sys.c	2006-03-21 17:04:09.000000000 -0800
+++ linux-2.6.16-rc6mm2/kernel/sys.c	2006-03-22 12:46:03.000000000 -0800
@@ -1860,23 +1860,20 @@ out:
  * fields when reaping, so a sample either gets all the additions of a
  * given child after it's reaped, or none so this sample is before reaping.
  *
- * tasklist_lock locking optimisation:
- * If we are current and single threaded, we do not need to take the tasklist
- * lock or the siglock.  No one else can take our signal_struct away,
- * no one else can reap the children to update signal->c* counters, and
- * no one else can race with the signal-> fields.
- * If we do not take the tasklist_lock, the signal-> fields could be read
- * out of order while another thread was just exiting. So we place a
- * read memory barrier when we avoid the lock.  On the writer side,
- * write memory barrier is implied in  __exit_signal as __exit_signal releases
- * the siglock spinlock after updating the signal-> fields.
- *
- * We don't really need the siglock when we access the non c* fields
- * of the signal_struct (for RUSAGE_SELF) even in multithreaded
- * case, since we take the tasklist lock for read and the non c* signal->
- * fields are updated only in __exit_signal, which is called with
- * tasklist_lock taken for write, hence these two threads cannot execute
- * concurrently.
+ * Locking:
+ * We need to take the siglock for CHILDEREN, SELF and BOTH 
+ * for  the cases current multithreaded, non-current single threaded
+ * non-current multithreaded.  Thread traversal is now safe with 
+ * the siglock held. 
+ * Strictly speaking, we donot need to take the siglock if we are current and 
+ * single threaded,  as no one else can take our signal_struct away, no one 
+ * else can  reap the  children to update signal->c* counters, and no one else 
+ * can race with the signal-> fields. If we do not take any lock, the 
+ * signal-> fields could be read out of order while another thread was just 
+ * exiting. So we should  place a read memory barrier when we avoid the lock.  
+ * On the writer side,  write memory barrier is implied in  __exit_signal 
+ * as __exit_signal releases  the siglock spinlock after updating the signal-> 
+ * fields. But we don't do this yet to keep things simple.
  *
  */
 
@@ -1885,35 +1882,25 @@ static void k_getrusage(struct task_stru
 	struct task_struct *t;
 	unsigned long flags;
 	cputime_t utime, stime;
-	int need_lock = 0;
 
 	memset((char *) r, 0, sizeof *r);
 	utime = stime = cputime_zero;
 
-	if (p != current || !thread_group_empty(p))
-		need_lock = 1;
-
-	if (need_lock) {
-		read_lock(&tasklist_lock);
-		if (unlikely(!p->signal)) {
-			read_unlock(&tasklist_lock);
-			return;
-		}
-	} else
-		/* See locking comments above */
-		smp_rmb();
+	rcu_read_lock();
+	if (!lock_task_sighand(p, &flags)) {
+		rcu_read_unlock();
+		return;
+	}	
 
 	switch (who) {
 		case RUSAGE_BOTH:
 		case RUSAGE_CHILDREN:
-			spin_lock_irqsave(&p->sighand->siglock, flags);
 			utime = p->signal->cutime;
 			stime = p->signal->cstime;
 			r->ru_nvcsw = p->signal->cnvcsw;
 			r->ru_nivcsw = p->signal->cnivcsw;
 			r->ru_minflt = p->signal->cmin_flt;
 			r->ru_majflt = p->signal->cmaj_flt;
-			spin_unlock_irqrestore(&p->sighand->siglock, flags);
 
 			if (who == RUSAGE_CHILDREN)
 				break;
@@ -1940,9 +1927,10 @@ static void k_getrusage(struct task_stru
 		default:
 			BUG();
 	}
-
-	if (need_lock)
-		read_unlock(&tasklist_lock);
+	
+	unlock_task_sighand(p, &flags);
+	rcu_read_unlock();
+	
 	cputime_to_timeval(utime, &r->ru_utime);
 	cputime_to_timeval(stime, &r->ru_stime);
 }


  reply	other threads:[~2006-03-22 22:18 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-24 17:52 [rfc][patch] Avoid taking global tasklist_lock for single threaded process at getrusage() Oleg Nesterov
2005-12-27 20:21 ` Christoph Lameter
2005-12-28 12:38   ` [rfc][patch] Avoid taking global tasklist_lock for single threadedprocess " Oleg Nesterov
2005-12-28 18:33     ` Ravikiran G Thirumalai
2005-12-28 22:57       ` Ravikiran G Thirumalai
2005-12-30 17:57         ` Oleg Nesterov
2006-01-04 23:16           ` Ravikiran G Thirumalai
2006-01-05 19:17             ` Oleg Nesterov
2006-01-06  9:46               ` Ravikiran G Thirumalai
2006-01-06 17:23                 ` Christoph Lameter
2006-01-06 19:46                   ` Ravikiran G Thirumalai
2006-03-20 18:04                     ` Oleg Nesterov
2006-03-22 22:18                       ` Ravikiran G Thirumalai [this message]
2006-03-23 18:18                         ` Oleg Nesterov
2006-01-06 23:52                   ` Andrew Morton
2006-01-08 11:49                 ` Oleg Nesterov
2006-01-08 19:58                   ` Ravikiran G Thirumalai
2006-01-09 18:55                     ` Oleg Nesterov
2006-01-09 20:54                       ` Ravikiran G Thirumalai
2006-01-10 19:03                         ` Oleg Nesterov
2006-01-16 20:56                           ` Ravikiran G Thirumalai
2006-01-17 19:59                             ` Oleg Nesterov
2006-01-17 19:52                               ` Ravikiran G Thirumalai
2006-01-18  9:17                                 ` Oleg Nesterov
2006-01-03 18:18         ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060322221844.GA3300@localhost.localdomain \
    --to=kiran@scalex86.org \
    --cc=akpm@osdl.org \
    --cc=clameter@engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nippung@calsoftinc.com \
    --cc=oleg@tv-sign.ru \
    --cc=shai@scalex86.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.