Re: tree rcu: call_rcu scalability problem?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Piggin <npiggin@suse.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: tree rcu: call_rcu scalability problem?
Date: Wed, 2 Sep 2009 22:14:27 -0700	[thread overview]
Message-ID: <20090903051427.GD7138@linux.vnet.ibm.com> (raw)
In-Reply-To: <1251919064.10394.25.camel@laptop>

On Wed, Sep 02, 2009 at 09:17:44PM +0200, Peter Zijlstra wrote:
> On Wed, 2009-09-02 at 14:27 +0200, Nick Piggin wrote:
> 
> > It seems like nearly 2/3 of the cost is here:
> >         /* Add the callback to our list. */
> >         *rdp->nxttail[RCU_NEXT_TAIL] = head; <<<
> >         rdp->nxttail[RCU_NEXT_TAIL] = &head->next;
> > 
> > In loading the pointer to the next tail pointer. If I'm reading the profile
> > correctly. Can't see why that should be a probem though...
> > 
> > ffffffff8107dee0 <__call_rcu>: /* __call_rcu total: 320971 100.000 */
> >    697  0.2172 :ffffffff8107dee0:       push   %r12
> 
> >    921  0.2869 :ffffffff8107df57:       push   %rdx
> >    151  0.0470 :ffffffff8107df58:       popfq
> > 183507 57.1725 :ffffffff8107df59:       mov    0x50(%rbx),%rax
> >    995  0.3100 :ffffffff8107df5d:       mov    %rdi,(%rax)
> 
> I'd guess at popfq to be the expensive op here.. skid usually causes the
> attribution to be a few ops down the line.

I believe that Nick's workload is routinely driving the number of
callbacks queued on a given CPU above 10,000, which would provoke numerous
(and possibly inlined) calls to force_quiescent_state().  Like about
400,000 such calls per second.  Hey, I was naively assuming that no one
would see more than 10,000 callbacks queued on a single CPU unless there
was some sort of major emergency underway, and coded accordingly.  ;-)

I offer the attached experimental (untested, might not even compile) patch.

							Thanx, Paul

------------------------------------------------------------------------

>From 0544d2da54bad95556a320e57658e244cb2ae8c6 Mon Sep 17 00:00:00 2001
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Wed, 2 Sep 2009 22:01:50 -0700
Subject: [PATCH] Remove grace-period machinery from rcutree __call_rcu()

The grace-period machinery in __call_rcu() was a failed attempt to avoid
implementing synchronize_rcu_expedited().  But now that this attempt has
failed, try removing the machinery.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   12 ------------
 1 files changed, 0 insertions(+), 12 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index d2a372f..104de9e 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1201,26 +1201,14 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu),
 	 */
 	local_irq_save(flags);
 	rdp = rsp->rda[smp_processor_id()];
-	rcu_process_gp_end(rsp, rdp);
-	check_for_new_grace_period(rsp, rdp);
 
 	/* Add the callback to our list. */
 	*rdp->nxttail[RCU_NEXT_TAIL] = head;
 	rdp->nxttail[RCU_NEXT_TAIL] = &head->next;
 
-	/* Start a new grace period if one not already started. */
-	if (ACCESS_ONCE(rsp->completed) == ACCESS_ONCE(rsp->gpnum)) {
-		unsigned long nestflag;
-		struct rcu_node *rnp_root = rcu_get_root(rsp);
-
-		spin_lock_irqsave(&rnp_root->lock, nestflag);
-		rcu_start_gp(rsp, nestflag);  /* releases rnp_root->lock. */
-	}
-
 	/* Force the grace period if too many callbacks or too long waiting. */
 	if (unlikely(++rdp->qlen > qhimark)) {
 		rdp->blimit = LONG_MAX;
-		force_quiescent_state(rsp, 0);
 	} else if ((long)(ACCESS_ONCE(rsp->jiffies_force_qs) - jiffies) < 0)
 		force_quiescent_state(rsp, 1);
 	local_irq_restore(flags);
-- 
1.5.2.5

next prev parent reply	other threads:[~2009-09-03  5:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-02  9:48 tree rcu: call_rcu scalability problem? Nick Piggin
2009-09-02 12:27 ` Nick Piggin
2009-09-02 15:19   ` Paul E. McKenney
2009-09-02 16:24     ` Nick Piggin
2009-09-02 16:37       ` Paul E. McKenney
2009-09-02 16:45         ` Nick Piggin
2009-09-02 16:48           ` Paul E. McKenney
2009-09-02 17:50         ` Nick Piggin
2009-09-02 19:17   ` Peter Zijlstra
2009-09-03  5:14     ` Paul E. McKenney [this message]
2009-09-03  7:45       ` Nick Piggin
2009-09-03  9:01       ` Nick Piggin
2009-09-03 13:28         ` Paul E. McKenney
2009-09-03  7:14     ` Nick Piggin

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:d2a372f dfblob:104de9e )
 OR (
bs:"Remove grace-period machinery from rcutree __call_rcu()" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090903051427.GD7138@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.