Re: rcu-refcount stacker performance

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@us.ibm.com>
To: serue@us.ibm.com
Cc: lkml <linux-kernel@vger.kernel.org>,
	Dipankar Sarma <dipankar@in.ibm.com>,
	"David A. Wheeler" <dwheeler@ida.org>,
	Tony Jones <tonyj@immunix.com>
Subject: Re: rcu-refcount stacker performance
Date: Thu, 14 Jul 2005 11:50:53 -0700	[thread overview]
Message-ID: <20050714185053.GF1299@us.ibm.com> (raw)
In-Reply-To: <20050714171357.GA23309@serge.austin.ibm.com>

On Thu, Jul 14, 2005 at 12:13:57PM -0500, serue@us.ibm.com wrote:
> Quoting Paul E. McKenney (paulmck@us.ibm.com):
> > On Thu, Jul 14, 2005 at 08:44:50AM -0500, serue@us.ibm.com wrote:
> > > Quoting Paul E. McKenney (paulmck@us.ibm.com):
> > > > My guess is that the reference count is indeed costing you quite a
> > > > bit.  I glance quickly at the patch, and most of the uses seem to
> > > > be of the form:
> > > > 
> > > > 	increment ref count
> > > > 	rcu_read_lock()
> > > > 	do something
> > > > 	rcu_read_unlock()
> > > > 	decrement ref count
> > > > 
> > > > Can't these cases rely solely on rcu_read_lock()?  Why do you also
> > > > need to increment the reference count in these cases?
> > > 
> > > The problem is on module unload: is it possible for CPU1 to be
> > > on "do something", and sleep, and, while it sleeps, CPU2 does
> > > rmmod(lsm), so that by the time CPU1 stops sleeping, the code it
> > > is executing has been freed?
> > 
> > OK, but in the above case, "do something" cannot be sleeping, since
> > it is under rcu_read_lock().
> 
> Oh, but that's not quite what the code is doing, rather it is doing:
> 
> 	rcu_read_lock
> 	while get next element from list
> 		inc element.refcount
> 		rcu_read_unlock
> 		do something
> 		rcu_read_lock
> 		dec refcount
> 	rcu_read_unlock

Color me blind this morning...  :-/  Yes, "do something" can legitimately
sleep.  Sorry for my confusion!

> What I plan to try next is:
> 
> 	rcu_read_lock
> 	while get next element from list
> 		if (element->owning_module->state != LIVE)
> 			continue
> 		rcu_read_unlock
> 		do something
> 		rcu_read_lock
> 	rcu_read_unlock
> 
> > > Because stacker won't remove the lsm from the list of modules
> > > until mod->exit() is executed, and module_free(mod) happens
> > > immediately after that, the above scenario seems possible.
> > 
> > Right, if you have some other code path that sleeps (outside of
> > rcu_read_lock(), right?), then you need the reference count for that
> > code path.  But the code paths that do not sleep should be able to
> > dispense with the reference count, reducing the cache-line traffic.
> 
> Most if not all of the codepaths can sleep, however.  So unfortunately
> that doesn't seem a feasible solution.  That's why I'm hoping there is
> something inherent in the module unload code that I can take advantage
> of to forego my own refcounting.

OK, so the only way that elements are removed is when a module is
unloaded, right?

If your module trick does not pan out, how about the following:

o	Add a "need per-element reference count" global variable

o	Have a per-CPU reference-count variable.

o	Make your code snippet do something like the following:

	rcu_read_lock()
	while get next element from list
		if (need per-element reference count)
			ref = &element.refcount
		else
			ref = &__get_cpu_var(stacker_refcounts)
		atomic_inc(ref)
		rcu_read_unlock()
		do something
		rcu_read_lock()
		atomic_dec(ref)
	rcu_read_unlock()

o	The point is to (hopefully) reduce the cache thrashing associated
	with the reference counts.

At module unload time, do something like the following:

	need per-element reference count = 1
	synchronize_rcu()
	for_each_cpu(cpu)
		while (per_cpu(stacker_refcounts,cpu) != 0)
			sleep for a bit

	/* At this point, all CPUs are using per-element reference counts */

If this approach does not reduce cache thrashing enough, one could use
a per-task reference count instead of a per-CPU reference count.  The
downside of doing this per-task approach is that you have to traverse
the entire task list at unload time.  But module unloading should be
quite rare.  If doing the per-task approach, you don't need atomic
increments and decrements for the reference count, and you have excellent
cache locality.

							Thanx, Paul

next prev parent reply	other threads:[~2005-07-14 18:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-14 14:21 rcu-refcount stacker performance serue
2005-07-14 15:23 ` Paul E. McKenney
2005-07-14 13:44   ` serue
2005-07-14 16:59     ` Paul E. McKenney
2005-07-14 17:13       ` serue
2005-07-14 18:50         ` Paul E. McKenney [this message]
2005-07-14 19:09           ` serue
2005-07-15  0:29         ` Joe Seigh
2005-07-15 13:59           ` Joe Seigh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050714185053.GF1299@us.ibm.com \
    --to=paulmck@us.ibm.com \
    --cc=dipankar@in.ibm.com \
    --cc=dwheeler@ida.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=serue@us.ibm.com \
    --cc=tonyj@immunix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.