From: "Paul E. McKenney" <paulmck@us.ibm.com>
To: serue@us.ibm.com
Cc: lkml <linux-kernel@vger.kernel.org>,
Dipankar Sarma <dipankar@in.ibm.com>,
"David A. Wheeler" <dwheeler@ida.org>,
Tony Jones <tonyj@immunix.com>
Subject: Re: rcu-refcount stacker performance
Date: Thu, 14 Jul 2005 11:50:53 -0700 [thread overview]
Message-ID: <20050714185053.GF1299@us.ibm.com> (raw)
In-Reply-To: <20050714171357.GA23309@serge.austin.ibm.com>
On Thu, Jul 14, 2005 at 12:13:57PM -0500, serue@us.ibm.com wrote:
> Quoting Paul E. McKenney (paulmck@us.ibm.com):
> > On Thu, Jul 14, 2005 at 08:44:50AM -0500, serue@us.ibm.com wrote:
> > > Quoting Paul E. McKenney (paulmck@us.ibm.com):
> > > > My guess is that the reference count is indeed costing you quite a
> > > > bit. I glance quickly at the patch, and most of the uses seem to
> > > > be of the form:
> > > >
> > > > increment ref count
> > > > rcu_read_lock()
> > > > do something
> > > > rcu_read_unlock()
> > > > decrement ref count
> > > >
> > > > Can't these cases rely solely on rcu_read_lock()? Why do you also
> > > > need to increment the reference count in these cases?
> > >
> > > The problem is on module unload: is it possible for CPU1 to be
> > > on "do something", and sleep, and, while it sleeps, CPU2 does
> > > rmmod(lsm), so that by the time CPU1 stops sleeping, the code it
> > > is executing has been freed?
> >
> > OK, but in the above case, "do something" cannot be sleeping, since
> > it is under rcu_read_lock().
>
> Oh, but that's not quite what the code is doing, rather it is doing:
>
> rcu_read_lock
> while get next element from list
> inc element.refcount
> rcu_read_unlock
> do something
> rcu_read_lock
> dec refcount
> rcu_read_unlock
Color me blind this morning... :-/ Yes, "do something" can legitimately
sleep. Sorry for my confusion!
> What I plan to try next is:
>
> rcu_read_lock
> while get next element from list
> if (element->owning_module->state != LIVE)
> continue
> rcu_read_unlock
> do something
> rcu_read_lock
> rcu_read_unlock
>
> > > Because stacker won't remove the lsm from the list of modules
> > > until mod->exit() is executed, and module_free(mod) happens
> > > immediately after that, the above scenario seems possible.
> >
> > Right, if you have some other code path that sleeps (outside of
> > rcu_read_lock(), right?), then you need the reference count for that
> > code path. But the code paths that do not sleep should be able to
> > dispense with the reference count, reducing the cache-line traffic.
>
> Most if not all of the codepaths can sleep, however. So unfortunately
> that doesn't seem a feasible solution. That's why I'm hoping there is
> something inherent in the module unload code that I can take advantage
> of to forego my own refcounting.
OK, so the only way that elements are removed is when a module is
unloaded, right?
If your module trick does not pan out, how about the following:
o Add a "need per-element reference count" global variable
o Have a per-CPU reference-count variable.
o Make your code snippet do something like the following:
rcu_read_lock()
while get next element from list
if (need per-element reference count)
ref = &element.refcount
else
ref = &__get_cpu_var(stacker_refcounts)
atomic_inc(ref)
rcu_read_unlock()
do something
rcu_read_lock()
atomic_dec(ref)
rcu_read_unlock()
o The point is to (hopefully) reduce the cache thrashing associated
with the reference counts.
At module unload time, do something like the following:
need per-element reference count = 1
synchronize_rcu()
for_each_cpu(cpu)
while (per_cpu(stacker_refcounts,cpu) != 0)
sleep for a bit
/* At this point, all CPUs are using per-element reference counts */
If this approach does not reduce cache thrashing enough, one could use
a per-task reference count instead of a per-CPU reference count. The
downside of doing this per-task approach is that you have to traverse
the entire task list at unload time. But module unloading should be
quite rare. If doing the per-task approach, you don't need atomic
increments and decrements for the reference count, and you have excellent
cache locality.
Thanx, Paul
next prev parent reply other threads:[~2005-07-14 18:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-07-14 14:21 rcu-refcount stacker performance serue
2005-07-14 15:23 ` Paul E. McKenney
2005-07-14 13:44 ` serue
2005-07-14 16:59 ` Paul E. McKenney
2005-07-14 17:13 ` serue
2005-07-14 18:50 ` Paul E. McKenney [this message]
2005-07-14 19:09 ` serue
2005-07-15 0:29 ` Joe Seigh
2005-07-15 13:59 ` Joe Seigh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050714185053.GF1299@us.ibm.com \
--to=paulmck@us.ibm.com \
--cc=dipankar@in.ibm.com \
--cc=dwheeler@ida.org \
--cc=linux-kernel@vger.kernel.org \
--cc=serue@us.ibm.com \
--cc=tonyj@immunix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox