From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>,
linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
tglx@linutronix.de, daniel.blueman@gmail.com,
lizf@cn.fujitsu.com, miles.lane@gmail.com,
manfred@colorfullife.com
Subject: Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
Date: Tue, 22 Jun 2010 13:44:33 -0700 [thread overview]
Message-ID: <20100622204433.GJ2290@linux.vnet.ibm.com> (raw)
In-Reply-To: <1276764554.27822.26.camel@twins>
On Thu, Jun 17, 2010 at 10:49:14AM +0200, Peter Zijlstra wrote:
> On Wed, 2010-06-16 at 15:41 -0700, Paul E. McKenney wrote:
>
> > Hello, Peter!
> >
> > Here is the story as I understand it:
> >
> > o wake_affine() calls task_group() and uses the resulting
> > pointer, for example, passing it to effective_load().
> >
> > This pointer is to a struct task_group, which contains
> > a struct rcu_head, which is passed to call_rcu in
> > sched_destroy_group(). So some protection really is
> > needed -- or is it enough that wake_affine seems to be
> > invoked on the current task? If the latter, we would
> > need to add a "task == current" check to task_subsys_state().
> >
> > o task_group() calls task_subsys_state(), returning a pointer to
> > the enclosing task_group structure.
> >
> > o task_subsys_state() returns an rcu_dereference_check()ed
> > pointer. The caller must either be in an RCU read-side
> > critical section, hold the ->alloc_lock, or hold the
> > cgroup lock.
> >
> > Now wake_affine() appears to be doing load calculations, so it does not
> > seem reasonable to acquire the lock. Hence the use of RCU.
> >
> > So, what should we be doing instead? ;-)
>
> Well, start by writing a sane changlog ;-)
As soon as I learn the relevant definition of "sane" for this context. ;-)
> I realise you didn't actually wrote these patches, but you should push
> back to the people feeding you these things (esp when you get gems like:
>
> tg = task_group();
> rcu_read_unlock();
>
> which is obvious utter garbage).
Agreed. If you prefer, I can combine the two patches to avoid the
appearance of insanity. (The second patch of the pair adjusts the
rcu_read_unlock() to cover all uses of the "tg" pointer.)
> There's _two_ task_group() users in wake_affine(), at least one should
> be covered by the rq->lock we're holding. It should then explain why the
> other isn't covered (and which the other is).
I am probably missing something, but I see wake_affine() only called
from select_task_rq_fair(), which is one of the possible values for
->select_task_rq(). This can be called from select_task_rq(), which
claims that it can be called without holding rq->lock. I do not see
any rq->lock acquisition on the path from select_task_rq() to the
call to wake_affine().
(I am looking at 2.6.34, FWIW.)
> It should also explain why using RCU read lock is the right solution,
> and doesn't result in funny races. That is, the current changelog reads
> like: "It whines, this makes it quiet." -- which I totally distrust
> because we already found at least two actual bugs in this area
> (sched-cgroup rcu usage).
The usage appears to be heuristic in nature, so that processing old
data should be non-fatal.
> That said, the two patches together might not be wrong, but its very
> hard to verify without more information.
Left to myself, I would combine the two patches and use the changelog
shown below. Does this work for you?
Thanx, Paul
rcu: apply RCU protection to wake_affine()
The task_group() function returns a pointer that must be protected
by either RCU, the ->alloc_lock, or the cgroup lock (see the
rcu_dereference_check() in task_subsys_state(), which is invoked by
task_group()). The wake_affine() function currently does none of these,
which means that a concurrent update would be within its rights to free
the structure returned by task_group(). Because wake_affine() uses this
structure only to compute load-balancing heuristics, there is no reason
to acquire either of the two locks.
Therefore, this commit introduces an RCU read-side critical section that
starts before the first call to task_group() and ends after the last use
of the "tg" pointer returned from task_group(). Thanks to Li Zefan for
pointing out the need to extend the RCU read-side critical section from
that proposed by the original patch.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
next prev parent reply other threads:[~2010-06-22 20:44 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-16 4:22 [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes Paul E. McKenney
2010-06-16 5:53 ` Ingo Molnar
2010-06-16 6:23 ` Peter Zijlstra
2010-06-16 22:41 ` Paul E. McKenney
2010-06-17 8:49 ` Peter Zijlstra
2010-06-22 20:44 ` Paul E. McKenney [this message]
2010-06-23 8:08 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100622204433.GJ2290@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=daniel.blueman@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=manfred@colorfullife.com \
--cc=miles.lane@gmail.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox