* [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
@ 2010-06-16 4:22 Paul E. McKenney
2010-06-16 5:53 ` Ingo Molnar
0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2010-06-16 4:22 UTC (permalink / raw)
To: mingo
Cc: linux-kernel, akpm, a.p.zijlstra, tglx, daniel.blueman, lizf,
miles.lane, manfred
Hello, Ingo,
Here are a few more fixes for RCU-lockdep splats.
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent
This is based off of v2.6.35-rc3.
Thanx, Paul
------------------>
Daniel J Blueman (1):
rcu: fix lockdep splat in wake_affine()
Paul E. McKenney (2):
rcu: fix scope of wake_affine()'s new RCU read-side critical section
idr: fix RCU lockdep splat in idr_get_next()
kernel/sched_fair.c | 2 ++
lib/idr.c | 4 ++--
2 files changed, 4 insertions(+), 2 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
2010-06-16 4:22 [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes Paul E. McKenney
@ 2010-06-16 5:53 ` Ingo Molnar
2010-06-16 6:23 ` Peter Zijlstra
0 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2010-06-16 5:53 UTC (permalink / raw)
To: Paul E. McKenney
Cc: linux-kernel, akpm, a.p.zijlstra, tglx, daniel.blueman, lizf,
miles.lane, manfred
* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> Hello, Ingo,
>
> Here are a few more fixes for RCU-lockdep splats.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent
>
> This is based off of v2.6.35-rc3.
>
> Thanx, Paul
>
> ------------------>
> Daniel J Blueman (1):
> rcu: fix lockdep splat in wake_affine()
>
> Paul E. McKenney (2):
> rcu: fix scope of wake_affine()'s new RCU read-side critical section
> idr: fix RCU lockdep splat in idr_get_next()
>
> kernel/sched_fair.c | 2 ++
> lib/idr.c | 4 ++--
> 2 files changed, 4 insertions(+), 2 deletions(-)
Pulled, thanks a lot Paul!
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
2010-06-16 5:53 ` Ingo Molnar
@ 2010-06-16 6:23 ` Peter Zijlstra
2010-06-16 22:41 ` Paul E. McKenney
0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2010-06-16 6:23 UTC (permalink / raw)
To: Ingo Molnar
Cc: Paul E. McKenney, linux-kernel, akpm, tglx, daniel.blueman, lizf,
miles.lane, manfred
On Wed, 2010-06-16 at 07:53 +0200, Ingo Molnar wrote:
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
>
> > Hello, Ingo,
> >
> > Here are a few more fixes for RCU-lockdep splats.
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent
> >
> > This is based off of v2.6.35-rc3.
> >
> > Thanx, Paul
> >
> > ------------------>
> > Daniel J Blueman (1):
> > rcu: fix lockdep splat in wake_affine()
> >
> > Paul E. McKenney (2):
> > rcu: fix scope of wake_affine()'s new RCU read-side critical section
> > idr: fix RCU lockdep splat in idr_get_next()
> >
> > kernel/sched_fair.c | 2 ++
> > lib/idr.c | 4 ++--
> > 2 files changed, 4 insertions(+), 2 deletions(-)
>
> Pulled, thanks a lot Paul!
I'm not at all sure of those two wake_affine() ones..
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
2010-06-16 6:23 ` Peter Zijlstra
@ 2010-06-16 22:41 ` Paul E. McKenney
2010-06-17 8:49 ` Peter Zijlstra
0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2010-06-16 22:41 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, linux-kernel, akpm, tglx, daniel.blueman, lizf,
miles.lane, manfred
On Wed, Jun 16, 2010 at 08:23:51AM +0200, Peter Zijlstra wrote:
> On Wed, 2010-06-16 at 07:53 +0200, Ingo Molnar wrote:
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> >
> > > Hello, Ingo,
> > >
> > > Here are a few more fixes for RCU-lockdep splats.
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent
> > >
> > > This is based off of v2.6.35-rc3.
> > >
> > > Thanx, Paul
> > >
> > > ------------------>
> > > Daniel J Blueman (1):
> > > rcu: fix lockdep splat in wake_affine()
> > >
> > > Paul E. McKenney (2):
> > > rcu: fix scope of wake_affine()'s new RCU read-side critical section
> > > idr: fix RCU lockdep splat in idr_get_next()
> > >
> > > kernel/sched_fair.c | 2 ++
> > > lib/idr.c | 4 ++--
> > > 2 files changed, 4 insertions(+), 2 deletions(-)
> >
> > Pulled, thanks a lot Paul!
>
> I'm not at all sure of those two wake_affine() ones..
Hello, Peter!
Here is the story as I understand it:
o wake_affine() calls task_group() and uses the resulting
pointer, for example, passing it to effective_load().
This pointer is to a struct task_group, which contains
a struct rcu_head, which is passed to call_rcu in
sched_destroy_group(). So some protection really is
needed -- or is it enough that wake_affine seems to be
invoked on the current task? If the latter, we would
need to add a "task == current" check to task_subsys_state().
o task_group() calls task_subsys_state(), returning a pointer to
the enclosing task_group structure.
o task_subsys_state() returns an rcu_dereference_check()ed
pointer. The caller must either be in an RCU read-side
critical section, hold the ->alloc_lock, or hold the
cgroup lock.
Now wake_affine() appears to be doing load calculations, so it does not
seem reasonable to acquire the lock. Hence the use of RCU.
So, what should we be doing instead? ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
2010-06-16 22:41 ` Paul E. McKenney
@ 2010-06-17 8:49 ` Peter Zijlstra
2010-06-22 20:44 ` Paul E. McKenney
0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2010-06-17 8:49 UTC (permalink / raw)
To: paulmck
Cc: Ingo Molnar, linux-kernel, akpm, tglx, daniel.blueman, lizf,
miles.lane, manfred
On Wed, 2010-06-16 at 15:41 -0700, Paul E. McKenney wrote:
> Hello, Peter!
>
> Here is the story as I understand it:
>
> o wake_affine() calls task_group() and uses the resulting
> pointer, for example, passing it to effective_load().
>
> This pointer is to a struct task_group, which contains
> a struct rcu_head, which is passed to call_rcu in
> sched_destroy_group(). So some protection really is
> needed -- or is it enough that wake_affine seems to be
> invoked on the current task? If the latter, we would
> need to add a "task == current" check to task_subsys_state().
>
> o task_group() calls task_subsys_state(), returning a pointer to
> the enclosing task_group structure.
>
> o task_subsys_state() returns an rcu_dereference_check()ed
> pointer. The caller must either be in an RCU read-side
> critical section, hold the ->alloc_lock, or hold the
> cgroup lock.
>
> Now wake_affine() appears to be doing load calculations, so it does not
> seem reasonable to acquire the lock. Hence the use of RCU.
>
> So, what should we be doing instead? ;-)
Well, start by writing a sane changlog ;-)
I realise you didn't actually wrote these patches, but you should push
back to the people feeding you these things (esp when you get gems like:
tg = task_group();
rcu_read_unlock();
which is obvious utter garbage).
There's _two_ task_group() users in wake_affine(), at least one should
be covered by the rq->lock we're holding. It should then explain why the
other isn't covered (and which the other is).
It should also explain why using RCU read lock is the right solution,
and doesn't result in funny races. That is, the current changelog reads
like: "It whines, this makes it quiet." -- which I totally distrust
because we already found at least two actual bugs in this area
(sched-cgroup rcu usage).
That said, the two patches together might not be wrong, but its very
hard to verify without more information.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
2010-06-17 8:49 ` Peter Zijlstra
@ 2010-06-22 20:44 ` Paul E. McKenney
2010-06-23 8:08 ` Peter Zijlstra
0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2010-06-22 20:44 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, linux-kernel, akpm, tglx, daniel.blueman, lizf,
miles.lane, manfred
On Thu, Jun 17, 2010 at 10:49:14AM +0200, Peter Zijlstra wrote:
> On Wed, 2010-06-16 at 15:41 -0700, Paul E. McKenney wrote:
>
> > Hello, Peter!
> >
> > Here is the story as I understand it:
> >
> > o wake_affine() calls task_group() and uses the resulting
> > pointer, for example, passing it to effective_load().
> >
> > This pointer is to a struct task_group, which contains
> > a struct rcu_head, which is passed to call_rcu in
> > sched_destroy_group(). So some protection really is
> > needed -- or is it enough that wake_affine seems to be
> > invoked on the current task? If the latter, we would
> > need to add a "task == current" check to task_subsys_state().
> >
> > o task_group() calls task_subsys_state(), returning a pointer to
> > the enclosing task_group structure.
> >
> > o task_subsys_state() returns an rcu_dereference_check()ed
> > pointer. The caller must either be in an RCU read-side
> > critical section, hold the ->alloc_lock, or hold the
> > cgroup lock.
> >
> > Now wake_affine() appears to be doing load calculations, so it does not
> > seem reasonable to acquire the lock. Hence the use of RCU.
> >
> > So, what should we be doing instead? ;-)
>
> Well, start by writing a sane changlog ;-)
As soon as I learn the relevant definition of "sane" for this context. ;-)
> I realise you didn't actually wrote these patches, but you should push
> back to the people feeding you these things (esp when you get gems like:
>
> tg = task_group();
> rcu_read_unlock();
>
> which is obvious utter garbage).
Agreed. If you prefer, I can combine the two patches to avoid the
appearance of insanity. (The second patch of the pair adjusts the
rcu_read_unlock() to cover all uses of the "tg" pointer.)
> There's _two_ task_group() users in wake_affine(), at least one should
> be covered by the rq->lock we're holding. It should then explain why the
> other isn't covered (and which the other is).
I am probably missing something, but I see wake_affine() only called
from select_task_rq_fair(), which is one of the possible values for
->select_task_rq(). This can be called from select_task_rq(), which
claims that it can be called without holding rq->lock. I do not see
any rq->lock acquisition on the path from select_task_rq() to the
call to wake_affine().
(I am looking at 2.6.34, FWIW.)
> It should also explain why using RCU read lock is the right solution,
> and doesn't result in funny races. That is, the current changelog reads
> like: "It whines, this makes it quiet." -- which I totally distrust
> because we already found at least two actual bugs in this area
> (sched-cgroup rcu usage).
The usage appears to be heuristic in nature, so that processing old
data should be non-fatal.
> That said, the two patches together might not be wrong, but its very
> hard to verify without more information.
Left to myself, I would combine the two patches and use the changelog
shown below. Does this work for you?
Thanx, Paul
rcu: apply RCU protection to wake_affine()
The task_group() function returns a pointer that must be protected
by either RCU, the ->alloc_lock, or the cgroup lock (see the
rcu_dereference_check() in task_subsys_state(), which is invoked by
task_group()). The wake_affine() function currently does none of these,
which means that a concurrent update would be within its rights to free
the structure returned by task_group(). Because wake_affine() uses this
structure only to compute load-balancing heuristics, there is no reason
to acquire either of the two locks.
Therefore, this commit introduces an RCU read-side critical section that
starts before the first call to task_group() and ends after the last use
of the "tg" pointer returned from task_group(). Thanks to Li Zefan for
pointing out the need to extend the RCU read-side critical section from
that proposed by the original patch.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
2010-06-22 20:44 ` Paul E. McKenney
@ 2010-06-23 8:08 ` Peter Zijlstra
0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2010-06-23 8:08 UTC (permalink / raw)
To: paulmck
Cc: Ingo Molnar, linux-kernel, akpm, tglx, daniel.blueman, lizf,
miles.lane, manfred
On Tue, 2010-06-22 at 13:44 -0700, Paul E. McKenney wrote:
> I am probably missing something, but I see wake_affine() only called
> from select_task_rq_fair(), which is one of the possible values for
> ->select_task_rq(). This can be called from select_task_rq(), which
> claims that it can be called without holding rq->lock. I do not see
> any rq->lock acquisition on the path from select_task_rq() to the
> call to wake_affine().
>
You're right, although try_to_wake_up(), wake_up_new_task() and
sched_exec() all hold a rq->lock (not sufficient to cover both
task_group() callers though).
I posted a patch yesterday that makes try_to_wake_up() call
select_task_rq() without any rq->lock held (although its a scary patch
and needs more work).
> rcu: apply RCU protection to wake_affine()
>
> The task_group() function returns a pointer that must be protected
> by either RCU, the ->alloc_lock, or the cgroup lock (see the
> rcu_dereference_check() in task_subsys_state(), which is invoked by
> task_group()). The wake_affine() function currently does none of these,
> which means that a concurrent update would be within its rights to free
> the structure returned by task_group(). Because wake_affine() uses this
> structure only to compute load-balancing heuristics, there is no reason
> to acquire either of the two locks.
>
> Therefore, this commit introduces an RCU read-side critical section that
> starts before the first call to task_group() and ends after the last use
> of the "tg" pointer returned from task_group(). Thanks to Li Zefan for
> pointing out the need to extend the RCU read-side critical section from
> that proposed by the original patch.
>
> Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
OK, fair enough, thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-06-23 8:09 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-16 4:22 [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes Paul E. McKenney
2010-06-16 5:53 ` Ingo Molnar
2010-06-16 6:23 ` Peter Zijlstra
2010-06-16 22:41 ` Paul E. McKenney
2010-06-17 8:49 ` Peter Zijlstra
2010-06-22 20:44 ` Paul E. McKenney
2010-06-23 8:08 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox