* [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes
@ 2010-06-16 4:22 Paul E. McKenney
2010-06-16 5:53 ` Ingo Molnar
0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2010-06-16 4:22 UTC (permalink / raw)
To: mingo
Cc: linux-kernel, akpm, a.p.zijlstra, tglx, daniel.blueman, lizf,
miles.lane, manfred
Hello, Ingo,
Here are a few more fixes for RCU-lockdep splats.
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent
This is based off of v2.6.35-rc3.
Thanx, Paul
------------------>
Daniel J Blueman (1):
rcu: fix lockdep splat in wake_affine()
Paul E. McKenney (2):
rcu: fix scope of wake_affine()'s new RCU read-side critical section
idr: fix RCU lockdep splat in idr_get_next()
kernel/sched_fair.c | 2 ++
lib/idr.c | 4 ++--
2 files changed, 4 insertions(+), 2 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes 2010-06-16 4:22 [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes Paul E. McKenney @ 2010-06-16 5:53 ` Ingo Molnar 2010-06-16 6:23 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Ingo Molnar @ 2010-06-16 5:53 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-kernel, akpm, a.p.zijlstra, tglx, daniel.blueman, lizf, miles.lane, manfred * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > Hello, Ingo, > > Here are a few more fixes for RCU-lockdep splats. > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent > > This is based off of v2.6.35-rc3. > > Thanx, Paul > > ------------------> > Daniel J Blueman (1): > rcu: fix lockdep splat in wake_affine() > > Paul E. McKenney (2): > rcu: fix scope of wake_affine()'s new RCU read-side critical section > idr: fix RCU lockdep splat in idr_get_next() > > kernel/sched_fair.c | 2 ++ > lib/idr.c | 4 ++-- > 2 files changed, 4 insertions(+), 2 deletions(-) Pulled, thanks a lot Paul! Ingo ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes 2010-06-16 5:53 ` Ingo Molnar @ 2010-06-16 6:23 ` Peter Zijlstra 2010-06-16 22:41 ` Paul E. McKenney 0 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2010-06-16 6:23 UTC (permalink / raw) To: Ingo Molnar Cc: Paul E. McKenney, linux-kernel, akpm, tglx, daniel.blueman, lizf, miles.lane, manfred On Wed, 2010-06-16 at 07:53 +0200, Ingo Molnar wrote: > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > Hello, Ingo, > > > > Here are a few more fixes for RCU-lockdep splats. > > > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent > > > > This is based off of v2.6.35-rc3. > > > > Thanx, Paul > > > > ------------------> > > Daniel J Blueman (1): > > rcu: fix lockdep splat in wake_affine() > > > > Paul E. McKenney (2): > > rcu: fix scope of wake_affine()'s new RCU read-side critical section > > idr: fix RCU lockdep splat in idr_get_next() > > > > kernel/sched_fair.c | 2 ++ > > lib/idr.c | 4 ++-- > > 2 files changed, 4 insertions(+), 2 deletions(-) > > Pulled, thanks a lot Paul! I'm not at all sure of those two wake_affine() ones.. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes 2010-06-16 6:23 ` Peter Zijlstra @ 2010-06-16 22:41 ` Paul E. McKenney 2010-06-17 8:49 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Paul E. McKenney @ 2010-06-16 22:41 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, linux-kernel, akpm, tglx, daniel.blueman, lizf, miles.lane, manfred On Wed, Jun 16, 2010 at 08:23:51AM +0200, Peter Zijlstra wrote: > On Wed, 2010-06-16 at 07:53 +0200, Ingo Molnar wrote: > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > > > Hello, Ingo, > > > > > > Here are a few more fixes for RCU-lockdep splats. > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent > > > > > > This is based off of v2.6.35-rc3. > > > > > > Thanx, Paul > > > > > > ------------------> > > > Daniel J Blueman (1): > > > rcu: fix lockdep splat in wake_affine() > > > > > > Paul E. McKenney (2): > > > rcu: fix scope of wake_affine()'s new RCU read-side critical section > > > idr: fix RCU lockdep splat in idr_get_next() > > > > > > kernel/sched_fair.c | 2 ++ > > > lib/idr.c | 4 ++-- > > > 2 files changed, 4 insertions(+), 2 deletions(-) > > > > Pulled, thanks a lot Paul! > > I'm not at all sure of those two wake_affine() ones.. Hello, Peter! Here is the story as I understand it: o wake_affine() calls task_group() and uses the resulting pointer, for example, passing it to effective_load(). This pointer is to a struct task_group, which contains a struct rcu_head, which is passed to call_rcu in sched_destroy_group(). So some protection really is needed -- or is it enough that wake_affine seems to be invoked on the current task? If the latter, we would need to add a "task == current" check to task_subsys_state(). o task_group() calls task_subsys_state(), returning a pointer to the enclosing task_group structure. o task_subsys_state() returns an rcu_dereference_check()ed pointer. The caller must either be in an RCU read-side critical section, hold the ->alloc_lock, or hold the cgroup lock. Now wake_affine() appears to be doing load calculations, so it does not seem reasonable to acquire the lock. Hence the use of RCU. So, what should we be doing instead? ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes 2010-06-16 22:41 ` Paul E. McKenney @ 2010-06-17 8:49 ` Peter Zijlstra 2010-06-22 20:44 ` Paul E. McKenney 0 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2010-06-17 8:49 UTC (permalink / raw) To: paulmck Cc: Ingo Molnar, linux-kernel, akpm, tglx, daniel.blueman, lizf, miles.lane, manfred On Wed, 2010-06-16 at 15:41 -0700, Paul E. McKenney wrote: > Hello, Peter! > > Here is the story as I understand it: > > o wake_affine() calls task_group() and uses the resulting > pointer, for example, passing it to effective_load(). > > This pointer is to a struct task_group, which contains > a struct rcu_head, which is passed to call_rcu in > sched_destroy_group(). So some protection really is > needed -- or is it enough that wake_affine seems to be > invoked on the current task? If the latter, we would > need to add a "task == current" check to task_subsys_state(). > > o task_group() calls task_subsys_state(), returning a pointer to > the enclosing task_group structure. > > o task_subsys_state() returns an rcu_dereference_check()ed > pointer. The caller must either be in an RCU read-side > critical section, hold the ->alloc_lock, or hold the > cgroup lock. > > Now wake_affine() appears to be doing load calculations, so it does not > seem reasonable to acquire the lock. Hence the use of RCU. > > So, what should we be doing instead? ;-) Well, start by writing a sane changlog ;-) I realise you didn't actually wrote these patches, but you should push back to the people feeding you these things (esp when you get gems like: tg = task_group(); rcu_read_unlock(); which is obvious utter garbage). There's _two_ task_group() users in wake_affine(), at least one should be covered by the rq->lock we're holding. It should then explain why the other isn't covered (and which the other is). It should also explain why using RCU read lock is the right solution, and doesn't result in funny races. That is, the current changelog reads like: "It whines, this makes it quiet." -- which I totally distrust because we already found at least two actual bugs in this area (sched-cgroup rcu usage). That said, the two patches together might not be wrong, but its very hard to verify without more information. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes 2010-06-17 8:49 ` Peter Zijlstra @ 2010-06-22 20:44 ` Paul E. McKenney 2010-06-23 8:08 ` Peter Zijlstra 0 siblings, 1 reply; 7+ messages in thread From: Paul E. McKenney @ 2010-06-22 20:44 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, linux-kernel, akpm, tglx, daniel.blueman, lizf, miles.lane, manfred On Thu, Jun 17, 2010 at 10:49:14AM +0200, Peter Zijlstra wrote: > On Wed, 2010-06-16 at 15:41 -0700, Paul E. McKenney wrote: > > > Hello, Peter! > > > > Here is the story as I understand it: > > > > o wake_affine() calls task_group() and uses the resulting > > pointer, for example, passing it to effective_load(). > > > > This pointer is to a struct task_group, which contains > > a struct rcu_head, which is passed to call_rcu in > > sched_destroy_group(). So some protection really is > > needed -- or is it enough that wake_affine seems to be > > invoked on the current task? If the latter, we would > > need to add a "task == current" check to task_subsys_state(). > > > > o task_group() calls task_subsys_state(), returning a pointer to > > the enclosing task_group structure. > > > > o task_subsys_state() returns an rcu_dereference_check()ed > > pointer. The caller must either be in an RCU read-side > > critical section, hold the ->alloc_lock, or hold the > > cgroup lock. > > > > Now wake_affine() appears to be doing load calculations, so it does not > > seem reasonable to acquire the lock. Hence the use of RCU. > > > > So, what should we be doing instead? ;-) > > Well, start by writing a sane changlog ;-) As soon as I learn the relevant definition of "sane" for this context. ;-) > I realise you didn't actually wrote these patches, but you should push > back to the people feeding you these things (esp when you get gems like: > > tg = task_group(); > rcu_read_unlock(); > > which is obvious utter garbage). Agreed. If you prefer, I can combine the two patches to avoid the appearance of insanity. (The second patch of the pair adjusts the rcu_read_unlock() to cover all uses of the "tg" pointer.) > There's _two_ task_group() users in wake_affine(), at least one should > be covered by the rq->lock we're holding. It should then explain why the > other isn't covered (and which the other is). I am probably missing something, but I see wake_affine() only called from select_task_rq_fair(), which is one of the possible values for ->select_task_rq(). This can be called from select_task_rq(), which claims that it can be called without holding rq->lock. I do not see any rq->lock acquisition on the path from select_task_rq() to the call to wake_affine(). (I am looking at 2.6.34, FWIW.) > It should also explain why using RCU read lock is the right solution, > and doesn't result in funny races. That is, the current changelog reads > like: "It whines, this makes it quiet." -- which I totally distrust > because we already found at least two actual bugs in this area > (sched-cgroup rcu usage). The usage appears to be heuristic in nature, so that processing old data should be non-fatal. > That said, the two patches together might not be wrong, but its very > hard to verify without more information. Left to myself, I would combine the two patches and use the changelog shown below. Does this work for you? Thanx, Paul rcu: apply RCU protection to wake_affine() The task_group() function returns a pointer that must be protected by either RCU, the ->alloc_lock, or the cgroup lock (see the rcu_dereference_check() in task_subsys_state(), which is invoked by task_group()). The wake_affine() function currently does none of these, which means that a concurrent update would be within its rights to free the structure returned by task_group(). Because wake_affine() uses this structure only to compute load-balancing heuristics, there is no reason to acquire either of the two locks. Therefore, this commit introduces an RCU read-side critical section that starts before the first call to task_group() and ends after the last use of the "tg" pointer returned from task_group(). Thanks to Li Zefan for pointing out the need to extend the RCU read-side critical section from that proposed by the original patch. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes 2010-06-22 20:44 ` Paul E. McKenney @ 2010-06-23 8:08 ` Peter Zijlstra 0 siblings, 0 replies; 7+ messages in thread From: Peter Zijlstra @ 2010-06-23 8:08 UTC (permalink / raw) To: paulmck Cc: Ingo Molnar, linux-kernel, akpm, tglx, daniel.blueman, lizf, miles.lane, manfred On Tue, 2010-06-22 at 13:44 -0700, Paul E. McKenney wrote: > I am probably missing something, but I see wake_affine() only called > from select_task_rq_fair(), which is one of the possible values for > ->select_task_rq(). This can be called from select_task_rq(), which > claims that it can be called without holding rq->lock. I do not see > any rq->lock acquisition on the path from select_task_rq() to the > call to wake_affine(). > You're right, although try_to_wake_up(), wake_up_new_task() and sched_exec() all hold a rq->lock (not sufficient to cover both task_group() callers though). I posted a patch yesterday that makes try_to_wake_up() call select_task_rq() without any rq->lock held (although its a scary patch and needs more work). > rcu: apply RCU protection to wake_affine() > > The task_group() function returns a pointer that must be protected > by either RCU, the ->alloc_lock, or the cgroup lock (see the > rcu_dereference_check() in task_subsys_state(), which is invoked by > task_group()). The wake_affine() function currently does none of these, > which means that a concurrent update would be within its rights to free > the structure returned by task_group(). Because wake_affine() uses this > structure only to compute load-balancing heuristics, there is no reason > to acquire either of the two locks. > > Therefore, this commit introduces an RCU read-side critical section that > starts before the first call to task_group() and ends after the last use > of the "tg" pointer returned from task_group(). Thanks to Li Zefan for > pointing out the need to extend the RCU read-side critical section from > that proposed by the original patch. > > Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> OK, fair enough, thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-06-23 8:09 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-06-16 4:22 [GIT PULL rcu/urgent] yet more lockdep-RCU splat fixes Paul E. McKenney 2010-06-16 5:53 ` Ingo Molnar 2010-06-16 6:23 ` Peter Zijlstra 2010-06-16 22:41 ` Paul E. McKenney 2010-06-17 8:49 ` Peter Zijlstra 2010-06-22 20:44 ` Paul E. McKenney 2010-06-23 8:08 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox