From: Ed Tomlinson <edt@aei.ca>
To: paulmck@linux.vnet.ibm.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Steven Rostedt <rostedt@goodmis.org>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Dipankar Sarma <dipankar@in.ibm.com>,
linux-kernel@vger.kernel.org
Subject: Re: INFO: possible circular locking dependency detected
Date: Sat, 16 Jul 2011 21:56:56 -0400 [thread overview]
Message-ID: <201107162156.56830.edt@aei.ca> (raw)
In-Reply-To: <20110717000217.GB2370@linux.vnet.ibm.com>
On Saturday 16 July 2011 20:02:17 Paul E. McKenney wrote:
> On Sat, Jul 16, 2011 at 03:42:30PM -0400, Ed Tomlinson wrote:
> > On Friday 15 July 2011 18:04:47 Paul E. McKenney wrote:
> > > On Fri, Jul 15, 2011 at 05:48:06PM -0400, Ed Tomlinson wrote:
> > > > On Friday 15 July 2011 12:56:13 Paul E. McKenney wrote:
> > >
> > > [ . . . ]
> > >
> > > > > OK. Ed, would you be willing to try the patch out?
> > > >
> > > > I am booted at the same git commit with a bluetooth and the disable local_bh around softirq()
> > > > patch from this thread. So far so good. Not sure how 'easy' this one is to trigger a second time -
> > > > I've been running with threadirq enabled since .39 came out. Last night was the first deadlock...
> > > > If nothing happened post rc6 to make it more likely it could be a while before it triggers again.
> > >
> > > Thank you for trying it out, Ed! And I know that you will not be shy
> > > should the problem recur. ;-)
> >
> > Found this in dmesg this afternoon. This time, though X was dead, I was able to cancel and restart
> > it. This is with Peter's patch to call softirq() with local_bh disabled.
>
> Hmmm... Was RCU_BOOST enabled? If so, could you please try the
> following patch? If not, more thought is required.
>
Paul,
No boost set.
grover linux # grep RCU .config
# RCU Subsystem
CONFIG_TREE_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=64
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_BOOST is not set
# CONFIG_PROVE_RCU is not set
# CONFIG_SPARSE_RCU_POINTER is not set
CONFIG_RCU_TORTURE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_CPU_STALL_VERBOSE is not set
thinking cap time I would guess.
If I enable boost do you think the patch might help?
Ed
>
> ------------------------------------------------------------------------
>
> rcu: Fix RCU_BOOST race handling current->rcu_read_unlock_special
>
> The RCU_BOOST commits for TREE_PREEMPT_RCU introduced an other-task
> write to a new RCU_READ_UNLOCK_BOOSTED bit in the task_struct structure's
> ->rcu_read_unlock_special field, but, as noted by Steven Rostedt, without
> correctly synchronizing all accesses to ->rcu_read_unlock_special.
> This could result in bits in ->rcu_read_unlock_special being spuriously
> set and cleared due to conflicting accesses, which in turn could result
> in deadlocks between the rcu_node structure's ->lock and the scheduler's
> rq and pi locks. These deadlocks would result from RCU incorrectly
> believing that the just-ended RCU read-side critical section had been
> preempted and/or boosted. If that RCU read-side critical section was
> executed with either rq or pi locks held, RCU's ensuing (incorrect)
> calls to the scheduler would cause the scheduler to attempt to once
> again acquire the rq and pi locks, resulting in deadlock. More complex
> deadlock cycles are also possible, involving multiple rq and pi locks
> as well as locks from multiple rcu_node structures.
>
> This commit fixes synchronization by creating ->rcu_boosted field in
> task_struct that is accessed and modified only when holding the ->lock
> in the rcu_node structure on which the task is queued (on that rcu_node
> structure's ->blkd_tasks list). This results in tasks accessing only
> their own current->rcu_read_unlock_special fields, making unsynchronized
> access once again legal, and keeping the rcu_read_unlock() fastpath free
> of atomic instructions and memory barriers.
>
> The reason that the rcu_read_unlock() fastpath does not need to access
> the new current->rcu_boosted field is that this new field cannot
> be non-zero unless the RCU_READ_UNLOCK_BLOCKED bit is set in the
> current->rcu_read_unlock_special field. Therefore, rcu_read_unlock()
> need only test current->rcu_read_unlock_special: if that is zero, then
> current->rcu_boosted must also be zero.
>
> This bug does not affect TINY_PREEMPT_RCU because this implementation
> of RCU accesses current->rcu_read_unlock_special with irqs disabled,
> thus preventing races on the !SMP systems that TINY_PREEMPT_RCU runs on.
>
> Maybe-reported-by: Dave Jones <davej@redhat.com>
> Maybe-reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 496770a..76676a4 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1254,6 +1254,9 @@ struct task_struct {
> #ifdef CONFIG_PREEMPT_RCU
> int rcu_read_lock_nesting;
> char rcu_read_unlock_special;
> +#if defined(CONFIG_RCU_BOOST) && defined(CONFIG_TREE_PREEMPT_RCU)
> + int rcu_boosted;
> +#endif /* #if defined(CONFIG_RCU_BOOST) && defined(CONFIG_TREE_PREEMPT_RCU) */
> struct list_head rcu_node_entry;
> #endif /* #ifdef CONFIG_PREEMPT_RCU */
> #ifdef CONFIG_TREE_PREEMPT_RCU
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 75113cb..8d38a98 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -342,6 +342,11 @@ static void rcu_read_unlock_special(struct task_struct *t)
> #ifdef CONFIG_RCU_BOOST
> if (&t->rcu_node_entry == rnp->boost_tasks)
> rnp->boost_tasks = np;
> + /* Snapshot and clear ->rcu_boosted with rcu_node lock held. */
> + if (t->rcu_boosted) {
> + special |= RCU_READ_UNLOCK_BOOSTED;
> + t->rcu_boosted = 0;
> + }
> #endif /* #ifdef CONFIG_RCU_BOOST */
> t->rcu_blocked_node = NULL;
>
> @@ -358,7 +363,6 @@ static void rcu_read_unlock_special(struct task_struct *t)
> #ifdef CONFIG_RCU_BOOST
> /* Unboost if we were boosted. */
> if (special & RCU_READ_UNLOCK_BOOSTED) {
> - t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_BOOSTED;
> rt_mutex_unlock(t->rcu_boost_mutex);
> t->rcu_boost_mutex = NULL;
> }
> @@ -1174,7 +1178,7 @@ static int rcu_boost(struct rcu_node *rnp)
> t = container_of(tb, struct task_struct, rcu_node_entry);
> rt_mutex_init_proxy_locked(&mtx, t);
> t->rcu_boost_mutex = &mtx;
> - t->rcu_read_unlock_special |= RCU_READ_UNLOCK_BOOSTED;
> + t->rcu_boosted = 1;
> raw_spin_unlock_irqrestore(&rnp->lock, flags);
> rt_mutex_lock(&mtx); /* Side effect: boosts task t's priority. */
> rt_mutex_unlock(&mtx); /* Keep lockdep happy. */
>
>
next prev parent reply other threads:[~2011-07-17 1:57 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-14 14:49 INFO: possible circular locking dependency detected Sergey Senozhatsky
2011-07-14 16:41 ` Peter Zijlstra
2011-07-14 16:57 ` Paul E. McKenney
2011-07-14 19:16 ` Sergey Senozhatsky
2011-07-14 19:15 ` Sergey Senozhatsky
2011-07-14 19:34 ` Paul E. McKenney
2011-07-14 19:38 ` Dave Jones
2011-07-14 20:33 ` Paul E. McKenney
2011-07-14 19:38 ` Sergey Senozhatsky
2011-07-14 16:58 ` Steven Rostedt
2011-07-14 17:02 ` Steven Rostedt
2011-07-14 17:05 ` Paul E. McKenney
2011-07-14 17:32 ` Steven Rostedt
2011-07-14 17:46 ` Steven Rostedt
2011-07-14 19:18 ` Paul E. McKenney
2011-07-14 19:41 ` Steven Rostedt
2011-07-14 20:33 ` Paul E. McKenney
2011-07-15 11:05 ` Ed Tomlinson
2011-07-15 11:29 ` Peter Zijlstra
2011-07-15 11:35 ` Ed Tomlinson
2011-07-15 11:39 ` Peter Zijlstra
2011-07-15 18:11 ` Paul E. McKenney
2011-07-15 12:42 ` Paul E. McKenney
2011-07-15 13:07 ` Peter Zijlstra
2011-07-15 14:36 ` Paul E. McKenney
2011-07-15 15:04 ` Peter Zijlstra
2011-07-15 15:59 ` Paul E. McKenney
2011-07-15 16:11 ` Peter Zijlstra
2011-07-15 16:56 ` Paul E. McKenney
2011-07-15 21:48 ` Ed Tomlinson
2011-07-15 22:04 ` Paul E. McKenney
2011-07-16 19:42 ` Ed Tomlinson
2011-07-17 0:02 ` Paul E. McKenney
2011-07-17 1:56 ` Ed Tomlinson [this message]
2011-07-17 14:28 ` Paul E. McKenney
2011-07-18 15:15 ` Paul E. McKenney
2011-07-18 9:29 ` Peter Zijlstra
2011-07-18 15:29 ` Paul E. McKenney
2011-07-15 16:55 ` Steven Rostedt
2011-07-15 17:03 ` Paul E. McKenney
2011-07-15 17:16 ` Steven Rostedt
2011-07-15 17:24 ` Paul E. McKenney
2011-07-15 17:42 ` Steven Rostedt
2011-07-15 18:33 ` Paul E. McKenney
-- strict thread matches above, loose matches on Subject: below --
2013-11-20 20:15 Alexei Starovoitov
2013-11-20 23:24 ` Casey Schaufler
2011-08-07 16:22 Justin P. Mattock
2011-08-11 20:57 ` Justin P. Mattock
2009-12-06 10:11 Richard Zidlicky
2009-10-10 23:09 John Kacur
2007-02-08 15:03 Pedro M. López
2006-10-16 14:05 alpha @ steudten Engineering
2006-10-16 14:32 ` Nick Piggin
2006-10-16 15:42 ` Randy Dunlap
2006-10-16 15:46 ` Nick Piggin
2006-10-19 6:02 ` Andrew Morton
2006-10-19 6:30 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201107162156.56830.edt@aei.ca \
--to=edt@aei.ca \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=dipankar@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.