All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <laijs@linux.alibaba.com>
To: linux-kernel@vger.kernel.org
Cc: Lai Jiangshan <laijs@linux.alibaba.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	rcu@vger.kernel.org
Subject: [PATCH 07/11] rcu: set special.b.deferred_qs before wake_up()
Date: Thu, 31 Oct 2019 10:08:02 +0000	[thread overview]
Message-ID: <20191031100806.1326-8-laijs@linux.alibaba.com> (raw)
In-Reply-To: <20191031100806.1326-1-laijs@linux.alibaba.com>

The patch 10f39bb1b2c1
(rcu: protect __rcu_read_unlock() against scheduler-using irq handlers)
unveiled a kind of deadlock and resolved the deadlock problem by
avoiding the condition when ->rcu_read_lock_nesting is zero &&
->rcu_read_unlock_special is non-zero. To achieve it, the commit
used negative values for ->rcu_read_lock_nesting.

But now we have deferred_qs mechanism, we can defer qs rather
than persevere in reporting qs and deadlock. All we need is
setting special.b.deferred_qs before scheduler locks
such as wake_up() and leave the qs deferred and return.

After this change, rcu_read_unlock_special() is safe to be
called in any context, including nested in __rcu_read_unlock()
in interrupt.

This change is important to change ->rcu_read_lock_nesting
back to non-negative and further simplify the rcu_read_unlock().

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/rcu/tree_plugin.h | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index e612c77dc446..dbded2b8c792 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -591,6 +591,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
 	irqs_were_disabled = irqs_disabled_flags(flags);
 	if (preempt_bh_were_disabled || irqs_were_disabled) {
 		bool exp;
+		bool deferred_qs = t->rcu_read_unlock_special.b.deferred_qs;
 		struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 		struct rcu_node *rnp = rdp->mynode;
 
@@ -599,9 +600,18 @@ static void rcu_read_unlock_special(struct task_struct *t)
 		      (rdp->grpmask & rnp->expmask) ||
 		      tick_nohz_full_cpu(rdp->cpu);
 		// Need to defer quiescent state until everything is enabled.
+		// In some cases when in_interrupt() returns false,
+		// raise_softirq_irqoff() has to call wake_up(),
+		// and the !deferred_qs says that scheduler locks
+		// cannot be held, so the wakeup will be safe now.
+		// But this wake_up() may have RCU critical section nested
+		// in the scheduler locks and its rcu_read_unlock() would
+		// call rcu_read_unlock_special() and then wake_up()
+		// recursively and deadlock if deferred_qs is still false.
+		// To avoid it, deferred_qs has to be set beforehand.
+		t->rcu_read_unlock_special.b.deferred_qs = true;
 		if (irqs_were_disabled && use_softirq &&
-		    (in_interrupt() ||
-		     (exp && !t->rcu_read_unlock_special.b.deferred_qs))) {
+		    (in_interrupt() || (exp && !deferred_qs))) {
 			// Using softirq, safe to awaken, and we get
 			// no help from enabling irqs, unlike bh/preempt.
 			raise_softirq_irqoff(RCU_SOFTIRQ);
@@ -620,7 +630,6 @@ static void rcu_read_unlock_special(struct task_struct *t)
 				irq_work_queue_on(&rdp->defer_qs_iw, rdp->cpu);
 			}
 		}
-		t->rcu_read_unlock_special.b.deferred_qs = true;
 		local_irq_restore(flags);
 		return;
 	}
-- 
2.20.1


  parent reply	other threads:[~2019-10-31 10:09 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-31 10:07 [PATCH 00/11] rcu: introduce percpu rcu_preempt_depth Lai Jiangshan
2019-10-31 10:07 ` [PATCH 01/11] rcu: avoid leaking exp_deferred_qs into next GP Lai Jiangshan
2019-10-31 13:43   ` Paul E. McKenney
2019-10-31 18:19     ` Lai Jiangshan
2019-10-31 19:00       ` Paul E. McKenney
2019-10-31 10:07 ` [PATCH 02/11] rcu: fix bug when rcu_exp_handler() in nested interrupt Lai Jiangshan
2019-10-31 13:47   ` Paul E. McKenney
2019-10-31 14:20     ` Lai Jiangshan
2019-10-31 14:31     ` Paul E. McKenney
2019-10-31 15:14       ` Lai Jiangshan
2019-10-31 18:52         ` Paul E. McKenney
2019-11-01  0:19           ` Boqun Feng
2019-11-01  2:29             ` Lai Jiangshan
2019-10-31 10:07 ` [PATCH 03/11] rcu: clean up rcu_preempt_deferred_qs_irqrestore() Lai Jiangshan
2019-10-31 13:52   ` Paul E. McKenney
2019-10-31 15:25     ` Lai Jiangshan
2019-10-31 18:57       ` Paul E. McKenney
2019-10-31 19:02         ` Paul E. McKenney
2019-10-31 10:07 ` [PATCH 04/11] rcu: cleanup rcu_preempt_deferred_qs() Lai Jiangshan
2019-10-31 14:10   ` Paul E. McKenney
2019-10-31 14:35     ` Lai Jiangshan
2019-10-31 15:07       ` Paul E. McKenney
2019-10-31 18:33         ` Lai Jiangshan
2019-10-31 22:45           ` Paul E. McKenney
2019-10-31 10:08 ` [PATCH 05/11] rcu: clean all rcu_read_unlock_special after report qs Lai Jiangshan
2019-11-01 11:54   ` Paul E. McKenney
2019-10-31 10:08 ` [PATCH 06/11] rcu: clear t->rcu_read_unlock_special in one go Lai Jiangshan
2019-11-01 12:10   ` Paul E. McKenney
2019-11-01 16:58     ` Paul E. McKenney
2019-10-31 10:08 ` Lai Jiangshan [this message]
2019-10-31 10:08 ` [PATCH 08/11] rcu: don't use negative ->rcu_read_lock_nesting Lai Jiangshan
2019-11-01 12:33   ` Paul E. McKenney
2019-11-16 13:04     ` Lai Jiangshan
2019-11-17 21:53       ` Paul E. McKenney
2019-11-18  1:54         ` Lai Jiangshan
2019-11-18 14:57           ` Paul E. McKenney
2019-10-31 10:08 ` [PATCH 09/11] rcu: wrap usages of rcu_read_lock_nesting Lai Jiangshan
2019-10-31 10:08 ` [PATCH 10/11] rcu: clear the special.b.need_qs in rcu_note_context_switch() Lai Jiangshan
2019-10-31 10:08 ` [PATCH 11/11] x86,rcu: use percpu rcu_preempt_depth Lai Jiangshan
2019-11-01 12:58   ` Paul E. McKenney
2019-11-01 13:13     ` Peter Zijlstra
2019-11-01 14:30       ` Paul E. McKenney
2019-11-01 15:32         ` Lai Jiangshan
2019-11-01 16:21           ` Paul E. McKenney
2019-11-01 16:21             ` Paul E. McKenney
2019-11-01 15:47       ` Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191031100806.1326-8-laijs@linux.alibaba.com \
    --to=laijs@linux.alibaba.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.