linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
	jiangshanlai@gmail.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, tglx@linutronix.de, rostedt@goodmis.org,
	dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com,
	fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com
Subject: Re: [PATCH tip/core/rcu 18/18] rcu: Better hotplug handling for synchronize_sched_expedited()
Date: Thu, 8 Oct 2015 08:06:39 -0700	[thread overview]
Message-ID: <20151008150639.GA3910@linux.vnet.ibm.com> (raw)
In-Reply-To: <20151008090128.GI3604@twins.programming.kicks-ass.net>

On Thu, Oct 08, 2015 at 11:01:28AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 07, 2015 at 09:26:53AM -0700, Paul E. McKenney wrote:
> > On Wed, Oct 07, 2015 at 04:26:27PM +0200, Peter Zijlstra wrote:
> 
> > Indeed, that approach looks better than moving rcu_note_context_switch(),
> > which probably results in deadlocks.  I will update my patch accordingly.
> 
> Yeah, calling rcu_note_context_switch() under the rq->lock is asking for
> trouble we don't need.

Please see below for the fixed version.  Thoughts?

(Queued for 4.5, want some serious testing on this.)

							Thanx, Paul

------------------------------------------------------------------------

commit 3ab3edf72a59a800e6e59ad3128f5b3b251b8962
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Wed Oct 7 09:10:48 2015 -0700

    rcu: Stop disabling interrupts in scheduler fastpaths
    
    We need the scheduler's fastpaths to be, well, fast, and unnecessarily
    disabling and re-enabling interrupts is not necessarily consistent with
    this goal.  Especially given that there are regions of the scheduler that
    already have interrupts disabled.
    
    This commit therefore moves the call to rcu_note_context_switch()
    to one of the interrupts-disabled regions of the scheduler, and
    removes the now-redundant disabling and re-enabling of interrupts from
    rcu_note_context_switch() and the functions it calls.
    
    Reported-by: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    [ paulmck: Shift rcu_note_context_switch() to avoid deadlock, as suggested
      by Peter Zijlstra. ]

diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 60d15a080d7c..9d3eda39bcd2 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -37,7 +37,7 @@ void rcu_cpu_stall_reset(void);
 /*
  * Note a virtualization-based context switch.  This is simply a
  * wrapper around rcu_note_context_switch(), which allows TINY_RCU
- * to save a few bytes.
+ * to save a few bytes. The caller must have disabled interrupts.
  */
 static inline void rcu_virt_note_context_switch(int cpu)
 {
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c9780fc47391..fbc9b5574e48 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -295,17 +295,16 @@ EXPORT_PER_CPU_SYMBOL_GPL(rcu_qs_ctr);
  * We inform the RCU core by emulating a zero-duration dyntick-idle
  * period, which we in turn do by incrementing the ->dynticks counter
  * by two.
+ *
+ * The caller must have disabled interrupts.
  */
 static void rcu_momentary_dyntick_idle(void)
 {
-	unsigned long flags;
 	struct rcu_data *rdp;
 	struct rcu_dynticks *rdtp;
 	int resched_mask;
 	struct rcu_state *rsp;
 
-	local_irq_save(flags);
-
 	/*
 	 * Yes, we can lose flag-setting operations.  This is OK, because
 	 * the flag will be set again after some delay.
@@ -335,13 +334,12 @@ static void rcu_momentary_dyntick_idle(void)
 		smp_mb__after_atomic(); /* Later stuff after QS. */
 		break;
 	}
-	local_irq_restore(flags);
 }
 
 /*
  * Note a context switch.  This is a quiescent state for RCU-sched,
  * and requires special handling for preemptible RCU.
- * The caller must have disabled preemption.
+ * The caller must have disabled interrupts.
  */
 void rcu_note_context_switch(void)
 {
@@ -371,9 +369,14 @@ EXPORT_SYMBOL_GPL(rcu_note_context_switch);
  */
 void rcu_all_qs(void)
 {
+	unsigned long flags;
+
 	barrier(); /* Avoid RCU read-side critical sections leaking down. */
-	if (unlikely(raw_cpu_read(rcu_sched_qs_mask)))
+	if (unlikely(raw_cpu_read(rcu_sched_qs_mask))) {
+		local_irq_save(flags);
 		rcu_momentary_dyntick_idle();
+		local_irq_restore(flags);
+	}
 	this_cpu_inc(rcu_qs_ctr);
 	barrier(); /* Avoid RCU read-side critical sections leaking up. */
 }
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 97dfa7d57f79..7087fb047e2d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -146,8 +146,8 @@ static void __init rcu_bootup_announce(void)
  * the corresponding expedited grace period will also be the end of the
  * normal grace period.
  */
-static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp,
-				   unsigned long flags) __releases(rnp->lock)
+static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
+	__releases(rnp->lock) /* But leaves rrupts disabled. */
 {
 	int blkd_state = (rnp->gp_tasks ? RCU_GP_TASKS : 0) +
 			 (rnp->exp_tasks ? RCU_EXP_TASKS : 0) +
@@ -235,7 +235,7 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp,
 		rnp->gp_tasks = &t->rcu_node_entry;
 	if (!rnp->exp_tasks && (blkd_state & RCU_EXP_BLKD))
 		rnp->exp_tasks = &t->rcu_node_entry;
-	raw_spin_unlock(&rnp->lock);
+	raw_spin_unlock(&rnp->lock); /* rrupts remain disabled. */
 
 	/*
 	 * Report the quiescent state for the expedited GP.  This expedited
@@ -250,7 +250,6 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp,
 	} else {
 		WARN_ON_ONCE(t->rcu_read_unlock_special.b.exp_need_qs);
 	}
-	local_irq_restore(flags);
 }
 
 /*
@@ -285,12 +284,11 @@ static void rcu_preempt_qs(void)
  * predating the current grace period drain, in other words, until
  * rnp->gp_tasks becomes NULL.
  *
- * Caller must disable preemption.
+ * Caller must disable interrupts.
  */
 static void rcu_preempt_note_context_switch(void)
 {
 	struct task_struct *t = current;
-	unsigned long flags;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
 
@@ -300,7 +298,7 @@ static void rcu_preempt_note_context_switch(void)
 		/* Possibly blocking in an RCU read-side critical section. */
 		rdp = this_cpu_ptr(rcu_state_p->rda);
 		rnp = rdp->mynode;
-		raw_spin_lock_irqsave(&rnp->lock, flags);
+		raw_spin_lock(&rnp->lock); /* rrupts already disabled. */
 		smp_mb__after_unlock_lock();
 		t->rcu_read_unlock_special.b.blocked = true;
 		t->rcu_blocked_node = rnp;
@@ -317,7 +315,7 @@ static void rcu_preempt_note_context_switch(void)
 				       (rnp->qsmask & rdp->grpmask)
 				       ? rnp->gpnum
 				       : rnp->gpnum + 1);
-		rcu_preempt_ctxt_queue(rnp, rdp, flags);
+		rcu_preempt_ctxt_queue(rnp, rdp);
 	} else if (t->rcu_read_lock_nesting < 0 &&
 		   t->rcu_read_unlock_special.s) {
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c4e607873d6f..ac246b0b987a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3056,7 +3056,6 @@ static void __sched __schedule(void)
 
 	cpu = smp_processor_id();
 	rq = cpu_rq(cpu);
-	rcu_note_context_switch();
 	prev = rq->curr;
 
 	schedule_debug(prev);
@@ -3064,13 +3063,16 @@ static void __sched __schedule(void)
 	if (sched_feat(HRTICK))
 		hrtick_clear(rq);
 
+	local_irq_disable();
+	rcu_note_context_switch();
+
 	/*
 	 * Make sure that signal_pending_state()->signal_pending() below
 	 * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
 	 * done by the caller to avoid the race with signal_wake_up().
 	 */
 	smp_mb__before_spinlock();
-	raw_spin_lock_irq(&rq->lock);
+	raw_spin_lock(&rq->lock);
 	lockdep_pin_lock(&rq->lock);
 
 	rq->clock_skip_update <<= 1; /* promote REQ to ACT */


  reply	other threads:[~2015-10-08 15:06 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-06 16:29 [PATCH tip/core/rcu 0/18] Expedited grace-period improvements for 4.4 Paul E. McKenney
2015-10-06 16:29 ` [PATCH tip/core/rcu 01/18] rcu: Use rsp->expedited_wq instead of sync_rcu_preempt_exp_wq Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 02/18] rcu: Move rcu_report_exp_rnp() to allow consolidation Paul E. McKenney
2015-10-06 20:29     ` Peter Zijlstra
2015-10-06 20:58       ` Paul E. McKenney
2015-10-07  7:51         ` Peter Zijlstra
2015-10-07  8:42           ` Mathieu Desnoyers
2015-10-07 11:01             ` Peter Zijlstra
2015-10-07 11:50               ` Peter Zijlstra
2015-10-07 12:03                 ` Peter Zijlstra
2015-10-07 12:05                 ` kbuild test robot
2015-10-07 12:09                 ` kbuild test robot
2015-10-07 12:11                 ` kbuild test robot
2015-10-07 12:17                   ` Peter Zijlstra
2015-10-07 13:44                     ` [kbuild-all] " Fengguang Wu
2015-10-07 13:55                       ` Peter Zijlstra
2015-10-07 14:21                         ` Fengguang Wu
2015-10-07 14:28                           ` Peter Zijlstra
2015-10-07 15:18                 ` Paul E. McKenney
2015-10-08 10:24                   ` Peter Zijlstra
2015-10-07 15:15               ` Paul E. McKenney
2015-10-07 14:33           ` Paul E. McKenney
2015-10-07 14:40             ` Peter Zijlstra
2015-10-07 16:48               ` Paul E. McKenney
2015-10-08  9:49                 ` Peter Zijlstra
2015-10-08 15:33                   ` Paul E. McKenney
2015-10-08 17:12                     ` Peter Zijlstra
2015-10-08 17:46                       ` Paul E. McKenney
2015-10-09  0:10                       ` Paul E. McKenney
2015-10-09  8:44                         ` Peter Zijlstra
2015-10-06 16:29   ` [PATCH tip/core/rcu 03/18] rcu: Consolidate tree setup for synchronize_rcu_expedited() Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 04/18] rcu: Use single-stage IPI algorithm for RCU expedited grace period Paul E. McKenney
2015-10-07 13:24     ` Peter Zijlstra
2015-10-07 18:11       ` Paul E. McKenney
2015-10-07 13:35     ` Peter Zijlstra
2015-10-07 15:44       ` Paul E. McKenney
2015-10-07 13:43     ` Peter Zijlstra
2015-10-07 13:49       ` Peter Zijlstra
2015-10-07 16:14         ` Paul E. McKenney
2015-10-08  9:00           ` Peter Zijlstra
2015-10-07 16:13       ` Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 05/18] rcu: Move synchronize_sched_expedited() to combining tree Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 06/18] rcu: Rename qs_pending to core_needs_qs Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 07/18] rcu: Invert passed_quiesce and rename to cpu_no_qs Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 08/18] rcu: Make ->cpu_no_qs be a union for aggregate OR Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 09/18] rcu: Switch synchronize_sched_expedited() to IPI Paul E. McKenney
2015-10-07 14:18     ` Peter Zijlstra
2015-10-07 16:24       ` Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 10/18] rcu: Stop silencing lockdep false positive for expedited grace periods Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 11/18] rcu: Stop excluding CPU hotplug in synchronize_sched_expedited() Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 12/18] cpu: Remove try_get_online_cpus() Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 13/18] rcu: Prepare for consolidating expedited CPU selection Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 14/18] rcu: Consolidate " Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 15/18] rcu: Add online/offline info to expedited stall warning message Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 16/18] rcu: Add tasks to expedited stall-warning messages Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 17/18] rcu: Enable stall warnings for synchronize_rcu_expedited() Paul E. McKenney
2015-10-06 16:29   ` [PATCH tip/core/rcu 18/18] rcu: Better hotplug handling for synchronize_sched_expedited() Paul E. McKenney
2015-10-07 14:26     ` Peter Zijlstra
2015-10-07 16:26       ` Paul E. McKenney
2015-10-08  9:01         ` Peter Zijlstra
2015-10-08 15:06           ` Paul E. McKenney [this message]
2015-10-08 15:12             ` Peter Zijlstra
2015-10-08 15:19               ` Paul E. McKenney
2015-10-08 18:01                 ` Josh Triplett
2015-10-09  0:11                   ` Paul E. McKenney
2015-10-09  0:48                     ` Josh Triplett
2015-10-09  3:54                       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151008150639.GA3910@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bobby.prani@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).