public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/4] Review comments, cleanups, and preemptable synchronize_rcu() fixes
@ 2009-09-13 16:14 Paul E. McKenney
  2009-09-13 16:15 ` [PATCH tip/core/rcu 1/4] Kconfig help needs to say that TREE_PREEMPT_RCU scales down Paul E. McKenney
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Paul E. McKenney @ 2009-09-13 16:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks

This patchset provides updates to TREE_PREEMPT_RCU as follows:

1.	Update the TREE_PREEMPT_RCU description to note that it is
	suitable for small machines.

2.	Add some WARN_ON_ONCE() calls to check for (incorrect)
	concurrent grace-period initialization.

3.	Simplify quiescent-state detection (which also speeds up
	TREE_PREEMPT_RCU grace periods slightly).

4.	Fix a thinko in TREE_PREEMPT_RCU's synchronize_rcu() that
	could result in premature grace periods.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH tip/core/rcu 1/4] Kconfig help needs to say that TREE_PREEMPT_RCU scales down
  2009-09-13 16:14 [PATCH tip/core/rcu 0/4] Review comments, cleanups, and preemptable synchronize_rcu() fixes Paul E. McKenney
@ 2009-09-13 16:15 ` Paul E. McKenney
  2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
  2009-09-17 22:10   ` tip-bot for Paul E. McKenney
  2009-09-13 16:15 ` [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods Paul E. McKenney
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: Paul E. McKenney @ 2009-09-13 16:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, Paul E. McKenney

To quote Valdis:

	This leaves somebody who has a laptop wondering which choice
	is best for a system with only one or two cores that has
	CONFIG_PREEMPT defined. One choice says it scales down nicely,
	the other explicitly has a 'depends on PREEMPT' attached to it...

So add "scales down nicely" to TREE_PREEMPT_RCU to match that of
TREE_RCU.

Suggested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 init/Kconfig |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 8e8b76d..4c2c936 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -331,7 +331,8 @@ config TREE_PREEMPT_RCU
 	  This option selects the RCU implementation that is
 	  designed for very large SMP systems with hundreds or
 	  thousands of CPUs, but for which real-time response
-	  is also required.
+	  is also required.  It also scales down nicely to
+	  smaller systems.
 
 endchoice
 
-- 
1.5.2.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods.
  2009-09-13 16:14 [PATCH tip/core/rcu 0/4] Review comments, cleanups, and preemptable synchronize_rcu() fixes Paul E. McKenney
  2009-09-13 16:15 ` [PATCH tip/core/rcu 1/4] Kconfig help needs to say that TREE_PREEMPT_RCU scales down Paul E. McKenney
@ 2009-09-13 16:15 ` Paul E. McKenney
  2009-09-13 16:23   ` Daniel Walker
                     ` (2 more replies)
  2009-09-13 16:15 ` [PATCH tip/core/rcu 3/4] Simplify rcu_read_unlock_special() quiescent-state accounting Paul E. McKenney
  2009-09-13 16:15 ` [PATCH tip/core/rcu 4/4] Fix synchronize_rcu() for TREE_PREEMPT_RCU Paul E. McKenney
  3 siblings, 3 replies; 16+ messages in thread
From: Paul E. McKenney @ 2009-09-13 16:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, Paul E. McKenney

From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Check to make sure that there are no blocked tasks for the previous
grace period while initializing for the next grace period, verify
that rcu_preempt_qs() is given the correct CPU number and is never
called for an offline CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |    2 ++
 kernel/rcutree_plugin.h |   25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index bca0aba..3a01405 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -627,6 +627,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	/* Special-case the common single-level case. */
 	if (NUM_RCU_NODES == 1) {
 		rnp->qsmask = rnp->qsmaskinit;
+		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->gpnum = rsp->gpnum;
 		rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
 		spin_unlock_irqrestore(&rnp->lock, flags);
@@ -660,6 +661,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++) {
 		spin_lock(&rnp_cur->lock);	/* irqs already disabled. */
 		rnp_cur->qsmask = rnp_cur->qsmaskinit;
+		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->gpnum = rsp->gpnum;
 		spin_unlock(&rnp_cur->lock);	/* irqs already disabled. */
 	}
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 4778936..51413cb 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -86,6 +86,7 @@ static void rcu_preempt_qs(int cpu)
 
 	if (t->rcu_read_lock_nesting &&
 	    (t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {
+	    	WARN_ON_ONCE(cpu != smp_processor_id());
 
 		/* Possibly blocking in an RCU read-side critical section. */
 		rdp = rcu_preempt_state.rda[cpu];
@@ -103,7 +104,11 @@ static void rcu_preempt_qs(int cpu)
 		 * state for the current grace period), then as long
 		 * as that task remains queued, the current grace period
 		 * cannot end.
+		 *
+		 * But first, note that the current CPU must still be
+		 * on line!
 		 */
+	    	WARN_ON_ONCE((rdp->grpmask & rnp->qsmaskinit) == 0);
 		phase = !(rnp->qsmask & rdp->grpmask) ^ (rnp->gpnum & 0x1);
 		list_add(&t->rcu_node_entry, &rnp->blocked_tasks[phase]);
 		smp_mb();  /* Ensure later ctxt swtch seen after above. */
@@ -259,6 +264,18 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
+ * Check that the list of blocked tasks for the newly completed grace
+ * period is in fact empty.  It is a serious bug to complete a grace
+ * period that still has RCU readers blocked!  This function must be
+ * invoked -before- updating this rnp's ->gpnum, and the rnp's ->lock
+ * must be held by the caller.
+ */
+static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+{
+	WARN_ON_ONCE(!list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1]));
+}
+
+/*
  * Check for preempted RCU readers for the specified rcu_node structure.
  * If the caller needs a reliable answer, it must hold the rcu_node's
  * >lock.
@@ -451,6 +468,14 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
+ * Because there is no preemptable RCU, there can be no readers blocked,
+ * so there is no need to check for blocked tasks.
+ */
+static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+{
+}
+
+/*
  * Because preemptable RCU does not exist, there are never any preempted
  * RCU readers.
  */
-- 
1.5.2.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH tip/core/rcu 3/4] Simplify rcu_read_unlock_special() quiescent-state accounting
  2009-09-13 16:14 [PATCH tip/core/rcu 0/4] Review comments, cleanups, and preemptable synchronize_rcu() fixes Paul E. McKenney
  2009-09-13 16:15 ` [PATCH tip/core/rcu 1/4] Kconfig help needs to say that TREE_PREEMPT_RCU scales down Paul E. McKenney
  2009-09-13 16:15 ` [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods Paul E. McKenney
@ 2009-09-13 16:15 ` Paul E. McKenney
  2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
                     ` (2 more replies)
  2009-09-13 16:15 ` [PATCH tip/core/rcu 4/4] Fix synchronize_rcu() for TREE_PREEMPT_RCU Paul E. McKenney
  3 siblings, 3 replies; 16+ messages in thread
From: Paul E. McKenney @ 2009-09-13 16:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, Paul E. McKenney

From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

The earlier approach required two scheduling-clock ticks to note
an preemptable-RCU quiescent state in the situation in which the
scheduling-clock interrupt is unlucky enough to always interrupt an RCU
read-side critical section.  With this change, the quiescent state is
instead noted by the outermost rcu_read_unlock() immediately following the
first scheduling-clock tick, or, alternatively, by the first subsequent
context switch.  Therefore, this change also speeds up grace periods.

Suggested-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/sched.h   |    1 -
 kernel/rcutree.c        |   15 +++++-------
 kernel/rcutree_plugin.h |   54 ++++++++++++++++++++++------------------------
 3 files changed, 32 insertions(+), 38 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 855fd0d..e00ee56 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1731,7 +1731,6 @@ extern cputime_t task_gtime(struct task_struct *p);
 
 #define RCU_READ_UNLOCK_BLOCKED (1 << 0) /* blocked while in RCU read-side. */
 #define RCU_READ_UNLOCK_NEED_QS (1 << 1) /* RCU core needs CPU response. */
-#define RCU_READ_UNLOCK_GOT_QS  (1 << 2) /* CPU has responded to RCU core. */
 
 static inline void rcu_copy_process(struct task_struct *p)
 {
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 3a01405..2454999 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -106,27 +106,23 @@ static void __cpuinit rcu_init_percpu_data(int cpu, struct rcu_state *rsp,
  */
 void rcu_sched_qs(int cpu)
 {
-	unsigned long flags;
 	struct rcu_data *rdp;
 
-	local_irq_save(flags);
 	rdp = &per_cpu(rcu_sched_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
-	rcu_preempt_qs(cpu);
-	local_irq_restore(flags);
+	barrier();
+	rdp->passed_quiesc = 1;
+	rcu_preempt_note_context_switch(cpu);
 }
 
 void rcu_bh_qs(int cpu)
 {
-	unsigned long flags;
 	struct rcu_data *rdp;
 
-	local_irq_save(flags);
 	rdp = &per_cpu(rcu_bh_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
-	local_irq_restore(flags);
+	barrier();
+	rdp->passed_quiesc = 1;
 }
 
 #ifdef CONFIG_NO_HZ
@@ -610,6 +606,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 
 	/* Advance to a new grace period and initialize state. */
 	rsp->gpnum++;
+	WARN_ON_ONCE(rsp->signaled == RCU_GP_INIT);
 	rsp->signaled = RCU_GP_INIT; /* Hold off force_quiescent_state. */
 	rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
 	record_gp_stall_check_time(rsp);
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 51413cb..eb4bae3 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -64,34 +64,42 @@ EXPORT_SYMBOL_GPL(rcu_batches_completed);
  * not in a quiescent state.  There might be any number of tasks blocked
  * while in an RCU read-side critical section.
  */
-static void rcu_preempt_qs_record(int cpu)
+static void rcu_preempt_qs(int cpu)
 {
 	struct rcu_data *rdp = &per_cpu(rcu_preempt_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
+	barrier();
+	rdp->passed_quiesc = 1;
 }
 
 /*
- * We have entered the scheduler or are between softirqs in ksoftirqd.
- * If we are in an RCU read-side critical section, we need to reflect
- * that in the state of the rcu_node structure corresponding to this CPU.
- * Caller must disable hardirqs.
+ * We have entered the scheduler, and the current task might soon be
+ * context-switched away from.  If this task is in an RCU read-side
+ * critical section, we will no longer be able to rely on the CPU to
+ * record that fact, so we enqueue the task on the appropriate entry
+ * of the blocked_tasks[] array.  The task will dequeue itself when
+ * it exits the outermost enclosing RCU read-side critical section.
+ * Therefore, the current grace period cannot be permitted to complete
+ * until the blocked_tasks[] entry indexed by the low-order bit of
+ * rnp->gpnum empties.
+ *
+ * Caller must disable preemption.
  */
-static void rcu_preempt_qs(int cpu)
+static void rcu_preempt_note_context_switch(int cpu)
 {
 	struct task_struct *t = current;
+	unsigned long flags;
 	int phase;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
 
 	if (t->rcu_read_lock_nesting &&
 	    (t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {
-	    	WARN_ON_ONCE(cpu != smp_processor_id());
 
 		/* Possibly blocking in an RCU read-side critical section. */
 		rdp = rcu_preempt_state.rda[cpu];
 		rnp = rdp->mynode;
-		spin_lock(&rnp->lock);
+		spin_lock_irqsave(&rnp->lock, flags);
 		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_BLOCKED;
 		t->rcu_blocked_node = rnp;
 
@@ -112,7 +120,7 @@ static void rcu_preempt_qs(int cpu)
 		phase = !(rnp->qsmask & rdp->grpmask) ^ (rnp->gpnum & 0x1);
 		list_add(&t->rcu_node_entry, &rnp->blocked_tasks[phase]);
 		smp_mb();  /* Ensure later ctxt swtch seen after above. */
-		spin_unlock(&rnp->lock);
+		spin_unlock_irqrestore(&rnp->lock, flags);
 	}
 
 	/*
@@ -124,9 +132,8 @@ static void rcu_preempt_qs(int cpu)
 	 * grace period, then the fact that the task has been enqueued
 	 * means that we continue to block the current grace period.
 	 */
-	rcu_preempt_qs_record(cpu);
-	t->rcu_read_unlock_special &= ~(RCU_READ_UNLOCK_NEED_QS |
-					RCU_READ_UNLOCK_GOT_QS);
+	rcu_preempt_qs(cpu);
+	t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
 }
 
 /*
@@ -162,7 +169,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
 	special = t->rcu_read_unlock_special;
 	if (special & RCU_READ_UNLOCK_NEED_QS) {
 		t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
-		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_GOT_QS;
+		rcu_preempt_qs(smp_processor_id());
 	}
 
 	/* Hardware IRQ handlers cannot block. */
@@ -199,9 +206,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
 		 */
 		if (!empty && rnp->qsmask == 0 &&
 		    list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1])) {
-			t->rcu_read_unlock_special &=
-				~(RCU_READ_UNLOCK_NEED_QS |
-				  RCU_READ_UNLOCK_GOT_QS);
+			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
 			if (rnp->parent == NULL) {
 				/* Only one rcu_node in the tree. */
 				cpu_quiet_msk_finish(&rcu_preempt_state, flags);
@@ -352,19 +357,12 @@ static void rcu_preempt_check_callbacks(int cpu)
 	struct task_struct *t = current;
 
 	if (t->rcu_read_lock_nesting == 0) {
-		t->rcu_read_unlock_special &=
-			~(RCU_READ_UNLOCK_NEED_QS | RCU_READ_UNLOCK_GOT_QS);
-		rcu_preempt_qs_record(cpu);
+		t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
+		rcu_preempt_qs(cpu);
 		return;
 	}
 	if (per_cpu(rcu_preempt_data, cpu).qs_pending) {
-		if (t->rcu_read_unlock_special & RCU_READ_UNLOCK_GOT_QS) {
-			rcu_preempt_qs_record(cpu);
-			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_GOT_QS;
-		} else if (!(t->rcu_read_unlock_special &
-			     RCU_READ_UNLOCK_NEED_QS)) {
-			t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
-		}
+		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
 	}
 }
 
@@ -451,7 +449,7 @@ EXPORT_SYMBOL_GPL(rcu_batches_completed);
  * Because preemptable RCU does not exist, we never have to check for
  * CPUs being in quiescent states.
  */
-static void rcu_preempt_qs(int cpu)
+static void rcu_preempt_note_context_switch(int cpu)
 {
 }
 
-- 
1.5.2.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH tip/core/rcu 4/4] Fix synchronize_rcu() for TREE_PREEMPT_RCU
  2009-09-13 16:14 [PATCH tip/core/rcu 0/4] Review comments, cleanups, and preemptable synchronize_rcu() fixes Paul E. McKenney
                   ` (2 preceding siblings ...)
  2009-09-13 16:15 ` [PATCH tip/core/rcu 3/4] Simplify rcu_read_unlock_special() quiescent-state accounting Paul E. McKenney
@ 2009-09-13 16:15 ` Paul E. McKenney
  2009-09-15  7:18   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
  2009-09-17 22:11   ` tip-bot for Paul E. McKenney
  3 siblings, 2 replies; 16+ messages in thread
From: Paul E. McKenney @ 2009-09-13 16:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, dvhltc,
	niv, tglx, peterz, rostedt, Valdis.Kletnieks, Paul E. McKenney

From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

The redirection of synchronize_sched() to synchronize_rcu() was appropriate
for TREE_RCU, but not for TREE_PREEMPT_RCU.  Fix this by creating an
underlying synchronize_sched().  TREE_RCU then redirects synchronize_rcu()
to synchronize_sched(), while TREE_PREEMPT_RCU has its own version of
synchronize_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   23 +++++------------------
 include/linux/rcutree.h  |    4 ++--
 kernel/rcupdate.c        |   44 +++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 95e0615..39dce83 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -52,8 +52,13 @@ struct rcu_head {
 };
 
 /* Exported common interfaces */
+#ifdef CONFIG_TREE_PREEMPT_RCU
 extern void synchronize_rcu(void);
+#else /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+#define synchronize_rcu synchronize_sched
+#endif /* #else #ifdef CONFIG_TREE_PREEMPT_RCU */
 extern void synchronize_rcu_bh(void);
+extern void synchronize_sched(void);
 extern void rcu_barrier(void);
 extern void rcu_barrier_bh(void);
 extern void rcu_barrier_sched(void);
@@ -262,24 +267,6 @@ struct rcu_synchronize {
 extern void wakeme_after_rcu(struct rcu_head  *head);
 
 /**
- * synchronize_sched - block until all CPUs have exited any non-preemptive
- * kernel code sequences.
- *
- * This means that all preempt_disable code sequences, including NMI and
- * hardware-interrupt handlers, in progress on entry will have completed
- * before this primitive returns.  However, this does not guarantee that
- * softirq handlers will have completed, since in some kernels, these
- * handlers can run in process context, and can block.
- *
- * This primitive provides the guarantees made by the (now removed)
- * synchronize_kernel() API.  In contrast, synchronize_rcu() only
- * guarantees that rcu_read_lock() sections will have completed.
- * In "classic RCU", these two guarantees happen to be one and
- * the same, but can differ in realtime RCU implementations.
- */
-#define synchronize_sched() __synchronize_sched()
-
-/**
  * call_rcu - Queue an RCU callback for invocation after a grace period.
  * @head: structure to be used for queueing the RCU updates.
  * @func: actual update function to be invoked after the grace period
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index a893077..00d08c0 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -53,6 +53,8 @@ static inline void __rcu_read_unlock(void)
 	preempt_enable();
 }
 
+#define __synchronize_sched() synchronize_rcu()
+
 static inline void exit_rcu(void)
 {
 }
@@ -68,8 +70,6 @@ static inline void __rcu_read_unlock_bh(void)
 	local_bh_enable();
 }
 
-#define __synchronize_sched() synchronize_rcu()
-
 extern void call_rcu_sched(struct rcu_head *head,
 			   void (*func)(struct rcu_head *rcu));
 
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index bd5d5c8..28d2f24 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -74,6 +74,8 @@ void wakeme_after_rcu(struct rcu_head  *head)
 	complete(&rcu->completion);
 }
 
+#ifdef CONFIG_TREE_PREEMPT_RCU
+
 /**
  * synchronize_rcu - wait until a grace period has elapsed.
  *
@@ -87,7 +89,7 @@ void synchronize_rcu(void)
 {
 	struct rcu_synchronize rcu;
 
-	if (rcu_blocking_is_gp())
+	if (!rcu_scheduler_active)
 		return;
 
 	init_completion(&rcu.completion);
@@ -98,6 +100,46 @@ void synchronize_rcu(void)
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu);
 
+#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+
+/**
+ * synchronize_sched - wait until an rcu-sched grace period has elapsed.
+ *
+ * Control will return to the caller some time after a full rcu-sched
+ * grace period has elapsed, in other words after all currently executing
+ * rcu-sched read-side critical sections have completed.   These read-side
+ * critical sections are delimited by rcu_read_lock_sched() and
+ * rcu_read_unlock_sched(), and may be nested.  Note that preempt_disable(),
+ * local_irq_disable(), and so on may be used in place of
+ * rcu_read_lock_sched().
+ *
+ * This means that all preempt_disable code sequences, including NMI and
+ * hardware-interrupt handlers, in progress on entry will have completed
+ * before this primitive returns.  However, this does not guarantee that
+ * softirq handlers will have completed, since in some kernels, these
+ * handlers can run in process context, and can block.
+ *
+ * This primitive provides the guarantees made by the (now removed)
+ * synchronize_kernel() API.  In contrast, synchronize_rcu() only
+ * guarantees that rcu_read_lock() sections will have completed.
+ * In "classic RCU", these two guarantees happen to be one and
+ * the same, but can differ in realtime RCU implementations.
+ */
+void synchronize_sched(void)
+{
+	struct rcu_synchronize rcu;
+
+	if (rcu_blocking_is_gp())
+		return;
+
+	init_completion(&rcu.completion);
+	/* Will wake me after RCU finished. */
+	call_rcu_sched(&rcu.head, wakeme_after_rcu);
+	/* Wait for it. */
+	wait_for_completion(&rcu.completion);
+}
+EXPORT_SYMBOL_GPL(synchronize_sched);
+
 /**
  * synchronize_rcu_bh - wait until an rcu_bh grace period has elapsed.
  *
-- 
1.5.2.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods.
  2009-09-13 16:15 ` [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods Paul E. McKenney
@ 2009-09-13 16:23   ` Daniel Walker
  2009-09-13 16:31     ` Paul E. McKenney
  2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
  2009-09-17 22:10   ` tip-bot for Paul E. McKenney
  2 siblings, 1 reply; 16+ messages in thread
From: Daniel Walker @ 2009-09-13 16:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, dvhltc, niv, tglx, peterz, rostedt, Valdis.Kletnieks

On Sun, 2009-09-13 at 09:15 -0700, Paul E. McKenney wrote:
> From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Check to make sure that there are no blocked tasks for the previous
> grace period while initializing for the next grace period, verify
> that rcu_preempt_qs() is given the correct CPU number and is never
> called for an offline CPU.
> 

You've got a couple of whitespace issues in the WARN_ON_ONCE() lines..
As found by checkpatch,

ERROR: code indent should use tabs where possible
#97: FILE: kernel/rcutree_plugin.h:89:
+^I    ^IWARN_ON_ONCE(cpu != smp_processor_id());$

ERROR: code indent should use tabs where possible
#109: FILE: kernel/rcutree_plugin.h:111:
+^I    ^IWARN_ON_ONCE((rdp->grpmask & rnp->qsmaskinit) == 0);$


Could you fix these up?

Daniel


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods.
  2009-09-13 16:23   ` Daniel Walker
@ 2009-09-13 16:31     ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2009-09-13 16:31 UTC (permalink / raw)
  To: Daniel Walker
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	josh, dvhltc, niv, tglx, peterz, rostedt, Valdis.Kletnieks

On Sun, Sep 13, 2009 at 09:23:02AM -0700, Daniel Walker wrote:
> On Sun, 2009-09-13 at 09:15 -0700, Paul E. McKenney wrote:
> > From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > Check to make sure that there are no blocked tasks for the previous
> > grace period while initializing for the next grace period, verify
> > that rcu_preempt_qs() is given the correct CPU number and is never
> > called for an offline CPU.
> > 
> 
> You've got a couple of whitespace issues in the WARN_ON_ONCE() lines..
> As found by checkpatch,
> 
> ERROR: code indent should use tabs where possible
> #97: FILE: kernel/rcutree_plugin.h:89:
> +^I    ^IWARN_ON_ONCE(cpu != smp_processor_id());$
> 
> ERROR: code indent should use tabs where possible
> #109: FILE: kernel/rcutree_plugin.h:111:
> +^I    ^IWARN_ON_ONCE((rdp->grpmask & rnp->qsmaskinit) == 0);$
> 
> Could you fix these up?

Good catch!  Here is a corrected version.

							Thanx, Paul

------------------------------------------------------------------------

>From f5807ddbd4fff957e6c2efdc874a740ff40f1c94 Mon Sep 17 00:00:00 2001
From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue, 8 Sep 2009 16:36:30 -0700
Subject: [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods.

Check to make sure that there are no blocked tasks for the previous
grace period while initializing for the next grace period, verify
that rcu_preempt_qs() is given the correct CPU number and is never
called for an offline CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |    2 ++
 kernel/rcutree_plugin.h |   25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index bca0aba..3a01405 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -627,6 +627,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	/* Special-case the common single-level case. */
 	if (NUM_RCU_NODES == 1) {
 		rnp->qsmask = rnp->qsmaskinit;
+		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->gpnum = rsp->gpnum;
 		rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
 		spin_unlock_irqrestore(&rnp->lock, flags);
@@ -660,6 +661,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++) {
 		spin_lock(&rnp_cur->lock);	/* irqs already disabled. */
 		rnp_cur->qsmask = rnp_cur->qsmaskinit;
+		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->gpnum = rsp->gpnum;
 		spin_unlock(&rnp_cur->lock);	/* irqs already disabled. */
 	}
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 4778936..51413cb 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -86,6 +86,7 @@ static void rcu_preempt_qs(int cpu)
 
 	if (t->rcu_read_lock_nesting &&
 	    (t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {
+		WARN_ON_ONCE(cpu != smp_processor_id());
 
 		/* Possibly blocking in an RCU read-side critical section. */
 		rdp = rcu_preempt_state.rda[cpu];
@@ -103,7 +104,11 @@ static void rcu_preempt_qs(int cpu)
 		 * state for the current grace period), then as long
 		 * as that task remains queued, the current grace period
 		 * cannot end.
+		 *
+		 * But first, note that the current CPU must still be
+		 * on line!
 		 */
+		WARN_ON_ONCE((rdp->grpmask & rnp->qsmaskinit) == 0);
 		phase = !(rnp->qsmask & rdp->grpmask) ^ (rnp->gpnum & 0x1);
 		list_add(&t->rcu_node_entry, &rnp->blocked_tasks[phase]);
 		smp_mb();  /* Ensure later ctxt swtch seen after above. */
@@ -259,6 +264,18 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
+ * Check that the list of blocked tasks for the newly completed grace
+ * period is in fact empty.  It is a serious bug to complete a grace
+ * period that still has RCU readers blocked!  This function must be
+ * invoked -before- updating this rnp's ->gpnum, and the rnp's ->lock
+ * must be held by the caller.
+ */
+static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+{
+	WARN_ON_ONCE(!list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1]));
+}
+
+/*
  * Check for preempted RCU readers for the specified rcu_node structure.
  * If the caller needs a reliable answer, it must hold the rcu_node's
  * >lock.
@@ -451,6 +468,14 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
+ * Because there is no preemptable RCU, there can be no readers blocked,
+ * so there is no need to check for blocked tasks.
+ */
+static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+{
+}
+
+/*
  * Because preemptable RCU does not exist, there are never any preempted
  * RCU readers.
  */
-- 
1.5.2.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:core/urgent] rcu: Kconfig help needs to say that TREE_PREEMPT_RCU scales down
  2009-09-13 16:15 ` [PATCH tip/core/rcu 1/4] Kconfig help needs to say that TREE_PREEMPT_RCU scales down Paul E. McKenney
@ 2009-09-15  7:17   ` tip-bot for Paul E. McKenney
  2009-09-17 22:10   ` tip-bot for Paul E. McKenney
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot for Paul E. McKenney @ 2009-09-15  7:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, Valdis.Kletnieks, paulmck, hpa, mingo, tglx, mingo

Commit-ID:  da054e04e4d7c8d340ddc8dc45d4f7cad7672b96
Gitweb:     http://git.kernel.org/tip/da054e04e4d7c8d340ddc8dc45d4f7cad7672b96
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Sun, 13 Sep 2009 09:15:08 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 15 Sep 2009 08:43:58 +0200

rcu: Kconfig help needs to say that TREE_PREEMPT_RCU scales down

To quote Valdis:

    This leaves somebody who has a laptop wondering which
    choice is best for a system with only one or two cores that
    has CONFIG_PREEMPT defined. One choice says it scales down
    nicely, the other explicitly has a 'depends on PREEMPT'
    attached to it...

So add "scales down nicely" to TREE_PREEMPT_RCU to match that of
TREE_RCU.

Suggested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference: <12528585112362-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 init/Kconfig |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 8e8b76d..4c2c936 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -331,7 +331,8 @@ config TREE_PREEMPT_RCU
 	  This option selects the RCU implementation that is
 	  designed for very large SMP systems with hundreds or
 	  thousands of CPUs, but for which real-time response
-	  is also required.
+	  is also required.  It also scales down nicely to
+	  smaller systems.
 
 endchoice
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:core/urgent] rcu: Add debug checks to TREE_PREEMPT_RCU for premature grace periods
  2009-09-13 16:15 ` [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods Paul E. McKenney
  2009-09-13 16:23   ` Daniel Walker
@ 2009-09-15  7:17   ` tip-bot for Paul E. McKenney
  2009-09-17 22:10   ` tip-bot for Paul E. McKenney
  2 siblings, 0 replies; 16+ messages in thread
From: tip-bot for Paul E. McKenney @ 2009-09-15  7:17 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, paulmck, hpa, mingo, tglx, mingo

Commit-ID:  429e6f07df20175fa59927df415c41c5e1d82d91
Gitweb:     http://git.kernel.org/tip/429e6f07df20175fa59927df415c41c5e1d82d91
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Sun, 13 Sep 2009 09:15:09 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 15 Sep 2009 08:43:58 +0200

rcu: Add debug checks to TREE_PREEMPT_RCU for premature grace periods

Check to make sure that there are no blocked tasks for the previous
grace period while initializing for the next grace period, verify
that rcu_preempt_qs() is given the correct CPU number and is never
called for an offline CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference: <12528585111986-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 kernel/rcutree.c        |    2 ++
 kernel/rcutree_plugin.h |   25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index da301e2..e9a4ae9 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -632,6 +632,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	/* Special-case the common single-level case. */
 	if (NUM_RCU_NODES == 1) {
 		rnp->qsmask = rnp->qsmaskinit;
+		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->gpnum = rsp->gpnum;
 		rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
 		spin_unlock_irqrestore(&rnp->lock, flags);
@@ -665,6 +666,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++) {
 		spin_lock(&rnp_cur->lock);	/* irqs already disabled. */
 		rnp_cur->qsmask = rnp_cur->qsmaskinit;
+		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->gpnum = rsp->gpnum;
 		spin_unlock(&rnp_cur->lock);	/* irqs already disabled. */
 	}
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 4778936..b8e4b03 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -86,6 +86,7 @@ static void rcu_preempt_qs(int cpu)
 
 	if (t->rcu_read_lock_nesting &&
 	    (t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {
+		WARN_ON_ONCE(cpu != smp_processor_id());
 
 		/* Possibly blocking in an RCU read-side critical section. */
 		rdp = rcu_preempt_state.rda[cpu];
@@ -103,7 +104,11 @@ static void rcu_preempt_qs(int cpu)
 		 * state for the current grace period), then as long
 		 * as that task remains queued, the current grace period
 		 * cannot end.
+		 *
+		 * But first, note that the current CPU must still be
+		 * on line!
 		 */
+		WARN_ON_ONCE((rdp->grpmask & rnp->qsmaskinit) == 0);
 		phase = !(rnp->qsmask & rdp->grpmask) ^ (rnp->gpnum & 0x1);
 		list_add(&t->rcu_node_entry, &rnp->blocked_tasks[phase]);
 		smp_mb();  /* Ensure later ctxt swtch seen after above. */
@@ -259,6 +264,18 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
+ * Check that the list of blocked tasks for the newly completed grace
+ * period is in fact empty.  It is a serious bug to complete a grace
+ * period that still has RCU readers blocked!  This function must be
+ * invoked -before- updating this rnp's ->gpnum, and the rnp's ->lock
+ * must be held by the caller.
+ */
+static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+{
+	WARN_ON_ONCE(!list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1]));
+}
+
+/*
  * Check for preempted RCU readers for the specified rcu_node structure.
  * If the caller needs a reliable answer, it must hold the rcu_node's
  * >lock.
@@ -451,6 +468,14 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
+ * Because there is no preemptable RCU, there can be no readers blocked,
+ * so there is no need to check for blocked tasks.
+ */
+static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+{
+}
+
+/*
  * Because preemptable RCU does not exist, there are never any preempted
  * RCU readers.
  */

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:core/urgent] rcu: Simplify rcu_read_unlock_special() quiescent-state accounting
  2009-09-13 16:15 ` [PATCH tip/core/rcu 3/4] Simplify rcu_read_unlock_special() quiescent-state accounting Paul E. McKenney
@ 2009-09-15  7:17   ` tip-bot for Paul E. McKenney
  2009-09-15 19:53   ` [PATCH tip/core/rcu 3/4] " Josh Triplett
  2009-09-17 22:11   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
  2 siblings, 0 replies; 16+ messages in thread
From: tip-bot for Paul E. McKenney @ 2009-09-15  7:17 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, paulmck, hpa, mingo, josh, tglx, mingo

Commit-ID:  ddaad21c6848c599edc9432747a5295ea4d060df
Gitweb:     http://git.kernel.org/tip/ddaad21c6848c599edc9432747a5295ea4d060df
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Sun, 13 Sep 2009 09:15:10 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 15 Sep 2009 08:43:59 +0200

rcu: Simplify rcu_read_unlock_special() quiescent-state accounting

The earlier approach required two scheduling-clock ticks to note an
preemptable-RCU quiescent state in the situation in which the
scheduling-clock interrupt is unlucky enough to always interrupt an
RCU read-side critical section.

With this change, the quiescent state is instead noted by the
outermost rcu_read_unlock() immediately following the first
scheduling-clock tick, or, alternatively, by the first subsequent
context switch.  Therefore, this change also speeds up grace
periods.

Suggested-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference: <12528585111945-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 include/linux/sched.h   |    1 -
 kernel/rcutree.c        |   15 +++++-------
 kernel/rcutree_plugin.h |   54 ++++++++++++++++++++++------------------------
 3 files changed, 32 insertions(+), 38 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index f3d74bd..c62a9f8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1740,7 +1740,6 @@ extern cputime_t task_gtime(struct task_struct *p);
 
 #define RCU_READ_UNLOCK_BLOCKED (1 << 0) /* blocked while in RCU read-side. */
 #define RCU_READ_UNLOCK_NEED_QS (1 << 1) /* RCU core needs CPU response. */
-#define RCU_READ_UNLOCK_GOT_QS  (1 << 2) /* CPU has responded to RCU core. */
 
 static inline void rcu_copy_process(struct task_struct *p)
 {
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index e9a4ae9..6c99553 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -107,27 +107,23 @@ static void __cpuinit rcu_init_percpu_data(int cpu, struct rcu_state *rsp,
  */
 void rcu_sched_qs(int cpu)
 {
-	unsigned long flags;
 	struct rcu_data *rdp;
 
-	local_irq_save(flags);
 	rdp = &per_cpu(rcu_sched_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
-	rcu_preempt_qs(cpu);
-	local_irq_restore(flags);
+	barrier();
+	rdp->passed_quiesc = 1;
+	rcu_preempt_note_context_switch(cpu);
 }
 
 void rcu_bh_qs(int cpu)
 {
-	unsigned long flags;
 	struct rcu_data *rdp;
 
-	local_irq_save(flags);
 	rdp = &per_cpu(rcu_bh_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
-	local_irq_restore(flags);
+	barrier();
+	rdp->passed_quiesc = 1;
 }
 
 #ifdef CONFIG_NO_HZ
@@ -615,6 +611,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 
 	/* Advance to a new grace period and initialize state. */
 	rsp->gpnum++;
+	WARN_ON_ONCE(rsp->signaled == RCU_GP_INIT);
 	rsp->signaled = RCU_GP_INIT; /* Hold off force_quiescent_state. */
 	rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
 	record_gp_stall_check_time(rsp);
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index b8e4b03..c9616e4 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -64,34 +64,42 @@ EXPORT_SYMBOL_GPL(rcu_batches_completed);
  * not in a quiescent state.  There might be any number of tasks blocked
  * while in an RCU read-side critical section.
  */
-static void rcu_preempt_qs_record(int cpu)
+static void rcu_preempt_qs(int cpu)
 {
 	struct rcu_data *rdp = &per_cpu(rcu_preempt_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
+	barrier();
+	rdp->passed_quiesc = 1;
 }
 
 /*
- * We have entered the scheduler or are between softirqs in ksoftirqd.
- * If we are in an RCU read-side critical section, we need to reflect
- * that in the state of the rcu_node structure corresponding to this CPU.
- * Caller must disable hardirqs.
+ * We have entered the scheduler, and the current task might soon be
+ * context-switched away from.  If this task is in an RCU read-side
+ * critical section, we will no longer be able to rely on the CPU to
+ * record that fact, so we enqueue the task on the appropriate entry
+ * of the blocked_tasks[] array.  The task will dequeue itself when
+ * it exits the outermost enclosing RCU read-side critical section.
+ * Therefore, the current grace period cannot be permitted to complete
+ * until the blocked_tasks[] entry indexed by the low-order bit of
+ * rnp->gpnum empties.
+ *
+ * Caller must disable preemption.
  */
-static void rcu_preempt_qs(int cpu)
+static void rcu_preempt_note_context_switch(int cpu)
 {
 	struct task_struct *t = current;
+	unsigned long flags;
 	int phase;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
 
 	if (t->rcu_read_lock_nesting &&
 	    (t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {
-		WARN_ON_ONCE(cpu != smp_processor_id());
 
 		/* Possibly blocking in an RCU read-side critical section. */
 		rdp = rcu_preempt_state.rda[cpu];
 		rnp = rdp->mynode;
-		spin_lock(&rnp->lock);
+		spin_lock_irqsave(&rnp->lock, flags);
 		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_BLOCKED;
 		t->rcu_blocked_node = rnp;
 
@@ -112,7 +120,7 @@ static void rcu_preempt_qs(int cpu)
 		phase = !(rnp->qsmask & rdp->grpmask) ^ (rnp->gpnum & 0x1);
 		list_add(&t->rcu_node_entry, &rnp->blocked_tasks[phase]);
 		smp_mb();  /* Ensure later ctxt swtch seen after above. */
-		spin_unlock(&rnp->lock);
+		spin_unlock_irqrestore(&rnp->lock, flags);
 	}
 
 	/*
@@ -124,9 +132,8 @@ static void rcu_preempt_qs(int cpu)
 	 * grace period, then the fact that the task has been enqueued
 	 * means that we continue to block the current grace period.
 	 */
-	rcu_preempt_qs_record(cpu);
-	t->rcu_read_unlock_special &= ~(RCU_READ_UNLOCK_NEED_QS |
-					RCU_READ_UNLOCK_GOT_QS);
+	rcu_preempt_qs(cpu);
+	t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
 }
 
 /*
@@ -162,7 +169,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
 	special = t->rcu_read_unlock_special;
 	if (special & RCU_READ_UNLOCK_NEED_QS) {
 		t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
-		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_GOT_QS;
+		rcu_preempt_qs(smp_processor_id());
 	}
 
 	/* Hardware IRQ handlers cannot block. */
@@ -199,9 +206,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
 		 */
 		if (!empty && rnp->qsmask == 0 &&
 		    list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1])) {
-			t->rcu_read_unlock_special &=
-				~(RCU_READ_UNLOCK_NEED_QS |
-				  RCU_READ_UNLOCK_GOT_QS);
+			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
 			if (rnp->parent == NULL) {
 				/* Only one rcu_node in the tree. */
 				cpu_quiet_msk_finish(&rcu_preempt_state, flags);
@@ -352,19 +357,12 @@ static void rcu_preempt_check_callbacks(int cpu)
 	struct task_struct *t = current;
 
 	if (t->rcu_read_lock_nesting == 0) {
-		t->rcu_read_unlock_special &=
-			~(RCU_READ_UNLOCK_NEED_QS | RCU_READ_UNLOCK_GOT_QS);
-		rcu_preempt_qs_record(cpu);
+		t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
+		rcu_preempt_qs(cpu);
 		return;
 	}
 	if (per_cpu(rcu_preempt_data, cpu).qs_pending) {
-		if (t->rcu_read_unlock_special & RCU_READ_UNLOCK_GOT_QS) {
-			rcu_preempt_qs_record(cpu);
-			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_GOT_QS;
-		} else if (!(t->rcu_read_unlock_special &
-			     RCU_READ_UNLOCK_NEED_QS)) {
-			t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
-		}
+		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
 	}
 }
 
@@ -451,7 +449,7 @@ EXPORT_SYMBOL_GPL(rcu_batches_completed);
  * Because preemptable RCU does not exist, we never have to check for
  * CPUs being in quiescent states.
  */
-static void rcu_preempt_qs(int cpu)
+static void rcu_preempt_note_context_switch(int cpu)
 {
 }
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:core/urgent] rcu: Fix synchronize_rcu() for TREE_PREEMPT_RCU
  2009-09-13 16:15 ` [PATCH tip/core/rcu 4/4] Fix synchronize_rcu() for TREE_PREEMPT_RCU Paul E. McKenney
@ 2009-09-15  7:18   ` tip-bot for Paul E. McKenney
  2009-09-17 22:11   ` tip-bot for Paul E. McKenney
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot for Paul E. McKenney @ 2009-09-15  7:18 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, paulmck, hpa, mingo, tglx, mingo

Commit-ID:  366b04ca60c70479e2959fe8485b87ff380fdbbf
Gitweb:     http://git.kernel.org/tip/366b04ca60c70479e2959fe8485b87ff380fdbbf
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Sun, 13 Sep 2009 09:15:11 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 15 Sep 2009 08:43:59 +0200

rcu: Fix synchronize_rcu() for TREE_PREEMPT_RCU

The redirection of synchronize_sched() to synchronize_rcu() was
appropriate for TREE_RCU, but not for TREE_PREEMPT_RCU.

Fix this by creating an underlying synchronize_sched().  TREE_RCU
then redirects synchronize_rcu() to synchronize_sched(), while
TREE_PREEMPT_RCU has its own version of synchronize_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference: <12528585111916-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 include/linux/rcupdate.h |   23 +++++------------------
 include/linux/rcutree.h  |    4 ++--
 kernel/rcupdate.c        |   44 +++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 95e0615..39dce83 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -52,8 +52,13 @@ struct rcu_head {
 };
 
 /* Exported common interfaces */
+#ifdef CONFIG_TREE_PREEMPT_RCU
 extern void synchronize_rcu(void);
+#else /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+#define synchronize_rcu synchronize_sched
+#endif /* #else #ifdef CONFIG_TREE_PREEMPT_RCU */
 extern void synchronize_rcu_bh(void);
+extern void synchronize_sched(void);
 extern void rcu_barrier(void);
 extern void rcu_barrier_bh(void);
 extern void rcu_barrier_sched(void);
@@ -262,24 +267,6 @@ struct rcu_synchronize {
 extern void wakeme_after_rcu(struct rcu_head  *head);
 
 /**
- * synchronize_sched - block until all CPUs have exited any non-preemptive
- * kernel code sequences.
- *
- * This means that all preempt_disable code sequences, including NMI and
- * hardware-interrupt handlers, in progress on entry will have completed
- * before this primitive returns.  However, this does not guarantee that
- * softirq handlers will have completed, since in some kernels, these
- * handlers can run in process context, and can block.
- *
- * This primitive provides the guarantees made by the (now removed)
- * synchronize_kernel() API.  In contrast, synchronize_rcu() only
- * guarantees that rcu_read_lock() sections will have completed.
- * In "classic RCU", these two guarantees happen to be one and
- * the same, but can differ in realtime RCU implementations.
- */
-#define synchronize_sched() __synchronize_sched()
-
-/**
  * call_rcu - Queue an RCU callback for invocation after a grace period.
  * @head: structure to be used for queueing the RCU updates.
  * @func: actual update function to be invoked after the grace period
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index a893077..00d08c0 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -53,6 +53,8 @@ static inline void __rcu_read_unlock(void)
 	preempt_enable();
 }
 
+#define __synchronize_sched() synchronize_rcu()
+
 static inline void exit_rcu(void)
 {
 }
@@ -68,8 +70,6 @@ static inline void __rcu_read_unlock_bh(void)
 	local_bh_enable();
 }
 
-#define __synchronize_sched() synchronize_rcu()
-
 extern void call_rcu_sched(struct rcu_head *head,
 			   void (*func)(struct rcu_head *rcu));
 
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index bd5d5c8..28d2f24 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -74,6 +74,8 @@ void wakeme_after_rcu(struct rcu_head  *head)
 	complete(&rcu->completion);
 }
 
+#ifdef CONFIG_TREE_PREEMPT_RCU
+
 /**
  * synchronize_rcu - wait until a grace period has elapsed.
  *
@@ -87,7 +89,7 @@ void synchronize_rcu(void)
 {
 	struct rcu_synchronize rcu;
 
-	if (rcu_blocking_is_gp())
+	if (!rcu_scheduler_active)
 		return;
 
 	init_completion(&rcu.completion);
@@ -98,6 +100,46 @@ void synchronize_rcu(void)
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu);
 
+#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+
+/**
+ * synchronize_sched - wait until an rcu-sched grace period has elapsed.
+ *
+ * Control will return to the caller some time after a full rcu-sched
+ * grace period has elapsed, in other words after all currently executing
+ * rcu-sched read-side critical sections have completed.   These read-side
+ * critical sections are delimited by rcu_read_lock_sched() and
+ * rcu_read_unlock_sched(), and may be nested.  Note that preempt_disable(),
+ * local_irq_disable(), and so on may be used in place of
+ * rcu_read_lock_sched().
+ *
+ * This means that all preempt_disable code sequences, including NMI and
+ * hardware-interrupt handlers, in progress on entry will have completed
+ * before this primitive returns.  However, this does not guarantee that
+ * softirq handlers will have completed, since in some kernels, these
+ * handlers can run in process context, and can block.
+ *
+ * This primitive provides the guarantees made by the (now removed)
+ * synchronize_kernel() API.  In contrast, synchronize_rcu() only
+ * guarantees that rcu_read_lock() sections will have completed.
+ * In "classic RCU", these two guarantees happen to be one and
+ * the same, but can differ in realtime RCU implementations.
+ */
+void synchronize_sched(void)
+{
+	struct rcu_synchronize rcu;
+
+	if (rcu_blocking_is_gp())
+		return;
+
+	init_completion(&rcu.completion);
+	/* Will wake me after RCU finished. */
+	call_rcu_sched(&rcu.head, wakeme_after_rcu);
+	/* Wait for it. */
+	wait_for_completion(&rcu.completion);
+}
+EXPORT_SYMBOL_GPL(synchronize_sched);
+
 /**
  * synchronize_rcu_bh - wait until an rcu_bh grace period has elapsed.
  *

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH tip/core/rcu 3/4] Simplify rcu_read_unlock_special() quiescent-state accounting
  2009-09-13 16:15 ` [PATCH tip/core/rcu 3/4] Simplify rcu_read_unlock_special() quiescent-state accounting Paul E. McKenney
  2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
@ 2009-09-15 19:53   ` Josh Triplett
  2009-09-17 22:11   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
  2 siblings, 0 replies; 16+ messages in thread
From: Josh Triplett @ 2009-09-15 19:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
	dvhltc, niv, tglx, peterz, rostedt, Valdis.Kletnieks

On Sun, Sep 13, 2009 at 09:15:10AM -0700, Paul E. McKenney wrote:
> From: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> The earlier approach required two scheduling-clock ticks to note
> an preemptable-RCU quiescent state in the situation in which the
> scheduling-clock interrupt is unlucky enough to always interrupt an RCU
> read-side critical section.  With this change, the quiescent state is
> instead noted by the outermost rcu_read_unlock() immediately following the
> first scheduling-clock tick, or, alternatively, by the first subsequent
> context switch.  Therefore, this change also speeds up grace periods.
> 
> Suggested-by: Josh Triplett <josh@joshtriplett.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Acked-by: Josh Triplett <josh@joshtriplett.org>

(patch left quoted for context)

> ---
>  include/linux/sched.h   |    1 -
>  kernel/rcutree.c        |   15 +++++-------
>  kernel/rcutree_plugin.h |   54 ++++++++++++++++++++++------------------------
>  3 files changed, 32 insertions(+), 38 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 855fd0d..e00ee56 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1731,7 +1731,6 @@ extern cputime_t task_gtime(struct task_struct *p);
>  
>  #define RCU_READ_UNLOCK_BLOCKED (1 << 0) /* blocked while in RCU read-side. */
>  #define RCU_READ_UNLOCK_NEED_QS (1 << 1) /* RCU core needs CPU response. */
> -#define RCU_READ_UNLOCK_GOT_QS  (1 << 2) /* CPU has responded to RCU core. */
>  
>  static inline void rcu_copy_process(struct task_struct *p)
>  {
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 3a01405..2454999 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -106,27 +106,23 @@ static void __cpuinit rcu_init_percpu_data(int cpu, struct rcu_state *rsp,
>   */
>  void rcu_sched_qs(int cpu)
>  {
> -	unsigned long flags;
>  	struct rcu_data *rdp;
>  
> -	local_irq_save(flags);
>  	rdp = &per_cpu(rcu_sched_data, cpu);
> -	rdp->passed_quiesc = 1;
>  	rdp->passed_quiesc_completed = rdp->completed;
> -	rcu_preempt_qs(cpu);
> -	local_irq_restore(flags);
> +	barrier();
> +	rdp->passed_quiesc = 1;
> +	rcu_preempt_note_context_switch(cpu);
>  }
>  
>  void rcu_bh_qs(int cpu)
>  {
> -	unsigned long flags;
>  	struct rcu_data *rdp;
>  
> -	local_irq_save(flags);
>  	rdp = &per_cpu(rcu_bh_data, cpu);
> -	rdp->passed_quiesc = 1;
>  	rdp->passed_quiesc_completed = rdp->completed;
> -	local_irq_restore(flags);
> +	barrier();
> +	rdp->passed_quiesc = 1;
>  }
>  
>  #ifdef CONFIG_NO_HZ
> @@ -610,6 +606,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
>  
>  	/* Advance to a new grace period and initialize state. */
>  	rsp->gpnum++;
> +	WARN_ON_ONCE(rsp->signaled == RCU_GP_INIT);
>  	rsp->signaled = RCU_GP_INIT; /* Hold off force_quiescent_state. */
>  	rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
>  	record_gp_stall_check_time(rsp);
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 51413cb..eb4bae3 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -64,34 +64,42 @@ EXPORT_SYMBOL_GPL(rcu_batches_completed);
>   * not in a quiescent state.  There might be any number of tasks blocked
>   * while in an RCU read-side critical section.
>   */
> -static void rcu_preempt_qs_record(int cpu)
> +static void rcu_preempt_qs(int cpu)
>  {
>  	struct rcu_data *rdp = &per_cpu(rcu_preempt_data, cpu);
> -	rdp->passed_quiesc = 1;
>  	rdp->passed_quiesc_completed = rdp->completed;
> +	barrier();
> +	rdp->passed_quiesc = 1;
>  }
>  
>  /*
> - * We have entered the scheduler or are between softirqs in ksoftirqd.
> - * If we are in an RCU read-side critical section, we need to reflect
> - * that in the state of the rcu_node structure corresponding to this CPU.
> - * Caller must disable hardirqs.
> + * We have entered the scheduler, and the current task might soon be
> + * context-switched away from.  If this task is in an RCU read-side
> + * critical section, we will no longer be able to rely on the CPU to
> + * record that fact, so we enqueue the task on the appropriate entry
> + * of the blocked_tasks[] array.  The task will dequeue itself when
> + * it exits the outermost enclosing RCU read-side critical section.
> + * Therefore, the current grace period cannot be permitted to complete
> + * until the blocked_tasks[] entry indexed by the low-order bit of
> + * rnp->gpnum empties.
> + *
> + * Caller must disable preemption.
>   */
> -static void rcu_preempt_qs(int cpu)
> +static void rcu_preempt_note_context_switch(int cpu)
>  {
>  	struct task_struct *t = current;
> +	unsigned long flags;
>  	int phase;
>  	struct rcu_data *rdp;
>  	struct rcu_node *rnp;
>  
>  	if (t->rcu_read_lock_nesting &&
>  	    (t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {
> -	    	WARN_ON_ONCE(cpu != smp_processor_id());
>  
>  		/* Possibly blocking in an RCU read-side critical section. */
>  		rdp = rcu_preempt_state.rda[cpu];
>  		rnp = rdp->mynode;
> -		spin_lock(&rnp->lock);
> +		spin_lock_irqsave(&rnp->lock, flags);
>  		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_BLOCKED;
>  		t->rcu_blocked_node = rnp;
>  
> @@ -112,7 +120,7 @@ static void rcu_preempt_qs(int cpu)
>  		phase = !(rnp->qsmask & rdp->grpmask) ^ (rnp->gpnum & 0x1);
>  		list_add(&t->rcu_node_entry, &rnp->blocked_tasks[phase]);
>  		smp_mb();  /* Ensure later ctxt swtch seen after above. */
> -		spin_unlock(&rnp->lock);
> +		spin_unlock_irqrestore(&rnp->lock, flags);
>  	}
>  
>  	/*
> @@ -124,9 +132,8 @@ static void rcu_preempt_qs(int cpu)
>  	 * grace period, then the fact that the task has been enqueued
>  	 * means that we continue to block the current grace period.
>  	 */
> -	rcu_preempt_qs_record(cpu);
> -	t->rcu_read_unlock_special &= ~(RCU_READ_UNLOCK_NEED_QS |
> -					RCU_READ_UNLOCK_GOT_QS);
> +	rcu_preempt_qs(cpu);
> +	t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
>  }
>  
>  /*
> @@ -162,7 +169,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
>  	special = t->rcu_read_unlock_special;
>  	if (special & RCU_READ_UNLOCK_NEED_QS) {
>  		t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
> -		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_GOT_QS;
> +		rcu_preempt_qs(smp_processor_id());
>  	}
>  
>  	/* Hardware IRQ handlers cannot block. */
> @@ -199,9 +206,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
>  		 */
>  		if (!empty && rnp->qsmask == 0 &&
>  		    list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1])) {
> -			t->rcu_read_unlock_special &=
> -				~(RCU_READ_UNLOCK_NEED_QS |
> -				  RCU_READ_UNLOCK_GOT_QS);
> +			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
>  			if (rnp->parent == NULL) {
>  				/* Only one rcu_node in the tree. */
>  				cpu_quiet_msk_finish(&rcu_preempt_state, flags);
> @@ -352,19 +357,12 @@ static void rcu_preempt_check_callbacks(int cpu)
>  	struct task_struct *t = current;
>  
>  	if (t->rcu_read_lock_nesting == 0) {
> -		t->rcu_read_unlock_special &=
> -			~(RCU_READ_UNLOCK_NEED_QS | RCU_READ_UNLOCK_GOT_QS);
> -		rcu_preempt_qs_record(cpu);
> +		t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
> +		rcu_preempt_qs(cpu);
>  		return;
>  	}
>  	if (per_cpu(rcu_preempt_data, cpu).qs_pending) {
> -		if (t->rcu_read_unlock_special & RCU_READ_UNLOCK_GOT_QS) {
> -			rcu_preempt_qs_record(cpu);
> -			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_GOT_QS;
> -		} else if (!(t->rcu_read_unlock_special &
> -			     RCU_READ_UNLOCK_NEED_QS)) {
> -			t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
> -		}
> +		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
>  	}
>  }
>  
> @@ -451,7 +449,7 @@ EXPORT_SYMBOL_GPL(rcu_batches_completed);
>   * Because preemptable RCU does not exist, we never have to check for
>   * CPUs being in quiescent states.
>   */
> -static void rcu_preempt_qs(int cpu)
> +static void rcu_preempt_note_context_switch(int cpu)
>  {
>  }
>  

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [tip:core/urgent] rcu: Kconfig help needs to say that TREE_PREEMPT_RCU scales down
  2009-09-13 16:15 ` [PATCH tip/core/rcu 1/4] Kconfig help needs to say that TREE_PREEMPT_RCU scales down Paul E. McKenney
  2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
@ 2009-09-17 22:10   ` tip-bot for Paul E. McKenney
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot for Paul E. McKenney @ 2009-09-17 22:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, Valdis.Kletnieks, paulmck, hpa, mingo, tglx, mingo

Commit-ID:  bbe3eae8bb039b5ffd64a6e3d1a0deaa1f3cbae9
Gitweb:     http://git.kernel.org/tip/bbe3eae8bb039b5ffd64a6e3d1a0deaa1f3cbae9
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Sun, 13 Sep 2009 09:15:08 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 18 Sep 2009 00:05:53 +0200

rcu: Kconfig help needs to say that TREE_PREEMPT_RCU scales down

To quote Valdis:

    This leaves somebody who has a laptop wondering which
    choice is best for a system with only one or two cores that
    has CONFIG_PREEMPT defined. One choice says it scales down
    nicely, the other explicitly has a 'depends on PREEMPT'
    attached to it...

So add "scales down nicely" to TREE_PREEMPT_RCU to match that of
TREE_RCU.

Suggested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference: <12528585112362-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 init/Kconfig |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 8e8b76d..4c2c936 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -331,7 +331,8 @@ config TREE_PREEMPT_RCU
 	  This option selects the RCU implementation that is
 	  designed for very large SMP systems with hundreds or
 	  thousands of CPUs, but for which real-time response
-	  is also required.
+	  is also required.  It also scales down nicely to
+	  smaller systems.
 
 endchoice
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:core/urgent] rcu: Add debug checks to TREE_PREEMPT_RCU for premature grace periods
  2009-09-13 16:15 ` [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods Paul E. McKenney
  2009-09-13 16:23   ` Daniel Walker
  2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
@ 2009-09-17 22:10   ` tip-bot for Paul E. McKenney
  2 siblings, 0 replies; 16+ messages in thread
From: tip-bot for Paul E. McKenney @ 2009-09-17 22:10 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, paulmck, hpa, mingo, tglx, mingo

Commit-ID:  b0e165c035b13e1074fa0b555318bd9cb7102558
Gitweb:     http://git.kernel.org/tip/b0e165c035b13e1074fa0b555318bd9cb7102558
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Sun, 13 Sep 2009 09:15:09 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 18 Sep 2009 00:06:13 +0200

rcu: Add debug checks to TREE_PREEMPT_RCU for premature grace periods

Check to make sure that there are no blocked tasks for the previous
grace period while initializing for the next grace period, verify
that rcu_preempt_qs() is given the correct CPU number and is never
called for an offline CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference: <12528585111986-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 kernel/rcutree.c        |    2 ++
 kernel/rcutree_plugin.h |   25 +++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index da301e2..e9a4ae9 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -632,6 +632,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	/* Special-case the common single-level case. */
 	if (NUM_RCU_NODES == 1) {
 		rnp->qsmask = rnp->qsmaskinit;
+		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->gpnum = rsp->gpnum;
 		rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
 		spin_unlock_irqrestore(&rnp->lock, flags);
@@ -665,6 +666,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 	for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++) {
 		spin_lock(&rnp_cur->lock);	/* irqs already disabled. */
 		rnp_cur->qsmask = rnp_cur->qsmaskinit;
+		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->gpnum = rsp->gpnum;
 		spin_unlock(&rnp_cur->lock);	/* irqs already disabled. */
 	}
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 4778936..b8e4b03 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -86,6 +86,7 @@ static void rcu_preempt_qs(int cpu)
 
 	if (t->rcu_read_lock_nesting &&
 	    (t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {
+		WARN_ON_ONCE(cpu != smp_processor_id());
 
 		/* Possibly blocking in an RCU read-side critical section. */
 		rdp = rcu_preempt_state.rda[cpu];
@@ -103,7 +104,11 @@ static void rcu_preempt_qs(int cpu)
 		 * state for the current grace period), then as long
 		 * as that task remains queued, the current grace period
 		 * cannot end.
+		 *
+		 * But first, note that the current CPU must still be
+		 * on line!
 		 */
+		WARN_ON_ONCE((rdp->grpmask & rnp->qsmaskinit) == 0);
 		phase = !(rnp->qsmask & rdp->grpmask) ^ (rnp->gpnum & 0x1);
 		list_add(&t->rcu_node_entry, &rnp->blocked_tasks[phase]);
 		smp_mb();  /* Ensure later ctxt swtch seen after above. */
@@ -259,6 +264,18 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
+ * Check that the list of blocked tasks for the newly completed grace
+ * period is in fact empty.  It is a serious bug to complete a grace
+ * period that still has RCU readers blocked!  This function must be
+ * invoked -before- updating this rnp's ->gpnum, and the rnp's ->lock
+ * must be held by the caller.
+ */
+static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+{
+	WARN_ON_ONCE(!list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1]));
+}
+
+/*
  * Check for preempted RCU readers for the specified rcu_node structure.
  * If the caller needs a reliable answer, it must hold the rcu_node's
  * >lock.
@@ -451,6 +468,14 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
+ * Because there is no preemptable RCU, there can be no readers blocked,
+ * so there is no need to check for blocked tasks.
+ */
+static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
+{
+}
+
+/*
  * Because preemptable RCU does not exist, there are never any preempted
  * RCU readers.
  */

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:core/urgent] rcu: Simplify rcu_read_unlock_special() quiescent-state accounting
  2009-09-13 16:15 ` [PATCH tip/core/rcu 3/4] Simplify rcu_read_unlock_special() quiescent-state accounting Paul E. McKenney
  2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
  2009-09-15 19:53   ` [PATCH tip/core/rcu 3/4] " Josh Triplett
@ 2009-09-17 22:11   ` tip-bot for Paul E. McKenney
  2 siblings, 0 replies; 16+ messages in thread
From: tip-bot for Paul E. McKenney @ 2009-09-17 22:11 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, paulmck, hpa, mingo, josh, tglx, mingo

Commit-ID:  c3422bea5f09b0e85704f51f2b01271630b8940b
Gitweb:     http://git.kernel.org/tip/c3422bea5f09b0e85704f51f2b01271630b8940b
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Sun, 13 Sep 2009 09:15:10 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 18 Sep 2009 00:06:33 +0200

rcu: Simplify rcu_read_unlock_special() quiescent-state accounting

The earlier approach required two scheduling-clock ticks to note an
preemptable-RCU quiescent state in the situation in which the
scheduling-clock interrupt is unlucky enough to always interrupt an
RCU read-side critical section.

With this change, the quiescent state is instead noted by the
outermost rcu_read_unlock() immediately following the first
scheduling-clock tick, or, alternatively, by the first subsequent
context switch.  Therefore, this change also speeds up grace
periods.

Suggested-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference: <12528585111945-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 include/linux/sched.h   |    1 -
 kernel/rcutree.c        |   15 +++++-------
 kernel/rcutree_plugin.h |   54 ++++++++++++++++++++++------------------------
 3 files changed, 32 insertions(+), 38 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index f3d74bd..c62a9f8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1740,7 +1740,6 @@ extern cputime_t task_gtime(struct task_struct *p);
 
 #define RCU_READ_UNLOCK_BLOCKED (1 << 0) /* blocked while in RCU read-side. */
 #define RCU_READ_UNLOCK_NEED_QS (1 << 1) /* RCU core needs CPU response. */
-#define RCU_READ_UNLOCK_GOT_QS  (1 << 2) /* CPU has responded to RCU core. */
 
 static inline void rcu_copy_process(struct task_struct *p)
 {
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index e9a4ae9..6c99553 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -107,27 +107,23 @@ static void __cpuinit rcu_init_percpu_data(int cpu, struct rcu_state *rsp,
  */
 void rcu_sched_qs(int cpu)
 {
-	unsigned long flags;
 	struct rcu_data *rdp;
 
-	local_irq_save(flags);
 	rdp = &per_cpu(rcu_sched_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
-	rcu_preempt_qs(cpu);
-	local_irq_restore(flags);
+	barrier();
+	rdp->passed_quiesc = 1;
+	rcu_preempt_note_context_switch(cpu);
 }
 
 void rcu_bh_qs(int cpu)
 {
-	unsigned long flags;
 	struct rcu_data *rdp;
 
-	local_irq_save(flags);
 	rdp = &per_cpu(rcu_bh_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
-	local_irq_restore(flags);
+	barrier();
+	rdp->passed_quiesc = 1;
 }
 
 #ifdef CONFIG_NO_HZ
@@ -615,6 +611,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
 
 	/* Advance to a new grace period and initialize state. */
 	rsp->gpnum++;
+	WARN_ON_ONCE(rsp->signaled == RCU_GP_INIT);
 	rsp->signaled = RCU_GP_INIT; /* Hold off force_quiescent_state. */
 	rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
 	record_gp_stall_check_time(rsp);
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index b8e4b03..c9616e4 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -64,34 +64,42 @@ EXPORT_SYMBOL_GPL(rcu_batches_completed);
  * not in a quiescent state.  There might be any number of tasks blocked
  * while in an RCU read-side critical section.
  */
-static void rcu_preempt_qs_record(int cpu)
+static void rcu_preempt_qs(int cpu)
 {
 	struct rcu_data *rdp = &per_cpu(rcu_preempt_data, cpu);
-	rdp->passed_quiesc = 1;
 	rdp->passed_quiesc_completed = rdp->completed;
+	barrier();
+	rdp->passed_quiesc = 1;
 }
 
 /*
- * We have entered the scheduler or are between softirqs in ksoftirqd.
- * If we are in an RCU read-side critical section, we need to reflect
- * that in the state of the rcu_node structure corresponding to this CPU.
- * Caller must disable hardirqs.
+ * We have entered the scheduler, and the current task might soon be
+ * context-switched away from.  If this task is in an RCU read-side
+ * critical section, we will no longer be able to rely on the CPU to
+ * record that fact, so we enqueue the task on the appropriate entry
+ * of the blocked_tasks[] array.  The task will dequeue itself when
+ * it exits the outermost enclosing RCU read-side critical section.
+ * Therefore, the current grace period cannot be permitted to complete
+ * until the blocked_tasks[] entry indexed by the low-order bit of
+ * rnp->gpnum empties.
+ *
+ * Caller must disable preemption.
  */
-static void rcu_preempt_qs(int cpu)
+static void rcu_preempt_note_context_switch(int cpu)
 {
 	struct task_struct *t = current;
+	unsigned long flags;
 	int phase;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
 
 	if (t->rcu_read_lock_nesting &&
 	    (t->rcu_read_unlock_special & RCU_READ_UNLOCK_BLOCKED) == 0) {
-		WARN_ON_ONCE(cpu != smp_processor_id());
 
 		/* Possibly blocking in an RCU read-side critical section. */
 		rdp = rcu_preempt_state.rda[cpu];
 		rnp = rdp->mynode;
-		spin_lock(&rnp->lock);
+		spin_lock_irqsave(&rnp->lock, flags);
 		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_BLOCKED;
 		t->rcu_blocked_node = rnp;
 
@@ -112,7 +120,7 @@ static void rcu_preempt_qs(int cpu)
 		phase = !(rnp->qsmask & rdp->grpmask) ^ (rnp->gpnum & 0x1);
 		list_add(&t->rcu_node_entry, &rnp->blocked_tasks[phase]);
 		smp_mb();  /* Ensure later ctxt swtch seen after above. */
-		spin_unlock(&rnp->lock);
+		spin_unlock_irqrestore(&rnp->lock, flags);
 	}
 
 	/*
@@ -124,9 +132,8 @@ static void rcu_preempt_qs(int cpu)
 	 * grace period, then the fact that the task has been enqueued
 	 * means that we continue to block the current grace period.
 	 */
-	rcu_preempt_qs_record(cpu);
-	t->rcu_read_unlock_special &= ~(RCU_READ_UNLOCK_NEED_QS |
-					RCU_READ_UNLOCK_GOT_QS);
+	rcu_preempt_qs(cpu);
+	t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
 }
 
 /*
@@ -162,7 +169,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
 	special = t->rcu_read_unlock_special;
 	if (special & RCU_READ_UNLOCK_NEED_QS) {
 		t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
-		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_GOT_QS;
+		rcu_preempt_qs(smp_processor_id());
 	}
 
 	/* Hardware IRQ handlers cannot block. */
@@ -199,9 +206,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
 		 */
 		if (!empty && rnp->qsmask == 0 &&
 		    list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1])) {
-			t->rcu_read_unlock_special &=
-				~(RCU_READ_UNLOCK_NEED_QS |
-				  RCU_READ_UNLOCK_GOT_QS);
+			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
 			if (rnp->parent == NULL) {
 				/* Only one rcu_node in the tree. */
 				cpu_quiet_msk_finish(&rcu_preempt_state, flags);
@@ -352,19 +357,12 @@ static void rcu_preempt_check_callbacks(int cpu)
 	struct task_struct *t = current;
 
 	if (t->rcu_read_lock_nesting == 0) {
-		t->rcu_read_unlock_special &=
-			~(RCU_READ_UNLOCK_NEED_QS | RCU_READ_UNLOCK_GOT_QS);
-		rcu_preempt_qs_record(cpu);
+		t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
+		rcu_preempt_qs(cpu);
 		return;
 	}
 	if (per_cpu(rcu_preempt_data, cpu).qs_pending) {
-		if (t->rcu_read_unlock_special & RCU_READ_UNLOCK_GOT_QS) {
-			rcu_preempt_qs_record(cpu);
-			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_GOT_QS;
-		} else if (!(t->rcu_read_unlock_special &
-			     RCU_READ_UNLOCK_NEED_QS)) {
-			t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
-		}
+		t->rcu_read_unlock_special |= RCU_READ_UNLOCK_NEED_QS;
 	}
 }
 
@@ -451,7 +449,7 @@ EXPORT_SYMBOL_GPL(rcu_batches_completed);
  * Because preemptable RCU does not exist, we never have to check for
  * CPUs being in quiescent states.
  */
-static void rcu_preempt_qs(int cpu)
+static void rcu_preempt_note_context_switch(int cpu)
 {
 }
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip:core/urgent] rcu: Fix synchronize_rcu() for TREE_PREEMPT_RCU
  2009-09-13 16:15 ` [PATCH tip/core/rcu 4/4] Fix synchronize_rcu() for TREE_PREEMPT_RCU Paul E. McKenney
  2009-09-15  7:18   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
@ 2009-09-17 22:11   ` tip-bot for Paul E. McKenney
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot for Paul E. McKenney @ 2009-09-17 22:11 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, paulmck, hpa, mingo, tglx, mingo

Commit-ID:  16e3081191837a6a04733de5cd5d1d1b303140d4
Gitweb:     http://git.kernel.org/tip/16e3081191837a6a04733de5cd5d1d1b303140d4
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Sun, 13 Sep 2009 09:15:11 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 18 Sep 2009 00:06:53 +0200

rcu: Fix synchronize_rcu() for TREE_PREEMPT_RCU

The redirection of synchronize_sched() to synchronize_rcu() was
appropriate for TREE_RCU, but not for TREE_PREEMPT_RCU.

Fix this by creating an underlying synchronize_sched().  TREE_RCU
then redirects synchronize_rcu() to synchronize_sched(), while
TREE_PREEMPT_RCU has its own version of synchronize_rcu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference: <12528585111916-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 include/linux/rcupdate.h |   23 +++++------------------
 include/linux/rcutree.h  |    4 ++--
 kernel/rcupdate.c        |   44 +++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 95e0615..39dce83 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -52,8 +52,13 @@ struct rcu_head {
 };
 
 /* Exported common interfaces */
+#ifdef CONFIG_TREE_PREEMPT_RCU
 extern void synchronize_rcu(void);
+#else /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+#define synchronize_rcu synchronize_sched
+#endif /* #else #ifdef CONFIG_TREE_PREEMPT_RCU */
 extern void synchronize_rcu_bh(void);
+extern void synchronize_sched(void);
 extern void rcu_barrier(void);
 extern void rcu_barrier_bh(void);
 extern void rcu_barrier_sched(void);
@@ -262,24 +267,6 @@ struct rcu_synchronize {
 extern void wakeme_after_rcu(struct rcu_head  *head);
 
 /**
- * synchronize_sched - block until all CPUs have exited any non-preemptive
- * kernel code sequences.
- *
- * This means that all preempt_disable code sequences, including NMI and
- * hardware-interrupt handlers, in progress on entry will have completed
- * before this primitive returns.  However, this does not guarantee that
- * softirq handlers will have completed, since in some kernels, these
- * handlers can run in process context, and can block.
- *
- * This primitive provides the guarantees made by the (now removed)
- * synchronize_kernel() API.  In contrast, synchronize_rcu() only
- * guarantees that rcu_read_lock() sections will have completed.
- * In "classic RCU", these two guarantees happen to be one and
- * the same, but can differ in realtime RCU implementations.
- */
-#define synchronize_sched() __synchronize_sched()
-
-/**
  * call_rcu - Queue an RCU callback for invocation after a grace period.
  * @head: structure to be used for queueing the RCU updates.
  * @func: actual update function to be invoked after the grace period
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index a893077..00d08c0 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -53,6 +53,8 @@ static inline void __rcu_read_unlock(void)
 	preempt_enable();
 }
 
+#define __synchronize_sched() synchronize_rcu()
+
 static inline void exit_rcu(void)
 {
 }
@@ -68,8 +70,6 @@ static inline void __rcu_read_unlock_bh(void)
 	local_bh_enable();
 }
 
-#define __synchronize_sched() synchronize_rcu()
-
 extern void call_rcu_sched(struct rcu_head *head,
 			   void (*func)(struct rcu_head *rcu));
 
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index bd5d5c8..28d2f24 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -74,6 +74,8 @@ void wakeme_after_rcu(struct rcu_head  *head)
 	complete(&rcu->completion);
 }
 
+#ifdef CONFIG_TREE_PREEMPT_RCU
+
 /**
  * synchronize_rcu - wait until a grace period has elapsed.
  *
@@ -87,7 +89,7 @@ void synchronize_rcu(void)
 {
 	struct rcu_synchronize rcu;
 
-	if (rcu_blocking_is_gp())
+	if (!rcu_scheduler_active)
 		return;
 
 	init_completion(&rcu.completion);
@@ -98,6 +100,46 @@ void synchronize_rcu(void)
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu);
 
+#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+
+/**
+ * synchronize_sched - wait until an rcu-sched grace period has elapsed.
+ *
+ * Control will return to the caller some time after a full rcu-sched
+ * grace period has elapsed, in other words after all currently executing
+ * rcu-sched read-side critical sections have completed.   These read-side
+ * critical sections are delimited by rcu_read_lock_sched() and
+ * rcu_read_unlock_sched(), and may be nested.  Note that preempt_disable(),
+ * local_irq_disable(), and so on may be used in place of
+ * rcu_read_lock_sched().
+ *
+ * This means that all preempt_disable code sequences, including NMI and
+ * hardware-interrupt handlers, in progress on entry will have completed
+ * before this primitive returns.  However, this does not guarantee that
+ * softirq handlers will have completed, since in some kernels, these
+ * handlers can run in process context, and can block.
+ *
+ * This primitive provides the guarantees made by the (now removed)
+ * synchronize_kernel() API.  In contrast, synchronize_rcu() only
+ * guarantees that rcu_read_lock() sections will have completed.
+ * In "classic RCU", these two guarantees happen to be one and
+ * the same, but can differ in realtime RCU implementations.
+ */
+void synchronize_sched(void)
+{
+	struct rcu_synchronize rcu;
+
+	if (rcu_blocking_is_gp())
+		return;
+
+	init_completion(&rcu.completion);
+	/* Will wake me after RCU finished. */
+	call_rcu_sched(&rcu.head, wakeme_after_rcu);
+	/* Wait for it. */
+	wait_for_completion(&rcu.completion);
+}
+EXPORT_SYMBOL_GPL(synchronize_sched);
+
 /**
  * synchronize_rcu_bh - wait until an rcu_bh grace period has elapsed.
  *

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-09-17 22:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-13 16:14 [PATCH tip/core/rcu 0/4] Review comments, cleanups, and preemptable synchronize_rcu() fixes Paul E. McKenney
2009-09-13 16:15 ` [PATCH tip/core/rcu 1/4] Kconfig help needs to say that TREE_PREEMPT_RCU scales down Paul E. McKenney
2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
2009-09-17 22:10   ` tip-bot for Paul E. McKenney
2009-09-13 16:15 ` [PATCH tip/core/rcu 2/4] Add debug checks to TREE_PREEMPT_RCU for premature grace periods Paul E. McKenney
2009-09-13 16:23   ` Daniel Walker
2009-09-13 16:31     ` Paul E. McKenney
2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
2009-09-17 22:10   ` tip-bot for Paul E. McKenney
2009-09-13 16:15 ` [PATCH tip/core/rcu 3/4] Simplify rcu_read_unlock_special() quiescent-state accounting Paul E. McKenney
2009-09-15  7:17   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
2009-09-15 19:53   ` [PATCH tip/core/rcu 3/4] " Josh Triplett
2009-09-17 22:11   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
2009-09-13 16:15 ` [PATCH tip/core/rcu 4/4] Fix synchronize_rcu() for TREE_PREEMPT_RCU Paul E. McKenney
2009-09-15  7:18   ` [tip:core/urgent] rcu: " tip-bot for Paul E. McKenney
2009-09-17 22:11   ` tip-bot for Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox