public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
@ 2012-06-04 12:08 fweisbec
  2012-06-04 12:08 ` [PATCH 1/2] rcu: New rcu_user_enter() and rcu_user_exit() APIs fweisbec
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: fweisbec @ 2012-06-04 12:08 UTC (permalink / raw)
  To: Ingo Molnar, Paul E. McKenney
  Cc: LKML, Frederic Weisbecker, Alessio Igor Bogani, Andrew Morton,
	Avi Kivity, Chris Metcalf, Christoph Lameter, Daniel Lezcano,
	Geoff Levand, Gilad Ben Yossef, Hakan Akkan, Kevin Hilman,
	Max Krasnyansky, Peter Zijlstra, Stephen Hemminger,
	Steven Rostedt, Sven-Thorsten Dietrich, Thomas Gleixner

From: Frederic Weisbecker <fweisbec@gmail.com>

Paul, Ingo,

This is a rebase of the nohz cpusets RCU APIs on top of Paul's latest
-rcu (rcu/core) branch.

I have only built tested it yet, I need to do a full rebase of my
tree to test it in practice. But I wanted to show you how it looks
like first.

I also wonder if we can set that to a tree somewhere. Ingo suggested
to set up a tree on -tip to apply the uncontroversial part of nohz
cpusets patches and iterate from there. I think it would accelerate
everything if we start doing that.

Tell me what you think.

Thanks.

Frederic Weisbecker (2):
  rcu: New rcu_user_enter() and rcu_user_exit() APIs
  rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs

 include/linux/rcupdate.h |    4 +
 kernel/rcutree.c         |  184 +++++++++++++++++++++++++++++++++++++++------
 2 files changed, 163 insertions(+), 25 deletions(-)

-- 
1.7.5.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] rcu: New rcu_user_enter() and rcu_user_exit() APIs
  2012-06-04 12:08 [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz fweisbec
@ 2012-06-04 12:08 ` fweisbec
  2012-06-04 12:08 ` [PATCH 2/2] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs fweisbec
  2012-06-04 18:13 ` [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz Paul E. McKenney
  2 siblings, 0 replies; 18+ messages in thread
From: fweisbec @ 2012-06-04 12:08 UTC (permalink / raw)
  To: Ingo Molnar, Paul E. McKenney
  Cc: LKML, Frederic Weisbecker, Alessio Igor Bogani, Andrew Morton,
	Avi Kivity, Chris Metcalf, Christoph Lameter, Daniel Lezcano,
	Geoff Levand, Gilad Ben Yossef, Hakan Akkan, Kevin Hilman,
	Max Krasnyansky, Peter Zijlstra, Stephen Hemminger,
	Steven Rostedt, Sven-Thorsten Dietrich, Thomas Gleixner

From: Frederic Weisbecker <fweisbec@gmail.com>

These two APIs are provided to help the implementation
of an adaptive tickless kernel (cf: nohz cpusets). We need
to run into RCU extended quiescent state when we are in
userland so that a tickless CPU is not involved in the
global RCU state machine and can shutdown its tick safely.

These APIs are called from syscall and exception entry/exit
points and can't be called from interrupt.

They are essentially the same than rcu_idle_enter() and
rcu_idle_exit() minus the checks that ensure the CPU is
running the idle task.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/rcupdate.h |    2 +
 kernel/rcutree.c         |  135 +++++++++++++++++++++++++++++++++++++---------
 2 files changed, 112 insertions(+), 25 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index b737a5b..e8323df 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -191,6 +191,8 @@ extern void rcu_idle_enter(void);
 extern void rcu_idle_exit(void);
 extern void rcu_irq_enter(void);
 extern void rcu_irq_exit(void);
+extern void rcu_user_enter(void);
+extern void rcu_user_exit(void);
 extern void exit_rcu(void);
 
 /**
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 6acb7c0..59ac305 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -349,6 +349,29 @@ static int rcu_implicit_offline_qs(struct rcu_data *rdp)
 	return 0;
 }
 
+static void rcu_check_idle_entry(void)
+{
+	struct task_struct *idle;
+	struct rcu_dynticks *rdtp;
+	unsigned long flags;
+
+	if (is_idle_task(current))
+		return;
+
+	local_irq_save(flags);
+
+	rdtp = &__get_cpu_var(rcu_dynticks);
+	idle = idle_task(smp_processor_id());
+
+	trace_rcu_dyntick("Error on entry: not idle task", rdtp->dynticks_nesting, 0);
+	ftrace_dump(DUMP_ORIG);
+	WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
+		  current->pid, current->comm,
+		  idle->pid, idle->comm); /* must be idle task! */
+
+	local_irq_restore(flags);
+}
+
 /*
  * rcu_idle_enter_common - inform RCU that current CPU is moving towards idle
  *
@@ -359,15 +382,6 @@ static int rcu_implicit_offline_qs(struct rcu_data *rdp)
 static void rcu_idle_enter_common(struct rcu_dynticks *rdtp, long long oldval)
 {
 	trace_rcu_dyntick("Start", oldval, 0);
-	if (!is_idle_task(current)) {
-		struct task_struct *idle = idle_task(smp_processor_id());
-
-		trace_rcu_dyntick("Error on entry: not idle task", oldval, 0);
-		ftrace_dump(DUMP_ORIG);
-		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
-			  current->pid, current->comm,
-			  idle->pid, idle->comm); /* must be idle task! */
-	}
 	rcu_prepare_for_idle(smp_processor_id());
 	/* CPUs seeing atomic_inc() must see prior RCU read-side crit sects */
 	smp_mb__before_atomic_inc();  /* See above. */
@@ -387,8 +401,9 @@ static void rcu_idle_enter_common(struct rcu_dynticks *rdtp, long long oldval)
 			   "Illegal idle entry in RCU-sched read-side critical section.");
 }
 
-/**
- * rcu_idle_enter - inform RCU that current CPU is entering idle
+/*
+ * __rcu_idle_enter - inform RCU that current CPU is entering RCU
+ * idle mode.
  *
  * Enter idle mode, in other words, -leave- the mode in which RCU
  * read-side critical sections can occur.  (Though RCU read-side
@@ -399,7 +414,7 @@ static void rcu_idle_enter_common(struct rcu_dynticks *rdtp, long long oldval)
  * the possibility of usermode upcalls having messed up our count
  * of interrupt nesting level during the prior busy period.
  */
-void rcu_idle_enter(void)
+static void __rcu_idle_enter(void)
 {
 	unsigned long flags;
 	long long oldval;
@@ -416,9 +431,38 @@ void rcu_idle_enter(void)
 	rcu_idle_enter_common(rdtp, oldval);
 	local_irq_restore(flags);
 }
+
+/**
+ * rcu_idle_enter - inform RCU that current CPU is entering RCU
+ * idle mode from the idle task.
+ *
+ * Enter idle mode from the idle task before we put the CPU into
+ * low power mode. No use of RCU is permitted between this call and
+ * rcu_idle_exit(). This way the CPU doesn't need to keep the
+ * timer tick to report quiescent states, which is desired for energy
+ * savings.
+ */
+void rcu_idle_enter(void)
+{
+	rcu_check_idle_entry();
+	__rcu_idle_enter();
+}
 EXPORT_SYMBOL_GPL(rcu_idle_enter);
 
 /**
+ * rcu_user_enter - inform RCU that we are resuming userspace.
+ *
+ * Enter RCU idle mode right before resuming userspace. No use of RCU
+ * is permitted between this call and rcu_user_exit(). This way the
+ * CPU doesn't need to maintain the tick for RCU maintainance purpose
+ * when the CPU runs in userspace.
+ */
+void rcu_user_enter(void)
+{
+	__rcu_idle_enter();
+}
+
+/**
  * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
  *
  * Exit from an interrupt handler, which might possibly result in entering
@@ -452,6 +496,29 @@ void rcu_irq_exit(void)
 	local_irq_restore(flags);
 }
 
+static void rcu_check_idle_exit(long long oldval)
+{
+	struct task_struct *idle;
+	struct rcu_dynticks *rdtp;
+	unsigned long flags;
+
+	if (is_idle_task(current))
+		return;
+
+	local_irq_save(flags);
+
+	idle = idle_task(smp_processor_id());
+	rdtp = &__get_cpu_var(rcu_dynticks);
+	trace_rcu_dyntick("Error on exit: not idle task",
+			  oldval, rdtp->dynticks_nesting);
+	ftrace_dump(DUMP_ORIG);
+	WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
+		  current->pid, current->comm,
+		  idle->pid, idle->comm); /* must be idle task! */
+
+	local_irq_restore(flags);
+}
+
 /*
  * rcu_idle_exit_common - inform RCU that current CPU is moving away from idle
  *
@@ -468,20 +535,11 @@ static void rcu_idle_exit_common(struct rcu_dynticks *rdtp, long long oldval)
 	WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
 	rcu_cleanup_after_idle(smp_processor_id());
 	trace_rcu_dyntick("End", oldval, rdtp->dynticks_nesting);
-	if (!is_idle_task(current)) {
-		struct task_struct *idle = idle_task(smp_processor_id());
-
-		trace_rcu_dyntick("Error on exit: not idle task",
-				  oldval, rdtp->dynticks_nesting);
-		ftrace_dump(DUMP_ORIG);
-		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
-			  current->pid, current->comm,
-			  idle->pid, idle->comm); /* must be idle task! */
-	}
 }
 
-/**
- * rcu_idle_exit - inform RCU that current CPU is leaving idle
+/*
+ * rcu_idle_exit - inform RCU that current CPU is leaving RCU
+ * idle mode.
  *
  * Exit idle mode, in other words, -enter- the mode in which RCU
  * read-side critical sections can occur.
@@ -491,7 +549,7 @@ static void rcu_idle_exit_common(struct rcu_dynticks *rdtp, long long oldval)
  * of interrupt nesting level during the busy period that is just
  * now starting.
  */
-void rcu_idle_exit(void)
+static long long __rcu_idle_exit(void)
 {
 	unsigned long flags;
 	struct rcu_dynticks *rdtp;
@@ -507,10 +565,37 @@ void rcu_idle_exit(void)
 		rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
 	rcu_idle_exit_common(rdtp, oldval);
 	local_irq_restore(flags);
+
+	return oldval;
 }
 EXPORT_SYMBOL_GPL(rcu_idle_exit);
 
 /**
+ * rcu_idle_exit - inform RCU that current CPU is leaving RCU
+ * idle mode from the idle task.
+ *
+ * Exit idle mode from the idle task after we wake the CPU up from
+ * low power mode. The CPU can make use of RCU read side critical
+ * sections again after this call.
+ */
+void rcu_idle_exit(void)
+{
+	long long oldval = __rcu_idle_exit();
+	rcu_check_idle_exit(oldval);
+}
+
+/**
+ * rcu_user_exit - inform RCU that we are exiting userspace.
+ *
+ * Exit RCU idle mode while entering the kernel because it can
+ * run an RCU read side critical section anytime.
+ */
+void rcu_user_exit(void)
+{
+	__rcu_idle_exit();
+}
+
+/**
  * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
  *
  * Enter an interrupt handler, which might possibly result in exiting
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs
  2012-06-04 12:08 [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz fweisbec
  2012-06-04 12:08 ` [PATCH 1/2] rcu: New rcu_user_enter() and rcu_user_exit() APIs fweisbec
@ 2012-06-04 12:08 ` fweisbec
  2012-06-04 18:13 ` [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz Paul E. McKenney
  2 siblings, 0 replies; 18+ messages in thread
From: fweisbec @ 2012-06-04 12:08 UTC (permalink / raw)
  To: Ingo Molnar, Paul E. McKenney
  Cc: LKML, Frederic Weisbecker, Alessio Igor Bogani, Andrew Morton,
	Avi Kivity, Chris Metcalf, Christoph Lameter, Daniel Lezcano,
	Geoff Levand, Gilad Ben Yossef, Hakan Akkan, Kevin Hilman,
	Max Krasnyansky, Peter Zijlstra, Stephen Hemminger,
	Steven Rostedt, Sven-Thorsten Dietrich, Thomas Gleixner

From: Frederic Weisbecker <fweisbec@gmail.com>

A CPU running in adaptive tickless mode wants to enter into
RCU extended quiescent state while running in userspace. This
way we can shut down the tick that is usually needed on each
CPU for the needs of RCU.

Typically, RCU enters the extended quiescent state when we resume
to userspace through a syscall or exception exit, this is done
using rcu_user_enter(). Then RCU exit this state by calling
rcu_user_exit() from syscall or exception entry.

However there are two other points where we may want to enter
or exit this state. Some remote CPU may require a tickless CPU
to restart its tick for any reason and send it an IPI for
this purpose. As we restart the tick, we don't want to resume
from the IPI in RCU extended quiescent state anymore.
Similarly we may stop the tick from an interrupt in userspace and
we need to be able to enter RCU extended quiescent state when we
resume from this interrupt to userspace.

To these ends, we provide two new APIs:

- rcu_user_enter_irq(). This must be called from a non-nesting
interrupt betwenn rcu_irq_enter() and rcu_irq_exit().
After the irq calls rcu_irq_exit(), we'll run into RCU extended
quiescent state.

- rcu_user_exit_irq(). This must be called from a non-nesting
interrupt, interrupting an RCU extended quiescent state, and
between rcu_irq_enter() and rcu_irq_exit(). After the irq calls
rcu_irq_exit(), we'll prevent from resuming the RCU extended
quiescent.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/rcupdate.h |    2 +
 kernel/rcutree.c         |   49 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e8323df..c0280ce 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -193,6 +193,8 @@ extern void rcu_irq_enter(void);
 extern void rcu_irq_exit(void);
 extern void rcu_user_enter(void);
 extern void rcu_user_exit(void);
+extern void rcu_user_enter_irq(void);
+extern void rcu_user_exit_irq(void);
 extern void exit_rcu(void);
 
 /**
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 59ac305..28e542c 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -463,6 +463,30 @@ void rcu_user_enter(void)
 }
 
 /**
+ * rcu_user_enter_irq - inform RCU that we are going to resume userspace
+ * after the current irq returns.
+ *
+ * This is similar to rcu_user_enter() but in the context of a non
+ * nesting irq. After this call, RCU enters into idle mode when the
+ * interrupt returns.
+ */
+void rcu_user_enter_irq(void)
+{
+	unsigned long flags;
+	struct rcu_dynticks *rdtp;
+
+	local_irq_save(flags);
+	rdtp = &__get_cpu_var(rcu_dynticks);
+	/*
+	 * Ensure this irq is a non nesting one interrupting
+	 * a non-idle RCU state.
+	 */
+	WARN_ON_ONCE(rdtp->dynticks_nesting != DYNTICK_TASK_EXIT_IDLE + 1);
+	rdtp->dynticks_nesting = 1;
+	local_irq_restore(flags);
+}
+
+/**
  * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
  *
  * Exit from an interrupt handler, which might possibly result in entering
@@ -596,6 +620,31 @@ void rcu_user_exit(void)
 }
 
 /**
+ * rcu_user_exit_irq - inform RCU that we won't resume to userspace
+ * idle mode after the current irq returns.
+ *
+ * This is similar to rcu_user_exit() but in the context of a non
+ * nesting irq. This is called when the irq has interrupted a userspace
+ * RCU idle mode context. When the interrupt returns after this call,
+ * the CPU won't restore the RCU idle mode.
+ */
+void rcu_user_exit_irq(void)
+{
+	unsigned long flags;
+	struct rcu_dynticks *rdtp;
+
+	local_irq_save(flags);
+	rdtp = &__get_cpu_var(rcu_dynticks);
+	/*
+	 * Ensure this irq is a non-nesting one interrupting
+	 * an RCU idle mode.
+	 */
+	WARN_ON_ONCE(rdtp->dynticks_nesting != 1);
+	rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE + 1;
+	local_irq_restore(flags);
+}
+
+/**
  * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
  *
  * Enter an interrupt handler, which might possibly result in exiting
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-04 12:08 [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz fweisbec
  2012-06-04 12:08 ` [PATCH 1/2] rcu: New rcu_user_enter() and rcu_user_exit() APIs fweisbec
  2012-06-04 12:08 ` [PATCH 2/2] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs fweisbec
@ 2012-06-04 18:13 ` Paul E. McKenney
  2012-06-04 19:06   ` Frederic Weisbecker
  2 siblings, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2012-06-04 18:13 UTC (permalink / raw)
  To: fweisbec
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Mon, Jun 04, 2012 at 02:08:26PM +0200, fweisbec@gmail.com wrote:
> From: Frederic Weisbecker <fweisbec@gmail.com>
> 
> Paul, Ingo,
> 
> This is a rebase of the nohz cpusets RCU APIs on top of Paul's latest
> -rcu (rcu/core) branch.
> 
> I have only built tested it yet, I need to do a full rebase of my
> tree to test it in practice. But I wanted to show you how it looks
> like first.
> 
> I also wonder if we can set that to a tree somewhere. Ingo suggested
> to set up a tree on -tip to apply the uncontroversial part of nohz
> cpusets patches and iterate from there. I think it would accelerate
> everything if we start doing that.

It would probably be best to put these two in the -rcu set in order to
avoid conflicts with possible further RCU_FAST_NO_HZ work.  I could
push this to -tip early, if that would help.

							Thanx, Paul

> Tell me what you think.
> 
> Thanks.
> 
> Frederic Weisbecker (2):
>   rcu: New rcu_user_enter() and rcu_user_exit() APIs
>   rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs
> 
>  include/linux/rcupdate.h |    4 +
>  kernel/rcutree.c         |  184 +++++++++++++++++++++++++++++++++++++++------
>  2 files changed, 163 insertions(+), 25 deletions(-)
> 
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-04 18:13 ` [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz Paul E. McKenney
@ 2012-06-04 19:06   ` Frederic Weisbecker
  2012-06-04 21:07     ` Paul E. McKenney
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2012-06-04 19:06 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

2012/6/4 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> On Mon, Jun 04, 2012 at 02:08:26PM +0200, fweisbec@gmail.com wrote:
>> From: Frederic Weisbecker <fweisbec@gmail.com>
>>
>> Paul, Ingo,
>>
>> This is a rebase of the nohz cpusets RCU APIs on top of Paul's latest
>> -rcu (rcu/core) branch.
>>
>> I have only built tested it yet, I need to do a full rebase of my
>> tree to test it in practice. But I wanted to show you how it looks
>> like first.
>>
>> I also wonder if we can set that to a tree somewhere. Ingo suggested
>> to set up a tree on -tip to apply the uncontroversial part of nohz
>> cpusets patches and iterate from there. I think it would accelerate
>> everything if we start doing that.
>
> It would probably be best to put these two in the -rcu set in order to
> avoid conflicts with possible further RCU_FAST_NO_HZ work.  I could
> push this to -tip early, if that would help.

But then these APIs are going to be upstream on 3.6
Is that ok for you even if they don't have any upstream user?
We can ifdef it.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-04 19:06   ` Frederic Weisbecker
@ 2012-06-04 21:07     ` Paul E. McKenney
  2012-06-05 10:31       ` Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2012-06-04 21:07 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Mon, Jun 04, 2012 at 09:06:22PM +0200, Frederic Weisbecker wrote:
> 2012/6/4 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > On Mon, Jun 04, 2012 at 02:08:26PM +0200, fweisbec@gmail.com wrote:
> >> From: Frederic Weisbecker <fweisbec@gmail.com>
> >>
> >> Paul, Ingo,
> >>
> >> This is a rebase of the nohz cpusets RCU APIs on top of Paul's latest
> >> -rcu (rcu/core) branch.
> >>
> >> I have only built tested it yet, I need to do a full rebase of my
> >> tree to test it in practice. But I wanted to show you how it looks
> >> like first.
> >>
> >> I also wonder if we can set that to a tree somewhere. Ingo suggested
> >> to set up a tree on -tip to apply the uncontroversial part of nohz
> >> cpusets patches and iterate from there. I think it would accelerate
> >> everything if we start doing that.
> >
> > It would probably be best to put these two in the -rcu set in order to
> > avoid conflicts with possible further RCU_FAST_NO_HZ work.  I could
> > push this to -tip early, if that would help.
> 
> But then these APIs are going to be upstream on 3.6
> Is that ok for you even if they don't have any upstream user?
> We can ifdef it.

I figured on maintaining a separate rcu/idle topic branch that I would
merge locally for building and testing, but which I would not push
to rcu/next.  If Ingo agrees, I can push separately to -tip so that it
does not go upstream until you are ready, at which point I would merge
it into rcu/next.

Seem reasonable, or would something else work better?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-04 21:07     ` Paul E. McKenney
@ 2012-06-05 10:31       ` Frederic Weisbecker
  2012-06-05 23:46         ` Paul E. McKenney
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2012-06-05 10:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Mon, Jun 04, 2012 at 02:07:09PM -0700, Paul E. McKenney wrote:
> On Mon, Jun 04, 2012 at 09:06:22PM +0200, Frederic Weisbecker wrote:
> > 2012/6/4 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > > On Mon, Jun 04, 2012 at 02:08:26PM +0200, fweisbec@gmail.com wrote:
> > >> From: Frederic Weisbecker <fweisbec@gmail.com>
> > >>
> > >> Paul, Ingo,
> > >>
> > >> This is a rebase of the nohz cpusets RCU APIs on top of Paul's latest
> > >> -rcu (rcu/core) branch.
> > >>
> > >> I have only built tested it yet, I need to do a full rebase of my
> > >> tree to test it in practice. But I wanted to show you how it looks
> > >> like first.
> > >>
> > >> I also wonder if we can set that to a tree somewhere. Ingo suggested
> > >> to set up a tree on -tip to apply the uncontroversial part of nohz
> > >> cpusets patches and iterate from there. I think it would accelerate
> > >> everything if we start doing that.
> > >
> > > It would probably be best to put these two in the -rcu set in order to
> > > avoid conflicts with possible further RCU_FAST_NO_HZ work.  I could
> > > push this to -tip early, if that would help.
> > 
> > But then these APIs are going to be upstream on 3.6
> > Is that ok for you even if they don't have any upstream user?
> > We can ifdef it.
> 
> I figured on maintaining a separate rcu/idle topic branch that I would
> merge locally for building and testing, but which I would not push
> to rcu/next.  If Ingo agrees, I can push separately to -tip so that it
> does not go upstream until you are ready, at which point I would merge
> it into rcu/next.
> 
> Seem reasonable, or would something else work better?

Sounds very good!

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-05 10:31       ` Frederic Weisbecker
@ 2012-06-05 23:46         ` Paul E. McKenney
  2012-06-07 14:21           ` Frederic Weisbecker
  2012-06-09 22:55           ` [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs Frederic Weisbecker
  0 siblings, 2 replies; 18+ messages in thread
From: Paul E. McKenney @ 2012-06-05 23:46 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Tue, Jun 05, 2012 at 12:31:00PM +0200, Frederic Weisbecker wrote:
> On Mon, Jun 04, 2012 at 02:07:09PM -0700, Paul E. McKenney wrote:
> > On Mon, Jun 04, 2012 at 09:06:22PM +0200, Frederic Weisbecker wrote:
> > > 2012/6/4 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > > > On Mon, Jun 04, 2012 at 02:08:26PM +0200, fweisbec@gmail.com wrote:
> > > >> From: Frederic Weisbecker <fweisbec@gmail.com>
> > > >>
> > > >> Paul, Ingo,
> > > >>
> > > >> This is a rebase of the nohz cpusets RCU APIs on top of Paul's latest
> > > >> -rcu (rcu/core) branch.
> > > >>
> > > >> I have only built tested it yet, I need to do a full rebase of my
> > > >> tree to test it in practice. But I wanted to show you how it looks
> > > >> like first.
> > > >>
> > > >> I also wonder if we can set that to a tree somewhere. Ingo suggested
> > > >> to set up a tree on -tip to apply the uncontroversial part of nohz
> > > >> cpusets patches and iterate from there. I think it would accelerate
> > > >> everything if we start doing that.
> > > >
> > > > It would probably be best to put these two in the -rcu set in order to
> > > > avoid conflicts with possible further RCU_FAST_NO_HZ work.  I could
> > > > push this to -tip early, if that would help.
> > > 
> > > But then these APIs are going to be upstream on 3.6
> > > Is that ok for you even if they don't have any upstream user?
> > > We can ifdef it.
> > 
> > I figured on maintaining a separate rcu/idle topic branch that I would
> > merge locally for building and testing, but which I would not push
> > to rcu/next.  If Ingo agrees, I can push separately to -tip so that it
> > does not go upstream until you are ready, at which point I would merge
> > it into rcu/next.
> > 
> > Seem reasonable, or would something else work better?
> 
> Sounds very good!

Here you go:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/idle

							Thanx, Paul


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-05 23:46         ` Paul E. McKenney
@ 2012-06-07 14:21           ` Frederic Weisbecker
  2012-06-07 22:45             ` Paul E. McKenney
  2012-06-09 22:55           ` [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs Frederic Weisbecker
  1 sibling, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2012-06-07 14:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Tue, Jun 05, 2012 at 04:46:40PM -0700, Paul E. McKenney wrote:
> On Tue, Jun 05, 2012 at 12:31:00PM +0200, Frederic Weisbecker wrote:
> > On Mon, Jun 04, 2012 at 02:07:09PM -0700, Paul E. McKenney wrote:
> > > On Mon, Jun 04, 2012 at 09:06:22PM +0200, Frederic Weisbecker wrote:
> > > > 2012/6/4 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > > > > On Mon, Jun 04, 2012 at 02:08:26PM +0200, fweisbec@gmail.com wrote:
> > > > >> From: Frederic Weisbecker <fweisbec@gmail.com>
> > > > >>
> > > > >> Paul, Ingo,
> > > > >>
> > > > >> This is a rebase of the nohz cpusets RCU APIs on top of Paul's latest
> > > > >> -rcu (rcu/core) branch.
> > > > >>
> > > > >> I have only built tested it yet, I need to do a full rebase of my
> > > > >> tree to test it in practice. But I wanted to show you how it looks
> > > > >> like first.
> > > > >>
> > > > >> I also wonder if we can set that to a tree somewhere. Ingo suggested
> > > > >> to set up a tree on -tip to apply the uncontroversial part of nohz
> > > > >> cpusets patches and iterate from there. I think it would accelerate
> > > > >> everything if we start doing that.
> > > > >
> > > > > It would probably be best to put these two in the -rcu set in order to
> > > > > avoid conflicts with possible further RCU_FAST_NO_HZ work.  I could
> > > > > push this to -tip early, if that would help.
> > > > 
> > > > But then these APIs are going to be upstream on 3.6
> > > > Is that ok for you even if they don't have any upstream user?
> > > > We can ifdef it.
> > > 
> > > I figured on maintaining a separate rcu/idle topic branch that I would
> > > merge locally for building and testing, but which I would not push
> > > to rcu/next.  If Ingo agrees, I can push separately to -tip so that it
> > > does not go upstream until you are ready, at which point I would merge
> > > it into rcu/next.
> > > 
> > > Seem reasonable, or would something else work better?
> > 
> > Sounds very good!
> 
> Here you go:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/idle

Thanks!

I can see you've implemented a version for TinyRCU. Nohz cpusets only work on
SMP right now because there must be at least one CPU running with the tick
to maintain the timekeeping. I'm pretty confident that one day we'll remove
the jiffies and we'll be able to do the whole timekeeping by using the TSC
or so. There is quite a way before we reach that though.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-07 14:21           ` Frederic Weisbecker
@ 2012-06-07 22:45             ` Paul E. McKenney
  2012-06-09 22:58               ` Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2012-06-07 22:45 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Thu, Jun 07, 2012 at 04:21:09PM +0200, Frederic Weisbecker wrote:
> On Tue, Jun 05, 2012 at 04:46:40PM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 05, 2012 at 12:31:00PM +0200, Frederic Weisbecker wrote:
> > > On Mon, Jun 04, 2012 at 02:07:09PM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jun 04, 2012 at 09:06:22PM +0200, Frederic Weisbecker wrote:
> > > > > 2012/6/4 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > > > > > On Mon, Jun 04, 2012 at 02:08:26PM +0200, fweisbec@gmail.com wrote:
> > > > > >> From: Frederic Weisbecker <fweisbec@gmail.com>
> > > > > >>
> > > > > >> Paul, Ingo,
> > > > > >>
> > > > > >> This is a rebase of the nohz cpusets RCU APIs on top of Paul's latest
> > > > > >> -rcu (rcu/core) branch.
> > > > > >>
> > > > > >> I have only built tested it yet, I need to do a full rebase of my
> > > > > >> tree to test it in practice. But I wanted to show you how it looks
> > > > > >> like first.
> > > > > >>
> > > > > >> I also wonder if we can set that to a tree somewhere. Ingo suggested
> > > > > >> to set up a tree on -tip to apply the uncontroversial part of nohz
> > > > > >> cpusets patches and iterate from there. I think it would accelerate
> > > > > >> everything if we start doing that.
> > > > > >
> > > > > > It would probably be best to put these two in the -rcu set in order to
> > > > > > avoid conflicts with possible further RCU_FAST_NO_HZ work.  I could
> > > > > > push this to -tip early, if that would help.
> > > > > 
> > > > > But then these APIs are going to be upstream on 3.6
> > > > > Is that ok for you even if they don't have any upstream user?
> > > > > We can ifdef it.
> > > > 
> > > > I figured on maintaining a separate rcu/idle topic branch that I would
> > > > merge locally for building and testing, but which I would not push
> > > > to rcu/next.  If Ingo agrees, I can push separately to -tip so that it
> > > > does not go upstream until you are ready, at which point I would merge
> > > > it into rcu/next.
> > > > 
> > > > Seem reasonable, or would something else work better?
> > > 
> > > Sounds very good!
> > 
> > Here you go:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/idle
> 
> Thanks!
> 
> I can see you've implemented a version for TinyRCU. Nohz cpusets only work on
> SMP right now because there must be at least one CPU running with the tick
> to maintain the timekeeping. I'm pretty confident that one day we'll remove
> the jiffies and we'll be able to do the whole timekeeping by using the TSC
> or so. There is quite a way before we reach that though.

In the meantime, would it make sense to slow the tick rate by a factor
of 10 or so on that one CPU when nothing else is going on?  Or does
timekeeping absolutely require running the tick at full speed?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
  2012-06-05 23:46         ` Paul E. McKenney
  2012-06-07 14:21           ` Frederic Weisbecker
@ 2012-06-09 22:55           ` Frederic Weisbecker
  2012-06-10 18:06             ` Paul E. McKenney
  1 sibling, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2012-06-09 22:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Tue, Jun 05, 2012 at 04:46:40PM -0700, Paul E. McKenney wrote:
> Here you go:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/idle

So I've rebased my nohz cpusets patchset and applied these patches.
During testing I found a bug and realized I need to make rcu_exit_user_irq()
callable from any irq nesting level.

Here is a fix:

---
>From c30610d5ed2c292a87f7e32216c3419cdc12dff0 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Sat, 9 Jun 2012 14:06:30 +0200
Subject: [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs

rcu_exit_user_irq() which exits RCU idle mode after the current
irq returns has been designed to be called from non nesting irqs
only.

However the IPI that restarts the tick and exits RCU user-idle mode
in nohz cpusets can happen anytime. For example it can be a nesting
irq by interrupting a softirq. In this case the stack of RCU API
calls becomes:

==> IRQ
    rcu_irq_enter()
    ....
    do_softirq {
===== > IRQ (restart tick IPI)
        rcu_irq_enter()
        rcu_exit_user_irq()
        rcu_irq_exit()
<=====
    }
    rcu_irq_exit();

Hence we need to make rcu_exit_user_irq() callable from any nesting
level of interrupt.

rcu_enter_user_irq() is only called from non nesting irqs though. But
to stay consistant with the new change we also allow it to be called
from any irq nesting level.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/rcutree.c |   36 +++++++++++++++---------------------
 1 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 1b0dca2..3e84c4c 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -465,11 +465,11 @@ void rcu_user_enter(void)
 
 /**
  * rcu_user_enter_irq - inform RCU that we are going to resume userspace
- * after the current irq returns.
+ * after the current non-nesting irq returns.
  *
- * This is similar to rcu_user_enter() but in the context of a non
- * nesting irq. After this call, RCU enters into idle mode when the
- * interrupt returns.
+ * This is similar to rcu_user_enter() but in the context of an
+ * irq. After this call, RCU enters into idle mode when the
+ * current non-nesting interrupt returns.
  */
 void rcu_user_enter_irq(void)
 {
@@ -478,12 +478,9 @@ void rcu_user_enter_irq(void)
 
 	local_irq_save(flags);
 	rdtp = &__get_cpu_var(rcu_dynticks);
-	/*
-	 * Ensure this irq is a non nesting one interrupting
-	 * a non-idle RCU state.
-	 */
-	WARN_ON_ONCE(rdtp->dynticks_nesting != DYNTICK_TASK_EXIT_IDLE + 1);
-	rdtp->dynticks_nesting = 1;
+	/* Ensure we are interrupting a non-idle RCU state */
+	WARN_ON_ONCE(!(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK));
+	rdtp->dynticks_nesting -= DYNTICK_TASK_EXIT_IDLE;
 	local_irq_restore(flags);
 }
 
@@ -619,12 +616,12 @@ void rcu_user_exit(void)
 
 /**
  * rcu_user_exit_irq - inform RCU that we won't resume to userspace
- * idle mode after the current irq returns.
+ * idle mode after the current non-nesting irq returns.
  *
- * This is similar to rcu_user_exit() but in the context of a non
- * nesting irq. This is called when the irq has interrupted a userspace
- * RCU idle mode context. When the interrupt returns after this call,
- * the CPU won't restore the RCU idle mode.
+ * This is similar to rcu_user_exit() but in the context of an
+ * irq. This is called when the irq has interrupted a userspace
+ * RCU idle mode context. When the current non-nesting interrupt
+ * returns after this call, the CPU won't restore the RCU idle mode.
  */
 void rcu_user_exit_irq(void)
 {
@@ -633,12 +630,9 @@ void rcu_user_exit_irq(void)
 
 	local_irq_save(flags);
 	rdtp = &__get_cpu_var(rcu_dynticks);
-	/*
-	 * Ensure this irq is a non-nesting one interrupting
-	 * an RCU idle mode.
-	 */
-	WARN_ON_ONCE(rdtp->dynticks_nesting != 1);
-	rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE + 1;
+	/* Ensure we are interrupting an RCU idle mode. */
+	WARN_ON_ONCE(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK);
+	rdtp->dynticks_nesting += DYNTICK_TASK_EXIT_IDLE;
 	local_irq_restore(flags);
 }
 
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-07 22:45             ` Paul E. McKenney
@ 2012-06-09 22:58               ` Frederic Weisbecker
  2012-06-10 17:54                 ` Paul E. McKenney
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2012-06-09 22:58 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Thu, Jun 07, 2012 at 03:45:08PM -0700, Paul E. McKenney wrote:
> > I can see you've implemented a version for TinyRCU. Nohz cpusets only work on
> > SMP right now because there must be at least one CPU running with the tick
> > to maintain the timekeeping. I'm pretty confident that one day we'll remove
> > the jiffies and we'll be able to do the whole timekeeping by using the TSC
> > or so. There is quite a way before we reach that though.
> 
> In the meantime, would it make sense to slow the tick rate by a factor
> of 10 or so on that one CPU when nothing else is going on?  Or does
> timekeeping absolutely require running the tick at full speed?

I'm not sure of the possible consequences of that.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz
  2012-06-09 22:58               ` Frederic Weisbecker
@ 2012-06-10 17:54                 ` Paul E. McKenney
  0 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2012-06-10 17:54 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Sun, Jun 10, 2012 at 12:58:56AM +0200, Frederic Weisbecker wrote:
> On Thu, Jun 07, 2012 at 03:45:08PM -0700, Paul E. McKenney wrote:
> > > I can see you've implemented a version for TinyRCU. Nohz cpusets only work on
> > > SMP right now because there must be at least one CPU running with the tick
> > > to maintain the timekeeping. I'm pretty confident that one day we'll remove
> > > the jiffies and we'll be able to do the whole timekeeping by using the TSC
> > > or so. There is quite a way before we reach that though.
> > 
> > In the meantime, would it make sense to slow the tick rate by a factor
> > of 10 or so on that one CPU when nothing else is going on?  Or does
> > timekeeping absolutely require running the tick at full speed?
> 
> I'm not sure of the possible consequences of that.

OK, so I will remove the TINY_RCU patches for the moment.  If reducing
tick speed on the sole remaining idle CPU ever becomes feasible and
useful, they are easy to add back in.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
  2012-06-09 22:55           ` [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs Frederic Weisbecker
@ 2012-06-10 18:06             ` Paul E. McKenney
  2012-06-10 20:29               ` Frederic Weisbecker
  2012-06-10 21:47               ` [PATCH v2] " Frederic Weisbecker
  0 siblings, 2 replies; 18+ messages in thread
From: Paul E. McKenney @ 2012-06-10 18:06 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Sun, Jun 10, 2012 at 12:55:54AM +0200, Frederic Weisbecker wrote:
> On Tue, Jun 05, 2012 at 04:46:40PM -0700, Paul E. McKenney wrote:
> > Here you go:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/idle
> 
> So I've rebased my nohz cpusets patchset and applied these patches.
> During testing I found a bug and realized I need to make rcu_exit_user_irq()
> callable from any irq nesting level.

OK.  ;-)

> Here is a fix:
> 
> ---
> >From c30610d5ed2c292a87f7e32216c3419cdc12dff0 Mon Sep 17 00:00:00 2001
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Sat, 9 Jun 2012 14:06:30 +0200
> Subject: [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
> 
> rcu_exit_user_irq() which exits RCU idle mode after the current
> irq returns has been designed to be called from non nesting irqs
> only.
> 
> However the IPI that restarts the tick and exits RCU user-idle mode
> in nohz cpusets can happen anytime. For example it can be a nesting
> irq by interrupting a softirq. In this case the stack of RCU API
> calls becomes:
> 
> ==> IRQ
>     rcu_irq_enter()
>     ....
>     do_softirq {
> ===== > IRQ (restart tick IPI)
>         rcu_irq_enter()
>         rcu_exit_user_irq()
>         rcu_irq_exit()
> <=====
>     }
>     rcu_irq_exit();
> 
> Hence we need to make rcu_exit_user_irq() callable from any nesting
> level of interrupt.
> 
> rcu_enter_user_irq() is only called from non nesting irqs though. But
> to stay consistant with the new change we also allow it to be called
> from any irq nesting level.
> 
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
>  kernel/rcutree.c |   36 +++++++++++++++---------------------
>  1 files changed, 15 insertions(+), 21 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 1b0dca2..3e84c4c 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -465,11 +465,11 @@ void rcu_user_enter(void)
> 
>  /**
>   * rcu_user_enter_irq - inform RCU that we are going to resume userspace
> - * after the current irq returns.
> + * after the current non-nesting irq returns.
>   *
> - * This is similar to rcu_user_enter() but in the context of a non
> - * nesting irq. After this call, RCU enters into idle mode when the
> - * interrupt returns.
> + * This is similar to rcu_user_enter() but in the context of an
> + * irq. After this call, RCU enters into idle mode when the
> + * current non-nesting interrupt returns.
>   */
>  void rcu_user_enter_irq(void)
>  {
> @@ -478,12 +478,9 @@ void rcu_user_enter_irq(void)
> 
>  	local_irq_save(flags);
>  	rdtp = &__get_cpu_var(rcu_dynticks);
> -	/*
> -	 * Ensure this irq is a non nesting one interrupting
> -	 * a non-idle RCU state.
> -	 */
> -	WARN_ON_ONCE(rdtp->dynticks_nesting != DYNTICK_TASK_EXIT_IDLE + 1);
> -	rdtp->dynticks_nesting = 1;
> +	/* Ensure we are interrupting a non-idle RCU state */
> +	WARN_ON_ONCE(!(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK));
> +	rdtp->dynticks_nesting -= DYNTICK_TASK_EXIT_IDLE;

This will be broken on architectures that can fail to return from interrupts
and exceptions and vice versa.  The resulting value of rdtp->dynticks_nesting
might well go negative, or might fail to reach zero when the outermost
interrupt returns.

One workaround would be to add up the relevant fields of preempt_count()
and assign the result to rdtp->dynticks_nesting.

>  	local_irq_restore(flags);
>  }
> 
> @@ -619,12 +616,12 @@ void rcu_user_exit(void)
> 
>  /**
>   * rcu_user_exit_irq - inform RCU that we won't resume to userspace
> - * idle mode after the current irq returns.
> + * idle mode after the current non-nesting irq returns.
>   *
> - * This is similar to rcu_user_exit() but in the context of a non
> - * nesting irq. This is called when the irq has interrupted a userspace
> - * RCU idle mode context. When the interrupt returns after this call,
> - * the CPU won't restore the RCU idle mode.
> + * This is similar to rcu_user_exit() but in the context of an
> + * irq. This is called when the irq has interrupted a userspace
> + * RCU idle mode context. When the current non-nesting interrupt
> + * returns after this call, the CPU won't restore the RCU idle mode.
>   */
>  void rcu_user_exit_irq(void)
>  {
> @@ -633,12 +630,9 @@ void rcu_user_exit_irq(void)
> 
>  	local_irq_save(flags);
>  	rdtp = &__get_cpu_var(rcu_dynticks);
> -	/*
> -	 * Ensure this irq is a non-nesting one interrupting
> -	 * an RCU idle mode.
> -	 */
> -	WARN_ON_ONCE(rdtp->dynticks_nesting != 1);
> -	rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE + 1;
> +	/* Ensure we are interrupting an RCU idle mode. */
> +	WARN_ON_ONCE(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK);
> +	rdtp->dynticks_nesting += DYNTICK_TASK_EXIT_IDLE;

This one works because all of the interrupt misnesting events that I
know of happen from system-call context, not from idle or from user-mode
execution.

							Thanx, Paul

>  	local_irq_restore(flags);
>  }
> 
> -- 
> 1.7.0.4
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
  2012-06-10 18:06             ` Paul E. McKenney
@ 2012-06-10 20:29               ` Frederic Weisbecker
  2012-06-10 21:47               ` [PATCH v2] " Frederic Weisbecker
  1 sibling, 0 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2012-06-10 20:29 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Sun, Jun 10, 2012 at 11:06:59AM -0700, Paul E. McKenney wrote:
> On Sun, Jun 10, 2012 at 12:55:54AM +0200, Frederic Weisbecker wrote:
> >  void rcu_user_enter_irq(void)
> >  {
> > @@ -478,12 +478,9 @@ void rcu_user_enter_irq(void)
> > 
> >  	local_irq_save(flags);
> >  	rdtp = &__get_cpu_var(rcu_dynticks);
> > -	/*
> > -	 * Ensure this irq is a non nesting one interrupting
> > -	 * a non-idle RCU state.
> > -	 */
> > -	WARN_ON_ONCE(rdtp->dynticks_nesting != DYNTICK_TASK_EXIT_IDLE + 1);
> > -	rdtp->dynticks_nesting = 1;
> > +	/* Ensure we are interrupting a non-idle RCU state */
> > +	WARN_ON_ONCE(!(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK));
> > +	rdtp->dynticks_nesting -= DYNTICK_TASK_EXIT_IDLE;
> 
> This will be broken on architectures that can fail to return from interrupts
> and exceptions and vice versa.  The resulting value of rdtp->dynticks_nesting
> might well go negative, or might fail to reach zero when the outermost
> interrupt returns.
> 
> One workaround would be to add up the relevant fields of preempt_count()
> and assign the result to rdtp->dynticks_nesting.

That's ok. I made rcu_user_enter_irq() to support nesting irqs in order to stay
consistant with the same change in rcu_user_exit_irq(). But this is not necessary.
rcu_user_enter_irq() itself is only called from the outermost irq.

Also there is a sanity check that ensures that in the current code in rcu/idle tree
so rcu_user_enter_irq() can stay as is there.

I'll resend the patch with only the change in rcu_user_exit_irq().

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
  2012-06-10 18:06             ` Paul E. McKenney
  2012-06-10 20:29               ` Frederic Weisbecker
@ 2012-06-10 21:47               ` Frederic Weisbecker
  2012-06-11 21:55                 ` Paul E. McKenney
  1 sibling, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2012-06-10 21:47 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Sat, 9 Jun 2012 14:06:30 +0200
Subject: [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs

rcu_exit_user_irq() which exits RCU idle mode after the current
irq returns has been designed to be called from non nesting irqs
only.

However the IPI that restarts the tick and exits RCU user-idle mode
in nohz cpusets can happen anytime. For example it can be a nesting
irq by interrupting a softirq. In this case the stack of RCU API
calls becomes:

==> IRQ
    rcu_irq_enter()
    ....
    do_softirq {
===== > IRQ (restart tick IPI)
        rcu_irq_enter()
        rcu_exit_user_irq()
        rcu_irq_exit()
<=====
    }
    rcu_irq_exit();

Hence we need to make rcu_exit_user_irq() callable from any nesting
level of interrupt.

v2: rcu_user_enter_irq() is only called from non nesting irqs though so
the change doesn't need to be propagated to it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/rcutree.c |   19 ++++++++-----------
 1 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 1b0dca2..7a54265 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -619,12 +619,12 @@ void rcu_user_exit(void)
 
 /**
  * rcu_user_exit_irq - inform RCU that we won't resume to userspace
- * idle mode after the current irq returns.
+ * idle mode after the current non-nesting irq returns.
  *
- * This is similar to rcu_user_exit() but in the context of a non
- * nesting irq. This is called when the irq has interrupted a userspace
- * RCU idle mode context. When the interrupt returns after this call,
- * the CPU won't restore the RCU idle mode.
+ * This is similar to rcu_user_exit() but in the context of an
+ * irq. This is called when the irq has interrupted a userspace
+ * RCU idle mode context. When the current non-nesting interrupt
+ * returns after this call, the CPU won't restore the RCU idle mode.
  */
 void rcu_user_exit_irq(void)
 {
@@ -633,12 +633,9 @@ void rcu_user_exit_irq(void)
 
 	local_irq_save(flags);
 	rdtp = &__get_cpu_var(rcu_dynticks);
-	/*
-	 * Ensure this irq is a non-nesting one interrupting
-	 * an RCU idle mode.
-	 */
-	WARN_ON_ONCE(rdtp->dynticks_nesting != 1);
-	rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE + 1;
+	/* Ensure we are interrupting an RCU idle mode. */
+	WARN_ON_ONCE(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK);
+	rdtp->dynticks_nesting += DYNTICK_TASK_EXIT_IDLE;
 	local_irq_restore(flags);
 }
 
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
  2012-06-10 21:47               ` [PATCH v2] " Frederic Weisbecker
@ 2012-06-11 21:55                 ` Paul E. McKenney
  2012-06-11 22:06                   ` Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2012-06-11 21:55 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Sun, Jun 10, 2012 at 11:47:26PM +0200, Frederic Weisbecker wrote:
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Sat, 9 Jun 2012 14:06:30 +0200
> Subject: [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
> 
> rcu_exit_user_irq() which exits RCU idle mode after the current
> irq returns has been designed to be called from non nesting irqs
> only.
> 
> However the IPI that restarts the tick and exits RCU user-idle mode
> in nohz cpusets can happen anytime. For example it can be a nesting
> irq by interrupting a softirq. In this case the stack of RCU API
> calls becomes:
> 
> ==> IRQ
>     rcu_irq_enter()
>     ....
>     do_softirq {
> ===== > IRQ (restart tick IPI)
>         rcu_irq_enter()
>         rcu_exit_user_irq()
>         rcu_irq_exit()
> <=====
>     }
>     rcu_irq_exit();
> 
> Hence we need to make rcu_exit_user_irq() callable from any nesting
> level of interrupt.
> 
> v2: rcu_user_enter_irq() is only called from non nesting irqs though so
> the change doesn't need to be propagated to it.

I have queued this on -rcu on branch rcu/idle.  I have also removed
the TINY_RCU commits as discussed.  If you ever need them back, the
magic branch name is idle.2012.06.06a.

							Thanx, Paul

> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
>  kernel/rcutree.c |   19 ++++++++-----------
>  1 files changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 1b0dca2..7a54265 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -619,12 +619,12 @@ void rcu_user_exit(void)
> 
>  /**
>   * rcu_user_exit_irq - inform RCU that we won't resume to userspace
> - * idle mode after the current irq returns.
> + * idle mode after the current non-nesting irq returns.
>   *
> - * This is similar to rcu_user_exit() but in the context of a non
> - * nesting irq. This is called when the irq has interrupted a userspace
> - * RCU idle mode context. When the interrupt returns after this call,
> - * the CPU won't restore the RCU idle mode.
> + * This is similar to rcu_user_exit() but in the context of an
> + * irq. This is called when the irq has interrupted a userspace
> + * RCU idle mode context. When the current non-nesting interrupt
> + * returns after this call, the CPU won't restore the RCU idle mode.
>   */
>  void rcu_user_exit_irq(void)
>  {
> @@ -633,12 +633,9 @@ void rcu_user_exit_irq(void)
> 
>  	local_irq_save(flags);
>  	rdtp = &__get_cpu_var(rcu_dynticks);
> -	/*
> -	 * Ensure this irq is a non-nesting one interrupting
> -	 * an RCU idle mode.
> -	 */
> -	WARN_ON_ONCE(rdtp->dynticks_nesting != 1);
> -	rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE + 1;
> +	/* Ensure we are interrupting an RCU idle mode. */
> +	WARN_ON_ONCE(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK);
> +	rdtp->dynticks_nesting += DYNTICK_TASK_EXIT_IDLE;
>  	local_irq_restore(flags);
>  }
> 
> -- 
> 1.7.0.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
  2012-06-11 21:55                 ` Paul E. McKenney
@ 2012-06-11 22:06                   ` Frederic Weisbecker
  0 siblings, 0 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2012-06-11 22:06 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton, Avi Kivity,
	Chris Metcalf, Christoph Lameter, Daniel Lezcano, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Kevin Hilman, Max Krasnyansky,
	Peter Zijlstra, Stephen Hemminger, Steven Rostedt,
	Sven-Thorsten Dietrich, Thomas Gleixner

On Mon, Jun 11, 2012 at 02:55:46PM -0700, Paul E. McKenney wrote:
> On Sun, Jun 10, 2012 at 11:47:26PM +0200, Frederic Weisbecker wrote:
> > From: Frederic Weisbecker <fweisbec@gmail.com>
> > Date: Sat, 9 Jun 2012 14:06:30 +0200
> > Subject: [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs
> > 
> > rcu_exit_user_irq() which exits RCU idle mode after the current
> > irq returns has been designed to be called from non nesting irqs
> > only.
> > 
> > However the IPI that restarts the tick and exits RCU user-idle mode
> > in nohz cpusets can happen anytime. For example it can be a nesting
> > irq by interrupting a softirq. In this case the stack of RCU API
> > calls becomes:
> > 
> > ==> IRQ
> >     rcu_irq_enter()
> >     ....
> >     do_softirq {
> > ===== > IRQ (restart tick IPI)
> >         rcu_irq_enter()
> >         rcu_exit_user_irq()
> >         rcu_irq_exit()
> > <=====
> >     }
> >     rcu_irq_exit();
> > 
> > Hence we need to make rcu_exit_user_irq() callable from any nesting
> > level of interrupt.
> > 
> > v2: rcu_user_enter_irq() is only called from non nesting irqs though so
> > the change doesn't need to be propagated to it.
> 
> I have queued this on -rcu on branch rcu/idle.  I have also removed
> the TINY_RCU commits as discussed.  If you ever need them back, the
> magic branch name is idle.2012.06.06a.

Thanks a lot!

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-06-11 22:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-04 12:08 [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz fweisbec
2012-06-04 12:08 ` [PATCH 1/2] rcu: New rcu_user_enter() and rcu_user_exit() APIs fweisbec
2012-06-04 12:08 ` [PATCH 2/2] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs fweisbec
2012-06-04 18:13 ` [PATCH 0/2] rcu: Extended quiescent state for adaptive nohz Paul E. McKenney
2012-06-04 19:06   ` Frederic Weisbecker
2012-06-04 21:07     ` Paul E. McKenney
2012-06-05 10:31       ` Frederic Weisbecker
2012-06-05 23:46         ` Paul E. McKenney
2012-06-07 14:21           ` Frederic Weisbecker
2012-06-07 22:45             ` Paul E. McKenney
2012-06-09 22:58               ` Frederic Weisbecker
2012-06-10 17:54                 ` Paul E. McKenney
2012-06-09 22:55           ` [PATCH] rcu: Allow calls to rcu_exit_user_irq from nesting irqs Frederic Weisbecker
2012-06-10 18:06             ` Paul E. McKenney
2012-06-10 20:29               ` Frederic Weisbecker
2012-06-10 21:47               ` [PATCH v2] " Frederic Weisbecker
2012-06-11 21:55                 ` Paul E. McKenney
2012-06-11 22:06                   ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox