* [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
@ 2011-11-17 17:48 Frederic Weisbecker
2011-11-17 20:11 ` Josh Triplett
0 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2011-11-17 17:48 UTC (permalink / raw)
To: Paul E. McKenney
Cc: LKML, Frederic Weisbecker, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra, Josh Triplett
Those two APIs were provided to optimize the calls of
tick_nohz_idle_enter() and rcu_idle_enter() into a single
irq disabled section. This way no interrupt happening in-between would
needlessly process any RCU job.
Now we are talking about an optimization for which benefits
have yet to be measured. Let's start simple and completely decouple
idle rcu and dyntick idle logics to simplify.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Triplett <josh@joshtriplett.org>
---
arch/arm/kernel/process.c | 6 +++-
arch/avr32/kernel/process.c | 6 +++-
arch/blackfin/kernel/process.c | 6 +++-
arch/microblaze/kernel/process.c | 6 +++-
arch/mips/kernel/process.c | 6 +++-
arch/openrisc/kernel/idle.c | 6 +++-
arch/powerpc/kernel/idle.c | 15 +++++-----
arch/powerpc/platforms/iseries/setup.c | 12 +++++---
arch/s390/kernel/process.c | 6 +++-
arch/sh/kernel/idle.c | 6 +++-
arch/sparc/kernel/process_64.c | 6 +++-
arch/tile/kernel/process.c | 6 +++-
arch/um/kernel/process.c | 6 +++-
arch/unicore32/kernel/process.c | 6 +++-
arch/x86/kernel/process_32.c | 6 +++-
include/linux/tick.h | 47 +-------------------------------
kernel/time/tick-sched.c | 15 +++++-----
17 files changed, 76 insertions(+), 91 deletions(-)
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 4f83362..0e42a9c 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -183,7 +183,8 @@ void cpu_idle(void)
/* endless idle loop with no priority at all */
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
leds_event(led_idle_start);
while (!need_resched()) {
#ifdef CONFIG_HOTPLUG_CPU
@@ -210,7 +211,8 @@ void cpu_idle(void)
}
}
leds_event(led_idle_end);
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/avr32/kernel/process.c b/arch/avr32/kernel/process.c
index 34c8c70..ea33957 100644
--- a/arch/avr32/kernel/process.c
+++ b/arch/avr32/kernel/process.c
@@ -34,10 +34,12 @@ void cpu_idle(void)
{
/* endless idle loop with no priority at all */
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched())
cpu_idle_sleep();
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/blackfin/kernel/process.c b/arch/blackfin/kernel/process.c
index 57e0749..8dd0416 100644
--- a/arch/blackfin/kernel/process.c
+++ b/arch/blackfin/kernel/process.c
@@ -88,10 +88,12 @@ void cpu_idle(void)
#endif
if (!idle)
idle = default_idle;
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched())
idle();
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c
index c6ece38..37ed945 100644
--- a/arch/microblaze/kernel/process.c
+++ b/arch/microblaze/kernel/process.c
@@ -103,10 +103,12 @@ void cpu_idle(void)
if (!idle)
idle = default_idle;
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched())
idle();
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 7df2ffc..7937367 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -56,7 +56,8 @@ void __noreturn cpu_idle(void)
/* endless idle loop with no priority at all */
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched() && cpu_online(cpu)) {
#ifdef CONFIG_MIPS_MT_SMTC
extern void smtc_idle_loop_hook(void);
@@ -77,7 +78,8 @@ void __noreturn cpu_idle(void)
system_state == SYSTEM_BOOTING))
play_dead();
#endif
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/openrisc/kernel/idle.c b/arch/openrisc/kernel/idle.c
index 2e82cd0..e5fc7887 100644
--- a/arch/openrisc/kernel/idle.c
+++ b/arch/openrisc/kernel/idle.c
@@ -51,7 +51,8 @@ void cpu_idle(void)
/* endless idle loop with no priority at all */
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched()) {
check_pgt_cache();
@@ -69,7 +70,8 @@ void cpu_idle(void)
set_thread_flag(TIF_POLLING_NRFLAG);
}
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
index 3cd73d1..9c3cd49 100644
--- a/arch/powerpc/kernel/idle.c
+++ b/arch/powerpc/kernel/idle.c
@@ -62,10 +62,10 @@ void cpu_idle(void)
set_thread_flag(TIF_POLLING_NRFLAG);
while (1) {
- if (idle_uses_rcu)
- tick_nohz_idle_enter();
- else
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ if (!idle_uses_rcu)
+ rcu_idle_enter();
+
while (!need_resched() && !cpu_should_die()) {
ppc64_runlatch_off();
@@ -102,10 +102,9 @@ void cpu_idle(void)
HMT_medium();
ppc64_runlatch_on();
- if (idle_uses_rcu)
- tick_nohz_idle_exit();
- else
- tick_nohz_idle_exit_norcu();
+ if (!idle_uses_rcu)
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
if (cpu_should_die())
cpu_die();
diff --git a/arch/powerpc/platforms/iseries/setup.c b/arch/powerpc/platforms/iseries/setup.c
index 77ff6eb..097f7d5 100644
--- a/arch/powerpc/platforms/iseries/setup.c
+++ b/arch/powerpc/platforms/iseries/setup.c
@@ -562,7 +562,8 @@ static void yield_shared_processor(void)
static void iseries_shared_idle(void)
{
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched() && !hvlpevent_is_pending()) {
local_irq_disable();
ppc64_runlatch_off();
@@ -576,7 +577,8 @@ static void iseries_shared_idle(void)
}
ppc64_runlatch_on();
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
if (hvlpevent_is_pending())
process_iSeries_events();
@@ -592,7 +594,8 @@ static void iseries_dedicated_idle(void)
set_thread_flag(TIF_POLLING_NRFLAG);
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
if (!need_resched()) {
while (!need_resched()) {
ppc64_runlatch_off();
@@ -609,7 +612,8 @@ static void iseries_dedicated_idle(void)
}
ppc64_runlatch_on();
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
index 44028ae..bf2bc31 100644
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -90,10 +90,12 @@ static void default_idle(void)
void cpu_idle(void)
{
for (;;) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched())
default_idle();
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c
index ad58e75..406508d 100644
--- a/arch/sh/kernel/idle.c
+++ b/arch/sh/kernel/idle.c
@@ -89,7 +89,8 @@ void cpu_idle(void)
/* endless idle loop with no priority at all */
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched()) {
check_pgt_cache();
@@ -111,7 +112,8 @@ void cpu_idle(void)
start_critical_timings();
}
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index 78b1bc0..fde8d72 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -95,12 +95,14 @@ void cpu_idle(void)
set_thread_flag(TIF_POLLING_NRFLAG);
while(1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched() && !cpu_is_offline(cpu))
sparc64_yield(cpu);
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
diff --git a/arch/tile/kernel/process.c b/arch/tile/kernel/process.c
index 53ac895..4c1ac6e 100644
--- a/arch/tile/kernel/process.c
+++ b/arch/tile/kernel/process.c
@@ -85,7 +85,8 @@ void cpu_idle(void)
/* endless idle loop with no priority at all */
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched()) {
if (cpu_is_offline(cpu))
BUG(); /* no HOTPLUG_CPU */
@@ -105,7 +106,8 @@ void cpu_idle(void)
local_irq_enable();
current_thread_info()->status |= TS_POLLING;
}
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index 9e7176b..b652ff1 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -245,10 +245,12 @@ void default_idle(void)
if (need_resched())
schedule();
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
nsecs = disable_timer();
idle_sleep(nsecs);
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
}
}
diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c
index 095ff5a..52edc2b 100644
--- a/arch/unicore32/kernel/process.c
+++ b/arch/unicore32/kernel/process.c
@@ -55,7 +55,8 @@ void cpu_idle(void)
{
/* endless idle loop with no priority at all */
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched()) {
local_irq_disable();
stop_critical_timings();
@@ -63,7 +64,8 @@ void cpu_idle(void)
local_irq_enable();
start_critical_timings();
}
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index f311d096..44e3384 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -98,7 +98,8 @@ void cpu_idle(void)
/* endless idle loop with no priority at all */
while (1) {
- tick_nohz_idle_enter_norcu();
+ tick_nohz_idle_enter();
+ rcu_idle_enter();
while (!need_resched()) {
check_pgt_cache();
@@ -114,7 +115,8 @@ void cpu_idle(void)
pm_idle();
start_critical_timings();
}
- tick_nohz_idle_exit_norcu();
+ rcu_idle_exit();
+ tick_nohz_idle_exit();
preempt_enable_no_resched();
schedule();
preempt_disable();
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 327434a..ab8be90 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -122,45 +122,8 @@ static inline int tick_oneshot_mode_active(void) { return 0; }
#endif /* !CONFIG_GENERIC_CLOCKEVENTS */
# ifdef CONFIG_NO_HZ
-extern void __tick_nohz_idle_enter(void);
-static inline void tick_nohz_idle_enter(void)
-{
- local_irq_disable();
- __tick_nohz_idle_enter();
- local_irq_enable();
-}
+extern void tick_nohz_idle_enter(void);
extern void tick_nohz_idle_exit(void);
-
-/*
- * Call this pair of function if the arch doesn't make any use
- * of RCU in-between. You won't need to call rcu_idle_enter() and
- * rcu_idle_exit().
- * Otherwise you need to call tick_nohz_idle_enter() and tick_nohz_idle_exit()
- * and explicitly tell RCU about the window around the place the CPU enters low
- * power mode where no RCU use is made. This is done by calling rcu_idle_enter()
- * after the last use of RCU before the CPU is put to sleep and by calling
- * rcu_idle_exit() before the first use of RCU after the CPU woke up.
- */
-static inline void tick_nohz_idle_enter_norcu(void)
-{
- /*
- * Also call rcu_idle_enter() in the irq disabled section even
- * if it disables irq itself.
- * Just an optimization that prevents from an interrupt happening
- * between it and __tick_nohz_idle_enter() to lose time to help
- * completing a grace period while we could be in extended grace
- * period already.
- */
- local_irq_disable();
- __tick_nohz_idle_enter();
- rcu_idle_enter();
- local_irq_enable();
-}
-static inline void tick_nohz_idle_exit_norcu(void)
-{
- rcu_idle_exit();
- tick_nohz_idle_exit();
-}
extern void tick_nohz_irq_exit(void);
extern ktime_t tick_nohz_get_sleep_length(void);
extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
@@ -168,14 +131,6 @@ extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
# else
static inline void tick_nohz_idle_enter(void) { }
static inline void tick_nohz_idle_exit(void) { }
-static inline void tick_nohz_idle_enter_norcu(void)
-{
- rcu_idle_enter();
-}
-static inline void tick_nohz_idle_exit_norcu(void)
-{
- rcu_idle_exit();
-}
static inline ktime_t tick_nohz_get_sleep_length(void)
{
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 360d028..0d887e8 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -425,21 +425,20 @@ out:
* When the next event is more than a tick into the future, stop the idle tick
* Called when we start the idle loop.
*
- * If no use of RCU is made in the idle loop between
- * tick_nohz_idle_enter() and tick_nohz_idle_exit() calls, then
- * tick_nohz_idle_enter_norcu() should be called instead and the arch
- * doesn't need to call rcu_idle_enter() and rcu_idle_exit() explicitly.
- *
- * Otherwise the arch is responsible of calling:
+ * The arch is responsible of calling:
*
* - rcu_idle_enter() after its last use of RCU before the CPU is put
* to sleep.
* - rcu_idle_exit() before the first use of RCU after the CPU is woken up.
*/
-void __tick_nohz_idle_enter(void)
+void tick_nohz_idle_enter(void)
{
struct tick_sched *ts;
+ WARN_ON_ONCE(irqs_disabled());
+
+ local_irq_disable();
+
ts = &__get_cpu_var(tick_cpu_sched);
/*
* set ts->inidle unconditionally. even if the system did not
@@ -448,6 +447,8 @@ void __tick_nohz_idle_enter(void)
*/
ts->inidle = 1;
tick_nohz_stop_sched_tick(ts);
+
+ local_irq_enable();
}
/**
--
1.7.5.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
2011-11-17 17:48 [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu() Frederic Weisbecker
@ 2011-11-17 20:11 ` Josh Triplett
2011-11-18 1:03 ` Paul E. McKenney
0 siblings, 1 reply; 8+ messages in thread
From: Josh Triplett @ 2011-11-17 20:11 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Paul E. McKenney, LKML, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra
On Thu, Nov 17, 2011 at 06:48:14PM +0100, Frederic Weisbecker wrote:
> Those two APIs were provided to optimize the calls of
> tick_nohz_idle_enter() and rcu_idle_enter() into a single
> irq disabled section. This way no interrupt happening in-between would
> needlessly process any RCU job.
>
> Now we are talking about an optimization for which benefits
> have yet to be measured. Let's start simple and completely decouple
> idle rcu and dyntick idle logics to simplify.
>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> index 4f83362..0e42a9c 100644
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -183,7 +183,8 @@ void cpu_idle(void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> leds_event(led_idle_start);
> while (!need_resched()) {
> #ifdef CONFIG_HOTPLUG_CPU
> @@ -210,7 +211,8 @@ void cpu_idle(void)
> }
> }
> leds_event(led_idle_end);
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/avr32/kernel/process.c b/arch/avr32/kernel/process.c
> index 34c8c70..ea33957 100644
> --- a/arch/avr32/kernel/process.c
> +++ b/arch/avr32/kernel/process.c
> @@ -34,10 +34,12 @@ void cpu_idle(void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched())
> cpu_idle_sleep();
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/blackfin/kernel/process.c b/arch/blackfin/kernel/process.c
> index 57e0749..8dd0416 100644
> --- a/arch/blackfin/kernel/process.c
> +++ b/arch/blackfin/kernel/process.c
> @@ -88,10 +88,12 @@ void cpu_idle(void)
> #endif
> if (!idle)
> idle = default_idle;
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched())
> idle();
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c
> index c6ece38..37ed945 100644
> --- a/arch/microblaze/kernel/process.c
> +++ b/arch/microblaze/kernel/process.c
> @@ -103,10 +103,12 @@ void cpu_idle(void)
> if (!idle)
> idle = default_idle;
>
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched())
> idle();
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
>
> preempt_enable_no_resched();
> schedule();
> diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
> index 7df2ffc..7937367 100644
> --- a/arch/mips/kernel/process.c
> +++ b/arch/mips/kernel/process.c
> @@ -56,7 +56,8 @@ void __noreturn cpu_idle(void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched() && cpu_online(cpu)) {
> #ifdef CONFIG_MIPS_MT_SMTC
> extern void smtc_idle_loop_hook(void);
> @@ -77,7 +78,8 @@ void __noreturn cpu_idle(void)
> system_state == SYSTEM_BOOTING))
> play_dead();
> #endif
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/openrisc/kernel/idle.c b/arch/openrisc/kernel/idle.c
> index 2e82cd0..e5fc7887 100644
> --- a/arch/openrisc/kernel/idle.c
> +++ b/arch/openrisc/kernel/idle.c
> @@ -51,7 +51,8 @@ void cpu_idle(void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
>
> while (!need_resched()) {
> check_pgt_cache();
> @@ -69,7 +70,8 @@ void cpu_idle(void)
> set_thread_flag(TIF_POLLING_NRFLAG);
> }
>
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
> index 3cd73d1..9c3cd49 100644
> --- a/arch/powerpc/kernel/idle.c
> +++ b/arch/powerpc/kernel/idle.c
> @@ -62,10 +62,10 @@ void cpu_idle(void)
>
> set_thread_flag(TIF_POLLING_NRFLAG);
> while (1) {
> - if (idle_uses_rcu)
> - tick_nohz_idle_enter();
> - else
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + if (!idle_uses_rcu)
> + rcu_idle_enter();
> +
> while (!need_resched() && !cpu_should_die()) {
> ppc64_runlatch_off();
>
> @@ -102,10 +102,9 @@ void cpu_idle(void)
>
> HMT_medium();
> ppc64_runlatch_on();
> - if (idle_uses_rcu)
> - tick_nohz_idle_exit();
> - else
> - tick_nohz_idle_exit_norcu();
> + if (!idle_uses_rcu)
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> if (cpu_should_die())
> cpu_die();
> diff --git a/arch/powerpc/platforms/iseries/setup.c b/arch/powerpc/platforms/iseries/setup.c
> index 77ff6eb..097f7d5 100644
> --- a/arch/powerpc/platforms/iseries/setup.c
> +++ b/arch/powerpc/platforms/iseries/setup.c
> @@ -562,7 +562,8 @@ static void yield_shared_processor(void)
> static void iseries_shared_idle(void)
> {
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched() && !hvlpevent_is_pending()) {
> local_irq_disable();
> ppc64_runlatch_off();
> @@ -576,7 +577,8 @@ static void iseries_shared_idle(void)
> }
>
> ppc64_runlatch_on();
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
>
> if (hvlpevent_is_pending())
> process_iSeries_events();
> @@ -592,7 +594,8 @@ static void iseries_dedicated_idle(void)
> set_thread_flag(TIF_POLLING_NRFLAG);
>
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> if (!need_resched()) {
> while (!need_resched()) {
> ppc64_runlatch_off();
> @@ -609,7 +612,8 @@ static void iseries_dedicated_idle(void)
> }
>
> ppc64_runlatch_on();
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
> index 44028ae..bf2bc31 100644
> --- a/arch/s390/kernel/process.c
> +++ b/arch/s390/kernel/process.c
> @@ -90,10 +90,12 @@ static void default_idle(void)
> void cpu_idle(void)
> {
> for (;;) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched())
> default_idle();
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c
> index ad58e75..406508d 100644
> --- a/arch/sh/kernel/idle.c
> +++ b/arch/sh/kernel/idle.c
> @@ -89,7 +89,8 @@ void cpu_idle(void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
>
> while (!need_resched()) {
> check_pgt_cache();
> @@ -111,7 +112,8 @@ void cpu_idle(void)
> start_critical_timings();
> }
>
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
> index 78b1bc0..fde8d72 100644
> --- a/arch/sparc/kernel/process_64.c
> +++ b/arch/sparc/kernel/process_64.c
> @@ -95,12 +95,14 @@ void cpu_idle(void)
> set_thread_flag(TIF_POLLING_NRFLAG);
>
> while(1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
>
> while (!need_resched() && !cpu_is_offline(cpu))
> sparc64_yield(cpu);
>
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
>
> preempt_enable_no_resched();
>
> diff --git a/arch/tile/kernel/process.c b/arch/tile/kernel/process.c
> index 53ac895..4c1ac6e 100644
> --- a/arch/tile/kernel/process.c
> +++ b/arch/tile/kernel/process.c
> @@ -85,7 +85,8 @@ void cpu_idle(void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched()) {
> if (cpu_is_offline(cpu))
> BUG(); /* no HOTPLUG_CPU */
> @@ -105,7 +106,8 @@ void cpu_idle(void)
> local_irq_enable();
> current_thread_info()->status |= TS_POLLING;
> }
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
> index 9e7176b..b652ff1 100644
> --- a/arch/um/kernel/process.c
> +++ b/arch/um/kernel/process.c
> @@ -245,10 +245,12 @@ void default_idle(void)
> if (need_resched())
> schedule();
>
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> nsecs = disable_timer();
> idle_sleep(nsecs);
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> }
> }
>
> diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c
> index 095ff5a..52edc2b 100644
> --- a/arch/unicore32/kernel/process.c
> +++ b/arch/unicore32/kernel/process.c
> @@ -55,7 +55,8 @@ void cpu_idle(void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched()) {
> local_irq_disable();
> stop_critical_timings();
> @@ -63,7 +64,8 @@ void cpu_idle(void)
> local_irq_enable();
> start_critical_timings();
> }
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> index f311d096..44e3384 100644
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -98,7 +98,8 @@ void cpu_idle(void)
>
> /* endless idle loop with no priority at all */
> while (1) {
> - tick_nohz_idle_enter_norcu();
> + tick_nohz_idle_enter();
> + rcu_idle_enter();
> while (!need_resched()) {
>
> check_pgt_cache();
> @@ -114,7 +115,8 @@ void cpu_idle(void)
> pm_idle();
> start_critical_timings();
> }
> - tick_nohz_idle_exit_norcu();
> + rcu_idle_exit();
> + tick_nohz_idle_exit();
> preempt_enable_no_resched();
> schedule();
> preempt_disable();
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 327434a..ab8be90 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -122,45 +122,8 @@ static inline int tick_oneshot_mode_active(void) { return 0; }
> #endif /* !CONFIG_GENERIC_CLOCKEVENTS */
>
> # ifdef CONFIG_NO_HZ
> -extern void __tick_nohz_idle_enter(void);
> -static inline void tick_nohz_idle_enter(void)
> -{
> - local_irq_disable();
> - __tick_nohz_idle_enter();
> - local_irq_enable();
> -}
> +extern void tick_nohz_idle_enter(void);
> extern void tick_nohz_idle_exit(void);
> -
> -/*
> - * Call this pair of function if the arch doesn't make any use
> - * of RCU in-between. You won't need to call rcu_idle_enter() and
> - * rcu_idle_exit().
> - * Otherwise you need to call tick_nohz_idle_enter() and tick_nohz_idle_exit()
> - * and explicitly tell RCU about the window around the place the CPU enters low
> - * power mode where no RCU use is made. This is done by calling rcu_idle_enter()
> - * after the last use of RCU before the CPU is put to sleep and by calling
> - * rcu_idle_exit() before the first use of RCU after the CPU woke up.
> - */
> -static inline void tick_nohz_idle_enter_norcu(void)
> -{
> - /*
> - * Also call rcu_idle_enter() in the irq disabled section even
> - * if it disables irq itself.
> - * Just an optimization that prevents from an interrupt happening
> - * between it and __tick_nohz_idle_enter() to lose time to help
> - * completing a grace period while we could be in extended grace
> - * period already.
> - */
> - local_irq_disable();
> - __tick_nohz_idle_enter();
> - rcu_idle_enter();
> - local_irq_enable();
> -}
> -static inline void tick_nohz_idle_exit_norcu(void)
> -{
> - rcu_idle_exit();
> - tick_nohz_idle_exit();
> -}
> extern void tick_nohz_irq_exit(void);
> extern ktime_t tick_nohz_get_sleep_length(void);
> extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
> @@ -168,14 +131,6 @@ extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
> # else
> static inline void tick_nohz_idle_enter(void) { }
> static inline void tick_nohz_idle_exit(void) { }
> -static inline void tick_nohz_idle_enter_norcu(void)
> -{
> - rcu_idle_enter();
> -}
> -static inline void tick_nohz_idle_exit_norcu(void)
> -{
> - rcu_idle_exit();
> -}
>
> static inline ktime_t tick_nohz_get_sleep_length(void)
> {
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 360d028..0d887e8 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -425,21 +425,20 @@ out:
> * When the next event is more than a tick into the future, stop the idle tick
> * Called when we start the idle loop.
> *
> - * If no use of RCU is made in the idle loop between
> - * tick_nohz_idle_enter() and tick_nohz_idle_exit() calls, then
> - * tick_nohz_idle_enter_norcu() should be called instead and the arch
> - * doesn't need to call rcu_idle_enter() and rcu_idle_exit() explicitly.
> - *
> - * Otherwise the arch is responsible of calling:
> + * The arch is responsible of calling:
> *
> * - rcu_idle_enter() after its last use of RCU before the CPU is put
> * to sleep.
> * - rcu_idle_exit() before the first use of RCU after the CPU is woken up.
> */
> -void __tick_nohz_idle_enter(void)
> +void tick_nohz_idle_enter(void)
> {
> struct tick_sched *ts;
>
> + WARN_ON_ONCE(irqs_disabled());
> +
> + local_irq_disable();
> +
> ts = &__get_cpu_var(tick_cpu_sched);
> /*
> * set ts->inidle unconditionally. even if the system did not
> @@ -448,6 +447,8 @@ void __tick_nohz_idle_enter(void)
> */
> ts->inidle = 1;
> tick_nohz_stop_sched_tick(ts);
> +
> + local_irq_enable();
> }
>
> /**
> --
> 1.7.5.4
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
2011-11-17 20:11 ` Josh Triplett
@ 2011-11-18 1:03 ` Paul E. McKenney
2011-11-19 0:50 ` Paul E. McKenney
0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2011-11-18 1:03 UTC (permalink / raw)
To: Josh Triplett
Cc: Frederic Weisbecker, LKML, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra
On Thu, Nov 17, 2011 at 12:11:34PM -0800, Josh Triplett wrote:
> On Thu, Nov 17, 2011 at 06:48:14PM +0100, Frederic Weisbecker wrote:
> > Those two APIs were provided to optimize the calls of
> > tick_nohz_idle_enter() and rcu_idle_enter() into a single
> > irq disabled section. This way no interrupt happening in-between would
> > needlessly process any RCU job.
> >
> > Now we are talking about an optimization for which benefits
> > have yet to be measured. Let's start simple and completely decouple
> > idle rcu and dyntick idle logics to simplify.
> >
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Josh Triplett <josh@joshtriplett.org>
>
> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Merged, thank you both!
Thanx, Paul
> > diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> > index 4f83362..0e42a9c 100644
> > --- a/arch/arm/kernel/process.c
> > +++ b/arch/arm/kernel/process.c
> > @@ -183,7 +183,8 @@ void cpu_idle(void)
> >
> > /* endless idle loop with no priority at all */
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > leds_event(led_idle_start);
> > while (!need_resched()) {
> > #ifdef CONFIG_HOTPLUG_CPU
> > @@ -210,7 +211,8 @@ void cpu_idle(void)
> > }
> > }
> > leds_event(led_idle_end);
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/avr32/kernel/process.c b/arch/avr32/kernel/process.c
> > index 34c8c70..ea33957 100644
> > --- a/arch/avr32/kernel/process.c
> > +++ b/arch/avr32/kernel/process.c
> > @@ -34,10 +34,12 @@ void cpu_idle(void)
> > {
> > /* endless idle loop with no priority at all */
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched())
> > cpu_idle_sleep();
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/blackfin/kernel/process.c b/arch/blackfin/kernel/process.c
> > index 57e0749..8dd0416 100644
> > --- a/arch/blackfin/kernel/process.c
> > +++ b/arch/blackfin/kernel/process.c
> > @@ -88,10 +88,12 @@ void cpu_idle(void)
> > #endif
> > if (!idle)
> > idle = default_idle;
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched())
> > idle();
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c
> > index c6ece38..37ed945 100644
> > --- a/arch/microblaze/kernel/process.c
> > +++ b/arch/microblaze/kernel/process.c
> > @@ -103,10 +103,12 @@ void cpu_idle(void)
> > if (!idle)
> > idle = default_idle;
> >
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched())
> > idle();
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> >
> > preempt_enable_no_resched();
> > schedule();
> > diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
> > index 7df2ffc..7937367 100644
> > --- a/arch/mips/kernel/process.c
> > +++ b/arch/mips/kernel/process.c
> > @@ -56,7 +56,8 @@ void __noreturn cpu_idle(void)
> >
> > /* endless idle loop with no priority at all */
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched() && cpu_online(cpu)) {
> > #ifdef CONFIG_MIPS_MT_SMTC
> > extern void smtc_idle_loop_hook(void);
> > @@ -77,7 +78,8 @@ void __noreturn cpu_idle(void)
> > system_state == SYSTEM_BOOTING))
> > play_dead();
> > #endif
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/openrisc/kernel/idle.c b/arch/openrisc/kernel/idle.c
> > index 2e82cd0..e5fc7887 100644
> > --- a/arch/openrisc/kernel/idle.c
> > +++ b/arch/openrisc/kernel/idle.c
> > @@ -51,7 +51,8 @@ void cpu_idle(void)
> >
> > /* endless idle loop with no priority at all */
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> >
> > while (!need_resched()) {
> > check_pgt_cache();
> > @@ -69,7 +70,8 @@ void cpu_idle(void)
> > set_thread_flag(TIF_POLLING_NRFLAG);
> > }
> >
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
> > index 3cd73d1..9c3cd49 100644
> > --- a/arch/powerpc/kernel/idle.c
> > +++ b/arch/powerpc/kernel/idle.c
> > @@ -62,10 +62,10 @@ void cpu_idle(void)
> >
> > set_thread_flag(TIF_POLLING_NRFLAG);
> > while (1) {
> > - if (idle_uses_rcu)
> > - tick_nohz_idle_enter();
> > - else
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + if (!idle_uses_rcu)
> > + rcu_idle_enter();
> > +
> > while (!need_resched() && !cpu_should_die()) {
> > ppc64_runlatch_off();
> >
> > @@ -102,10 +102,9 @@ void cpu_idle(void)
> >
> > HMT_medium();
> > ppc64_runlatch_on();
> > - if (idle_uses_rcu)
> > - tick_nohz_idle_exit();
> > - else
> > - tick_nohz_idle_exit_norcu();
> > + if (!idle_uses_rcu)
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > if (cpu_should_die())
> > cpu_die();
> > diff --git a/arch/powerpc/platforms/iseries/setup.c b/arch/powerpc/platforms/iseries/setup.c
> > index 77ff6eb..097f7d5 100644
> > --- a/arch/powerpc/platforms/iseries/setup.c
> > +++ b/arch/powerpc/platforms/iseries/setup.c
> > @@ -562,7 +562,8 @@ static void yield_shared_processor(void)
> > static void iseries_shared_idle(void)
> > {
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched() && !hvlpevent_is_pending()) {
> > local_irq_disable();
> > ppc64_runlatch_off();
> > @@ -576,7 +577,8 @@ static void iseries_shared_idle(void)
> > }
> >
> > ppc64_runlatch_on();
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> >
> > if (hvlpevent_is_pending())
> > process_iSeries_events();
> > @@ -592,7 +594,8 @@ static void iseries_dedicated_idle(void)
> > set_thread_flag(TIF_POLLING_NRFLAG);
> >
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > if (!need_resched()) {
> > while (!need_resched()) {
> > ppc64_runlatch_off();
> > @@ -609,7 +612,8 @@ static void iseries_dedicated_idle(void)
> > }
> >
> > ppc64_runlatch_on();
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
> > index 44028ae..bf2bc31 100644
> > --- a/arch/s390/kernel/process.c
> > +++ b/arch/s390/kernel/process.c
> > @@ -90,10 +90,12 @@ static void default_idle(void)
> > void cpu_idle(void)
> > {
> > for (;;) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched())
> > default_idle();
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c
> > index ad58e75..406508d 100644
> > --- a/arch/sh/kernel/idle.c
> > +++ b/arch/sh/kernel/idle.c
> > @@ -89,7 +89,8 @@ void cpu_idle(void)
> >
> > /* endless idle loop with no priority at all */
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> >
> > while (!need_resched()) {
> > check_pgt_cache();
> > @@ -111,7 +112,8 @@ void cpu_idle(void)
> > start_critical_timings();
> > }
> >
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
> > index 78b1bc0..fde8d72 100644
> > --- a/arch/sparc/kernel/process_64.c
> > +++ b/arch/sparc/kernel/process_64.c
> > @@ -95,12 +95,14 @@ void cpu_idle(void)
> > set_thread_flag(TIF_POLLING_NRFLAG);
> >
> > while(1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> >
> > while (!need_resched() && !cpu_is_offline(cpu))
> > sparc64_yield(cpu);
> >
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> >
> > preempt_enable_no_resched();
> >
> > diff --git a/arch/tile/kernel/process.c b/arch/tile/kernel/process.c
> > index 53ac895..4c1ac6e 100644
> > --- a/arch/tile/kernel/process.c
> > +++ b/arch/tile/kernel/process.c
> > @@ -85,7 +85,8 @@ void cpu_idle(void)
> >
> > /* endless idle loop with no priority at all */
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched()) {
> > if (cpu_is_offline(cpu))
> > BUG(); /* no HOTPLUG_CPU */
> > @@ -105,7 +106,8 @@ void cpu_idle(void)
> > local_irq_enable();
> > current_thread_info()->status |= TS_POLLING;
> > }
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
> > index 9e7176b..b652ff1 100644
> > --- a/arch/um/kernel/process.c
> > +++ b/arch/um/kernel/process.c
> > @@ -245,10 +245,12 @@ void default_idle(void)
> > if (need_resched())
> > schedule();
> >
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > nsecs = disable_timer();
> > idle_sleep(nsecs);
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > }
> > }
> >
> > diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c
> > index 095ff5a..52edc2b 100644
> > --- a/arch/unicore32/kernel/process.c
> > +++ b/arch/unicore32/kernel/process.c
> > @@ -55,7 +55,8 @@ void cpu_idle(void)
> > {
> > /* endless idle loop with no priority at all */
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched()) {
> > local_irq_disable();
> > stop_critical_timings();
> > @@ -63,7 +64,8 @@ void cpu_idle(void)
> > local_irq_enable();
> > start_critical_timings();
> > }
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> > index f311d096..44e3384 100644
> > --- a/arch/x86/kernel/process_32.c
> > +++ b/arch/x86/kernel/process_32.c
> > @@ -98,7 +98,8 @@ void cpu_idle(void)
> >
> > /* endless idle loop with no priority at all */
> > while (1) {
> > - tick_nohz_idle_enter_norcu();
> > + tick_nohz_idle_enter();
> > + rcu_idle_enter();
> > while (!need_resched()) {
> >
> > check_pgt_cache();
> > @@ -114,7 +115,8 @@ void cpu_idle(void)
> > pm_idle();
> > start_critical_timings();
> > }
> > - tick_nohz_idle_exit_norcu();
> > + rcu_idle_exit();
> > + tick_nohz_idle_exit();
> > preempt_enable_no_resched();
> > schedule();
> > preempt_disable();
> > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > index 327434a..ab8be90 100644
> > --- a/include/linux/tick.h
> > +++ b/include/linux/tick.h
> > @@ -122,45 +122,8 @@ static inline int tick_oneshot_mode_active(void) { return 0; }
> > #endif /* !CONFIG_GENERIC_CLOCKEVENTS */
> >
> > # ifdef CONFIG_NO_HZ
> > -extern void __tick_nohz_idle_enter(void);
> > -static inline void tick_nohz_idle_enter(void)
> > -{
> > - local_irq_disable();
> > - __tick_nohz_idle_enter();
> > - local_irq_enable();
> > -}
> > +extern void tick_nohz_idle_enter(void);
> > extern void tick_nohz_idle_exit(void);
> > -
> > -/*
> > - * Call this pair of function if the arch doesn't make any use
> > - * of RCU in-between. You won't need to call rcu_idle_enter() and
> > - * rcu_idle_exit().
> > - * Otherwise you need to call tick_nohz_idle_enter() and tick_nohz_idle_exit()
> > - * and explicitly tell RCU about the window around the place the CPU enters low
> > - * power mode where no RCU use is made. This is done by calling rcu_idle_enter()
> > - * after the last use of RCU before the CPU is put to sleep and by calling
> > - * rcu_idle_exit() before the first use of RCU after the CPU woke up.
> > - */
> > -static inline void tick_nohz_idle_enter_norcu(void)
> > -{
> > - /*
> > - * Also call rcu_idle_enter() in the irq disabled section even
> > - * if it disables irq itself.
> > - * Just an optimization that prevents from an interrupt happening
> > - * between it and __tick_nohz_idle_enter() to lose time to help
> > - * completing a grace period while we could be in extended grace
> > - * period already.
> > - */
> > - local_irq_disable();
> > - __tick_nohz_idle_enter();
> > - rcu_idle_enter();
> > - local_irq_enable();
> > -}
> > -static inline void tick_nohz_idle_exit_norcu(void)
> > -{
> > - rcu_idle_exit();
> > - tick_nohz_idle_exit();
> > -}
> > extern void tick_nohz_irq_exit(void);
> > extern ktime_t tick_nohz_get_sleep_length(void);
> > extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
> > @@ -168,14 +131,6 @@ extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
> > # else
> > static inline void tick_nohz_idle_enter(void) { }
> > static inline void tick_nohz_idle_exit(void) { }
> > -static inline void tick_nohz_idle_enter_norcu(void)
> > -{
> > - rcu_idle_enter();
> > -}
> > -static inline void tick_nohz_idle_exit_norcu(void)
> > -{
> > - rcu_idle_exit();
> > -}
> >
> > static inline ktime_t tick_nohz_get_sleep_length(void)
> > {
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 360d028..0d887e8 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -425,21 +425,20 @@ out:
> > * When the next event is more than a tick into the future, stop the idle tick
> > * Called when we start the idle loop.
> > *
> > - * If no use of RCU is made in the idle loop between
> > - * tick_nohz_idle_enter() and tick_nohz_idle_exit() calls, then
> > - * tick_nohz_idle_enter_norcu() should be called instead and the arch
> > - * doesn't need to call rcu_idle_enter() and rcu_idle_exit() explicitly.
> > - *
> > - * Otherwise the arch is responsible of calling:
> > + * The arch is responsible of calling:
> > *
> > * - rcu_idle_enter() after its last use of RCU before the CPU is put
> > * to sleep.
> > * - rcu_idle_exit() before the first use of RCU after the CPU is woken up.
> > */
> > -void __tick_nohz_idle_enter(void)
> > +void tick_nohz_idle_enter(void)
> > {
> > struct tick_sched *ts;
> >
> > + WARN_ON_ONCE(irqs_disabled());
> > +
> > + local_irq_disable();
> > +
> > ts = &__get_cpu_var(tick_cpu_sched);
> > /*
> > * set ts->inidle unconditionally. even if the system did not
> > @@ -448,6 +447,8 @@ void __tick_nohz_idle_enter(void)
> > */
> > ts->inidle = 1;
> > tick_nohz_stop_sched_tick(ts);
> > +
> > + local_irq_enable();
> > }
> >
> > /**
> > --
> > 1.7.5.4
> >
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
2011-11-18 1:03 ` Paul E. McKenney
@ 2011-11-19 0:50 ` Paul E. McKenney
2011-11-21 1:46 ` Frederic Weisbecker
0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2011-11-19 0:50 UTC (permalink / raw)
To: Josh Triplett
Cc: Frederic Weisbecker, LKML, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra
On Thu, Nov 17, 2011 at 05:03:44PM -0800, Paul E. McKenney wrote:
> On Thu, Nov 17, 2011 at 12:11:34PM -0800, Josh Triplett wrote:
> > On Thu, Nov 17, 2011 at 06:48:14PM +0100, Frederic Weisbecker wrote:
> > > Those two APIs were provided to optimize the calls of
> > > tick_nohz_idle_enter() and rcu_idle_enter() into a single
> > > irq disabled section. This way no interrupt happening in-between would
> > > needlessly process any RCU job.
> > >
> > > Now we are talking about an optimization for which benefits
> > > have yet to be measured. Let's start simple and completely decouple
> > > idle rcu and dyntick idle logics to simplify.
> > >
> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Josh Triplett <josh@joshtriplett.org>
> >
> > Reviewed-by: Josh Triplett <josh@joshtriplett.org>
>
> Merged, thank you both!
And here is a patch on top of yours to allow nesting of rcu_idle_enter()
and rcu_idle_exit(). Thoughts?
Thanx, Paul
------------------------------------------------------------------------
rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
Running user tasks in dyntick-idle mode requires RCU to undergo
an idle-to-non-idle transition on each entry into the kernel, and
vice versa on each exit from the kernel. However, situations where
user tasks cannot run in dyntick-idle mode (for example, when there
is more than one runnable task on the CPU in question) also require
RCU to undergo an idle-to-non-idle transition when coming out of the
idle loop (and vice versa when entering the idle loop). In this case,
RCU would see one idle-to-non-idle transition when the task became
runnable, and another when the task executed a system call.
Therefore, rcu_idle_enter() and rcu_idle_exit() must handle nested
calls, which this commit provides for.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
diff --git a/kernel/rcu.h b/kernel/rcu.h
index aa88baa..8a76a5b 100644
--- a/kernel/rcu.h
+++ b/kernel/rcu.h
@@ -33,8 +33,24 @@
* Process-level increment to ->dynticks_nesting field. This allows for
* architectures that use half-interrupts and half-exceptions from
* process context.
+ *
+ * DYNTICK_TASK_NESTING_MASK is a three-bit field that counts the number
+ * of process-based reasons why RCU cannot consider the corresponding CPU
+ * to be idle, and DYNTICK_TASK_NESTING_VALUE is the value used to increment
+ * or decrement this three-bit field. The rest of the bits could in
+ * principle be used to count interrupts, but this would mean that a
+ * negative-one value in the interrupt field could incorrectly zero out
+ * the DYNTICK_TASK_NESTING_MASK field. We therefore provide a two-bit
+ * guard field defined by DYNTICK_TASK_MASK that is set to DYNTICK_TASK_FLAG
+ * upon initial exit from idle. The DYNTICK_TASK_EXIT_IDLE value is
+ * thus the combined value used upon initial exit from idle.
*/
-#define DYNTICK_TASK_NESTING (LLONG_MAX / 2 - 1)
+#define DYNTICK_TASK_NESTING_VALUE (LLONG_MAX / 8 + 1)
+#define DYNTICK_TASK_NESTING_MASK (LLONG_MAX - DYNTICK_TASK_NESTING_VALUE + 1)
+#define DYNTICK_TASK_FLAG ((DYNTICK_TASK_NESTING_VALUE / 8) * 2)
+#define DYNTICK_TASK_MASK ((DYNTICK_TASK_NESTING_VALUE / 8) * 3)
+#define DYNTICK_TASK_EXIT_IDLE (DYNTICK_TASK_NESTING_VALUE + \
+ DYNTICK_TASK_FLAG)
/*
* debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index e5bd949..10523d6 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -53,7 +53,7 @@ static void __call_rcu(struct rcu_head *head,
#include "rcutiny_plugin.h"
-static long long rcu_dynticks_nesting = DYNTICK_TASK_NESTING;
+static long long rcu_dynticks_nesting = DYNTICK_TASK_NESTING_VALUE;
/* Common code for rcu_idle_enter() and rcu_irq_exit(), see kernel/rcutree.c. */
static void rcu_idle_enter_common(long long oldval)
@@ -88,7 +88,12 @@ void rcu_idle_enter(void)
local_irq_save(flags);
oldval = rcu_dynticks_nesting;
- rcu_dynticks_nesting = 0;
+ WARN_ON_ONCE((rcu_dynticks_nesting & DYNTICK_TASK_NESTING_MASK) == 0);
+ if ((rcu_dynticks_nesting & DYNTICK_TASK_NESTING_MASK) ==
+ DYNTICK_TASK_NESTING_VALUE)
+ rcu_dynticks_nesting = 0;
+ else
+ rcu_dynticks_nesting -= DYNTICK_TASK_NESTING_VALUE;
rcu_idle_enter_common(oldval);
local_irq_restore(flags);
}
@@ -140,8 +145,11 @@ void rcu_idle_exit(void)
local_irq_save(flags);
oldval = rcu_dynticks_nesting;
- WARN_ON_ONCE(oldval != 0);
- rcu_dynticks_nesting = DYNTICK_TASK_NESTING;
+ WARN_ON_ONCE(rcu_dynticks_nesting < 0);
+ if (rcu_dynticks_nesting & DYNTICK_TASK_NESTING_MASK)
+ rcu_dynticks_nesting += DYNTICK_TASK_NESTING_VALUE;
+ else
+ rcu_dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
rcu_idle_exit_common(oldval);
local_irq_restore(flags);
}
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 7fb8b0e..f1a3379 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -196,7 +196,7 @@ void rcu_note_context_switch(int cpu)
EXPORT_SYMBOL_GPL(rcu_note_context_switch);
DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
- .dynticks_nesting = DYNTICK_TASK_NESTING,
+ .dynticks_nesting = DYNTICK_TASK_NESTING_VALUE,
.dynticks = ATOMIC_INIT(1),
};
@@ -394,7 +394,11 @@ void rcu_idle_enter(void)
local_irq_save(flags);
rdtp = &__get_cpu_var(rcu_dynticks);
oldval = rdtp->dynticks_nesting;
- rdtp->dynticks_nesting = 0;
+ WARN_ON_ONCE((oldval & DYNTICK_TASK_NESTING_MASK) == 0);
+ if ((oldval & DYNTICK_TASK_NESTING_MASK) == DYNTICK_TASK_NESTING_VALUE)
+ rdtp->dynticks_nesting = 0;
+ else
+ rdtp->dynticks_nesting -= DYNTICK_TASK_NESTING_VALUE;
rcu_idle_enter_common(rdtp, oldval);
local_irq_restore(flags);
}
@@ -481,8 +485,11 @@ void rcu_idle_exit(void)
local_irq_save(flags);
rdtp = &__get_cpu_var(rcu_dynticks);
oldval = rdtp->dynticks_nesting;
- WARN_ON_ONCE(oldval != 0);
- rdtp->dynticks_nesting = DYNTICK_TASK_NESTING;
+ WARN_ON_ONCE(oldval < 0);
+ if (oldval & DYNTICK_TASK_NESTING_MASK)
+ rdtp->dynticks_nesting += DYNTICK_TASK_NESTING_VALUE;
+ else
+ rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
rcu_idle_exit_common(rdtp, oldval);
local_irq_restore(flags);
}
@@ -2028,7 +2035,8 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
rdp->nxttail[i] = &rdp->nxtlist;
rdp->qlen = 0;
rdp->dynticks = &per_cpu(rcu_dynticks, cpu);
- WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != DYNTICK_TASK_NESTING);
+ WARN_ON_ONCE(rdp->dynticks->dynticks_nesting !=
+ DYNTICK_TASK_NESTING_VALUE);
WARN_ON_ONCE(atomic_read(&rdp->dynticks->dynticks) != 1);
rdp->cpu = cpu;
rdp->rsp = rsp;
@@ -2056,8 +2064,10 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible)
rdp->qlen_last_fqs_check = 0;
rdp->n_force_qs_snap = rsp->n_force_qs;
rdp->blimit = blimit;
- WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != DYNTICK_TASK_NESTING);
+ rdp->dynticks->dynticks_nesting = DYNTICK_TASK_NESTING_VALUE;
WARN_ON_ONCE((atomic_read(&rdp->dynticks->dynticks) & 0x1) != 1);
+ atomic_set(&rdp->dynticks->dynticks,
+ (atomic_read(&rdp->dynticks->dynticks) & ~0x1) + 1);
raw_spin_unlock(&rnp->lock); /* irqs remain disabled. */
/*
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
2011-11-19 0:50 ` Paul E. McKenney
@ 2011-11-21 1:46 ` Frederic Weisbecker
2011-11-21 5:28 ` Paul E. McKenney
0 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2011-11-21 1:46 UTC (permalink / raw)
To: paulmck; +Cc: Josh Triplett, LKML, Ingo Molnar, Thomas Gleixner, Peter Zijlstra
2011/11/19 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> On Thu, Nov 17, 2011 at 05:03:44PM -0800, Paul E. McKenney wrote:
>> On Thu, Nov 17, 2011 at 12:11:34PM -0800, Josh Triplett wrote:
>> > On Thu, Nov 17, 2011 at 06:48:14PM +0100, Frederic Weisbecker wrote:
>> > > Those two APIs were provided to optimize the calls of
>> > > tick_nohz_idle_enter() and rcu_idle_enter() into a single
>> > > irq disabled section. This way no interrupt happening in-between would
>> > > needlessly process any RCU job.
>> > >
>> > > Now we are talking about an optimization for which benefits
>> > > have yet to be measured. Let's start simple and completely decouple
>> > > idle rcu and dyntick idle logics to simplify.
>> > >
>> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
>> > > Cc: Ingo Molnar <mingo@redhat.com>
>> > > Cc: Thomas Gleixner <tglx@linutronix.de>
>> > > Cc: Peter Zijlstra <peterz@infradead.org>
>> > > Cc: Josh Triplett <josh@joshtriplett.org>
>> >
>> > Reviewed-by: Josh Triplett <josh@joshtriplett.org>
>>
>> Merged, thank you both!
>
> And here is a patch on top of yours to allow nesting of rcu_idle_enter()
> and rcu_idle_exit(). Thoughts?
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
>
> Running user tasks in dyntick-idle mode requires RCU to undergo
> an idle-to-non-idle transition on each entry into the kernel, and
> vice versa on each exit from the kernel. However, situations where
> user tasks cannot run in dyntick-idle mode (for example, when there
> is more than one runnable task on the CPU in question) also require
> RCU to undergo an idle-to-non-idle transition when coming out of the
> idle loop (and vice versa when entering the idle loop).
Not sure what you mean about the idle loop with the dyntick-idle mode we
can't enter when we resume to userspace with more than one task in the runqueue.
> In this case,
> RCU would see one idle-to-non-idle transition when the task became
> runnable, and another when the task executed a system call.
I'm a bit confused with this changelog.
What can happen with the adaptive tickless thing is:
- When we resume to userspace after a syscall/irq/exception and we are
not in RCU extended quiescent state, then switch to it. We may call it RCU
idle mode I guess but that may start to be confusing.
So this may involve several kind of nesting. From a single rcu_idle_enter()
to more complicated scenario if we switch to RCU extended qs from an
an interrupt: rcu_idle_exit() is called on entry of the irq, rcu_idle_enter() is
called in the middle then finally a last call to rcu_idle_enter() in the irq
exit at which point only we want the RCU extended qs to be effective.
- We may also exit that RCU extended qs state by involving other funny
nesting. We have the simple syscall enter that just calls rcu_idle_exit() if
we were in userspace in RCU extended qs. We may also receive an IPI
that enqueues a new task, in which case we may exit the RCU extended
quiescent from the irq with the following nesting:
rcu_idle_exit() on irq entry, then another call to rcu_idle_exit() to prevent
from resuming the RCU extended quiescent state when we come back
to userspace and finally the rcu_idle_enter() in the irq exit.
Is that what you had in mind?
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
2011-11-21 1:46 ` Frederic Weisbecker
@ 2011-11-21 5:28 ` Paul E. McKenney
2011-11-21 15:23 ` Frederic Weisbecker
0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2011-11-21 5:28 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Josh Triplett, LKML, Ingo Molnar, Thomas Gleixner, Peter Zijlstra
On Mon, Nov 21, 2011 at 02:46:58AM +0100, Frederic Weisbecker wrote:
> 2011/11/19 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > On Thu, Nov 17, 2011 at 05:03:44PM -0800, Paul E. McKenney wrote:
> >> On Thu, Nov 17, 2011 at 12:11:34PM -0800, Josh Triplett wrote:
> >> > On Thu, Nov 17, 2011 at 06:48:14PM +0100, Frederic Weisbecker wrote:
> >> > > Those two APIs were provided to optimize the calls of
> >> > > tick_nohz_idle_enter() and rcu_idle_enter() into a single
> >> > > irq disabled section. This way no interrupt happening in-between would
> >> > > needlessly process any RCU job.
> >> > >
> >> > > Now we are talking about an optimization for which benefits
> >> > > have yet to be measured. Let's start simple and completely decouple
> >> > > idle rcu and dyntick idle logics to simplify.
> >> > >
> >> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> >> > > Cc: Ingo Molnar <mingo@redhat.com>
> >> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> >> > > Cc: Peter Zijlstra <peterz@infradead.org>
> >> > > Cc: Josh Triplett <josh@joshtriplett.org>
> >> >
> >> > Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> >>
> >> Merged, thank you both!
> >
> > And here is a patch on top of yours to allow nesting of rcu_idle_enter()
> > and rcu_idle_exit(). Thoughts?
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
> >
> > Running user tasks in dyntick-idle mode requires RCU to undergo
> > an idle-to-non-idle transition on each entry into the kernel, and
> > vice versa on each exit from the kernel. However, situations where
> > user tasks cannot run in dyntick-idle mode (for example, when there
> > is more than one runnable task on the CPU in question) also require
> > RCU to undergo an idle-to-non-idle transition when coming out of the
> > idle loop (and vice versa when entering the idle loop).
>
> Not sure what you mean about the idle loop with the dyntick-idle mode we
> can't enter when we resume to userspace with more than one task in the runqueue.
>
> > In this case,
> > RCU would see one idle-to-non-idle transition when the task became
> > runnable, and another when the task executed a system call.
>
> I'm a bit confused with this changelog.
>
> What can happen with the adaptive tickless thing is:
>
> - When we resume to userspace after a syscall/irq/exception and we are
> not in RCU extended quiescent state, then switch to it. We may call it RCU
> idle mode I guess but that may start to be confusing.
> So this may involve several kind of nesting. From a single rcu_idle_enter()
> to more complicated scenario if we switch to RCU extended qs from an
> an interrupt: rcu_idle_exit() is called on entry of the irq, rcu_idle_enter() is
> called in the middle then finally a last call to rcu_idle_enter() in the irq
> exit at which point only we want the RCU extended qs to be effective.
>
> - We may also exit that RCU extended qs state by involving other funny
> nesting. We have the simple syscall enter that just calls rcu_idle_exit() if
> we were in userspace in RCU extended qs.
OK, so perhaps this is what I am missing. Do you avoid calling
rcu_idle_exit() in the case where the user-mode execution was not an
RCU extended quiescent state? If so, then my patch is not needed,
and I can revert it.
> We may also receive an IPI
> that enqueues a new task, in which case we may exit the RCU extended
> quiescent from the irq with the following nesting:
> rcu_idle_exit() on irq entry, then another call to rcu_idle_exit() to prevent
> from resuming the RCU extended quiescent state when we come back
> to userspace and finally the rcu_idle_enter() in the irq exit.
>
> Is that what you had in mind?
I was concerned about the following scenario:
1. A CPU is initially idle.
2. Task A wakes up on that CPU, enters user-mode execution
in an RCU extended quiescent state.
3. Task B wakes up on that CPU, forcing the CPU out of its
RCU extended quiescent state. However, Task A is higher
priority than is Task B, so Task A continues running.
4. Task A invokes a system call. If the system-call entry
code were to again invoke rcu_idle_enter(), then my patch
is required. If you check and avoid invoking rcu_idle_enter()
in this case, then my patch is not required.
Thanx, Paul
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
2011-11-21 5:28 ` Paul E. McKenney
@ 2011-11-21 15:23 ` Frederic Weisbecker
2011-11-21 16:37 ` Paul E. McKenney
0 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2011-11-21 15:23 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Josh Triplett, LKML, Ingo Molnar, Thomas Gleixner, Peter Zijlstra
On Sun, Nov 20, 2011 at 09:28:19PM -0800, Paul E. McKenney wrote:
> On Mon, Nov 21, 2011 at 02:46:58AM +0100, Frederic Weisbecker wrote:
> > 2011/11/19 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > > On Thu, Nov 17, 2011 at 05:03:44PM -0800, Paul E. McKenney wrote:
> > >> On Thu, Nov 17, 2011 at 12:11:34PM -0800, Josh Triplett wrote:
> > >> > On Thu, Nov 17, 2011 at 06:48:14PM +0100, Frederic Weisbecker wrote:
> > >> > > Those two APIs were provided to optimize the calls of
> > >> > > tick_nohz_idle_enter() and rcu_idle_enter() into a single
> > >> > > irq disabled section. This way no interrupt happening in-between would
> > >> > > needlessly process any RCU job.
> > >> > >
> > >> > > Now we are talking about an optimization for which benefits
> > >> > > have yet to be measured. Let's start simple and completely decouple
> > >> > > idle rcu and dyntick idle logics to simplify.
> > >> > >
> > >> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > >> > > Cc: Ingo Molnar <mingo@redhat.com>
> > >> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > >> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > >> > > Cc: Josh Triplett <josh@joshtriplett.org>
> > >> >
> > >> > Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> > >>
> > >> Merged, thank you both!
> > >
> > > And here is a patch on top of yours to allow nesting of rcu_idle_enter()
> > > and rcu_idle_exit(). Thoughts?
> > >
> > > Thanx, Paul
> > >
> > > ------------------------------------------------------------------------
> > >
> > > rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
> > >
> > > Running user tasks in dyntick-idle mode requires RCU to undergo
> > > an idle-to-non-idle transition on each entry into the kernel, and
> > > vice versa on each exit from the kernel. However, situations where
> > > user tasks cannot run in dyntick-idle mode (for example, when there
> > > is more than one runnable task on the CPU in question) also require
> > > RCU to undergo an idle-to-non-idle transition when coming out of the
> > > idle loop (and vice versa when entering the idle loop).
> >
> > Not sure what you mean about the idle loop with the dyntick-idle mode we
> > can't enter when we resume to userspace with more than one task in the runqueue.
> >
> > > In this case,
> > > RCU would see one idle-to-non-idle transition when the task became
> > > runnable, and another when the task executed a system call.
> >
> > I'm a bit confused with this changelog.
> >
> > What can happen with the adaptive tickless thing is:
> >
> > - When we resume to userspace after a syscall/irq/exception and we are
> > not in RCU extended quiescent state, then switch to it. We may call it RCU
> > idle mode I guess but that may start to be confusing.
> > So this may involve several kind of nesting. From a single rcu_idle_enter()
> > to more complicated scenario if we switch to RCU extended qs from an
> > an interrupt: rcu_idle_exit() is called on entry of the irq, rcu_idle_enter() is
> > called in the middle then finally a last call to rcu_idle_enter() in the irq
> > exit at which point only we want the RCU extended qs to be effective.
> >
> > - We may also exit that RCU extended qs state by involving other funny
> > nesting. We have the simple syscall enter that just calls rcu_idle_exit() if
> > we were in userspace in RCU extended qs.
>
> OK, so perhaps this is what I am missing. Do you avoid calling
> rcu_idle_exit() in the case where the user-mode execution was not an
> RCU extended quiescent state? If so, then my patch is not needed,
> and I can revert it.
Yes, if we resume to userspace after a syscall but we have more than one
task in the runqueue, then we don't switch to RCU extended qs: we don't
call rcu_idle_exit() on syscall return in this case.
>
> > We may also receive an IPI
> > that enqueues a new task, in which case we may exit the RCU extended
> > quiescent from the irq with the following nesting:
> > rcu_idle_exit() on irq entry, then another call to rcu_idle_exit() to prevent
> > from resuming the RCU extended quiescent state when we come back
> > to userspace and finally the rcu_idle_enter() in the irq exit.
> >
> > Is that what you had in mind?
>
> I was concerned about the following scenario:
>
> 1. A CPU is initially idle.
>
> 2. Task A wakes up on that CPU, enters user-mode execution
> in an RCU extended quiescent state.
Just in case, I would like to note what happens in detail here:
- Idle notices the need to resched, goes out of its idle loop and
calls rcu_idle_exit(). The scheduler context switching may need
RCU and we don't know where the next task will resume. If it's in
the kernel it may need RCU as well. So we need this unconditional
rcu_idle_exit() that re-enables RCU.
- We also re-enable the tick unconditionally on idle exit time. So
when the user task resumes, the tick is there and may decide to shut
down again, in which case we may call rcu_idle_enter() if we are in
userspace. Otherwise this is done later when we resume userspace (syscall
or exception).
>
> 3. Task B wakes up on that CPU, forcing the CPU out of its
> RCU extended quiescent state. However, Task A is higher
> priority than is Task B, so Task A continues running.
Right but we have two tasks in the runqueue then, so we restart
the tick and call rcu_idle_exit().
>
> 4. Task A invokes a system call. If the system-call entry
> code were to again invoke rcu_idle_enter(), then my patch
> is required. If you check and avoid invoking rcu_idle_enter()
> in this case, then my patch is not required.
You mean rcu_idle_exit()? So yeah, since we have the tick running
and thus RCU not in extended QS, we won't call rcu_idle_exit() on syscall
entry.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
2011-11-21 15:23 ` Frederic Weisbecker
@ 2011-11-21 16:37 ` Paul E. McKenney
0 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2011-11-21 16:37 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Josh Triplett, LKML, Ingo Molnar, Thomas Gleixner, Peter Zijlstra
On Mon, Nov 21, 2011 at 04:23:51PM +0100, Frederic Weisbecker wrote:
> On Sun, Nov 20, 2011 at 09:28:19PM -0800, Paul E. McKenney wrote:
> > On Mon, Nov 21, 2011 at 02:46:58AM +0100, Frederic Weisbecker wrote:
> > > 2011/11/19 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > > > On Thu, Nov 17, 2011 at 05:03:44PM -0800, Paul E. McKenney wrote:
> > > >> On Thu, Nov 17, 2011 at 12:11:34PM -0800, Josh Triplett wrote:
[ . . . ]
> > 4. Task A invokes a system call. If the system-call entry
> > code were to again invoke rcu_idle_enter(), then my patch
> > is required. If you check and avoid invoking rcu_idle_enter()
> > in this case, then my patch is not required.
>
> You mean rcu_idle_exit()? So yeah, since we have the tick running
> and thus RCU not in extended QS, we won't call rcu_idle_exit() on syscall
> entry.
OK, then I will drop my patch. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-11-21 16:39 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-17 17:48 [PATCH] nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu() Frederic Weisbecker
2011-11-17 20:11 ` Josh Triplett
2011-11-18 1:03 ` Paul E. McKenney
2011-11-19 0:50 ` Paul E. McKenney
2011-11-21 1:46 ` Frederic Weisbecker
2011-11-21 5:28 ` Paul E. McKenney
2011-11-21 15:23 ` Frederic Weisbecker
2011-11-21 16:37 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).