* [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v4
@ 2014-05-13 14:38 Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work Frederic Weisbecker
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 14:38 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
So this version now implements remote irq works using the generic IPI
interrupt available in most archs, as suggested by Peterz.
Keep in mind that the first patch is really just a draft to build the
mockup. It needs to be turned into an internal state set on boot or so.
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
timers/nohz-irq-work-v2
Thanks,
Frederic
---
Frederic Weisbecker (5):
irq_work: Let arch tell us if it can raise irq work
irq_work: Force non-lazy works to the IPI
irq_work: Allow remote queueing
nohz: Move full nohz kick to its own IPI
nohz: Use IPI implicit full barrier against rq->nr_running r/w
arch/alpha/kernel/time.c | 5 +++
arch/arm/kernel/smp.c | 5 +++
arch/powerpc/kernel/time.c | 5 +++
arch/sparc/kernel/pcr.c | 5 +++
arch/x86/kernel/irq_work.c | 7 ++++
include/linux/irq_work.h | 3 ++
include/linux/tick.h | 9 ++++-
kernel/irq_work.c | 83 +++++++++++++++++++++++++++++++---------------
kernel/sched/core.c | 14 ++++----
kernel/sched/sched.h | 12 +++++--
kernel/smp.c | 3 ++
kernel/time/tick-sched.c | 10 +++---
kernel/timer.c | 2 +-
13 files changed, 119 insertions(+), 44 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work
2014-05-13 14:38 [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v4 Frederic Weisbecker
@ 2014-05-13 14:38 ` Frederic Weisbecker
2014-05-13 17:09 ` Peter Zijlstra
2014-05-13 14:38 ` [PATCH 2/5] irq_work: Force non-lazy works to the IPI Frederic Weisbecker
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 14:38 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
We prepare for executing the full nohz kick through an irq work. But
if we do this as is, we'll run into conflicting tick locking: the tick
holds the hrtimer lock and the nohz kick may do so too.
So we need to be able to force the execution of some irq works (more
precisely the non-lazy ones) to the arch irq work interrupt if any.
As a start we need to know if the arch support sending its own self-IPIs
and doesn't rely on the tick to execute the works.
This solution proposes weak function. Of course it's ugly and deemed
only for a draft. The best would be to call a generic
irq_work_set_raisable() only once per arch.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Not-Yet-Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
arch/alpha/kernel/time.c | 5 +++++
arch/arm/kernel/smp.c | 5 +++++
arch/powerpc/kernel/time.c | 5 +++++
arch/sparc/kernel/pcr.c | 5 +++++
arch/x86/kernel/irq_work.c | 7 +++++++
kernel/irq_work.c | 5 +++++
6 files changed, 32 insertions(+)
diff --git a/arch/alpha/kernel/time.c b/arch/alpha/kernel/time.c
index ee39cee..b30d7bd 100644
--- a/arch/alpha/kernel/time.c
+++ b/arch/alpha/kernel/time.c
@@ -65,6 +65,11 @@ void arch_irq_work_raise(void)
set_irq_work_pending_flag();
}
+bool arch_irq_work_can_raise(void)
+{
+ return true;
+}
+
#else /* CONFIG_IRQ_WORK */
#define test_irq_work_pending() 0
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 7c4fada..89ff3a3 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -459,6 +459,11 @@ void arch_irq_work_raise(void)
if (is_smp())
smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
}
+
+bool arch_irq_work_can_raise(void)
+{
+ return is_smp();
+}
#endif
static const char *ipi_types[NR_IPI] = {
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 122a580..e5381e8 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -472,6 +472,11 @@ void arch_irq_work_raise(void)
preempt_enable();
}
+bool arch_irq_work_can_raise(void)
+{
+ return true;
+}
+
#else /* CONFIG_IRQ_WORK */
#define test_irq_work_pending() 0
diff --git a/arch/sparc/kernel/pcr.c b/arch/sparc/kernel/pcr.c
index 269af58..658f4bc 100644
--- a/arch/sparc/kernel/pcr.c
+++ b/arch/sparc/kernel/pcr.c
@@ -48,6 +48,11 @@ void arch_irq_work_raise(void)
set_softint(1 << PIL_DEFERRED_PCR_WORK);
}
+bool arch_irq_work_can_raise(void)
+{
+ return true;
+}
+
const struct pcr_ops *pcr_ops;
EXPORT_SYMBOL_GPL(pcr_ops);
diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
index 1de84e3..03e1ee4 100644
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -48,3 +48,10 @@ void arch_irq_work_raise(void)
apic_wait_icr_idle();
#endif
}
+
+#ifdef CONFIG_X86_LOCAL_APIC
+bool arch_irq_work_can_raise(void)
+{
+ return cpu_has_apic;
+}
+#endif
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index a82170e..2a5aad4 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -55,6 +55,11 @@ void __weak arch_irq_work_raise(void)
*/
}
+bool __weak arch_irq_work_can_raise(void)
+{
+ return false;
+}
+
/*
* Enqueue the irq_work @entry unless it's already pending
* somewhere.
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work
2014-05-13 14:38 ` [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work Frederic Weisbecker
@ 2014-05-13 17:09 ` Peter Zijlstra
2014-05-13 19:33 ` Frederic Weisbecker
0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2014-05-13 17:09 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Andrew Morton, Ingo Molnar, Kevin Hilman, Paul E. McKenney,
Thomas Gleixner, Viresh Kumar
On Tue, May 13, 2014 at 04:38:37PM +0200, Frederic Weisbecker wrote:
> We prepare for executing the full nohz kick through an irq work. But
> if we do this as is, we'll run into conflicting tick locking: the tick
> holds the hrtimer lock and the nohz kick may do so too.
It does? How does the tick end up holding that lock?
Normal hrtimer callbacks run without holding the hrtimer lock -- I made
it so.
This means tick_sched_timer() is called without hrtimer lock, and I
don't see it taking it anywhere in tick_sched_do_timer() or
tick_sched_handle().
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work
2014-05-13 17:09 ` Peter Zijlstra
@ 2014-05-13 19:33 ` Frederic Weisbecker
2014-05-13 20:48 ` Peter Zijlstra
0 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 19:33 UTC (permalink / raw)
To: Peter Zijlstra
Cc: LKML, Andrew Morton, Ingo Molnar, Kevin Hilman, Paul E. McKenney,
Thomas Gleixner, Viresh Kumar
On Tue, May 13, 2014 at 07:09:42PM +0200, Peter Zijlstra wrote:
> On Tue, May 13, 2014 at 04:38:37PM +0200, Frederic Weisbecker wrote:
> > We prepare for executing the full nohz kick through an irq work. But
> > if we do this as is, we'll run into conflicting tick locking: the tick
> > holds the hrtimer lock and the nohz kick may do so too.
>
> It does? How does the tick end up holding that lock?
>
> Normal hrtimer callbacks run without holding the hrtimer lock -- I made
> it so.
>
> This means tick_sched_timer() is called without hrtimer lock, and I
> don't see it taking it anywhere in tick_sched_do_timer() or
> tick_sched_handle().
Check hrtimer_interrupt(), it takes the per cpu base->lock.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work
2014-05-13 19:33 ` Frederic Weisbecker
@ 2014-05-13 20:48 ` Peter Zijlstra
2014-05-13 21:15 ` Frederic Weisbecker
0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2014-05-13 20:48 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Andrew Morton, Ingo Molnar, Kevin Hilman, Paul E. McKenney,
Thomas Gleixner, Viresh Kumar
On Tue, May 13, 2014 at 09:33:29PM +0200, Frederic Weisbecker wrote:
> On Tue, May 13, 2014 at 07:09:42PM +0200, Peter Zijlstra wrote:
> > On Tue, May 13, 2014 at 04:38:37PM +0200, Frederic Weisbecker wrote:
> > > We prepare for executing the full nohz kick through an irq work. But
> > > if we do this as is, we'll run into conflicting tick locking: the tick
> > > holds the hrtimer lock and the nohz kick may do so too.
> >
> > It does? How does the tick end up holding that lock?
> >
> > Normal hrtimer callbacks run without holding the hrtimer lock -- I made
> > it so.
> >
> > This means tick_sched_timer() is called without hrtimer lock, and I
> > don't see it taking it anywhere in tick_sched_do_timer() or
> > tick_sched_handle().
>
> Check hrtimer_interrupt(), it takes the per cpu base->lock.
check __run_hrtimer() which drops base->lock over calling ->function.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work
2014-05-13 20:48 ` Peter Zijlstra
@ 2014-05-13 21:15 ` Frederic Weisbecker
0 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 21:15 UTC (permalink / raw)
To: Peter Zijlstra
Cc: LKML, Andrew Morton, Ingo Molnar, Kevin Hilman, Paul E. McKenney,
Thomas Gleixner, Viresh Kumar
On Tue, May 13, 2014 at 10:48:02PM +0200, Peter Zijlstra wrote:
> On Tue, May 13, 2014 at 09:33:29PM +0200, Frederic Weisbecker wrote:
> > On Tue, May 13, 2014 at 07:09:42PM +0200, Peter Zijlstra wrote:
> > > On Tue, May 13, 2014 at 04:38:37PM +0200, Frederic Weisbecker wrote:
> > > > We prepare for executing the full nohz kick through an irq work. But
> > > > if we do this as is, we'll run into conflicting tick locking: the tick
> > > > holds the hrtimer lock and the nohz kick may do so too.
> > >
> > > It does? How does the tick end up holding that lock?
> > >
> > > Normal hrtimer callbacks run without holding the hrtimer lock -- I made
> > > it so.
> > >
> > > This means tick_sched_timer() is called without hrtimer lock, and I
> > > don't see it taking it anywhere in tick_sched_do_timer() or
> > > tick_sched_handle().
> >
> > Check hrtimer_interrupt(), it takes the per cpu base->lock.
>
> check __run_hrtimer() which drops base->lock over calling ->function.
Oh! I had lockdep splats a few days ago. But I think I worked too many hours
on it and eventually developed some brainfarted pet assumptions all along :-(
It was probably due to some other mistakes of mine. Ok, lets try again.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/5] irq_work: Force non-lazy works to the IPI
2014-05-13 14:38 [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v4 Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work Frederic Weisbecker
@ 2014-05-13 14:38 ` Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 3/5] irq_work: Allow remote queueing Frederic Weisbecker
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 14:38 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
As we plan to handle the full nohz IPI using irq work, we need to
enforce non-lazy works outside the tick because it's called under
hrtimer lock. This is not desired from the nohz callback revaluating the
tick because it can take hrtimer lock itself.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
include/linux/irq_work.h | 1 +
kernel/irq_work.c | 61 +++++++++++++++++++++++++++---------------------
kernel/timer.c | 2 +-
3 files changed, 36 insertions(+), 28 deletions(-)
diff --git a/include/linux/irq_work.h b/include/linux/irq_work.h
index 19ae05d..429b1ba 100644
--- a/include/linux/irq_work.h
+++ b/include/linux/irq_work.h
@@ -34,6 +34,7 @@ void init_irq_work(struct irq_work *work, void (*func)(struct irq_work *))
bool irq_work_queue(struct irq_work *work);
void irq_work_run(void);
+void irq_work_run_tick(void);
void irq_work_sync(struct irq_work *work);
#ifdef CONFIG_IRQ_WORK
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 2a5aad4..292a9ac 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -19,8 +19,8 @@
#include <asm/processor.h>
-static DEFINE_PER_CPU(struct llist_head, irq_work_list);
-static DEFINE_PER_CPU(int, irq_work_raised);
+static DEFINE_PER_CPU(struct llist_head, raised_list);
+static DEFINE_PER_CPU(struct llist_head, lazy_list);
/*
* Claim the entry so that no one else will poke at it.
@@ -68,14 +68,14 @@ bool __weak arch_irq_work_can_raise(void)
*/
bool irq_work_queue(struct irq_work *work)
{
+ unsigned long flags;
+
/* Only queue if not already pending */
if (!irq_work_claim(work))
return false;
- /* Queue the entry and raise the IPI if needed. */
- preempt_disable();
-
- llist_add(&work->llnode, &__get_cpu_var(irq_work_list));
+ /* Make sure an IRQ doesn't stop the tick concurrently */
+ local_irq_save(flags);
/*
* If the work is not "lazy" or the tick is stopped, raise the irq
@@ -83,11 +83,13 @@ bool irq_work_queue(struct irq_work *work)
* for the next tick.
*/
if (!(work->flags & IRQ_WORK_LAZY) || tick_nohz_tick_stopped()) {
- if (!this_cpu_cmpxchg(irq_work_raised, 0, 1))
+ if (llist_add(&work->llnode, &__get_cpu_var(raised_list)))
arch_irq_work_raise();
+ } else {
+ llist_add(&work->llnode, &__get_cpu_var(lazy_list));
}
- preempt_enable();
+ local_irq_restore(flags);
return true;
}
@@ -95,10 +97,7 @@ EXPORT_SYMBOL_GPL(irq_work_queue);
bool irq_work_needs_cpu(void)
{
- struct llist_head *this_list;
-
- this_list = &__get_cpu_var(irq_work_list);
- if (llist_empty(this_list))
+ if (llist_empty(&__get_cpu_var(lazy_list)))
return false;
/* All work should have been flushed before going offline */
@@ -107,28 +106,18 @@ bool irq_work_needs_cpu(void)
return true;
}
-static void __irq_work_run(void)
+static void __irq_work_run(struct llist_head *list)
{
unsigned long flags;
struct irq_work *work;
- struct llist_head *this_list;
struct llist_node *llnode;
-
- /*
- * Reset the "raised" state right before we check the list because
- * an NMI may enqueue after we find the list empty from the runner.
- */
- __this_cpu_write(irq_work_raised, 0);
- barrier();
-
- this_list = &__get_cpu_var(irq_work_list);
- if (llist_empty(this_list))
+ if (llist_empty(list))
return;
BUG_ON(!irqs_disabled());
- llnode = llist_del_all(this_list);
+ llnode = llist_del_all(list);
while (llnode != NULL) {
work = llist_entry(llnode, struct irq_work, llnode);
@@ -160,11 +149,28 @@ static void __irq_work_run(void)
void irq_work_run(void)
{
BUG_ON(!in_irq());
- __irq_work_run();
+ __irq_work_run(&__get_cpu_var(raised_list));
+ __irq_work_run(&__get_cpu_var(lazy_list));
}
EXPORT_SYMBOL_GPL(irq_work_run);
/*
+ * Run the lazy irq_work entries on this cpu from the tick. But let
+ * the IPI handle the others. Some works may require to work outside
+ * the tick due to its locking dependencies (hrtimer lock).
+ */
+void irq_work_run_tick(void)
+{
+ BUG_ON(!in_irq());
+
+ if (!arch_irq_work_can_raise()) {
+ /* No IPI support, we don't have the choice... */
+ __irq_work_run(&__get_cpu_var(raised_list));
+ }
+ __irq_work_run(&__get_cpu_var(lazy_list));
+}
+
+/*
* Synchronize against the irq_work @entry, ensures the entry is not
* currently in use.
*/
@@ -188,7 +194,8 @@ static int irq_work_cpu_notify(struct notifier_block *self,
/* Called from stop_machine */
if (WARN_ON_ONCE(cpu != smp_processor_id()))
break;
- __irq_work_run();
+ __irq_work_run(&__get_cpu_var(raised_list));
+ __irq_work_run(&__get_cpu_var(lazy_list));
break;
default:
break;
diff --git a/kernel/timer.c b/kernel/timer.c
index 3bb01a3..0251dfa 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1384,7 +1384,7 @@ void update_process_times(int user_tick)
rcu_check_callbacks(cpu, user_tick);
#ifdef CONFIG_IRQ_WORK
if (in_irq())
- irq_work_run();
+ irq_work_run_tick();
#endif
scheduler_tick();
run_posix_cpu_timers(p);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 3/5] irq_work: Allow remote queueing
2014-05-13 14:38 [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v4 Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 2/5] irq_work: Force non-lazy works to the IPI Frederic Weisbecker
@ 2014-05-13 14:38 ` Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 5/5] nohz: Use IPI implicit full barrier against rq->nr_running r/w Frederic Weisbecker
4 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 14:38 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
irq work currently only supports local callbacks. However its code
is mostly ready to run remote callbacks and we have some potential user.
The full nohz subsystem currently open codes its own remote irq work
on top of the scheduler ipi when it wants a CPU to revaluate its next
tick. However this ad hoc solution bloats the scheduler IPI.
Lets just extend the irq work subsystem to support remote queuing on top
of the generic SMP IPI to handle this kind of user. This shouldn't add
noticeable overhead.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
include/linux/irq_work.h | 2 ++
kernel/irq_work.c | 17 +++++++++++++++++
kernel/smp.c | 3 +++
3 files changed, 22 insertions(+)
diff --git a/include/linux/irq_work.h b/include/linux/irq_work.h
index 429b1ba..511e7f7 100644
--- a/include/linux/irq_work.h
+++ b/include/linux/irq_work.h
@@ -33,6 +33,8 @@ void init_irq_work(struct irq_work *work, void (*func)(struct irq_work *))
#define DEFINE_IRQ_WORK(name, _f) struct irq_work name = { .func = (_f), }
bool irq_work_queue(struct irq_work *work);
+bool irq_work_queue_on(struct irq_work *work, int cpu);
+
void irq_work_run(void);
void irq_work_run_tick(void);
void irq_work_sync(struct irq_work *work);
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 292a9ac..98dab29 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -66,6 +66,23 @@ bool __weak arch_irq_work_can_raise(void)
*
* Can be re-enqueued while the callback is still in progress.
*/
+bool irq_work_queue_on(struct irq_work *work, int cpu)
+{
+ /* Only queue if not already pending */
+ if (!irq_work_claim(work))
+ return false;
+
+ /* All work should have been flushed before going offline */
+ WARN_ON_ONCE(cpu_is_offline(cpu));
+ WARN_ON_ONCE(work->flags & IRQ_WORK_LAZY);
+
+ if (llist_add(&work->llnode, &per_cpu(raised_list, cpu)))
+ native_send_call_func_single_ipi(cpu);
+
+ return true;
+}
+EXPORT_SYMBOL_GPL(irq_work_queue_on);
+
bool irq_work_queue(struct irq_work *work)
{
unsigned long flags;
diff --git a/kernel/smp.c b/kernel/smp.c
index 06d574e..f5edb96 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -3,6 +3,7 @@
*
* (C) Jens Axboe <jens.axboe@oracle.com> 2008
*/
+#include <linux/irq_work.h>
#include <linux/rcupdate.h>
#include <linux/rculist.h>
#include <linux/kernel.h>
@@ -198,6 +199,8 @@ void generic_smp_call_function_single_interrupt(void)
csd->func(csd->info);
csd_unlock(csd);
}
+
+ irq_work_run();
}
/*
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 4/5] nohz: Move full nohz kick to its own IPI
2014-05-13 14:38 [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v4 Frederic Weisbecker
` (2 preceding siblings ...)
2014-05-13 14:38 ` [PATCH 3/5] irq_work: Allow remote queueing Frederic Weisbecker
@ 2014-05-13 14:38 ` Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 5/5] nohz: Use IPI implicit full barrier against rq->nr_running r/w Frederic Weisbecker
4 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 14:38 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
Now that the irq work subsystem can queue remote callbacks, it's
a perfect fit to safely queue IPIs when interrupts are disabled
without worrying about concurrent callers.
Lets use it for the full dynticks kick to notify a CPU that it's
exiting single task mode.
This unbloats a bit the scheduler IPI that the nohz code was abusing
for its cool "callable anywhere/anytime" properties.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
include/linux/tick.h | 9 ++++++++-
kernel/sched/core.c | 5 +----
kernel/sched/sched.h | 2 +-
kernel/time/tick-sched.c | 10 ++++++----
4 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/include/linux/tick.h b/include/linux/tick.h
index b84773c..8a4987f 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -181,7 +181,13 @@ static inline bool tick_nohz_full_cpu(int cpu)
extern void tick_nohz_init(void);
extern void __tick_nohz_full_check(void);
-extern void tick_nohz_full_kick(void);
+extern void tick_nohz_full_kick_cpu(int cpu);
+
+static inline void tick_nohz_full_kick(void)
+{
+ tick_nohz_full_kick_cpu(smp_processor_id());
+}
+
extern void tick_nohz_full_kick_all(void);
extern void __tick_nohz_task_switch(struct task_struct *tsk);
#else
@@ -189,6 +195,7 @@ static inline void tick_nohz_init(void) { }
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
static inline void __tick_nohz_full_check(void) { }
+static inline void tick_nohz_full_kick_cpu(int cpu) { }
static inline void tick_nohz_full_kick(void) { }
static inline void tick_nohz_full_kick_all(void) { }
static inline void __tick_nohz_task_switch(struct task_struct *tsk) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d9d8ece..fb6dfad 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1500,9 +1500,7 @@ void scheduler_ipi(void)
*/
preempt_fold_need_resched();
- if (llist_empty(&this_rq()->wake_list)
- && !tick_nohz_full_cpu(smp_processor_id())
- && !got_nohz_idle_kick())
+ if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick())
return;
/*
@@ -1519,7 +1517,6 @@ void scheduler_ipi(void)
* somewhat pessimize the simple resched case.
*/
irq_enter();
- tick_nohz_full_check();
sched_ttwu_pending();
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 456e492..6089e00 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1225,7 +1225,7 @@ static inline void inc_nr_running(struct rq *rq)
if (tick_nohz_full_cpu(rq->cpu)) {
/* Order rq->nr_running write against the IPI */
smp_wmb();
- smp_send_reschedule(rq->cpu);
+ tick_nohz_full_kick_cpu(rq->cpu);
}
}
#endif
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6558b7a..3d63944 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -224,13 +224,15 @@ static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {
};
/*
- * Kick the current CPU if it's full dynticks in order to force it to
+ * Kick the CPU if it's full dynticks in order to force it to
* re-evaluate its dependency on the tick and restart it if necessary.
*/
-void tick_nohz_full_kick(void)
+void tick_nohz_full_kick_cpu(int cpu)
{
- if (tick_nohz_full_cpu(smp_processor_id()))
- irq_work_queue(&__get_cpu_var(nohz_full_kick_work));
+ if (!tick_nohz_full_cpu(cpu))
+ return;
+
+ irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
}
static void nohz_full_kick_ipi(void *info)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 5/5] nohz: Use IPI implicit full barrier against rq->nr_running r/w
2014-05-13 14:38 [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v4 Frederic Weisbecker
` (3 preceding siblings ...)
2014-05-13 14:38 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
@ 2014-05-13 14:38 ` Frederic Weisbecker
4 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-13 14:38 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
A full dynticks CPU is allowed to stop its tick when a single task runs.
Meanwhile when a new task gets enqueued, the CPU must be notified so that
it restart its tick to maintain local fairness and other accounting details.
This notification is performed by way of an IPI. Then when the target
receives the IPI, we expect it to see the new value of rq->nr_running.
Hence the following ordering scenario:
CPU 0 CPU 1
write rq->running get IPI
smp_wmb() smp_rmb()
send IPI read rq->nr_running
But Paul Mckenney says that nowadays IPIs imply a full barrier on
all architectures. So we can safely remove this pair and rely on the
implicit barriers that come along IPI send/receive. Lets
just comment on this new assumption.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
kernel/sched/core.c | 9 +++++----
kernel/sched/sched.h | 10 ++++++++--
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fb6dfad..a06cac1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -670,10 +670,11 @@ bool sched_can_stop_tick(void)
rq = this_rq();
- /* Make sure rq->nr_running update is visible after the IPI */
- smp_rmb();
-
- /* More than one running task need preemption */
+ /*
+ * More than one running task need preemption.
+ * nr_running update is assumed to be visible
+ * after IPI is sent from wakers.
+ */
if (rq->nr_running > 1)
return false;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 6089e00..219bfbd 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1223,8 +1223,14 @@ static inline void inc_nr_running(struct rq *rq)
#ifdef CONFIG_NO_HZ_FULL
if (rq->nr_running == 2) {
if (tick_nohz_full_cpu(rq->cpu)) {
- /* Order rq->nr_running write against the IPI */
- smp_wmb();
+ /*
+ * Tick is needed if more than one task runs on a CPU.
+ * Send the target an IPI to kick it out of nohz mode.
+ *
+ * We assume that IPI implies full memory barrier and the
+ * new value of rq->nr_running is visible on reception
+ * from the target.
+ */
tick_nohz_full_kick_cpu(rq->cpu);
}
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [GIT PULL] nohz: Move nohz kick out of scheduler IPI, v7
@ 2014-06-03 14:40 Frederic Weisbecker
2014-06-03 14:40 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
0 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2014-06-03 14:40 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar
Cc: LKML, Frederic Weisbecker, Andrew Morton, Kevin Hilman,
Paul E. McKenney, Thomas Gleixner, Viresh Kumar
Hi,
This version fixes the following concerns from Peterz:
* Warn _before_ work claim on irq_work_queue_on()
* Warn on in_nmi() while remote queueing
* Only disabled preemption (and not irqs) on local queueing
Thanks.
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
timers/nohz-irq-work-v5
Thanks,
Frederic
---
Frederic Weisbecker (5):
irq_work: Split raised and lazy lists
irq_work: Shorten a bit irq_work_needs_cpu()
irq_work: Implement remote queueing
nohz: Move full nohz kick to its own IPI
nohz: Use IPI implicit full barrier against rq->nr_running r/w
include/linux/irq_work.h | 2 ++
include/linux/tick.h | 9 +++++-
kernel/irq_work.c | 72 ++++++++++++++++++++++++++++--------------------
kernel/sched/core.c | 14 ++++------
kernel/sched/sched.h | 12 ++++++--
kernel/smp.c | 4 +++
kernel/time/tick-sched.c | 10 ++++---
7 files changed, 77 insertions(+), 46 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 4/5] nohz: Move full nohz kick to its own IPI
2014-06-03 14:40 [GIT PULL] nohz: Move nohz kick out of scheduler IPI, v7 Frederic Weisbecker
@ 2014-06-03 14:40 ` Frederic Weisbecker
2014-06-03 15:00 ` Peter Zijlstra
0 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2014-06-03 14:40 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar
Cc: LKML, Frederic Weisbecker, Andrew Morton, Kevin Hilman,
Paul E. McKenney, Thomas Gleixner, Viresh Kumar
Now that the irq work subsystem can queue remote callbacks, it's
a perfect fit to safely queue IPIs when interrupts are disabled
without worrying about concurrent callers.
Lets use it for the full dynticks kick to notify a CPU that it's
exiting single task mode.
This unbloats a bit the scheduler IPI that the nohz code was abusing
for its cool "callable anywhere/anytime" properties.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
include/linux/tick.h | 9 ++++++++-
kernel/sched/core.c | 5 +----
kernel/sched/sched.h | 2 +-
kernel/time/tick-sched.c | 10 ++++++----
4 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/include/linux/tick.h b/include/linux/tick.h
index b84773c..8a4987f 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -181,7 +181,13 @@ static inline bool tick_nohz_full_cpu(int cpu)
extern void tick_nohz_init(void);
extern void __tick_nohz_full_check(void);
-extern void tick_nohz_full_kick(void);
+extern void tick_nohz_full_kick_cpu(int cpu);
+
+static inline void tick_nohz_full_kick(void)
+{
+ tick_nohz_full_kick_cpu(smp_processor_id());
+}
+
extern void tick_nohz_full_kick_all(void);
extern void __tick_nohz_task_switch(struct task_struct *tsk);
#else
@@ -189,6 +195,7 @@ static inline void tick_nohz_init(void) { }
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
static inline void __tick_nohz_full_check(void) { }
+static inline void tick_nohz_full_kick_cpu(int cpu) { }
static inline void tick_nohz_full_kick(void) { }
static inline void tick_nohz_full_kick_all(void) { }
static inline void __tick_nohz_task_switch(struct task_struct *tsk) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d9d8ece..fb6dfad 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1500,9 +1500,7 @@ void scheduler_ipi(void)
*/
preempt_fold_need_resched();
- if (llist_empty(&this_rq()->wake_list)
- && !tick_nohz_full_cpu(smp_processor_id())
- && !got_nohz_idle_kick())
+ if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick())
return;
/*
@@ -1519,7 +1517,6 @@ void scheduler_ipi(void)
* somewhat pessimize the simple resched case.
*/
irq_enter();
- tick_nohz_full_check();
sched_ttwu_pending();
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 456e492..6089e00 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1225,7 +1225,7 @@ static inline void inc_nr_running(struct rq *rq)
if (tick_nohz_full_cpu(rq->cpu)) {
/* Order rq->nr_running write against the IPI */
smp_wmb();
- smp_send_reschedule(rq->cpu);
+ tick_nohz_full_kick_cpu(rq->cpu);
}
}
#endif
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6558b7a..3d63944 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -224,13 +224,15 @@ static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {
};
/*
- * Kick the current CPU if it's full dynticks in order to force it to
+ * Kick the CPU if it's full dynticks in order to force it to
* re-evaluate its dependency on the tick and restart it if necessary.
*/
-void tick_nohz_full_kick(void)
+void tick_nohz_full_kick_cpu(int cpu)
{
- if (tick_nohz_full_cpu(smp_processor_id()))
- irq_work_queue(&__get_cpu_var(nohz_full_kick_work));
+ if (!tick_nohz_full_cpu(cpu))
+ return;
+
+ irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
}
static void nohz_full_kick_ipi(void *info)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 4/5] nohz: Move full nohz kick to its own IPI
2014-06-03 14:40 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
@ 2014-06-03 15:00 ` Peter Zijlstra
0 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2014-06-03 15:00 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Ingo Molnar, LKML, Andrew Morton, Kevin Hilman, Paul E. McKenney,
Thomas Gleixner, Viresh Kumar
[-- Attachment #1: Type: text/plain, Size: 1099 bytes --]
On Tue, Jun 03, 2014 at 04:40:19PM +0200, Frederic Weisbecker wrote:
> Now that the irq work subsystem can queue remote callbacks, it's
> a perfect fit to safely queue IPIs when interrupts are disabled
> without worrying about concurrent callers.
>
> Lets use it for the full dynticks kick to notify a CPU that it's
> exiting single task mode.
>
> This unbloats a bit the scheduler IPI that the nohz code was abusing
> for its cool "callable anywhere/anytime" properties.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Kevin Hilman <khilman@linaro.org>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
> include/linux/tick.h | 9 ++++++++-
> kernel/sched/core.c | 5 +----
> kernel/sched/sched.h | 2 +-
> kernel/time/tick-sched.c | 10 ++++++----
> 4 files changed, 16 insertions(+), 10 deletions(-)
>
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v6
@ 2014-05-25 14:29 Frederic Weisbecker
2014-05-25 14:29 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
0 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-25 14:29 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
This version implements separate lists for tick and IPI works. As such
it simplifies the IPI raise decision.
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
timers/nohz-irq-work-v4
Thanks,
Frederic
---
Frederic Weisbecker (5):
irq_work: Split raised and lazy lists
irq_work: Shorten a bit irq_work_needs_cpu()
irq_work: Implement remote queueing
nohz: Move full nohz kick to its own IPI
nohz: Use IPI implicit full barrier against rq->nr_running r/w
include/linux/irq_work.h | 2 ++
include/linux/tick.h | 9 +++++-
kernel/irq_work.c | 74 +++++++++++++++++++++++++++---------------------
kernel/sched/core.c | 14 ++++-----
kernel/sched/sched.h | 12 ++++++--
kernel/smp.c | 4 +++
kernel/time/tick-sched.c | 10 ++++---
7 files changed, 76 insertions(+), 49 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 4/5] nohz: Move full nohz kick to its own IPI
2014-05-25 14:29 [PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v6 Frederic Weisbecker
@ 2014-05-25 14:29 ` Frederic Weisbecker
0 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-25 14:29 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
Now that the irq work subsystem can queue remote callbacks, it's
a perfect fit to safely queue IPIs when interrupts are disabled
without worrying about concurrent callers.
Lets use it for the full dynticks kick to notify a CPU that it's
exiting single task mode.
This unbloats a bit the scheduler IPI that the nohz code was abusing
for its cool "callable anywhere/anytime" properties.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
include/linux/tick.h | 9 ++++++++-
kernel/sched/core.c | 5 +----
kernel/sched/sched.h | 2 +-
kernel/time/tick-sched.c | 10 ++++++----
4 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/include/linux/tick.h b/include/linux/tick.h
index b84773c..8a4987f 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -181,7 +181,13 @@ static inline bool tick_nohz_full_cpu(int cpu)
extern void tick_nohz_init(void);
extern void __tick_nohz_full_check(void);
-extern void tick_nohz_full_kick(void);
+extern void tick_nohz_full_kick_cpu(int cpu);
+
+static inline void tick_nohz_full_kick(void)
+{
+ tick_nohz_full_kick_cpu(smp_processor_id());
+}
+
extern void tick_nohz_full_kick_all(void);
extern void __tick_nohz_task_switch(struct task_struct *tsk);
#else
@@ -189,6 +195,7 @@ static inline void tick_nohz_init(void) { }
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
static inline void __tick_nohz_full_check(void) { }
+static inline void tick_nohz_full_kick_cpu(int cpu) { }
static inline void tick_nohz_full_kick(void) { }
static inline void tick_nohz_full_kick_all(void) { }
static inline void __tick_nohz_task_switch(struct task_struct *tsk) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d9d8ece..fb6dfad 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1500,9 +1500,7 @@ void scheduler_ipi(void)
*/
preempt_fold_need_resched();
- if (llist_empty(&this_rq()->wake_list)
- && !tick_nohz_full_cpu(smp_processor_id())
- && !got_nohz_idle_kick())
+ if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick())
return;
/*
@@ -1519,7 +1517,6 @@ void scheduler_ipi(void)
* somewhat pessimize the simple resched case.
*/
irq_enter();
- tick_nohz_full_check();
sched_ttwu_pending();
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 456e492..6089e00 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1225,7 +1225,7 @@ static inline void inc_nr_running(struct rq *rq)
if (tick_nohz_full_cpu(rq->cpu)) {
/* Order rq->nr_running write against the IPI */
smp_wmb();
- smp_send_reschedule(rq->cpu);
+ tick_nohz_full_kick_cpu(rq->cpu);
}
}
#endif
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6558b7a..3d63944 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -224,13 +224,15 @@ static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {
};
/*
- * Kick the current CPU if it's full dynticks in order to force it to
+ * Kick the CPU if it's full dynticks in order to force it to
* re-evaluate its dependency on the tick and restart it if necessary.
*/
-void tick_nohz_full_kick(void)
+void tick_nohz_full_kick_cpu(int cpu)
{
- if (tick_nohz_full_cpu(smp_processor_id()))
- irq_work_queue(&__get_cpu_var(nohz_full_kick_work));
+ if (!tick_nohz_full_cpu(cpu))
+ return;
+
+ irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
}
static void nohz_full_kick_ipi(void *info)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v3
@ 2014-05-11 23:33 Frederic Weisbecker
2014-05-11 23:33 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
0 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-11 23:33 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
Hi,
So this version gives up with smp_queue_function_single() and extends
irq work to support remote queuing. As suggested by Peterz.
Comments?
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
timers/nohz-irq-work
Thanks,
Frederic
---
Frederic Weisbecker (5):
irq_work: Architecture support for remote irq work raise
irq_work: Force non-lazy works on IPI
irq_work: Allow remote queueing
nohz: Move full nohz kick to its own IPI
nohz: Use IPI implicit full barrier against rq->nr_running r/w
arch/Kconfig | 12 +++++++
arch/alpha/kernel/time.c | 3 +-
arch/arm/Kconfig | 1 +
arch/arm/kernel/smp.c | 4 +--
arch/powerpc/kernel/time.c | 3 +-
arch/sparc/kernel/pcr.c | 3 +-
arch/x86/Kconfig | 1 +
arch/x86/kernel/irq_work.c | 10 ++----
include/linux/irq_work.h | 3 ++
include/linux/tick.h | 9 ++++-
kernel/irq_work.c | 87 +++++++++++++++++++++++++++++++---------------
kernel/sched/core.c | 14 ++++----
kernel/sched/sched.h | 12 +++++--
kernel/time/Kconfig | 2 ++
kernel/time/tick-sched.c | 10 +++---
kernel/timer.c | 2 +-
16 files changed, 118 insertions(+), 58 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 4/5] nohz: Move full nohz kick to its own IPI
2014-05-11 23:33 [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v3 Frederic Weisbecker
@ 2014-05-11 23:33 ` Frederic Weisbecker
0 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2014-05-11 23:33 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Andrew Morton, Ingo Molnar, Kevin Hilman,
Paul E. McKenney, Peter Zijlstra, Thomas Gleixner, Viresh Kumar
Now that the irq work subsystem can queue remote callbacks, it's
a perfect fit to safely queue IPIs when interrupts are disabled
without worrying about concurrent callers.
Lets use it for the full dynticks kick to notify a CPU that it's
exiting single task mode.
This unbloats a bit the scheduler IPI that the nohz code was abusing
for its cool "callable anywhere/anytime" properties.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
include/linux/tick.h | 9 ++++++++-
kernel/sched/core.c | 5 +----
kernel/sched/sched.h | 2 +-
kernel/time/Kconfig | 2 ++
kernel/time/tick-sched.c | 10 ++++++----
5 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/include/linux/tick.h b/include/linux/tick.h
index b84773c..8a4987f 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -181,7 +181,13 @@ static inline bool tick_nohz_full_cpu(int cpu)
extern void tick_nohz_init(void);
extern void __tick_nohz_full_check(void);
-extern void tick_nohz_full_kick(void);
+extern void tick_nohz_full_kick_cpu(int cpu);
+
+static inline void tick_nohz_full_kick(void)
+{
+ tick_nohz_full_kick_cpu(smp_processor_id());
+}
+
extern void tick_nohz_full_kick_all(void);
extern void __tick_nohz_task_switch(struct task_struct *tsk);
#else
@@ -189,6 +195,7 @@ static inline void tick_nohz_init(void) { }
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
static inline void __tick_nohz_full_check(void) { }
+static inline void tick_nohz_full_kick_cpu(int cpu) { }
static inline void tick_nohz_full_kick(void) { }
static inline void tick_nohz_full_kick_all(void) { }
static inline void __tick_nohz_task_switch(struct task_struct *tsk) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 268a45e..00ac248 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1500,9 +1500,7 @@ void scheduler_ipi(void)
*/
preempt_fold_need_resched();
- if (llist_empty(&this_rq()->wake_list)
- && !tick_nohz_full_cpu(smp_processor_id())
- && !got_nohz_idle_kick())
+ if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick())
return;
/*
@@ -1519,7 +1517,6 @@ void scheduler_ipi(void)
* somewhat pessimize the simple resched case.
*/
irq_enter();
- tick_nohz_full_check();
sched_ttwu_pending();
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 456e492..6089e00 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1225,7 +1225,7 @@ static inline void inc_nr_running(struct rq *rq)
if (tick_nohz_full_cpu(rq->cpu)) {
/* Order rq->nr_running write against the IPI */
smp_wmb();
- smp_send_reschedule(rq->cpu);
+ tick_nohz_full_kick_cpu(rq->cpu);
}
}
#endif
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index f448513..27f1f63 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -101,6 +101,8 @@ config NO_HZ_FULL
depends on HAVE_CONTEXT_TRACKING
# VIRT_CPU_ACCOUNTING_GEN dependency
depends on HAVE_VIRT_CPU_ACCOUNTING_GEN
+ # tickless irq work
+ depends on HAVE_IRQ_WORK_IPI
select NO_HZ_COMMON
select RCU_USER_QS
select RCU_NOCB_CPU
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6558b7a..3d63944 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -224,13 +224,15 @@ static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {
};
/*
- * Kick the current CPU if it's full dynticks in order to force it to
+ * Kick the CPU if it's full dynticks in order to force it to
* re-evaluate its dependency on the tick and restart it if necessary.
*/
-void tick_nohz_full_kick(void)
+void tick_nohz_full_kick_cpu(int cpu)
{
- if (tick_nohz_full_cpu(smp_processor_id()))
- irq_work_queue(&__get_cpu_var(nohz_full_kick_work));
+ if (!tick_nohz_full_cpu(cpu))
+ return;
+
+ irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
}
static void nohz_full_kick_ipi(void *info)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-06-03 15:01 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-13 14:38 [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v4 Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 1/5] irq_work: Let arch tell us if it can raise irq work Frederic Weisbecker
2014-05-13 17:09 ` Peter Zijlstra
2014-05-13 19:33 ` Frederic Weisbecker
2014-05-13 20:48 ` Peter Zijlstra
2014-05-13 21:15 ` Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 2/5] irq_work: Force non-lazy works to the IPI Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 3/5] irq_work: Allow remote queueing Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
2014-05-13 14:38 ` [PATCH 5/5] nohz: Use IPI implicit full barrier against rq->nr_running r/w Frederic Weisbecker
-- strict thread matches above, loose matches on Subject: below --
2014-06-03 14:40 [GIT PULL] nohz: Move nohz kick out of scheduler IPI, v7 Frederic Weisbecker
2014-06-03 14:40 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
2014-06-03 15:00 ` Peter Zijlstra
2014-05-25 14:29 [PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v6 Frederic Weisbecker
2014-05-25 14:29 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
2014-05-11 23:33 [RFC PATCH 0/5] nohz: Move nohz kick out of scheduler IPI, v3 Frederic Weisbecker
2014-05-11 23:33 ` [PATCH 4/5] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox