* [PATCH 0/4] nohz: A few improvements v2
@ 2015-04-21 12:23 Frederic Weisbecker
2015-04-21 12:23 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
` (4 more replies)
0 siblings, 5 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Mike Galbraith,
Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner,
Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar,
Rik van Riel, Martin Schwidefsky
2nd version of the patchset "context_tracking: A few improvements" with
the nohz isolcpus patches from Chris.
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
nohz/core
HEAD: b885e51f754a09bce2709c789a913735ee62bfd2
Thanks,
Frederic
---
Chris Metcalf (2):
nohz: Add tick_nohz_full_add_cpus_to() API
nohz: Set isolcpus when nohz_full is set
Frederic Weisbecker (2):
context_tracking: Protect against recursion
context_tracking: Inherit TIF_NOHZ through forks instead of context switches
include/linux/context_tracking.h | 10 -----
include/linux/context_tracking_state.h | 1 +
include/linux/tick.h | 7 ++++
kernel/context_tracking.c | 75 ++++++++++++++++++++++++----------
kernel/sched/core.c | 4 +-
5 files changed, 64 insertions(+), 33 deletions(-)
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH 1/4] context_tracking: Protect against recursion 2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker @ 2015-04-21 12:23 ` Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker ` (3 subsequent siblings) 4 siblings, 0 replies; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw) To: LKML Cc: Frederic Weisbecker, Peter Zijlstra, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky Context tracking recursion can happen when an exception triggers in the middle of a call to a context tracking probe. This special case can be caused by vmalloc faults. If an access to a memory area allocated by vmalloc happens in the middle of context_tracking_enter(), we may run into an endless fault loop because the exception in turn calls context_tracking_enter() which faults on the same vmalloc'ed memory, triggering an exception again, etc... Some rare crashes have been reported so lets protect against this with a recursion counter. Reported-by: Dave Jones <davej@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Dave Jones <davej@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> --- include/linux/context_tracking_state.h | 1 + kernel/context_tracking.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h index 6b7b96a..678ecdf 100644 --- a/include/linux/context_tracking_state.h +++ b/include/linux/context_tracking_state.h @@ -12,6 +12,7 @@ struct context_tracking { * may be further optimized using static keys. */ bool active; + int recursion; enum ctx_state { CONTEXT_KERNEL = 0, CONTEXT_USER, diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index 72d59a1..b9e0b4f 100644 --- a/kernel/context_tracking.c +++ b/kernel/context_tracking.c @@ -38,6 +38,25 @@ void context_tracking_cpu_set(int cpu) } } +static bool context_tracking_recursion_enter(void) +{ + int recursion; + + recursion = __this_cpu_inc_return(context_tracking.recursion); + if (recursion == 1) + return true; + + WARN_ONCE((recursion < 1), "Invalid context tracking recursion value %d\n", recursion); + __this_cpu_dec(context_tracking.recursion); + + return false; +} + +static void context_tracking_recursion_exit(void) +{ + __this_cpu_dec(context_tracking.recursion); +} + /** * context_tracking_enter - Inform the context tracking that the CPU is going * enter user or guest space mode. @@ -75,6 +94,11 @@ void context_tracking_enter(enum ctx_state state) WARN_ON_ONCE(!current->mm); local_irq_save(flags); + if (!context_tracking_recursion_enter()) { + local_irq_restore(flags); + return; + } + if ( __this_cpu_read(context_tracking.state) != state) { if (__this_cpu_read(context_tracking.active)) { /* @@ -105,6 +129,7 @@ void context_tracking_enter(enum ctx_state state) */ __this_cpu_write(context_tracking.state, state); } + context_tracking_recursion_exit(); local_irq_restore(flags); } NOKPROBE_SYMBOL(context_tracking_enter); @@ -139,6 +164,10 @@ void context_tracking_exit(enum ctx_state state) return; local_irq_save(flags); + if (!context_tracking_recursion_enter()) { + local_irq_restore(flags); + return; + } if (__this_cpu_read(context_tracking.state) == state) { if (__this_cpu_read(context_tracking.active)) { /* @@ -153,6 +182,7 @@ void context_tracking_exit(enum ctx_state state) } __this_cpu_write(context_tracking.state, CONTEXT_KERNEL); } + context_tracking_recursion_exit(); local_irq_restore(flags); } NOKPROBE_SYMBOL(context_tracking_exit); -- 2.1.4 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker @ 2015-04-21 12:23 ` Frederic Weisbecker 2015-04-21 15:26 ` Peter Zijlstra 2015-04-21 12:23 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker ` (2 subsequent siblings) 4 siblings, 1 reply; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw) To: LKML Cc: Frederic Weisbecker, Peter Zijlstra, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky TIF_NOHZ is used by context_tracking to force syscall slow-path on every task in order to track userspace roundtrips. As such, it must be set on all running tasks. It's currently explicitly inherited through context switches. There is no need to do it on this fast-path though. The flag could be simply set once for all on all tasks, whether they are running or not. Lets do this by setting the flag to init task on early boot and let it propagate through fork inheritance. While at it mark context_tracking_cpu_set() as init code, we only need it at early boot time. Suggested-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Dave Jones <davej@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> --- include/linux/context_tracking.h | 10 -------- kernel/context_tracking.c | 51 ++++++++++++++++++++-------------------- kernel/sched/core.c | 1 - 3 files changed, 26 insertions(+), 36 deletions(-) diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h index 2821838..b96bd29 100644 --- a/include/linux/context_tracking.h +++ b/include/linux/context_tracking.h @@ -14,8 +14,6 @@ extern void context_tracking_enter(enum ctx_state state); extern void context_tracking_exit(enum ctx_state state); extern void context_tracking_user_enter(void); extern void context_tracking_user_exit(void); -extern void __context_tracking_task_switch(struct task_struct *prev, - struct task_struct *next); static inline void user_enter(void) { @@ -51,19 +49,11 @@ static inline void exception_exit(enum ctx_state prev_ctx) } } -static inline void context_tracking_task_switch(struct task_struct *prev, - struct task_struct *next) -{ - if (context_tracking_is_enabled()) - __context_tracking_task_switch(prev, next); -} #else static inline void user_enter(void) { } static inline void user_exit(void) { } static inline enum ctx_state exception_enter(void) { return 0; } static inline void exception_exit(enum ctx_state prev_ctx) { } -static inline void context_tracking_task_switch(struct task_struct *prev, - struct task_struct *next) { } #endif /* !CONFIG_CONTEXT_TRACKING */ diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index b9e0b4f..70edc57 100644 --- a/kernel/context_tracking.c +++ b/kernel/context_tracking.c @@ -30,14 +30,6 @@ EXPORT_SYMBOL_GPL(context_tracking_enabled); DEFINE_PER_CPU(struct context_tracking, context_tracking); EXPORT_SYMBOL_GPL(context_tracking); -void context_tracking_cpu_set(int cpu) -{ - if (!per_cpu(context_tracking.active, cpu)) { - per_cpu(context_tracking.active, cpu) = true; - static_key_slow_inc(&context_tracking_enabled); - } -} - static bool context_tracking_recursion_enter(void) { int recursion; @@ -194,24 +186,33 @@ void context_tracking_user_exit(void) } NOKPROBE_SYMBOL(context_tracking_user_exit); -/** - * __context_tracking_task_switch - context switch the syscall callbacks - * @prev: the task that is being switched out - * @next: the task that is being switched in - * - * The context tracking uses the syscall slow path to implement its user-kernel - * boundaries probes on syscalls. This way it doesn't impact the syscall fast - * path on CPUs that don't do context tracking. - * - * But we need to clear the flag on the previous task because it may later - * migrate to some CPU that doesn't do the context tracking. As such the TIF - * flag may not be desired there. - */ -void __context_tracking_task_switch(struct task_struct *prev, - struct task_struct *next) +void __init context_tracking_cpu_set(int cpu) { - clear_tsk_thread_flag(prev, TIF_NOHZ); - set_tsk_thread_flag(next, TIF_NOHZ); + static __initdata bool initialized = false; + struct task_struct *p, *t; + + if (!per_cpu(context_tracking.active, cpu)) { + per_cpu(context_tracking.active, cpu) = true; + static_key_slow_inc(&context_tracking_enabled); + } + + if (initialized) + return; + + set_tsk_thread_flag(&init_task, TIF_NOHZ); + + /* + * There shouldn't be any thread at this early boot stage + * but the scheduler is ready to host any. So lets walk + * the tasklist just in case. tasklist_lock isn't necessary + * either that early but take it for correctness checkers. + */ + read_lock(&tasklist_lock); + for_each_process_thread(p, t) + set_tsk_thread_flag(t, TIF_NOHZ); + read_unlock(&tasklist_lock); + + initialized = true; } #ifdef CONFIG_CONTEXT_TRACKING_FORCE diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f9123a8..8414a6a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2347,7 +2347,6 @@ context_switch(struct rq *rq, struct task_struct *prev, */ spin_release(&rq->lock.dep_map, 1, _THIS_IP_); - context_tracking_task_switch(prev, next); /* Here we just switch the register state and the stack. */ switch_to(prev, next, prev); barrier(); -- 2.1.4 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-04-21 12:23 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker @ 2015-04-21 15:26 ` Peter Zijlstra 2015-04-21 16:51 ` Frederic Weisbecker 0 siblings, 1 reply; 38+ messages in thread From: Peter Zijlstra @ 2015-04-21 15:26 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Tue, Apr 21, 2015 at 02:23:07PM +0200, Frederic Weisbecker wrote: > +void __init context_tracking_cpu_set(int cpu) > { > + static __initdata bool initialized = false; > + struct task_struct *p, *t; > + > + if (!per_cpu(context_tracking.active, cpu)) { > + per_cpu(context_tracking.active, cpu) = true; > + static_key_slow_inc(&context_tracking_enabled); > + } > + > + if (initialized) > + return; > + > + set_tsk_thread_flag(&init_task, TIF_NOHZ); > + > + /* > + * There shouldn't be any thread at this early boot stage > + * but the scheduler is ready to host any. So lets walk > + * the tasklist just in case. tasklist_lock isn't necessary > + * either that early but take it for correctness checkers. > + */ > + read_lock(&tasklist_lock); > + for_each_process_thread(p, t) > + set_tsk_thread_flag(t, TIF_NOHZ); If there should not be any task, should there not be a WARN_ON_ONCE() here? > + read_unlock(&tasklist_lock); > + > + initialized = true; > } ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-04-21 15:26 ` Peter Zijlstra @ 2015-04-21 16:51 ` Frederic Weisbecker 2015-04-21 20:52 ` Peter Zijlstra 0 siblings, 1 reply; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-21 16:51 UTC (permalink / raw) To: Peter Zijlstra Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Tue, Apr 21, 2015 at 05:26:00PM +0200, Peter Zijlstra wrote: > On Tue, Apr 21, 2015 at 02:23:07PM +0200, Frederic Weisbecker wrote: > > +void __init context_tracking_cpu_set(int cpu) > > { > > + static __initdata bool initialized = false; > > + struct task_struct *p, *t; > > + > > + if (!per_cpu(context_tracking.active, cpu)) { > > + per_cpu(context_tracking.active, cpu) = true; > > + static_key_slow_inc(&context_tracking_enabled); > > + } > > + > > + if (initialized) > > + return; > > + > > + set_tsk_thread_flag(&init_task, TIF_NOHZ); > > + > > + /* > > + * There shouldn't be any thread at this early boot stage > > + * but the scheduler is ready to host any. So lets walk > > + * the tasklist just in case. tasklist_lock isn't necessary > > + * either that early but take it for correctness checkers. > > + */ > > + read_lock(&tasklist_lock); > > + for_each_process_thread(p, t) > > + set_tsk_thread_flag(t, TIF_NOHZ); > > If there should not be any task, should there not be a WARN_ON_ONCE() > here? Well, it's legal to have a task at that time because sched_init() was called. I just haven't observed any task other than init/0. But future code (or alternate configs than mine) might create a task between sched_init() and tick_init(). And the above code takes care of such a possibility. > > > + read_unlock(&tasklist_lock); > > + > > + initialized = true; > > } ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-04-21 16:51 ` Frederic Weisbecker @ 2015-04-21 20:52 ` Peter Zijlstra 2015-04-21 21:06 ` Frederic Weisbecker 0 siblings, 1 reply; 38+ messages in thread From: Peter Zijlstra @ 2015-04-21 20:52 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Tue, Apr 21, 2015 at 06:51:54PM +0200, Frederic Weisbecker wrote: > On Tue, Apr 21, 2015 at 05:26:00PM +0200, Peter Zijlstra wrote: > > On Tue, Apr 21, 2015 at 02:23:07PM +0200, Frederic Weisbecker wrote: > > > +void __init context_tracking_cpu_set(int cpu) > > > { > > > + static __initdata bool initialized = false; > > > + struct task_struct *p, *t; > > > + > > > + if (!per_cpu(context_tracking.active, cpu)) { > > > + per_cpu(context_tracking.active, cpu) = true; > > > + static_key_slow_inc(&context_tracking_enabled); > > > + } > > > + > > > + if (initialized) > > > + return; > > > + > > > + set_tsk_thread_flag(&init_task, TIF_NOHZ); > > > + > > > + /* > > > + * There shouldn't be any thread at this early boot stage > > > + * but the scheduler is ready to host any. So lets walk > > > + * the tasklist just in case. tasklist_lock isn't necessary > > > + * either that early but take it for correctness checkers. > > > + */ > > > + read_lock(&tasklist_lock); > > > + for_each_process_thread(p, t) > > > + set_tsk_thread_flag(t, TIF_NOHZ); > > > > If there should not be any task, should there not be a WARN_ON_ONCE() > > here? > > Well, it's legal to have a task at that time because sched_init() was called. > I just haven't observed any task other than init/0. But future code (or alternate > configs than mine) might create a task between sched_init() and tick_init(). And > the above code takes care of such a possibility. So why not do it sooner? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-04-21 20:52 ` Peter Zijlstra @ 2015-04-21 21:06 ` Frederic Weisbecker 2015-04-22 15:35 ` Peter Zijlstra 0 siblings, 1 reply; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-21 21:06 UTC (permalink / raw) To: Peter Zijlstra Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Tue, Apr 21, 2015 at 10:52:18PM +0200, Peter Zijlstra wrote: > On Tue, Apr 21, 2015 at 06:51:54PM +0200, Frederic Weisbecker wrote: > > On Tue, Apr 21, 2015 at 05:26:00PM +0200, Peter Zijlstra wrote: > > > On Tue, Apr 21, 2015 at 02:23:07PM +0200, Frederic Weisbecker wrote: > > > > +void __init context_tracking_cpu_set(int cpu) > > > > { > > > > + static __initdata bool initialized = false; > > > > + struct task_struct *p, *t; > > > > + > > > > + if (!per_cpu(context_tracking.active, cpu)) { > > > > + per_cpu(context_tracking.active, cpu) = true; > > > > + static_key_slow_inc(&context_tracking_enabled); > > > > + } > > > > + > > > > + if (initialized) > > > > + return; > > > > + > > > > + set_tsk_thread_flag(&init_task, TIF_NOHZ); > > > > + > > > > + /* > > > > + * There shouldn't be any thread at this early boot stage > > > > + * but the scheduler is ready to host any. So lets walk > > > > + * the tasklist just in case. tasklist_lock isn't necessary > > > > + * either that early but take it for correctness checkers. > > > > + */ > > > > + read_lock(&tasklist_lock); > > > > + for_each_process_thread(p, t) > > > > + set_tsk_thread_flag(t, TIF_NOHZ); > > > > > > If there should not be any task, should there not be a WARN_ON_ONCE() > > > here? > > > > Well, it's legal to have a task at that time because sched_init() was called. > > I just haven't observed any task other than init/0. But future code (or alternate > > configs than mine) might create a task between sched_init() and tick_init(). And > > the above code takes care of such a possibility. > > So why not do it sooner? Because I need the tick nohz intialization to be after irq initialization (init_IRQ()), which it depends on. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-04-21 21:06 ` Frederic Weisbecker @ 2015-04-22 15:35 ` Peter Zijlstra 2015-04-22 15:51 ` Frederic Weisbecker 0 siblings, 1 reply; 38+ messages in thread From: Peter Zijlstra @ 2015-04-22 15:35 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Tue, Apr 21, 2015 at 11:06:04PM +0200, Frederic Weisbecker wrote: > > Because I need the tick nohz intialization to be after irq initialization (init_IRQ()), > which it depends on. OK, that's all _way_ before rest_init() which does the very first fork(). I think we can safely put a WARN in there. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-04-22 15:35 ` Peter Zijlstra @ 2015-04-22 15:51 ` Frederic Weisbecker 0 siblings, 0 replies; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-22 15:51 UTC (permalink / raw) To: Peter Zijlstra Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Wed, Apr 22, 2015 at 05:35:53PM +0200, Peter Zijlstra wrote: > On Tue, Apr 21, 2015 at 11:06:04PM +0200, Frederic Weisbecker wrote: > > > > Because I need the tick nohz intialization to be after irq initialization (init_IRQ()), > > which it depends on. > > OK, that's all _way_ before rest_init() which does the very first > fork(). > > I think we can safely put a WARN in there. Ah good, I'm doing that! thanks! ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API 2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker @ 2015-04-21 12:23 ` Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker 2015-04-21 15:26 ` [PATCH 0/4] nohz: A few improvements v2 Peter Zijlstra 4 siblings, 0 replies; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw) To: LKML Cc: Chris Metcalf, Peter Zijlstra, Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Frederic Weisbecker, Rik van Riel, Martin Schwidefsky From: Chris Metcalf <cmetcalf@ezchip.com> This API is useful to modify a cpumask indicating some special nohz-type functionality so that the nohz cores are automatically added to that set. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Link: http://lkml.kernel.org/r/1429024675-18938-1-git-send-email-cmetcalf@ezchip.com Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> --- include/linux/tick.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/linux/tick.h b/include/linux/tick.h index f8492da5..4191b56 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -134,6 +134,12 @@ static inline bool tick_nohz_full_cpu(int cpu) return cpumask_test_cpu(cpu, tick_nohz_full_mask); } +static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) +{ + if (tick_nohz_full_enabled()) + cpumask_or(mask, mask, tick_nohz_full_mask); +} + extern void __tick_nohz_full_check(void); extern void tick_nohz_full_kick(void); extern void tick_nohz_full_kick_cpu(int cpu); @@ -142,6 +148,7 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk); #else static inline bool tick_nohz_full_enabled(void) { return false; } static inline bool tick_nohz_full_cpu(int cpu) { return false; } +static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } static inline void __tick_nohz_full_check(void) { } static inline void tick_nohz_full_kick_cpu(int cpu) { } static inline void tick_nohz_full_kick(void) { } -- 2.1.4 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker ` (2 preceding siblings ...) 2015-04-21 12:23 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker @ 2015-04-21 12:23 ` Frederic Weisbecker 2015-04-21 15:26 ` [PATCH 0/4] nohz: A few improvements v2 Peter Zijlstra 4 siblings, 0 replies; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-21 12:23 UTC (permalink / raw) To: LKML Cc: Chris Metcalf, Peter Zijlstra, Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Frederic Weisbecker, Rik van Riel, Martin Schwidefsky From: Chris Metcalf <cmetcalf@ezchip.com> nohz_full is only useful with isolcpus also set, since otherwise the scheduler has to run periodically to try to determine whether to steal work from other cores. Accordingly, when booting with nohz_full=xxx on the command line, we should act as if isolcpus=xxx was also set, and set (or extend) the isolcpus set to include the nohz_full cpus. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8414a6a..ee4f735 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7051,6 +7051,9 @@ void __init sched_init_smp(void) alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL); alloc_cpumask_var(&fallback_doms, GFP_KERNEL); + /* nohz_full won't take effect without isolating the cpus. */ + tick_nohz_full_add_cpus_to(cpu_isolated_map); + sched_init_numa(); /* -- 2.1.4 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 0/4] nohz: A few improvements v2 2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker ` (3 preceding siblings ...) 2015-04-21 12:23 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker @ 2015-04-21 15:26 ` Peter Zijlstra 4 siblings, 0 replies; 38+ messages in thread From: Peter Zijlstra @ 2015-04-21 15:26 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Rafael J . Wysocki, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Tue, Apr 21, 2015 at 02:23:05PM +0200, Frederic Weisbecker wrote: > 2nd version of the patchset "context_tracking: A few improvements" with > the nohz isolcpus patches from Chris. > Other than the one comment; Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 0/4] nohz: A few improvements v3
@ 2015-04-24 15:58 Frederic Weisbecker
2015-04-24 15:58 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-04-24 15:58 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Rafael J . Wysocki, Peter Zijlstra,
Mike Galbraith, Chris Metcalf, Ingo Molnar, Dave Jones,
Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
Rik van Riel, Martin Schwidefsky
Since v2, this set only changes
"context_tracking: Inherit TIF_NOHZ through forks instead of context switches"
which now warns when tasklist is populated earlier than expected as per
Peterz suggestion.
Simpler.
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
nohz/core
HEAD: 3c2b3b78a1cca2865a932d0265690997db9706b1
Thanks,
Frederic
---
Chris Metcalf (2):
nohz: Add tick_nohz_full_add_cpus_to() API
nohz: Set isolcpus when nohz_full is set
Frederic Weisbecker (2):
context_tracking: Protect against recursion
context_tracking: Inherit TIF_NOHZ through forks instead of context switches
include/linux/context_tracking.h | 10 -----
include/linux/context_tracking_state.h | 1 +
include/linux/sched.h | 3 ++
include/linux/tick.h | 7 ++++
kernel/context_tracking.c | 69 +++++++++++++++++++++++-----------
kernel/sched/core.c | 4 +-
6 files changed, 61 insertions(+), 33 deletions(-)
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-04-24 15:58 [PATCH 0/4] nohz: A few improvements v3 Frederic Weisbecker @ 2015-04-24 15:58 ` Frederic Weisbecker 2015-04-24 20:07 ` Gene Heskett 0 siblings, 1 reply; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-24 15:58 UTC (permalink / raw) To: LKML Cc: Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Frederic Weisbecker, Rik van Riel, Martin Schwidefsky From: Chris Metcalf <cmetcalf@ezchip.com> nohz_full is only useful with isolcpus also set, since otherwise the scheduler has to run periodically to try to determine whether to steal work from other cores. Accordingly, when booting with nohz_full=xxx on the command line, we should act as if isolcpus=xxx was also set, and set (or extend) the isolcpus set to include the nohz_full cpus. Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"] Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 6f149f8..e95b4d8 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7042,6 +7042,9 @@ void __init sched_init_smp(void) alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL); alloc_cpumask_var(&fallback_doms, GFP_KERNEL); + /* nohz_full won't take effect without isolating the cpus. */ + tick_nohz_full_add_cpus_to(cpu_isolated_map); + sched_init_numa(); /* -- 2.1.4 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-04-24 15:58 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker @ 2015-04-24 20:07 ` Gene Heskett 2015-04-25 23:13 ` Frederic Weisbecker 0 siblings, 1 reply; 38+ messages in thread From: Gene Heskett @ 2015-04-24 20:07 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Friday 24 April 2015 11:58:31 Frederic Weisbecker wrote: > From: Chris Metcalf <cmetcalf@ezchip.com> > > nohz_full is only useful with isolcpus also set, since otherwise the > scheduler has to run periodically to try to determine whether to steal > work from other cores. > > Accordingly, when booting with nohz_full=xxx on the command line, we > should act as if isolcpus=xxx was also set, and set (or extend) the > isolcpus set to include the nohz_full cpus. > > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"] > Acked-by: Rik van Riel <riel@redhat.com> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> > Cc: Mike Galbraith <umgwanakikbuti@gmail.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Rik van Riel <riel@redhat.com> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> As a user of LinuxCNC, we expect the core(s) so isolated by the isolcpus argument at bootup time to remain undisturbed in order to preserve the I/O updates heartbeat latency at the absolute minimum that board and cpu combo can accomplish. If this patch changes that behaviour such that the isolated core is grabbed for another job while the RTAI bits and pieces are loaded and running machinery, causing the machinery to lose this steady, possibly as little as a 20 microsecond period repeating operation heartbeat, this will quite effectively destroy our ability to run this software on linux without farming that whole operation out to intelligent I/O cards. Please keep this in mind. I don't read the patch well enough to determine this myself. > --- > kernel/sched/core.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 6f149f8..e95b4d8 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -7042,6 +7042,9 @@ void __init sched_init_smp(void) > alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL); > alloc_cpumask_var(&fallback_doms, GFP_KERNEL); > > + /* nohz_full won't take effect without isolating the cpus. */ > + tick_nohz_full_add_cpus_to(cpu_isolated_map); > + > sched_init_numa(); > > /* Cheers, Gene Heskett -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Genes Web page <http://geneslinuxbox.net:6309/gene> ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-04-24 20:07 ` Gene Heskett @ 2015-04-25 23:13 ` Frederic Weisbecker 2015-04-25 23:50 ` Gene Heskett 0 siblings, 1 reply; 38+ messages in thread From: Frederic Weisbecker @ 2015-04-25 23:13 UTC (permalink / raw) To: Gene Heskett Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Fri, Apr 24, 2015 at 04:07:52PM -0400, Gene Heskett wrote: > On Friday 24 April 2015 11:58:31 Frederic Weisbecker wrote: > > From: Chris Metcalf <cmetcalf@ezchip.com> > > > > nohz_full is only useful with isolcpus also set, since otherwise the > > scheduler has to run periodically to try to determine whether to steal > > work from other cores. > > > > Accordingly, when booting with nohz_full=xxx on the command line, we > > should act as if isolcpus=xxx was also set, and set (or extend) the > > isolcpus set to include the nohz_full cpus. > > > > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"] > > Acked-by: Rik van Riel <riel@redhat.com> > > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> > > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> > > Cc: Mike Galbraith <umgwanakikbuti@gmail.com> > > Cc: Ingo Molnar <mingo@redhat.com> > > Cc: Rik van Riel <riel@redhat.com> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> > > As a user of LinuxCNC, we expect the core(s) so isolated by the isolcpus > argument at bootup time to remain undisturbed in order to preserve the > I/O updates heartbeat latency at the absolute minimum that board and cpu > combo can accomplish. > > If this patch changes that behaviour such that the isolated core is > grabbed for another job while the RTAI bits and pieces are loaded and > running machinery, causing the machinery to lose this steady, possibly > as little as a 20 microsecond period repeating operation heartbeat, this > will quite effectively destroy our ability to run this software on linux > without farming that whole operation out to intelligent I/O cards. > > Please keep this in mind. I don't read the patch well enough to > determine this myself. I think it's not a problem. This patch isn't adding any work or noise to isolcpus, it's actually setting CPUs that run tickless in userspace to be part of the isolcpus set, because we need userspace-tickless CPUs to not be disturbed at all. You shouldn't be concerned if you don't use full dynticks. If you happen to use full dynticks in your usecase one day, you'll use it on your isolcpus anyway. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-04-25 23:13 ` Frederic Weisbecker @ 2015-04-25 23:50 ` Gene Heskett 0 siblings, 0 replies; 38+ messages in thread From: Gene Heskett @ 2015-04-25 23:50 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Mike Galbraith, Ingo Molnar, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Saturday 25 April 2015 19:13:10 Frederic Weisbecker wrote: > On Fri, Apr 24, 2015 at 04:07:52PM -0400, Gene Heskett wrote: > > On Friday 24 April 2015 11:58:31 Frederic Weisbecker wrote: > > > From: Chris Metcalf <cmetcalf@ezchip.com> > > > > > > nohz_full is only useful with isolcpus also set, since otherwise > > > the scheduler has to run periodically to try to determine whether > > > to steal work from other cores. > > > > > > Accordingly, when booting with nohz_full=xxx on the command line, > > > we should act as if isolcpus=xxx was also set, and set (or extend) > > > the isolcpus set to include the nohz_full cpus. > > > > > > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"] > > > Acked-by: Rik van Riel <riel@redhat.com> > > > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > > > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> > > > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > > > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> > > > Cc: Mike Galbraith <umgwanakikbuti@gmail.com> > > > Cc: Ingo Molnar <mingo@redhat.com> > > > Cc: Rik van Riel <riel@redhat.com> > > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> > > > > As a user of LinuxCNC, we expect the core(s) so isolated by the > > isolcpus argument at bootup time to remain undisturbed in order to > > preserve the I/O updates heartbeat latency at the absolute minimum > > that board and cpu combo can accomplish. > > > > If this patch changes that behaviour such that the isolated core is > > grabbed for another job while the RTAI bits and pieces are loaded > > and running machinery, causing the machinery to lose this steady, > > possibly as little as a 20 microsecond period repeating operation > > heartbeat, this will quite effectively destroy our ability to run > > this software on linux without farming that whole operation out to > > intelligent I/O cards. > > > > Please keep this in mind. I don't read the patch well enough to > > determine this myself. > > I think it's not a problem. This patch isn't adding any work or noise > to isolcpus, it's actually setting CPUs that run tickless in userspace > to be part of the isolcpus set, because we need userspace-tickless > CPUs to not be disturbed at all. You shouldn't be concerned if you > don't use full dynticks. > > If you happen to use full dynticks in your usecase one day, you'll use > it on your isolcpus anyway. TBT I am probably too paranoid for my own good, but I did want to be heard from the far corner of the cheap seats in the bleachers. I write my stuff in bash, but there are, I expect, some guys in our group who could given enough time, run down anything that breaks it for us and patch it back out. We are the guys who generally run an RTAI patched kernel anyway. Some of the machinery driven by LinuxCNC is pretty good sized stuff, like a 5 axis milling machine in Cincinnati that may be the biggest one Cincinnati ever made, bed is 26 feet long, nearly 6 feet wide. The whole thing is a bit over 200k lbs. Bad controller, controller company gone, sold at scrap cheap. Adapted it to be run by LinuxCNC. After commissioning it, he turned his shop machinists loose to use it. First job was a locomotive axle bearing seat in a multi-ton truck casting, needed .001" tolerance for a shrink fitted bearing. Measured when done, it was off .0002" worst case. It has been detected by the seismographs at the UNI across town. :) My biggest machine is less than 200 lbs. Toy's IOW. Thank you for taking the time to reply, Frederick. Its appreciated. Cheers, Gene Heskett -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Genes Web page <http://geneslinuxbox.net:6309/gene> ^ permalink raw reply [flat|nested] 38+ messages in thread
* [GIT PULL] nohz: A few improvements v4
@ 2015-05-06 16:04 Frederic Weisbecker
2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2015-05-06 16:04 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Rafael J . Wysocki, Peter Zijlstra,
Mike Galbraith, Chris Metcalf, Dave Jones, Thomas Gleixner,
Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
Martin Schwidefsky
Ingo,
Please pull the nohz/core branch (on top of tip:timers/core) that can be
found at:
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
nohz/core
HEAD: 60d0b4dc36259029e4cc8d030d8f59b33a119814
---
Summary of changes:
* Fix rare crashes due to context tracking recursion when faulting on
vmalloc'ed areas.
* Simplify the TIF_NOHZ propagation (less context switch overhead)
* Set nohz full CPUs as isolcpus. This way we enforce nohz CPUs to be
undisturbed by globally affine tasks.
Thanks,
Frederic
---
Chris Metcalf (2):
nohz: Add tick_nohz_full_add_cpus_to() API
nohz: Set isolcpus when nohz_full is set
Frederic Weisbecker (2):
context_tracking: Protect against recursion
context_tracking: Inherit TIF_NOHZ through forks instead of context switches
include/linux/context_tracking.h | 10 -----
include/linux/context_tracking_state.h | 1 +
include/linux/sched.h | 3 ++
include/linux/tick.h | 7 ++++
kernel/context_tracking.c | 68 +++++++++++++++++++++++-----------
kernel/sched/core.c | 4 +-
6 files changed, 60 insertions(+), 33 deletions(-)
^ permalink raw reply [flat|nested] 38+ messages in thread* [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker @ 2015-05-06 16:04 ` Frederic Weisbecker 2015-05-16 19:39 ` Sasha Levin 0 siblings, 1 reply; 38+ messages in thread From: Frederic Weisbecker @ 2015-05-06 16:04 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Mike Galbraith, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Frederic Weisbecker, Rik van Riel, Martin Schwidefsky From: Chris Metcalf <cmetcalf@ezchip.com> nohz_full is only useful with isolcpus also set, since otherwise the scheduler has to run periodically to try to determine whether to steal work from other cores. Accordingly, when booting with nohz_full=xxx on the command line, we should act as if isolcpus=xxx was also set, and set (or extend) the isolcpus set to include the nohz_full cpus. Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"] Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 6f149f8..e95b4d8 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7042,6 +7042,9 @@ void __init sched_init_smp(void) alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL); alloc_cpumask_var(&fallback_doms, GFP_KERNEL); + /* nohz_full won't take effect without isolating the cpus. */ + tick_nohz_full_add_cpus_to(cpu_isolated_map); + sched_init_numa(); /* -- 2.1.4 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker @ 2015-05-16 19:39 ` Sasha Levin 2015-05-17 5:30 ` Mike Galbraith 0 siblings, 1 reply; 38+ messages in thread From: Sasha Levin @ 2015-05-16 19:39 UTC (permalink / raw) To: Frederic Weisbecker, Ingo Molnar Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Mike Galbraith, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On 05/06/2015 12:04 PM, Frederic Weisbecker wrote: > From: Chris Metcalf <cmetcalf@ezchip.com> > > nohz_full is only useful with isolcpus also set, since otherwise the > scheduler has to run periodically to try to determine whether to steal > work from other cores. > > Accordingly, when booting with nohz_full=xxx on the command line, we > should act as if isolcpus=xxx was also set, and set (or extend) the > isolcpus set to include the nohz_full cpus. > > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"] > Acked-by: Rik van Riel <riel@redhat.com> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> > Cc: Mike Galbraith <umgwanakikbuti@gmail.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Rik van Riel <riel@redhat.com> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Hi folks, I've noticed a regression in my testing a few days ago and bisected it down to this patch. I was seeing frequent soft lockups/RCU lockups and the load of the testing VMs would go beyond 400-500 (on 32 VCPU guests) - note I'm booting them with nohz_full=1-27. This patch sort of explains the behaviour I was seeing now: most of the cores are no longer being used by the scheduler, and the remaining cores can't deal with the load imposed on them which results in "lockups" which are really just the CPUs being unable to keep up. I always thought that nohz_full without isolcpus meant that the cores would be available to the scheduler, but it won't interfere if there is one task running on them. It seems that this patch changed that behaviour. Did I misunderstand that? Thanks, Sasha ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-16 19:39 ` Sasha Levin @ 2015-05-17 5:30 ` Mike Galbraith 2015-05-18 2:17 ` Rik van Riel 2015-05-20 20:38 ` Afzal Mohammed 0 siblings, 2 replies; 38+ messages in thread From: Mike Galbraith @ 2015-05-17 5:30 UTC (permalink / raw) To: Sasha Levin Cc: Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Sat, 2015-05-16 at 15:39 -0400, Sasha Levin wrote: > On 05/06/2015 12:04 PM, Frederic Weisbecker wrote: > > From: Chris Metcalf <cmetcalf@ezchip.com> > > > > nohz_full is only useful with isolcpus also set, since otherwise the > > scheduler has to run periodically to try to determine whether to steal > > work from other cores. > > > > Accordingly, when booting with nohz_full=xxx on the command line, we > > should act as if isolcpus=xxx was also set, and set (or extend) the > > isolcpus set to include the nohz_full cpus. > > > > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"] > > Acked-by: Rik van Riel <riel@redhat.com> > > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> > > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> > > Cc: Mike Galbraith <umgwanakikbuti@gmail.com> > > Cc: Ingo Molnar <mingo@redhat.com> > > Cc: Rik van Riel <riel@redhat.com> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> > > Hi folks, > > I've noticed a regression in my testing a few days ago and bisected it down to > this patch. I was seeing frequent soft lockups/RCU lockups and the load of the > testing VMs would go beyond 400-500 (on 32 VCPU guests) - note I'm booting them > with nohz_full=1-27. > > This patch sort of explains the behaviour I was seeing now: most of the cores > are no longer being used by the scheduler, and the remaining cores can't deal > with the load imposed on them which results in "lockups" which are really just > the CPUs being unable to keep up. > > I always thought that nohz_full without isolcpus meant that the cores would > be available to the scheduler, but it won't interfere if there is one task > running on them. It seems that this patch changed that behaviour. > > Did I misunderstand that? Yeah, tying nohz_full set to isolcpus set up an initial condition that you have to tear down with cpusets if you want those cpus returned to the general purpose pool. I had considered the kernel setting initial state to be an optimization, but have reconsidered. You may have misunderstood somewhat though, if you do not explicitly isolate the nohz_full set from the scheduler via either isolcpus or cpusets, there is no exclusion from load balancing, the scheduler may place additional tasks on a nohz_full cpu to resolve load imbalance. Given that kernel initiated association to isolcpus, a user turning NO_HZ_FULL_ALL on had better not have much generic load to manage. If he/she does not have CPUSETS enabled, or should Rik's patch rendering isolcpus immutable be merged, he/she would quickly discover that the generic box has been transformed into an utterly inflexible specialist incapable of performing mundane tasks ever again. -Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-17 5:30 ` Mike Galbraith @ 2015-05-18 2:17 ` Rik van Riel 2015-05-18 3:29 ` Mike Galbraith 2015-05-20 20:38 ` Afzal Mohammed 1 sibling, 1 reply; 38+ messages in thread From: Rik van Riel @ 2015-05-18 2:17 UTC (permalink / raw) To: Mike Galbraith, Sasha Levin Cc: Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Martin Schwidefsky On 05/17/2015 01:30 AM, Mike Galbraith wrote: > Given that kernel initiated association to isolcpus, a user turning > NO_HZ_FULL_ALL on had better not have much generic load to manage. If > he/she does not have CPUSETS enabled, or should Rik's patch rendering > isolcpus immutable be merged, My patch does not aim to make isolcpus immutable, it aims to make isolcpus resistent to system management tools (like libvirt) automatically undoing isolcpus the instant a cpuset with the default cpus (inherited from the root group) is created. -- All rights reversed ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-18 2:17 ` Rik van Riel @ 2015-05-18 3:29 ` Mike Galbraith 2015-05-18 14:07 ` Rik van Riel 0 siblings, 1 reply; 38+ messages in thread From: Mike Galbraith @ 2015-05-18 3:29 UTC (permalink / raw) To: Rik van Riel Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Martin Schwidefsky On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote: > On 05/17/2015 01:30 AM, Mike Galbraith wrote: > > > Given that kernel initiated association to isolcpus, a user turning > > NO_HZ_FULL_ALL on had better not have much generic load to manage. If > > he/she does not have CPUSETS enabled, or should Rik's patch rendering > > isolcpus immutable be merged, > > My patch does not aim to make isolcpus immutable, it aims to make > isolcpus resistent to system management tools (like libvirt) > automatically undoing isolcpus the instant a cpuset with the default > cpus (inherited from the root group) is created. Aim or not, if cpusets is the sole modifier, it'll render isolcpus immutable, no? Cpusets could grow an override to the override I suppose, to regain control of the resource it thinks it manages. -Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-18 3:29 ` Mike Galbraith @ 2015-05-18 14:07 ` Rik van Riel 2015-05-18 14:22 ` Mike Galbraith 0 siblings, 1 reply; 38+ messages in thread From: Rik van Riel @ 2015-05-18 14:07 UTC (permalink / raw) To: Mike Galbraith Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Martin Schwidefsky On 05/17/2015 11:29 PM, Mike Galbraith wrote: > On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote: >> On 05/17/2015 01:30 AM, Mike Galbraith wrote: >> >>> Given that kernel initiated association to isolcpus, a user turning >>> NO_HZ_FULL_ALL on had better not have much generic load to manage. If >>> he/she does not have CPUSETS enabled, or should Rik's patch rendering >>> isolcpus immutable be merged, >> >> My patch does not aim to make isolcpus immutable, it aims to make >> isolcpus resistent to system management tools (like libvirt) >> automatically undoing isolcpus the instant a cpuset with the default >> cpus (inherited from the root group) is created. > > Aim or not, if cpusets is the sole modifier, it'll render isolcpus > immutable, no? Cpusets could grow an override to the override I > suppose, to regain control of the resource it thinks it manages. The other way would be to make /sys/devices/system/cpu/isolcpus (which Greg KH promised he would queue up for 4.2) writable. I am all for making isolcpus changeable at run time. I just want to prevent it being changed accidentally at run time, by system tools that want to place their workloads in cpusets. -- All rights reversed ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-18 14:07 ` Rik van Riel @ 2015-05-18 14:22 ` Mike Galbraith 2015-05-18 14:52 ` Rik van Riel 0 siblings, 1 reply; 38+ messages in thread From: Mike Galbraith @ 2015-05-18 14:22 UTC (permalink / raw) To: Rik van Riel Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Martin Schwidefsky On Mon, 2015-05-18 at 10:07 -0400, Rik van Riel wrote: > On 05/17/2015 11:29 PM, Mike Galbraith wrote: > > On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote: > >> On 05/17/2015 01:30 AM, Mike Galbraith wrote: > >> > >>> Given that kernel initiated association to isolcpus, a user turning > >>> NO_HZ_FULL_ALL on had better not have much generic load to manage. If > >>> he/she does not have CPUSETS enabled, or should Rik's patch rendering > >>> isolcpus immutable be merged, > >> > >> My patch does not aim to make isolcpus immutable, it aims to make > >> isolcpus resistent to system management tools (like libvirt) > >> automatically undoing isolcpus the instant a cpuset with the default > >> cpus (inherited from the root group) is created. > > > > Aim or not, if cpusets is the sole modifier, it'll render isolcpus > > immutable, no? Cpusets could grow an override to the override I > > suppose, to regain control of the resource it thinks it manages. > > The other way would be to make /sys/devices/system/cpu/isolcpus > (which Greg KH promised he would queue up for 4.2) writable. Anything is better than override the override. That's easy, but the changelog would have to be farmed out to a talented used car salesman. -Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-18 14:22 ` Mike Galbraith @ 2015-05-18 14:52 ` Rik van Riel 2015-05-19 2:30 ` Mike Galbraith 0 siblings, 1 reply; 38+ messages in thread From: Rik van Riel @ 2015-05-18 14:52 UTC (permalink / raw) To: Mike Galbraith Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Martin Schwidefsky On 05/18/2015 10:22 AM, Mike Galbraith wrote: > On Mon, 2015-05-18 at 10:07 -0400, Rik van Riel wrote: >> On 05/17/2015 11:29 PM, Mike Galbraith wrote: >>> On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote: >>>> On 05/17/2015 01:30 AM, Mike Galbraith wrote: >>>> >>>>> Given that kernel initiated association to isolcpus, a user turning >>>>> NO_HZ_FULL_ALL on had better not have much generic load to manage. If >>>>> he/she does not have CPUSETS enabled, or should Rik's patch rendering >>>>> isolcpus immutable be merged, >>>> >>>> My patch does not aim to make isolcpus immutable, it aims to make >>>> isolcpus resistent to system management tools (like libvirt) >>>> automatically undoing isolcpus the instant a cpuset with the default >>>> cpus (inherited from the root group) is created. >>> >>> Aim or not, if cpusets is the sole modifier, it'll render isolcpus >>> immutable, no? Cpusets could grow an override to the override I >>> suppose, to regain control of the resource it thinks it manages. >> >> The other way would be to make /sys/devices/system/cpu/isolcpus >> (which Greg KH promised he would queue up for 4.2) writable. > > Anything is better than override the override. That's easy, but the > changelog would have to be farmed out to a talented used car salesman. For real time KVM, it is desirable to run the VCPU threads on isolated CPUs as real time tasks. Meanwhile, the emulator threads can run as normal tasks anywhere on the system. This means the /machine cpuset, which all guests live under, needs to contain all the system's CPUs, both isolated and non-isolated, otherwise it will be unable to have the VCPU threads on isolated CPUs and the emulator threads on other CPUs. -- All rights reversed ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-18 14:52 ` Rik van Riel @ 2015-05-19 2:30 ` Mike Galbraith 2015-06-12 19:12 ` Rik van Riel 0 siblings, 1 reply; 38+ messages in thread From: Mike Galbraith @ 2015-05-19 2:30 UTC (permalink / raw) To: Rik van Riel Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Martin Schwidefsky On Mon, 2015-05-18 at 10:52 -0400, Rik van Riel wrote: > For real time KVM, it is desirable to run the VCPU threads on > isolated CPUs as real time tasks. > > Meanwhile, the emulator threads can run as normal tasks anywhere > on the system. > > This means the /machine cpuset, which all guests live under, > needs to contain all the system's CPUs, both isolated and > non-isolated, otherwise it will be unable to have the VCPU > threads on isolated CPUs and the emulator threads on other > CPUs. Ok, I know next to nothing about virtualization only because I failed at maintaining the desired absolute zero, but... Why do you use isolcpus for this? This KVM setup sounds very similar to what I do for real realtime. I use a shield script to set up an exclusive isolated set and a system set and move the mundane world into the system set. Inject realtime proggy into its exclusive home, it creates/pins worker bees, done. Why the cpuset isolcpus hybrid? When you say emulator threads can run anywhere, it sounds like that includes the isolated set, but even so, nothing prevents injecting whatever (the kernel permits) into whichever set. -Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-19 2:30 ` Mike Galbraith @ 2015-06-12 19:12 ` Rik van Riel 0 siblings, 0 replies; 38+ messages in thread From: Rik van Riel @ 2015-06-12 19:12 UTC (permalink / raw) To: Mike Galbraith Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Martin Schwidefsky On 05/18/2015 10:30 PM, Mike Galbraith wrote: > On Mon, 2015-05-18 at 10:52 -0400, Rik van Riel wrote: > >> For real time KVM, it is desirable to run the VCPU threads on >> isolated CPUs as real time tasks. >> >> Meanwhile, the emulator threads can run as normal tasks anywhere >> on the system. >> >> This means the /machine cpuset, which all guests live under, >> needs to contain all the system's CPUs, both isolated and >> non-isolated, otherwise it will be unable to have the VCPU >> threads on isolated CPUs and the emulator threads on other >> CPUs. > > Ok, I know next to nothing about virtualization only because I failed at > maintaining the desired absolute zero, but... > > Why do you use isolcpus for this? This KVM setup sounds very similar to > what I do for real realtime. I use a shield script to set up an > exclusive isolated set and a system set and move the mundane world into > the system set. Inject realtime proggy into its exclusive home, it > creates/pins worker bees, done. Why the cpuset isolcpus hybrid? Because libvirt doesn't know any better currently. Ideally in the long run, we would use only cpusets, and not boot with isolcpus at all. Of course, things like irqbalance also key off isolcpus, to avoid assigning irqs to the isolated cpus, so there are quite a few puzzle pieces, and the best I can hope for in the short term is to simply resolve the incompatibilities where one piece of software undoes the settings required by another... -- All rights reversed ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-17 5:30 ` Mike Galbraith 2015-05-18 2:17 ` Rik van Riel @ 2015-05-20 20:38 ` Afzal Mohammed 2015-05-20 21:00 ` Paul E. McKenney 1 sibling, 1 reply; 38+ messages in thread From: Afzal Mohammed @ 2015-05-20 20:38 UTC (permalink / raw) To: Mike Galbraith Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky Hi, On Sun, May 17, 2015 at 07:30:50AM +0200, Mike Galbraith wrote: > Yeah, tying nohz_full set to isolcpus set up an initial condition that > you have to tear down with cpusets if you want those cpus returned to > the general purpose pool. I had considered the kernel setting initial > state to be an optimization, but have reconsidered. > Given that kernel initiated association to isolcpus, a user turning > NO_HZ_FULL_ALL on had better not have much generic load to manage. If On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x time as compared to w/o this patch, except boot cpu every one else jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load, it was working fine, but not after this - it is now like a single core system. Regards Afzal > he/she does not have CPUSETS enabled, or should Rik's patch rendering > isolcpus immutable be merged, he/she would quickly discover that the > generic box has been transformed into an utterly inflexible specialist > incapable of performing mundane tasks ever again. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-20 20:38 ` Afzal Mohammed @ 2015-05-20 21:00 ` Paul E. McKenney 2015-05-21 12:12 ` Afzal Mohammed 0 siblings, 1 reply; 38+ messages in thread From: Paul E. McKenney @ 2015-05-20 21:00 UTC (permalink / raw) To: Afzal Mohammed Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Thu, May 21, 2015 at 02:08:09AM +0530, Afzal Mohammed wrote: > Hi, > > On Sun, May 17, 2015 at 07:30:50AM +0200, Mike Galbraith wrote: > > > Yeah, tying nohz_full set to isolcpus set up an initial condition that > > you have to tear down with cpusets if you want those cpus returned to > > the general purpose pool. I had considered the kernel setting initial > > state to be an optimization, but have reconsidered. > > > Given that kernel initiated association to isolcpus, a user turning > > NO_HZ_FULL_ALL on had better not have much generic load to manage. If > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x > time as compared to w/o this patch, except boot cpu every one else > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load, > it was working fine, but not after this - it is now like a single core > system. I have to ask... What is your use case? What are you wanting NO_HZ_FULL to do for you? Thanx, Paul > Regards > Afzal > > > he/she does not have CPUSETS enabled, or should Rik's patch rendering > > isolcpus immutable be merged, he/she would quickly discover that the > > generic box has been transformed into an utterly inflexible specialist > > incapable of performing mundane tasks ever again. > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-20 21:00 ` Paul E. McKenney @ 2015-05-21 12:12 ` Afzal Mohammed 2015-05-21 12:57 ` Paul E. McKenney 0 siblings, 1 reply; 38+ messages in thread From: Afzal Mohammed @ 2015-05-21 12:12 UTC (permalink / raw) To: Paul E. McKenney Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky Hi, On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote: > > > Given that kernel initiated association to isolcpus, a user turning > > > NO_HZ_FULL_ALL on had better not have much generic load to manage. If > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x > > time as compared to w/o this patch, except boot cpu every one else > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load, > > it was working fine, but not after this - it is now like a single core > > system. > > I have to ask... What is your use case? What are you wanting NO_HZ_FULL > to do for you? I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes. Thought that shutting down ticks as much as possible would be beneficial to normal loads too, though it has been mentioned to be used for specialized loads. Seems like drawbacks due to it weigh against normal loads, but haven't so far observed any (on a laptop with normal activities) before this change. Regards Afzal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-21 12:12 ` Afzal Mohammed @ 2015-05-21 12:57 ` Paul E. McKenney 2015-05-21 13:06 ` Frederic Weisbecker 0 siblings, 1 reply; 38+ messages in thread From: Paul E. McKenney @ 2015-05-21 12:57 UTC (permalink / raw) To: Afzal Mohammed Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote: > Hi, > > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote: > > > > > Given that kernel initiated association to isolcpus, a user turning > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage. If > > > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x > > > time as compared to w/o this patch, except boot cpu every one else > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load, > > > it was working fine, but not after this - it is now like a single core > > > system. > > > > I have to ask... What is your use case? What are you wanting NO_HZ_FULL > > to do for you? > > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes. > > Thought that shutting down ticks as much as possible would be > beneficial to normal loads too, though it has been mentioned to be used > for specialized loads. Seems like drawbacks due to it weigh against > normal loads, but haven't so far observed any (on a laptop with normal > activities) before this change. Indeed, NO_HZ_FULL is special purpose. You normally would select NO_HZ_FULL_ALL only on a system intended for heavy compute without normal-workload distractions or for some real-time systems. For mixed workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and use the boot parameters to select which CPUs are to be running the specialized portion of the workload. And you would of course need to lead enough CPUs running normally to handle the non-specialized portion of the workload. This sort of thing has traditionally required specialized kernels, so the cool thing here is that we can make Linux do it. Though, as you noticed, careful configuration is still required. Seem reasonable? Thanx, Paul ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-21 12:57 ` Paul E. McKenney @ 2015-05-21 13:06 ` Frederic Weisbecker 2015-05-21 13:27 ` Paul E. McKenney ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Frederic Weisbecker @ 2015-05-21 13:06 UTC (permalink / raw) To: Paul E. McKenney Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote: > On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote: > > Hi, > > > > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote: > > > > > > > Given that kernel initiated association to isolcpus, a user turning > > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage. If > > > > > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x > > > > time as compared to w/o this patch, except boot cpu every one else > > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load, > > > > it was working fine, but not after this - it is now like a single core > > > > system. > > > > > > I have to ask... What is your use case? What are you wanting NO_HZ_FULL > > > to do for you? > > > > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes. > > > > Thought that shutting down ticks as much as possible would be > > beneficial to normal loads too, though it has been mentioned to be used > > for specialized loads. Seems like drawbacks due to it weigh against > > normal loads, but haven't so far observed any (on a laptop with normal > > activities) before this change. > > Indeed, NO_HZ_FULL is special purpose. You normally would select > NO_HZ_FULL_ALL only on a system intended for heavy compute without > normal-workload distractions or for some real-time systems. For mixed > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and > use the boot parameters to select which CPUs are to be running the > specialized portion of the workload. > > And you would of course need to lead enough CPUs running normally to > handle the non-specialized portion of the workload. > > This sort of thing has traditionally required specialized kernels, > so the cool thing here is that we can make Linux do it. Though, as > you noticed, careful configuration is still required. > > Seem reasonable? That said if he saw a big performance regression after applying these patches, then there is likely a problem in the patchset. Well it could be due to that mode which loops on full dynticks before resuming to userspace. Indeed when that is enabled, I expect real throughput issues on workloads doing lots of kernel <-> userspace roundtrips. We just need to make sure this thing only works when requested. Anyway, I need to look at the patchset. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-21 13:06 ` Frederic Weisbecker @ 2015-05-21 13:27 ` Paul E. McKenney 2015-05-21 13:29 ` Afzal Mohammed 2015-05-21 18:59 ` Mike Galbraith 2 siblings, 0 replies; 38+ messages in thread From: Paul E. McKenney @ 2015-05-21 13:27 UTC (permalink / raw) To: Frederic Weisbecker Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote: > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote: > > On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote: > > > Hi, > > > > > > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote: > > > > > > > > > Given that kernel initiated association to isolcpus, a user turning > > > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage. If > > > > > > > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x > > > > > time as compared to w/o this patch, except boot cpu every one else > > > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load, > > > > > it was working fine, but not after this - it is now like a single core > > > > > system. > > > > > > > > I have to ask... What is your use case? What are you wanting NO_HZ_FULL > > > > to do for you? > > > > > > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes. > > > > > > Thought that shutting down ticks as much as possible would be > > > beneficial to normal loads too, though it has been mentioned to be used > > > for specialized loads. Seems like drawbacks due to it weigh against > > > normal loads, but haven't so far observed any (on a laptop with normal > > > activities) before this change. > > > > Indeed, NO_HZ_FULL is special purpose. You normally would select > > NO_HZ_FULL_ALL only on a system intended for heavy compute without > > normal-workload distractions or for some real-time systems. For mixed > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and > > use the boot parameters to select which CPUs are to be running the > > specialized portion of the workload. > > > > And you would of course need to lead enough CPUs running normally to > > handle the non-specialized portion of the workload. > > > > This sort of thing has traditionally required specialized kernels, > > so the cool thing here is that we can make Linux do it. Though, as > > you noticed, careful configuration is still required. > > > > Seem reasonable? > > That said if he saw a big performance regression after applying these patches, > then there is likely a problem in the patchset. Well it could be due to that mode > which loops on full dynticks before resuming to userspace. Indeed when that is > enabled, I expect real throughput issues on workloads doing lots of kernel <-> > userspace roundtrips. We just need to make sure this thing only works when requested. > > Anyway, I need to look at the patchset. Fair enough! Thanx, Paul ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-21 13:06 ` Frederic Weisbecker 2015-05-21 13:27 ` Paul E. McKenney @ 2015-05-21 13:29 ` Afzal Mohammed 2015-05-21 14:14 ` Paul E. McKenney 2015-05-21 18:59 ` Mike Galbraith 2 siblings, 1 reply; 38+ messages in thread From: Afzal Mohammed @ 2015-05-21 13:29 UTC (permalink / raw) To: Frederic Weisbecker Cc: Paul E. McKenney, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky Hi, On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote: > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote: > > Indeed, NO_HZ_FULL is special purpose. You normally would select > > NO_HZ_FULL_ALL only on a system intended for heavy compute without > > normal-workload distractions or for some real-time systems. For mixed > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and > > use the boot parameters to select which CPUs are to be running the > > specialized portion of the workload. > > > > And you would of course need to lead enough CPUs running normally to > > handle the non-specialized portion of the workload. > > > > This sort of thing has traditionally required specialized kernels, > > so the cool thing here is that we can make Linux do it. Though, as > > you noticed, careful configuration is still required. > > > > Seem reasonable? Yes, thanks, some dots got connected :) > That said if he saw a big performance regression after applying these patches, > then there is likely a problem in the patchset. Well it could be due to that mode > which loops on full dynticks before resuming to userspace. Indeed when that is > enabled, I expect real throughput issues on workloads doing lots of kernel <-> > userspace roundtrips. We just need to make sure this thing only works when requested. With this change (& having NO_HZ_FULL_ALL), hackbench was being served only by the boot cpu, while w/o this change, all 8 (this is a quad core HT processor) was being used - observation based on 'top'. Regards Afzal ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-21 13:29 ` Afzal Mohammed @ 2015-05-21 14:14 ` Paul E. McKenney 2015-05-21 14:46 ` Frederic Weisbecker 0 siblings, 1 reply; 38+ messages in thread From: Paul E. McKenney @ 2015-05-21 14:14 UTC (permalink / raw) To: Afzal Mohammed Cc: Frederic Weisbecker, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Thu, May 21, 2015 at 06:59:05PM +0530, Afzal Mohammed wrote: > Hi, > > On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote: > > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote: > > > > Indeed, NO_HZ_FULL is special purpose. You normally would select > > > NO_HZ_FULL_ALL only on a system intended for heavy compute without > > > normal-workload distractions or for some real-time systems. For mixed > > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and > > > use the boot parameters to select which CPUs are to be running the > > > specialized portion of the workload. > > > > > > And you would of course need to lead enough CPUs running normally to > > > handle the non-specialized portion of the workload. > > > > > > This sort of thing has traditionally required specialized kernels, > > > so the cool thing here is that we can make Linux do it. Though, as > > > you noticed, careful configuration is still required. > > > > > > Seem reasonable? > > Yes, thanks, some dots got connected :) > > > That said if he saw a big performance regression after applying these patches, > > then there is likely a problem in the patchset. Well it could be due to that mode > > which loops on full dynticks before resuming to userspace. Indeed when that is > > enabled, I expect real throughput issues on workloads doing lots of kernel <-> > > userspace roundtrips. We just need to make sure this thing only works when requested. > > With this change (& having NO_HZ_FULL_ALL), hackbench was being served > only by the boot cpu, while w/o this change, all 8 (this is a quad > core HT processor) was being used - observation based on 'top'. Good to know! And it will be interesting to see what Frederic decides based on his review of the patchset. Thanx, Paul ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-21 14:14 ` Paul E. McKenney @ 2015-05-21 14:46 ` Frederic Weisbecker 0 siblings, 0 replies; 38+ messages in thread From: Frederic Weisbecker @ 2015-05-21 14:46 UTC (permalink / raw) To: Paul E. McKenney Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Thu, May 21, 2015 at 07:14:09AM -0700, Paul E. McKenney wrote: > On Thu, May 21, 2015 at 06:59:05PM +0530, Afzal Mohammed wrote: > > Hi, > > > > On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote: > > > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote: > > > > > > Indeed, NO_HZ_FULL is special purpose. You normally would select > > > > NO_HZ_FULL_ALL only on a system intended for heavy compute without > > > > normal-workload distractions or for some real-time systems. For mixed > > > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and > > > > use the boot parameters to select which CPUs are to be running the > > > > specialized portion of the workload. > > > > > > > > And you would of course need to lead enough CPUs running normally to > > > > handle the non-specialized portion of the workload. > > > > > > > > This sort of thing has traditionally required specialized kernels, > > > > so the cool thing here is that we can make Linux do it. Though, as > > > > you noticed, careful configuration is still required. > > > > > > > > Seem reasonable? > > > > Yes, thanks, some dots got connected :) > > > > > That said if he saw a big performance regression after applying these patches, > > > then there is likely a problem in the patchset. Well it could be due to that mode > > > which loops on full dynticks before resuming to userspace. Indeed when that is > > > enabled, I expect real throughput issues on workloads doing lots of kernel <-> > > > userspace roundtrips. We just need to make sure this thing only works when requested. > > > > With this change (& having NO_HZ_FULL_ALL), hackbench was being served > > only by the boot cpu, while w/o this change, all 8 (this is a quad > > core HT processor) was being used - observation based on 'top'. > > Good to know! And it will be interesting to see what Frederic decides > based on his review of the patchset. Ah wait I was confusing this patchset with the dataplane mode. No the performance loss seen by Afzal is actually expected. Now that we set all nohz full CPUs as isolcpus, when you start a process such as hackbench, it will be affine to CPU 0 unless you explicitly tweak the affinity of hackbench. So yeah this is a desired effect :-) ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-21 13:06 ` Frederic Weisbecker 2015-05-21 13:27 ` Paul E. McKenney 2015-05-21 13:29 ` Afzal Mohammed @ 2015-05-21 18:59 ` Mike Galbraith 2015-05-22 14:39 ` Frederic Weisbecker 2 siblings, 1 reply; 38+ messages in thread From: Mike Galbraith @ 2015-05-21 18:59 UTC (permalink / raw) To: Frederic Weisbecker Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Thu, 2015-05-21 at 15:06 +0200, Frederic Weisbecker wrote: > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote: > > On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote: > > > Hi, > > > > > > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote: > > > > > > > > > Given that kernel initiated association to isolcpus, a user turning > > > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage. If > > > > > > > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x > > > > > time as compared to w/o this patch, except boot cpu every one else > > > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load, > > > > > it was working fine, but not after this - it is now like a single core > > > > > system. > > > > > > > > I have to ask... What is your use case? What are you wanting NO_HZ_FULL > > > > to do for you? > > > > > > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes. > > > > > > Thought that shutting down ticks as much as possible would be > > > beneficial to normal loads too, though it has been mentioned to be used > > > for specialized loads. Seems like drawbacks due to it weigh against > > > normal loads, but haven't so far observed any (on a laptop with normal > > > activities) before this change. > > > > Indeed, NO_HZ_FULL is special purpose. You normally would select > > NO_HZ_FULL_ALL only on a system intended for heavy compute without > > normal-workload distractions or for some real-time systems. For mixed > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and > > use the boot parameters to select which CPUs are to be running the > > specialized portion of the workload. > > > > And you would of course need to lead enough CPUs running normally to > > handle the non-specialized portion of the workload. > > > > This sort of thing has traditionally required specialized kernels, > > so the cool thing here is that we can make Linux do it. Though, as > > you noticed, careful configuration is still required. > > > > Seem reasonable? > > That said if he saw a big performance regression after applying these patches, > then there is likely a problem in the patchset. Well it could be due to that mode > which loops on full dynticks before resuming to userspace. Indeed when that is > enabled, I expect real throughput issues on workloads doing lots of kernel <-> > userspace roundtrips. We just need to make sure this thing only works when requested. > > Anyway, I need to look at the patchset. I think it's a mistake to make any rash assumption and/or mandate that the user WILL use nohz_full CPUs immediately, or even at all for that matter. nohz_full currently is nothing but a CPU attribute, period, nothing more, nothing less. The overhead that comes with the it is the only thing that suggests that the set really really should be used exclusively for isolated compute loads, and even then, they had better be pure userspace due to the hefty border crossing overhead. Let that overhear be seriously reduced, as per the Ingo/Andy discussion, and presto, the things become a very interesting to dynamic sets. You can really really use them as you see fit, and on the fly, it doesn't cost you an arm and a leg just to turn it on. The only restriction being the static set declaration, that you have to manage until that too goes away. I see no reason to turn nohz_full into a rigid construct. -Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-21 18:59 ` Mike Galbraith @ 2015-05-22 14:39 ` Frederic Weisbecker 2015-05-22 15:20 ` Mike Galbraith 0 siblings, 1 reply; 38+ messages in thread From: Frederic Weisbecker @ 2015-05-22 14:39 UTC (permalink / raw) To: Mike Galbraith Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Thu, May 21, 2015 at 08:59:38PM +0200, Mike Galbraith wrote: > I think it's a mistake to make any rash assumption and/or mandate that > the user WILL use nohz_full CPUs immediately, or even at all for that > matter. nohz_full currently is nothing but a CPU attribute, period, > nothing more, nothing less. That's what the nohz_full parameter is for. Now if you're referring to change the nohz_full toggle to a runtime interface such as sysfs or such, I don't see that's a pressing need, especially considering the added complexity. At least I haven't heard much requests for it. > The overhead that comes with the it is the > only thing that suggests that the set really really should be used > exclusively for isolated compute loads, and even then, they had better > be pure userspace due to the hefty border crossing overhead. Yep. > > Let that overhear be seriously reduced, as per the Ingo/Andy discussion, > and presto, the things become a very interesting to dynamic sets. You > can really really use them as you see fit, and on the fly, it doesn't > cost you an arm and a leg just to turn it on. The only restriction > being the static set declaration, that you have to manage until that too > goes away. > > I see no reason to turn nohz_full into a rigid construct. I'm all for making nohz_full just about the tick and drive the rest of the isolation features through other settings. Ideally the isolation should be tuned by userspace, we have all the interface available for that: process affinity, irq affinity, workqueue affinity (soon), watchdog, etc... Then we'll also add syscall or prctl to loop on hard isolation (dataplane). Unfortunately people seem to prefer adding that complexity to the kernel and let it assume bold and unflexible pre-setting on its own. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set 2015-05-22 14:39 ` Frederic Weisbecker @ 2015-05-22 15:20 ` Mike Galbraith 0 siblings, 0 replies; 38+ messages in thread From: Mike Galbraith @ 2015-05-22 15:20 UTC (permalink / raw) To: Frederic Weisbecker Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel, Martin Schwidefsky On Fri, 2015-05-22 at 16:39 +0200, Frederic Weisbecker wrote: > On Thu, May 21, 2015 at 08:59:38PM +0200, Mike Galbraith wrote: > > I think it's a mistake to make any rash assumption and/or mandate that > > the user WILL use nohz_full CPUs immediately, or even at all for that > > matter. nohz_full currently is nothing but a CPU attribute, period, > > nothing more, nothing less. > > That's what the nohz_full parameter is for. Now if you're referring to > change the nohz_full toggle to a runtime interface such as sysfs or such, > I don't see that's a pressing need, especially considering the added > complexity. At least I haven't heard much requests for it. I'm not advocating anything like that, I'm just standing on my soap box advocating NOT restricting user choices. Cpusets works fine for me, and I'd like them to keep on working. If I need to add any hooks, buttons, or rubber tubing I'll do it there ;-) -Mike ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2015-06-12 19:12 UTC | newest] Thread overview: 38+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-04-21 12:23 [PATCH 0/4] nohz: A few improvements v2 Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker 2015-04-21 15:26 ` Peter Zijlstra 2015-04-21 16:51 ` Frederic Weisbecker 2015-04-21 20:52 ` Peter Zijlstra 2015-04-21 21:06 ` Frederic Weisbecker 2015-04-22 15:35 ` Peter Zijlstra 2015-04-22 15:51 ` Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker 2015-04-21 12:23 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker 2015-04-21 15:26 ` [PATCH 0/4] nohz: A few improvements v2 Peter Zijlstra -- strict thread matches above, loose matches on Subject: below -- 2015-04-24 15:58 [PATCH 0/4] nohz: A few improvements v3 Frederic Weisbecker 2015-04-24 15:58 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker 2015-04-24 20:07 ` Gene Heskett 2015-04-25 23:13 ` Frederic Weisbecker 2015-04-25 23:50 ` Gene Heskett 2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker 2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker 2015-05-16 19:39 ` Sasha Levin 2015-05-17 5:30 ` Mike Galbraith 2015-05-18 2:17 ` Rik van Riel 2015-05-18 3:29 ` Mike Galbraith 2015-05-18 14:07 ` Rik van Riel 2015-05-18 14:22 ` Mike Galbraith 2015-05-18 14:52 ` Rik van Riel 2015-05-19 2:30 ` Mike Galbraith 2015-06-12 19:12 ` Rik van Riel 2015-05-20 20:38 ` Afzal Mohammed 2015-05-20 21:00 ` Paul E. McKenney 2015-05-21 12:12 ` Afzal Mohammed 2015-05-21 12:57 ` Paul E. McKenney 2015-05-21 13:06 ` Frederic Weisbecker 2015-05-21 13:27 ` Paul E. McKenney 2015-05-21 13:29 ` Afzal Mohammed 2015-05-21 14:14 ` Paul E. McKenney 2015-05-21 14:46 ` Frederic Weisbecker 2015-05-21 18:59 ` Mike Galbraith 2015-05-22 14:39 ` Frederic Weisbecker 2015-05-22 15:20 ` Mike Galbraith
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).