public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
@ 2015-08-14  1:46 Frederic Weisbecker
  2015-08-22 21:09 ` Ping: " Frederic Weisbecker
  2015-08-23  5:40 ` Ingo Molnar
  0 siblings, 2 replies; 21+ messages in thread
From: Frederic Weisbecker @ 2015-08-14  1:46 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: LKML, Vatika Harlalka, Chris Metcalf, Thomas Gleixner,
	Preeti U Murthy, Christoph Lameter, Paul E . McKenney,
	Frederic Weisbecker

From: Vatika Harlalka <vatikaharlalka@gmail.com>

The problem addressed in this patch is about affining unpinned timers.
Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
jitter due to firing of such timers on them.

This patch will affine timers to online CPUs which are not full dynticks
in NOHZ_FULL configured systems. It should not introduce overhead in
nohz full off case due to static keys.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off by: Vatika Harlalka <vatikaharlalka@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/tick.h | 9 ++++++++-
 kernel/sched/core.c  | 7 +++++--
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 3741ba1..51e6493 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
 	if (tick_nohz_full_enabled())
 		cpumask_or(mask, mask, tick_nohz_full_mask);
 }
-
+static inline int housekeeping_any_cpu(void)
+{
+	return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+}
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
 extern void tick_nohz_full_kick_all(void);
 extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
+static inline int housekeeping_any_cpu(void)
+{
+	return smp_processor_id();
+}
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9917c96..4fd42e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
 	int i, cpu = smp_processor_id();
 	struct sched_domain *sd;
 
-	if (!idle_cpu(cpu))
+	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
 		return cpu;
 
 	rcu_read_lock();
 	for_each_domain(cpu, sd) {
 		for_each_cpu(i, sched_domain_span(sd)) {
-			if (!idle_cpu(i)) {
+			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
 				cpu = i;
 				goto unlock;
 			}
 		}
 	}
+
+	if (!is_housekeeping_cpu(cpu))
+		cpu = housekeeping_any_cpu();
 unlock:
 	rcu_read_unlock();
 	return cpu;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-14  1:46 [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers Frederic Weisbecker
@ 2015-08-22 21:09 ` Frederic Weisbecker
  2015-08-23  1:22   ` Christoph Lameter
  2015-08-23  5:40 ` Ingo Molnar
  1 sibling, 1 reply; 21+ messages in thread
From: Frederic Weisbecker @ 2015-08-22 21:09 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: LKML, Vatika Harlalka, Chris Metcalf, Thomas Gleixner,
	Preeti U Murthy, Christoph Lameter, Paul E . McKenney


Ping!

On Fri, Aug 14, 2015 at 03:46:14AM +0200, Frederic Weisbecker wrote:
> From: Vatika Harlalka <vatikaharlalka@gmail.com>
> 
> The problem addressed in this patch is about affining unpinned timers.
> Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
> jitter due to firing of such timers on them.
> 
> This patch will affine timers to online CPUs which are not full dynticks
> in NOHZ_FULL configured systems. It should not introduce overhead in
> nohz full off case due to static keys.
> 
> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off by: Vatika Harlalka <vatikaharlalka@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Chris Metcalf <cmetcalf@ezchip.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
>  include/linux/tick.h | 9 ++++++++-
>  kernel/sched/core.c  | 7 +++++--
>  2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 3741ba1..51e6493 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
>  	if (tick_nohz_full_enabled())
>  		cpumask_or(mask, mask, tick_nohz_full_mask);
>  }
> -
> +static inline int housekeeping_any_cpu(void)
> +{
> +	return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> +}
>  extern void __tick_nohz_full_check(void);
>  extern void tick_nohz_full_kick(void);
>  extern void tick_nohz_full_kick_cpu(int cpu);
>  extern void tick_nohz_full_kick_all(void);
>  extern void __tick_nohz_task_switch(struct task_struct *tsk);
>  #else
> +static inline int housekeeping_any_cpu(void)
> +{
> +	return smp_processor_id();
> +}
>  static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
>  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9917c96..4fd42e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
>  	int i, cpu = smp_processor_id();
>  	struct sched_domain *sd;
>  
> -	if (!idle_cpu(cpu))
> +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
>  		return cpu;
>  
>  	rcu_read_lock();
>  	for_each_domain(cpu, sd) {
>  		for_each_cpu(i, sched_domain_span(sd)) {
> -			if (!idle_cpu(i)) {
> +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
>  				cpu = i;
>  				goto unlock;
>  			}
>  		}
>  	}
> +
> +	if (!is_housekeeping_cpu(cpu))
> +		cpu = housekeeping_any_cpu();
>  unlock:
>  	rcu_read_unlock();
>  	return cpu;
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-22 21:09 ` Ping: " Frederic Weisbecker
@ 2015-08-23  1:22   ` Christoph Lameter
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Lameter @ 2015-08-23  1:22 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Ingo Molnar, LKML, Vatika Harlalka, Chris Metcalf,
	Thomas Gleixner, Preeti U Murthy, Paul E . McKenney

On Sat, 22 Aug 2015, Frederic Weisbecker wrote:

> Ping!

Exellent!!!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-14  1:46 [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers Frederic Weisbecker
  2015-08-22 21:09 ` Ping: " Frederic Weisbecker
@ 2015-08-23  5:40 ` Ingo Molnar
  2015-08-23 16:01   ` Paul E. McKenney
  2015-08-24  1:45   ` Frederic Weisbecker
  1 sibling, 2 replies; 21+ messages in thread
From: Ingo Molnar @ 2015-08-23  5:40 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, LKML, Vatika Harlalka, Chris Metcalf,
	Thomas Gleixner, Preeti U Murthy, Christoph Lameter,
	Paul E . McKenney


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> From: Vatika Harlalka <vatikaharlalka@gmail.com>
> 
> The problem addressed in this patch is about affining unpinned timers.
> Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
> jitter due to firing of such timers on them.
> 
> This patch will affine timers to online CPUs which are not full dynticks
> in NOHZ_FULL configured systems. It should not introduce overhead in
> nohz full off case due to static keys.
> 
> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off by: Vatika Harlalka <vatikaharlalka@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Chris Metcalf <cmetcalf@ezchip.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
>  include/linux/tick.h | 9 ++++++++-
>  kernel/sched/core.c  | 7 +++++--
>  2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 3741ba1..51e6493 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
>  	if (tick_nohz_full_enabled())
>  		cpumask_or(mask, mask, tick_nohz_full_mask);
>  }
> -
> +static inline int housekeeping_any_cpu(void)
> +{
> +	return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> +}
>  extern void __tick_nohz_full_check(void);
>  extern void tick_nohz_full_kick(void);
>  extern void tick_nohz_full_kick_cpu(int cpu);
>  extern void tick_nohz_full_kick_all(void);
>  extern void __tick_nohz_task_switch(struct task_struct *tsk);
>  #else
> +static inline int housekeeping_any_cpu(void)
> +{
> +	return smp_processor_id();
> +}
>  static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
>  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9917c96..4fd42e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
>  	int i, cpu = smp_processor_id();
>  	struct sched_domain *sd;
>  
> -	if (!idle_cpu(cpu))
> +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
>  		return cpu;
>  
>  	rcu_read_lock();
>  	for_each_domain(cpu, sd) {
>  		for_each_cpu(i, sched_domain_span(sd)) {
> -			if (!idle_cpu(i)) {
> +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
>  				cpu = i;
>  				goto unlock;
>  			}
>  		}
>  	}
> +
> +	if (!is_housekeeping_cpu(cpu))
> +		cpu = housekeeping_any_cpu();
>  unlock:
>  	rcu_read_unlock();
>  	return cpu;

So I almost applied this yesterday, but had the following question: what ensures 
that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
returns cpumask_any_and() of an empty cpumask - which returns an out of range 
index AFAICS - which will crash and burn in:

kernel/time/hrtimer.c:  return &per_cpu(hrtimer_bases, get_nohz_timer_target());
kernel/time/timer.c:    return per_cpu_ptr(&tvec_bases, get_nohz_timer_target());

housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse of 
it in essence), and tick_nohz_full_mask is set via two methods, either via a boot 
parameter:

        if (cpulist_parse(str, tick_nohz_full_mask) < 0) {

in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
completely full - making housekeeping_mask empty?

The other method is via CONFIG_NO_HZ_FULL_ALL:

        cpumask_setall(tick_nohz_full_mask);

here it's fully set - triggering the bug I'm worried about. So what am I missing, 
what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-23  5:40 ` Ingo Molnar
@ 2015-08-23 16:01   ` Paul E. McKenney
  2015-08-24  1:28     ` Frederic Weisbecker
  2015-08-24  6:44     ` Ingo Molnar
  2015-08-24  1:45   ` Frederic Weisbecker
  1 sibling, 2 replies; 21+ messages in thread
From: Paul E. McKenney @ 2015-08-23 16:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter

On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > From: Vatika Harlalka <vatikaharlalka@gmail.com>
> > 
> > The problem addressed in this patch is about affining unpinned timers.
> > Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
> > jitter due to firing of such timers on them.
> > 
> > This patch will affine timers to online CPUs which are not full dynticks
> > in NOHZ_FULL configured systems. It should not introduce overhead in
> > nohz full off case due to static keys.
> > 
> > Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> > Signed-off by: Vatika Harlalka <vatikaharlalka@gmail.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Christoph Lameter <cl@linux.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Chris Metcalf <cmetcalf@ezchip.com>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > ---
> >  include/linux/tick.h | 9 ++++++++-
> >  kernel/sched/core.c  | 7 +++++--
> >  2 files changed, 13 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > index 3741ba1..51e6493 100644
> > --- a/include/linux/tick.h
> > +++ b/include/linux/tick.h
> > @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
> >  	if (tick_nohz_full_enabled())
> >  		cpumask_or(mask, mask, tick_nohz_full_mask);
> >  }
> > -
> > +static inline int housekeeping_any_cpu(void)
> > +{
> > +	return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > +}
> >  extern void __tick_nohz_full_check(void);
> >  extern void tick_nohz_full_kick(void);
> >  extern void tick_nohz_full_kick_cpu(int cpu);
> >  extern void tick_nohz_full_kick_all(void);
> >  extern void __tick_nohz_task_switch(struct task_struct *tsk);
> >  #else
> > +static inline int housekeeping_any_cpu(void)
> > +{
> > +	return smp_processor_id();
> > +}
> >  static inline bool tick_nohz_full_enabled(void) { return false; }
> >  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
> >  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 9917c96..4fd42e4 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
> >  	int i, cpu = smp_processor_id();
> >  	struct sched_domain *sd;
> >  
> > -	if (!idle_cpu(cpu))
> > +	if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
> >  		return cpu;
> >  
> >  	rcu_read_lock();
> >  	for_each_domain(cpu, sd) {
> >  		for_each_cpu(i, sched_domain_span(sd)) {
> > -			if (!idle_cpu(i)) {
> > +			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
> >  				cpu = i;
> >  				goto unlock;
> >  			}
> >  		}
> >  	}
> > +
> > +	if (!is_housekeeping_cpu(cpu))
> > +		cpu = housekeeping_any_cpu();
> >  unlock:
> >  	rcu_read_unlock();
> >  	return cpu;
> 
> So I almost applied this yesterday, but had the following question: what ensures 
> that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
> returns cpumask_any_and() of an empty cpumask - which returns an out of range 
> index AFAICS - which will crash and burn in:
> 
> kernel/time/hrtimer.c:  return &per_cpu(hrtimer_bases, get_nohz_timer_target());
> kernel/time/timer.c:    return per_cpu_ptr(&tvec_bases, get_nohz_timer_target());
> 
> housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse of 
> it in essence), and tick_nohz_full_mask is set via two methods, either via a boot 
> parameter:
> 
>         if (cpulist_parse(str, tick_nohz_full_mask) < 0) {
> 
> in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
> completely full - making housekeeping_mask empty?
> 
> The other method is via CONFIG_NO_HZ_FULL_ALL:
> 
>         cpumask_setall(tick_nohz_full_mask);
> 
> here it's fully set - triggering the bug I'm worried about. So what am I missing, 
> what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(),
which is called from tick_init() which is called from start_kernel()
shortly after rcu_init():

	cpu = smp_processor_id();

	if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
		pr_warning("NO_HZ: Clearing %d from nohz_full range for timekeeping\n", cpu);
		cpumask_clear_cpu(cpu, tick_nohz_full_mask);
	}

This happens after the call to tick_nohz_init_all() that does the
cpumask_setall() that you called out above.

Or is a recent patch that I missed changing this?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-23 16:01   ` Paul E. McKenney
@ 2015-08-24  1:28     ` Frederic Weisbecker
  2015-08-24  6:44     ` Ingo Molnar
  1 sibling, 0 replies; 21+ messages in thread
From: Frederic Weisbecker @ 2015-08-24  1:28 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Vatika Harlalka, Chris Metcalf,
	Thomas Gleixner, Preeti U Murthy, Christoph Lameter

On Sun, Aug 23, 2015 at 09:01:01AM -0700, Paul E. McKenney wrote:
> The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(),
> which is called from tick_init() which is called from start_kernel()
> shortly after rcu_init():
> 
> 	cpu = smp_processor_id();
> 
> 	if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> 		pr_warning("NO_HZ: Clearing %d from nohz_full range for timekeeping\n", cpu);
> 		cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> 	}
> 
> This happens after the call to tick_nohz_init_all() that does the
> cpumask_setall() that you called out above.
> 
> Or is a recent patch that I missed changing this?

Exactly, this happens right after tick_nohz_full_mask is filled and
makes sure that at least the boot CPU runs as a housekeeper. We
also make sure that it can't become offline later.

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-23  5:40 ` Ingo Molnar
  2015-08-23 16:01   ` Paul E. McKenney
@ 2015-08-24  1:45   ` Frederic Weisbecker
  1 sibling, 0 replies; 21+ messages in thread
From: Frederic Weisbecker @ 2015-08-24  1:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, LKML, Vatika Harlalka, Chris Metcalf,
	Thomas Gleixner, Preeti U Murthy, Christoph Lameter,
	Paul E . McKenney

On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote:
> So I almost applied this yesterday, but had the following question: what ensures 
> that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
> returns cpumask_any_and() of an empty cpumask - which returns an out of range 
> index AFAICS - which will crash and burn in:
> 
> kernel/time/hrtimer.c:  return &per_cpu(hrtimer_bases, get_nohz_timer_target());
> kernel/time/timer.c:    return per_cpu_ptr(&tvec_bases, get_nohz_timer_target());
> 
> housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse of 
> it in essence), and tick_nohz_full_mask is set via two methods, either via a boot 
> parameter:
> 
>         if (cpulist_parse(str, tick_nohz_full_mask) < 0) {
> 
> in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
> completely full - making housekeeping_mask empty?
> 
> The other method is via CONFIG_NO_HZ_FULL_ALL:
> 
>         cpumask_setall(tick_nohz_full_mask);
> 
> here it's fully set - triggering the bug I'm worried about. So what am I missing, 
> what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

Legitimate worry and I should have explained that in the changelog.

Like Paul replied, we make sure that at least the boot CPU is excluded
from tick_nohz_full_mask in tick_nohz_init(). Then housekeeping_mask,
by reverse effect, contains that boot CPU at least.

And we also make sure that the boot CPU can't get offline
(tick_nohz_cpu_down_callback()).

Now we should really document and check that assumption so here is a
second patch below. The sched patch depends on tip:sched/core (to avoid
conflicts with sched changes) and the following one is based on
tip:timer/nohz but should be applicable to sched/core without conflict.
Both are standalone anyway.

Thanks!

---
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Sun, 23 Aug 2015 19:34:31 +0200
Subject: [PATCH] nohz: Assert existing housekeepers when nohz full enabled

The code ensures that at least the boot CPU serves as a housekeeper.

Let's assert this assumption to make sure that we have CPUs to handle
unbound jobs like workqueues and timers while nohz full CPUs run
undisturbed.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/time/tick-sched.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3319e16..cc9884f 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -370,6 +370,12 @@ void __init tick_nohz_init(void)
 	cpu_notifier(tick_nohz_cpu_down_callback, 0);
 	pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
 		cpumask_pr_args(tick_nohz_full_mask));
+
+	/*
+	 * We need at least one CPU to handle housekeeping work such
+	 * as timekeeping, unbound timers, workqueues, ...
+	 */
+	WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
 }
 #endif
 
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-23 16:01   ` Paul E. McKenney
  2015-08-24  1:28     ` Frederic Weisbecker
@ 2015-08-24  6:44     ` Ingo Molnar
  2015-08-24  7:23       ` Mike Galbraith
                         ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: Ingo Molnar @ 2015-08-24  6:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> > here it's fully set - triggering the bug I'm worried about. So what am I 
> > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
> 
> The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is 
> called from tick_init() which is called from start_kernel() shortly after 
> rcu_init():
> 
> 	cpu = smp_processor_id();
> 
> 	if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> 		pr_warning("NO_HZ: Clearing %d from nohz_full range for timekeeping\n", cpu);
> 		cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> 	}
> 
> This happens after the call to tick_nohz_init_all() that does the 
> cpumask_setall() that you called out above.

Ah, indeed - I somehow missed that.

This brings up two other questions:

1)

the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to 
it (housekeeping_mask - a variable mask to begin with) and recover the 
housekeeping CPU via:

+       return cpumask_any_and(housekeeping_mask, cpu_online_mask);

which can be pretty expensive, and which gets executed in two hotpaths:

kernel/time/hrtimer.c:  return &per_cpu(hrtimer_bases, get_nohz_timer_target());
kernel/time/timer.c:    return per_cpu_ptr(&tvec_bases, get_nohz_timer_target());

... why not just use a single housekeeping_cpu which would be way faster to pass 
down to the timer code?

2)

What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)

I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
is offlined.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24  6:44     ` Ingo Molnar
@ 2015-08-24  7:23       ` Mike Galbraith
  2015-08-24  7:41         ` Ingo Molnar
  2015-08-24 13:36       ` Frederic Weisbecker
  2015-08-24 13:50       ` Paul E. McKenney
  2 siblings, 1 reply; 21+ messages in thread
From: Mike Galbraith @ 2015-08-24  7:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, Frederic Weisbecker, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter

On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:

> the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to 
> it (housekeeping_mask - a variable mask to begin with) and recover the 
> housekeeping CPU via:
> 
> +       return cpumask_any_and(housekeeping_mask, cpu_online_mask);

There can be, and had better be if box is big, multiple housekeepers.

Imagine a NO_HZ_FULL_ALL kernel on an SGI beast from hell.  Offloading
8191 CPUs onto poor little CPU0 probably wouldn't work out well :)

	-Mike


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24  7:23       ` Mike Galbraith
@ 2015-08-24  7:41         ` Ingo Molnar
  2015-08-24  7:54           ` Mike Galbraith
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2015-08-24  7:41 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Paul E. McKenney, Frederic Weisbecker, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter


* Mike Galbraith <umgwanakikbuti@gmail.com> wrote:

> On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
> 
> > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to 
> > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > housekeeping CPU via:
> > 
> > +       return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> 
> There can be, and had better be if box is big, multiple housekeepers.

Yes - but that does not seem to be possible via the code right now AFAICS, so at 
minimum it's incomplete.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24  7:41         ` Ingo Molnar
@ 2015-08-24  7:54           ` Mike Galbraith
  2015-08-24  8:00             ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Mike Galbraith @ 2015-08-24  7:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, Frederic Weisbecker, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter

On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote:
> * Mike Galbraith <umgwanakikbuti@gmail.com> wrote:
> 
> > On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
> > 
> > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to 
> > > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > > housekeeping CPU via:
> > > 
> > > +       return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > 
> > There can be, and had better be if box is big, multiple housekeepers.
> 
> Yes - but that does not seem to be possible via the code right now AFAICS, so at 
> minimum it's incomplete.

In master housekeepers are set up in tick_nohz_init().  Everybody who's
not a nohz_full CPU is a housekeeper.

	-Mike


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24  7:54           ` Mike Galbraith
@ 2015-08-24  8:00             ` Ingo Molnar
  0 siblings, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2015-08-24  8:00 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Paul E. McKenney, Frederic Weisbecker, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter


* Mike Galbraith <umgwanakikbuti@gmail.com> wrote:

> On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote:
> > * Mike Galbraith <umgwanakikbuti@gmail.com> wrote:
> > 
> > > On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
> > > 
> > > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to 
> > > > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > > > housekeeping CPU via:
> > > > 
> > > > +       return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > > 
> > > There can be, and had better be if box is big, multiple housekeepers.
> > 
> > Yes - but that does not seem to be possible via the code right now AFAICS, so 
> > at minimum it's incomplete.
> 
> In master housekeepers are set up in tick_nohz_init().  Everybody who's not a 
> nohz_full CPU is a housekeeper.

Ah, indeed, I missed the nohz_full= boot option:

static int __init tick_nohz_full_setup(char *str)
{
        alloc_bootmem_cpumask_var(&tick_nohz_full_mask);
        if (cpulist_parse(str, tick_nohz_full_mask) < 0) {


Ok, first question is resolved - but the second one, behavior on boot CPU unplug 
still holds.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24  6:44     ` Ingo Molnar
  2015-08-24  7:23       ` Mike Galbraith
@ 2015-08-24 13:36       ` Frederic Weisbecker
  2015-08-24 14:01         ` Mike Galbraith
  2015-08-25  8:29         ` Ingo Molnar
  2015-08-24 13:50       ` Paul E. McKenney
  2 siblings, 2 replies; 21+ messages in thread
From: Frederic Weisbecker @ 2015-08-24 13:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter

On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> 2)
> 
> What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> 
> I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> is offlined.

We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which
is the boot CPU in nohz full, never gets offlined.

> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24  6:44     ` Ingo Molnar
  2015-08-24  7:23       ` Mike Galbraith
  2015-08-24 13:36       ` Frederic Weisbecker
@ 2015-08-24 13:50       ` Paul E. McKenney
  2015-08-24 14:04         ` Frederic Weisbecker
  2 siblings, 1 reply; 21+ messages in thread
From: Paul E. McKenney @ 2015-08-24 13:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter

On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > here it's fully set - triggering the bug I'm worried about. So what am I 
> > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
> > 
> > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is 
> > called from tick_init() which is called from start_kernel() shortly after 
> > rcu_init():
> > 
> > 	cpu = smp_processor_id();
> > 
> > 	if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> > 		pr_warning("NO_HZ: Clearing %d from nohz_full range for timekeeping\n", cpu);
> > 		cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> > 	}
> > 
> > This happens after the call to tick_nohz_init_all() that does the 
> > cpumask_setall() that you called out above.
> 
> Ah, indeed - I somehow missed that.
> 
> This brings up two other questions:
> 
> 1)
> 
> the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to 
> it (housekeeping_mask - a variable mask to begin with) and recover the 
> housekeeping CPU via:
> 
> +       return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> 
> which can be pretty expensive, and which gets executed in two hotpaths:
> 
> kernel/time/hrtimer.c:  return &per_cpu(hrtimer_bases, get_nohz_timer_target());
> kernel/time/timer.c:    return per_cpu_ptr(&tvec_bases, get_nohz_timer_target());
> 
> ... why not just use a single housekeeping_cpu which would be way faster to pass 
> down to the timer code?

The housekeeping_cpu came later, but that does seem like a good optimization.

> 2)
> 
> What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> 
> I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> is offlined.

The tick_nohz_cpu_down_callback() function does this, though in a less
than obvious way.  The tick_do_timer_cpu variable is the housekeeping
CPU that is currently handling timing, and it is not permitted to go
offline.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24 13:36       ` Frederic Weisbecker
@ 2015-08-24 14:01         ` Mike Galbraith
  2015-08-25  8:29         ` Ingo Molnar
  1 sibling, 0 replies; 21+ messages in thread
From: Mike Galbraith @ 2015-08-24 14:01 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Paul E. McKenney, Peter Zijlstra, LKML,
	Vatika Harlalka, Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter

On Mon, 2015-08-24 at 15:36 +0200, Frederic Weisbecker wrote:
> On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > 2)
> > 
> > What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > 
> > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> > is offlined.
> 
> We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which
> is the boot CPU in nohz full, never gets offlined.

Aha.. and it works fine too.  (CPU0 told me to go away earlier;)

	-Mike


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24 13:50       ` Paul E. McKenney
@ 2015-08-24 14:04         ` Frederic Weisbecker
  2015-08-24 15:45           ` Paul E. McKenney
  0 siblings, 1 reply; 21+ messages in thread
From: Frederic Weisbecker @ 2015-08-24 14:04 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Vatika Harlalka, Chris Metcalf,
	Thomas Gleixner, Preeti U Murthy, Christoph Lameter

On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote:
> On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > 
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > > here it's fully set - triggering the bug I'm worried about. So what am I 
> > > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
> > > 
> > > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is 
> > > called from tick_init() which is called from start_kernel() shortly after 
> > > rcu_init():
> > > 
> > > 	cpu = smp_processor_id();
> > > 
> > > 	if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> > > 		pr_warning("NO_HZ: Clearing %d from nohz_full range for timekeeping\n", cpu);
> > > 		cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> > > 	}
> > > 
> > > This happens after the call to tick_nohz_init_all() that does the 
> > > cpumask_setall() that you called out above.
> > 
> > Ah, indeed - I somehow missed that.
> > 
> > This brings up two other questions:
> > 
> > 1)
> > 
> > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to 
> > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > housekeeping CPU via:
> > 
> > +       return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > 
> > which can be pretty expensive, and which gets executed in two hotpaths:
> > 
> > kernel/time/hrtimer.c:  return &per_cpu(hrtimer_bases, get_nohz_timer_target());
> > kernel/time/timer.c:    return per_cpu_ptr(&tvec_bases, get_nohz_timer_target());
> > 
> > ... why not just use a single housekeeping_cpu which would be way faster to pass 
> > down to the timer code?
> 
> The housekeeping_cpu came later, but that does seem like a good optimization.

Well nohz full is likely to be used for HPC and that can involve big machines.
Having the housekeeping duty spread per node is a likely future evolution there,
if it isn't already used that way.

So we need to keep it a cpumask.

> 
> > 2)
> > 
> > What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > 
> > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> > is offlined.
> 
> The tick_nohz_cpu_down_callback() function does this, though in a less
> than obvious way.  The tick_do_timer_cpu variable is the housekeeping
> CPU that is currently handling timing, and it is not permitted to go
> offline.

Indeed, more specifically tick-common.c makes sure to set the timekeeping
duty to a housekeeper and that housekeeper is always the boot CPU due to
early device initialization.

But I should find a way to simplify that code and make it obvious it's always
set to the boot CPU.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24 14:04         ` Frederic Weisbecker
@ 2015-08-24 15:45           ` Paul E. McKenney
  0 siblings, 0 replies; 21+ messages in thread
From: Paul E. McKenney @ 2015-08-24 15:45 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Vatika Harlalka, Chris Metcalf,
	Thomas Gleixner, Preeti U Murthy, Christoph Lameter

On Mon, Aug 24, 2015 at 04:04:37PM +0200, Frederic Weisbecker wrote:
> On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote:
> > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > > 
> > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > > > > here it's fully set - triggering the bug I'm worried about. So what am I 
> > > > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
> > > > 
> > > > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is 
> > > > called from tick_init() which is called from start_kernel() shortly after 
> > > > rcu_init():
> > > > 
> > > > 	cpu = smp_processor_id();
> > > > 
> > > > 	if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> > > > 		pr_warning("NO_HZ: Clearing %d from nohz_full range for timekeeping\n", cpu);
> > > > 		cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> > > > 	}
> > > > 
> > > > This happens after the call to tick_nohz_init_all() that does the 
> > > > cpumask_setall() that you called out above.
> > > 
> > > Ah, indeed - I somehow missed that.
> > > 
> > > This brings up two other questions:
> > > 
> > > 1)
> > > 
> > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to 
> > > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > > housekeeping CPU via:
> > > 
> > > +       return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > > 
> > > which can be pretty expensive, and which gets executed in two hotpaths:
> > > 
> > > kernel/time/hrtimer.c:  return &per_cpu(hrtimer_bases, get_nohz_timer_target());
> > > kernel/time/timer.c:    return per_cpu_ptr(&tvec_bases, get_nohz_timer_target());
> > > 
> > > ... why not just use a single housekeeping_cpu which would be way faster to pass 
> > > down to the timer code?
> > 
> > The housekeeping_cpu came later, but that does seem like a good optimization.
> 
> Well nohz full is likely to be used for HPC and that can involve big machines.
> Having the housekeeping duty spread per node is a likely future evolution there,
> if it isn't already used that way.
> 
> So we need to keep it a cpumask.

Fair point!

							Thanx, Paul

> > > 2)
> > > 
> > > What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > > 
> > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> > > is offlined.
> > 
> > The tick_nohz_cpu_down_callback() function does this, though in a less
> > than obvious way.  The tick_do_timer_cpu variable is the housekeeping
> > CPU that is currently handling timing, and it is not permitted to go
> > offline.
> 
> Indeed, more specifically tick-common.c makes sure to set the timekeeping
> duty to a housekeeper and that housekeeper is always the boot CPU due to
> early device initialization.
> 
> But I should find a way to simplify that code and make it obvious it's always
> set to the boot CPU.
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-24 13:36       ` Frederic Weisbecker
  2015-08-24 14:01         ` Mike Galbraith
@ 2015-08-25  8:29         ` Ingo Molnar
  2015-08-25 13:45           ` Frederic Weisbecker
  1 sibling, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2015-08-25  8:29 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > 2)
> > 
> > What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > 
> > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> > is offlined.
> 
> We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which
> is the boot CPU in nohz full, never gets offlined.

That solution really sucks - it essentially regresses a feature the user 
explicitly asked for! I also see no way for the user to migrate the timekeeping 
functionality over to another CPU without rebooting.

If this is the last timekeeping CPU then it should migrate the timekeeping 
functionality to another CPU, and perhaps printk a warning if all other CPUs are 
nohz-full and we have to mark one of them as the timekeeper.

Also, the nohz-full and timekeeper functionality should not be a boot parameter 
only thing, but should be runtime configurable.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-25  8:29         ` Ingo Molnar
@ 2015-08-25 13:45           ` Frederic Weisbecker
  2015-08-28  8:32             ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Frederic Weisbecker @ 2015-08-25 13:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter

On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > > 2)
> > > 
> > > What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > > 
> > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> > > is offlined.
> > 
> > We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which
> > is the boot CPU in nohz full, never gets offlined.
> 
> That solution really sucks - it essentially regresses a feature the user 
> explicitly asked for! I also see no way for the user to migrate the timekeeping 
> functionality over to another CPU without rebooting.
> 
> If this is the last timekeeping CPU then it should migrate the timekeeping 
> functionality to another CPU, and perhaps printk a warning if all other CPUs are 
> nohz-full and we have to mark one of them as the timekeeper.
> 
> Also, the nohz-full and timekeeper functionality should not be a boot parameter 
> only thing, but should be runtime configurable.

When I tried to allow moving the timekeeping duty over all housekeeping CPUs, Thomas got
angry because it broke the KISS current nohz full code. Indeed, there must be at least
one running all the time on behalf of nohz full CPUs that can run anytime. Thus balancing
the timekeeping duty over housekeepers is a bit more complicated than in normal
configurations.

Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if
any remains or to a nohz full one. Then we must make sure the new timekeeper never
goes to idle.

But nohz_full is a corner usecase and I'm not sure it's worth the complexity. If a
nohz full user came and complained about CPU0 hotplog not working, I would definetly
retry it but I haven't heard about that yet. Besides, hotplug is very isolation-unfriendly
in general due to stop machine.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-25 13:45           ` Frederic Weisbecker
@ 2015-08-28  8:32             ` Ingo Molnar
  2015-08-28 12:30               ` Frederic Weisbecker
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2015-08-28  8:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
> > 
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 
> > > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > > > 2)
> > > > 
> > > > What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > > > 
> > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> > > > is offlined.
> > > 
> > > We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which
> > > is the boot CPU in nohz full, never gets offlined.
> > 
> > That solution really sucks - it essentially regresses a feature the user 
> > explicitly asked for! I also see no way for the user to migrate the timekeeping 
> > functionality over to another CPU without rebooting.
> > 
> > If this is the last timekeeping CPU then it should migrate the timekeeping 
> > functionality to another CPU, and perhaps printk a warning if all other CPUs are 
> > nohz-full and we have to mark one of them as the timekeeper.
> > 
> > Also, the nohz-full and timekeeper functionality should not be a boot parameter 
> > only thing, but should be runtime configurable.
> 
> When I tried to allow moving the timekeeping duty over all housekeeping CPUs, 
> Thomas got angry because it broke the KISS current nohz full code. Indeed, there 
> must be at least one running all the time on behalf of nohz full CPUs that can 
> run anytime. Thus balancing the timekeeping duty over housekeepers is a bit more 
> complicated than in normal configurations.
> 
> Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if 
> any remains or to a nohz full one. Then we must make sure the new timekeeper 
> never goes to idle.
> 
> But nohz_full is a corner usecase and I'm not sure it's worth the complexity. If 
> a nohz full user came and complained about CPU0 hotplog not working, I would 
> definetly retry it but I haven't heard about that yet. Besides, hotplug is very 
> isolation-unfriendly in general due to stop machine.

Ok, I guess we can live with this.

Mind sending an updated series with all patches?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
  2015-08-28  8:32             ` Ingo Molnar
@ 2015-08-28 12:30               ` Frederic Weisbecker
  0 siblings, 0 replies; 21+ messages in thread
From: Frederic Weisbecker @ 2015-08-28 12:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul E. McKenney, Peter Zijlstra, LKML, Vatika Harlalka,
	Chris Metcalf, Thomas Gleixner, Preeti U Murthy,
	Christoph Lameter

On Fri, Aug 28, 2015 at 10:32:59AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
> > > 
> > > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > 
> > > > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > > > > 2)
> > > > > 
> > > > > What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > > > > 
> > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU 
> > > > > is offlined.
> > > > 
> > > > We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which
> > > > is the boot CPU in nohz full, never gets offlined.
> > > 
> > > That solution really sucks - it essentially regresses a feature the user 
> > > explicitly asked for! I also see no way for the user to migrate the timekeeping 
> > > functionality over to another CPU without rebooting.
> > > 
> > > If this is the last timekeeping CPU then it should migrate the timekeeping 
> > > functionality to another CPU, and perhaps printk a warning if all other CPUs are 
> > > nohz-full and we have to mark one of them as the timekeeper.
> > > 
> > > Also, the nohz-full and timekeeper functionality should not be a boot parameter 
> > > only thing, but should be runtime configurable.
> > 
> > When I tried to allow moving the timekeeping duty over all housekeeping CPUs, 
> > Thomas got angry because it broke the KISS current nohz full code. Indeed, there 
> > must be at least one running all the time on behalf of nohz full CPUs that can 
> > run anytime. Thus balancing the timekeeping duty over housekeepers is a bit more 
> > complicated than in normal configurations.
> > 
> > Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if 
> > any remains or to a nohz full one. Then we must make sure the new timekeeper 
> > never goes to idle.
> > 
> > But nohz_full is a corner usecase and I'm not sure it's worth the complexity. If 
> > a nohz full user came and complained about CPU0 hotplog not working, I would 
> > definetly retry it but I haven't heard about that yet. Besides, hotplug is very 
> > isolation-unfriendly in general due to stop machine.
> 
> Ok, I guess we can live with this.

Now this will likely evolve in the future, I can easily imagine that timekeeping becomes
balanced among housekeepers when we'll have one per node. It's not yet the priority but
we may come to that one day.

> 
> Mind sending an updated series with all patches?

Sure, I'm cooking that.

Thanks!

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-08-28 12:31 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-14  1:46 [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers Frederic Weisbecker
2015-08-22 21:09 ` Ping: " Frederic Weisbecker
2015-08-23  1:22   ` Christoph Lameter
2015-08-23  5:40 ` Ingo Molnar
2015-08-23 16:01   ` Paul E. McKenney
2015-08-24  1:28     ` Frederic Weisbecker
2015-08-24  6:44     ` Ingo Molnar
2015-08-24  7:23       ` Mike Galbraith
2015-08-24  7:41         ` Ingo Molnar
2015-08-24  7:54           ` Mike Galbraith
2015-08-24  8:00             ` Ingo Molnar
2015-08-24 13:36       ` Frederic Weisbecker
2015-08-24 14:01         ` Mike Galbraith
2015-08-25  8:29         ` Ingo Molnar
2015-08-25 13:45           ` Frederic Weisbecker
2015-08-28  8:32             ` Ingo Molnar
2015-08-28 12:30               ` Frederic Weisbecker
2015-08-24 13:50       ` Paul E. McKenney
2015-08-24 14:04         ` Frederic Weisbecker
2015-08-24 15:45           ` Paul E. McKenney
2015-08-24  1:45   ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox