[PATCH v2] sched: Fast idling of CPU when system is partially loaded

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2] sched: Fast idling of CPU when system is partially loaded
@ 2014-06-16 19:48 Tim Chen
  2014-06-23  4:33 ` Jason Low
  2014-06-23 12:52 ` Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Tim Chen @ 2014-06-16 19:48 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Andi Kleen, Michel Lespinasse, Rik van Riel, Peter Hurley,
	Jason Low, Davidlohr Bueson, linux-kernel

Thanks to the review from Jason and Peter.  I've moved the check
of whether load balance is required into fair.c's idle_balance.

When a system is lightly loaded (i.e. no more than 1 job per cpu),
attempt to pull job to a cpu before putting it to idle is unnecessary and
can be skipped.  This patch adds an indicator so the scheduler can know
when there's no more than 1 active job is on any CPU in the system to
skip needless job pulls.

On a 4 socket machine with a request/response kind of workload from
clients, we saw about 0.13 msec delay when we go through a full load
balance to try pull job from all the other cpus.  While 0.1 msec was
spent on processing the request and generating a response, the 0.13 msec
load balance overhead was actually more than the actual work being done.
This overhead can be skipped much of the time for lightly loaded systems.

With this patch, we tested with a netperf request/response workload that
has the server busy with half the cpus in a 4 socket system.  We found
the patch eliminated 75% of the load balance attempts before idling a cpu.

The overhead of setting/clearing the indicator is low as we already gather
the necessary info while we call add_nr_running and update_sd_lb_stats.
We switch to full load balance load immediately if any cpu got more than
one job on its run queue in add_nr_running.  We'll clear the indicator
to avoid load balance when we detect no cpu's have more than one job
when we scan the work queues in update_sg_lb_stats.  We are aggressive
in maintaining the load balance and opportunistic in skipping the load
balance.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 kernel/sched/fair.c  | 24 +++++++++++++++++++++---
 kernel/sched/sched.h | 10 ++++++++--
 2 files changed, 29 insertions(+), 5 deletions(-)

Change log:
v2. 
1. Move the skip load balance code to idle_balance.
2. Use env->dst_rq->rd to get the root domain directly.

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9855e87..95bb541 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5863,7 +5863,8 @@ static inline int sg_capacity(struct lb_env *env, struct sched_group *group)
  */
 static inline void update_sg_lb_stats(struct lb_env *env,
 			struct sched_group *group, int load_idx,
-			int local_group, struct sg_lb_stats *sgs)
+			int local_group, struct sg_lb_stats *sgs,
+			bool *overload)
 {
 	unsigned long load;
 	int i;
@@ -5881,6 +5882,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 
 		sgs->group_load += load;
 		sgs->sum_nr_running += rq->nr_running;
+		if (overload && rq->nr_running > 1)
+			*overload = true;
 #ifdef CONFIG_NUMA_BALANCING
 		sgs->nr_numa_running += rq->nr_numa_running;
 		sgs->nr_preferred_running += rq->nr_preferred_running;
@@ -5991,6 +5994,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
 	struct sched_group *sg = env->sd->groups;
 	struct sg_lb_stats tmp_sgs;
 	int load_idx, prefer_sibling = 0;
+	bool overload = false;
 
 	if (child && child->flags & SD_PREFER_SIBLING)
 		prefer_sibling = 1;
@@ -6011,7 +6015,13 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
 				update_group_power(env->sd, env->dst_cpu);
 		}
 
-		update_sg_lb_stats(env, sg, load_idx, local_group, sgs);
+		if (env->sd->parent)
+			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
+						NULL);
+		else
+			/* gather overload info if we are at root domain */
+			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
+						&overload);
 
 		if (local_group)
 			goto next_group;
@@ -6045,6 +6055,13 @@ next_group:
 
 	if (env->sd->flags & SD_NUMA)
 		env->fbq_type = fbq_classify_group(&sds->busiest_stat);
+
+	if (!env->sd->parent) {
+		/* update overload indicator if we are at root domain */
+		if (env->dst_rq->rd->overload != overload)
+			env->dst_rq->rd->overload = overload;
+	}
+
 }
 
 /**
@@ -6762,7 +6779,8 @@ static int idle_balance(struct rq *this_rq)
 	 */
 	this_rq->idle_stamp = rq_clock(this_rq);
 
-	if (this_rq->avg_idle < sysctl_sched_migration_cost) {
+	if (this_rq->avg_idle < sysctl_sched_migration_cost ||
+	    !this_rq->rd->overload) {
 		rcu_read_lock();
 		sd = rcu_dereference_check_sched_domain(this_rq->sd);
 		if (sd)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index e47679b..396bce0 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -477,6 +477,9 @@ struct root_domain {
 	cpumask_var_t span;
 	cpumask_var_t online;
 
+	/* Indicate more than one runnable task for any CPU */
+	bool overload;
+
 	/*
 	 * The bit corresponding to a CPU gets set here if such CPU has more
 	 * than one runnable -deadline task (as it is below for RT tasks).
@@ -1212,15 +1215,18 @@ static inline void add_nr_running(struct rq *rq, unsigned count)
 
 	rq->nr_running = prev_nr + count;
 
-#ifdef CONFIG_NO_HZ_FULL
 	if (prev_nr < 2 && rq->nr_running >= 2) {
+		if (!rq->rd->overload)
+			rq->rd->overload = true;
+
+#ifdef CONFIG_NO_HZ_FULL
 		if (tick_nohz_full_cpu(rq->cpu)) {
 			/* Order rq->nr_running write against the IPI */
 			smp_wmb();
 			smp_send_reschedule(rq->cpu);
 		}
-       }
 #endif
+	}
 }
 
 static inline void sub_nr_running(struct rq *rq, unsigned count)
-- 
1.7.11.7



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-16 19:48 [PATCH v2] sched: Fast idling of CPU when system is partially loaded Tim Chen
@ 2014-06-23  4:33 ` Jason Low
  2014-06-23 12:52 ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Jason Low @ 2014-06-23  4:33 UTC (permalink / raw)
  To: Tim Chen
  Cc: Ingo Molnar, Peter Zijlstra, Andi Kleen, Michel Lespinasse,
	Rik van Riel, Peter Hurley, Davidlohr Bueson, linux-kernel

On Mon, 2014-06-16 at 12:48 -0700, Tim Chen wrote:

> Thanks to the review from Jason and Peter.  I've moved the check
> of whether load balance is required into fair.c's idle_balance.
> 
> When a system is lightly loaded (i.e. no more than 1 job per cpu),
> attempt to pull job to a cpu before putting it to idle is unnecessary and
> can be skipped.  This patch adds an indicator so the scheduler can know
> when there's no more than 1 active job is on any CPU in the system to
> skip needless job pulls.

> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>

Acked-by: Jason Low <jason.low2@hp.com>

This change would address one of the main issues I've also been seeing
on my test machines with idle_balance where most of the
find_busiest_group overhead is not useful due to that issue with no
tasks to move.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-16 19:48 [PATCH v2] sched: Fast idling of CPU when system is partially loaded Tim Chen
  2014-06-23  4:33 ` Jason Low
@ 2014-06-23 12:52 ` Peter Zijlstra
  2014-06-23 16:22   ` Andi Kleen
  2014-06-23 16:40   ` Tim Chen
  1 sibling, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2014-06-23 12:52 UTC (permalink / raw)
  To: Tim Chen
  Cc: Ingo Molnar, Andi Kleen, Michel Lespinasse, Rik van Riel,
	Peter Hurley, Jason Low, Davidlohr Bueson, linux-kernel

On Mon, Jun 16, 2014 at 12:48:47PM -0700, Tim Chen wrote:
> +++ b/kernel/sched/fair.c
> @@ -5863,7 +5863,8 @@ static inline int sg_capacity(struct lb_env *env, struct sched_group *group)
>   */
>  static inline void update_sg_lb_stats(struct lb_env *env,
>  			struct sched_group *group, int load_idx,
> -			int local_group, struct sg_lb_stats *sgs)
> +			int local_group, struct sg_lb_stats *sgs,
> +			bool *overload)
>  {
>  	unsigned long load;
>  	int i;
> @@ -5881,6 +5882,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>  
>  		sgs->group_load += load;
>  		sgs->sum_nr_running += rq->nr_running;
> +		if (overload && rq->nr_running > 1)
> +			*overload = true;
>  #ifdef CONFIG_NUMA_BALANCING
>  		sgs->nr_numa_running += rq->nr_numa_running;
>  		sgs->nr_preferred_running += rq->nr_preferred_running;
> @@ -5991,6 +5994,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
>  	struct sched_group *sg = env->sd->groups;
>  	struct sg_lb_stats tmp_sgs;
>  	int load_idx, prefer_sibling = 0;
> +	bool overload = false;
>  
>  	if (child && child->flags & SD_PREFER_SIBLING)
>  		prefer_sibling = 1;
> @@ -6011,7 +6015,13 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
>  				update_group_power(env->sd, env->dst_cpu);
>  		}
>  
> -		update_sg_lb_stats(env, sg, load_idx, local_group, sgs);
> +		if (env->sd->parent)
> +			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
> +						NULL);
> +		else
> +			/* gather overload info if we are at root domain */
> +			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
> +						&overload);
>  
>  		if (local_group)
>  			goto next_group;
> @@ -6045,6 +6055,13 @@ next_group:
>  
>  	if (env->sd->flags & SD_NUMA)
>  		env->fbq_type = fbq_classify_group(&sds->busiest_stat);
> +
> +	if (!env->sd->parent) {
> +		/* update overload indicator if we are at root domain */
> +		if (env->dst_rq->rd->overload != overload)
> +			env->dst_rq->rd->overload = overload;
> +	}
> +
>  }
>  
>  /**

So I don't get why we can't do the below; I think Jason tried to ask the
same...

Making that overload thing unconditional makes the code simpler and the
cost is about the same; it doesn't matter if we test the pointer or
->nr_running, which we've already loaded anyhow.

Also, with only having a single update_sg_lb_stats() callsite GCC can
more easily inline the lot.

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5886,7 +5886,7 @@ static inline void update_sg_lb_stats(st
 
 		sgs->group_load += load;
 		sgs->sum_nr_running += rq->nr_running;
-		if (overload && rq->nr_running > 1)
+		if (rq->nr_running > 1)
 			*overload = true;
 #ifdef CONFIG_NUMA_BALANCING
 		sgs->nr_numa_running += rq->nr_numa_running;
@@ -6019,13 +6019,7 @@ static inline void update_sd_lb_stats(st
 				update_group_capacity(env->sd, env->dst_cpu);
 		}
 
-		if (env->sd->parent)
-			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
-						NULL);
-		else
-			/* gather overload info if we are at root domain */
-			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
-						&overload);
+		update_sg_lb_stats(env, sg, load_idx, local_group, sgs, &overload);
 
 		if (local_group)
 			goto next_group;
@@ -6065,7 +6059,6 @@ static inline void update_sd_lb_stats(st
 		if (env->dst_rq->rd->overload != overload)
 			env->dst_rq->rd->overload = overload;
 	}
-
 }
 
 /**

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-23 12:52 ` Peter Zijlstra
@ 2014-06-23 16:22   ` Andi Kleen
  2014-06-23 16:44     ` Jason Low
  2014-06-23 18:47     ` Peter Zijlstra
  2014-06-23 16:40   ` Tim Chen
  1 sibling, 2 replies; 10+ messages in thread
From: Andi Kleen @ 2014-06-23 16:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tim Chen, Ingo Molnar, Andi Kleen, Michel Lespinasse,
	Rik van Riel, Peter Hurley, Jason Low, Davidlohr Bueson,
	linux-kernel

> So I don't get why we can't do the below; I think Jason tried to ask the
> same...

The important part for performance is to minimize the cache line transfers. Your
unconditional variant would cause more dirty cache lines than Tim's,
right?

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-23 12:52 ` Peter Zijlstra
  2014-06-23 16:22   ` Andi Kleen
@ 2014-06-23 16:40   ` Tim Chen
  2014-06-23 18:50     ` Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Tim Chen @ 2014-06-23 16:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Andi Kleen, Michel Lespinasse, Rik van Riel,
	Peter Hurley, Jason Low, Davidlohr Bueson, linux-kernel

On Mon, 2014-06-23 at 14:52 +0200, Peter Zijlstra wrote:
> On Mon, Jun 16, 2014 at 12:48:47PM -0700, Tim Chen wrote:
> > +++ b/kernel/sched/fair.c
> > @@ -5863,7 +5863,8 @@ static inline int sg_capacity(struct lb_env *env, struct sched_group *group)
> >   */
> >  static inline void update_sg_lb_stats(struct lb_env *env,
> >  			struct sched_group *group, int load_idx,
> > -			int local_group, struct sg_lb_stats *sgs)
> > +			int local_group, struct sg_lb_stats *sgs,
> > +			bool *overload)
> >  {
> >  	unsigned long load;
> >  	int i;
> > @@ -5881,6 +5882,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
> >  
> >  		sgs->group_load += load;
> >  		sgs->sum_nr_running += rq->nr_running;
> > +		if (overload && rq->nr_running > 1)
> > +			*overload = true;
> >  #ifdef CONFIG_NUMA_BALANCING
> >  		sgs->nr_numa_running += rq->nr_numa_running;
> >  		sgs->nr_preferred_running += rq->nr_preferred_running;
> > @@ -5991,6 +5994,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
> >  	struct sched_group *sg = env->sd->groups;
> >  	struct sg_lb_stats tmp_sgs;
> >  	int load_idx, prefer_sibling = 0;
> > +	bool overload = false;
> >  
> >  	if (child && child->flags & SD_PREFER_SIBLING)
> >  		prefer_sibling = 1;
> > @@ -6011,7 +6015,13 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
> >  				update_group_power(env->sd, env->dst_cpu);
> >  		}
> >  
> > -		update_sg_lb_stats(env, sg, load_idx, local_group, sgs);
> > +		if (env->sd->parent)
> > +			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
> > +						NULL);
> > +		else
> > +			/* gather overload info if we are at root domain */
> > +			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
> > +						&overload);
> >  
> >  		if (local_group)
> >  			goto next_group;
> > @@ -6045,6 +6055,13 @@ next_group:
> >  
> >  	if (env->sd->flags & SD_NUMA)
> >  		env->fbq_type = fbq_classify_group(&sds->busiest_stat);
> > +
> > +	if (!env->sd->parent) {
> > +		/* update overload indicator if we are at root domain */
> > +		if (env->dst_rq->rd->overload != overload)
> > +			env->dst_rq->rd->overload = overload;
> > +	}
> > +
> >  }
> >  
> >  /**
> 
> So I don't get why we can't do the below; I think Jason tried to ask the
> same...
> 
> Making that overload thing unconditional makes the code simpler and the
> cost is about the same; it doesn't matter if we test the pointer or
> ->nr_running, which we've already loaded anyhow.
> 
> Also, with only having a single update_sg_lb_stats() callsite GCC can
> more easily inline the lot.
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5886,7 +5886,7 @@ static inline void update_sg_lb_stats(st
>  
>  		sgs->group_load += load;
>  		sgs->sum_nr_running += rq->nr_running;
> -		if (overload && rq->nr_running > 1)
> +		if (rq->nr_running > 1)
>  			*overload = true;
>  #ifdef CONFIG_NUMA_BALANCING
>  		sgs->nr_numa_running += rq->nr_numa_running;
> @@ -6019,13 +6019,7 @@ static inline void update_sd_lb_stats(st
>  				update_group_capacity(env->sd, env->dst_cpu);
>  		}
>  
> -		if (env->sd->parent)
> -			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
> -						NULL);
> -		else
> -			/* gather overload info if we are at root domain */
> -			update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
> -						&overload);
> +		update_sg_lb_stats(env, sg, load_idx, local_group, sgs, &overload);

With this change, we'll be returning the overload indicator 
that we don't use for non-root domains, which will be
extra work in sg_lb_stats as it loops through each rq checking
the nr_running to update the indicator.  I was hoping to avoid
that if possible.

Tim



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-23 16:22   ` Andi Kleen
@ 2014-06-23 16:44     ` Jason Low
  2014-06-23 17:01       ` Tim Chen
  2014-06-23 18:47     ` Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Jason Low @ 2014-06-23 16:44 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, Tim Chen, Ingo Molnar, Michel Lespinasse,
	Rik van Riel, Peter Hurley, Davidlohr Bueson, linux-kernel

On Mon, 2014-06-23 at 18:22 +0200, Andi Kleen wrote:
> > So I don't get why we can't do the below; I think Jason tried to ask the
> > same...
> 
> The important part for performance is to minimize the cache line transfers. Your
> unconditional variant would cause more dirty cache lines than Tim's,
> right?

How about that change to the update_sg_lb_stats() call site along with:

                sgs->group_load += load;
                sgs->sum_nr_running += rq->nr_running;
-               if (overload && rq->nr_running > 1)
+               if (!env->sd->parent && rq->nr_running > 1)
                        *overload = true;

which keeps it as a conditional to avoid unnecessarily setting overload
when it's not used, and still get those benefits Peter mentioned.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-23 16:44     ` Jason Low
@ 2014-06-23 17:01       ` Tim Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Tim Chen @ 2014-06-23 17:01 UTC (permalink / raw)
  To: Jason Low
  Cc: Andi Kleen, Peter Zijlstra, Ingo Molnar, Michel Lespinasse,
	Rik van Riel, Peter Hurley, Davidlohr Bueson, linux-kernel

On Mon, 2014-06-23 at 09:44 -0700, Jason Low wrote:
> On Mon, 2014-06-23 at 18:22 +0200, Andi Kleen wrote:
> > > So I don't get why we can't do the below; I think Jason tried to ask the
> > > same...
> > 
> > The important part for performance is to minimize the cache line transfers. Your
> > unconditional variant would cause more dirty cache lines than Tim's,
> > right?
> 
> How about that change to the update_sg_lb_stats() call site along with:
> 
>                 sgs->group_load += load;
>                 sgs->sum_nr_running += rq->nr_running;
> -               if (overload && rq->nr_running > 1)
> +               if (!env->sd->parent && rq->nr_running > 1)
>                         *overload = true;
> 
> which keeps it as a conditional to avoid unnecessarily setting overload
> when it's not used, and still get those benefits Peter mentioned.

I think this change will satisfy both needs.  I'll re-spin a v3 patch
with this modification if there're no objections.

Tim


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-23 16:22   ` Andi Kleen
  2014-06-23 16:44     ` Jason Low
@ 2014-06-23 18:47     ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2014-06-23 18:47 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Tim Chen, Ingo Molnar, Michel Lespinasse, Rik van Riel,
	Peter Hurley, Jason Low, Davidlohr Bueson, linux-kernel

On Mon, Jun 23, 2014 at 06:22:34PM +0200, Andi Kleen wrote:
> > So I don't get why we can't do the below; I think Jason tried to ask the
> > same...
> 
> The important part for performance is to minimize the cache line transfers. Your
> unconditional variant would cause more dirty cache lines than Tim's,
> right?

Don't think so. We already have the overloaded thing on stack and dirty,
and the only thing it needs is rq->nr_running and we already
unconditionally load that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-23 16:40   ` Tim Chen
@ 2014-06-23 18:50     ` Peter Zijlstra
  2014-06-23 18:59       ` Tim Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2014-06-23 18:50 UTC (permalink / raw)
  To: Tim Chen
  Cc: Ingo Molnar, Andi Kleen, Michel Lespinasse, Rik van Riel,
	Peter Hurley, Jason Low, Davidlohr Bueson, linux-kernel

On Mon, Jun 23, 2014 at 09:40:45AM -0700, Tim Chen wrote:

> > @@ -5886,7 +5886,7 @@ static inline void update_sg_lb_stats(st
> >  
> >  		sgs->group_load += load;
> >  		sgs->sum_nr_running += rq->nr_running;
> > -		if (overload && rq->nr_running > 1)
> > +		if (rq->nr_running > 1)
> >  			*overload = true;
> >  #ifdef CONFIG_NUMA_BALANCING
> >  		sgs->nr_numa_running += rq->nr_numa_running;

> With this change, we'll be returning the overload indicator 
> that we don't use for non-root domains, which will be
> extra work in sg_lb_stats as it loops through each rq checking
> the nr_running to update the indicator.  I was hoping to avoid
> that if possible.

What extra work? We already load nr_running and overloaded is on-stack
and should be quite dirty already due to that.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] sched: Fast idling of CPU when system is partially loaded
  2014-06-23 18:50     ` Peter Zijlstra
@ 2014-06-23 18:59       ` Tim Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Tim Chen @ 2014-06-23 18:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Andi Kleen, Michel Lespinasse, Rik van Riel,
	Peter Hurley, Jason Low, Davidlohr Bueson, linux-kernel

On Mon, 2014-06-23 at 20:50 +0200, Peter Zijlstra wrote:
> On Mon, Jun 23, 2014 at 09:40:45AM -0700, Tim Chen wrote:
> 
> > > @@ -5886,7 +5886,7 @@ static inline void update_sg_lb_stats(st
> > >  
> > >  		sgs->group_load += load;
> > >  		sgs->sum_nr_running += rq->nr_running;
> > > -		if (overload && rq->nr_running > 1)
> > > +		if (rq->nr_running > 1)
> > >  			*overload = true;
> > >  #ifdef CONFIG_NUMA_BALANCING
> > >  		sgs->nr_numa_running += rq->nr_numa_running;
> 
> > With this change, we'll be returning the overload indicator 
> > that we don't use for non-root domains, which will be
> > extra work in sg_lb_stats as it loops through each rq checking
> > the nr_running to update the indicator.  I was hoping to avoid
> > that if possible.
> 
> What extra work? We already load nr_running and overloaded is on-stack
> and should be quite dirty already due to that.

Okay then.  I'll need to modify the v3 patch with the following bits:

-               /* only need to update overload indicator for root domain */
-               if (!env->sd->parent && rq->nr_running > 1)
+               if (rq->nr_running > 1)
                        *overload = true;

Just sent v3 patch out before your email reached me.

Tim


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-06-23 18:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-16 19:48 [PATCH v2] sched: Fast idling of CPU when system is partially loaded Tim Chen
2014-06-23  4:33 ` Jason Low
2014-06-23 12:52 ` Peter Zijlstra
2014-06-23 16:22   ` Andi Kleen
2014-06-23 16:44     ` Jason Low
2014-06-23 17:01       ` Tim Chen
2014-06-23 18:47     ` Peter Zijlstra
2014-06-23 16:40   ` Tim Chen
2014-06-23 18:50     ` Peter Zijlstra
2014-06-23 18:59       ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox