public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
@ 2008-09-29 10:02 Amit K. Arora
  2008-09-29 16:09 ` Chris Friesen
  2008-09-30 11:45 ` [PATCH][resubmit] " Amit K. Arora
  0 siblings, 2 replies; 7+ messages in thread
From: Amit K. Arora @ 2008-09-29 10:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: vatsa, a.p.zijlstra, mingo, aarora

Hello,

Please consider this patch. It makes a few minor changes to
sched_fair.c.


sched: Minor optimizations in wake_affine and select_task_rq_fair

This patch does following:
o Reduces the number of arguments to wake_affine().
o Removes unused variable "rq".
o Optimizes one of the "if" conditions in wake_affine() - i.e.  if
  "balanced" is true, we need not do rest of the calculations in the
  condition.
o If this cpu is same as the previous cpu (on which woken up task
  was running when it went to sleep), no need to call wake_affine at all.


Signed-off-by: Amit K Arora <aarora@linux.vnet.ibm.com>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched_fair.c |   29 ++++++++++++-----------------
 1 file changed, 12 insertions(+), 17 deletions(-)

--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1143,17 +1143,18 @@ static inline unsigned long effective_lo
 #endif
 
 static int
-wake_affine(struct rq *rq, struct sched_domain *this_sd, struct rq *this_rq,
-	    struct task_struct *p, int prev_cpu, int this_cpu, int sync,
-	    int idx, unsigned long load, unsigned long this_load,
-	    unsigned int imbalance)
+wake_affine(struct sched_domain *this_sd, struct task_struct *p, int prev_cpu,
+	    int this_cpu, int sync, unsigned long load, unsigned long this_load)
 {
+	struct rq *this_rq = cpu_rq(this_cpu);
 	struct task_struct *curr = this_rq->curr;
 	struct task_group *tg;
 	unsigned long tl = this_load;
 	unsigned long tl_per_task;
 	unsigned long weight;
 	int balanced;
+	int idx = this_sd->wake_idx;
+	unsigned int imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
 
 	if (!(this_sd->flags & SD_WAKE_AFFINE) || !sched_feat(AFFINE_WAKEUPS))
 		return 0;
@@ -1191,8 +1192,8 @@ wake_affine(struct rq *rq, struct sched_
 	schedstat_inc(p, se.nr_wakeups_affine_attempts);
 	tl_per_task = cpu_avg_load_per_task(this_cpu);
 
-	if ((tl <= load && tl + target_load(prev_cpu, idx) <= tl_per_task) ||
-			balanced) {
+	if (balanced || (tl <= load && tl + target_load(prev_cpu, idx) <=
+			tl_per_task)) {
 		/*
 		 * This domain has SD_WAKE_AFFINE and
 		 * p is cache cold in this domain, and
@@ -1211,16 +1212,16 @@ static int select_task_rq_fair(struct ta
 	struct sched_domain *sd, *this_sd = NULL;
 	int prev_cpu, this_cpu, new_cpu;
 	unsigned long load, this_load;
-	struct rq *rq, *this_rq;
 	unsigned int imbalance;
 	int idx;
 
 	prev_cpu	= task_cpu(p);
-	rq		= task_rq(p);
 	this_cpu	= smp_processor_id();
-	this_rq		= cpu_rq(this_cpu);
 	new_cpu		= prev_cpu;
 
+	if (prev_cpu == this_cpu)
+		goto out;
+
 	/*
 	 * 'this_sd' is the first domain that both
 	 * this_cpu and prev_cpu are present in:
@@ -1242,24 +1243,18 @@ static int select_task_rq_fair(struct ta
 		goto out;
 
 	idx = this_sd->wake_idx;
-
-	imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
-
 	load = source_load(prev_cpu, idx);
 	this_load = target_load(this_cpu, idx);
 
-	if (wake_affine(rq, this_sd, this_rq, p, prev_cpu, this_cpu, sync, idx,
-				     load, this_load, imbalance))
+	if (wake_affine(this_sd, p, prev_cpu, this_cpu, sync, load, this_load))
 		return this_cpu;
 
-	if (prev_cpu == this_cpu)
-		goto out;
-
 	/*
 	 * Start passive balancing when half the imbalance_pct
 	 * limit is reached.
 	 */
 	if (this_sd->flags & SD_WAKE_BALANCE) {
+		imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
 		if (imbalance*this_load <= 100*load) {
 			schedstat_inc(this_sd, ttwu_move_balance);
 			schedstat_inc(p, se.nr_wakeups_passive);


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
  2008-09-29 10:02 [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair Amit K. Arora
@ 2008-09-29 16:09 ` Chris Friesen
  2008-09-30  7:01   ` Ingo Molnar
  2008-09-30  7:03   ` Amit K. Arora
  2008-09-30 11:45 ` [PATCH][resubmit] " Amit K. Arora
  1 sibling, 2 replies; 7+ messages in thread
From: Chris Friesen @ 2008-09-29 16:09 UTC (permalink / raw)
  To: Amit K. Arora; +Cc: linux-kernel, vatsa, a.p.zijlstra, mingo

Amit K. Arora wrote:
> Hello,
> 
> Please consider this patch. It makes a few minor changes to
> sched_fair.c.
> 
> 
> sched: Minor optimizations in wake_affine and select_task_rq_fair
> 
> This patch does following:
> o Reduces the number of arguments to wake_affine().

At what point is it cheaper to pass items as args rather than 
recalculating them?  If reducing the number of args is desirable, what 
about removing the "this_cpu" and "prev_cpu" args and recalculating them 
in wake_affine()?

Chris

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
  2008-09-29 16:09 ` Chris Friesen
@ 2008-09-30  7:01   ` Ingo Molnar
  2008-09-30 11:40     ` Amit K. Arora
  2008-09-30  7:03   ` Amit K. Arora
  1 sibling, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2008-09-30  7:01 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Amit K. Arora, linux-kernel, vatsa, a.p.zijlstra


* Chris Friesen <cfriesen@nortel.com> wrote:

> Amit K. Arora wrote:
>> Hello,
>>
>> Please consider this patch. It makes a few minor changes to
>> sched_fair.c.
>>
>>
>> sched: Minor optimizations in wake_affine and select_task_rq_fair
>>
>> This patch does following:
>> o Reduces the number of arguments to wake_affine().
>
> At what point is it cheaper to pass items as args rather than 
> recalculating them?  If reducing the number of args is desirable, what 
> about removing the "this_cpu" and "prev_cpu" args and recalculating 
> them in wake_affine()?

it's usually not worth it, especially if it leads to duplicated 
calculations (and code) like:

+       unsigned int imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;

gcc will optimize it away because it's all static functions, but still.

'size kernel/sched.o' should be a good guideline: if the .o's text 
section gets smaller due to a patch it usually gets faster as well.

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
  2008-09-29 16:09 ` Chris Friesen
  2008-09-30  7:01   ` Ingo Molnar
@ 2008-09-30  7:03   ` Amit K. Arora
  1 sibling, 0 replies; 7+ messages in thread
From: Amit K. Arora @ 2008-09-30  7:03 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-kernel, vatsa, a.p.zijlstra, mingo

On Mon, Sep 29, 2008 at 10:09:41AM -0600, Chris Friesen wrote:
> Amit K. Arora wrote:
>> sched: Minor optimizations in wake_affine and select_task_rq_fair
>>
>> This patch does following:
>> o Reduces the number of arguments to wake_affine().
>
> At what point is it cheaper to pass items as args rather than recalculating 
> them?  If reducing the number of args is desirable, what about removing the 
> "this_cpu" and "prev_cpu" args and recalculating them in wake_affine()?

Thats a good question. Its kind of arguable and I wasn't sure if
everyone will be happy if I removed more arguments from wake_affine() than
what I did in my patch (because of the recalculations required).

wake_affine() currently has 11 arguments and I thought it may make sense
in reducing it to a sane number. For that I chose arguments which I
thought can be recalculated with minimum overhead (involves single struct
dereference, a simple per cpu variable and/or a simple arithmetic). And
one argument ("rq") which is being removed, isn't used at all in the
function!

Regarding the two variables you have mentioned, I didn't remove them as
args since I wasn't sure of "this_cpu" (which is nothing but
smp_processor_id()) as it is arch dependent, and calculating "prev_cpu"
involves two struct dereferences (((struct thread_info *)(task)->stack)->cpu).
And the calculation for other arguments (like, this_sd, load and this_load)
involves good amount of instructions.

If you disagree, what do you suggest we do here ?

Regards,
Amit Arora

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
  2008-09-30  7:01   ` Ingo Molnar
@ 2008-09-30 11:40     ` Amit K. Arora
  0 siblings, 0 replies; 7+ messages in thread
From: Amit K. Arora @ 2008-09-30 11:40 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Chris Friesen, linux-kernel, vatsa, a.p.zijlstra

On Tue, Sep 30, 2008 at 09:01:58AM +0200, Ingo Molnar wrote:
> 
> * Chris Friesen <cfriesen@nortel.com> wrote:
> 
> > Amit K. Arora wrote:
> >> Hello,
> >>
> >> Please consider this patch. It makes a few minor changes to
> >> sched_fair.c.
> >>
> >>
> >> sched: Minor optimizations in wake_affine and select_task_rq_fair
> >>
> >> This patch does following:
> >> o Reduces the number of arguments to wake_affine().
> >
> > At what point is it cheaper to pass items as args rather than 
> > recalculating them?  If reducing the number of args is desirable, what 
> > about removing the "this_cpu" and "prev_cpu" args and recalculating 
> > them in wake_affine()?
> 
> it's usually not worth it, especially if it leads to duplicated 
> calculations (and code) like:
> 
> +       unsigned int imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
> 
> gcc will optimize it away because it's all static functions, but still.

Ok. I will resubmit the patch with other suggested changes only. It
won't try to reduce wake_affine's arguments (besides the first argument
"rq" which is not being used at all). 

Regards,
Amit Arora
> 
> 'size kernel/sched.o' should be a good guideline: if the .o's text 
> section gets smaller due to a patch it usually gets faster as well.
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH][resubmit] sched: minor optimizations in wake_affine and select_task_rq_fair
  2008-09-29 10:02 [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair Amit K. Arora
  2008-09-29 16:09 ` Chris Friesen
@ 2008-09-30 11:45 ` Amit K. Arora
  2008-09-30 13:26   ` Ingo Molnar
  1 sibling, 1 reply; 7+ messages in thread
From: Amit K. Arora @ 2008-09-30 11:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: vatsa, a.p.zijlstra, mingo, aarora

As mentioned in http://lkml.org/lkml/2008/9/30/141 , this is the new
patch.


sched: Minor optimizations in wake_affine and select_task_rq_fair

This patch does following:
o Removes unused variable and argument "rq".
o Optimizes one of the "if" conditions in wake_affine() - i.e.  if
  "balanced" is true, we need not do rest of the calculations in the
  condition.
o If this cpu is same as the previous cpu (on which woken up task
  was running when it went to sleep), no need to call wake_affine at all.

Signed-off-by: Amit K Arora <aarora@linux.vnet.ibm.com>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched_fair.c |   16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1143,7 +1143,7 @@ static inline unsigned long effective_lo
 #endif
 
 static int
-wake_affine(struct rq *rq, struct sched_domain *this_sd, struct rq *this_rq,
+wake_affine(struct sched_domain *this_sd, struct rq *this_rq,
 	    struct task_struct *p, int prev_cpu, int this_cpu, int sync,
 	    int idx, unsigned long load, unsigned long this_load,
 	    unsigned int imbalance)
@@ -1191,8 +1191,8 @@ wake_affine(struct rq *rq, struct sched_
 	schedstat_inc(p, se.nr_wakeups_affine_attempts);
 	tl_per_task = cpu_avg_load_per_task(this_cpu);
 
-	if ((tl <= load && tl + target_load(prev_cpu, idx) <= tl_per_task) ||
-			balanced) {
+	if (balanced || (tl <= load && tl + target_load(prev_cpu, idx) <=
+			tl_per_task)) {
 		/*
 		 * This domain has SD_WAKE_AFFINE and
 		 * p is cache cold in this domain, and
@@ -1211,16 +1211,17 @@ static int select_task_rq_fair(struct ta
 	struct sched_domain *sd, *this_sd = NULL;
 	int prev_cpu, this_cpu, new_cpu;
 	unsigned long load, this_load;
-	struct rq *rq, *this_rq;
+	struct rq *this_rq;
 	unsigned int imbalance;
 	int idx;
 
 	prev_cpu	= task_cpu(p);
-	rq		= task_rq(p);
 	this_cpu	= smp_processor_id();
 	this_rq		= cpu_rq(this_cpu);
 	new_cpu		= prev_cpu;
 
+	if (prev_cpu == this_cpu)
+		goto out;
 	/*
 	 * 'this_sd' is the first domain that both
 	 * this_cpu and prev_cpu are present in:
@@ -1248,13 +1249,10 @@ static int select_task_rq_fair(struct ta
 	load = source_load(prev_cpu, idx);
 	this_load = target_load(this_cpu, idx);
 
-	if (wake_affine(rq, this_sd, this_rq, p, prev_cpu, this_cpu, sync, idx,
+	if (wake_affine(this_sd, this_rq, p, prev_cpu, this_cpu, sync, idx,
 				     load, this_load, imbalance))
 		return this_cpu;
 
-	if (prev_cpu == this_cpu)
-		goto out;
-
 	/*
 	 * Start passive balancing when half the imbalance_pct
 	 * limit is reached.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][resubmit] sched: minor optimizations in wake_affine and select_task_rq_fair
  2008-09-30 11:45 ` [PATCH][resubmit] " Amit K. Arora
@ 2008-09-30 13:26   ` Ingo Molnar
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2008-09-30 13:26 UTC (permalink / raw)
  To: Amit K. Arora; +Cc: linux-kernel, vatsa, a.p.zijlstra


* Amit K. Arora <aarora@linux.vnet.ibm.com> wrote:

> As mentioned in http://lkml.org/lkml/2008/9/30/141 , this is the new
> patch.
> 
> sched: Minor optimizations in wake_affine and select_task_rq_fair
> 
> This patch does following:
> o Removes unused variable and argument "rq".
> o Optimizes one of the "if" conditions in wake_affine() - i.e.  if
>   "balanced" is true, we need not do rest of the calculations in the
>   condition.
> o If this cpu is same as the previous cpu (on which woken up task
>   was running when it went to sleep), no need to call wake_affine at all.
> 
> Signed-off-by: Amit K Arora <aarora@linux.vnet.ibm.com>
> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> CC: Ingo Molnar <mingo@elte.hu>

applied to tip/sched/devel, thanks Amit!

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-09-30 13:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-29 10:02 [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair Amit K. Arora
2008-09-29 16:09 ` Chris Friesen
2008-09-30  7:01   ` Ingo Molnar
2008-09-30 11:40     ` Amit K. Arora
2008-09-30  7:03   ` Amit K. Arora
2008-09-30 11:45 ` [PATCH][resubmit] " Amit K. Arora
2008-09-30 13:26   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox