* [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
@ 2008-09-29 10:02 Amit K. Arora
2008-09-29 16:09 ` Chris Friesen
2008-09-30 11:45 ` [PATCH][resubmit] " Amit K. Arora
0 siblings, 2 replies; 7+ messages in thread
From: Amit K. Arora @ 2008-09-29 10:02 UTC (permalink / raw)
To: linux-kernel; +Cc: vatsa, a.p.zijlstra, mingo, aarora
Hello,
Please consider this patch. It makes a few minor changes to
sched_fair.c.
sched: Minor optimizations in wake_affine and select_task_rq_fair
This patch does following:
o Reduces the number of arguments to wake_affine().
o Removes unused variable "rq".
o Optimizes one of the "if" conditions in wake_affine() - i.e. if
"balanced" is true, we need not do rest of the calculations in the
condition.
o If this cpu is same as the previous cpu (on which woken up task
was running when it went to sleep), no need to call wake_affine at all.
Signed-off-by: Amit K Arora <aarora@linux.vnet.ibm.com>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Ingo Molnar <mingo@elte.hu>
---
kernel/sched_fair.c | 29 ++++++++++++-----------------
1 file changed, 12 insertions(+), 17 deletions(-)
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1143,17 +1143,18 @@ static inline unsigned long effective_lo
#endif
static int
-wake_affine(struct rq *rq, struct sched_domain *this_sd, struct rq *this_rq,
- struct task_struct *p, int prev_cpu, int this_cpu, int sync,
- int idx, unsigned long load, unsigned long this_load,
- unsigned int imbalance)
+wake_affine(struct sched_domain *this_sd, struct task_struct *p, int prev_cpu,
+ int this_cpu, int sync, unsigned long load, unsigned long this_load)
{
+ struct rq *this_rq = cpu_rq(this_cpu);
struct task_struct *curr = this_rq->curr;
struct task_group *tg;
unsigned long tl = this_load;
unsigned long tl_per_task;
unsigned long weight;
int balanced;
+ int idx = this_sd->wake_idx;
+ unsigned int imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
if (!(this_sd->flags & SD_WAKE_AFFINE) || !sched_feat(AFFINE_WAKEUPS))
return 0;
@@ -1191,8 +1192,8 @@ wake_affine(struct rq *rq, struct sched_
schedstat_inc(p, se.nr_wakeups_affine_attempts);
tl_per_task = cpu_avg_load_per_task(this_cpu);
- if ((tl <= load && tl + target_load(prev_cpu, idx) <= tl_per_task) ||
- balanced) {
+ if (balanced || (tl <= load && tl + target_load(prev_cpu, idx) <=
+ tl_per_task)) {
/*
* This domain has SD_WAKE_AFFINE and
* p is cache cold in this domain, and
@@ -1211,16 +1212,16 @@ static int select_task_rq_fair(struct ta
struct sched_domain *sd, *this_sd = NULL;
int prev_cpu, this_cpu, new_cpu;
unsigned long load, this_load;
- struct rq *rq, *this_rq;
unsigned int imbalance;
int idx;
prev_cpu = task_cpu(p);
- rq = task_rq(p);
this_cpu = smp_processor_id();
- this_rq = cpu_rq(this_cpu);
new_cpu = prev_cpu;
+ if (prev_cpu == this_cpu)
+ goto out;
+
/*
* 'this_sd' is the first domain that both
* this_cpu and prev_cpu are present in:
@@ -1242,24 +1243,18 @@ static int select_task_rq_fair(struct ta
goto out;
idx = this_sd->wake_idx;
-
- imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
-
load = source_load(prev_cpu, idx);
this_load = target_load(this_cpu, idx);
- if (wake_affine(rq, this_sd, this_rq, p, prev_cpu, this_cpu, sync, idx,
- load, this_load, imbalance))
+ if (wake_affine(this_sd, p, prev_cpu, this_cpu, sync, load, this_load))
return this_cpu;
- if (prev_cpu == this_cpu)
- goto out;
-
/*
* Start passive balancing when half the imbalance_pct
* limit is reached.
*/
if (this_sd->flags & SD_WAKE_BALANCE) {
+ imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
if (imbalance*this_load <= 100*load) {
schedstat_inc(this_sd, ttwu_move_balance);
schedstat_inc(p, se.nr_wakeups_passive);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
2008-09-29 10:02 [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair Amit K. Arora
@ 2008-09-29 16:09 ` Chris Friesen
2008-09-30 7:01 ` Ingo Molnar
2008-09-30 7:03 ` Amit K. Arora
2008-09-30 11:45 ` [PATCH][resubmit] " Amit K. Arora
1 sibling, 2 replies; 7+ messages in thread
From: Chris Friesen @ 2008-09-29 16:09 UTC (permalink / raw)
To: Amit K. Arora; +Cc: linux-kernel, vatsa, a.p.zijlstra, mingo
Amit K. Arora wrote:
> Hello,
>
> Please consider this patch. It makes a few minor changes to
> sched_fair.c.
>
>
> sched: Minor optimizations in wake_affine and select_task_rq_fair
>
> This patch does following:
> o Reduces the number of arguments to wake_affine().
At what point is it cheaper to pass items as args rather than
recalculating them? If reducing the number of args is desirable, what
about removing the "this_cpu" and "prev_cpu" args and recalculating them
in wake_affine()?
Chris
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
2008-09-29 16:09 ` Chris Friesen
@ 2008-09-30 7:01 ` Ingo Molnar
2008-09-30 11:40 ` Amit K. Arora
2008-09-30 7:03 ` Amit K. Arora
1 sibling, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2008-09-30 7:01 UTC (permalink / raw)
To: Chris Friesen; +Cc: Amit K. Arora, linux-kernel, vatsa, a.p.zijlstra
* Chris Friesen <cfriesen@nortel.com> wrote:
> Amit K. Arora wrote:
>> Hello,
>>
>> Please consider this patch. It makes a few minor changes to
>> sched_fair.c.
>>
>>
>> sched: Minor optimizations in wake_affine and select_task_rq_fair
>>
>> This patch does following:
>> o Reduces the number of arguments to wake_affine().
>
> At what point is it cheaper to pass items as args rather than
> recalculating them? If reducing the number of args is desirable, what
> about removing the "this_cpu" and "prev_cpu" args and recalculating
> them in wake_affine()?
it's usually not worth it, especially if it leads to duplicated
calculations (and code) like:
+ unsigned int imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
gcc will optimize it away because it's all static functions, but still.
'size kernel/sched.o' should be a good guideline: if the .o's text
section gets smaller due to a patch it usually gets faster as well.
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
2008-09-29 16:09 ` Chris Friesen
2008-09-30 7:01 ` Ingo Molnar
@ 2008-09-30 7:03 ` Amit K. Arora
1 sibling, 0 replies; 7+ messages in thread
From: Amit K. Arora @ 2008-09-30 7:03 UTC (permalink / raw)
To: Chris Friesen; +Cc: linux-kernel, vatsa, a.p.zijlstra, mingo
On Mon, Sep 29, 2008 at 10:09:41AM -0600, Chris Friesen wrote:
> Amit K. Arora wrote:
>> sched: Minor optimizations in wake_affine and select_task_rq_fair
>>
>> This patch does following:
>> o Reduces the number of arguments to wake_affine().
>
> At what point is it cheaper to pass items as args rather than recalculating
> them? If reducing the number of args is desirable, what about removing the
> "this_cpu" and "prev_cpu" args and recalculating them in wake_affine()?
Thats a good question. Its kind of arguable and I wasn't sure if
everyone will be happy if I removed more arguments from wake_affine() than
what I did in my patch (because of the recalculations required).
wake_affine() currently has 11 arguments and I thought it may make sense
in reducing it to a sane number. For that I chose arguments which I
thought can be recalculated with minimum overhead (involves single struct
dereference, a simple per cpu variable and/or a simple arithmetic). And
one argument ("rq") which is being removed, isn't used at all in the
function!
Regarding the two variables you have mentioned, I didn't remove them as
args since I wasn't sure of "this_cpu" (which is nothing but
smp_processor_id()) as it is arch dependent, and calculating "prev_cpu"
involves two struct dereferences (((struct thread_info *)(task)->stack)->cpu).
And the calculation for other arguments (like, this_sd, load and this_load)
involves good amount of instructions.
If you disagree, what do you suggest we do here ?
Regards,
Amit Arora
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair
2008-09-30 7:01 ` Ingo Molnar
@ 2008-09-30 11:40 ` Amit K. Arora
0 siblings, 0 replies; 7+ messages in thread
From: Amit K. Arora @ 2008-09-30 11:40 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Chris Friesen, linux-kernel, vatsa, a.p.zijlstra
On Tue, Sep 30, 2008 at 09:01:58AM +0200, Ingo Molnar wrote:
>
> * Chris Friesen <cfriesen@nortel.com> wrote:
>
> > Amit K. Arora wrote:
> >> Hello,
> >>
> >> Please consider this patch. It makes a few minor changes to
> >> sched_fair.c.
> >>
> >>
> >> sched: Minor optimizations in wake_affine and select_task_rq_fair
> >>
> >> This patch does following:
> >> o Reduces the number of arguments to wake_affine().
> >
> > At what point is it cheaper to pass items as args rather than
> > recalculating them? If reducing the number of args is desirable, what
> > about removing the "this_cpu" and "prev_cpu" args and recalculating
> > them in wake_affine()?
>
> it's usually not worth it, especially if it leads to duplicated
> calculations (and code) like:
>
> + unsigned int imbalance = 100 + (this_sd->imbalance_pct - 100) / 2;
>
> gcc will optimize it away because it's all static functions, but still.
Ok. I will resubmit the patch with other suggested changes only. It
won't try to reduce wake_affine's arguments (besides the first argument
"rq" which is not being used at all).
Regards,
Amit Arora
>
> 'size kernel/sched.o' should be a good guideline: if the .o's text
> section gets smaller due to a patch it usually gets faster as well.
>
> Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH][resubmit] sched: minor optimizations in wake_affine and select_task_rq_fair
2008-09-29 10:02 [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair Amit K. Arora
2008-09-29 16:09 ` Chris Friesen
@ 2008-09-30 11:45 ` Amit K. Arora
2008-09-30 13:26 ` Ingo Molnar
1 sibling, 1 reply; 7+ messages in thread
From: Amit K. Arora @ 2008-09-30 11:45 UTC (permalink / raw)
To: linux-kernel; +Cc: vatsa, a.p.zijlstra, mingo, aarora
As mentioned in http://lkml.org/lkml/2008/9/30/141 , this is the new
patch.
sched: Minor optimizations in wake_affine and select_task_rq_fair
This patch does following:
o Removes unused variable and argument "rq".
o Optimizes one of the "if" conditions in wake_affine() - i.e. if
"balanced" is true, we need not do rest of the calculations in the
condition.
o If this cpu is same as the previous cpu (on which woken up task
was running when it went to sleep), no need to call wake_affine at all.
Signed-off-by: Amit K Arora <aarora@linux.vnet.ibm.com>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Ingo Molnar <mingo@elte.hu>
---
kernel/sched_fair.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1143,7 +1143,7 @@ static inline unsigned long effective_lo
#endif
static int
-wake_affine(struct rq *rq, struct sched_domain *this_sd, struct rq *this_rq,
+wake_affine(struct sched_domain *this_sd, struct rq *this_rq,
struct task_struct *p, int prev_cpu, int this_cpu, int sync,
int idx, unsigned long load, unsigned long this_load,
unsigned int imbalance)
@@ -1191,8 +1191,8 @@ wake_affine(struct rq *rq, struct sched_
schedstat_inc(p, se.nr_wakeups_affine_attempts);
tl_per_task = cpu_avg_load_per_task(this_cpu);
- if ((tl <= load && tl + target_load(prev_cpu, idx) <= tl_per_task) ||
- balanced) {
+ if (balanced || (tl <= load && tl + target_load(prev_cpu, idx) <=
+ tl_per_task)) {
/*
* This domain has SD_WAKE_AFFINE and
* p is cache cold in this domain, and
@@ -1211,16 +1211,17 @@ static int select_task_rq_fair(struct ta
struct sched_domain *sd, *this_sd = NULL;
int prev_cpu, this_cpu, new_cpu;
unsigned long load, this_load;
- struct rq *rq, *this_rq;
+ struct rq *this_rq;
unsigned int imbalance;
int idx;
prev_cpu = task_cpu(p);
- rq = task_rq(p);
this_cpu = smp_processor_id();
this_rq = cpu_rq(this_cpu);
new_cpu = prev_cpu;
+ if (prev_cpu == this_cpu)
+ goto out;
/*
* 'this_sd' is the first domain that both
* this_cpu and prev_cpu are present in:
@@ -1248,13 +1249,10 @@ static int select_task_rq_fair(struct ta
load = source_load(prev_cpu, idx);
this_load = target_load(this_cpu, idx);
- if (wake_affine(rq, this_sd, this_rq, p, prev_cpu, this_cpu, sync, idx,
+ if (wake_affine(this_sd, this_rq, p, prev_cpu, this_cpu, sync, idx,
load, this_load, imbalance))
return this_cpu;
- if (prev_cpu == this_cpu)
- goto out;
-
/*
* Start passive balancing when half the imbalance_pct
* limit is reached.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH][resubmit] sched: minor optimizations in wake_affine and select_task_rq_fair
2008-09-30 11:45 ` [PATCH][resubmit] " Amit K. Arora
@ 2008-09-30 13:26 ` Ingo Molnar
0 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2008-09-30 13:26 UTC (permalink / raw)
To: Amit K. Arora; +Cc: linux-kernel, vatsa, a.p.zijlstra
* Amit K. Arora <aarora@linux.vnet.ibm.com> wrote:
> As mentioned in http://lkml.org/lkml/2008/9/30/141 , this is the new
> patch.
>
> sched: Minor optimizations in wake_affine and select_task_rq_fair
>
> This patch does following:
> o Removes unused variable and argument "rq".
> o Optimizes one of the "if" conditions in wake_affine() - i.e. if
> "balanced" is true, we need not do rest of the calculations in the
> condition.
> o If this cpu is same as the previous cpu (on which woken up task
> was running when it went to sleep), no need to call wake_affine at all.
>
> Signed-off-by: Amit K Arora <aarora@linux.vnet.ibm.com>
> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> CC: Ingo Molnar <mingo@elte.hu>
applied to tip/sched/devel, thanks Amit!
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-09-30 13:26 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-29 10:02 [PATCH] sched: minor optimizations in wake_affine and select_task_rq_fair Amit K. Arora
2008-09-29 16:09 ` Chris Friesen
2008-09-30 7:01 ` Ingo Molnar
2008-09-30 11:40 ` Amit K. Arora
2008-09-30 7:03 ` Amit K. Arora
2008-09-30 11:45 ` [PATCH][resubmit] " Amit K. Arora
2008-09-30 13:26 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox