* [PATCH] sched: prevent compiler from optimising sched_avg_update loop
@ 2010-03-23 17:36 Will Deacon
2010-03-23 17:53 ` Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Will Deacon @ 2010-03-23 17:36 UTC (permalink / raw)
To: linux-kernel
Cc: Will Deacon, Catalin Marinas, Ingo Molnar, Andrew Morton,
Peter Zijlstra
GCC 4.4.1 on ARM has been observed to replace the while loop
in sched_avg_update with a call to uldivmod, resulting in the
following build failure at link-time:
kernel/built-in.o: In function `sched_avg_update':
/linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
/linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
make: *** [.tmp_vmlinux1] Error 1
This patch [taken against 2.6.34-rc2] replaces the loop with a call to
div_s64 which allows the Kernel to link.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
kernel/sched.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 9ab3cd7..6b74f21 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1238,11 +1238,10 @@ static u64 sched_avg_period(void)
static void sched_avg_update(struct rq *rq)
{
s64 period = sched_avg_period();
+ s64 elapsed_periods = div_s64(rq->clock - rq->age_stamp - 1, period);
- while ((s64)(rq->clock - rq->age_stamp) > period) {
- rq->age_stamp += period;
- rq->rt_avg /= 2;
- }
+ rq->age_stamp += (u64)(elapsed_periods * period);
+ rq->rt_avg >>= elapsed_periods;
}
static void sched_rt_avg_update(struct rq *rq, u64 rt_delta)
--
1.6.3.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: prevent compiler from optimising sched_avg_update loop
2010-03-23 17:36 [PATCH] sched: prevent compiler from optimising sched_avg_update loop Will Deacon
@ 2010-03-23 17:53 ` Eric Dumazet
[not found] ` <000101cacab3$25af90a0$710eb1e0$@deacon@arm.com>
2010-03-23 18:08 ` Peter Zijlstra
2010-03-23 19:05 ` Will Deacon
2 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2010-03-23 17:53 UTC (permalink / raw)
To: Will Deacon
Cc: linux-kernel, Catalin Marinas, Ingo Molnar, Andrew Morton,
Peter Zijlstra
Le mardi 23 mars 2010 à 17:36 +0000, Will Deacon a écrit :
> GCC 4.4.1 on ARM has been observed to replace the while loop
> in sched_avg_update with a call to uldivmod, resulting in the
> following build failure at link-time:
>
> kernel/built-in.o: In function `sched_avg_update':
> /linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
> /linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
> make: *** [.tmp_vmlinux1] Error 1
>
> This patch [taken against 2.6.34-rc2] replaces the loop with a call to
> div_s64 which allows the Kernel to link.
>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
> kernel/sched.c | 7 +++----
> 1 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 9ab3cd7..6b74f21 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1238,11 +1238,10 @@ static u64 sched_avg_period(void)
> static void sched_avg_update(struct rq *rq)
> {
> s64 period = sched_avg_period();
> + s64 elapsed_periods = div_s64(rq->clock - rq->age_stamp - 1, period);
>
> - while ((s64)(rq->clock - rq->age_stamp) > period) {
> - rq->age_stamp += period;
> - rq->rt_avg /= 2;
> - }
> + rq->age_stamp += (u64)(elapsed_periods * period);
> + rq->rt_avg >>= elapsed_periods;
> }
>
> static void sched_rt_avg_update(struct rq *rq, u64 rt_delta)
Please take a look at __iter_div_u64_rem() , because we had a similar
problem in the past. We want to avoid this div_s64() call.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: prevent compiler from optimising sched_avg_update loop
2010-03-23 17:36 [PATCH] sched: prevent compiler from optimising sched_avg_update loop Will Deacon
2010-03-23 17:53 ` Eric Dumazet
@ 2010-03-23 18:08 ` Peter Zijlstra
2010-03-23 19:05 ` Will Deacon
2 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2010-03-23 18:08 UTC (permalink / raw)
To: Will Deacon; +Cc: linux-kernel, Catalin Marinas, Ingo Molnar, Andrew Morton
On Tue, 2010-03-23 at 17:36 +0000, Will Deacon wrote:
> GCC 4.4.1 on ARM has been observed to replace the while loop
> in sched_avg_update with a call to uldivmod, resulting in the
> following build failure at link-time:
>
> kernel/built-in.o: In function `sched_avg_update':
> /linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
> /linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
> make: *** [.tmp_vmlinux1] Error 1
>
> This patch [taken against 2.6.34-rc2] replaces the loop with a call to
> div_s64 which allows the Kernel to link.
>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
> kernel/sched.c | 7 +++----
> 1 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 9ab3cd7..6b74f21 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1238,11 +1238,10 @@ static u64 sched_avg_period(void)
> static void sched_avg_update(struct rq *rq)
> {
> s64 period = sched_avg_period();
> + s64 elapsed_periods = div_s64(rq->clock - rq->age_stamp - 1, period);
>
> - while ((s64)(rq->clock - rq->age_stamp) > period) {
> - rq->age_stamp += period;
> - rq->rt_avg /= 2;
> - }
> + rq->age_stamp += (u64)(elapsed_periods * period);
> + rq->rt_avg >>= elapsed_periods;
> }
Hmm, and that does an unconditional division, thing is, I don't expect
(under normal circumstances) for that loop to go round more than once so
that division will hurt for no reason.
Should we maybe write it like so:
if ((s64)(rq->clock - rq->age_stamp) > period) {
rq->age_stamp += period;
rq->rt_avg >>= 1;
}
if (unlikely((s64)(rq->clock - rq->age_stamp) > period)) {
s64 overflows = div_s64(rq->clocks - rq->age_stamp, period);
int width = sizeof(rq->rt_avg) * 8;
rq->age_stamp += overflows * period;
if (unlikely(overflows >= width))
rq->rt_avg = 0;
else
rq->rt_avg >>= overflows;
}
?
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] sched: prevent compiler from optimising sched_avg_update loop
[not found] ` <000101cacab3$25af90a0$710eb1e0$@deacon@arm.com>
@ 2010-03-23 18:10 ` Peter Zijlstra
0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2010-03-23 18:10 UTC (permalink / raw)
To: Will Deacon
Cc: 'Eric Dumazet', linux-kernel, Catalin Marinas,
Ingo Molnar, Andrew Morton
On Tue, 2010-03-23 at 18:03 +0000, Will Deacon wrote:
> Hello Eric,
>
> Thanks for looking at the patch.
>
> > > diff --git a/kernel/sched.c b/kernel/sched.c
> > > index 9ab3cd7..6b74f21 100644
> > > --- a/kernel/sched.c
> > > +++ b/kernel/sched.c
> > > @@ -1238,11 +1238,10 @@ static u64 sched_avg_period(void)
> > > static void sched_avg_update(struct rq *rq)
> > > {
> > > s64 period = sched_avg_period();
> > > + s64 elapsed_periods = div_s64(rq->clock - rq->age_stamp - 1, period);
> > >
> > > - while ((s64)(rq->clock - rq->age_stamp) > period) {
> > > - rq->age_stamp += period;
> > > - rq->rt_avg /= 2;
> > > - }
> > > + rq->age_stamp += (u64)(elapsed_periods * period);
> > > + rq->rt_avg >>= elapsed_periods;
> > > }
> > >
> > > static void sched_rt_avg_update(struct rq *rq, u64 rt_delta)
> >
> > Please take a look at __iter_div_u64_rem() , because we had a similar
> > problem in the past. We want to avoid this div_s64() call.
>
> Yes, I saw the inline assembly fix there. I avoided that fix because
> I was trying not to execute the loop body multiple times. Is the iterative
> approach preferred over a single call to div_s64? I don't have a handle on
> how many iterations are typically executed for this loop.
I expect it to be mostly 0 and occasionally 1 cycle, except when someone
pokes at a sysctl with funny values, at which point it might go round
much much faster.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] sched: prevent compiler from optimising sched_avg_update loop
2010-03-23 17:36 [PATCH] sched: prevent compiler from optimising sched_avg_update loop Will Deacon
2010-03-23 17:53 ` Eric Dumazet
2010-03-23 18:08 ` Peter Zijlstra
@ 2010-03-23 19:05 ` Will Deacon
2 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2010-03-23 19:05 UTC (permalink / raw)
To: linux-kernel
Cc: Will Deacon, Catalin Marinas, Ingo Molnar, Andrew Morton,
Peter Zijlstra
GCC 4.4.1 on ARM has been observed to replace the while loop
in sched_avg_update with a call to uldivmod, resulting in the
following build failure at link-time:
kernel/built-in.o: In function `sched_avg_update':
/linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
/linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
make: *** [.tmp_vmlinux1] Error 1
This patch [taken against 2.6.34-rc2] introduces a fake data hazard to
the loop body to prevent the compiler optimising the loop away.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
kernel/sched.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 9ab3cd7..0846815 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1240,6 +1240,12 @@ static void sched_avg_update(struct rq *rq)
s64 period = sched_avg_period();
while ((s64)(rq->clock - rq->age_stamp) > period) {
+ /*
+ * Inline assembly required to prevent the compiler
+ * optimising this loop into a divmod call.
+ * See __iter_div_u64_rem() for another example of this.
+ */
+ asm("" : "+rm" (rq->age_stamp));
rq->age_stamp += period;
rq->rt_avg /= 2;
}
--
1.6.3.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-03-23 19:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-23 17:36 [PATCH] sched: prevent compiler from optimising sched_avg_update loop Will Deacon
2010-03-23 17:53 ` Eric Dumazet
[not found] ` <000101cacab3$25af90a0$710eb1e0$@deacon@arm.com>
2010-03-23 18:10 ` Peter Zijlstra
2010-03-23 18:08 ` Peter Zijlstra
2010-03-23 19:05 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox