public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: prevent compiler from optimising sched_avg_update loop
@ 2010-03-23 17:36 Will Deacon
  2010-03-23 17:53 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Will Deacon @ 2010-03-23 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Catalin Marinas, Ingo Molnar, Andrew Morton,
	Peter Zijlstra

GCC 4.4.1 on ARM has been observed to replace the while loop
in sched_avg_update with a call to uldivmod, resulting in the
following build failure at link-time:

kernel/built-in.o: In function `sched_avg_update':
/linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
/linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
make: *** [.tmp_vmlinux1] Error 1

This patch [taken against 2.6.34-rc2] replaces the loop with a call to
div_s64 which allows the Kernel to link.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 kernel/sched.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 9ab3cd7..6b74f21 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1238,11 +1238,10 @@ static u64 sched_avg_period(void)
 static void sched_avg_update(struct rq *rq)
 {
 	s64 period = sched_avg_period();
+	s64 elapsed_periods = div_s64(rq->clock - rq->age_stamp - 1, period);
 
-	while ((s64)(rq->clock - rq->age_stamp) > period) {
-		rq->age_stamp += period;
-		rq->rt_avg /= 2;
-	}
+	rq->age_stamp += (u64)(elapsed_periods * period);
+	rq->rt_avg >>= elapsed_periods;
 }
 
 static void sched_rt_avg_update(struct rq *rq, u64 rt_delta)
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: prevent compiler from optimising sched_avg_update loop
  2010-03-23 17:36 [PATCH] sched: prevent compiler from optimising sched_avg_update loop Will Deacon
@ 2010-03-23 17:53 ` Eric Dumazet
       [not found]   ` <000101cacab3$25af90a0$710eb1e0$@deacon@arm.com>
  2010-03-23 18:08 ` Peter Zijlstra
  2010-03-23 19:05 ` Will Deacon
  2 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2010-03-23 17:53 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Catalin Marinas, Ingo Molnar, Andrew Morton,
	Peter Zijlstra

Le mardi 23 mars 2010 à 17:36 +0000, Will Deacon a écrit :
> GCC 4.4.1 on ARM has been observed to replace the while loop
> in sched_avg_update with a call to uldivmod, resulting in the
> following build failure at link-time:
> 
> kernel/built-in.o: In function `sched_avg_update':
> /linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
> /linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
> make: *** [.tmp_vmlinux1] Error 1
> 
> This patch [taken against 2.6.34-rc2] replaces the loop with a call to
> div_s64 which allows the Kernel to link.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  kernel/sched.c |    7 +++----
>  1 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 9ab3cd7..6b74f21 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1238,11 +1238,10 @@ static u64 sched_avg_period(void)
>  static void sched_avg_update(struct rq *rq)
>  {
>  	s64 period = sched_avg_period();
> +	s64 elapsed_periods = div_s64(rq->clock - rq->age_stamp - 1, period);
>  
> -	while ((s64)(rq->clock - rq->age_stamp) > period) {
> -		rq->age_stamp += period;
> -		rq->rt_avg /= 2;
> -	}
> +	rq->age_stamp += (u64)(elapsed_periods * period);
> +	rq->rt_avg >>= elapsed_periods;
>  }
>  
>  static void sched_rt_avg_update(struct rq *rq, u64 rt_delta)

Please take a look at __iter_div_u64_rem() , because we had a similar
problem in the past. We want to avoid this div_s64() call.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: prevent compiler from optimising sched_avg_update loop
  2010-03-23 17:36 [PATCH] sched: prevent compiler from optimising sched_avg_update loop Will Deacon
  2010-03-23 17:53 ` Eric Dumazet
@ 2010-03-23 18:08 ` Peter Zijlstra
  2010-03-23 19:05 ` Will Deacon
  2 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2010-03-23 18:08 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-kernel, Catalin Marinas, Ingo Molnar, Andrew Morton

On Tue, 2010-03-23 at 17:36 +0000, Will Deacon wrote:
> GCC 4.4.1 on ARM has been observed to replace the while loop
> in sched_avg_update with a call to uldivmod, resulting in the
> following build failure at link-time:
> 
> kernel/built-in.o: In function `sched_avg_update':
> /linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
> /linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
> make: *** [.tmp_vmlinux1] Error 1
> 
> This patch [taken against 2.6.34-rc2] replaces the loop with a call to
> div_s64 which allows the Kernel to link.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  kernel/sched.c |    7 +++----
>  1 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 9ab3cd7..6b74f21 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1238,11 +1238,10 @@ static u64 sched_avg_period(void)
>  static void sched_avg_update(struct rq *rq)
>  {
>  	s64 period = sched_avg_period();
> +	s64 elapsed_periods = div_s64(rq->clock - rq->age_stamp - 1, period);
>  
> -	while ((s64)(rq->clock - rq->age_stamp) > period) {
> -		rq->age_stamp += period;
> -		rq->rt_avg /= 2;
> -	}
> +	rq->age_stamp += (u64)(elapsed_periods * period);
> +	rq->rt_avg >>= elapsed_periods;
>  }

Hmm, and that does an unconditional division, thing is, I don't expect
(under normal circumstances) for that loop to go round more than once so
that division will hurt for no reason.

Should we maybe write it like so:

  if ((s64)(rq->clock - rq->age_stamp) > period) {
    rq->age_stamp += period;
    rq->rt_avg >>= 1;
  }
  if (unlikely((s64)(rq->clock - rq->age_stamp) > period)) {
    s64 overflows = div_s64(rq->clocks - rq->age_stamp, period);
    int width = sizeof(rq->rt_avg) * 8;

    rq->age_stamp += overflows * period;
    if (unlikely(overflows >= width))
      rq->rt_avg = 0;
    else
      rq->rt_avg >>= overflows;
  }

?
    

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] sched: prevent compiler from optimising sched_avg_update loop
       [not found]   ` <000101cacab3$25af90a0$710eb1e0$@deacon@arm.com>
@ 2010-03-23 18:10     ` Peter Zijlstra
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2010-03-23 18:10 UTC (permalink / raw)
  To: Will Deacon
  Cc: 'Eric Dumazet', linux-kernel, Catalin Marinas,
	Ingo Molnar, Andrew Morton

On Tue, 2010-03-23 at 18:03 +0000, Will Deacon wrote:
> Hello Eric,
> 
> Thanks for looking at the patch.
> 
> > > diff --git a/kernel/sched.c b/kernel/sched.c
> > > index 9ab3cd7..6b74f21 100644
> > > --- a/kernel/sched.c
> > > +++ b/kernel/sched.c
> > > @@ -1238,11 +1238,10 @@ static u64 sched_avg_period(void)
> > >  static void sched_avg_update(struct rq *rq)
> > >  {
> > >  	s64 period = sched_avg_period();
> > > +	s64 elapsed_periods = div_s64(rq->clock - rq->age_stamp - 1, period);
> > >
> > > -	while ((s64)(rq->clock - rq->age_stamp) > period) {
> > > -		rq->age_stamp += period;
> > > -		rq->rt_avg /= 2;
> > > -	}
> > > +	rq->age_stamp += (u64)(elapsed_periods * period);
> > > +	rq->rt_avg >>= elapsed_periods;
> > >  }
> > >
> > >  static void sched_rt_avg_update(struct rq *rq, u64 rt_delta)
> > 
> > Please take a look at __iter_div_u64_rem() , because we had a similar
> > problem in the past. We want to avoid this div_s64() call.
> 
> Yes, I saw the inline assembly fix there. I avoided that fix because
> I was trying not to execute the loop body multiple times. Is the iterative
> approach preferred over a single call to div_s64? I don't have a handle on
> how many iterations are typically executed for this loop.

I expect it to be mostly 0 and occasionally 1 cycle, except when someone
pokes at a sysctl with funny values, at which point it might go round
much much faster.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] sched: prevent compiler from optimising sched_avg_update loop
  2010-03-23 17:36 [PATCH] sched: prevent compiler from optimising sched_avg_update loop Will Deacon
  2010-03-23 17:53 ` Eric Dumazet
  2010-03-23 18:08 ` Peter Zijlstra
@ 2010-03-23 19:05 ` Will Deacon
  2 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2010-03-23 19:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Catalin Marinas, Ingo Molnar, Andrew Morton,
	Peter Zijlstra

GCC 4.4.1 on ARM has been observed to replace the while loop
in sched_avg_update with a call to uldivmod, resulting in the
following build failure at link-time:

kernel/built-in.o: In function `sched_avg_update':
/linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
/linux-2.6/kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
make: *** [.tmp_vmlinux1] Error 1

This patch [taken against 2.6.34-rc2] introduces a fake data hazard to
the loop body to prevent the compiler optimising the loop away.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 kernel/sched.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 9ab3cd7..0846815 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1240,6 +1240,12 @@ static void sched_avg_update(struct rq *rq)
 	s64 period = sched_avg_period();
 
 	while ((s64)(rq->clock - rq->age_stamp) > period) {
+		/*
+		 * Inline assembly required to prevent the compiler
+		 * optimising this loop into a divmod call.
+		 * See __iter_div_u64_rem() for another example of this.
+		 */
+		asm("" : "+rm" (rq->age_stamp));
 		rq->age_stamp += period;
 		rq->rt_avg /= 2;
 	}
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-03-23 19:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-23 17:36 [PATCH] sched: prevent compiler from optimising sched_avg_update loop Will Deacon
2010-03-23 17:53 ` Eric Dumazet
     [not found]   ` <000101cacab3$25af90a0$710eb1e0$@deacon@arm.com>
2010-03-23 18:10     ` Peter Zijlstra
2010-03-23 18:08 ` Peter Zijlstra
2010-03-23 19:05 ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox