linux-snps-arc.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arc: make sure __delay() never gets executed with 0 loops
@ 2016-02-15 16:37 Alexey Brodkin
  2016-02-24  5:05 ` Vineet Gupta
  0 siblings, 1 reply; 3+ messages in thread
From: Alexey Brodkin @ 2016-02-15 16:37 UTC (permalink / raw)
  To: linux-snps-arc

Current implementation of __delay() function uses so-called
zero-delay loops. And the only condition to exit that loop is
LP_COUNT (loop count register) = 1 (but not 0 as it might be easily
imagined).

So if our calculation of "loops" gives 0 (and that is pretty possible
given result of multiplication being >> 32) then zero-delay loop
mechanism starts with LP_COUNT=0 and it ends up decrementing LP_COUNT
while staying in the loop effectively producing close to infinite delay
instead of very short one.

I bumped into it with AXS101 + external DDR controller and caches
disabled. In that case I've got very small
loops_per_jiffy=0xf00:
------------------------>8--------------------
Calibrating delay loop... 0.77 BogoMIPS (lpj=3862)
------------------------>8--------------------

And on console output delays were way too long.

Signed-off-by: Alexey Brodkin <abrodkin at synopsys.com>
Cc: Vineet Gupta <vgupta at synopsys.com>
 Please enter the commit message for your changes. Lines starting
---
 arch/arc/include/asm/delay.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arc/include/asm/delay.h b/arch/arc/include/asm/delay.h
index 08e7e2a..1a7a1dc 100644
--- a/arch/arc/include/asm/delay.h
+++ b/arch/arc/include/asm/delay.h
@@ -57,7 +57,8 @@ static inline void __udelay(unsigned long usecs)
 	 */
 	loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32;
 
-	__delay(loops);
+	if (loops)
+		__delay(loops);
 }
 
 #define udelay(n) (__builtin_constant_p(n) ? ((n) > 20000 ? __bad_udelay() \
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH] arc: make sure __delay() never gets executed with 0 loops
  2016-02-15 16:37 [PATCH] arc: make sure __delay() never gets executed with 0 loops Alexey Brodkin
@ 2016-02-24  5:05 ` Vineet Gupta
  2016-02-24 14:10   ` Alexey Brodkin
  0 siblings, 1 reply; 3+ messages in thread
From: Vineet Gupta @ 2016-02-24  5:05 UTC (permalink / raw)
  To: linux-snps-arc

On Monday 15 February 2016 10:07 PM, Alexey Brodkin wrote:
> Current implementation of __delay() function uses so-called
> zero-delay loops. And the only condition to exit that loop is
> LP_COUNT (loop count register) = 1 (but not 0 as it might be easily
> imagined).

So u can fix this better by doing a lp.nz, but....

> So if our calculation of "loops" gives 0 (and that is pretty possible
> given result of multiplication being >> 32) then zero-delay loop
> mechanism starts with LP_COUNT=0 and it ends up decrementing LP_COUNT
> while staying in the loop effectively producing close to infinite delay
> instead of very short one.
>
> I bumped into it with AXS101 + external DDR controller and caches
> disabled. In that case I've got very small
> loops_per_jiffy=0xf00:

I understand this gives you grief, but the code is doing exactly what it is asked to.
Since the system is slow, You are getting only 0xf00 (3840) loop iterations in 10ms.
So if you want say a delay of 1 micro-sec, you will need to loop for 3840 / 10000
~ 0 loops

This all assumes our lpj computation is correct - otherwise that needs fixing too.

Anyways I think for genuine cases where the number of loops is indeed computed to
0 because caller was passing too small a value, it is better to wait for looong
time to catch the bugger rather than silently returning. This is one of the cases
where disease is better than the cure !

> ------------------------>8--------------------
> Calibrating delay loop... 0.77 BogoMIPS (lpj=3862)
> ------------------------>8--------------------
>
> And on console output delays were way too long.
>
> Signed-off-by: Alexey Brodkin <abrodkin at synopsys.com>
> Cc: Vineet Gupta <vgupta at synopsys.com>
>  Please enter the commit message for your changes. Lines starting
> ---
>  arch/arc/include/asm/delay.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arc/include/asm/delay.h b/arch/arc/include/asm/delay.h
> index 08e7e2a..1a7a1dc 100644
> --- a/arch/arc/include/asm/delay.h
> +++ b/arch/arc/include/asm/delay.h
> @@ -57,7 +57,8 @@ static inline void __udelay(unsigned long usecs)
>  	 */
>  	loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32;
>  
> -	__delay(loops);
> +	if (loops)
> +		__delay(loops);
>  }
>  
>  #define udelay(n) (__builtin_constant_p(n) ? ((n) > 20000 ? __bad_udelay() \

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH] arc: make sure __delay() never gets executed with 0 loops
  2016-02-24  5:05 ` Vineet Gupta
@ 2016-02-24 14:10   ` Alexey Brodkin
  0 siblings, 0 replies; 3+ messages in thread
From: Alexey Brodkin @ 2016-02-24 14:10 UTC (permalink / raw)
  To: linux-snps-arc

Hi Vineet,

On Wed, 2016-02-24@05:05 +0000, Vineet Gupta wrote:
> On Monday 15 February 2016 10:07 PM, Alexey Brodkin wrote:
> > Current implementation of __delay() function uses so-called
> > zero-delay loops. And the only condition to exit that loop is
> > LP_COUNT (loop count register) = 1 (but not 0 as it might be easily
> > imagined).
> 
> So u can fix this better by doing a lp.nz, but....
> 
> > So if our calculation of "loops" gives 0 (and that is pretty possible
> > given result of multiplication being >> 32) then zero-delay loop
> > mechanism starts with LP_COUNT=0 and it ends up decrementing LP_COUNT
> > while staying in the loop effectively producing close to infinite delay
> > instead of very short one.
> > 
> > I bumped into it with AXS101 + external DDR controller and caches
> > disabled. In that case I've got very small
> > loops_per_jiffy=0xf00:
> 
> I understand this gives you grief, but the code is doing exactly what it is asked to.
> Since the system is slow, You are getting only 0xf00 (3840) loop iterations in 10ms.
> So if you want say a delay of 1 micro-sec, you will need to loop for 3840 / 10000
> ~ 0 loops
> 
> This all assumes our lpj computation is correct - otherwise that needs fixing too.
> 
> Anyways I think for genuine cases where the number of loops is indeed computed to
> 0 because caller was passing too small a value, it is better to wait for looong
> time to catch the bugger rather than silently returning. This is one of the cases
> where disease is better than the cure !

Ok, but see delays (even those for just a few usecs) might be justified by
hardware requirements (i.e. some peripheral need at least 10 usecs between this
and tat operations). So one driver will attempt to use completely correct value
for delay but on slow hardware (or especially in simulation) we'll get kernel
virtually stuck for no obvious reason.

So frankly I don't like proposal to keep existing implementation.
And solution with "lp.nz" looks much better to me.

-Alexey

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-02-24 14:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-15 16:37 [PATCH] arc: make sure __delay() never gets executed with 0 loops Alexey Brodkin
2016-02-24  5:05 ` Vineet Gupta
2016-02-24 14:10   ` Alexey Brodkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).