* [patch 4/21] fix ARCH_HAS_PREFETCH
@ 2002-08-11 7:38 Andrew Morton
2002-08-11 19:49 ` Alan Cox
0 siblings, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2002-08-11 7:38 UTC (permalink / raw)
To: Linus Torvalds; +Cc: lkml
include/linux/prefetch.h does a strange thing: if the arch doesn't have
the prefectch functions, this header defines no-op version of them and
then defines ARCH_HAS_PREFETCH. So there's no way for mainline code to
know if the architecture *really* has prefetch instructions.
This information loss is unfortunate. Examples:
for (i = 0; i < N; i++)
prefetch(foo[i]);
Problem is, if `prefetch' is a no-op, the compiler will still
generate an empty busy-wait loop. Which it must do. We need to
know the truth about ARCH_HAS_PREFETCH to correctly elide that loop.
#ifdef ARCH_HAS_PREFETCH
#define prefetch_prev_lru_page(_page, _base, _field) \
do { \
if ((_page)->lru.prev != _base) { \
struct page *prev; \
\
prev = list_entry(_page->lru.prev, \
struct page, lru); \
prefetch(&prev->_field); \
} \
} while (0)
#else
#define prefetch_prev_lru_page(_page, _base, _field) do { } while (0)
#endif
Which needs a working ARCH_HAS_PREFETCH to avoid probable extra code
generation on CPUs which don't have prefetch.
prefetch.h | 3 ---
1 files changed, 3 deletions(-)
--- 2.5.31/include/linux/prefetch.h~arch_has_prefetchw Sun Aug 11 00:20:08 2002
+++ 2.5.31-akpm/include/linux/prefetch.h Sun Aug 11 00:20:08 2002
@@ -39,17 +39,14 @@
*/
#ifndef ARCH_HAS_PREFETCH
-#define ARCH_HAS_PREFETCH
static inline void prefetch(const void *x) {;}
#endif
#ifndef ARCH_HAS_PREFETCHW
-#define ARCH_HAS_PREFETCHW
static inline void prefetchw(const void *x) {;}
#endif
#ifndef ARCH_HAS_SPINLOCK_PREFETCH
-#define ARCH_HAS_SPINLOCK_PREFETCH
#define spin_lock_prefetch(x) prefetchw(x)
#endif
.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
[not found] <3D56147E.15E7A98@zip.com.au.suse.lists.linux.kernel>
@ 2002-08-11 9:53 ` Andi Kleen
0 siblings, 0 replies; 27+ messages in thread
From: Andi Kleen @ 2002-08-11 9:53 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
Andrew Morton <akpm@zip.com.au> writes:
> Which needs a working ARCH_HAS_PREFETCH to avoid probable extra code
> generation on CPUs which don't have prefetch.
When you use gcc 3.1+ you can use __builtin_prefetch() and gcc takes care of
it. See asm-x86_64/prefetch.h of a working example.
Of course generic C code should not use the ugly builtins directly, but
it could be used to define the wrappers.
-Andi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-11 18:47 ` Andrew Morton
@ 2002-08-11 18:46 ` Larry McVoy
2002-08-11 19:06 ` Linus Torvalds
2002-08-13 21:42 ` [patch 4/21] fix ARCH_HAS_PREFETCH Adrian Bunk
2 siblings, 0 replies; 27+ messages in thread
From: Larry McVoy @ 2002-08-11 18:46 UTC (permalink / raw)
To: Andrew Morton; +Cc: Alan Cox, Linus Torvalds, lkml
On Sun, Aug 11, 2002 at 11:47:22AM -0700, Andrew Morton wrote:
> Alan Cox wrote:
> >
> > On Sun, 2002-08-11 at 08:38, Andrew Morton wrote:
> > > This information loss is unfortunate. Examples:
> > >
> > > for (i = 0; i < N; i++)
> > > prefetch(foo[i]);
> > >
> > > Problem is, if `prefetch' is a no-op, the compiler will still
> > > generate an empty busy-wait loop. Which it must do.
> >
> > Why - nothing there is volatile
>
> Because the compiler sees:
>
> for (i = 0; i < N; i++)
> ;
>
> and it says "ah ha. A busy wait delay loop" and leaves it alone.
>
> It's actually a special-case inside the compiler to not optimise
> away such constructs.
Good lord. If anyone depends all versions of GCC to not optimize this away,
they are going to hate life. Since GCC doesn't seem to do cross file
optimization (does it?) I've found the following works well:
cat > use_result.c
int dummy; // can't be static, the compiler will see it's not read
use_result(int i)
{
dummy = i;
}
^D
for (i = 0; i < N; ++i) use_result(i);
I'm positive we do stuff like this in LMbench, which is fairly well supported
on a pile of different platforms/compilers.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-11 19:49 ` Alan Cox
@ 2002-08-11 18:47 ` Andrew Morton
2002-08-11 18:46 ` Larry McVoy
` (2 more replies)
0 siblings, 3 replies; 27+ messages in thread
From: Andrew Morton @ 2002-08-11 18:47 UTC (permalink / raw)
To: Alan Cox; +Cc: Linus Torvalds, lkml
Alan Cox wrote:
>
> On Sun, 2002-08-11 at 08:38, Andrew Morton wrote:
> > This information loss is unfortunate. Examples:
> >
> > for (i = 0; i < N; i++)
> > prefetch(foo[i]);
> >
> > Problem is, if `prefetch' is a no-op, the compiler will still
> > generate an empty busy-wait loop. Which it must do.
>
> Why - nothing there is volatile
Because the compiler sees:
for (i = 0; i < N; i++)
;
and it says "ah ha. A busy wait delay loop" and leaves it alone.
It's actually a special-case inside the compiler to not optimise
away such constructs.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-11 18:47 ` Andrew Morton
2002-08-11 18:46 ` Larry McVoy
@ 2002-08-11 19:06 ` Linus Torvalds
2002-08-11 20:07 ` GCC still keeps empty loops? (was: [patch 4/21] fix ARCH_HAS_PREFETCH) Jamie Lokier
2002-08-13 21:42 ` [patch 4/21] fix ARCH_HAS_PREFETCH Adrian Bunk
2 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2002-08-11 19:06 UTC (permalink / raw)
To: Andrew Morton; +Cc: Alan Cox, lkml
On Sun, 11 Aug 2002, Andrew Morton wrote:
>
> It's actually a special-case inside the compiler to not optimise
> away such constructs.
I thought that special case was removed long ago, because it is untenable
in C++ etc (where such empty loops happen due to various abstraction
issues, and not optimizing them away is just silly).
But testing shows that you're right at least for 2.95 and 2.96. Argh
Linus
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-11 7:38 [patch 4/21] fix ARCH_HAS_PREFETCH Andrew Morton
@ 2002-08-11 19:49 ` Alan Cox
2002-08-11 18:47 ` Andrew Morton
0 siblings, 1 reply; 27+ messages in thread
From: Alan Cox @ 2002-08-11 19:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linus Torvalds, lkml
On Sun, 2002-08-11 at 08:38, Andrew Morton wrote:
> This information loss is unfortunate. Examples:
>
> for (i = 0; i < N; i++)
> prefetch(foo[i]);
>
> Problem is, if `prefetch' is a no-op, the compiler will still
> generate an empty busy-wait loop. Which it must do.
Why - nothing there is volatile
^ permalink raw reply [flat|nested] 27+ messages in thread
* GCC still keeps empty loops? (was: [patch 4/21] fix ARCH_HAS_PREFETCH)
2002-08-11 19:06 ` Linus Torvalds
@ 2002-08-11 20:07 ` Jamie Lokier
2002-08-12 7:45 ` Ralf Baechle
0 siblings, 1 reply; 27+ messages in thread
From: Jamie Lokier @ 2002-08-11 20:07 UTC (permalink / raw)
To: gcc, Linus Torvalds; +Cc: Andrew Morton, Alan Cox, lkml
Linus Torvalds wrote:
> I thought that special case was removed long ago, because it is untenable
> in C++ etc (where such empty loops happen due to various abstraction
> issues, and not optimizing them away is just silly).
>
> But testing shows that you're right at least for 2.95 and 2.96. Argh
Unbelievably, 3.1 doesn't remove empty loops either.
I think there's a case for a compiler flag, `-fremove-empty-loops'.
Empty loop delays aren't portable acrosss compilers in general. If you
_really_ want an empty loop that must always work with GCC, it's easy
enough to write:
for (i = 0; i < 100000; i++)
__asm__ __volatile__ ("");
-- Jamie
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: GCC still keeps empty loops? (was: [patch 4/21] fix ARCH_HAS_PREFETCH)
2002-08-11 20:07 ` GCC still keeps empty loops? (was: [patch 4/21] fix ARCH_HAS_PREFETCH) Jamie Lokier
@ 2002-08-12 7:45 ` Ralf Baechle
0 siblings, 0 replies; 27+ messages in thread
From: Ralf Baechle @ 2002-08-12 7:45 UTC (permalink / raw)
To: Jamie Lokier; +Cc: gcc, Linus Torvalds, Andrew Morton, Alan Cox, lkml
On Sun, Aug 11, 2002 at 09:07:18PM +0100, Jamie Lokier wrote:
> Unbelievably, 3.1 doesn't remove empty loops either.
> I think there's a case for a compiler flag, `-fremove-empty-loops'.
Indeed ... It's sad having to scatter ifdefs over the code just because
gcc lacks this optimization ...
Ralf
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-11 18:47 ` Andrew Morton
2002-08-11 18:46 ` Larry McVoy
2002-08-11 19:06 ` Linus Torvalds
@ 2002-08-13 21:42 ` Adrian Bunk
2002-08-13 22:12 ` H. Peter Anvin
2 siblings, 1 reply; 27+ messages in thread
From: Adrian Bunk @ 2002-08-13 21:42 UTC (permalink / raw)
To: Andrew Morton; +Cc: Alan Cox, Linus Torvalds, lkml
On Sun, 11 Aug 2002, Andrew Morton wrote:
> Alan Cox wrote:
> >
> > On Sun, 2002-08-11 at 08:38, Andrew Morton wrote:
> > > This information loss is unfortunate. Examples:
> > >
> > > for (i = 0; i < N; i++)
> > > prefetch(foo[i]);
> > >
> > > Problem is, if `prefetch' is a no-op, the compiler will still
> > > generate an empty busy-wait loop. Which it must do.
> >
> > Why - nothing there is volatile
>
> Because the compiler sees:
>
> for (i = 0; i < N; i++)
> ;
>
> and it says "ah ha. A busy wait delay loop" and leaves it alone.
>
> It's actually a special-case inside the compiler to not optimise
> away such constructs.
Why is this a special case? As long as a compiler can't prove that the
computed value of i isn't used later it mustn't optimize it away.
Kernighan/Ritchie (German translation of the second edition) contains the
following example program that shows why the compiler mustn't optimize it
away:
<-- snip -->
#include <stdio.h>
main()
{
double nc;
for (nc = 0; getchar() != EOF; ++nc)
;
printf("%.0f\n", nc);
}
<-- snip -->
cu
Adrian
--
You only think this is a free country. Like the US the UK spends a lot of
time explaining its a free country because its a police state.
Alan Cox
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-13 21:42 ` [patch 4/21] fix ARCH_HAS_PREFETCH Adrian Bunk
@ 2002-08-13 22:12 ` H. Peter Anvin
2002-08-13 22:22 ` Adrian Bunk
2002-08-14 17:40 ` Rogier Wolff
0 siblings, 2 replies; 27+ messages in thread
From: H. Peter Anvin @ 2002-08-13 22:12 UTC (permalink / raw)
To: linux-kernel
Followup to: <Pine.NEB.4.44.0208132322340.1351-100000@mimas.fachschaften.tu-muenchen.de>
By author: Adrian Bunk <bunk@fs.tum.de>
In newsgroup: linux.dev.kernel
> >
> > Because the compiler sees:
> >
> > for (i = 0; i < N; i++)
> > ;
> >
> > and it says "ah ha. A busy wait delay loop" and leaves it alone.
> >
> > It's actually a special-case inside the compiler to not optimise
> > away such constructs.
>
> Why is this a special case? As long as a compiler can't prove that the
> computed value of i isn't used later it mustn't optimize it away.
>
Bullsh*t. It can legitimately transform it into:
i = N;
> Kernighan/Ritchie (German translation of the second edition) contains the
> following example program that shows why the compiler mustn't optimize it
> away:
>
> <-- snip -->
>
> #include <stdio.h>
>
> main()
> {
> double nc;
>
> for (nc = 0; getchar() != EOF; ++nc)
> ;
> printf("%.0f\n", nc);
>
> }
>
getchar() has side effects.
-hpa
--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <amsp@zytor.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-13 22:12 ` H. Peter Anvin
@ 2002-08-13 22:22 ` Adrian Bunk
2002-08-13 22:42 ` H. Peter Anvin
2002-08-14 17:40 ` Rogier Wolff
1 sibling, 1 reply; 27+ messages in thread
From: Adrian Bunk @ 2002-08-13 22:22 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel
On 13 Aug 2002, H. Peter Anvin wrote:
>...
> > > Because the compiler sees:
> > >
> > > for (i = 0; i < N; i++)
> > > ;
> > >
> > > and it says "ah ha. A busy wait delay loop" and leaves it alone.
> > >
> > > It's actually a special-case inside the compiler to not optimise
> > > away such constructs.
> >
> > Why is this a special case? As long as a compiler can't prove that the
> > computed value of i isn't used later it mustn't optimize it away.
>
> Bullsh*t. It can legitimately transform it into:
>
> i = N;
>...
Ah, my misunderstanding:
"optimize away" didn't mean "completely remove it" but "transform it to
something semantically equivalent".
> -hpa
Thanks
Adrian
--
You only think this is a free country. Like the US the UK spends a lot of
time explaining its a free country because its a police state.
Alan Cox
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-13 22:22 ` Adrian Bunk
@ 2002-08-13 22:42 ` H. Peter Anvin
0 siblings, 0 replies; 27+ messages in thread
From: H. Peter Anvin @ 2002-08-13 22:42 UTC (permalink / raw)
To: Adrian Bunk; +Cc: linux-kernel
Adrian Bunk wrote:
>
> Ah, my misunderstanding:
> "optimize away" didn't mean "completely remove it" but "transform it to
> something semantically equivalent".
>
The thing is that gcc has a special rule that keeps it from optimizing
it into something with less than O(n) time.
-=hpa
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-13 22:12 ` H. Peter Anvin
2002-08-13 22:22 ` Adrian Bunk
@ 2002-08-14 17:40 ` Rogier Wolff
2002-08-14 19:41 ` H. Peter Anvin
1 sibling, 1 reply; 27+ messages in thread
From: Rogier Wolff @ 2002-08-14 17:40 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel
On Tue, Aug 13, 2002 at 03:12:53PM -0700, H. Peter Anvin wrote:
> Followup to: <Pine.NEB.4.44.0208132322340.1351-100000@mimas.fachschaften.tu-muenchen.de>
> By author: Adrian Bunk <bunk@fs.tum.de>
> In newsgroup: linux.dev.kernel
> > >
> > > Because the compiler sees:
> > >
> > > for (i = 0; i < N; i++)
> > > ;
> > >
> > > and it says "ah ha. A busy wait delay loop" and leaves it alone.
> > >
> > > It's actually a special-case inside the compiler to not optimise
> > > away such constructs.
> >
> > Why is this a special case? As long as a compiler can't prove that the
> > computed value of i isn't used later it mustn't optimize it away.
> >
>
> Bullsh*t. It can legitimately transform it into:
>
> i = N;
Right! But people are confusing "practise", "published interface", and
"spec" again.
Published interface in this case is that gcc will not optimize an empty
loop away, as it is often used to generate a timing loop.
Roger.
--
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots.
* There are also old, bald pilots.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 17:40 ` Rogier Wolff
@ 2002-08-14 19:41 ` H. Peter Anvin
2002-08-14 19:57 ` Jamie Lokier
2002-08-14 20:45 ` Willy Tarreau
0 siblings, 2 replies; 27+ messages in thread
From: H. Peter Anvin @ 2002-08-14 19:41 UTC (permalink / raw)
To: Rogier Wolff; +Cc: linux-kernel
Rogier Wolff wrote:
>>
>>Bullsh*t. It can legitimately transform it into:
>>
>> i = N;
>
>
> Right! But people are confusing "practise", "published interface", and
> "spec" again.
>
> Published interface in this case is that gcc will not optimize an empty
> loop away, as it is often used to generate a timing loop.
>
Yes. This is a gcc-specific wart, a bad idea from the start, and
apparently one which has caught up with them to the point that they've
had to abandon it.
-hpa
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 19:41 ` H. Peter Anvin
@ 2002-08-14 19:57 ` Jamie Lokier
2002-08-14 22:03 ` Daniel Jacobowitz
2002-08-14 20:45 ` Willy Tarreau
1 sibling, 1 reply; 27+ messages in thread
From: Jamie Lokier @ 2002-08-14 19:57 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Rogier Wolff, linux-kernel
H. Peter Anvin wrote:
> Yes. This is a gcc-specific wart, a bad idea from the start, and
> apparently one which has caught up with them to the point that they've
> had to abandon it.
It's still there in GCC 3.1.
-- Jamie
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 19:41 ` H. Peter Anvin
2002-08-14 19:57 ` Jamie Lokier
@ 2002-08-14 20:45 ` Willy Tarreau
2002-08-14 20:58 ` H. Peter Anvin
2002-08-14 22:53 ` David Lang
1 sibling, 2 replies; 27+ messages in thread
From: Willy Tarreau @ 2002-08-14 20:45 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Rogier Wolff, linux-kernel
On Wed, Aug 14, 2002 at 12:41:04PM -0700, H. Peter Anvin wrote:
> Rogier Wolff wrote:
> >>
> >>Bullsh*t. It can legitimately transform it into:
> >>
> >> i = N;
> >
> >
> >Right! But people are confusing "practise", "published interface", and
> >"spec" again.
> >
> >Published interface in this case is that gcc will not optimize an empty
> >loop away, as it is often used to generate a timing loop.
> >
>
> Yes. This is a gcc-specific wart, a bad idea from the start, and
> apparently one which has caught up with them to the point that they've
> had to abandon it.
There would be a solution to tell gcc not to optimize things, which may
not require too much work from gcc people. Basically, we would need to
implement a __builtin_nop() function that would respect dependencies but
not generate any code. This way, we could have :
for (i=0; i<N, i++);
optimized as i=N
and
for (i=0; i<N; i++)
__builtin_nop();
or even
for (i=0; i<N; __builtin_nop(i++));
do the real work.
This way, some loops could be optimized, and the developpers could explicitely
tell the compiler when they need to prevent any optimization.
Cheers,
Willy
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 20:45 ` Willy Tarreau
@ 2002-08-14 20:58 ` H. Peter Anvin
2002-08-14 21:11 ` Willy Tarreau
2002-08-15 1:23 ` Alan Cox
2002-08-14 22:53 ` David Lang
1 sibling, 2 replies; 27+ messages in thread
From: H. Peter Anvin @ 2002-08-14 20:58 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Rogier Wolff, linux-kernel
Willy Tarreau wrote:
>
> There would be a solution to tell gcc not to optimize things, which may
> not require too much work from gcc people. Basically, we would need to
> implement a __builtin_nop() function that would respect dependencies but
> not generate any code. This way, we could have :
>
> for (i=0; i<N, i++);
>
> optimized as i=N
> and
> for (i=0; i<N; i++)
> __builtin_nop();
> or even
> for (i=0; i<N; __builtin_nop(i++));
> do the real work.
>
> This way, some loops could be optimized, and the developpers could explicitely
> tell the compiler when they need to prevent any optimization.
>
#define __nop() asm volatile("")
Since some processors now have "busy wait delay" instructions, this
would also make it possible to do:
#if defined(__i386__) || defined(__x86_64__)
#define __busy_wait() asm volatile("rep;nop")
#else
#define __busy_wait() asm volatile("")
#endif
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 20:58 ` H. Peter Anvin
@ 2002-08-14 21:11 ` Willy Tarreau
2002-08-14 21:13 ` H. Peter Anvin
2002-08-15 1:23 ` Alan Cox
1 sibling, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2002-08-14 21:11 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Willy Tarreau, Rogier Wolff, linux-kernel
On Wed, Aug 14, 2002 at 01:58:41PM -0700, H. Peter Anvin wrote:
> Willy Tarreau wrote:
> > This way, some loops could be optimized, and the developpers could explicitely
> > tell the compiler when they need to prevent any optimization.
> >
>
> #define __nop() asm volatile("")
and if you want to pass arguments, to guarantee that no optimization will
be done, even on loop constants ?
eg:
for (i = 0; i < N; i++) {
j++;
__nop();
}
-> might be optimized this way :
j = N;
for (i = 0; i < N; i++) {
__nop();
}
Perhaps using a volatile for j ?
Cheers,
Willy
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 21:11 ` Willy Tarreau
@ 2002-08-14 21:13 ` H. Peter Anvin
0 siblings, 0 replies; 27+ messages in thread
From: H. Peter Anvin @ 2002-08-14 21:13 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Rogier Wolff, linux-kernel
Willy Tarreau wrote:
>>>
>>
>>#define __nop() asm volatile("")
>
> and if you want to pass arguments, to guarantee that no optimization will
> be done, even on loop constants ?
> eg:
> for (i = 0; i < N; i++) {
> j++;
> __nop();
> }
>
> -> might be optimized this way :
> j = N;
> for (i = 0; i < N; i++) {
> __nop();
> }
>
> Perhaps using a volatile for j ?
>
OK, what are you trying to accomplish by this?
But if you wanted to, you could do:
for ( i = 0 ; i < N ; i++ ) {
j++;
asm volatile("" : "=g" (j));
}
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 19:57 ` Jamie Lokier
@ 2002-08-14 22:03 ` Daniel Jacobowitz
0 siblings, 0 replies; 27+ messages in thread
From: Daniel Jacobowitz @ 2002-08-14 22:03 UTC (permalink / raw)
To: Jamie Lokier; +Cc: H. Peter Anvin, Rogier Wolff, linux-kernel
On Wed, Aug 14, 2002 at 08:57:09PM +0100, Jamie Lokier wrote:
> H. Peter Anvin wrote:
> > Yes. This is a gcc-specific wart, a bad idea from the start, and
> > apparently one which has caught up with them to the point that they've
> > had to abandon it.
>
> It's still there in GCC 3.1.
Yes. If you check out the tree-ssa-branch, however (and use
appropriate commandline arguments), I think you'll find that it's no
longer there. That's the future of GCC's optimizer, but most of it
won't make even GCC 3.3.
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 20:45 ` Willy Tarreau
2002-08-14 20:58 ` H. Peter Anvin
@ 2002-08-14 22:53 ` David Lang
2002-08-14 23:31 ` Jamie Lokier
2002-08-15 6:44 ` Willy Tarreau
1 sibling, 2 replies; 27+ messages in thread
From: David Lang @ 2002-08-14 22:53 UTC (permalink / raw)
To: Willy Tarreau; +Cc: H. Peter Anvin, Rogier Wolff, linux-kernel
why are you useing loops for delays in the first place? it's a solution
that will fail as clock speeds keep improving (if for no other reason then
your loop counter will end up needing to be a larger int to achieve the
desired delay!!)
rather then debating how to convince gcc how to not optimize them away and
messing up the timing we should be talking about how to eliminate such
loops in the first place.
David Lang
On Wed, 14 Aug 2002, Willy Tarreau wrote:
> Date: Wed, 14 Aug 2002 22:45:56 +0200
> From: Willy Tarreau <willy@w.ods.org>
> To: H. Peter Anvin <hpa@zytor.com>
> Cc: Rogier Wolff <R.E.Wolff@BitWizard.nl>, linux-kernel@vger.kernel.org
> Subject: Re: [patch 4/21] fix ARCH_HAS_PREFETCH
>
> On Wed, Aug 14, 2002 at 12:41:04PM -0700, H. Peter Anvin wrote:
> > Rogier Wolff wrote:
> > >>
> > >>Bullsh*t. It can legitimately transform it into:
> > >>
> > >> i = N;
> > >
> > >
> > >Right! But people are confusing "practise", "published interface", and
> > >"spec" again.
> > >
> > >Published interface in this case is that gcc will not optimize an empty
> > >loop away, as it is often used to generate a timing loop.
> > >
> >
> > Yes. This is a gcc-specific wart, a bad idea from the start, and
> > apparently one which has caught up with them to the point that they've
> > had to abandon it.
>
> There would be a solution to tell gcc not to optimize things, which may
> not require too much work from gcc people. Basically, we would need to
> implement a __builtin_nop() function that would respect dependencies but
> not generate any code. This way, we could have :
>
> for (i=0; i<N, i++);
>
> optimized as i=N
> and
> for (i=0; i<N; i++)
> __builtin_nop();
> or even
> for (i=0; i<N; __builtin_nop(i++));
> do the real work.
>
> This way, some loops could be optimized, and the developpers could explicitely
> tell the compiler when they need to prevent any optimization.
>
> Cheers,
> Willy
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 22:53 ` David Lang
@ 2002-08-14 23:31 ` Jamie Lokier
2002-08-15 6:44 ` Willy Tarreau
1 sibling, 0 replies; 27+ messages in thread
From: Jamie Lokier @ 2002-08-14 23:31 UTC (permalink / raw)
To: David Lang; +Cc: linux-kernel
David Lang wrote:
> rather then debating how to convince gcc how to not optimize them away and
> messing up the timing we should be talking about how to eliminate such
> loops in the first place.
You misunderstand. We _do_ want gcc to optimize away empty loops.
-- Jamie
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 20:58 ` H. Peter Anvin
2002-08-14 21:11 ` Willy Tarreau
@ 2002-08-15 1:23 ` Alan Cox
2002-08-15 1:28 ` H. Peter Anvin
1 sibling, 1 reply; 27+ messages in thread
From: Alan Cox @ 2002-08-15 1:23 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Willy Tarreau, Rogier Wolff, linux-kernel
On Wed, 2002-08-14 at 21:58, H. Peter Anvin wrote:
>
> Since some processors now have "busy wait delay" instructions, this
> would also make it possible to do:
Nobody should be using an empty busy loop. If its a short timed busy
loop then they should be using udelay, if its a long one
schedule_timeout()
If its polling hardware then it isnt an empty loop
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-15 1:23 ` Alan Cox
@ 2002-08-15 1:28 ` H. Peter Anvin
2002-08-15 1:37 ` Alan Cox
0 siblings, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2002-08-15 1:28 UTC (permalink / raw)
To: Alan Cox; +Cc: Willy Tarreau, Rogier Wolff, linux-kernel
Alan Cox wrote:
> On Wed, 2002-08-14 at 21:58, H. Peter Anvin wrote:
>
>>Since some processors now have "busy wait delay" instructions, this
>>would also make it possible to do:
>
>
> Nobody should be using an empty busy loop. If its a short timed busy
> loop then they should be using udelay, if its a long one
> schedule_timeout()
>
Indeed.
> If its polling hardware then it isnt an empty loop
True indeed as well, although we should still have a busy_wait(); macro
that can insert whatever hint instruction the architecture might or
might not have.
-hpa
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-15 1:28 ` H. Peter Anvin
@ 2002-08-15 1:37 ` Alan Cox
0 siblings, 0 replies; 27+ messages in thread
From: Alan Cox @ 2002-08-15 1:37 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Willy Tarreau, Rogier Wolff, linux-kernel
On Thu, 2002-08-15 at 02:28, H. Peter Anvin wrote:
> True indeed as well, although we should still have a busy_wait(); macro
> that can insert whatever hint instruction the architecture might or
> might not have.
We have one - its called cpu_relax()
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-14 22:53 ` David Lang
2002-08-14 23:31 ` Jamie Lokier
@ 2002-08-15 6:44 ` Willy Tarreau
2002-08-15 6:55 ` H. Peter Anvin
1 sibling, 1 reply; 27+ messages in thread
From: Willy Tarreau @ 2002-08-15 6:44 UTC (permalink / raw)
To: David Lang; +Cc: Willy Tarreau, H. Peter Anvin, Rogier Wolff, linux-kernel
On Wed, Aug 14, 2002 at 03:53:15PM -0700, David Lang wrote:
> why are you useing loops for delays in the first place? it's a solution
> that will fail as clock speeds keep improving (if for no other reason then
> your loop counter will end up needing to be a larger int to achieve the
> desired delay!!)
>
> rather then debating how to convince gcc how to not optimize them away and
> messing up the timing we should be talking about how to eliminate such
> loops in the first place.
I'm NEVER using loops for delays, I find this awful and always a waste of
time. But sometimes, you want to benchmark some algorithms, and the compiler
makes this difficult by eliminating things it thinks are useless. I once had
this problem when comparing several linked lists algorithms, for example.
Moreover, some people complain about the fact that gcc doesn't optimize
everything because of people using loops for delays. If we provide convenient
ways to help the user tell gcc when not to optimize, gcc could optimize
everything possible by default.
Cheers,
Willy
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [patch 4/21] fix ARCH_HAS_PREFETCH
2002-08-15 6:44 ` Willy Tarreau
@ 2002-08-15 6:55 ` H. Peter Anvin
0 siblings, 0 replies; 27+ messages in thread
From: H. Peter Anvin @ 2002-08-15 6:55 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Lang, Rogier Wolff, linux-kernel
Willy Tarreau wrote:
>
> Moreover, some people complain about the fact that gcc doesn't optimize
> everything because of people using loops for delays. If we provide convenient
> ways to help the user tell gcc when not to optimize, gcc could optimize
> everything possible by default.
>
Well, again, such ways already exist.
-hpa
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2002-08-15 6:51 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-11 7:38 [patch 4/21] fix ARCH_HAS_PREFETCH Andrew Morton
2002-08-11 19:49 ` Alan Cox
2002-08-11 18:47 ` Andrew Morton
2002-08-11 18:46 ` Larry McVoy
2002-08-11 19:06 ` Linus Torvalds
2002-08-11 20:07 ` GCC still keeps empty loops? (was: [patch 4/21] fix ARCH_HAS_PREFETCH) Jamie Lokier
2002-08-12 7:45 ` Ralf Baechle
2002-08-13 21:42 ` [patch 4/21] fix ARCH_HAS_PREFETCH Adrian Bunk
2002-08-13 22:12 ` H. Peter Anvin
2002-08-13 22:22 ` Adrian Bunk
2002-08-13 22:42 ` H. Peter Anvin
2002-08-14 17:40 ` Rogier Wolff
2002-08-14 19:41 ` H. Peter Anvin
2002-08-14 19:57 ` Jamie Lokier
2002-08-14 22:03 ` Daniel Jacobowitz
2002-08-14 20:45 ` Willy Tarreau
2002-08-14 20:58 ` H. Peter Anvin
2002-08-14 21:11 ` Willy Tarreau
2002-08-14 21:13 ` H. Peter Anvin
2002-08-15 1:23 ` Alan Cox
2002-08-15 1:28 ` H. Peter Anvin
2002-08-15 1:37 ` Alan Cox
2002-08-14 22:53 ` David Lang
2002-08-14 23:31 ` Jamie Lokier
2002-08-15 6:44 ` Willy Tarreau
2002-08-15 6:55 ` H. Peter Anvin
[not found] <3D56147E.15E7A98@zip.com.au.suse.lists.linux.kernel>
2002-08-11 9:53 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox