All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] micro-opt DEBUG_ADD_PAGE
       [not found] <Pine.LNX.4.21.0102071744440.5204-100000@localhost.localdomain>
@ 2001-02-07 18:00 ` Hugh Dickins
  2001-02-07 18:17   ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2001-02-07 18:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, linux-kernel

On Tue, 6 Feb 2001, Linus Torvalds wrote:
> > -		if (bh->b_size % correct_size) {
> > +		if (bh->b_size != correct_size) {
> 
> Actually, I'd rather leave it in, but speed it up with the saner and
> faster  	if (bh->b_size & (correct_size-1)) {

Micro-optimization season?

--- linux-2.4.2-pre1/include/linux/swap.h	Wed Feb  7 15:21:13 2001
+++ linux/include/linux/swap.h	Wed Feb  7 17:21:25 2001
@@ -200,8 +200,8 @@
  * with the pagemap_lru_lock held!
  */
 #define DEBUG_ADD_PAGE \
-	if (PageActive(page) || PageInactiveDirty(page) || \
-					PageInactiveClean(page)) BUG();
+	if ((page)->flags & ((1<<PG_active)|(1<<PG_inactive_dirty)| \
+					(1<<PG_inactive_clean))) BUG();
 
 #define ZERO_PAGE_BUG \
 	if (page_count(page) == 0) BUG();

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-07 18:00 ` [PATCH] micro-opt DEBUG_ADD_PAGE Hugh Dickins
@ 2001-02-07 18:17   ` Linus Torvalds
  2001-02-07 20:42     ` Hugh Dickins
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2001-02-07 18:17 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Rik van Riel, linux-kernel



On Wed, 7 Feb 2001, Hugh Dickins wrote:
> 
> Micro-optimization season?

I'd rather not do these kinds of things that the compiler should be able
to trivially do for us.

(gcc sometimes _does_ do these things. I've seen it. Why doesn't it do it
here? Did you check the code? Have you asked the gcc lists?)

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-07 18:17   ` Linus Torvalds
@ 2001-02-07 20:42     ` Hugh Dickins
  2001-02-07 21:40       ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2001-02-07 20:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, Mark Hahn, David Howells, linux-kernel

On Wed, 7 Feb 2001, Linus Torvalds wrote:
> 
> I'd rather not do these kinds of things that the compiler should be able
> to trivially do for us.
> 
> (gcc sometimes _does_ do these things. I've seen it. Why doesn't it do it
> here? Did you check the code? Have you asked the gcc lists?)

The "(1<<PG_bitshift)" part of it is done, sure; but I've rechecked
activate_page_nolock() compiled -O2 -march=i686 with egcs-2.91.66 (RH7.0
kgcc), gcc-2.96-69 (RH7.0 gcc+fixes), gcc-2.97 (gcc-snapshot-20010207-1).

None of those optimizes this: I believe the semantics of "||" (don't
try next test if first succeeds) forbid the optimization "|" gives?

2.91 and 2.96 give three movs (two unnecessary), three tests,
three jumps (first two not usually taken):

 232:	8b 43 18             	mov    0x18(%ebx),%eax
 235:	a8 40                	test   $0x40,%al
 237:	75 0f                	jne    248 <activate_page_nolock+0x4c>
 239:	8b 43 18             	mov    0x18(%ebx),%eax
 23c:	a8 80                	test   $0x80,%al
 23e:	75 08                	jne    248 <activate_page_nolock+0x4c>
 240:	8b 43 18             	mov    0x18(%ebx),%eax
 243:	f6 c4 08             	test   $0x8,%ah
 246:	74 19                	je     261 <activate_page_nolock+0x65>

2.97 is jumpier: mov and je mov test jne mov test jne jmp.
That looks worse to me: David, earlier on you advertized
	http://www.codesourcery.com/gcc-snapshots/
Is this something worth your pursuing with the gcc guys?

Hugh

--- linux-2.4.2-pre1/include/linux/swap.h	Wed Feb  7 15:21:13 2001
+++ linux/include/linux/swap.h	Wed Feb  7 17:21:25 2001
@@ -200,8 +200,8 @@
  * with the pagemap_lru_lock held!
  */
 #define DEBUG_ADD_PAGE \
-	if (PageActive(page) || PageInactiveDirty(page) || \
-					PageInactiveClean(page)) BUG();
+	if ((page)->flags & ((1<<PG_active)|(1<<PG_inactive_dirty)| \
+					(1<<PG_inactive_clean))) BUG();
 
 #define ZERO_PAGE_BUG \
 	if (page_count(page) == 0) BUG();

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-07 20:42     ` Hugh Dickins
@ 2001-02-07 21:40       ` Linus Torvalds
  2001-02-08  1:24         ` Kai Germaschewski
  2001-02-08 16:24         ` Hugh Dickins
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2001-02-07 21:40 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Rik van Riel, Mark Hahn, David Howells, linux-kernel



On Wed, 7 Feb 2001, Hugh Dickins wrote:
> 
> The "(1<<PG_bitshift)" part of it is done, sure; but I've rechecked
> activate_page_nolock() compiled -O2 -march=i686 with egcs-2.91.66 (RH7.0
> kgcc), gcc-2.96-69 (RH7.0 gcc+fixes), gcc-2.97 (gcc-snapshot-20010207-1).
> 
> None of those optimizes this: I believe the semantics of "||" (don't
> try next test if first succeeds) forbid the optimization "|" gives?

No. The optimization is entirely legal - but the fact that
"constant_test_bit()" uses a "volatile unsigned int *" is the reason why
gcc thinks it can't optimize it.

Oh, well. That "volatile" is really totally bogus. But it's there because
there are probably drivers that do

	while (test_bit(...))
		/* nothing */;

and the compiler woul doptimize it away a bit too much without the
volatile. Dang.

You could try to remove the volatile from test_bit, and see if that fixes
it - but then we'd have to find and add the proper "rmb()" calls to people
who do the endless loop kind of thing like above.

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-07 21:40       ` Linus Torvalds
@ 2001-02-08  1:24         ` Kai Germaschewski
  2001-02-08 16:24         ` Hugh Dickins
  1 sibling, 0 replies; 14+ messages in thread
From: Kai Germaschewski @ 2001-02-08  1:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hugh Dickins, Rik van Riel, Mark Hahn, David Howells,
	linux-kernel

On Wed, 7 Feb 2001, Linus Torvalds wrote:

> No. The optimization is entirely legal - but the fact that
> "constant_test_bit()" uses a "volatile unsigned int *" is the reason why
> gcc thinks it can't optimize it.

This thing did attract me somewhat and I decided to learn a little about 
compilers.

Result: Unfortunately it's not just the volatile, there's a bunch of 
conditions you have to fulfill to have the compiler optimize this. (Sounds 
like work for the compiler guys).

Test program is attached, inspecting the code (egcs 2.91.66 and 
gcc-2.96 (-69) generate the same code gives the following conclusions:

- f1(unsigned long f): manually optimized

	if (f & ((1 << 1) | (1 << 2) | (1 << 4))) {

  -> optimized code (of course)


- f2(unsigned long f): leave some work to the compiler

	if ((f & (1 << 1)) || (f & (1 << 2)) || (f & (1 << 4))) {

  -> optimized code (good)


- f3(unsigned int f): use constant_test_bit macro
	
	  if (constant_test_bit(1, &f) || constant_test_bit(2, &f) || 
	      constant_test_bit(4, &f)) {
 
  -> optimized code

  where

	#define constant_test_bit(nr, addr) \
	(((1UL << (nr & 31)) & ((unsigned int*)(addr))[nr >> 5]) != 0)

  (doesn't optimize when putting *const* unsigned int there)

- f4: same thing as f3, but use (unsigned long f) instead of 
  (unsigned int f)
  
  -> no optimization

- f5: same thing as f3, but use inline function for constant_test_bit

  -> no optimization

- f6: same thing as f3, but use test_bit instead of constant_test_bit,
  where

	#define test_bit(nr,addr) \
	(__builtin_constant_p(nr) ? \
	constant_test_bit((nr),(addr)) : \
	variable_test_bit((nr),(addr)))

  -> no optimization


Conclusion: With the compilers tested, lots of cases are not optimized 
although the could be in theory:
- casting even from unsigned int to unsigned long breaks optimization
- macros are better than inline
- Even though evaluated at compile time, __builtin_constant_p breaks
  optimization here, too.

BTW: volatile makes optimization impossible as well, of course, it leads 
to repeated reloads of the variable, whereas otherwise it's cached in a 
register in the above "no optimization" cases. That's expected behavior.

--Kai

Test code:
----------

#define ADDR (*(volatile long *) addr)

static __inline__ int inl_constant_test_bit(int nr, const void * addr)
{
        return ((1UL << (nr & 31)) & (((unsigned int *) addr)[nr >> 5])) != 0;
}

#define constant_test_bit(nr, addr) (((1UL << (nr & 31)) & ((unsigned int*)(addr))[nr >> 5]) != 0)

static __inline__ int variable_test_bit(int nr, volatile void * addr)
{
        int oldbit;

        __asm__ __volatile__(
                "btl %2,%1\n\tsbbl %0,%0"
                :"=r" (oldbit)
                :"m" (ADDR),"Ir" (nr));
        return oldbit;
}

#define test_bit(nr,addr) \
(__builtin_constant_p(nr) ? \
 constant_test_bit((nr),(addr)) : \
 variable_test_bit((nr),(addr)))




int f1(unsigned long f)
{
  if (f & ((1 << 1) | (1 << 2) | (1 << 4))) {
    return 1;
  }
  return 0;
}

int f2(unsigned long f)
{
  if ((f & (1 << 1)) || (f & (1 << 2)) || (f & (1 << 4))) {
    return 1;
  }
  return 0;
}

int f3(unsigned int f)
{
  if (constant_test_bit(1, &f) || constant_test_bit(2, &f) || constant_test_bit(4, &f)) {
    return 1;
  }
  return 0;
}

int f4(unsigned long f)
{
  if (constant_test_bit(1, &f) || constant_test_bit(2, &f) || constant_test_bit(4, &f)) {
    return 1;
  }
  return 0;
}

int f5(unsigned int f)
{
  if (inl_constant_test_bit(1, &f) || inl_constant_test_bit(2, &f) || inl_constant_test_bit(4, &f)) {
    return 1;
  }
  return 0;
}

int f6(unsigned int f)
{
  if (test_bit(1, &f) || test_bit(2, &f) || test_bit(4, &f)) {
    return 1;
  }
  return 0;
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-07 21:40       ` Linus Torvalds
  2001-02-08  1:24         ` Kai Germaschewski
@ 2001-02-08 16:24         ` Hugh Dickins
  2001-02-08 16:37           ` David Weinehall
  2001-02-08 17:02           ` Richard B. Johnson
  1 sibling, 2 replies; 14+ messages in thread
From: Hugh Dickins @ 2001-02-08 16:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, Mark Hahn, David Howells, linux-kernel

On Wed, 7 Feb 2001, Linus Torvalds wrote:
> On Wed, 7 Feb 2001, Hugh Dickins wrote:
> > 
> > None of those optimizes this: I believe the semantics of "||" (don't
> > try next test if first succeeds) forbid the optimization "|" gives?
> 
> No. The optimization is entirely legal - but the fact that
> "constant_test_bit()" uses a "volatile unsigned int *" is the reason why
> gcc thinks it can't optimize it.

Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up
with three "mov"s.  But take the "volatile"s out of constant_test_bit(),
and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97)
jumps - which is what originally offended me.

But Mark (in test program in private mail) shows gcc combining bits
into one test and one jump, just as we'd hope (and I wrongly thought
forbidden).  Perhaps the inline function nature of constant_test_bit()
(which Mark didn't use) gets in the way of combining those tests.

> You could try to remove the volatile from test_bit, and see if that fixes
> it - but then we'd have to find and add the proper "rmb()" calls to people
> who do the endless loop kind of thing like above.

That is not an inviting path to me, at least not any time soon!

I think this all argues for the little patch I suggested - just avoid
test_bit() here.  But it was only intended as a quick little suggestion:
looks like our tastes differ, and you prefer taking the _tiny_ hit of
using the regular macros, to seeing "1<<PG_bitshift"s in DEBUG_ADD_PAGE.

Hugh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-08 16:24         ` Hugh Dickins
@ 2001-02-08 16:37           ` David Weinehall
  2001-02-08 16:56             ` Hugh Dickins
  2001-02-08 17:02           ` Richard B. Johnson
  1 sibling, 1 reply; 14+ messages in thread
From: David Weinehall @ 2001-02-08 16:37 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Rik van Riel, Mark Hahn, David Howells,
	linux-kernel

On Thu, Feb 08, 2001 at 04:24:23PM +0000, Hugh Dickins wrote:
> On Wed, 7 Feb 2001, Linus Torvalds wrote:
> > On Wed, 7 Feb 2001, Hugh Dickins wrote:
> > > 
> > > None of those optimizes this: I believe the semantics of "||" (don't
> > > try next test if first succeeds) forbid the optimization "|" gives?
> > 
> > No. The optimization is entirely legal - but the fact that
> > "constant_test_bit()" uses a "volatile unsigned int *" is the reason why
> > gcc thinks it can't optimize it.
> 
> Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up
> with three "mov"s.  But take the "volatile"s out of constant_test_bit(),
> and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97)
> jumps - which is what originally offended me.
> 
> But Mark (in test program in private mail) shows gcc combining bits
> into one test and one jump, just as we'd hope (and I wrongly thought
> forbidden).  Perhaps the inline function nature of constant_test_bit()
> (which Mark didn't use) gets in the way of combining those tests.
> 
> > You could try to remove the volatile from test_bit, and see if that fixes
> > it - but then we'd have to find and add the proper "rmb()" calls to people
> > who do the endless loop kind of thing like above.
> 
> That is not an inviting path to me, at least not any time soon!
> 
> I think this all argues for the little patch I suggested - just avoid
> test_bit() here.  But it was only intended as a quick little suggestion:
> looks like our tastes differ, and you prefer taking the _tiny_ hit of
> using the regular macros, to seeing "1<<PG_bitshift"s in DEBUG_ADD_PAGE.

Well, after all, it's debugging code, and the code now is easy to read.
Your code, while more efficient, isn't. I think that clarity takes
priority over efficiency in non-critical code such as debugging
code. Of course, this is my personal opinion...


/David
  _                                                                 _
 // David Weinehall <tao@acc.umu.se> /> Northern lights wander      \\
//  Project MCA Linux hacker        //  Dance across the winter sky //
\>  http://www.acc.umu.se/~tao/    </   Full colour fire           </
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-08 16:37           ` David Weinehall
@ 2001-02-08 16:56             ` Hugh Dickins
  2001-02-08 17:02               ` Rik van Riel
  0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2001-02-08 16:56 UTC (permalink / raw)
  To: David Weinehall
  Cc: Linus Torvalds, Rik van Riel, Mark Hahn, David Howells,
	linux-kernel

On Thu, 8 Feb 2001, David Weinehall wrote:
> 
> Well, after all, it's debugging code, and the code now is easy to read.
> Your code, while more efficient, isn't. I think that clarity takes
> priority over efficiency in non-critical code such as debugging
> code. Of course, this is my personal opinion...

I agree my version isn't _as_ easy, and if this code only got built
into DEBUG kernels, I would never have bothered about it; but it's
built into every kernel, on executed paths, so it's no less critical.

Hugh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-08 16:24         ` Hugh Dickins
  2001-02-08 16:37           ` David Weinehall
@ 2001-02-08 17:02           ` Richard B. Johnson
  2001-02-08 17:19             ` Stephen Wille Padnos
  1 sibling, 1 reply; 14+ messages in thread
From: Richard B. Johnson @ 2001-02-08 17:02 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Rik van Riel, Mark Hahn, David Howells,
	linux-kernel

On Thu, 8 Feb 2001, Hugh Dickins wrote:

> On Wed, 7 Feb 2001, Linus Torvalds wrote:
> > On Wed, 7 Feb 2001, Hugh Dickins wrote:
> > > 
> > > None of those optimizes this: I believe the semantics of "||" (don't
> > > try next test if first succeeds) forbid the optimization "|" gives?
> > 
> > No. The optimization is entirely legal - but the fact that
> > "constant_test_bit()" uses a "volatile unsigned int *" is the reason why
> > gcc thinks it can't optimize it.
> 
> Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up
> with three "mov"s.  But take the "volatile"s out of constant_test_bit(),
> and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97)
> jumps - which is what originally offended me.
> 
> But Mark (in test program in private mail) shows gcc combining bits
> into one test and one jump, just as we'd hope (and I wrongly thought
> forbidden).  Perhaps the inline function nature of constant_test_bit()
> (which Mark didn't use) gets in the way of combining those tests.
> 
> > You could try to remove the volatile from test_bit, and see if that fixes
> > it - but then we'd have to find and add the proper "rmb()" calls to people
> > who do the endless loop kind of thing like above.
> 
> That is not an inviting path to me, at least not any time soon!
> 
> I think this all argues for the little patch I suggested - just avoid
> test_bit() here.  But it was only intended as a quick little suggestion:
> looks like our tastes differ, and you prefer taking the _tiny_ hit of
> using the regular macros, to seeing "1<<PG_bitshift"s in DEBUG_ADD_PAGE.
> 

The use of the key word 'volatile' has gone just a bit too far in
some cases.

given:
funct()
{
   volatile unsigned int;
}

Is plain dumb. There is nobody else that can touch that local
variable except the code in funct(). Even if it's recursive,
the Nth invocation still can't (using legal 'C' code) touch
that variable. Therefore, it should not be declared volatile.


Another problem with 'volatile' has to do with pointers. When
it's possible for some object to be modified by some external
influence, we see:

	volatile struct whatever *ptr;

Now, it's unclear if gcc knows that we don't give a damn about
the address contained in 'ptr'. We know that it's not going to
change. What we are concerned with are the items within the
'struct whatever'. From what I've seen, gcc just reloads the
pointer.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-08 16:56             ` Hugh Dickins
@ 2001-02-08 17:02               ` Rik van Riel
  0 siblings, 0 replies; 14+ messages in thread
From: Rik van Riel @ 2001-02-08 17:02 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: David Weinehall, Linus Torvalds, Mark Hahn, David Howells,
	linux-kernel

On Thu, 8 Feb 2001, Hugh Dickins wrote:
> On Thu, 8 Feb 2001, David Weinehall wrote:
> > 
> > Well, after all, it's debugging code, and the code now is easy to read.
> > Your code, while more efficient, isn't. I think that clarity takes
> > priority over efficiency in non-critical code such as debugging
> > code. Of course, this is my personal opinion...
> 
> I agree my version isn't _as_ easy, and if this code only got built
> into DEBUG kernels, I would never have bothered about it; but it's
> built into every kernel, on executed paths, so it's no less critical.

Since it's DEBUG code only and nicely "hidden" in a .h file,
why not have the efficient code with a well-written comment
documenting what the code does and why it is there ?

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-08 17:02           ` Richard B. Johnson
@ 2001-02-08 17:19             ` Stephen Wille Padnos
  2001-02-08 17:57               ` Richard B. Johnson
  0 siblings, 1 reply; 14+ messages in thread
From: Stephen Wille Padnos @ 2001-02-08 17:19 UTC (permalink / raw)
  To: root
  Cc: Hugh Dickins, Linus Torvalds, Rik van Riel, Mark Hahn,
	David Howells, linux-kernel

"Richard B. Johnson" wrote:
[snip]
> Another problem with 'volatile' has to do with pointers. When
> it's possible for some object to be modified by some external
> influence, we see:
> 
>         volatile struct whatever *ptr;
> 
> Now, it's unclear if gcc knows that we don't give a damn about
> the address contained in 'ptr'. We know that it's not going to
> change. What we are concerned with are the items within the
> 'struct whatever'. From what I've seen, gcc just reloads the
> pointer.
> 
> Cheers,
> Dick Johnson
> 
gcc should treat
volatile struct whatever *ptr;

as a different case than
struct whatever * volatile ptr;

which is also different from
volatile struct whatever * volatile ptr;

I think (but can't find my K&R C book to confirm :) that the first case
declares the struct as volatile, and the second case declares the
pointer volatile (the third case declares a volatile pointer to a
structure with volatile parts).  So, the programmer should have the
choice, if gcc is dealing with volatile correctly.

Of course, that doesn't mean that the authors have made the right choice
:)

-- 
Stephen Wille Padnos
Programmer, Engineer, Problem Solver
swpadnos@adelphia.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-08 17:19             ` Stephen Wille Padnos
@ 2001-02-08 17:57               ` Richard B. Johnson
  2001-02-08 18:16                 ` Stephen Wille Padnos
  0 siblings, 1 reply; 14+ messages in thread
From: Richard B. Johnson @ 2001-02-08 17:57 UTC (permalink / raw)
  To: Stephen Wille Padnos
  Cc: Hugh Dickins, Linus Torvalds, Rik van Riel, Mark Hahn,
	David Howells, linux-kernel

On Thu, 8 Feb 2001, Stephen Wille Padnos wrote:

> "Richard B. Johnson" wrote:
> [snip]
> > Another problem with 'volatile' has to do with pointers. When
> > it's possible for some object to be modified by some external
> > influence, we see:
> > 
> >         volatile struct whatever *ptr;
> > 
> > Now, it's unclear if gcc knows that we don't give a damn about
> > the address contained in 'ptr'. We know that it's not going to
> > change. What we are concerned with are the items within the
> > 'struct whatever'. From what I've seen, gcc just reloads the
> > pointer.
> > 
> > Cheers,
> > Dick Johnson
> > 
> gcc should treat
> volatile struct whatever *ptr;
> 
> as a different case than
> struct whatever * volatile ptr;
> 
> which is also different from
> volatile struct whatever * volatile ptr;
> 
> I think (but can't find my K&R C book to confirm :) that the first case
> declares the struct as volatile, and the second case declares the
> pointer volatile (the third case declares a volatile pointer to a
> structure with volatile parts).  So, the programmer should have the
> choice, if gcc is dealing with volatile correctly.
> 
> Of course, that doesn't mean that the authors have made the right choice
> :)
> 

Yes. My point is that a lot of authors have declared just about everything
'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to
be "safe". It's likely that there are many hundreds of thousands of
unneeded register-reloads because of this. 

It might be useful for somebody who has a lot of time on his/her
hands to go through some of these drivers.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-08 17:57               ` Richard B. Johnson
@ 2001-02-08 18:16                 ` Stephen Wille Padnos
  2001-02-08 19:32                   ` Richard B. Johnson
  0 siblings, 1 reply; 14+ messages in thread
From: Stephen Wille Padnos @ 2001-02-08 18:16 UTC (permalink / raw)
  To: root
  Cc: Hugh Dickins, Linus Torvalds, Rik van Riel, Mark Hahn,
	David Howells, linux-kernel

"Richard B. Johnson" wrote:
> 
> On Thu, 8 Feb 2001, Stephen Wille Padnos wrote:
> 
> > "Richard B. Johnson" wrote:
> > [snip]
> > > Another problem with 'volatile' has to do with pointers. When
> > > it's possible for some object to be modified by some external
> > > influence, we see:
> > >
> > >         volatile struct whatever *ptr;
> > >
> > > Now, it's unclear if gcc knows that we don't give a damn about
> > > the address contained in 'ptr'. We know that it's not going to
> > > change. What we are concerned with are the items within the
> > > 'struct whatever'. From what I've seen, gcc just reloads the
> > > pointer.
> > >
[snip]

> Yes. My point is that a lot of authors have declared just about everything
> 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to
> be "safe". It's likely that there are many hundreds of thousands of
> unneeded register-reloads because of this.
> 
> It might be useful for somebody who has a lot of time on his/her
> hands to go through some of these drivers.

I would be willing to do this (on the slow boat - I don't have THAT much
spare time :), but only if we can be sure that the gcc optimizer will
correctly handle a normal pointer to volatile data.  Your experiences
would seem to indicate that the optimizer needs fixing before much
effort should be spent on this.

-- 
Stephen Wille Padnos
Programmer, Engineer, Problem Solver
swpadnos@adelphia.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] micro-opt DEBUG_ADD_PAGE
  2001-02-08 18:16                 ` Stephen Wille Padnos
@ 2001-02-08 19:32                   ` Richard B. Johnson
  0 siblings, 0 replies; 14+ messages in thread
From: Richard B. Johnson @ 2001-02-08 19:32 UTC (permalink / raw)
  To: Stephen Wille Padnos
  Cc: Hugh Dickins, Linus Torvalds, Rik van Riel, Mark Hahn,
	David Howells, linux-kernel

On Thu, 8 Feb 2001, Stephen Wille Padnos wrote:

> "Richard B. Johnson" wrote:
> > 
> > On Thu, 8 Feb 2001, Stephen Wille Padnos wrote:
> > 
> > > "Richard B. Johnson" wrote:
> > > [snip]
> > > > Another problem with 'volatile' has to do with pointers. When
> > > > it's possible for some object to be modified by some external
> > > > influence, we see:
> > > >
> > > >         volatile struct whatever *ptr;
> > > >
> > > > Now, it's unclear if gcc knows that we don't give a damn about
> > > > the address contained in 'ptr'. We know that it's not going to
> > > > change. What we are concerned with are the items within the
> > > > 'struct whatever'. From what I've seen, gcc just reloads the
> > > > pointer.
> > > >
> [snip]
> 
> > Yes. My point is that a lot of authors have declared just about everything
> > 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to
> > be "safe". It's likely that there are many hundreds of thousands of
> > unneeded register-reloads because of this.
> > 
> > It might be useful for somebody who has a lot of time on his/her
> > hands to go through some of these drivers.
> 
> I would be willing to do this (on the slow boat - I don't have THAT much
> spare time :), but only if we can be sure that the gcc optimizer will
> correctly handle a normal pointer to volatile data.  Your experiences
> would seem to indicate that the optimizer needs fixing before much
> effort should be spent on this.
> 

Well the question for that is; "What compiler?". I'm currently
using egcs-2.91.66, one of the "approved" versions for compiling
the kernel. It treats all volatiles about the same:


volatile int i;
volatile int *p;
int volatile *q;
volatile int * volatile r;

void foo()
{
   while(*p == i) 
       ;
   while(*q == i) 
       ;
   while(*r == i) 
       ;
}
...makes :


	.file	"main.c"
	.version	"01.01"
gcc2_compiled.:
.text
	.align 4
.globl foo
	.type	 foo,@function
foo:
	pushl %ebp
	movl %esp,%ebp
	nop
	.align 4
.L2:
	movl p,%eax
	movl (%eax),%edx
	movl i,%eax
	cmpl %eax,%edx
	je .L4
	jmp .L3
	.align 4
.L4:
	jmp .L2
	.align 4
.L3:
	nop
	.align 4
.L5:
	movl q,%eax
	movl (%eax),%edx
	movl i,%eax
	cmpl %eax,%edx
	je .L7
	jmp .L6
	.align 4
.L7:
	jmp .L5
	.align 4
.L6:
	nop
	.align 4
.L8:
	movl r,%eax
	movl (%eax),%edx
	movl i,%eax
	cmpl %eax,%edx
	je .L10
	jmp .L9
	.align 4
.L10:
	jmp .L8
	.align 4
.L9:
.L1:
	movl %ebp,%esp
	popl %ebp
	ret
.Lfe1:
	.size	 foo,.Lfe1-foo
	.comm	i,4,4
	.comm	p,4,4
	.comm	q,4,4
	.comm	r,4,4
	.ident	"GCC: (GNU) egcs-2.91.66 19990314 (egcs-1.1.2 release)"


Since there seems to be a rather big difference between what is
expected to be done, and what happens to be the result, this
certainly contributes to the possible over-use of 'volatile' in
some kernel code. 

It's certainly better to be safe than sorry, but in some cases "safe"
is just a bit "strange". FYI, ../linux/drivers/net/atp.c doesn't use
'volatile' at all. However, ../linux/drivers/net/bmac.c uses it 40
times. I'll bet a buck that both of the drivers work and the one
without 'volatile' keywords does the work with fewer instructions.

These are just two drivers chosen at random. The driver I've been
working on to make  'bullet proof', pcnet32.c uses 'volatile' twice.
And, at least in one occasion, the wrong thing is declared volatile
(the value in a pointer to a structure ), however gcc doesn't seem
to care because it reloads the values of the structure members every time,
anyway. So, in this case, the address-value in the pointer will never
change, but gcc reloads all the pointed-to members anyway, so the
'volatile' keyword not useful.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2001-02-08 19:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.21.0102071744440.5204-100000@localhost.localdomain>
2001-02-07 18:00 ` [PATCH] micro-opt DEBUG_ADD_PAGE Hugh Dickins
2001-02-07 18:17   ` Linus Torvalds
2001-02-07 20:42     ` Hugh Dickins
2001-02-07 21:40       ` Linus Torvalds
2001-02-08  1:24         ` Kai Germaschewski
2001-02-08 16:24         ` Hugh Dickins
2001-02-08 16:37           ` David Weinehall
2001-02-08 16:56             ` Hugh Dickins
2001-02-08 17:02               ` Rik van Riel
2001-02-08 17:02           ` Richard B. Johnson
2001-02-08 17:19             ` Stephen Wille Padnos
2001-02-08 17:57               ` Richard B. Johnson
2001-02-08 18:16                 ` Stephen Wille Padnos
2001-02-08 19:32                   ` Richard B. Johnson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.