linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Inline Assembly queries
@ 2009-06-27 19:46 kernel mailz
       [not found] ` <abe8a1fd0906271249k479e5a87gfe1ee9c02798a234@mail.gmail.com>
  0 siblings, 1 reply; 13+ messages in thread
From: kernel mailz @ 2009-06-27 19:46 UTC (permalink / raw)
  To: gcc-help, gcc-help-help, linuxppc-dev

Hello All the gurus,

I've been fiddling my luck with gcc 4.3.2 inline assembly on powerpc
There are a few queries

1. asm volatile or simply asm produce the same assembly code.
Tried with a few examples but didnt find any difference by adding
volatile with asm

2. Use of "memory" and clobbered registers.

"memory" -
a. announce to the compiler that the memory has been modified
b. this instruction writes to some memory (other than a listed output)
and GCC shouldn=92t cache memory values in registers across this asm.

I tried with stw and stwcx instruction, adding "memory" has no effect.

Is there any example scenerio where gcc would generate different
assembly by adding / removing "memory" ?


-TZ

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
       [not found]   ` <m3ab3t4623.fsf@google.com>
@ 2009-06-28  4:57     ` kernel mailz
  2009-06-29 15:49       ` kernel mailz
  2009-06-29 15:57       ` David Howells
  0 siblings, 2 replies; 13+ messages in thread
From: kernel mailz @ 2009-06-28  4:57 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-help, linuxppc-dev

Thanks Ian,
For the "memory" clobber
I tried with the a function in linux kernel

--
/*
 * Atomic exchange
 *
 * Changes the memory location '*ptr' to be val and returns
 * the previous value stored there.
 */
static inline unsigned long
__xchg_u32(volatile void *p, unsigned long val)
{
        unsigned long prev;

        __asm__ __volatile__(

"1:     lwarx   %0,0,%2 \n"

"       stwcx.  %3,0,%2 \n\
        bne-    1b"

        : "=3D&r" (prev), "+m" (*(volatile unsigned int *)p)
        : "r" (p), "r" (val)
//        :"memory","cc");

        return prev;
}
#define ADDR 0x1000
int main()
{
	__xchg_u32((void*)ADDR, 0x2000);
	__xchg_u32((void*)ADDR, 0x3000);

	return 0;

}

Got the same asm, when compiled with O1 , with / without "memory" clobber

100003fc <main>:
100003fc:       39 20 10 00     li      r9,4096
10000400:       38 00 20 00     li      r0,8192
10000404:       7d 60 48 28     lwarx   r11,0,r9
10000408:       7c 00 49 2d     stwcx.  r0,0,r9
1000040c:       40 a2 ff f8     bne-    10000404 <main+0x8>
10000410:       38 00 30 00     li      r0,12288
10000414:       7d 60 48 28     lwarx   r11,0,r9
10000418:       7c 00 49 2d     stwcx.  r0,0,r9
1000041c:       40 a2 ff f8     bne-    10000414 <main+0x18>
10000420:       38 60 00 00     li      r3,0
10000424:       4e 80 00 20     blr

No diff ?
am I choosing the right example ?

-TZ


On Sun, Jun 28, 2009 at 4:50 AM, Ian Lance Taylor<iant@google.com> wrote:
> kernel mailz <kernelmailz@googlemail.com> writes:
>
>> I've been fiddling my luck with gcc 4.3.2 inline assembly on powerpc
>> There are a few queries
>>
>> 1. asm volatile or simply asm produce the same assembly code.
>> Tried with a few examples but didnt find any difference by adding
>> volatile with asm
>>
>> 2. Use of "memory" and clobbered registers.
>>
>> "memory" -
>> a. announce to the compiler that the memory has been modified
>> b. this instruction writes to some memory (other than a listed output)
>> and GCC shouldn=92t cache memory values in registers across this asm.
>>
>> I tried with stw and stwcx instruction, adding "memory" has no effect.
>>
>> Is there any example scenerio where gcc would generate different
>> assembly by adding / removing "memory" ?
>
> Please never send a message to both gcc@gcc.gnu.org and
> gcc-help@gcc.gnu.org. =A0This message is appropriate for
> gcc-help@gcc.gnu.org, not for gcc@gcc.gnu.org. =A0Thanks.
>
> An asm with no outputs is always considered to be volatile. =A0To see the
> affect of volatile, just try something like
> =A0 =A0asm ("# modify %0" : "=3Dr" (i) : /* no inputs */ : /* no clobbers=
 */);
> Try it with and without optimization.
>
> As the documentation says, the effect of adding a "memory" clobber is
> that gcc does not cache values in registers across the asm. =A0So the
> effect will be shown in something like
> =A0int i =3D *p;
> =A0asm volatile ("# read %0" : : "r" (i));
> =A0return *p;
> The memory clobber will only make a different when optimizing.
>
> Ian
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Inline Assembly queries
  2009-06-28  4:57     ` kernel mailz
@ 2009-06-29 15:49       ` kernel mailz
  2009-06-29 19:27         ` Scott Wood
  2009-06-29 21:29         ` Ian Lance Taylor
  2009-06-29 15:57       ` David Howells
  1 sibling, 2 replies; 13+ messages in thread
From: kernel mailz @ 2009-06-29 15:49 UTC (permalink / raw)
  To: gcc-help, linuxppc-dev

I tried a small example

int *p =3D 0x1000;
int a =3D *p;
asm("sync":::"memory");
a =3D *p;

and

volatile int *p =3D 0x1000;
int a =3D *p;
asm("sync");
a =3D *p

Got the same assembly.
Which is right.

So does it mean, if proper use of volatile is done, there is no need
of "memory" ?
But then why  below example of __xchg uses both ?

I am confused!
Anyone has a clue?

-TZ


---------- Forwarded message ----------
From: kernel mailz <kernelmailz@googlemail.com>
Date: Sun, Jun 28, 2009 at 10:27 AM
Subject: Re: Inline Assembly queries
To: Ian Lance Taylor <iant@google.com>
Cc: gcc-help@gcc.gnu.org, linuxppc-dev@ozlabs.org


Thanks Ian,
For the "memory" clobber
I tried with the a function in linux kernel

--
/*
=A0* Atomic exchange
=A0*
=A0* Changes the memory location '*ptr' to be val and returns
=A0* the previous value stored there.
=A0*/
static inline unsigned long
__xchg_u32(volatile void *p, unsigned long val)
{
=A0 =A0 =A0 =A0unsigned long prev;

=A0 =A0 =A0 =A0__asm__ __volatile__(

"1: =A0 =A0 lwarx =A0 %0,0,%2 \n"

" =A0 =A0 =A0 stwcx. =A0%3,0,%2 \n\
=A0 =A0 =A0 =A0bne- =A0 =A01b"

=A0 =A0 =A0 =A0: "=3D&r" (prev), "+m" (*(volatile unsigned int *)p)
=A0 =A0 =A0 =A0: "r" (p), "r" (val)
// =A0 =A0 =A0 =A0:"memory","cc");

=A0 =A0 =A0 =A0return prev;
}
#define ADDR 0x1000
int main()
{
=A0 =A0 =A0 =A0__xchg_u32((void*)ADDR, 0x2000);
=A0 =A0 =A0 =A0__xchg_u32((void*)ADDR, 0x3000);

=A0 =A0 =A0 =A0return 0;

}

Got the same asm, when compiled with O1 , with / without "memory" clobber

100003fc <main>:
100003fc: =A0 =A0 =A0 39 20 10 00 =A0 =A0 li =A0 =A0 =A0r9,4096
10000400: =A0 =A0 =A0 38 00 20 00 =A0 =A0 li =A0 =A0 =A0r0,8192
10000404: =A0 =A0 =A0 7d 60 48 28 =A0 =A0 lwarx =A0 r11,0,r9
10000408: =A0 =A0 =A0 7c 00 49 2d =A0 =A0 stwcx. =A0r0,0,r9
1000040c: =A0 =A0 =A0 40 a2 ff f8 =A0 =A0 bne- =A0 =A010000404 <main+0x8>
10000410: =A0 =A0 =A0 38 00 30 00 =A0 =A0 li =A0 =A0 =A0r0,12288
10000414: =A0 =A0 =A0 7d 60 48 28 =A0 =A0 lwarx =A0 r11,0,r9
10000418: =A0 =A0 =A0 7c 00 49 2d =A0 =A0 stwcx. =A0r0,0,r9
1000041c: =A0 =A0 =A0 40 a2 ff f8 =A0 =A0 bne- =A0 =A010000414 <main+0x18>
10000420: =A0 =A0 =A0 38 60 00 00 =A0 =A0 li =A0 =A0 =A0r3,0
10000424: =A0 =A0 =A0 4e 80 00 20 =A0 =A0 blr

No diff ?
am I choosing the right example ?

-TZ


On Sun, Jun 28, 2009 at 4:50 AM, Ian Lance Taylor<iant@google.com> wrote:
> kernel mailz <kernelmailz@googlemail.com> writes:
>
>> I've been fiddling my luck with gcc 4.3.2 inline assembly on powerpc
>> There are a few queries
>>
>> 1. asm volatile or simply asm produce the same assembly code.
>> Tried with a few examples but didnt find any difference by adding
>> volatile with asm
>>
>> 2. Use of "memory" and clobbered registers.
>>
>> "memory" -
>> a. announce to the compiler that the memory has been modified
>> b. this instruction writes to some memory (other than a listed output)
>> and GCC shouldn=92t cache memory values in registers across this asm.
>>
>> I tried with stw and stwcx instruction, adding "memory" has no effect.
>>
>> Is there any example scenerio where gcc would generate different
>> assembly by adding / removing "memory" ?
>
> Please never send a message to both gcc@gcc.gnu.org and
> gcc-help@gcc.gnu.org. =A0This message is appropriate for
> gcc-help@gcc.gnu.org, not for gcc@gcc.gnu.org. =A0Thanks.
>
> An asm with no outputs is always considered to be volatile. =A0To see the
> affect of volatile, just try something like
> =A0 =A0asm ("# modify %0" : "=3Dr" (i) : /* no inputs */ : /* no clobbers=
 */);
> Try it with and without optimization.
>
> As the documentation says, the effect of adding a "memory" clobber is
> that gcc does not cache values in registers across the asm. =A0So the
> effect will be shown in something like
> =A0int i =3D *p;
> =A0asm volatile ("# read %0" : : "r" (i));
> =A0return *p;
> The memory clobber will only make a different when optimizing.
>
> Ian
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-28  4:57     ` kernel mailz
  2009-06-29 15:49       ` kernel mailz
@ 2009-06-29 15:57       ` David Howells
  2009-06-29 21:27         ` Ian Lance Taylor
  2009-06-30 10:43         ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 13+ messages in thread
From: David Howells @ 2009-06-29 15:57 UTC (permalink / raw)
  To: kernel mailz; +Cc: gcc-help, linuxppc-dev

kernel mailz <kernelmailz@googlemail.com> wrote:

> asm("sync");

Isn't gcc free to discard this as it has no dependencies, no indicated side
effects, and isn't required to be kept?  I think this should probably be:

	asm volatile("sync");

David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-29 15:49       ` kernel mailz
@ 2009-06-29 19:27         ` Scott Wood
  2009-06-30  5:27           ` kernel mailz
  2009-06-29 21:29         ` Ian Lance Taylor
  1 sibling, 1 reply; 13+ messages in thread
From: Scott Wood @ 2009-06-29 19:27 UTC (permalink / raw)
  To: kernel mailz; +Cc: gcc-help, linuxppc-dev

On Mon, Jun 29, 2009 at 09:19:57PM +0530, kernel mailz wrote:
> I tried a small example
> 
> int *p = 0x1000;
> int a = *p;
> asm("sync":::"memory");
> a = *p;
> 
> and
> 
> volatile int *p = 0x1000;
> int a = *p;
> asm("sync");
> a = *p
> 
> Got the same assembly.
> Which is right.
> 
> So does it mean, if proper use of volatile is done, there is no need
> of "memory" ?

No.  As I understand it, volatile concerns deletion of the asm statement
(if no outputs are used) and reordering with respect to other asm
statements (not sure whether GCC will actually do this), while the memory
clobber concerns optimization of non-asm loads/stores around the asm
statement.

> static inline unsigned long
> __xchg_u32(volatile void *p, unsigned long val)
> {
>        unsigned long prev;
> 
>        __asm__ __volatile__(
> 
> "1:     lwarx   %0,0,%2 \n"
> 
> "       stwcx.  %3,0,%2 \n\
>        bne-    1b"
> 
>        : "=&r" (prev), "+m" (*(volatile unsigned int *)p)
>        : "r" (p), "r" (val)
> //        :"memory","cc");
> 
>        return prev;
> }
> #define ADDR 0x1000
> int main()
> {
>        __xchg_u32((void*)ADDR, 0x2000);
>        __xchg_u32((void*)ADDR, 0x3000);
> 
>        return 0;
> 
> }
> 
> Got the same asm, when compiled with O1 , with / without "memory" clobber

This isn't a good test case, because there's nothing other than inline
asm going on in that function for GCC to optimize.  Plus, it's generally
not a good idea, when talking about what the compiler is or isn't allowed
to do, to point to a single test case (or even several) and say that it
isn't required because you don't notice a difference.  Even if there were
no code at all with which it made a difference with GCC version X, it
could make a difference with GCC version X+1.

-Scott

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-29 15:57       ` David Howells
@ 2009-06-29 21:27         ` Ian Lance Taylor
  2009-06-30 10:43         ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 13+ messages in thread
From: Ian Lance Taylor @ 2009-06-29 21:27 UTC (permalink / raw)
  To: David Howells; +Cc: gcc-help, linuxppc-dev, kernel mailz

David Howells <dhowells@redhat.com> writes:

> kernel mailz <kernelmailz@googlemail.com> wrote:
>
>> asm("sync");
>
> Isn't gcc free to discard this as it has no dependencies, no indicated side
> effects, and isn't required to be kept?  I think this should probably be:
>
> 	asm volatile("sync");

An asm with no outputs is considered to be volatile.

Ian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-29 15:49       ` kernel mailz
  2009-06-29 19:27         ` Scott Wood
@ 2009-06-29 21:29         ` Ian Lance Taylor
  2009-06-30  5:53           ` kernel mailz
  1 sibling, 1 reply; 13+ messages in thread
From: Ian Lance Taylor @ 2009-06-29 21:29 UTC (permalink / raw)
  To: kernel mailz; +Cc: gcc-help, linuxppc-dev

kernel mailz <kernelmailz@googlemail.com> writes:

> I tried a small example
>
> int *p = 0x1000;
> int a = *p;
> asm("sync":::"memory");
> a = *p;
>
> and
>
> volatile int *p = 0x1000;
> int a = *p;
> asm("sync");
> a = *p
>
> Got the same assembly.
> Which is right.
>
> So does it mean, if proper use of volatile is done, there is no need
> of "memory" ?

You have to consider the effects of inlining, which may bring in other
memory loads and stores through non-volatile pointers.

Ian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-29 19:27         ` Scott Wood
@ 2009-06-30  5:27           ` kernel mailz
  2009-06-30 10:41             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 13+ messages in thread
From: kernel mailz @ 2009-06-30  5:27 UTC (permalink / raw)
  To: Scott Wood; +Cc: gcc-help, linuxppc-dev

Hi Scott,
I agree with you, kind of understand that it is required.
But buddy unless you see some construct work or by adding the
construct a visible difference is there, the concept is just piece of
theory.

I am trying all the kernel code inline assembly to find an example
that works differently with memory.

For instance take atomic_add , atomic_add_return, while the
atomic_add_return has the "memory", atomic_add skips it.

-TZ

On Tue, Jun 30, 2009 at 12:57 AM, Scott Wood<scottwood@freescale.com> wrote=
:
> On Mon, Jun 29, 2009 at 09:19:57PM +0530, kernel mailz wrote:
>> I tried a small example
>>
>> int *p =3D 0x1000;
>> int a =3D *p;
>> asm("sync":::"memory");
>> a =3D *p;
>>
>> and
>>
>> volatile int *p =3D 0x1000;
>> int a =3D *p;
>> asm("sync");
>> a =3D *p
>>
>> Got the same assembly.
>> Which is right.
>>
>> So does it mean, if proper use of volatile is done, there is no need
>> of "memory" ?
>
> No. =A0As I understand it, volatile concerns deletion of the asm statemen=
t
> (if no outputs are used) and reordering with respect to other asm
> statements (not sure whether GCC will actually do this), while the memory
> clobber concerns optimization of non-asm loads/stores around the asm
> statement.
>
>> static inline unsigned long
>> __xchg_u32(volatile void *p, unsigned long val)
>> {
>> =A0 =A0 =A0 =A0unsigned long prev;
>>
>> =A0 =A0 =A0 =A0__asm__ __volatile__(
>>
>> "1: =A0 =A0 lwarx =A0 %0,0,%2 \n"
>>
>> " =A0 =A0 =A0 stwcx. =A0%3,0,%2 \n\
>> =A0 =A0 =A0 =A0bne- =A0 =A01b"
>>
>> =A0 =A0 =A0 =A0: "=3D&r" (prev), "+m" (*(volatile unsigned int *)p)
>> =A0 =A0 =A0 =A0: "r" (p), "r" (val)
>> // =A0 =A0 =A0 =A0:"memory","cc");
>>
>> =A0 =A0 =A0 =A0return prev;
>> }
>> #define ADDR 0x1000
>> int main()
>> {
>> =A0 =A0 =A0 =A0__xchg_u32((void*)ADDR, 0x2000);
>> =A0 =A0 =A0 =A0__xchg_u32((void*)ADDR, 0x3000);
>>
>> =A0 =A0 =A0 =A0return 0;
>>
>> }
>>
>> Got the same asm, when compiled with O1 , with / without "memory" clobbe=
r
>
> This isn't a good test case, because there's nothing other than inline
> asm going on in that function for GCC to optimize. =A0Plus, it's generall=
y
> not a good idea, when talking about what the compiler is or isn't allowed
> to do, to point to a single test case (or even several) and say that it
> isn't required because you don't notice a difference. =A0Even if there we=
re
> no code at all with which it made a difference with GCC version X, it
> could make a difference with GCC version X+1.
>
> -Scott
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-29 21:29         ` Ian Lance Taylor
@ 2009-06-30  5:53           ` kernel mailz
  2009-06-30  9:30             ` Andrew Haley
  2009-06-30  9:52             ` Paul Mackerras
  0 siblings, 2 replies; 13+ messages in thread
From: kernel mailz @ 2009-06-30  5:53 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-help, linuxppc-dev

Consider atomic_add and atomic_add_return in kernel code.

On Tue, Jun 30, 2009 at 2:59 AM, Ian Lance Taylor<iant@google.com> wrote:
> kernel mailz <kernelmailz@googlemail.com> writes:
>
>> I tried a small example
>>
>> int *p = 0x1000;
>> int a = *p;
>> asm("sync":::"memory");
>> a = *p;
>>
>> and
>>
>> volatile int *p = 0x1000;
>> int a = *p;
>> asm("sync");
>> a = *p
>>
>> Got the same assembly.
>> Which is right.
>>
>> So does it mean, if proper use of volatile is done, there is no need
>> of "memory" ?
>
> You have to consider the effects of inlining, which may bring in other
> memory loads and stores through non-volatile pointers.
>
> Ian
>
Consider

static __inline__ void atomic_add(int a, atomic_t *v)
{
        int t;

        __asm__ __volatile__(
"1:     lwarx   %0,0,%3         # atomic_add\n\
        add     %0,%2,%0\n"
        PPC405_ERR77(0,%3)
"       stwcx.  %0,0,%3 \n\
        bne-    1b"
        : "=&r" (t), "+m" (v->counter)
        : "r" (a), "r" (&v->counter)
        : "cc");
}

static __inline__ int atomic_add_return(int a, atomic_t *v)
{
        int t;

        __asm__ __volatile__(
        LWSYNC_ON_SMP
"1:     lwarx   %0,0,%2         # atomic_add_return\n\
        add     %0,%1,%0\n"
        PPC405_ERR77(0,%2)
"       stwcx.  %0,0,%2 \n\
        bne-    1b"
        ISYNC_ON_SMP
        : "=&r" (t)
        : "r" (a), "r" (&v->counter)
        : "cc", "memory");

        return t;
}

I am not able to figure out why "memory" is added in latter

-TZ

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-30  5:53           ` kernel mailz
@ 2009-06-30  9:30             ` Andrew Haley
  2009-06-30  9:52             ` Paul Mackerras
  1 sibling, 0 replies; 13+ messages in thread
From: Andrew Haley @ 2009-06-30  9:30 UTC (permalink / raw)
  To: kernel mailz; +Cc: gcc-help, linuxppc-dev, Ian Lance Taylor

kernel mailz wrote:
> Consider atomic_add and atomic_add_return in kernel code.
> 
> On Tue, Jun 30, 2009 at 2:59 AM, Ian Lance Taylor<iant@google.com> wrote:
>> kernel mailz <kernelmailz@googlemail.com> writes:
>>
>>> I tried a small example
>>>
>>> int *p = 0x1000;
>>> int a = *p;
>>> asm("sync":::"memory");
>>> a = *p;
>>>
>>> and
>>>
>>> volatile int *p = 0x1000;
>>> int a = *p;
>>> asm("sync");
>>> a = *p
>>>
>>> Got the same assembly.
>>> Which is right.
>>>
>>> So does it mean, if proper use of volatile is done, there is no need
>>> of "memory" ?
>> You have to consider the effects of inlining, which may bring in other
>> memory loads and stores through non-volatile pointers.

> Consider
> 
> static __inline__ void atomic_add(int a, atomic_t *v)
> {
>         int t;
> 
>         __asm__ __volatile__(
> "1:     lwarx   %0,0,%3         # atomic_add\n\
>         add     %0,%2,%0\n"
>         PPC405_ERR77(0,%3)
> "       stwcx.  %0,0,%3 \n\
>         bne-    1b"
>         : "=&r" (t), "+m" (v->counter)
>         : "r" (a), "r" (&v->counter)
>         : "cc");
> }
> 
> static __inline__ int atomic_add_return(int a, atomic_t *v)
> {
>         int t;
> 
>         __asm__ __volatile__(
>         LWSYNC_ON_SMP
> "1:     lwarx   %0,0,%2         # atomic_add_return\n\
>         add     %0,%1,%0\n"
>         PPC405_ERR77(0,%2)
> "       stwcx.  %0,0,%2 \n\
>         bne-    1b"
>         ISYNC_ON_SMP
>         : "=&r" (t)
>         : "r" (a), "r" (&v->counter)
>         : "cc", "memory");
> 
>         return t;
> }
> 
> I am not able to figure out why "memory" is added in latter

The latter, as well as its stated purpose, forms a memory barrier, so the
compiler must be prevented from moving memory access across it.  See
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html

Andrew.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-30  5:53           ` kernel mailz
  2009-06-30  9:30             ` Andrew Haley
@ 2009-06-30  9:52             ` Paul Mackerras
  1 sibling, 0 replies; 13+ messages in thread
From: Paul Mackerras @ 2009-06-30  9:52 UTC (permalink / raw)
  To: kernel mailz; +Cc: gcc-help, linuxppc-dev, Ian Lance Taylor

kernel mailz writes:

> Consider atomic_add and atomic_add_return in kernel code.
> I am not able to figure out why "memory" is added in latter

The "memory" indicates that gcc should not reorder accesses to memory
from one side of the asm to the other.  The reason for putting it on
the atomic ops that return a value is that they are sometimes used to
implement locks or other synchronization primitives.

Paul.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-30  5:27           ` kernel mailz
@ 2009-06-30 10:41             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2009-06-30 10:41 UTC (permalink / raw)
  To: kernel mailz; +Cc: Scott Wood, gcc-help, linuxppc-dev

On Tue, 2009-06-30 at 10:57 +0530, kernel mailz wrote:
> Hi Scott,
> I agree with you, kind of understand that it is required.
> But buddy unless you see some construct work or by adding the
> construct a visible difference is there, the concept is just piece of
> theory.

In this case you'd rather get the theory right :-) The problem here is
that doing it wrong won't bite most of the time ... until for some
reason gcc decides to optimize things, for example by moving things
around, and kaboom ! But it may only happen in one out of 100 cases of
the same inline function depending on the surrounding code.

> I am trying all the kernel code inline assembly to find an example
> that works differently with memory.

Well, we recently had an example in the atomic64 code for 32-bit that
Paulus wrote, where we discovered we were missing the memory clobber in
local_irq_restore() iirc. On a base UP build, we did observe the load
and stores within the local_irq_save/restore section being moved to
outside of it by gcc.
 
> For instance take atomic_add , atomic_add_return, while the
> atomic_add_return has the "memory", atomic_add skips it.

There is a reason for the slightly different semantics here.

The base atomic ops such as atomic_add are defined as having no side
effects and to allow re-ordering. Thus atomic_add has an explicit
clobber of the actual variable that's incremented but nothing else and
no other memory barrier instruction.

The variants that -return- something, however, have been granted
stronger semantics, because they have been (ab)used to construct what
effectively is semaphores or locks by callers, and thus we added
stronger memory barriers to avoid re-ordering of loads and stores around
the atomic operation.

Cheers,
Ben.

> -TZ
> 
> On Tue, Jun 30, 2009 at 12:57 AM, Scott Wood<scottwood@freescale.com> wrote:
> > On Mon, Jun 29, 2009 at 09:19:57PM +0530, kernel mailz wrote:
> >> I tried a small example
> >>
> >> int *p = 0x1000;
> >> int a = *p;
> >> asm("sync":::"memory");
> >> a = *p;
> >>
> >> and
> >>
> >> volatile int *p = 0x1000;
> >> int a = *p;
> >> asm("sync");
> >> a = *p
> >>
> >> Got the same assembly.
> >> Which is right.
> >>
> >> So does it mean, if proper use of volatile is done, there is no need
> >> of "memory" ?
> >
> > No.  As I understand it, volatile concerns deletion of the asm statement
> > (if no outputs are used) and reordering with respect to other asm
> > statements (not sure whether GCC will actually do this), while the memory
> > clobber concerns optimization of non-asm loads/stores around the asm
> > statement.
> >
> >> static inline unsigned long
> >> __xchg_u32(volatile void *p, unsigned long val)
> >> {
> >>        unsigned long prev;
> >>
> >>        __asm__ __volatile__(
> >>
> >> "1:     lwarx   %0,0,%2 \n"
> >>
> >> "       stwcx.  %3,0,%2 \n\
> >>        bne-    1b"
> >>
> >>        : "=&r" (prev), "+m" (*(volatile unsigned int *)p)
> >>        : "r" (p), "r" (val)
> >> //        :"memory","cc");
> >>
> >>        return prev;
> >> }
> >> #define ADDR 0x1000
> >> int main()
> >> {
> >>        __xchg_u32((void*)ADDR, 0x2000);
> >>        __xchg_u32((void*)ADDR, 0x3000);
> >>
> >>        return 0;
> >>
> >> }
> >>
> >> Got the same asm, when compiled with O1 , with / without "memory" clobber
> >
> > This isn't a good test case, because there's nothing other than inline
> > asm going on in that function for GCC to optimize.  Plus, it's generally
> > not a good idea, when talking about what the compiler is or isn't allowed
> > to do, to point to a single test case (or even several) and say that it
> > isn't required because you don't notice a difference.  Even if there were
> > no code at all with which it made a difference with GCC version X, it
> > could make a difference with GCC version X+1.
> >
> > -Scott
> >
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Inline Assembly queries
  2009-06-29 15:57       ` David Howells
  2009-06-29 21:27         ` Ian Lance Taylor
@ 2009-06-30 10:43         ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2009-06-30 10:43 UTC (permalink / raw)
  To: David Howells; +Cc: gcc-help, linuxppc-dev, kernel mailz

On Mon, 2009-06-29 at 16:57 +0100, David Howells wrote:
> kernel mailz <kernelmailz@googlemail.com> wrote:
> 
> > asm("sync");
> 
> Isn't gcc free to discard this as it has no dependencies, no indicated side
> effects, and isn't required to be kept?  I think this should probably be:
> 
> 	asm volatile("sync");

It should also have a "memory" clobber or it's pointless since gcc would
otherwise be free to move load and stores accross that barrier.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-06-30 10:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-27 19:46 Inline Assembly queries kernel mailz
     [not found] ` <abe8a1fd0906271249k479e5a87gfe1ee9c02798a234@mail.gmail.com>
     [not found]   ` <m3ab3t4623.fsf@google.com>
2009-06-28  4:57     ` kernel mailz
2009-06-29 15:49       ` kernel mailz
2009-06-29 19:27         ` Scott Wood
2009-06-30  5:27           ` kernel mailz
2009-06-30 10:41             ` Benjamin Herrenschmidt
2009-06-29 21:29         ` Ian Lance Taylor
2009-06-30  5:53           ` kernel mailz
2009-06-30  9:30             ` Andrew Haley
2009-06-30  9:52             ` Paul Mackerras
2009-06-29 15:57       ` David Howells
2009-06-29 21:27         ` Ian Lance Taylor
2009-06-30 10:43         ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).