* Inline Assembly queries
@ 2009-06-27 19:46 kernel mailz
[not found] ` <abe8a1fd0906271249k479e5a87gfe1ee9c02798a234@mail.gmail.com>
0 siblings, 1 reply; 13+ messages in thread
From: kernel mailz @ 2009-06-27 19:46 UTC (permalink / raw)
To: gcc-help, gcc-help-help, linuxppc-dev
Hello All the gurus,
I've been fiddling my luck with gcc 4.3.2 inline assembly on powerpc
There are a few queries
1. asm volatile or simply asm produce the same assembly code.
Tried with a few examples but didnt find any difference by adding
volatile with asm
2. Use of "memory" and clobbered registers.
"memory" -
a. announce to the compiler that the memory has been modified
b. this instruction writes to some memory (other than a listed output)
and GCC shouldn=92t cache memory values in registers across this asm.
I tried with stw and stwcx instruction, adding "memory" has no effect.
Is there any example scenerio where gcc would generate different
assembly by adding / removing "memory" ?
-TZ
^ permalink raw reply [flat|nested] 13+ messages in thread[parent not found: <abe8a1fd0906271249k479e5a87gfe1ee9c02798a234@mail.gmail.com>]
[parent not found: <m3ab3t4623.fsf@google.com>]
* Re: Inline Assembly queries [not found] ` <m3ab3t4623.fsf@google.com> @ 2009-06-28 4:57 ` kernel mailz 2009-06-29 15:49 ` kernel mailz 2009-06-29 15:57 ` David Howells 0 siblings, 2 replies; 13+ messages in thread From: kernel mailz @ 2009-06-28 4:57 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: gcc-help, linuxppc-dev Thanks Ian, For the "memory" clobber I tried with the a function in linux kernel -- /* * Atomic exchange * * Changes the memory location '*ptr' to be val and returns * the previous value stored there. */ static inline unsigned long __xchg_u32(volatile void *p, unsigned long val) { unsigned long prev; __asm__ __volatile__( "1: lwarx %0,0,%2 \n" " stwcx. %3,0,%2 \n\ bne- 1b" : "=3D&r" (prev), "+m" (*(volatile unsigned int *)p) : "r" (p), "r" (val) // :"memory","cc"); return prev; } #define ADDR 0x1000 int main() { __xchg_u32((void*)ADDR, 0x2000); __xchg_u32((void*)ADDR, 0x3000); return 0; } Got the same asm, when compiled with O1 , with / without "memory" clobber 100003fc <main>: 100003fc: 39 20 10 00 li r9,4096 10000400: 38 00 20 00 li r0,8192 10000404: 7d 60 48 28 lwarx r11,0,r9 10000408: 7c 00 49 2d stwcx. r0,0,r9 1000040c: 40 a2 ff f8 bne- 10000404 <main+0x8> 10000410: 38 00 30 00 li r0,12288 10000414: 7d 60 48 28 lwarx r11,0,r9 10000418: 7c 00 49 2d stwcx. r0,0,r9 1000041c: 40 a2 ff f8 bne- 10000414 <main+0x18> 10000420: 38 60 00 00 li r3,0 10000424: 4e 80 00 20 blr No diff ? am I choosing the right example ? -TZ On Sun, Jun 28, 2009 at 4:50 AM, Ian Lance Taylor<iant@google.com> wrote: > kernel mailz <kernelmailz@googlemail.com> writes: > >> I've been fiddling my luck with gcc 4.3.2 inline assembly on powerpc >> There are a few queries >> >> 1. asm volatile or simply asm produce the same assembly code. >> Tried with a few examples but didnt find any difference by adding >> volatile with asm >> >> 2. Use of "memory" and clobbered registers. >> >> "memory" - >> a. announce to the compiler that the memory has been modified >> b. this instruction writes to some memory (other than a listed output) >> and GCC shouldn=92t cache memory values in registers across this asm. >> >> I tried with stw and stwcx instruction, adding "memory" has no effect. >> >> Is there any example scenerio where gcc would generate different >> assembly by adding / removing "memory" ? > > Please never send a message to both gcc@gcc.gnu.org and > gcc-help@gcc.gnu.org. =A0This message is appropriate for > gcc-help@gcc.gnu.org, not for gcc@gcc.gnu.org. =A0Thanks. > > An asm with no outputs is always considered to be volatile. =A0To see the > affect of volatile, just try something like > =A0 =A0asm ("# modify %0" : "=3Dr" (i) : /* no inputs */ : /* no clobbers= */); > Try it with and without optimization. > > As the documentation says, the effect of adding a "memory" clobber is > that gcc does not cache values in registers across the asm. =A0So the > effect will be shown in something like > =A0int i =3D *p; > =A0asm volatile ("# read %0" : : "r" (i)); > =A0return *p; > The memory clobber will only make a different when optimizing. > > Ian > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Inline Assembly queries 2009-06-28 4:57 ` kernel mailz @ 2009-06-29 15:49 ` kernel mailz 2009-06-29 19:27 ` Scott Wood 2009-06-29 21:29 ` Ian Lance Taylor 2009-06-29 15:57 ` David Howells 1 sibling, 2 replies; 13+ messages in thread From: kernel mailz @ 2009-06-29 15:49 UTC (permalink / raw) To: gcc-help, linuxppc-dev I tried a small example int *p =3D 0x1000; int a =3D *p; asm("sync":::"memory"); a =3D *p; and volatile int *p =3D 0x1000; int a =3D *p; asm("sync"); a =3D *p Got the same assembly. Which is right. So does it mean, if proper use of volatile is done, there is no need of "memory" ? But then why below example of __xchg uses both ? I am confused! Anyone has a clue? -TZ ---------- Forwarded message ---------- From: kernel mailz <kernelmailz@googlemail.com> Date: Sun, Jun 28, 2009 at 10:27 AM Subject: Re: Inline Assembly queries To: Ian Lance Taylor <iant@google.com> Cc: gcc-help@gcc.gnu.org, linuxppc-dev@ozlabs.org Thanks Ian, For the "memory" clobber I tried with the a function in linux kernel -- /* =A0* Atomic exchange =A0* =A0* Changes the memory location '*ptr' to be val and returns =A0* the previous value stored there. =A0*/ static inline unsigned long __xchg_u32(volatile void *p, unsigned long val) { =A0 =A0 =A0 =A0unsigned long prev; =A0 =A0 =A0 =A0__asm__ __volatile__( "1: =A0 =A0 lwarx =A0 %0,0,%2 \n" " =A0 =A0 =A0 stwcx. =A0%3,0,%2 \n\ =A0 =A0 =A0 =A0bne- =A0 =A01b" =A0 =A0 =A0 =A0: "=3D&r" (prev), "+m" (*(volatile unsigned int *)p) =A0 =A0 =A0 =A0: "r" (p), "r" (val) // =A0 =A0 =A0 =A0:"memory","cc"); =A0 =A0 =A0 =A0return prev; } #define ADDR 0x1000 int main() { =A0 =A0 =A0 =A0__xchg_u32((void*)ADDR, 0x2000); =A0 =A0 =A0 =A0__xchg_u32((void*)ADDR, 0x3000); =A0 =A0 =A0 =A0return 0; } Got the same asm, when compiled with O1 , with / without "memory" clobber 100003fc <main>: 100003fc: =A0 =A0 =A0 39 20 10 00 =A0 =A0 li =A0 =A0 =A0r9,4096 10000400: =A0 =A0 =A0 38 00 20 00 =A0 =A0 li =A0 =A0 =A0r0,8192 10000404: =A0 =A0 =A0 7d 60 48 28 =A0 =A0 lwarx =A0 r11,0,r9 10000408: =A0 =A0 =A0 7c 00 49 2d =A0 =A0 stwcx. =A0r0,0,r9 1000040c: =A0 =A0 =A0 40 a2 ff f8 =A0 =A0 bne- =A0 =A010000404 <main+0x8> 10000410: =A0 =A0 =A0 38 00 30 00 =A0 =A0 li =A0 =A0 =A0r0,12288 10000414: =A0 =A0 =A0 7d 60 48 28 =A0 =A0 lwarx =A0 r11,0,r9 10000418: =A0 =A0 =A0 7c 00 49 2d =A0 =A0 stwcx. =A0r0,0,r9 1000041c: =A0 =A0 =A0 40 a2 ff f8 =A0 =A0 bne- =A0 =A010000414 <main+0x18> 10000420: =A0 =A0 =A0 38 60 00 00 =A0 =A0 li =A0 =A0 =A0r3,0 10000424: =A0 =A0 =A0 4e 80 00 20 =A0 =A0 blr No diff ? am I choosing the right example ? -TZ On Sun, Jun 28, 2009 at 4:50 AM, Ian Lance Taylor<iant@google.com> wrote: > kernel mailz <kernelmailz@googlemail.com> writes: > >> I've been fiddling my luck with gcc 4.3.2 inline assembly on powerpc >> There are a few queries >> >> 1. asm volatile or simply asm produce the same assembly code. >> Tried with a few examples but didnt find any difference by adding >> volatile with asm >> >> 2. Use of "memory" and clobbered registers. >> >> "memory" - >> a. announce to the compiler that the memory has been modified >> b. this instruction writes to some memory (other than a listed output) >> and GCC shouldn=92t cache memory values in registers across this asm. >> >> I tried with stw and stwcx instruction, adding "memory" has no effect. >> >> Is there any example scenerio where gcc would generate different >> assembly by adding / removing "memory" ? > > Please never send a message to both gcc@gcc.gnu.org and > gcc-help@gcc.gnu.org. =A0This message is appropriate for > gcc-help@gcc.gnu.org, not for gcc@gcc.gnu.org. =A0Thanks. > > An asm with no outputs is always considered to be volatile. =A0To see the > affect of volatile, just try something like > =A0 =A0asm ("# modify %0" : "=3Dr" (i) : /* no inputs */ : /* no clobbers= */); > Try it with and without optimization. > > As the documentation says, the effect of adding a "memory" clobber is > that gcc does not cache values in registers across the asm. =A0So the > effect will be shown in something like > =A0int i =3D *p; > =A0asm volatile ("# read %0" : : "r" (i)); > =A0return *p; > The memory clobber will only make a different when optimizing. > > Ian > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-29 15:49 ` kernel mailz @ 2009-06-29 19:27 ` Scott Wood 2009-06-30 5:27 ` kernel mailz 2009-06-29 21:29 ` Ian Lance Taylor 1 sibling, 1 reply; 13+ messages in thread From: Scott Wood @ 2009-06-29 19:27 UTC (permalink / raw) To: kernel mailz; +Cc: gcc-help, linuxppc-dev On Mon, Jun 29, 2009 at 09:19:57PM +0530, kernel mailz wrote: > I tried a small example > > int *p = 0x1000; > int a = *p; > asm("sync":::"memory"); > a = *p; > > and > > volatile int *p = 0x1000; > int a = *p; > asm("sync"); > a = *p > > Got the same assembly. > Which is right. > > So does it mean, if proper use of volatile is done, there is no need > of "memory" ? No. As I understand it, volatile concerns deletion of the asm statement (if no outputs are used) and reordering with respect to other asm statements (not sure whether GCC will actually do this), while the memory clobber concerns optimization of non-asm loads/stores around the asm statement. > static inline unsigned long > __xchg_u32(volatile void *p, unsigned long val) > { > unsigned long prev; > > __asm__ __volatile__( > > "1: lwarx %0,0,%2 \n" > > " stwcx. %3,0,%2 \n\ > bne- 1b" > > : "=&r" (prev), "+m" (*(volatile unsigned int *)p) > : "r" (p), "r" (val) > // :"memory","cc"); > > return prev; > } > #define ADDR 0x1000 > int main() > { > __xchg_u32((void*)ADDR, 0x2000); > __xchg_u32((void*)ADDR, 0x3000); > > return 0; > > } > > Got the same asm, when compiled with O1 , with / without "memory" clobber This isn't a good test case, because there's nothing other than inline asm going on in that function for GCC to optimize. Plus, it's generally not a good idea, when talking about what the compiler is or isn't allowed to do, to point to a single test case (or even several) and say that it isn't required because you don't notice a difference. Even if there were no code at all with which it made a difference with GCC version X, it could make a difference with GCC version X+1. -Scott ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-29 19:27 ` Scott Wood @ 2009-06-30 5:27 ` kernel mailz 2009-06-30 10:41 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: kernel mailz @ 2009-06-30 5:27 UTC (permalink / raw) To: Scott Wood; +Cc: gcc-help, linuxppc-dev Hi Scott, I agree with you, kind of understand that it is required. But buddy unless you see some construct work or by adding the construct a visible difference is there, the concept is just piece of theory. I am trying all the kernel code inline assembly to find an example that works differently with memory. For instance take atomic_add , atomic_add_return, while the atomic_add_return has the "memory", atomic_add skips it. -TZ On Tue, Jun 30, 2009 at 12:57 AM, Scott Wood<scottwood@freescale.com> wrote= : > On Mon, Jun 29, 2009 at 09:19:57PM +0530, kernel mailz wrote: >> I tried a small example >> >> int *p =3D 0x1000; >> int a =3D *p; >> asm("sync":::"memory"); >> a =3D *p; >> >> and >> >> volatile int *p =3D 0x1000; >> int a =3D *p; >> asm("sync"); >> a =3D *p >> >> Got the same assembly. >> Which is right. >> >> So does it mean, if proper use of volatile is done, there is no need >> of "memory" ? > > No. =A0As I understand it, volatile concerns deletion of the asm statemen= t > (if no outputs are used) and reordering with respect to other asm > statements (not sure whether GCC will actually do this), while the memory > clobber concerns optimization of non-asm loads/stores around the asm > statement. > >> static inline unsigned long >> __xchg_u32(volatile void *p, unsigned long val) >> { >> =A0 =A0 =A0 =A0unsigned long prev; >> >> =A0 =A0 =A0 =A0__asm__ __volatile__( >> >> "1: =A0 =A0 lwarx =A0 %0,0,%2 \n" >> >> " =A0 =A0 =A0 stwcx. =A0%3,0,%2 \n\ >> =A0 =A0 =A0 =A0bne- =A0 =A01b" >> >> =A0 =A0 =A0 =A0: "=3D&r" (prev), "+m" (*(volatile unsigned int *)p) >> =A0 =A0 =A0 =A0: "r" (p), "r" (val) >> // =A0 =A0 =A0 =A0:"memory","cc"); >> >> =A0 =A0 =A0 =A0return prev; >> } >> #define ADDR 0x1000 >> int main() >> { >> =A0 =A0 =A0 =A0__xchg_u32((void*)ADDR, 0x2000); >> =A0 =A0 =A0 =A0__xchg_u32((void*)ADDR, 0x3000); >> >> =A0 =A0 =A0 =A0return 0; >> >> } >> >> Got the same asm, when compiled with O1 , with / without "memory" clobbe= r > > This isn't a good test case, because there's nothing other than inline > asm going on in that function for GCC to optimize. =A0Plus, it's generall= y > not a good idea, when talking about what the compiler is or isn't allowed > to do, to point to a single test case (or even several) and say that it > isn't required because you don't notice a difference. =A0Even if there we= re > no code at all with which it made a difference with GCC version X, it > could make a difference with GCC version X+1. > > -Scott > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-30 5:27 ` kernel mailz @ 2009-06-30 10:41 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-06-30 10:41 UTC (permalink / raw) To: kernel mailz; +Cc: Scott Wood, gcc-help, linuxppc-dev On Tue, 2009-06-30 at 10:57 +0530, kernel mailz wrote: > Hi Scott, > I agree with you, kind of understand that it is required. > But buddy unless you see some construct work or by adding the > construct a visible difference is there, the concept is just piece of > theory. In this case you'd rather get the theory right :-) The problem here is that doing it wrong won't bite most of the time ... until for some reason gcc decides to optimize things, for example by moving things around, and kaboom ! But it may only happen in one out of 100 cases of the same inline function depending on the surrounding code. > I am trying all the kernel code inline assembly to find an example > that works differently with memory. Well, we recently had an example in the atomic64 code for 32-bit that Paulus wrote, where we discovered we were missing the memory clobber in local_irq_restore() iirc. On a base UP build, we did observe the load and stores within the local_irq_save/restore section being moved to outside of it by gcc. > For instance take atomic_add , atomic_add_return, while the > atomic_add_return has the "memory", atomic_add skips it. There is a reason for the slightly different semantics here. The base atomic ops such as atomic_add are defined as having no side effects and to allow re-ordering. Thus atomic_add has an explicit clobber of the actual variable that's incremented but nothing else and no other memory barrier instruction. The variants that -return- something, however, have been granted stronger semantics, because they have been (ab)used to construct what effectively is semaphores or locks by callers, and thus we added stronger memory barriers to avoid re-ordering of loads and stores around the atomic operation. Cheers, Ben. > -TZ > > On Tue, Jun 30, 2009 at 12:57 AM, Scott Wood<scottwood@freescale.com> wrote: > > On Mon, Jun 29, 2009 at 09:19:57PM +0530, kernel mailz wrote: > >> I tried a small example > >> > >> int *p = 0x1000; > >> int a = *p; > >> asm("sync":::"memory"); > >> a = *p; > >> > >> and > >> > >> volatile int *p = 0x1000; > >> int a = *p; > >> asm("sync"); > >> a = *p > >> > >> Got the same assembly. > >> Which is right. > >> > >> So does it mean, if proper use of volatile is done, there is no need > >> of "memory" ? > > > > No. As I understand it, volatile concerns deletion of the asm statement > > (if no outputs are used) and reordering with respect to other asm > > statements (not sure whether GCC will actually do this), while the memory > > clobber concerns optimization of non-asm loads/stores around the asm > > statement. > > > >> static inline unsigned long > >> __xchg_u32(volatile void *p, unsigned long val) > >> { > >> unsigned long prev; > >> > >> __asm__ __volatile__( > >> > >> "1: lwarx %0,0,%2 \n" > >> > >> " stwcx. %3,0,%2 \n\ > >> bne- 1b" > >> > >> : "=&r" (prev), "+m" (*(volatile unsigned int *)p) > >> : "r" (p), "r" (val) > >> // :"memory","cc"); > >> > >> return prev; > >> } > >> #define ADDR 0x1000 > >> int main() > >> { > >> __xchg_u32((void*)ADDR, 0x2000); > >> __xchg_u32((void*)ADDR, 0x3000); > >> > >> return 0; > >> > >> } > >> > >> Got the same asm, when compiled with O1 , with / without "memory" clobber > > > > This isn't a good test case, because there's nothing other than inline > > asm going on in that function for GCC to optimize. Plus, it's generally > > not a good idea, when talking about what the compiler is or isn't allowed > > to do, to point to a single test case (or even several) and say that it > > isn't required because you don't notice a difference. Even if there were > > no code at all with which it made a difference with GCC version X, it > > could make a difference with GCC version X+1. > > > > -Scott > > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-29 15:49 ` kernel mailz 2009-06-29 19:27 ` Scott Wood @ 2009-06-29 21:29 ` Ian Lance Taylor 2009-06-30 5:53 ` kernel mailz 1 sibling, 1 reply; 13+ messages in thread From: Ian Lance Taylor @ 2009-06-29 21:29 UTC (permalink / raw) To: kernel mailz; +Cc: gcc-help, linuxppc-dev kernel mailz <kernelmailz@googlemail.com> writes: > I tried a small example > > int *p = 0x1000; > int a = *p; > asm("sync":::"memory"); > a = *p; > > and > > volatile int *p = 0x1000; > int a = *p; > asm("sync"); > a = *p > > Got the same assembly. > Which is right. > > So does it mean, if proper use of volatile is done, there is no need > of "memory" ? You have to consider the effects of inlining, which may bring in other memory loads and stores through non-volatile pointers. Ian ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-29 21:29 ` Ian Lance Taylor @ 2009-06-30 5:53 ` kernel mailz 2009-06-30 9:30 ` Andrew Haley 2009-06-30 9:52 ` Paul Mackerras 0 siblings, 2 replies; 13+ messages in thread From: kernel mailz @ 2009-06-30 5:53 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: gcc-help, linuxppc-dev Consider atomic_add and atomic_add_return in kernel code. On Tue, Jun 30, 2009 at 2:59 AM, Ian Lance Taylor<iant@google.com> wrote: > kernel mailz <kernelmailz@googlemail.com> writes: > >> I tried a small example >> >> int *p = 0x1000; >> int a = *p; >> asm("sync":::"memory"); >> a = *p; >> >> and >> >> volatile int *p = 0x1000; >> int a = *p; >> asm("sync"); >> a = *p >> >> Got the same assembly. >> Which is right. >> >> So does it mean, if proper use of volatile is done, there is no need >> of "memory" ? > > You have to consider the effects of inlining, which may bring in other > memory loads and stores through non-volatile pointers. > > Ian > Consider static __inline__ void atomic_add(int a, atomic_t *v) { int t; __asm__ __volatile__( "1: lwarx %0,0,%3 # atomic_add\n\ add %0,%2,%0\n" PPC405_ERR77(0,%3) " stwcx. %0,0,%3 \n\ bne- 1b" : "=&r" (t), "+m" (v->counter) : "r" (a), "r" (&v->counter) : "cc"); } static __inline__ int atomic_add_return(int a, atomic_t *v) { int t; __asm__ __volatile__( LWSYNC_ON_SMP "1: lwarx %0,0,%2 # atomic_add_return\n\ add %0,%1,%0\n" PPC405_ERR77(0,%2) " stwcx. %0,0,%2 \n\ bne- 1b" ISYNC_ON_SMP : "=&r" (t) : "r" (a), "r" (&v->counter) : "cc", "memory"); return t; } I am not able to figure out why "memory" is added in latter -TZ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-30 5:53 ` kernel mailz @ 2009-06-30 9:30 ` Andrew Haley 2009-06-30 9:52 ` Paul Mackerras 1 sibling, 0 replies; 13+ messages in thread From: Andrew Haley @ 2009-06-30 9:30 UTC (permalink / raw) To: kernel mailz; +Cc: gcc-help, linuxppc-dev, Ian Lance Taylor kernel mailz wrote: > Consider atomic_add and atomic_add_return in kernel code. > > On Tue, Jun 30, 2009 at 2:59 AM, Ian Lance Taylor<iant@google.com> wrote: >> kernel mailz <kernelmailz@googlemail.com> writes: >> >>> I tried a small example >>> >>> int *p = 0x1000; >>> int a = *p; >>> asm("sync":::"memory"); >>> a = *p; >>> >>> and >>> >>> volatile int *p = 0x1000; >>> int a = *p; >>> asm("sync"); >>> a = *p >>> >>> Got the same assembly. >>> Which is right. >>> >>> So does it mean, if proper use of volatile is done, there is no need >>> of "memory" ? >> You have to consider the effects of inlining, which may bring in other >> memory loads and stores through non-volatile pointers. > Consider > > static __inline__ void atomic_add(int a, atomic_t *v) > { > int t; > > __asm__ __volatile__( > "1: lwarx %0,0,%3 # atomic_add\n\ > add %0,%2,%0\n" > PPC405_ERR77(0,%3) > " stwcx. %0,0,%3 \n\ > bne- 1b" > : "=&r" (t), "+m" (v->counter) > : "r" (a), "r" (&v->counter) > : "cc"); > } > > static __inline__ int atomic_add_return(int a, atomic_t *v) > { > int t; > > __asm__ __volatile__( > LWSYNC_ON_SMP > "1: lwarx %0,0,%2 # atomic_add_return\n\ > add %0,%1,%0\n" > PPC405_ERR77(0,%2) > " stwcx. %0,0,%2 \n\ > bne- 1b" > ISYNC_ON_SMP > : "=&r" (t) > : "r" (a), "r" (&v->counter) > : "cc", "memory"); > > return t; > } > > I am not able to figure out why "memory" is added in latter The latter, as well as its stated purpose, forms a memory barrier, so the compiler must be prevented from moving memory access across it. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html Andrew. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-30 5:53 ` kernel mailz 2009-06-30 9:30 ` Andrew Haley @ 2009-06-30 9:52 ` Paul Mackerras 1 sibling, 0 replies; 13+ messages in thread From: Paul Mackerras @ 2009-06-30 9:52 UTC (permalink / raw) To: kernel mailz; +Cc: gcc-help, linuxppc-dev, Ian Lance Taylor kernel mailz writes: > Consider atomic_add and atomic_add_return in kernel code. > I am not able to figure out why "memory" is added in latter The "memory" indicates that gcc should not reorder accesses to memory from one side of the asm to the other. The reason for putting it on the atomic ops that return a value is that they are sometimes used to implement locks or other synchronization primitives. Paul. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-28 4:57 ` kernel mailz 2009-06-29 15:49 ` kernel mailz @ 2009-06-29 15:57 ` David Howells 2009-06-29 21:27 ` Ian Lance Taylor 2009-06-30 10:43 ` Benjamin Herrenschmidt 1 sibling, 2 replies; 13+ messages in thread From: David Howells @ 2009-06-29 15:57 UTC (permalink / raw) To: kernel mailz; +Cc: gcc-help, linuxppc-dev kernel mailz <kernelmailz@googlemail.com> wrote: > asm("sync"); Isn't gcc free to discard this as it has no dependencies, no indicated side effects, and isn't required to be kept? I think this should probably be: asm volatile("sync"); David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-29 15:57 ` David Howells @ 2009-06-29 21:27 ` Ian Lance Taylor 2009-06-30 10:43 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 13+ messages in thread From: Ian Lance Taylor @ 2009-06-29 21:27 UTC (permalink / raw) To: David Howells; +Cc: gcc-help, linuxppc-dev, kernel mailz David Howells <dhowells@redhat.com> writes: > kernel mailz <kernelmailz@googlemail.com> wrote: > >> asm("sync"); > > Isn't gcc free to discard this as it has no dependencies, no indicated side > effects, and isn't required to be kept? I think this should probably be: > > asm volatile("sync"); An asm with no outputs is considered to be volatile. Ian ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Inline Assembly queries 2009-06-29 15:57 ` David Howells 2009-06-29 21:27 ` Ian Lance Taylor @ 2009-06-30 10:43 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-06-30 10:43 UTC (permalink / raw) To: David Howells; +Cc: gcc-help, linuxppc-dev, kernel mailz On Mon, 2009-06-29 at 16:57 +0100, David Howells wrote: > kernel mailz <kernelmailz@googlemail.com> wrote: > > > asm("sync"); > > Isn't gcc free to discard this as it has no dependencies, no indicated side > effects, and isn't required to be kept? I think this should probably be: > > asm volatile("sync"); It should also have a "memory" clobber or it's pointless since gcc would otherwise be free to move load and stores accross that barrier. Cheers, Ben. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-06-30 10:49 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-27 19:46 Inline Assembly queries kernel mailz
[not found] ` <abe8a1fd0906271249k479e5a87gfe1ee9c02798a234@mail.gmail.com>
[not found] ` <m3ab3t4623.fsf@google.com>
2009-06-28 4:57 ` kernel mailz
2009-06-29 15:49 ` kernel mailz
2009-06-29 19:27 ` Scott Wood
2009-06-30 5:27 ` kernel mailz
2009-06-30 10:41 ` Benjamin Herrenschmidt
2009-06-29 21:29 ` Ian Lance Taylor
2009-06-30 5:53 ` kernel mailz
2009-06-30 9:30 ` Andrew Haley
2009-06-30 9:52 ` Paul Mackerras
2009-06-29 15:57 ` David Howells
2009-06-29 21:27 ` Ian Lance Taylor
2009-06-30 10:43 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).