Linux PARISC architecture development
 help / color / mirror / Atom feed
* ldcw inline assembler patch
@ 2008-06-14 15:36 Helge Deller
  2008-06-16 20:50 ` Carlos O'Donell
  0 siblings, 1 reply; 11+ messages in thread
From: Helge Deller @ 2008-06-14 15:36 UTC (permalink / raw)
  To: linux-parisc

I'm wondering if this patch might help people who are seeing locking 
problems on SMP boxes ?
Helge

diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
index ee80c92..4752684 100644
--- a/include/asm-parisc/system.h
+++ b/include/asm-parisc/system.h
@@ -168,8 +168,9 @@ static inline void set_eiem(unsigned long val)
  /* LDCW, the only atomic read-write operation PA-RISC has. *sigh*.  */
  #define __ldcw(a) ({						\
  	unsigned __ret;						\
-	__asm__ __volatile__(__LDCW " 0(%1),%0"			\
-		: "=r" (__ret) : "r" (a));			\
+	__asm__ __volatile__(__LDCW " 0(%2),%0"			\
+			: "=r" (__ret), "=m" (*(a))		\
+			: "r" (a), "m" (*(a))	);		\
  	__ret;							\
  })


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-14 15:36 ldcw inline assembler patch Helge Deller
@ 2008-06-16 20:50 ` Carlos O'Donell
  2008-06-16 21:06   ` Helge Deller
  2008-06-21 18:34   ` John David Anglin
  0 siblings, 2 replies; 11+ messages in thread
From: Carlos O'Donell @ 2008-06-16 20:50 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, John David Anglin

On Sat, Jun 14, 2008 at 11:36 AM, Helge Deller <deller@gmx.de> wrote:
> I'm wondering if this patch might help people who are seeing locking
> problems on SMP boxes ?
> Helge
>
> diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
> index ee80c92..4752684 100644
> --- a/include/asm-parisc/system.h
> +++ b/include/asm-parisc/system.h
> @@ -168,8 +168,9 @@ static inline void set_eiem(unsigned long val)
>  /* LDCW, the only atomic read-write operation PA-RISC has. *sigh*.  */
>  #define __ldcw(a) ({                                           \
>        unsigned __ret;                                         \
> -       __asm__ __volatile__(__LDCW " 0(%1),%0"                 \
> -               : "=r" (__ret) : "r" (a));                      \
> +       __asm__ __volatile__(__LDCW " 0(%2),%0"                 \
> +                       : "=r" (__ret), "=m" (*(a))             \
> +                       : "r" (a), "m" (*(a))   );              \
>        __ret;                                                  \
>  })

You don't want to do that, the compiler might hold "=m" (*(a)) in a
temporary memory location.

e.g.
temp_mem = *a;
reg2 = &temp_mem;
... operation ...
*a = temp_mem;

You would be atomic with regards to the store to temp_mem, but not a.
Infact this could always be the case.

I'm now of the opinion that we need a "memory" clobber in the original
expression to prevent this from ever happening.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-16 20:50 ` Carlos O'Donell
@ 2008-06-16 21:06   ` Helge Deller
  2008-06-16 21:54     ` Carlos O'Donell
  2008-06-21 18:34   ` John David Anglin
  1 sibling, 1 reply; 11+ messages in thread
From: Helge Deller @ 2008-06-16 21:06 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: linux-parisc, John David Anglin

Hi Carlos,

Carlos O'Donell wrote:
> On Sat, Jun 14, 2008 at 11:36 AM, Helge Deller <deller@gmx.de> wrote:
>> I'm wondering if this patch might help people who are seeing locking
>> problems on SMP boxes ?
>> Helge
>>
>> diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
>> index ee80c92..4752684 100644
>> --- a/include/asm-parisc/system.h
>> +++ b/include/asm-parisc/system.h
>> @@ -168,8 +168,9 @@ static inline void set_eiem(unsigned long val)
>>  /* LDCW, the only atomic read-write operation PA-RISC has. *sigh*.  */
>>  #define __ldcw(a) ({                                           \
>>        unsigned __ret;                                         \
>> -       __asm__ __volatile__(__LDCW " 0(%1),%0"                 \
>> -               : "=r" (__ret) : "r" (a));                      \
>> +       __asm__ __volatile__(__LDCW " 0(%2),%0"                 \
>> +                       : "=r" (__ret), "=m" (*(a))             \
>> +                       : "r" (a), "m" (*(a))   );              \
>>        __ret;                                                  \
>>  })
> 
> You don't want to do that, the compiler might hold "=m" (*(a)) in a
> temporary memory location.
> 
> e.g.
> temp_mem = *a;
> reg2 = &temp_mem;
> ... operation ...
> *a = temp_mem;
> 
> You would be atomic with regards to the store to temp_mem, but not a.
> Infact this could always be the case.
> 
> I'm now of the opinion that we need a "memory" clobber in the original
> expression to prevent this from ever happening.

So, your proposal is (copy-and-pasted in here) the following ?

diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
index ee80c92..daeae39 100644
--- a/include/asm-parisc/system.h
+++ b/include/asm-parisc/system.h
@@ -169,7 +169,7 @@ static inline void set_eiem(unsigned long val)
  #define __ldcw(a) ({                                           \
         unsigned __ret;                                         \
         __asm__ __volatile__(__LDCW " 0(%1),%0"                 \
-               : "=r" (__ret) : "r" (a));                      \
+               : "=r" (__ret) : "r" (a) : "memory" );          \
         __ret;                                                  \
  })

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-16 21:06   ` Helge Deller
@ 2008-06-16 21:54     ` Carlos O'Donell
  2008-06-16 21:57       ` Kyle McMartin
  0 siblings, 1 reply; 11+ messages in thread
From: Carlos O'Donell @ 2008-06-16 21:54 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, John David Anglin

On Mon, Jun 16, 2008 at 5:06 PM, Helge Deller <deller@gmx.de> wrote:
> So, your proposal is (copy-and-pasted in here) the following ?
>
> diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
> index ee80c92..daeae39 100644
> --- a/include/asm-parisc/system.h
> +++ b/include/asm-parisc/system.h
> @@ -169,7 +169,7 @@ static inline void set_eiem(unsigned long val)
>  #define __ldcw(a) ({                                           \
>        unsigned __ret;                                         \
>        __asm__ __volatile__(__LDCW " 0(%1),%0"                 \
> -               : "=r" (__ret) : "r" (a));                      \
> +               : "=r" (__ret) : "r" (a) : "memory" );          \
>        __ret;                                                  \
>  })

Yes. The asm should clobber memory thus forcing the compiler to avoid
memory temporaries.

Dave may have other opinions.

I think this solution is the safest.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-16 21:54     ` Carlos O'Donell
@ 2008-06-16 21:57       ` Kyle McMartin
  2008-06-16 22:03         ` Kyle McMartin
  2008-06-16 22:05         ` Matthew Wilcox
  0 siblings, 2 replies; 11+ messages in thread
From: Kyle McMartin @ 2008-06-16 21:57 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Helge Deller, linux-parisc, John David Anglin

On Mon, Jun 16, 2008 at 05:54:24PM -0400, Carlos O'Donell wrote:
> On Mon, Jun 16, 2008 at 5:06 PM, Helge Deller <deller@gmx.de> wrote:
> > So, your proposal is (copy-and-pasted in here) the following ?
> >
> > diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
> > index ee80c92..daeae39 100644
> > --- a/include/asm-parisc/system.h
> > +++ b/include/asm-parisc/system.h
> > @@ -169,7 +169,7 @@ static inline void set_eiem(unsigned long val)
> >  #define __ldcw(a) ({                                           \
> >        unsigned __ret;                                         \
> >        __asm__ __volatile__(__LDCW " 0(%1),%0"                 \
> > -               : "=r" (__ret) : "r" (a));                      \
> > +               : "=r" (__ret) : "r" (a) : "memory" );          \
> >        __ret;                                                  \
> >  })
> 
> Yes. The asm should clobber memory thus forcing the compiler to avoid
> memory temporaries.
> 

It shouldn't need to, since we're only ever accessing one word (the one
specified in the operand.)

Otherwise basically every inline asm everywhere ever is going to need a
memory clobber, and that's just BROKEN.

r, Kyle

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-16 21:57       ` Kyle McMartin
@ 2008-06-16 22:03         ` Kyle McMartin
  2008-06-16 22:05         ` Matthew Wilcox
  1 sibling, 0 replies; 11+ messages in thread
From: Kyle McMartin @ 2008-06-16 22:03 UTC (permalink / raw)
  To: Kyle McMartin
  Cc: Carlos O'Donell, Helge Deller, linux-parisc,
	John David Anglin

On Mon, Jun 16, 2008 at 05:57:26PM -0400, Kyle McMartin wrote:
> On Mon, Jun 16, 2008 at 05:54:24PM -0400, Carlos O'Donell wrote:
> > On Mon, Jun 16, 2008 at 5:06 PM, Helge Deller <deller@gmx.de> wrote:
> > > So, your proposal is (copy-and-pasted in here) the following ?
> > >
> > > diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
> > > index ee80c92..daeae39 100644
> > > --- a/include/asm-parisc/system.h
> > > +++ b/include/asm-parisc/system.h
> > > @@ -169,7 +169,7 @@ static inline void set_eiem(unsigned long val)
> > >  #define __ldcw(a) ({                                           \
> > >        unsigned __ret;                                         \
> > >        __asm__ __volatile__(__LDCW " 0(%1),%0"                 \
> > > -               : "=r" (__ret) : "r" (a));                      \
> > > +               : "=r" (__ret) : "r" (a) : "memory" );          \
> > >        __ret;                                                  \
> > >  })
> > 
> > Yes. The asm should clobber memory thus forcing the compiler to avoid
> > memory temporaries.
> > 
> 
> It shouldn't need to, since we're only ever accessing one word (the one
> specified in the operand.)
> 
> Otherwise basically every inline asm everywhere ever is going to need a
> memory clobber, and that's just BROKEN.
>

Willy points out the caching of the locked data across the lock, but we
surround the inline in a memory barrier, so we're fine.

r, Kyle

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-16 21:57       ` Kyle McMartin
  2008-06-16 22:03         ` Kyle McMartin
@ 2008-06-16 22:05         ` Matthew Wilcox
  2008-06-16 22:14           ` Carlos O'Donell
  1 sibling, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2008-06-16 22:05 UTC (permalink / raw)
  To: Kyle McMartin
  Cc: Carlos O'Donell, Helge Deller, linux-parisc,
	John David Anglin

On Mon, Jun 16, 2008 at 05:57:26PM -0400, Kyle McMartin wrote:
> On Mon, Jun 16, 2008 at 05:54:24PM -0400, Carlos O'Donell wrote:
> > On Mon, Jun 16, 2008 at 5:06 PM, Helge Deller <deller@gmx.de> wrote:
> > > So, your proposal is (copy-and-pasted in here) the following ?
> > >
> > > diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
> > > index ee80c92..daeae39 100644
> > > --- a/include/asm-parisc/system.h
> > > +++ b/include/asm-parisc/system.h
> > > @@ -169,7 +169,7 @@ static inline void set_eiem(unsigned long val)
> > >  #define __ldcw(a) ({                                           \
> > >        unsigned __ret;                                         \
> > >        __asm__ __volatile__(__LDCW " 0(%1),%0"                 \
> > > -               : "=r" (__ret) : "r" (a));                      \
> > > +               : "=r" (__ret) : "r" (a) : "memory" );          \
> > >        __ret;                                                  \
> > >  })
> > 
> > Yes. The asm should clobber memory thus forcing the compiler to avoid
> > memory temporaries.
> 
> It shouldn't need to, since we're only ever accessing one word (the one
> specified in the operand.)
> 
> Otherwise basically every inline asm everywhere ever is going to need a
> memory clobber, and that's just BROKEN.

Carlos' and Helge's point (I think) is that the __ldcw() doesn't clobber
memory, so gcc can cache other things in registers across a call to
__ldcw().  While this is true, our definition of __raw_spin_lock() has
two calls to mb() in it, which is defined to clobber memory.

The only users of __ldcw() are in spinlock.h which has the mb()s in
place.  I don't think there's a problem here.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-16 22:05         ` Matthew Wilcox
@ 2008-06-16 22:14           ` Carlos O'Donell
  2008-06-17  1:56             ` Carlos O'Donell
  0 siblings, 1 reply; 11+ messages in thread
From: Carlos O'Donell @ 2008-06-16 22:14 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kyle McMartin, Helge Deller, linux-parisc, John David Anglin

On Mon, Jun 16, 2008 at 6:05 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> The only users of __ldcw() are in spinlock.h which has the mb()s in
> place.  I don't think there's a problem here.

Given:
mb()
__ldcw(a)
mb()

What stops the compiler from doing?

mb()
*stack_slot = *a
reg1 = *stack_slot
reg2 = __ldcw(reg1)
*a = *stack_slot
mb()

Memory is consistent before and after the memory barriers, but the
operation is not atomic?

While this seems stupid, the compile may create a memory temporary to
shuffle things around.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-16 22:14           ` Carlos O'Donell
@ 2008-06-17  1:56             ` Carlos O'Donell
  2008-06-17  3:34               ` Carlos O'Donell
  0 siblings, 1 reply; 11+ messages in thread
From: Carlos O'Donell @ 2008-06-17  1:56 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kyle McMartin, Helge Deller, linux-parisc, John David Anglin

On Mon, Jun 16, 2008 at 6:14 PM, Carlos O'Donell
<carlos@systemhalted.org> wrote:
> Given:
> mb()
> __ldcw(a)
> mb()
>
> What stops the compiler from doing?
>
> mb()
> *stack_slot = *a
> reg1 = *stack_slot

This is a mistake, and should read:

reg1 = stack_slot

e.g. Copy the address of the stack slot into reg1.

> reg2 = __ldcw(reg1)
> *a = *stack_slot
> mb()
>
> Memory is consistent before and after the memory barriers, but the
> operation is not atomic?
>
> While this seems stupid, the compile may create a memory temporary to
> shuffle things around.

The compiler creates memory temporaries all the times and expects the
optimizers to see that the temporaries are not needed. However,
sometimes the temporaries aren't removed.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-17  1:56             ` Carlos O'Donell
@ 2008-06-17  3:34               ` Carlos O'Donell
  0 siblings, 0 replies; 11+ messages in thread
From: Carlos O'Donell @ 2008-06-17  3:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kyle McMartin, Helge Deller, linux-parisc, John David Anglin

On Mon, Jun 16, 2008 at 9:56 PM, Carlos O'Donell
<carlos@systemhalted.org> wrote:
> The compiler creates memory temporaries all the times and expects the
> optimizers to see that the temporaries are not needed. However,
> sometimes the temporaries aren't removed.

It turns out this should never happen because raw_spinlock_t->lock is volatile.

Therefore the memory clobber is not required.

The memory barriers will prevent code from being moved before the lock
or past the unlock.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ldcw inline assembler patch
  2008-06-16 20:50 ` Carlos O'Donell
  2008-06-16 21:06   ` Helge Deller
@ 2008-06-21 18:34   ` John David Anglin
  1 sibling, 0 replies; 11+ messages in thread
From: John David Anglin @ 2008-06-21 18:34 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: deller, linux-parisc, dave.anglin

> > diff --git a/include/asm-parisc/system.h b/include/asm-parisc/system.h
> > index ee80c92..4752684 100644
> > --- a/include/asm-parisc/system.h
> > +++ b/include/asm-parisc/system.h
> > @@ -168,8 +168,9 @@ static inline void set_eiem(unsigned long val)
> >  /* LDCW, the only atomic read-write operation PA-RISC has. *sigh*.  */
> >  #define __ldcw(a) ({                                           \
> >        unsigned __ret;                                         \
> > -       __asm__ __volatile__(__LDCW " 0(%1),%0"                 \
> > -               : "=r" (__ret) : "r" (a));                      \
> > +       __asm__ __volatile__(__LDCW " 0(%2),%0"                 \
> > +                       : "=r" (__ret), "=m" (*(a))             \
> > +                       : "r" (a), "m" (*(a))   );              \
> >        __ret;                                                  \
> >  })
> 
> You don't want to do that, the compiler might hold "=m" (*(a)) in a
> temporary memory location.
> 
> e.g.
> temp_mem = *a;
> reg2 = &temp_mem;
> ... operation ...
> *a = temp_mem;

The operation uses "a" not "reg2".  Listing *a as written in the
asm should stop the compiler from caching *a in temp_mem.

I don't believe the compiler ever caches values in "memory".  However,
it does cache memory values in registers.  So, we need to tell GCC not
to cache *a in a register.  This can be done either by explicitly
listing the affected locations or by clobbering all memory.

I believe using the volatile keyword and a memory clobber is ok but
overkill.  Explicitly listing the affected memory as suggested by
Helge appears to be the way to go.  There is an example of this in
the GCC manual.

I also believe we can use the "+m" constraint so that *a doesn't need
to be listed twice.

It might be possible to remove the volatile keyword when the affected
memory is explicitly listed.  However, I doubt there are any situations
where the asm can be optimized away.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-06-21 18:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-14 15:36 ldcw inline assembler patch Helge Deller
2008-06-16 20:50 ` Carlos O'Donell
2008-06-16 21:06   ` Helge Deller
2008-06-16 21:54     ` Carlos O'Donell
2008-06-16 21:57       ` Kyle McMartin
2008-06-16 22:03         ` Kyle McMartin
2008-06-16 22:05         ` Matthew Wilcox
2008-06-16 22:14           ` Carlos O'Donell
2008-06-17  1:56             ` Carlos O'Donell
2008-06-17  3:34               ` Carlos O'Donell
2008-06-21 18:34   ` John David Anglin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox