All of lore.kernel.org
 help / color / mirror / Atom feed
* [parisc-linux] coherent ops and mb() revisited
@ 2004-09-05  1:38 Grant Grundler
  2004-09-05  2:56 ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Grant Grundler @ 2004-09-05  1:38 UTC (permalink / raw)
  To: parisc-linux

Hi all,
I haven't tested the following patch and just wanted to get
feedback if it is "the right thing" or not.

John Marvin was clear in previous discussion we don't need
to use coherent stores but I've heard we should follow the
arch anyway so it doesn't bite us later. I find it highly
unlikely but see no harm in "doing it right".

The second bit is if adding "memory" reference in the ldcw
and _raw_spin_unlock() could replace the mb() instructions
which James Bottomley added to prevent gcc from re-ordering
instructions outside the critical section.

opinions?

thanks,
grant


Index: include/asm-parisc/spinlock.h
===================================================================
RCS file: /var/cvs/linux-2.6/include/asm-parisc/spinlock.h,v
retrieving revision 1.6
diff -u -p -r1.6 spinlock.h
--- include/asm-parisc/spinlock.h	15 Aug 2004 14:17:39 -0000	1.6
+++ include/asm-parisc/spinlock.h	5 Sep 2004 00:58:42 -0000
@@ -29,20 +29,26 @@ static inline void _raw_spin_lock(spinlo
 {
 	volatile unsigned int *a;
 
-	mb();
 	a = __ldcw_align(x);
 	while (__ldcw(a) == 0)
 		while (*a == 0);
-	mb();
 }
 
 static inline void _raw_spin_unlock(spinlock_t *x)
 {
 	volatile unsigned int *a;
-	mb();
+	register unsigned tmp=1;
+
 	a = __ldcw_align(x);
-	*a = 1;
-	mb();
+	/* use a coherent store. PA1.1 is always strongly ordered.
+	 * Even though no PA2.0 implementation is weakly ordered,
+	 * jda would prefer we use coherent stores (",ma" with zero offset
+	 * is the same thing but PA1.1 compatible).
+	 * Key here is "memory" - prevent gcc from re-ordering memory
+	 * accesses below releasing the lock.
+	 */
+	__asm__ __volatile__ ("stw,ma %1,0(%0)"
+			: : "r" (a), "r" (tmp) : "memory");
 }
 
 static inline int _raw_spin_trylock(spinlock_t *x)
Index: include/asm-parisc/system.h
===================================================================
RCS file: /var/cvs/linux-2.6/include/asm-parisc/system.h,v
retrieving revision 1.7
diff -u -p -r1.7 system.h
--- include/asm-parisc/system.h	18 Aug 2004 20:21:41 -0000	1.7
+++ include/asm-parisc/system.h	5 Sep 2004 00:58:42 -0000
@@ -141,7 +141,7 @@ static inline void set_eiem(unsigned lon
 /* LDCW, the only atomic read-write operation PA-RISC has. *sigh*.  */
 #define __ldcw(a) ({ \
 	unsigned __ret; \
-	__asm__ __volatile__("ldcw 0(%1),%0" : "=r" (__ret) : "r" (a)); \
+	__asm__ __volatile__("ldcw 0(%1),%0" : "=r" (__ret) : "r" (a) : "memory"); \
 	__ret; \
 })
 

_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-05  1:38 [parisc-linux] coherent ops and mb() revisited Grant Grundler
@ 2004-09-05  2:56 ` James Bottomley
  2004-09-05  6:27   ` John David Anglin
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2004-09-05  2:56 UTC (permalink / raw)
  To: Grant Grundler; +Cc: parisc-linux

On Sat, 2004-09-04 at 21:38, Grant Grundler wrote:
> I haven't tested the following patch and just wanted to get
> feedback if it is "the right thing" or not.

It should work, but I don't think it's the right thing to do.

__ldcw already insists on operating on volatile data, which is the
correct thing to do.  Adding a memory barrier is probably harmless, but
it crimps any optimisations gcc might like to try with that function
(not that there really are any with such a simple loop).  Indeed, we
could even remove the __volatile__ from the asm.

I think the best implementation is probably

barrier();
a = __ldcw_align(x);
while (__ldcw(a) == 0)
	while(*a == 0)
		;
mb();

Which makes it totally clear what barriers we need where in the spinlock
(the first being simply a compiler ordering barrier and the second
actually a processor memory barrier).

James


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-05  2:56 ` James Bottomley
@ 2004-09-05  6:27   ` John David Anglin
  2004-09-05 14:36     ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: John David Anglin @ 2004-09-05  6:27 UTC (permalink / raw)
  To: James Bottomley; +Cc: parisc-linux

> I think the best implementation is probably
> 
> barrier();
> a = __ldcw_align(x);

I don't understand why you put the barrier before the __ldcw_align
operation.  The alignment operation is non-critical and could be done
much earlier if that's desireable from a scheduling standpoint.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-05  6:27   ` John David Anglin
@ 2004-09-05 14:36     ` James Bottomley
  2004-09-06  4:19       ` Grant Grundler
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2004-09-05 14:36 UTC (permalink / raw)
  To: John David Anglin; +Cc: parisc-linux

On Sun, 2004-09-05 at 02:27, John David Anglin wrote:
> > I think the best implementation is probably
> > 
> > barrier();
> > a = __ldcw_align(x);
> 
> I don't understand why you put the barrier before the __ldcw_align
> operation.  The alignment operation is non-critical and could be done
> much earlier if that's desireable from a scheduling standpoint.

Actually, you're right.  I think all we need is the mb() after the
spin_lock code and another mb before the spin_unlock code.  How the
taking or releasing of the lock is optimised by gcc should be irrelevant
as long as it's locked when we cross the barrier.

James


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-05 14:36     ` James Bottomley
@ 2004-09-06  4:19       ` Grant Grundler
  2004-09-06  9:24         ` John David Anglin
  2004-09-06 14:15         ` James Bottomley
  0 siblings, 2 replies; 12+ messages in thread
From: Grant Grundler @ 2004-09-06  4:19 UTC (permalink / raw)
  To: James Bottomley; +Cc: John David Anglin, parisc-linux

On Sun, Sep 05, 2004 at 10:36:43AM -0400, James Bottomley wrote:
> On Sun, 2004-09-05 at 02:27, John David Anglin wrote:
> Actually, you're right.  I think all we need is the mb() after the
> spin_lock code and another mb before the spin_unlock code.  How the
> taking or releasing of the lock is optimised by gcc should be irrelevant
> as long as it's locked when we cross the barrier.

Actaully, I don't think it's irrelevant.  If a lock is contended for,
re-ordering by gcc could excerbate the problem by adding additional
instructions (good for the instruction pipeline) to the "critical
section" (the period we actually hold the lock).

I know, lock contention is bad and it should never happen.
Reality is some workload will contend for a lock. I want to
have some confidence gcc is not making it any worse.
This is why I'm asking about use of "memory" in the actual
asm instruction that either acquire or release the lock
instead of using mb().

thanks,
grant

_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-06  4:19       ` Grant Grundler
@ 2004-09-06  9:24         ` John David Anglin
  2004-09-06 14:15         ` James Bottomley
  1 sibling, 0 replies; 12+ messages in thread
From: John David Anglin @ 2004-09-06  9:24 UTC (permalink / raw)
  To: Grant Grundler; +Cc: James.Bottomley, parisc-linux

> Actaully, I don't think it's irrelevant.  If a lock is contended for,
> re-ordering by gcc could excerbate the problem by adding additional
> instructions (good for the instruction pipeline) to the "critical
> section" (the period we actually hold the lock).
> 
> I know, lock contention is bad and it should never happen.
> Reality is some workload will contend for a lock. I want to
> have some confidence gcc is not making it any worse.
> This is why I'm asking about use of "memory" in the actual
> asm instruction that either acquire or release the lock
> instead of using mb().

I don't have any particular insight here beyond what's in the manual:

If your assembler instructions access memory in an unpredictable
fashion, add @samp{memory} to the list of clobbered registers.  This
will cause GCC to not keep memory values cached in registers across the
assembler instruction and not optimize stores or loads to that memory.
You will also want to add the @code{volatile} keyword if the memory
affected is not listed in the inputs or outputs of the @code{asm}, as
the @samp{memory} clobber does not count as a side-effect of the
@code{asm}.  If you know how large the accessed memory is, you can add
it as input or output but if this is not known, you should add
@samp{memory}.  As an example, if you access ten bytes of a string, you
can use a memory input like:

@example
@{"m"( (@{ struct @{ char x[10]; @} *p = (void *)ptr ; *p; @}) )@}.
@end example

Note that in the following example the memory input is necessary,
otherwise GCC might optimize the store to @code{x} away:
@example
int foo ()
@{
  int x = 42;
  int *y = &x;
  int result;
  asm ("magic stuff accessing an 'int' pointed to by '%1'"
       "=&d" (r) : "a" (y), "m" (*y));
  return result;     
@}
@end example

@noindent
If you write an @code{asm} instruction with no outputs, GCC will know
the instruction has side-effects and will not delete the instruction or
move it outside of loops.

The @code{volatile} keyword indicates that the instruction has
important side-effects.  GCC will not delete a volatile @code{asm} if
it is reachable.  (The instruction can still be deleted if GCC can
prove that control-flow will never reach the location of the
instruction.)  In addition, GCC will not reschedule instructions
across a volatile @code{asm} instruction.  For example:

@smallexample
*(volatile int *)addr = foo;
asm volatile ("eieio" : : );
@end smallexample

@noindent
Assume @code{addr} contains the address of a memory mapped device
register.  The PowerPC @code{eieio} instruction (Enforce In-order
Execution of I/O) tells the CPU to make sure that the store to that
device register happens before it issues any other I/O@.

Note that even a volatile @code{asm} instruction can be moved in ways
that appear insignificant to the compiler, such as across jump
instructions.  You can't expect a sequence of volatile @code{asm}
instructions to remain perfectly consecutive.  If you want consecutive
output, use a single @code{asm}.  Also, GCC will perform some
optimizations across a volatile @code{asm} instruction; GCC does not
``forget everything'' when it encounters a volatile @code{asm}
instruction the way some other compilers do.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-06  4:19       ` Grant Grundler
  2004-09-06  9:24         ` John David Anglin
@ 2004-09-06 14:15         ` James Bottomley
  2004-09-07 15:17           ` Grant Grundler
  1 sibling, 1 reply; 12+ messages in thread
From: James Bottomley @ 2004-09-06 14:15 UTC (permalink / raw)
  To: Grant Grundler; +Cc: John David Anglin, parisc-linux

On Mon, 2004-09-06 at 00:19, Grant Grundler wrote:
> Actaully, I don't think it's irrelevant.  If a lock is contended for,
> re-ordering by gcc could excerbate the problem by adding additional
> instructions (good for the instruction pipeline) to the "critical
> section" (the period we actually hold the lock).

But that's what placing the mb() after the spinlock code does: confines
the critical section and prevents gcc reordering around it.

> I know, lock contention is bad and it should never happen.
> Reality is some workload will contend for a lock. I want to
> have some confidence gcc is not making it any worse.
> This is why I'm asking about use of "memory" in the actual
> asm instruction that either acquire or release the lock
> instead of using mb().

You probably get slightly more confinement by using mb() instead of the
"memory" clobber in __ldcw().  But really it's insignificant.  What
appeals to me is that the barrier in the spinlocks is explicit.  If the
spinlocks were fully asm coded, like for example x86, then adding a
"memory" clobber would make sense, but they're not, they're coded in C
with a little asm help.

James


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-06 14:15         ` James Bottomley
@ 2004-09-07 15:17           ` Grant Grundler
  2004-09-07 15:30             ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Grant Grundler @ 2004-09-07 15:17 UTC (permalink / raw)
  To: James Bottomley; +Cc: John David Anglin, parisc-linux

On Mon, Sep 06, 2004 at 10:15:29AM -0400, James Bottomley wrote:
> On Mon, 2004-09-06 at 00:19, Grant Grundler wrote:
> > Actaully, I don't think it's irrelevant.  If a lock is contended for,
> > re-ordering by gcc could excerbate the problem by adding additional
> > instructions (good for the instruction pipeline) to the "critical
> > section" (the period we actually hold the lock).
> 
> But that's what placing the mb() after the spinlock code does: confines
> the critical section and prevents gcc reordering around it.

Maybe I'm just confused again. I was thinking gcc could move instructions
between the asm("ldcw") and mb(). And the same thing between
mb() and "lock = 1".

In other words, I'm worried about the "distance" bewteen lock/unlock
ops is greater than the "distance" between two mb() that contain
the critical section of code.

> > This is why I'm asking about use of "memory" in the actual
> > asm instruction that either acquire or release the lock
> > instead of using mb().
> 
> You probably get slightly more confinement by using mb() instead of the
> "memory" clobber in __ldcw().

Erm, how?
Our mb() definition uses asm(:::"memory").
Is that different from using "memory" in the actual bit of assembly?

I just noticed set_eiem(), mtsp() and IRQ support in our system.h
do exactly what I proposed.

> What appeals to me is that the barrier in the spinlocks is explicit.

Yes, me tool.
But I'm suggesting those are superfluous when using "memory" in
the core asm() lock operations.

thanks,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-07 15:17           ` Grant Grundler
@ 2004-09-07 15:30             ` James Bottomley
  2004-09-08 16:52               ` Grant Grundler
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2004-09-07 15:30 UTC (permalink / raw)
  To: Grant Grundler; +Cc: John David Anglin, parisc-linux

On Tue, 2004-09-07 at 11:17, Grant Grundler wrote:
> Maybe I'm just confused again. I was thinking gcc could move instructions
> between the asm("ldcw") and mb(). And the same thing between
> mb() and "lock = 1".

It can, in a restricted fashion (the asm volatile ensures this
restriction).  But that's OK.  What we want is that when we complete the
spin_lock code, the spin is locked and memory clobbered.  That's why the
only necessary mb() is at the end.

> In other words, I'm worried about the "distance" bewteen lock/unlock
> ops is greater than the "distance" between two mb() that contain
> the critical section of code.

Well, the point about using an optimising compiler is that you're
supposed to give it the information necessary to do its job.  In this
case, the correct information is that a is volatile.  And further, after
it's finished the spin lock code we prevent no reordering leaks and
clobber memory.

> > You probably get slightly more confinement by using mb() instead of the
> > "memory" clobber in __ldcw().
> 
> Erm, how?
> Our mb() definition uses asm(:::"memory").
> Is that different from using "memory" in the actual bit of assembly?

Because in your implementation you have the barrier in the middle of a
loop; that means it's crossed many times in the contended case.  In mine
it's only crossed once.

James


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-07 15:30             ` James Bottomley
@ 2004-09-08 16:52               ` Grant Grundler
  2004-09-08 17:11                 ` James Bottomley
  0 siblings, 1 reply; 12+ messages in thread
From: Grant Grundler @ 2004-09-08 16:52 UTC (permalink / raw)
  To: James Bottomley; +Cc: John David Anglin, parisc-linux

On Tue, Sep 07, 2004 at 11:30:17AM -0400, James Bottomley wrote:
> > Maybe I'm just confused again. I was thinking gcc could move instructions
> > between the asm("ldcw") and mb(). And the same thing between
> > mb() and "lock = 1".
> 
> It can, in a restricted fashion (the asm volatile ensures this
> restriction).  But that's OK.  What we want is that when we complete the
> spin_lock code, the spin is locked and memory clobbered.  That's why the
> only necessary mb() is at the end.

yes - I understand that part and agree the current code is correct.

> > In other words, I'm worried about the "distance" bewteen lock/unlock
> > ops is greater than the "distance" between two mb() that contain
> > the critical section of code.
> 
> Well, the point about using an optimising compiler is that you're
> supposed to give it the information necessary to do its job.  In this
> case, the correct information is that a is volatile. 

Yeah - there is a tradeoff between optimizing the pipeline between
cases where the lock is contended and where it is not. The compiler
does not have that info unless it could consume Profile info.
(ie "Profile Based Optimization").
I was thinking we should favor the contended case to make it
less painful. I've changed my mind and will not push this patch.

> And further, after
> it's finished the spin lock code we prevent no reordering leaks and
> clobber memory.

Yes - no argument about correctness of current code.

> > Erm, how?
> > Our mb() definition uses asm(:::"memory").
> > Is that different from using "memory" in the actual bit of assembly?
> 
> Because in your implementation you have the barrier in the middle of a
> loop; that means it's crossed many times in the contended case.  In mine
> it's only crossed once.

But the _ldcw() is part of a tight "while" loop.
What's the penalty for "crossing" the barrier?
I don't see one.

BTW, do you think we should use coherent loads/stores for locks?
That's the other aspect of the patch that I think jda would like
to see incorporated.

thanks,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-08 16:52               ` Grant Grundler
@ 2004-09-08 17:11                 ` James Bottomley
  2004-09-10 16:11                   ` Grant Grundler
  0 siblings, 1 reply; 12+ messages in thread
From: James Bottomley @ 2004-09-08 17:11 UTC (permalink / raw)
  To: Grant Grundler; +Cc: John David Anglin, parisc-linux

On Wed, 2004-09-08 at 12:52, Grant Grundler wrote:
> Yeah - there is a tradeoff between optimizing the pipeline between
> cases where the lock is contended and where it is not. The compiler
> does not have that info unless it could consume Profile info.
> (ie "Profile Based Optimization").
> I was thinking we should favor the contended case to make it
> less painful. I've changed my mind and will not push this patch.

Erm, no, optimisation of the pipeline is something different again.  If
we want to hint to the compiler that the lock will be uncontended, then
our code should read:

        while (unlikely(__ldcw(a) == 0))
                while (likely(*a == 0));


> But the _ldcw() is part of a tight "while" loop.
> What's the penalty for "crossing" the barrier?
> I don't see one.

In this case, probably not much ... it would forbid clever optimisations
of the while loops, but I'm sure the compiler isn't actually optimising
them very much.  However, the *principle* is that you want your barriers
where they're effective but do the least damage to the compiler's
ability to optimise.

> BTW, do you think we should use coherent loads/stores for locks?
> That's the other aspect of the patch that I think jda would like
> to see incorporated.

I'm not really sure what the coherent hint actually does.

James


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [parisc-linux] coherent ops and mb() revisited
  2004-09-08 17:11                 ` James Bottomley
@ 2004-09-10 16:11                   ` Grant Grundler
  0 siblings, 0 replies; 12+ messages in thread
From: Grant Grundler @ 2004-09-10 16:11 UTC (permalink / raw)
  To: James Bottomley; +Cc: John David Anglin, parisc-linux

On Wed, Sep 08, 2004 at 01:11:12PM -0400, James Bottomley wrote:
> > does not have that info unless it could consume Profile info.
> > (ie "Profile Based Optimization").
> > I was thinking we should favor the contended case to make it
> > less painful. I've changed my mind and will not push this patch.
> 
> Erm, no, optimisation of the pipeline is something different again.  If
> we want to hint to the compiler that the lock will be uncontended, then
> our code should read:
> 
>         while (unlikely(__ldcw(a) == 0))
>                 while (likely(*a == 0));

This is general and applies to all uses of locks.
PBO would determine which is the common path at each usage.
But hints are right most of the time - included in the new patch below.

> > But the _ldcw() is part of a tight "while" loop.
> > What's the penalty for "crossing" the barrier?
> > I don't see one.
> 
> In this case, probably not much ... it would forbid clever optimisations
> of the while loops, but I'm sure the compiler isn't actually optimising
> them very much.  However, the *principle* is that you want your barriers
> where they're effective but do the least damage to the compiler's
> ability to optimise.

ok. That makes sense. I just didn't want an overly clever compiler
doing optimizations which are good for the pipeline but bad for
the algorithm.

> > BTW, do you think we should use coherent loads/stores for locks?
> > That's the other aspect of the patch that I think jda would like
> > to see incorporated.
> 
> I'm not really sure what the coherent hint actually does.

coherent hint is an architecturally (PA 2.0) defined mechanism to
enforce ordering of memory accesses - ie makes memory writes
visible to other CPUs in order. This assumes PA2.0 CPU implements
weakly ordered memory (which is not the case and I'll continue
to assert is unlikely to ever happen). But I'm happy to include
this if people think it's the "right thing".

The patch below:
o removes unneeded mb()
o adds likely/unlikely to spinlock loop as suggest by James Bottomley
o adds coherent hints to unlock "store" instruction as suggested
  by John David Anglin.

(o and drops the "memory" asm directives from the previous version)

thanks,
grant


Index: include/asm-parisc/spinlock.h
===================================================================
RCS file: /var/cvs/linux-2.6/include/asm-parisc/spinlock.h,v
retrieving revision 1.6
diff -u -p -r1.6 spinlock.h
--- include/asm-parisc/spinlock.h	15 Aug 2004 14:17:39 -0000	1.6
+++ include/asm-parisc/spinlock.h	10 Sep 2004 15:56:25 -0000
@@ -2,6 +2,7 @@
 #define __ASM_SPINLOCK_H
 
 #include <asm/system.h>
+#include <linux/compiler.h>	/* for likely/unlikely ops */
 
 /* Note that PA-RISC has to use `1' to mean unlocked and `0' to mean locked
  * since it only has load-and-zero. Moreover, at least on some PA processors,
@@ -29,20 +30,28 @@ static inline void _raw_spin_lock(spinlo
 {
 	volatile unsigned int *a;
 
-	mb();
 	a = __ldcw_align(x);
-	while (__ldcw(a) == 0)
-		while (*a == 0);
+	while (unlikely(__ldcw(a) == 0))
+		while (likely(*a == 0));
 	mb();
 }
 
 static inline void _raw_spin_unlock(spinlock_t *x)
 {
 	volatile unsigned int *a;
+
 	mb();
 	a = __ldcw_align(x);
-	*a = 1;
-	mb();
+
+	/* use a coherent store. PA1.1 is always strongly ordered.
+	* The idea here is stores to locks will enforce memory ordering
+	* should any PA20 chip ever implement weakly ordered memory.
+	* To date, no PA2.0 implementation is weakly ordered.
+	* jda would prefer we use coherent stores (",ma" with zero offset
+	* is the same thing but PA1.1 compatible).
+	*/
+	__asm__ __volatile__ ("stw,ma %1,0(%0)"
+			: : "r" (a), "r" (1) );
 }
 
 static inline int _raw_spin_trylock(spinlock_t *x)
@@ -50,7 +59,6 @@ static inline int _raw_spin_trylock(spin
 	volatile unsigned int *a;
 	int ret;
 
-	mb();
 	a = __ldcw_align(x);
         ret = __ldcw(a) != 0;
 	mb();
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2004-09-10 16:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-05  1:38 [parisc-linux] coherent ops and mb() revisited Grant Grundler
2004-09-05  2:56 ` James Bottomley
2004-09-05  6:27   ` John David Anglin
2004-09-05 14:36     ` James Bottomley
2004-09-06  4:19       ` Grant Grundler
2004-09-06  9:24         ` John David Anglin
2004-09-06 14:15         ` James Bottomley
2004-09-07 15:17           ` Grant Grundler
2004-09-07 15:30             ` James Bottomley
2004-09-08 16:52               ` Grant Grundler
2004-09-08 17:11                 ` James Bottomley
2004-09-10 16:11                   ` Grant Grundler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.