From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Jamie Lokier <jamie@shareable.org>,
Catalin Marinas <catalin.marinas@arm.com>,
linux-arm-kernel@lists.arm.linux.org.uk,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: Broken ARM atomic ops wrt memory barriers (was : [PATCH] Add cmpxchg support for ARMv6+ systems)
Date: Mon, 25 May 2009 16:22:10 -0400 [thread overview]
Message-ID: <20090525202210.GD22651@Krystal> (raw)
In-Reply-To: <20090525195656.GC3667@n2100.arm.linux.org.uk>
* Russell King - ARM Linux (linux@arm.linux.org.uk) wrote:
> This reply is based upon what's in your email rather than atomic_ops.
>
> On Mon, May 25, 2009 at 01:29:55PM -0400, Mathieu Desnoyers wrote:
> > This is a very good start, but I think a few are still missing :
> >
> > in atomic.h :
> >
> > /* Atomic operations are already serializing on ARM */
> > #define smp_mb__before_atomic_dec() barrier()
> > #define smp_mb__after_atomic_dec() barrier()
> > #define smp_mb__before_atomic_inc() barrier()
> > #define smp_mb__after_atomic_inc() barrier()
> >
> > should probably map to smp_mb() for arm v6+.
>
> BTW, I think you're wrong here. atomic_dec() and atomic_inc() are
> implemented using atomic_add_return() and atomic_sub_return(). Both
> of these functions are serializing as a result of the patch you
> replied to.
>
Hrm, then atomic add/dec should not be implemented on top of the add/sub
return with memory barriers, given this would kill performance for no
reason. It would be better to re-implement mb-less add/dec and add those
smp_mb__*() primitives.
But you are right : it's not a bug, just... very slow.
> > Also, bitops.h should have : (taken from powerpc)
> >
> > /*
> > * clear_bit doesn't imply a memory barrier
> > */
> > #define smp_mb__before_clear_bit() smp_mb()
> > #define smp_mb__after_clear_bit() smp_mb()
>
> Again, disagree. With the current definition being mb(), they become
> either:
>
> - a compiler barrier on UP architectures (which don't have weak ordering
> models)
> - a data memory barrier on UP coherent xscale (don't know if this has
> weak ordering)
> - a data memory barrier on SMP
>
> So, I think no change is required; mb() is doing at least the right thing.
> (Whether it's heavier than it actually needs to be is another question,
> and that only affects the coherent xscale stuff. That is out of my
> knowledge to answer.)
Right, no bug here, only probably much slower.
To give you an order of magnitude, a cmpxchg_local primitive without the
memory barriers takes 11 cycles on my ARMv7 omap3. cmpxchg with mb()
before and after the cmpxchg takes 71 cycles. That's 6.45 times slower.
>
> > According to atomic_ops.txt, 3 other bitwise atomic ops imply memory
> > barriers :
> >
> > "There are two special bitops with lock barrier semantics (acquire/release,
> > same as spinlocks). These operate in the same way as their non-_lock/unlock
> > postfixed variants, except that they are to provide acquire/release semantics,
> > respectively. This means they can be used for bit_spin_trylock and
> > bit_spin_unlock type operations without specifying any more barriers.
> >
> > int test_and_set_bit_lock(unsigned long nr, unsigned long *addr);
> > void clear_bit_unlock(unsigned long nr, unsigned long *addr);
> > void __clear_bit_unlock(unsigned long nr, unsigned long *addr);
> >
> > The __clear_bit_unlock version is non-atomic, however it still implements
> > unlock barrier semantics. This can be useful if the lock itself is protecting
> > the other bits in the word."
>
> It looks to me that if we make arch/arm/lib/bitops.h fully ordered then
> these get sorted out for free.
>
Yes, this has been my first thought too. I think all these
implementations use bitops.h.
> > arch/arm/include/asm/mutex.h should also have smp_mb() to provide
> > acquire/release semantic to mutex fastpath (like spinlock does),
> > otherwise subtle deadlocks and various problems could occur.
>
> Hmm, the mutex is undocumented in the atomic ops document. Does it
> require ordering both before and after, or do some of those ops just
> need it before acquire and after release?
>
I guess the mutex fast path should probably be added to atomic_ops.txt.
If I look at PowerPC mutex.h, mutex lock provides acquire semantic (like
spinlock) and mutex unlock provides release semantic (like spin unlock).
acquire :
take lock
smp_mb()
(critical section memory accesses)
release :
(critical section memory accesses)
smp_mb()
release lock
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2009-05-25 20:22 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20090422171703.19555.83629.stgit@pc1117.cambridge.arm.com>
[not found] ` <20090423141248.22193.10543.stgit@pc1117.cambridge.arm.com>
[not found] ` <20090524131636.GB3159@n2100.arm.linux.org.uk>
2009-05-24 14:56 ` Broken ARM atomic ops wrt memory barriers (was : [PATCH] Add cmpxchg support for ARMv6+ systems) Mathieu Desnoyers
2009-05-25 13:20 ` Jamie Lokier
2009-05-25 15:17 ` Mathieu Desnoyers
2009-05-25 16:19 ` Russell King - ARM Linux
2009-05-25 17:29 ` Mathieu Desnoyers
2009-05-25 19:34 ` Russell King - ARM Linux
2009-05-25 20:05 ` Mathieu Desnoyers
2009-05-26 11:29 ` Catalin Marinas
2009-05-25 19:56 ` Russell King - ARM Linux
2009-05-25 20:22 ` Mathieu Desnoyers [this message]
2009-05-25 21:45 ` Broken ARM (and powerpc ?) futex wrt memory barriers Mathieu Desnoyers
2009-05-25 21:57 ` Russell King - ARM Linux
2009-05-25 22:27 ` Mathieu Desnoyers
2009-05-26 14:59 ` Broken ARM atomic ops wrt memory barriers (was : [PATCH] Add cmpxchg support for ARMv6+ systems) Russell King - ARM Linux
2009-05-26 15:36 ` Mathieu Desnoyers
2009-05-26 15:59 ` Russell King - ARM Linux
2009-05-26 17:23 ` Mathieu Desnoyers
2009-05-26 18:23 ` Russell King - ARM Linux
2009-05-26 19:17 ` Jamie Lokier
2009-05-26 19:56 ` Russell King - ARM Linux
2009-05-27 1:22 ` Mathieu Desnoyers
2009-05-27 8:56 ` Russell King - ARM Linux
2009-05-27 9:18 ` Catalin Marinas
2009-05-27 9:14 ` Catalin Marinas
2009-05-27 14:52 ` Mathieu Desnoyers
2009-05-27 15:59 ` Paul E. McKenney
2009-05-27 16:02 ` Mathieu Desnoyers
2009-05-27 20:55 ` Paul E. McKenney
2009-05-27 18:40 ` Mathieu Desnoyers
2009-05-28 18:20 ` Russell King - ARM Linux
2009-05-28 18:38 ` Mathieu Desnoyers
2009-05-28 18:40 ` Russell King - ARM Linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090525202210.GD22651@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=davem@davemloft.net \
--cc=jamie@shareable.org \
--cc=linux-arm-kernel@lists.arm.linux.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=paulmck@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox