public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: linux-ia64@vger.kernel.org
Subject: Re: [IA64] Reduce __clear_bit_unlock overhead
Date: Fri, 19 Oct 2007 04:34:48 +0000	[thread overview]
Message-ID: <200710191434.48445.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <Pine.LNX.4.64.0710182037220.25820@schroedinger.engr.sgi.com>

On Friday 19 October 2007 13:38, Christoph Lameter wrote:
> On Fri, 19 Oct 2007, Nick Piggin wrote:
> > I'm not sure, I had an idea it was relatively expensive on ia64,
> > but I didn't really test with a good workload (a microbenchmark
> > probably isn't that good because it won't generate too much out
> > of order memory traffic that needs to be fenced).
>
> Its expensive on IA64. Is it any less expensive on x86?

Probably relatively less because x86 has only a very small amount of
possible reordering, perhaps. (it can only reorder loads past
earlier stores).

Of course, we can avoid the fence altogether on x86 as well, because
it simply isn't needed for release semantics.


> > > Where can I find your patchset? I looked through lkml but did not see
> > > it.
> >
> > Infrastructure in -mm, starting at bitops-introduce-lock-ops.patch.
> > bit_spin_lock-use-lock-bitops.patch and ia64-lock-bitops.patch are
> > ones to look at.
>
> ia64-lock-bitops.patch defines:
>
> static __inline__ void
> clear_bit_unlock (int nr, volatile void *addr)
> {
>        __u32 mask, old, new;
>        volatile __u32 *m;
>        CMPXCHG_BUGCHECK_DECL
>
>        m = (volatile __u32 *) addr + (nr >> 5);
>        mask = ~(1 << (nr & 31));
>        do {
>                CMPXCHG_BUGCHECK(m);
>                old = *m;
>                new = old & mask;
>        } while (cmpxchg_rel(m, old, new) != old);
> }
>
> /**
>  * __clear_bit_unlock - Non-atomically clear a bit with release
>  *
>  * This is like clear_bit_unlock, but the implementation may use a
> non-atomic * store (this one uses an atomic, however).
>  */
> #define __clear_bit_unlock clear_bit_unlock
>
>
> A non atomic store is a misaligned store on IA64. That is not
> relevant here. The data is properly aligned. I guess it was intended to
> refer to the cmpxchg.

Well it's just saying that it _can_ be non-atomic, but the implementation
here is actually atomic, because....


> How about this patch? [Works fine on IA64 simulator...]
>
>
>
>
> IA64: Slim down __clear_bit_unlock
>
> __clear_bit_unlock does not need to perform atomic operations on the
> variable. Avoid a cmpxchg and simply do a store with release semantics. Add
> a barrier to be safe that the compiler does not do funky things.
>
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
>
> ---
>  include/asm-ia64/bitops.h |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> Index: linux-2.6.23-mm1/include/asm-ia64/bitops.h
> =================================> --- linux-2.6.23-mm1.orig/include/asm-ia64/bitops.h	2007-10-18
> 19:37:22.000000000 -0700 +++
> linux-2.6.23-mm1/include/asm-ia64/bitops.h	2007-10-18 19:50:22.000000000
> -0700 @@ -124,10 +124,21 @@ clear_bit_unlock (int nr, volatile void
>  /**
>   * __clear_bit_unlock - Non-atomically clear a bit with release
>   *
> - * This is like clear_bit_unlock, but the implementation may use a
> non-atomic - * store (this one uses an atomic, however).
> + * This is like clear_bit_unlock, but the implementation uses a store
> + * with release semantics. See also __raw_spin_unlock().
>   */
> -#define __clear_bit_unlock clear_bit_unlock
> +static __inline__ void
> +__clear_bit_unlock (int nr, volatile void *addr)
> +{
> +	__u32 mask, new;
> +	volatile __u32 *m;
> +
> +	m = (volatile __u32 *) addr + (nr >> 5);
> +	mask = ~(1 << (nr & 31));
> +	new = *m & mask;
> +	barrier();
> +	asm volatile ("st4.rel.nta [%0] = %1\n\t" :: "r"(m), "r"(new));
> +}

.... I didn't realise ia64 could do plain stores with release
semantics -- I should have looked closer at spinlock.h. Yes,
this is exactly what we want here. This patch will now apply
upstream, so if you want to send it to Linus, that would be
nice. Thanks!

Acked-by: Nick Piggin <npiggin@suse.de>

  reply	other threads:[~2007-10-19  4:34 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-19  3:38 [IA64] Reduce __clear_bit_unlock overhead Christoph Lameter
2007-10-19  4:34 ` Nick Piggin [this message]
2007-10-19  9:14 ` Zoltan Menyhart
2007-10-19  9:28 ` Nick Piggin
2007-10-19 10:58 ` Christoph Lameter
2007-10-19 11:12 ` Christoph Lameter
2007-10-19 14:15 ` Zoltan Menyhart
2007-10-19 17:44 ` Christoph Lameter
2007-10-21  4:43 ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200710191434.48445.nickpiggin@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox