All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
To: linux-ia64@vger.kernel.org
Subject: Re: [IA64] Reduce __clear_bit_unlock overhead
Date: Fri, 19 Oct 2007 09:14:34 +0000	[thread overview]
Message-ID: <4718757A.9040805@bull.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0710182037220.25820@schroedinger.engr.sgi.com>

You may want to avoid assembly magics:

static __inline__ void
__clear_bit_unlock(int const nr, volatile void * const addr)
{
        volatile __u32 * const m = (volatile __u32 *) addr + (nr >> 5);

        *m &= ~(1 << (nr & 0x1f));
}

GCC compiles volatile loads with ".acq" and stores with ".rel".
E.g. the following program:

int lo = 3;

main()
{
        __clear_bit_unlock(1, &lo);
}

compiles into (NOP-s removed):

4000000000000680 <main>:0b 70 e0 03 00 24       [MMI]       addl r14\x120,r1;;
4000000000000686:       f0 00 38 60 21 00                   ld4.acq r15=[r14]
4000000000000690:       0a 78 f4 1f 2c 22       [MMI]       and r15=-3,r15;;
4000000000000696:       00 78 38 60 23 00                   st4.rel [r14]=r15
40000000000006ac:       08 00 84 00                         br.ret.sptk.many b0;;

Actually, we don't need a load with ".acq". A somewhat less readable code:

static __inline__ void
__clear_bit_unlock(int const nr, void * const addr)
{
        __u32 * const p = (__u32 *) addr + (nr >> 5);

        * (volatile __u32 *) p = *p & ~(1 << (nr & 0x1f));
}

gives you:

4000000000000680 <main>:0b 70 e0 03 00 24       [MMI]       addl r14\x120,r1;;
4000000000000686:       f0 00 38 20 20 00                   ld4 r15=[r14]
4000000000000690:       0a 78 f4 1f 2c 22       [MMI]       and r15=-3,r15;;
4000000000000696:       00 78 38 60 23 00                   st4.rel [r14]=r15
40000000000006ac:       08 00 84 00                         br.ret.sptk.many b0;;

that can be slightly more efficient.

Another remark:
We are adding more variants of existing funtions, e.g.:

clear_bit()
__clear_bit()

I've got problems with hidden semantics.
Just reading the source (where they are used), I simply cannot guess
if a primitive is atomic or not, if it is with some fencing or w/o.

Cannot we have some "speaking names"? E.g.: bit_unlock_Natomic_rel()

Zoltan Menyhart

  parent reply	other threads:[~2007-10-19  9:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-19  3:38 [IA64] Reduce __clear_bit_unlock overhead Christoph Lameter
2007-10-19  4:34 ` Nick Piggin
2007-10-19  9:14 ` Zoltan Menyhart [this message]
2007-10-19  9:28 ` Nick Piggin
2007-10-19 10:58 ` Christoph Lameter
2007-10-19 11:12 ` Christoph Lameter
2007-10-19 14:15 ` Zoltan Menyhart
2007-10-19 17:44 ` Christoph Lameter
2007-10-21  4:43 ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2007-10-18 22:15 SLUB: Avoid atomic operation for slab_unlock Christoph Lameter
2007-10-19  1:56 ` Nick Piggin
2007-10-19  2:01   ` Christoph Lameter
2007-10-19  2:12     ` Nick Piggin
2007-10-19  3:26       ` [IA64] Reduce __clear_bit_unlock overhead Christoph Lameter
2007-10-19  3:26         ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4718757A.9040805@bull.net \
    --to=zoltan.menyhart@bull.net \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.