Re: [patch 2.6.15-rc5] Define an ia64 version of __raw_read_trylock

public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed

From: Keith Owens <kaos@sgi.com>
To: linux-ia64@vger.kernel.org
Subject: Re: [patch 2.6.15-rc5] Define an ia64 version of __raw_read_trylock
Date: Mon, 12 Dec 2005 13:20:19 +0000	[thread overview]
Message-ID: <11724.1134393619@ocs3.ocs.com.au> (raw)
In-Reply-To: <17916.1134185068@ocs3.ocs.com.au>

On 12 Dec 2005 14:16:29 +0800, 
Zou Nan hai <nanhai.zou@intel.com> wrote:
>On Sat, 2005-12-10 at 11:24, Keith Owens wrote:
>> IA64 is using the generic version of __raw_read_trylock, which always
>> waits for the lock to be free instead of returning when the lock is in
>> use.  Define an ia64 version of __raw_read_trylock which behaves
>> correctly, and drop the generic one.
>> 
>> Signed-off-by: Keith Owens <kaos@sgi.com>
>> 
>> Index: linux/include/asm-ia64/spinlock.h
>> =================================>> --- linux.orig/include/asm-ia64/spinlock.h      2005-12-09
>> 14:44:56.883252331 +1100
>> +++ linux/include/asm-ia64/spinlock.h   2005-12-10 14:20:44.459423430
>> +1100
>> @@ -201,6 +201,16 @@ static inline void __raw_write_unlock(ra
>>  
>>  #endif /* !ASM_SUPPORTED */
>>  
>> -#define __raw_read_trylock(lock) generic__raw_read_trylock(lock)
>> +static inline int __raw_read_trylock(raw_rwlock_t *x)
>> +{
>> +       union {
>> +               raw_rwlock_t lock;
>> +               __u32 word;
>> +       } old, new;
>> +       old.lock = new.lock = *x;
>> +       old.lock.write_lock = new.lock.write_lock = 0;
>> +       ++new.lock.read_counter;
>> +       return (u32)ia64_cmpxchg4_acq((__u32 *)(x), new.word,
>> old.word) = old.word;
>> +}
>>  
>>  #endif /*  _ASM_IA64_SPINLOCK_H */
>> 
>
>   How about using fetchadd4.acq 1 in trylock and rollback by fetchadd4.rel -1 if lock failed, 
>just like how __raw_read_lock does?
>   That should emit fewer code and give no extra memory foot print I think.

The code generated for the patch requires no extra memory, the old and
new variables are optimized to registers.

0000000000000000 <_raw_read_trylock>:
   0:   05 58 00 40 10 d0       [MLX]       ld4 r11=[r32]
   6:   7f 00 00 00 00 20                   movl r9=0x7fffffff;;
   c:   f1 f7 ff 67
  10:   0b 18 24 16 0c 20       [MMI]       and r3=r9,r11;;
  16:   a0 08 0c 00 42 00                   adds r10=1,r3
  1c:   00 00 04 00                         nop.i 0x0;;
  20:   00 40 28 12 0c 20       [MII]       and r8=r10,r9
  26:   00 00 00 02 00 00                   nop.i 0x0
  2c:   00 00 04 00                         nop.i 0x0
  30:   0b 00 0c 40 2a 04       [MMI]       mov.m ar.ccv=r3;;
  36:   20 40 80 22 20 00                   cmpxchg4.acq r2=[r32],r8,ar.ccv
  3c:   00 00 04 00                         nop.i 0x0;;
  40:   1d 40 08 06 89 38       [MFB]       cmp4.eq p8,p9=r2,r3
  46:   00 00 00 02 00 00                   nop.f 0x0
  4c:   00 00 00 20                         nop.b 0x0;;
  50:   11 41 04 00 00 64       [MIB] (p08) mov r8=1
  56:   82 00 00 00 42 80             (p09) mov r8=r0
  5c:   08 00 84 00                         br.ret.sptk.many b0;;

The code with fetchadd is one bundle less, but it requires an extra
memory update and possible cache line bounce when the lock is
contended.  We are talking about micro optmizations here so I doubt
that the difference is measurable, but my gut says that an extra memory
update in the contention case is (in general) the wrong approach.

00000000000007e0 <_raw_read_trylock>:
 7e0:   0e 10 0c 40 91 10       [MMF]       fetchadd4.acq r2=[r32],1
 7e6:   00 00 00 02 00 00                   nop.m 0x0
 7ec:   00 00 04 00                         nop.f 0x0
 7f0:   1d 40 04 00 00 24       [MFB]       mov r8=1
 7f6:   00 00 00 02 00 00                   nop.f 0x0
 7fc:   00 00 00 20                         nop.b 0x0;;
 800:   11 40 08 00 89 30       [MIB]       cmp4.lt p8,p9=r2,r0
 806:   00 00 00 02 80 84                   nop.i 0x0
 80c:   08 00 84 02                   (p09) br.ret.dptk.many b0;;
 810:   00 70 1c 40 b1 10       [MII]       fetchadd4.rel r14=[r32],-1
 816:   00 00 00 02 00 00                   nop.i 0x0
 81c:   00 00 04 00                         nop.i 0x0
 820:   11 40 00 00 00 21       [MIB]       mov r8=r0
 826:   00 00 00 02 00 80                   nop.i 0x0
 82c:   08 00 84 00                         br.ret.sptk.many b0;;

Index: linux/include/asm-ia64/spinlock.h
=================================--- linux.orig/include/asm-ia64/spinlock.h	2005-12-10 19:50:34.001197292 +1100
+++ linux/include/asm-ia64/spinlock.h	2005-12-13 00:12:47.250022120 +1100
@@ -201,6 +201,12 @@ static inline void __raw_write_unlock(ra
 
 #endif /* !ASM_SUPPORTED */
 
-#define __raw_read_trylock(lock) generic__raw_read_trylock(lock)
+static inline int __raw_read_trylock(raw_rwlock_t *x)
+{
+	if (likely(ia64_fetchadd(1, (int *) x, acq) >= 0))
+		return 1;
+	ia64_fetchadd(-1, (int *) x, rel);
+	return 0;
+}
 
 #endif /*  _ASM_IA64_SPINLOCK_H */

     prev parent reply	other threads:[~2005-12-12 13:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-10  3:24 [patch 2.6.15-rc5] Define an ia64 version of __raw_read_trylock Keith Owens
2005-12-12  2:34 ` Luck, Tony
2005-12-12  3:04 ` Keith Owens
2005-12-12  6:16 ` Zou Nan hai
2005-12-12 13:20 ` Keith Owens [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11724.1134393619@ocs3.ocs.com.au \
    --to=kaos@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox