public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* About a change to the implementation of spin lock in 2.6.12 kernel.
@ 2005-07-14  2:20 multisyncfe991
  2005-07-14  5:16 ` Willy Tarreau
  0 siblings, 1 reply; 9+ messages in thread
From: multisyncfe991 @ 2005-07-14  2:20 UTC (permalink / raw)
  To: linux-kernel

Hi,

I found _spin_lock used a LOCK instruction to make the following operation 
"decb %0" atomic. As you know, LOCK instruction alone takes almost 70 clock 
cycles to finish and this add lots of cost to the _spin_lock. However 
_spin_unlock does not use this LOCK instruction and it uses "movb $1,%0" 
instead since 4-byte writes on 4-byte aligned addresses are atomic.

So I want rewrite the _spin_lock defined spinlock.h 
(/linux/include/asm-i386) as follows to reduce the overhead of _spin_lock 
and make it more efficient.
#define spin_lock_string \
        "\n1:\t" \
        "cmpb $0,%0\n\t" \
        "jle 2f\n\t" \
        "movb $0, %0\n\t" \
        "jmp 3f\n" \
        "2:\t" \
        "rep;nop\n\t" \
        "cmpb $0, %0\n\t" \
        "jle 2b\n\t" \
        "jmp 1b\n" \
        "3:\n\t"

Compared with the original version as follows, LOCK instruction is removed. 
I rebuilt the Intel e1000 Gigabit driver with this _spin_lock. There is 
about 2% throughput improvement.
#define spin_lock_string \
            "\n1:\t" \
            "lock ; decb %0\n\t" \
            "jns 3f\n" \
            "2:\t" \
            "rep;nop\n\t" \
            "cmpb $0,%0\n\t" \
            "jle 2b\n\t" \
            "jmp 1b\n" \
            "3:\n\t"

Do you think I can get a better performance if I dig further?

Any ideas will be greatly appreciated,

L.Y.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2005-11-10  3:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-14  2:20 About a change to the implementation of spin lock in 2.6.12 kernel multisyncfe991
2005-07-14  5:16 ` Willy Tarreau
2005-07-14 16:21   ` multisyncfe991
2005-07-14 16:26     ` Brandon Niemczyk
2005-11-09 17:57       ` Does Printk() block another CPU in dual cpu platforms? John Smith
2005-11-10  3:31         ` Fawad Lateef
2005-07-15 19:22     ` Volatile vs Non-Volatile Spin Locks on SMP multisyncfe991
2005-07-17 12:51       ` Joe Seigh
2005-07-18 13:40         ` Joe Seigh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox