* x86-32: clean up rwsem inline asm statements
@ 2010-01-13 0:21 Linus Torvalds
2010-01-13 0:45 ` Andreas Schwab
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Linus Torvalds @ 2010-01-13 0:21 UTC (permalink / raw)
To: Ingo Molnar, Peter Anvin, Thomas Gleixner; +Cc: Linux Kernel Mailing List
This makes gcc use the right register names and instruction operand sizes
automatically for the rwsem inline asm statements.
So instead of using "(%%eax)" to specify the memory address that is the
semaphore, we use "(%1)" or similar. And instead of forcing the operation
to always be 32-bit, we use "%z0", taking the size from the actual
semaphore data structure itself.
This doesn't actually matter on x86-32, but if we want to use the same
inline asm for x86-64, we'll need to have the compiler generate the proper
64-bit names for the registers (%rax instead of %eax), and if we want to
use a 64-bit counter too (in order to avoid the 15-bit limit on the
write counter that limits concurrent users to 32767 threads), we'll need
to be able to generate instructions with "q" accesses rather than "l".
Since this header currently isn't enabled on x86-64, none of that matters,
but we do want to use the xadd version of the semaphores rather than have
to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended
when you have lots of threads all taking page faults, and the fallback
rwsem code that uses a spinlock performs abysmally badly in that case.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
This really should be a no-op, but it does seem to do something to gcc so
that when I do a "cmp" on the whole kernel before and after, it doesn't
match. The differences _seem_ to be some code size issue that change all
the offsets, it's a bit hard to figure it out from the binary.
Anyway, this is purely preparatory, but I think it cleans up the asm too,
so...
Linus
arch/x86/include/asm/rwsem.h | 30 +++++++++++++++---------------
1 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index ca7517d..4a884e0 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -105,7 +105,7 @@ do { \
static inline void __down_read(struct rw_semaphore *sem)
{
asm volatile("# beginning down_read\n\t"
- LOCK_PREFIX " incl (%%eax)\n\t"
+ LOCK_PREFIX " inc%z0 (%1)\n\t"
/* adds 0x00000001, returns the old value */
" jns 1f\n"
" call call_rwsem_down_read_failed\n"
@@ -123,10 +123,10 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
{
__s32 result, tmp;
asm volatile("# beginning __down_read_trylock\n\t"
- " movl %0,%1\n\t"
+ " mov%z0 %0,%1\n\t"
"1:\n\t"
- " movl %1,%2\n\t"
- " addl %3,%2\n\t"
+ " mov%z0 %1,%2\n\t"
+ " add%z0 %3,%2\n\t"
" jle 2f\n\t"
LOCK_PREFIX " cmpxchgl %2,%0\n\t"
" jnz 1b\n\t"
@@ -147,9 +147,9 @@ static inline void __down_write_nested(struct rw_semaphore *sem, int subclass)
tmp = RWSEM_ACTIVE_WRITE_BIAS;
asm volatile("# beginning down_write\n\t"
- LOCK_PREFIX " xadd %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd%z0 %1,(%2)\n\t"
/* subtract 0x0000ffff, returns the old value */
- " testl %%edx,%%edx\n\t"
+ " test%z0 %1,%1\n\t"
/* was the count 0 before? */
" jz 1f\n"
" call call_rwsem_down_write_failed\n"
@@ -185,7 +185,7 @@ static inline void __up_read(struct rw_semaphore *sem)
{
__s32 tmp = -RWSEM_ACTIVE_READ_BIAS;
asm volatile("# beginning __up_read\n\t"
- LOCK_PREFIX " xadd %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd%z0 %1,(%2)\n\t"
/* subtracts 1, returns the old value */
" jns 1f\n\t"
" call call_rwsem_wake\n"
@@ -201,18 +201,18 @@ static inline void __up_read(struct rw_semaphore *sem)
*/
static inline void __up_write(struct rw_semaphore *sem)
{
+ unsigned long tmp;
asm volatile("# beginning __up_write\n\t"
- " movl %2,%%edx\n\t"
- LOCK_PREFIX " xaddl %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd%z0 %1,(%2)\n\t"
/* tries to transition
0xffff0001 -> 0x00000000 */
" jz 1f\n"
" call call_rwsem_wake\n"
"1:\n\t"
"# ending __up_write\n"
- : "+m" (sem->count)
- : "a" (sem), "i" (-RWSEM_ACTIVE_WRITE_BIAS)
- : "memory", "cc", "edx");
+ : "+m" (sem->count),"=d" (tmp)
+ : "a" (sem), "1" (-RWSEM_ACTIVE_WRITE_BIAS)
+ : "memory", "cc");
}
/*
@@ -221,7 +221,7 @@ static inline void __up_write(struct rw_semaphore *sem)
static inline void __downgrade_write(struct rw_semaphore *sem)
{
asm volatile("# beginning __downgrade_write\n\t"
- LOCK_PREFIX " addl %2,(%%eax)\n\t"
+ LOCK_PREFIX " add%z0 %2,(%1)\n\t"
/* transitions 0xZZZZ0001 -> 0xYYYY0001 */
" jns 1f\n\t"
" call call_rwsem_downgrade_wake\n"
@@ -237,7 +237,7 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
*/
static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
{
- asm volatile(LOCK_PREFIX "addl %1,%0"
+ asm volatile(LOCK_PREFIX "add%z0 %1,%0"
: "+m" (sem->count)
: "ir" (delta));
}
@@ -249,7 +249,7 @@ static inline int rwsem_atomic_update(int delta, struct rw_semaphore *sem)
{
int tmp = delta;
- asm volatile(LOCK_PREFIX "xadd %0,%1"
+ asm volatile(LOCK_PREFIX "xadd%z0 %0,%1"
: "+r" (tmp), "+m" (sem->count)
: : "memory");
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: x86-32: clean up rwsem inline asm statements
2010-01-13 0:21 x86-32: clean up rwsem inline asm statements Linus Torvalds
@ 2010-01-13 0:45 ` Andreas Schwab
2010-01-13 1:26 ` Linus Torvalds
2010-01-13 0:48 ` H. Peter Anvin
2010-01-13 5:03 ` [tip:x86/asm] x86-32: clean up rwsem inline asm statements tip-bot for Linus Torvalds
2 siblings, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2010-01-13 0:45 UTC (permalink / raw)
To: Linus Torvalds
Cc: Ingo Molnar, Peter Anvin, Thomas Gleixner,
Linux Kernel Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> @@ -249,7 +249,7 @@ static inline int rwsem_atomic_update(int delta, struct rw_semaphore *sem)
> {
> int tmp = delta;
>
> - asm volatile(LOCK_PREFIX "xadd %0,%1"
> + asm volatile(LOCK_PREFIX "xadd%z0 %0,%1"
> : "+r" (tmp), "+m" (sem->count)
> : : "memory");
I think %z0 should be %z1 here.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86-32: clean up rwsem inline asm statements
2010-01-13 0:21 x86-32: clean up rwsem inline asm statements Linus Torvalds
2010-01-13 0:45 ` Andreas Schwab
@ 2010-01-13 0:48 ` H. Peter Anvin
2010-01-13 0:59 ` Linus Torvalds
2010-01-13 5:03 ` [tip:x86/asm] x86-32: clean up rwsem inline asm statements tip-bot for Linus Torvalds
2 siblings, 1 reply; 13+ messages in thread
From: H. Peter Anvin @ 2010-01-13 0:48 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 1447 bytes --]
On 01/12/2010 04:21 PM, Linus Torvalds wrote:
>
> So instead of using "(%%eax)" to specify the memory address that is the
> semaphore, we use "(%1)" or similar. And instead of forcing the operation
> to always be 32-bit, we use "%z0", taking the size from the actual
> semaphore data structure itself.
>
> This doesn't actually matter on x86-32, but if we want to use the same
> inline asm for x86-64, we'll need to have the compiler generate the proper
> 64-bit names for the registers (%rax instead of %eax), and if we want to
> use a 64-bit counter too (in order to avoid the 15-bit limit on the
> write counter that limits concurrent users to 32767 threads), we'll need
> to be able to generate instructions with "q" accesses rather than "l".
>
As far as I can tell, very few of these assembly statements actually
need a size at all -- the very first inc statement is purely to memory,
and as such it needs a size marker, but almost any operation which is
register-register or register-memory will simply take its size from the
register operand. For those, it seems cleaner to simply drop the size
suffix, and in fact this is the style we have been pushing people
towards (use the suffix where mandatory or where the size is fixed
anyway, to help catch bugs; use no suffix where the size can vary and is
implied by the operands.)
So, proposed alternate version of your patch attached.
(Also changed cmpxchgl -> cmpxchg.)
-hpa
[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 3474 bytes --]
diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index ca7517d..4136200 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -105,7 +105,7 @@ do { \
static inline void __down_read(struct rw_semaphore *sem)
{
asm volatile("# beginning down_read\n\t"
- LOCK_PREFIX " incl (%%eax)\n\t"
+ LOCK_PREFIX " inc%z0 (%1)\n\t"
/* adds 0x00000001, returns the old value */
" jns 1f\n"
" call call_rwsem_down_read_failed\n"
@@ -123,12 +123,12 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
{
__s32 result, tmp;
asm volatile("# beginning __down_read_trylock\n\t"
- " movl %0,%1\n\t"
+ " mov %0,%1\n\t"
"1:\n\t"
- " movl %1,%2\n\t"
- " addl %3,%2\n\t"
+ " mov %1,%2\n\t"
+ " add %3,%2\n\t"
" jle 2f\n\t"
- LOCK_PREFIX " cmpxchgl %2,%0\n\t"
+ LOCK_PREFIX " cmpxchg %2,%0\n\t"
" jnz 1b\n\t"
"2:\n\t"
"# ending __down_read_trylock\n\t"
@@ -147,9 +147,9 @@ static inline void __down_write_nested(struct rw_semaphore *sem, int subclass)
tmp = RWSEM_ACTIVE_WRITE_BIAS;
asm volatile("# beginning down_write\n\t"
- LOCK_PREFIX " xadd %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd %1,(%2)\n\t"
/* subtract 0x0000ffff, returns the old value */
- " testl %%edx,%%edx\n\t"
+ " test %1,%1\n\t"
/* was the count 0 before? */
" jz 1f\n"
" call call_rwsem_down_write_failed\n"
@@ -185,7 +185,7 @@ static inline void __up_read(struct rw_semaphore *sem)
{
__s32 tmp = -RWSEM_ACTIVE_READ_BIAS;
asm volatile("# beginning __up_read\n\t"
- LOCK_PREFIX " xadd %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd %1,(%2)\n\t"
/* subtracts 1, returns the old value */
" jns 1f\n\t"
" call call_rwsem_wake\n"
@@ -201,18 +201,18 @@ static inline void __up_read(struct rw_semaphore *sem)
*/
static inline void __up_write(struct rw_semaphore *sem)
{
+ unsigned long tmp;
asm volatile("# beginning __up_write\n\t"
- " movl %2,%%edx\n\t"
- LOCK_PREFIX " xaddl %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd %1,(%2)\n\t"
/* tries to transition
0xffff0001 -> 0x00000000 */
" jz 1f\n"
" call call_rwsem_wake\n"
"1:\n\t"
"# ending __up_write\n"
- : "+m" (sem->count)
- : "a" (sem), "i" (-RWSEM_ACTIVE_WRITE_BIAS)
- : "memory", "cc", "edx");
+ : "+m" (sem->count), "=d" (tmp)
+ : "a" (sem), "1" (-RWSEM_ACTIVE_WRITE_BIAS)
+ : "memory", "cc");
}
/*
@@ -221,7 +221,7 @@ static inline void __up_write(struct rw_semaphore *sem)
static inline void __downgrade_write(struct rw_semaphore *sem)
{
asm volatile("# beginning __downgrade_write\n\t"
- LOCK_PREFIX " addl %2,(%%eax)\n\t"
+ LOCK_PREFIX " add%z0 %2,(%1)\n\t"
/* transitions 0xZZZZ0001 -> 0xYYYY0001 */
" jns 1f\n\t"
" call call_rwsem_downgrade_wake\n"
@@ -237,7 +237,7 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
*/
static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
{
- asm volatile(LOCK_PREFIX "addl %1,%0"
+ asm volatile(LOCK_PREFIX "add%z0 %1,%0"
: "+m" (sem->count)
: "ir" (delta));
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: x86-32: clean up rwsem inline asm statements
2010-01-13 0:48 ` H. Peter Anvin
@ 2010-01-13 0:59 ` Linus Torvalds
2010-01-13 1:24 ` x86: avoid read-cycle on down_read_trylock Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2010-01-13 0:59 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List
On Tue, 12 Jan 2010, H. Peter Anvin wrote:
>
> As far as I can tell, very few of these assembly statements actually
> need a size at all -- the very first inc statement is purely to memory,
> and as such it needs a size marker, but almost any operation which is
> register-register or register-memory will simply take its size from the
> register operand. For those, it seems cleaner to simply drop the size
> suffix, and in fact this is the style we have been pushing people
> towards (use the suffix where mandatory or where the size is fixed
> anyway, to help catch bugs; use no suffix where the size can vary and is
> implied by the operands.)
>
> So, proposed alternate version of your patch attached.
Looks good to me.
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* x86: avoid read-cycle on down_read_trylock
2010-01-13 0:59 ` Linus Torvalds
@ 2010-01-13 1:24 ` Linus Torvalds
2010-01-13 1:57 ` x86: clean up rwsem type system Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2010-01-13 1:24 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List
We don't want to start the lock sequence with a plain read, since that
will cause the cacheline to be initially brought in as a shared line, only
to then immediately afterwards need to be turned into an exclusive one.
So in order to avoid unnecessary bus traffic, just start off assuming
that the lock is unlocked, which is the common case anyway. That way,
the first access to the lock will be the actual locked cycle.
This speeds up the lock ping-pong case, since it now has fewer bus cycles.
The reason down_read_trylock() is so important is that the main rwsem
usage is mmap_sem, and the page fault case - which is the most common case
by far - takes it with a "down_read_trylock()". That, in turn, is because
in case it is locked we want to do the exception table lookup (so that we
get a nice oops rather than a deadlock if we happen to get a page fault
while holding the mmap lock for writing).
So why "trylock" is normally not a very common operation, for rwsems it
ends up being the _normal_ way to get the lock.
Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
This is on top of Peter's cleanup of my asm-cleanup patch.
On Hiroyuki-san's load, this trivial change improved his (admittedly
_very_ artificial) page-fault benchmark by about 2%. The profile hit of
down_read_trylock() went from 9.08% down to 7.73%. So the trylock itself
seems to have improved by 15%+ from this.
All numbers above are meaningless, but the point is that the effect of
this cacheline access pattern can be real.
diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index 4136200..e9480be 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -123,7 +123,6 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
{
__s32 result, tmp;
asm volatile("# beginning __down_read_trylock\n\t"
- " mov %0,%1\n\t"
"1:\n\t"
" mov %1,%2\n\t"
" add %3,%2\n\t"
@@ -133,7 +132,7 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
"2:\n\t"
"# ending __down_read_trylock\n\t"
: "+m" (sem->count), "=&a" (result), "=&r" (tmp)
- : "i" (RWSEM_ACTIVE_READ_BIAS)
+ : "i" (RWSEM_ACTIVE_READ_BIAS), "1" (RWSEM_UNLOCKED_VALUE)
: "memory", "cc");
return result >= 0 ? 1 : 0;
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: x86-32: clean up rwsem inline asm statements
2010-01-13 0:45 ` Andreas Schwab
@ 2010-01-13 1:26 ` Linus Torvalds
0 siblings, 0 replies; 13+ messages in thread
From: Linus Torvalds @ 2010-01-13 1:26 UTC (permalink / raw)
To: Andreas Schwab
Cc: Ingo Molnar, Peter Anvin, Thomas Gleixner,
Linux Kernel Mailing List
On Wed, 13 Jan 2010, Andreas Schwab wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
> > @@ -249,7 +249,7 @@ static inline int rwsem_atomic_update(int delta, struct rw_semaphore *sem)
> > {
> > int tmp = delta;
> >
> > - asm volatile(LOCK_PREFIX "xadd %0,%1"
> > + asm volatile(LOCK_PREFIX "xadd%z0 %0,%1"
> > : "+r" (tmp), "+m" (sem->count)
> > : : "memory");
>
> I think %z0 should be %z1 here.
You're right, but Peter made it all a non-issue anyway.
Thanks for noticing, though, since it implies that somebody read the patch
carefully. It's always a bit scary to change fundamental inline asms,
especially for something like a locking routine where the core _semantics_
are so subtle (and if you get locking wrong, things may still work - you
just end up with some really odd race conditions).
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* x86: clean up rwsem type system
2010-01-13 1:24 ` x86: avoid read-cycle on down_read_trylock Linus Torvalds
@ 2010-01-13 1:57 ` Linus Torvalds
2010-01-13 2:16 ` Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2010-01-13 1:57 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List
The fast version of the rwsems (the code that uses xadd) has
traditionally only worked on x86-32, and as a result it mixes different
kinds of types wildly - they just all happen to be 32-bit. We have
"long", we have "__s32", and we have "int".
To make it work on x86-64, the types suddenly matter a lot more. It can
be either a 32-bit or 64-bit signed type, and both work (with the caveat
that a 32-bit counter will only have 15 bits of effective write
counters, so it's limited to 32767 users). But whatever type you
choose, it needs to be used consistently.
This makes a new 'rwsem_counter_t', that is a 32-bit signed type. For a
64-bit type, you'd need to also update the BIAS values.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
Ok, this is the final cleanup. With this, the <asm/rwsem.h> header file
should be usable on x86-64 too. It needs some more support functions (the
actual rwsem asm->C trampoline functions that save/restore all the
clobbered registers etc), but that's a separate thing.
Similarly, this will need further changes for actual 64-bit counters. The
32767-thread limit is fine on x86-32 (we're thread-limited by lowmem
anyway), but likely not on x86-64.
Comments?
diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index e9480be..705265f 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -55,6 +55,9 @@ extern asmregparm struct rw_semaphore *
/*
* the semaphore definition
+ *
+ * The bias values and the counter type needs to be extended to 64 bits
+ * if we want to have more than 32767 potential readers/writers
*/
#define RWSEM_UNLOCKED_VALUE 0x00000000
@@ -64,8 +67,10 @@ extern asmregparm struct rw_semaphore *
#define RWSEM_ACTIVE_READ_BIAS RWSEM_ACTIVE_BIAS
#define RWSEM_ACTIVE_WRITE_BIAS (RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+typedef signed int rwsem_count_t;
+
struct rw_semaphore {
- signed long count;
+ rwsem_count_t count;
spinlock_t wait_lock;
struct list_head wait_list;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -121,7 +126,7 @@ static inline void __down_read(struct rw_semaphore *sem)
*/
static inline int __down_read_trylock(struct rw_semaphore *sem)
{
- __s32 result, tmp;
+ rwsem_count_t result, tmp;
asm volatile("# beginning __down_read_trylock\n\t"
"1:\n\t"
" mov %1,%2\n\t"
@@ -142,7 +147,7 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
*/
static inline void __down_write_nested(struct rw_semaphore *sem, int subclass)
{
- int tmp;
+ rwsem_count_t tmp;
tmp = RWSEM_ACTIVE_WRITE_BIAS;
asm volatile("# beginning down_write\n\t"
@@ -169,9 +174,9 @@ static inline void __down_write(struct rw_semaphore *sem)
*/
static inline int __down_write_trylock(struct rw_semaphore *sem)
{
- signed long ret = cmpxchg(&sem->count,
- RWSEM_UNLOCKED_VALUE,
- RWSEM_ACTIVE_WRITE_BIAS);
+ rwsem_count_t ret = cmpxchg(&sem->count,
+ RWSEM_UNLOCKED_VALUE,
+ RWSEM_ACTIVE_WRITE_BIAS);
if (ret == RWSEM_UNLOCKED_VALUE)
return 1;
return 0;
@@ -182,7 +187,7 @@ static inline int __down_write_trylock(struct rw_semaphore *sem)
*/
static inline void __up_read(struct rw_semaphore *sem)
{
- __s32 tmp = -RWSEM_ACTIVE_READ_BIAS;
+ rwsem_count_t tmp = -RWSEM_ACTIVE_READ_BIAS;
asm volatile("# beginning __up_read\n\t"
LOCK_PREFIX " xadd %1,(%2)\n\t"
/* subtracts 1, returns the old value */
@@ -200,7 +205,7 @@ static inline void __up_read(struct rw_semaphore *sem)
*/
static inline void __up_write(struct rw_semaphore *sem)
{
- unsigned long tmp;
+ rwsem_count_t tmp;
asm volatile("# beginning __up_write\n\t"
LOCK_PREFIX " xadd %1,(%2)\n\t"
/* tries to transition
@@ -244,9 +249,9 @@ static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
/*
* implement exchange and add functionality
*/
-static inline int rwsem_atomic_update(int delta, struct rw_semaphore *sem)
+static inline rwsem_count_t rwsem_atomic_update(int delta, struct rw_semaphore *sem)
{
- int tmp = delta;
+ rwsem_count_t tmp = delta;
asm volatile(LOCK_PREFIX "xadd %0,%1"
: "+r" (tmp), "+m" (sem->count)
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: x86: clean up rwsem type system
2010-01-13 1:57 ` x86: clean up rwsem type system Linus Torvalds
@ 2010-01-13 2:16 ` Linus Torvalds
2010-01-14 7:03 ` H. Peter Anvin
0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2010-01-13 2:16 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List
On Tue, 12 Jan 2010, Linus Torvalds wrote:
>
> Ok, this is the final cleanup. With this, the <asm/rwsem.h> header file
> should be usable on x86-64 too. It needs some more support functions (the
> actual rwsem asm->C trampoline functions that save/restore all the
> clobbered registers etc), but that's a separate thing.
>
> Similarly, this will need further changes for actual 64-bit counters. The
> 32767-thread limit is fine on x86-32 (we're thread-limited by lowmem
> anyway), but likely not on x86-64.
In case anybody wants to test, the final piece is appended.
Again, note the 32767-thread limit here. So this really does need that
whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
extension on top of it, but that is conceptually a totally independent
issue.
NOT TESTED! The original patch that this all was based on were tested by
KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
cleaned-up series, so caveat emptor..
Also note that it _may_ be a good idea to mark some more registers
clobbered on x86-64 in the inline asms instead of saving/restoring them.
They are inline functions, but they are only used in places where there
are not a lot of live registers _anyway_, so doing for example the
clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
worse, and would make the slow-path code smaller.
(Not that the slow-path really matters to that degree. Saving a few
unnecessary registers is the _least_ of our problems when we hit the slow
path. The instruction/cycle counting really only matters in the fast
path).
Linus
---
x86-64: support native xadd rwsem implementation
This one is much faster than the spinlock based fallback rwsem code,
with certain artifical benchmarks having shown 300%+ improvement on
threaded page faults etc.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/x86/Kconfig.cpu | 2 +-
arch/x86/lib/Makefile | 1 +
arch/x86/lib/rwsem_64.S | 81 +++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 83 insertions(+), 1 deletions(-)
diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index f20ddf8..a198293 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -319,7 +319,7 @@ config X86_L1_CACHE_SHIFT
config X86_XADD
def_bool y
- depends on X86_32 && !M386
+ depends on X86_64 || !M386
config X86_PPRO_FENCE
bool "PentiumPro memory ordering errata workaround"
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index cffd754..c802451 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -39,4 +39,5 @@ else
lib-y += thunk_64.o clear_page_64.o copy_page_64.o
lib-y += memmove_64.o memset_64.o
lib-y += copy_user_64.o rwlock_64.o copy_user_nocache_64.o
+ lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem_64.o
endif
diff --git a/arch/x86/lib/rwsem_64.S b/arch/x86/lib/rwsem_64.S
new file mode 100644
index 0000000..15acecf
--- /dev/null
+++ b/arch/x86/lib/rwsem_64.S
@@ -0,0 +1,81 @@
+/*
+ * x86-64 rwsem wrappers
+ *
+ * This interfaces the inline asm code to the slow-path
+ * C routines. We need to save the call-clobbered regs
+ * that the asm does not mark as clobbered, and move the
+ * argument from %rax to %rdi.
+ *
+ * NOTE! We don't need to save %rax, because the functions
+ * will always return the semaphore pointer in %rax (which
+ * is also the input argument to these helpers)
+ *
+ * The following can clobber %rdx because the asm clobbers it:
+ * call_rwsem_down_write_failed
+ * call_rwsem_wake
+ * but %rdi, %rsi, %rcx, %r8-r11 always need saving.
+ */
+
+#include <linux/linkage.h>
+#include <asm/rwlock.h>
+#include <asm/alternative-asm.h>
+#include <asm/frame.h>
+#include <asm/dwarf2.h>
+
+#define save_common_regs \
+ pushq %rdi; \
+ pushq %rsi; \
+ pushq %rcx; \
+ pushq %r8; \
+ pushq %r9; \
+ pushq %r10; \
+ pushq %r11
+
+#define restore_common_regs \
+ popq %r11; \
+ popq %r10; \
+ popq %r9; \
+ popq %r8; \
+ popq %rcx; \
+ popq %rsi; \
+ popq %rdi
+
+/* Fix up special calling conventions */
+ENTRY(call_rwsem_down_read_failed)
+ save_common_regs
+ pushq %rdx
+ movq %rax,%rdi
+ call rwsem_down_read_failed
+ popq %rdx
+ restore_common_regs
+ ret
+ ENDPROC(call_rwsem_down_read_failed)
+
+ENTRY(call_rwsem_down_write_failed)
+ save_common_regs
+ movq %rax,%rdi
+ call rwsem_down_write_failed
+ restore_common_regs
+ ret
+ ENDPROC(call_rwsem_down_write_failed)
+
+ENTRY(call_rwsem_wake)
+ decw %dx /* do nothing if still outstanding active readers */
+ jnz 1f
+ save_common_regs
+ movq %rax,%rdi
+ call rwsem_wake
+ restore_common_regs
+1: ret
+ ENDPROC(call_rwsem_wake)
+
+/* Fix up special calling conventions */
+ENTRY(call_rwsem_downgrade_wake)
+ save_common_regs
+ pushq %rdx
+ movq %rax,%rdi
+ call rwsem_downgrade_wake
+ popq %rdx
+ restore_common_regs
+ ret
+ ENDPROC(call_rwsem_downgrade_wake)
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip:x86/asm] x86-32: clean up rwsem inline asm statements
2010-01-13 0:21 x86-32: clean up rwsem inline asm statements Linus Torvalds
2010-01-13 0:45 ` Andreas Schwab
2010-01-13 0:48 ` H. Peter Anvin
@ 2010-01-13 5:03 ` tip-bot for Linus Torvalds
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Linus Torvalds @ 2010-01-13 5:03 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, torvalds, tglx
Commit-ID: 59c33fa7791e9948ba467c2b83e307a0d087ab49
Gitweb: http://git.kernel.org/tip/59c33fa7791e9948ba467c2b83e307a0d087ab49
Author: Linus Torvalds <torvalds@linux-foundation.org>
AuthorDate: Tue, 12 Jan 2010 16:21:09 -0800
Committer: H. Peter Anvin <hpa@zytor.com>
CommitDate: Tue, 12 Jan 2010 20:43:04 -0800
x86-32: clean up rwsem inline asm statements
This makes gcc use the right register names and instruction operand sizes
automatically for the rwsem inline asm statements.
So instead of using "(%%eax)" to specify the memory address that is the
semaphore, we use "(%1)" or similar. And instead of forcing the operation
to always be 32-bit, we use "%z0", taking the size from the actual
semaphore data structure itself.
This doesn't actually matter on x86-32, but if we want to use the same
inline asm for x86-64, we'll need to have the compiler generate the proper
64-bit names for the registers (%rax instead of %eax), and if we want to
use a 64-bit counter too (in order to avoid the 15-bit limit on the
write counter that limits concurrent users to 32767 threads), we'll need
to be able to generate instructions with "q" accesses rather than "l".
Since this header currently isn't enabled on x86-64, none of that matters,
but we do want to use the xadd version of the semaphores rather than have
to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended
when you have lots of threads all taking page faults, and the fallback
rwsem code that uses a spinlock performs abysmally badly in that case.
[ hpa: modified the patch to skip size suffixes entirely when they are
redundant due to register operands. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
arch/x86/include/asm/rwsem.h | 30 +++++++++++++++---------------
1 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index ca7517d..4136200 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -105,7 +105,7 @@ do { \
static inline void __down_read(struct rw_semaphore *sem)
{
asm volatile("# beginning down_read\n\t"
- LOCK_PREFIX " incl (%%eax)\n\t"
+ LOCK_PREFIX " inc%z0 (%1)\n\t"
/* adds 0x00000001, returns the old value */
" jns 1f\n"
" call call_rwsem_down_read_failed\n"
@@ -123,12 +123,12 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
{
__s32 result, tmp;
asm volatile("# beginning __down_read_trylock\n\t"
- " movl %0,%1\n\t"
+ " mov %0,%1\n\t"
"1:\n\t"
- " movl %1,%2\n\t"
- " addl %3,%2\n\t"
+ " mov %1,%2\n\t"
+ " add %3,%2\n\t"
" jle 2f\n\t"
- LOCK_PREFIX " cmpxchgl %2,%0\n\t"
+ LOCK_PREFIX " cmpxchg %2,%0\n\t"
" jnz 1b\n\t"
"2:\n\t"
"# ending __down_read_trylock\n\t"
@@ -147,9 +147,9 @@ static inline void __down_write_nested(struct rw_semaphore *sem, int subclass)
tmp = RWSEM_ACTIVE_WRITE_BIAS;
asm volatile("# beginning down_write\n\t"
- LOCK_PREFIX " xadd %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd %1,(%2)\n\t"
/* subtract 0x0000ffff, returns the old value */
- " testl %%edx,%%edx\n\t"
+ " test %1,%1\n\t"
/* was the count 0 before? */
" jz 1f\n"
" call call_rwsem_down_write_failed\n"
@@ -185,7 +185,7 @@ static inline void __up_read(struct rw_semaphore *sem)
{
__s32 tmp = -RWSEM_ACTIVE_READ_BIAS;
asm volatile("# beginning __up_read\n\t"
- LOCK_PREFIX " xadd %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd %1,(%2)\n\t"
/* subtracts 1, returns the old value */
" jns 1f\n\t"
" call call_rwsem_wake\n"
@@ -201,18 +201,18 @@ static inline void __up_read(struct rw_semaphore *sem)
*/
static inline void __up_write(struct rw_semaphore *sem)
{
+ unsigned long tmp;
asm volatile("# beginning __up_write\n\t"
- " movl %2,%%edx\n\t"
- LOCK_PREFIX " xaddl %%edx,(%%eax)\n\t"
+ LOCK_PREFIX " xadd %1,(%2)\n\t"
/* tries to transition
0xffff0001 -> 0x00000000 */
" jz 1f\n"
" call call_rwsem_wake\n"
"1:\n\t"
"# ending __up_write\n"
- : "+m" (sem->count)
- : "a" (sem), "i" (-RWSEM_ACTIVE_WRITE_BIAS)
- : "memory", "cc", "edx");
+ : "+m" (sem->count), "=d" (tmp)
+ : "a" (sem), "1" (-RWSEM_ACTIVE_WRITE_BIAS)
+ : "memory", "cc");
}
/*
@@ -221,7 +221,7 @@ static inline void __up_write(struct rw_semaphore *sem)
static inline void __downgrade_write(struct rw_semaphore *sem)
{
asm volatile("# beginning __downgrade_write\n\t"
- LOCK_PREFIX " addl %2,(%%eax)\n\t"
+ LOCK_PREFIX " add%z0 %2,(%1)\n\t"
/* transitions 0xZZZZ0001 -> 0xYYYY0001 */
" jns 1f\n\t"
" call call_rwsem_downgrade_wake\n"
@@ -237,7 +237,7 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
*/
static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
{
- asm volatile(LOCK_PREFIX "addl %1,%0"
+ asm volatile(LOCK_PREFIX "add%z0 %1,%0"
: "+m" (sem->count)
: "ir" (delta));
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: x86: clean up rwsem type system
2010-01-13 2:16 ` Linus Torvalds
@ 2010-01-14 7:03 ` H. Peter Anvin
2010-01-17 6:05 ` Ingo Molnar
0 siblings, 1 reply; 13+ messages in thread
From: H. Peter Anvin @ 2010-01-14 7:03 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List
On 01/12/2010 06:16 PM, Linus Torvalds wrote:
>
> In case anybody wants to test, the final piece is appended.
>
> Again, note the 32767-thread limit here. So this really does need that
> whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
> extension on top of it, but that is conceptually a totally independent
> issue.
>
> NOT TESTED! The original patch that this all was based on were tested by
> KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
> cleaned-up series, so caveat emptor..
>
> Also note that it _may_ be a good idea to mark some more registers
> clobbered on x86-64 in the inline asms instead of saving/restoring them.
> They are inline functions, but they are only used in places where there
> are not a lot of live registers _anyway_, so doing for example the
> clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
> worse, and would make the slow-path code smaller.
>
Hi Linus,
I have put these into a separate topic branch in the tip tree, which
should get them some test coverage. I will look at 64-bit counters to
support 2^31 threads hopefully later this week, unless you prefer to do
it yourself.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86: clean up rwsem type system
2010-01-14 7:03 ` H. Peter Anvin
@ 2010-01-17 6:05 ` Ingo Molnar
2010-01-17 18:24 ` Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2010-01-17 6:05 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Linus Torvalds, Thomas Gleixner, Linux Kernel Mailing List
* H. Peter Anvin <hpa@zytor.com> wrote:
> On 01/12/2010 06:16 PM, Linus Torvalds wrote:
> >
> > In case anybody wants to test, the final piece is appended.
> >
> > Again, note the 32767-thread limit here. So this really does need that
> > whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
> > extension on top of it, but that is conceptually a totally independent
> > issue.
> >
> > NOT TESTED! The original patch that this all was based on were tested by
> > KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
> > cleaned-up series, so caveat emptor..
> >
> > Also note that it _may_ be a good idea to mark some more registers
> > clobbered on x86-64 in the inline asms instead of saving/restoring them.
> > They are inline functions, but they are only used in places where there
> > are not a lot of live registers _anyway_, so doing for example the
> > clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
> > worse, and would make the slow-path code smaller.
> >
>
> Hi Linus,
>
> I have put these into a separate topic branch in the tip tree, which should
> get them some test coverage. I will look at 64-bit counters to support 2^31
> threads hopefully later this week, unless you prefer to do it yourself.
FYI, -tip testing found that these changes break the UML build:
kernel/built-in.o: In function `__up_read':
/home/mingo/tip/arch/x86/include/asm/rwsem.h:192: undefined reference to `call_rwsem_wake'
kernel/built-in.o: In function `__up_write':
/home/mingo/tip/arch/x86/include/asm/rwsem.h:210: undefined reference to `call_rwsem_wake'
kernel/built-in.o: In function `__downgrade_write':
/home/mingo/tip/arch/x86/include/asm/rwsem.h:228: undefined reference to `call_rwsem_downgrade_wake'
kernel/built-in.o: In function `__down_read':
/home/mingo/tip/arch/x86/include/asm/rwsem.h:112: undefined reference to `call_rwsem_down_read_failed'
kernel/built-in.o: In function `__down_write_nested':
/home/mingo/tip/arch/x86/include/asm/rwsem.h:154: undefined reference to `call_rwsem_down_write_failed'
collect2: ld returned 1 exit status
Ingo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: x86: clean up rwsem type system
2010-01-17 6:05 ` Ingo Molnar
@ 2010-01-17 18:24 ` Linus Torvalds
2010-01-21 17:14 ` Ingo Molnar
0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2010-01-17 18:24 UTC (permalink / raw)
To: Ingo Molnar; +Cc: H. Peter Anvin, Thomas Gleixner, Linux Kernel Mailing List
On Sun, 17 Jan 2010, Ingo Molnar wrote:
>
> FYI, -tip testing found that these changes break the UML build:
>
> kernel/built-in.o: In function `__up_read':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:192: undefined reference to `call_rwsem_wake'
> kernel/built-in.o: In function `__up_write':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:210: undefined reference to `call_rwsem_wake'
> kernel/built-in.o: In function `__downgrade_write':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:228: undefined reference to `call_rwsem_downgrade_wake'
> kernel/built-in.o: In function `__down_read':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:112: undefined reference to `call_rwsem_down_read_failed'
> kernel/built-in.o: In function `__down_write_nested':
> /home/mingo/tip/arch/x86/include/asm/rwsem.h:154: undefined reference to `call_rwsem_down_write_failed'
> collect2: ld returned 1 exit status
I think something like this should fix it.
Linus
---
arch/um/sys-x86_64/Makefile | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/arch/um/sys-x86_64/Makefile b/arch/um/sys-x86_64/Makefile
index 2201e9c..c1ea9eb 100644
--- a/arch/um/sys-x86_64/Makefile
+++ b/arch/um/sys-x86_64/Makefile
@@ -8,7 +8,8 @@ obj-y = bug.o bugs.o delay.o fault.o ldt.o mem.o ptrace.o ptrace_user.o \
setjmp.o signal.o stub.o stub_segv.o syscalls.o syscall_table.o \
sysrq.o ksyms.o tls.o
-subarch-obj-y = lib/csum-partial_64.o lib/memcpy_64.o lib/thunk_64.o
+subarch-obj-y = lib/csum-partial_64.o lib/memcpy_64.o lib/thunk_64.o \
+ lib/rwsem_64.o
subarch-obj-$(CONFIG_MODULES) += kernel/module.o
ldt-y = ../sys-i386/ldt.o
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: x86: clean up rwsem type system
2010-01-17 18:24 ` Linus Torvalds
@ 2010-01-21 17:14 ` Ingo Molnar
0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2010-01-21 17:14 UTC (permalink / raw)
To: Linus Torvalds; +Cc: H. Peter Anvin, Thomas Gleixner, Linux Kernel Mailing List
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Sun, 17 Jan 2010, Ingo Molnar wrote:
> >
> > FYI, -tip testing found that these changes break the UML build:
> >
> > kernel/built-in.o: In function `__up_read':
> > /home/mingo/tip/arch/x86/include/asm/rwsem.h:192: undefined reference to `call_rwsem_wake'
> > kernel/built-in.o: In function `__up_write':
> > /home/mingo/tip/arch/x86/include/asm/rwsem.h:210: undefined reference to `call_rwsem_wake'
> > kernel/built-in.o: In function `__downgrade_write':
> > /home/mingo/tip/arch/x86/include/asm/rwsem.h:228: undefined reference to `call_rwsem_downgrade_wake'
> > kernel/built-in.o: In function `__down_read':
> > /home/mingo/tip/arch/x86/include/asm/rwsem.h:112: undefined reference to `call_rwsem_down_read_failed'
> > kernel/built-in.o: In function `__down_write_nested':
> > /home/mingo/tip/arch/x86/include/asm/rwsem.h:154: undefined reference to `call_rwsem_down_write_failed'
> > collect2: ld returned 1 exit status
>
> I think something like this should fix it.
That did the trick - thanks!
Ingo
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-01-21 17:14 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-13 0:21 x86-32: clean up rwsem inline asm statements Linus Torvalds
2010-01-13 0:45 ` Andreas Schwab
2010-01-13 1:26 ` Linus Torvalds
2010-01-13 0:48 ` H. Peter Anvin
2010-01-13 0:59 ` Linus Torvalds
2010-01-13 1:24 ` x86: avoid read-cycle on down_read_trylock Linus Torvalds
2010-01-13 1:57 ` x86: clean up rwsem type system Linus Torvalds
2010-01-13 2:16 ` Linus Torvalds
2010-01-14 7:03 ` H. Peter Anvin
2010-01-17 6:05 ` Ingo Molnar
2010-01-17 18:24 ` Linus Torvalds
2010-01-21 17:14 ` Ingo Molnar
2010-01-13 5:03 ` [tip:x86/asm] x86-32: clean up rwsem inline asm statements tip-bot for Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox