From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org,
Davidlohr Bueso <dave@stgolabs.net>,
linux-ia64@vger.kernel.org, Tim Chen <tim.c.chen@linux.intel.com>,
Arnd Bergmann <arnd@arndb.de>,
linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org,
x86@kernel.org, Will Deacon <will.deacon@arm.com>,
linux-kernel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
linux-alpha@vger.kernel.org, sparclinux@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
linuxppc-dev@lists.ozlabs.org,
Andrew Morton <akpm@linux-foundation.org>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()
Date: Tue, 12 Feb 2019 14:24:04 +0100 [thread overview]
Message-ID: <20190212132404.GI32494@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <1549913486-16799-3-git-send-email-longman@redhat.com>
On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
> Modify __down_read_trylock() to make it generate slightly better code
> (smaller and maybe a tiny bit faster).
>
> Before this patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: jmp 0x18 <down_read_trylock+24>
> 0x0000000000000007 <+7>: lea 0x1(%rdx),%rcx
> 0x000000000000000b <+11>: mov %rdx,%rax
> 0x000000000000000e <+14>: lock cmpxchg %rcx,(%rdi)
> 0x0000000000000013 <+19>: cmp %rax,%rdx
> 0x0000000000000016 <+22>: je 0x23 <down_read_trylock+35>
> 0x0000000000000018 <+24>: mov (%rdi),%rdx
> 0x000000000000001b <+27>: test %rdx,%rdx
> 0x000000000000001e <+30>: jns 0x7 <down_read_trylock+7>
> 0x0000000000000020 <+32>: xor %eax,%eax
> 0x0000000000000022 <+34>: retq
> 0x0000000000000023 <+35>: mov %gs:0x0,%rax
> 0x000000000000002c <+44>: or $0x3,%rax
> 0x0000000000000030 <+48>: mov %rax,0x20(%rdi)
> 0x0000000000000034 <+52>: mov $0x1,%eax
> 0x0000000000000039 <+57>: retq
>
> After patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: mov (%rdi),%rax
> 0x0000000000000008 <+8>: test %rax,%rax
> 0x000000000000000b <+11>: js 0x2f <down_read_trylock+47>
> 0x000000000000000d <+13>: lea 0x1(%rax),%rdx
> 0x0000000000000011 <+17>: lock cmpxchg %rdx,(%rdi)
> 0x0000000000000016 <+22>: jne 0x8 <down_read_trylock+8>
> 0x0000000000000018 <+24>: mov %gs:0x0,%rax
> 0x0000000000000021 <+33>: or $0x3,%rax
> 0x0000000000000025 <+37>: mov %rax,0x20(%rdi)
> 0x0000000000000029 <+41>: mov $0x1,%eax
> 0x000000000000002e <+46>: retq
> 0x000000000000002f <+47>: xor %eax,%eax
> 0x0000000000000031 <+49>: retq
>
> By using a rwsem microbenchmark, the down_read_trylock() rate on a
> x86-64 system before and after the patch were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 27,787 28,259
> 2 8,359 9,234
From 1/2:
1 29,201 30,143 29,458 28,615 30,172 29,201
2 6,807 13,299 1,171 7,725 15,025 1,804
>
> On a ARM64 system, the performance results were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 24,155 25,000
> 2 6,820 8,699
>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/locking/rwsem.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/locking/rwsem.h b/kernel/locking/rwsem.h
> index 067e265..028bc33 100644
> --- a/kernel/locking/rwsem.h
> +++ b/kernel/locking/rwsem.h
> @@ -175,11 +175,11 @@ static inline int __down_read_killable(struct rw_semaphore *sem)
>
> static inline int __down_read_trylock(struct rw_semaphore *sem)
> {
> - long tmp;
> + long tmp = atomic_long_read(&sem->count);
>
> - while ((tmp = atomic_long_read(&sem->count)) >= 0) {
> - if (tmp == atomic_long_cmpxchg_acquire(&sem->count, tmp,
> - tmp + RWSEM_ACTIVE_READ_BIAS)) {
> + while (tmp >= 0) {
> + if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp,
> + tmp + RWSEM_ACTIVE_READ_BIAS)) {
> return 1;
> }
> }
> --
> 1.8.3.1
>
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-hexagon@vger.kernel.org, linux-ia64@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, linux-sh@vger.kernel.org,
sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org,
linux-arch@vger.kernel.org, x86@kernel.org,
Arnd Bergmann <arnd@arndb.de>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()
Date: Tue, 12 Feb 2019 14:24:04 +0100 [thread overview]
Message-ID: <20190212132404.GI32494@hirez.programming.kicks-ass.net> (raw)
Message-ID: <20190212132404.wKWRm-4Tsp9FgfB0xO0ddHVRGXNq87mFzR8ws4Efaps@z> (raw)
In-Reply-To: <1549913486-16799-3-git-send-email-longman@redhat.com>
On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
> Modify __down_read_trylock() to make it generate slightly better code
> (smaller and maybe a tiny bit faster).
>
> Before this patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: jmp 0x18 <down_read_trylock+24>
> 0x0000000000000007 <+7>: lea 0x1(%rdx),%rcx
> 0x000000000000000b <+11>: mov %rdx,%rax
> 0x000000000000000e <+14>: lock cmpxchg %rcx,(%rdi)
> 0x0000000000000013 <+19>: cmp %rax,%rdx
> 0x0000000000000016 <+22>: je 0x23 <down_read_trylock+35>
> 0x0000000000000018 <+24>: mov (%rdi),%rdx
> 0x000000000000001b <+27>: test %rdx,%rdx
> 0x000000000000001e <+30>: jns 0x7 <down_read_trylock+7>
> 0x0000000000000020 <+32>: xor %eax,%eax
> 0x0000000000000022 <+34>: retq
> 0x0000000000000023 <+35>: mov %gs:0x0,%rax
> 0x000000000000002c <+44>: or $0x3,%rax
> 0x0000000000000030 <+48>: mov %rax,0x20(%rdi)
> 0x0000000000000034 <+52>: mov $0x1,%eax
> 0x0000000000000039 <+57>: retq
>
> After patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: mov (%rdi),%rax
> 0x0000000000000008 <+8>: test %rax,%rax
> 0x000000000000000b <+11>: js 0x2f <down_read_trylock+47>
> 0x000000000000000d <+13>: lea 0x1(%rax),%rdx
> 0x0000000000000011 <+17>: lock cmpxchg %rdx,(%rdi)
> 0x0000000000000016 <+22>: jne 0x8 <down_read_trylock+8>
> 0x0000000000000018 <+24>: mov %gs:0x0,%rax
> 0x0000000000000021 <+33>: or $0x3,%rax
> 0x0000000000000025 <+37>: mov %rax,0x20(%rdi)
> 0x0000000000000029 <+41>: mov $0x1,%eax
> 0x000000000000002e <+46>: retq
> 0x000000000000002f <+47>: xor %eax,%eax
> 0x0000000000000031 <+49>: retq
>
> By using a rwsem microbenchmark, the down_read_trylock() rate on a
> x86-64 system before and after the patch were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 27,787 28,259
> 2 8,359 9,234
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org,
Davidlohr Bueso <dave@stgolabs.net>,
linux-ia64@vger.kernel.org, Tim Chen <tim.c.chen@linux.intel.com>,
Arnd Bergmann <arnd@arndb.de>,
linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org,
x86@kernel.org, Will Deacon <will.deacon@arm.com>,
linux-kernel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
linux-alpha@vger.kernel.org, sparclinux@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
linuxppc-dev@lists.ozlabs.org,
Andrew Morton <akpm@linux-foundation.org>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()
Date: Tue, 12 Feb 2019 13:24:04 +0000 [thread overview]
Message-ID: <20190212132404.GI32494@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <1549913486-16799-3-git-send-email-longman@redhat.com>
On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
> Modify __down_read_trylock() to make it generate slightly better code
> (smaller and maybe a tiny bit faster).
>
> Before this patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: jmp 0x18 <down_read_trylock+24>
> 0x0000000000000007 <+7>: lea 0x1(%rdx),%rcx
> 0x000000000000000b <+11>: mov %rdx,%rax
> 0x000000000000000e <+14>: lock cmpxchg %rcx,(%rdi)
> 0x0000000000000013 <+19>: cmp %rax,%rdx
> 0x0000000000000016 <+22>: je 0x23 <down_read_trylock+35>
> 0x0000000000000018 <+24>: mov (%rdi),%rdx
> 0x000000000000001b <+27>: test %rdx,%rdx
> 0x000000000000001e <+30>: jns 0x7 <down_read_trylock+7>
> 0x0000000000000020 <+32>: xor %eax,%eax
> 0x0000000000000022 <+34>: retq
> 0x0000000000000023 <+35>: mov %gs:0x0,%rax
> 0x000000000000002c <+44>: or $0x3,%rax
> 0x0000000000000030 <+48>: mov %rax,0x20(%rdi)
> 0x0000000000000034 <+52>: mov $0x1,%eax
> 0x0000000000000039 <+57>: retq
>
> After patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: mov (%rdi),%rax
> 0x0000000000000008 <+8>: test %rax,%rax
> 0x000000000000000b <+11>: js 0x2f <down_read_trylock+47>
> 0x000000000000000d <+13>: lea 0x1(%rax),%rdx
> 0x0000000000000011 <+17>: lock cmpxchg %rdx,(%rdi)
> 0x0000000000000016 <+22>: jne 0x8 <down_read_trylock+8>
> 0x0000000000000018 <+24>: mov %gs:0x0,%rax
> 0x0000000000000021 <+33>: or $0x3,%rax
> 0x0000000000000025 <+37>: mov %rax,0x20(%rdi)
> 0x0000000000000029 <+41>: mov $0x1,%eax
> 0x000000000000002e <+46>: retq
> 0x000000000000002f <+47>: xor %eax,%eax
> 0x0000000000000031 <+49>: retq
>
> By using a rwsem microbenchmark, the down_read_trylock() rate on a
> x86-64 system before and after the patch were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 27,787 28,259
> 2 8,359 9,234
From 1/2:
1 29,201 30,143 29,458 28,615 30,172 29,201
2 6,807 13,299 1,171 7,725 15,025 1,804
>
> On a ARM64 system, the performance results were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 24,155 25,000
> 2 6,820 8,699
>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/locking/rwsem.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/locking/rwsem.h b/kernel/locking/rwsem.h
> index 067e265..028bc33 100644
> --- a/kernel/locking/rwsem.h
> +++ b/kernel/locking/rwsem.h
> @@ -175,11 +175,11 @@ static inline int __down_read_killable(struct rw_semaphore *sem)
>
> static inline int __down_read_trylock(struct rw_semaphore *sem)
> {
> - long tmp;
> + long tmp = atomic_long_read(&sem->count);
>
> - while ((tmp = atomic_long_read(&sem->count)) >= 0) {
> - if (tmp = atomic_long_cmpxchg_acquire(&sem->count, tmp,
> - tmp + RWSEM_ACTIVE_READ_BIAS)) {
> + while (tmp >= 0) {
> + if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp,
> + tmp + RWSEM_ACTIVE_READ_BIAS)) {
> return 1;
> }
> }
> --
> 1.8.3.1
>
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: linux-arch@vger.kernel.org, linux-xtensa@linux-xtensa.org,
Davidlohr Bueso <dave@stgolabs.net>,
linux-ia64@vger.kernel.org, Tim Chen <tim.c.chen@linux.intel.com>,
Arnd Bergmann <arnd@arndb.de>,
linux-sh@vger.kernel.org, linux-hexagon@vger.kernel.org,
x86@kernel.org, Will Deacon <will.deacon@arm.com>,
linux-kernel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
linux-alpha@vger.kernel.org, sparclinux@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
linuxppc-dev@lists.ozlabs.org,
Andrew Morton <akpm@linux-foundation.org>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()
Date: Tue, 12 Feb 2019 14:24:04 +0100 [thread overview]
Message-ID: <20190212132404.GI32494@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <1549913486-16799-3-git-send-email-longman@redhat.com>
On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
> Modify __down_read_trylock() to make it generate slightly better code
> (smaller and maybe a tiny bit faster).
>
> Before this patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: jmp 0x18 <down_read_trylock+24>
> 0x0000000000000007 <+7>: lea 0x1(%rdx),%rcx
> 0x000000000000000b <+11>: mov %rdx,%rax
> 0x000000000000000e <+14>: lock cmpxchg %rcx,(%rdi)
> 0x0000000000000013 <+19>: cmp %rax,%rdx
> 0x0000000000000016 <+22>: je 0x23 <down_read_trylock+35>
> 0x0000000000000018 <+24>: mov (%rdi),%rdx
> 0x000000000000001b <+27>: test %rdx,%rdx
> 0x000000000000001e <+30>: jns 0x7 <down_read_trylock+7>
> 0x0000000000000020 <+32>: xor %eax,%eax
> 0x0000000000000022 <+34>: retq
> 0x0000000000000023 <+35>: mov %gs:0x0,%rax
> 0x000000000000002c <+44>: or $0x3,%rax
> 0x0000000000000030 <+48>: mov %rax,0x20(%rdi)
> 0x0000000000000034 <+52>: mov $0x1,%eax
> 0x0000000000000039 <+57>: retq
>
> After patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: mov (%rdi),%rax
> 0x0000000000000008 <+8>: test %rax,%rax
> 0x000000000000000b <+11>: js 0x2f <down_read_trylock+47>
> 0x000000000000000d <+13>: lea 0x1(%rax),%rdx
> 0x0000000000000011 <+17>: lock cmpxchg %rdx,(%rdi)
> 0x0000000000000016 <+22>: jne 0x8 <down_read_trylock+8>
> 0x0000000000000018 <+24>: mov %gs:0x0,%rax
> 0x0000000000000021 <+33>: or $0x3,%rax
> 0x0000000000000025 <+37>: mov %rax,0x20(%rdi)
> 0x0000000000000029 <+41>: mov $0x1,%eax
> 0x000000000000002e <+46>: retq
> 0x000000000000002f <+47>: xor %eax,%eax
> 0x0000000000000031 <+49>: retq
>
> By using a rwsem microbenchmark, the down_read_trylock() rate on a
> x86-64 system before and after the patch were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 27,787 28,259
> 2 8,359 9,234
From 1/2:
1 29,201 30,143 29,458 28,615 30,172 29,201
2 6,807 13,299 1,171 7,725 15,025 1,804
>
> On a ARM64 system, the performance results were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 24,155 25,000
> 2 6,820 8,699
>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/locking/rwsem.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/locking/rwsem.h b/kernel/locking/rwsem.h
> index 067e265..028bc33 100644
> --- a/kernel/locking/rwsem.h
> +++ b/kernel/locking/rwsem.h
> @@ -175,11 +175,11 @@ static inline int __down_read_killable(struct rw_semaphore *sem)
>
> static inline int __down_read_trylock(struct rw_semaphore *sem)
> {
> - long tmp;
> + long tmp = atomic_long_read(&sem->count);
>
> - while ((tmp = atomic_long_read(&sem->count)) >= 0) {
> - if (tmp == atomic_long_cmpxchg_acquire(&sem->count, tmp,
> - tmp + RWSEM_ACTIVE_READ_BIAS)) {
> + while (tmp >= 0) {
> + if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp,
> + tmp + RWSEM_ACTIVE_READ_BIAS)) {
> return 1;
> }
> }
> --
> 1.8.3.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-hexagon@vger.kernel.org, linux-ia64@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, linux-sh@vger.kernel.org,
sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org,
linux-arch@vger.kernel.org, x86@kernel.org,
Arnd Bergmann <arnd@arndb.de>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock()
Date: Tue, 12 Feb 2019 14:24:04 +0100 [thread overview]
Message-ID: <20190212132404.GI32494@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <1549913486-16799-3-git-send-email-longman@redhat.com>
On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
> Modify __down_read_trylock() to make it generate slightly better code
> (smaller and maybe a tiny bit faster).
>
> Before this patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: jmp 0x18 <down_read_trylock+24>
> 0x0000000000000007 <+7>: lea 0x1(%rdx),%rcx
> 0x000000000000000b <+11>: mov %rdx,%rax
> 0x000000000000000e <+14>: lock cmpxchg %rcx,(%rdi)
> 0x0000000000000013 <+19>: cmp %rax,%rdx
> 0x0000000000000016 <+22>: je 0x23 <down_read_trylock+35>
> 0x0000000000000018 <+24>: mov (%rdi),%rdx
> 0x000000000000001b <+27>: test %rdx,%rdx
> 0x000000000000001e <+30>: jns 0x7 <down_read_trylock+7>
> 0x0000000000000020 <+32>: xor %eax,%eax
> 0x0000000000000022 <+34>: retq
> 0x0000000000000023 <+35>: mov %gs:0x0,%rax
> 0x000000000000002c <+44>: or $0x3,%rax
> 0x0000000000000030 <+48>: mov %rax,0x20(%rdi)
> 0x0000000000000034 <+52>: mov $0x1,%eax
> 0x0000000000000039 <+57>: retq
>
> After patch, down_read_trylock:
>
> 0x0000000000000000 <+0>: callq 0x5 <down_read_trylock+5>
> 0x0000000000000005 <+5>: mov (%rdi),%rax
> 0x0000000000000008 <+8>: test %rax,%rax
> 0x000000000000000b <+11>: js 0x2f <down_read_trylock+47>
> 0x000000000000000d <+13>: lea 0x1(%rax),%rdx
> 0x0000000000000011 <+17>: lock cmpxchg %rdx,(%rdi)
> 0x0000000000000016 <+22>: jne 0x8 <down_read_trylock+8>
> 0x0000000000000018 <+24>: mov %gs:0x0,%rax
> 0x0000000000000021 <+33>: or $0x3,%rax
> 0x0000000000000025 <+37>: mov %rax,0x20(%rdi)
> 0x0000000000000029 <+41>: mov $0x1,%eax
> 0x000000000000002e <+46>: retq
> 0x000000000000002f <+47>: xor %eax,%eax
> 0x0000000000000031 <+49>: retq
>
> By using a rwsem microbenchmark, the down_read_trylock() rate on a
> x86-64 system before and after the patch were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 27,787 28,259
> 2 8,359 9,234
From 1/2:
1 29,201 30,143 29,458 28,615 30,172 29,201
2 6,807 13,299 1,171 7,725 15,025 1,804
>
> On a ARM64 system, the performance results were:
>
> Before Patch After Patch
> # of Threads rlock rlock
> ------------ ----- -----
> 1 24,155 25,000
> 2 6,820 8,699
>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/locking/rwsem.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/locking/rwsem.h b/kernel/locking/rwsem.h
> index 067e265..028bc33 100644
> --- a/kernel/locking/rwsem.h
> +++ b/kernel/locking/rwsem.h
> @@ -175,11 +175,11 @@ static inline int __down_read_killable(struct rw_semaphore *sem)
>
> static inline int __down_read_trylock(struct rw_semaphore *sem)
> {
> - long tmp;
> + long tmp = atomic_long_read(&sem->count);
>
> - while ((tmp = atomic_long_read(&sem->count)) >= 0) {
> - if (tmp == atomic_long_cmpxchg_acquire(&sem->count, tmp,
> - tmp + RWSEM_ACTIVE_READ_BIAS)) {
> + while (tmp >= 0) {
> + if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp,
> + tmp + RWSEM_ACTIVE_READ_BIAS)) {
> return 1;
> }
> }
> --
> 1.8.3.1
>
next prev parent reply other threads:[~2019-02-12 13:24 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-11 19:31 [PATCH v2 0/2] locking/rwsem: Remove arch specific rwsem files Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-11 19:31 ` [PATCH v2 1/2] " Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-11 19:31 ` [PATCH v2 2/2] locking/rwsem: Optimize down_read_trylock() Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-11 19:31 ` Waiman Long
2019-02-12 13:24 ` Peter Zijlstra [this message]
2019-02-12 13:24 ` Peter Zijlstra
2019-02-12 13:24 ` Peter Zijlstra
2019-02-12 13:24 ` Peter Zijlstra
2019-02-12 13:24 ` Peter Zijlstra
2019-02-12 13:25 ` Peter Zijlstra
2019-02-12 13:25 ` Peter Zijlstra
2019-02-12 13:25 ` Peter Zijlstra
2019-02-12 13:25 ` Peter Zijlstra
2019-02-12 18:36 ` Waiman Long
2019-02-12 18:36 ` Waiman Long
2019-02-12 18:36 ` Waiman Long
2019-02-12 18:36 ` Waiman Long
2019-02-12 18:38 ` Waiman Long
2019-02-12 18:38 ` Waiman Long
2019-02-12 18:38 ` Waiman Long
2019-02-12 18:38 ` Waiman Long
2019-02-12 19:58 ` Linus Torvalds
2019-02-12 19:58 ` Linus Torvalds
2019-02-12 19:58 ` Linus Torvalds
2019-02-12 19:58 ` Linus Torvalds
2019-02-12 21:21 ` Waiman Long
2019-02-12 21:21 ` Waiman Long
2019-02-12 21:21 ` Waiman Long
2019-02-12 21:21 ` Waiman Long
2019-02-13 7:45 ` Ingo Molnar
2019-02-13 7:45 ` Ingo Molnar
2019-02-13 7:45 ` Ingo Molnar
2019-02-13 7:45 ` Ingo Molnar
2019-02-13 7:45 ` Ingo Molnar
2019-02-13 15:33 ` Waiman Long
2019-02-13 15:33 ` Waiman Long
2019-02-13 15:33 ` Waiman Long
2019-02-13 15:33 ` Waiman Long
2019-02-13 15:33 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190212132404.GI32494@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=dave@stgolabs.net \
--cc=hpa@zytor.com \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-hexagon@vger.kernel.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linux-xtensa@linux-xtensa.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=sparclinux@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.