All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
@ 2024-10-01 11:45 Uros Bizjak
  2024-10-02 15:47 ` Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Uros Bizjak @ 2024-10-01 11:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Uros Bizjak, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Waiman Long, Boqun Feng

Replace this pattern in osq_unlock():

    atomic_cmpxchg(*ptr, old, new) == old

... with the simpler and faster:

    atomic_try_cmpxchg(*ptr, &old, new)

The x86 CMPXCHG instruction returns success in the ZF flag,
so this change saves a compare after the CMPXCHG.  The code
in the fast path of osq_unlock() improves from:

 11b:	31 c9                	xor    %ecx,%ecx
 11d:	8d 50 01             	lea    0x1(%rax),%edx
 120:	89 d0                	mov    %edx,%eax
 122:	f0 0f b1 0f          	lock cmpxchg %ecx,(%rdi)
 126:	39 c2                	cmp    %eax,%edx
 128:	75 05                	jne    12f <...>

to:

 12b:	31 d2                	xor    %edx,%edx
 12d:	83 c0 01             	add    $0x1,%eax
 130:	f0 0f b1 17          	lock cmpxchg %edx,(%rdi)
 134:	75 05                	jne    13b <...>

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/osq_lock.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 75a6f6133866..b4233dc2c2b0 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -215,8 +215,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
 	/*
 	 * Fast path for the uncontended case.
 	 */
-	if (likely(atomic_cmpxchg_release(&lock->tail, curr,
-					  OSQ_UNLOCKED_VAL) == curr))
+	if (atomic_try_cmpxchg_release(&lock->tail, &curr, OSQ_UNLOCKED_VAL))
 		return;
 
 	/*
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
  2024-10-01 11:45 [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() Uros Bizjak
@ 2024-10-02 15:47 ` Waiman Long
  2024-10-25  8:02 ` Peter Zijlstra
  2024-10-26  7:35 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
  2 siblings, 0 replies; 4+ messages in thread
From: Waiman Long @ 2024-10-02 15:47 UTC (permalink / raw)
  To: Uros Bizjak, linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng

On 10/1/24 07:45, Uros Bizjak wrote:
> Replace this pattern in osq_unlock():
>
>      atomic_cmpxchg(*ptr, old, new) == old
>
> ... with the simpler and faster:
>
>      atomic_try_cmpxchg(*ptr, &old, new)
>
> The x86 CMPXCHG instruction returns success in the ZF flag,
> so this change saves a compare after the CMPXCHG.  The code
> in the fast path of osq_unlock() improves from:
>
>   11b:	31 c9                	xor    %ecx,%ecx
>   11d:	8d 50 01             	lea    0x1(%rax),%edx
>   120:	89 d0                	mov    %edx,%eax
>   122:	f0 0f b1 0f          	lock cmpxchg %ecx,(%rdi)
>   126:	39 c2                	cmp    %eax,%edx
>   128:	75 05                	jne    12f <...>
>
> to:
>
>   12b:	31 d2                	xor    %edx,%edx
>   12d:	83 c0 01             	add    $0x1,%eax
>   130:	f0 0f b1 17          	lock cmpxchg %edx,(%rdi)
>   134:	75 05                	jne    13b <...>
>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> ---
>   kernel/locking/osq_lock.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index 75a6f6133866..b4233dc2c2b0 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -215,8 +215,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
>   	/*
>   	 * Fast path for the uncontended case.
>   	 */
> -	if (likely(atomic_cmpxchg_release(&lock->tail, curr,
> -					  OSQ_UNLOCKED_VAL) == curr))
> +	if (atomic_try_cmpxchg_release(&lock->tail, &curr, OSQ_UNLOCKED_VAL))
>   		return;
>   
>   	/*

LGTM

Acked-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
  2024-10-01 11:45 [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() Uros Bizjak
  2024-10-02 15:47 ` Waiman Long
@ 2024-10-25  8:02 ` Peter Zijlstra
  2024-10-26  7:35 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2024-10-25  8:02 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-kernel, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng

On Tue, Oct 01, 2024 at 01:45:57PM +0200, Uros Bizjak wrote:
> Replace this pattern in osq_unlock():
> 
>     atomic_cmpxchg(*ptr, old, new) == old
> 
> ... with the simpler and faster:
> 
>     atomic_try_cmpxchg(*ptr, &old, new)
> 
> The x86 CMPXCHG instruction returns success in the ZF flag,
> so this change saves a compare after the CMPXCHG.  The code
> in the fast path of osq_unlock() improves from:
> 
>  11b:	31 c9                	xor    %ecx,%ecx
>  11d:	8d 50 01             	lea    0x1(%rax),%edx
>  120:	89 d0                	mov    %edx,%eax
>  122:	f0 0f b1 0f          	lock cmpxchg %ecx,(%rdi)
>  126:	39 c2                	cmp    %eax,%edx
>  128:	75 05                	jne    12f <...>
> 
> to:
> 
>  12b:	31 d2                	xor    %edx,%edx
>  12d:	83 c0 01             	add    $0x1,%eax
>  130:	f0 0f b1 17          	lock cmpxchg %edx,(%rdi)
>  134:	75 05                	jne    13b <...>
> 
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>

Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip: locking/core] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
  2024-10-01 11:45 [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() Uros Bizjak
  2024-10-02 15:47 ` Waiman Long
  2024-10-25  8:02 ` Peter Zijlstra
@ 2024-10-26  7:35 ` tip-bot2 for Uros Bizjak
  2 siblings, 0 replies; 4+ messages in thread
From: tip-bot2 for Uros Bizjak @ 2024-10-26  7:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Uros Bizjak, Peter Zijlstra (Intel), Waiman Long, x86,
	linux-kernel

The following commit has been merged into the locking/core branch of tip:

Commit-ID:     0d75e0c420e52b4057a2de274054a5274209a2ae
Gitweb:        https://git.kernel.org/tip/0d75e0c420e52b4057a2de274054a5274209a2ae
Author:        Uros Bizjak <ubizjak@gmail.com>
AuthorDate:    Tue, 01 Oct 2024 13:45:57 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 25 Oct 2024 10:01:50 +02:00

locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()

Replace this pattern in osq_unlock():

    atomic_cmpxchg(*ptr, old, new) == old

... with the simpler and faster:

    atomic_try_cmpxchg(*ptr, &old, new)

The x86 CMPXCHG instruction returns success in the ZF flag,
so this change saves a compare after the CMPXCHG.  The code
in the fast path of osq_unlock() improves from:

 11b:	31 c9                	xor    %ecx,%ecx
 11d:	8d 50 01             	lea    0x1(%rax),%edx
 120:	89 d0                	mov    %edx,%eax
 122:	f0 0f b1 0f          	lock cmpxchg %ecx,(%rdi)
 126:	39 c2                	cmp    %eax,%edx
 128:	75 05                	jne    12f <...>

to:

 12b:	31 d2                	xor    %edx,%edx
 12d:	83 c0 01             	add    $0x1,%eax
 130:	f0 0f b1 17          	lock cmpxchg %edx,(%rdi)
 134:	75 05                	jne    13b <...>

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20241001114606.820277-1-ubizjak@gmail.com
---
 kernel/locking/osq_lock.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 75a6f61..b4233dc 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -215,8 +215,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
 	/*
 	 * Fast path for the uncontended case.
 	 */
-	if (likely(atomic_cmpxchg_release(&lock->tail, curr,
-					  OSQ_UNLOCKED_VAL) == curr))
+	if (atomic_try_cmpxchg_release(&lock->tail, &curr, OSQ_UNLOCKED_VAL))
 		return;
 
 	/*

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-10-26  7:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-01 11:45 [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() Uros Bizjak
2024-10-02 15:47 ` Waiman Long
2024-10-25  8:02 ` Peter Zijlstra
2024-10-26  7:35 ` [tip: locking/core] " tip-bot2 for Uros Bizjak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.