* [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
@ 2024-10-01 11:45 Uros Bizjak
2024-10-02 15:47 ` Waiman Long
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Uros Bizjak @ 2024-10-01 11:45 UTC (permalink / raw)
To: linux-kernel
Cc: Uros Bizjak, Peter Zijlstra, Ingo Molnar, Will Deacon,
Waiman Long, Boqun Feng
Replace this pattern in osq_unlock():
atomic_cmpxchg(*ptr, old, new) == old
... with the simpler and faster:
atomic_try_cmpxchg(*ptr, &old, new)
The x86 CMPXCHG instruction returns success in the ZF flag,
so this change saves a compare after the CMPXCHG. The code
in the fast path of osq_unlock() improves from:
11b: 31 c9 xor %ecx,%ecx
11d: 8d 50 01 lea 0x1(%rax),%edx
120: 89 d0 mov %edx,%eax
122: f0 0f b1 0f lock cmpxchg %ecx,(%rdi)
126: 39 c2 cmp %eax,%edx
128: 75 05 jne 12f <...>
to:
12b: 31 d2 xor %edx,%edx
12d: 83 c0 01 add $0x1,%eax
130: f0 0f b1 17 lock cmpxchg %edx,(%rdi)
134: 75 05 jne 13b <...>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
kernel/locking/osq_lock.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 75a6f6133866..b4233dc2c2b0 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -215,8 +215,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
/*
* Fast path for the uncontended case.
*/
- if (likely(atomic_cmpxchg_release(&lock->tail, curr,
- OSQ_UNLOCKED_VAL) == curr))
+ if (atomic_try_cmpxchg_release(&lock->tail, &curr, OSQ_UNLOCKED_VAL))
return;
/*
--
2.46.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
2024-10-01 11:45 [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() Uros Bizjak
@ 2024-10-02 15:47 ` Waiman Long
2024-10-25 8:02 ` Peter Zijlstra
2024-10-26 7:35 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
2 siblings, 0 replies; 4+ messages in thread
From: Waiman Long @ 2024-10-02 15:47 UTC (permalink / raw)
To: Uros Bizjak, linux-kernel
Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
On 10/1/24 07:45, Uros Bizjak wrote:
> Replace this pattern in osq_unlock():
>
> atomic_cmpxchg(*ptr, old, new) == old
>
> ... with the simpler and faster:
>
> atomic_try_cmpxchg(*ptr, &old, new)
>
> The x86 CMPXCHG instruction returns success in the ZF flag,
> so this change saves a compare after the CMPXCHG. The code
> in the fast path of osq_unlock() improves from:
>
> 11b: 31 c9 xor %ecx,%ecx
> 11d: 8d 50 01 lea 0x1(%rax),%edx
> 120: 89 d0 mov %edx,%eax
> 122: f0 0f b1 0f lock cmpxchg %ecx,(%rdi)
> 126: 39 c2 cmp %eax,%edx
> 128: 75 05 jne 12f <...>
>
> to:
>
> 12b: 31 d2 xor %edx,%edx
> 12d: 83 c0 01 add $0x1,%eax
> 130: f0 0f b1 17 lock cmpxchg %edx,(%rdi)
> 134: 75 05 jne 13b <...>
>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> ---
> kernel/locking/osq_lock.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index 75a6f6133866..b4233dc2c2b0 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -215,8 +215,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
> /*
> * Fast path for the uncontended case.
> */
> - if (likely(atomic_cmpxchg_release(&lock->tail, curr,
> - OSQ_UNLOCKED_VAL) == curr))
> + if (atomic_try_cmpxchg_release(&lock->tail, &curr, OSQ_UNLOCKED_VAL))
> return;
>
> /*
LGTM
Acked-by: Waiman Long <longman@redhat.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
2024-10-01 11:45 [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() Uros Bizjak
2024-10-02 15:47 ` Waiman Long
@ 2024-10-25 8:02 ` Peter Zijlstra
2024-10-26 7:35 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
2 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2024-10-25 8:02 UTC (permalink / raw)
To: Uros Bizjak
Cc: linux-kernel, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng
On Tue, Oct 01, 2024 at 01:45:57PM +0200, Uros Bizjak wrote:
> Replace this pattern in osq_unlock():
>
> atomic_cmpxchg(*ptr, old, new) == old
>
> ... with the simpler and faster:
>
> atomic_try_cmpxchg(*ptr, &old, new)
>
> The x86 CMPXCHG instruction returns success in the ZF flag,
> so this change saves a compare after the CMPXCHG. The code
> in the fast path of osq_unlock() improves from:
>
> 11b: 31 c9 xor %ecx,%ecx
> 11d: 8d 50 01 lea 0x1(%rax),%edx
> 120: 89 d0 mov %edx,%eax
> 122: f0 0f b1 0f lock cmpxchg %ecx,(%rdi)
> 126: 39 c2 cmp %eax,%edx
> 128: 75 05 jne 12f <...>
>
> to:
>
> 12b: 31 d2 xor %edx,%edx
> 12d: 83 c0 01 add $0x1,%eax
> 130: f0 0f b1 17 lock cmpxchg %edx,(%rdi)
> 134: 75 05 jne 13b <...>
>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
* [tip: locking/core] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
2024-10-01 11:45 [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() Uros Bizjak
2024-10-02 15:47 ` Waiman Long
2024-10-25 8:02 ` Peter Zijlstra
@ 2024-10-26 7:35 ` tip-bot2 for Uros Bizjak
2 siblings, 0 replies; 4+ messages in thread
From: tip-bot2 for Uros Bizjak @ 2024-10-26 7:35 UTC (permalink / raw)
To: linux-tip-commits
Cc: Uros Bizjak, Peter Zijlstra (Intel), Waiman Long, x86,
linux-kernel
The following commit has been merged into the locking/core branch of tip:
Commit-ID: 0d75e0c420e52b4057a2de274054a5274209a2ae
Gitweb: https://git.kernel.org/tip/0d75e0c420e52b4057a2de274054a5274209a2ae
Author: Uros Bizjak <ubizjak@gmail.com>
AuthorDate: Tue, 01 Oct 2024 13:45:57 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 25 Oct 2024 10:01:50 +02:00
locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
Replace this pattern in osq_unlock():
atomic_cmpxchg(*ptr, old, new) == old
... with the simpler and faster:
atomic_try_cmpxchg(*ptr, &old, new)
The x86 CMPXCHG instruction returns success in the ZF flag,
so this change saves a compare after the CMPXCHG. The code
in the fast path of osq_unlock() improves from:
11b: 31 c9 xor %ecx,%ecx
11d: 8d 50 01 lea 0x1(%rax),%edx
120: 89 d0 mov %edx,%eax
122: f0 0f b1 0f lock cmpxchg %ecx,(%rdi)
126: 39 c2 cmp %eax,%edx
128: 75 05 jne 12f <...>
to:
12b: 31 d2 xor %edx,%edx
12d: 83 c0 01 add $0x1,%eax
130: f0 0f b1 17 lock cmpxchg %edx,(%rdi)
134: 75 05 jne 13b <...>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20241001114606.820277-1-ubizjak@gmail.com
---
kernel/locking/osq_lock.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 75a6f61..b4233dc 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -215,8 +215,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
/*
* Fast path for the uncontended case.
*/
- if (likely(atomic_cmpxchg_release(&lock->tail, curr,
- OSQ_UNLOCKED_VAL) == curr))
+ if (atomic_try_cmpxchg_release(&lock->tail, &curr, OSQ_UNLOCKED_VAL))
return;
/*
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-10-26 7:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-01 11:45 [PATCH] locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() Uros Bizjak
2024-10-02 15:47 ` Waiman Long
2024-10-25 8:02 ` Peter Zijlstra
2024-10-26 7:35 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.