linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] workqueue: Use atomic_try_cmpxchg_relaxed() in tryinc_node_nr_active()
@ 2025-07-09 13:19 Uros Bizjak
  2025-07-17 18:15 ` Tejun Heo
  0 siblings, 1 reply; 2+ messages in thread
From: Uros Bizjak @ 2025-07-09 13:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Uros Bizjak, Tejun Heo, Lai Jiangshan

Use try_cmpxchg() family of locking primitives instead of
cmpxchg(*ptr, old, new) == old.

The x86 CMPXCHG instruction returns success in the ZF flag, so this
change saves a compare after CMPXCHG (and related move instruction
in front of CMPXCHG).

Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when
CMPXCHG fails. There is no need to re-read the value in the loop.

The generated assembly improves from:

     3f7:	44 8b 0a             	mov    (%rdx),%r9d
     3fa:	eb 12                	jmp    40e <...>
     3fc:	8d 79 01             	lea    0x1(%rcx),%edi
     3ff:	89 c8                	mov    %ecx,%eax
     401:	f0 0f b1 7a 04       	lock cmpxchg %edi,0x4(%rdx)
     406:	39 c1                	cmp    %eax,%ecx
     408:	0f 84 83 00 00 00    	je     491 <...>
     40e:	8b 4a 04             	mov    0x4(%rdx),%ecx
     411:	41 39 c9             	cmp    %ecx,%r9d
     414:	7f e6                	jg     3fc <...>

to:

    256b:	45 8b 08             	mov    (%r8),%r9d
    256e:	41 8b 40 04          	mov    0x4(%r8),%eax
    2572:	41 39 c1             	cmp    %eax,%r9d
    2575:	7e 10                	jle    2587 <...>
    2577:	8d 78 01             	lea    0x1(%rax),%edi
    257a:	f0 41 0f b1 78 04    	lock cmpxchg %edi,0x4(%r8)
    2580:	75 f0                	jne    2572 <...>

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
---
 kernel/workqueue.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 9f9148075828..f0bd688bb88b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1686,17 +1686,14 @@ static void __pwq_activate_work(struct pool_workqueue *pwq,
 static bool tryinc_node_nr_active(struct wq_node_nr_active *nna)
 {
 	int max = READ_ONCE(nna->max);
+	int old = atomic_read(&nna->nr);
 
-	while (true) {
-		int old, tmp;
-
-		old = atomic_read(&nna->nr);
+	do {
 		if (old >= max)
 			return false;
-		tmp = atomic_cmpxchg_relaxed(&nna->nr, old, old + 1);
-		if (tmp == old)
-			return true;
-	}
+	} while (!atomic_try_cmpxchg_relaxed(&nna->nr, &old, old + 1));
+
+	return true;
 }
 
 /**
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] workqueue: Use atomic_try_cmpxchg_relaxed() in tryinc_node_nr_active()
  2025-07-09 13:19 [PATCH] workqueue: Use atomic_try_cmpxchg_relaxed() in tryinc_node_nr_active() Uros Bizjak
@ 2025-07-17 18:15 ` Tejun Heo
  0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2025-07-17 18:15 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: linux-kernel, Lai Jiangshan

On Wed, Jul 09, 2025 at 03:19:03PM +0200, Uros Bizjak wrote:
> Use try_cmpxchg() family of locking primitives instead of
> cmpxchg(*ptr, old, new) == old.
> 
> The x86 CMPXCHG instruction returns success in the ZF flag, so this
> change saves a compare after CMPXCHG (and related move instruction
> in front of CMPXCHG).
> 
> Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when
> CMPXCHG fails. There is no need to re-read the value in the loop.
> 
> The generated assembly improves from:
> 
>      3f7:	44 8b 0a             	mov    (%rdx),%r9d
>      3fa:	eb 12                	jmp    40e <...>
>      3fc:	8d 79 01             	lea    0x1(%rcx),%edi
>      3ff:	89 c8                	mov    %ecx,%eax
>      401:	f0 0f b1 7a 04       	lock cmpxchg %edi,0x4(%rdx)
>      406:	39 c1                	cmp    %eax,%ecx
>      408:	0f 84 83 00 00 00    	je     491 <...>
>      40e:	8b 4a 04             	mov    0x4(%rdx),%ecx
>      411:	41 39 c9             	cmp    %ecx,%r9d
>      414:	7f e6                	jg     3fc <...>
> 
> to:
> 
>     256b:	45 8b 08             	mov    (%r8),%r9d
>     256e:	41 8b 40 04          	mov    0x4(%r8),%eax
>     2572:	41 39 c1             	cmp    %eax,%r9d
>     2575:	7e 10                	jle    2587 <...>
>     2577:	8d 78 01             	lea    0x1(%rax),%edi
>     257a:	f0 41 0f b1 78 04    	lock cmpxchg %edi,0x4(%r8)
>     2580:	75 f0                	jne    2572 <...>
> 
> No functional change intended.
> 
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>

Applied to wq/for-6.17.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-07-17 18:15 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-09 13:19 [PATCH] workqueue: Use atomic_try_cmpxchg_relaxed() in tryinc_node_nr_active() Uros Bizjak
2025-07-17 18:15 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).