public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate()
@ 2024-11-20 15:06 Uros Bizjak
  2024-11-20 15:34 ` Alex Elder
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Uros Bizjak @ 2024-11-20 15:06 UTC (permalink / raw)
  To: linux-xfs, linux-kernel
  Cc: Uros Bizjak, Chandan Babu R, Darrick J. Wong, Christoph Hellwig,
	Dave Chinner

try_cmpxchg() loop with constant "new" value can be substituted
with just xchg() to atomically get and clear the location.

The code on x86_64 improves from:

    1e7f:	48 89 4c 24 10       	mov    %rcx,0x10(%rsp)
    1e84:	48 03 14 c5 00 00 00 	add    0x0(,%rax,8),%rdx
    1e8b:	00
			1e88: R_X86_64_32S	__per_cpu_offset
    1e8c:	8b 02                	mov    (%rdx),%eax
    1e8e:	41 89 c5             	mov    %eax,%r13d
    1e91:	31 c9                	xor    %ecx,%ecx
    1e93:	f0 0f b1 0a          	lock cmpxchg %ecx,(%rdx)
    1e97:	75 f5                	jne    1e8e <xlog_cil_commit+0x84e>
    1e99:	48 8b 4c 24 10       	mov    0x10(%rsp),%rcx
    1e9e:	45 01 e9             	add    %r13d,%r9d

to just:

    1e7f:	48 03 14 cd 00 00 00 	add    0x0(,%rcx,8),%rdx
    1e86:	00
			1e83: R_X86_64_32S	__per_cpu_offset
    1e87:	31 c9                	xor    %ecx,%ecx
    1e89:	87 0a                	xchg   %ecx,(%rdx)
    1e8b:	41 01 cb             	add    %ecx,%r11d

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Chandan Babu R <chandan.babu@oracle.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_log_cil.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 80da0cf87d7a..9d667be1d909 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -171,11 +171,8 @@ xlog_cil_insert_pcp_aggregate(
 	 */
 	for_each_cpu(cpu, &ctx->cil_pcpmask) {
 		struct xlog_cil_pcp	*cilpcp = per_cpu_ptr(cil->xc_pcp, cpu);
-		int			old = READ_ONCE(cilpcp->space_used);
 
-		while (!try_cmpxchg(&cilpcp->space_used, &old, 0))
-			;
-		count += old;
+		count += xchg(&cilpcp->space_used, 0);
 	}
 	atomic_add(count, &ctx->space_used);
 }
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate()
  2024-11-20 15:06 [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate() Uros Bizjak
@ 2024-11-20 15:34 ` Alex Elder
  2024-11-20 15:36   ` Uros Bizjak
  2024-11-20 20:37 ` Dave Chinner
  2024-11-28 12:19 ` Carlos Maiolino
  2 siblings, 1 reply; 6+ messages in thread
From: Alex Elder @ 2024-11-20 15:34 UTC (permalink / raw)
  To: Uros Bizjak, linux-xfs, linux-kernel
  Cc: Chandan Babu R, Darrick J. Wong, Christoph Hellwig, Dave Chinner

On 11/20/24 9:06 AM, Uros Bizjak wrote:
> try_cmpxchg() loop with constant "new" value can be substituted
> with just xchg() to atomically get and clear the location.

You're right.  With a constant new value (0), there is no need
to loop to ensure we get a "stable" update.

Is the READ_ONCE() is still needed?

					-Alex

> The code on x86_64 improves from:
> 
>      1e7f:	48 89 4c 24 10       	mov    %rcx,0x10(%rsp)
>      1e84:	48 03 14 c5 00 00 00 	add    0x0(,%rax,8),%rdx
>      1e8b:	00
> 			1e88: R_X86_64_32S	__per_cpu_offset
>      1e8c:	8b 02                	mov    (%rdx),%eax
>      1e8e:	41 89 c5             	mov    %eax,%r13d
>      1e91:	31 c9                	xor    %ecx,%ecx
>      1e93:	f0 0f b1 0a          	lock cmpxchg %ecx,(%rdx)
>      1e97:	75 f5                	jne    1e8e <xlog_cil_commit+0x84e>
>      1e99:	48 8b 4c 24 10       	mov    0x10(%rsp),%rcx
>      1e9e:	45 01 e9             	add    %r13d,%r9d
> 
> to just:
> 
>      1e7f:	48 03 14 cd 00 00 00 	add    0x0(,%rcx,8),%rdx
>      1e86:	00
> 			1e83: R_X86_64_32S	__per_cpu_offset
>      1e87:	31 c9                	xor    %ecx,%ecx
>      1e89:	87 0a                	xchg   %ecx,(%rdx)
>      1e8b:	41 01 cb             	add    %ecx,%r11d
> 
> No functional change intended.
> 
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Chandan Babu R <chandan.babu@oracle.com>
> Cc: "Darrick J. Wong" <djwong@kernel.org>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Dave Chinner <dchinner@redhat.com>
> ---
>   fs/xfs/xfs_log_cil.c | 5 +----
>   1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index 80da0cf87d7a..9d667be1d909 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -171,11 +171,8 @@ xlog_cil_insert_pcp_aggregate(
>   	 */
>   	for_each_cpu(cpu, &ctx->cil_pcpmask) {
>   		struct xlog_cil_pcp	*cilpcp = per_cpu_ptr(cil->xc_pcp, cpu);
> -		int			old = READ_ONCE(cilpcp->space_used);
>   
> -		while (!try_cmpxchg(&cilpcp->space_used, &old, 0))
> -			;
> -		count += old;
> +		count += xchg(&cilpcp->space_used, 0);
>   	}
>   	atomic_add(count, &ctx->space_used);
>   }


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate()
  2024-11-20 15:34 ` Alex Elder
@ 2024-11-20 15:36   ` Uros Bizjak
  2024-11-20 15:37     ` Alex Elder
  0 siblings, 1 reply; 6+ messages in thread
From: Uros Bizjak @ 2024-11-20 15:36 UTC (permalink / raw)
  To: Alex Elder
  Cc: linux-xfs, linux-kernel, Chandan Babu R, Darrick J. Wong,
	Christoph Hellwig, Dave Chinner

On Wed, Nov 20, 2024 at 4:34 PM Alex Elder <elder@riscstar.com> wrote:
>
> On 11/20/24 9:06 AM, Uros Bizjak wrote:
> > try_cmpxchg() loop with constant "new" value can be substituted
> > with just xchg() to atomically get and clear the location.
>
> You're right.  With a constant new value (0), there is no need
> to loop to ensure we get a "stable" update.
>
> Is the READ_ONCE() is still needed?

No, xchg() guarantees atomic access on its own.

Uros.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate()
  2024-11-20 15:36   ` Uros Bizjak
@ 2024-11-20 15:37     ` Alex Elder
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Elder @ 2024-11-20 15:37 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-xfs, linux-kernel, Chandan Babu R, Darrick J. Wong,
	Christoph Hellwig, Dave Chinner

On 11/20/24 9:36 AM, Uros Bizjak wrote:
> On Wed, Nov 20, 2024 at 4:34 PM Alex Elder <elder@riscstar.com> wrote:
>>
>> On 11/20/24 9:06 AM, Uros Bizjak wrote:
>>> try_cmpxchg() loop with constant "new" value can be substituted
>>> with just xchg() to atomically get and clear the location.
>>
>> You're right.  With a constant new value (0), there is no need
>> to loop to ensure we get a "stable" update.
>>
>> Is the READ_ONCE() is still needed?
> 
> No, xchg() guarantees atomic access on its own.
> 
> Uros.

Based on that:

Reviewed-by: Alex Elder <elder@riscstar.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate()
  2024-11-20 15:06 [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate() Uros Bizjak
  2024-11-20 15:34 ` Alex Elder
@ 2024-11-20 20:37 ` Dave Chinner
  2024-11-28 12:19 ` Carlos Maiolino
  2 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2024-11-20 20:37 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-xfs, linux-kernel, Chandan Babu R, Darrick J. Wong,
	Christoph Hellwig, Dave Chinner

On Wed, Nov 20, 2024 at 04:06:22PM +0100, Uros Bizjak wrote:
> try_cmpxchg() loop with constant "new" value can be substituted
> with just xchg() to atomically get and clear the location.
> 
> The code on x86_64 improves from:
> 
>     1e7f:	48 89 4c 24 10       	mov    %rcx,0x10(%rsp)
>     1e84:	48 03 14 c5 00 00 00 	add    0x0(,%rax,8),%rdx
>     1e8b:	00
> 			1e88: R_X86_64_32S	__per_cpu_offset
>     1e8c:	8b 02                	mov    (%rdx),%eax
>     1e8e:	41 89 c5             	mov    %eax,%r13d
>     1e91:	31 c9                	xor    %ecx,%ecx
>     1e93:	f0 0f b1 0a          	lock cmpxchg %ecx,(%rdx)
>     1e97:	75 f5                	jne    1e8e <xlog_cil_commit+0x84e>
>     1e99:	48 8b 4c 24 10       	mov    0x10(%rsp),%rcx
>     1e9e:	45 01 e9             	add    %r13d,%r9d
> 
> to just:
> 
>     1e7f:	48 03 14 cd 00 00 00 	add    0x0(,%rcx,8),%rdx
>     1e86:	00
> 			1e83: R_X86_64_32S	__per_cpu_offset
>     1e87:	31 c9                	xor    %ecx,%ecx
>     1e89:	87 0a                	xchg   %ecx,(%rdx)
>     1e8b:	41 01 cb             	add    %ecx,%r11d
> 
> No functional change intended.
> 
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Chandan Babu R <chandan.babu@oracle.com>
> Cc: "Darrick J. Wong" <djwong@kernel.org>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_log_cil.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index 80da0cf87d7a..9d667be1d909 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -171,11 +171,8 @@ xlog_cil_insert_pcp_aggregate(
>  	 */
>  	for_each_cpu(cpu, &ctx->cil_pcpmask) {
>  		struct xlog_cil_pcp	*cilpcp = per_cpu_ptr(cil->xc_pcp, cpu);
> -		int			old = READ_ONCE(cilpcp->space_used);
>  
> -		while (!try_cmpxchg(&cilpcp->space_used, &old, 0))
> -			;
> -		count += old;
> +		count += xchg(&cilpcp->space_used, 0);
>  	}
>  	atomic_add(count, &ctx->space_used);
>  }

Looks fine.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate()
  2024-11-20 15:06 [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate() Uros Bizjak
  2024-11-20 15:34 ` Alex Elder
  2024-11-20 20:37 ` Dave Chinner
@ 2024-11-28 12:19 ` Carlos Maiolino
  2 siblings, 0 replies; 6+ messages in thread
From: Carlos Maiolino @ 2024-11-28 12:19 UTC (permalink / raw)
  To: linux-xfs, linux-kernel, Uros Bizjak
  Cc: Chandan Babu R, Darrick J. Wong, Christoph Hellwig, Dave Chinner

On Wed, 20 Nov 2024 16:06:22 +0100, Uros Bizjak wrote:
> try_cmpxchg() loop with constant "new" value can be substituted
> with just xchg() to atomically get and clear the location.
> 
> The code on x86_64 improves from:
> 
>     1e7f:	48 89 4c 24 10       	mov    %rcx,0x10(%rsp)
>     1e84:	48 03 14 c5 00 00 00 	add    0x0(,%rax,8),%rdx
>     1e8b:	00
> 			1e88: R_X86_64_32S	__per_cpu_offset
>     1e8c:	8b 02                	mov    (%rdx),%eax
>     1e8e:	41 89 c5             	mov    %eax,%r13d
>     1e91:	31 c9                	xor    %ecx,%ecx
>     1e93:	f0 0f b1 0a          	lock cmpxchg %ecx,(%rdx)
>     1e97:	75 f5                	jne    1e8e <xlog_cil_commit+0x84e>
>     1e99:	48 8b 4c 24 10       	mov    0x10(%rsp),%rcx
>     1e9e:	45 01 e9             	add    %r13d,%r9d
> 
> [...]

Applied to for-next, thanks!

[1/1] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate()
      commit: 214093534f3c046bf5acc9affbf4e6bd9af4538b

Best regards,
-- 
Carlos Maiolino <cem@kernel.org>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-11-28 12:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-20 15:06 [PATCH] xfs: Use xchg() in xlog_cil_insert_pcp_aggregate() Uros Bizjak
2024-11-20 15:34 ` Alex Elder
2024-11-20 15:36   ` Uros Bizjak
2024-11-20 15:37     ` Alex Elder
2024-11-20 20:37 ` Dave Chinner
2024-11-28 12:19 ` Carlos Maiolino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox