Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arm64: futex: Consolidate 'old == new' check in __lsui_cmpxchg32()
@ 2026-05-19  9:08 Will Deacon
  2026-05-19  9:24 ` Yeoreum Yun
  2026-05-19 15:09 ` Catalin Marinas
  0 siblings, 2 replies; 3+ messages in thread
From: Will Deacon @ 2026-05-19  9:08 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Will Deacon, Catalin Marinas, Yeoreum Yun

The LSUI futex implementation relies on a cmpxchg() loop to implement
FUTEX_OP_XOR, as the architecture doesn't provide unprivileged *EOR
atomics. Since the unprivileged 'CAST' instructions used to implement
the cmpxchg() can only operate on 64-bit memory locations, the
__lsui_cmpxchg32() helper function performs a song and dance to marshall
the 32-bit futex value into the correct part of a 64-bit register and
fill the remaining bytes with the neighbouring data.

A consequence of this structure is that the 'CAST' failure/success
condition ends up being split into two separate 32-bit checks across
__lsui_cmpxchg32() and its caller. This is a little fiddly to read and
introduces some additional local variables which can be avoided if the
check is done in one place.

Tweak __lsui_cmpxchg32() so that it performs the full 64-bit check on
the value returned from the 'CAST' instruction and returns success to
its caller only in the case that the cmpxchg() operation has succeeded.
With that in place, simplify the outer loop in __lsui_futex_atomic_eor()
to pass 'oldval' by reference and return unless the cmpxchg() operation
returns -EAGAIN.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/futex.h | 53 ++++++++++++----------------------
 1 file changed, 18 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index d1d2ff9d323a..db84a7b2de74 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -151,42 +151,31 @@ __lsui_cmpxchg64(u64 __user *uaddr, u64 *oldval, u64 newval)
 }
 
 static __always_inline int
-__lsui_cmpxchg32(u32 __user *uaddr, u32 oldval, u32 newval, u32 *oval)
+__lsui_cmpxchg32(u32 __user *uaddr, u32 *oldval, u32 newval)
 {
 	u64 __user *uaddr64;
 	bool futex_pos, other_pos;
-	u32 other, orig_other;
 	union {
 		u32 futex[2];
 		u64 raw;
-	} oval64, orig64, nval64;
+	} orig64, oval64, nval64;
 
 	uaddr64 = (u64 __user *)PTR_ALIGN_DOWN(uaddr, sizeof(u64));
 	futex_pos = !IS_ALIGNED((unsigned long)uaddr, sizeof(u64));
 	other_pos = !futex_pos;
 
-	oval64.futex[futex_pos] = oldval;
-	if (get_user(oval64.futex[other_pos], (u32 __user *)uaddr64 + other_pos))
+	orig64.futex[futex_pos] = *oldval;
+	if (get_user(orig64.futex[other_pos], (u32 __user *)uaddr64 + other_pos))
 		return -EFAULT;
 
-	orig64.raw = oval64.raw;
-
+	nval64 = oval64 = orig64;
 	nval64.futex[futex_pos] = newval;
-	nval64.futex[other_pos] = oval64.futex[other_pos];
 
 	if (__lsui_cmpxchg64(uaddr64, &oval64.raw, nval64.raw))
 		return -EFAULT;
 
-	oldval = oval64.futex[futex_pos];
-	other = oval64.futex[other_pos];
-	orig_other = orig64.futex[other_pos];
-
-	if (other != orig_other)
-		return -EAGAIN;
-
-	*oval = oldval;
-
-	return 0;
+	*oldval = oval64.futex[futex_pos];
+	return oval64.raw == orig64.raw ? 0 : -EAGAIN;
 }
 
 static __always_inline int
@@ -202,7 +191,7 @@ __lsui_futex_atomic_and(int oparg, u32 __user *uaddr, int *oval)
 static __always_inline int
 __lsui_futex_atomic_eor(int oparg, u32 __user *uaddr, int *oval)
 {
-	u32 oldval, newval, val;
+	u32 oldval, newval;
 	int ret, i;
 
 	if (get_user(oldval, uaddr))
@@ -214,33 +203,27 @@ __lsui_futex_atomic_eor(int oparg, u32 __user *uaddr, int *oval)
 	for (i = 0; i < FUTEX_MAX_LOOPS; i++) {
 		newval = oldval ^ oparg;
 
-		ret = __lsui_cmpxchg32(uaddr, oldval, newval, &val);
-		switch (ret) {
-		case -EFAULT:
-			return ret;
-		case -EAGAIN:
-			continue;
-		}
-
-		if (val == oldval) {
-			*oval = val;
-			return 0;
-		}
-
-		oldval = val;
+		ret = __lsui_cmpxchg32(uaddr, &oldval, newval);
+		if (ret != -EAGAIN)
+			break;
 	}
 
-	return -EAGAIN;
+	*oval = oldval;
+	return ret;
 }
 
 static __always_inline int
 __lsui_futex_cmpxchg(u32 __user *uaddr, u32 oldval, u32 newval, u32 *oval)
 {
+	int ret;
+
 	/*
 	 * Callers of futex_atomic_cmpxchg_inatomic() already retry on
 	 * -EAGAIN, no need for another loop of max retries.
 	 */
-	return __lsui_cmpxchg32(uaddr, oldval, newval, oval);
+	ret = __lsui_cmpxchg32(uaddr, &oldval, newval);
+	*oval = oldval;
+	return ret;
 }
 #endif	/* CONFIG_ARM64_LSUI */
 
-- 
2.54.0.563.g4f69b47b94-goog



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] arm64: futex: Consolidate 'old == new' check in __lsui_cmpxchg32()
  2026-05-19  9:08 [PATCH] arm64: futex: Consolidate 'old == new' check in __lsui_cmpxchg32() Will Deacon
@ 2026-05-19  9:24 ` Yeoreum Yun
  2026-05-19 15:09 ` Catalin Marinas
  1 sibling, 0 replies; 3+ messages in thread
From: Yeoreum Yun @ 2026-05-19  9:24 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, Catalin Marinas

Thanks.

At the time of writing the code, expected the upper layer to handle
changes to the futex value for cmpxchg. However, it now seems that it
simply returns -EAGAIN there as well.

Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>

> The LSUI futex implementation relies on a cmpxchg() loop to implement
> FUTEX_OP_XOR, as the architecture doesn't provide unprivileged *EOR
> atomics. Since the unprivileged 'CAST' instructions used to implement
> the cmpxchg() can only operate on 64-bit memory locations, the
> __lsui_cmpxchg32() helper function performs a song and dance to marshall
> the 32-bit futex value into the correct part of a 64-bit register and
> fill the remaining bytes with the neighbouring data.
> 
> A consequence of this structure is that the 'CAST' failure/success
> condition ends up being split into two separate 32-bit checks across
> __lsui_cmpxchg32() and its caller. This is a little fiddly to read and
> introduces some additional local variables which can be avoided if the
> check is done in one place.
> 
> Tweak __lsui_cmpxchg32() so that it performs the full 64-bit check on
> the value returned from the 'CAST' instruction and returns success to
> its caller only in the case that the cmpxchg() operation has succeeded.
> With that in place, simplify the outer loop in __lsui_futex_atomic_eor()
> to pass 'oldval' by reference and return unless the cmpxchg() operation
> returns -EAGAIN.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Yeoreum Yun <yeoreum.yun@arm.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/futex.h | 53 ++++++++++++----------------------
>  1 file changed, 18 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
> index d1d2ff9d323a..db84a7b2de74 100644
> --- a/arch/arm64/include/asm/futex.h
> +++ b/arch/arm64/include/asm/futex.h
> @@ -151,42 +151,31 @@ __lsui_cmpxchg64(u64 __user *uaddr, u64 *oldval, u64 newval)
>  }
>  
>  static __always_inline int
> -__lsui_cmpxchg32(u32 __user *uaddr, u32 oldval, u32 newval, u32 *oval)
> +__lsui_cmpxchg32(u32 __user *uaddr, u32 *oldval, u32 newval)
>  {
>  	u64 __user *uaddr64;
>  	bool futex_pos, other_pos;
> -	u32 other, orig_other;
>  	union {
>  		u32 futex[2];
>  		u64 raw;
> -	} oval64, orig64, nval64;
> +	} orig64, oval64, nval64;
>  
>  	uaddr64 = (u64 __user *)PTR_ALIGN_DOWN(uaddr, sizeof(u64));
>  	futex_pos = !IS_ALIGNED((unsigned long)uaddr, sizeof(u64));
>  	other_pos = !futex_pos;
>  
> -	oval64.futex[futex_pos] = oldval;
> -	if (get_user(oval64.futex[other_pos], (u32 __user *)uaddr64 + other_pos))
> +	orig64.futex[futex_pos] = *oldval;
> +	if (get_user(orig64.futex[other_pos], (u32 __user *)uaddr64 + other_pos))
>  		return -EFAULT;
>  
> -	orig64.raw = oval64.raw;
> -
> +	nval64 = oval64 = orig64;
>  	nval64.futex[futex_pos] = newval;
> -	nval64.futex[other_pos] = oval64.futex[other_pos];
>  
>  	if (__lsui_cmpxchg64(uaddr64, &oval64.raw, nval64.raw))
>  		return -EFAULT;
>  
> -	oldval = oval64.futex[futex_pos];
> -	other = oval64.futex[other_pos];
> -	orig_other = orig64.futex[other_pos];
> -
> -	if (other != orig_other)
> -		return -EAGAIN;
> -
> -	*oval = oldval;
> -
> -	return 0;
> +	*oldval = oval64.futex[futex_pos];
> +	return oval64.raw == orig64.raw ? 0 : -EAGAIN;
>  }
>  
>  static __always_inline int
> @@ -202,7 +191,7 @@ __lsui_futex_atomic_and(int oparg, u32 __user *uaddr, int *oval)
>  static __always_inline int
>  __lsui_futex_atomic_eor(int oparg, u32 __user *uaddr, int *oval)
>  {
> -	u32 oldval, newval, val;
> +	u32 oldval, newval;
>  	int ret, i;
>  
>  	if (get_user(oldval, uaddr))
> @@ -214,33 +203,27 @@ __lsui_futex_atomic_eor(int oparg, u32 __user *uaddr, int *oval)
>  	for (i = 0; i < FUTEX_MAX_LOOPS; i++) {
>  		newval = oldval ^ oparg;
>  
> -		ret = __lsui_cmpxchg32(uaddr, oldval, newval, &val);
> -		switch (ret) {
> -		case -EFAULT:
> -			return ret;
> -		case -EAGAIN:
> -			continue;
> -		}
> -
> -		if (val == oldval) {
> -			*oval = val;
> -			return 0;
> -		}
> -
> -		oldval = val;
> +		ret = __lsui_cmpxchg32(uaddr, &oldval, newval);
> +		if (ret != -EAGAIN)
> +			break;
>  	}
>  
> -	return -EAGAIN;
> +	*oval = oldval;
> +	return ret;
>  }
>  
>  static __always_inline int
>  __lsui_futex_cmpxchg(u32 __user *uaddr, u32 oldval, u32 newval, u32 *oval)
>  {
> +	int ret;
> +
>  	/*
>  	 * Callers of futex_atomic_cmpxchg_inatomic() already retry on
>  	 * -EAGAIN, no need for another loop of max retries.
>  	 */
> -	return __lsui_cmpxchg32(uaddr, oldval, newval, oval);
> +	ret = __lsui_cmpxchg32(uaddr, &oldval, newval);
> +	*oval = oldval;
> +	return ret;
>  }
>  #endif	/* CONFIG_ARM64_LSUI */
>  
> -- 
> 2.54.0.563.g4f69b47b94-goog
> 

-- 
Sincerely,
Yeoreum Yun


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] arm64: futex: Consolidate 'old == new' check in __lsui_cmpxchg32()
  2026-05-19  9:08 [PATCH] arm64: futex: Consolidate 'old == new' check in __lsui_cmpxchg32() Will Deacon
  2026-05-19  9:24 ` Yeoreum Yun
@ 2026-05-19 15:09 ` Catalin Marinas
  1 sibling, 0 replies; 3+ messages in thread
From: Catalin Marinas @ 2026-05-19 15:09 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arm-kernel, Yeoreum Yun

On Tue, May 19, 2026 at 10:08:22AM +0100, Will Deacon wrote:
> The LSUI futex implementation relies on a cmpxchg() loop to implement
> FUTEX_OP_XOR, as the architecture doesn't provide unprivileged *EOR
> atomics. Since the unprivileged 'CAST' instructions used to implement
> the cmpxchg() can only operate on 64-bit memory locations, the
> __lsui_cmpxchg32() helper function performs a song and dance to marshall
> the 32-bit futex value into the correct part of a 64-bit register and
> fill the remaining bytes with the neighbouring data.

IIRC, the reason for the current __lsui_cmpxchg32() was not EOR but the
expected futex_atomic_cmpxchg_inatomic() semantics. Looking at it again,
we have wake_futex_pi() that does something else if the ret is 0 but the
value differs. Looking at it again, the caller of wake_futex_pi()
retries on -EAGAIN anyway, so I don't see a correctness issue, it will
eventually hit the condition.

(Sashiko complains about the change but I think we can ignore it)

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-19 15:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19  9:08 [PATCH] arm64: futex: Consolidate 'old == new' check in __lsui_cmpxchg32() Will Deacon
2026-05-19  9:24 ` Yeoreum Yun
2026-05-19 15:09 ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox