linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros
@ 2023-04-06  8:20 Leonardo Bras
  2023-04-06  8:20 ` [RFC PATCH v1 1/2] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Leonardo Bras @ 2023-04-06  8:20 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Leonardo Bras, Guo Ren,
	Andrea Parri, Conor Dooley
  Cc: linux-riscv, linux-kernel

While studying riscv's cmpxchg.h file, I got really interested in
understanding how RISCV asm implemented the different versions of
{cmp,}xchg.

When I understood the pattern, it made sense for me to remove the
duplications and create macros to make it easier to understand what exactly
changes between the versions: Instruction sufixes & barriers.

Thanks!
Leo

Changes since RFCv3:
- Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/

Changes since RFCv2:
- Fixed  macros that depend on having a local variable with a magic name
- Previous cast to (long) is now only applied on 4-bytes cmpxchg
https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/

Changes since RFCv1:
- Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/

Leonardo Bras (2):
  riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
  riscv/cmpxchg: Deduplicate xchg() asm functions

 arch/riscv/include/asm/cmpxchg.h | 319 +++++++------------------------
 1 file changed, 67 insertions(+), 252 deletions(-)

-- 
2.40.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH v1 1/2] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
  2023-04-06  8:20 [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros Leonardo Bras
@ 2023-04-06  8:20 ` Leonardo Bras
  2023-04-07  8:29   ` Guo Ren
  2023-04-06  8:20 ` [RFC PATCH v1 2/2] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
  2023-04-06 17:19 ` [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros Andrea Parri
  2 siblings, 1 reply; 10+ messages in thread
From: Leonardo Bras @ 2023-04-06  8:20 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Leonardo Bras, Guo Ren,
	Andrea Parri, Conor Dooley
  Cc: linux-riscv, linux-kernel

In this header every cmpxchg define (_relaxed, _acquire, _release,
vanilla) contain it's own asm file, both for 4-byte variables an 8-byte
variables, on a total of 8 versions of mostly the same asm.

This is usually bad, as it means any change may be done in up to 8
different places.

Unify those versions by creating a new define with enough parameters to
generate any version of the previous 8.

Then unify the result under a more general define, and simplify
arch_cmpxchg* generation

(This did not cause any change in generated asm)

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 arch/riscv/include/asm/cmpxchg.h | 184 ++++++-------------------------
 1 file changed, 36 insertions(+), 148 deletions(-)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 12debce235e52..f88fae357071c 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -163,51 +163,23 @@
  * store NEW in MEM.  Return the initial value in MEM.  Success is
  * indicated by comparing RETURN with OLD.
  */
-#define __cmpxchg_relaxed(ptr, old, new, size)				\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(*(ptr)) __old = (old);				\
-	__typeof__(*(ptr)) __new = (new);				\
-	__typeof__(*(ptr)) __ret;					\
-	register unsigned int __rc;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
 
-#define arch_cmpxchg_relaxed(ptr, o, n)					\
+#define ____cmpxchg(lr_sfx, sc_sfx, prepend, append, r, rc, p, co, o, n)\
 ({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg_relaxed((ptr),			\
-					_o_, _n_, sizeof(*(ptr)));	\
+	__asm__ __volatile__ (						\
+		prepend							\
+		"0:	lr" lr_sfx " %0, %2\n"				\
+		"	bne  %0, %z3, 1f\n"				\
+		"	sc" sc_sfx " %1, %z4, %2\n"			\
+		"	bnez %1, 0b\n"					\
+		append							\
+		"1:\n"							\
+		: "=&r" (r), "=&r" (rc), "+A" (*(p))			\
+		: "rJ" (co o), "rJ" (n)					\
+		: "memory");						\
 })
 
-#define __cmpxchg_acquire(ptr, old, new, size)				\
+#define ___cmpxchg(ptr, old, new, size, sc_sfx, prepend, append)	\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
 	__typeof__(*(ptr)) __old = (old);				\
@@ -216,28 +188,12 @@
 	register unsigned int __rc;					\
 	switch (size) {							\
 	case 4:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			RISCV_ACQUIRE_BARRIER				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
+		____cmpxchg(".w", ".w" sc_sfx, prepend, append,		\
+			    __ret, __rc, __ptr, (long), __old, __new);	\
 		break;							\
 	case 8:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			RISCV_ACQUIRE_BARRIER				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
+		____cmpxchg(".d", ".d" sc_sfx, prepend, append,		\
+			    __ret, __rc, __ptr, /**/, __old, __new);	\
 		break;							\
 	default:							\
 		BUILD_BUG();						\
@@ -245,105 +201,37 @@
 	__ret;								\
 })
 
-#define arch_cmpxchg_acquire(ptr, o, n)					\
+#define __cmpxchg_relaxed(ptr, old, new, size)				\
+	___cmpxchg(ptr, old, new, size, "", "", "")
+
+#define _arch_cmpxchg(order, ptr, o, n)					\
 ({									\
 	__typeof__(*(ptr)) _o_ = (o);					\
 	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg_acquire((ptr),			\
-					_o_, _n_, sizeof(*(ptr)));	\
+	(__typeof__(*(ptr))) __cmpxchg ## order((ptr), _o_, _n_,	\
+						sizeof(*(ptr)));	\
 })
 
+#define arch_cmpxchg_relaxed(ptr, o, n)					\
+	_arch_cmpxchg(_relaxed, ptr, o, n)
+
+#define __cmpxchg_acquire(ptr, old, new, size)				\
+	___cmpxchg(ptr, old, new, size, "", "", RISCV_ACQUIRE_BARRIER)
+
+#define arch_cmpxchg_acquire(ptr, o, n)					\
+	_arch_cmpxchg(_acquire, ptr, o, n)
+
 #define __cmpxchg_release(ptr, old, new, size)				\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(*(ptr)) __old = (old);				\
-	__typeof__(*(ptr)) __new = (new);				\
-	__typeof__(*(ptr)) __ret;					\
-	register unsigned int __rc;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+	___cmpxchg(ptr, old, new, size, "", RISCV_RELEASE_BARRIER, "")
 
 #define arch_cmpxchg_release(ptr, o, n)					\
-({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg_release((ptr),			\
-					_o_, _n_, sizeof(*(ptr)));	\
-})
+	_arch_cmpxchg(_release, ptr, o, n)
 
 #define __cmpxchg(ptr, old, new, size)					\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(*(ptr)) __old = (old);				\
-	__typeof__(*(ptr)) __new = (new);				\
-	__typeof__(*(ptr)) __ret;					\
-	register unsigned int __rc;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w.rl %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"	fence rw, rw\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d.rl %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"	fence rw, rw\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+	___cmpxchg(ptr, old, new, size, ".rl", "", "	fence rw, rw\n")
 
 #define arch_cmpxchg(ptr, o, n)						\
-({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg((ptr),				\
-				       _o_, _n_, sizeof(*(ptr)));	\
-})
+	_arch_cmpxchg(, ptr, o, n)
 
 #define arch_cmpxchg_local(ptr, o, n)					\
 	(__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr))))
-- 
2.40.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH v1 2/2] riscv/cmpxchg: Deduplicate xchg() asm functions
  2023-04-06  8:20 [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros Leonardo Bras
  2023-04-06  8:20 ` [RFC PATCH v1 1/2] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
@ 2023-04-06  8:20 ` Leonardo Bras
  2023-04-07  8:31   ` Guo Ren
  2023-04-06 17:19 ` [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros Andrea Parri
  2 siblings, 1 reply; 10+ messages in thread
From: Leonardo Bras @ 2023-04-06  8:20 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Leonardo Bras, Guo Ren,
	Andrea Parri, Conor Dooley
  Cc: linux-riscv, linux-kernel

In this header every xchg define (_relaxed, _acquire, _release, vanilla)
contain it's own asm file, both for 4-byte variables an 8-byte variables,
on a total of 8 versions of mostly the same asm.

This is usually bad, as it means any change may be done in up to 8
different places.

Unify those versions by creating a new define with enough parameters to
generate any version of the previous 8.

Then unify the result under a more general define, and simplify
arch_xchg* generation.

(This did not cause any change in generated asm)

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 arch/riscv/include/asm/cmpxchg.h | 135 +++++++------------------------
 1 file changed, 31 insertions(+), 104 deletions(-)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index f88fae357071c..905a888d8b04d 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -11,25 +11,30 @@
 #include <asm/barrier.h>
 #include <asm/fence.h>
 
-#define __xchg_relaxed(ptr, new, size)					\
+#define ____xchg(sfx, prepend, append, r, p, n)				\
+({									\
+	__asm__ __volatile__ (						\
+		prepend							\
+		"	amoswap" sfx " %0, %2, %1\n"			\
+		append							\
+		: "=r" (r), "+A" (*(p))					\
+		: "r" (n)						\
+		: "memory");						\
+})
+
+#define ___xchg(ptr, new, size, sfx, prepend, append)			\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
 	case 4:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.w %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
+		____xchg(".w" sfx, prepend, append,			\
+			 __ret, __ptr, __new);				\
 		break;							\
 	case 8:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.d %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
+		____xchg(".d" sfx, prepend, append,			\
+			 __ret, __ptr, __new);				\
 		break;							\
 	default:							\
 		BUILD_BUG();						\
@@ -37,114 +42,36 @@
 	__ret;								\
 })
 
-#define arch_xchg_relaxed(ptr, x)					\
+#define __xchg_relaxed(ptr, new, size)					\
+	___xchg(ptr, new, size, "", "", "")
+
+#define _arch_xchg(order, ptr, x)					\
 ({									\
 	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg_relaxed((ptr),			\
-					    _x_, sizeof(*(ptr)));	\
+	(__typeof__(*(ptr))) __xchg ## order((ptr),			\
+					     _x_, sizeof(*(ptr)));	\
 })
 
+#define arch_xchg_relaxed(ptr, x)					\
+	_arch_xchg(_relaxed, ptr, x)
+
 #define __xchg_acquire(ptr, new, size)					\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
-	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.w %0, %2, %1\n"			\
-			RISCV_ACQUIRE_BARRIER				\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.d %0, %2, %1\n"			\
-			RISCV_ACQUIRE_BARRIER				\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+	___xchg(ptr, new, size, "", "", RISCV_ACQUIRE_BARRIER)
 
 #define arch_xchg_acquire(ptr, x)					\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg_acquire((ptr),			\
-					    _x_, sizeof(*(ptr)));	\
-})
+	_arch_xchg(_acquire, ptr, x)
 
 #define __xchg_release(ptr, new, size)					\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
-	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"	amoswap.w %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"	amoswap.d %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+	___xchg(ptr, new, size, "", RISCV_RELEASE_BARRIER, "")
 
 #define arch_xchg_release(ptr, x)					\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg_release((ptr),			\
-					    _x_, sizeof(*(ptr)));	\
-})
+	_arch_xchg(_release, ptr, x)
 
 #define __xchg(ptr, new, size)						\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
-	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.w.aqrl %0, %2, %1\n"		\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.d.aqrl %0, %2, %1\n"		\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+	___xchg(ptr, new, size, ".aqrl", "", "")
 
 #define arch_xchg(ptr, x)						\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg((ptr), _x_, sizeof(*(ptr)));	\
-})
+	_arch_xchg(, ptr, x)
 
 #define xchg32(ptr, x)							\
 ({									\
-- 
2.40.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros
  2023-04-06  8:20 [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros Leonardo Bras
  2023-04-06  8:20 ` [RFC PATCH v1 1/2] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
  2023-04-06  8:20 ` [RFC PATCH v1 2/2] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
@ 2023-04-06 17:19 ` Andrea Parri
  2023-08-02 21:51   ` Leonardo Brás
  2 siblings, 1 reply; 10+ messages in thread
From: Andrea Parri @ 2023-04-06 17:19 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Conor Dooley,
	linux-riscv, linux-kernel

On Thu, Apr 06, 2023 at 05:20:17AM -0300, Leonardo Bras wrote:
> While studying riscv's cmpxchg.h file, I got really interested in
> understanding how RISCV asm implemented the different versions of
> {cmp,}xchg.
> 
> When I understood the pattern, it made sense for me to remove the
> duplications and create macros to make it easier to understand what exactly
> changes between the versions: Instruction sufixes & barriers.
> 
> Thanks!
> Leo
> 
> Changes since RFCv3:
> - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> 
> Changes since RFCv2:
> - Fixed  macros that depend on having a local variable with a magic name
> - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> 
> Changes since RFCv1:
> - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> 
> Leonardo Bras (2):
>   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
>   riscv/cmpxchg: Deduplicate xchg() asm functions
> 
>  arch/riscv/include/asm/cmpxchg.h | 319 +++++++------------------------
>  1 file changed, 67 insertions(+), 252 deletions(-)

LGTM.  AFAICT, this would need to be rebased, cf. e.g.

  a8596dda1fbf7e ("arch: rename all internal names __xchg to __arch_xchg")

from the tip tree.

  Andrea

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v1 1/2] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
  2023-04-06  8:20 ` [RFC PATCH v1 1/2] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
@ 2023-04-07  8:29   ` Guo Ren
  2023-04-18 19:10     ` Leonardo Bras Soares Passos
  0 siblings, 1 reply; 10+ messages in thread
From: Guo Ren @ 2023-04-07  8:29 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrea Parri,
	Conor Dooley, linux-riscv, linux-kernel

On Thu, Apr 6, 2023 at 4:20 PM Leonardo Bras <leobras@redhat.com> wrote:
>
> In this header every cmpxchg define (_relaxed, _acquire, _release,
> vanilla) contain it's own asm file, both for 4-byte variables an 8-byte
> variables, on a total of 8 versions of mostly the same asm.
>
> This is usually bad, as it means any change may be done in up to 8
> different places.
>
> Unify those versions by creating a new define with enough parameters to
> generate any version of the previous 8.
>
> Then unify the result under a more general define, and simplify
> arch_cmpxchg* generation
>
> (This did not cause any change in generated asm)
>
> Signed-off-by: Leonardo Bras <leobras@redhat.com>
> ---
>  arch/riscv/include/asm/cmpxchg.h | 184 ++++++-------------------------
>  1 file changed, 36 insertions(+), 148 deletions(-)
>
> diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
> index 12debce235e52..f88fae357071c 100644
> --- a/arch/riscv/include/asm/cmpxchg.h
> +++ b/arch/riscv/include/asm/cmpxchg.h
> @@ -163,51 +163,23 @@
>   * store NEW in MEM.  Return the initial value in MEM.  Success is
>   * indicated by comparing RETURN with OLD.
>   */
> -#define __cmpxchg_relaxed(ptr, old, new, size)                         \
> -({                                                                     \
> -       __typeof__(ptr) __ptr = (ptr);                                  \
> -       __typeof__(*(ptr)) __old = (old);                               \
> -       __typeof__(*(ptr)) __new = (new);                               \
> -       __typeof__(*(ptr)) __ret;                                       \
> -       register unsigned int __rc;                                     \
> -       switch (size) {                                                 \
> -       case 4:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "0:     lr.w %0, %2\n"                          \
> -                       "       bne  %0, %z3, 1f\n"                     \
> -                       "       sc.w %1, %z4, %2\n"                     \
> -                       "       bnez %1, 0b\n"                          \
> -                       "1:\n"                                          \
> -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> -                       : "rJ" ((long)__old), "rJ" (__new)              \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       case 8:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "0:     lr.d %0, %2\n"                          \
> -                       "       bne %0, %z3, 1f\n"                      \
> -                       "       sc.d %1, %z4, %2\n"                     \
> -                       "       bnez %1, 0b\n"                          \
> -                       "1:\n"                                          \
> -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> -                       : "rJ" (__old), "rJ" (__new)                    \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       default:                                                        \
> -               BUILD_BUG();                                            \
> -       }                                                               \
> -       __ret;                                                          \
> -})
>
> -#define arch_cmpxchg_relaxed(ptr, o, n)                                        \
> +#define ____cmpxchg(lr_sfx, sc_sfx, prepend, append, r, rc, p, co, o, n)\
>  ({                                                                     \
> -       __typeof__(*(ptr)) _o_ = (o);                                   \
> -       __typeof__(*(ptr)) _n_ = (n);                                   \
> -       (__typeof__(*(ptr))) __cmpxchg_relaxed((ptr),                   \
> -                                       _o_, _n_, sizeof(*(ptr)));      \
> +       __asm__ __volatile__ (                                          \
> +               prepend                                                 \
> +               "0:     lr" lr_sfx " %0, %2\n"                          \
> +               "       bne  %0, %z3, 1f\n"                             \
> +               "       sc" sc_sfx " %1, %z4, %2\n"                     \
> +               "       bnez %1, 0b\n"                                  \
> +               append                                                  \
> +               "1:\n"                                                  \
> +               : "=&r" (r), "=&r" (rc), "+A" (*(p))                    \
> +               : "rJ" (co o), "rJ" (n)                                 \
> +               : "memory");                                            \
>  })
>
> -#define __cmpxchg_acquire(ptr, old, new, size)                         \
> +#define ___cmpxchg(ptr, old, new, size, sc_sfx, prepend, append)       \
>  ({                                                                     \
>         __typeof__(ptr) __ptr = (ptr);                                  \
>         __typeof__(*(ptr)) __old = (old);                               \
> @@ -216,28 +188,12 @@
>         register unsigned int __rc;                                     \
>         switch (size) {                                                 \
>         case 4:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "0:     lr.w %0, %2\n"                          \
> -                       "       bne  %0, %z3, 1f\n"                     \
> -                       "       sc.w %1, %z4, %2\n"                     \
> -                       "       bnez %1, 0b\n"                          \
> -                       RISCV_ACQUIRE_BARRIER                           \
> -                       "1:\n"                                          \
> -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> -                       : "rJ" ((long)__old), "rJ" (__new)              \
> -                       : "memory");                                    \
> +               ____cmpxchg(".w", ".w" sc_sfx, prepend, append,         \
> +                           __ret, __rc, __ptr, (long), __old, __new);  \
>                 break;                                                  \
>         case 8:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "0:     lr.d %0, %2\n"                          \
> -                       "       bne %0, %z3, 1f\n"                      \
> -                       "       sc.d %1, %z4, %2\n"                     \
> -                       "       bnez %1, 0b\n"                          \
> -                       RISCV_ACQUIRE_BARRIER                           \
> -                       "1:\n"                                          \
> -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> -                       : "rJ" (__old), "rJ" (__new)                    \
> -                       : "memory");                                    \
> +               ____cmpxchg(".d", ".d" sc_sfx, prepend, append,         \
> +                           __ret, __rc, __ptr, /**/, __old, __new);    \
>                 break;                                                  \
>         default:                                                        \
>                 BUILD_BUG();                                            \
> @@ -245,105 +201,37 @@
>         __ret;                                                          \
>  })
>
> -#define arch_cmpxchg_acquire(ptr, o, n)                                        \
> +#define __cmpxchg_relaxed(ptr, old, new, size)                         \
> +       ___cmpxchg(ptr, old, new, size, "", "", "")
> +
> +#define _arch_cmpxchg(order, ptr, o, n)                                        \
>  ({                                                                     \
>         __typeof__(*(ptr)) _o_ = (o);                                   \
>         __typeof__(*(ptr)) _n_ = (n);                                   \
> -       (__typeof__(*(ptr))) __cmpxchg_acquire((ptr),                   \
> -                                       _o_, _n_, sizeof(*(ptr)));      \
> +       (__typeof__(*(ptr))) __cmpxchg ## order((ptr), _o_, _n_,        \
> +                                               sizeof(*(ptr)));        \
>  })
>
> +#define arch_cmpxchg_relaxed(ptr, o, n)                                        \
> +       _arch_cmpxchg(_relaxed, ptr, o, n)
> +
> +#define __cmpxchg_acquire(ptr, old, new, size)                         \
> +       ___cmpxchg(ptr, old, new, size, "", "", RISCV_ACQUIRE_BARRIER)
> +
> +#define arch_cmpxchg_acquire(ptr, o, n)                                        \
> +       _arch_cmpxchg(_acquire, ptr, o, n)
> +
>  #define __cmpxchg_release(ptr, old, new, size)                         \
> -({                                                                     \
> -       __typeof__(ptr) __ptr = (ptr);                                  \
> -       __typeof__(*(ptr)) __old = (old);                               \
> -       __typeof__(*(ptr)) __new = (new);                               \
> -       __typeof__(*(ptr)) __ret;                                       \
> -       register unsigned int __rc;                                     \
> -       switch (size) {                                                 \
> -       case 4:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       RISCV_RELEASE_BARRIER                           \
> -                       "0:     lr.w %0, %2\n"                          \
> -                       "       bne  %0, %z3, 1f\n"                     \
> -                       "       sc.w %1, %z4, %2\n"                     \
> -                       "       bnez %1, 0b\n"                          \
> -                       "1:\n"                                          \
> -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> -                       : "rJ" ((long)__old), "rJ" (__new)              \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       case 8:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       RISCV_RELEASE_BARRIER                           \
> -                       "0:     lr.d %0, %2\n"                          \
> -                       "       bne %0, %z3, 1f\n"                      \
> -                       "       sc.d %1, %z4, %2\n"                     \
> -                       "       bnez %1, 0b\n"                          \
> -                       "1:\n"                                          \
> -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> -                       : "rJ" (__old), "rJ" (__new)                    \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       default:                                                        \
> -               BUILD_BUG();                                            \
> -       }                                                               \
> -       __ret;                                                          \
> -})
> +       ___cmpxchg(ptr, old, new, size, "", RISCV_RELEASE_BARRIER, "")
>
>  #define arch_cmpxchg_release(ptr, o, n)                                        \
> -({                                                                     \
> -       __typeof__(*(ptr)) _o_ = (o);                                   \
> -       __typeof__(*(ptr)) _n_ = (n);                                   \
> -       (__typeof__(*(ptr))) __cmpxchg_release((ptr),                   \
> -                                       _o_, _n_, sizeof(*(ptr)));      \
> -})
> +       _arch_cmpxchg(_release, ptr, o, n)
>
>  #define __cmpxchg(ptr, old, new, size)                                 \
> -({                                                                     \
> -       __typeof__(ptr) __ptr = (ptr);                                  \
> -       __typeof__(*(ptr)) __old = (old);                               \
> -       __typeof__(*(ptr)) __new = (new);                               \
> -       __typeof__(*(ptr)) __ret;                                       \
> -       register unsigned int __rc;                                     \
> -       switch (size) {                                                 \
> -       case 4:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "0:     lr.w %0, %2\n"                          \
> -                       "       bne  %0, %z3, 1f\n"                     \
> -                       "       sc.w.rl %1, %z4, %2\n"                  \
> -                       "       bnez %1, 0b\n"                          \
> -                       "       fence rw, rw\n"                         \
> -                       "1:\n"                                          \
> -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> -                       : "rJ" ((long)__old), "rJ" (__new)              \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       case 8:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "0:     lr.d %0, %2\n"                          \
> -                       "       bne %0, %z3, 1f\n"                      \
> -                       "       sc.d.rl %1, %z4, %2\n"                  \
> -                       "       bnez %1, 0b\n"                          \
> -                       "       fence rw, rw\n"                         \
> -                       "1:\n"                                          \
> -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> -                       : "rJ" (__old), "rJ" (__new)                    \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       default:                                                        \
> -               BUILD_BUG();                                            \
> -       }                                                               \
> -       __ret;                                                          \
> -})
> +       ___cmpxchg(ptr, old, new, size, ".rl", "", "    fence rw, rw\n")
>
>  #define arch_cmpxchg(ptr, o, n)                                                \
> -({                                                                     \
> -       __typeof__(*(ptr)) _o_ = (o);                                   \
> -       __typeof__(*(ptr)) _n_ = (n);                                   \
> -       (__typeof__(*(ptr))) __cmpxchg((ptr),                           \
> -                                      _o_, _n_, sizeof(*(ptr)));       \
> -})
> +       _arch_cmpxchg(, ptr, o, n)
>
>  #define arch_cmpxchg_local(ptr, o, n)                                  \
>         (__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr))))
> --
> 2.40.0
>
One patch is much easier to review :)

Reviewed-by: Guo Ren <guoren@kernel.org>

-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v1 2/2] riscv/cmpxchg: Deduplicate xchg() asm functions
  2023-04-06  8:20 ` [RFC PATCH v1 2/2] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
@ 2023-04-07  8:31   ` Guo Ren
  0 siblings, 0 replies; 10+ messages in thread
From: Guo Ren @ 2023-04-07  8:31 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrea Parri,
	Conor Dooley, linux-riscv, linux-kernel

On Thu, Apr 6, 2023 at 4:20 PM Leonardo Bras <leobras@redhat.com> wrote:
>
> In this header every xchg define (_relaxed, _acquire, _release, vanilla)
> contain it's own asm file, both for 4-byte variables an 8-byte variables,
> on a total of 8 versions of mostly the same asm.
>
> This is usually bad, as it means any change may be done in up to 8
> different places.
>
> Unify those versions by creating a new define with enough parameters to
> generate any version of the previous 8.
>
> Then unify the result under a more general define, and simplify
> arch_xchg* generation.
>
> (This did not cause any change in generated asm)
>
> Signed-off-by: Leonardo Bras <leobras@redhat.com>
> ---
>  arch/riscv/include/asm/cmpxchg.h | 135 +++++++------------------------
>  1 file changed, 31 insertions(+), 104 deletions(-)
>
> diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
> index f88fae357071c..905a888d8b04d 100644
> --- a/arch/riscv/include/asm/cmpxchg.h
> +++ b/arch/riscv/include/asm/cmpxchg.h
> @@ -11,25 +11,30 @@
>  #include <asm/barrier.h>
>  #include <asm/fence.h>
>
> -#define __xchg_relaxed(ptr, new, size)                                 \
> +#define ____xchg(sfx, prepend, append, r, p, n)                                \
> +({                                                                     \
> +       __asm__ __volatile__ (                                          \
> +               prepend                                                 \
> +               "       amoswap" sfx " %0, %2, %1\n"                    \
> +               append                                                  \
> +               : "=r" (r), "+A" (*(p))                                 \
> +               : "r" (n)                                               \
> +               : "memory");                                            \
> +})
> +
> +#define ___xchg(ptr, new, size, sfx, prepend, append)                  \
>  ({                                                                     \
>         __typeof__(ptr) __ptr = (ptr);                                  \
>         __typeof__(new) __new = (new);                                  \
>         __typeof__(*(ptr)) __ret;                                       \
>         switch (size) {                                                 \
>         case 4:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "       amoswap.w %0, %2, %1\n"                 \
> -                       : "=r" (__ret), "+A" (*__ptr)                   \
> -                       : "r" (__new)                                   \
> -                       : "memory");                                    \
> +               ____xchg(".w" sfx, prepend, append,                     \
> +                        __ret, __ptr, __new);                          \
>                 break;                                                  \
>         case 8:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "       amoswap.d %0, %2, %1\n"                 \
> -                       : "=r" (__ret), "+A" (*__ptr)                   \
> -                       : "r" (__new)                                   \
> -                       : "memory");                                    \
> +               ____xchg(".d" sfx, prepend, append,                     \
> +                        __ret, __ptr, __new);                          \
>                 break;                                                  \
>         default:                                                        \
>                 BUILD_BUG();                                            \
> @@ -37,114 +42,36 @@
>         __ret;                                                          \
>  })
>
> -#define arch_xchg_relaxed(ptr, x)                                      \
> +#define __xchg_relaxed(ptr, new, size)                                 \
> +       ___xchg(ptr, new, size, "", "", "")
> +
> +#define _arch_xchg(order, ptr, x)                                      \
>  ({                                                                     \
>         __typeof__(*(ptr)) _x_ = (x);                                   \
> -       (__typeof__(*(ptr))) __xchg_relaxed((ptr),                      \
> -                                           _x_, sizeof(*(ptr)));       \
> +       (__typeof__(*(ptr))) __xchg ## order((ptr),                     \
> +                                            _x_, sizeof(*(ptr)));      \
>  })
>
> +#define arch_xchg_relaxed(ptr, x)                                      \
> +       _arch_xchg(_relaxed, ptr, x)
> +
>  #define __xchg_acquire(ptr, new, size)                                 \
> -({                                                                     \
> -       __typeof__(ptr) __ptr = (ptr);                                  \
> -       __typeof__(new) __new = (new);                                  \
> -       __typeof__(*(ptr)) __ret;                                       \
> -       switch (size) {                                                 \
> -       case 4:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "       amoswap.w %0, %2, %1\n"                 \
> -                       RISCV_ACQUIRE_BARRIER                           \
> -                       : "=r" (__ret), "+A" (*__ptr)                   \
> -                       : "r" (__new)                                   \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       case 8:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "       amoswap.d %0, %2, %1\n"                 \
> -                       RISCV_ACQUIRE_BARRIER                           \
> -                       : "=r" (__ret), "+A" (*__ptr)                   \
> -                       : "r" (__new)                                   \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       default:                                                        \
> -               BUILD_BUG();                                            \
> -       }                                                               \
> -       __ret;                                                          \
> -})
> +       ___xchg(ptr, new, size, "", "", RISCV_ACQUIRE_BARRIER)
>
>  #define arch_xchg_acquire(ptr, x)                                      \
> -({                                                                     \
> -       __typeof__(*(ptr)) _x_ = (x);                                   \
> -       (__typeof__(*(ptr))) __xchg_acquire((ptr),                      \
> -                                           _x_, sizeof(*(ptr)));       \
> -})
> +       _arch_xchg(_acquire, ptr, x)
>
>  #define __xchg_release(ptr, new, size)                                 \
> -({                                                                     \
> -       __typeof__(ptr) __ptr = (ptr);                                  \
> -       __typeof__(new) __new = (new);                                  \
> -       __typeof__(*(ptr)) __ret;                                       \
> -       switch (size) {                                                 \
> -       case 4:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       RISCV_RELEASE_BARRIER                           \
> -                       "       amoswap.w %0, %2, %1\n"                 \
> -                       : "=r" (__ret), "+A" (*__ptr)                   \
> -                       : "r" (__new)                                   \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       case 8:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       RISCV_RELEASE_BARRIER                           \
> -                       "       amoswap.d %0, %2, %1\n"                 \
> -                       : "=r" (__ret), "+A" (*__ptr)                   \
> -                       : "r" (__new)                                   \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       default:                                                        \
> -               BUILD_BUG();                                            \
> -       }                                                               \
> -       __ret;                                                          \
> -})
> +       ___xchg(ptr, new, size, "", RISCV_RELEASE_BARRIER, "")
>
>  #define arch_xchg_release(ptr, x)                                      \
> -({                                                                     \
> -       __typeof__(*(ptr)) _x_ = (x);                                   \
> -       (__typeof__(*(ptr))) __xchg_release((ptr),                      \
> -                                           _x_, sizeof(*(ptr)));       \
> -})
> +       _arch_xchg(_release, ptr, x)
>
>  #define __xchg(ptr, new, size)                                         \
> -({                                                                     \
> -       __typeof__(ptr) __ptr = (ptr);                                  \
> -       __typeof__(new) __new = (new);                                  \
> -       __typeof__(*(ptr)) __ret;                                       \
> -       switch (size) {                                                 \
> -       case 4:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "       amoswap.w.aqrl %0, %2, %1\n"            \
> -                       : "=r" (__ret), "+A" (*__ptr)                   \
> -                       : "r" (__new)                                   \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       case 8:                                                         \
> -               __asm__ __volatile__ (                                  \
> -                       "       amoswap.d.aqrl %0, %2, %1\n"            \
> -                       : "=r" (__ret), "+A" (*__ptr)                   \
> -                       : "r" (__new)                                   \
> -                       : "memory");                                    \
> -               break;                                                  \
> -       default:                                                        \
> -               BUILD_BUG();                                            \
> -       }                                                               \
> -       __ret;                                                          \
> -})
> +       ___xchg(ptr, new, size, ".aqrl", "", "")
>
>  #define arch_xchg(ptr, x)                                              \
> -({                                                                     \
> -       __typeof__(*(ptr)) _x_ = (x);                                   \
> -       (__typeof__(*(ptr))) __xchg((ptr), _x_, sizeof(*(ptr)));        \
> -})
> +       _arch_xchg(, ptr, x)
>
>  #define xchg32(ptr, x)                                                 \
>  ({                                                                     \
> --
> 2.40.0
>

Reviewed-by: Guo Ren <guoren@kernel.org>

-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v1 1/2] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
  2023-04-07  8:29   ` Guo Ren
@ 2023-04-18 19:10     ` Leonardo Bras Soares Passos
  0 siblings, 0 replies; 10+ messages in thread
From: Leonardo Bras Soares Passos @ 2023-04-18 19:10 UTC (permalink / raw)
  To: Guo Ren
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrea Parri,
	Conor Dooley, linux-riscv, linux-kernel

On Fri, Apr 7, 2023 at 5:29 AM Guo Ren <guoren@kernel.org> wrote:
>
> On Thu, Apr 6, 2023 at 4:20 PM Leonardo Bras <leobras@redhat.com> wrote:
> >
> > In this header every cmpxchg define (_relaxed, _acquire, _release,
> > vanilla) contain it's own asm file, both for 4-byte variables an 8-byte
> > variables, on a total of 8 versions of mostly the same asm.
> >
> > This is usually bad, as it means any change may be done in up to 8
> > different places.
> >
> > Unify those versions by creating a new define with enough parameters to
> > generate any version of the previous 8.
> >
> > Then unify the result under a more general define, and simplify
> > arch_cmpxchg* generation
> >
> > (This did not cause any change in generated asm)
> >
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > ---
> >  arch/riscv/include/asm/cmpxchg.h | 184 ++++++-------------------------
> >  1 file changed, 36 insertions(+), 148 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
> > index 12debce235e52..f88fae357071c 100644
> > --- a/arch/riscv/include/asm/cmpxchg.h
> > +++ b/arch/riscv/include/asm/cmpxchg.h
> > @@ -163,51 +163,23 @@
> >   * store NEW in MEM.  Return the initial value in MEM.  Success is
> >   * indicated by comparing RETURN with OLD.
> >   */
> > -#define __cmpxchg_relaxed(ptr, old, new, size)                         \
> > -({                                                                     \
> > -       __typeof__(ptr) __ptr = (ptr);                                  \
> > -       __typeof__(*(ptr)) __old = (old);                               \
> > -       __typeof__(*(ptr)) __new = (new);                               \
> > -       __typeof__(*(ptr)) __ret;                                       \
> > -       register unsigned int __rc;                                     \
> > -       switch (size) {                                                 \
> > -       case 4:                                                         \
> > -               __asm__ __volatile__ (                                  \
> > -                       "0:     lr.w %0, %2\n"                          \
> > -                       "       bne  %0, %z3, 1f\n"                     \
> > -                       "       sc.w %1, %z4, %2\n"                     \
> > -                       "       bnez %1, 0b\n"                          \
> > -                       "1:\n"                                          \
> > -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> > -                       : "rJ" ((long)__old), "rJ" (__new)              \
> > -                       : "memory");                                    \
> > -               break;                                                  \
> > -       case 8:                                                         \
> > -               __asm__ __volatile__ (                                  \
> > -                       "0:     lr.d %0, %2\n"                          \
> > -                       "       bne %0, %z3, 1f\n"                      \
> > -                       "       sc.d %1, %z4, %2\n"                     \
> > -                       "       bnez %1, 0b\n"                          \
> > -                       "1:\n"                                          \
> > -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> > -                       : "rJ" (__old), "rJ" (__new)                    \
> > -                       : "memory");                                    \
> > -               break;                                                  \
> > -       default:                                                        \
> > -               BUILD_BUG();                                            \
> > -       }                                                               \
> > -       __ret;                                                          \
> > -})
> >
> > -#define arch_cmpxchg_relaxed(ptr, o, n)                                        \
> > +#define ____cmpxchg(lr_sfx, sc_sfx, prepend, append, r, rc, p, co, o, n)\
> >  ({                                                                     \
> > -       __typeof__(*(ptr)) _o_ = (o);                                   \
> > -       __typeof__(*(ptr)) _n_ = (n);                                   \
> > -       (__typeof__(*(ptr))) __cmpxchg_relaxed((ptr),                   \
> > -                                       _o_, _n_, sizeof(*(ptr)));      \
> > +       __asm__ __volatile__ (                                          \
> > +               prepend                                                 \
> > +               "0:     lr" lr_sfx " %0, %2\n"                          \
> > +               "       bne  %0, %z3, 1f\n"                             \
> > +               "       sc" sc_sfx " %1, %z4, %2\n"                     \
> > +               "       bnez %1, 0b\n"                                  \
> > +               append                                                  \
> > +               "1:\n"                                                  \
> > +               : "=&r" (r), "=&r" (rc), "+A" (*(p))                    \
> > +               : "rJ" (co o), "rJ" (n)                                 \
> > +               : "memory");                                            \
> >  })
> >
> > -#define __cmpxchg_acquire(ptr, old, new, size)                         \
> > +#define ___cmpxchg(ptr, old, new, size, sc_sfx, prepend, append)       \
> >  ({                                                                     \
> >         __typeof__(ptr) __ptr = (ptr);                                  \
> >         __typeof__(*(ptr)) __old = (old);                               \
> > @@ -216,28 +188,12 @@
> >         register unsigned int __rc;                                     \
> >         switch (size) {                                                 \
> >         case 4:                                                         \
> > -               __asm__ __volatile__ (                                  \
> > -                       "0:     lr.w %0, %2\n"                          \
> > -                       "       bne  %0, %z3, 1f\n"                     \
> > -                       "       sc.w %1, %z4, %2\n"                     \
> > -                       "       bnez %1, 0b\n"                          \
> > -                       RISCV_ACQUIRE_BARRIER                           \
> > -                       "1:\n"                                          \
> > -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> > -                       : "rJ" ((long)__old), "rJ" (__new)              \
> > -                       : "memory");                                    \
> > +               ____cmpxchg(".w", ".w" sc_sfx, prepend, append,         \
> > +                           __ret, __rc, __ptr, (long), __old, __new);  \
> >                 break;                                                  \
> >         case 8:                                                         \
> > -               __asm__ __volatile__ (                                  \
> > -                       "0:     lr.d %0, %2\n"                          \
> > -                       "       bne %0, %z3, 1f\n"                      \
> > -                       "       sc.d %1, %z4, %2\n"                     \
> > -                       "       bnez %1, 0b\n"                          \
> > -                       RISCV_ACQUIRE_BARRIER                           \
> > -                       "1:\n"                                          \
> > -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> > -                       : "rJ" (__old), "rJ" (__new)                    \
> > -                       : "memory");                                    \
> > +               ____cmpxchg(".d", ".d" sc_sfx, prepend, append,         \
> > +                           __ret, __rc, __ptr, /**/, __old, __new);    \
> >                 break;                                                  \
> >         default:                                                        \
> >                 BUILD_BUG();                                            \
> > @@ -245,105 +201,37 @@
> >         __ret;                                                          \
> >  })
> >
> > -#define arch_cmpxchg_acquire(ptr, o, n)                                        \
> > +#define __cmpxchg_relaxed(ptr, old, new, size)                         \
> > +       ___cmpxchg(ptr, old, new, size, "", "", "")
> > +
> > +#define _arch_cmpxchg(order, ptr, o, n)                                        \
> >  ({                                                                     \
> >         __typeof__(*(ptr)) _o_ = (o);                                   \
> >         __typeof__(*(ptr)) _n_ = (n);                                   \
> > -       (__typeof__(*(ptr))) __cmpxchg_acquire((ptr),                   \
> > -                                       _o_, _n_, sizeof(*(ptr)));      \
> > +       (__typeof__(*(ptr))) __cmpxchg ## order((ptr), _o_, _n_,        \
> > +                                               sizeof(*(ptr)));        \
> >  })
> >
> > +#define arch_cmpxchg_relaxed(ptr, o, n)                                        \
> > +       _arch_cmpxchg(_relaxed, ptr, o, n)
> > +
> > +#define __cmpxchg_acquire(ptr, old, new, size)                         \
> > +       ___cmpxchg(ptr, old, new, size, "", "", RISCV_ACQUIRE_BARRIER)
> > +
> > +#define arch_cmpxchg_acquire(ptr, o, n)                                        \
> > +       _arch_cmpxchg(_acquire, ptr, o, n)
> > +
> >  #define __cmpxchg_release(ptr, old, new, size)                         \
> > -({                                                                     \
> > -       __typeof__(ptr) __ptr = (ptr);                                  \
> > -       __typeof__(*(ptr)) __old = (old);                               \
> > -       __typeof__(*(ptr)) __new = (new);                               \
> > -       __typeof__(*(ptr)) __ret;                                       \
> > -       register unsigned int __rc;                                     \
> > -       switch (size) {                                                 \
> > -       case 4:                                                         \
> > -               __asm__ __volatile__ (                                  \
> > -                       RISCV_RELEASE_BARRIER                           \
> > -                       "0:     lr.w %0, %2\n"                          \
> > -                       "       bne  %0, %z3, 1f\n"                     \
> > -                       "       sc.w %1, %z4, %2\n"                     \
> > -                       "       bnez %1, 0b\n"                          \
> > -                       "1:\n"                                          \
> > -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> > -                       : "rJ" ((long)__old), "rJ" (__new)              \
> > -                       : "memory");                                    \
> > -               break;                                                  \
> > -       case 8:                                                         \
> > -               __asm__ __volatile__ (                                  \
> > -                       RISCV_RELEASE_BARRIER                           \
> > -                       "0:     lr.d %0, %2\n"                          \
> > -                       "       bne %0, %z3, 1f\n"                      \
> > -                       "       sc.d %1, %z4, %2\n"                     \
> > -                       "       bnez %1, 0b\n"                          \
> > -                       "1:\n"                                          \
> > -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> > -                       : "rJ" (__old), "rJ" (__new)                    \
> > -                       : "memory");                                    \
> > -               break;                                                  \
> > -       default:                                                        \
> > -               BUILD_BUG();                                            \
> > -       }                                                               \
> > -       __ret;                                                          \
> > -})
> > +       ___cmpxchg(ptr, old, new, size, "", RISCV_RELEASE_BARRIER, "")
> >
> >  #define arch_cmpxchg_release(ptr, o, n)                                        \
> > -({                                                                     \
> > -       __typeof__(*(ptr)) _o_ = (o);                                   \
> > -       __typeof__(*(ptr)) _n_ = (n);                                   \
> > -       (__typeof__(*(ptr))) __cmpxchg_release((ptr),                   \
> > -                                       _o_, _n_, sizeof(*(ptr)));      \
> > -})
> > +       _arch_cmpxchg(_release, ptr, o, n)
> >
> >  #define __cmpxchg(ptr, old, new, size)                                 \
> > -({                                                                     \
> > -       __typeof__(ptr) __ptr = (ptr);                                  \
> > -       __typeof__(*(ptr)) __old = (old);                               \
> > -       __typeof__(*(ptr)) __new = (new);                               \
> > -       __typeof__(*(ptr)) __ret;                                       \
> > -       register unsigned int __rc;                                     \
> > -       switch (size) {                                                 \
> > -       case 4:                                                         \
> > -               __asm__ __volatile__ (                                  \
> > -                       "0:     lr.w %0, %2\n"                          \
> > -                       "       bne  %0, %z3, 1f\n"                     \
> > -                       "       sc.w.rl %1, %z4, %2\n"                  \
> > -                       "       bnez %1, 0b\n"                          \
> > -                       "       fence rw, rw\n"                         \
> > -                       "1:\n"                                          \
> > -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> > -                       : "rJ" ((long)__old), "rJ" (__new)              \
> > -                       : "memory");                                    \
> > -               break;                                                  \
> > -       case 8:                                                         \
> > -               __asm__ __volatile__ (                                  \
> > -                       "0:     lr.d %0, %2\n"                          \
> > -                       "       bne %0, %z3, 1f\n"                      \
> > -                       "       sc.d.rl %1, %z4, %2\n"                  \
> > -                       "       bnez %1, 0b\n"                          \
> > -                       "       fence rw, rw\n"                         \
> > -                       "1:\n"                                          \
> > -                       : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)    \
> > -                       : "rJ" (__old), "rJ" (__new)                    \
> > -                       : "memory");                                    \
> > -               break;                                                  \
> > -       default:                                                        \
> > -               BUILD_BUG();                                            \
> > -       }                                                               \
> > -       __ret;                                                          \
> > -})
> > +       ___cmpxchg(ptr, old, new, size, ".rl", "", "    fence rw, rw\n")
> >
> >  #define arch_cmpxchg(ptr, o, n)                                                \
> > -({                                                                     \
> > -       __typeof__(*(ptr)) _o_ = (o);                                   \
> > -       __typeof__(*(ptr)) _n_ = (n);                                   \
> > -       (__typeof__(*(ptr))) __cmpxchg((ptr),                           \
> > -                                      _o_, _n_, sizeof(*(ptr)));       \
> > -})
> > +       _arch_cmpxchg(, ptr, o, n)
> >
> >  #define arch_cmpxchg_local(ptr, o, n)                                  \
> >         (__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr))))
> > --
> > 2.40.0
> >
> One patch is much easier to review :)
>
> Reviewed-by: Guo Ren <guoren@kernel.org>
>
> --
> Best Regards
>  Guo Ren
>

Thanks!


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros
  2023-04-06 17:19 ` [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros Andrea Parri
@ 2023-08-02 21:51   ` Leonardo Brás
  2023-08-03  3:14     ` Andrea Parri
  0 siblings, 1 reply; 10+ messages in thread
From: Leonardo Brás @ 2023-08-02 21:51 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Conor Dooley,
	linux-riscv, linux-kernel

On Thu, 2023-04-06 at 19:19 +0200, Andrea Parri wrote:
> On Thu, Apr 06, 2023 at 05:20:17AM -0300, Leonardo Bras wrote:
> > While studying riscv's cmpxchg.h file, I got really interested in
> > understanding how RISCV asm implemented the different versions of
> > {cmp,}xchg.
> > 
> > When I understood the pattern, it made sense for me to remove the
> > duplications and create macros to make it easier to understand what exactly
> > changes between the versions: Instruction sufixes & barriers.
> > 
> > Thanks!
> > Leo
> > 
> > Changes since RFCv3:
> > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> > 
> > Changes since RFCv2:
> > - Fixed  macros that depend on having a local variable with a magic name
> > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> > 
> > Changes since RFCv1:
> > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> > 
> > Leonardo Bras (2):
> >   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> >   riscv/cmpxchg: Deduplicate xchg() asm functions
> > 
> >  arch/riscv/include/asm/cmpxchg.h | 319 +++++++------------------------
> >  1 file changed, 67 insertions(+), 252 deletions(-)
> 
> LGTM.  AFAICT, this would need to be rebased, cf. e.g.
> 
>   a8596dda1fbf7e ("arch: rename all internal names __xchg to __arch_xchg")
> 
> from the tip tree.
> 
>   Andrea

Thanks for the heads up!
I will update this and re-send!


And sorry about the delay :(
For some weird reason neither the cover letter, nor your message reached my
gmail, and just now looking at lore I could find your message. 



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros
  2023-08-02 21:51   ` Leonardo Brás
@ 2023-08-03  3:14     ` Andrea Parri
  2023-08-03  6:12       ` Leonardo Brás
  0 siblings, 1 reply; 10+ messages in thread
From: Andrea Parri @ 2023-08-03  3:14 UTC (permalink / raw)
  To: Leonardo Brás
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Conor Dooley,
	linux-riscv, linux-kernel

> > LGTM.  AFAICT, this would need to be rebased, cf. e.g.
> > 
> >   a8596dda1fbf7e ("arch: rename all internal names __xchg to __arch_xchg")
> > 
> > from the tip tree.
> > 
> >   Andrea
> 
> Thanks for the heads up!
> I will update this and re-send!
> 
> 
> And sorry about the delay :(
> For some weird reason neither the cover letter, nor your message reached my
> gmail, and just now looking at lore I could find your message. 

All's well that ends well.  ;-)  Thanks,

  Andrea

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros
  2023-08-03  3:14     ` Andrea Parri
@ 2023-08-03  6:12       ` Leonardo Brás
  0 siblings, 0 replies; 10+ messages in thread
From: Leonardo Brás @ 2023-08-03  6:12 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Conor Dooley,
	Palmer Dabbelt, linux-riscv, linux-kernel

On Thu, 2023-08-03 at 05:14 +0200, Andrea Parri wrote:
> > > LGTM.  AFAICT, this would need to be rebased, cf. e.g.
> > > 
> > >   a8596dda1fbf7e ("arch: rename all internal names __xchg to __arch_xchg")
> > > 
> > > from the tip tree.
> > > 
> > >   Andrea
> > 
> > Thanks for the heads up!
> > I will update this and re-send!
> > 
> > 
> > And sorry about the delay :(
> > For some weird reason neither the cover letter, nor your message reached my
> > gmail, and just now looking at lore I could find your message. 
> 
> All's well that ends well.  ;-)  Thanks,
> 
>   Andrea
> 

Superseded by v2:
https://patchwork.kernel.org/project/linux-riscv/list/?series=772422&state=%2A&archive=both

Leo


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-08-03  6:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-06  8:20 [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros Leonardo Bras
2023-04-06  8:20 ` [RFC PATCH v1 1/2] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
2023-04-07  8:29   ` Guo Ren
2023-04-18 19:10     ` Leonardo Bras Soares Passos
2023-04-06  8:20 ` [RFC PATCH v1 2/2] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
2023-04-07  8:31   ` Guo Ren
2023-04-06 17:19 ` [RFC PATCH v1 0/2] Deduplicating RISCV cmpxchg.h macros Andrea Parri
2023-08-02 21:51   ` Leonardo Brás
2023-08-03  3:14     ` Andrea Parri
2023-08-03  6:12       ` Leonardo Brás

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).