public inbox for linux-riscv@lists.infradead.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros
@ 2023-08-03  5:13 Leonardo Bras
  2023-08-03  5:13 ` [RFC PATCH v2 1/3] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Leonardo Bras @ 2023-08-03  5:13 UTC (permalink / raw)
  To: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Leonardo Bras,
	Andrzej Hajda, Arnd Bergmann, Ingo Molnar, Palmer Dabbelt,
	Guo Ren
  Cc: linux-kernel, linux-riscv

I unified previous patchsets into a single one, since the work is related.

While studying riscv's cmpxchg.h file, I got really interested in
understanding how RISCV asm implemented the different versions of
{cmp,}xchg.

When I understood the pattern, it made sense for me to remove the
duplications and create macros to make it easier to understand what exactly
changes between the versions: Instruction sufixes & barriers.

Also, did the same kind of work on atomic.c.

Note to Guo Ren:
I did some further improvement after your previous reviews, so I ended
up afraid including your Reviewed-by before cheching if the changes are
ok for you. Please check it out again, I just removed some helper macros
that were not being used elsewhere in the kernel.

Thanks!
Leo


Changes since squashed cmpxchg:
- Unified with atomic.c patchset 
- Rebased on top of torvalds/master (thanks Andrea Parri!)
- Removed helper macros that were not being used elsewhere in the kernel.

Changes since (cmpxchg) RFCv3:
- Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/

Changes since (cmpxchg) RFCv2:
- Fixed  macros that depend on having a local variable with a magic name
- Previous cast to (long) is now only applied on 4-bytes cmpxchg
https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/

Changes since (cmpxchg) RFCv1:
- Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/


Leonardo Bras (3):
  riscv/cmpxchg: Deduplicate xchg() asm functions
  riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
  riscv/atomic.h : Deduplicate arch_atomic.*

 arch/riscv/include/asm/atomic.h  | 164 ++++++++--------
 arch/riscv/include/asm/cmpxchg.h | 318 +++++--------------------------
 2 files changed, 123 insertions(+), 359 deletions(-)

-- 
2.41.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 1/3] riscv/cmpxchg: Deduplicate xchg() asm functions
  2023-08-03  5:13 [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Leonardo Bras
@ 2023-08-03  5:13 ` Leonardo Bras
  2023-08-03  5:13 ` [RFC PATCH v2 2/3] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Leonardo Bras @ 2023-08-03  5:13 UTC (permalink / raw)
  To: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Leonardo Bras,
	Andrzej Hajda, Arnd Bergmann, Ingo Molnar, Palmer Dabbelt,
	Guo Ren
  Cc: linux-kernel, linux-riscv

In this header every xchg define (_relaxed, _acquire, _release, vanilla)
contain it's own asm file, both for 4-byte variables an 8-byte variables,
on a total of 8 versions of mostly the same asm.

This is usually bad, as it means any change may be done in up to 8
different places.

Unify those versions by creating a new define with enough parameters to
generate any version of the previous 8.

Then unify the result under a more general define, and simplify
arch_xchg* generation.

(This did not cause any change in generated asm)

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 arch/riscv/include/asm/cmpxchg.h | 136 +++++--------------------------
 1 file changed, 22 insertions(+), 114 deletions(-)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 2f4726d3cfcc..ec4ea4f3f908 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -11,60 +11,30 @@
 #include <asm/barrier.h>
 #include <asm/fence.h>
 
-#define __xchg_relaxed(ptr, new, size)					\
+#define __arch_xchg(sfx, prepend, append, r, p, n)			\
 ({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
-	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.w %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.d %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
-
-#define arch_xchg_relaxed(ptr, x)					\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg_relaxed((ptr),			\
-					    _x_, sizeof(*(ptr)));	\
+	__asm__ __volatile__ (						\
+		prepend							\
+		"	amoswap" sfx " %0, %2, %1\n"			\
+		append							\
+		: "=r" (r), "+A" (*(p))					\
+		: "r" (n)						\
+		: "memory");						\
 })
 
-#define __xchg_acquire(ptr, new, size)					\
+#define _arch_xchg(ptr, new, sfx, prepend, append)			\
 ({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
+	__typeof__(*(ptr)) __new = (new);				\
 	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
+	__typeof__(ptr) __ptr = (ptr);					\
+	switch (sizeof(*__ptr)) {					\
 	case 4:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.w %0, %2, %1\n"			\
-			RISCV_ACQUIRE_BARRIER				\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
+		__arch_xchg(".w" sfx, prepend, append,			\
+			      __ret, __ptr, __new);			\
 		break;							\
 	case 8:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.d %0, %2, %1\n"			\
-			RISCV_ACQUIRE_BARRIER				\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
+		__arch_xchg(".d" sfx, prepend, append,			\
+			      __ret, __ptr, __new);			\
 		break;							\
 	default:							\
 		BUILD_BUG();						\
@@ -72,79 +42,17 @@
 	__ret;								\
 })
 
-#define arch_xchg_acquire(ptr, x)					\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg_acquire((ptr),			\
-					    _x_, sizeof(*(ptr)));	\
-})
+#define arch_xchg_relaxed(ptr, x)					\
+	_arch_xchg(ptr, x, "", "", "")
 
-#define __xchg_release(ptr, new, size)					\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
-	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"	amoswap.w %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"	amoswap.d %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+#define arch_xchg_acquire(ptr, x)					\
+	_arch_xchg(ptr, x, "", "", RISCV_ACQUIRE_BARRIER)
 
 #define arch_xchg_release(ptr, x)					\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg_release((ptr),			\
-					    _x_, sizeof(*(ptr)));	\
-})
-
-#define __arch_xchg(ptr, new, size)					\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
-	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.w.aqrl %0, %2, %1\n"		\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			"	amoswap.d.aqrl %0, %2, %1\n"		\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+	_arch_xchg(ptr, x, "", RISCV_RELEASE_BARRIER, "")
 
 #define arch_xchg(ptr, x)						\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __arch_xchg((ptr), _x_, sizeof(*(ptr)));	\
-})
+	_arch_xchg(ptr, x, ".aqrl", "", "")
 
 #define xchg32(ptr, x)							\
 ({									\
-- 
2.41.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 2/3] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
  2023-08-03  5:13 [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Leonardo Bras
  2023-08-03  5:13 ` [RFC PATCH v2 1/3] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
@ 2023-08-03  5:13 ` Leonardo Bras
  2023-08-03  5:14 ` [RFC PATCH v2 3/3] riscv/atomic.h : Deduplicate arch_atomic.* Leonardo Bras
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Leonardo Bras @ 2023-08-03  5:13 UTC (permalink / raw)
  To: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Leonardo Bras,
	Andrzej Hajda, Arnd Bergmann, Ingo Molnar, Palmer Dabbelt,
	Guo Ren
  Cc: linux-kernel, linux-riscv

In this header every cmpxchg define (_relaxed, _acquire, _release,
vanilla) contain it's own asm file, both for 4-byte variables an 8-byte
variables, on a total of 8 versions of mostly the same asm.

This is usually bad, as it means any change may be done in up to 8
different places.

Unify those versions by creating a new define with enough parameters to
generate any version of the previous 8.

Then unify the result under a more general define, and simplify
arch_cmpxchg* generation

(This did not cause any change in generated asm)

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 arch/riscv/include/asm/cmpxchg.h | 182 +++++--------------------------
 1 file changed, 25 insertions(+), 157 deletions(-)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index ec4ea4f3f908..bb45ef83592f 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -71,81 +71,37 @@
  * store NEW in MEM.  Return the initial value in MEM.  Success is
  * indicated by comparing RETURN with OLD.
  */
-#define __cmpxchg_relaxed(ptr, old, new, size)				\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(*(ptr)) __old = (old);				\
-	__typeof__(*(ptr)) __new = (new);				\
-	__typeof__(*(ptr)) __ret;					\
-	register unsigned int __rc;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
 
-#define arch_cmpxchg_relaxed(ptr, o, n)					\
+#define __arch_cmpxchg(lr_sfx, sc_sfx, prepend, append, r, rc, p, co, o, n)\
 ({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg_relaxed((ptr),			\
-					_o_, _n_, sizeof(*(ptr)));	\
+	__asm__ __volatile__ (						\
+		prepend							\
+		"0:	lr" lr_sfx " %0, %2\n"				\
+		"	bne  %0, %z3, 1f\n"				\
+		"	sc" sc_sfx " %1, %z4, %2\n"			\
+		"	bnez %1, 0b\n"					\
+		append							\
+		"1:\n"							\
+		: "=&r" (r), "=&r" (rc), "+A" (*(p))			\
+		: "rJ" (co o), "rJ" (n)					\
+		: "memory");						\
 })
 
-#define __cmpxchg_acquire(ptr, old, new, size)				\
+#define _arch_cmpxchg(ptr, old, new, sc_sfx, prepend, append)		\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
 	__typeof__(*(ptr)) __old = (old);				\
 	__typeof__(*(ptr)) __new = (new);				\
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
-	switch (size) {							\
+	switch (sizeof(*__ptr)) {					\
 	case 4:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			RISCV_ACQUIRE_BARRIER				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
+		__arch_cmpxchg(".w", ".w" sc_sfx, prepend, append,	\
+			    __ret, __rc, __ptr, (long), __old, __new);	\
 		break;							\
 	case 8:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			RISCV_ACQUIRE_BARRIER				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
+		__arch_cmpxchg(".d", ".d" sc_sfx, prepend, append,	\
+			    __ret, __rc, __ptr, /**/, __old, __new);	\
 		break;							\
 	default:							\
 		BUILD_BUG();						\
@@ -153,108 +109,20 @@
 	__ret;								\
 })
 
-#define arch_cmpxchg_acquire(ptr, o, n)					\
-({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg_acquire((ptr),			\
-					_o_, _n_, sizeof(*(ptr)));	\
-})
+#define arch_cmpxchg_relaxed(ptr, o, n)					\
+	_arch_cmpxchg((ptr), (o), (n), "", "", "")
 
-#define __cmpxchg_release(ptr, old, new, size)				\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(*(ptr)) __old = (old);				\
-	__typeof__(*(ptr)) __new = (new);				\
-	__typeof__(*(ptr)) __ret;					\
-	register unsigned int __rc;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+#define arch_cmpxchg_acquire(ptr, o, n)					\
+	_arch_cmpxchg((ptr), (o), (n), "", "", RISCV_ACQUIRE_BARRIER)
 
 #define arch_cmpxchg_release(ptr, o, n)					\
-({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg_release((ptr),			\
-					_o_, _n_, sizeof(*(ptr)));	\
-})
-
-#define __cmpxchg(ptr, old, new, size)					\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(*(ptr)) __old = (old);				\
-	__typeof__(*(ptr)) __new = (new);				\
-	__typeof__(*(ptr)) __ret;					\
-	register unsigned int __rc;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w.rl %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"	fence rw, rw\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d.rl %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"	fence rw, rw\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
+	_arch_cmpxchg((ptr), (o), (n), "", RISCV_RELEASE_BARRIER, "")
 
 #define arch_cmpxchg(ptr, o, n)						\
-({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg((ptr),				\
-				       _o_, _n_, sizeof(*(ptr)));	\
-})
+	_arch_cmpxchg((ptr), (o), (n), ".rl", "", "	fence rw, rw\n")
 
 #define arch_cmpxchg_local(ptr, o, n)					\
-	(__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr))))
+	arch_cmpxchg_relaxed((ptr), (o), (n))
 
 #define arch_cmpxchg64(ptr, o, n)					\
 ({									\
-- 
2.41.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 3/3] riscv/atomic.h : Deduplicate arch_atomic.*
  2023-08-03  5:13 [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Leonardo Bras
  2023-08-03  5:13 ` [RFC PATCH v2 1/3] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
  2023-08-03  5:13 ` [RFC PATCH v2 2/3] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
@ 2023-08-03  5:14 ` Leonardo Bras
  2023-08-03  7:33   ` Guo Ren
  2023-08-03 13:53 ` [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Andrea Parri
  2023-08-04  0:53 ` Guo Ren
  4 siblings, 1 reply; 12+ messages in thread
From: Leonardo Bras @ 2023-08-03  5:14 UTC (permalink / raw)
  To: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Leonardo Bras,
	Andrzej Hajda, Arnd Bergmann, Ingo Molnar, Palmer Dabbelt,
	Guo Ren
  Cc: linux-kernel, linux-riscv

Some functions use mostly the same asm for 32-bit and 64-bit versions.

Make a macro that is generic enough and avoid code duplication.

(This did not cause any change in generated asm)

Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
 arch/riscv/include/asm/atomic.h | 164 +++++++++++++++-----------------
 1 file changed, 76 insertions(+), 88 deletions(-)

diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index f5dfef6c2153..80cca7ac16fd 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -196,22 +196,28 @@ ATOMIC_OPS(xor, xor, i)
 #undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 
+#define _arch_atomic_fetch_add_unless(_prev, _rc, counter, _a, _u, sfx)	\
+({									\
+	__asm__ __volatile__ (						\
+		"0:	lr." sfx "     %[p],  %[c]\n"			\
+		"	beq	       %[p],  %[u], 1f\n"		\
+		"	add            %[rc], %[p], %[a]\n"		\
+		"	sc." sfx ".rl  %[rc], %[rc], %[c]\n"		\
+		"	bnez           %[rc], 0b\n"			\
+		"	fence          rw, rw\n"			\
+		"1:\n"							\
+		: [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter)	\
+		: [a]"r" (_a), [u]"r" (_u)				\
+		: "memory");						\
+})
+
 /* This is required to provide a full barrier on success. */
 static __always_inline int arch_atomic_fetch_add_unless(atomic_t *v, int a, int u)
 {
        int prev, rc;
 
-	__asm__ __volatile__ (
-		"0:	lr.w     %[p],  %[c]\n"
-		"	beq      %[p],  %[u], 1f\n"
-		"	add      %[rc], %[p], %[a]\n"
-		"	sc.w.rl  %[rc], %[rc], %[c]\n"
-		"	bnez     %[rc], 0b\n"
-		"	fence    rw, rw\n"
-		"1:\n"
-		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
-		: [a]"r" (a), [u]"r" (u)
-		: "memory");
+	_arch_atomic_fetch_add_unless(prev, rc, v->counter, a, u, "w");
+
 	return prev;
 }
 #define arch_atomic_fetch_add_unless arch_atomic_fetch_add_unless
@@ -222,77 +228,86 @@ static __always_inline s64 arch_atomic64_fetch_add_unless(atomic64_t *v, s64 a,
        s64 prev;
        long rc;
 
-	__asm__ __volatile__ (
-		"0:	lr.d     %[p],  %[c]\n"
-		"	beq      %[p],  %[u], 1f\n"
-		"	add      %[rc], %[p], %[a]\n"
-		"	sc.d.rl  %[rc], %[rc], %[c]\n"
-		"	bnez     %[rc], 0b\n"
-		"	fence    rw, rw\n"
-		"1:\n"
-		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
-		: [a]"r" (a), [u]"r" (u)
-		: "memory");
+	_arch_atomic_fetch_add_unless(prev, rc, v->counter, a, u, "d");
+
 	return prev;
 }
 #define arch_atomic64_fetch_add_unless arch_atomic64_fetch_add_unless
 #endif
 
+#define _arch_atomic_inc_unless_negative(_prev, _rc, counter, sfx)	\
+({									\
+	__asm__ __volatile__ (						\
+		"0:	lr." sfx "      %[p],  %[c]\n"			\
+		"	bltz            %[p],  1f\n"			\
+		"	addi            %[rc], %[p], 1\n"		\
+		"	sc." sfx ".rl   %[rc], %[rc], %[c]\n"		\
+		"	bnez            %[rc], 0b\n"			\
+		"	fence           rw, rw\n"			\
+		"1:\n"							\
+		: [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter)	\
+		:							\
+		: "memory");						\
+})
+
 static __always_inline bool arch_atomic_inc_unless_negative(atomic_t *v)
 {
 	int prev, rc;
 
-	__asm__ __volatile__ (
-		"0:	lr.w      %[p],  %[c]\n"
-		"	bltz      %[p],  1f\n"
-		"	addi      %[rc], %[p], 1\n"
-		"	sc.w.rl   %[rc], %[rc], %[c]\n"
-		"	bnez      %[rc], 0b\n"
-		"	fence     rw, rw\n"
-		"1:\n"
-		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
-		:
-		: "memory");
+	_arch_atomic_inc_unless_negative(prev, rc, v->counter, "w");
+
 	return !(prev < 0);
 }
 
 #define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative
 
+#define _arch_atomic_dec_unless_positive(_prev, _rc, counter, sfx)	\
+({									\
+	__asm__ __volatile__ (						\
+		"0:	lr." sfx "      %[p],  %[c]\n"			\
+		"	bgtz            %[p],  1f\n"			\
+		"	addi            %[rc], %[p], -1\n"		\
+		"	sc." sfx ".rl   %[rc], %[rc], %[c]\n"		\
+		"	bnez            %[rc], 0b\n"			\
+		"	fence           rw, rw\n"			\
+		"1:\n"							\
+		: [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter)	\
+		:							\
+		: "memory");						\
+})
+
 static __always_inline bool arch_atomic_dec_unless_positive(atomic_t *v)
 {
 	int prev, rc;
 
-	__asm__ __volatile__ (
-		"0:	lr.w      %[p],  %[c]\n"
-		"	bgtz      %[p],  1f\n"
-		"	addi      %[rc], %[p], -1\n"
-		"	sc.w.rl   %[rc], %[rc], %[c]\n"
-		"	bnez      %[rc], 0b\n"
-		"	fence     rw, rw\n"
-		"1:\n"
-		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
-		:
-		: "memory");
+	_arch_atomic_dec_unless_positive(prev, rc, v->counter, "w");
+
 	return !(prev > 0);
 }
 
 #define arch_atomic_dec_unless_positive arch_atomic_dec_unless_positive
 
+#define _arch_atomic_dec_if_positive(_prev, _rc, counter, sfx)		\
+({									\
+	__asm__ __volatile__ (						\
+		"0:	lr." sfx "     %[p],  %[c]\n"			\
+		"	addi           %[rc], %[p], -1\n"		\
+		"	bltz           %[rc], 1f\n"			\
+		"	sc." sfx ".rl  %[rc], %[rc], %[c]\n"		\
+		"	bnez           %[rc], 0b\n"			\
+		"	fence          rw, rw\n"			\
+		"1:\n"							\
+		: [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter)	\
+		:							\
+		: "memory");						\
+})
+
 static __always_inline int arch_atomic_dec_if_positive(atomic_t *v)
 {
        int prev, rc;
 
-	__asm__ __volatile__ (
-		"0:	lr.w     %[p],  %[c]\n"
-		"	addi     %[rc], %[p], -1\n"
-		"	bltz     %[rc], 1f\n"
-		"	sc.w.rl  %[rc], %[rc], %[c]\n"
-		"	bnez     %[rc], 0b\n"
-		"	fence    rw, rw\n"
-		"1:\n"
-		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
-		:
-		: "memory");
+	_arch_atomic_dec_if_positive(prev, rc, v->counter, "w");
+
 	return prev - 1;
 }
 
@@ -304,17 +319,8 @@ static __always_inline bool arch_atomic64_inc_unless_negative(atomic64_t *v)
 	s64 prev;
 	long rc;
 
-	__asm__ __volatile__ (
-		"0:	lr.d      %[p],  %[c]\n"
-		"	bltz      %[p],  1f\n"
-		"	addi      %[rc], %[p], 1\n"
-		"	sc.d.rl   %[rc], %[rc], %[c]\n"
-		"	bnez      %[rc], 0b\n"
-		"	fence     rw, rw\n"
-		"1:\n"
-		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
-		:
-		: "memory");
+	_arch_atomic_inc_unless_negative(prev, rc, v->counter, "d");
+
 	return !(prev < 0);
 }
 
@@ -325,17 +331,8 @@ static __always_inline bool arch_atomic64_dec_unless_positive(atomic64_t *v)
 	s64 prev;
 	long rc;
 
-	__asm__ __volatile__ (
-		"0:	lr.d      %[p],  %[c]\n"
-		"	bgtz      %[p],  1f\n"
-		"	addi      %[rc], %[p], -1\n"
-		"	sc.d.rl   %[rc], %[rc], %[c]\n"
-		"	bnez      %[rc], 0b\n"
-		"	fence     rw, rw\n"
-		"1:\n"
-		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
-		:
-		: "memory");
+	_arch_atomic_dec_unless_positive(prev, rc, v->counter, "d");
+
 	return !(prev > 0);
 }
 
@@ -346,17 +343,8 @@ static __always_inline s64 arch_atomic64_dec_if_positive(atomic64_t *v)
        s64 prev;
        long rc;
 
-	__asm__ __volatile__ (
-		"0:	lr.d     %[p],  %[c]\n"
-		"	addi      %[rc], %[p], -1\n"
-		"	bltz     %[rc], 1f\n"
-		"	sc.d.rl  %[rc], %[rc], %[c]\n"
-		"	bnez     %[rc], 0b\n"
-		"	fence    rw, rw\n"
-		"1:\n"
-		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
-		:
-		: "memory");
+	_arch_atomic_dec_if_positive(prev, rc, v->counter, "d");
+
 	return prev - 1;
 }
 
-- 
2.41.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 3/3] riscv/atomic.h : Deduplicate arch_atomic.*
  2023-08-03  5:14 ` [RFC PATCH v2 3/3] riscv/atomic.h : Deduplicate arch_atomic.* Leonardo Bras
@ 2023-08-03  7:33   ` Guo Ren
  0 siblings, 0 replies; 12+ messages in thread
From: Guo Ren @ 2023-08-03  7:33 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrzej Hajda,
	Arnd Bergmann, Ingo Molnar, Palmer Dabbelt, linux-kernel,
	linux-riscv

On Thu, Aug 03, 2023 at 02:14:00AM -0300, Leonardo Bras wrote:
> Some functions use mostly the same asm for 32-bit and 64-bit versions.
> 
> Make a macro that is generic enough and avoid code duplication.
> 
> (This did not cause any change in generated asm)
> 
> Signed-off-by: Leonardo Bras <leobras@redhat.com>
> ---
>  arch/riscv/include/asm/atomic.h | 164 +++++++++++++++-----------------
>  1 file changed, 76 insertions(+), 88 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
> index f5dfef6c2153..80cca7ac16fd 100644
> --- a/arch/riscv/include/asm/atomic.h
> +++ b/arch/riscv/include/asm/atomic.h
> @@ -196,22 +196,28 @@ ATOMIC_OPS(xor, xor, i)
>  #undef ATOMIC_FETCH_OP
>  #undef ATOMIC_OP_RETURN
>  
> +#define _arch_atomic_fetch_add_unless(_prev, _rc, counter, _a, _u, sfx)	\
> +({									\
> +	__asm__ __volatile__ (						\
> +		"0:	lr." sfx "     %[p],  %[c]\n"			\
> +		"	beq	       %[p],  %[u], 1f\n"		\
> +		"	add            %[rc], %[p], %[a]\n"		\
> +		"	sc." sfx ".rl  %[rc], %[rc], %[c]\n"		\
> +		"	bnez           %[rc], 0b\n"			\
> +		"	fence          rw, rw\n"			\
> +		"1:\n"							\
> +		: [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter)	\
> +		: [a]"r" (_a), [u]"r" (_u)				\
> +		: "memory");						\
> +})
> +
>  /* This is required to provide a full barrier on success. */
>  static __always_inline int arch_atomic_fetch_add_unless(atomic_t *v, int a, int u)
>  {
>         int prev, rc;
>  
> -	__asm__ __volatile__ (
> -		"0:	lr.w     %[p],  %[c]\n"
> -		"	beq      %[p],  %[u], 1f\n"
> -		"	add      %[rc], %[p], %[a]\n"
> -		"	sc.w.rl  %[rc], %[rc], %[c]\n"
> -		"	bnez     %[rc], 0b\n"
> -		"	fence    rw, rw\n"
> -		"1:\n"
> -		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> -		: [a]"r" (a), [u]"r" (u)
> -		: "memory");
> +	_arch_atomic_fetch_add_unless(prev, rc, v->counter, a, u, "w");
> +
>  	return prev;
>  }
>  #define arch_atomic_fetch_add_unless arch_atomic_fetch_add_unless
> @@ -222,77 +228,86 @@ static __always_inline s64 arch_atomic64_fetch_add_unless(atomic64_t *v, s64 a,
>         s64 prev;
>         long rc;
>  
> -	__asm__ __volatile__ (
> -		"0:	lr.d     %[p],  %[c]\n"
> -		"	beq      %[p],  %[u], 1f\n"
> -		"	add      %[rc], %[p], %[a]\n"
> -		"	sc.d.rl  %[rc], %[rc], %[c]\n"
> -		"	bnez     %[rc], 0b\n"
> -		"	fence    rw, rw\n"
> -		"1:\n"
> -		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> -		: [a]"r" (a), [u]"r" (u)
> -		: "memory");
> +	_arch_atomic_fetch_add_unless(prev, rc, v->counter, a, u, "d");
> +
>  	return prev;
>  }
>  #define arch_atomic64_fetch_add_unless arch_atomic64_fetch_add_unless
>  #endif
>  
> +#define _arch_atomic_inc_unless_negative(_prev, _rc, counter, sfx)	\
> +({									\
> +	__asm__ __volatile__ (						\
> +		"0:	lr." sfx "      %[p],  %[c]\n"			\
> +		"	bltz            %[p],  1f\n"			\
> +		"	addi            %[rc], %[p], 1\n"		\
> +		"	sc." sfx ".rl   %[rc], %[rc], %[c]\n"		\
> +		"	bnez            %[rc], 0b\n"			\
> +		"	fence           rw, rw\n"			\
> +		"1:\n"							\
> +		: [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter)	\
> +		:							\
> +		: "memory");						\
> +})
> +
>  static __always_inline bool arch_atomic_inc_unless_negative(atomic_t *v)
>  {
>  	int prev, rc;
>  
> -	__asm__ __volatile__ (
> -		"0:	lr.w      %[p],  %[c]\n"
> -		"	bltz      %[p],  1f\n"
> -		"	addi      %[rc], %[p], 1\n"
> -		"	sc.w.rl   %[rc], %[rc], %[c]\n"
> -		"	bnez      %[rc], 0b\n"
> -		"	fence     rw, rw\n"
> -		"1:\n"
> -		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> -		:
> -		: "memory");
> +	_arch_atomic_inc_unless_negative(prev, rc, v->counter, "w");
> +
>  	return !(prev < 0);
>  }
>  
>  #define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative
>  
> +#define _arch_atomic_dec_unless_positive(_prev, _rc, counter, sfx)	\
> +({									\
> +	__asm__ __volatile__ (						\
> +		"0:	lr." sfx "      %[p],  %[c]\n"			\
> +		"	bgtz            %[p],  1f\n"			\
> +		"	addi            %[rc], %[p], -1\n"		\
> +		"	sc." sfx ".rl   %[rc], %[rc], %[c]\n"		\
> +		"	bnez            %[rc], 0b\n"			\
> +		"	fence           rw, rw\n"			\
> +		"1:\n"							\
> +		: [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter)	\
> +		:							\
> +		: "memory");						\
> +})
> +
>  static __always_inline bool arch_atomic_dec_unless_positive(atomic_t *v)
>  {
>  	int prev, rc;
>  
> -	__asm__ __volatile__ (
> -		"0:	lr.w      %[p],  %[c]\n"
> -		"	bgtz      %[p],  1f\n"
> -		"	addi      %[rc], %[p], -1\n"
> -		"	sc.w.rl   %[rc], %[rc], %[c]\n"
> -		"	bnez      %[rc], 0b\n"
> -		"	fence     rw, rw\n"
> -		"1:\n"
> -		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> -		:
> -		: "memory");
> +	_arch_atomic_dec_unless_positive(prev, rc, v->counter, "w");
> +
>  	return !(prev > 0);
>  }
>  
>  #define arch_atomic_dec_unless_positive arch_atomic_dec_unless_positive
>  
> +#define _arch_atomic_dec_if_positive(_prev, _rc, counter, sfx)		\
> +({									\
> +	__asm__ __volatile__ (						\
> +		"0:	lr." sfx "     %[p],  %[c]\n"			\
> +		"	addi           %[rc], %[p], -1\n"		\
> +		"	bltz           %[rc], 1f\n"			\
> +		"	sc." sfx ".rl  %[rc], %[rc], %[c]\n"		\
> +		"	bnez           %[rc], 0b\n"			\
> +		"	fence          rw, rw\n"			\
> +		"1:\n"							\
> +		: [p]"=&r" (_prev), [rc]"=&r" (_rc), [c]"+A" (counter)	\
> +		:							\
> +		: "memory");						\
> +})
> +
>  static __always_inline int arch_atomic_dec_if_positive(atomic_t *v)
>  {
>         int prev, rc;
>  
> -	__asm__ __volatile__ (
> -		"0:	lr.w     %[p],  %[c]\n"
> -		"	addi     %[rc], %[p], -1\n"
> -		"	bltz     %[rc], 1f\n"
> -		"	sc.w.rl  %[rc], %[rc], %[c]\n"
> -		"	bnez     %[rc], 0b\n"
> -		"	fence    rw, rw\n"
> -		"1:\n"
> -		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> -		:
> -		: "memory");
> +	_arch_atomic_dec_if_positive(prev, rc, v->counter, "w");
> +
>  	return prev - 1;
>  }
>  
> @@ -304,17 +319,8 @@ static __always_inline bool arch_atomic64_inc_unless_negative(atomic64_t *v)
>  	s64 prev;
>  	long rc;
>  
> -	__asm__ __volatile__ (
> -		"0:	lr.d      %[p],  %[c]\n"
> -		"	bltz      %[p],  1f\n"
> -		"	addi      %[rc], %[p], 1\n"
> -		"	sc.d.rl   %[rc], %[rc], %[c]\n"
> -		"	bnez      %[rc], 0b\n"
> -		"	fence     rw, rw\n"
> -		"1:\n"
> -		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> -		:
> -		: "memory");
> +	_arch_atomic_inc_unless_negative(prev, rc, v->counter, "d");
> +
>  	return !(prev < 0);
>  }
>  
> @@ -325,17 +331,8 @@ static __always_inline bool arch_atomic64_dec_unless_positive(atomic64_t *v)
>  	s64 prev;
>  	long rc;
>  
> -	__asm__ __volatile__ (
> -		"0:	lr.d      %[p],  %[c]\n"
> -		"	bgtz      %[p],  1f\n"
> -		"	addi      %[rc], %[p], -1\n"
> -		"	sc.d.rl   %[rc], %[rc], %[c]\n"
> -		"	bnez      %[rc], 0b\n"
> -		"	fence     rw, rw\n"
> -		"1:\n"
> -		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> -		:
> -		: "memory");
> +	_arch_atomic_dec_unless_positive(prev, rc, v->counter, "d");
> +
>  	return !(prev > 0);
>  }
>  
> @@ -346,17 +343,8 @@ static __always_inline s64 arch_atomic64_dec_if_positive(atomic64_t *v)
>         s64 prev;
>         long rc;
>  
> -	__asm__ __volatile__ (
> -		"0:	lr.d     %[p],  %[c]\n"
> -		"	addi      %[rc], %[p], -1\n"
> -		"	bltz     %[rc], 1f\n"
> -		"	sc.d.rl  %[rc], %[rc], %[c]\n"
> -		"	bnez     %[rc], 0b\n"
> -		"	fence    rw, rw\n"
> -		"1:\n"
> -		: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> -		:
> -		: "memory");
> +	_arch_atomic_dec_if_positive(prev, rc, v->counter, "d");
> +
>  	return prev - 1;
>  }
I have no problem with this optimization.

Reviewed-by: Guo Ren <guoren@kernel.org> 

>  
> -- 
> 2.41.0
> 

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros
  2023-08-03  5:13 [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Leonardo Bras
                   ` (2 preceding siblings ...)
  2023-08-03  5:14 ` [RFC PATCH v2 3/3] riscv/atomic.h : Deduplicate arch_atomic.* Leonardo Bras
@ 2023-08-03 13:53 ` Andrea Parri
  2023-08-04  2:20   ` Leonardo Bras Soares Passos
  2023-08-04  0:53 ` Guo Ren
  4 siblings, 1 reply; 12+ messages in thread
From: Andrea Parri @ 2023-08-03 13:53 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrzej Hajda,
	Arnd Bergmann, Ingo Molnar, Palmer Dabbelt, Guo Ren, linux-kernel,
	linux-riscv

On Thu, Aug 03, 2023 at 02:13:57AM -0300, Leonardo Bras wrote:
> I unified previous patchsets into a single one, since the work is related.
> 
> While studying riscv's cmpxchg.h file, I got really interested in
> understanding how RISCV asm implemented the different versions of
> {cmp,}xchg.
> 
> When I understood the pattern, it made sense for me to remove the
> duplications and create macros to make it easier to understand what exactly
> changes between the versions: Instruction sufixes & barriers.
> 
> Also, did the same kind of work on atomic.c.
> 
> Note to Guo Ren:
> I did some further improvement after your previous reviews, so I ended
> up afraid including your Reviewed-by before cheching if the changes are
> ok for you. Please check it out again, I just removed some helper macros
> that were not being used elsewhere in the kernel.
> 
> Thanks!
> Leo
> 
> 
> Changes since squashed cmpxchg:
> - Unified with atomic.c patchset 
> - Rebased on top of torvalds/master (thanks Andrea Parri!)
> - Removed helper macros that were not being used elsewhere in the kernel.
> 
> Changes since (cmpxchg) RFCv3:
> - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> 
> Changes since (cmpxchg) RFCv2:
> - Fixed  macros that depend on having a local variable with a magic name
> - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> 
> Changes since (cmpxchg) RFCv1:
> - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> 
> 
> Leonardo Bras (3):
>   riscv/cmpxchg: Deduplicate xchg() asm functions
>   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
>   riscv/atomic.h : Deduplicate arch_atomic.*
> 
>  arch/riscv/include/asm/atomic.h  | 164 ++++++++--------
>  arch/riscv/include/asm/cmpxchg.h | 318 +++++--------------------------
>  2 files changed, 123 insertions(+), 359 deletions(-)

LGTM.  For the series,

Reviewed-by: Andrea Parri <parri.andrea@gmail.com>

  Andrea

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros
  2023-08-03  5:13 [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Leonardo Bras
                   ` (3 preceding siblings ...)
  2023-08-03 13:53 ` [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Andrea Parri
@ 2023-08-04  0:53 ` Guo Ren
  2023-08-04  2:19   ` Leonardo Bras Soares Passos
  4 siblings, 1 reply; 12+ messages in thread
From: Guo Ren @ 2023-08-04  0:53 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrzej Hajda,
	Arnd Bergmann, Ingo Molnar, Palmer Dabbelt, linux-kernel,
	linux-riscv

On Thu, Aug 3, 2023 at 1:14 PM Leonardo Bras <leobras@redhat.com> wrote:
>
> I unified previous patchsets into a single one, since the work is related.
>
> While studying riscv's cmpxchg.h file, I got really interested in
> understanding how RISCV asm implemented the different versions of
> {cmp,}xchg.
>
> When I understood the pattern, it made sense for me to remove the
> duplications and create macros to make it easier to understand what exactly
> changes between the versions: Instruction sufixes & barriers.
>
> Also, did the same kind of work on atomic.c.
>
> Note to Guo Ren:
> I did some further improvement after your previous reviews, so I ended
> up afraid including your Reviewed-by before cheching if the changes are
> ok for you. Please check it out again, I just removed some helper macros
> that were not being used elsewhere in the kernel.
I found this optimization has conflicts with the below patches:
https://lore.kernel.org/linux-riscv/20230802164701.192791-15-guoren@kernel.org/
https://lore.kernel.org/linux-riscv/20230802164701.192791-5-guoren@kernel.org/

If yours merged, how do we support the inline cmpxchg/xchg_small
function? It's very struggling to use macros to implement complex
functions.

Could you consider a more relaxed optimization in which we could
insert inline cmpxchg/xchg_small?

>
> Thanks!
> Leo
>
>
> Changes since squashed cmpxchg:
> - Unified with atomic.c patchset
> - Rebased on top of torvalds/master (thanks Andrea Parri!)
> - Removed helper macros that were not being used elsewhere in the kernel.
>
> Changes since (cmpxchg) RFCv3:
> - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
>
> Changes since (cmpxchg) RFCv2:
> - Fixed  macros that depend on having a local variable with a magic name
> - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
>
> Changes since (cmpxchg) RFCv1:
> - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
>
>
> Leonardo Bras (3):
>   riscv/cmpxchg: Deduplicate xchg() asm functions
>   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
>   riscv/atomic.h : Deduplicate arch_atomic.*
>
>  arch/riscv/include/asm/atomic.h  | 164 ++++++++--------
>  arch/riscv/include/asm/cmpxchg.h | 318 +++++--------------------------
>  2 files changed, 123 insertions(+), 359 deletions(-)
>
> --
> 2.41.0
>


-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros
  2023-08-04  0:53 ` Guo Ren
@ 2023-08-04  2:19   ` Leonardo Bras Soares Passos
  2023-08-04  2:29     ` Guo Ren
  0 siblings, 1 reply; 12+ messages in thread
From: Leonardo Bras Soares Passos @ 2023-08-04  2:19 UTC (permalink / raw)
  To: Guo Ren
  Cc: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrzej Hajda,
	Arnd Bergmann, Ingo Molnar, Palmer Dabbelt, linux-kernel,
	linux-riscv

On Thu, Aug 3, 2023 at 9:53 PM Guo Ren <guoren@kernel.org> wrote:
>
> On Thu, Aug 3, 2023 at 1:14 PM Leonardo Bras <leobras@redhat.com> wrote:
> >
> > I unified previous patchsets into a single one, since the work is related.
> >
> > While studying riscv's cmpxchg.h file, I got really interested in
> > understanding how RISCV asm implemented the different versions of
> > {cmp,}xchg.
> >
> > When I understood the pattern, it made sense for me to remove the
> > duplications and create macros to make it easier to understand what exactly
> > changes between the versions: Instruction sufixes & barriers.
> >
> > Also, did the same kind of work on atomic.c.
> >
> > Note to Guo Ren:
> > I did some further improvement after your previous reviews, so I ended
> > up afraid including your Reviewed-by before cheching if the changes are
> > ok for you. Please check it out again, I just removed some helper macros
> > that were not being used elsewhere in the kernel.
> I found this optimization has conflicts with the below patches:
> https://lore.kernel.org/linux-riscv/20230802164701.192791-15-guoren@kernel.org/
> https://lore.kernel.org/linux-riscv/20230802164701.192791-5-guoren@kernel.org/
>
> If yours merged, how do we support the inline cmpxchg/xchg_small
> function?

Oh, I actually introduced my series so I could introduce new xchg and
cmpxchg for size 1 and 2. Is that what your patches are about, right?

I was working on that yesterday, and decided to send the patchset
without them because I was still not sure enough.

About implementation strategy, I was introducing a new macros for xchg
& cmpxchg with asm which would work for both for size 1 & size 2, and
use the switch-case to create the mask and and_value.

You think that works enough?

> It's very struggling to use macros to implement complex
> functions.

I agree, but with this we can achieve more generic code, which makes
more clear what is the pattern for given function.

> Could you consider a more relaxed optimization in which we could
> insert inline cmpxchg/xchg_small?

What about this: I finish the patches I have been working with
(cmpxchg & xchg for sizes 1 and 2), and if they are fine we expand
this patchset with them.  If not, I try relaxing them a little so we
can merge with your set.

Does that work for you?

Best regards,
Leo


>
> >
> > Thanks!
> > Leo
> >
> >
> > Changes since squashed cmpxchg:
> > - Unified with atomic.c patchset
> > - Rebased on top of torvalds/master (thanks Andrea Parri!)
> > - Removed helper macros that were not being used elsewhere in the kernel.
> >
> > Changes since (cmpxchg) RFCv3:
> > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv2:
> > - Fixed  macros that depend on having a local variable with a magic name
> > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv1:
> > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> >
> >
> > Leonardo Bras (3):
> >   riscv/cmpxchg: Deduplicate xchg() asm functions
> >   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> >   riscv/atomic.h : Deduplicate arch_atomic.*
> >
> >  arch/riscv/include/asm/atomic.h  | 164 ++++++++--------
> >  arch/riscv/include/asm/cmpxchg.h | 318 +++++--------------------------
> >  2 files changed, 123 insertions(+), 359 deletions(-)
> >
> > --
> > 2.41.0
> >
>
>
> --
> Best Regards
>  Guo Ren
>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros
  2023-08-03 13:53 ` [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Andrea Parri
@ 2023-08-04  2:20   ` Leonardo Bras Soares Passos
  0 siblings, 0 replies; 12+ messages in thread
From: Leonardo Bras Soares Passos @ 2023-08-04  2:20 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrzej Hajda,
	Arnd Bergmann, Ingo Molnar, Palmer Dabbelt, Guo Ren, linux-kernel,
	linux-riscv

On Thu, Aug 3, 2023 at 10:53 AM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> On Thu, Aug 03, 2023 at 02:13:57AM -0300, Leonardo Bras wrote:
> > I unified previous patchsets into a single one, since the work is related.
> >
> > While studying riscv's cmpxchg.h file, I got really interested in
> > understanding how RISCV asm implemented the different versions of
> > {cmp,}xchg.
> >
> > When I understood the pattern, it made sense for me to remove the
> > duplications and create macros to make it easier to understand what exactly
> > changes between the versions: Instruction sufixes & barriers.
> >
> > Also, did the same kind of work on atomic.c.
> >
> > Note to Guo Ren:
> > I did some further improvement after your previous reviews, so I ended
> > up afraid including your Reviewed-by before cheching if the changes are
> > ok for you. Please check it out again, I just removed some helper macros
> > that were not being used elsewhere in the kernel.
> >
> > Thanks!
> > Leo
> >
> >
> > Changes since squashed cmpxchg:
> > - Unified with atomic.c patchset
> > - Rebased on top of torvalds/master (thanks Andrea Parri!)
> > - Removed helper macros that were not being used elsewhere in the kernel.
> >
> > Changes since (cmpxchg) RFCv3:
> > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv2:
> > - Fixed  macros that depend on having a local variable with a magic name
> > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv1:
> > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> >
> >
> > Leonardo Bras (3):
> >   riscv/cmpxchg: Deduplicate xchg() asm functions
> >   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> >   riscv/atomic.h : Deduplicate arch_atomic.*
> >
> >  arch/riscv/include/asm/atomic.h  | 164 ++++++++--------
> >  arch/riscv/include/asm/cmpxchg.h | 318 +++++--------------------------
> >  2 files changed, 123 insertions(+), 359 deletions(-)
>
> LGTM.  For the series,
>
> Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
>
>   Andrea

Thanks Andrea!

>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros
  2023-08-04  2:19   ` Leonardo Bras Soares Passos
@ 2023-08-04  2:29     ` Guo Ren
  2023-08-04  8:05       ` Leonardo Bras Soares Passos
  0 siblings, 1 reply; 12+ messages in thread
From: Guo Ren @ 2023-08-04  2:29 UTC (permalink / raw)
  To: Leonardo Bras Soares Passos
  Cc: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrzej Hajda,
	Arnd Bergmann, Ingo Molnar, Palmer Dabbelt, linux-kernel,
	linux-riscv

On Fri, Aug 4, 2023 at 10:20 AM Leonardo Bras Soares Passos
<leobras@redhat.com> wrote:
>
> On Thu, Aug 3, 2023 at 9:53 PM Guo Ren <guoren@kernel.org> wrote:
> >
> > On Thu, Aug 3, 2023 at 1:14 PM Leonardo Bras <leobras@redhat.com> wrote:
> > >
> > > I unified previous patchsets into a single one, since the work is related.
> > >
> > > While studying riscv's cmpxchg.h file, I got really interested in
> > > understanding how RISCV asm implemented the different versions of
> > > {cmp,}xchg.
> > >
> > > When I understood the pattern, it made sense for me to remove the
> > > duplications and create macros to make it easier to understand what exactly
> > > changes between the versions: Instruction sufixes & barriers.
> > >
> > > Also, did the same kind of work on atomic.c.
> > >
> > > Note to Guo Ren:
> > > I did some further improvement after your previous reviews, so I ended
> > > up afraid including your Reviewed-by before cheching if the changes are
> > > ok for you. Please check it out again, I just removed some helper macros
> > > that were not being used elsewhere in the kernel.
> > I found this optimization has conflicts with the below patches:
> > https://lore.kernel.org/linux-riscv/20230802164701.192791-15-guoren@kernel.org/
> > https://lore.kernel.org/linux-riscv/20230802164701.192791-5-guoren@kernel.org/
> >
> > If yours merged, how do we support the inline cmpxchg/xchg_small
> > function?
>
> Oh, I actually introduced my series so I could introduce new xchg and
> cmpxchg for size 1 and 2. Is that what your patches are about, right?
>
> I was working on that yesterday, and decided to send the patchset
> without them because I was still not sure enough.
>
> About implementation strategy, I was introducing a new macros for xchg
> & cmpxchg with asm which would work for both for size 1 & size 2, and
> use the switch-case to create the mask and and_value.
>
> You think that works enough?
Good, go ahead.

>
> > It's very struggling to use macros to implement complex
> > functions.
>
> I agree, but with this we can achieve more generic code, which makes
> more clear what is the pattern for given function.
>
> > Could you consider a more relaxed optimization in which we could
> > insert inline cmpxchg/xchg_small?
>
> What about this: I finish the patches I have been working with
> (cmpxchg & xchg for sizes 1 and 2), and if they are fine we expand
> this patchset with them.  If not, I try relaxing them a little so we
> can merge with your set.
>
> Does that work for you?
Great, you could provide cmpxchg & xchg for sizes 1 and 2, then my
patch series would base on yours. After tested, I would give you
Tested-by.
>
> Best regards,
> Leo
>
>
> >
> > >
> > > Thanks!
> > > Leo
> > >
> > >
> > > Changes since squashed cmpxchg:
> > > - Unified with atomic.c patchset
> > > - Rebased on top of torvalds/master (thanks Andrea Parri!)
> > > - Removed helper macros that were not being used elsewhere in the kernel.
> > >
> > > Changes since (cmpxchg) RFCv3:
> > > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> > >
> > > Changes since (cmpxchg) RFCv2:
> > > - Fixed  macros that depend on having a local variable with a magic name
> > > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> > >
> > > Changes since (cmpxchg) RFCv1:
> > > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> > >
> > >
> > > Leonardo Bras (3):
> > >   riscv/cmpxchg: Deduplicate xchg() asm functions
> > >   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> > >   riscv/atomic.h : Deduplicate arch_atomic.*
> > >
> > >  arch/riscv/include/asm/atomic.h  | 164 ++++++++--------
> > >  arch/riscv/include/asm/cmpxchg.h | 318 +++++--------------------------
> > >  2 files changed, 123 insertions(+), 359 deletions(-)
> > >
> > > --
> > > 2.41.0
> > >
> >
> >
> > --
> > Best Regards
> >  Guo Ren
> >
>


-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros
  2023-08-04  2:29     ` Guo Ren
@ 2023-08-04  8:05       ` Leonardo Bras Soares Passos
  2023-08-04  8:55         ` Leonardo Bras Soares Passos
  0 siblings, 1 reply; 12+ messages in thread
From: Leonardo Bras Soares Passos @ 2023-08-04  8:05 UTC (permalink / raw)
  To: Guo Ren
  Cc: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrzej Hajda,
	Arnd Bergmann, Ingo Molnar, Palmer Dabbelt, linux-kernel,
	linux-riscv

On Thu, Aug 3, 2023 at 11:29 PM Guo Ren <guoren@kernel.org> wrote:
>
> On Fri, Aug 4, 2023 at 10:20 AM Leonardo Bras Soares Passos
> <leobras@redhat.com> wrote:
> >
> > On Thu, Aug 3, 2023 at 9:53 PM Guo Ren <guoren@kernel.org> wrote:
> > >
> > > On Thu, Aug 3, 2023 at 1:14 PM Leonardo Bras <leobras@redhat.com> wrote:
> > > >
> > > > I unified previous patchsets into a single one, since the work is related.
> > > >
> > > > While studying riscv's cmpxchg.h file, I got really interested in
> > > > understanding how RISCV asm implemented the different versions of
> > > > {cmp,}xchg.
> > > >
> > > > When I understood the pattern, it made sense for me to remove the
> > > > duplications and create macros to make it easier to understand what exactly
> > > > changes between the versions: Instruction sufixes & barriers.
> > > >
> > > > Also, did the same kind of work on atomic.c.
> > > >
> > > > Note to Guo Ren:
> > > > I did some further improvement after your previous reviews, so I ended
> > > > up afraid including your Reviewed-by before cheching if the changes are
> > > > ok for you. Please check it out again, I just removed some helper macros
> > > > that were not being used elsewhere in the kernel.
> > > I found this optimization has conflicts with the below patches:
> > > https://lore.kernel.org/linux-riscv/20230802164701.192791-15-guoren@kernel.org/
> > > https://lore.kernel.org/linux-riscv/20230802164701.192791-5-guoren@kernel.org/
> > >
> > > If yours merged, how do we support the inline cmpxchg/xchg_small
> > > function?
> >
> > Oh, I actually introduced my series so I could introduce new xchg and
> > cmpxchg for size 1 and 2. Is that what your patches are about, right?
> >
> > I was working on that yesterday, and decided to send the patchset
> > without them because I was still not sure enough.
> >
> > About implementation strategy, I was introducing a new macros for xchg
> > & cmpxchg with asm which would work for both for size 1 & size 2, and
> > use the switch-case to create the mask and and_value.
> >
> > You think that works enough?
> Good, go ahead.
>
> >
> > > It's very struggling to use macros to implement complex
> > > functions.
> >
> > I agree, but with this we can achieve more generic code, which makes
> > more clear what is the pattern for given function.
> >
> > > Could you consider a more relaxed optimization in which we could
> > > insert inline cmpxchg/xchg_small?
> >
> > What about this: I finish the patches I have been working with
> > (cmpxchg & xchg for sizes 1 and 2), and if they are fine we expand
> > this patchset with them.  If not, I try relaxing them a little so we
> > can merge with your set.
> >
> > Does that work for you?
> Great, you could provide cmpxchg & xchg for sizes 1 and 2, then my
> patch series would base on yours. After tested, I would give you
> Tested-by.

Awesome! Thanks!

I will send it shortly, just compile-testing the kernel.

> >
> > Best regards,
> > Leo
> >
> >
> > >
> > > >
> > > > Thanks!
> > > > Leo
> > > >
> > > >
> > > > Changes since squashed cmpxchg:
> > > > - Unified with atomic.c patchset
> > > > - Rebased on top of torvalds/master (thanks Andrea Parri!)
> > > > - Removed helper macros that were not being used elsewhere in the kernel.
> > > >
> > > > Changes since (cmpxchg) RFCv3:
> > > > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > > > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> > > >
> > > > Changes since (cmpxchg) RFCv2:
> > > > - Fixed  macros that depend on having a local variable with a magic name
> > > > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > > > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> > > >
> > > > Changes since (cmpxchg) RFCv1:
> > > > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > > > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> > > >
> > > >
> > > > Leonardo Bras (3):
> > > >   riscv/cmpxchg: Deduplicate xchg() asm functions
> > > >   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> > > >   riscv/atomic.h : Deduplicate arch_atomic.*
> > > >
> > > >  arch/riscv/include/asm/atomic.h  | 164 ++++++++--------
> > > >  arch/riscv/include/asm/cmpxchg.h | 318 +++++--------------------------
> > > >  2 files changed, 123 insertions(+), 359 deletions(-)
> > > >
> > > > --
> > > > 2.41.0
> > > >
> > >
> > >
> > > --
> > > Best Regards
> > >  Guo Ren
> > >
> >
>
>
> --
> Best Regards
>  Guo Ren
>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros
  2023-08-04  8:05       ` Leonardo Bras Soares Passos
@ 2023-08-04  8:55         ` Leonardo Bras Soares Passos
  0 siblings, 0 replies; 12+ messages in thread
From: Leonardo Bras Soares Passos @ 2023-08-04  8:55 UTC (permalink / raw)
  To: Guo Ren
  Cc: Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrzej Hajda,
	Arnd Bergmann, Ingo Molnar, Palmer Dabbelt, linux-kernel,
	linux-riscv

On Fri, Aug 4, 2023 at 5:05 AM Leonardo Bras Soares Passos
<leobras@redhat.com> wrote:
>
> On Thu, Aug 3, 2023 at 11:29 PM Guo Ren <guoren@kernel.org> wrote:
> >
> > On Fri, Aug 4, 2023 at 10:20 AM Leonardo Bras Soares Passos
> > <leobras@redhat.com> wrote:
> > >
> > > On Thu, Aug 3, 2023 at 9:53 PM Guo Ren <guoren@kernel.org> wrote:
> > > >
> > > > On Thu, Aug 3, 2023 at 1:14 PM Leonardo Bras <leobras@redhat.com> wrote:
> > > > >
> > > > > I unified previous patchsets into a single one, since the work is related.
> > > > >
> > > > > While studying riscv's cmpxchg.h file, I got really interested in
> > > > > understanding how RISCV asm implemented the different versions of
> > > > > {cmp,}xchg.
> > > > >
> > > > > When I understood the pattern, it made sense for me to remove the
> > > > > duplications and create macros to make it easier to understand what exactly
> > > > > changes between the versions: Instruction sufixes & barriers.
> > > > >
> > > > > Also, did the same kind of work on atomic.c.
> > > > >
> > > > > Note to Guo Ren:
> > > > > I did some further improvement after your previous reviews, so I ended
> > > > > up afraid including your Reviewed-by before cheching if the changes are
> > > > > ok for you. Please check it out again, I just removed some helper macros
> > > > > that were not being used elsewhere in the kernel.
> > > > I found this optimization has conflicts with the below patches:
> > > > https://lore.kernel.org/linux-riscv/20230802164701.192791-15-guoren@kernel.org/
> > > > https://lore.kernel.org/linux-riscv/20230802164701.192791-5-guoren@kernel.org/
> > > >
> > > > If yours merged, how do we support the inline cmpxchg/xchg_small
> > > > function?
> > >
> > > Oh, I actually introduced my series so I could introduce new xchg and
> > > cmpxchg for size 1 and 2. Is that what your patches are about, right?
> > >
> > > I was working on that yesterday, and decided to send the patchset
> > > without them because I was still not sure enough.
> > >
> > > About implementation strategy, I was introducing a new macros for xchg
> > > & cmpxchg with asm which would work for both for size 1 & size 2, and
> > > use the switch-case to create the mask and and_value.
> > >
> > > You think that works enough?
> > Good, go ahead.
> >
> > >
> > > > It's very struggling to use macros to implement complex
> > > > functions.
> > >
> > > I agree, but with this we can achieve more generic code, which makes
> > > more clear what is the pattern for given function.
> > >
> > > > Could you consider a more relaxed optimization in which we could
> > > > insert inline cmpxchg/xchg_small?
> > >
> > > What about this: I finish the patches I have been working with
> > > (cmpxchg & xchg for sizes 1 and 2), and if they are fine we expand
> > > this patchset with them.  If not, I try relaxing them a little so we
> > > can merge with your set.
> > >
> > > Does that work for you?
> > Great, you could provide cmpxchg & xchg for sizes 1 and 2, then my
> > patch series would base on yours. After tested, I would give you
> > Tested-by.
>
> Awesome! Thanks!
>
> I will send it shortly, just compile-testing the kernel.
>

v3:
https://patchwork.kernel.org/project/linux-riscv/list/?series=772986&state=%2A&archive=both

> > >
> > > Best regards,
> > > Leo
> > >
> > >
> > > >
> > > > >
> > > > > Thanks!
> > > > > Leo
> > > > >
> > > > >
> > > > > Changes since squashed cmpxchg:
> > > > > - Unified with atomic.c patchset
> > > > > - Rebased on top of torvalds/master (thanks Andrea Parri!)
> > > > > - Removed helper macros that were not being used elsewhere in the kernel.
> > > > >
> > > > > Changes since (cmpxchg) RFCv3:
> > > > > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > > > > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> > > > >
> > > > > Changes since (cmpxchg) RFCv2:
> > > > > - Fixed  macros that depend on having a local variable with a magic name
> > > > > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > > > > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> > > > >
> > > > > Changes since (cmpxchg) RFCv1:
> > > > > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > > > > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> > > > >
> > > > >
> > > > > Leonardo Bras (3):
> > > > >   riscv/cmpxchg: Deduplicate xchg() asm functions
> > > > >   riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> > > > >   riscv/atomic.h : Deduplicate arch_atomic.*
> > > > >
> > > > >  arch/riscv/include/asm/atomic.h  | 164 ++++++++--------
> > > > >  arch/riscv/include/asm/cmpxchg.h | 318 +++++--------------------------
> > > > >  2 files changed, 123 insertions(+), 359 deletions(-)
> > > > >
> > > > > --
> > > > > 2.41.0
> > > > >
> > > >
> > > >
> > > > --
> > > > Best Regards
> > > >  Guo Ren
> > > >
> > >
> >
> >
> > --
> > Best Regards
> >  Guo Ren
> >


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-08-04  8:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-03  5:13 [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Leonardo Bras
2023-08-03  5:13 ` [RFC PATCH v2 1/3] riscv/cmpxchg: Deduplicate xchg() asm functions Leonardo Bras
2023-08-03  5:13 ` [RFC PATCH v2 2/3] riscv/cmpxchg: Deduplicate cmpxchg() asm and macros Leonardo Bras
2023-08-03  5:14 ` [RFC PATCH v2 3/3] riscv/atomic.h : Deduplicate arch_atomic.* Leonardo Bras
2023-08-03  7:33   ` Guo Ren
2023-08-03 13:53 ` [RFC PATCH v2 0/3] Deduplicate RISCV cmpxchg.h and atomic.c macros Andrea Parri
2023-08-04  2:20   ` Leonardo Bras Soares Passos
2023-08-04  0:53 ` Guo Ren
2023-08-04  2:19   ` Leonardo Bras Soares Passos
2023-08-04  2:29     ` Guo Ren
2023-08-04  8:05       ` Leonardo Bras Soares Passos
2023-08-04  8:55         ` Leonardo Bras Soares Passos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox