public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 loongarch-next 0/3] LoongArch: Add 128-bit atomic cmpxchg support
@ 2026-01-10 13:11 George Guo
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 1/3] LoongArch: Add SCQ support detection George Guo
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: George Guo @ 2026-01-10 13:11 UTC (permalink / raw)
  To: chenhuacai
  Cc: dongtai.guo, guodongtai, hengqi.chen, kernel, lianyangyang,
	linux-kernel, loongarch, r, xry111

This patch series adds 128-bit atomic compare-and-exchange support for
LoongArch architecture, which fixes BPF scheduler test failures caused
by missing 128-bit atomics support.

---
Changes in v10:
- move "scq" to be placed after "lam" 
- squash last patch 
- move patch: "LoongArch: Replace seq_printf with seq_puts for simple strings" to the end
- Link to v9: https://lore.kernel.org/all/20260105105514.76021-1-dongtai.guo@linux.dev/

---
Changes in v9:
- Add patch: "LoongArch: Replace seq_printf with seq_puts for simple strings"
- #define system_has_cmpxchg128() (cpu_has_scq) 
- Delete __cmpxchg128_locked
- #define HWCAP_LOONGARCH_CPU_SCQ (1 << 15) in hwcap.h
- Link to v8: https://lore.kernel.org/all/20251231034523.47014-1-dongtai.guo@linux.dev/

---
Changes in v8:
- Merge patch 2 and patch 3 into one patch
- Put HAVE_CMPXCHG_DOUBLE in order
- Link to v7: https://lore.kernel.org/all/20251230013417.37393-1-dongtai.guo@linux.dev/

---
Changes in v7:
- Create patches based on loongarch-next branch(previously used master)
- Link to v6: https://lore.kernel.org/r/20251215-2-v6-0-09a486e8df99@linux.dev

Changes in v6:
- Put SCQ information in hwcap
- Link to v5: https://lore.kernel.org/r/20251212-2-v5-0-704b3af55f7d@linux.dev

Changes in v5:
- Reordered the patches
- Link to v4: https://lore.kernel.org/r/20251205-2-v4-0-e5ab932cf219@linux.dev

Changes in v4:
- Add SCQ support detection
- Add spinlock to emulate 128-bit cmpxchg
- Link to v3: https://lore.kernel.org/r/20251126-2-v3-0-851b5a516801@linux.dev

Changes in v3:
- dbar 0 -> __WEAK_LLSC_MB
- =ZB" (__ptr[0]) -> "r" (__ptr)
- Link to v2: https://lore.kernel.org/r/20251124-2-v2-0-b38216e25fd9@linux.dev

Changes in v2:
- Use a normal ld.d for the high word instead of ll.d to avoid race
  condition
- Insert a dbar between ll.d and ld.d to prevent reordering
- Simply __cmpxchg128_asm("ll.d", "sc.q", ptr, o, n) to __cmpxchg128_asm(ptr, o, n)
- Fix address operand constraints after testing different approaches:
  * ld.d with "m"
  * ll.d with "ZC",
  * sc.q with "ZB"(alternative constraints caused issues:
   - "r"  caused system hang
   - "ZC" caused compiler error:
     {standard input}: Assembler messages:
     {standard input}:10037: Fatal error: Immediate overflow.
     format: u0:0 )
- Link to v1: https://lore.kernel.org/r/20251120-2-v1-0-705bdc440550@linux.dev


George Guo (3):
  LoongArch: Add SCQ support detection
  LoongArch: Add 128-bit atomic cmpxchg support
  LoongArch: Replace seq_printf with seq_puts for simple strings

 arch/loongarch/Kconfig                    |  2 +
 arch/loongarch/include/asm/cmpxchg.h      | 48 +++++++++++++++++
 arch/loongarch/include/asm/cpu-features.h |  1 +
 arch/loongarch/include/asm/cpu.h          |  2 +
 arch/loongarch/include/uapi/asm/hwcap.h   |  1 +
 arch/loongarch/kernel/cpu-probe.c         |  4 ++
 arch/loongarch/kernel/proc.c              | 63 ++++++++++++++---------
 7 files changed, 98 insertions(+), 23 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v10 loongarch-next 1/3] LoongArch: Add SCQ support detection
  2026-01-10 13:11 [PATCH v10 loongarch-next 0/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
@ 2026-01-10 13:11 ` George Guo
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 2/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 3/3] LoongArch: Replace seq_printf with seq_puts for simple strings George Guo
  2 siblings, 0 replies; 8+ messages in thread
From: George Guo @ 2026-01-10 13:11 UTC (permalink / raw)
  To: chenhuacai
  Cc: dongtai.guo, guodongtai, hengqi.chen, kernel, lianyangyang,
	linux-kernel, loongarch, r, xry111

Check CPUCFG2_SCQ bit to determine if the CPU supports
SCQ instruction.

Co-developed-by: Yangyang Lian <lianyangyang@kylinos.cn>
Signed-off-by: Yangyang Lian <lianyangyang@kylinos.cn>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
---
 arch/loongarch/include/asm/cpu-features.h | 1 +
 arch/loongarch/include/asm/cpu.h          | 2 ++
 arch/loongarch/include/uapi/asm/hwcap.h   | 1 +
 arch/loongarch/kernel/cpu-probe.c         | 4 ++++
 arch/loongarch/kernel/proc.c              | 1 +
 5 files changed, 9 insertions(+)

diff --git a/arch/loongarch/include/asm/cpu-features.h b/arch/loongarch/include/asm/cpu-features.h
index 3745d991a99a..39c7fe64c3ef 100644
--- a/arch/loongarch/include/asm/cpu-features.h
+++ b/arch/loongarch/include/asm/cpu-features.h
@@ -67,5 +67,6 @@
 #define cpu_has_msgint		cpu_opt(LOONGARCH_CPU_MSGINT)
 #define cpu_has_avecint		cpu_opt(LOONGARCH_CPU_AVECINT)
 #define cpu_has_redirectint	cpu_opt(LOONGARCH_CPU_REDIRECTINT)
+#define cpu_has_scq		cpu_opt(LOONGARCH_CPU_SCQ)
 
 #endif /* __ASM_CPU_FEATURES_H */
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
index f3efb00b6141..5531039027ec 100644
--- a/arch/loongarch/include/asm/cpu.h
+++ b/arch/loongarch/include/asm/cpu.h
@@ -125,6 +125,7 @@ static inline char *id_to_core_name(unsigned int id)
 #define CPU_FEATURE_MSGINT		29	/* CPU has MSG interrupt */
 #define CPU_FEATURE_AVECINT		30	/* CPU has AVEC interrupt */
 #define CPU_FEATURE_REDIRECTINT		31	/* CPU has interrupt remapping */
+#define CPU_FEATURE_SCQ			32	/* CPU has SC.Q instruction */
 
 #define LOONGARCH_CPU_CPUCFG		BIT_ULL(CPU_FEATURE_CPUCFG)
 #define LOONGARCH_CPU_LAM		BIT_ULL(CPU_FEATURE_LAM)
@@ -158,5 +159,6 @@ static inline char *id_to_core_name(unsigned int id)
 #define LOONGARCH_CPU_MSGINT		BIT_ULL(CPU_FEATURE_MSGINT)
 #define LOONGARCH_CPU_AVECINT		BIT_ULL(CPU_FEATURE_AVECINT)
 #define LOONGARCH_CPU_REDIRECTINT	BIT_ULL(CPU_FEATURE_REDIRECTINT)
+#define LOONGARCH_CPU_SCQ		BIT_ULL(CPU_FEATURE_SCQ)
 
 #endif /* _ASM_CPU_H */
diff --git a/arch/loongarch/include/uapi/asm/hwcap.h b/arch/loongarch/include/uapi/asm/hwcap.h
index 2b34e56cfa9e..a3c570d407b9 100644
--- a/arch/loongarch/include/uapi/asm/hwcap.h
+++ b/arch/loongarch/include/uapi/asm/hwcap.h
@@ -18,5 +18,6 @@
 #define HWCAP_LOONGARCH_LBT_MIPS	(1 << 12)
 #define HWCAP_LOONGARCH_PTW		(1 << 13)
 #define HWCAP_LOONGARCH_LSPW		(1 << 14)
+#define HWCAP_LOONGARCH_CPU_SCQ		(1 << 15)
 
 #endif /* _UAPI_ASM_HWCAP_H */
diff --git a/arch/loongarch/kernel/cpu-probe.c b/arch/loongarch/kernel/cpu-probe.c
index 08a227034042..7c7708ce4063 100644
--- a/arch/loongarch/kernel/cpu-probe.c
+++ b/arch/loongarch/kernel/cpu-probe.c
@@ -177,6 +177,10 @@ static void cpu_probe_common(struct cpuinfo_loongarch *c)
 		c->options |= LOONGARCH_CPU_LAM;
 		elf_hwcap |= HWCAP_LOONGARCH_LAM;
 	}
+	if (config & CPUCFG2_SCQ) {
+		c->options |= LOONGARCH_CPU_SCQ;
+		elf_hwcap |= HWCAP_LOONGARCH_CPU_SCQ;
+	}
 	if (config & CPUCFG2_FP) {
 		c->options |= LOONGARCH_CPU_FPU;
 		elf_hwcap |= HWCAP_LOONGARCH_FPU;
diff --git a/arch/loongarch/kernel/proc.c b/arch/loongarch/kernel/proc.c
index a8800d20e11b..a60471b96440 100644
--- a/arch/loongarch/kernel/proc.c
+++ b/arch/loongarch/kernel/proc.c
@@ -62,6 +62,7 @@ static int show_cpuinfo(struct seq_file *m, void *v)
 	seq_printf(m, "Features\t\t:");
 	if (cpu_has_cpucfg)	seq_printf(m, " cpucfg");
 	if (cpu_has_lam)	seq_printf(m, " lam");
+	if (cpu_has_scq)	seq_printf(m, " scq");
 	if (cpu_has_ual)	seq_printf(m, " ual");
 	if (cpu_has_fpu)	seq_printf(m, " fpu");
 	if (cpu_has_lsx)	seq_printf(m, " lsx");
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v10 loongarch-next 2/3] LoongArch: Add 128-bit atomic cmpxchg support
  2026-01-10 13:11 [PATCH v10 loongarch-next 0/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 1/3] LoongArch: Add SCQ support detection George Guo
@ 2026-01-10 13:11 ` George Guo
  2026-01-28  4:26   ` Huacai Chen
  2026-02-16  7:32   ` Thomas Weißschuh
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 3/3] LoongArch: Replace seq_printf with seq_puts for simple strings George Guo
  2 siblings, 2 replies; 8+ messages in thread
From: George Guo @ 2026-01-10 13:11 UTC (permalink / raw)
  To: chenhuacai
  Cc: dongtai.guo, guodongtai, hengqi.chen, kernel, lianyangyang,
	linux-kernel, loongarch, r, xry111

From: George Guo <guodongtai@kylinos.cn>

Implement 128-bit atomic compare-and-exchange using LoongArch's
LL.D/SC.Q instructions.

At the same time, fix BPF scheduler test failures (scx_central scx_qmap)
caused by kmalloc_nolock_noprof returning NULL due to missing
128-bit atomics. The NULL returns led to -ENOMEM errors during
scheduler initialization, causing test cases to fail.

Verified by testing with the scx_qmap scheduler (located in
tools/sched_ext/). Building with `make` and running
./tools/sched_ext/build/bin/scx_qmap.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=5fb750e8a9ae
Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
---
 arch/loongarch/Kconfig               |  2 ++
 arch/loongarch/include/asm/cmpxchg.h | 48 ++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 730f34214519..f9845ebec1a4 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -114,6 +114,7 @@ config LOONGARCH
 	select GENERIC_TIME_VSYSCALL
 	select GPIOLIB
 	select HAS_IOPORT
+	select HAVE_ALIGNED_STRUCT_PAGE
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_BITREVERSE
 	select HAVE_ARCH_JUMP_LABEL
@@ -130,6 +131,7 @@ config LOONGARCH
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
 	select HAVE_ASM_MODVERSIONS
+	select HAVE_CMPXCHG_DOUBLE
 	select HAVE_CONTEXT_TRACKING_USER
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_DEBUG_KMEMLEAK
diff --git a/arch/loongarch/include/asm/cmpxchg.h b/arch/loongarch/include/asm/cmpxchg.h
index 0494c2ab553e..d25e25d8fc9e 100644
--- a/arch/loongarch/include/asm/cmpxchg.h
+++ b/arch/loongarch/include/asm/cmpxchg.h
@@ -8,6 +8,7 @@
 #include <linux/bits.h>
 #include <linux/build_bug.h>
 #include <asm/barrier.h>
+#include <asm/cpu-features.h>
 
 #define __xchg_amo_asm(amswap_db, m, val)	\
 ({						\
@@ -137,6 +138,44 @@ __arch_xchg(volatile void *ptr, unsigned long x, int size)
 	__ret;								\
 })
 
+union __u128_halves {
+	u128 full;
+	struct {
+		u64 low;
+		u64 high;
+	};
+};
+
+#define __arch_cmpxchg128(ptr, old, new)					\
+({									\
+	union __u128_halves __old, __new, __ret;			\
+	volatile u64 *__ptr = (volatile u64 *)(ptr);			\
+									\
+	__old.full = (old);                                             \
+	__new.full = (new);						\
+									\
+	__asm__ __volatile__(						\
+	"1:   ll.d    %0, %3		# 128-bit cmpxchg low	\n"	\
+	__WEAK_LLSC_MB							\
+	"     ld.d    %1, %4		# 128-bit cmpxchg high	\n"	\
+	"     bne     %0, %z5, 2f				\n"	\
+	"     bne     %1, %z6, 2f				\n"	\
+	"     move    $t0, %z7					\n"	\
+	"     move    $t1, %z8					\n"	\
+	"     sc.q    $t0, $t1, %2				\n"	\
+	"     beqz    $t0, 1b					\n"	\
+	"2:							\n"	\
+	__WEAK_LLSC_MB							\
+	: "=&r" (__ret.low), "=&r" (__ret.high)				\
+	: "r" (__ptr),							\
+	  "ZC" (__ptr[0]), "m" (__ptr[1]),				\
+	  "Jr" (__old.low), "Jr" (__old.high),				\
+	  "Jr" (__new.low), "Jr" (__new.high)				\
+	: "t0", "t1", "memory");					\
+									\
+	__ret.full;							\
+})
+
 static inline unsigned int __cmpxchg_small(volatile void *ptr, unsigned int old,
 					   unsigned int new, unsigned int size)
 {
@@ -224,6 +263,15 @@ __cmpxchg(volatile void *ptr, unsigned long old, unsigned long new, unsigned int
 	__res;								\
 })
 
+/* cmpxchg128 */
+#define system_has_cmpxchg128()		(cpu_has_scq)
+
+#define arch_cmpxchg128(ptr, o, n)					\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 16);				\
+	__arch_cmpxchg128(ptr, o, n);					\
+})
+
 #ifdef CONFIG_64BIT
 #define arch_cmpxchg64_local(ptr, o, n)					\
   ({									\
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v10 loongarch-next 3/3] LoongArch: Replace seq_printf with seq_puts for simple strings
  2026-01-10 13:11 [PATCH v10 loongarch-next 0/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 1/3] LoongArch: Add SCQ support detection George Guo
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 2/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
@ 2026-01-10 13:11 ` George Guo
  2026-01-10 18:04   ` Joe Perches
  2 siblings, 1 reply; 8+ messages in thread
From: George Guo @ 2026-01-10 13:11 UTC (permalink / raw)
  To: chenhuacai
  Cc: dongtai.guo, guodongtai, hengqi.chen, kernel, lianyangyang,
	linux-kernel, loongarch, r, xry111

Fix warnings like: "Prefer seq_puts to seq_printf" by checkpatch.pl.

Replace seq_printf() calls with seq_puts() in show_cpuinfo()
when outputting simple constant strings without format specifiers.

This improves performance slightly as seq_puts() avoids parsing
the format string.

Signed-off-by: George Guo <guodongtai@kylinos.cn>
---
 arch/loongarch/kernel/proc.c | 64 ++++++++++++++++++++++--------------
 1 file changed, 40 insertions(+), 24 deletions(-)

diff --git a/arch/loongarch/kernel/proc.c b/arch/loongarch/kernel/proc.c
index a60471b96440..a8127e83da65 100644
--- a/arch/loongarch/kernel/proc.c
+++ b/arch/loongarch/kernel/proc.c
@@ -50,33 +50,49 @@ static int show_cpuinfo(struct seq_file *m, void *v)
 	seq_printf(m, "Address Sizes\t\t: %d bits physical, %d bits virtual\n",
 		      cpu_pabits + 1, cpu_vabits + 1);
 
-	seq_printf(m, "ISA\t\t\t:");
+	seq_puts(m, "ISA\t\t\t:");
 	if (isa & LOONGARCH_CPU_ISA_LA32R)
-		seq_printf(m, " loongarch32r");
+		seq_puts(m, " loongarch32r");
 	if (isa & LOONGARCH_CPU_ISA_LA32S)
-		seq_printf(m, " loongarch32s");
+		seq_puts(m, " loongarch32s");
 	if (isa & LOONGARCH_CPU_ISA_LA64)
-		seq_printf(m, " loongarch64");
-	seq_printf(m, "\n");
+		seq_puts(m, " loongarch64");
+	seq_puts(m, "\n");
 
-	seq_printf(m, "Features\t\t:");
-	if (cpu_has_cpucfg)	seq_printf(m, " cpucfg");
-	if (cpu_has_lam)	seq_printf(m, " lam");
-	if (cpu_has_scq)	seq_printf(m, " scq");
-	if (cpu_has_ual)	seq_printf(m, " ual");
-	if (cpu_has_fpu)	seq_printf(m, " fpu");
-	if (cpu_has_lsx)	seq_printf(m, " lsx");
-	if (cpu_has_lasx)	seq_printf(m, " lasx");
-	if (cpu_has_crc32)	seq_printf(m, " crc32");
-	if (cpu_has_complex)	seq_printf(m, " complex");
-	if (cpu_has_crypto)	seq_printf(m, " crypto");
-	if (cpu_has_ptw)	seq_printf(m, " ptw");
-	if (cpu_has_lspw)	seq_printf(m, " lspw");
-	if (cpu_has_lvz)	seq_printf(m, " lvz");
-	if (cpu_has_lbt_x86)	seq_printf(m, " lbt_x86");
-	if (cpu_has_lbt_arm)	seq_printf(m, " lbt_arm");
-	if (cpu_has_lbt_mips)	seq_printf(m, " lbt_mips");
-	seq_printf(m, "\n");
+	seq_puts(m, "Features\t\t:");
+	if (cpu_has_cpucfg)
+		seq_puts(m, " cpucfg");
+	if (cpu_has_lam)
+		seq_puts(m, " lam");
+	if (cpu_has_scq)
+		seq_puts(m, " scq");
+	if (cpu_has_ual)
+		seq_puts(m, " ual");
+	if (cpu_has_fpu)
+		seq_puts(m, " fpu");
+	if (cpu_has_lsx)
+		seq_puts(m, " lsx");
+	if (cpu_has_lasx)
+		seq_puts(m, " lasx");
+	if (cpu_has_crc32)
+		seq_puts(m, " crc32");
+	if (cpu_has_complex)
+		seq_puts(m, " complex");
+	if (cpu_has_crypto)
+		seq_puts(m, " crypto");
+	if (cpu_has_ptw)
+		seq_puts(m, " ptw");
+	if (cpu_has_lspw)
+		seq_puts(m, " lspw");
+	if (cpu_has_lvz)
+		seq_puts(m, " lvz");
+	if (cpu_has_lbt_x86)
+		seq_puts(m, " lbt_x86");
+	if (cpu_has_lbt_arm)
+		seq_puts(m, " lbt_arm");
+	if (cpu_has_lbt_mips)
+		seq_puts(m, " lbt_mips");
+	seq_puts(m, "\n");
 
 	seq_printf(m, "Hardware Watchpoint\t: %s", str_yes_no(cpu_has_watch));
 	if (cpu_has_watch) {
@@ -84,7 +100,7 @@ static int show_cpuinfo(struct seq_file *m, void *v)
 		      cpu_data[n].watch_ireg_count, cpu_data[n].watch_dreg_count);
 	}
 
-	seq_printf(m, "\n\n");
+	seq_puts(m, "\n\n");
 
 	return 0;
 }
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v10 loongarch-next 3/3] LoongArch: Replace seq_printf with seq_puts for simple strings
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 3/3] LoongArch: Replace seq_printf with seq_puts for simple strings George Guo
@ 2026-01-10 18:04   ` Joe Perches
  2026-01-11  1:48     ` Huacai Chen
  0 siblings, 1 reply; 8+ messages in thread
From: Joe Perches @ 2026-01-10 18:04 UTC (permalink / raw)
  To: George Guo, chenhuacai
  Cc: guodongtai, hengqi.chen, kernel, lianyangyang, linux-kernel,
	loongarch, r, xry111

On Sat, 2026-01-10 at 21:11 +0800, George Guo wrote:
> Fix warnings like: "Prefer seq_puts to seq_printf" by checkpatch.pl.
> 
> Replace seq_printf() calls with seq_puts() in show_cpuinfo()
> when outputting simple constant strings without format specifiers.
> 
> This improves performance slightly as seq_puts() avoids parsing
> the format string.
[]
> diff --git a/arch/loongarch/kernel/proc.c b/arch/loongarch/kernel/proc.c
[]
> @@ -50,33 +50,49 @@ static int show_cpuinfo(struct seq_file *m, void *v)
[]
> -	seq_printf(m, "Features\t\t:");
> -	if (cpu_has_cpucfg)	seq_printf(m, " cpucfg");
> -	if (cpu_has_lam)	seq_printf(m, " lam");
[etc]
> +	seq_puts(m, "Features\t\t:");
> +	if (cpu_has_cpucfg)
> +		seq_puts(m, " cpucfg");
> +	if (cpu_has_lam)
> +		seq_puts(m, " lam");

trivia:

Not sure this is better style as it's fairly difficult to read.

Maybe a macro might help, something like:

#define seq_cpu_feature(m, feature) \
	if (cpu_has_##feature) seq_puts(m, " " #feature)

	seq_cpu_feature(m, cpucfg);
	seq_cpu_feature(m, lam);

etc.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v10 loongarch-next 3/3] LoongArch: Replace seq_printf with seq_puts for simple strings
  2026-01-10 18:04   ` Joe Perches
@ 2026-01-11  1:48     ` Huacai Chen
  0 siblings, 0 replies; 8+ messages in thread
From: Huacai Chen @ 2026-01-11  1:48 UTC (permalink / raw)
  To: Joe Perches
  Cc: George Guo, guodongtai, hengqi.chen, kernel, lianyangyang,
	linux-kernel, loongarch, r, xry111

Hi, Joe,

On Sun, Jan 11, 2026 at 2:04 AM Joe Perches <joe@perches.com> wrote:
>
> On Sat, 2026-01-10 at 21:11 +0800, George Guo wrote:
> > Fix warnings like: "Prefer seq_puts to seq_printf" by checkpatch.pl.
> >
> > Replace seq_printf() calls with seq_puts() in show_cpuinfo()
> > when outputting simple constant strings without format specifiers.
> >
> > This improves performance slightly as seq_puts() avoids parsing
> > the format string.
> []
> > diff --git a/arch/loongarch/kernel/proc.c b/arch/loongarch/kernel/proc.c
> []
> > @@ -50,33 +50,49 @@ static int show_cpuinfo(struct seq_file *m, void *v)
> []
> > -     seq_printf(m, "Features\t\t:");
> > -     if (cpu_has_cpucfg)     seq_printf(m, " cpucfg");
> > -     if (cpu_has_lam)        seq_printf(m, " lam");
> [etc]
> > +     seq_puts(m, "Features\t\t:");
> > +     if (cpu_has_cpucfg)
> > +             seq_puts(m, " cpucfg");
> > +     if (cpu_has_lam)
> > +             seq_puts(m, " lam");
>
> trivia:
>
> Not sure this is better style as it's fairly difficult to read.
>
> Maybe a macro might help, something like:
>
> #define seq_cpu_feature(m, feature) \
>         if (cpu_has_##feature) seq_puts(m, " " #feature)
>
>         seq_cpu_feature(m, cpucfg);
>         seq_cpu_feature(m, lam);
Seems just some bikesheedings, the current style is OK at least for
me. Since George has worked on this series for a very long time, let's
stop changing and get it merged.

Huacai

>
> etc.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v10 loongarch-next 2/3] LoongArch: Add 128-bit atomic cmpxchg support
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 2/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
@ 2026-01-28  4:26   ` Huacai Chen
  2026-02-16  7:32   ` Thomas Weißschuh
  1 sibling, 0 replies; 8+ messages in thread
From: Huacai Chen @ 2026-01-28  4:26 UTC (permalink / raw)
  To: George Guo
  Cc: guodongtai, hengqi.chen, kernel, lianyangyang, linux-kernel,
	loongarch, r, xry111

Applied and add arch_cmpxchg128_local(), you can check whether everthing is OK.
https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/commit/?h=loongarch-next&id=09a3171203974a30c6d0ab66f8ef0a6d9e7e780e

Huacai

On Sat, Jan 10, 2026 at 9:11 PM George Guo <dongtai.guo@linux.dev> wrote:
>
> From: George Guo <guodongtai@kylinos.cn>
>
> Implement 128-bit atomic compare-and-exchange using LoongArch's
> LL.D/SC.Q instructions.
>
> At the same time, fix BPF scheduler test failures (scx_central scx_qmap)
> caused by kmalloc_nolock_noprof returning NULL due to missing
> 128-bit atomics. The NULL returns led to -ENOMEM errors during
> scheduler initialization, causing test cases to fail.
>
> Verified by testing with the scx_qmap scheduler (located in
> tools/sched_ext/). Building with `make` and running
> ./tools/sched_ext/build/bin/scx_qmap.
>
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=5fb750e8a9ae
> Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
> Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> ---
>  arch/loongarch/Kconfig               |  2 ++
>  arch/loongarch/include/asm/cmpxchg.h | 48 ++++++++++++++++++++++++++++
>  2 files changed, 50 insertions(+)
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 730f34214519..f9845ebec1a4 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -114,6 +114,7 @@ config LOONGARCH
>         select GENERIC_TIME_VSYSCALL
>         select GPIOLIB
>         select HAS_IOPORT
> +       select HAVE_ALIGNED_STRUCT_PAGE
>         select HAVE_ARCH_AUDITSYSCALL
>         select HAVE_ARCH_BITREVERSE
>         select HAVE_ARCH_JUMP_LABEL
> @@ -130,6 +131,7 @@ config LOONGARCH
>         select HAVE_ARCH_TRANSPARENT_HUGEPAGE
>         select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
>         select HAVE_ASM_MODVERSIONS
> +       select HAVE_CMPXCHG_DOUBLE
>         select HAVE_CONTEXT_TRACKING_USER
>         select HAVE_C_RECORDMCOUNT
>         select HAVE_DEBUG_KMEMLEAK
> diff --git a/arch/loongarch/include/asm/cmpxchg.h b/arch/loongarch/include/asm/cmpxchg.h
> index 0494c2ab553e..d25e25d8fc9e 100644
> --- a/arch/loongarch/include/asm/cmpxchg.h
> +++ b/arch/loongarch/include/asm/cmpxchg.h
> @@ -8,6 +8,7 @@
>  #include <linux/bits.h>
>  #include <linux/build_bug.h>
>  #include <asm/barrier.h>
> +#include <asm/cpu-features.h>
>
>  #define __xchg_amo_asm(amswap_db, m, val)      \
>  ({                                             \
> @@ -137,6 +138,44 @@ __arch_xchg(volatile void *ptr, unsigned long x, int size)
>         __ret;                                                          \
>  })
>
> +union __u128_halves {
> +       u128 full;
> +       struct {
> +               u64 low;
> +               u64 high;
> +       };
> +};
> +
> +#define __arch_cmpxchg128(ptr, old, new)                                       \
> +({                                                                     \
> +       union __u128_halves __old, __new, __ret;                        \
> +       volatile u64 *__ptr = (volatile u64 *)(ptr);                    \
> +                                                                       \
> +       __old.full = (old);                                             \
> +       __new.full = (new);                                             \
> +                                                                       \
> +       __asm__ __volatile__(                                           \
> +       "1:   ll.d    %0, %3            # 128-bit cmpxchg low   \n"     \
> +       __WEAK_LLSC_MB                                                  \
> +       "     ld.d    %1, %4            # 128-bit cmpxchg high  \n"     \
> +       "     bne     %0, %z5, 2f                               \n"     \
> +       "     bne     %1, %z6, 2f                               \n"     \
> +       "     move    $t0, %z7                                  \n"     \
> +       "     move    $t1, %z8                                  \n"     \
> +       "     sc.q    $t0, $t1, %2                              \n"     \
> +       "     beqz    $t0, 1b                                   \n"     \
> +       "2:                                                     \n"     \
> +       __WEAK_LLSC_MB                                                  \
> +       : "=&r" (__ret.low), "=&r" (__ret.high)                         \
> +       : "r" (__ptr),                                                  \
> +         "ZC" (__ptr[0]), "m" (__ptr[1]),                              \
> +         "Jr" (__old.low), "Jr" (__old.high),                          \
> +         "Jr" (__new.low), "Jr" (__new.high)                           \
> +       : "t0", "t1", "memory");                                        \
> +                                                                       \
> +       __ret.full;                                                     \
> +})
> +
>  static inline unsigned int __cmpxchg_small(volatile void *ptr, unsigned int old,
>                                            unsigned int new, unsigned int size)
>  {
> @@ -224,6 +263,15 @@ __cmpxchg(volatile void *ptr, unsigned long old, unsigned long new, unsigned int
>         __res;                                                          \
>  })
>
> +/* cmpxchg128 */
> +#define system_has_cmpxchg128()                (cpu_has_scq)
> +
> +#define arch_cmpxchg128(ptr, o, n)                                     \
> +({                                                                     \
> +       BUILD_BUG_ON(sizeof(*(ptr)) != 16);                             \
> +       __arch_cmpxchg128(ptr, o, n);                                   \
> +})
> +
>  #ifdef CONFIG_64BIT
>  #define arch_cmpxchg64_local(ptr, o, n)                                        \
>    ({                                                                   \
> --
> 2.49.0
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v10 loongarch-next 2/3] LoongArch: Add 128-bit atomic cmpxchg support
  2026-01-10 13:11 ` [PATCH v10 loongarch-next 2/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
  2026-01-28  4:26   ` Huacai Chen
@ 2026-02-16  7:32   ` Thomas Weißschuh
  1 sibling, 0 replies; 8+ messages in thread
From: Thomas Weißschuh @ 2026-02-16  7:32 UTC (permalink / raw)
  To: George Guo
  Cc: chenhuacai, guodongtai, hengqi.chen, kernel, lianyangyang,
	linux-kernel, loongarch, r, xry111

Hi!

On Sat, Jan 10, 2026 at 09:11:23PM +0800, George Guo wrote:
> From: George Guo <guodongtai@kylinos.cn>
> 
> Implement 128-bit atomic compare-and-exchange using LoongArch's
> LL.D/SC.Q instructions.
> 
> At the same time, fix BPF scheduler test failures (scx_central scx_qmap)
> caused by kmalloc_nolock_noprof returning NULL due to missing
> 128-bit atomics. The NULL returns led to -ENOMEM errors during
> scheduler initialization, causing test cases to fail.
> 
> Verified by testing with the scx_qmap scheduler (located in
> tools/sched_ext/). Building with `make` and running
> ./tools/sched_ext/build/bin/scx_qmap.
> 
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=5fb750e8a9ae
> Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
> Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>

(...)

> +#define __arch_cmpxchg128(ptr, old, new)					\
> +({									\
> +	union __u128_halves __old, __new, __ret;			\
> +	volatile u64 *__ptr = (volatile u64 *)(ptr);			\
> +									\
> +	__old.full = (old);                                             \
> +	__new.full = (new);						\
> +									\
> +	__asm__ __volatile__(						\
> +	"1:   ll.d    %0, %3		# 128-bit cmpxchg low	\n"	\
> +	__WEAK_LLSC_MB							\
> +	"     ld.d    %1, %4		# 128-bit cmpxchg high	\n"	\
> +	"     bne     %0, %z5, 2f				\n"	\
> +	"     bne     %1, %z6, 2f				\n"	\
> +	"     move    $t0, %z7					\n"	\
> +	"     move    $t1, %z8					\n"	\
> +	"     sc.q    $t0, $t1, %2				\n"	\
> +	"     beqz    $t0, 1b					\n"	\
> +	"2:							\n"	\
> +	__WEAK_LLSC_MB							\
> +	: "=&r" (__ret.low), "=&r" (__ret.high)				\
> +	: "r" (__ptr),							\
> +	  "ZC" (__ptr[0]), "m" (__ptr[1]),				\
> +	  "Jr" (__old.low), "Jr" (__old.high),				\
> +	  "Jr" (__new.low), "Jr" (__new.high)				\
> +	: "t0", "t1", "memory");					\
> +									\
> +	__ret.full;							\
> +})

Older versions of binutils[0] seem to lack support for the sc.q instruction.
So there should be some validation of the toolchain capabilities.

ERROR:root:{standard input}: Assembler messages:
{standard input}:4831: Error: no match insn: sc.q	$t0,$t1,$r14
{standard input}:6407: Error: no match insn: sc.q	$t0,$t1,$r23
{standard input}:10856: Error: no match insn: sc.q	$t0,$t1,$r14
make[4]: *** [../scripts/Makefile.build:289: mm/slub.o] Error 1


[0] Tested with binutils 2.41 from the GCC 13.2 kernel.org toolchain.


Thomas

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-02-16  7:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-10 13:11 [PATCH v10 loongarch-next 0/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
2026-01-10 13:11 ` [PATCH v10 loongarch-next 1/3] LoongArch: Add SCQ support detection George Guo
2026-01-10 13:11 ` [PATCH v10 loongarch-next 2/3] LoongArch: Add 128-bit atomic cmpxchg support George Guo
2026-01-28  4:26   ` Huacai Chen
2026-02-16  7:32   ` Thomas Weißschuh
2026-01-10 13:11 ` [PATCH v10 loongarch-next 3/3] LoongArch: Replace seq_printf with seq_puts for simple strings George Guo
2026-01-10 18:04   ` Joe Perches
2026-01-11  1:48     ` Huacai Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox