linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc
@ 2025-12-14 13:02 Mukesh Kumar Chaurasiya
  2025-12-14 13:02 ` [PATCH v2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
                   ` (7 more replies)
  0 siblings, 8 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel
  Cc: Mukesh Kumar Chaurasiya

Adding support for the generic irq entry/exit handling for PowerPC. The
goal is to bring PowerPC in line with other architectures that already
use the common irq entry infrastructure, reducing duplicated code and
making it easier to share future changes in entry/exit paths.

This is slightly tested of ppc64le and ppc32.

The performance benchmarks are below:

perf bench syscall usec/op

| Test            | With Patch | Without Patch | % Change |
| --------------- | ---------- | ------------- | -------- |
| getppid usec/op | 0.207795   | 0.210373      | -1.22%   |
| getpgid usec/op | 0.206282   | 0.211676      | -2.55%   |
| fork usec/op    | 833.986    | 814.809       | +2.35%   |
| execve usec/op  | 360.939    | 365.168       | -1.16%   | 


perf bench syscall ops/sec

| Test            | With Patch | Without Patch | % Change |
| --------------- | ---------- | ------------- | -------- |
| getppid ops/sec | 48,12,433  | 47,53,459     | +1.24%   |
| getpgid ops/sec | 48,47,744  | 47,24,192     | +2.61%   |
| fork ops/sec    | 1,199      | 1,227         | -2.28%   |
| execve ops/sec  | 2,770      | 2,738         | +1.16%   |

IPI latency benchmark

| Metric                  | With Patch       | Without Patch    | % Change |
| ----------------------- | ---------------- | ---------------- | -------- |
| Dry-run (ns)            | 206,675.81       | 206,719.36       | -0.02%   |
| Self-IPI avg (ns)       | 1,939,991.00     | 1,976,116.15     | -1.83%   |
| Self-IPI max (ns)       | 3,533,718.93     | 3,582,650.33     | -1.37%   |
| Normal IPI avg (ns)     | 111,110,034.23   | 110,513,373.51   | +0.54%   |
| Normal IPI max (ns)     | 150,393,442.64   | 149,669,477.89   | +0.48%   |
| Broadcast IPI max (ns)  | 3,978,231,022.96 | 4,359,916,859.46 | -8.73%   |
| Broadcast lock max (ns) | 4,025,425,714.49 | 4,384,956,730.83 | -8.20%   |

Thats very close to performance earlier with arch specific handling.

Tests done:
 - Build and boot on ppc64le pseries.
 - Build and boot on ppc64le powernv8 powernv9 powernv10.
 - Build and boot on ppc32.
 - Performance benchmark done with perf syscall basic on pseries.

Changelog:

V1 -> V2
 - Fix an issue where context tracking was showing warnings for
   incorrect context
V1: https://lore.kernel.org/all/20251102115358.1744304-1-mkchauras@linux.ibm.com/

RFC -> PATCH V1
 - Fix for ppc32 spitting out kuap lock warnings.
 - ppc64le powernv8 crash fix.
 - Review comments incorporated from previous RFC.
RFC https://lore.kernel.org/all/20250908210235.137300-2-mchauras@linux.ibm.com/

Mukesh Kumar Chaurasiya (8):
  powerpc: rename arch_irq_disabled_regs
  powerpc: Prepare to build with generic entry/exit framework
  powerpc: introduce arch_enter_from_user_mode
  powerpc: Introduce syscall exit arch functions
  powerpc: add exit_flags field in pt_regs
  powerpc: Prepare for IRQ entry exit
  powerpc: Enable IRQ generic entry/exit path.
  powerpc: Enable Generic Entry/Exit for syscalls.

 arch/powerpc/Kconfig                    |   2 +
 arch/powerpc/include/asm/entry-common.h | 536 ++++++++++++++++++++++++
 arch/powerpc/include/asm/hw_irq.h       |   4 +-
 arch/powerpc/include/asm/interrupt.h    | 401 +++---------------
 arch/powerpc/include/asm/ptrace.h       |   3 +
 arch/powerpc/include/asm/stacktrace.h   |   6 +
 arch/powerpc/include/asm/syscall.h      |   5 +
 arch/powerpc/include/asm/thread_info.h  |   1 +
 arch/powerpc/include/uapi/asm/ptrace.h  |  14 +-
 arch/powerpc/kernel/asm-offsets.c       |   1 +
 arch/powerpc/kernel/interrupt.c         | 255 ++---------
 arch/powerpc/kernel/ptrace/ptrace.c     | 142 +------
 arch/powerpc/kernel/signal.c            |   8 +
 arch/powerpc/kernel/syscall.c           | 119 +-----
 arch/powerpc/kernel/traps.c             |   2 +-
 arch/powerpc/kernel/watchdog.c          |   2 +-
 arch/powerpc/perf/core-book3s.c         |   2 +-
 17 files changed, 685 insertions(+), 818 deletions(-)
 create mode 100644 arch/powerpc/include/asm/entry-common.h

-- 
2.52.0



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 1/8] powerpc: rename arch_irq_disabled_regs
  2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
@ 2025-12-14 13:02 ` Mukesh Kumar Chaurasiya
  2025-12-14 13:02 ` [PATCH v2 2/8] powerpc: Prepare to build with generic entry/exit framework Mukesh Kumar Chaurasiya
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>

Rename arch_irq_disabled_regs() to regs_irqs_disabled() to align with the
naming used in the generic irqentry framework. This makes the function
available for use both in the PowerPC architecture code and in the
common entry/exit paths shared with other architectures.

This is a preparatory change for enabling the generic irqentry framework
on PowerPC.

Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---
 arch/powerpc/include/asm/hw_irq.h    |  4 ++--
 arch/powerpc/include/asm/interrupt.h | 16 ++++++++--------
 arch/powerpc/kernel/interrupt.c      |  4 ++--
 arch/powerpc/kernel/syscall.c        |  2 +-
 arch/powerpc/kernel/traps.c          |  2 +-
 arch/powerpc/kernel/watchdog.c       |  2 +-
 arch/powerpc/perf/core-book3s.c      |  2 +-
 7 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 1078ba88efaf..8dfe36b442a5 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -393,7 +393,7 @@ static inline void do_hard_irq_enable(void)
 	__hard_irq_enable();
 }
 
-static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
+static inline bool regs_irqs_disabled(struct pt_regs *regs)
 {
 	return (regs->softe & IRQS_DISABLED);
 }
@@ -466,7 +466,7 @@ static inline bool arch_irqs_disabled(void)
 
 #define hard_irq_disable()		arch_local_irq_disable()
 
-static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
+static inline bool regs_irqs_disabled(struct pt_regs *regs)
 {
 	return !(regs->msr & MSR_EE);
 }
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index eb0e4a20b818..0e2cddf8bd21 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -172,7 +172,7 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs)
 	/* Enable MSR[RI] early, to support kernel SLB and hash faults */
 #endif
 
-	if (!arch_irq_disabled_regs(regs))
+	if (!regs_irqs_disabled(regs))
 		trace_hardirqs_off();
 
 	if (user_mode(regs)) {
@@ -192,11 +192,11 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs)
 			CT_WARN_ON(ct_state() != CT_STATE_KERNEL &&
 				   ct_state() != CT_STATE_IDLE);
 		INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
-		INT_SOFT_MASK_BUG_ON(regs, arch_irq_disabled_regs(regs) &&
-					   search_kernel_restart_table(regs->nip));
+		INT_SOFT_MASK_BUG_ON(regs, regs_irqs_disabled(regs) &&
+				     search_kernel_restart_table(regs->nip));
 	}
-	INT_SOFT_MASK_BUG_ON(regs, !arch_irq_disabled_regs(regs) &&
-				   !(regs->msr & MSR_EE));
+	INT_SOFT_MASK_BUG_ON(regs, !regs_irqs_disabled(regs) &&
+			     !(regs->msr & MSR_EE));
 
 	booke_restore_dbcr0();
 }
@@ -298,7 +298,7 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
 		 * Adjust regs->softe to be soft-masked if it had not been
 		 * reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
 		 * not yet set disabled), or if it was in an implicit soft
-		 * masked state. This makes arch_irq_disabled_regs(regs)
+		 * masked state. This makes regs_irqs_disabled(regs)
 		 * behave as expected.
 		 */
 		regs->softe = IRQS_ALL_DISABLED;
@@ -372,7 +372,7 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct inter
 
 #ifdef CONFIG_PPC64
 #ifdef CONFIG_PPC_BOOK3S
-	if (arch_irq_disabled_regs(regs)) {
+	if (regs_irqs_disabled(regs)) {
 		unsigned long rst = search_kernel_restart_table(regs->nip);
 		if (rst)
 			regs_set_return_ip(regs, rst);
@@ -661,7 +661,7 @@ void replay_soft_interrupts(void);
 
 static inline void interrupt_cond_local_irq_enable(struct pt_regs *regs)
 {
-	if (!arch_irq_disabled_regs(regs))
+	if (!regs_irqs_disabled(regs))
 		local_irq_enable();
 }
 
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index e0c681d0b076..0d8fd47049a1 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -347,7 +347,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
 	unsigned long ret;
 
 	BUG_ON(regs_is_unrecoverable(regs));
-	BUG_ON(arch_irq_disabled_regs(regs));
+	BUG_ON(regs_irqs_disabled(regs));
 	CT_WARN_ON(ct_state() == CT_STATE_USER);
 
 	/*
@@ -396,7 +396,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 
 	local_irq_disable();
 
-	if (!arch_irq_disabled_regs(regs)) {
+	if (!regs_irqs_disabled(regs)) {
 		/* Returning to a kernel context with local irqs enabled. */
 		WARN_ON_ONCE(!(regs->msr & MSR_EE));
 again:
diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
index be159ad4b77b..9f03a6263fb4 100644
--- a/arch/powerpc/kernel/syscall.c
+++ b/arch/powerpc/kernel/syscall.c
@@ -32,7 +32,7 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
 
 	BUG_ON(regs_is_unrecoverable(regs));
 	BUG_ON(!user_mode(regs));
-	BUG_ON(arch_irq_disabled_regs(regs));
+	BUG_ON(regs_irqs_disabled(regs));
 
 #ifdef CONFIG_PPC_PKEY
 	if (mmu_has_feature(MMU_FTR_PKEY)) {
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index cb8e9357383e..629f2a2d4780 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1956,7 +1956,7 @@ DEFINE_INTERRUPT_HANDLER_RAW(performance_monitor_exception)
 	 * prevent hash faults on user addresses when reading callchains (and
 	 * looks better from an irq tracing perspective).
 	 */
-	if (IS_ENABLED(CONFIG_PPC64) && unlikely(arch_irq_disabled_regs(regs)))
+	if (IS_ENABLED(CONFIG_PPC64) && unlikely(regs_irqs_disabled(regs)))
 		performance_monitor_exception_nmi(regs);
 	else
 		performance_monitor_exception_async(regs);
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 2429cb1c7baa..6111cbbde069 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -373,7 +373,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
 	u64 tb;
 
 	/* should only arrive from kernel, with irqs disabled */
-	WARN_ON_ONCE(!arch_irq_disabled_regs(regs));
+	WARN_ON_ONCE(!regs_irqs_disabled(regs));
 
 	if (!cpumask_test_cpu(cpu, &wd_cpus_enabled))
 		return 0;
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 8b0081441f85..f7518b7e3055 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2482,7 +2482,7 @@ static void __perf_event_interrupt(struct pt_regs *regs)
 	 * will trigger a PMI after waking up from idle. Since counter values are _not_
 	 * saved/restored in idle path, can lead to below "Can't find PMC" message.
 	 */
-	if (unlikely(!found) && !arch_irq_disabled_regs(regs))
+	if (unlikely(!found) && !regs_irqs_disabled(regs))
 		printk_ratelimited(KERN_WARNING "Can't find PMC that caused IRQ\n");
 
 	/*
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 2/8] powerpc: Prepare to build with generic entry/exit framework
  2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
  2025-12-14 13:02 ` [PATCH v2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
@ 2025-12-14 13:02 ` Mukesh Kumar Chaurasiya
  2025-12-16  9:27   ` Christophe Leroy (CS GROUP)
  2025-12-14 13:02 ` [PATCH v2 3/8] powerpc: introduce arch_enter_from_user_mode Mukesh Kumar Chaurasiya
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>

This patch introduces preparatory changes needed to support building
PowerPC with the generic entry/exit (irqentry) framework.

The following infrastructure updates are added:
 - Add a syscall_work field to struct thread_info to hold SYSCALL_WORK_* flags.
 - Provide a stub implementation of arch_syscall_is_vdso_sigreturn(),
   returning false for now.
 - Introduce on_thread_stack() helper to detect if the current stack pointer
   lies within the task’s kernel stack.

These additions enable later integration with the generic entry/exit
infrastructure while keeping existing PowerPC behavior unchanged.

No functional change is intended in this patch.

Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
 arch/powerpc/include/asm/entry-common.h | 11 +++++++++++
 arch/powerpc/include/asm/stacktrace.h   |  6 ++++++
 arch/powerpc/include/asm/syscall.h      |  5 +++++
 arch/powerpc/include/asm/thread_info.h  |  1 +
 4 files changed, 23 insertions(+)
 create mode 100644 arch/powerpc/include/asm/entry-common.h

diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
new file mode 100644
index 000000000000..3af16d821d07
--- /dev/null
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_PPC_ENTRY_COMMON_H
+#define _ASM_PPC_ENTRY_COMMON_H
+
+#ifdef CONFIG_GENERIC_IRQ_ENTRY
+
+#include <asm/stacktrace.h>
+
+#endif /* CONFIG_GENERIC_IRQ_ENTRY */
+#endif /* _ASM_PPC_ENTRY_COMMON_H */
diff --git a/arch/powerpc/include/asm/stacktrace.h b/arch/powerpc/include/asm/stacktrace.h
index 6149b53b3bc8..a81a9373d723 100644
--- a/arch/powerpc/include/asm/stacktrace.h
+++ b/arch/powerpc/include/asm/stacktrace.h
@@ -10,4 +10,10 @@
 
 void show_user_instructions(struct pt_regs *regs);
 
+static inline bool on_thread_stack(void)
+{
+	return !(((unsigned long)(current->stack) ^ current_stack_pointer)
+			& ~(THREAD_SIZE - 1));
+}
+
 #endif /* _ASM_POWERPC_STACKTRACE_H */
diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
index 4b3c52ed6e9d..834fcc4f7b54 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -139,4 +139,9 @@ static inline int syscall_get_arch(struct task_struct *task)
 	else
 		return AUDIT_ARCH_PPC64;
 }
+
+static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
+{
+	return false;
+}
 #endif	/* _ASM_SYSCALL_H */
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index b0f200aba2b3..9c8270354f0b 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -57,6 +57,7 @@ struct thread_info {
 #ifdef CONFIG_SMP
 	unsigned int	cpu;
 #endif
+	unsigned long	syscall_work;		/* SYSCALL_WORK_ flags */
 	unsigned long	local_flags;		/* private flags for thread */
 #ifdef CONFIG_LIVEPATCH_64
 	unsigned long *livepatch_sp;
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 3/8] powerpc: introduce arch_enter_from_user_mode
  2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
  2025-12-14 13:02 ` [PATCH v2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
  2025-12-14 13:02 ` [PATCH v2 2/8] powerpc: Prepare to build with generic entry/exit framework Mukesh Kumar Chaurasiya
@ 2025-12-14 13:02 ` Mukesh Kumar Chaurasiya
  2025-12-16  9:38   ` Christophe Leroy (CS GROUP)
  2025-12-14 13:02 ` [PATCH v2 4/8] powerpc: Introduce syscall exit arch functions Mukesh Kumar Chaurasiya
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>

Implement the arch_enter_from_user_mode() hook required by the generic
entry/exit framework. This helper prepares the CPU state when entering
the kernel from userspace, ensuring correct handling of KUAP/KUEP,
transactional memory, and debug register state.

As part of this change, move booke_load_dbcr0() from interrupt.c to
interrupt.h so it can be used by the new helper without introducing
cross-file dependencies.

This patch contains no functional changes, it is purely preparatory for
enabling the generic syscall and interrupt entry path on PowerPC.

Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
 arch/powerpc/include/asm/entry-common.h | 97 +++++++++++++++++++++++++
 arch/powerpc/include/asm/interrupt.h    | 22 ++++++
 arch/powerpc/kernel/interrupt.c         | 22 ------
 3 files changed, 119 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index 3af16d821d07..093ece06ef79 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -5,7 +5,104 @@
 
 #ifdef CONFIG_GENERIC_IRQ_ENTRY
 
+#include <asm/cputime.h>
+#include <asm/interrupt.h>
 #include <asm/stacktrace.h>
+#include <asm/tm.h>
+
+static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
+{
+	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
+		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
+
+	BUG_ON(regs_is_unrecoverable(regs));
+	BUG_ON(!user_mode(regs));
+	BUG_ON(regs_irqs_disabled(regs));
+
+#ifdef CONFIG_PPC_PKEY
+	if (mmu_has_feature(MMU_FTR_PKEY) && trap_is_syscall(regs)) {
+		unsigned long amr, iamr;
+		bool flush_needed = false;
+		/*
+		 * When entering from userspace we mostly have the AMR/IAMR
+		 * different from kernel default values. Hence don't compare.
+		 */
+		amr = mfspr(SPRN_AMR);
+		iamr = mfspr(SPRN_IAMR);
+		regs->amr  = amr;
+		regs->iamr = iamr;
+		if (mmu_has_feature(MMU_FTR_KUAP)) {
+			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
+			flush_needed = true;
+		}
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
+			mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
+			flush_needed = true;
+		}
+		if (flush_needed)
+			isync();
+	} else
+#endif
+		kuap_assert_locked();
+
+	booke_restore_dbcr0();
+
+	account_cpu_user_entry();
+
+	account_stolen_time();
+
+	/*
+	 * This is not required for the syscall exit path, but makes the
+	 * stack frame look nicer. If this was initialised in the first stack
+	 * frame, or if the unwinder was taught the first stack frame always
+	 * returns to user with IRQS_ENABLED, this store could be avoided!
+	 */
+	irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
+
+	/*
+	 * If system call is called with TM active, set _TIF_RESTOREALL to
+	 * prevent RFSCV being used to return to userspace, because POWER9
+	 * TM implementation has problems with this instruction returning to
+	 * transactional state. Final register values are not relevant because
+	 * the transaction will be aborted upon return anyway. Or in the case
+	 * of unsupported_scv SIGILL fault, the return state does not much
+	 * matter because it's an edge case.
+	 */
+	if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
+	    unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
+		set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
+
+	/*
+	 * If the system call was made with a transaction active, doom it and
+	 * return without performing the system call. Unless it was an
+	 * unsupported scv vector, in which case it's treated like an illegal
+	 * instruction.
+	 */
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
+	    !trap_is_unsupported_scv(regs)) {
+		/* Enable TM in the kernel, and disable EE (for scv) */
+		hard_irq_disable();
+		mtmsr(mfmsr() | MSR_TM);
+
+		/* tabort, this dooms the transaction, nothing else */
+		asm volatile(".long 0x7c00071d | ((%0) << 16)"
+			     :: "r"(TM_CAUSE_SYSCALL | TM_CAUSE_PERSISTENT));
+
+		/*
+		 * Userspace will never see the return value. Execution will
+		 * resume after the tbegin. of the aborted transaction with the
+		 * checkpointed register state. A context switch could occur
+		 * or signal delivered to the process before resuming the
+		 * doomed transaction context, but that should all be handled
+		 * as expected.
+		 */
+		return;
+	}
+#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+}
+
+#define arch_enter_from_user_mode arch_enter_from_user_mode
 
 #endif /* CONFIG_GENERIC_IRQ_ENTRY */
 #endif /* _ASM_PPC_ENTRY_COMMON_H */
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 0e2cddf8bd21..ca8a2cda9400 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -138,6 +138,28 @@ static inline void nap_adjust_return(struct pt_regs *regs)
 #endif
 }
 
+static inline void booke_load_dbcr0(void)
+{
+#ifdef CONFIG_PPC_ADV_DEBUG_REGS
+	unsigned long dbcr0 = current->thread.debug.dbcr0;
+
+	if (likely(!(dbcr0 & DBCR0_IDM)))
+		return;
+
+	/*
+	 * Check to see if the dbcr0 register is set up to debug.
+	 * Use the internal debug mode bit to do this.
+	 */
+	mtmsr(mfmsr() & ~MSR_DE);
+	if (IS_ENABLED(CONFIG_PPC32)) {
+		isync();
+		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
+	}
+	mtspr(SPRN_DBCR0, dbcr0);
+	mtspr(SPRN_DBSR, -1);
+#endif
+}
+
 static inline void booke_restore_dbcr0(void)
 {
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 0d8fd47049a1..2a09ac5dabd6 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -78,28 +78,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
 	return true;
 }
 
-static notrace void booke_load_dbcr0(void)
-{
-#ifdef CONFIG_PPC_ADV_DEBUG_REGS
-	unsigned long dbcr0 = current->thread.debug.dbcr0;
-
-	if (likely(!(dbcr0 & DBCR0_IDM)))
-		return;
-
-	/*
-	 * Check to see if the dbcr0 register is set up to debug.
-	 * Use the internal debug mode bit to do this.
-	 */
-	mtmsr(mfmsr() & ~MSR_DE);
-	if (IS_ENABLED(CONFIG_PPC32)) {
-		isync();
-		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
-	}
-	mtspr(SPRN_DBCR0, dbcr0);
-	mtspr(SPRN_DBSR, -1);
-#endif
-}
-
 static notrace void check_return_regs_valid(struct pt_regs *regs)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 4/8] powerpc: Introduce syscall exit arch functions
  2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
                   ` (2 preceding siblings ...)
  2025-12-14 13:02 ` [PATCH v2 3/8] powerpc: introduce arch_enter_from_user_mode Mukesh Kumar Chaurasiya
@ 2025-12-14 13:02 ` Mukesh Kumar Chaurasiya
  2025-12-16  9:46   ` Christophe Leroy (CS GROUP)
  2025-12-14 13:02 ` [PATCH v2 5/8] powerpc: add exit_flags field in pt_regs Mukesh Kumar Chaurasiya
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>

Add PowerPC-specific implementations of the generic syscall exit hooks
used by the generic entry/exit framework:

 - arch_exit_to_user_mode_work_prepare()
 - arch_exit_to_user_mode_work()

These helpers handle user state restoration when returning from the
kernel to userspace, including FPU/VMX/VSX state, transactional memory,
KUAP restore, and per-CPU accounting.

Additionally, move check_return_regs_valid() from interrupt.c to
interrupt.h so it can be shared by the new entry/exit logic, and add
arch_do_signal_or_restart() for use with the generic entry flow.

No functional change is intended with this patch.

Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
 arch/powerpc/include/asm/entry-common.h | 49 +++++++++++++++
 arch/powerpc/include/asm/interrupt.h    | 82 +++++++++++++++++++++++++
 arch/powerpc/kernel/interrupt.c         | 81 ------------------------
 arch/powerpc/kernel/signal.c            | 14 +++++
 4 files changed, 145 insertions(+), 81 deletions(-)

diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index 093ece06ef79..e8ebd42a4e6d 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -8,6 +8,7 @@
 #include <asm/cputime.h>
 #include <asm/interrupt.h>
 #include <asm/stacktrace.h>
+#include <asm/switch_to.h>
 #include <asm/tm.h>
 
 static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
@@ -104,5 +105,53 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
 
 #define arch_enter_from_user_mode arch_enter_from_user_mode
 
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+						  unsigned long ti_work)
+{
+	unsigned long mathflags;
+
+	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
+		if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
+		    unlikely((ti_work & _TIF_RESTORE_TM))) {
+			restore_tm_state(regs);
+		} else {
+			mathflags = MSR_FP;
+
+			if (cpu_has_feature(CPU_FTR_VSX))
+				mathflags |= MSR_VEC | MSR_VSX;
+			else if (cpu_has_feature(CPU_FTR_ALTIVEC))
+				mathflags |= MSR_VEC;
+
+			/*
+			 * If userspace MSR has all available FP bits set,
+			 * then they are live and no need to restore. If not,
+			 * it means the regs were given up and restore_math
+			 * may decide to restore them (to avoid taking an FP
+			 * fault).
+			 */
+			if ((regs->msr & mathflags) != mathflags)
+				restore_math(regs);
+		}
+	}
+
+	check_return_regs_valid(regs);
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	local_paca->tm_scratch = regs->msr;
+#endif
+	/* Restore user access locks last */
+	kuap_user_restore(regs);
+}
+
+#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
+
+static __always_inline void arch_exit_to_user_mode(void)
+{
+	booke_load_dbcr0();
+
+	account_cpu_user_exit();
+}
+
+#define arch_exit_to_user_mode arch_exit_to_user_mode
+
 #endif /* CONFIG_GENERIC_IRQ_ENTRY */
 #endif /* _ASM_PPC_ENTRY_COMMON_H */
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index ca8a2cda9400..77ff8e33f8cd 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -68,6 +68,8 @@
 
 #include <linux/context_tracking.h>
 #include <linux/hardirq.h>
+#include <linux/sched/debug.h> /* for show_regs */
+
 #include <asm/cputime.h>
 #include <asm/firmware.h>
 #include <asm/ftrace.h>
@@ -172,6 +174,86 @@ static inline void booke_restore_dbcr0(void)
 #endif
 }
 
+static inline void check_return_regs_valid(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC_BOOK3S_64
+	unsigned long trap, srr0, srr1;
+	static bool warned;
+	u8 *validp;
+	char *h;
+
+	if (trap_is_scv(regs))
+		return;
+
+	trap = TRAP(regs);
+	// EE in HV mode sets HSRRs like 0xea0
+	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
+		trap = 0xea0;
+
+	switch (trap) {
+	case 0x980:
+	case INTERRUPT_H_DATA_STORAGE:
+	case 0xe20:
+	case 0xe40:
+	case INTERRUPT_HMI:
+	case 0xe80:
+	case 0xea0:
+	case INTERRUPT_H_FAC_UNAVAIL:
+	case 0x1200:
+	case 0x1500:
+	case 0x1600:
+	case 0x1800:
+		validp = &local_paca->hsrr_valid;
+		if (!READ_ONCE(*validp))
+			return;
+
+		srr0 = mfspr(SPRN_HSRR0);
+		srr1 = mfspr(SPRN_HSRR1);
+		h = "H";
+
+		break;
+	default:
+		validp = &local_paca->srr_valid;
+		if (!READ_ONCE(*validp))
+			return;
+
+		srr0 = mfspr(SPRN_SRR0);
+		srr1 = mfspr(SPRN_SRR1);
+		h = "";
+		break;
+	}
+
+	if (srr0 == regs->nip && srr1 == regs->msr)
+		return;
+
+	/*
+	 * A NMI / soft-NMI interrupt may have come in after we found
+	 * srr_valid and before the SRRs are loaded. The interrupt then
+	 * comes in and clobbers SRRs and clears srr_valid. Then we load
+	 * the SRRs here and test them above and find they don't match.
+	 *
+	 * Test validity again after that, to catch such false positives.
+	 *
+	 * This test in general will have some window for false negatives
+	 * and may not catch and fix all such cases if an NMI comes in
+	 * later and clobbers SRRs without clearing srr_valid, but hopefully
+	 * such things will get caught most of the time, statistically
+	 * enough to be able to get a warning out.
+	 */
+	if (!READ_ONCE(*validp))
+		return;
+
+	if (!data_race(warned)) {
+		data_race(warned = true);
+		pr_warn("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
+		pr_warn("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
+		show_regs(regs);
+	}
+
+	WRITE_ONCE(*validp, 0); /* fixup */
+#endif
+}
+
 static inline void interrupt_enter_prepare(struct pt_regs *regs)
 {
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 2a09ac5dabd6..f53d432f6087 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -4,7 +4,6 @@
 #include <linux/err.h>
 #include <linux/compat.h>
 #include <linux/rseq.h>
-#include <linux/sched/debug.h> /* for show_regs */
 
 #include <asm/kup.h>
 #include <asm/cputime.h>
@@ -78,86 +77,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
 	return true;
 }
 
-static notrace void check_return_regs_valid(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
-	unsigned long trap, srr0, srr1;
-	static bool warned;
-	u8 *validp;
-	char *h;
-
-	if (trap_is_scv(regs))
-		return;
-
-	trap = TRAP(regs);
-	// EE in HV mode sets HSRRs like 0xea0
-	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
-		trap = 0xea0;
-
-	switch (trap) {
-	case 0x980:
-	case INTERRUPT_H_DATA_STORAGE:
-	case 0xe20:
-	case 0xe40:
-	case INTERRUPT_HMI:
-	case 0xe80:
-	case 0xea0:
-	case INTERRUPT_H_FAC_UNAVAIL:
-	case 0x1200:
-	case 0x1500:
-	case 0x1600:
-	case 0x1800:
-		validp = &local_paca->hsrr_valid;
-		if (!READ_ONCE(*validp))
-			return;
-
-		srr0 = mfspr(SPRN_HSRR0);
-		srr1 = mfspr(SPRN_HSRR1);
-		h = "H";
-
-		break;
-	default:
-		validp = &local_paca->srr_valid;
-		if (!READ_ONCE(*validp))
-			return;
-
-		srr0 = mfspr(SPRN_SRR0);
-		srr1 = mfspr(SPRN_SRR1);
-		h = "";
-		break;
-	}
-
-	if (srr0 == regs->nip && srr1 == regs->msr)
-		return;
-
-	/*
-	 * A NMI / soft-NMI interrupt may have come in after we found
-	 * srr_valid and before the SRRs are loaded. The interrupt then
-	 * comes in and clobbers SRRs and clears srr_valid. Then we load
-	 * the SRRs here and test them above and find they don't match.
-	 *
-	 * Test validity again after that, to catch such false positives.
-	 *
-	 * This test in general will have some window for false negatives
-	 * and may not catch and fix all such cases if an NMI comes in
-	 * later and clobbers SRRs without clearing srr_valid, but hopefully
-	 * such things will get caught most of the time, statistically
-	 * enough to be able to get a warning out.
-	 */
-	if (!READ_ONCE(*validp))
-		return;
-
-	if (!data_race(warned)) {
-		data_race(warned = true);
-		printk("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
-		printk("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
-		show_regs(regs);
-	}
-
-	WRITE_ONCE(*validp, 0); /* fixup */
-#endif
-}
-
 static notrace unsigned long
 interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
 {
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index aa17e62f3754..719930cf4ae1 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -22,6 +22,11 @@
 
 #include "signal.h"
 
+/* This will be removed */
+#ifdef CONFIG_GENERIC_ENTRY
+#include <linux/entry-common.h>
+#endif /* CONFIG_GENERIC_ENTRY */
+
 #ifdef CONFIG_VSX
 unsigned long copy_fpr_to_user(void __user *to,
 			       struct task_struct *task)
@@ -368,3 +373,12 @@ void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
 		printk_ratelimited(regs->msr & MSR_64BIT ? fm64 : fm32, tsk->comm,
 				   task_pid_nr(tsk), where, ptr, regs->nip, regs->link);
 }
+
+#ifdef CONFIG_GENERIC_ENTRY
+void arch_do_signal_or_restart(struct pt_regs *regs)
+{
+	BUG_ON(regs != current->thread.regs);
+	local_paca->generic_fw_flags |= GFW_RESTORE_ALL;
+	do_signal(current);
+}
+#endif /* CONFIG_GENERIC_ENTRY */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 5/8] powerpc: add exit_flags field in pt_regs
  2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
                   ` (3 preceding siblings ...)
  2025-12-14 13:02 ` [PATCH v2 4/8] powerpc: Introduce syscall exit arch functions Mukesh Kumar Chaurasiya
@ 2025-12-14 13:02 ` Mukesh Kumar Chaurasiya
  2025-12-16  9:52   ` Christophe Leroy (CS GROUP)
  2025-12-14 13:02 ` [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit Mukesh Kumar Chaurasiya
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>

Add a new field `exit_flags` in the pt_regs structure. This field will hold
the flags set during interrupt or syscall execution that are required during
exit to user mode.

Specifically, the `TIF_RESTOREALL` flag, stored in this field, helps the
exit routine determine if any NVGPRs were modified and need to be restored
before returning to userspace.

This addition ensures a clean and architecture-specific mechanism to track
per-syscall or per-interrupt state transitions related to register restore.

Changes:
 - Add `exit_flags` and `__pt_regs_pad` to maintain 16-byte stack alignment
 - Update asm-offsets.c and ptrace.c for offset and validation
 - Update PT_* constants in uapi header to reflect the new layout

Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
 arch/powerpc/include/asm/ptrace.h      |  3 +++
 arch/powerpc/include/uapi/asm/ptrace.h | 14 +++++++++-----
 arch/powerpc/kernel/asm-offsets.c      |  1 +
 arch/powerpc/kernel/ptrace/ptrace.c    |  1 +
 4 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
index 94aa1de2b06e..3af8a5898fe3 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -53,6 +53,9 @@ struct pt_regs
 				unsigned long esr;
 			};
 			unsigned long result;
+			unsigned long exit_flags;
+			/* Maintain 16 byte interrupt stack alignment */
+			unsigned long __pt_regs_pad[1];
 		};
 	};
 #if defined(CONFIG_PPC64) || defined(CONFIG_PPC_KUAP)
diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h
index 01e630149d48..de56b216c9c5 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -55,6 +55,8 @@ struct pt_regs
 	unsigned long dar;		/* Fault registers */
 	unsigned long dsisr;		/* on 4xx/Book-E used for ESR */
 	unsigned long result;		/* Result of a system call */
+	unsigned long exit_flags;	/* System call exit flags */
+	unsigned long __pt_regs_pad[1];	/* Maintain 16 byte interrupt stack alignment */
 };
 
 #endif /* __ASSEMBLER__ */
@@ -114,10 +116,12 @@ struct pt_regs
 #define PT_DAR	41
 #define PT_DSISR 42
 #define PT_RESULT 43
-#define PT_DSCR 44
-#define PT_REGS_COUNT 44
+#define PT_EXIT_FLAGS 44
+#define PT_PAD 45
+#define PT_DSCR 46
+#define PT_REGS_COUNT 46
 
-#define PT_FPR0	48	/* each FP reg occupies 2 slots in this space */
+#define PT_FPR0	(PT_REGS_COUNT + 4)	/* each FP reg occupies 2 slots in this space */
 
 #ifndef __powerpc64__
 
@@ -129,7 +133,7 @@ struct pt_regs
 #define PT_FPSCR (PT_FPR0 + 32)	/* each FP reg occupies 1 slot in 64-bit space */
 
 
-#define PT_VR0 82	/* each Vector reg occupies 2 slots in 64-bit */
+#define PT_VR0	(PT_FPSCR + 2)	/* <82> each Vector reg occupies 2 slots in 64-bit */
 #define PT_VSCR (PT_VR0 + 32*2 + 1)
 #define PT_VRSAVE (PT_VR0 + 33*2)
 
@@ -137,7 +141,7 @@ struct pt_regs
 /*
  * Only store first 32 VSRs here. The second 32 VSRs in VR0-31
  */
-#define PT_VSR0 150	/* each VSR reg occupies 2 slots in 64-bit */
+#define PT_VSR0	(PT_VRSAVE + 2)	/* each VSR reg occupies 2 slots in 64-bit */
 #define PT_VSR31 (PT_VSR0 + 2*31)
 #endif /* __powerpc64__ */
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index a4bc80b30410..c0bb09f1db78 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -292,6 +292,7 @@ int main(void)
 	STACK_PT_REGS_OFFSET(_ESR, esr);
 	STACK_PT_REGS_OFFSET(ORIG_GPR3, orig_gpr3);
 	STACK_PT_REGS_OFFSET(RESULT, result);
+	STACK_PT_REGS_OFFSET(EXIT_FLAGS, exit_flags);
 	STACK_PT_REGS_OFFSET(_TRAP, trap);
 #ifdef CONFIG_PPC64
 	STACK_PT_REGS_OFFSET(SOFTE, softe);
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index c6997df63287..2134b6d155ff 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -432,6 +432,7 @@ void __init pt_regs_check(void)
 	CHECK_REG(PT_DAR, dar);
 	CHECK_REG(PT_DSISR, dsisr);
 	CHECK_REG(PT_RESULT, result);
+	CHECK_REG(PT_EXIT_FLAGS, exit_flags);
 	#undef CHECK_REG
 
 	BUILD_BUG_ON(PT_REGS_COUNT != sizeof(struct user_pt_regs) / sizeof(unsigned long));
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit
  2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
                   ` (4 preceding siblings ...)
  2025-12-14 13:02 ` [PATCH v2 5/8] powerpc: add exit_flags field in pt_regs Mukesh Kumar Chaurasiya
@ 2025-12-14 13:02 ` Mukesh Kumar Chaurasiya
  2025-12-16  9:58   ` Christophe Leroy (CS GROUP)
  2025-12-14 13:02 ` [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
  2025-12-14 13:02 ` [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
  7 siblings, 1 reply; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>

Move interrupt entry and exit helper routines from interrupt.h into the
PowerPC-specific entry-common.h header as a preparatory step for enabling
the generic entry/exit framework.

This consolidation places all PowerPC interrupt entry/exit handling in a
single common header, aligning with the generic entry infrastructure.
The helpers provide architecture-specific handling for interrupt and NMI
entry/exit sequences, including:

 - arch_interrupt_enter/exit_prepare()
 - arch_interrupt_async_enter/exit_prepare()
 - arch_interrupt_nmi_enter/exit_prepare()
 - Supporting helpers such as nap_adjust_return(), check_return_regs_valid(),
   debug register maintenance, and soft mask handling.

The functions are copied verbatim from interrupt.h to avoid functional
changes at this stage. Subsequent patches will integrate these routines
into the generic entry/exit flow.

No functional change intended.

Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
 arch/powerpc/include/asm/entry-common.h | 422 ++++++++++++++++++++++++
 1 file changed, 422 insertions(+)

diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index e8ebd42a4e6d..e8bde4c67eaf 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -7,10 +7,432 @@
 
 #include <asm/cputime.h>
 #include <asm/interrupt.h>
+#include <asm/runlatch.h>
 #include <asm/stacktrace.h>
 #include <asm/switch_to.h>
 #include <asm/tm.h>
 
+#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
+/*
+ * WARN/BUG is handled with a program interrupt so minimise checks here to
+ * avoid recursion and maximise the chance of getting the first oops handled.
+ */
+#define INT_SOFT_MASK_BUG_ON(regs, cond)				\
+do {									\
+	if ((user_mode(regs) || (TRAP(regs) != INTERRUPT_PROGRAM)))	\
+		BUG_ON(cond);						\
+} while (0)
+#else
+#define INT_SOFT_MASK_BUG_ON(regs, cond)
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
+extern char __end_soft_masked[];
+bool search_kernel_soft_mask_table(unsigned long addr);
+unsigned long search_kernel_restart_table(unsigned long addr);
+
+DECLARE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
+
+static inline bool is_implicit_soft_masked(struct pt_regs *regs)
+{
+	if (user_mode(regs))
+		return false;
+
+	if (regs->nip >= (unsigned long)__end_soft_masked)
+		return false;
+
+	return search_kernel_soft_mask_table(regs->nip);
+}
+
+static inline void srr_regs_clobbered(void)
+{
+	local_paca->srr_valid = 0;
+	local_paca->hsrr_valid = 0;
+}
+#else
+static inline unsigned long search_kernel_restart_table(unsigned long addr)
+{
+	return 0;
+}
+
+static inline bool is_implicit_soft_masked(struct pt_regs *regs)
+{
+	return false;
+}
+
+static inline void srr_regs_clobbered(void)
+{
+}
+#endif
+
+static inline void nap_adjust_return(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC_970_NAP
+	if (unlikely(test_thread_local_flags(_TLF_NAPPING))) {
+		/* Can avoid a test-and-clear because NMIs do not call this */
+		clear_thread_local_flags(_TLF_NAPPING);
+		regs_set_return_ip(regs, (unsigned long)power4_idle_nap_return);
+	}
+#endif
+}
+
+static inline void booke_load_dbcr0(void)
+{
+#ifdef CONFIG_PPC_ADV_DEBUG_REGS
+	unsigned long dbcr0 = current->thread.debug.dbcr0;
+
+	if (likely(!(dbcr0 & DBCR0_IDM)))
+		return;
+
+	/*
+	 * Check to see if the dbcr0 register is set up to debug.
+	 * Use the internal debug mode bit to do this.
+	 */
+	mtmsr(mfmsr() & ~MSR_DE);
+	if (IS_ENABLED(CONFIG_PPC32)) {
+		isync();
+		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
+	}
+	mtspr(SPRN_DBCR0, dbcr0);
+	mtspr(SPRN_DBSR, -1);
+#endif
+}
+
+static inline void booke_restore_dbcr0(void)
+{
+#ifdef CONFIG_PPC_ADV_DEBUG_REGS
+	unsigned long dbcr0 = current->thread.debug.dbcr0;
+
+	if (IS_ENABLED(CONFIG_PPC32) && unlikely(dbcr0 & DBCR0_IDM)) {
+		mtspr(SPRN_DBSR, -1);
+		mtspr(SPRN_DBCR0, global_dbcr0[smp_processor_id()]);
+	}
+#endif
+}
+
+static inline void check_return_regs_valid(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC_BOOK3S_64
+	unsigned long trap, srr0, srr1;
+	static bool warned;
+	u8 *validp;
+	char *h;
+
+	if (trap_is_scv(regs))
+		return;
+
+	trap = TRAP(regs);
+	// EE in HV mode sets HSRRs like 0xea0
+	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
+		trap = 0xea0;
+
+	switch (trap) {
+	case 0x980:
+	case INTERRUPT_H_DATA_STORAGE:
+	case 0xe20:
+	case 0xe40:
+	case INTERRUPT_HMI:
+	case 0xe80:
+	case 0xea0:
+	case INTERRUPT_H_FAC_UNAVAIL:
+	case 0x1200:
+	case 0x1500:
+	case 0x1600:
+	case 0x1800:
+		validp = &local_paca->hsrr_valid;
+		if (!READ_ONCE(*validp))
+			return;
+
+		srr0 = mfspr(SPRN_HSRR0);
+		srr1 = mfspr(SPRN_HSRR1);
+		h = "H";
+
+		break;
+	default:
+		validp = &local_paca->srr_valid;
+		if (!READ_ONCE(*validp))
+			return;
+
+		srr0 = mfspr(SPRN_SRR0);
+		srr1 = mfspr(SPRN_SRR1);
+		h = "";
+		break;
+	}
+
+	if (srr0 == regs->nip && srr1 == regs->msr)
+		return;
+
+	/*
+	 * A NMI / soft-NMI interrupt may have come in after we found
+	 * srr_valid and before the SRRs are loaded. The interrupt then
+	 * comes in and clobbers SRRs and clears srr_valid. Then we load
+	 * the SRRs here and test them above and find they don't match.
+	 *
+	 * Test validity again after that, to catch such false positives.
+	 *
+	 * This test in general will have some window for false negatives
+	 * and may not catch and fix all such cases if an NMI comes in
+	 * later and clobbers SRRs without clearing srr_valid, but hopefully
+	 * such things will get caught most of the time, statistically
+	 * enough to be able to get a warning out.
+	 */
+	if (!READ_ONCE(*validp))
+		return;
+
+	if (!data_race(warned)) {
+		data_race(warned = true);
+		pr_warn("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
+		pr_warn("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
+		show_regs(regs);
+	}
+
+	WRITE_ONCE(*validp, 0); /* fixup */
+#endif
+}
+
+static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC64
+	irq_soft_mask_set(IRQS_ALL_DISABLED);
+
+	/*
+	 * If the interrupt was taken with HARD_DIS clear, then enable MSR[EE].
+	 * Asynchronous interrupts get here with HARD_DIS set (see below), so
+	 * this enables MSR[EE] for synchronous interrupts. IRQs remain
+	 * soft-masked. The interrupt handler may later call
+	 * interrupt_cond_local_irq_enable() to achieve a regular process
+	 * context.
+	 */
+	if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS)) {
+		INT_SOFT_MASK_BUG_ON(regs, !(regs->msr & MSR_EE));
+		__hard_irq_enable();
+	} else {
+		__hard_RI_enable();
+	}
+	/* Enable MSR[RI] early, to support kernel SLB and hash faults */
+#endif
+
+	if (!regs_irqs_disabled(regs))
+		trace_hardirqs_off();
+
+	if (user_mode(regs)) {
+		kuap_lock();
+		CT_WARN_ON(ct_state() != CT_STATE_USER);
+		user_exit_irqoff();
+
+		account_cpu_user_entry();
+		account_stolen_time();
+	} else {
+		kuap_save_and_lock(regs);
+		/*
+		 * CT_WARN_ON comes here via program_check_exception,
+		 * so avoid recursion.
+		 */
+		if (TRAP(regs) != INTERRUPT_PROGRAM)
+			CT_WARN_ON(ct_state() != CT_STATE_KERNEL &&
+				   ct_state() != CT_STATE_IDLE);
+		INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
+		INT_SOFT_MASK_BUG_ON(regs, regs_irqs_disabled(regs) &&
+				     search_kernel_restart_table(regs->nip));
+	}
+	INT_SOFT_MASK_BUG_ON(regs, !regs_irqs_disabled(regs) &&
+			     !(regs->msr & MSR_EE));
+
+	booke_restore_dbcr0();
+}
+
+/*
+ * Care should be taken to note that arch_interrupt_exit_prepare and
+ * arch_interrupt_async_exit_prepare do not necessarily return immediately to
+ * regs context (e.g., if regs is usermode, we don't necessarily return to
+ * user mode). Other interrupts might be taken between here and return,
+ * context switch / preemption may occur in the exit path after this, or a
+ * signal may be delivered, etc.
+ *
+ * The real interrupt exit code is platform specific, e.g.,
+ * interrupt_exit_user_prepare / interrupt_exit_kernel_prepare for 64s.
+ *
+ * However arch_interrupt_nmi_exit_prepare does return directly to regs, because
+ * NMIs do not do "exit work" or replay soft-masked interrupts.
+ */
+static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
+{
+}
+
+static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC64
+	/* Ensure arch_interrupt_enter_prepare does not enable MSR[EE] */
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+#endif
+	arch_interrupt_enter_prepare(regs);
+#ifdef CONFIG_PPC_BOOK3S_64
+	/*
+	 * RI=1 is set by arch_interrupt_enter_prepare, so this thread flags access
+	 * has to come afterward (it can cause SLB faults).
+	 */
+	if (cpu_has_feature(CPU_FTR_CTRL) &&
+	    !test_thread_local_flags(_TLF_RUNLATCH))
+		__ppc64_runlatch_on();
+#endif
+	irq_enter();
+}
+
+static inline void arch_interrupt_async_exit_prepare(struct pt_regs *regs)
+{
+	/*
+	 * Adjust at exit so the main handler sees the true NIA. This must
+	 * come before irq_exit() because irq_exit can enable interrupts, and
+	 * if another interrupt is taken before nap_adjust_return has run
+	 * here, then that interrupt would return directly to idle nap return.
+	 */
+	nap_adjust_return(regs);
+
+	irq_exit();
+	arch_interrupt_exit_prepare(regs);
+}
+
+struct interrupt_nmi_state {
+#ifdef CONFIG_PPC64
+	u8 irq_soft_mask;
+	u8 irq_happened;
+	u8 ftrace_enabled;
+	u64 softe;
+#endif
+};
+
+static inline bool nmi_disables_ftrace(struct pt_regs *regs)
+{
+	/* Allow DEC and PMI to be traced when they are soft-NMI */
+	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
+		if (TRAP(regs) == INTERRUPT_DECREMENTER)
+			return false;
+		if (TRAP(regs) == INTERRUPT_PERFMON)
+			return false;
+	}
+	if (IS_ENABLED(CONFIG_PPC_BOOK3E_64)) {
+		if (TRAP(regs) == INTERRUPT_PERFMON)
+			return false;
+	}
+
+	return true;
+}
+
+static inline void arch_interrupt_nmi_enter_prepare(struct pt_regs *regs,
+					       struct interrupt_nmi_state *state)
+{
+#ifdef CONFIG_PPC64
+	state->irq_soft_mask = local_paca->irq_soft_mask;
+	state->irq_happened = local_paca->irq_happened;
+	state->softe = regs->softe;
+
+	/*
+	 * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
+	 * the right thing, and set IRQ_HARD_DIS. We do not want to reconcile
+	 * because that goes through irq tracing which we don't want in NMI.
+	 */
+	local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
+	if (!(regs->msr & MSR_EE) || is_implicit_soft_masked(regs)) {
+		/*
+		 * Adjust regs->softe to be soft-masked if it had not been
+		 * reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
+		 * not yet set disabled), or if it was in an implicit soft
+		 * masked state. This makes regs_irqs_disabled(regs)
+		 * behave as expected.
+		 */
+		regs->softe = IRQS_ALL_DISABLED;
+	}
+
+	__hard_RI_enable();
+
+	/* Don't do any per-CPU operations until interrupt state is fixed */
+
+	if (nmi_disables_ftrace(regs)) {
+		state->ftrace_enabled = this_cpu_get_ftrace_enabled();
+		this_cpu_set_ftrace_enabled(0);
+	}
+#endif
+
+	/* If data relocations are enabled, it's safe to use nmi_enter() */
+	if (mfmsr() & MSR_DR) {
+		nmi_enter();
+		return;
+	}
+
+	/*
+	 * But do not use nmi_enter() for pseries hash guest taking a real-mode
+	 * NMI because not everything it touches is within the RMA limit.
+	 */
+	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
+	    firmware_has_feature(FW_FEATURE_LPAR) &&
+	    !radix_enabled())
+		return;
+
+	/*
+	 * Likewise, don't use it if we have some form of instrumentation (like
+	 * KASAN shadow) that is not safe to access in real mode (even on radix)
+	 */
+	if (IS_ENABLED(CONFIG_KASAN))
+		return;
+
+	/*
+	 * Likewise, do not use it in real mode if percpu first chunk is not
+	 * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
+	 * are chances where percpu allocation can come from vmalloc area.
+	 */
+	if (percpu_first_chunk_is_paged)
+		return;
+
+	/* Otherwise, it should be safe to call it */
+	nmi_enter();
+}
+
+static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
+					      struct interrupt_nmi_state *state)
+{
+	if (mfmsr() & MSR_DR) {
+		// nmi_exit if relocations are on
+		nmi_exit();
+	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
+		   firmware_has_feature(FW_FEATURE_LPAR) &&
+		   !radix_enabled()) {
+		// no nmi_exit for a pseries hash guest taking a real mode exception
+	} else if (IS_ENABLED(CONFIG_KASAN)) {
+		// no nmi_exit for KASAN in real mode
+	} else if (percpu_first_chunk_is_paged) {
+		// no nmi_exit if percpu first chunk is not embedded
+	} else {
+		nmi_exit();
+	}
+
+	/*
+	 * nmi does not call nap_adjust_return because nmi should not create
+	 * new work to do (must use irq_work for that).
+	 */
+
+#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S
+	if (regs_irqs_disabled(regs)) {
+		unsigned long rst = search_kernel_restart_table(regs->nip);
+
+		if (rst)
+			regs_set_return_ip(regs, rst);
+	}
+#endif
+
+	if (nmi_disables_ftrace(regs))
+		this_cpu_set_ftrace_enabled(state->ftrace_enabled);
+
+	/* Check we didn't change the pending interrupt mask. */
+	WARN_ON_ONCE((state->irq_happened | PACA_IRQ_HARD_DIS) != local_paca->irq_happened);
+	regs->softe = state->softe;
+	local_paca->irq_happened = state->irq_happened;
+	local_paca->irq_soft_mask = state->irq_soft_mask;
+#endif
+}
+
 static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
 {
 	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
  2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
                   ` (5 preceding siblings ...)
  2025-12-14 13:02 ` [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit Mukesh Kumar Chaurasiya
@ 2025-12-14 13:02 ` Mukesh Kumar Chaurasiya
  2025-12-16  6:29   ` kernel test robot
                     ` (3 more replies)
  2025-12-14 13:02 ` [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
  7 siblings, 4 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>

Enable the generic IRQ entry/exit infrastructure on PowerPC by selecting
GENERIC_IRQ_ENTRY and integrating the architecture-specific interrupt
handlers with the generic entry/exit APIs.

This change replaces PowerPC’s local interrupt entry/exit handling with
calls to the generic irqentry_* helpers, aligning the architecture with
the common kernel entry model. The macros that define interrupt, async,
and NMI handlers are updated to use irqentry_enter()/irqentry_exit()
and irqentry_nmi_enter()/irqentry_nmi_exit() where applicable.

Key updates include:
 - Select GENERIC_IRQ_ENTRY in Kconfig.
 - Replace interrupt_enter/exit_prepare() with arch_interrupt_* helpers.
 - Integrate irqentry_enter()/exit() in standard and async interrupt paths.
 - Integrate irqentry_nmi_enter()/exit() in NMI handlers.
 - Remove redundant irq_enter()/irq_exit() calls now handled generically.
 - Use irqentry_exit_cond_resched() for preemption checks.

This change establishes the necessary wiring for PowerPC to use the
generic IRQ entry/exit framework while maintaining existing semantics.

Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
 arch/powerpc/Kconfig                    |   1 +
 arch/powerpc/include/asm/entry-common.h |  66 +---
 arch/powerpc/include/asm/interrupt.h    | 499 +++---------------------
 arch/powerpc/kernel/interrupt.c         |  13 +-
 4 files changed, 74 insertions(+), 505 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e24f4d88885a..b0c602c3bbe1 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -206,6 +206,7 @@ config PPC
 	select GENERIC_GETTIMEOFDAY
 	select GENERIC_IDLE_POLL_SETUP
 	select GENERIC_IOREMAP
+	select GENERIC_IRQ_ENTRY
 	select GENERIC_IRQ_SHOW
 	select GENERIC_IRQ_SHOW_LEVEL
 	select GENERIC_PCI_IOMAP		if PCI
diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index e8bde4c67eaf..e2ae7416dee1 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -257,6 +257,17 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
  */
 static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
 {
+	if (user_mode(regs)) {
+		BUG_ON(regs_is_unrecoverable(regs));
+		BUG_ON(regs_irqs_disabled(regs));
+		/*
+		 * We don't need to restore AMR on the way back to userspace for KUAP.
+		 * AMR can only have been unlocked if we interrupted the kernel.
+		 */
+		kuap_assert_locked();
+
+		local_irq_disable();
+	}
 }
 
 static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
@@ -275,7 +286,6 @@ static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
 	    !test_thread_local_flags(_TLF_RUNLATCH))
 		__ppc64_runlatch_on();
 #endif
-	irq_enter();
 }
 
 static inline void arch_interrupt_async_exit_prepare(struct pt_regs *regs)
@@ -288,7 +298,6 @@ static inline void arch_interrupt_async_exit_prepare(struct pt_regs *regs)
 	 */
 	nap_adjust_return(regs);
 
-	irq_exit();
 	arch_interrupt_exit_prepare(regs);
 }
 
@@ -354,59 +363,11 @@ static inline void arch_interrupt_nmi_enter_prepare(struct pt_regs *regs,
 		this_cpu_set_ftrace_enabled(0);
 	}
 #endif
-
-	/* If data relocations are enabled, it's safe to use nmi_enter() */
-	if (mfmsr() & MSR_DR) {
-		nmi_enter();
-		return;
-	}
-
-	/*
-	 * But do not use nmi_enter() for pseries hash guest taking a real-mode
-	 * NMI because not everything it touches is within the RMA limit.
-	 */
-	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
-	    firmware_has_feature(FW_FEATURE_LPAR) &&
-	    !radix_enabled())
-		return;
-
-	/*
-	 * Likewise, don't use it if we have some form of instrumentation (like
-	 * KASAN shadow) that is not safe to access in real mode (even on radix)
-	 */
-	if (IS_ENABLED(CONFIG_KASAN))
-		return;
-
-	/*
-	 * Likewise, do not use it in real mode if percpu first chunk is not
-	 * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
-	 * are chances where percpu allocation can come from vmalloc area.
-	 */
-	if (percpu_first_chunk_is_paged)
-		return;
-
-	/* Otherwise, it should be safe to call it */
-	nmi_enter();
 }
 
 static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
 					      struct interrupt_nmi_state *state)
 {
-	if (mfmsr() & MSR_DR) {
-		// nmi_exit if relocations are on
-		nmi_exit();
-	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
-		   firmware_has_feature(FW_FEATURE_LPAR) &&
-		   !radix_enabled()) {
-		// no nmi_exit for a pseries hash guest taking a real mode exception
-	} else if (IS_ENABLED(CONFIG_KASAN)) {
-		// no nmi_exit for KASAN in real mode
-	} else if (percpu_first_chunk_is_paged) {
-		// no nmi_exit if percpu first chunk is not embedded
-	} else {
-		nmi_exit();
-	}
-
 	/*
 	 * nmi does not call nap_adjust_return because nmi should not create
 	 * new work to do (must use irq_work for that).
@@ -435,6 +396,8 @@ static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
 
 static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
 {
+	kuap_lock();
+
 	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
 		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
 
@@ -467,11 +430,8 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
 	} else
 #endif
 		kuap_assert_locked();
-
 	booke_restore_dbcr0();
-
 	account_cpu_user_entry();
-
 	account_stolen_time();
 
 	/*
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 77ff8e33f8cd..e2376de85370 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -66,433 +66,10 @@
 
 #ifndef __ASSEMBLER__
 
-#include <linux/context_tracking.h>
-#include <linux/hardirq.h>
 #include <linux/sched/debug.h> /* for show_regs */
+#include <linux/irq-entry-common.h>
 
-#include <asm/cputime.h>
-#include <asm/firmware.h>
-#include <asm/ftrace.h>
 #include <asm/kprobes.h>
-#include <asm/runlatch.h>
-
-#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
-/*
- * WARN/BUG is handled with a program interrupt so minimise checks here to
- * avoid recursion and maximise the chance of getting the first oops handled.
- */
-#define INT_SOFT_MASK_BUG_ON(regs, cond)				\
-do {									\
-	if ((user_mode(regs) || (TRAP(regs) != INTERRUPT_PROGRAM)))	\
-		BUG_ON(cond);						\
-} while (0)
-#else
-#define INT_SOFT_MASK_BUG_ON(regs, cond)
-#endif
-
-#ifdef CONFIG_PPC_BOOK3S_64
-extern char __end_soft_masked[];
-bool search_kernel_soft_mask_table(unsigned long addr);
-unsigned long search_kernel_restart_table(unsigned long addr);
-
-DECLARE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
-
-static inline bool is_implicit_soft_masked(struct pt_regs *regs)
-{
-	if (user_mode(regs))
-		return false;
-
-	if (regs->nip >= (unsigned long)__end_soft_masked)
-		return false;
-
-	return search_kernel_soft_mask_table(regs->nip);
-}
-
-static inline void srr_regs_clobbered(void)
-{
-	local_paca->srr_valid = 0;
-	local_paca->hsrr_valid = 0;
-}
-#else
-static inline unsigned long search_kernel_restart_table(unsigned long addr)
-{
-	return 0;
-}
-
-static inline bool is_implicit_soft_masked(struct pt_regs *regs)
-{
-	return false;
-}
-
-static inline void srr_regs_clobbered(void)
-{
-}
-#endif
-
-static inline void nap_adjust_return(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC_970_NAP
-	if (unlikely(test_thread_local_flags(_TLF_NAPPING))) {
-		/* Can avoid a test-and-clear because NMIs do not call this */
-		clear_thread_local_flags(_TLF_NAPPING);
-		regs_set_return_ip(regs, (unsigned long)power4_idle_nap_return);
-	}
-#endif
-}
-
-static inline void booke_load_dbcr0(void)
-{
-#ifdef CONFIG_PPC_ADV_DEBUG_REGS
-	unsigned long dbcr0 = current->thread.debug.dbcr0;
-
-	if (likely(!(dbcr0 & DBCR0_IDM)))
-		return;
-
-	/*
-	 * Check to see if the dbcr0 register is set up to debug.
-	 * Use the internal debug mode bit to do this.
-	 */
-	mtmsr(mfmsr() & ~MSR_DE);
-	if (IS_ENABLED(CONFIG_PPC32)) {
-		isync();
-		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
-	}
-	mtspr(SPRN_DBCR0, dbcr0);
-	mtspr(SPRN_DBSR, -1);
-#endif
-}
-
-static inline void booke_restore_dbcr0(void)
-{
-#ifdef CONFIG_PPC_ADV_DEBUG_REGS
-	unsigned long dbcr0 = current->thread.debug.dbcr0;
-
-	if (IS_ENABLED(CONFIG_PPC32) && unlikely(dbcr0 & DBCR0_IDM)) {
-		mtspr(SPRN_DBSR, -1);
-		mtspr(SPRN_DBCR0, global_dbcr0[smp_processor_id()]);
-	}
-#endif
-}
-
-static inline void check_return_regs_valid(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
-	unsigned long trap, srr0, srr1;
-	static bool warned;
-	u8 *validp;
-	char *h;
-
-	if (trap_is_scv(regs))
-		return;
-
-	trap = TRAP(regs);
-	// EE in HV mode sets HSRRs like 0xea0
-	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
-		trap = 0xea0;
-
-	switch (trap) {
-	case 0x980:
-	case INTERRUPT_H_DATA_STORAGE:
-	case 0xe20:
-	case 0xe40:
-	case INTERRUPT_HMI:
-	case 0xe80:
-	case 0xea0:
-	case INTERRUPT_H_FAC_UNAVAIL:
-	case 0x1200:
-	case 0x1500:
-	case 0x1600:
-	case 0x1800:
-		validp = &local_paca->hsrr_valid;
-		if (!READ_ONCE(*validp))
-			return;
-
-		srr0 = mfspr(SPRN_HSRR0);
-		srr1 = mfspr(SPRN_HSRR1);
-		h = "H";
-
-		break;
-	default:
-		validp = &local_paca->srr_valid;
-		if (!READ_ONCE(*validp))
-			return;
-
-		srr0 = mfspr(SPRN_SRR0);
-		srr1 = mfspr(SPRN_SRR1);
-		h = "";
-		break;
-	}
-
-	if (srr0 == regs->nip && srr1 == regs->msr)
-		return;
-
-	/*
-	 * A NMI / soft-NMI interrupt may have come in after we found
-	 * srr_valid and before the SRRs are loaded. The interrupt then
-	 * comes in and clobbers SRRs and clears srr_valid. Then we load
-	 * the SRRs here and test them above and find they don't match.
-	 *
-	 * Test validity again after that, to catch such false positives.
-	 *
-	 * This test in general will have some window for false negatives
-	 * and may not catch and fix all such cases if an NMI comes in
-	 * later and clobbers SRRs without clearing srr_valid, but hopefully
-	 * such things will get caught most of the time, statistically
-	 * enough to be able to get a warning out.
-	 */
-	if (!READ_ONCE(*validp))
-		return;
-
-	if (!data_race(warned)) {
-		data_race(warned = true);
-		pr_warn("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
-		pr_warn("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
-		show_regs(regs);
-	}
-
-	WRITE_ONCE(*validp, 0); /* fixup */
-#endif
-}
-
-static inline void interrupt_enter_prepare(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC64
-	irq_soft_mask_set(IRQS_ALL_DISABLED);
-
-	/*
-	 * If the interrupt was taken with HARD_DIS clear, then enable MSR[EE].
-	 * Asynchronous interrupts get here with HARD_DIS set (see below), so
-	 * this enables MSR[EE] for synchronous interrupts. IRQs remain
-	 * soft-masked. The interrupt handler may later call
-	 * interrupt_cond_local_irq_enable() to achieve a regular process
-	 * context.
-	 */
-	if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS)) {
-		INT_SOFT_MASK_BUG_ON(regs, !(regs->msr & MSR_EE));
-		__hard_irq_enable();
-	} else {
-		__hard_RI_enable();
-	}
-	/* Enable MSR[RI] early, to support kernel SLB and hash faults */
-#endif
-
-	if (!regs_irqs_disabled(regs))
-		trace_hardirqs_off();
-
-	if (user_mode(regs)) {
-		kuap_lock();
-		CT_WARN_ON(ct_state() != CT_STATE_USER);
-		user_exit_irqoff();
-
-		account_cpu_user_entry();
-		account_stolen_time();
-	} else {
-		kuap_save_and_lock(regs);
-		/*
-		 * CT_WARN_ON comes here via program_check_exception,
-		 * so avoid recursion.
-		 */
-		if (TRAP(regs) != INTERRUPT_PROGRAM)
-			CT_WARN_ON(ct_state() != CT_STATE_KERNEL &&
-				   ct_state() != CT_STATE_IDLE);
-		INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
-		INT_SOFT_MASK_BUG_ON(regs, regs_irqs_disabled(regs) &&
-				     search_kernel_restart_table(regs->nip));
-	}
-	INT_SOFT_MASK_BUG_ON(regs, !regs_irqs_disabled(regs) &&
-			     !(regs->msr & MSR_EE));
-
-	booke_restore_dbcr0();
-}
-
-/*
- * Care should be taken to note that interrupt_exit_prepare and
- * interrupt_async_exit_prepare do not necessarily return immediately to
- * regs context (e.g., if regs is usermode, we don't necessarily return to
- * user mode). Other interrupts might be taken between here and return,
- * context switch / preemption may occur in the exit path after this, or a
- * signal may be delivered, etc.
- *
- * The real interrupt exit code is platform specific, e.g.,
- * interrupt_exit_user_prepare / interrupt_exit_kernel_prepare for 64s.
- *
- * However interrupt_nmi_exit_prepare does return directly to regs, because
- * NMIs do not do "exit work" or replay soft-masked interrupts.
- */
-static inline void interrupt_exit_prepare(struct pt_regs *regs)
-{
-}
-
-static inline void interrupt_async_enter_prepare(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC64
-	/* Ensure interrupt_enter_prepare does not enable MSR[EE] */
-	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
-#endif
-	interrupt_enter_prepare(regs);
-#ifdef CONFIG_PPC_BOOK3S_64
-	/*
-	 * RI=1 is set by interrupt_enter_prepare, so this thread flags access
-	 * has to come afterward (it can cause SLB faults).
-	 */
-	if (cpu_has_feature(CPU_FTR_CTRL) &&
-	    !test_thread_local_flags(_TLF_RUNLATCH))
-		__ppc64_runlatch_on();
-#endif
-	irq_enter();
-}
-
-static inline void interrupt_async_exit_prepare(struct pt_regs *regs)
-{
-	/*
-	 * Adjust at exit so the main handler sees the true NIA. This must
-	 * come before irq_exit() because irq_exit can enable interrupts, and
-	 * if another interrupt is taken before nap_adjust_return has run
-	 * here, then that interrupt would return directly to idle nap return.
-	 */
-	nap_adjust_return(regs);
-
-	irq_exit();
-	interrupt_exit_prepare(regs);
-}
-
-struct interrupt_nmi_state {
-#ifdef CONFIG_PPC64
-	u8 irq_soft_mask;
-	u8 irq_happened;
-	u8 ftrace_enabled;
-	u64 softe;
-#endif
-};
-
-static inline bool nmi_disables_ftrace(struct pt_regs *regs)
-{
-	/* Allow DEC and PMI to be traced when they are soft-NMI */
-	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
-		if (TRAP(regs) == INTERRUPT_DECREMENTER)
-		       return false;
-		if (TRAP(regs) == INTERRUPT_PERFMON)
-		       return false;
-	}
-	if (IS_ENABLED(CONFIG_PPC_BOOK3E_64)) {
-		if (TRAP(regs) == INTERRUPT_PERFMON)
-			return false;
-	}
-
-	return true;
-}
-
-static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
-{
-#ifdef CONFIG_PPC64
-	state->irq_soft_mask = local_paca->irq_soft_mask;
-	state->irq_happened = local_paca->irq_happened;
-	state->softe = regs->softe;
-
-	/*
-	 * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
-	 * the right thing, and set IRQ_HARD_DIS. We do not want to reconcile
-	 * because that goes through irq tracing which we don't want in NMI.
-	 */
-	local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
-	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
-
-	if (!(regs->msr & MSR_EE) || is_implicit_soft_masked(regs)) {
-		/*
-		 * Adjust regs->softe to be soft-masked if it had not been
-		 * reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
-		 * not yet set disabled), or if it was in an implicit soft
-		 * masked state. This makes regs_irqs_disabled(regs)
-		 * behave as expected.
-		 */
-		regs->softe = IRQS_ALL_DISABLED;
-	}
-
-	__hard_RI_enable();
-
-	/* Don't do any per-CPU operations until interrupt state is fixed */
-
-	if (nmi_disables_ftrace(regs)) {
-		state->ftrace_enabled = this_cpu_get_ftrace_enabled();
-		this_cpu_set_ftrace_enabled(0);
-	}
-#endif
-
-	/* If data relocations are enabled, it's safe to use nmi_enter() */
-	if (mfmsr() & MSR_DR) {
-		nmi_enter();
-		return;
-	}
-
-	/*
-	 * But do not use nmi_enter() for pseries hash guest taking a real-mode
-	 * NMI because not everything it touches is within the RMA limit.
-	 */
-	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
-	    firmware_has_feature(FW_FEATURE_LPAR) &&
-	    !radix_enabled())
-		return;
-
-	/*
-	 * Likewise, don't use it if we have some form of instrumentation (like
-	 * KASAN shadow) that is not safe to access in real mode (even on radix)
-	 */
-	if (IS_ENABLED(CONFIG_KASAN))
-		return;
-
-	/*
-	 * Likewise, do not use it in real mode if percpu first chunk is not
-	 * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
-	 * are chances where percpu allocation can come from vmalloc area.
-	 */
-	if (percpu_first_chunk_is_paged)
-		return;
-
-	/* Otherwise, it should be safe to call it */
-	nmi_enter();
-}
-
-static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
-{
-	if (mfmsr() & MSR_DR) {
-		// nmi_exit if relocations are on
-		nmi_exit();
-	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
-		   firmware_has_feature(FW_FEATURE_LPAR) &&
-		   !radix_enabled()) {
-		// no nmi_exit for a pseries hash guest taking a real mode exception
-	} else if (IS_ENABLED(CONFIG_KASAN)) {
-		// no nmi_exit for KASAN in real mode
-	} else if (percpu_first_chunk_is_paged) {
-		// no nmi_exit if percpu first chunk is not embedded
-	} else {
-		nmi_exit();
-	}
-
-	/*
-	 * nmi does not call nap_adjust_return because nmi should not create
-	 * new work to do (must use irq_work for that).
-	 */
-
-#ifdef CONFIG_PPC64
-#ifdef CONFIG_PPC_BOOK3S
-	if (regs_irqs_disabled(regs)) {
-		unsigned long rst = search_kernel_restart_table(regs->nip);
-		if (rst)
-			regs_set_return_ip(regs, rst);
-	}
-#endif
-
-	if (nmi_disables_ftrace(regs))
-		this_cpu_set_ftrace_enabled(state->ftrace_enabled);
-
-	/* Check we didn't change the pending interrupt mask. */
-	WARN_ON_ONCE((state->irq_happened | PACA_IRQ_HARD_DIS) != local_paca->irq_happened);
-	regs->softe = state->softe;
-	local_paca->irq_happened = state->irq_happened;
-	local_paca->irq_soft_mask = state->irq_soft_mask;
-#endif
-}
 
 /*
  * Don't use noinstr here like x86, but rather add NOKPROBE_SYMBOL to each
@@ -574,11 +151,14 @@ static __always_inline void ____##func(struct pt_regs *regs);		\
 									\
 interrupt_handler void func(struct pt_regs *regs)			\
 {									\
-	interrupt_enter_prepare(regs);					\
-									\
+	irqentry_state_t state;						\
+	arch_interrupt_enter_prepare(regs);				\
+	state = irqentry_enter(regs);					\
+	instrumentation_begin();					\
 	____##func (regs);						\
-									\
-	interrupt_exit_prepare(regs);					\
+	instrumentation_end();						\
+	arch_interrupt_exit_prepare(regs);				\
+	irqentry_exit(regs, state);					\
 }									\
 NOKPROBE_SYMBOL(func);							\
 									\
@@ -608,12 +188,15 @@ static __always_inline long ____##func(struct pt_regs *regs);		\
 interrupt_handler long func(struct pt_regs *regs)			\
 {									\
 	long ret;							\
+	irqentry_state_t state;						\
 									\
-	interrupt_enter_prepare(regs);					\
-									\
+	arch_interrupt_enter_prepare(regs);				\
+	state = irqentry_enter(regs);					\
+	instrumentation_begin();					\
 	ret = ____##func (regs);					\
-									\
-	interrupt_exit_prepare(regs);					\
+	instrumentation_end();						\
+	arch_interrupt_exit_prepare(regs);				\
+	irqentry_exit(regs, state);					\
 									\
 	return ret;							\
 }									\
@@ -642,11 +225,16 @@ static __always_inline void ____##func(struct pt_regs *regs);		\
 									\
 interrupt_handler void func(struct pt_regs *regs)			\
 {									\
-	interrupt_async_enter_prepare(regs);				\
-									\
+	irqentry_state_t state;						\
+	arch_interrupt_async_enter_prepare(regs);			\
+	state = irqentry_enter(regs);					\
+	instrumentation_begin();					\
+	irq_enter_rcu();						\
 	____##func (regs);						\
-									\
-	interrupt_async_exit_prepare(regs);				\
+	irq_exit_rcu();							\
+	instrumentation_end();						\
+	arch_interrupt_async_exit_prepare(regs);			\
+	irqentry_exit(regs, state);					\
 }									\
 NOKPROBE_SYMBOL(func);							\
 									\
@@ -676,14 +264,43 @@ ____##func(struct pt_regs *regs);					\
 									\
 interrupt_handler long func(struct pt_regs *regs)			\
 {									\
-	struct interrupt_nmi_state state;				\
+	irqentry_state_t state;						\
+	struct interrupt_nmi_state nmi_state;				\
 	long ret;							\
 									\
-	interrupt_nmi_enter_prepare(regs, &state);			\
-									\
+	arch_interrupt_nmi_enter_prepare(regs, &nmi_state);		\
+	if (mfmsr() & MSR_DR) {						\
+		/* nmi_entry if relocations are on */			\
+		state = irqentry_nmi_enter(regs);			\
+	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&			\
+		   firmware_has_feature(FW_FEATURE_LPAR) &&		\
+		   !radix_enabled()) {					\
+		/* no nmi_entry for a pseries hash guest		\
+		 * taking a real mode exception */			\
+	} else if (IS_ENABLED(CONFIG_KASAN)) {				\
+		/* no nmi_entry for KASAN in real mode */		\
+	} else if (percpu_first_chunk_is_paged) {			\
+		/* no nmi_entry if percpu first chunk is not embedded */\
+	} else {							\
+		state = irqentry_nmi_enter(regs);			\
+	}								\
 	ret = ____##func (regs);					\
-									\
-	interrupt_nmi_exit_prepare(regs, &state);			\
+	arch_interrupt_nmi_exit_prepare(regs, &nmi_state);		\
+	if (mfmsr() & MSR_DR) {						\
+		/* nmi_exit if relocations are on */			\
+		irqentry_nmi_exit(regs, state);				\
+	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&			\
+		   firmware_has_feature(FW_FEATURE_LPAR) &&		\
+		   !radix_enabled()) {					\
+		/* no nmi_exit for a pseries hash guest			\
+		 * taking a real mode exception */			\
+	} else if (IS_ENABLED(CONFIG_KASAN)) {				\
+		/* no nmi_exit for KASAN in real mode */		\
+	} else if (percpu_first_chunk_is_paged) {			\
+		/* no nmi_exit if percpu first chunk is not embedded */	\
+	} else {							\
+		irqentry_nmi_exit(regs, state);				\
+	}								\
 									\
 	return ret;							\
 }									\
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index f53d432f6087..7f67f0b9d627 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -24,10 +24,6 @@
 unsigned long global_dbcr0[NR_CPUS];
 #endif
 
-#if defined(CONFIG_PREEMPT_DYNAMIC)
-DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-#endif
-
 #ifdef CONFIG_PPC_BOOK3S_64
 DEFINE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
 static inline bool exit_must_hard_disable(void)
@@ -297,13 +293,8 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 		/* Returning to a kernel context with local irqs enabled. */
 		WARN_ON_ONCE(!(regs->msr & MSR_EE));
 again:
-		if (need_irq_preemption()) {
-			/* Return to preemptible kernel context */
-			if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
-				if (preempt_count() == 0)
-					preempt_schedule_irq();
-			}
-		}
+		if (need_irq_preemption())
+			irqentry_exit_cond_resched();
 
 		check_return_regs_valid(regs);
 
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
                   ` (6 preceding siblings ...)
  2025-12-14 13:02 ` [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
@ 2025-12-14 13:02 ` Mukesh Kumar Chaurasiya
  2025-12-14 16:20   ` Segher Boessenkool
                     ` (3 more replies)
  7 siblings, 4 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-14 13:02 UTC (permalink / raw)
  To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>

Convert the PowerPC syscall entry and exit paths to use the generic
entry/exit framework by selecting GENERIC_ENTRY and integrating with
the common syscall handling routines.

This change transitions PowerPC away from its custom syscall entry and
exit code to use the generic helpers such as:
 - syscall_enter_from_user_mode()
 - syscall_exit_to_user_mode()

As part of this migration:
 - The architecture now selects GENERIC_ENTRY in Kconfig.
 - Old tracing, seccomp, and audit handling in ptrace.c is removed in
   favor of generic entry infrastructure.
 - interrupt.c and syscall.c are simplified to delegate context
   management and user exit handling to the generic entry path.
 - The new pt_regs field `exit_flags` introduced earlier is now used
   to carry per-syscall exit state flags (e.g. _TIF_RESTOREALL).

This aligns PowerPC with the common entry code used by other
architectures and reduces duplicated logic around syscall tracing,
context tracking, and signal handling.

The performance benchmarks from perf bench basic syscall are below:

perf bench syscall usec/op

| Test            | With Patch | Without Patch | % Change |
| --------------- | ---------- | ------------- | -------- |
| getppid usec/op | 0.207795   | 0.210373      | -1.22%   |
| getpgid usec/op | 0.206282   | 0.211676      | -2.55%   |
| fork usec/op    | 833.986    | 814.809       | +2.35%   |
| execve usec/op  | 360.939    | 365.168       | -1.16%   |

perf bench syscall ops/sec

| Test            | With Patch | Without Patch | % Change |
| --------------- | ---------- | ------------- | -------- |
| getppid ops/sec | 48,12,433  | 47,53,459     | +1.24%   |
| getpgid ops/sec | 48,47,744  | 47,24,192     | +2.61%   |
| fork ops/sec    | 1,199      | 1,227         | -2.28%   |
| execve ops/sec  | 2,770      | 2,738         | +1.16%   |

IPI latency benchmark

| Metric                  | With Patch       | Without Patch    | % Change |
| ----------------------- | ---------------- | ---------------- | -------- |
| Dry-run (ns)            | 206,675.81       | 206,719.36       | -0.02%   |
| Self-IPI avg (ns)       | 1,939,991.00     | 1,976,116.15     | -1.83%   |
| Self-IPI max (ns)       | 3,533,718.93     | 3,582,650.33     | -1.37%   |
| Normal IPI avg (ns)     | 111,110,034.23   | 110,513,373.51   | +0.54%   |
| Normal IPI max (ns)     | 150,393,442.64   | 149,669,477.89   | +0.48%   |
| Broadcast IPI max (ns)  | 3,978,231,022.96 | 4,359,916,859.46 | -8.73%   |
| Broadcast lock max (ns) | 4,025,425,714.49 | 4,384,956,730.83 | -8.20%   |

Thats very close to performance earlier with arch specific handling.

Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
 arch/powerpc/Kconfig                    |   1 +
 arch/powerpc/include/asm/entry-common.h |   5 +-
 arch/powerpc/kernel/interrupt.c         | 139 +++++++----------------
 arch/powerpc/kernel/ptrace/ptrace.c     | 141 ------------------------
 arch/powerpc/kernel/signal.c            |  10 +-
 arch/powerpc/kernel/syscall.c           | 119 +-------------------
 6 files changed, 49 insertions(+), 366 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b0c602c3bbe1..a4330775b254 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -203,6 +203,7 @@ config PPC
 	select GENERIC_CPU_AUTOPROBE
 	select GENERIC_CPU_VULNERABILITIES	if PPC_BARRIER_NOSPEC
 	select GENERIC_EARLY_IOREMAP
+	select GENERIC_ENTRY
 	select GENERIC_GETTIMEOFDAY
 	select GENERIC_IDLE_POLL_SETUP
 	select GENERIC_IOREMAP
diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index e2ae7416dee1..77129174f882 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -3,7 +3,7 @@
 #ifndef _ASM_PPC_ENTRY_COMMON_H
 #define _ASM_PPC_ENTRY_COMMON_H
 
-#ifdef CONFIG_GENERIC_IRQ_ENTRY
+#ifdef CONFIG_GENERIC_ENTRY
 
 #include <asm/cputime.h>
 #include <asm/interrupt.h>
@@ -217,9 +217,6 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
 
 	if (user_mode(regs)) {
 		kuap_lock();
-		CT_WARN_ON(ct_state() != CT_STATE_USER);
-		user_exit_irqoff();
-
 		account_cpu_user_entry();
 		account_stolen_time();
 	} else {
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 7f67f0b9d627..7d5cd4b5a610 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 
 #include <linux/context_tracking.h>
+#include <linux/entry-common.h>
 #include <linux/err.h>
 #include <linux/compat.h>
 #include <linux/rseq.h>
@@ -73,79 +74,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
 	return true;
 }
 
-static notrace unsigned long
-interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
-{
-	unsigned long ti_flags;
-
-again:
-	ti_flags = read_thread_flags();
-	while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
-		local_irq_enable();
-		if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
-			schedule();
-		} else {
-			/*
-			 * SIGPENDING must restore signal handler function
-			 * argument GPRs, and some non-volatiles (e.g., r1).
-			 * Restore all for now. This could be made lighter.
-			 */
-			if (ti_flags & _TIF_SIGPENDING)
-				ret |= _TIF_RESTOREALL;
-			do_notify_resume(regs, ti_flags);
-		}
-		local_irq_disable();
-		ti_flags = read_thread_flags();
-	}
-
-	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
-		if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
-				unlikely((ti_flags & _TIF_RESTORE_TM))) {
-			restore_tm_state(regs);
-		} else {
-			unsigned long mathflags = MSR_FP;
-
-			if (cpu_has_feature(CPU_FTR_VSX))
-				mathflags |= MSR_VEC | MSR_VSX;
-			else if (cpu_has_feature(CPU_FTR_ALTIVEC))
-				mathflags |= MSR_VEC;
-
-			/*
-			 * If userspace MSR has all available FP bits set,
-			 * then they are live and no need to restore. If not,
-			 * it means the regs were given up and restore_math
-			 * may decide to restore them (to avoid taking an FP
-			 * fault).
-			 */
-			if ((regs->msr & mathflags) != mathflags)
-				restore_math(regs);
-		}
-	}
-
-	check_return_regs_valid(regs);
-
-	user_enter_irqoff();
-	if (!prep_irq_for_enabled_exit(true)) {
-		user_exit_irqoff();
-		local_irq_enable();
-		local_irq_disable();
-		goto again;
-	}
-
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-	local_paca->tm_scratch = regs->msr;
-#endif
-
-	booke_load_dbcr0();
-
-	account_cpu_user_exit();
-
-	/* Restore user access locks last */
-	kuap_user_restore(regs);
-
-	return ret;
-}
-
 /*
  * This should be called after a syscall returns, with r3 the return value
  * from the syscall. If this function returns non-zero, the system call
@@ -160,17 +88,12 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 					   long scv)
 {
 	unsigned long ti_flags;
-	unsigned long ret = 0;
 	bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv;
 
-	CT_WARN_ON(ct_state() == CT_STATE_USER);
-
 	kuap_assert_locked();
 
 	regs->result = r3;
-
-	/* Check whether the syscall is issued inside a restartable sequence */
-	rseq_syscall(regs);
+	regs->exit_flags = 0;
 
 	ti_flags = read_thread_flags();
 
@@ -183,7 +106,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 
 	if (unlikely(ti_flags & _TIF_PERSYSCALL_MASK)) {
 		if (ti_flags & _TIF_RESTOREALL)
-			ret = _TIF_RESTOREALL;
+			regs->exit_flags = _TIF_RESTOREALL;
 		else
 			regs->gpr[3] = r3;
 		clear_bits(_TIF_PERSYSCALL_MASK, &current_thread_info()->flags);
@@ -192,18 +115,28 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
 	}
 
 	if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) {
-		do_syscall_trace_leave(regs);
-		ret |= _TIF_RESTOREALL;
+		regs->exit_flags |= _TIF_RESTOREALL;
 	}
 
-	local_irq_disable();
-	ret = interrupt_exit_user_prepare_main(ret, regs);
+	syscall_exit_to_user_mode(regs);
+
+again:
+	user_enter_irqoff();
+	if (!prep_irq_for_enabled_exit(true)) {
+		user_exit_irqoff();
+		local_irq_enable();
+		local_irq_disable();
+		goto again;
+	}
+
+	/* Restore user access locks last */
+	kuap_user_restore(regs);
 
 #ifdef CONFIG_PPC64
-	regs->exit_result = ret;
+	regs->exit_result = regs->exit_flags;
 #endif
 
-	return ret;
+	return regs->exit_flags;
 }
 
 #ifdef CONFIG_PPC64
@@ -223,13 +156,16 @@ notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *reg
 	set_kuap(AMR_KUAP_BLOCKED);
 #endif
 
-	trace_hardirqs_off();
-	user_exit_irqoff();
-	account_cpu_user_entry();
-
-	BUG_ON(!user_mode(regs));
+again:
+	user_enter_irqoff();
+	if (!prep_irq_for_enabled_exit(true)) {
+		user_exit_irqoff();
+		local_irq_enable();
+		local_irq_disable();
+		goto again;
+	}
 
-	regs->exit_result = interrupt_exit_user_prepare_main(regs->exit_result, regs);
+	regs->exit_result |= regs->exit_flags;
 
 	return regs->exit_result;
 }
@@ -241,7 +177,6 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
 
 	BUG_ON(regs_is_unrecoverable(regs));
 	BUG_ON(regs_irqs_disabled(regs));
-	CT_WARN_ON(ct_state() == CT_STATE_USER);
 
 	/*
 	 * We don't need to restore AMR on the way back to userspace for KUAP.
@@ -250,8 +185,21 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
 	kuap_assert_locked();
 
 	local_irq_disable();
+	regs->exit_flags = 0;
+again:
+	check_return_regs_valid(regs);
+	user_enter_irqoff();
+	if (!prep_irq_for_enabled_exit(true)) {
+		user_exit_irqoff();
+		local_irq_enable();
+		local_irq_disable();
+		goto again;
+	}
+
+	/* Restore user access locks last */
+	kuap_user_restore(regs);
 
-	ret = interrupt_exit_user_prepare_main(0, regs);
+	ret = regs->exit_flags;
 
 #ifdef CONFIG_PPC64
 	regs->exit_result = ret;
@@ -293,8 +241,6 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 		/* Returning to a kernel context with local irqs enabled. */
 		WARN_ON_ONCE(!(regs->msr & MSR_EE));
 again:
-		if (need_irq_preemption())
-			irqentry_exit_cond_resched();
 
 		check_return_regs_valid(regs);
 
@@ -364,7 +310,6 @@ notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
 #endif
 
 	trace_hardirqs_off();
-	user_exit_irqoff();
 	account_cpu_user_entry();
 
 	BUG_ON(!user_mode(regs));
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index 2134b6d155ff..316d4f5ead8e 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -21,9 +21,6 @@
 #include <asm/switch_to.h>
 #include <asm/debug.h>
 
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
 #include "ptrace-decl.h"
 
 /*
@@ -195,144 +192,6 @@ long arch_ptrace(struct task_struct *child, long request,
 	return ret;
 }
 
-#ifdef CONFIG_SECCOMP
-static int do_seccomp(struct pt_regs *regs)
-{
-	if (!test_thread_flag(TIF_SECCOMP))
-		return 0;
-
-	/*
-	 * The ABI we present to seccomp tracers is that r3 contains
-	 * the syscall return value and orig_gpr3 contains the first
-	 * syscall parameter. This is different to the ptrace ABI where
-	 * both r3 and orig_gpr3 contain the first syscall parameter.
-	 */
-	regs->gpr[3] = -ENOSYS;
-
-	/*
-	 * We use the __ version here because we have already checked
-	 * TIF_SECCOMP. If this fails, there is nothing left to do, we
-	 * have already loaded -ENOSYS into r3, or seccomp has put
-	 * something else in r3 (via SECCOMP_RET_ERRNO/TRACE).
-	 */
-	if (__secure_computing())
-		return -1;
-
-	/*
-	 * The syscall was allowed by seccomp, restore the register
-	 * state to what audit expects.
-	 * Note that we use orig_gpr3, which means a seccomp tracer can
-	 * modify the first syscall parameter (in orig_gpr3) and also
-	 * allow the syscall to proceed.
-	 */
-	regs->gpr[3] = regs->orig_gpr3;
-
-	return 0;
-}
-#else
-static inline int do_seccomp(struct pt_regs *regs) { return 0; }
-#endif /* CONFIG_SECCOMP */
-
-/**
- * do_syscall_trace_enter() - Do syscall tracing on kernel entry.
- * @regs: the pt_regs of the task to trace (current)
- *
- * Performs various types of tracing on syscall entry. This includes seccomp,
- * ptrace, syscall tracepoints and audit.
- *
- * The pt_regs are potentially visible to userspace via ptrace, so their
- * contents is ABI.
- *
- * One or more of the tracers may modify the contents of pt_regs, in particular
- * to modify arguments or even the syscall number itself.
- *
- * It's also possible that a tracer can choose to reject the system call. In
- * that case this function will return an illegal syscall number, and will put
- * an appropriate return value in regs->r3.
- *
- * Return: the (possibly changed) syscall number.
- */
-long do_syscall_trace_enter(struct pt_regs *regs)
-{
-	u32 flags;
-
-	flags = read_thread_flags() & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
-
-	if (flags) {
-		int rc = ptrace_report_syscall_entry(regs);
-
-		if (unlikely(flags & _TIF_SYSCALL_EMU)) {
-			/*
-			 * A nonzero return code from
-			 * ptrace_report_syscall_entry() tells us to prevent
-			 * the syscall execution, but we are not going to
-			 * execute it anyway.
-			 *
-			 * Returning -1 will skip the syscall execution. We want
-			 * to avoid clobbering any registers, so we don't goto
-			 * the skip label below.
-			 */
-			return -1;
-		}
-
-		if (rc) {
-			/*
-			 * The tracer decided to abort the syscall. Note that
-			 * the tracer may also just change regs->gpr[0] to an
-			 * invalid syscall number, that is handled below on the
-			 * exit path.
-			 */
-			goto skip;
-		}
-	}
-
-	/* Run seccomp after ptrace; allow it to set gpr[3]. */
-	if (do_seccomp(regs))
-		return -1;
-
-	/* Avoid trace and audit when syscall is invalid. */
-	if (regs->gpr[0] >= NR_syscalls)
-		goto skip;
-
-	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
-		trace_sys_enter(regs, regs->gpr[0]);
-
-	if (!is_32bit_task())
-		audit_syscall_entry(regs->gpr[0], regs->gpr[3], regs->gpr[4],
-				    regs->gpr[5], regs->gpr[6]);
-	else
-		audit_syscall_entry(regs->gpr[0],
-				    regs->gpr[3] & 0xffffffff,
-				    regs->gpr[4] & 0xffffffff,
-				    regs->gpr[5] & 0xffffffff,
-				    regs->gpr[6] & 0xffffffff);
-
-	/* Return the possibly modified but valid syscall number */
-	return regs->gpr[0];
-
-skip:
-	/*
-	 * If we are aborting explicitly, or if the syscall number is
-	 * now invalid, set the return value to -ENOSYS.
-	 */
-	regs->gpr[3] = -ENOSYS;
-	return -1;
-}
-
-void do_syscall_trace_leave(struct pt_regs *regs)
-{
-	int step;
-
-	audit_syscall_exit(regs);
-
-	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
-		trace_sys_exit(regs, regs->result);
-
-	step = test_thread_flag(TIF_SINGLESTEP);
-	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
-		ptrace_report_syscall_exit(regs, step);
-}
-
 void __init pt_regs_check(void);
 
 /*
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 719930cf4ae1..9f1847b4742e 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -6,6 +6,7 @@
  *    Extracted from signal_32.c and signal_64.c
  */
 
+#include <linux/entry-common.h>
 #include <linux/resume_user_mode.h>
 #include <linux/signal.h>
 #include <linux/uprobes.h>
@@ -22,11 +23,6 @@
 
 #include "signal.h"
 
-/* This will be removed */
-#ifdef CONFIG_GENERIC_ENTRY
-#include <linux/entry-common.h>
-#endif /* CONFIG_GENERIC_ENTRY */
-
 #ifdef CONFIG_VSX
 unsigned long copy_fpr_to_user(void __user *to,
 			       struct task_struct *task)
@@ -374,11 +370,9 @@ void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
 				   task_pid_nr(tsk), where, ptr, regs->nip, regs->link);
 }
 
-#ifdef CONFIG_GENERIC_ENTRY
 void arch_do_signal_or_restart(struct pt_regs *regs)
 {
 	BUG_ON(regs != current->thread.regs);
-	local_paca->generic_fw_flags |= GFW_RESTORE_ALL;
+	regs->exit_flags |= _TIF_RESTOREALL;
 	do_signal(current);
 }
-#endif /* CONFIG_GENERIC_ENTRY */
diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
index 9f03a6263fb4..df1c9a8d62bc 100644
--- a/arch/powerpc/kernel/syscall.c
+++ b/arch/powerpc/kernel/syscall.c
@@ -3,6 +3,7 @@
 #include <linux/compat.h>
 #include <linux/context_tracking.h>
 #include <linux/randomize_kstack.h>
+#include <linux/entry-common.h>
 
 #include <asm/interrupt.h>
 #include <asm/kup.h>
@@ -18,124 +19,10 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
 	long ret;
 	syscall_fn f;
 
-	kuap_lock();
-
 	add_random_kstack_offset();
+	r0 = syscall_enter_from_user_mode(regs, r0);
 
-	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
-		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
-
-	trace_hardirqs_off(); /* finish reconciling */
-
-	CT_WARN_ON(ct_state() == CT_STATE_KERNEL);
-	user_exit_irqoff();
-
-	BUG_ON(regs_is_unrecoverable(regs));
-	BUG_ON(!user_mode(regs));
-	BUG_ON(regs_irqs_disabled(regs));
-
-#ifdef CONFIG_PPC_PKEY
-	if (mmu_has_feature(MMU_FTR_PKEY)) {
-		unsigned long amr, iamr;
-		bool flush_needed = false;
-		/*
-		 * When entering from userspace we mostly have the AMR/IAMR
-		 * different from kernel default values. Hence don't compare.
-		 */
-		amr = mfspr(SPRN_AMR);
-		iamr = mfspr(SPRN_IAMR);
-		regs->amr  = amr;
-		regs->iamr = iamr;
-		if (mmu_has_feature(MMU_FTR_KUAP)) {
-			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
-			flush_needed = true;
-		}
-		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
-			mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
-			flush_needed = true;
-		}
-		if (flush_needed)
-			isync();
-	} else
-#endif
-		kuap_assert_locked();
-
-	booke_restore_dbcr0();
-
-	account_cpu_user_entry();
-
-	account_stolen_time();
-
-	/*
-	 * This is not required for the syscall exit path, but makes the
-	 * stack frame look nicer. If this was initialised in the first stack
-	 * frame, or if the unwinder was taught the first stack frame always
-	 * returns to user with IRQS_ENABLED, this store could be avoided!
-	 */
-	irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
-
-	/*
-	 * If system call is called with TM active, set _TIF_RESTOREALL to
-	 * prevent RFSCV being used to return to userspace, because POWER9
-	 * TM implementation has problems with this instruction returning to
-	 * transactional state. Final register values are not relevant because
-	 * the transaction will be aborted upon return anyway. Or in the case
-	 * of unsupported_scv SIGILL fault, the return state does not much
-	 * matter because it's an edge case.
-	 */
-	if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
-			unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
-		set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
-
-	/*
-	 * If the system call was made with a transaction active, doom it and
-	 * return without performing the system call. Unless it was an
-	 * unsupported scv vector, in which case it's treated like an illegal
-	 * instruction.
-	 */
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-	if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
-	    !trap_is_unsupported_scv(regs)) {
-		/* Enable TM in the kernel, and disable EE (for scv) */
-		hard_irq_disable();
-		mtmsr(mfmsr() | MSR_TM);
-
-		/* tabort, this dooms the transaction, nothing else */
-		asm volatile(".long 0x7c00071d | ((%0) << 16)"
-				:: "r"(TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT));
-
-		/*
-		 * Userspace will never see the return value. Execution will
-		 * resume after the tbegin. of the aborted transaction with the
-		 * checkpointed register state. A context switch could occur
-		 * or signal delivered to the process before resuming the
-		 * doomed transaction context, but that should all be handled
-		 * as expected.
-		 */
-		return -ENOSYS;
-	}
-#endif // CONFIG_PPC_TRANSACTIONAL_MEM
-
-	local_irq_enable();
-
-	if (unlikely(read_thread_flags() & _TIF_SYSCALL_DOTRACE)) {
-		if (unlikely(trap_is_unsupported_scv(regs))) {
-			/* Unsupported scv vector */
-			_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
-			return regs->gpr[3];
-		}
-		/*
-		 * We use the return value of do_syscall_trace_enter() as the
-		 * syscall number. If the syscall was rejected for any reason
-		 * do_syscall_trace_enter() returns an invalid syscall number
-		 * and the test against NR_syscalls will fail and the return
-		 * value to be used is in regs->gpr[3].
-		 */
-		r0 = do_syscall_trace_enter(regs);
-		if (unlikely(r0 >= NR_syscalls))
-			return regs->gpr[3];
-
-	} else if (unlikely(r0 >= NR_syscalls)) {
+	if (unlikely(r0 >= NR_syscalls)) {
 		if (unlikely(trap_is_unsupported_scv(regs))) {
 			/* Unsupported scv vector */
 			_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-14 13:02 ` [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
@ 2025-12-14 16:20   ` Segher Boessenkool
  2025-12-15 18:32     ` Mukesh Kumar Chaurasiya
  2025-12-15 20:27   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2025-12-14 16:20 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya
  Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, tglx, thomas.weissschuh, peterz, menglong8.dong,
	bigeasy, namcao, kan.liang, mingo, atrajeev, mark.barnett,
	linuxppc-dev, linux-kernel

Hi!

On Sun, Dec 14, 2025 at 06:32:44PM +0530, Mukesh Kumar Chaurasiya wrote:
> 
> | Test            | With Patch | Without Patch | % Change |
> | --------------- | ---------- | ------------- | -------- |
> | fork usec/op    | 833.986    | 814.809       | +2.35%   |

What causes this regression, did you investigate?  Maybe there is
something simple you can do to avoid this degradation :-)  All other
numbers look just fine :-)

> | Test            | With Patch | Without Patch | % Change |
> | --------------- | ---------- | ------------- | -------- |
> | fork ops/sec    | 1,199      | 1,227         | -2.28%   |

(Same thing seen from another side)


Segher


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-14 16:20   ` Segher Boessenkool
@ 2025-12-15 18:32     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-15 18:32 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, christophe.leroy,
	oleg, kees, luto, wad, thuth, sshegde, charlie, macro, akpm, ldv,
	deller, ankur.a.arora, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel

On Sun, Dec 14, 2025 at 10:20:12AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Sun, Dec 14, 2025 at 06:32:44PM +0530, Mukesh Kumar Chaurasiya wrote:
> > 
> > | Test            | With Patch | Without Patch | % Change |
> > | --------------- | ---------- | ------------- | -------- |
> > | fork usec/op    | 833.986    | 814.809       | +2.35%   |
> 
> What causes this regression, did you investigate?  Maybe there is
> something simple you can do to avoid this degradation :-)  All other
> numbers look just fine :-)
> 
Hey,

I ran this multiple times and took the average this time, below are the
results:
===========================================
Without patch
===========================================
╰─❯ perf bench syscall fork 
# Running 'syscall/fork' benchmark:
# Executed 10,000 fork() calls
     Total time: 8.514 [sec]

     851.415300 usecs/op
          1,174 ops/sec

╰─❯ perf bench syscall fork
# Running 'syscall/fork' benchmark:
# Executed 10,000 fork() calls
     Total time: 8.572 [sec]

     857.293600 usecs/op
          1,166 ops/sec

╰─❯ perf bench syscall fork 
# Running 'syscall/fork' benchmark:
# Executed 10,000 fork() calls
     Total time: 8.695 [sec]

     869.536500 usecs/op
          1,150 ops/sec
===========================================
With patch
===========================================
╰─❯ perf bench syscall fork 
# Running 'syscall/fork' benchmark:
# Executed 10,000 fork() calls
     Total time: 8.482 [sec]

     848.241300 usecs/op
          1,178 ops/sec

╰─❯ perf bench syscall fork 
# Running 'syscall/fork' benchmark:
# Executed 10,000 fork() calls
     Total time: 8.623 [sec]

     862.389000 usecs/op
          1,159 ops/sec

╰─❯ perf bench syscall fork 
# Running 'syscall/fork' benchmark:
# Executed 10,000 fork() calls
     Total time: 8.530 [sec]

     853.037200 usecs/op
          1,172 ops/sec
===========================================
Average:
===========================================
With Patch:
854.4964 usecs/op
1169 ops/sec

Without patch:
859.4151 usecs/op
1163 ops/sec

That's ~0.5% improvement when i take average through the runs.
This we can ignore as a standard deviation and consider that there are
no regression for these.

Regards,
Mukesh

> > | Test            | With Patch | Without Patch | % Change |
> > | --------------- | ---------- | ------------- | -------- |
> > | fork ops/sec    | 1,199      | 1,227         | -2.28%   |
> 
> (Same thing seen from another side)
> 
> 
> Segher


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-14 13:02 ` [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
  2025-12-14 16:20   ` Segher Boessenkool
@ 2025-12-15 20:27   ` kernel test robot
  2025-12-16 15:08     ` Mukesh Kumar Chaurasiya
  2025-12-16  6:41   ` Christophe Leroy (CS GROUP)
  2025-12-16 11:01   ` Christophe Leroy (CS GROUP)
  3 siblings, 1 reply; 37+ messages in thread
From: kernel test robot @ 2025-12-15 20:27 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, christophe.leroy,
	oleg, kees, luto, wad, mchauras, thuth, sshegde, charlie, macro,
	akpm, ldv, deller, ankur.a.arora, segher, tglx, thomas.weissschuh,
	peterz, menglong8.dong, bigeasy, namcao, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel
  Cc: oe-kbuild-all

Hi Mukesh,

kernel test robot noticed the following build errors:

[auto build test ERROR on powerpc/next]
[also build test ERROR on powerpc/fixes linus/master v6.19-rc1 next-20251215]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mukesh-Kumar-Chaurasiya/powerpc-rename-arch_irq_disabled_regs/20251214-210813
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
patch link:    https://lore.kernel.org/r/20251214130245.43664-9-mkchauras%40linux.ibm.com
patch subject: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
config: powerpc-randconfig-001-20251215 (https://download.01.org/0day-ci/archive/20251216/202512160453.iO9WNjrm-lkp@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 9.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251216/202512160453.iO9WNjrm-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512160453.iO9WNjrm-lkp@intel.com/

All errors (new ones prefixed by >>):

   powerpc-linux-ld: init/main.o: in function `do_trace_event_raw_event_initcall_level':
   include/trace/events/initcall.h:10: undefined reference to `memcpy'
   powerpc-linux-ld: init/main.o: in function `repair_env_string':
   init/main.c:512: undefined reference to `memmove'
   powerpc-linux-ld: init/do_mounts.o: in function `do_mount_root':
   init/do_mounts.c:162: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/process.o: in function `start_thread':
   arch/powerpc/kernel/process.c:1919: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/process.o: in function `__set_breakpoint':
   arch/powerpc/kernel/process.c:880: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/process.o: in function `arch_dup_task_struct':
   arch/powerpc/kernel/process.c:1724: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/process.o: in function `copy_thread':
   arch/powerpc/kernel/process.c:1801: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/process.c:1812: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/signal.o: in function `do_signal':
   arch/powerpc/kernel/signal.c:247: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/time.o: in function `register_decrementer_clockevent':
>> arch/powerpc/kernel/time.c:834: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/time.o: in function `platform_device_register_resndata':
>> include/linux/platform_device.h:158: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/prom.o: in function `move_device_tree':
>> arch/powerpc/kernel/prom.c:134: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/setup-common.o: in function `probe_machine':
>> arch/powerpc/kernel/setup-common.c:646: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `user_regset_copyin':
>> include/linux/regset.h:276: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `membuf_write':
   include/linux/regset.h:42: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `gpr_get':
>> arch/powerpc/kernel/ptrace/ptrace-view.c:230: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `membuf_zero':
>> include/linux/regset.h:30: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `gpr32_get_common':
   arch/powerpc/kernel/ptrace/ptrace-view.c:707: undefined reference to `memcpy'
>> powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.c:708: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.c:710: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `membuf_zero':
>> include/linux/regset.h:30: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-novsx.o: in function `membuf_write':
   include/linux/regset.h:42: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/optprobes.o: in function `can_optimize':
>> arch/powerpc/kernel/optprobes.c:71: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/kvm.o: in function `kvm_map_magic_page':
>> arch/powerpc/kernel/kvm.c:407: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/kernel/kvm.o: in function `kvm_patch_ins_mtmsrd':
>> arch/powerpc/kernel/kvm.c:178: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/kvm.o: in function `kvm_patch_ins_mtmsr':
   arch/powerpc/kernel/kvm.c:231: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/kernel/kvm.o: in function `epapr_hypercall0_1':
>> arch/powerpc/include/asm/epapr_hcalls.h:511: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/mm/mem.o: in function `execmem_arch_setup':
>> arch/powerpc/mm/mem.c:423: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/mm/init-common.o: in function `ctor_15':
>> arch/powerpc/mm/init-common.c:81: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/mm/init-common.o: in function `ctor_14':
>> arch/powerpc/mm/init-common.c:81: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/mm/init-common.o: in function `ctor_13':
>> arch/powerpc/mm/init-common.c:81: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/mm/init-common.o:arch/powerpc/mm/init-common.c:81: more undefined references to `memset' follow
   powerpc-linux-ld: arch/powerpc/lib/pmem.o: in function `memcpy_flushcache':
>> arch/powerpc/lib/pmem.c:84: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/sysdev/fsl_mpic_err.o: in function `mpic_setup_error_int':
>> arch/powerpc/sysdev/fsl_mpic_err.c:70: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/platforms/8xx/pic.o: in function `irq_domain_create_linear':
>> include/linux/irqdomain.h:405: undefined reference to `memset'
   powerpc-linux-ld: arch/powerpc/platforms/8xx/cpm1.o: in function `cpm1_clk_setup':
   arch/powerpc/platforms/8xx/cpm1.c:251: undefined reference to `memcpy'
   powerpc-linux-ld: arch/powerpc/platforms/8xx/cpm1-ic.o: in function `irq_domain_create_linear':
   include/linux/irqdomain.h:405: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `do_trace_event_raw_event_task_newtask':
   include/trace/events/task.h:9: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/fork.o: in function `do_trace_event_raw_event_task_rename':
   include/trace/events/task.h:34: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/fork.o: in function `copy_struct_from_user':
   include/linux/uaccess.h:396: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `copy_clone_args_from_user':
   kernel/fork.c:2800: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `mm_init':
   kernel/fork.c:1044: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `bitmap_zero':
   include/linux/bitmap.h:238: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `pgd_alloc':
   arch/powerpc/include/asm/nohash/pgalloc.h:26: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/fork.o: in function `__kmem_cache_create':
   include/linux/slab.h:379: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `arch_dup_task_struct':
   kernel/fork.c:854: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/fork.o: in function `mm_alloc':
   kernel/fork.c:1120: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `replace_mm_exe_file':
   kernel/fork.c:1238: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `copy_process':
   kernel/fork.c:2030: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `posix_cputimers_init':
   include/linux/posix-timers.h:103: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `copy_sighand':
   kernel/fork.c:1618: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/fork.o: in function `copy_signal':
   kernel/fork.c:1687: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/fork.o: in function `dup_mm':
   kernel/fork.c:1483: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/fork.o: in function `create_io_thread':
   kernel/fork.c:2549: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `kernel_thread':
   kernel/fork.c:2661: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `user_mode_thread':
   kernel/fork.c:2678: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `sys_fork':
   kernel/fork.c:2692: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o: in function `sys_vfork':
   kernel/fork.c:2707: undefined reference to `memset'
   powerpc-linux-ld: kernel/fork.o:kernel/fork.c:2740: more undefined references to `memset' follow
   powerpc-linux-ld: kernel/softirq.o: in function `do_trace_event_raw_event_irq_handler_entry':
   include/trace/events/irq.h:53: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/resource.o: in function `find_next_iomem_res':
   kernel/resource.c:372: undefined reference to `memset'
   powerpc-linux-ld: kernel/resource.o: in function `__request_region_locked':
   kernel/resource.c:1261: undefined reference to `memset'
   powerpc-linux-ld: kernel/resource.o: in function `reserve_setup':
   kernel/resource.c:1757: undefined reference to `memset'
   powerpc-linux-ld: kernel/resource.c:1760: undefined reference to `memset'
   powerpc-linux-ld: kernel/sysctl.o: in function `proc_put_long':
   kernel/sysctl.c:339: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/sysctl.o: in function `_proc_do_string':
   kernel/sysctl.c:127: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/sysctl.o: in function `proc_get_long':
   kernel/sysctl.c:284: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/sysctl.o: in function `bitmap_copy':
   include/linux/bitmap.h:259: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/sysctl.o: in function `proc_do_static_key':
   kernel/sysctl.c:1433: undefined reference to `memset'
   powerpc-linux-ld: kernel/capability.o: in function `__do_sys_capset':
   kernel/capability.c:218: undefined reference to `memset'
   powerpc-linux-ld: kernel/ptrace.o: in function `syscall_set_arguments':
   arch/powerpc/include/asm/syscall.h:127: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/ptrace.o: in function `ptrace_get_syscall_info':
   kernel/ptrace.c:998: undefined reference to `memset'
   powerpc-linux-ld: kernel/ptrace.o: in function `copy_siginfo':
   include/linux/signal.h:18: undefined reference to `memcpy'
   powerpc-linux-ld: include/linux/signal.h:18: undefined reference to `memcpy'
   powerpc-linux-ld: include/linux/signal.h:18: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/user.o: in function `ratelimit_state_init':
   include/linux/ratelimit.h:12: undefined reference to `memset'
   powerpc-linux-ld: kernel/user.o: in function `__kmem_cache_create':
   include/linux/slab.h:379: undefined reference to `memset'
   powerpc-linux-ld: kernel/signal.o: in function `do_trace_event_raw_event_signal_generate':
   include/trace/events/signal.h:50: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/signal.o: in function `clear_siginfo':
   include/linux/signal.h:23: undefined reference to `memset'
   powerpc-linux-ld: kernel/signal.o: in function `copy_siginfo':
   include/linux/signal.h:18: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/signal.o: in function `do_sigaltstack':
   kernel/signal.c:4396: undefined reference to `memset'
   powerpc-linux-ld: kernel/signal.o: in function `copy_siginfo':
   include/linux/signal.h:18: undefined reference to `memcpy'
   powerpc-linux-ld: include/linux/signal.h:18: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/signal.o: in function `signals_init':
   kernel/signal.c:5011: undefined reference to `memset'
   powerpc-linux-ld: kernel/sys.o: in function `override_release':
   kernel/sys.c:1331: undefined reference to `memset'
   powerpc-linux-ld: kernel/sys.o: in function `__do_sys_newuname':
   kernel/sys.c:1356: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/sys.o: in function `__do_sys_uname':
   kernel/sys.c:1380: undefined reference to `memcpy'
   powerpc-linux-ld: kernel/sys.o: in function `prctl_set_auxv':


vim +18 include/linux/signal.h

^1da177e4c3f415 Linus Torvalds    2005-04-16  14  
ae7795bc6187a15 Eric W. Biederman 2018-09-25  15  static inline void copy_siginfo(kernel_siginfo_t *to,
ae7795bc6187a15 Eric W. Biederman 2018-09-25  16  				const kernel_siginfo_t *from)
ca9eb49aa9562ea James Hogan       2016-02-08  17  {
ca9eb49aa9562ea James Hogan       2016-02-08 @18  	memcpy(to, from, sizeof(*to));
ca9eb49aa9562ea James Hogan       2016-02-08  19  }
ca9eb49aa9562ea James Hogan       2016-02-08  20  
ae7795bc6187a15 Eric W. Biederman 2018-09-25  21  static inline void clear_siginfo(kernel_siginfo_t *info)
8c5dbf2ae00bb86 Eric W. Biederman 2017-07-24  22  {
8c5dbf2ae00bb86 Eric W. Biederman 2017-07-24 @23  	memset(info, 0, sizeof(*info));
8c5dbf2ae00bb86 Eric W. Biederman 2017-07-24  24  }
8c5dbf2ae00bb86 Eric W. Biederman 2017-07-24  25  
4ce5f9c9e754691 Eric W. Biederman 2018-09-25  26  #define SI_EXPANSION_SIZE (sizeof(struct siginfo) - sizeof(struct kernel_siginfo))
4ce5f9c9e754691 Eric W. Biederman 2018-09-25  27  
fa4751f454e6b51 Eric W. Biederman 2020-05-05  28  static inline void copy_siginfo_to_external(siginfo_t *to,
fa4751f454e6b51 Eric W. Biederman 2020-05-05  29  					    const kernel_siginfo_t *from)
fa4751f454e6b51 Eric W. Biederman 2020-05-05  30  {
fa4751f454e6b51 Eric W. Biederman 2020-05-05 @31  	memcpy(to, from, sizeof(*from));
fa4751f454e6b51 Eric W. Biederman 2020-05-05 @32  	memset(((char *)to) + sizeof(struct kernel_siginfo), 0,
fa4751f454e6b51 Eric W. Biederman 2020-05-05  33  		SI_EXPANSION_SIZE);
fa4751f454e6b51 Eric W. Biederman 2020-05-05  34  }
fa4751f454e6b51 Eric W. Biederman 2020-05-05  35  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
  2025-12-14 13:02 ` [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
@ 2025-12-16  6:29   ` kernel test robot
  2025-12-16 15:02     ` Mukesh Kumar Chaurasiya
  2025-12-16 10:43   ` Christophe Leroy (CS GROUP)
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 37+ messages in thread
From: kernel test robot @ 2025-12-16  6:29 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, christophe.leroy,
	oleg, kees, luto, wad, mchauras, thuth, sshegde, charlie, macro,
	akpm, ldv, deller, ankur.a.arora, segher, tglx, thomas.weissschuh,
	peterz, menglong8.dong, bigeasy, namcao, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel
  Cc: oe-kbuild-all

Hi Mukesh,

kernel test robot noticed the following build warnings:

[auto build test WARNING on powerpc/next]
[also build test WARNING on powerpc/fixes linus/master v6.19-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mukesh-Kumar-Chaurasiya/powerpc-rename-arch_irq_disabled_regs/20251214-210813
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
patch link:    https://lore.kernel.org/r/20251214130245.43664-8-mkchauras%40linux.ibm.com
patch subject: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
config: powerpc-randconfig-r072-20251215 (https://download.01.org/0day-ci/archive/20251216/202512161441.xlMhHxvl-lkp@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 8.5.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512161441.xlMhHxvl-lkp@intel.com/

smatch warnings:
arch/powerpc/include/asm/entry-common.h:433 arch_enter_from_user_mode() warn: inconsistent indenting

vim +433 arch/powerpc/include/asm/entry-common.h

2b0f05f77f11f8 Mukesh Kumar Chaurasiya 2025-12-14  396  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  397  static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  398  {
37ad0d88d9bff7 Mukesh Kumar Chaurasiya 2025-12-14  399  	kuap_lock();
37ad0d88d9bff7 Mukesh Kumar Chaurasiya 2025-12-14  400  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  401  	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  402  		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  403  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  404  	BUG_ON(regs_is_unrecoverable(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  405  	BUG_ON(!user_mode(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  406  	BUG_ON(regs_irqs_disabled(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  407  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  408  #ifdef CONFIG_PPC_PKEY
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  409  	if (mmu_has_feature(MMU_FTR_PKEY) && trap_is_syscall(regs)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  410  		unsigned long amr, iamr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  411  		bool flush_needed = false;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  412  		/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  413  		 * When entering from userspace we mostly have the AMR/IAMR
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  414  		 * different from kernel default values. Hence don't compare.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  415  		 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  416  		amr = mfspr(SPRN_AMR);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  417  		iamr = mfspr(SPRN_IAMR);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  418  		regs->amr  = amr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  419  		regs->iamr = iamr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  420  		if (mmu_has_feature(MMU_FTR_KUAP)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  421  			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  422  			flush_needed = true;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  423  		}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  424  		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  425  			mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  426  			flush_needed = true;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  427  		}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  428  		if (flush_needed)
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  429  			isync();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  430  	} else
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  431  #endif
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  432  		kuap_assert_locked();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14 @433  	booke_restore_dbcr0();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  434  	account_cpu_user_entry();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  435  	account_stolen_time();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  436  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  437  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  438  	 * This is not required for the syscall exit path, but makes the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  439  	 * stack frame look nicer. If this was initialised in the first stack
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  440  	 * frame, or if the unwinder was taught the first stack frame always
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  441  	 * returns to user with IRQS_ENABLED, this store could be avoided!
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  442  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  443  	irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  444  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  445  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  446  	 * If system call is called with TM active, set _TIF_RESTOREALL to
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  447  	 * prevent RFSCV being used to return to userspace, because POWER9
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  448  	 * TM implementation has problems with this instruction returning to
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  449  	 * transactional state. Final register values are not relevant because
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  450  	 * the transaction will be aborted upon return anyway. Or in the case
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  451  	 * of unsupported_scv SIGILL fault, the return state does not much
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  452  	 * matter because it's an edge case.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  453  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  454  	if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  455  	    unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  456  		set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  457  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  458  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  459  	 * If the system call was made with a transaction active, doom it and
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  460  	 * return without performing the system call. Unless it was an
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  461  	 * unsupported scv vector, in which case it's treated like an illegal
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  462  	 * instruction.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  463  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  464  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  465  	if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  466  	    !trap_is_unsupported_scv(regs)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  467  		/* Enable TM in the kernel, and disable EE (for scv) */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  468  		hard_irq_disable();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  469  		mtmsr(mfmsr() | MSR_TM);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  470  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  471  		/* tabort, this dooms the transaction, nothing else */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  472  		asm volatile(".long 0x7c00071d | ((%0) << 16)"
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  473  			     :: "r"(TM_CAUSE_SYSCALL | TM_CAUSE_PERSISTENT));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  474  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  475  		/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  476  		 * Userspace will never see the return value. Execution will
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  477  		 * resume after the tbegin. of the aborted transaction with the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  478  		 * checkpointed register state. A context switch could occur
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  479  		 * or signal delivered to the process before resuming the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  480  		 * doomed transaction context, but that should all be handled
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  481  		 * as expected.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  482  		 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  483  		return;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  484  	}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  485  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  486  }
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  487  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-14 13:02 ` [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
  2025-12-14 16:20   ` Segher Boessenkool
  2025-12-15 20:27   ` kernel test robot
@ 2025-12-16  6:41   ` Christophe Leroy (CS GROUP)
  2025-12-16 15:09     ` Mukesh Kumar Chaurasiya
  2025-12-16 11:01   ` Christophe Leroy (CS GROUP)
  3 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16  6:41 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, oleg, kees, luto,
	wad, mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel



Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> 
> Convert the PowerPC syscall entry and exit paths to use the generic
> entry/exit framework by selecting GENERIC_ENTRY and integrating with
> the common syscall handling routines.
> 
> This change transitions PowerPC away from its custom syscall entry and
> exit code to use the generic helpers such as:
>   - syscall_enter_from_user_mode()
>   - syscall_exit_to_user_mode()
> 
> As part of this migration:
>   - The architecture now selects GENERIC_ENTRY in Kconfig.
>   - Old tracing, seccomp, and audit handling in ptrace.c is removed in
>     favor of generic entry infrastructure.
>   - interrupt.c and syscall.c are simplified to delegate context
>     management and user exit handling to the generic entry path.
>   - The new pt_regs field `exit_flags` introduced earlier is now used
>     to carry per-syscall exit state flags (e.g. _TIF_RESTOREALL).
> 
> This aligns PowerPC with the common entry code used by other
> architectures and reduces duplicated logic around syscall tracing,
> context tracking, and signal handling.
> 
> The performance benchmarks from perf bench basic syscall are below:
> 
> perf bench syscall usec/op
> 
> | Test            | With Patch | Without Patch | % Change |
> | --------------- | ---------- | ------------- | -------- |
> | getppid usec/op | 0.207795   | 0.210373      | -1.22%   |
> | getpgid usec/op | 0.206282   | 0.211676      | -2.55%   |
> | fork usec/op    | 833.986    | 814.809       | +2.35%   |
> | execve usec/op  | 360.939    | 365.168       | -1.16%   |
> 
> perf bench syscall ops/sec
> 
> | Test            | With Patch | Without Patch | % Change |
> | --------------- | ---------- | ------------- | -------- |
> | getppid ops/sec | 48,12,433  | 47,53,459     | +1.24%   |
> | getpgid ops/sec | 48,47,744  | 47,24,192     | +2.61%   |
> | fork ops/sec    | 1,199      | 1,227         | -2.28%   |
> | execve ops/sec  | 2,770      | 2,738         | +1.16%   |

I get about 2% degradation on powerpc 8xx, and it is quite stable over 
time when repeating the test.

'perf bench syscall all' on powerpc 8xx (usec per op):

| Test            | With Patch | Without Patch | % Change |
| --------------- | ---------- | ------------- | -------- |
| getppid usec/op | 2.63       | 2.63          | ~ 0%     |
| getpgid usec/op | 2.26       | 2.22          | +2,80%   |
| fork usec/op    | 15300      | 15000         | +2,00%   |
| execve usec/op  | 45700      | 45200         | +1.10%   |

Christophe


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 2/8] powerpc: Prepare to build with generic entry/exit framework
  2025-12-14 13:02 ` [PATCH v2 2/8] powerpc: Prepare to build with generic entry/exit framework Mukesh Kumar Chaurasiya
@ 2025-12-16  9:27   ` Christophe Leroy (CS GROUP)
  2025-12-16 14:42     ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16  9:27 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, oleg, kees, luto,
	wad, mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel



Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> 
> This patch introduces preparatory changes needed to support building
> PowerPC with the generic entry/exit (irqentry) framework.
> 
> The following infrastructure updates are added:
>   - Add a syscall_work field to struct thread_info to hold SYSCALL_WORK_* flags.
>   - Provide a stub implementation of arch_syscall_is_vdso_sigreturn(),
>     returning false for now.
>   - Introduce on_thread_stack() helper to detect if the current stack pointer
>     lies within the task’s kernel stack.
> 
> These additions enable later integration with the generic entry/exit
> infrastructure while keeping existing PowerPC behavior unchanged.
> 
> No functional change is intended in this patch.
> 
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
>   arch/powerpc/include/asm/entry-common.h | 11 +++++++++++
>   arch/powerpc/include/asm/stacktrace.h   |  6 ++++++
>   arch/powerpc/include/asm/syscall.h      |  5 +++++
>   arch/powerpc/include/asm/thread_info.h  |  1 +
>   4 files changed, 23 insertions(+)
>   create mode 100644 arch/powerpc/include/asm/entry-common.h
> 
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> new file mode 100644
> index 000000000000..3af16d821d07
> --- /dev/null
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _ASM_PPC_ENTRY_COMMON_H
> +#define _ASM_PPC_ENTRY_COMMON_H
> +
> +#ifdef CONFIG_GENERIC_IRQ_ENTRY

Why do you need this #ifdef ? I see no reason, the build works well 
without this #ifdef.

At the time being, CONFIG_GENERIC_IRQ_ENTRY is never selected by 
powerpc, meaning you are introducing dead code. If really needed it 
would be more explicit to add a "#if 0"

> +
> +#include <asm/stacktrace.h>
> +
> +#endif /* CONFIG_GENERIC_IRQ_ENTRY */
> +#endif /* _ASM_PPC_ENTRY_COMMON_H */
> diff --git a/arch/powerpc/include/asm/stacktrace.h b/arch/powerpc/include/asm/stacktrace.h
> index 6149b53b3bc8..a81a9373d723 100644
> --- a/arch/powerpc/include/asm/stacktrace.h
> +++ b/arch/powerpc/include/asm/stacktrace.h
> @@ -10,4 +10,10 @@
>   
>   void show_user_instructions(struct pt_regs *regs);
>   
> +static inline bool on_thread_stack(void)

Shouldn't it be __always_inline ?

> +{
> +	return !(((unsigned long)(current->stack) ^ current_stack_pointer)
> +			& ~(THREAD_SIZE - 1));
> +}
> +
>   #endif /* _ASM_POWERPC_STACKTRACE_H */
> diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
> index 4b3c52ed6e9d..834fcc4f7b54 100644
> --- a/arch/powerpc/include/asm/syscall.h
> +++ b/arch/powerpc/include/asm/syscall.h
> @@ -139,4 +139,9 @@ static inline int syscall_get_arch(struct task_struct *task)
>   	else
>   		return AUDIT_ARCH_PPC64;
>   }
> +
> +static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
> +{
> +	return false;
> +}
>   #endif	/* _ASM_SYSCALL_H */
> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
> index b0f200aba2b3..9c8270354f0b 100644
> --- a/arch/powerpc/include/asm/thread_info.h
> +++ b/arch/powerpc/include/asm/thread_info.h
> @@ -57,6 +57,7 @@ struct thread_info {
>   #ifdef CONFIG_SMP
>   	unsigned int	cpu;
>   #endif
> +	unsigned long	syscall_work;		/* SYSCALL_WORK_ flags */

This is not used, why add it here ?

>   	unsigned long	local_flags;		/* private flags for thread */
>   #ifdef CONFIG_LIVEPATCH_64
>   	unsigned long *livepatch_sp;



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/8] powerpc: introduce arch_enter_from_user_mode
  2025-12-14 13:02 ` [PATCH v2 3/8] powerpc: introduce arch_enter_from_user_mode Mukesh Kumar Chaurasiya
@ 2025-12-16  9:38   ` Christophe Leroy (CS GROUP)
  2025-12-16 14:47     ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16  9:38 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, oleg, kees, luto,
	wad, mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel



Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> 
> Implement the arch_enter_from_user_mode() hook required by the generic
> entry/exit framework. This helper prepares the CPU state when entering
> the kernel from userspace, ensuring correct handling of KUAP/KUEP,
> transactional memory, and debug register state.
> 
> As part of this change, move booke_load_dbcr0() from interrupt.c to
> interrupt.h so it can be used by the new helper without introducing
> cross-file dependencies.
> 
> This patch contains no functional changes, it is purely preparatory for
> enabling the generic syscall and interrupt entry path on PowerPC.
> 
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
>   arch/powerpc/include/asm/entry-common.h | 97 +++++++++++++++++++++++++
>   arch/powerpc/include/asm/interrupt.h    | 22 ++++++
>   arch/powerpc/kernel/interrupt.c         | 22 ------
>   3 files changed, 119 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index 3af16d821d07..093ece06ef79 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -5,7 +5,104 @@
>   
>   #ifdef CONFIG_GENERIC_IRQ_ENTRY

This #ifdef is still unnecessary it seems.

>   
> +#include <asm/cputime.h>
> +#include <asm/interrupt.h>
>   #include <asm/stacktrace.h>
> +#include <asm/tm.h>
> +
> +static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> +{
> +	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
> +		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
> +
> +	BUG_ON(regs_is_unrecoverable(regs));
> +	BUG_ON(!user_mode(regs));
> +	BUG_ON(regs_irqs_disabled(regs));
> +
> +#ifdef CONFIG_PPC_PKEY
> +	if (mmu_has_feature(MMU_FTR_PKEY) && trap_is_syscall(regs)) {
> +		unsigned long amr, iamr;
> +		bool flush_needed = false;
> +		/*
> +		 * When entering from userspace we mostly have the AMR/IAMR
> +		 * different from kernel default values. Hence don't compare.
> +		 */
> +		amr = mfspr(SPRN_AMR);
> +		iamr = mfspr(SPRN_IAMR);
> +		regs->amr  = amr;
> +		regs->iamr = iamr;
> +		if (mmu_has_feature(MMU_FTR_KUAP)) {
> +			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
> +			flush_needed = true;
> +		}
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
> +			mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
> +			flush_needed = true;
> +		}
> +		if (flush_needed)
> +			isync();
> +	} else
> +#endif
> +		kuap_assert_locked();

This construct is odd, can you do something about it ?

> +
> +	booke_restore_dbcr0();
> +
> +	account_cpu_user_entry();
> +
> +	account_stolen_time();
> +
> +	/*
> +	 * This is not required for the syscall exit path, but makes the
> +	 * stack frame look nicer. If this was initialised in the first stack
> +	 * frame, or if the unwinder was taught the first stack frame always
> +	 * returns to user with IRQS_ENABLED, this store could be avoided!
> +	 */
> +	irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
> +
> +	/*
> +	 * If system call is called with TM active, set _TIF_RESTOREALL to
> +	 * prevent RFSCV being used to return to userspace, because POWER9
> +	 * TM implementation has problems with this instruction returning to
> +	 * transactional state. Final register values are not relevant because
> +	 * the transaction will be aborted upon return anyway. Or in the case
> +	 * of unsupported_scv SIGILL fault, the return state does not much
> +	 * matter because it's an edge case.
> +	 */
> +	if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
> +	    unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
> +		set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
> +
> +	/*
> +	 * If the system call was made with a transaction active, doom it and
> +	 * return without performing the system call. Unless it was an
> +	 * unsupported scv vector, in which case it's treated like an illegal
> +	 * instruction.
> +	 */
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +	if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
> +	    !trap_is_unsupported_scv(regs)) {
> +		/* Enable TM in the kernel, and disable EE (for scv) */
> +		hard_irq_disable();
> +		mtmsr(mfmsr() | MSR_TM);
> +
> +		/* tabort, this dooms the transaction, nothing else */
> +		asm volatile(".long 0x7c00071d | ((%0) << 16)"
> +			     :: "r"(TM_CAUSE_SYSCALL | TM_CAUSE_PERSISTENT));
> +
> +		/*
> +		 * Userspace will never see the return value. Execution will
> +		 * resume after the tbegin. of the aborted transaction with the
> +		 * checkpointed register state. A context switch could occur
> +		 * or signal delivered to the process before resuming the
> +		 * doomed transaction context, but that should all be handled
> +		 * as expected.
> +		 */
> +		return;
> +	}
> +#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> +}
> +
> +#define arch_enter_from_user_mode arch_enter_from_user_mode
>   
>   #endif /* CONFIG_GENERIC_IRQ_ENTRY */
>   #endif /* _ASM_PPC_ENTRY_COMMON_H */
> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> index 0e2cddf8bd21..ca8a2cda9400 100644
> --- a/arch/powerpc/include/asm/interrupt.h
> +++ b/arch/powerpc/include/asm/interrupt.h
> @@ -138,6 +138,28 @@ static inline void nap_adjust_return(struct pt_regs *regs)
>   #endif
>   }
>   
> +static inline void booke_load_dbcr0(void)

It was a notrace function in interrupt.c
Should it be an __always_inline now ?

Christophe

> +{
> +#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> +	unsigned long dbcr0 = current->thread.debug.dbcr0;
> +
> +	if (likely(!(dbcr0 & DBCR0_IDM)))
> +		return;
> +
> +	/*
> +	 * Check to see if the dbcr0 register is set up to debug.
> +	 * Use the internal debug mode bit to do this.
> +	 */
> +	mtmsr(mfmsr() & ~MSR_DE);
> +	if (IS_ENABLED(CONFIG_PPC32)) {
> +		isync();
> +		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
> +	}
> +	mtspr(SPRN_DBCR0, dbcr0);
> +	mtspr(SPRN_DBSR, -1);
> +#endif
> +}
> +
>   static inline void booke_restore_dbcr0(void)
>   {
>   #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index 0d8fd47049a1..2a09ac5dabd6 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -78,28 +78,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
>   	return true;
>   }
>   
> -static notrace void booke_load_dbcr0(void)
> -{
> -#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> -	unsigned long dbcr0 = current->thread.debug.dbcr0;
> -
> -	if (likely(!(dbcr0 & DBCR0_IDM)))
> -		return;
> -
> -	/*
> -	 * Check to see if the dbcr0 register is set up to debug.
> -	 * Use the internal debug mode bit to do this.
> -	 */
> -	mtmsr(mfmsr() & ~MSR_DE);
> -	if (IS_ENABLED(CONFIG_PPC32)) {
> -		isync();
> -		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
> -	}
> -	mtspr(SPRN_DBCR0, dbcr0);
> -	mtspr(SPRN_DBSR, -1);
> -#endif
> -}
> -
>   static notrace void check_return_regs_valid(struct pt_regs *regs)
>   {
>   #ifdef CONFIG_PPC_BOOK3S_64



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 4/8] powerpc: Introduce syscall exit arch functions
  2025-12-14 13:02 ` [PATCH v2 4/8] powerpc: Introduce syscall exit arch functions Mukesh Kumar Chaurasiya
@ 2025-12-16  9:46   ` Christophe Leroy (CS GROUP)
  2025-12-16 14:51     ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16  9:46 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, oleg, kees, luto,
	wad, mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel



Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> 
> Add PowerPC-specific implementations of the generic syscall exit hooks
> used by the generic entry/exit framework:
> 
>   - arch_exit_to_user_mode_work_prepare()
>   - arch_exit_to_user_mode_work()
> 
> These helpers handle user state restoration when returning from the
> kernel to userspace, including FPU/VMX/VSX state, transactional memory,
> KUAP restore, and per-CPU accounting.
> 
> Additionally, move check_return_regs_valid() from interrupt.c to
> interrupt.h so it can be shared by the new entry/exit logic, and add
> arch_do_signal_or_restart() for use with the generic entry flow.
> 
> No functional change is intended with this patch.
> 
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
>   arch/powerpc/include/asm/entry-common.h | 49 +++++++++++++++
>   arch/powerpc/include/asm/interrupt.h    | 82 +++++++++++++++++++++++++
>   arch/powerpc/kernel/interrupt.c         | 81 ------------------------
>   arch/powerpc/kernel/signal.c            | 14 +++++
>   4 files changed, 145 insertions(+), 81 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index 093ece06ef79..e8ebd42a4e6d 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -8,6 +8,7 @@
>   #include <asm/cputime.h>
>   #include <asm/interrupt.h>
>   #include <asm/stacktrace.h>
> +#include <asm/switch_to.h>
>   #include <asm/tm.h>
>   
>   static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> @@ -104,5 +105,53 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
>   
>   #define arch_enter_from_user_mode arch_enter_from_user_mode
>   
> +static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
> +						  unsigned long ti_work)
> +{
> +	unsigned long mathflags;
> +
> +	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
> +		if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
> +		    unlikely((ti_work & _TIF_RESTORE_TM))) {
> +			restore_tm_state(regs);
> +		} else {
> +			mathflags = MSR_FP;
> +
> +			if (cpu_has_feature(CPU_FTR_VSX))
> +				mathflags |= MSR_VEC | MSR_VSX;
> +			else if (cpu_has_feature(CPU_FTR_ALTIVEC))
> +				mathflags |= MSR_VEC;
> +
> +			/*
> +			 * If userspace MSR has all available FP bits set,
> +			 * then they are live and no need to restore. If not,
> +			 * it means the regs were given up and restore_math
> +			 * may decide to restore them (to avoid taking an FP
> +			 * fault).
> +			 */
> +			if ((regs->msr & mathflags) != mathflags)
> +				restore_math(regs);
> +		}
> +	}
> +
> +	check_return_regs_valid(regs);
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +	local_paca->tm_scratch = regs->msr;
> +#endif
> +	/* Restore user access locks last */
> +	kuap_user_restore(regs);
> +}
> +
> +#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
> +
> +static __always_inline void arch_exit_to_user_mode(void)
> +{
> +	booke_load_dbcr0();
> +
> +	account_cpu_user_exit();
> +}
> +
> +#define arch_exit_to_user_mode arch_exit_to_user_mode
> +
>   #endif /* CONFIG_GENERIC_IRQ_ENTRY */
>   #endif /* _ASM_PPC_ENTRY_COMMON_H */
> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> index ca8a2cda9400..77ff8e33f8cd 100644
> --- a/arch/powerpc/include/asm/interrupt.h
> +++ b/arch/powerpc/include/asm/interrupt.h
> @@ -68,6 +68,8 @@
>   
>   #include <linux/context_tracking.h>
>   #include <linux/hardirq.h>
> +#include <linux/sched/debug.h> /* for show_regs */
> +
>   #include <asm/cputime.h>
>   #include <asm/firmware.h>
>   #include <asm/ftrace.h>
> @@ -172,6 +174,86 @@ static inline void booke_restore_dbcr0(void)
>   #endif
>   }
>   
> +static inline void check_return_regs_valid(struct pt_regs *regs)

This was previously a notrace function. Should it be marked 
__always_inline instead of just inline ?

> +{
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	unsigned long trap, srr0, srr1;
> +	static bool warned;
> +	u8 *validp;
> +	char *h;
> +
> +	if (trap_is_scv(regs))
> +		return;
> +
> +	trap = TRAP(regs);
> +	// EE in HV mode sets HSRRs like 0xea0
> +	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
> +		trap = 0xea0;
> +
> +	switch (trap) {
> +	case 0x980:
> +	case INTERRUPT_H_DATA_STORAGE:
> +	case 0xe20:
> +	case 0xe40:
> +	case INTERRUPT_HMI:
> +	case 0xe80:
> +	case 0xea0:
> +	case INTERRUPT_H_FAC_UNAVAIL:
> +	case 0x1200:
> +	case 0x1500:
> +	case 0x1600:
> +	case 0x1800:
> +		validp = &local_paca->hsrr_valid;
> +		if (!READ_ONCE(*validp))
> +			return;
> +
> +		srr0 = mfspr(SPRN_HSRR0);
> +		srr1 = mfspr(SPRN_HSRR1);
> +		h = "H";
> +
> +		break;
> +	default:
> +		validp = &local_paca->srr_valid;
> +		if (!READ_ONCE(*validp))
> +			return;
> +
> +		srr0 = mfspr(SPRN_SRR0);
> +		srr1 = mfspr(SPRN_SRR1);
> +		h = "";
> +		break;
> +	}
> +
> +	if (srr0 == regs->nip && srr1 == regs->msr)
> +		return;
> +
> +	/*
> +	 * A NMI / soft-NMI interrupt may have come in after we found
> +	 * srr_valid and before the SRRs are loaded. The interrupt then
> +	 * comes in and clobbers SRRs and clears srr_valid. Then we load
> +	 * the SRRs here and test them above and find they don't match.
> +	 *
> +	 * Test validity again after that, to catch such false positives.
> +	 *
> +	 * This test in general will have some window for false negatives
> +	 * and may not catch and fix all such cases if an NMI comes in
> +	 * later and clobbers SRRs without clearing srr_valid, but hopefully
> +	 * such things will get caught most of the time, statistically
> +	 * enough to be able to get a warning out.
> +	 */
> +	if (!READ_ONCE(*validp))
> +		return;
> +
> +	if (!data_race(warned)) {
> +		data_race(warned = true);
> +		pr_warn("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
> +		pr_warn("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
> +		show_regs(regs);
> +	}
> +
> +	WRITE_ONCE(*validp, 0); /* fixup */
> +#endif
> +}
> +
>   static inline void interrupt_enter_prepare(struct pt_regs *regs)
>   {
>   #ifdef CONFIG_PPC64
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index 2a09ac5dabd6..f53d432f6087 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -4,7 +4,6 @@
>   #include <linux/err.h>
>   #include <linux/compat.h>
>   #include <linux/rseq.h>
> -#include <linux/sched/debug.h> /* for show_regs */
>   
>   #include <asm/kup.h>
>   #include <asm/cputime.h>
> @@ -78,86 +77,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
>   	return true;
>   }
>   
> -static notrace void check_return_regs_valid(struct pt_regs *regs)
> -{
> -#ifdef CONFIG_PPC_BOOK3S_64
> -	unsigned long trap, srr0, srr1;
> -	static bool warned;
> -	u8 *validp;
> -	char *h;
> -
> -	if (trap_is_scv(regs))
> -		return;
> -
> -	trap = TRAP(regs);
> -	// EE in HV mode sets HSRRs like 0xea0
> -	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
> -		trap = 0xea0;
> -
> -	switch (trap) {
> -	case 0x980:
> -	case INTERRUPT_H_DATA_STORAGE:
> -	case 0xe20:
> -	case 0xe40:
> -	case INTERRUPT_HMI:
> -	case 0xe80:
> -	case 0xea0:
> -	case INTERRUPT_H_FAC_UNAVAIL:
> -	case 0x1200:
> -	case 0x1500:
> -	case 0x1600:
> -	case 0x1800:
> -		validp = &local_paca->hsrr_valid;
> -		if (!READ_ONCE(*validp))
> -			return;
> -
> -		srr0 = mfspr(SPRN_HSRR0);
> -		srr1 = mfspr(SPRN_HSRR1);
> -		h = "H";
> -
> -		break;
> -	default:
> -		validp = &local_paca->srr_valid;
> -		if (!READ_ONCE(*validp))
> -			return;
> -
> -		srr0 = mfspr(SPRN_SRR0);
> -		srr1 = mfspr(SPRN_SRR1);
> -		h = "";
> -		break;
> -	}
> -
> -	if (srr0 == regs->nip && srr1 == regs->msr)
> -		return;
> -
> -	/*
> -	 * A NMI / soft-NMI interrupt may have come in after we found
> -	 * srr_valid and before the SRRs are loaded. The interrupt then
> -	 * comes in and clobbers SRRs and clears srr_valid. Then we load
> -	 * the SRRs here and test them above and find they don't match.
> -	 *
> -	 * Test validity again after that, to catch such false positives.
> -	 *
> -	 * This test in general will have some window for false negatives
> -	 * and may not catch and fix all such cases if an NMI comes in
> -	 * later and clobbers SRRs without clearing srr_valid, but hopefully
> -	 * such things will get caught most of the time, statistically
> -	 * enough to be able to get a warning out.
> -	 */
> -	if (!READ_ONCE(*validp))
> -		return;
> -
> -	if (!data_race(warned)) {
> -		data_race(warned = true);
> -		printk("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
> -		printk("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
> -		show_regs(regs);
> -	}
> -
> -	WRITE_ONCE(*validp, 0); /* fixup */
> -#endif
> -}
> -
>   static notrace unsigned long
>   interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
>   {
> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> index aa17e62f3754..719930cf4ae1 100644
> --- a/arch/powerpc/kernel/signal.c
> +++ b/arch/powerpc/kernel/signal.c
> @@ -22,6 +22,11 @@
>   
>   #include "signal.h"
>   
> +/* This will be removed */
> +#ifdef CONFIG_GENERIC_ENTRY

Is this #ifdef really needed ?

> +#include <linux/entry-common.h>
> +#endif /* CONFIG_GENERIC_ENTRY */
> +
>   #ifdef CONFIG_VSX
>   unsigned long copy_fpr_to_user(void __user *to,
>   			       struct task_struct *task)
> @@ -368,3 +373,12 @@ void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
>   		printk_ratelimited(regs->msr & MSR_64BIT ? fm64 : fm32, tsk->comm,
>   				   task_pid_nr(tsk), where, ptr, regs->nip, regs->link);
>   }
> +
> +#ifdef CONFIG_GENERIC_ENTRY

Why is this #ifdef needed ?

> +void arch_do_signal_or_restart(struct pt_regs *regs)
> +{
> +	BUG_ON(regs != current->thread.regs);

Is this BUG_ON() needed ? Can't we use something smoother ?

> +	local_paca->generic_fw_flags |= GFW_RESTORE_ALL;
> +	do_signal(current);
> +}
> +#endif /* CONFIG_GENERIC_ENTRY */



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 5/8] powerpc: add exit_flags field in pt_regs
  2025-12-14 13:02 ` [PATCH v2 5/8] powerpc: add exit_flags field in pt_regs Mukesh Kumar Chaurasiya
@ 2025-12-16  9:52   ` Christophe Leroy (CS GROUP)
  2025-12-16 14:56     ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16  9:52 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, oleg, kees, luto,
	wad, mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel



Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> 
> Add a new field `exit_flags` in the pt_regs structure. This field will hold
> the flags set during interrupt or syscall execution that are required during
> exit to user mode.
> 
> Specifically, the `TIF_RESTOREALL` flag, stored in this field, helps the
> exit routine determine if any NVGPRs were modified and need to be restored
> before returning to userspace.

In the current implementation we did our best to keep this information 
in a local var for performance reasons. Have you assessed the 
performance impact of going through the stack for that ?

> 
> This addition ensures a clean and architecture-specific mechanism to track
> per-syscall or per-interrupt state transitions related to register restore.
> 
> Changes:
>   - Add `exit_flags` and `__pt_regs_pad` to maintain 16-byte stack alignment
>   - Update asm-offsets.c and ptrace.c for offset and validation
>   - Update PT_* constants in uapi header to reflect the new layout
> 
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
>   arch/powerpc/include/asm/ptrace.h      |  3 +++
>   arch/powerpc/include/uapi/asm/ptrace.h | 14 +++++++++-----
>   arch/powerpc/kernel/asm-offsets.c      |  1 +
>   arch/powerpc/kernel/ptrace/ptrace.c    |  1 +
>   4 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
> index 94aa1de2b06e..3af8a5898fe3 100644
> --- a/arch/powerpc/include/asm/ptrace.h
> +++ b/arch/powerpc/include/asm/ptrace.h
> @@ -53,6 +53,9 @@ struct pt_regs
>   				unsigned long esr;
>   			};
>   			unsigned long result;
> +			unsigned long exit_flags;
> +			/* Maintain 16 byte interrupt stack alignment */

On powerpc/32, one 'long' is 4 bytes not 8.

> +			unsigned long __pt_regs_pad[1];
>   		};
>   	};
>   #if defined(CONFIG_PPC64) || defined(CONFIG_PPC_KUAP)
> diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h
> index 01e630149d48..de56b216c9c5 100644
> --- a/arch/powerpc/include/uapi/asm/ptrace.h
> +++ b/arch/powerpc/include/uapi/asm/ptrace.h
> @@ -55,6 +55,8 @@ struct pt_regs
>   	unsigned long dar;		/* Fault registers */
>   	unsigned long dsisr;		/* on 4xx/Book-E used for ESR */
>   	unsigned long result;		/* Result of a system call */
> +	unsigned long exit_flags;	/* System call exit flags */
> +	unsigned long __pt_regs_pad[1];	/* Maintain 16 byte interrupt stack alignment */

On powerpc/32, one 'long' is 4 bytes not 8.

>   };
>   
>   #endif /* __ASSEMBLER__ */
> @@ -114,10 +116,12 @@ struct pt_regs
>   #define PT_DAR	41
>   #define PT_DSISR 42
>   #define PT_RESULT 43
> -#define PT_DSCR 44
> -#define PT_REGS_COUNT 44
> +#define PT_EXIT_FLAGS 44
> +#define PT_PAD 45
> +#define PT_DSCR 46
> +#define PT_REGS_COUNT 46
>   
> -#define PT_FPR0	48	/* each FP reg occupies 2 slots in this space */
> +#define PT_FPR0	(PT_REGS_COUNT + 4)	/* each FP reg occupies 2 slots in this space */
>   
>   #ifndef __powerpc64__
>   
> @@ -129,7 +133,7 @@ struct pt_regs
>   #define PT_FPSCR (PT_FPR0 + 32)	/* each FP reg occupies 1 slot in 64-bit space */
>   
>   
> -#define PT_VR0 82	/* each Vector reg occupies 2 slots in 64-bit */
> +#define PT_VR0	(PT_FPSCR + 2)	/* <82> each Vector reg occupies 2 slots in 64-bit */
>   #define PT_VSCR (PT_VR0 + 32*2 + 1)
>   #define PT_VRSAVE (PT_VR0 + 33*2)
>   
> @@ -137,7 +141,7 @@ struct pt_regs
>   /*
>    * Only store first 32 VSRs here. The second 32 VSRs in VR0-31
>    */
> -#define PT_VSR0 150	/* each VSR reg occupies 2 slots in 64-bit */
> +#define PT_VSR0	(PT_VRSAVE + 2)	/* each VSR reg occupies 2 slots in 64-bit */
>   #define PT_VSR31 (PT_VSR0 + 2*31)
>   #endif /* __powerpc64__ */
>   
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index a4bc80b30410..c0bb09f1db78 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -292,6 +292,7 @@ int main(void)
>   	STACK_PT_REGS_OFFSET(_ESR, esr);
>   	STACK_PT_REGS_OFFSET(ORIG_GPR3, orig_gpr3);
>   	STACK_PT_REGS_OFFSET(RESULT, result);
> +	STACK_PT_REGS_OFFSET(EXIT_FLAGS, exit_flags);

Where is that used ?

>   	STACK_PT_REGS_OFFSET(_TRAP, trap);
>   #ifdef CONFIG_PPC64
>   	STACK_PT_REGS_OFFSET(SOFTE, softe);
> diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
> index c6997df63287..2134b6d155ff 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace.c
> @@ -432,6 +432,7 @@ void __init pt_regs_check(void)
>   	CHECK_REG(PT_DAR, dar);
>   	CHECK_REG(PT_DSISR, dsisr);
>   	CHECK_REG(PT_RESULT, result);
> +	CHECK_REG(PT_EXIT_FLAGS, exit_flags);
>   	#undef CHECK_REG
>   
>   	BUILD_BUG_ON(PT_REGS_COUNT != sizeof(struct user_pt_regs) / sizeof(unsigned long));



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit
  2025-12-14 13:02 ` [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit Mukesh Kumar Chaurasiya
@ 2025-12-16  9:58   ` Christophe Leroy (CS GROUP)
  2025-12-16 15:00     ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16  9:58 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, oleg, kees, luto,
	wad, mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel



Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> 
> Move interrupt entry and exit helper routines from interrupt.h into the
> PowerPC-specific entry-common.h header as a preparatory step for enabling
> the generic entry/exit framework.
> 
> This consolidation places all PowerPC interrupt entry/exit handling in a
> single common header, aligning with the generic entry infrastructure.
> The helpers provide architecture-specific handling for interrupt and NMI
> entry/exit sequences, including:
> 
>   - arch_interrupt_enter/exit_prepare()
>   - arch_interrupt_async_enter/exit_prepare()
>   - arch_interrupt_nmi_enter/exit_prepare()
>   - Supporting helpers such as nap_adjust_return(), check_return_regs_valid(),
>     debug register maintenance, and soft mask handling.
> 
> The functions are copied verbatim from interrupt.h to avoid functional
> changes at this stage. Subsequent patches will integrate these routines
> into the generic entry/exit flow.

Can we move them instead of duplicating them ?

> 
> No functional change intended.
> 
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
>   arch/powerpc/include/asm/entry-common.h | 422 ++++++++++++++++++++++++
>   1 file changed, 422 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index e8ebd42a4e6d..e8bde4c67eaf 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -7,10 +7,432 @@
>   
>   #include <asm/cputime.h>
>   #include <asm/interrupt.h>
> +#include <asm/runlatch.h>
>   #include <asm/stacktrace.h>
>   #include <asm/switch_to.h>
>   #include <asm/tm.h>
>   
> +#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
> +/*
> + * WARN/BUG is handled with a program interrupt so minimise checks here to
> + * avoid recursion and maximise the chance of getting the first oops handled.
> + */
> +#define INT_SOFT_MASK_BUG_ON(regs, cond)				\
> +do {									\
> +	if ((user_mode(regs) || (TRAP(regs) != INTERRUPT_PROGRAM)))	\
> +		BUG_ON(cond);						\
> +} while (0)
> +#else
> +#define INT_SOFT_MASK_BUG_ON(regs, cond)
> +#endif
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
> +extern char __end_soft_masked[];
> +bool search_kernel_soft_mask_table(unsigned long addr);
> +unsigned long search_kernel_restart_table(unsigned long addr);
> +
> +DECLARE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
> +
> +static inline bool is_implicit_soft_masked(struct pt_regs *regs)
> +{
> +	if (user_mode(regs))
> +		return false;
> +
> +	if (regs->nip >= (unsigned long)__end_soft_masked)
> +		return false;
> +
> +	return search_kernel_soft_mask_table(regs->nip);
> +}
> +
> +static inline void srr_regs_clobbered(void)
> +{
> +	local_paca->srr_valid = 0;
> +	local_paca->hsrr_valid = 0;
> +}
> +#else
> +static inline unsigned long search_kernel_restart_table(unsigned long addr)
> +{
> +	return 0;
> +}
> +
> +static inline bool is_implicit_soft_masked(struct pt_regs *regs)
> +{
> +	return false;
> +}
> +
> +static inline void srr_regs_clobbered(void)
> +{
> +}
> +#endif
> +
> +static inline void nap_adjust_return(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_PPC_970_NAP
> +	if (unlikely(test_thread_local_flags(_TLF_NAPPING))) {
> +		/* Can avoid a test-and-clear because NMIs do not call this */
> +		clear_thread_local_flags(_TLF_NAPPING);
> +		regs_set_return_ip(regs, (unsigned long)power4_idle_nap_return);
> +	}
> +#endif
> +}
> +
> +static inline void booke_load_dbcr0(void)
> +{
> +#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> +	unsigned long dbcr0 = current->thread.debug.dbcr0;
> +
> +	if (likely(!(dbcr0 & DBCR0_IDM)))
> +		return;
> +
> +	/*
> +	 * Check to see if the dbcr0 register is set up to debug.
> +	 * Use the internal debug mode bit to do this.
> +	 */
> +	mtmsr(mfmsr() & ~MSR_DE);
> +	if (IS_ENABLED(CONFIG_PPC32)) {
> +		isync();
> +		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
> +	}
> +	mtspr(SPRN_DBCR0, dbcr0);
> +	mtspr(SPRN_DBSR, -1);
> +#endif
> +}
> +
> +static inline void booke_restore_dbcr0(void)
> +{
> +#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> +	unsigned long dbcr0 = current->thread.debug.dbcr0;
> +
> +	if (IS_ENABLED(CONFIG_PPC32) && unlikely(dbcr0 & DBCR0_IDM)) {
> +		mtspr(SPRN_DBSR, -1);
> +		mtspr(SPRN_DBCR0, global_dbcr0[smp_processor_id()]);
> +	}
> +#endif
> +}
> +
> +static inline void check_return_regs_valid(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	unsigned long trap, srr0, srr1;
> +	static bool warned;
> +	u8 *validp;
> +	char *h;
> +
> +	if (trap_is_scv(regs))
> +		return;
> +
> +	trap = TRAP(regs);
> +	// EE in HV mode sets HSRRs like 0xea0
> +	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
> +		trap = 0xea0;
> +
> +	switch (trap) {
> +	case 0x980:
> +	case INTERRUPT_H_DATA_STORAGE:
> +	case 0xe20:
> +	case 0xe40:
> +	case INTERRUPT_HMI:
> +	case 0xe80:
> +	case 0xea0:
> +	case INTERRUPT_H_FAC_UNAVAIL:
> +	case 0x1200:
> +	case 0x1500:
> +	case 0x1600:
> +	case 0x1800:
> +		validp = &local_paca->hsrr_valid;
> +		if (!READ_ONCE(*validp))
> +			return;
> +
> +		srr0 = mfspr(SPRN_HSRR0);
> +		srr1 = mfspr(SPRN_HSRR1);
> +		h = "H";
> +
> +		break;
> +	default:
> +		validp = &local_paca->srr_valid;
> +		if (!READ_ONCE(*validp))
> +			return;
> +
> +		srr0 = mfspr(SPRN_SRR0);
> +		srr1 = mfspr(SPRN_SRR1);
> +		h = "";
> +		break;
> +	}
> +
> +	if (srr0 == regs->nip && srr1 == regs->msr)
> +		return;
> +
> +	/*
> +	 * A NMI / soft-NMI interrupt may have come in after we found
> +	 * srr_valid and before the SRRs are loaded. The interrupt then
> +	 * comes in and clobbers SRRs and clears srr_valid. Then we load
> +	 * the SRRs here and test them above and find they don't match.
> +	 *
> +	 * Test validity again after that, to catch such false positives.
> +	 *
> +	 * This test in general will have some window for false negatives
> +	 * and may not catch and fix all such cases if an NMI comes in
> +	 * later and clobbers SRRs without clearing srr_valid, but hopefully
> +	 * such things will get caught most of the time, statistically
> +	 * enough to be able to get a warning out.
> +	 */
> +	if (!READ_ONCE(*validp))
> +		return;
> +
> +	if (!data_race(warned)) {
> +		data_race(warned = true);
> +		pr_warn("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
> +		pr_warn("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
> +		show_regs(regs);
> +	}
> +
> +	WRITE_ONCE(*validp, 0); /* fixup */
> +#endif
> +}
> +
> +static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_PPC64
> +	irq_soft_mask_set(IRQS_ALL_DISABLED);
> +
> +	/*
> +	 * If the interrupt was taken with HARD_DIS clear, then enable MSR[EE].
> +	 * Asynchronous interrupts get here with HARD_DIS set (see below), so
> +	 * this enables MSR[EE] for synchronous interrupts. IRQs remain
> +	 * soft-masked. The interrupt handler may later call
> +	 * interrupt_cond_local_irq_enable() to achieve a regular process
> +	 * context.
> +	 */
> +	if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS)) {
> +		INT_SOFT_MASK_BUG_ON(regs, !(regs->msr & MSR_EE));
> +		__hard_irq_enable();
> +	} else {
> +		__hard_RI_enable();
> +	}
> +	/* Enable MSR[RI] early, to support kernel SLB and hash faults */
> +#endif
> +
> +	if (!regs_irqs_disabled(regs))
> +		trace_hardirqs_off();
> +
> +	if (user_mode(regs)) {
> +		kuap_lock();
> +		CT_WARN_ON(ct_state() != CT_STATE_USER);
> +		user_exit_irqoff();
> +
> +		account_cpu_user_entry();
> +		account_stolen_time();
> +	} else {
> +		kuap_save_and_lock(regs);
> +		/*
> +		 * CT_WARN_ON comes here via program_check_exception,
> +		 * so avoid recursion.
> +		 */
> +		if (TRAP(regs) != INTERRUPT_PROGRAM)
> +			CT_WARN_ON(ct_state() != CT_STATE_KERNEL &&
> +				   ct_state() != CT_STATE_IDLE);
> +		INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
> +		INT_SOFT_MASK_BUG_ON(regs, regs_irqs_disabled(regs) &&
> +				     search_kernel_restart_table(regs->nip));
> +	}
> +	INT_SOFT_MASK_BUG_ON(regs, !regs_irqs_disabled(regs) &&
> +			     !(regs->msr & MSR_EE));
> +
> +	booke_restore_dbcr0();
> +}
> +
> +/*
> + * Care should be taken to note that arch_interrupt_exit_prepare and
> + * arch_interrupt_async_exit_prepare do not necessarily return immediately to
> + * regs context (e.g., if regs is usermode, we don't necessarily return to
> + * user mode). Other interrupts might be taken between here and return,
> + * context switch / preemption may occur in the exit path after this, or a
> + * signal may be delivered, etc.
> + *
> + * The real interrupt exit code is platform specific, e.g.,
> + * interrupt_exit_user_prepare / interrupt_exit_kernel_prepare for 64s.
> + *
> + * However arch_interrupt_nmi_exit_prepare does return directly to regs, because
> + * NMIs do not do "exit work" or replay soft-masked interrupts.
> + */
> +static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
> +{
> +}
> +
> +static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_PPC64
> +	/* Ensure arch_interrupt_enter_prepare does not enable MSR[EE] */
> +	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
> +#endif
> +	arch_interrupt_enter_prepare(regs);
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	/*
> +	 * RI=1 is set by arch_interrupt_enter_prepare, so this thread flags access
> +	 * has to come afterward (it can cause SLB faults).
> +	 */
> +	if (cpu_has_feature(CPU_FTR_CTRL) &&
> +	    !test_thread_local_flags(_TLF_RUNLATCH))
> +		__ppc64_runlatch_on();
> +#endif
> +	irq_enter();
> +}
> +
> +static inline void arch_interrupt_async_exit_prepare(struct pt_regs *regs)
> +{
> +	/*
> +	 * Adjust at exit so the main handler sees the true NIA. This must
> +	 * come before irq_exit() because irq_exit can enable interrupts, and
> +	 * if another interrupt is taken before nap_adjust_return has run
> +	 * here, then that interrupt would return directly to idle nap return.
> +	 */
> +	nap_adjust_return(regs);
> +
> +	irq_exit();
> +	arch_interrupt_exit_prepare(regs);
> +}
> +
> +struct interrupt_nmi_state {
> +#ifdef CONFIG_PPC64
> +	u8 irq_soft_mask;
> +	u8 irq_happened;
> +	u8 ftrace_enabled;
> +	u64 softe;
> +#endif
> +};
> +
> +static inline bool nmi_disables_ftrace(struct pt_regs *regs)
> +{
> +	/* Allow DEC and PMI to be traced when they are soft-NMI */
> +	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
> +		if (TRAP(regs) == INTERRUPT_DECREMENTER)
> +			return false;
> +		if (TRAP(regs) == INTERRUPT_PERFMON)
> +			return false;
> +	}
> +	if (IS_ENABLED(CONFIG_PPC_BOOK3E_64)) {
> +		if (TRAP(regs) == INTERRUPT_PERFMON)
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
> +static inline void arch_interrupt_nmi_enter_prepare(struct pt_regs *regs,
> +					       struct interrupt_nmi_state *state)

CHECK: Alignment should match open parenthesis
#354: FILE: arch/powerpc/include/asm/entry-common.h:322:
+static inline void arch_interrupt_nmi_enter_prepare(struct pt_regs *regs,
+					       struct interrupt_nmi_state *state)


> +{
> +#ifdef CONFIG_PPC64
> +	state->irq_soft_mask = local_paca->irq_soft_mask;
> +	state->irq_happened = local_paca->irq_happened;
> +	state->softe = regs->softe;
> +
> +	/*
> +	 * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
> +	 * the right thing, and set IRQ_HARD_DIS. We do not want to reconcile
> +	 * because that goes through irq tracing which we don't want in NMI.
> +	 */
> +	local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
> +	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
> +
> +	if (!(regs->msr & MSR_EE) || is_implicit_soft_masked(regs)) {
> +		/*
> +		 * Adjust regs->softe to be soft-masked if it had not been
> +		 * reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
> +		 * not yet set disabled), or if it was in an implicit soft
> +		 * masked state. This makes regs_irqs_disabled(regs)
> +		 * behave as expected.
> +		 */
> +		regs->softe = IRQS_ALL_DISABLED;
> +	}
> +
> +	__hard_RI_enable();
> +
> +	/* Don't do any per-CPU operations until interrupt state is fixed */
> +
> +	if (nmi_disables_ftrace(regs)) {
> +		state->ftrace_enabled = this_cpu_get_ftrace_enabled();
> +		this_cpu_set_ftrace_enabled(0);
> +	}
> +#endif
> +
> +	/* If data relocations are enabled, it's safe to use nmi_enter() */
> +	if (mfmsr() & MSR_DR) {
> +		nmi_enter();
> +		return;
> +	}
> +
> +	/*
> +	 * But do not use nmi_enter() for pseries hash guest taking a real-mode
> +	 * NMI because not everything it touches is within the RMA limit.
> +	 */
> +	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
> +	    firmware_has_feature(FW_FEATURE_LPAR) &&
> +	    !radix_enabled())
> +		return;
> +
> +	/*
> +	 * Likewise, don't use it if we have some form of instrumentation (like
> +	 * KASAN shadow) that is not safe to access in real mode (even on radix)
> +	 */
> +	if (IS_ENABLED(CONFIG_KASAN))
> +		return;
> +
> +	/*
> +	 * Likewise, do not use it in real mode if percpu first chunk is not
> +	 * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
> +	 * are chances where percpu allocation can come from vmalloc area.
> +	 */
> +	if (percpu_first_chunk_is_paged)
> +		return;
> +
> +	/* Otherwise, it should be safe to call it */
> +	nmi_enter();
> +}
> +
> +static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
> +					      struct interrupt_nmi_state *state)
> +{

CHECK: Alignment should match open parenthesis
#425: FILE: arch/powerpc/include/asm/entry-common.h:393:
+static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
+					      struct interrupt_nmi_state *state)

> +	if (mfmsr() & MSR_DR) {
> +		// nmi_exit if relocations are on
> +		nmi_exit();
> +	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
> +		   firmware_has_feature(FW_FEATURE_LPAR) &&
> +		   !radix_enabled()) {
> +		// no nmi_exit for a pseries hash guest taking a real mode exception
> +	} else if (IS_ENABLED(CONFIG_KASAN)) {
> +		// no nmi_exit for KASAN in real mode
> +	} else if (percpu_first_chunk_is_paged) {
> +		// no nmi_exit if percpu first chunk is not embedded
> +	} else {
> +		nmi_exit();
> +	}
> +
> +	/*
> +	 * nmi does not call nap_adjust_return because nmi should not create
> +	 * new work to do (must use irq_work for that).
> +	 */
> +
> +#ifdef CONFIG_PPC64
> +#ifdef CONFIG_PPC_BOOK3S
> +	if (regs_irqs_disabled(regs)) {
> +		unsigned long rst = search_kernel_restart_table(regs->nip);
> +
> +		if (rst)
> +			regs_set_return_ip(regs, rst);
> +	}
> +#endif
> +
> +	if (nmi_disables_ftrace(regs))
> +		this_cpu_set_ftrace_enabled(state->ftrace_enabled);
> +
> +	/* Check we didn't change the pending interrupt mask. */
> +	WARN_ON_ONCE((state->irq_happened | PACA_IRQ_HARD_DIS) != local_paca->irq_happened);
> +	regs->softe = state->softe;
> +	local_paca->irq_happened = state->irq_happened;
> +	local_paca->irq_soft_mask = state->irq_soft_mask;
> +#endif
> +}
> +
>   static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
>   {
>   	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
  2025-12-14 13:02 ` [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
  2025-12-16  6:29   ` kernel test robot
@ 2025-12-16 10:43   ` Christophe Leroy (CS GROUP)
  2025-12-16 15:06     ` Mukesh Kumar Chaurasiya
  2025-12-17  2:10   ` kernel test robot
  2025-12-17 21:32   ` kernel test robot
  3 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16 10:43 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, oleg, kees, luto,
	wad, mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel



Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> 
> Enable the generic IRQ entry/exit infrastructure on PowerPC by selecting
> GENERIC_IRQ_ENTRY and integrating the architecture-specific interrupt
> handlers with the generic entry/exit APIs.
> 
> This change replaces PowerPC’s local interrupt entry/exit handling with
> calls to the generic irqentry_* helpers, aligning the architecture with
> the common kernel entry model. The macros that define interrupt, async,
> and NMI handlers are updated to use irqentry_enter()/irqentry_exit()
> and irqentry_nmi_enter()/irqentry_nmi_exit() where applicable.
> 
> Key updates include:
>   - Select GENERIC_IRQ_ENTRY in Kconfig.
>   - Replace interrupt_enter/exit_prepare() with arch_interrupt_* helpers.
>   - Integrate irqentry_enter()/exit() in standard and async interrupt paths.
>   - Integrate irqentry_nmi_enter()/exit() in NMI handlers.
>   - Remove redundant irq_enter()/irq_exit() calls now handled generically.
>   - Use irqentry_exit_cond_resched() for preemption checks.
> 
> This change establishes the necessary wiring for PowerPC to use the
> generic IRQ entry/exit framework while maintaining existing semantics.

Did you look into resulting code ?

do_IRQ() is bigger and calls irqentry_enter() which is bigger than 
irq_enter().

And irq_enter_rcu() was tail-called from irq_enter(), now is it called 
after irqentry_enter().

> 
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
>   arch/powerpc/Kconfig                    |   1 +
>   arch/powerpc/include/asm/entry-common.h |  66 +---
>   arch/powerpc/include/asm/interrupt.h    | 499 +++---------------------
>   arch/powerpc/kernel/interrupt.c         |  13 +-
>   4 files changed, 74 insertions(+), 505 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e24f4d88885a..b0c602c3bbe1 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -206,6 +206,7 @@ config PPC
>   	select GENERIC_GETTIMEOFDAY
>   	select GENERIC_IDLE_POLL_SETUP
>   	select GENERIC_IOREMAP
> +	select GENERIC_IRQ_ENTRY
>   	select GENERIC_IRQ_SHOW
>   	select GENERIC_IRQ_SHOW_LEVEL
>   	select GENERIC_PCI_IOMAP		if PCI
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index e8bde4c67eaf..e2ae7416dee1 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -257,6 +257,17 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
>    */
>   static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
>   {
> +	if (user_mode(regs)) {
> +		BUG_ON(regs_is_unrecoverable(regs));
> +		BUG_ON(regs_irqs_disabled(regs));
> +		/*
> +		 * We don't need to restore AMR on the way back to userspace for KUAP.
> +		 * AMR can only have been unlocked if we interrupted the kernel.
> +		 */
> +		kuap_assert_locked();
> +
> +		local_irq_disable();
> +	}
>   }
>   
>   static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
> @@ -275,7 +286,6 @@ static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
>   	    !test_thread_local_flags(_TLF_RUNLATCH))
>   		__ppc64_runlatch_on();
>   #endif
> -	irq_enter();
>   }
>   
>   static inline void arch_interrupt_async_exit_prepare(struct pt_regs *regs)
> @@ -288,7 +298,6 @@ static inline void arch_interrupt_async_exit_prepare(struct pt_regs *regs)
>   	 */
>   	nap_adjust_return(regs);
>   
> -	irq_exit();
>   	arch_interrupt_exit_prepare(regs);
>   }
>   
> @@ -354,59 +363,11 @@ static inline void arch_interrupt_nmi_enter_prepare(struct pt_regs *regs,
>   		this_cpu_set_ftrace_enabled(0);
>   	}
>   #endif
> -
> -	/* If data relocations are enabled, it's safe to use nmi_enter() */
> -	if (mfmsr() & MSR_DR) {
> -		nmi_enter();
> -		return;
> -	}
> -
> -	/*
> -	 * But do not use nmi_enter() for pseries hash guest taking a real-mode
> -	 * NMI because not everything it touches is within the RMA limit.
> -	 */
> -	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
> -	    firmware_has_feature(FW_FEATURE_LPAR) &&
> -	    !radix_enabled())
> -		return;
> -
> -	/*
> -	 * Likewise, don't use it if we have some form of instrumentation (like
> -	 * KASAN shadow) that is not safe to access in real mode (even on radix)
> -	 */
> -	if (IS_ENABLED(CONFIG_KASAN))
> -		return;
> -
> -	/*
> -	 * Likewise, do not use it in real mode if percpu first chunk is not
> -	 * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
> -	 * are chances where percpu allocation can come from vmalloc area.
> -	 */
> -	if (percpu_first_chunk_is_paged)
> -		return;
> -
> -	/* Otherwise, it should be safe to call it */
> -	nmi_enter();
>   }
>   
>   static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
>   					      struct interrupt_nmi_state *state)
>   {
> -	if (mfmsr() & MSR_DR) {
> -		// nmi_exit if relocations are on
> -		nmi_exit();
> -	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
> -		   firmware_has_feature(FW_FEATURE_LPAR) &&
> -		   !radix_enabled()) {
> -		// no nmi_exit for a pseries hash guest taking a real mode exception
> -	} else if (IS_ENABLED(CONFIG_KASAN)) {
> -		// no nmi_exit for KASAN in real mode
> -	} else if (percpu_first_chunk_is_paged) {
> -		// no nmi_exit if percpu first chunk is not embedded
> -	} else {
> -		nmi_exit();
> -	}
> -
>   	/*
>   	 * nmi does not call nap_adjust_return because nmi should not create
>   	 * new work to do (must use irq_work for that).
> @@ -435,6 +396,8 @@ static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
>   
>   static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
>   {
> +	kuap_lock();
> +

A reason why this change comes now and not in the patch that added 
arch_enter_from_user_mode() ?

>   	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
>   		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
>   
> @@ -467,11 +430,8 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
>   	} else
>   #endif
>   		kuap_assert_locked();
> -
>   	booke_restore_dbcr0();
> -

This is cosmetic, should have been done when adding 
arch_enter_from_user_mode()

>   	account_cpu_user_entry();
> -
>   	account_stolen_time();
>   
>   	/*


Christophe


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-14 13:02 ` [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
                     ` (2 preceding siblings ...)
  2025-12-16  6:41   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 11:01   ` Christophe Leroy (CS GROUP)
  2025-12-16 15:13     ` Mukesh Kumar Chaurasiya
  3 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16 11:01 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, oleg, kees, luto,
	wad, mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, kan.liang, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel



Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> 
> Convert the PowerPC syscall entry and exit paths to use the generic
> entry/exit framework by selecting GENERIC_ENTRY and integrating with
> the common syscall handling routines.
> 
> This change transitions PowerPC away from its custom syscall entry and
> exit code to use the generic helpers such as:
>   - syscall_enter_from_user_mode()
>   - syscall_exit_to_user_mode()
> 
> As part of this migration:
>   - The architecture now selects GENERIC_ENTRY in Kconfig.
>   - Old tracing, seccomp, and audit handling in ptrace.c is removed in
>     favor of generic entry infrastructure.
>   - interrupt.c and syscall.c are simplified to delegate context
>     management and user exit handling to the generic entry path.
>   - The new pt_regs field `exit_flags` introduced earlier is now used
>     to carry per-syscall exit state flags (e.g. _TIF_RESTOREALL).
> 
> This aligns PowerPC with the common entry code used by other
> architectures and reduces duplicated logic around syscall tracing,
> context tracking, and signal handling.
> 
> The performance benchmarks from perf bench basic syscall are below:
> 
> perf bench syscall usec/op
> 
> | Test            | With Patch | Without Patch | % Change |
> | --------------- | ---------- | ------------- | -------- |
> | getppid usec/op | 0.207795   | 0.210373      | -1.22%   |
> | getpgid usec/op | 0.206282   | 0.211676      | -2.55%   |
> | fork usec/op    | 833.986    | 814.809       | +2.35%   |
> | execve usec/op  | 360.939    | 365.168       | -1.16%   |
> 
> perf bench syscall ops/sec
> 
> | Test            | With Patch | Without Patch | % Change |
> | --------------- | ---------- | ------------- | -------- |
> | getppid ops/sec | 48,12,433  | 47,53,459     | +1.24%   |
> | getpgid ops/sec | 48,47,744  | 47,24,192     | +2.61%   |
> | fork ops/sec    | 1,199      | 1,227         | -2.28%   |
> | execve ops/sec  | 2,770      | 2,738         | +1.16%   |
> 
> IPI latency benchmark
> 
> | Metric                  | With Patch       | Without Patch    | % Change |
> | ----------------------- | ---------------- | ---------------- | -------- |
> | Dry-run (ns)            | 206,675.81       | 206,719.36       | -0.02%   |
> | Self-IPI avg (ns)       | 1,939,991.00     | 1,976,116.15     | -1.83%   |
> | Self-IPI max (ns)       | 3,533,718.93     | 3,582,650.33     | -1.37%   |
> | Normal IPI avg (ns)     | 111,110,034.23   | 110,513,373.51   | +0.54%   |
> | Normal IPI max (ns)     | 150,393,442.64   | 149,669,477.89   | +0.48%   |
> | Broadcast IPI max (ns)  | 3,978,231,022.96 | 4,359,916,859.46 | -8.73%   |
> | Broadcast lock max (ns) | 4,025,425,714.49 | 4,384,956,730.83 | -8.20%   |
> 
> Thats very close to performance earlier with arch specific handling.
> 
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
>   arch/powerpc/Kconfig                    |   1 +
>   arch/powerpc/include/asm/entry-common.h |   5 +-
>   arch/powerpc/kernel/interrupt.c         | 139 +++++++----------------
>   arch/powerpc/kernel/ptrace/ptrace.c     | 141 ------------------------
>   arch/powerpc/kernel/signal.c            |  10 +-
>   arch/powerpc/kernel/syscall.c           | 119 +-------------------
>   6 files changed, 49 insertions(+), 366 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index b0c602c3bbe1..a4330775b254 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -203,6 +203,7 @@ config PPC
>   	select GENERIC_CPU_AUTOPROBE
>   	select GENERIC_CPU_VULNERABILITIES	if PPC_BARRIER_NOSPEC
>   	select GENERIC_EARLY_IOREMAP
> +	select GENERIC_ENTRY
>   	select GENERIC_GETTIMEOFDAY
>   	select GENERIC_IDLE_POLL_SETUP
>   	select GENERIC_IOREMAP
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index e2ae7416dee1..77129174f882 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -3,7 +3,7 @@
>   #ifndef _ASM_PPC_ENTRY_COMMON_H
>   #define _ASM_PPC_ENTRY_COMMON_H
>   
> -#ifdef CONFIG_GENERIC_IRQ_ENTRY
> +#ifdef CONFIG_GENERIC_ENTRY

Powerpc now selected this inconditionaly. Why is this #ifdef needed ?


>   
>   #include <asm/cputime.h>
>   #include <asm/interrupt.h>
> @@ -217,9 +217,6 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
>   
>   	if (user_mode(regs)) {
>   		kuap_lock();
> -		CT_WARN_ON(ct_state() != CT_STATE_USER);
> -		user_exit_irqoff();
> -
>   		account_cpu_user_entry();
>   		account_stolen_time();
>   	} else {
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index 7f67f0b9d627..7d5cd4b5a610 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -1,6 +1,7 @@
>   // SPDX-License-Identifier: GPL-2.0-or-later
>   
>   #include <linux/context_tracking.h>
> +#include <linux/entry-common.h>
>   #include <linux/err.h>
>   #include <linux/compat.h>
>   #include <linux/rseq.h>
> @@ -73,79 +74,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
>   	return true;
>   }
>   
> -static notrace unsigned long
> -interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
> -{
> -	unsigned long ti_flags;
> -
> -again:
> -	ti_flags = read_thread_flags();
> -	while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
> -		local_irq_enable();
> -		if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
> -			schedule();
> -		} else {
> -			/*
> -			 * SIGPENDING must restore signal handler function
> -			 * argument GPRs, and some non-volatiles (e.g., r1).
> -			 * Restore all for now. This could be made lighter.
> -			 */
> -			if (ti_flags & _TIF_SIGPENDING)
> -				ret |= _TIF_RESTOREALL;
> -			do_notify_resume(regs, ti_flags);

do_notify_resume() has no caller anymore, should be removed from 
arch/powerpc/include/asm/signal.h and arch/powerpc/kernel/signal.c



> -		}
> -		local_irq_disable();
> -		ti_flags = read_thread_flags();
> -	}
> -
> -	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
> -		if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
> -				unlikely((ti_flags & _TIF_RESTORE_TM))) {
> -			restore_tm_state(regs);
> -		} else {
> -			unsigned long mathflags = MSR_FP;
> -
> -			if (cpu_has_feature(CPU_FTR_VSX))
> -				mathflags |= MSR_VEC | MSR_VSX;
> -			else if (cpu_has_feature(CPU_FTR_ALTIVEC))
> -				mathflags |= MSR_VEC;
> -
> -			/*
> -			 * If userspace MSR has all available FP bits set,
> -			 * then they are live and no need to restore. If not,
> -			 * it means the regs were given up and restore_math
> -			 * may decide to restore them (to avoid taking an FP
> -			 * fault).
> -			 */
> -			if ((regs->msr & mathflags) != mathflags)
> -				restore_math(regs);
> -		}
> -	}
> -
> -	check_return_regs_valid(regs);
> -
> -	user_enter_irqoff();
> -	if (!prep_irq_for_enabled_exit(true)) {
> -		user_exit_irqoff();
> -		local_irq_enable();
> -		local_irq_disable();
> -		goto again;
> -	}
> -
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> -	local_paca->tm_scratch = regs->msr;
> -#endif
> -
> -	booke_load_dbcr0();
> -
> -	account_cpu_user_exit();
> -
> -	/* Restore user access locks last */
> -	kuap_user_restore(regs);
> -
> -	return ret;
> -}
> -
>   /*
>    * This should be called after a syscall returns, with r3 the return value
>    * from the syscall. If this function returns non-zero, the system call
> @@ -160,17 +88,12 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
>   					   long scv)
>   {
>   	unsigned long ti_flags;
> -	unsigned long ret = 0;
>   	bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv;
>   
> -	CT_WARN_ON(ct_state() == CT_STATE_USER);
> -
>   	kuap_assert_locked();
>   
>   	regs->result = r3;
> -
> -	/* Check whether the syscall is issued inside a restartable sequence */
> -	rseq_syscall(regs);
> +	regs->exit_flags = 0;
>   
>   	ti_flags = read_thread_flags();
>   
> @@ -183,7 +106,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
>   
>   	if (unlikely(ti_flags & _TIF_PERSYSCALL_MASK)) {
>   		if (ti_flags & _TIF_RESTOREALL)
> -			ret = _TIF_RESTOREALL;
> +			regs->exit_flags = _TIF_RESTOREALL;
>   		else
>   			regs->gpr[3] = r3;
>   		clear_bits(_TIF_PERSYSCALL_MASK, &current_thread_info()->flags);
> @@ -192,18 +115,28 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
>   	}
>   
>   	if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) {
> -		do_syscall_trace_leave(regs);
> -		ret |= _TIF_RESTOREALL;
> +		regs->exit_flags |= _TIF_RESTOREALL;
>   	}
>   
> -	local_irq_disable();
> -	ret = interrupt_exit_user_prepare_main(ret, regs);
> +	syscall_exit_to_user_mode(regs);
> +
> +again:
> +	user_enter_irqoff();
> +	if (!prep_irq_for_enabled_exit(true)) {
> +		user_exit_irqoff();
> +		local_irq_enable();
> +		local_irq_disable();
> +		goto again;
> +	}
> +
> +	/* Restore user access locks last */
> +	kuap_user_restore(regs);
>   
>   #ifdef CONFIG_PPC64
> -	regs->exit_result = ret;
> +	regs->exit_result = regs->exit_flags;
>   #endif
>   
> -	return ret;
> +	return regs->exit_flags;
>   }
>   
>   #ifdef CONFIG_PPC64
> @@ -223,13 +156,16 @@ notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *reg
>   	set_kuap(AMR_KUAP_BLOCKED);
>   #endif
>   
> -	trace_hardirqs_off();
> -	user_exit_irqoff();
> -	account_cpu_user_entry();
> -
> -	BUG_ON(!user_mode(regs));
> +again:
> +	user_enter_irqoff();
> +	if (!prep_irq_for_enabled_exit(true)) {
> +		user_exit_irqoff();
> +		local_irq_enable();
> +		local_irq_disable();
> +		goto again;
> +	}
>   
> -	regs->exit_result = interrupt_exit_user_prepare_main(regs->exit_result, regs);
> +	regs->exit_result |= regs->exit_flags;
>   
>   	return regs->exit_result;
>   }
> @@ -241,7 +177,6 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
>   
>   	BUG_ON(regs_is_unrecoverable(regs));
>   	BUG_ON(regs_irqs_disabled(regs));
> -	CT_WARN_ON(ct_state() == CT_STATE_USER);
>   
>   	/*
>   	 * We don't need to restore AMR on the way back to userspace for KUAP.
> @@ -250,8 +185,21 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
>   	kuap_assert_locked();
>   
>   	local_irq_disable();
> +	regs->exit_flags = 0;
> +again:
> +	check_return_regs_valid(regs);
> +	user_enter_irqoff();
> +	if (!prep_irq_for_enabled_exit(true)) {
> +		user_exit_irqoff();
> +		local_irq_enable();
> +		local_irq_disable();
> +		goto again;
> +	}
> +
> +	/* Restore user access locks last */
> +	kuap_user_restore(regs);
>   
> -	ret = interrupt_exit_user_prepare_main(0, regs);
> +	ret = regs->exit_flags;
>   
>   #ifdef CONFIG_PPC64
>   	regs->exit_result = ret;
> @@ -293,8 +241,6 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
>   		/* Returning to a kernel context with local irqs enabled. */
>   		WARN_ON_ONCE(!(regs->msr & MSR_EE));
>   again:
> -		if (need_irq_preemption())
> -			irqentry_exit_cond_resched();
>   
>   		check_return_regs_valid(regs);
>   
> @@ -364,7 +310,6 @@ notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
>   #endif
>   
>   	trace_hardirqs_off();
> -	user_exit_irqoff();
>   	account_cpu_user_entry();
>   
>   	BUG_ON(!user_mode(regs));
> diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
> index 2134b6d155ff..316d4f5ead8e 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace.c
> @@ -21,9 +21,6 @@
>   #include <asm/switch_to.h>
>   #include <asm/debug.h>
>   
> -#define CREATE_TRACE_POINTS
> -#include <trace/events/syscalls.h>
> -
>   #include "ptrace-decl.h"
>   
>   /*
> @@ -195,144 +192,6 @@ long arch_ptrace(struct task_struct *child, long request,
>   	return ret;
>   }
>   
> -#ifdef CONFIG_SECCOMP
> -static int do_seccomp(struct pt_regs *regs)
> -{
> -	if (!test_thread_flag(TIF_SECCOMP))
> -		return 0;
> -
> -	/*
> -	 * The ABI we present to seccomp tracers is that r3 contains
> -	 * the syscall return value and orig_gpr3 contains the first
> -	 * syscall parameter. This is different to the ptrace ABI where
> -	 * both r3 and orig_gpr3 contain the first syscall parameter.
> -	 */
> -	regs->gpr[3] = -ENOSYS;
> -
> -	/*
> -	 * We use the __ version here because we have already checked
> -	 * TIF_SECCOMP. If this fails, there is nothing left to do, we
> -	 * have already loaded -ENOSYS into r3, or seccomp has put
> -	 * something else in r3 (via SECCOMP_RET_ERRNO/TRACE).
> -	 */
> -	if (__secure_computing())
> -		return -1;
> -
> -	/*
> -	 * The syscall was allowed by seccomp, restore the register
> -	 * state to what audit expects.
> -	 * Note that we use orig_gpr3, which means a seccomp tracer can
> -	 * modify the first syscall parameter (in orig_gpr3) and also
> -	 * allow the syscall to proceed.
> -	 */
> -	regs->gpr[3] = regs->orig_gpr3;
> -
> -	return 0;
> -}
> -#else
> -static inline int do_seccomp(struct pt_regs *regs) { return 0; }
> -#endif /* CONFIG_SECCOMP */
> -
> -/**
> - * do_syscall_trace_enter() - Do syscall tracing on kernel entry.
> - * @regs: the pt_regs of the task to trace (current)
> - *
> - * Performs various types of tracing on syscall entry. This includes seccomp,
> - * ptrace, syscall tracepoints and audit.
> - *
> - * The pt_regs are potentially visible to userspace via ptrace, so their
> - * contents is ABI.
> - *
> - * One or more of the tracers may modify the contents of pt_regs, in particular
> - * to modify arguments or even the syscall number itself.
> - *
> - * It's also possible that a tracer can choose to reject the system call. In
> - * that case this function will return an illegal syscall number, and will put
> - * an appropriate return value in regs->r3.
> - *
> - * Return: the (possibly changed) syscall number.
> - */
> -long do_syscall_trace_enter(struct pt_regs *regs)

Remove prototype from arch/powerpc/include/asm/ptrace.h

> -{
> -	u32 flags;
> -
> -	flags = read_thread_flags() & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
> -
> -	if (flags) {
> -		int rc = ptrace_report_syscall_entry(regs);
> -
> -		if (unlikely(flags & _TIF_SYSCALL_EMU)) {
> -			/*
> -			 * A nonzero return code from
> -			 * ptrace_report_syscall_entry() tells us to prevent
> -			 * the syscall execution, but we are not going to
> -			 * execute it anyway.
> -			 *
> -			 * Returning -1 will skip the syscall execution. We want
> -			 * to avoid clobbering any registers, so we don't goto
> -			 * the skip label below.
> -			 */
> -			return -1;
> -		}
> -
> -		if (rc) {
> -			/*
> -			 * The tracer decided to abort the syscall. Note that
> -			 * the tracer may also just change regs->gpr[0] to an
> -			 * invalid syscall number, that is handled below on the
> -			 * exit path.
> -			 */
> -			goto skip;
> -		}
> -	}
> -
> -	/* Run seccomp after ptrace; allow it to set gpr[3]. */
> -	if (do_seccomp(regs))
> -		return -1;
> -
> -	/* Avoid trace and audit when syscall is invalid. */
> -	if (regs->gpr[0] >= NR_syscalls)
> -		goto skip;
> -
> -	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
> -		trace_sys_enter(regs, regs->gpr[0]);
> -
> -	if (!is_32bit_task())
> -		audit_syscall_entry(regs->gpr[0], regs->gpr[3], regs->gpr[4],
> -				    regs->gpr[5], regs->gpr[6]);
> -	else
> -		audit_syscall_entry(regs->gpr[0],
> -				    regs->gpr[3] & 0xffffffff,
> -				    regs->gpr[4] & 0xffffffff,
> -				    regs->gpr[5] & 0xffffffff,
> -				    regs->gpr[6] & 0xffffffff);
> -
> -	/* Return the possibly modified but valid syscall number */
> -	return regs->gpr[0];
> -
> -skip:
> -	/*
> -	 * If we are aborting explicitly, or if the syscall number is
> -	 * now invalid, set the return value to -ENOSYS.
> -	 */
> -	regs->gpr[3] = -ENOSYS;
> -	return -1;
> -}
> -
> -void do_syscall_trace_leave(struct pt_regs *regs)
> -{
> -	int step;
> -
> -	audit_syscall_exit(regs);
> -
> -	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
> -		trace_sys_exit(regs, regs->result);
> -
> -	step = test_thread_flag(TIF_SINGLESTEP);
> -	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
> -		ptrace_report_syscall_exit(regs, step);
> -}
> -
>   void __init pt_regs_check(void);
>   
>   /*
> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> index 719930cf4ae1..9f1847b4742e 100644
> --- a/arch/powerpc/kernel/signal.c
> +++ b/arch/powerpc/kernel/signal.c
> @@ -6,6 +6,7 @@
>    *    Extracted from signal_32.c and signal_64.c
>    */
>   
> +#include <linux/entry-common.h>
>   #include <linux/resume_user_mode.h>
>   #include <linux/signal.h>
>   #include <linux/uprobes.h>
> @@ -22,11 +23,6 @@
>   
>   #include "signal.h"
>   
> -/* This will be removed */
> -#ifdef CONFIG_GENERIC_ENTRY
> -#include <linux/entry-common.h>
> -#endif /* CONFIG_GENERIC_ENTRY */
> -

Until now CONFIG_GENERIC_ENTRY was not defined.

Now that it is defined, we remove the entire block ?

Then why has it been added at all ?

>   #ifdef CONFIG_VSX
>   unsigned long copy_fpr_to_user(void __user *to,
>   			       struct task_struct *task)
> @@ -374,11 +370,9 @@ void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
>   				   task_pid_nr(tsk), where, ptr, regs->nip, regs->link);
>   }
>   
> -#ifdef CONFIG_GENERIC_ENTRY
>   void arch_do_signal_or_restart(struct pt_regs *regs)
>   {
>   	BUG_ON(regs != current->thread.regs);
> -	local_paca->generic_fw_flags |= GFW_RESTORE_ALL;

Why was that there ? I thought it was preparatory, then you remove it 
before even using it ?

> +	regs->exit_flags |= _TIF_RESTOREALL;
>   	do_signal(current);
>   }
> -#endif /* CONFIG_GENERIC_ENTRY */


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 2/8] powerpc: Prepare to build with generic entry/exit framework
  2025-12-16  9:27   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 14:42     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 14:42 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 10:27:55AM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > 
> > This patch introduces preparatory changes needed to support building
> > PowerPC with the generic entry/exit (irqentry) framework.
> > 
> > The following infrastructure updates are added:
> >   - Add a syscall_work field to struct thread_info to hold SYSCALL_WORK_* flags.
> >   - Provide a stub implementation of arch_syscall_is_vdso_sigreturn(),
> >     returning false for now.
> >   - Introduce on_thread_stack() helper to detect if the current stack pointer
> >     lies within the task’s kernel stack.
> > 
> > These additions enable later integration with the generic entry/exit
> > infrastructure while keeping existing PowerPC behavior unchanged.
> > 
> > No functional change is intended in this patch.
> > 
> > Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > ---
> >   arch/powerpc/include/asm/entry-common.h | 11 +++++++++++
> >   arch/powerpc/include/asm/stacktrace.h   |  6 ++++++
> >   arch/powerpc/include/asm/syscall.h      |  5 +++++
> >   arch/powerpc/include/asm/thread_info.h  |  1 +
> >   4 files changed, 23 insertions(+)
> >   create mode 100644 arch/powerpc/include/asm/entry-common.h
> > 
> > diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> > new file mode 100644
> > index 000000000000..3af16d821d07
> > --- /dev/null
> > +++ b/arch/powerpc/include/asm/entry-common.h
> > @@ -0,0 +1,11 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef _ASM_PPC_ENTRY_COMMON_H
> > +#define _ASM_PPC_ENTRY_COMMON_H
> > +
> > +#ifdef CONFIG_GENERIC_IRQ_ENTRY
> 
> Why do you need this #ifdef ? I see no reason, the build works well without
> this #ifdef.
> 
> At the time being, CONFIG_GENERIC_IRQ_ENTRY is never selected by powerpc,
> meaning you are introducing dead code. If really needed it would be more
> explicit to add a "#if 0"
> 
Yes you are correct. I intended it to be a dead code till we introduce
the implementation. I'll remove this in next iteration.
> > +
> > +#include <asm/stacktrace.h>
> > +
> > +#endif /* CONFIG_GENERIC_IRQ_ENTRY */
> > +#endif /* _ASM_PPC_ENTRY_COMMON_H */
> > diff --git a/arch/powerpc/include/asm/stacktrace.h b/arch/powerpc/include/asm/stacktrace.h
> > index 6149b53b3bc8..a81a9373d723 100644
> > --- a/arch/powerpc/include/asm/stacktrace.h
> > +++ b/arch/powerpc/include/asm/stacktrace.h
> > @@ -10,4 +10,10 @@
> >   void show_user_instructions(struct pt_regs *regs);
> > +static inline bool on_thread_stack(void)
> 
> Shouldn't it be __always_inline ?
> 
Yes it should. Will fix this too.
> > +{
> > +	return !(((unsigned long)(current->stack) ^ current_stack_pointer)
> > +			& ~(THREAD_SIZE - 1));
> > +}
> > +
> >   #endif /* _ASM_POWERPC_STACKTRACE_H */
> > diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
> > index 4b3c52ed6e9d..834fcc4f7b54 100644
> > --- a/arch/powerpc/include/asm/syscall.h
> > +++ b/arch/powerpc/include/asm/syscall.h
> > @@ -139,4 +139,9 @@ static inline int syscall_get_arch(struct task_struct *task)
> >   	else
> >   		return AUDIT_ARCH_PPC64;
> >   }
> > +
> > +static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
> > +{
> > +	return false;
> > +}
> >   #endif	/* _ASM_SYSCALL_H */
> > diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
> > index b0f200aba2b3..9c8270354f0b 100644
> > --- a/arch/powerpc/include/asm/thread_info.h
> > +++ b/arch/powerpc/include/asm/thread_info.h
> > @@ -57,6 +57,7 @@ struct thread_info {
> >   #ifdef CONFIG_SMP
> >   	unsigned int	cpu;
> >   #endif
> > +	unsigned long	syscall_work;		/* SYSCALL_WORK_ flags */
> 
> This is not used, why add it here ?
> 
I wanted it to be in a separate patch from where it's used coz if there
are any cache alignment issues, during the bisect we can be sure that
it's introduced by this variable not due to any implementation.
Do you think it should be along with the implementation?

I appreciate the review.

Regards,
Mukesh

> >   	unsigned long	local_flags;		/* private flags for thread */
> >   #ifdef CONFIG_LIVEPATCH_64
> >   	unsigned long *livepatch_sp;
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/8] powerpc: introduce arch_enter_from_user_mode
  2025-12-16  9:38   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 14:47     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 14:47 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 10:38:50AM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > 
> > Implement the arch_enter_from_user_mode() hook required by the generic
> > entry/exit framework. This helper prepares the CPU state when entering
> > the kernel from userspace, ensuring correct handling of KUAP/KUEP,
> > transactional memory, and debug register state.
> > 
> > As part of this change, move booke_load_dbcr0() from interrupt.c to
> > interrupt.h so it can be used by the new helper without introducing
> > cross-file dependencies.
> > 
> > This patch contains no functional changes, it is purely preparatory for
> > enabling the generic syscall and interrupt entry path on PowerPC.
> > 
> > Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > ---
> >   arch/powerpc/include/asm/entry-common.h | 97 +++++++++++++++++++++++++
> >   arch/powerpc/include/asm/interrupt.h    | 22 ++++++
> >   arch/powerpc/kernel/interrupt.c         | 22 ------
> >   3 files changed, 119 insertions(+), 22 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> > index 3af16d821d07..093ece06ef79 100644
> > --- a/arch/powerpc/include/asm/entry-common.h
> > +++ b/arch/powerpc/include/asm/entry-common.h
> > @@ -5,7 +5,104 @@
> >   #ifdef CONFIG_GENERIC_IRQ_ENTRY
> 
> This #ifdef is still unnecessary it seems.
> 
Sure will fix it in next iteration.
> > +#include <asm/cputime.h>
> > +#include <asm/interrupt.h>
> >   #include <asm/stacktrace.h>
> > +#include <asm/tm.h>
> > +
> > +static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> > +{
> > +	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
> > +		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
> > +
> > +	BUG_ON(regs_is_unrecoverable(regs));
> > +	BUG_ON(!user_mode(regs));
> > +	BUG_ON(regs_irqs_disabled(regs));
> > +
> > +#ifdef CONFIG_PPC_PKEY
> > +	if (mmu_has_feature(MMU_FTR_PKEY) && trap_is_syscall(regs)) {
> > +		unsigned long amr, iamr;
> > +		bool flush_needed = false;
> > +		/*
> > +		 * When entering from userspace we mostly have the AMR/IAMR
> > +		 * different from kernel default values. Hence don't compare.
> > +		 */
> > +		amr = mfspr(SPRN_AMR);
> > +		iamr = mfspr(SPRN_IAMR);
> > +		regs->amr  = amr;
> > +		regs->iamr = iamr;
> > +		if (mmu_has_feature(MMU_FTR_KUAP)) {
> > +			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
> > +			flush_needed = true;
> > +		}
> > +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
> > +			mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
> > +			flush_needed = true;
> > +		}
> > +		if (flush_needed)
> > +			isync();
> > +	} else
> > +#endif
> > +		kuap_assert_locked();
> 
> This construct is odd, can you do something about it ?
> 
Yeah seemed weird to me too. Lemme see what i can do about this.
Will do something in next iteration.
> > +
> > +	booke_restore_dbcr0();
> > +
> > +	account_cpu_user_entry();
> > +
> > +	account_stolen_time();
> > +
> > +	/*
> > +	 * This is not required for the syscall exit path, but makes the
> > +	 * stack frame look nicer. If this was initialised in the first stack
> > +	 * frame, or if the unwinder was taught the first stack frame always
> > +	 * returns to user with IRQS_ENABLED, this store could be avoided!
> > +	 */
> > +	irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
> > +
> > +	/*
> > +	 * If system call is called with TM active, set _TIF_RESTOREALL to
> > +	 * prevent RFSCV being used to return to userspace, because POWER9
> > +	 * TM implementation has problems with this instruction returning to
> > +	 * transactional state. Final register values are not relevant because
> > +	 * the transaction will be aborted upon return anyway. Or in the case
> > +	 * of unsupported_scv SIGILL fault, the return state does not much
> > +	 * matter because it's an edge case.
> > +	 */
> > +	if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
> > +	    unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
> > +		set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
> > +
> > +	/*
> > +	 * If the system call was made with a transaction active, doom it and
> > +	 * return without performing the system call. Unless it was an
> > +	 * unsupported scv vector, in which case it's treated like an illegal
> > +	 * instruction.
> > +	 */
> > +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> > +	if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
> > +	    !trap_is_unsupported_scv(regs)) {
> > +		/* Enable TM in the kernel, and disable EE (for scv) */
> > +		hard_irq_disable();
> > +		mtmsr(mfmsr() | MSR_TM);
> > +
> > +		/* tabort, this dooms the transaction, nothing else */
> > +		asm volatile(".long 0x7c00071d | ((%0) << 16)"
> > +			     :: "r"(TM_CAUSE_SYSCALL | TM_CAUSE_PERSISTENT));
> > +
> > +		/*
> > +		 * Userspace will never see the return value. Execution will
> > +		 * resume after the tbegin. of the aborted transaction with the
> > +		 * checkpointed register state. A context switch could occur
> > +		 * or signal delivered to the process before resuming the
> > +		 * doomed transaction context, but that should all be handled
> > +		 * as expected.
> > +		 */
> > +		return;
> > +	}
> > +#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> > +}
> > +
> > +#define arch_enter_from_user_mode arch_enter_from_user_mode
> >   #endif /* CONFIG_GENERIC_IRQ_ENTRY */
> >   #endif /* _ASM_PPC_ENTRY_COMMON_H */
> > diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> > index 0e2cddf8bd21..ca8a2cda9400 100644
> > --- a/arch/powerpc/include/asm/interrupt.h
> > +++ b/arch/powerpc/include/asm/interrupt.h
> > @@ -138,6 +138,28 @@ static inline void nap_adjust_return(struct pt_regs *regs)
> >   #endif
> >   }
> > +static inline void booke_load_dbcr0(void)
> 
> It was a notrace function in interrupt.c
> Should it be an __always_inline now ?
Yes, will fix this.

Regards,
Mukesh
> 
> Christophe
> 
> > +{
> > +#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> > +	unsigned long dbcr0 = current->thread.debug.dbcr0;
> > +
> > +	if (likely(!(dbcr0 & DBCR0_IDM)))
> > +		return;
> > +
> > +	/*
> > +	 * Check to see if the dbcr0 register is set up to debug.
> > +	 * Use the internal debug mode bit to do this.
> > +	 */
> > +	mtmsr(mfmsr() & ~MSR_DE);
> > +	if (IS_ENABLED(CONFIG_PPC32)) {
> > +		isync();
> > +		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
> > +	}
> > +	mtspr(SPRN_DBCR0, dbcr0);
> > +	mtspr(SPRN_DBSR, -1);
> > +#endif
> > +}
> > +
> >   static inline void booke_restore_dbcr0(void)
> >   {
> >   #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> > diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> > index 0d8fd47049a1..2a09ac5dabd6 100644
> > --- a/arch/powerpc/kernel/interrupt.c
> > +++ b/arch/powerpc/kernel/interrupt.c
> > @@ -78,28 +78,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
> >   	return true;
> >   }
> > -static notrace void booke_load_dbcr0(void)
> > -{
> > -#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> > -	unsigned long dbcr0 = current->thread.debug.dbcr0;
> > -
> > -	if (likely(!(dbcr0 & DBCR0_IDM)))
> > -		return;
> > -
> > -	/*
> > -	 * Check to see if the dbcr0 register is set up to debug.
> > -	 * Use the internal debug mode bit to do this.
> > -	 */
> > -	mtmsr(mfmsr() & ~MSR_DE);
> > -	if (IS_ENABLED(CONFIG_PPC32)) {
> > -		isync();
> > -		global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
> > -	}
> > -	mtspr(SPRN_DBCR0, dbcr0);
> > -	mtspr(SPRN_DBSR, -1);
> > -#endif
> > -}
> > -
> >   static notrace void check_return_regs_valid(struct pt_regs *regs)
> >   {
> >   #ifdef CONFIG_PPC_BOOK3S_64
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 4/8] powerpc: Introduce syscall exit arch functions
  2025-12-16  9:46   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 14:51     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 14:51 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 10:46:28AM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > 
> > Add PowerPC-specific implementations of the generic syscall exit hooks
> > used by the generic entry/exit framework:
> > 
> >   - arch_exit_to_user_mode_work_prepare()
> >   - arch_exit_to_user_mode_work()
> > 
> > These helpers handle user state restoration when returning from the
> > kernel to userspace, including FPU/VMX/VSX state, transactional memory,
> > KUAP restore, and per-CPU accounting.
> > 
> > Additionally, move check_return_regs_valid() from interrupt.c to
> > interrupt.h so it can be shared by the new entry/exit logic, and add
> > arch_do_signal_or_restart() for use with the generic entry flow.
> > 
> > No functional change is intended with this patch.
> > 
> > Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > ---
> >   arch/powerpc/include/asm/entry-common.h | 49 +++++++++++++++
> >   arch/powerpc/include/asm/interrupt.h    | 82 +++++++++++++++++++++++++
> >   arch/powerpc/kernel/interrupt.c         | 81 ------------------------
> >   arch/powerpc/kernel/signal.c            | 14 +++++
> >   4 files changed, 145 insertions(+), 81 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> > index 093ece06ef79..e8ebd42a4e6d 100644
> > --- a/arch/powerpc/include/asm/entry-common.h
> > +++ b/arch/powerpc/include/asm/entry-common.h
> > @@ -8,6 +8,7 @@
> >   #include <asm/cputime.h>
> >   #include <asm/interrupt.h>
> >   #include <asm/stacktrace.h>
> > +#include <asm/switch_to.h>
> >   #include <asm/tm.h>
> >   static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> > @@ -104,5 +105,53 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> >   #define arch_enter_from_user_mode arch_enter_from_user_mode
> > +static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
> > +						  unsigned long ti_work)
> > +{
> > +	unsigned long mathflags;
> > +
> > +	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
> > +		if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
> > +		    unlikely((ti_work & _TIF_RESTORE_TM))) {
> > +			restore_tm_state(regs);
> > +		} else {
> > +			mathflags = MSR_FP;
> > +
> > +			if (cpu_has_feature(CPU_FTR_VSX))
> > +				mathflags |= MSR_VEC | MSR_VSX;
> > +			else if (cpu_has_feature(CPU_FTR_ALTIVEC))
> > +				mathflags |= MSR_VEC;
> > +
> > +			/*
> > +			 * If userspace MSR has all available FP bits set,
> > +			 * then they are live and no need to restore. If not,
> > +			 * it means the regs were given up and restore_math
> > +			 * may decide to restore them (to avoid taking an FP
> > +			 * fault).
> > +			 */
> > +			if ((regs->msr & mathflags) != mathflags)
> > +				restore_math(regs);
> > +		}
> > +	}
> > +
> > +	check_return_regs_valid(regs);
> > +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> > +	local_paca->tm_scratch = regs->msr;
> > +#endif
> > +	/* Restore user access locks last */
> > +	kuap_user_restore(regs);
> > +}
> > +
> > +#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
> > +
> > +static __always_inline void arch_exit_to_user_mode(void)
> > +{
> > +	booke_load_dbcr0();
> > +
> > +	account_cpu_user_exit();
> > +}
> > +
> > +#define arch_exit_to_user_mode arch_exit_to_user_mode
> > +
> >   #endif /* CONFIG_GENERIC_IRQ_ENTRY */
> >   #endif /* _ASM_PPC_ENTRY_COMMON_H */
> > diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> > index ca8a2cda9400..77ff8e33f8cd 100644
> > --- a/arch/powerpc/include/asm/interrupt.h
> > +++ b/arch/powerpc/include/asm/interrupt.h
> > @@ -68,6 +68,8 @@
> >   #include <linux/context_tracking.h>
> >   #include <linux/hardirq.h>
> > +#include <linux/sched/debug.h> /* for show_regs */
> > +
> >   #include <asm/cputime.h>
> >   #include <asm/firmware.h>
> >   #include <asm/ftrace.h>
> > @@ -172,6 +174,86 @@ static inline void booke_restore_dbcr0(void)
> >   #endif
> >   }
> > +static inline void check_return_regs_valid(struct pt_regs *regs)
> 
> This was previously a notrace function. Should it be marked __always_inline
> instead of just inline ?
> 
Yes it should. Will fix this too.
> > +{
> > +#ifdef CONFIG_PPC_BOOK3S_64
> > +	unsigned long trap, srr0, srr1;
> > +	static bool warned;
> > +	u8 *validp;
> > +	char *h;
> > +
> > +	if (trap_is_scv(regs))
> > +		return;
> > +
> > +	trap = TRAP(regs);
> > +	// EE in HV mode sets HSRRs like 0xea0
> > +	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
> > +		trap = 0xea0;
> > +
> > +	switch (trap) {
> > +	case 0x980:
> > +	case INTERRUPT_H_DATA_STORAGE:
> > +	case 0xe20:
> > +	case 0xe40:
> > +	case INTERRUPT_HMI:
> > +	case 0xe80:
> > +	case 0xea0:
> > +	case INTERRUPT_H_FAC_UNAVAIL:
> > +	case 0x1200:
> > +	case 0x1500:
> > +	case 0x1600:
> > +	case 0x1800:
> > +		validp = &local_paca->hsrr_valid;
> > +		if (!READ_ONCE(*validp))
> > +			return;
> > +
> > +		srr0 = mfspr(SPRN_HSRR0);
> > +		srr1 = mfspr(SPRN_HSRR1);
> > +		h = "H";
> > +
> > +		break;
> > +	default:
> > +		validp = &local_paca->srr_valid;
> > +		if (!READ_ONCE(*validp))
> > +			return;
> > +
> > +		srr0 = mfspr(SPRN_SRR0);
> > +		srr1 = mfspr(SPRN_SRR1);
> > +		h = "";
> > +		break;
> > +	}
> > +
> > +	if (srr0 == regs->nip && srr1 == regs->msr)
> > +		return;
> > +
> > +	/*
> > +	 * A NMI / soft-NMI interrupt may have come in after we found
> > +	 * srr_valid and before the SRRs are loaded. The interrupt then
> > +	 * comes in and clobbers SRRs and clears srr_valid. Then we load
> > +	 * the SRRs here and test them above and find they don't match.
> > +	 *
> > +	 * Test validity again after that, to catch such false positives.
> > +	 *
> > +	 * This test in general will have some window for false negatives
> > +	 * and may not catch and fix all such cases if an NMI comes in
> > +	 * later and clobbers SRRs without clearing srr_valid, but hopefully
> > +	 * such things will get caught most of the time, statistically
> > +	 * enough to be able to get a warning out.
> > +	 */
> > +	if (!READ_ONCE(*validp))
> > +		return;
> > +
> > +	if (!data_race(warned)) {
> > +		data_race(warned = true);
> > +		pr_warn("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
> > +		pr_warn("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
> > +		show_regs(regs);
> > +	}
> > +
> > +	WRITE_ONCE(*validp, 0); /* fixup */
> > +#endif
> > +}
> > +
> >   static inline void interrupt_enter_prepare(struct pt_regs *regs)
> >   {
> >   #ifdef CONFIG_PPC64
> > diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> > index 2a09ac5dabd6..f53d432f6087 100644
> > --- a/arch/powerpc/kernel/interrupt.c
> > +++ b/arch/powerpc/kernel/interrupt.c
> > @@ -4,7 +4,6 @@
> >   #include <linux/err.h>
> >   #include <linux/compat.h>
> >   #include <linux/rseq.h>
> > -#include <linux/sched/debug.h> /* for show_regs */
> >   #include <asm/kup.h>
> >   #include <asm/cputime.h>
> > @@ -78,86 +77,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
> >   	return true;
> >   }
> > -static notrace void check_return_regs_valid(struct pt_regs *regs)
> > -{
> > -#ifdef CONFIG_PPC_BOOK3S_64
> > -	unsigned long trap, srr0, srr1;
> > -	static bool warned;
> > -	u8 *validp;
> > -	char *h;
> > -
> > -	if (trap_is_scv(regs))
> > -		return;
> > -
> > -	trap = TRAP(regs);
> > -	// EE in HV mode sets HSRRs like 0xea0
> > -	if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
> > -		trap = 0xea0;
> > -
> > -	switch (trap) {
> > -	case 0x980:
> > -	case INTERRUPT_H_DATA_STORAGE:
> > -	case 0xe20:
> > -	case 0xe40:
> > -	case INTERRUPT_HMI:
> > -	case 0xe80:
> > -	case 0xea0:
> > -	case INTERRUPT_H_FAC_UNAVAIL:
> > -	case 0x1200:
> > -	case 0x1500:
> > -	case 0x1600:
> > -	case 0x1800:
> > -		validp = &local_paca->hsrr_valid;
> > -		if (!READ_ONCE(*validp))
> > -			return;
> > -
> > -		srr0 = mfspr(SPRN_HSRR0);
> > -		srr1 = mfspr(SPRN_HSRR1);
> > -		h = "H";
> > -
> > -		break;
> > -	default:
> > -		validp = &local_paca->srr_valid;
> > -		if (!READ_ONCE(*validp))
> > -			return;
> > -
> > -		srr0 = mfspr(SPRN_SRR0);
> > -		srr1 = mfspr(SPRN_SRR1);
> > -		h = "";
> > -		break;
> > -	}
> > -
> > -	if (srr0 == regs->nip && srr1 == regs->msr)
> > -		return;
> > -
> > -	/*
> > -	 * A NMI / soft-NMI interrupt may have come in after we found
> > -	 * srr_valid and before the SRRs are loaded. The interrupt then
> > -	 * comes in and clobbers SRRs and clears srr_valid. Then we load
> > -	 * the SRRs here and test them above and find they don't match.
> > -	 *
> > -	 * Test validity again after that, to catch such false positives.
> > -	 *
> > -	 * This test in general will have some window for false negatives
> > -	 * and may not catch and fix all such cases if an NMI comes in
> > -	 * later and clobbers SRRs without clearing srr_valid, but hopefully
> > -	 * such things will get caught most of the time, statistically
> > -	 * enough to be able to get a warning out.
> > -	 */
> > -	if (!READ_ONCE(*validp))
> > -		return;
> > -
> > -	if (!data_race(warned)) {
> > -		data_race(warned = true);
> > -		printk("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
> > -		printk("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
> > -		show_regs(regs);
> > -	}
> > -
> > -	WRITE_ONCE(*validp, 0); /* fixup */
> > -#endif
> > -}
> > -
> >   static notrace unsigned long
> >   interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
> >   {
> > diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> > index aa17e62f3754..719930cf4ae1 100644
> > --- a/arch/powerpc/kernel/signal.c
> > +++ b/arch/powerpc/kernel/signal.c
> > @@ -22,6 +22,11 @@
> >   #include "signal.h"
> > +/* This will be removed */
> > +#ifdef CONFIG_GENERIC_ENTRY
> 
> Is this #ifdef really needed ?
Will fix this.
> 
> > +#include <linux/entry-common.h>
> > +#endif /* CONFIG_GENERIC_ENTRY */
> > +
> >   #ifdef CONFIG_VSX
> >   unsigned long copy_fpr_to_user(void __user *to,
> >   			       struct task_struct *task)
> > @@ -368,3 +373,12 @@ void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
> >   		printk_ratelimited(regs->msr & MSR_64BIT ? fm64 : fm32, tsk->comm,
> >   				   task_pid_nr(tsk), where, ptr, regs->nip, regs->link);
> >   }
> > +
> > +#ifdef CONFIG_GENERIC_ENTRY
> 
> Why is this #ifdef needed ?
> 
> > +void arch_do_signal_or_restart(struct pt_regs *regs)
> > +{
> > +	BUG_ON(regs != current->thread.regs);
> 
> Is this BUG_ON() needed ? Can't we use something smoother ?
> 
I am not sure about what to do here. Proceeding with this seemed
dangerous. So went with a BUG_ON. Can you suggest if something better
comes to your mind.

Regards,
Mukesh
> > +	local_paca->generic_fw_flags |= GFW_RESTORE_ALL;
> > +	do_signal(current);
> > +}
> > +#endif /* CONFIG_GENERIC_ENTRY */
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 5/8] powerpc: add exit_flags field in pt_regs
  2025-12-16  9:52   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 14:56     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 14:56 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 10:52:42AM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > 
> > Add a new field `exit_flags` in the pt_regs structure. This field will hold
> > the flags set during interrupt or syscall execution that are required during
> > exit to user mode.
> > 
> > Specifically, the `TIF_RESTOREALL` flag, stored in this field, helps the
> > exit routine determine if any NVGPRs were modified and need to be restored
> > before returning to userspace.
> 
> In the current implementation we did our best to keep this information in a
> local var for performance reasons. Have you assessed the performance impact
> of going through the stack for that ?
> 
I needed this information out of the stack calls so kept it here. After
enabling the code as a whole i didn't see much of an impact.
> > 
> > This addition ensures a clean and architecture-specific mechanism to track
> > per-syscall or per-interrupt state transitions related to register restore.
> > 
> > Changes:
> >   - Add `exit_flags` and `__pt_regs_pad` to maintain 16-byte stack alignment
> >   - Update asm-offsets.c and ptrace.c for offset and validation
> >   - Update PT_* constants in uapi header to reflect the new layout
> > 
> > Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > ---
> >   arch/powerpc/include/asm/ptrace.h      |  3 +++
> >   arch/powerpc/include/uapi/asm/ptrace.h | 14 +++++++++-----
> >   arch/powerpc/kernel/asm-offsets.c      |  1 +
> >   arch/powerpc/kernel/ptrace/ptrace.c    |  1 +
> >   4 files changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
> > index 94aa1de2b06e..3af8a5898fe3 100644
> > --- a/arch/powerpc/include/asm/ptrace.h
> > +++ b/arch/powerpc/include/asm/ptrace.h
> > @@ -53,6 +53,9 @@ struct pt_regs
> >   				unsigned long esr;
> >   			};
> >   			unsigned long result;
> > +			unsigned long exit_flags;
> > +			/* Maintain 16 byte interrupt stack alignment */
> 
> On powerpc/32, one 'long' is 4 bytes not 8.
> 
Oh ohk. Will fix this in next revision.
> > +			unsigned long __pt_regs_pad[1];
> >   		};
> >   	};
> >   #if defined(CONFIG_PPC64) || defined(CONFIG_PPC_KUAP)
> > diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h
> > index 01e630149d48..de56b216c9c5 100644
> > --- a/arch/powerpc/include/uapi/asm/ptrace.h
> > +++ b/arch/powerpc/include/uapi/asm/ptrace.h
> > @@ -55,6 +55,8 @@ struct pt_regs
> >   	unsigned long dar;		/* Fault registers */
> >   	unsigned long dsisr;		/* on 4xx/Book-E used for ESR */
> >   	unsigned long result;		/* Result of a system call */
> > +	unsigned long exit_flags;	/* System call exit flags */
> > +	unsigned long __pt_regs_pad[1];	/* Maintain 16 byte interrupt stack alignment */
> 
> On powerpc/32, one 'long' is 4 bytes not 8.
> 
Will fix this too.
> >   };
> >   #endif /* __ASSEMBLER__ */
> > @@ -114,10 +116,12 @@ struct pt_regs
> >   #define PT_DAR	41
> >   #define PT_DSISR 42
> >   #define PT_RESULT 43
> > -#define PT_DSCR 44
> > -#define PT_REGS_COUNT 44
> > +#define PT_EXIT_FLAGS 44
> > +#define PT_PAD 45
> > +#define PT_DSCR 46
> > +#define PT_REGS_COUNT 46
> > -#define PT_FPR0	48	/* each FP reg occupies 2 slots in this space */
> > +#define PT_FPR0	(PT_REGS_COUNT + 4)	/* each FP reg occupies 2 slots in this space */
> >   #ifndef __powerpc64__
> > @@ -129,7 +133,7 @@ struct pt_regs
> >   #define PT_FPSCR (PT_FPR0 + 32)	/* each FP reg occupies 1 slot in 64-bit space */
> > -#define PT_VR0 82	/* each Vector reg occupies 2 slots in 64-bit */
> > +#define PT_VR0	(PT_FPSCR + 2)	/* <82> each Vector reg occupies 2 slots in 64-bit */
> >   #define PT_VSCR (PT_VR0 + 32*2 + 1)
> >   #define PT_VRSAVE (PT_VR0 + 33*2)
> > @@ -137,7 +141,7 @@ struct pt_regs
> >   /*
> >    * Only store first 32 VSRs here. The second 32 VSRs in VR0-31
> >    */
> > -#define PT_VSR0 150	/* each VSR reg occupies 2 slots in 64-bit */
> > +#define PT_VSR0	(PT_VRSAVE + 2)	/* each VSR reg occupies 2 slots in 64-bit */
> >   #define PT_VSR31 (PT_VSR0 + 2*31)
> >   #endif /* __powerpc64__ */
> > diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> > index a4bc80b30410..c0bb09f1db78 100644
> > --- a/arch/powerpc/kernel/asm-offsets.c
> > +++ b/arch/powerpc/kernel/asm-offsets.c
> > @@ -292,6 +292,7 @@ int main(void)
> >   	STACK_PT_REGS_OFFSET(_ESR, esr);
> >   	STACK_PT_REGS_OFFSET(ORIG_GPR3, orig_gpr3);
> >   	STACK_PT_REGS_OFFSET(RESULT, result);
> > +	STACK_PT_REGS_OFFSET(EXIT_FLAGS, exit_flags);
> 
> Where is that used ?
> 
It's not used anywhere as of now but kept it there as a convention.
Should this be removed?

Regards,
Mukesh
> >   	STACK_PT_REGS_OFFSET(_TRAP, trap);
> >   #ifdef CONFIG_PPC64
> >   	STACK_PT_REGS_OFFSET(SOFTE, softe);
> > diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
> > index c6997df63287..2134b6d155ff 100644
> > --- a/arch/powerpc/kernel/ptrace/ptrace.c
> > +++ b/arch/powerpc/kernel/ptrace/ptrace.c
> > @@ -432,6 +432,7 @@ void __init pt_regs_check(void)
> >   	CHECK_REG(PT_DAR, dar);
> >   	CHECK_REG(PT_DSISR, dsisr);
> >   	CHECK_REG(PT_RESULT, result);
> > +	CHECK_REG(PT_EXIT_FLAGS, exit_flags);
> >   	#undef CHECK_REG
> >   	BUILD_BUG_ON(PT_REGS_COUNT != sizeof(struct user_pt_regs) / sizeof(unsigned long));
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit
  2025-12-16  9:58   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 15:00     ` Mukesh Kumar Chaurasiya
  2025-12-16 22:40       ` Christophe Leroy (CS GROUP)
  0 siblings, 1 reply; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 15:00 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 10:58:16AM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > 
> > Move interrupt entry and exit helper routines from interrupt.h into the
> > PowerPC-specific entry-common.h header as a preparatory step for enabling
> > the generic entry/exit framework.
> > 
> > This consolidation places all PowerPC interrupt entry/exit handling in a
> > single common header, aligning with the generic entry infrastructure.
> > The helpers provide architecture-specific handling for interrupt and NMI
> > entry/exit sequences, including:
> > 
> >   - arch_interrupt_enter/exit_prepare()
> >   - arch_interrupt_async_enter/exit_prepare()
> >   - arch_interrupt_nmi_enter/exit_prepare()
> >   - Supporting helpers such as nap_adjust_return(), check_return_regs_valid(),
> >     debug register maintenance, and soft mask handling.
> > 
> > The functions are copied verbatim from interrupt.h to avoid functional
> > changes at this stage. Subsequent patches will integrate these routines
> > into the generic entry/exit flow.
> 
> Can we move them instead of duplicating them ?
> 
Till we enable the Generic framework i didn't want to touch the already
used code path. Once we enable the code all the unused code should be
removed. This helps us in bisecting future issues caused due to this.
> > 
> > No functional change intended.
> > 
> > Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > ---
> >   arch/powerpc/include/asm/entry-common.h | 422 ++++++++++++++++++++++++
> >   1 file changed, 422 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> > index e8ebd42a4e6d..e8bde4c67eaf 100644
> > --- a/arch/powerpc/include/asm/entry-common.h
> > +++ b/arch/powerpc/include/asm/entry-common.h
> > @@ -7,10 +7,432 @@
> >   #include <asm/cputime.h>
> >   #include <asm/interrupt.h>
> > +#include <asm/runlatch.h>
> >   #include <asm/stacktrace.h>
> >   #include <asm/switch_to.h>
> >   #include <asm/tm.h>
> > +#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
> > +/*
> > + * WARN/BUG is handled with a program interrupt so minimise checks here to
> > + * avoid recursion and maximise the chance of getting the first oops handled.
> > + */
> > +#define INT_SOFT_MASK_BUG_ON(regs, cond)				\
> > +do {									\
> > +	if ((user_mode(regs) || (TRAP(regs) != INTERRUPT_PROGRAM)))	\
> > +		BUG_ON(cond);						\
> > +} while (0)
> > +#else
> > +#define INT_SOFT_MASK_BUG_ON(regs, cond)
> > +#endif
> > +
> > +#ifdef CONFIG_PPC_BOOK3S_64
> > +extern char __end_soft_masked[];
> > +bool search_kernel_soft_mask_table(unsigned long addr);
> > +unsigned long search_kernel_restart_table(unsigned long addr);
> > +
> > +DECLARE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
> > +

[...]

> > +static inline bool nmi_disables_ftrace(struct pt_regs *regs)
> > +{
> > +	/* Allow DEC and PMI to be traced when they are soft-NMI */
> > +	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
> > +		if (TRAP(regs) == INTERRUPT_DECREMENTER)
> > +			return false;
> > +		if (TRAP(regs) == INTERRUPT_PERFMON)
> > +			return false;
> > +	}
> > +	if (IS_ENABLED(CONFIG_PPC_BOOK3E_64)) {
> > +		if (TRAP(regs) == INTERRUPT_PERFMON)
> > +			return false;
> > +	}
> > +
> > +	return true;
> > +}
> > +
> > +static inline void arch_interrupt_nmi_enter_prepare(struct pt_regs *regs,
> > +					       struct interrupt_nmi_state *state)
> 
> CHECK: Alignment should match open parenthesis
> #354: FILE: arch/powerpc/include/asm/entry-common.h:322:
> +static inline void arch_interrupt_nmi_enter_prepare(struct pt_regs *regs,
> +					       struct interrupt_nmi_state *state)
> 
> 
Will fix this.
> > +{
> > +#ifdef CONFIG_PPC64
> > +	state->irq_soft_mask = local_paca->irq_soft_mask;
> > +	state->irq_happened = local_paca->irq_happened;
> > +	state->softe = regs->softe;
> > +
> > +	/*
> > +	 * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
> > +	 * the right thing, and set IRQ_HARD_DIS. We do not want to reconcile
> > +	 * because that goes through irq tracing which we don't want in NMI.
> > +	 */
> > +	local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
> > +	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
> > +
> > +	if (!(regs->msr & MSR_EE) || is_implicit_soft_masked(regs)) {
> > +		/*
> > +		 * Adjust regs->softe to be soft-masked if it had not been
> > +		 * reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
> > +		 * not yet set disabled), or if it was in an implicit soft
> > +		 * masked state. This makes regs_irqs_disabled(regs)
> > +		 * behave as expected.
> > +		 */
> > +		regs->softe = IRQS_ALL_DISABLED;
> > +	}
> > +
> > +	__hard_RI_enable();
> > +
> > +	/* Don't do any per-CPU operations until interrupt state is fixed */
> > +
> > +	if (nmi_disables_ftrace(regs)) {
> > +		state->ftrace_enabled = this_cpu_get_ftrace_enabled();
> > +		this_cpu_set_ftrace_enabled(0);
> > +	}
> > +#endif
> > +
> > +	/* If data relocations are enabled, it's safe to use nmi_enter() */
> > +	if (mfmsr() & MSR_DR) {
> > +		nmi_enter();
> > +		return;
> > +	}
> > +
> > +	/*
> > +	 * But do not use nmi_enter() for pseries hash guest taking a real-mode
> > +	 * NMI because not everything it touches is within the RMA limit.
> > +	 */
> > +	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
> > +	    firmware_has_feature(FW_FEATURE_LPAR) &&
> > +	    !radix_enabled())
> > +		return;
> > +
> > +	/*
> > +	 * Likewise, don't use it if we have some form of instrumentation (like
> > +	 * KASAN shadow) that is not safe to access in real mode (even on radix)
> > +	 */
> > +	if (IS_ENABLED(CONFIG_KASAN))
> > +		return;
> > +
> > +	/*
> > +	 * Likewise, do not use it in real mode if percpu first chunk is not
> > +	 * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
> > +	 * are chances where percpu allocation can come from vmalloc area.
> > +	 */
> > +	if (percpu_first_chunk_is_paged)
> > +		return;
> > +
> > +	/* Otherwise, it should be safe to call it */
> > +	nmi_enter();
> > +}
> > +
> > +static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
> > +					      struct interrupt_nmi_state *state)
> > +{
> 
> CHECK: Alignment should match open parenthesis
> #425: FILE: arch/powerpc/include/asm/entry-common.h:393:
> +static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
> +					      struct interrupt_nmi_state *state)
> 

Will fix this.

Regards,
Mukesh
> > +	if (mfmsr() & MSR_DR) {
> > +		// nmi_exit if relocations are on
> > +		nmi_exit();
> > +	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
> > +		   firmware_has_feature(FW_FEATURE_LPAR) &&
> > +		   !radix_enabled()) {
> > +		// no nmi_exit for a pseries hash guest taking a real mode exception
> > +	} else if (IS_ENABLED(CONFIG_KASAN)) {
> > +		// no nmi_exit for KASAN in real mode
> > +	} else if (percpu_first_chunk_is_paged) {
> > +		// no nmi_exit if percpu first chunk is not embedded
> > +	} else {
> > +		nmi_exit();
> > +	}
> > +
> > +	/*
> > +	 * nmi does not call nap_adjust_return because nmi should not create
> > +	 * new work to do (must use irq_work for that).
> > +	 */
> > +
> > +#ifdef CONFIG_PPC64
> > +#ifdef CONFIG_PPC_BOOK3S
> > +	if (regs_irqs_disabled(regs)) {
> > +		unsigned long rst = search_kernel_restart_table(regs->nip);
> > +
> > +		if (rst)
> > +			regs_set_return_ip(regs, rst);
> > +	}
> > +#endif
> > +
> > +	if (nmi_disables_ftrace(regs))
> > +		this_cpu_set_ftrace_enabled(state->ftrace_enabled);
> > +
> > +	/* Check we didn't change the pending interrupt mask. */
> > +	WARN_ON_ONCE((state->irq_happened | PACA_IRQ_HARD_DIS) != local_paca->irq_happened);
> > +	regs->softe = state->softe;
> > +	local_paca->irq_happened = state->irq_happened;
> > +	local_paca->irq_soft_mask = state->irq_soft_mask;
> > +#endif
> > +}
> > +
> >   static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> >   {
> >   	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
  2025-12-16  6:29   ` kernel test robot
@ 2025-12-16 15:02     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 15:02 UTC (permalink / raw)
  To: kernel test robot
  Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, mingo, atrajeev, mark.barnett,
	linuxppc-dev, linux-kernel, oe-kbuild-all

On Tue, Dec 16, 2025 at 02:29:29PM +0800, kernel test robot wrote:
> Hi Mukesh,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on powerpc/next]
> [also build test WARNING on powerpc/fixes linus/master v6.19-rc1]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Mukesh-Kumar-Chaurasiya/powerpc-rename-arch_irq_disabled_regs/20251214-210813
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> patch link:    https://lore.kernel.org/r/20251214130245.43664-8-mkchauras%40linux.ibm.com
> patch subject: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
> config: powerpc-randconfig-r072-20251215 (https://download.01.org/0day-ci/archive/20251216/202512161441.xlMhHxvl-lkp@intel.com/config)
> compiler: powerpc-linux-gcc (GCC) 8.5.0
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202512161441.xlMhHxvl-lkp@intel.com/
> 
> smatch warnings:
> arch/powerpc/include/asm/entry-common.h:433 arch_enter_from_user_mode() warn: inconsistent indenting
> 
> vim +433 arch/powerpc/include/asm/entry-common.h
> 
> 2b0f05f77f11f8 Mukesh Kumar Chaurasiya 2025-12-14  396  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  397  static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  398  {
> 37ad0d88d9bff7 Mukesh Kumar Chaurasiya 2025-12-14  399  	kuap_lock();
> 37ad0d88d9bff7 Mukesh Kumar Chaurasiya 2025-12-14  400  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  401  	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  402  		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  403  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  404  	BUG_ON(regs_is_unrecoverable(regs));
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  405  	BUG_ON(!user_mode(regs));
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  406  	BUG_ON(regs_irqs_disabled(regs));
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  407  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  408  #ifdef CONFIG_PPC_PKEY
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  409  	if (mmu_has_feature(MMU_FTR_PKEY) && trap_is_syscall(regs)) {
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  410  		unsigned long amr, iamr;
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  411  		bool flush_needed = false;
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  412  		/*
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  413  		 * When entering from userspace we mostly have the AMR/IAMR
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  414  		 * different from kernel default values. Hence don't compare.
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  415  		 */
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  416  		amr = mfspr(SPRN_AMR);
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  417  		iamr = mfspr(SPRN_IAMR);
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  418  		regs->amr  = amr;
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  419  		regs->iamr = iamr;
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  420  		if (mmu_has_feature(MMU_FTR_KUAP)) {
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  421  			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  422  			flush_needed = true;
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  423  		}
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  424  		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  425  			mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  426  			flush_needed = true;
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  427  		}
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  428  		if (flush_needed)
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  429  			isync();
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  430  	} else
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  431  #endif
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  432  		kuap_assert_locked();
Will fix this in next iteration.
Regards,
Mukesh
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14 @433  	booke_restore_dbcr0();
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  434  	account_cpu_user_entry();
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  435  	account_stolen_time();
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  436  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  437  	/*
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  438  	 * This is not required for the syscall exit path, but makes the
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  439  	 * stack frame look nicer. If this was initialised in the first stack
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  440  	 * frame, or if the unwinder was taught the first stack frame always
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  441  	 * returns to user with IRQS_ENABLED, this store could be avoided!
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  442  	 */
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  443  	irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  444  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  445  	/*
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  446  	 * If system call is called with TM active, set _TIF_RESTOREALL to
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  447  	 * prevent RFSCV being used to return to userspace, because POWER9
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  448  	 * TM implementation has problems with this instruction returning to
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  449  	 * transactional state. Final register values are not relevant because
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  450  	 * the transaction will be aborted upon return anyway. Or in the case
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  451  	 * of unsupported_scv SIGILL fault, the return state does not much
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  452  	 * matter because it's an edge case.
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  453  	 */
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  454  	if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  455  	    unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  456  		set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  457  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  458  	/*
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  459  	 * If the system call was made with a transaction active, doom it and
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  460  	 * return without performing the system call. Unless it was an
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  461  	 * unsupported scv vector, in which case it's treated like an illegal
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  462  	 * instruction.
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  463  	 */
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  464  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  465  	if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  466  	    !trap_is_unsupported_scv(regs)) {
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  467  		/* Enable TM in the kernel, and disable EE (for scv) */
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  468  		hard_irq_disable();
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  469  		mtmsr(mfmsr() | MSR_TM);
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  470  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  471  		/* tabort, this dooms the transaction, nothing else */
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  472  		asm volatile(".long 0x7c00071d | ((%0) << 16)"
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  473  			     :: "r"(TM_CAUSE_SYSCALL | TM_CAUSE_PERSISTENT));
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  474  
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  475  		/*
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  476  		 * Userspace will never see the return value. Execution will
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  477  		 * resume after the tbegin. of the aborted transaction with the
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  478  		 * checkpointed register state. A context switch could occur
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  479  		 * or signal delivered to the process before resuming the
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  480  		 * doomed transaction context, but that should all be handled
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  481  		 * as expected.
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  482  		 */
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  483  		return;
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  484  	}
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  485  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  486  }
> 1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  487  
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
  2025-12-16 10:43   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 15:06     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 15:06 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 11:43:02AM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > 
> > Enable the generic IRQ entry/exit infrastructure on PowerPC by selecting
> > GENERIC_IRQ_ENTRY and integrating the architecture-specific interrupt
> > handlers with the generic entry/exit APIs.
> > 
> > This change replaces PowerPC’s local interrupt entry/exit handling with
> > calls to the generic irqentry_* helpers, aligning the architecture with
> > the common kernel entry model. The macros that define interrupt, async,
> > and NMI handlers are updated to use irqentry_enter()/irqentry_exit()
> > and irqentry_nmi_enter()/irqentry_nmi_exit() where applicable.
> > 
> > Key updates include:
> >   - Select GENERIC_IRQ_ENTRY in Kconfig.
> >   - Replace interrupt_enter/exit_prepare() with arch_interrupt_* helpers.
> >   - Integrate irqentry_enter()/exit() in standard and async interrupt paths.
> >   - Integrate irqentry_nmi_enter()/exit() in NMI handlers.
> >   - Remove redundant irq_enter()/irq_exit() calls now handled generically.
> >   - Use irqentry_exit_cond_resched() for preemption checks.
> > 
> > This change establishes the necessary wiring for PowerPC to use the
> > generic IRQ entry/exit framework while maintaining existing semantics.
> 
> Did you look into resulting code ?
> 
> do_IRQ() is bigger and calls irqentry_enter() which is bigger than
> irq_enter().
> 
> And irq_enter_rcu() was tail-called from irq_enter(), now is it called after
> irqentry_enter().
> 
I am not sure if I understand your question correctly here. Can you
elaborate a little more?

> > 
> > Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > ---
> >   arch/powerpc/Kconfig                    |   1 +
> >   arch/powerpc/include/asm/entry-common.h |  66 +---
> >   arch/powerpc/include/asm/interrupt.h    | 499 +++---------------------
> >   arch/powerpc/kernel/interrupt.c         |  13 +-
> >   4 files changed, 74 insertions(+), 505 deletions(-)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index e24f4d88885a..b0c602c3bbe1 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -206,6 +206,7 @@ config PPC
> >   	select GENERIC_GETTIMEOFDAY
> >   	select GENERIC_IDLE_POLL_SETUP
> >   	select GENERIC_IOREMAP
> > +	select GENERIC_IRQ_ENTRY
> >   	select GENERIC_IRQ_SHOW
> >   	select GENERIC_IRQ_SHOW_LEVEL
> >   	select GENERIC_PCI_IOMAP		if PCI
> > diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> > index e8bde4c67eaf..e2ae7416dee1 100644
> > --- a/arch/powerpc/include/asm/entry-common.h
> > +++ b/arch/powerpc/include/asm/entry-common.h
> > @@ -257,6 +257,17 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
> >    */
> >   static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
> >   {
> > +	if (user_mode(regs)) {
> > +		BUG_ON(regs_is_unrecoverable(regs));
> > +		BUG_ON(regs_irqs_disabled(regs));
> > +		/*
> > +		 * We don't need to restore AMR on the way back to userspace for KUAP.
> > +		 * AMR can only have been unlocked if we interrupted the kernel.
> > +		 */
> > +		kuap_assert_locked();
> > +
> > +		local_irq_disable();
> > +	}
> >   }
> >   static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
> > @@ -275,7 +286,6 @@ static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
> >   	    !test_thread_local_flags(_TLF_RUNLATCH))
> >   		__ppc64_runlatch_on();
> >   #endif
> > -	irq_enter();
> >   }
> >   static inline void arch_interrupt_async_exit_prepare(struct pt_regs *regs)
> > @@ -288,7 +298,6 @@ static inline void arch_interrupt_async_exit_prepare(struct pt_regs *regs)
> >   	 */
> >   	nap_adjust_return(regs);
> > -	irq_exit();
> >   	arch_interrupt_exit_prepare(regs);
> >   }
> > @@ -354,59 +363,11 @@ static inline void arch_interrupt_nmi_enter_prepare(struct pt_regs *regs,
> >   		this_cpu_set_ftrace_enabled(0);
> >   	}
> >   #endif
> > -
> > -	/* If data relocations are enabled, it's safe to use nmi_enter() */
> > -	if (mfmsr() & MSR_DR) {
> > -		nmi_enter();
> > -		return;
> > -	}
> > -
> > -	/*
> > -	 * But do not use nmi_enter() for pseries hash guest taking a real-mode
> > -	 * NMI because not everything it touches is within the RMA limit.
> > -	 */
> > -	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
> > -	    firmware_has_feature(FW_FEATURE_LPAR) &&
> > -	    !radix_enabled())
> > -		return;
> > -
> > -	/*
> > -	 * Likewise, don't use it if we have some form of instrumentation (like
> > -	 * KASAN shadow) that is not safe to access in real mode (even on radix)
> > -	 */
> > -	if (IS_ENABLED(CONFIG_KASAN))
> > -		return;
> > -
> > -	/*
> > -	 * Likewise, do not use it in real mode if percpu first chunk is not
> > -	 * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
> > -	 * are chances where percpu allocation can come from vmalloc area.
> > -	 */
> > -	if (percpu_first_chunk_is_paged)
> > -		return;
> > -
> > -	/* Otherwise, it should be safe to call it */
> > -	nmi_enter();
> >   }
> >   static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
> >   					      struct interrupt_nmi_state *state)
> >   {
> > -	if (mfmsr() & MSR_DR) {
> > -		// nmi_exit if relocations are on
> > -		nmi_exit();
> > -	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
> > -		   firmware_has_feature(FW_FEATURE_LPAR) &&
> > -		   !radix_enabled()) {
> > -		// no nmi_exit for a pseries hash guest taking a real mode exception
> > -	} else if (IS_ENABLED(CONFIG_KASAN)) {
> > -		// no nmi_exit for KASAN in real mode
> > -	} else if (percpu_first_chunk_is_paged) {
> > -		// no nmi_exit if percpu first chunk is not embedded
> > -	} else {
> > -		nmi_exit();
> > -	}
> > -
> >   	/*
> >   	 * nmi does not call nap_adjust_return because nmi should not create
> >   	 * new work to do (must use irq_work for that).
> > @@ -435,6 +396,8 @@ static inline void arch_interrupt_nmi_exit_prepare(struct pt_regs *regs,
> >   static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> >   {
> > +	kuap_lock();
> > +
> 
> A reason why this change comes now and not in the patch that added
> arch_enter_from_user_mode() ?
> 
Yes it should have been. Will fix this in next revision.

> >   	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
> >   		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
> > @@ -467,11 +430,8 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> >   	} else
> >   #endif
> >   		kuap_assert_locked();
> > -
> >   	booke_restore_dbcr0();
> > -
> 
> This is cosmetic, should have been done when adding
> arch_enter_from_user_mode()
>
Sure, Will fix this.

Regards,
Mukesh
> >   	account_cpu_user_entry();
> > -
> >   	account_stolen_time();
> >   	/*
> 
> 
> Christophe


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-15 20:27   ` kernel test robot
@ 2025-12-16 15:08     ` Mukesh Kumar Chaurasiya
  2025-12-16 22:57       ` Christophe Leroy (CS GROUP)
  0 siblings, 1 reply; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 15:08 UTC (permalink / raw)
  To: kernel test robot
  Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
	mchauras, thuth, sshegde, charlie, macro, akpm, ldv, deller,
	ankur.a.arora, segher, tglx, thomas.weissschuh, peterz,
	menglong8.dong, bigeasy, namcao, mingo, atrajeev, mark.barnett,
	linuxppc-dev, linux-kernel, oe-kbuild-all

On Tue, Dec 16, 2025 at 04:27:55AM +0800, kernel test robot wrote:
> Hi Mukesh,
> 
> kernel test robot noticed the following build errors:
> 
> [auto build test ERROR on powerpc/next]
> [also build test ERROR on powerpc/fixes linus/master v6.19-rc1 next-20251215]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Mukesh-Kumar-Chaurasiya/powerpc-rename-arch_irq_disabled_regs/20251214-210813
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> patch link:    https://lore.kernel.org/r/20251214130245.43664-9-mkchauras%40linux.ibm.com
> patch subject: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
> config: powerpc-randconfig-001-20251215 (https://download.01.org/0day-ci/archive/20251216/202512160453.iO9WNjrm-lkp@intel.com/config)
> compiler: powerpc-linux-gcc (GCC) 9.5.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251216/202512160453.iO9WNjrm-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202512160453.iO9WNjrm-lkp@intel.com/
> 
I tried this with gcc 9.4 and gcc 14. I am not able to reproduce this.
Will investigate further, meanwhile if anyone has any ideas that can help it would be great.

Regards,
Mukesh
> All errors (new ones prefixed by >>):
> 
>    powerpc-linux-ld: init/main.o: in function `do_trace_event_raw_event_initcall_level':
>    include/trace/events/initcall.h:10: undefined reference to `memcpy'
>    powerpc-linux-ld: init/main.o: in function `repair_env_string':
>    init/main.c:512: undefined reference to `memmove'
>    powerpc-linux-ld: init/do_mounts.o: in function `do_mount_root':
>    init/do_mounts.c:162: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/process.o: in function `start_thread':
>    arch/powerpc/kernel/process.c:1919: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/process.o: in function `__set_breakpoint':
>    arch/powerpc/kernel/process.c:880: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/process.o: in function `arch_dup_task_struct':
>    arch/powerpc/kernel/process.c:1724: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/process.o: in function `copy_thread':
>    arch/powerpc/kernel/process.c:1801: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/process.c:1812: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/signal.o: in function `do_signal':
>    arch/powerpc/kernel/signal.c:247: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/time.o: in function `register_decrementer_clockevent':
> >> arch/powerpc/kernel/time.c:834: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/time.o: in function `platform_device_register_resndata':
> >> include/linux/platform_device.h:158: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/prom.o: in function `move_device_tree':
> >> arch/powerpc/kernel/prom.c:134: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/setup-common.o: in function `probe_machine':
> >> arch/powerpc/kernel/setup-common.c:646: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `user_regset_copyin':
> >> include/linux/regset.h:276: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `membuf_write':
>    include/linux/regset.h:42: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `gpr_get':
> >> arch/powerpc/kernel/ptrace/ptrace-view.c:230: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `membuf_zero':
> >> include/linux/regset.h:30: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `gpr32_get_common':
>    arch/powerpc/kernel/ptrace/ptrace-view.c:707: undefined reference to `memcpy'
> >> powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.c:708: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.c:710: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-view.o: in function `membuf_zero':
> >> include/linux/regset.h:30: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/ptrace/ptrace-novsx.o: in function `membuf_write':
>    include/linux/regset.h:42: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/optprobes.o: in function `can_optimize':
> >> arch/powerpc/kernel/optprobes.c:71: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/kvm.o: in function `kvm_map_magic_page':
> >> arch/powerpc/kernel/kvm.c:407: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/kernel/kvm.o: in function `kvm_patch_ins_mtmsrd':
> >> arch/powerpc/kernel/kvm.c:178: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/kvm.o: in function `kvm_patch_ins_mtmsr':
>    arch/powerpc/kernel/kvm.c:231: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/kernel/kvm.o: in function `epapr_hypercall0_1':
> >> arch/powerpc/include/asm/epapr_hcalls.h:511: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/mm/mem.o: in function `execmem_arch_setup':
> >> arch/powerpc/mm/mem.c:423: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/mm/init-common.o: in function `ctor_15':
> >> arch/powerpc/mm/init-common.c:81: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/mm/init-common.o: in function `ctor_14':
> >> arch/powerpc/mm/init-common.c:81: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/mm/init-common.o: in function `ctor_13':
> >> arch/powerpc/mm/init-common.c:81: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/mm/init-common.o:arch/powerpc/mm/init-common.c:81: more undefined references to `memset' follow
>    powerpc-linux-ld: arch/powerpc/lib/pmem.o: in function `memcpy_flushcache':
> >> arch/powerpc/lib/pmem.c:84: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/sysdev/fsl_mpic_err.o: in function `mpic_setup_error_int':
> >> arch/powerpc/sysdev/fsl_mpic_err.c:70: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/platforms/8xx/pic.o: in function `irq_domain_create_linear':
> >> include/linux/irqdomain.h:405: undefined reference to `memset'
>    powerpc-linux-ld: arch/powerpc/platforms/8xx/cpm1.o: in function `cpm1_clk_setup':
>    arch/powerpc/platforms/8xx/cpm1.c:251: undefined reference to `memcpy'
>    powerpc-linux-ld: arch/powerpc/platforms/8xx/cpm1-ic.o: in function `irq_domain_create_linear':
>    include/linux/irqdomain.h:405: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `do_trace_event_raw_event_task_newtask':
>    include/trace/events/task.h:9: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/fork.o: in function `do_trace_event_raw_event_task_rename':
>    include/trace/events/task.h:34: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/fork.o: in function `copy_struct_from_user':
>    include/linux/uaccess.h:396: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `copy_clone_args_from_user':
>    kernel/fork.c:2800: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `mm_init':
>    kernel/fork.c:1044: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `bitmap_zero':
>    include/linux/bitmap.h:238: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `pgd_alloc':
>    arch/powerpc/include/asm/nohash/pgalloc.h:26: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/fork.o: in function `__kmem_cache_create':
>    include/linux/slab.h:379: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `arch_dup_task_struct':
>    kernel/fork.c:854: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/fork.o: in function `mm_alloc':
>    kernel/fork.c:1120: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `replace_mm_exe_file':
>    kernel/fork.c:1238: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `copy_process':
>    kernel/fork.c:2030: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `posix_cputimers_init':
>    include/linux/posix-timers.h:103: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `copy_sighand':
>    kernel/fork.c:1618: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/fork.o: in function `copy_signal':
>    kernel/fork.c:1687: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/fork.o: in function `dup_mm':
>    kernel/fork.c:1483: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/fork.o: in function `create_io_thread':
>    kernel/fork.c:2549: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `kernel_thread':
>    kernel/fork.c:2661: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `user_mode_thread':
>    kernel/fork.c:2678: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `sys_fork':
>    kernel/fork.c:2692: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o: in function `sys_vfork':
>    kernel/fork.c:2707: undefined reference to `memset'
>    powerpc-linux-ld: kernel/fork.o:kernel/fork.c:2740: more undefined references to `memset' follow
>    powerpc-linux-ld: kernel/softirq.o: in function `do_trace_event_raw_event_irq_handler_entry':
>    include/trace/events/irq.h:53: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/resource.o: in function `find_next_iomem_res':
>    kernel/resource.c:372: undefined reference to `memset'
>    powerpc-linux-ld: kernel/resource.o: in function `__request_region_locked':
>    kernel/resource.c:1261: undefined reference to `memset'
>    powerpc-linux-ld: kernel/resource.o: in function `reserve_setup':
>    kernel/resource.c:1757: undefined reference to `memset'
>    powerpc-linux-ld: kernel/resource.c:1760: undefined reference to `memset'
>    powerpc-linux-ld: kernel/sysctl.o: in function `proc_put_long':
>    kernel/sysctl.c:339: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/sysctl.o: in function `_proc_do_string':
>    kernel/sysctl.c:127: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/sysctl.o: in function `proc_get_long':
>    kernel/sysctl.c:284: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/sysctl.o: in function `bitmap_copy':
>    include/linux/bitmap.h:259: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/sysctl.o: in function `proc_do_static_key':
>    kernel/sysctl.c:1433: undefined reference to `memset'
>    powerpc-linux-ld: kernel/capability.o: in function `__do_sys_capset':
>    kernel/capability.c:218: undefined reference to `memset'
>    powerpc-linux-ld: kernel/ptrace.o: in function `syscall_set_arguments':
>    arch/powerpc/include/asm/syscall.h:127: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/ptrace.o: in function `ptrace_get_syscall_info':
>    kernel/ptrace.c:998: undefined reference to `memset'
>    powerpc-linux-ld: kernel/ptrace.o: in function `copy_siginfo':
>    include/linux/signal.h:18: undefined reference to `memcpy'
>    powerpc-linux-ld: include/linux/signal.h:18: undefined reference to `memcpy'
>    powerpc-linux-ld: include/linux/signal.h:18: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/user.o: in function `ratelimit_state_init':
>    include/linux/ratelimit.h:12: undefined reference to `memset'
>    powerpc-linux-ld: kernel/user.o: in function `__kmem_cache_create':
>    include/linux/slab.h:379: undefined reference to `memset'
>    powerpc-linux-ld: kernel/signal.o: in function `do_trace_event_raw_event_signal_generate':
>    include/trace/events/signal.h:50: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/signal.o: in function `clear_siginfo':
>    include/linux/signal.h:23: undefined reference to `memset'
>    powerpc-linux-ld: kernel/signal.o: in function `copy_siginfo':
>    include/linux/signal.h:18: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/signal.o: in function `do_sigaltstack':
>    kernel/signal.c:4396: undefined reference to `memset'
>    powerpc-linux-ld: kernel/signal.o: in function `copy_siginfo':
>    include/linux/signal.h:18: undefined reference to `memcpy'
>    powerpc-linux-ld: include/linux/signal.h:18: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/signal.o: in function `signals_init':
>    kernel/signal.c:5011: undefined reference to `memset'
>    powerpc-linux-ld: kernel/sys.o: in function `override_release':
>    kernel/sys.c:1331: undefined reference to `memset'
>    powerpc-linux-ld: kernel/sys.o: in function `__do_sys_newuname':
>    kernel/sys.c:1356: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/sys.o: in function `__do_sys_uname':
>    kernel/sys.c:1380: undefined reference to `memcpy'
>    powerpc-linux-ld: kernel/sys.o: in function `prctl_set_auxv':
> 
> 
> vim +18 include/linux/signal.h
> 
> ^1da177e4c3f415 Linus Torvalds    2005-04-16  14  
> ae7795bc6187a15 Eric W. Biederman 2018-09-25  15  static inline void copy_siginfo(kernel_siginfo_t *to,
> ae7795bc6187a15 Eric W. Biederman 2018-09-25  16  				const kernel_siginfo_t *from)
> ca9eb49aa9562ea James Hogan       2016-02-08  17  {
> ca9eb49aa9562ea James Hogan       2016-02-08 @18  	memcpy(to, from, sizeof(*to));
> ca9eb49aa9562ea James Hogan       2016-02-08  19  }
> ca9eb49aa9562ea James Hogan       2016-02-08  20  
> ae7795bc6187a15 Eric W. Biederman 2018-09-25  21  static inline void clear_siginfo(kernel_siginfo_t *info)
> 8c5dbf2ae00bb86 Eric W. Biederman 2017-07-24  22  {
> 8c5dbf2ae00bb86 Eric W. Biederman 2017-07-24 @23  	memset(info, 0, sizeof(*info));
> 8c5dbf2ae00bb86 Eric W. Biederman 2017-07-24  24  }
> 8c5dbf2ae00bb86 Eric W. Biederman 2017-07-24  25  
> 4ce5f9c9e754691 Eric W. Biederman 2018-09-25  26  #define SI_EXPANSION_SIZE (sizeof(struct siginfo) - sizeof(struct kernel_siginfo))
> 4ce5f9c9e754691 Eric W. Biederman 2018-09-25  27  
> fa4751f454e6b51 Eric W. Biederman 2020-05-05  28  static inline void copy_siginfo_to_external(siginfo_t *to,
> fa4751f454e6b51 Eric W. Biederman 2020-05-05  29  					    const kernel_siginfo_t *from)
> fa4751f454e6b51 Eric W. Biederman 2020-05-05  30  {
> fa4751f454e6b51 Eric W. Biederman 2020-05-05 @31  	memcpy(to, from, sizeof(*from));
> fa4751f454e6b51 Eric W. Biederman 2020-05-05 @32  	memset(((char *)to) + sizeof(struct kernel_siginfo), 0,
> fa4751f454e6b51 Eric W. Biederman 2020-05-05  33  		SI_EXPANSION_SIZE);
> fa4751f454e6b51 Eric W. Biederman 2020-05-05  34  }
> fa4751f454e6b51 Eric W. Biederman 2020-05-05  35  
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-16  6:41   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 15:09     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 15:09 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 07:41:11AM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > 
> > Convert the PowerPC syscall entry and exit paths to use the generic
> > entry/exit framework by selecting GENERIC_ENTRY and integrating with
> > the common syscall handling routines.
> > 
> > This change transitions PowerPC away from its custom syscall entry and
> > exit code to use the generic helpers such as:
> >   - syscall_enter_from_user_mode()
> >   - syscall_exit_to_user_mode()
> > 
> > As part of this migration:
> >   - The architecture now selects GENERIC_ENTRY in Kconfig.
> >   - Old tracing, seccomp, and audit handling in ptrace.c is removed in
> >     favor of generic entry infrastructure.
> >   - interrupt.c and syscall.c are simplified to delegate context
> >     management and user exit handling to the generic entry path.
> >   - The new pt_regs field `exit_flags` introduced earlier is now used
> >     to carry per-syscall exit state flags (e.g. _TIF_RESTOREALL).
> > 
> > This aligns PowerPC with the common entry code used by other
> > architectures and reduces duplicated logic around syscall tracing,
> > context tracking, and signal handling.
> > 
> > The performance benchmarks from perf bench basic syscall are below:
> > 
> > perf bench syscall usec/op
> > 
> > | Test            | With Patch | Without Patch | % Change |
> > | --------------- | ---------- | ------------- | -------- |
> > | getppid usec/op | 0.207795   | 0.210373      | -1.22%   |
> > | getpgid usec/op | 0.206282   | 0.211676      | -2.55%   |
> > | fork usec/op    | 833.986    | 814.809       | +2.35%   |
> > | execve usec/op  | 360.939    | 365.168       | -1.16%   |
> > 
> > perf bench syscall ops/sec
> > 
> > | Test            | With Patch | Without Patch | % Change |
> > | --------------- | ---------- | ------------- | -------- |
> > | getppid ops/sec | 48,12,433  | 47,53,459     | +1.24%   |
> > | getpgid ops/sec | 48,47,744  | 47,24,192     | +2.61%   |
> > | fork ops/sec    | 1,199      | 1,227         | -2.28%   |
> > | execve ops/sec  | 2,770      | 2,738         | +1.16%   |
> 
> I get about 2% degradation on powerpc 8xx, and it is quite stable over time
> when repeating the test.
> 
> 'perf bench syscall all' on powerpc 8xx (usec per op):
> 
> | Test            | With Patch | Without Patch | % Change |
> | --------------- | ---------- | ------------- | -------- |
> | getppid usec/op | 2.63       | 2.63          | ~ 0%     |
> | getpgid usec/op | 2.26       | 2.22          | +2,80%   |
> | fork usec/op    | 15300      | 15000         | +2,00%   |
> | execve usec/op  | 45700      | 45200         | +1.10%   |
> 
Ohk,
Do you have any idea where we might be loosing performance? I don't have
an 8xx chip so i am not sure where to look.

Thanks for testing it out.

Regards,
Mukesh
> Christophe


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-16 11:01   ` Christophe Leroy (CS GROUP)
@ 2025-12-16 15:13     ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-16 15:13 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 12:01:32PM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > 
> > Convert the PowerPC syscall entry and exit paths to use the generic
> > entry/exit framework by selecting GENERIC_ENTRY and integrating with
> > the common syscall handling routines.
> > 
> > This change transitions PowerPC away from its custom syscall entry and
> > exit code to use the generic helpers such as:
> >   - syscall_enter_from_user_mode()
> >   - syscall_exit_to_user_mode()
> > 
> > As part of this migration:
> >   - The architecture now selects GENERIC_ENTRY in Kconfig.
> >   - Old tracing, seccomp, and audit handling in ptrace.c is removed in
> >     favor of generic entry infrastructure.
> >   - interrupt.c and syscall.c are simplified to delegate context
> >     management and user exit handling to the generic entry path.
> >   - The new pt_regs field `exit_flags` introduced earlier is now used
> >     to carry per-syscall exit state flags (e.g. _TIF_RESTOREALL).
> > 
> > This aligns PowerPC with the common entry code used by other
> > architectures and reduces duplicated logic around syscall tracing,
> > context tracking, and signal handling.
> > 
> > The performance benchmarks from perf bench basic syscall are below:
> > 
> > perf bench syscall usec/op
> > 
> > | Test            | With Patch | Without Patch | % Change |
> > | --------------- | ---------- | ------------- | -------- |
> > | getppid usec/op | 0.207795   | 0.210373      | -1.22%   |
> > | getpgid usec/op | 0.206282   | 0.211676      | -2.55%   |
> > | fork usec/op    | 833.986    | 814.809       | +2.35%   |
> > | execve usec/op  | 360.939    | 365.168       | -1.16%   |
> > 
> > perf bench syscall ops/sec
> > 
> > | Test            | With Patch | Without Patch | % Change |
> > | --------------- | ---------- | ------------- | -------- |
> > | getppid ops/sec | 48,12,433  | 47,53,459     | +1.24%   |
> > | getpgid ops/sec | 48,47,744  | 47,24,192     | +2.61%   |
> > | fork ops/sec    | 1,199      | 1,227         | -2.28%   |
> > | execve ops/sec  | 2,770      | 2,738         | +1.16%   |
> > 
> > IPI latency benchmark
> > 
> > | Metric                  | With Patch       | Without Patch    | % Change |
> > | ----------------------- | ---------------- | ---------------- | -------- |
> > | Dry-run (ns)            | 206,675.81       | 206,719.36       | -0.02%   |
> > | Self-IPI avg (ns)       | 1,939,991.00     | 1,976,116.15     | -1.83%   |
> > | Self-IPI max (ns)       | 3,533,718.93     | 3,582,650.33     | -1.37%   |
> > | Normal IPI avg (ns)     | 111,110,034.23   | 110,513,373.51   | +0.54%   |
> > | Normal IPI max (ns)     | 150,393,442.64   | 149,669,477.89   | +0.48%   |
> > | Broadcast IPI max (ns)  | 3,978,231,022.96 | 4,359,916,859.46 | -8.73%   |
> > | Broadcast lock max (ns) | 4,025,425,714.49 | 4,384,956,730.83 | -8.20%   |
> > 
> > Thats very close to performance earlier with arch specific handling.
> > 
> > Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > ---
> >   arch/powerpc/Kconfig                    |   1 +
> >   arch/powerpc/include/asm/entry-common.h |   5 +-
> >   arch/powerpc/kernel/interrupt.c         | 139 +++++++----------------
> >   arch/powerpc/kernel/ptrace/ptrace.c     | 141 ------------------------
> >   arch/powerpc/kernel/signal.c            |  10 +-
> >   arch/powerpc/kernel/syscall.c           | 119 +-------------------
> >   6 files changed, 49 insertions(+), 366 deletions(-)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index b0c602c3bbe1..a4330775b254 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -203,6 +203,7 @@ config PPC
> >   	select GENERIC_CPU_AUTOPROBE
> >   	select GENERIC_CPU_VULNERABILITIES	if PPC_BARRIER_NOSPEC
> >   	select GENERIC_EARLY_IOREMAP
> > +	select GENERIC_ENTRY
> >   	select GENERIC_GETTIMEOFDAY
> >   	select GENERIC_IDLE_POLL_SETUP
> >   	select GENERIC_IOREMAP
> > diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> > index e2ae7416dee1..77129174f882 100644
> > --- a/arch/powerpc/include/asm/entry-common.h
> > +++ b/arch/powerpc/include/asm/entry-common.h
> > @@ -3,7 +3,7 @@
> >   #ifndef _ASM_PPC_ENTRY_COMMON_H
> >   #define _ASM_PPC_ENTRY_COMMON_H
> > -#ifdef CONFIG_GENERIC_IRQ_ENTRY
> > +#ifdef CONFIG_GENERIC_ENTRY
> 
> Powerpc now selected this inconditionaly. Why is this #ifdef needed ?
> 
Will remove this.
> 
> >   #include <asm/cputime.h>
> >   #include <asm/interrupt.h>
> > @@ -217,9 +217,6 @@ static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
> >   	if (user_mode(regs)) {
> >   		kuap_lock();
> > -		CT_WARN_ON(ct_state() != CT_STATE_USER);
> > -		user_exit_irqoff();
> > -
> >   		account_cpu_user_entry();
> >   		account_stolen_time();
> >   	} else {
> > diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> > index 7f67f0b9d627..7d5cd4b5a610 100644
> > --- a/arch/powerpc/kernel/interrupt.c
> > +++ b/arch/powerpc/kernel/interrupt.c
> > @@ -1,6 +1,7 @@
> >   // SPDX-License-Identifier: GPL-2.0-or-later
> >   #include <linux/context_tracking.h>
> > +#include <linux/entry-common.h>
> >   #include <linux/err.h>
> >   #include <linux/compat.h>
> >   #include <linux/rseq.h>
> > @@ -73,79 +74,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
> >   	return true;
> >   }
> > -static notrace unsigned long
> > -interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
> > -{
> > -	unsigned long ti_flags;
> > -
> > -again:
> > -	ti_flags = read_thread_flags();
> > -	while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
> > -		local_irq_enable();
> > -		if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
> > -			schedule();
> > -		} else {
> > -			/*
> > -			 * SIGPENDING must restore signal handler function
> > -			 * argument GPRs, and some non-volatiles (e.g., r1).
> > -			 * Restore all for now. This could be made lighter.
> > -			 */
> > -			if (ti_flags & _TIF_SIGPENDING)
> > -				ret |= _TIF_RESTOREALL;
> > -			do_notify_resume(regs, ti_flags);
> 
> do_notify_resume() has no caller anymore, should be removed from
> arch/powerpc/include/asm/signal.h and arch/powerpc/kernel/signal.c
> 
> 
> 
Oh yeah, will remove this.
> > -		}
> > -		local_irq_disable();
> > -		ti_flags = read_thread_flags();
> > -	}
> > -
> > -	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
> > -		if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
> > -				unlikely((ti_flags & _TIF_RESTORE_TM))) {
> > -			restore_tm_state(regs);
> > -		} else {
> > -			unsigned long mathflags = MSR_FP;
> > -
> > -			if (cpu_has_feature(CPU_FTR_VSX))
> > -				mathflags |= MSR_VEC | MSR_VSX;
> > -			else if (cpu_has_feature(CPU_FTR_ALTIVEC))
> > -				mathflags |= MSR_VEC;
> > -
> > -			/*
> > -			 * If userspace MSR has all available FP bits set,
> > -			 * then they are live and no need to restore. If not,
> > -			 * it means the regs were given up and restore_math
> > -			 * may decide to restore them (to avoid taking an FP
> > -			 * fault).
> > -			 */
> > -			if ((regs->msr & mathflags) != mathflags)
> > -				restore_math(regs);
> > -		}
> > -	}
> > -
> > -	check_return_regs_valid(regs);
> > -
> > -	user_enter_irqoff();
> > -	if (!prep_irq_for_enabled_exit(true)) {
> > -		user_exit_irqoff();
> > -		local_irq_enable();
> > -		local_irq_disable();
> > -		goto again;
> > -	}
> > -
> > -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> > -	local_paca->tm_scratch = regs->msr;
> > -#endif
> > -
> > -	booke_load_dbcr0();
> > -
> > -	account_cpu_user_exit();
> > -
> > -	/* Restore user access locks last */
> > -	kuap_user_restore(regs);
> > -
> > -	return ret;
> > -}
> > -
> >   /*
> >    * This should be called after a syscall returns, with r3 the return value
> >    * from the syscall. If this function returns non-zero, the system call
> > @@ -160,17 +88,12 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
> >   					   long scv)
> >   {
> >   	unsigned long ti_flags;
> > -	unsigned long ret = 0;
> >   	bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv;
> > -	CT_WARN_ON(ct_state() == CT_STATE_USER);
> > -
> >   	kuap_assert_locked();
> >   	regs->result = r3;
> > -
> > -	/* Check whether the syscall is issued inside a restartable sequence */
> > -	rseq_syscall(regs);
> > +	regs->exit_flags = 0;
> >   	ti_flags = read_thread_flags();
> > @@ -183,7 +106,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
> >   	if (unlikely(ti_flags & _TIF_PERSYSCALL_MASK)) {
> >   		if (ti_flags & _TIF_RESTOREALL)
> > -			ret = _TIF_RESTOREALL;
> > +			regs->exit_flags = _TIF_RESTOREALL;
> >   		else
> >   			regs->gpr[3] = r3;
> >   		clear_bits(_TIF_PERSYSCALL_MASK, &current_thread_info()->flags);
> > @@ -192,18 +115,28 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
> >   	}
> >   	if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) {
> > -		do_syscall_trace_leave(regs);
> > -		ret |= _TIF_RESTOREALL;
> > +		regs->exit_flags |= _TIF_RESTOREALL;
> >   	}
> > -	local_irq_disable();
> > -	ret = interrupt_exit_user_prepare_main(ret, regs);
> > +	syscall_exit_to_user_mode(regs);
> > +
> > +again:
> > +	user_enter_irqoff();
> > +	if (!prep_irq_for_enabled_exit(true)) {
> > +		user_exit_irqoff();
> > +		local_irq_enable();
> > +		local_irq_disable();
> > +		goto again;
> > +	}
> > +
> > +	/* Restore user access locks last */
> > +	kuap_user_restore(regs);
> >   #ifdef CONFIG_PPC64
> > -	regs->exit_result = ret;
> > +	regs->exit_result = regs->exit_flags;
> >   #endif
> > -	return ret;
> > +	return regs->exit_flags;
> >   }
> >   #ifdef CONFIG_PPC64
> > @@ -223,13 +156,16 @@ notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *reg
> >   	set_kuap(AMR_KUAP_BLOCKED);
> >   #endif
> > -	trace_hardirqs_off();
> > -	user_exit_irqoff();
> > -	account_cpu_user_entry();
> > -
> > -	BUG_ON(!user_mode(regs));
> > +again:
> > +	user_enter_irqoff();
> > +	if (!prep_irq_for_enabled_exit(true)) {
> > +		user_exit_irqoff();
> > +		local_irq_enable();
> > +		local_irq_disable();
> > +		goto again;
> > +	}
> > -	regs->exit_result = interrupt_exit_user_prepare_main(regs->exit_result, regs);
> > +	regs->exit_result |= regs->exit_flags;
> >   	return regs->exit_result;
> >   }
> > @@ -241,7 +177,6 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
> >   	BUG_ON(regs_is_unrecoverable(regs));
> >   	BUG_ON(regs_irqs_disabled(regs));
> > -	CT_WARN_ON(ct_state() == CT_STATE_USER);
> >   	/*
> >   	 * We don't need to restore AMR on the way back to userspace for KUAP.
> > @@ -250,8 +185,21 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
> >   	kuap_assert_locked();
> >   	local_irq_disable();
> > +	regs->exit_flags = 0;
> > +again:
> > +	check_return_regs_valid(regs);
> > +	user_enter_irqoff();
> > +	if (!prep_irq_for_enabled_exit(true)) {
> > +		user_exit_irqoff();
> > +		local_irq_enable();
> > +		local_irq_disable();
> > +		goto again;
> > +	}
> > +
> > +	/* Restore user access locks last */
> > +	kuap_user_restore(regs);
> > -	ret = interrupt_exit_user_prepare_main(0, regs);
> > +	ret = regs->exit_flags;
> >   #ifdef CONFIG_PPC64
> >   	regs->exit_result = ret;
> > @@ -293,8 +241,6 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
> >   		/* Returning to a kernel context with local irqs enabled. */
> >   		WARN_ON_ONCE(!(regs->msr & MSR_EE));
> >   again:
> > -		if (need_irq_preemption())
> > -			irqentry_exit_cond_resched();
> >   		check_return_regs_valid(regs);
> > @@ -364,7 +310,6 @@ notrace unsigned long interrupt_exit_user_restart(struct pt_regs *regs)
> >   #endif
> >   	trace_hardirqs_off();
> > -	user_exit_irqoff();
> >   	account_cpu_user_entry();
> >   	BUG_ON(!user_mode(regs));
> > diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
> > index 2134b6d155ff..316d4f5ead8e 100644
> > --- a/arch/powerpc/kernel/ptrace/ptrace.c
> > +++ b/arch/powerpc/kernel/ptrace/ptrace.c
> > @@ -21,9 +21,6 @@
> >   #include <asm/switch_to.h>
> >   #include <asm/debug.h>
> > -#define CREATE_TRACE_POINTS
> > -#include <trace/events/syscalls.h>
> > -
> >   #include "ptrace-decl.h"
> >   /*
> > @@ -195,144 +192,6 @@ long arch_ptrace(struct task_struct *child, long request,
> >   	return ret;
> >   }
> > -#ifdef CONFIG_SECCOMP
> > -static int do_seccomp(struct pt_regs *regs)
> > -{
> > -	if (!test_thread_flag(TIF_SECCOMP))
> > -		return 0;
> > -
> > -	/*
> > -	 * The ABI we present to seccomp tracers is that r3 contains
> > -	 * the syscall return value and orig_gpr3 contains the first
> > -	 * syscall parameter. This is different to the ptrace ABI where
> > -	 * both r3 and orig_gpr3 contain the first syscall parameter.
> > -	 */
> > -	regs->gpr[3] = -ENOSYS;
> > -
> > -	/*
> > -	 * We use the __ version here because we have already checked
> > -	 * TIF_SECCOMP. If this fails, there is nothing left to do, we
> > -	 * have already loaded -ENOSYS into r3, or seccomp has put
> > -	 * something else in r3 (via SECCOMP_RET_ERRNO/TRACE).
> > -	 */
> > -	if (__secure_computing())
> > -		return -1;
> > -
> > -	/*
> > -	 * The syscall was allowed by seccomp, restore the register
> > -	 * state to what audit expects.
> > -	 * Note that we use orig_gpr3, which means a seccomp tracer can
> > -	 * modify the first syscall parameter (in orig_gpr3) and also
> > -	 * allow the syscall to proceed.
> > -	 */
> > -	regs->gpr[3] = regs->orig_gpr3;
> > -
> > -	return 0;
> > -}
> > -#else
> > -static inline int do_seccomp(struct pt_regs *regs) { return 0; }
> > -#endif /* CONFIG_SECCOMP */
> > -
> > -/**
> > - * do_syscall_trace_enter() - Do syscall tracing on kernel entry.
> > - * @regs: the pt_regs of the task to trace (current)
> > - *
> > - * Performs various types of tracing on syscall entry. This includes seccomp,
> > - * ptrace, syscall tracepoints and audit.
> > - *
> > - * The pt_regs are potentially visible to userspace via ptrace, so their
> > - * contents is ABI.
> > - *
> > - * One or more of the tracers may modify the contents of pt_regs, in particular
> > - * to modify arguments or even the syscall number itself.
> > - *
> > - * It's also possible that a tracer can choose to reject the system call. In
> > - * that case this function will return an illegal syscall number, and will put
> > - * an appropriate return value in regs->r3.
> > - *
> > - * Return: the (possibly changed) syscall number.
> > - */
> > -long do_syscall_trace_enter(struct pt_regs *regs)
> 
> Remove prototype from arch/powerpc/include/asm/ptrace.h
> 
Sure will do.
> > -{
> > -	u32 flags;
> > -
> > -	flags = read_thread_flags() & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
> > -
> > -	if (flags) {
> > -		int rc = ptrace_report_syscall_entry(regs);
> > -
> > -		if (unlikely(flags & _TIF_SYSCALL_EMU)) {
> > -			/*
> > -			 * A nonzero return code from
> > -			 * ptrace_report_syscall_entry() tells us to prevent
> > -			 * the syscall execution, but we are not going to
> > -			 * execute it anyway.
> > -			 *
> > -			 * Returning -1 will skip the syscall execution. We want
> > -			 * to avoid clobbering any registers, so we don't goto
> > -			 * the skip label below.
> > -			 */
> > -			return -1;
> > -		}
> > -
> > -		if (rc) {
> > -			/*
> > -			 * The tracer decided to abort the syscall. Note that
> > -			 * the tracer may also just change regs->gpr[0] to an
> > -			 * invalid syscall number, that is handled below on the
> > -			 * exit path.
> > -			 */
> > -			goto skip;
> > -		}
> > -	}
> > -
> > -	/* Run seccomp after ptrace; allow it to set gpr[3]. */
> > -	if (do_seccomp(regs))
> > -		return -1;
> > -
> > -	/* Avoid trace and audit when syscall is invalid. */
> > -	if (regs->gpr[0] >= NR_syscalls)
> > -		goto skip;
> > -
> > -	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
> > -		trace_sys_enter(regs, regs->gpr[0]);
> > -
> > -	if (!is_32bit_task())
> > -		audit_syscall_entry(regs->gpr[0], regs->gpr[3], regs->gpr[4],
> > -				    regs->gpr[5], regs->gpr[6]);
> > -	else
> > -		audit_syscall_entry(regs->gpr[0],
> > -				    regs->gpr[3] & 0xffffffff,
> > -				    regs->gpr[4] & 0xffffffff,
> > -				    regs->gpr[5] & 0xffffffff,
> > -				    regs->gpr[6] & 0xffffffff);
> > -
> > -	/* Return the possibly modified but valid syscall number */
> > -	return regs->gpr[0];
> > -
> > -skip:
> > -	/*
> > -	 * If we are aborting explicitly, or if the syscall number is
> > -	 * now invalid, set the return value to -ENOSYS.
> > -	 */
> > -	regs->gpr[3] = -ENOSYS;
> > -	return -1;
> > -}
> > -
> > -void do_syscall_trace_leave(struct pt_regs *regs)
> > -{
> > -	int step;
> > -
> > -	audit_syscall_exit(regs);
> > -
> > -	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
> > -		trace_sys_exit(regs, regs->result);
> > -
> > -	step = test_thread_flag(TIF_SINGLESTEP);
> > -	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
> > -		ptrace_report_syscall_exit(regs, step);
> > -}
> > -
> >   void __init pt_regs_check(void);
> >   /*
> > diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> > index 719930cf4ae1..9f1847b4742e 100644
> > --- a/arch/powerpc/kernel/signal.c
> > +++ b/arch/powerpc/kernel/signal.c
> > @@ -6,6 +6,7 @@
> >    *    Extracted from signal_32.c and signal_64.c
> >    */
> > +#include <linux/entry-common.h>
> >   #include <linux/resume_user_mode.h>
> >   #include <linux/signal.h>
> >   #include <linux/uprobes.h>
> > @@ -22,11 +23,6 @@
> >   #include "signal.h"
> > -/* This will be removed */
> > -#ifdef CONFIG_GENERIC_ENTRY
> > -#include <linux/entry-common.h>
> > -#endif /* CONFIG_GENERIC_ENTRY */
> > -
> 
> Until now CONFIG_GENERIC_ENTRY was not defined.
> 
> Now that it is defined, we remove the entire block ?
> 
> Then why has it been added at all ?
> 
I wanted all this to be a dead code till we enable the config. I kept
the name as CONFIG_GENERIC_ENTRY so that it would be explanatory why
it's dead.

> >   #ifdef CONFIG_VSX
> >   unsigned long copy_fpr_to_user(void __user *to,
> >   			       struct task_struct *task)
> > @@ -374,11 +370,9 @@ void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
> >   				   task_pid_nr(tsk), where, ptr, regs->nip, regs->link);
> >   }
> > -#ifdef CONFIG_GENERIC_ENTRY
> >   void arch_do_signal_or_restart(struct pt_regs *regs)
> >   {
> >   	BUG_ON(regs != current->thread.regs);
> > -	local_paca->generic_fw_flags |= GFW_RESTORE_ALL;
> 
> Why was that there ? I thought it was preparatory, then you remove it before
> even using it ?
> 
This should have been removed earlier. Will fix this in next revision.

I appreciate a detailed review. Thanks :).

Regards,
Mukesh
> > +	regs->exit_flags |= _TIF_RESTOREALL;
> >   	do_signal(current);
> >   }
> > -#endif /* CONFIG_GENERIC_ENTRY */


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit
  2025-12-16 15:00     ` Mukesh Kumar Chaurasiya
@ 2025-12-16 22:40       ` Christophe Leroy (CS GROUP)
  2025-12-17  4:43         ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16 22:40 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel



Le 16/12/2025 à 16:00, Mukesh Kumar Chaurasiya a écrit :
> On Tue, Dec 16, 2025 at 10:58:16AM +0100, Christophe Leroy (CS GROUP) wrote:
>>
>>
>> Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
>>> From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
>>>
>>> Move interrupt entry and exit helper routines from interrupt.h into the
>>> PowerPC-specific entry-common.h header as a preparatory step for enabling
>>> the generic entry/exit framework.
>>>
>>> This consolidation places all PowerPC interrupt entry/exit handling in a
>>> single common header, aligning with the generic entry infrastructure.
>>> The helpers provide architecture-specific handling for interrupt and NMI
>>> entry/exit sequences, including:
>>>
>>>    - arch_interrupt_enter/exit_prepare()
>>>    - arch_interrupt_async_enter/exit_prepare()
>>>    - arch_interrupt_nmi_enter/exit_prepare()
>>>    - Supporting helpers such as nap_adjust_return(), check_return_regs_valid(),
>>>      debug register maintenance, and soft mask handling.
>>>
>>> The functions are copied verbatim from interrupt.h to avoid functional
>>> changes at this stage. Subsequent patches will integrate these routines
>>> into the generic entry/exit flow.
>>
>> Can we move them instead of duplicating them ?
>>
> Till we enable the Generic framework i didn't want to touch the already
> used code path. Once we enable the code all the unused code should be
> removed. This helps us in bisecting future issues caused due to this.

I can't see who it can help bisecting. What did I miss ?

If you copy the code, you don't know whether what you have copied is 
correct or not until you use it. So when you start using it you don't 
know if the problem is in the copied code or the code using it.

If instead of copying the code you move it and continue to use the moved 
code from the only implementation, they when you start using it with the 
new code you are sure it works and then if you have a problem you know 
it is not the moved code but the new code using it.

Christophe


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
  2025-12-16 15:08     ` Mukesh Kumar Chaurasiya
@ 2025-12-16 22:57       ` Christophe Leroy (CS GROUP)
  0 siblings, 0 replies; 37+ messages in thread
From: Christophe Leroy (CS GROUP) @ 2025-12-16 22:57 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, kernel test robot
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	mingo, atrajeev, mark.barnett, linuxppc-dev, linux-kernel,
	oe-kbuild-all



Le 16/12/2025 à 16:08, Mukesh Kumar Chaurasiya a écrit :
> On Tue, Dec 16, 2025 at 04:27:55AM +0800, kernel test robot wrote:
>> Hi Mukesh,
>>
>> kernel test robot noticed the following build errors:
>>
>> [auto build test ERROR on powerpc/next]
>> [also build test ERROR on powerpc/fixes linus/master v6.19-rc1 next-20251215]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit-scm.com%2Fdocs%2Fgit-format-patch%23_base_tree_information&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C181ba96509e54904228908de3cb4fa09%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C639014945174972177%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=4sU795uiAIkzbbf9cdzEUNH6Cbax3kPS7biPFWiZir4%3D&reserved=0]
>>
>> url:    https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fintel-lab-lkp%2Flinux%2Fcommits%2FMukesh-Kumar-Chaurasiya%2Fpowerpc-rename-arch_irq_disabled_regs%2F20251214-210813&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C181ba96509e54904228908de3cb4fa09%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C639014945174996572%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=BalspVN%2FJmi2pKv2QsOwMPsPOzLF%2FGAfSJ4Pq1jn%2BeM%3D&reserved=0
>> base:   https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fpowerpc%2Flinux.git&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C181ba96509e54904228908de3cb4fa09%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C639014945175289139%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=IExs1jOXw207ewNfCHIbLYmCeYTPGr65giviQ842JnA%3D&reserved=0 next
>> patch link:    https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20251214130245.43664-9-mkchauras%2540linux.ibm.com&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C181ba96509e54904228908de3cb4fa09%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C639014945175311233%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=CJQevooFK2LY06NmPMpnyki3UfYOsIiFv1FThGk4CEI%3D&reserved=0
>> patch subject: [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
>> config: powerpc-randconfig-001-20251215 (https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdownload.01.org%2F0day-ci%2Farchive%2F20251216%2F202512160453.iO9WNjrm-lkp%40intel.com%2Fconfig&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C181ba96509e54904228908de3cb4fa09%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C639014945175328057%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=fM7XuG4aUtdsKuVZCjXZazYE%2FLU5uhqtXOTsiHQgEz0%3D&reserved=0)
>> compiler: powerpc-linux-gcc (GCC) 9.5.0
>> reproduce (this is a W=1 build): (https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdownload.01.org%2F0day-ci%2Farchive%2F20251216%2F202512160453.iO9WNjrm-lkp%40intel.com%2Freproduce&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C181ba96509e54904228908de3cb4fa09%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C639014945175343707%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=pnqJzh3vwyq8XCtopm%2ByJm0zUpz1Bk5JzVork9mZyec%3D&reserved=0)
>>
>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>> the same patch/commit), kindly add following tags
>> | Reported-by: kernel test robot <lkp@intel.com>
>> | Closes: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Foe-kbuild-all%2F202512160453.iO9WNjrm-lkp%40intel.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C181ba96509e54904228908de3cb4fa09%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C639014945175359094%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=5ToZM5JVO5d4NhuwFJjg78Xv%2F5nmxX8ZUhfmCNDo2hg%3D&reserved=0
>>
> I tried this with gcc 9.4 and gcc 14. I am not able to reproduce this.
> Will investigate further, meanwhile if anyone has any ideas that can help it would be great.

I was able to reproduce it with gcc 9.5

Seems to be related to commit 69d4c0d32186 ("entry, kasan, x86: Disallow 
overriding mem*() functions")

Which is in contradiction with commit 26deb04342e3 ("powerpc: prepare 
string/mem functions for KASAN")

Christophe



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
  2025-12-14 13:02 ` [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
  2025-12-16  6:29   ` kernel test robot
  2025-12-16 10:43   ` Christophe Leroy (CS GROUP)
@ 2025-12-17  2:10   ` kernel test robot
  2025-12-17 21:32   ` kernel test robot
  3 siblings, 0 replies; 37+ messages in thread
From: kernel test robot @ 2025-12-17  2:10 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, christophe.leroy,
	oleg, kees, luto, wad, mchauras, thuth, sshegde, charlie, macro,
	akpm, ldv, deller, ankur.a.arora, segher, tglx, thomas.weissschuh,
	peterz, menglong8.dong, bigeasy, namcao, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel
  Cc: oe-kbuild-all

Hi Mukesh,

kernel test robot noticed the following build warnings:

[auto build test WARNING on powerpc/next]
[also build test WARNING on powerpc/fixes linus/master v6.19-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mukesh-Kumar-Chaurasiya/powerpc-rename-arch_irq_disabled_regs/20251214-210813
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
patch link:    https://lore.kernel.org/r/20251214130245.43664-8-mkchauras%40linux.ibm.com
patch subject: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
config: powerpc-randconfig-r072-20251215 (https://download.01.org/0day-ci/archive/20251217/202512170925.boH1EF7e-lkp@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 8.5.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512170925.boH1EF7e-lkp@intel.com/

smatch warnings:
arch/powerpc/include/asm/entry-common.h:433 arch_enter_from_user_mode() warn: inconsistent indenting

vim +433 arch/powerpc/include/asm/entry-common.h

2b0f05f77f11f8 Mukesh Kumar Chaurasiya 2025-12-14  396  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  397  static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  398  {
37ad0d88d9bff7 Mukesh Kumar Chaurasiya 2025-12-14  399  	kuap_lock();
37ad0d88d9bff7 Mukesh Kumar Chaurasiya 2025-12-14  400  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  401  	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  402  		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  403  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  404  	BUG_ON(regs_is_unrecoverable(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  405  	BUG_ON(!user_mode(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  406  	BUG_ON(regs_irqs_disabled(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  407  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  408  #ifdef CONFIG_PPC_PKEY
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  409  	if (mmu_has_feature(MMU_FTR_PKEY) && trap_is_syscall(regs)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  410  		unsigned long amr, iamr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  411  		bool flush_needed = false;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  412  		/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  413  		 * When entering from userspace we mostly have the AMR/IAMR
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  414  		 * different from kernel default values. Hence don't compare.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  415  		 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  416  		amr = mfspr(SPRN_AMR);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  417  		iamr = mfspr(SPRN_IAMR);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  418  		regs->amr  = amr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  419  		regs->iamr = iamr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  420  		if (mmu_has_feature(MMU_FTR_KUAP)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  421  			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  422  			flush_needed = true;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  423  		}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  424  		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  425  			mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  426  			flush_needed = true;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  427  		}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  428  		if (flush_needed)
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  429  			isync();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  430  	} else
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  431  #endif
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  432  		kuap_assert_locked();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14 @433  	booke_restore_dbcr0();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  434  	account_cpu_user_entry();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  435  	account_stolen_time();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  436  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  437  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  438  	 * This is not required for the syscall exit path, but makes the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  439  	 * stack frame look nicer. If this was initialised in the first stack
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  440  	 * frame, or if the unwinder was taught the first stack frame always
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  441  	 * returns to user with IRQS_ENABLED, this store could be avoided!
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  442  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  443  	irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  444  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  445  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  446  	 * If system call is called with TM active, set _TIF_RESTOREALL to
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  447  	 * prevent RFSCV being used to return to userspace, because POWER9
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  448  	 * TM implementation has problems with this instruction returning to
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  449  	 * transactional state. Final register values are not relevant because
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  450  	 * the transaction will be aborted upon return anyway. Or in the case
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  451  	 * of unsupported_scv SIGILL fault, the return state does not much
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  452  	 * matter because it's an edge case.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  453  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  454  	if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  455  	    unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  456  		set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  457  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  458  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  459  	 * If the system call was made with a transaction active, doom it and
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  460  	 * return without performing the system call. Unless it was an
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  461  	 * unsupported scv vector, in which case it's treated like an illegal
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  462  	 * instruction.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  463  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  464  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  465  	if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  466  	    !trap_is_unsupported_scv(regs)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  467  		/* Enable TM in the kernel, and disable EE (for scv) */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  468  		hard_irq_disable();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  469  		mtmsr(mfmsr() | MSR_TM);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  470  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  471  		/* tabort, this dooms the transaction, nothing else */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  472  		asm volatile(".long 0x7c00071d | ((%0) << 16)"
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  473  			     :: "r"(TM_CAUSE_SYSCALL | TM_CAUSE_PERSISTENT));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  474  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  475  		/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  476  		 * Userspace will never see the return value. Execution will
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  477  		 * resume after the tbegin. of the aborted transaction with the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  478  		 * checkpointed register state. A context switch could occur
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  479  		 * or signal delivered to the process before resuming the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  480  		 * doomed transaction context, but that should all be handled
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  481  		 * as expected.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  482  		 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  483  		return;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  484  	}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  485  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  486  }
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  487  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit
  2025-12-16 22:40       ` Christophe Leroy (CS GROUP)
@ 2025-12-17  4:43         ` Mukesh Kumar Chaurasiya
  2025-12-19  4:56           ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-17  4:43 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Tue, Dec 16, 2025 at 11:40:51PM +0100, Christophe Leroy (CS GROUP) wrote:
> 
> 
> Le 16/12/2025 à 16:00, Mukesh Kumar Chaurasiya a écrit :
> > On Tue, Dec 16, 2025 at 10:58:16AM +0100, Christophe Leroy (CS GROUP) wrote:
> > > 
> > > 
> > > Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > > > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > > > 
> > > > Move interrupt entry and exit helper routines from interrupt.h into the
> > > > PowerPC-specific entry-common.h header as a preparatory step for enabling
> > > > the generic entry/exit framework.
> > > > 
> > > > This consolidation places all PowerPC interrupt entry/exit handling in a
> > > > single common header, aligning with the generic entry infrastructure.
> > > > The helpers provide architecture-specific handling for interrupt and NMI
> > > > entry/exit sequences, including:
> > > > 
> > > >    - arch_interrupt_enter/exit_prepare()
> > > >    - arch_interrupt_async_enter/exit_prepare()
> > > >    - arch_interrupt_nmi_enter/exit_prepare()
> > > >    - Supporting helpers such as nap_adjust_return(), check_return_regs_valid(),
> > > >      debug register maintenance, and soft mask handling.
> > > > 
> > > > The functions are copied verbatim from interrupt.h to avoid functional
> > > > changes at this stage. Subsequent patches will integrate these routines
> > > > into the generic entry/exit flow.
> > > 
> > > Can we move them instead of duplicating them ?
> > > 
> > Till we enable the Generic framework i didn't want to touch the already
> > used code path. Once we enable the code all the unused code should be
> > removed. This helps us in bisecting future issues caused due to this.
> 
> I can't see who it can help bisecting. What did I miss ?
> 
> If you copy the code, you don't know whether what you have copied is correct
> or not until you use it. So when you start using it you don't know if the
> problem is in the copied code or the code using it.
> 
> If instead of copying the code you move it and continue to use the moved
> code from the only implementation, they when you start using it with the new
> code you are sure it works and then if you have a problem you know it is not
> the moved code but the new code using it.
> 
Sure makes sense. Will move this instead of duplicating.

Regards,
Mukesh
> Christophe


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
  2025-12-14 13:02 ` [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
                     ` (2 preceding siblings ...)
  2025-12-17  2:10   ` kernel test robot
@ 2025-12-17 21:32   ` kernel test robot
  3 siblings, 0 replies; 37+ messages in thread
From: kernel test robot @ 2025-12-17 21:32 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, christophe.leroy,
	oleg, kees, luto, wad, mchauras, thuth, sshegde, charlie, macro,
	akpm, ldv, deller, ankur.a.arora, segher, tglx, thomas.weissschuh,
	peterz, menglong8.dong, bigeasy, namcao, mingo, atrajeev,
	mark.barnett, linuxppc-dev, linux-kernel
  Cc: oe-kbuild-all

Hi Mukesh,

kernel test robot noticed the following build warnings:

[auto build test WARNING on powerpc/next]
[also build test WARNING on powerpc/fixes linus/master v6.19-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mukesh-Kumar-Chaurasiya/powerpc-rename-arch_irq_disabled_regs/20251214-210813
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
patch link:    https://lore.kernel.org/r/20251214130245.43664-8-mkchauras%40linux.ibm.com
patch subject: [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path.
config: powerpc-randconfig-r072-20251215 (https://download.01.org/0day-ci/archive/20251218/202512180511.ujibhcpR-lkp@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 8.5.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512180511.ujibhcpR-lkp@intel.com/

smatch warnings:
arch/powerpc/include/asm/entry-common.h:433 arch_enter_from_user_mode() warn: inconsistent indenting

vim +433 arch/powerpc/include/asm/entry-common.h

2b0f05f77f11f8 Mukesh Kumar Chaurasiya 2025-12-14  396  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  397  static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  398  {
37ad0d88d9bff7 Mukesh Kumar Chaurasiya 2025-12-14  399  	kuap_lock();
37ad0d88d9bff7 Mukesh Kumar Chaurasiya 2025-12-14  400  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  401  	if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  402  		BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  403  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  404  	BUG_ON(regs_is_unrecoverable(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  405  	BUG_ON(!user_mode(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  406  	BUG_ON(regs_irqs_disabled(regs));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  407  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  408  #ifdef CONFIG_PPC_PKEY
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  409  	if (mmu_has_feature(MMU_FTR_PKEY) && trap_is_syscall(regs)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  410  		unsigned long amr, iamr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  411  		bool flush_needed = false;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  412  		/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  413  		 * When entering from userspace we mostly have the AMR/IAMR
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  414  		 * different from kernel default values. Hence don't compare.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  415  		 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  416  		amr = mfspr(SPRN_AMR);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  417  		iamr = mfspr(SPRN_IAMR);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  418  		regs->amr  = amr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  419  		regs->iamr = iamr;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  420  		if (mmu_has_feature(MMU_FTR_KUAP)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  421  			mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  422  			flush_needed = true;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  423  		}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  424  		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  425  			mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  426  			flush_needed = true;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  427  		}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  428  		if (flush_needed)
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  429  			isync();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  430  	} else
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  431  #endif
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  432  		kuap_assert_locked();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14 @433  	booke_restore_dbcr0();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  434  	account_cpu_user_entry();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  435  	account_stolen_time();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  436  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  437  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  438  	 * This is not required for the syscall exit path, but makes the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  439  	 * stack frame look nicer. If this was initialised in the first stack
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  440  	 * frame, or if the unwinder was taught the first stack frame always
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  441  	 * returns to user with IRQS_ENABLED, this store could be avoided!
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  442  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  443  	irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  444  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  445  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  446  	 * If system call is called with TM active, set _TIF_RESTOREALL to
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  447  	 * prevent RFSCV being used to return to userspace, because POWER9
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  448  	 * TM implementation has problems with this instruction returning to
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  449  	 * transactional state. Final register values are not relevant because
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  450  	 * the transaction will be aborted upon return anyway. Or in the case
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  451  	 * of unsupported_scv SIGILL fault, the return state does not much
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  452  	 * matter because it's an edge case.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  453  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  454  	if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  455  	    unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  456  		set_bits(_TIF_RESTOREALL, &current_thread_info()->flags);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  457  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  458  	/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  459  	 * If the system call was made with a transaction active, doom it and
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  460  	 * return without performing the system call. Unless it was an
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  461  	 * unsupported scv vector, in which case it's treated like an illegal
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  462  	 * instruction.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  463  	 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  464  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  465  	if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  466  	    !trap_is_unsupported_scv(regs)) {
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  467  		/* Enable TM in the kernel, and disable EE (for scv) */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  468  		hard_irq_disable();
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  469  		mtmsr(mfmsr() | MSR_TM);
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  470  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  471  		/* tabort, this dooms the transaction, nothing else */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  472  		asm volatile(".long 0x7c00071d | ((%0) << 16)"
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  473  			     :: "r"(TM_CAUSE_SYSCALL | TM_CAUSE_PERSISTENT));
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  474  
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  475  		/*
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  476  		 * Userspace will never see the return value. Execution will
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  477  		 * resume after the tbegin. of the aborted transaction with the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  478  		 * checkpointed register state. A context switch could occur
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  479  		 * or signal delivered to the process before resuming the
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  480  		 * doomed transaction context, but that should all be handled
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  481  		 * as expected.
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  482  		 */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  483  		return;
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  484  	}
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  485  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  486  }
1a5661537226c3 Mukesh Kumar Chaurasiya 2025-12-14  487  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit
  2025-12-17  4:43         ` Mukesh Kumar Chaurasiya
@ 2025-12-19  4:56           ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 37+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-12-19  4:56 UTC (permalink / raw)
  To: Christophe Leroy (CS GROUP)
  Cc: maddy, mpe, npiggin, oleg, kees, luto, wad, mchauras, thuth,
	sshegde, charlie, macro, akpm, ldv, deller, ankur.a.arora, segher,
	tglx, thomas.weissschuh, peterz, menglong8.dong, bigeasy, namcao,
	kan.liang, mingo, atrajeev, mark.barnett, linuxppc-dev,
	linux-kernel

On Wed, Dec 17, 2025 at 10:13:19AM +0530, Mukesh Kumar Chaurasiya wrote:
> On Tue, Dec 16, 2025 at 11:40:51PM +0100, Christophe Leroy (CS GROUP) wrote:
> > 
> > 
> > Le 16/12/2025 à 16:00, Mukesh Kumar Chaurasiya a écrit :
> > > On Tue, Dec 16, 2025 at 10:58:16AM +0100, Christophe Leroy (CS GROUP) wrote:
> > > > 
> > > > 
> > > > Le 14/12/2025 à 14:02, Mukesh Kumar Chaurasiya a écrit :
> > > > > From: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> > > > > 
> > > > > Move interrupt entry and exit helper routines from interrupt.h into the
> > > > > PowerPC-specific entry-common.h header as a preparatory step for enabling
> > > > > the generic entry/exit framework.
> > > > > 
> > > > > This consolidation places all PowerPC interrupt entry/exit handling in a
> > > > > single common header, aligning with the generic entry infrastructure.
> > > > > The helpers provide architecture-specific handling for interrupt and NMI
> > > > > entry/exit sequences, including:
> > > > > 
> > > > >    - arch_interrupt_enter/exit_prepare()
> > > > >    - arch_interrupt_async_enter/exit_prepare()
> > > > >    - arch_interrupt_nmi_enter/exit_prepare()
> > > > >    - Supporting helpers such as nap_adjust_return(), check_return_regs_valid(),
> > > > >      debug register maintenance, and soft mask handling.
> > > > > 
> > > > > The functions are copied verbatim from interrupt.h to avoid functional
> > > > > changes at this stage. Subsequent patches will integrate these routines
> > > > > into the generic entry/exit flow.
> > > > 
> > > > Can we move them instead of duplicating them ?
> > > > 
> > > Till we enable the Generic framework i didn't want to touch the already
> > > used code path. Once we enable the code all the unused code should be
> > > removed. This helps us in bisecting future issues caused due to this.
> > 
> > I can't see who it can help bisecting. What did I miss ?
> > 
> > If you copy the code, you don't know whether what you have copied is correct
> > or not until you use it. So when you start using it you don't know if the
> > problem is in the copied code or the code using it.
> > 
> > If instead of copying the code you move it and continue to use the moved
> > code from the only implementation, they when you start using it with the new
> > code you are sure it works and then if you have a problem you know it is not
> > the moved code but the new code using it.
> > 
> Sure makes sense. Will move this instead of duplicating.
> 
> Regards,
> Mukesh
> > Christophe
> 
Ah one more reason we can't move them is because this file is not
directly included by the arch code. It's included by the generic
framework and then we include it the generic header.

Regards,
Mukesh


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2025-12-19  4:57 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-14 13:02 [PATCH v2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
2025-12-14 13:02 ` [PATCH v2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
2025-12-14 13:02 ` [PATCH v2 2/8] powerpc: Prepare to build with generic entry/exit framework Mukesh Kumar Chaurasiya
2025-12-16  9:27   ` Christophe Leroy (CS GROUP)
2025-12-16 14:42     ` Mukesh Kumar Chaurasiya
2025-12-14 13:02 ` [PATCH v2 3/8] powerpc: introduce arch_enter_from_user_mode Mukesh Kumar Chaurasiya
2025-12-16  9:38   ` Christophe Leroy (CS GROUP)
2025-12-16 14:47     ` Mukesh Kumar Chaurasiya
2025-12-14 13:02 ` [PATCH v2 4/8] powerpc: Introduce syscall exit arch functions Mukesh Kumar Chaurasiya
2025-12-16  9:46   ` Christophe Leroy (CS GROUP)
2025-12-16 14:51     ` Mukesh Kumar Chaurasiya
2025-12-14 13:02 ` [PATCH v2 5/8] powerpc: add exit_flags field in pt_regs Mukesh Kumar Chaurasiya
2025-12-16  9:52   ` Christophe Leroy (CS GROUP)
2025-12-16 14:56     ` Mukesh Kumar Chaurasiya
2025-12-14 13:02 ` [PATCH v2 6/8] powerpc: Prepare for IRQ entry exit Mukesh Kumar Chaurasiya
2025-12-16  9:58   ` Christophe Leroy (CS GROUP)
2025-12-16 15:00     ` Mukesh Kumar Chaurasiya
2025-12-16 22:40       ` Christophe Leroy (CS GROUP)
2025-12-17  4:43         ` Mukesh Kumar Chaurasiya
2025-12-19  4:56           ` Mukesh Kumar Chaurasiya
2025-12-14 13:02 ` [PATCH v2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
2025-12-16  6:29   ` kernel test robot
2025-12-16 15:02     ` Mukesh Kumar Chaurasiya
2025-12-16 10:43   ` Christophe Leroy (CS GROUP)
2025-12-16 15:06     ` Mukesh Kumar Chaurasiya
2025-12-17  2:10   ` kernel test robot
2025-12-17 21:32   ` kernel test robot
2025-12-14 13:02 ` [PATCH v2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
2025-12-14 16:20   ` Segher Boessenkool
2025-12-15 18:32     ` Mukesh Kumar Chaurasiya
2025-12-15 20:27   ` kernel test robot
2025-12-16 15:08     ` Mukesh Kumar Chaurasiya
2025-12-16 22:57       ` Christophe Leroy (CS GROUP)
2025-12-16  6:41   ` Christophe Leroy (CS GROUP)
2025-12-16 15:09     ` Mukesh Kumar Chaurasiya
2025-12-16 11:01   ` Christophe Leroy (CS GROUP)
2025-12-16 15:13     ` Mukesh Kumar Chaurasiya

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).