* [RFC V2 0/8] Generic IRQ entry/exit support for powerpc
@ 2025-09-08 21:02 Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
` (9 more replies)
0 siblings, 10 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
Adding support for the generic irq entry/exit handling for PowerPC. The
goal is to bring PowerPC in line with other architectures that already
use the common irq entry infrastructure, reducing duplicated code and
making it easier to share future changes in entry/exit paths.
This is slightly tested on ppc64le.
The performance benchmarks from perf bench basic syscall are below:
| Metric | W/O Generic Framework | With Generic Framework | Improvement |
| ---------- | --------------------- | ---------------------- | ----------- |
| Total time | 0.885 [sec] | 0.880 [sec] | ~0.56% |
| usecs/op | 0.088518 | 0.088005 | ~0.58% |
| ops/sec | 1,12,97,086 | 1,13,62,977 | ~0.58% |
Thats close to 0.6% improvement with this.
Changelog:
V1 -> V2: Support added for irq with generic framework.
Mukesh Kumar Chaurasiya (8):
powerpc: rename arch_irq_disabled_regs
powerpc: Prepare to build with generic entry/exit framework
powerpc: introduce arch_enter_from_user_mode
powerpc: Introduce syscall exit arch functions
powerpc: add exit_flags field in pt_regs
powerpc: Prepare for IRQ entry exit
powerpc: Enable IRQ generic entry/exit path.
powerpc: Enable Generic Entry/Exit for syscalls.
arch/powerpc/Kconfig | 2 +
arch/powerpc/include/asm/entry-common.h | 550 ++++++++++++++++++++++++
arch/powerpc/include/asm/hw_irq.h | 4 +-
arch/powerpc/include/asm/interrupt.h | 393 +++--------------
arch/powerpc/include/asm/ptrace.h | 2 +
arch/powerpc/include/asm/stacktrace.h | 8 +
arch/powerpc/include/asm/syscall.h | 5 +
arch/powerpc/include/asm/thread_info.h | 1 +
arch/powerpc/include/uapi/asm/ptrace.h | 14 +-
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/interrupt.c | 251 ++---------
arch/powerpc/kernel/interrupt_64.S | 2 -
arch/powerpc/kernel/ptrace/ptrace.c | 142 +-----
arch/powerpc/kernel/signal.c | 8 +
arch/powerpc/kernel/syscall.c | 119 +----
arch/powerpc/kernel/traps.c | 2 +-
arch/powerpc/kernel/watchdog.c | 2 +-
arch/powerpc/perf/core-book3s.c | 2 +-
18 files changed, 698 insertions(+), 810 deletions(-)
create mode 100644 arch/powerpc/include/asm/entry-common.h
--
2.51.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC V2 1/8] powerpc: rename arch_irq_disabled_regs
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
@ 2025-09-08 21:02 ` Mukesh Kumar Chaurasiya
2025-09-13 12:50 ` Shrikanth Hegde
2025-09-08 21:02 ` [RFC V2 2/8] powerpc: Prepare to build with generic entry/exit framework Mukesh Kumar Chaurasiya
` (8 subsequent siblings)
9 siblings, 1 reply; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
Renaming arch_irq_disabled_regs to regs_irqs_disabled to be used
commonly in generic entry exit framework and ppc arch code.
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
arch/powerpc/include/asm/hw_irq.h | 4 ++--
arch/powerpc/include/asm/interrupt.h | 12 ++++++------
arch/powerpc/kernel/interrupt.c | 4 ++--
arch/powerpc/kernel/syscall.c | 2 +-
arch/powerpc/kernel/traps.c | 2 +-
arch/powerpc/kernel/watchdog.c | 2 +-
arch/powerpc/perf/core-book3s.c | 2 +-
7 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 569ac1165b069..2b9cf0380e0e9 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -393,7 +393,7 @@ static inline void do_hard_irq_enable(void)
__hard_irq_enable();
}
-static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
+static inline bool regs_irqs_disabled(struct pt_regs *regs)
{
return (regs->softe & IRQS_DISABLED);
}
@@ -466,7 +466,7 @@ static inline bool arch_irqs_disabled(void)
#define hard_irq_disable() arch_local_irq_disable()
-static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
+static inline bool regs_irqs_disabled(struct pt_regs *regs)
{
return !(regs->msr & MSR_EE);
}
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 23638d4e73ac0..56bc8113b8cde 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -172,7 +172,7 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs)
/* Enable MSR[RI] early, to support kernel SLB and hash faults */
#endif
- if (!arch_irq_disabled_regs(regs))
+ if (!regs_irqs_disabled(regs))
trace_hardirqs_off();
if (user_mode(regs)) {
@@ -192,10 +192,10 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs)
CT_WARN_ON(ct_state() != CT_STATE_KERNEL &&
ct_state() != CT_STATE_IDLE);
INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
- INT_SOFT_MASK_BUG_ON(regs, arch_irq_disabled_regs(regs) &&
+ INT_SOFT_MASK_BUG_ON(regs, regs_irqs_disabled(regs) &&
search_kernel_restart_table(regs->nip));
}
- INT_SOFT_MASK_BUG_ON(regs, !arch_irq_disabled_regs(regs) &&
+ INT_SOFT_MASK_BUG_ON(regs, !regs_irqs_disabled(regs) &&
!(regs->msr & MSR_EE));
booke_restore_dbcr0();
@@ -298,7 +298,7 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
* Adjust regs->softe to be soft-masked if it had not been
* reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
* not yet set disabled), or if it was in an implicit soft
- * masked state. This makes arch_irq_disabled_regs(regs)
+ * masked state. This makes regs_irqs_disabled(regs)
* behave as expected.
*/
regs->softe = IRQS_ALL_DISABLED;
@@ -372,7 +372,7 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct inter
#ifdef CONFIG_PPC64
#ifdef CONFIG_PPC_BOOK3S
- if (arch_irq_disabled_regs(regs)) {
+ if (regs_irqs_disabled(regs)) {
unsigned long rst = search_kernel_restart_table(regs->nip);
if (rst)
regs_set_return_ip(regs, rst);
@@ -661,7 +661,7 @@ void replay_soft_interrupts(void);
static inline void interrupt_cond_local_irq_enable(struct pt_regs *regs)
{
- if (!arch_irq_disabled_regs(regs))
+ if (!regs_irqs_disabled(regs))
local_irq_enable();
}
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index e0c681d0b0763..0d8fd47049a19 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -347,7 +347,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
unsigned long ret;
BUG_ON(regs_is_unrecoverable(regs));
- BUG_ON(arch_irq_disabled_regs(regs));
+ BUG_ON(regs_irqs_disabled(regs));
CT_WARN_ON(ct_state() == CT_STATE_USER);
/*
@@ -396,7 +396,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
local_irq_disable();
- if (!arch_irq_disabled_regs(regs)) {
+ if (!regs_irqs_disabled(regs)) {
/* Returning to a kernel context with local irqs enabled. */
WARN_ON_ONCE(!(regs->msr & MSR_EE));
again:
diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
index be159ad4b77bd..9f03a6263fb41 100644
--- a/arch/powerpc/kernel/syscall.c
+++ b/arch/powerpc/kernel/syscall.c
@@ -32,7 +32,7 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
BUG_ON(regs_is_unrecoverable(regs));
BUG_ON(!user_mode(regs));
- BUG_ON(arch_irq_disabled_regs(regs));
+ BUG_ON(regs_irqs_disabled(regs));
#ifdef CONFIG_PPC_PKEY
if (mmu_has_feature(MMU_FTR_PKEY)) {
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index cb8e9357383e9..629f2a2d4780e 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1956,7 +1956,7 @@ DEFINE_INTERRUPT_HANDLER_RAW(performance_monitor_exception)
* prevent hash faults on user addresses when reading callchains (and
* looks better from an irq tracing perspective).
*/
- if (IS_ENABLED(CONFIG_PPC64) && unlikely(arch_irq_disabled_regs(regs)))
+ if (IS_ENABLED(CONFIG_PPC64) && unlikely(regs_irqs_disabled(regs)))
performance_monitor_exception_nmi(regs);
else
performance_monitor_exception_async(regs);
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 2429cb1c7baa7..6111cbbde069d 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -373,7 +373,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
u64 tb;
/* should only arrive from kernel, with irqs disabled */
- WARN_ON_ONCE(!arch_irq_disabled_regs(regs));
+ WARN_ON_ONCE(!regs_irqs_disabled(regs));
if (!cpumask_test_cpu(cpu, &wd_cpus_enabled))
return 0;
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 8b0081441f85d..f7518b7e30554 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2482,7 +2482,7 @@ static void __perf_event_interrupt(struct pt_regs *regs)
* will trigger a PMI after waking up from idle. Since counter values are _not_
* saved/restored in idle path, can lead to below "Can't find PMC" message.
*/
- if (unlikely(!found) && !arch_irq_disabled_regs(regs))
+ if (unlikely(!found) && !regs_irqs_disabled(regs))
printk_ratelimited(KERN_WARNING "Can't find PMC that caused IRQ\n");
/*
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC V2 2/8] powerpc: Prepare to build with generic entry/exit framework
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
@ 2025-09-08 21:02 ` Mukesh Kumar Chaurasiya
2025-09-13 12:49 ` Shrikanth Hegde
2025-09-08 21:02 ` [RFC V2 3/8] powerpc: introduce arch_enter_from_user_mode Mukesh Kumar Chaurasiya
` (7 subsequent siblings)
9 siblings, 1 reply; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
Enabling build with generic entry/exit framework for powerpc
architecture requires few necessary steps.
Introducing minor infrastructure updates to prepare for future generic
framework handling:
- Add syscall_work field to struct thread_info for SYSCALL_WORK_* flags.
- Provide arch_syscall_is_vdso_sigreturn() stub, returning false.
- Add on_thread_stack() helper to test whether the current stack pointer
lies within the task’s kernel stack.
No functional change is intended with this patch.
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
arch/powerpc/include/asm/entry-common.h | 11 +++++++++++
arch/powerpc/include/asm/stacktrace.h | 8 ++++++++
arch/powerpc/include/asm/syscall.h | 5 +++++
arch/powerpc/include/asm/thread_info.h | 1 +
4 files changed, 25 insertions(+)
create mode 100644 arch/powerpc/include/asm/entry-common.h
diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
new file mode 100644
index 0000000000000..3af16d821d07e
--- /dev/null
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_PPC_ENTRY_COMMON_H
+#define _ASM_PPC_ENTRY_COMMON_H
+
+#ifdef CONFIG_GENERIC_IRQ_ENTRY
+
+#include <asm/stacktrace.h>
+
+#endif /* CONFIG_GENERIC_IRQ_ENTRY */
+#endif /* _ASM_PPC_ENTRY_COMMON_H */
diff --git a/arch/powerpc/include/asm/stacktrace.h b/arch/powerpc/include/asm/stacktrace.h
index 6149b53b3bc8e..3f0a242468813 100644
--- a/arch/powerpc/include/asm/stacktrace.h
+++ b/arch/powerpc/include/asm/stacktrace.h
@@ -8,6 +8,14 @@
#ifndef _ASM_POWERPC_STACKTRACE_H
#define _ASM_POWERPC_STACKTRACE_H
+#include <linux/sched.h>
+
void show_user_instructions(struct pt_regs *regs);
+static inline bool on_thread_stack(void)
+{
+ return !(((unsigned long)(current->stack) ^ current_stack_pointer)
+ & ~(THREAD_SIZE -1));
+}
+
#endif /* _ASM_POWERPC_STACKTRACE_H */
diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
index 4b3c52ed6e9d2..834fcc4f7b543 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -139,4 +139,9 @@ static inline int syscall_get_arch(struct task_struct *task)
else
return AUDIT_ARCH_PPC64;
}
+
+static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
+{
+ return false;
+}
#endif /* _ASM_SYSCALL_H */
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 2785c7462ebf7..d0e87c9bae0b0 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -54,6 +54,7 @@
struct thread_info {
int preempt_count; /* 0 => preemptable,
<0 => BUG */
+ unsigned long syscall_work; /* SYSCALL_WORK_ flags */
#ifdef CONFIG_SMP
unsigned int cpu;
#endif
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC V2 3/8] powerpc: introduce arch_enter_from_user_mode
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 2/8] powerpc: Prepare to build with generic entry/exit framework Mukesh Kumar Chaurasiya
@ 2025-09-08 21:02 ` Mukesh Kumar Chaurasiya
2025-09-14 9:02 ` Shrikanth Hegde
2025-09-08 21:02 ` [RFC V2 4/8] powerpc: Introduce syscall exit arch functions Mukesh Kumar Chaurasiya
` (6 subsequent siblings)
9 siblings, 1 reply; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
- Implement the hook arch_enter_from_user_mode for syscall entry.
- Move booke_load_dbcr0 from interrupt.c to interrupt.h
No functional change intended.
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
arch/powerpc/include/asm/entry-common.h | 96 +++++++++++++++++++++++++
arch/powerpc/include/asm/interrupt.h | 23 ++++++
arch/powerpc/kernel/interrupt.c | 22 ------
3 files changed, 119 insertions(+), 22 deletions(-)
diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index 3af16d821d07e..49607292bf5a5 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -5,7 +5,103 @@
#ifdef CONFIG_GENERIC_IRQ_ENTRY
+#include <asm/cputime.h>
+#include <asm/interrupt.h>
#include <asm/stacktrace.h>
+#include <asm/tm.h>
+
+static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
+{
+ if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
+ BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
+
+ BUG_ON(regs_is_unrecoverable(regs));
+ BUG_ON(!user_mode(regs));
+ BUG_ON(regs_irqs_disabled(regs));
+
+#ifdef CONFIG_PPC_PKEY
+ if (mmu_has_feature(MMU_FTR_PKEY)) {
+ unsigned long amr, iamr;
+ bool flush_needed = false;
+ /*
+ * When entering from userspace we mostly have the AMR/IAMR
+ * different from kernel default values. Hence don't compare.
+ */
+ amr = mfspr(SPRN_AMR);
+ iamr = mfspr(SPRN_IAMR);
+ regs->amr = amr;
+ regs->iamr = iamr;
+ if (mmu_has_feature(MMU_FTR_KUAP)) {
+ mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
+ flush_needed = true;
+ }
+ if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
+ mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
+ flush_needed = true;
+ }
+ if (flush_needed)
+ isync();
+ } else
+#endif
+ kuap_assert_locked();
+
+ booke_restore_dbcr0();
+
+ account_cpu_user_entry();
+
+ account_stolen_time();
+
+ /*
+ * This is not required for the syscall exit path, but makes the
+ * stack frame look nicer. If this was initialised in the first stack
+ * frame, or if the unwinder was taught the first stack frame always
+ * returns to user with IRQS_ENABLED, this store could be avoided!
+ */
+ irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
+
+ /*
+ * If system call is called with TM active, set _TIF_RESTOREALL to
+ * prevent RFSCV being used to return to userspace, because POWER9
+ * TM implementation has problems with this instruction returning to
+ * transactional state. Final register values are not relevant because
+ * the transaction will be aborted upon return anyway. Or in the case
+ * of unsupported_scv SIGILL fault, the return state does not much
+ * matter because it's an edge case.
+ */
+ if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
+ unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
+ set_bits(_TIF_RESTOREALL, ¤t_thread_info()->flags);
+
+ /*
+ * If the system call was made with a transaction active, doom it and
+ * return without performing the system call. Unless it was an
+ * unsupported scv vector, in which case it's treated like an illegal
+ * instruction.
+ */
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+ if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
+ !trap_is_unsupported_scv(regs)) {
+ /* Enable TM in the kernel, and disable EE (for scv) */
+ hard_irq_disable();
+ mtmsr(mfmsr() | MSR_TM);
+
+ /* tabort, this dooms the transaction, nothing else */
+ asm volatile(".long 0x7c00071d | ((%0) << 16)"
+ :: "r"(TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT));
+
+ /*
+ * Userspace will never see the return value. Execution will
+ * resume after the tbegin. of the aborted transaction with the
+ * checkpointed register state. A context switch could occur
+ * or signal delivered to the process before resuming the
+ * doomed transaction context, but that should all be handled
+ * as expected.
+ */
+ return;
+ }
+#endif // CONFIG_PPC_TRANSACTIONAL_MEM
+}
+#define arch_enter_from_user_mode arch_enter_from_user_mode
#endif /* CONFIG_GENERIC_IRQ_ENTRY */
#endif /* _ASM_PPC_ENTRY_COMMON_H */
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 56bc8113b8cde..6edf064a0fea2 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -138,6 +138,29 @@ static inline void nap_adjust_return(struct pt_regs *regs)
#endif
}
+static inline void booke_load_dbcr0(void)
+{
+#ifdef CONFIG_PPC_ADV_DEBUG_REGS
+ unsigned long dbcr0 = current->thread.debug.dbcr0;
+
+ if (likely(!(dbcr0 & DBCR0_IDM)))
+ return;
+
+ /*
+ * Check to see if the dbcr0 register is set up to debug.
+ * Use the internal debug mode bit to do this.
+ */
+ mtmsr(mfmsr() & ~MSR_DE);
+ if (IS_ENABLED(CONFIG_PPC32)) {
+ isync();
+ global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
+ }
+ mtspr(SPRN_DBCR0, dbcr0);
+ mtspr(SPRN_DBSR, -1);
+#endif
+}
+
+
static inline void booke_restore_dbcr0(void)
{
#ifdef CONFIG_PPC_ADV_DEBUG_REGS
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 0d8fd47049a19..2a09ac5dabd62 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -78,28 +78,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
return true;
}
-static notrace void booke_load_dbcr0(void)
-{
-#ifdef CONFIG_PPC_ADV_DEBUG_REGS
- unsigned long dbcr0 = current->thread.debug.dbcr0;
-
- if (likely(!(dbcr0 & DBCR0_IDM)))
- return;
-
- /*
- * Check to see if the dbcr0 register is set up to debug.
- * Use the internal debug mode bit to do this.
- */
- mtmsr(mfmsr() & ~MSR_DE);
- if (IS_ENABLED(CONFIG_PPC32)) {
- isync();
- global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
- }
- mtspr(SPRN_DBCR0, dbcr0);
- mtspr(SPRN_DBSR, -1);
-#endif
-}
-
static notrace void check_return_regs_valid(struct pt_regs *regs)
{
#ifdef CONFIG_PPC_BOOK3S_64
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC V2 4/8] powerpc: Introduce syscall exit arch functions
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
` (2 preceding siblings ...)
2025-09-08 21:02 ` [RFC V2 3/8] powerpc: introduce arch_enter_from_user_mode Mukesh Kumar Chaurasiya
@ 2025-09-08 21:02 ` Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 5/8] powerpc: add exit_flags field in pt_regs Mukesh Kumar Chaurasiya
` (5 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
Introducing following functions for syscall exit
- arch_exit_to_user_mode_work
- arch_exit_to_user_mode_work_prepare
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
arch/powerpc/include/asm/entry-common.h | 46 ++++++++++++++
arch/powerpc/include/asm/interrupt.h | 82 +++++++++++++++++++++++++
arch/powerpc/kernel/interrupt.c | 81 ------------------------
arch/powerpc/kernel/signal.c | 14 +++++
4 files changed, 142 insertions(+), 81 deletions(-)
diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index 49607292bf5a5..adea093274279 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -8,6 +8,7 @@
#include <asm/cputime.h>
#include <asm/interrupt.h>
#include <asm/stacktrace.h>
+#include <asm/switch_to.h>
#include <asm/tm.h>
static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
@@ -101,7 +102,52 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
}
#endif // CONFIG_PPC_TRANSACTIONAL_MEM
}
+
#define arch_enter_from_user_mode arch_enter_from_user_mode
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+ unsigned long mathflags;
+
+ if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
+ if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
+ unlikely((ti_work & _TIF_RESTORE_TM))) {
+ restore_tm_state(regs);
+ } else {
+ mathflags = MSR_FP;
+
+ if (cpu_has_feature(CPU_FTR_VSX))
+ mathflags |= MSR_VEC | MSR_VSX;
+ else if (cpu_has_feature(CPU_FTR_ALTIVEC))
+ mathflags |= MSR_VEC;
+
+ /*
+ * If userspace MSR has all available FP bits set,
+ * then they are live and no need to restore. If not,
+ * it means the regs were given up and restore_math
+ * may decide to restore them (to avoid taking an FP
+ * fault).
+ */
+ if ((regs->msr & mathflags) != mathflags)
+ restore_math(regs);
+ }
+ }
+
+ check_return_regs_valid(regs);
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+ local_paca->tm_scratch = regs->msr;
+#endif
+}
+#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
+
+static __always_inline void arch_exit_to_user_mode(void)
+{
+ booke_load_dbcr0();
+
+ account_cpu_user_exit();
+}
+#define arch_exit_to_user_mode arch_exit_to_user_mode
+
#endif /* CONFIG_GENERIC_IRQ_ENTRY */
#endif /* _ASM_PPC_ENTRY_COMMON_H */
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 6edf064a0fea2..c6ab286a723f2 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -68,6 +68,8 @@
#include <linux/context_tracking.h>
#include <linux/hardirq.h>
+#include <linux/sched/debug.h> /* for show_regs */
+
#include <asm/cputime.h>
#include <asm/firmware.h>
#include <asm/ftrace.h>
@@ -173,6 +175,86 @@ static inline void booke_restore_dbcr0(void)
#endif
}
+static inline void check_return_regs_valid(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC_BOOK3S_64
+ unsigned long trap, srr0, srr1;
+ static bool warned;
+ u8 *validp;
+ char *h;
+
+ if (trap_is_scv(regs))
+ return;
+
+ trap = TRAP(regs);
+ // EE in HV mode sets HSRRs like 0xea0
+ if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
+ trap = 0xea0;
+
+ switch (trap) {
+ case 0x980:
+ case INTERRUPT_H_DATA_STORAGE:
+ case 0xe20:
+ case 0xe40:
+ case INTERRUPT_HMI:
+ case 0xe80:
+ case 0xea0:
+ case INTERRUPT_H_FAC_UNAVAIL:
+ case 0x1200:
+ case 0x1500:
+ case 0x1600:
+ case 0x1800:
+ validp = &local_paca->hsrr_valid;
+ if (!READ_ONCE(*validp))
+ return;
+
+ srr0 = mfspr(SPRN_HSRR0);
+ srr1 = mfspr(SPRN_HSRR1);
+ h = "H";
+
+ break;
+ default:
+ validp = &local_paca->srr_valid;
+ if (!READ_ONCE(*validp))
+ return;
+
+ srr0 = mfspr(SPRN_SRR0);
+ srr1 = mfspr(SPRN_SRR1);
+ h = "";
+ break;
+ }
+
+ if (srr0 == regs->nip && srr1 == regs->msr)
+ return;
+
+ /*
+ * A NMI / soft-NMI interrupt may have come in after we found
+ * srr_valid and before the SRRs are loaded. The interrupt then
+ * comes in and clobbers SRRs and clears srr_valid. Then we load
+ * the SRRs here and test them above and find they don't match.
+ *
+ * Test validity again after that, to catch such false positives.
+ *
+ * This test in general will have some window for false negatives
+ * and may not catch and fix all such cases if an NMI comes in
+ * later and clobbers SRRs without clearing srr_valid, but hopefully
+ * such things will get caught most of the time, statistically
+ * enough to be able to get a warning out.
+ */
+ if (!READ_ONCE(*validp))
+ return;
+
+ if (!data_race(warned)) {
+ data_race(warned = true);
+ printk("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
+ printk("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
+ show_regs(regs);
+ }
+
+ WRITE_ONCE(*validp, 0); /* fixup */
+#endif
+}
+
static inline void interrupt_enter_prepare(struct pt_regs *regs)
{
#ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 2a09ac5dabd62..f53d432f60870 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -4,7 +4,6 @@
#include <linux/err.h>
#include <linux/compat.h>
#include <linux/rseq.h>
-#include <linux/sched/debug.h> /* for show_regs */
#include <asm/kup.h>
#include <asm/cputime.h>
@@ -78,86 +77,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
return true;
}
-static notrace void check_return_regs_valid(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
- unsigned long trap, srr0, srr1;
- static bool warned;
- u8 *validp;
- char *h;
-
- if (trap_is_scv(regs))
- return;
-
- trap = TRAP(regs);
- // EE in HV mode sets HSRRs like 0xea0
- if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
- trap = 0xea0;
-
- switch (trap) {
- case 0x980:
- case INTERRUPT_H_DATA_STORAGE:
- case 0xe20:
- case 0xe40:
- case INTERRUPT_HMI:
- case 0xe80:
- case 0xea0:
- case INTERRUPT_H_FAC_UNAVAIL:
- case 0x1200:
- case 0x1500:
- case 0x1600:
- case 0x1800:
- validp = &local_paca->hsrr_valid;
- if (!READ_ONCE(*validp))
- return;
-
- srr0 = mfspr(SPRN_HSRR0);
- srr1 = mfspr(SPRN_HSRR1);
- h = "H";
-
- break;
- default:
- validp = &local_paca->srr_valid;
- if (!READ_ONCE(*validp))
- return;
-
- srr0 = mfspr(SPRN_SRR0);
- srr1 = mfspr(SPRN_SRR1);
- h = "";
- break;
- }
-
- if (srr0 == regs->nip && srr1 == regs->msr)
- return;
-
- /*
- * A NMI / soft-NMI interrupt may have come in after we found
- * srr_valid and before the SRRs are loaded. The interrupt then
- * comes in and clobbers SRRs and clears srr_valid. Then we load
- * the SRRs here and test them above and find they don't match.
- *
- * Test validity again after that, to catch such false positives.
- *
- * This test in general will have some window for false negatives
- * and may not catch and fix all such cases if an NMI comes in
- * later and clobbers SRRs without clearing srr_valid, but hopefully
- * such things will get caught most of the time, statistically
- * enough to be able to get a warning out.
- */
- if (!READ_ONCE(*validp))
- return;
-
- if (!data_race(warned)) {
- data_race(warned = true);
- printk("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
- printk("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
- show_regs(regs);
- }
-
- WRITE_ONCE(*validp, 0); /* fixup */
-#endif
-}
-
static notrace unsigned long
interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
{
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index aa17e62f37547..719930cf4ae1f 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -22,6 +22,11 @@
#include "signal.h"
+/* This will be removed */
+#ifdef CONFIG_GENERIC_ENTRY
+#include <linux/entry-common.h>
+#endif /* CONFIG_GENERIC_ENTRY */
+
#ifdef CONFIG_VSX
unsigned long copy_fpr_to_user(void __user *to,
struct task_struct *task)
@@ -368,3 +373,12 @@ void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
printk_ratelimited(regs->msr & MSR_64BIT ? fm64 : fm32, tsk->comm,
task_pid_nr(tsk), where, ptr, regs->nip, regs->link);
}
+
+#ifdef CONFIG_GENERIC_ENTRY
+void arch_do_signal_or_restart(struct pt_regs *regs)
+{
+ BUG_ON(regs != current->thread.regs);
+ local_paca->generic_fw_flags |= GFW_RESTORE_ALL;
+ do_signal(current);
+}
+#endif /* CONFIG_GENERIC_ENTRY */
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC V2 5/8] powerpc: add exit_flags field in pt_regs
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
` (3 preceding siblings ...)
2025-09-08 21:02 ` [RFC V2 4/8] powerpc: Introduce syscall exit arch functions Mukesh Kumar Chaurasiya
@ 2025-09-08 21:02 ` Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 6/8] powerpc: Prepare for IRQ entry exit Mukesh Kumar Chaurasiya
` (4 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
Add field exit_flags in the pt_regs. This will hold the flags while
executing interrupt or syscall which is required during exit to user.
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
arch/powerpc/include/asm/ptrace.h | 2 ++
arch/powerpc/include/uapi/asm/ptrace.h | 14 +++++++++-----
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/ptrace/ptrace.c | 1 +
4 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
index 7b9350756875a..1b0ad5088f60d 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -53,6 +53,8 @@ struct pt_regs
unsigned long esr;
};
unsigned long result;
+ unsigned long exit_flags;
+ unsigned long __pt_regs_pad[1]; /* Maintain 16 byte interrupt stack alignment */
};
};
#if defined(CONFIG_PPC64) || defined(CONFIG_PPC_KUAP)
diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h
index 7004cfea3f5ff..4de612e2e40ac 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -55,6 +55,8 @@ struct pt_regs
unsigned long dar; /* Fault registers */
unsigned long dsisr; /* on 4xx/Book-E used for ESR */
unsigned long result; /* Result of a system call */
+ unsigned long exit_flags; /* System call exit flags */
+ unsigned long __pt_regs_pad[1]; /* Maintain 16 byte interrupt stack alignment */
};
#endif /* __ASSEMBLY__ */
@@ -114,10 +116,12 @@ struct pt_regs
#define PT_DAR 41
#define PT_DSISR 42
#define PT_RESULT 43
-#define PT_DSCR 44
-#define PT_REGS_COUNT 44
+#define PT_EXIT_FLAGS 44
+#define PT_PAD 45
+#define PT_DSCR 46
+#define PT_REGS_COUNT 46
-#define PT_FPR0 48 /* each FP reg occupies 2 slots in this space */
+#define PT_FPR0 (PT_REGS_COUNT + 4) /* each FP reg occupies 2 slots in this space */
#ifndef __powerpc64__
@@ -129,7 +133,7 @@ struct pt_regs
#define PT_FPSCR (PT_FPR0 + 32) /* each FP reg occupies 1 slot in 64-bit space */
-#define PT_VR0 82 /* each Vector reg occupies 2 slots in 64-bit */
+#define PT_VR0 (PT_FPSCR + 2) /* <82> each Vector reg occupies 2 slots in 64-bit */
#define PT_VSCR (PT_VR0 + 32*2 + 1)
#define PT_VRSAVE (PT_VR0 + 33*2)
@@ -137,7 +141,7 @@ struct pt_regs
/*
* Only store first 32 VSRs here. The second 32 VSRs in VR0-31
*/
-#define PT_VSR0 150 /* each VSR reg occupies 2 slots in 64-bit */
+#define PT_VSR0 (PT_VRSAVE + 2) /* each VSR reg occupies 2 slots in 64-bit */
#define PT_VSR31 (PT_VSR0 + 2*31)
#endif /* __powerpc64__ */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index b3048f6d3822c..4d4e880e3c616 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -291,6 +291,7 @@ int main(void)
STACK_PT_REGS_OFFSET(_ESR, esr);
STACK_PT_REGS_OFFSET(ORIG_GPR3, orig_gpr3);
STACK_PT_REGS_OFFSET(RESULT, result);
+ STACK_PT_REGS_OFFSET(EXIT_FLAGS, exit_flags);
STACK_PT_REGS_OFFSET(_TRAP, trap);
#ifdef CONFIG_PPC64
STACK_PT_REGS_OFFSET(SOFTE, softe);
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index c6997df632873..2134b6d155ff6 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -432,6 +432,7 @@ void __init pt_regs_check(void)
CHECK_REG(PT_DAR, dar);
CHECK_REG(PT_DSISR, dsisr);
CHECK_REG(PT_RESULT, result);
+ CHECK_REG(PT_EXIT_FLAGS, exit_flags);
#undef CHECK_REG
BUILD_BUG_ON(PT_REGS_COUNT != sizeof(struct user_pt_regs) / sizeof(unsigned long));
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC V2 6/8] powerpc: Prepare for IRQ entry exit
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
` (4 preceding siblings ...)
2025-09-08 21:02 ` [RFC V2 5/8] powerpc: add exit_flags field in pt_regs Mukesh Kumar Chaurasiya
@ 2025-09-08 21:02 ` Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
` (3 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
Copy all the functions from interrupt.h to arch specific entry-common.h file
as it will be a part of common now.
No functional change intended here.
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
arch/powerpc/include/asm/entry-common.h | 420 ++++++++++++++++++++++++
1 file changed, 420 insertions(+)
diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index adea093274279..28a96a84e83b5 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -7,10 +7,430 @@
#include <asm/cputime.h>
#include <asm/interrupt.h>
+#include <asm/runlatch.h>
#include <asm/stacktrace.h>
#include <asm/switch_to.h>
#include <asm/tm.h>
+#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
+/*
+ * WARN/BUG is handled with a program interrupt so minimise checks here to
+ * avoid recursion and maximise the chance of getting the first oops handled.
+ */
+#define INT_SOFT_MASK_BUG_ON(regs, cond) \
+do { \
+ if ((user_mode(regs) || (TRAP(regs) != INTERRUPT_PROGRAM))) \
+ BUG_ON(cond); \
+} while (0)
+#else
+#define INT_SOFT_MASK_BUG_ON(regs, cond)
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
+extern char __end_soft_masked[];
+bool search_kernel_soft_mask_table(unsigned long addr);
+unsigned long search_kernel_restart_table(unsigned long addr);
+
+DECLARE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
+
+static inline bool is_implicit_soft_masked(struct pt_regs *regs)
+{
+ if (user_mode(regs))
+ return false;
+
+ if (regs->nip >= (unsigned long)__end_soft_masked)
+ return false;
+
+ return search_kernel_soft_mask_table(regs->nip);
+}
+
+static inline void srr_regs_clobbered(void)
+{
+ local_paca->srr_valid = 0;
+ local_paca->hsrr_valid = 0;
+}
+#else
+static inline unsigned long search_kernel_restart_table(unsigned long addr)
+{
+ return 0;
+}
+
+static inline bool is_implicit_soft_masked(struct pt_regs *regs)
+{
+ return false;
+}
+
+static inline void srr_regs_clobbered(void)
+{
+}
+#endif
+
+static inline void nap_adjust_return(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC_970_NAP
+ if (unlikely(test_thread_local_flags(_TLF_NAPPING))) {
+ /* Can avoid a test-and-clear because NMIs do not call this */
+ clear_thread_local_flags(_TLF_NAPPING);
+ regs_set_return_ip(regs, (unsigned long)power4_idle_nap_return);
+ }
+#endif
+}
+
+static inline void booke_load_dbcr0(void)
+{
+#ifdef CONFIG_PPC_ADV_DEBUG_REGS
+ unsigned long dbcr0 = current->thread.debug.dbcr0;
+
+ if (likely(!(dbcr0 & DBCR0_IDM)))
+ return;
+
+ /*
+ * Check to see if the dbcr0 register is set up to debug.
+ * Use the internal debug mode bit to do this.
+ */
+ mtmsr(mfmsr() & ~MSR_DE);
+ if (IS_ENABLED(CONFIG_PPC32)) {
+ isync();
+ global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
+ }
+ mtspr(SPRN_DBCR0, dbcr0);
+ mtspr(SPRN_DBSR, -1);
+#endif
+}
+
+
+static inline void booke_restore_dbcr0(void)
+{
+#ifdef CONFIG_PPC_ADV_DEBUG_REGS
+ unsigned long dbcr0 = current->thread.debug.dbcr0;
+
+ if (IS_ENABLED(CONFIG_PPC32) && unlikely(dbcr0 & DBCR0_IDM)) {
+ mtspr(SPRN_DBSR, -1);
+ mtspr(SPRN_DBCR0, global_dbcr0[smp_processor_id()]);
+ }
+#endif
+}
+
+static inline void check_return_regs_valid(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC_BOOK3S_64
+ unsigned long trap, srr0, srr1;
+ static bool warned;
+ u8 *validp;
+ char *h;
+
+ if (trap_is_scv(regs))
+ return;
+
+ trap = TRAP(regs);
+ // EE in HV mode sets HSRRs like 0xea0
+ if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
+ trap = 0xea0;
+
+ switch (trap) {
+ case 0x980:
+ case INTERRUPT_H_DATA_STORAGE:
+ case 0xe20:
+ case 0xe40:
+ case INTERRUPT_HMI:
+ case 0xe80:
+ case 0xea0:
+ case INTERRUPT_H_FAC_UNAVAIL:
+ case 0x1200:
+ case 0x1500:
+ case 0x1600:
+ case 0x1800:
+ validp = &local_paca->hsrr_valid;
+ if (!READ_ONCE(*validp))
+ return;
+
+ srr0 = mfspr(SPRN_HSRR0);
+ srr1 = mfspr(SPRN_HSRR1);
+ h = "H";
+
+ break;
+ default:
+ validp = &local_paca->srr_valid;
+ if (!READ_ONCE(*validp))
+ return;
+
+ srr0 = mfspr(SPRN_SRR0);
+ srr1 = mfspr(SPRN_SRR1);
+ h = "";
+ break;
+ }
+
+ if (srr0 == regs->nip && srr1 == regs->msr)
+ return;
+
+ /*
+ * A NMI / soft-NMI interrupt may have come in after we found
+ * srr_valid and before the SRRs are loaded. The interrupt then
+ * comes in and clobbers SRRs and clears srr_valid. Then we load
+ * the SRRs here and test them above and find they don't match.
+ *
+ * Test validity again after that, to catch such false positives.
+ *
+ * This test in general will have some window for false negatives
+ * and may not catch and fix all such cases if an NMI comes in
+ * later and clobbers SRRs without clearing srr_valid, but hopefully
+ * such things will get caught most of the time, statistically
+ * enough to be able to get a warning out.
+ */
+ if (!READ_ONCE(*validp))
+ return;
+
+ if (!data_race(warned)) {
+ data_race(warned = true);
+ printk("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
+ printk("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
+ show_regs(regs);
+ }
+
+ WRITE_ONCE(*validp, 0); /* fixup */
+#endif
+}
+
+static inline void interrupt_enter_prepare(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC64
+ irq_soft_mask_set(IRQS_ALL_DISABLED);
+
+ /*
+ * If the interrupt was taken with HARD_DIS clear, then enable MSR[EE].
+ * Asynchronous interrupts get here with HARD_DIS set (see below), so
+ * this enables MSR[EE] for synchronous interrupts. IRQs remain
+ * soft-masked. The interrupt handler may later call
+ * interrupt_cond_local_irq_enable() to achieve a regular process
+ * context.
+ */
+ if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS)) {
+ INT_SOFT_MASK_BUG_ON(regs, !(regs->msr & MSR_EE));
+ __hard_irq_enable();
+ } else {
+ __hard_RI_enable();
+ }
+ /* Enable MSR[RI] early, to support kernel SLB and hash faults */
+#endif
+
+ if (!regs_irqs_disabled(regs))
+ trace_hardirqs_off();
+
+ if (user_mode(regs)) {
+ kuap_lock();
+ CT_WARN_ON(ct_state() != CT_STATE_USER);
+ user_exit_irqoff();
+
+ account_cpu_user_entry();
+ account_stolen_time();
+ } else {
+ kuap_save_and_lock(regs);
+ /*
+ * CT_WARN_ON comes here via program_check_exception,
+ * so avoid recursion.
+ */
+ if (TRAP(regs) != INTERRUPT_PROGRAM)
+ CT_WARN_ON(ct_state() != CT_STATE_KERNEL &&
+ ct_state() != CT_STATE_IDLE);
+ INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
+ INT_SOFT_MASK_BUG_ON(regs, regs_irqs_disabled(regs) &&
+ search_kernel_restart_table(regs->nip));
+ }
+ INT_SOFT_MASK_BUG_ON(regs, !regs_irqs_disabled(regs) &&
+ !(regs->msr & MSR_EE));
+
+ booke_restore_dbcr0();
+}
+
+/*
+ * Care should be taken to note that interrupt_exit_prepare and
+ * interrupt_async_exit_prepare do not necessarily return immediately to
+ * regs context (e.g., if regs is usermode, we don't necessarily return to
+ * user mode). Other interrupts might be taken between here and return,
+ * context switch / preemption may occur in the exit path after this, or a
+ * signal may be delivered, etc.
+ *
+ * The real interrupt exit code is platform specific, e.g.,
+ * interrupt_exit_user_prepare / interrupt_exit_kernel_prepare for 64s.
+ *
+ * However interrupt_nmi_exit_prepare does return directly to regs, because
+ * NMIs do not do "exit work" or replay soft-masked interrupts.
+ */
+static inline void interrupt_exit_prepare(struct pt_regs *regs)
+{
+}
+
+static inline void interrupt_async_enter_prepare(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC64
+ /* Ensure interrupt_enter_prepare does not enable MSR[EE] */
+ local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+#endif
+ interrupt_enter_prepare(regs);
+#ifdef CONFIG_PPC_BOOK3S_64
+ /*
+ * RI=1 is set by interrupt_enter_prepare, so this thread flags access
+ * has to come afterward (it can cause SLB faults).
+ */
+ if (cpu_has_feature(CPU_FTR_CTRL) &&
+ !test_thread_local_flags(_TLF_RUNLATCH))
+ __ppc64_runlatch_on();
+#endif
+ irq_enter();
+}
+
+static inline void interrupt_async_exit_prepare(struct pt_regs *regs)
+{
+ /*
+ * Adjust at exit so the main handler sees the true NIA. This must
+ * come before irq_exit() because irq_exit can enable interrupts, and
+ * if another interrupt is taken before nap_adjust_return has run
+ * here, then that interrupt would return directly to idle nap return.
+ */
+ nap_adjust_return(regs);
+
+ irq_exit();
+ interrupt_exit_prepare(regs);
+}
+
+struct interrupt_nmi_state {
+#ifdef CONFIG_PPC64
+ u8 irq_soft_mask;
+ u8 irq_happened;
+ u8 ftrace_enabled;
+ u64 softe;
+#endif
+};
+
+static inline bool nmi_disables_ftrace(struct pt_regs *regs)
+{
+ /* Allow DEC and PMI to be traced when they are soft-NMI */
+ if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
+ if (TRAP(regs) == INTERRUPT_DECREMENTER)
+ return false;
+ if (TRAP(regs) == INTERRUPT_PERFMON)
+ return false;
+ }
+ if (IS_ENABLED(CONFIG_PPC_BOOK3E_64)) {
+ if (TRAP(regs) == INTERRUPT_PERFMON)
+ return false;
+ }
+
+ return true;
+}
+
+static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
+{
+#ifdef CONFIG_PPC64
+ state->irq_soft_mask = local_paca->irq_soft_mask;
+ state->irq_happened = local_paca->irq_happened;
+ state->softe = regs->softe;
+
+ /*
+ * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
+ * the right thing, and set IRQ_HARD_DIS. We do not want to reconcile
+ * because that goes through irq tracing which we don't want in NMI.
+ */
+ local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
+ local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
+ if (!(regs->msr & MSR_EE) || is_implicit_soft_masked(regs)) {
+ /*
+ * Adjust regs->softe to be soft-masked if it had not been
+ * reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
+ * not yet set disabled), or if it was in an implicit soft
+ * masked state. This makes regs_irqs_disabled(regs)
+ * behave as expected.
+ */
+ regs->softe = IRQS_ALL_DISABLED;
+ }
+
+ __hard_RI_enable();
+
+ /* Don't do any per-CPU operations until interrupt state is fixed */
+
+ if (nmi_disables_ftrace(regs)) {
+ state->ftrace_enabled = this_cpu_get_ftrace_enabled();
+ this_cpu_set_ftrace_enabled(0);
+ }
+#endif
+
+ /* If data relocations are enabled, it's safe to use nmi_enter() */
+ if (mfmsr() & MSR_DR) {
+ nmi_enter();
+ return;
+ }
+
+ /*
+ * But do not use nmi_enter() for pseries hash guest taking a real-mode
+ * NMI because not everything it touches is within the RMA limit.
+ */
+ if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
+ firmware_has_feature(FW_FEATURE_LPAR) &&
+ !radix_enabled())
+ return;
+
+ /*
+ * Likewise, don't use it if we have some form of instrumentation (like
+ * KASAN shadow) that is not safe to access in real mode (even on radix)
+ */
+ if (IS_ENABLED(CONFIG_KASAN))
+ return;
+
+ /*
+ * Likewise, do not use it in real mode if percpu first chunk is not
+ * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
+ * are chances where percpu allocation can come from vmalloc area.
+ */
+ if (percpu_first_chunk_is_paged)
+ return;
+
+ /* Otherwise, it should be safe to call it */
+ nmi_enter();
+}
+
+static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
+{
+ if (mfmsr() & MSR_DR) {
+ // nmi_exit if relocations are on
+ nmi_exit();
+ } else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
+ firmware_has_feature(FW_FEATURE_LPAR) &&
+ !radix_enabled()) {
+ // no nmi_exit for a pseries hash guest taking a real mode exception
+ } else if (IS_ENABLED(CONFIG_KASAN)) {
+ // no nmi_exit for KASAN in real mode
+ } else if (percpu_first_chunk_is_paged) {
+ // no nmi_exit if percpu first chunk is not embedded
+ } else {
+ nmi_exit();
+ }
+
+ /*
+ * nmi does not call nap_adjust_return because nmi should not create
+ * new work to do (must use irq_work for that).
+ */
+
+#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S
+ if (regs_irqs_disabled(regs)) {
+ unsigned long rst = search_kernel_restart_table(regs->nip);
+ if (rst)
+ regs_set_return_ip(regs, rst);
+ }
+#endif
+
+ if (nmi_disables_ftrace(regs))
+ this_cpu_set_ftrace_enabled(state->ftrace_enabled);
+
+ /* Check we didn't change the pending interrupt mask. */
+ WARN_ON_ONCE((state->irq_happened | PACA_IRQ_HARD_DIS) != local_paca->irq_happened);
+ regs->softe = state->softe;
+ local_paca->irq_happened = state->irq_happened;
+ local_paca->irq_soft_mask = state->irq_soft_mask;
+#endif
+}
+
static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
{
if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC V2 7/8] powerpc: Enable IRQ generic entry/exit path.
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
` (5 preceding siblings ...)
2025-09-08 21:02 ` [RFC V2 6/8] powerpc: Prepare for IRQ entry exit Mukesh Kumar Chaurasiya
@ 2025-09-08 21:02 ` Mukesh Kumar Chaurasiya
2025-09-16 4:16 ` Shrikanth Hegde
2025-09-08 21:02 ` [RFC V2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
` (2 subsequent siblings)
9 siblings, 1 reply; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
Enable generic entry/exit path for ppc irq.
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/entry-common.h | 93 ++---
arch/powerpc/include/asm/interrupt.h | 492 +++---------------------
arch/powerpc/kernel/interrupt.c | 9 +-
arch/powerpc/kernel/interrupt_64.S | 2 -
5 files changed, 92 insertions(+), 505 deletions(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 93402a1d9c9fc..e0c51d7b5638d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -202,6 +202,7 @@ config PPC
select GENERIC_GETTIMEOFDAY
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IOREMAP
+ select GENERIC_IRQ_ENTRY
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_SHOW_LEVEL
select GENERIC_PCI_IOMAP if PCI
diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index 28a96a84e83b5..d3f4a12aeafca 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -191,6 +191,32 @@ static inline void check_return_regs_valid(struct pt_regs *regs)
#endif
}
+static inline void arch_interrupt_enter_prepare(struct pt_regs *regs)
+{
+#ifdef CONFIG_PPC64
+ irq_soft_mask_set(IRQS_ALL_DISABLED);
+
+ /*
+ * If the interrupt was taken with HARD_DIS clear, then enable MSR[EE].
+ * Asynchronous interrupts get here with HARD_DIS set (see below), so
+ * this enables MSR[EE] for synchronous interrupts. IRQs remain
+ * soft-masked. The interrupt handler may later call
+ * interrupt_cond_local_irq_enable() to achieve a regular process
+ * context.
+ */
+ if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS)) {
+ INT_SOFT_MASK_BUG_ON(regs, !(regs->msr & MSR_EE));
+ __hard_irq_enable();
+ } else {
+ __hard_RI_enable();
+ }
+ /* Enable MSR[RI] early, to support kernel SLB and hash faults */
+#endif
+
+ if (!regs_irqs_disabled(regs))
+ trace_hardirqs_off();
+}
+
static inline void interrupt_enter_prepare(struct pt_regs *regs)
{
#ifdef CONFIG_PPC64
@@ -266,7 +292,7 @@ static inline void interrupt_async_enter_prepare(struct pt_regs *regs)
/* Ensure interrupt_enter_prepare does not enable MSR[EE] */
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
#endif
- interrupt_enter_prepare(regs);
+ arch_interrupt_enter_prepare(regs);
#ifdef CONFIG_PPC_BOOK3S_64
/*
* RI=1 is set by interrupt_enter_prepare, so this thread flags access
@@ -276,7 +302,6 @@ static inline void interrupt_async_enter_prepare(struct pt_regs *regs)
!test_thread_local_flags(_TLF_RUNLATCH))
__ppc64_runlatch_on();
#endif
- irq_enter();
}
static inline void interrupt_async_exit_prepare(struct pt_regs *regs)
@@ -288,8 +313,6 @@ static inline void interrupt_async_exit_prepare(struct pt_regs *regs)
* here, then that interrupt would return directly to idle nap return.
*/
nap_adjust_return(regs);
-
- irq_exit();
interrupt_exit_prepare(regs);
}
@@ -319,7 +342,8 @@ static inline bool nmi_disables_ftrace(struct pt_regs *regs)
return true;
}
-static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
+static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs,
+ struct interrupt_nmi_state *state)
{
#ifdef CONFIG_PPC64
state->irq_soft_mask = local_paca->irq_soft_mask;
@@ -354,58 +378,11 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
this_cpu_set_ftrace_enabled(0);
}
#endif
-
- /* If data relocations are enabled, it's safe to use nmi_enter() */
- if (mfmsr() & MSR_DR) {
- nmi_enter();
- return;
- }
-
- /*
- * But do not use nmi_enter() for pseries hash guest taking a real-mode
- * NMI because not everything it touches is within the RMA limit.
- */
- if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
- firmware_has_feature(FW_FEATURE_LPAR) &&
- !radix_enabled())
- return;
-
- /*
- * Likewise, don't use it if we have some form of instrumentation (like
- * KASAN shadow) that is not safe to access in real mode (even on radix)
- */
- if (IS_ENABLED(CONFIG_KASAN))
- return;
-
- /*
- * Likewise, do not use it in real mode if percpu first chunk is not
- * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
- * are chances where percpu allocation can come from vmalloc area.
- */
- if (percpu_first_chunk_is_paged)
- return;
-
- /* Otherwise, it should be safe to call it */
- nmi_enter();
}
-static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
+static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs,
+ struct interrupt_nmi_state *state)
{
- if (mfmsr() & MSR_DR) {
- // nmi_exit if relocations are on
- nmi_exit();
- } else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
- firmware_has_feature(FW_FEATURE_LPAR) &&
- !radix_enabled()) {
- // no nmi_exit for a pseries hash guest taking a real mode exception
- } else if (IS_ENABLED(CONFIG_KASAN)) {
- // no nmi_exit for KASAN in real mode
- } else if (percpu_first_chunk_is_paged) {
- // no nmi_exit if percpu first chunk is not embedded
- } else {
- nmi_exit();
- }
-
/*
* nmi does not call nap_adjust_return because nmi should not create
* new work to do (must use irq_work for that).
@@ -433,8 +410,11 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct inter
static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
{
- if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
+ kuap_lock();
+
+ if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) {
BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
+ }
BUG_ON(regs_is_unrecoverable(regs));
BUG_ON(!user_mode(regs));
@@ -465,11 +445,8 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
} else
#endif
kuap_assert_locked();
-
booke_restore_dbcr0();
-
account_cpu_user_entry();
-
account_stolen_time();
/*
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index c6ab286a723f2..830501bc1d4aa 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -66,434 +66,10 @@
#ifndef __ASSEMBLY__
-#include <linux/context_tracking.h>
-#include <linux/hardirq.h>
#include <linux/sched/debug.h> /* for show_regs */
+#include <linux/irq-entry-common.h>
-#include <asm/cputime.h>
-#include <asm/firmware.h>
-#include <asm/ftrace.h>
#include <asm/kprobes.h>
-#include <asm/runlatch.h>
-
-#ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
-/*
- * WARN/BUG is handled with a program interrupt so minimise checks here to
- * avoid recursion and maximise the chance of getting the first oops handled.
- */
-#define INT_SOFT_MASK_BUG_ON(regs, cond) \
-do { \
- if ((user_mode(regs) || (TRAP(regs) != INTERRUPT_PROGRAM))) \
- BUG_ON(cond); \
-} while (0)
-#else
-#define INT_SOFT_MASK_BUG_ON(regs, cond)
-#endif
-
-#ifdef CONFIG_PPC_BOOK3S_64
-extern char __end_soft_masked[];
-bool search_kernel_soft_mask_table(unsigned long addr);
-unsigned long search_kernel_restart_table(unsigned long addr);
-
-DECLARE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
-
-static inline bool is_implicit_soft_masked(struct pt_regs *regs)
-{
- if (user_mode(regs))
- return false;
-
- if (regs->nip >= (unsigned long)__end_soft_masked)
- return false;
-
- return search_kernel_soft_mask_table(regs->nip);
-}
-
-static inline void srr_regs_clobbered(void)
-{
- local_paca->srr_valid = 0;
- local_paca->hsrr_valid = 0;
-}
-#else
-static inline unsigned long search_kernel_restart_table(unsigned long addr)
-{
- return 0;
-}
-
-static inline bool is_implicit_soft_masked(struct pt_regs *regs)
-{
- return false;
-}
-
-static inline void srr_regs_clobbered(void)
-{
-}
-#endif
-
-static inline void nap_adjust_return(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC_970_NAP
- if (unlikely(test_thread_local_flags(_TLF_NAPPING))) {
- /* Can avoid a test-and-clear because NMIs do not call this */
- clear_thread_local_flags(_TLF_NAPPING);
- regs_set_return_ip(regs, (unsigned long)power4_idle_nap_return);
- }
-#endif
-}
-
-static inline void booke_load_dbcr0(void)
-{
-#ifdef CONFIG_PPC_ADV_DEBUG_REGS
- unsigned long dbcr0 = current->thread.debug.dbcr0;
-
- if (likely(!(dbcr0 & DBCR0_IDM)))
- return;
-
- /*
- * Check to see if the dbcr0 register is set up to debug.
- * Use the internal debug mode bit to do this.
- */
- mtmsr(mfmsr() & ~MSR_DE);
- if (IS_ENABLED(CONFIG_PPC32)) {
- isync();
- global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
- }
- mtspr(SPRN_DBCR0, dbcr0);
- mtspr(SPRN_DBSR, -1);
-#endif
-}
-
-
-static inline void booke_restore_dbcr0(void)
-{
-#ifdef CONFIG_PPC_ADV_DEBUG_REGS
- unsigned long dbcr0 = current->thread.debug.dbcr0;
-
- if (IS_ENABLED(CONFIG_PPC32) && unlikely(dbcr0 & DBCR0_IDM)) {
- mtspr(SPRN_DBSR, -1);
- mtspr(SPRN_DBCR0, global_dbcr0[smp_processor_id()]);
- }
-#endif
-}
-
-static inline void check_return_regs_valid(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
- unsigned long trap, srr0, srr1;
- static bool warned;
- u8 *validp;
- char *h;
-
- if (trap_is_scv(regs))
- return;
-
- trap = TRAP(regs);
- // EE in HV mode sets HSRRs like 0xea0
- if (cpu_has_feature(CPU_FTR_HVMODE) && trap == INTERRUPT_EXTERNAL)
- trap = 0xea0;
-
- switch (trap) {
- case 0x980:
- case INTERRUPT_H_DATA_STORAGE:
- case 0xe20:
- case 0xe40:
- case INTERRUPT_HMI:
- case 0xe80:
- case 0xea0:
- case INTERRUPT_H_FAC_UNAVAIL:
- case 0x1200:
- case 0x1500:
- case 0x1600:
- case 0x1800:
- validp = &local_paca->hsrr_valid;
- if (!READ_ONCE(*validp))
- return;
-
- srr0 = mfspr(SPRN_HSRR0);
- srr1 = mfspr(SPRN_HSRR1);
- h = "H";
-
- break;
- default:
- validp = &local_paca->srr_valid;
- if (!READ_ONCE(*validp))
- return;
-
- srr0 = mfspr(SPRN_SRR0);
- srr1 = mfspr(SPRN_SRR1);
- h = "";
- break;
- }
-
- if (srr0 == regs->nip && srr1 == regs->msr)
- return;
-
- /*
- * A NMI / soft-NMI interrupt may have come in after we found
- * srr_valid and before the SRRs are loaded. The interrupt then
- * comes in and clobbers SRRs and clears srr_valid. Then we load
- * the SRRs here and test them above and find they don't match.
- *
- * Test validity again after that, to catch such false positives.
- *
- * This test in general will have some window for false negatives
- * and may not catch and fix all such cases if an NMI comes in
- * later and clobbers SRRs without clearing srr_valid, but hopefully
- * such things will get caught most of the time, statistically
- * enough to be able to get a warning out.
- */
- if (!READ_ONCE(*validp))
- return;
-
- if (!data_race(warned)) {
- data_race(warned = true);
- printk("%sSRR0 was: %lx should be: %lx\n", h, srr0, regs->nip);
- printk("%sSRR1 was: %lx should be: %lx\n", h, srr1, regs->msr);
- show_regs(regs);
- }
-
- WRITE_ONCE(*validp, 0); /* fixup */
-#endif
-}
-
-static inline void interrupt_enter_prepare(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC64
- irq_soft_mask_set(IRQS_ALL_DISABLED);
-
- /*
- * If the interrupt was taken with HARD_DIS clear, then enable MSR[EE].
- * Asynchronous interrupts get here with HARD_DIS set (see below), so
- * this enables MSR[EE] for synchronous interrupts. IRQs remain
- * soft-masked. The interrupt handler may later call
- * interrupt_cond_local_irq_enable() to achieve a regular process
- * context.
- */
- if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS)) {
- INT_SOFT_MASK_BUG_ON(regs, !(regs->msr & MSR_EE));
- __hard_irq_enable();
- } else {
- __hard_RI_enable();
- }
- /* Enable MSR[RI] early, to support kernel SLB and hash faults */
-#endif
-
- if (!regs_irqs_disabled(regs))
- trace_hardirqs_off();
-
- if (user_mode(regs)) {
- kuap_lock();
- CT_WARN_ON(ct_state() != CT_STATE_USER);
- user_exit_irqoff();
-
- account_cpu_user_entry();
- account_stolen_time();
- } else {
- kuap_save_and_lock(regs);
- /*
- * CT_WARN_ON comes here via program_check_exception,
- * so avoid recursion.
- */
- if (TRAP(regs) != INTERRUPT_PROGRAM)
- CT_WARN_ON(ct_state() != CT_STATE_KERNEL &&
- ct_state() != CT_STATE_IDLE);
- INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs));
- INT_SOFT_MASK_BUG_ON(regs, regs_irqs_disabled(regs) &&
- search_kernel_restart_table(regs->nip));
- }
- INT_SOFT_MASK_BUG_ON(regs, !regs_irqs_disabled(regs) &&
- !(regs->msr & MSR_EE));
-
- booke_restore_dbcr0();
-}
-
-/*
- * Care should be taken to note that interrupt_exit_prepare and
- * interrupt_async_exit_prepare do not necessarily return immediately to
- * regs context (e.g., if regs is usermode, we don't necessarily return to
- * user mode). Other interrupts might be taken between here and return,
- * context switch / preemption may occur in the exit path after this, or a
- * signal may be delivered, etc.
- *
- * The real interrupt exit code is platform specific, e.g.,
- * interrupt_exit_user_prepare / interrupt_exit_kernel_prepare for 64s.
- *
- * However interrupt_nmi_exit_prepare does return directly to regs, because
- * NMIs do not do "exit work" or replay soft-masked interrupts.
- */
-static inline void interrupt_exit_prepare(struct pt_regs *regs)
-{
-}
-
-static inline void interrupt_async_enter_prepare(struct pt_regs *regs)
-{
-#ifdef CONFIG_PPC64
- /* Ensure interrupt_enter_prepare does not enable MSR[EE] */
- local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
-#endif
- interrupt_enter_prepare(regs);
-#ifdef CONFIG_PPC_BOOK3S_64
- /*
- * RI=1 is set by interrupt_enter_prepare, so this thread flags access
- * has to come afterward (it can cause SLB faults).
- */
- if (cpu_has_feature(CPU_FTR_CTRL) &&
- !test_thread_local_flags(_TLF_RUNLATCH))
- __ppc64_runlatch_on();
-#endif
- irq_enter();
-}
-
-static inline void interrupt_async_exit_prepare(struct pt_regs *regs)
-{
- /*
- * Adjust at exit so the main handler sees the true NIA. This must
- * come before irq_exit() because irq_exit can enable interrupts, and
- * if another interrupt is taken before nap_adjust_return has run
- * here, then that interrupt would return directly to idle nap return.
- */
- nap_adjust_return(regs);
-
- irq_exit();
- interrupt_exit_prepare(regs);
-}
-
-struct interrupt_nmi_state {
-#ifdef CONFIG_PPC64
- u8 irq_soft_mask;
- u8 irq_happened;
- u8 ftrace_enabled;
- u64 softe;
-#endif
-};
-
-static inline bool nmi_disables_ftrace(struct pt_regs *regs)
-{
- /* Allow DEC and PMI to be traced when they are soft-NMI */
- if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
- if (TRAP(regs) == INTERRUPT_DECREMENTER)
- return false;
- if (TRAP(regs) == INTERRUPT_PERFMON)
- return false;
- }
- if (IS_ENABLED(CONFIG_PPC_BOOK3E_64)) {
- if (TRAP(regs) == INTERRUPT_PERFMON)
- return false;
- }
-
- return true;
-}
-
-static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
-{
-#ifdef CONFIG_PPC64
- state->irq_soft_mask = local_paca->irq_soft_mask;
- state->irq_happened = local_paca->irq_happened;
- state->softe = regs->softe;
-
- /*
- * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
- * the right thing, and set IRQ_HARD_DIS. We do not want to reconcile
- * because that goes through irq tracing which we don't want in NMI.
- */
- local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
- local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
-
- if (!(regs->msr & MSR_EE) || is_implicit_soft_masked(regs)) {
- /*
- * Adjust regs->softe to be soft-masked if it had not been
- * reconcied (e.g., interrupt entry with MSR[EE]=0 but softe
- * not yet set disabled), or if it was in an implicit soft
- * masked state. This makes regs_irqs_disabled(regs)
- * behave as expected.
- */
- regs->softe = IRQS_ALL_DISABLED;
- }
-
- __hard_RI_enable();
-
- /* Don't do any per-CPU operations until interrupt state is fixed */
-
- if (nmi_disables_ftrace(regs)) {
- state->ftrace_enabled = this_cpu_get_ftrace_enabled();
- this_cpu_set_ftrace_enabled(0);
- }
-#endif
-
- /* If data relocations are enabled, it's safe to use nmi_enter() */
- if (mfmsr() & MSR_DR) {
- nmi_enter();
- return;
- }
-
- /*
- * But do not use nmi_enter() for pseries hash guest taking a real-mode
- * NMI because not everything it touches is within the RMA limit.
- */
- if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
- firmware_has_feature(FW_FEATURE_LPAR) &&
- !radix_enabled())
- return;
-
- /*
- * Likewise, don't use it if we have some form of instrumentation (like
- * KASAN shadow) that is not safe to access in real mode (even on radix)
- */
- if (IS_ENABLED(CONFIG_KASAN))
- return;
-
- /*
- * Likewise, do not use it in real mode if percpu first chunk is not
- * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
- * are chances where percpu allocation can come from vmalloc area.
- */
- if (percpu_first_chunk_is_paged)
- return;
-
- /* Otherwise, it should be safe to call it */
- nmi_enter();
-}
-
-static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
-{
- if (mfmsr() & MSR_DR) {
- // nmi_exit if relocations are on
- nmi_exit();
- } else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
- firmware_has_feature(FW_FEATURE_LPAR) &&
- !radix_enabled()) {
- // no nmi_exit for a pseries hash guest taking a real mode exception
- } else if (IS_ENABLED(CONFIG_KASAN)) {
- // no nmi_exit for KASAN in real mode
- } else if (percpu_first_chunk_is_paged) {
- // no nmi_exit if percpu first chunk is not embedded
- } else {
- nmi_exit();
- }
-
- /*
- * nmi does not call nap_adjust_return because nmi should not create
- * new work to do (must use irq_work for that).
- */
-
-#ifdef CONFIG_PPC64
-#ifdef CONFIG_PPC_BOOK3S
- if (regs_irqs_disabled(regs)) {
- unsigned long rst = search_kernel_restart_table(regs->nip);
- if (rst)
- regs_set_return_ip(regs, rst);
- }
-#endif
-
- if (nmi_disables_ftrace(regs))
- this_cpu_set_ftrace_enabled(state->ftrace_enabled);
-
- /* Check we didn't change the pending interrupt mask. */
- WARN_ON_ONCE((state->irq_happened | PACA_IRQ_HARD_DIS) != local_paca->irq_happened);
- regs->softe = state->softe;
- local_paca->irq_happened = state->irq_happened;
- local_paca->irq_soft_mask = state->irq_soft_mask;
-#endif
-}
/*
* Don't use noinstr here like x86, but rather add NOKPROBE_SYMBOL to each
@@ -575,10 +151,13 @@ static __always_inline void ____##func(struct pt_regs *regs); \
\
interrupt_handler void func(struct pt_regs *regs) \
{ \
- interrupt_enter_prepare(regs); \
- \
+ irqentry_state_t state; \
+ arch_interrupt_enter_prepare(regs); \
+ state = irqentry_enter(regs); \
+ instrumentation_begin(); \
____##func (regs); \
- \
+ instrumentation_end(); \
+ irqentry_exit(regs, state); \
interrupt_exit_prepare(regs); \
} \
NOKPROBE_SYMBOL(func); \
@@ -609,11 +188,14 @@ static __always_inline long ____##func(struct pt_regs *regs); \
interrupt_handler long func(struct pt_regs *regs) \
{ \
long ret; \
+ irqentry_state_t state; \
\
- interrupt_enter_prepare(regs); \
- \
+ arch_interrupt_enter_prepare(regs); \
+ state = irqentry_enter(regs); \
+ instrumentation_begin(); \
ret = ____##func (regs); \
- \
+ instrumentation_end(); \
+ irqentry_exit(regs, state); \
interrupt_exit_prepare(regs); \
\
return ret; \
@@ -643,11 +225,16 @@ static __always_inline void ____##func(struct pt_regs *regs); \
\
interrupt_handler void func(struct pt_regs *regs) \
{ \
+ irqentry_state_t state; \
interrupt_async_enter_prepare(regs); \
- \
+ state = irqentry_enter(regs); \
+ instrumentation_begin(); \
+ irq_enter_rcu(); \
____##func (regs); \
- \
+ irq_exit_rcu(); \
+ instrumentation_end(); \
interrupt_async_exit_prepare(regs); \
+ irqentry_exit(regs, state); \
} \
NOKPROBE_SYMBOL(func); \
\
@@ -677,14 +264,43 @@ ____##func(struct pt_regs *regs); \
\
interrupt_handler long func(struct pt_regs *regs) \
{ \
- struct interrupt_nmi_state state; \
+ irqentry_state_t state; \
+ struct interrupt_nmi_state nmi_state; \
long ret; \
\
- interrupt_nmi_enter_prepare(regs, &state); \
- \
+ interrupt_nmi_enter_prepare(regs, &nmi_state); \
+ if (mfmsr() & MSR_DR) { \
+ /* nmi_entry if relocations are on */ \
+ state = irqentry_nmi_enter(regs); \
+ } else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && \
+ firmware_has_feature(FW_FEATURE_LPAR) && \
+ !radix_enabled()) { \
+ /* no nmi_entry for a pseries hash guest \
+ * taking a real mode exception */ \
+ } else if (IS_ENABLED(CONFIG_KASAN)) { \
+ /* no nmi_entry for KASAN in real mode */ \
+ } else if (percpu_first_chunk_is_paged) { \
+ /* no nmi_entry if percpu first chunk is not embedded */\
+ } else { \
+ state = irqentry_nmi_enter(regs); \
+ } \
ret = ____##func (regs); \
- \
- interrupt_nmi_exit_prepare(regs, &state); \
+ if (mfmsr() & MSR_DR) { \
+ /* nmi_exit if relocations are on */ \
+ irqentry_nmi_exit(regs, state); \
+ } else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && \
+ firmware_has_feature(FW_FEATURE_LPAR) && \
+ !radix_enabled()) { \
+ /* no nmi_exit for a pseries hash guest \
+ * taking a real mode exception */ \
+ } else if (IS_ENABLED(CONFIG_KASAN)) { \
+ /* no nmi_exit for KASAN in real mode */ \
+ } else if (percpu_first_chunk_is_paged) { \
+ /* no nmi_exit if percpu first chunk is not embedded */ \
+ } else { \
+ irqentry_nmi_exit(regs, state); \
+ } \
+ interrupt_nmi_exit_prepare(regs,&nmi_state); \
\
return ret; \
} \
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index f53d432f60870..7bb8a31b24ea7 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -297,13 +297,8 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
/* Returning to a kernel context with local irqs enabled. */
WARN_ON_ONCE(!(regs->msr & MSR_EE));
again:
- if (need_irq_preemption()) {
- /* Return to preemptible kernel context */
- if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
- if (preempt_count() == 0)
- preempt_schedule_irq();
- }
- }
+ if (need_irq_preemption())
+ irqentry_exit_cond_resched();
check_return_regs_valid(regs);
diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
index 1ad059a9e2fef..6aa88fe91fb6a 100644
--- a/arch/powerpc/kernel/interrupt_64.S
+++ b/arch/powerpc/kernel/interrupt_64.S
@@ -418,8 +418,6 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
beq interrupt_return_\srr\()_kernel
interrupt_return_\srr\()_user: /* make backtraces match the _kernel variant */
_ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\()_user)
- addi r3,r1,STACK_INT_FRAME_REGS
- bl CFUNC(interrupt_exit_user_prepare)
#ifndef CONFIG_INTERRUPT_SANITIZE_REGISTERS
cmpdi r3,0
bne- .Lrestore_nvgprs_\srr
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC V2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
` (6 preceding siblings ...)
2025-09-08 21:02 ` [RFC V2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
@ 2025-09-08 21:02 ` Mukesh Kumar Chaurasiya
2025-09-09 6:54 ` Shrikanth Hegde
2025-09-10 20:34 ` [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Thomas Gleixner
2025-09-24 9:04 ` Samir Alamshaha Mulani
9 siblings, 1 reply; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-08 21:02 UTC (permalink / raw)
To: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
mchauras, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
Enable the syscall entry and exit path from generic framework.
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/entry-common.h | 2 +-
arch/powerpc/kernel/interrupt.c | 135 +++++++----------------
arch/powerpc/kernel/ptrace/ptrace.c | 141 ------------------------
arch/powerpc/kernel/signal.c | 10 +-
arch/powerpc/kernel/syscall.c | 119 +-------------------
6 files changed, 49 insertions(+), 359 deletions(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e0c51d7b5638d..e67294a72e4d4 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -199,6 +199,7 @@ config PPC
select GENERIC_CPU_AUTOPROBE
select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC
select GENERIC_EARLY_IOREMAP
+ select GENERIC_ENTRY
select GENERIC_GETTIMEOFDAY
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IOREMAP
diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
index d3f4a12aeafca..8fb74e6aa9560 100644
--- a/arch/powerpc/include/asm/entry-common.h
+++ b/arch/powerpc/include/asm/entry-common.h
@@ -3,7 +3,7 @@
#ifndef _ASM_PPC_ENTRY_COMMON_H
#define _ASM_PPC_ENTRY_COMMON_H
-#ifdef CONFIG_GENERIC_IRQ_ENTRY
+#ifdef CONFIG_GENERIC_ENTRY
#include <asm/cputime.h>
#include <asm/interrupt.h>
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 7bb8a31b24ea7..642e22527f9dd 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-or-later
#include <linux/context_tracking.h>
+#include <linux/entry-common.h>
#include <linux/err.h>
#include <linux/compat.h>
#include <linux/rseq.h>
@@ -77,79 +78,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
return true;
}
-static notrace unsigned long
-interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
-{
- unsigned long ti_flags;
-
-again:
- ti_flags = read_thread_flags();
- while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
- local_irq_enable();
- if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
- schedule();
- } else {
- /*
- * SIGPENDING must restore signal handler function
- * argument GPRs, and some non-volatiles (e.g., r1).
- * Restore all for now. This could be made lighter.
- */
- if (ti_flags & _TIF_SIGPENDING)
- ret |= _TIF_RESTOREALL;
- do_notify_resume(regs, ti_flags);
- }
- local_irq_disable();
- ti_flags = read_thread_flags();
- }
-
- if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && IS_ENABLED(CONFIG_PPC_FPU)) {
- if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
- unlikely((ti_flags & _TIF_RESTORE_TM))) {
- restore_tm_state(regs);
- } else {
- unsigned long mathflags = MSR_FP;
-
- if (cpu_has_feature(CPU_FTR_VSX))
- mathflags |= MSR_VEC | MSR_VSX;
- else if (cpu_has_feature(CPU_FTR_ALTIVEC))
- mathflags |= MSR_VEC;
-
- /*
- * If userspace MSR has all available FP bits set,
- * then they are live and no need to restore. If not,
- * it means the regs were given up and restore_math
- * may decide to restore them (to avoid taking an FP
- * fault).
- */
- if ((regs->msr & mathflags) != mathflags)
- restore_math(regs);
- }
- }
-
- check_return_regs_valid(regs);
-
- user_enter_irqoff();
- if (!prep_irq_for_enabled_exit(true)) {
- user_exit_irqoff();
- local_irq_enable();
- local_irq_disable();
- goto again;
- }
-
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
- local_paca->tm_scratch = regs->msr;
-#endif
-
- booke_load_dbcr0();
-
- account_cpu_user_exit();
-
- /* Restore user access locks last */
- kuap_user_restore(regs);
-
- return ret;
-}
-
/*
* This should be called after a syscall returns, with r3 the return value
* from the syscall. If this function returns non-zero, the system call
@@ -164,17 +92,12 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
long scv)
{
unsigned long ti_flags;
- unsigned long ret = 0;
bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv;
- CT_WARN_ON(ct_state() == CT_STATE_USER);
-
kuap_assert_locked();
regs->result = r3;
-
- /* Check whether the syscall is issued inside a restartable sequence */
- rseq_syscall(regs);
+ regs->exit_flags = 0;
ti_flags = read_thread_flags();
@@ -187,7 +110,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
if (unlikely(ti_flags & _TIF_PERSYSCALL_MASK)) {
if (ti_flags & _TIF_RESTOREALL)
- ret = _TIF_RESTOREALL;
+ regs->exit_flags = _TIF_RESTOREALL;
else
regs->gpr[3] = r3;
clear_bits(_TIF_PERSYSCALL_MASK, ¤t_thread_info()->flags);
@@ -196,18 +119,28 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
}
if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) {
- do_syscall_trace_leave(regs);
- ret |= _TIF_RESTOREALL;
+ regs->exit_flags |= _TIF_RESTOREALL;
}
- local_irq_disable();
- ret = interrupt_exit_user_prepare_main(ret, regs);
+again:
+ syscall_exit_to_user_mode(regs);
+
+ user_enter_irqoff();
+ if (!prep_irq_for_enabled_exit(true)) {
+ user_exit_irqoff();
+ local_irq_enable();
+ local_irq_disable();
+ goto again;
+ }
+
+ /* Restore user access locks last */
+ kuap_user_restore(regs);
#ifdef CONFIG_PPC64
- regs->exit_result = ret;
+ regs->exit_result = regs->exit_flags;
#endif
- return ret;
+ return regs->exit_flags;
}
#ifdef CONFIG_PPC64
@@ -226,14 +159,18 @@ notrace unsigned long syscall_exit_restart(unsigned long r3, struct pt_regs *reg
#ifdef CONFIG_PPC_BOOK3S_64
set_kuap(AMR_KUAP_BLOCKED);
#endif
+again:
+ syscall_exit_to_user_mode(regs);
- trace_hardirqs_off();
- user_exit_irqoff();
- account_cpu_user_entry();
-
- BUG_ON(!user_mode(regs));
+ user_enter_irqoff();
+ if (!prep_irq_for_enabled_exit(true)) {
+ user_exit_irqoff();
+ local_irq_enable();
+ local_irq_disable();
+ goto again;
+ }
- regs->exit_result = interrupt_exit_user_prepare_main(regs->exit_result, regs);
+ regs->exit_result |= regs->exit_flags;
return regs->exit_result;
}
@@ -254,8 +191,20 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs)
kuap_assert_locked();
local_irq_disable();
+ regs->exit_flags = 0;
+again:
+ irqentry_exit_to_user_mode(regs);
+ check_return_regs_valid(regs);
+
+ user_enter_irqoff();
+ if (!prep_irq_for_enabled_exit(true)) {
+ user_exit_irqoff();
+ local_irq_enable();
+ local_irq_disable();
+ goto again;
+ }
- ret = interrupt_exit_user_prepare_main(0, regs);
+ ret = regs->exit_flags;
#ifdef CONFIG_PPC64
regs->exit_result = ret;
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index 2134b6d155ff6..316d4f5ead8ed 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -21,9 +21,6 @@
#include <asm/switch_to.h>
#include <asm/debug.h>
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
#include "ptrace-decl.h"
/*
@@ -195,144 +192,6 @@ long arch_ptrace(struct task_struct *child, long request,
return ret;
}
-#ifdef CONFIG_SECCOMP
-static int do_seccomp(struct pt_regs *regs)
-{
- if (!test_thread_flag(TIF_SECCOMP))
- return 0;
-
- /*
- * The ABI we present to seccomp tracers is that r3 contains
- * the syscall return value and orig_gpr3 contains the first
- * syscall parameter. This is different to the ptrace ABI where
- * both r3 and orig_gpr3 contain the first syscall parameter.
- */
- regs->gpr[3] = -ENOSYS;
-
- /*
- * We use the __ version here because we have already checked
- * TIF_SECCOMP. If this fails, there is nothing left to do, we
- * have already loaded -ENOSYS into r3, or seccomp has put
- * something else in r3 (via SECCOMP_RET_ERRNO/TRACE).
- */
- if (__secure_computing())
- return -1;
-
- /*
- * The syscall was allowed by seccomp, restore the register
- * state to what audit expects.
- * Note that we use orig_gpr3, which means a seccomp tracer can
- * modify the first syscall parameter (in orig_gpr3) and also
- * allow the syscall to proceed.
- */
- regs->gpr[3] = regs->orig_gpr3;
-
- return 0;
-}
-#else
-static inline int do_seccomp(struct pt_regs *regs) { return 0; }
-#endif /* CONFIG_SECCOMP */
-
-/**
- * do_syscall_trace_enter() - Do syscall tracing on kernel entry.
- * @regs: the pt_regs of the task to trace (current)
- *
- * Performs various types of tracing on syscall entry. This includes seccomp,
- * ptrace, syscall tracepoints and audit.
- *
- * The pt_regs are potentially visible to userspace via ptrace, so their
- * contents is ABI.
- *
- * One or more of the tracers may modify the contents of pt_regs, in particular
- * to modify arguments or even the syscall number itself.
- *
- * It's also possible that a tracer can choose to reject the system call. In
- * that case this function will return an illegal syscall number, and will put
- * an appropriate return value in regs->r3.
- *
- * Return: the (possibly changed) syscall number.
- */
-long do_syscall_trace_enter(struct pt_regs *regs)
-{
- u32 flags;
-
- flags = read_thread_flags() & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE);
-
- if (flags) {
- int rc = ptrace_report_syscall_entry(regs);
-
- if (unlikely(flags & _TIF_SYSCALL_EMU)) {
- /*
- * A nonzero return code from
- * ptrace_report_syscall_entry() tells us to prevent
- * the syscall execution, but we are not going to
- * execute it anyway.
- *
- * Returning -1 will skip the syscall execution. We want
- * to avoid clobbering any registers, so we don't goto
- * the skip label below.
- */
- return -1;
- }
-
- if (rc) {
- /*
- * The tracer decided to abort the syscall. Note that
- * the tracer may also just change regs->gpr[0] to an
- * invalid syscall number, that is handled below on the
- * exit path.
- */
- goto skip;
- }
- }
-
- /* Run seccomp after ptrace; allow it to set gpr[3]. */
- if (do_seccomp(regs))
- return -1;
-
- /* Avoid trace and audit when syscall is invalid. */
- if (regs->gpr[0] >= NR_syscalls)
- goto skip;
-
- if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
- trace_sys_enter(regs, regs->gpr[0]);
-
- if (!is_32bit_task())
- audit_syscall_entry(regs->gpr[0], regs->gpr[3], regs->gpr[4],
- regs->gpr[5], regs->gpr[6]);
- else
- audit_syscall_entry(regs->gpr[0],
- regs->gpr[3] & 0xffffffff,
- regs->gpr[4] & 0xffffffff,
- regs->gpr[5] & 0xffffffff,
- regs->gpr[6] & 0xffffffff);
-
- /* Return the possibly modified but valid syscall number */
- return regs->gpr[0];
-
-skip:
- /*
- * If we are aborting explicitly, or if the syscall number is
- * now invalid, set the return value to -ENOSYS.
- */
- regs->gpr[3] = -ENOSYS;
- return -1;
-}
-
-void do_syscall_trace_leave(struct pt_regs *regs)
-{
- int step;
-
- audit_syscall_exit(regs);
-
- if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
- trace_sys_exit(regs, regs->result);
-
- step = test_thread_flag(TIF_SINGLESTEP);
- if (step || test_thread_flag(TIF_SYSCALL_TRACE))
- ptrace_report_syscall_exit(regs, step);
-}
-
void __init pt_regs_check(void);
/*
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 719930cf4ae1f..9f1847b4742e6 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -6,6 +6,7 @@
* Extracted from signal_32.c and signal_64.c
*/
+#include <linux/entry-common.h>
#include <linux/resume_user_mode.h>
#include <linux/signal.h>
#include <linux/uprobes.h>
@@ -22,11 +23,6 @@
#include "signal.h"
-/* This will be removed */
-#ifdef CONFIG_GENERIC_ENTRY
-#include <linux/entry-common.h>
-#endif /* CONFIG_GENERIC_ENTRY */
-
#ifdef CONFIG_VSX
unsigned long copy_fpr_to_user(void __user *to,
struct task_struct *task)
@@ -374,11 +370,9 @@ void signal_fault(struct task_struct *tsk, struct pt_regs *regs,
task_pid_nr(tsk), where, ptr, regs->nip, regs->link);
}
-#ifdef CONFIG_GENERIC_ENTRY
void arch_do_signal_or_restart(struct pt_regs *regs)
{
BUG_ON(regs != current->thread.regs);
- local_paca->generic_fw_flags |= GFW_RESTORE_ALL;
+ regs->exit_flags |= _TIF_RESTOREALL;
do_signal(current);
}
-#endif /* CONFIG_GENERIC_ENTRY */
diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
index 9f03a6263fb41..df1c9a8d62bc6 100644
--- a/arch/powerpc/kernel/syscall.c
+++ b/arch/powerpc/kernel/syscall.c
@@ -3,6 +3,7 @@
#include <linux/compat.h>
#include <linux/context_tracking.h>
#include <linux/randomize_kstack.h>
+#include <linux/entry-common.h>
#include <asm/interrupt.h>
#include <asm/kup.h>
@@ -18,124 +19,10 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
long ret;
syscall_fn f;
- kuap_lock();
-
add_random_kstack_offset();
+ r0 = syscall_enter_from_user_mode(regs, r0);
- if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
- BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
-
- trace_hardirqs_off(); /* finish reconciling */
-
- CT_WARN_ON(ct_state() == CT_STATE_KERNEL);
- user_exit_irqoff();
-
- BUG_ON(regs_is_unrecoverable(regs));
- BUG_ON(!user_mode(regs));
- BUG_ON(regs_irqs_disabled(regs));
-
-#ifdef CONFIG_PPC_PKEY
- if (mmu_has_feature(MMU_FTR_PKEY)) {
- unsigned long amr, iamr;
- bool flush_needed = false;
- /*
- * When entering from userspace we mostly have the AMR/IAMR
- * different from kernel default values. Hence don't compare.
- */
- amr = mfspr(SPRN_AMR);
- iamr = mfspr(SPRN_IAMR);
- regs->amr = amr;
- regs->iamr = iamr;
- if (mmu_has_feature(MMU_FTR_KUAP)) {
- mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
- flush_needed = true;
- }
- if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
- mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
- flush_needed = true;
- }
- if (flush_needed)
- isync();
- } else
-#endif
- kuap_assert_locked();
-
- booke_restore_dbcr0();
-
- account_cpu_user_entry();
-
- account_stolen_time();
-
- /*
- * This is not required for the syscall exit path, but makes the
- * stack frame look nicer. If this was initialised in the first stack
- * frame, or if the unwinder was taught the first stack frame always
- * returns to user with IRQS_ENABLED, this store could be avoided!
- */
- irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
-
- /*
- * If system call is called with TM active, set _TIF_RESTOREALL to
- * prevent RFSCV being used to return to userspace, because POWER9
- * TM implementation has problems with this instruction returning to
- * transactional state. Final register values are not relevant because
- * the transaction will be aborted upon return anyway. Or in the case
- * of unsupported_scv SIGILL fault, the return state does not much
- * matter because it's an edge case.
- */
- if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
- unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
- set_bits(_TIF_RESTOREALL, ¤t_thread_info()->flags);
-
- /*
- * If the system call was made with a transaction active, doom it and
- * return without performing the system call. Unless it was an
- * unsupported scv vector, in which case it's treated like an illegal
- * instruction.
- */
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
- if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
- !trap_is_unsupported_scv(regs)) {
- /* Enable TM in the kernel, and disable EE (for scv) */
- hard_irq_disable();
- mtmsr(mfmsr() | MSR_TM);
-
- /* tabort, this dooms the transaction, nothing else */
- asm volatile(".long 0x7c00071d | ((%0) << 16)"
- :: "r"(TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT));
-
- /*
- * Userspace will never see the return value. Execution will
- * resume after the tbegin. of the aborted transaction with the
- * checkpointed register state. A context switch could occur
- * or signal delivered to the process before resuming the
- * doomed transaction context, but that should all be handled
- * as expected.
- */
- return -ENOSYS;
- }
-#endif // CONFIG_PPC_TRANSACTIONAL_MEM
-
- local_irq_enable();
-
- if (unlikely(read_thread_flags() & _TIF_SYSCALL_DOTRACE)) {
- if (unlikely(trap_is_unsupported_scv(regs))) {
- /* Unsupported scv vector */
- _exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
- return regs->gpr[3];
- }
- /*
- * We use the return value of do_syscall_trace_enter() as the
- * syscall number. If the syscall was rejected for any reason
- * do_syscall_trace_enter() returns an invalid syscall number
- * and the test against NR_syscalls will fail and the return
- * value to be used is in regs->gpr[3].
- */
- r0 = do_syscall_trace_enter(regs);
- if (unlikely(r0 >= NR_syscalls))
- return regs->gpr[3];
-
- } else if (unlikely(r0 >= NR_syscalls)) {
+ if (unlikely(r0 >= NR_syscalls)) {
if (unlikely(trap_is_unsupported_scv(regs))) {
/* Unsupported scv vector */
_exception(SIGILL, regs, ILL_ILLOPC, regs->nip);
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC V2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
2025-09-08 21:02 ` [RFC V2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
@ 2025-09-09 6:54 ` Shrikanth Hegde
2025-09-09 8:46 ` Mukesh Kumar Chaurasiya
2025-09-18 6:57 ` Mukesh Kumar Chaurasiya
0 siblings, 2 replies; 21+ messages in thread
From: Shrikanth Hegde @ 2025-09-09 6:54 UTC (permalink / raw)
To: Mukesh Kumar Chaurasiya
Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
deller, ldv, macro, charlie, akpm, bigeasy, ankur.a.arora, naveen,
thomas.weissschuh, Jason, peterz, tglx, namcao, kan.liang, mingo,
oliver.upton, mark.barnett, atrajeev, rppt, coltonlewis,
linuxppc-dev, linux-kernel
On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
> Enable the syscall entry and exit path from generic framework.
>
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
Hi Mukesh.
Thanks for working on this and getting it to better shape.
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/include/asm/entry-common.h | 2 +-
> arch/powerpc/kernel/interrupt.c | 135 +++++++----------------
> arch/powerpc/kernel/ptrace/ptrace.c | 141 ------------------------
> arch/powerpc/kernel/signal.c | 10 +-
> arch/powerpc/kernel/syscall.c | 119 +-------------------
> 6 files changed, 49 insertions(+), 359 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e0c51d7b5638d..e67294a72e4d4 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -199,6 +199,7 @@ config PPC
> select GENERIC_CPU_AUTOPROBE
> select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC
> select GENERIC_EARLY_IOREMAP
> + select GENERIC_ENTRY
> select GENERIC_GETTIMEOFDAY
> select GENERIC_IDLE_POLL_SETUP
> select GENERIC_IOREMAP
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index d3f4a12aeafca..8fb74e6aa9560 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -3,7 +3,7
There could be some of the configs we need to take care while enabling generic entry. Since powerpc
didn't have it earlier, there could areas which needs cleanup. One for example dynamic preemption.
There could be more. Do some git history checks and see.
Issue with dynamic preemption:
ld: kernel/entry/common.o:/home/shrikanth/sched_tip/kernel/entry/common.c:161: multiple definition of `sk_dynamic_irqentry_exit_cond_resched';
arch/powerpc/kernel/interrupt.o:/home/shrikanth/sched_tip/arch/powerpc/kernel/interrupt.c:29: first defined here
Below diff helps to fix and changing preemption modes help. Also verified preempt lazy works too.
---
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 642e22527f9d..e1e0f0da4165 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -25,10 +25,6 @@
unsigned long global_dbcr0[NR_CPUS];
#endif
-#if defined(CONFIG_PREEMPT_DYNAMIC)
-DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-#endif
-
#ifdef CONFIG_PPC_BOOK3S_64
DEFINE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
static inline bool exit_must_hard_disable(void)
----
Though ideal thing is move them to sched/core instead of being in generic code. Like below.
https://lore.kernel.org/all/20250716094745.2232041-1-sshegde@linux.ibm.com/
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC V2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
2025-09-09 6:54 ` Shrikanth Hegde
@ 2025-09-09 8:46 ` Mukesh Kumar Chaurasiya
2025-09-18 6:57 ` Mukesh Kumar Chaurasiya
1 sibling, 0 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-09 8:46 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
deller, ldv, macro, charlie, akpm, bigeasy, ankur.a.arora, naveen,
thomas.weissschuh, Jason, peterz, tglx, namcao, kan.liang, mingo,
oliver.upton, mark.barnett, atrajeev, rppt, coltonlewis,
linuxppc-dev, linux-kernel
On 9/9/25 12:24, Shrikanth Hegde wrote:
>
>
> On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
>> Enable the syscall entry and exit path from generic framework.
>>
>> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
>> ---
>
> Hi Mukesh.
> Thanks for working on this and getting it to better shape.
>
>> arch/powerpc/Kconfig | 1 +
>> arch/powerpc/include/asm/entry-common.h | 2 +-
>> arch/powerpc/kernel/interrupt.c | 135 +++++++----------------
>> arch/powerpc/kernel/ptrace/ptrace.c | 141 ------------------------
>> arch/powerpc/kernel/signal.c | 10 +-
>> arch/powerpc/kernel/syscall.c | 119 +-------------------
>> 6 files changed, 49 insertions(+), 359 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index e0c51d7b5638d..e67294a72e4d4 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -199,6 +199,7 @@ config PPC
>> select GENERIC_CPU_AUTOPROBE
>> select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC
>> select GENERIC_EARLY_IOREMAP
>> + select GENERIC_ENTRY
>> select GENERIC_GETTIMEOFDAY
>> select GENERIC_IDLE_POLL_SETUP
>> select GENERIC_IOREMAP
>> diff --git a/arch/powerpc/include/asm/entry-common.h
>> b/arch/powerpc/include/asm/entry-common.h
>> index d3f4a12aeafca..8fb74e6aa9560 100644
>> --- a/arch/powerpc/include/asm/entry-common.h
>> +++ b/arch/powerpc/include/asm/entry-common.h
>> @@ -3,7 +3,7
>
> There could be some of the configs we need to take care while enabling
> generic entry. Since powerpc
> didn't have it earlier, there could areas which needs cleanup. One for
> example dynamic preemption.
> There could be more. Do some git history checks and see.
>
> Issue with dynamic preemption:
>
> ld:
> kernel/entry/common.o:/home/shrikanth/sched_tip/kernel/entry/common.c:161:
> multiple definition of `sk_dynamic_irqentry_exit_cond_resched';
> arch/powerpc/kernel/interrupt.o:/home/shrikanth/sched_tip/arch/powerpc/kernel/interrupt.c:29:
> first defined here
>
> Below diff helps to fix and changing preemption modes help. Also
> verified preempt lazy works too.
>
> ---
> diff --git a/arch/powerpc/kernel/interrupt.c
> b/arch/powerpc/kernel/interrupt.c
> index 642e22527f9d..e1e0f0da4165 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -25,10 +25,6 @@
> unsigned long global_dbcr0[NR_CPUS];
> #endif
>
> -#if defined(CONFIG_PREEMPT_DYNAMIC)
> -DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
> -#endif
> -
> #ifdef CONFIG_PPC_BOOK3S_64
> DEFINE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
> static inline bool exit_must_hard_disable(void)
>
>
Hey Srikanth,
Thanks for this. I will add this in next revision.
Mukesh
> ----
> Though ideal thing is move them to sched/core instead of being in
> generic code. Like below.
> https://lore.kernel.org/all/20250716094745.2232041-1-sshegde@linux.ibm.com/
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 0/8] Generic IRQ entry/exit support for powerpc
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
` (7 preceding siblings ...)
2025-09-08 21:02 ` [RFC V2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
@ 2025-09-10 20:34 ` Thomas Gleixner
2025-09-24 9:04 ` Samir Alamshaha Mulani
9 siblings, 0 replies; 21+ messages in thread
From: Thomas Gleixner @ 2025-09-10 20:34 UTC (permalink / raw)
To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, christophe.leroy,
oleg, kees, luto, wad, mchauras, deller, ldv, macro, charlie,
akpm, bigeasy, ankur.a.arora, sshegde, naveen, thomas.weissschuh,
Jason, peterz, namcao, kan.liang, mingo, oliver.upton,
mark.barnett, atrajeev, rppt, coltonlewis, linuxppc-dev,
linux-kernel
On Tue, Sep 09 2025 at 02:32, Mukesh Kumar Chaurasiya wrote:
> Adding support for the generic irq entry/exit handling for PowerPC. The
> goal is to bring PowerPC in line with other architectures that already
> use the common irq entry infrastructure, reducing duplicated code and
> making it easier to share future changes in entry/exit paths.
>
> This is slightly tested on ppc64le.
>
> The performance benchmarks from perf bench basic syscall are below:
>
> | Metric | W/O Generic Framework | With Generic Framework | Improvement |
> | ---------- | --------------------- | ---------------------- | ----------- |
> | Total time | 0.885 [sec] | 0.880 [sec] | ~0.56% |
> | usecs/op | 0.088518 | 0.088005 | ~0.58% |
> | ops/sec | 1,12,97,086 | 1,13,62,977 | ~0.58% |
>
> Thats close to 0.6% improvement with this.
Cool!
> 18 files changed, 698 insertions(+), 810 deletions(-)
Thanks for moving ppc over to this. Makes everyones life easier!
tglx
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 2/8] powerpc: Prepare to build with generic entry/exit framework
2025-09-08 21:02 ` [RFC V2 2/8] powerpc: Prepare to build with generic entry/exit framework Mukesh Kumar Chaurasiya
@ 2025-09-13 12:49 ` Shrikanth Hegde
2025-09-16 4:16 ` Mukesh Kumar Chaurasiya
0 siblings, 1 reply; 21+ messages in thread
From: Shrikanth Hegde @ 2025-09-13 12:49 UTC (permalink / raw)
To: Mukesh Kumar Chaurasiya
Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
deller, ldv, macro, charlie, akpm, bigeasy, ankur.a.arora, naveen,
thomas.weissschuh, Jason, peterz, tglx, namcao, kan.liang, mingo,
oliver.upton, mark.barnett, atrajeev, rppt, coltonlewis,
linuxppc-dev, linux-kernel
On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
> Enabling build with generic entry/exit framework for powerpc
> architecture requires few necessary steps.
>
> Introducing minor infrastructure updates to prepare for future generic
> framework handling:
>
> - Add syscall_work field to struct thread_info for SYSCALL_WORK_* flags.
> - Provide arch_syscall_is_vdso_sigreturn() stub, returning false.
> - Add on_thread_stack() helper to test whether the current stack pointer
> lies within the task’s kernel stack.
>
> No functional change is intended with this patch.
>
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
> arch/powerpc/include/asm/entry-common.h | 11 +++++++++++
> arch/powerpc/include/asm/stacktrace.h | 8 ++++++++
> arch/powerpc/include/asm/syscall.h | 5 +++++
> arch/powerpc/include/asm/thread_info.h | 1 +
> 4 files changed, 25 insertions(+)
> create mode 100644 arch/powerpc/include/asm/entry-common.h
>
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> new file mode 100644
> index 0000000000000..3af16d821d07e
> --- /dev/null
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _ASM_PPC_ENTRY_COMMON_H
> +#define _ASM_PPC_ENTRY_COMMON_H
> +
> +#ifdef CONFIG_GENERIC_IRQ_ENTRY
> +
> +#include <asm/stacktrace.h>
> +
> +#endif /* CONFIG_GENERIC_IRQ_ENTRY */
> +#endif /* _ASM_PPC_ENTRY_COMMON_H */
> diff --git a/arch/powerpc/include/asm/stacktrace.h b/arch/powerpc/include/asm/stacktrace.h
> index 6149b53b3bc8e..3f0a242468813 100644
> --- a/arch/powerpc/include/asm/stacktrace.h
> +++ b/arch/powerpc/include/asm/stacktrace.h
> @@ -8,6 +8,14 @@
> #ifndef _ASM_POWERPC_STACKTRACE_H
> #define _ASM_POWERPC_STACKTRACE_H
>
> +#include <linux/sched.h>
nit:
Is sched.h needed? I don't see any reference here.
It compiled for me without it.
> +
> void show_user_instructions(struct pt_regs *regs);
>
> +static inline bool on_thread_stack(void)
> +{
> + return !(((unsigned long)(current->stack) ^ current_stack_pointer)
> + & ~(THREAD_SIZE -1));
> +}
> +
> #endif /* _ASM_POWERPC_STACKTRACE_H */
> diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
> index 4b3c52ed6e9d2..834fcc4f7b543 100644
> --- a/arch/powerpc/include/asm/syscall.h
> +++ b/arch/powerpc/include/asm/syscall.h
> @@ -139,4 +139,9 @@ static inline int syscall_get_arch(struct task_struct *task)
> else
> return AUDIT_ARCH_PPC64;
> }
> +
> +static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
> +{
> + return false;
> +}
> #endif /* _ASM_SYSCALL_H */
> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
> index 2785c7462ebf7..d0e87c9bae0b0 100644
> --- a/arch/powerpc/include/asm/thread_info.h
> +++ b/arch/powerpc/include/asm/thread_info.h
> @@ -54,6 +54,7 @@
> struct thread_info {
> int preempt_count; /* 0 => preemptable,
> <0 => BUG */
> + unsigned long syscall_work; /* SYSCALL_WORK_ flags */
Can this go after cpu ? it would be 8 byte aligned then. Since it is in
fast path, might help.
> #ifdef CONFIG_SMP
> unsigned int cpu;
> #endif
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 1/8] powerpc: rename arch_irq_disabled_regs
2025-09-08 21:02 ` [RFC V2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
@ 2025-09-13 12:50 ` Shrikanth Hegde
0 siblings, 0 replies; 21+ messages in thread
From: Shrikanth Hegde @ 2025-09-13 12:50 UTC (permalink / raw)
To: Mukesh Kumar Chaurasiya
Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
deller, ldv, macro, charlie, akpm, bigeasy, ankur.a.arora, naveen,
thomas.weissschuh, Jason, peterz, tglx, namcao, kan.liang, mingo,
oliver.upton, mark.barnett, atrajeev, rppt, coltonlewis,
linuxppc-dev, linux-kernel
On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
> Renaming arch_irq_disabled_regs to regs_irqs_disabled to be used
> commonly in generic entry exit framework and ppc arch code.
>
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
> arch/powerpc/include/asm/hw_irq.h | 4 ++--
> arch/powerpc/include/asm/interrupt.h | 12 ++++++------
> arch/powerpc/kernel/interrupt.c | 4 ++--
> arch/powerpc/kernel/syscall.c | 2 +-
> arch/powerpc/kernel/traps.c | 2 +-
> arch/powerpc/kernel/watchdog.c | 2 +-
> arch/powerpc/perf/core-book3s.c | 2 +-
> 7 files changed, 14 insertions(+), 14 deletions(-)
>
...
> return 0;
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 8b0081441f85d..f7518b7e30554 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -2482,7 +2482,7 @@ static void __perf_event_interrupt(struct pt_regs *regs)
> * will trigger a PMI after waking up from idle. Since counter values are _not_
> * saved/restored in idle path, can lead to below "Can't find PMC" message.
> */
> - if (unlikely(!found) && !arch_irq_disabled_regs(regs))
> + if (unlikely(!found) && !regs_irqs_disabled(regs))
> printk_ratelimited(KERN_WARNING "Can't find PMC that caused IRQ\n");
>
> /*
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 3/8] powerpc: introduce arch_enter_from_user_mode
2025-09-08 21:02 ` [RFC V2 3/8] powerpc: introduce arch_enter_from_user_mode Mukesh Kumar Chaurasiya
@ 2025-09-14 9:02 ` Shrikanth Hegde
2025-09-16 4:19 ` Mukesh Kumar Chaurasiya
0 siblings, 1 reply; 21+ messages in thread
From: Shrikanth Hegde @ 2025-09-14 9:02 UTC (permalink / raw)
To: Mukesh Kumar Chaurasiya
Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
deller, ldv, macro, charlie, akpm, bigeasy, ankur.a.arora, naveen,
thomas.weissschuh, Jason, peterz, tglx, namcao, kan.liang, mingo,
oliver.upton, mark.barnett, atrajeev, rppt, coltonlewis,
linuxppc-dev, linux-kernel
On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
> - Implement the hook arch_enter_from_user_mode for syscall entry.
nit: for generic syscall infra.
> - Move booke_load_dbcr0 from interrupt.c to interrupt.h
>
> No functional change intended.
>
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
> arch/powerpc/include/asm/entry-common.h | 96 +++++++++++++++++++++++++
> arch/powerpc/include/asm/interrupt.h | 23 ++++++
> arch/powerpc/kernel/interrupt.c | 22 ------
> 3 files changed, 119 insertions(+), 22 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/entry-common.h b/arch/powerpc/include/asm/entry-common.h
> index 3af16d821d07e..49607292bf5a5 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -5,7 +5,103 @@
>
> #ifdef CONFIG_GENERIC_IRQ_ENTRY
>
> +#include <asm/cputime.h>
> +#include <asm/interrupt.h>
> #include <asm/stacktrace.h>
> +#include <asm/tm.h>
> +
> +static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
> +{
> + if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
> + BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
> +
> + BUG_ON(regs_is_unrecoverable(regs));
> + BUG_ON(!user_mode(regs));
> + BUG_ON(regs_irqs_disabled(regs));
> +
> +#ifdef CONFIG_PPC_PKEY
> + if (mmu_has_feature(MMU_FTR_PKEY)) {
> + unsigned long amr, iamr;
> + bool flush_needed = false;
> + /*
> + * When entering from userspace we mostly have the AMR/IAMR
> + * different from kernel default values. Hence don't compare.
> + */
> + amr = mfspr(SPRN_AMR);
> + iamr = mfspr(SPRN_IAMR);
> + regs->amr = amr;
> + regs->iamr = iamr;
> + if (mmu_has_feature(MMU_FTR_KUAP)) {
> + mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
> + flush_needed = true;
> + }
> + if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
> + mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
> + flush_needed = true;
> + }
> + if (flush_needed)
> + isync();
> + } else
> +#endif
> + kuap_assert_locked();
> +
> + booke_restore_dbcr0();
> +
> + account_cpu_user_entry();
> +
> + account_stolen_time();
> +
> + /*
> + * This is not required for the syscall exit path, but makes the
> + * stack frame look nicer. If this was initialised in the first stack
> + * frame, or if the unwinder was taught the first stack frame always
> + * returns to user with IRQS_ENABLED, this store could be avoided!
> + */
> + irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
> +
> + /*
> + * If system call is called with TM active, set _TIF_RESTOREALL to
> + * prevent RFSCV being used to return to userspace, because POWER9
> + * TM implementation has problems with this instruction returning to
> + * transactional state. Final register values are not relevant because
> + * the transaction will be aborted upon return anyway. Or in the case
> + * of unsupported_scv SIGILL fault, the return state does not much
> + * matter because it's an edge case.
> + */
> + if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
> + unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
> + set_bits(_TIF_RESTOREALL, ¤t_thread_info()->flags);
> +
> + /*
> + * If the system call was made with a transaction active, doom it and
> + * return without performing the system call. Unless it was an
> + * unsupported scv vector, in which case it's treated like an illegal
> + * instruction.
> + */
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> + if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
> + !trap_is_unsupported_scv(regs)) {
> + /* Enable TM in the kernel, and disable EE (for scv) */
> + hard_irq_disable();
> + mtmsr(mfmsr() | MSR_TM);
> +
> + /* tabort, this dooms the transaction, nothing else */
> + asm volatile(".long 0x7c00071d | ((%0) << 16)"
> + :: "r"(TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT));
> +
> + /*
> + * Userspace will never see the return value. Execution will
> + * resume after the tbegin. of the aborted transaction with the
> + * checkpointed register state. A context switch could occur
> + * or signal delivered to the process before resuming the
> + * doomed transaction context, but that should all be handled
> + * as expected.
> + */
> + return;
> + }
> +#endif // CONFIG_PPC_TRANSACTIONAL_MEM
nit: Better to follow standard comment practices.
/* CONFIG_PPC_TRANSACTIONAL_MEM */
> +}
> +#define arch_enter_from_user_mode arch_enter_from_user_mode
>
> #endif /* CONFIG_GENERIC_IRQ_ENTRY */
> #endif /* _ASM_PPC_ENTRY_COMMON_H */
> diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
> index 56bc8113b8cde..6edf064a0fea2 100644
> --- a/arch/powerpc/include/asm/interrupt.h
> +++ b/arch/powerpc/include/asm/interrupt.h
> @@ -138,6 +138,29 @@ static inline void nap_adjust_return(struct pt_regs *regs)
> #endif
> }
>
> +static inline void booke_load_dbcr0(void)
> +{
> +#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> + unsigned long dbcr0 = current->thread.debug.dbcr0;
> +
> + if (likely(!(dbcr0 & DBCR0_IDM)))
> + return;
> +
> + /*
> + * Check to see if the dbcr0 register is set up to debug.
> + * Use the internal debug mode bit to do this.
> + */
> + mtmsr(mfmsr() & ~MSR_DE);
> + if (IS_ENABLED(CONFIG_PPC32)) {
> + isync();
> + global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
> + }
> + mtspr(SPRN_DBCR0, dbcr0);
> + mtspr(SPRN_DBSR, -1);
> +#endif
> +}
> +
Please run checkpatch.pl --strict on the series and fix the simple
ones such as need to using tabs, spaces and alignments, extra lines etc.
> +
> static inline void booke_restore_dbcr0(void)
> {
> #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index 0d8fd47049a19..2a09ac5dabd62 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -78,28 +78,6 @@ static notrace __always_inline bool prep_irq_for_enabled_exit(bool restartable)
> return true;
> }
>
> -static notrace void booke_load_dbcr0(void)
> -{
> -#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> - unsigned long dbcr0 = current->thread.debug.dbcr0;
> -
> - if (likely(!(dbcr0 & DBCR0_IDM)))
> - return;
> -
> - /*
> - * Check to see if the dbcr0 register is set up to debug.
> - * Use the internal debug mode bit to do this.
> - */
> - mtmsr(mfmsr() & ~MSR_DE);
> - if (IS_ENABLED(CONFIG_PPC32)) {
> - isync();
> - global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
> - }
> - mtspr(SPRN_DBCR0, dbcr0);
> - mtspr(SPRN_DBSR, -1);
> -#endif
> -}
> -
> static notrace void check_return_regs_valid(struct pt_regs *regs)
> {
> #ifdef CONFIG_PPC_BOOK3S_64
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 2/8] powerpc: Prepare to build with generic entry/exit framework
2025-09-13 12:49 ` Shrikanth Hegde
@ 2025-09-16 4:16 ` Mukesh Kumar Chaurasiya
0 siblings, 0 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-16 4:16 UTC (permalink / raw)
To: Shrikanth Hegde, Mukesh Kumar Chaurasiya
Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
deller, ldv, macro, charlie, akpm, bigeasy, ankur.a.arora, naveen,
thomas.weissschuh, Jason, peterz, tglx, namcao, kan.liang, mingo,
oliver.upton, mark.barnett, atrajeev, rppt, coltonlewis,
linuxppc-dev, linux-kernel
On 9/13/25 6:19 PM, Shrikanth Hegde wrote:
>
>
> On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
>> Enabling build with generic entry/exit framework for powerpc
>> architecture requires few necessary steps.
>>
>> Introducing minor infrastructure updates to prepare for future generic
>> framework handling:
>>
>> - Add syscall_work field to struct thread_info for SYSCALL_WORK_* flags.
>> - Provide arch_syscall_is_vdso_sigreturn() stub, returning false.
>> - Add on_thread_stack() helper to test whether the current stack pointer
>> lies within the task’s kernel stack.
>>
>> No functional change is intended with this patch.
>>
>> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
>> ---
>> arch/powerpc/include/asm/entry-common.h | 11 +++++++++++
>> arch/powerpc/include/asm/stacktrace.h | 8 ++++++++
>> arch/powerpc/include/asm/syscall.h | 5 +++++
>> arch/powerpc/include/asm/thread_info.h | 1 +
>> 4 files changed, 25 insertions(+)
>> create mode 100644 arch/powerpc/include/asm/entry-common.h
>>
>> diff --git a/arch/powerpc/include/asm/entry-common.h
>> b/arch/powerpc/include/asm/entry-common.h
>> new file mode 100644
>> index 0000000000000..3af16d821d07e
>> --- /dev/null
>> +++ b/arch/powerpc/include/asm/entry-common.h
>> @@ -0,0 +1,11 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef _ASM_PPC_ENTRY_COMMON_H
>> +#define _ASM_PPC_ENTRY_COMMON_H
>> +
>> +#ifdef CONFIG_GENERIC_IRQ_ENTRY
>> +
>> +#include <asm/stacktrace.h>
>> +
>> +#endif /* CONFIG_GENERIC_IRQ_ENTRY */
>> +#endif /* _ASM_PPC_ENTRY_COMMON_H */
>> diff --git a/arch/powerpc/include/asm/stacktrace.h
>> b/arch/powerpc/include/asm/stacktrace.h
>> index 6149b53b3bc8e..3f0a242468813 100644
>> --- a/arch/powerpc/include/asm/stacktrace.h
>> +++ b/arch/powerpc/include/asm/stacktrace.h
>> @@ -8,6 +8,14 @@
>> #ifndef _ASM_POWERPC_STACKTRACE_H
>> #define _ASM_POWERPC_STACKTRACE_H
>> +#include <linux/sched.h>
>
> nit:
>
> Is sched.h needed? I don't see any reference here.
> It compiled for me without it.
>
Will remove this in next revision.
>> +
>> void show_user_instructions(struct pt_regs *regs);
>> +static inline bool on_thread_stack(void)
>> +{
>> + return !(((unsigned long)(current->stack) ^ current_stack_pointer)
>> + & ~(THREAD_SIZE -1));
>> +}
>> +
>> #endif /* _ASM_POWERPC_STACKTRACE_H */
>> diff --git a/arch/powerpc/include/asm/syscall.h
>> b/arch/powerpc/include/asm/syscall.h
>> index 4b3c52ed6e9d2..834fcc4f7b543 100644
>> --- a/arch/powerpc/include/asm/syscall.h
>> +++ b/arch/powerpc/include/asm/syscall.h
>> @@ -139,4 +139,9 @@ static inline int syscall_get_arch(struct
>> task_struct *task)
>> else
>> return AUDIT_ARCH_PPC64;
>> }
>> +
>> +static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
>> +{
>> + return false;
>> +}
>> #endif /* _ASM_SYSCALL_H */
>> diff --git a/arch/powerpc/include/asm/thread_info.h
>> b/arch/powerpc/include/asm/thread_info.h
>> index 2785c7462ebf7..d0e87c9bae0b0 100644
>> --- a/arch/powerpc/include/asm/thread_info.h
>> +++ b/arch/powerpc/include/asm/thread_info.h
>> @@ -54,6 +54,7 @@
>> struct thread_info {
>> int preempt_count; /* 0 => preemptable,
>> <0 => BUG */
>> + unsigned long syscall_work; /* SYSCALL_WORK_ flags */
>
> Can this go after cpu ? it would be 8 byte aligned then. Since it is
> in fast path, might help.
>
Oh yeah, will move this in next revision.
Thanks,
Mukesh
>> #ifdef CONFIG_SMP
>> unsigned int cpu;
>> #endif
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 7/8] powerpc: Enable IRQ generic entry/exit path.
2025-09-08 21:02 ` [RFC V2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
@ 2025-09-16 4:16 ` Shrikanth Hegde
2025-09-18 6:55 ` Mukesh Kumar Chaurasiya
0 siblings, 1 reply; 21+ messages in thread
From: Shrikanth Hegde @ 2025-09-16 4:16 UTC (permalink / raw)
To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, christophe.leroy
Cc: oleg, kees, luto, wad, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, naveen, thomas.weissschuh, Jason, peterz, tglx,
namcao, kan.liang, mingo, oliver.upton, mark.barnett, atrajeev,
rppt, coltonlewis, linuxppc-dev, linux-kernel
On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
> Enable generic entry/exit path for ppc irq.
>
> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
> ---
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/include/asm/entry-common.h | 93 ++---
> arch/powerpc/include/asm/interrupt.h | 492 +++---------------------
> arch/powerpc/kernel/interrupt.c | 9 +-
> arch/powerpc/kernel/interrupt_64.S | 2 -
> 5 files changed, 92 insertions(+), 505 deletions(-)
>
\
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index f53d432f60870..7bb8a31b24ea7 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -297,13 +297,8 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
> /* Returning to a kernel context with local irqs enabled. */
> WARN_ON_ONCE(!(regs->msr & MSR_EE));
> again:
> - if (need_irq_preemption()) {
> - /* Return to preemptible kernel context */
> - if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
> - if (preempt_count() == 0)
> - preempt_schedule_irq();
> - }
> - }
> + if (need_irq_preemption())
> + irqentry_exit_cond_resched();
irqentry_exit_cond_resched is also called in irqentry_exit. It would be
better if we can find ways to avoid calling it again.
I see a loop here. But comment says it is not enabling irq again. so the
loop is bounded. So might be okay to remove cond_resched here. do run
preemptirq, irq tracers to ensure that is case.
Also, what is this "soft_interrupts"?
>
> check_return_regs_valid(regs);
>
> diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S
> index 1ad059a9e2fef..6aa88fe91fb6a 100644
> --- a/arch/powerpc/kernel/interrupt_64.S
> +++ b/arch/powerpc/kernel/interrupt_64.S
> @@ -418,8 +418,6 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
> beq interrupt_return_\srr\()_kernel
> interrupt_return_\srr\()_user: /* make backtraces match the _kernel variant */
> _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\()_user)
> - addi r3,r1,STACK_INT_FRAME_REGS
> - bl CFUNC(interrupt_exit_user_prepare)
> #ifndef CONFIG_INTERRUPT_SANITIZE_REGISTERS
> cmpdi r3,0
> bne- .Lrestore_nvgprs_\srr
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 3/8] powerpc: introduce arch_enter_from_user_mode
2025-09-14 9:02 ` Shrikanth Hegde
@ 2025-09-16 4:19 ` Mukesh Kumar Chaurasiya
0 siblings, 0 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-16 4:19 UTC (permalink / raw)
To: Shrikanth Hegde, Mukesh Kumar Chaurasiya
Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
deller, ldv, macro, charlie, akpm, bigeasy, ankur.a.arora, naveen,
thomas.weissschuh, Jason, peterz, tglx, namcao, kan.liang, mingo,
oliver.upton, mark.barnett, atrajeev, rppt, coltonlewis,
linuxppc-dev, linux-kernel
On 9/14/25 2:32 PM, Shrikanth Hegde wrote:
>
>
> On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
>> - Implement the hook arch_enter_from_user_mode for syscall entry.
>
> nit: for generic syscall infra.
Cool, will change this.
>
>> - Move booke_load_dbcr0 from interrupt.c to interrupt.h
>>
>> No functional change intended.
>>
>> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
>> ---
>> arch/powerpc/include/asm/entry-common.h | 96 +++++++++++++++++++++++++
>> arch/powerpc/include/asm/interrupt.h | 23 ++++++
>> arch/powerpc/kernel/interrupt.c | 22 ------
>> 3 files changed, 119 insertions(+), 22 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/entry-common.h
>> b/arch/powerpc/include/asm/entry-common.h
>> index 3af16d821d07e..49607292bf5a5 100644
>> --- a/arch/powerpc/include/asm/entry-common.h
>> +++ b/arch/powerpc/include/asm/entry-common.h
>> @@ -5,7 +5,103 @@
>> #ifdef CONFIG_GENERIC_IRQ_ENTRY
>> +#include <asm/cputime.h>
>> +#include <asm/interrupt.h>
>> #include <asm/stacktrace.h>
>> +#include <asm/tm.h>
>> +
>> +static __always_inline void arch_enter_from_user_mode(struct pt_regs
>> *regs)
>> +{
>> + if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
>> + BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED);
>> +
>> + BUG_ON(regs_is_unrecoverable(regs));
>> + BUG_ON(!user_mode(regs));
>> + BUG_ON(regs_irqs_disabled(regs));
>> +
>> +#ifdef CONFIG_PPC_PKEY
>> + if (mmu_has_feature(MMU_FTR_PKEY)) {
>> + unsigned long amr, iamr;
>> + bool flush_needed = false;
>> + /*
>> + * When entering from userspace we mostly have the AMR/IAMR
>> + * different from kernel default values. Hence don't compare.
>> + */
>> + amr = mfspr(SPRN_AMR);
>> + iamr = mfspr(SPRN_IAMR);
>> + regs->amr = amr;
>> + regs->iamr = iamr;
>> + if (mmu_has_feature(MMU_FTR_KUAP)) {
>> + mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
>> + flush_needed = true;
>> + }
>> + if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP)) {
>> + mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
>> + flush_needed = true;
>> + }
>> + if (flush_needed)
>> + isync();
>> + } else
>> +#endif
>> + kuap_assert_locked();
>> +
>> + booke_restore_dbcr0();
>> +
>> + account_cpu_user_entry();
>> +
>> + account_stolen_time();
>> +
>> + /*
>> + * This is not required for the syscall exit path, but makes the
>> + * stack frame look nicer. If this was initialised in the first
>> stack
>> + * frame, or if the unwinder was taught the first stack frame
>> always
>> + * returns to user with IRQS_ENABLED, this store could be avoided!
>> + */
>> + irq_soft_mask_regs_set_state(regs, IRQS_ENABLED);
>> +
>> + /*
>> + * If system call is called with TM active, set _TIF_RESTOREALL to
>> + * prevent RFSCV being used to return to userspace, because POWER9
>> + * TM implementation has problems with this instruction
>> returning to
>> + * transactional state. Final register values are not relevant
>> because
>> + * the transaction will be aborted upon return anyway. Or in the
>> case
>> + * of unsupported_scv SIGILL fault, the return state does not much
>> + * matter because it's an edge case.
>> + */
>> + if (IS_ENABLED(CONFIG_PPC_TRANSACTIONAL_MEM) &&
>> + unlikely(MSR_TM_TRANSACTIONAL(regs->msr)))
>> + set_bits(_TIF_RESTOREALL, ¤t_thread_info()->flags);
>> +
>> + /*
>> + * If the system call was made with a transaction active, doom
>> it and
>> + * return without performing the system call. Unless it was an
>> + * unsupported scv vector, in which case it's treated like an
>> illegal
>> + * instruction.
>> + */
>> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>> + if (unlikely(MSR_TM_TRANSACTIONAL(regs->msr)) &&
>> + !trap_is_unsupported_scv(regs)) {
>> + /* Enable TM in the kernel, and disable EE (for scv) */
>> + hard_irq_disable();
>> + mtmsr(mfmsr() | MSR_TM);
>> +
>> + /* tabort, this dooms the transaction, nothing else */
>> + asm volatile(".long 0x7c00071d | ((%0) << 16)"
>> + :: "r"(TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT));
>> +
>> + /*
>> + * Userspace will never see the return value. Execution will
>> + * resume after the tbegin. of the aborted transaction with the
>> + * checkpointed register state. A context switch could occur
>> + * or signal delivered to the process before resuming the
>> + * doomed transaction context, but that should all be handled
>> + * as expected.
>> + */
>> + return;
>> + }
>> +#endif // CONFIG_PPC_TRANSACTIONAL_MEM
>
> nit: Better to follow standard comment practices.
> /* CONFIG_PPC_TRANSACTIONAL_MEM */
>
Sure.
>> +}
>> +#define arch_enter_from_user_mode arch_enter_from_user_mode
>> #endif /* CONFIG_GENERIC_IRQ_ENTRY */
>> #endif /* _ASM_PPC_ENTRY_COMMON_H */
>> diff --git a/arch/powerpc/include/asm/interrupt.h
>> b/arch/powerpc/include/asm/interrupt.h
>> index 56bc8113b8cde..6edf064a0fea2 100644
>> --- a/arch/powerpc/include/asm/interrupt.h
>> +++ b/arch/powerpc/include/asm/interrupt.h
>> @@ -138,6 +138,29 @@ static inline void nap_adjust_return(struct
>> pt_regs *regs)
>> #endif
>> }
>> +static inline void booke_load_dbcr0(void)
>> +{
>> +#ifdef CONFIG_PPC_ADV_DEBUG_REGS
>> + unsigned long dbcr0 = current->thread.debug.dbcr0;
>> +
>> + if (likely(!(dbcr0 & DBCR0_IDM)))
>> + return;
>> +
>> + /*
>> + * Check to see if the dbcr0 register is set up to debug.
>> + * Use the internal debug mode bit to do this.
>> + */
>> + mtmsr(mfmsr() & ~MSR_DE);
>> + if (IS_ENABLED(CONFIG_PPC32)) {
>> + isync();
>> + global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
>> + }
>> + mtspr(SPRN_DBCR0, dbcr0);
>> + mtspr(SPRN_DBSR, -1);
>> +#endif
>> +}
>> +
>
> Please run checkpatch.pl --strict on the series and fix the simple
> ones such as need to using tabs, spaces and alignments, extra lines etc.
>
Sure, will fix these.
Thanks,
Mukesh
>> +
>> static inline void booke_restore_dbcr0(void)
>> {
>> #ifdef CONFIG_PPC_ADV_DEBUG_REGS
>> diff --git a/arch/powerpc/kernel/interrupt.c
>> b/arch/powerpc/kernel/interrupt.c
>> index 0d8fd47049a19..2a09ac5dabd62 100644
>> --- a/arch/powerpc/kernel/interrupt.c
>> +++ b/arch/powerpc/kernel/interrupt.c
>> @@ -78,28 +78,6 @@ static notrace __always_inline bool
>> prep_irq_for_enabled_exit(bool restartable)
>> return true;
>> }
>> -static notrace void booke_load_dbcr0(void)
>> -{
>> -#ifdef CONFIG_PPC_ADV_DEBUG_REGS
>> - unsigned long dbcr0 = current->thread.debug.dbcr0;
>> -
>> - if (likely(!(dbcr0 & DBCR0_IDM)))
>> - return;
>> -
>> - /*
>> - * Check to see if the dbcr0 register is set up to debug.
>> - * Use the internal debug mode bit to do this.
>> - */
>> - mtmsr(mfmsr() & ~MSR_DE);
>> - if (IS_ENABLED(CONFIG_PPC32)) {
>> - isync();
>> - global_dbcr0[smp_processor_id()] = mfspr(SPRN_DBCR0);
>> - }
>> - mtspr(SPRN_DBCR0, dbcr0);
>> - mtspr(SPRN_DBSR, -1);
>> -#endif
>> -}
>> -
>> static notrace void check_return_regs_valid(struct pt_regs *regs)
>> {
>> #ifdef CONFIG_PPC_BOOK3S_64
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 7/8] powerpc: Enable IRQ generic entry/exit path.
2025-09-16 4:16 ` Shrikanth Hegde
@ 2025-09-18 6:55 ` Mukesh Kumar Chaurasiya
0 siblings, 0 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-18 6:55 UTC (permalink / raw)
To: Shrikanth Hegde, Mukesh Kumar Chaurasiya, maddy, mpe, npiggin,
christophe.leroy
Cc: oleg, kees, luto, wad, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, naveen, thomas.weissschuh, Jason, peterz, tglx,
namcao, kan.liang, mingo, oliver.upton, mark.barnett, atrajeev,
rppt, coltonlewis, linuxppc-dev, linux-kernel
On 9/16/25 9:46 AM, Shrikanth Hegde wrote:
>
>
> On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
>> Enable generic entry/exit path for ppc irq.
>>
>> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
>> ---
>> arch/powerpc/Kconfig | 1 +
>> arch/powerpc/include/asm/entry-common.h | 93 ++---
>> arch/powerpc/include/asm/interrupt.h | 492 +++---------------------
>> arch/powerpc/kernel/interrupt.c | 9 +-
>> arch/powerpc/kernel/interrupt_64.S | 2 -
>> 5 files changed, 92 insertions(+), 505 deletions(-)
>>
> \
>> diff --git a/arch/powerpc/kernel/interrupt.c
>> b/arch/powerpc/kernel/interrupt.c
>> index f53d432f60870..7bb8a31b24ea7 100644
>> --- a/arch/powerpc/kernel/interrupt.c
>> +++ b/arch/powerpc/kernel/interrupt.c
>> @@ -297,13 +297,8 @@ notrace unsigned long
>> interrupt_exit_kernel_prepare(struct pt_regs *regs)
>> /* Returning to a kernel context with local irqs enabled. */
>> WARN_ON_ONCE(!(regs->msr & MSR_EE));
>> again:
>> - if (need_irq_preemption()) {
>> - /* Return to preemptible kernel context */
>> - if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
>> - if (preempt_count() == 0)
>> - preempt_schedule_irq();
>> - }
>> - }
>> + if (need_irq_preemption())
>> + irqentry_exit_cond_resched();
>
> irqentry_exit_cond_resched is also called in irqentry_exit. It would
> be better if we can find ways to avoid calling it again.
>
> I see a loop here. But comment says it is not enabling irq again. so
> the loop is bounded. So might be okay to remove cond_resched here. do run
> preemptirq, irq tracers to ensure that is case.
>
Sure.
> Also, what is this "soft_interrupts"?
You mean soft masked interrupts?
It's a mechanism to buffer interrupts without disabling the ee bit so
that we can replay those interrupts later.
>> check_return_regs_valid(regs);
>> diff --git a/arch/powerpc/kernel/interrupt_64.S
>> b/arch/powerpc/kernel/interrupt_64.S
>> index 1ad059a9e2fef..6aa88fe91fb6a 100644
>> --- a/arch/powerpc/kernel/interrupt_64.S
>> +++ b/arch/powerpc/kernel/interrupt_64.S
>> @@ -418,8 +418,6 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\())
>> beq interrupt_return_\srr\()_kernel
>> interrupt_return_\srr\()_user: /* make backtraces match the _kernel
>> variant */
>> _ASM_NOKPROBE_SYMBOL(interrupt_return_\srr\()_user)
>> - addi r3,r1,STACK_INT_FRAME_REGS
>> - bl CFUNC(interrupt_exit_user_prepare)
>> #ifndef CONFIG_INTERRUPT_SANITIZE_REGISTERS
>> cmpdi r3,0
>> bne- .Lrestore_nvgprs_\srr
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 8/8] powerpc: Enable Generic Entry/Exit for syscalls.
2025-09-09 6:54 ` Shrikanth Hegde
2025-09-09 8:46 ` Mukesh Kumar Chaurasiya
@ 2025-09-18 6:57 ` Mukesh Kumar Chaurasiya
1 sibling, 0 replies; 21+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2025-09-18 6:57 UTC (permalink / raw)
To: Shrikanth Hegde, Mukesh Kumar Chaurasiya
Cc: maddy, mpe, npiggin, christophe.leroy, oleg, kees, luto, wad,
deller, ldv, macro, charlie, akpm, bigeasy, ankur.a.arora, naveen,
thomas.weissschuh, Jason, peterz, tglx, namcao, kan.liang, mingo,
oliver.upton, mark.barnett, atrajeev, rppt, coltonlewis,
linuxppc-dev, linux-kernel
On 9/9/25 12:24 PM, Shrikanth Hegde wrote:
>
>
> On 9/9/25 2:32 AM, Mukesh Kumar Chaurasiya wrote:
>> Enable the syscall entry and exit path from generic framework.
>>
>> Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
>> ---
>
> Hi Mukesh.
> Thanks for working on this and getting it to better shape.
>
>> arch/powerpc/Kconfig | 1 +
>> arch/powerpc/include/asm/entry-common.h | 2 +-
>> arch/powerpc/kernel/interrupt.c | 135 +++++++----------------
>> arch/powerpc/kernel/ptrace/ptrace.c | 141 ------------------------
>> arch/powerpc/kernel/signal.c | 10 +-
>> arch/powerpc/kernel/syscall.c | 119 +-------------------
>> 6 files changed, 49 insertions(+), 359 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index e0c51d7b5638d..e67294a72e4d4 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -199,6 +199,7 @@ config PPC
>> select GENERIC_CPU_AUTOPROBE
>> select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC
>> select GENERIC_EARLY_IOREMAP
>> + select GENERIC_ENTRY
>> select GENERIC_GETTIMEOFDAY
>> select GENERIC_IDLE_POLL_SETUP
>> select GENERIC_IOREMAP
>> diff --git a/arch/powerpc/include/asm/entry-common.h
>> b/arch/powerpc/include/asm/entry-common.h
>> index d3f4a12aeafca..8fb74e6aa9560 100644
>> --- a/arch/powerpc/include/asm/entry-common.h
>> +++ b/arch/powerpc/include/asm/entry-common.h
>> @@ -3,7 +3,7
>
> There could be some of the configs we need to take care while enabling
> generic entry. Since powerpc
> didn't have it earlier, there could areas which needs cleanup. One for
> example dynamic preemption.
> There could be more. Do some git history checks and see.
>
> Issue with dynamic preemption:
>
> ld:
> kernel/entry/common.o:/home/shrikanth/sched_tip/kernel/entry/common.c:161:
> multiple definition of `sk_dynamic_irqentry_exit_cond_resched';
> arch/powerpc/kernel/interrupt.o:/home/shrikanth/sched_tip/arch/powerpc/kernel/interrupt.c:29:
> first defined here
>
> Below diff helps to fix and changing preemption modes help. Also
> verified preempt lazy works too.
>
> ---
> diff --git a/arch/powerpc/kernel/interrupt.c
> b/arch/powerpc/kernel/interrupt.c
> index 642e22527f9d..e1e0f0da4165 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -25,10 +25,6 @@
> unsigned long global_dbcr0[NR_CPUS];
> #endif
>
> -#if defined(CONFIG_PREEMPT_DYNAMIC)
> -DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
> -#endif
> -
> #ifdef CONFIG_PPC_BOOK3S_64
> DEFINE_STATIC_KEY_FALSE(interrupt_exit_not_reentrant);
> static inline bool exit_must_hard_disable(void)
>
>
> ----
> Though ideal thing is move them to sched/core instead of being in
> generic code. Like below.
> https://lore.kernel.org/all/20250716094745.2232041-1-sshegde@linux.ibm.com/
>
>
Sure.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC V2 0/8] Generic IRQ entry/exit support for powerpc
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
` (8 preceding siblings ...)
2025-09-10 20:34 ` [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Thomas Gleixner
@ 2025-09-24 9:04 ` Samir Alamshaha Mulani
9 siblings, 0 replies; 21+ messages in thread
From: Samir Alamshaha Mulani @ 2025-09-24 9:04 UTC (permalink / raw)
To: Mukesh Kumar Chaurasiya, maddy, mpe, npiggin, christophe.leroy,
oleg, kees, luto, wad, deller, ldv, macro, charlie, akpm, bigeasy,
ankur.a.arora, sshegde, naveen, thomas.weissschuh, Jason, peterz,
tglx, namcao, kan.liang, mingo, oliver.upton, mark.barnett,
atrajeev, rppt, coltonlewis, linuxppc-dev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 4574 bytes --]
On 09/09/25 2:32 am, Mukesh Kumar Chaurasiya wrote:
> Adding support for the generic irq entry/exit handling for PowerPC. The
> goal is to bring PowerPC in line with other architectures that already
> use the common irq entry infrastructure, reducing duplicated code and
> making it easier to share future changes in entry/exit paths.
>
> This is slightly tested on ppc64le.
>
> The performance benchmarks from perf bench basic syscall are below:
>
> | Metric | W/O Generic Framework | With Generic Framework | Improvement |
> | ---------- | --------------------- | ---------------------- | ----------- |
> | Total time | 0.885 [sec] | 0.880 [sec] | ~0.56% |
> | usecs/op | 0.088518 | 0.088005 | ~0.58% |
> | ops/sec | 1,12,97,086 | 1,13,62,977 | ~0.58% |
>
> Thats close to 0.6% improvement with this.
>
> Changelog:
> V1 -> V2: Support added for irq with generic framework.
>
> Mukesh Kumar Chaurasiya (8):
> powerpc: rename arch_irq_disabled_regs
> powerpc: Prepare to build with generic entry/exit framework
> powerpc: introduce arch_enter_from_user_mode
> powerpc: Introduce syscall exit arch functions
> powerpc: add exit_flags field in pt_regs
> powerpc: Prepare for IRQ entry exit
> powerpc: Enable IRQ generic entry/exit path.
> powerpc: Enable Generic Entry/Exit for syscalls.
>
> arch/powerpc/Kconfig | 2 +
> arch/powerpc/include/asm/entry-common.h | 550 ++++++++++++++++++++++++
> arch/powerpc/include/asm/hw_irq.h | 4 +-
> arch/powerpc/include/asm/interrupt.h | 393 +++--------------
> arch/powerpc/include/asm/ptrace.h | 2 +
> arch/powerpc/include/asm/stacktrace.h | 8 +
> arch/powerpc/include/asm/syscall.h | 5 +
> arch/powerpc/include/asm/thread_info.h | 1 +
> arch/powerpc/include/uapi/asm/ptrace.h | 14 +-
> arch/powerpc/kernel/asm-offsets.c | 1 +
> arch/powerpc/kernel/interrupt.c | 251 ++---------
> arch/powerpc/kernel/interrupt_64.S | 2 -
> arch/powerpc/kernel/ptrace/ptrace.c | 142 +-----
> arch/powerpc/kernel/signal.c | 8 +
> arch/powerpc/kernel/syscall.c | 119 +----
> arch/powerpc/kernel/traps.c | 2 +-
> arch/powerpc/kernel/watchdog.c | 2 +-
> arch/powerpc/perf/core-book3s.c | 2 +-
> 18 files changed, 698 insertions(+), 810 deletions(-)
> create mode 100644 arch/powerpc/include/asm/entry-common.h
>
Hi,
I have reviewed and tested the generic IRQ entry/exist patch series.
Below are my observations:
Test Coverage
• Successfully ran LTP (specially syscall) and entire LTP test
suite, without observing any regressions or issues related to the
implementation.
System Configuration
• CPUs: 640
• Kernel: v6.17.0-rc6+
• Processor mode: Shared (uncapped)
Performance Evaluation
• Conducted benchmarking using perf bench syscall basic -l.
• No functional regressions observed, and results were
consistent with expectations.
The performance benchmarks from perf bench basic syscall are below:
Loops = 100,000 | Metric | W/O Generic Framework | With Generic
Framework | Improvement |
|----------|-----------------------:|-----------------------:|------------:|
| usecs/op | 0.124562 | 0.124253 | ~0.25% | | ops/sec | 8,028,471 |
8,048,158 | ~0.25% | Loops = 1,000,000 | Metric | W/O Generic Framework
| With Generic Framework | Improvement |
|----------|-----------------------:|-----------------------:|------------:|
| usecs/op | 0.125389 | 0.124374 | ~0.81% | | ops/sec | 7,977,511 |
8,040,330 | ~0.79% | Loops = 10,000,000 | Metric | W/O Generic Framework
| With Generic Framework | Improvement |
|----------|-----------------------:|-----------------------:|------------:|
| usecs/op | 0.124626 | 0.123928 | ~0.56% | | ops/sec | 8,024,058 |
8,069,182 | ~0.56% | **Overall (aggregated across all runs)** | Metric |
W/O Generic Framework | With Generic Framework | Improvement | |
---------- |
---------------------:|-----------------------:|------------:| | Total
time | 1.384 [sec] | 1.376 [sec] | ~0.58% | | usecs/op | 0.124694 |
0.123971 | ~0.58% | | ops/sec | 8,019,904 | 8,066,393 | ~0.58% |
Here with this benchmarking we can see the improvement close to 0.6%.
Overall, the patch series works as intended in my testing.
Please add below tag for the patch set.
Tested-by: Samir M <samir@linux.vnet.ibm.com>
Thank You !!
[-- Attachment #2: Type: text/html, Size: 6106 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-09-24 15:57 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-08 21:02 [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 1/8] powerpc: rename arch_irq_disabled_regs Mukesh Kumar Chaurasiya
2025-09-13 12:50 ` Shrikanth Hegde
2025-09-08 21:02 ` [RFC V2 2/8] powerpc: Prepare to build with generic entry/exit framework Mukesh Kumar Chaurasiya
2025-09-13 12:49 ` Shrikanth Hegde
2025-09-16 4:16 ` Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 3/8] powerpc: introduce arch_enter_from_user_mode Mukesh Kumar Chaurasiya
2025-09-14 9:02 ` Shrikanth Hegde
2025-09-16 4:19 ` Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 4/8] powerpc: Introduce syscall exit arch functions Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 5/8] powerpc: add exit_flags field in pt_regs Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 6/8] powerpc: Prepare for IRQ entry exit Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 7/8] powerpc: Enable IRQ generic entry/exit path Mukesh Kumar Chaurasiya
2025-09-16 4:16 ` Shrikanth Hegde
2025-09-18 6:55 ` Mukesh Kumar Chaurasiya
2025-09-08 21:02 ` [RFC V2 8/8] powerpc: Enable Generic Entry/Exit for syscalls Mukesh Kumar Chaurasiya
2025-09-09 6:54 ` Shrikanth Hegde
2025-09-09 8:46 ` Mukesh Kumar Chaurasiya
2025-09-18 6:57 ` Mukesh Kumar Chaurasiya
2025-09-10 20:34 ` [RFC V2 0/8] Generic IRQ entry/exit support for powerpc Thomas Gleixner
2025-09-24 9:04 ` Samir Alamshaha Mulani
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).