* [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry
@ 2025-02-13 12:59 Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry Jinjie Ruan
` (8 more replies)
0 siblings, 9 replies; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 12:59 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
to use the generic entry infrastructure from kernel/entry/*. The generic
entry makes maintainers' work easier and codes more elegant, which will
make PREEMPT_DYNAMIC and PREEMPT_LAZY available and remove a lot of
duplicate code.
This patch series split the generic entry into generic irq entry and
generic syscall entry first, and then convert arm64 to use the generic
irq entry. arm64 will be completely converted to generic entry in another
patch series.
The main convert steps are as follows:
- Split generic entry into generic irq entry and generic syscall to
make the single patch more concentrated in switching to one thing.
- Make arm64 easier to use irqentry_enter/exit().
- Make arm64 closer to the PREEMPT_DYNAMIC code of generic entry.
- Switch to generic irq entry.
Changes in v6:
- Rebased on 6.14 rc2 next.
- Put the syscall bits aside and split it out.
- Have the split patch before the arm64 changes.
- Merge some tightly coupled patches.
- Adjust the order of some patches to make them more reasonable.
- Define regs_irqs_disabled() by inline function.
- Define interrupts_enabled() in terms of regs_irqs_disabled().
- Delete the fast_interrupts_enabled() macro.
- irqentry_state_t -> arm64_irqentry_state_t.
- Remove arch_exit_to_user_mode_prepare() and pull local_daif_mask() later
in the arm64 exit sequence
- Update the commit message.
Changes in v5:
- Not change arm32 and keep inerrupts_enabled() macro for gicv3 driver.
- Move irqentry_state definition into arch/arm64/kernel/entry-common.c.
- Avoid removing the __enter_from_*() and __exit_to_*() wrappers.
- Update "irqentry_state_t ret/irq_state" to "state"
to keep it consistently.
- Use generic irq entry header for PREEMPT_DYNAMIC after split
the generic entry.
- Also refactor the ARM64 syscall code.
- Introduce arch_ptrace_report_syscall_entry/exit(), instead of
arch_pre/post_report_syscall_entry/exit() to simplify code.
- Make the syscall patches clear separation.
- Update the commit message.
Changes in v4:
- Rework/cleanup split into a few patches as Mark suggested.
- Replace interrupts_enabled() macro with regs_irqs_disabled(), instead
of left it here.
- Remove rcu and lockdep state in pt_regs by using temporary
irqentry_state_t as Mark suggested.
- Remove some unnecessary intermediate functions to make it clear.
- Rework preempt irq and PREEMPT_DYNAMIC code
to make the switch more clear.
- arch_prepare_*_entry/exit() -> arch_pre_*_entry/exit().
- Expand the arch functions comment.
- Make arch functions closer to its caller.
- Declare saved_reg in for block.
- Remove arch_exit_to_kernel_mode_prepare(), arch_enter_from_kernel_mode().
- Adjust "Add few arch functions to use generic entry" patch to be
the penultimate.
- Update the commit message.
- Add suggested-by.
Changes in v3:
- Test the MTE test cases.
- Handle forget_syscall() in arch_post_report_syscall_entry()
- Make the arch funcs not use __weak as Thomas suggested, so move
the arch funcs to entry-common.h, and make arch_forget_syscall() folded
in arch_post_report_syscall_entry() as suggested.
- Move report_single_step() to thread_info.h for arm64
- Change __always_inline() to inline, add inline for the other arch funcs.
- Remove unused signal.h for entry-common.h.
- Add Suggested-by.
- Update the commit message.
Changes in v2:
- Add tested-by.
- Fix a bug that not call arch_post_report_syscall_entry() in
syscall_trace_enter() if ptrace_report_syscall_entry() return not zero.
- Refactor report_syscall().
- Add comment for arch_prepare_report_syscall_exit().
- Adjust entry-common.h header file inclusion to alphabetical order.
- Update the commit message.
Jinjie Ruan (8):
entry: Split generic entry into generic exception and syscall entry
arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled()
arm64: entry: Refactor the entry and exit for exceptions from EL1
arm64: entry: Rework arm64_preempt_schedule_irq()
arm64: entry: Use preempt_count() and need_resched() helper
arm64: entry: Refactor preempt_schedule_irq() check code
arm64: entry: Move arm64_preempt_schedule_irq() into
__exit_to_kernel_mode()
arm64: entry: Switch to generic IRQ entry
MAINTAINERS | 1 +
arch/Kconfig | 9 +
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/daifflags.h | 2 +-
arch/arm64/include/asm/entry-common.h | 56 ++++
arch/arm64/include/asm/preempt.h | 2 -
arch/arm64/include/asm/ptrace.h | 13 +-
arch/arm64/include/asm/xen/events.h | 2 +-
arch/arm64/kernel/acpi.c | 2 +-
arch/arm64/kernel/debug-monitors.c | 2 +-
arch/arm64/kernel/entry-common.c | 378 ++++++++-----------------
arch/arm64/kernel/sdei.c | 2 +-
arch/arm64/kernel/signal.c | 3 +-
include/linux/entry-common.h | 382 +------------------------
include/linux/irq-entry-common.h | 389 ++++++++++++++++++++++++++
kernel/entry/Makefile | 3 +-
kernel/entry/common.c | 176 ++----------
kernel/entry/syscall-common.c | 159 +++++++++++
kernel/sched/core.c | 8 +-
19 files changed, 766 insertions(+), 824 deletions(-)
create mode 100644 arch/arm64/include/asm/entry-common.h
create mode 100644 include/linux/irq-entry-common.h
create mode 100644 kernel/entry/syscall-common.c
--
2.34.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
@ 2025-02-13 13:00 ` Jinjie Ruan
2025-03-20 14:26 ` Linus Walleij
2025-02-13 13:00 ` [PATCH -next v6 2/8] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Jinjie Ruan
` (7 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 13:00 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
Currently CONFIG_GENERIC_ENTRY enables both the generic exception
entry logic and the generic syscall entry logic, which are otherwise
loosely coupled.
Introduce separate config options for these so that archtiectures can
select the two independently. This will make it easier for
architectures to migrate to generic entry code.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
----
v6:
- Udapte the commit message.
- Have this before the arm64 changes.
- Make GENERIC_SYSCALL depends on GENERIC_IRQ_ENTRY.
---
MAINTAINERS | 1 +
arch/Kconfig | 9 +
include/linux/entry-common.h | 382 +-----------------------------
include/linux/irq-entry-common.h | 389 +++++++++++++++++++++++++++++++
kernel/entry/Makefile | 3 +-
kernel/entry/common.c | 160 +------------
kernel/entry/syscall-common.c | 159 +++++++++++++
kernel/sched/core.c | 8 +-
8 files changed, 566 insertions(+), 545 deletions(-)
create mode 100644 include/linux/irq-entry-common.h
create mode 100644 kernel/entry/syscall-common.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 92fc0eca7061..56e72dab6655 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9666,6 +9666,7 @@ S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/entry
F: include/linux/entry-common.h
F: include/linux/entry-kvm.h
+F: include/linux/irq-entry-common.h
F: kernel/entry/
GENERIC GPIO I2C DRIVER
diff --git a/arch/Kconfig b/arch/Kconfig
index b8a4ff365582..b59c23594342 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -64,8 +64,17 @@ config HOTPLUG_PARALLEL
bool
select HOTPLUG_SPLIT_STARTUP
+config GENERIC_IRQ_ENTRY
+ bool
+
+config GENERIC_SYSCALL
+ bool
+ depends on GENERIC_IRQ_ENTRY
+
config GENERIC_ENTRY
bool
+ select GENERIC_IRQ_ENTRY
+ select GENERIC_SYSCALL
config KPROBES
bool "Kprobes"
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index fc61d0205c97..b3233e8328c5 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -2,27 +2,15 @@
#ifndef __LINUX_ENTRYCOMMON_H
#define __LINUX_ENTRYCOMMON_H
-#include <linux/static_call_types.h>
+#include <linux/irq-entry-common.h>
#include <linux/ptrace.h>
-#include <linux/syscalls.h>
#include <linux/seccomp.h>
#include <linux/sched.h>
-#include <linux/context_tracking.h>
#include <linux/livepatch.h>
#include <linux/resume_user_mode.h>
-#include <linux/tick.h>
-#include <linux/kmsan.h>
#include <asm/entry-common.h>
-/*
- * Define dummy _TIF work flags if not defined by the architecture or for
- * disabled functionality.
- */
-#ifndef _TIF_PATCH_PENDING
-# define _TIF_PATCH_PENDING (0)
-#endif
-
#ifndef _TIF_UPROBE
# define _TIF_UPROBE (0)
#endif
@@ -55,69 +43,6 @@
SYSCALL_WORK_SYSCALL_EXIT_TRAP | \
ARCH_SYSCALL_WORK_EXIT)
-/*
- * TIF flags handled in exit_to_user_mode_loop()
- */
-#ifndef ARCH_EXIT_TO_USER_MODE_WORK
-# define ARCH_EXIT_TO_USER_MODE_WORK (0)
-#endif
-
-#define EXIT_TO_USER_MODE_WORK \
- (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
- _TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \
- _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \
- ARCH_EXIT_TO_USER_MODE_WORK)
-
-/**
- * arch_enter_from_user_mode - Architecture specific sanity check for user mode regs
- * @regs: Pointer to currents pt_regs
- *
- * Defaults to an empty implementation. Can be replaced by architecture
- * specific code.
- *
- * Invoked from syscall_enter_from_user_mode() in the non-instrumentable
- * section. Use __always_inline so the compiler cannot push it out of line
- * and make it instrumentable.
- */
-static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs);
-
-#ifndef arch_enter_from_user_mode
-static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) {}
-#endif
-
-/**
- * enter_from_user_mode - Establish state when coming from user mode
- *
- * Syscall/interrupt entry disables interrupts, but user mode is traced as
- * interrupts enabled. Also with NO_HZ_FULL RCU might be idle.
- *
- * 1) Tell lockdep that interrupts are disabled
- * 2) Invoke context tracking if enabled to reactivate RCU
- * 3) Trace interrupts off state
- *
- * Invoked from architecture specific syscall entry code with interrupts
- * disabled. The calling code has to be non-instrumentable. When the
- * function returns all state is correct and interrupts are still
- * disabled. The subsequent functions can be instrumented.
- *
- * This is invoked when there is architecture specific functionality to be
- * done between establishing state and enabling interrupts. The caller must
- * enable interrupts before invoking syscall_enter_from_user_mode_work().
- */
-static __always_inline void enter_from_user_mode(struct pt_regs *regs)
-{
- arch_enter_from_user_mode(regs);
- lockdep_hardirqs_off(CALLER_ADDR0);
-
- CT_WARN_ON(__ct_state() != CT_STATE_USER);
- user_exit_irqoff();
-
- instrumentation_begin();
- kmsan_unpoison_entry_regs(regs);
- trace_hardirqs_off_finish();
- instrumentation_end();
-}
-
/**
* syscall_enter_from_user_mode_prepare - Establish state and enable interrupts
* @regs: Pointer to currents pt_regs
@@ -202,170 +127,6 @@ static __always_inline long syscall_enter_from_user_mode(struct pt_regs *regs, l
return ret;
}
-/**
- * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable()
- * @ti_work: Cached TIF flags gathered with interrupts disabled
- *
- * Defaults to local_irq_enable(). Can be supplied by architecture specific
- * code.
- */
-static inline void local_irq_enable_exit_to_user(unsigned long ti_work);
-
-#ifndef local_irq_enable_exit_to_user
-static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
-{
- local_irq_enable();
-}
-#endif
-
-/**
- * local_irq_disable_exit_to_user - Exit to user variant of local_irq_disable()
- *
- * Defaults to local_irq_disable(). Can be supplied by architecture specific
- * code.
- */
-static inline void local_irq_disable_exit_to_user(void);
-
-#ifndef local_irq_disable_exit_to_user
-static inline void local_irq_disable_exit_to_user(void)
-{
- local_irq_disable();
-}
-#endif
-
-/**
- * arch_exit_to_user_mode_work - Architecture specific TIF work for exit
- * to user mode.
- * @regs: Pointer to currents pt_regs
- * @ti_work: Cached TIF flags gathered with interrupts disabled
- *
- * Invoked from exit_to_user_mode_loop() with interrupt enabled
- *
- * Defaults to NOOP. Can be supplied by architecture specific code.
- */
-static inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
- unsigned long ti_work);
-
-#ifndef arch_exit_to_user_mode_work
-static inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
- unsigned long ti_work)
-{
-}
-#endif
-
-/**
- * arch_exit_to_user_mode_prepare - Architecture specific preparation for
- * exit to user mode.
- * @regs: Pointer to currents pt_regs
- * @ti_work: Cached TIF flags gathered with interrupts disabled
- *
- * Invoked from exit_to_user_mode_prepare() with interrupt disabled as the last
- * function before return. Defaults to NOOP.
- */
-static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
- unsigned long ti_work);
-
-#ifndef arch_exit_to_user_mode_prepare
-static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
- unsigned long ti_work)
-{
-}
-#endif
-
-/**
- * arch_exit_to_user_mode - Architecture specific final work before
- * exit to user mode.
- *
- * Invoked from exit_to_user_mode() with interrupt disabled as the last
- * function before return. Defaults to NOOP.
- *
- * This needs to be __always_inline because it is non-instrumentable code
- * invoked after context tracking switched to user mode.
- *
- * An architecture implementation must not do anything complex, no locking
- * etc. The main purpose is for speculation mitigations.
- */
-static __always_inline void arch_exit_to_user_mode(void);
-
-#ifndef arch_exit_to_user_mode
-static __always_inline void arch_exit_to_user_mode(void) { }
-#endif
-
-/**
- * arch_do_signal_or_restart - Architecture specific signal delivery function
- * @regs: Pointer to currents pt_regs
- *
- * Invoked from exit_to_user_mode_loop().
- */
-void arch_do_signal_or_restart(struct pt_regs *regs);
-
-/**
- * exit_to_user_mode_loop - do any pending work before leaving to user space
- */
-unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
- unsigned long ti_work);
-
-/**
- * exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required
- * @regs: Pointer to pt_regs on entry stack
- *
- * 1) check that interrupts are disabled
- * 2) call tick_nohz_user_enter_prepare()
- * 3) call exit_to_user_mode_loop() if any flags from
- * EXIT_TO_USER_MODE_WORK are set
- * 4) check that interrupts are still disabled
- */
-static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs)
-{
- unsigned long ti_work;
-
- lockdep_assert_irqs_disabled();
-
- /* Flush pending rcuog wakeup before the last need_resched() check */
- tick_nohz_user_enter_prepare();
-
- ti_work = read_thread_flags();
- if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
- ti_work = exit_to_user_mode_loop(regs, ti_work);
-
- arch_exit_to_user_mode_prepare(regs, ti_work);
-
- /* Ensure that kernel state is sane for a return to userspace */
- kmap_assert_nomap();
- lockdep_assert_irqs_disabled();
- lockdep_sys_exit();
-}
-
-/**
- * exit_to_user_mode - Fixup state when exiting to user mode
- *
- * Syscall/interrupt exit enables interrupts, but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about it.
- *
- * 1) Trace interrupts on state
- * 2) Invoke context tracking if enabled to adjust RCU state
- * 3) Invoke architecture specific last minute exit code, e.g. speculation
- * mitigations, etc.: arch_exit_to_user_mode()
- * 4) Tell lockdep that interrupts are enabled
- *
- * Invoked from architecture specific code when syscall_exit_to_user_mode()
- * is not suitable as the last step before returning to userspace. Must be
- * invoked with interrupts disabled and the caller must be
- * non-instrumentable.
- * The caller has to invoke syscall_exit_to_user_mode_work() before this.
- */
-static __always_inline void exit_to_user_mode(void)
-{
- instrumentation_begin();
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare();
- instrumentation_end();
-
- user_enter_irqoff();
- arch_exit_to_user_mode();
- lockdep_hardirqs_on(CALLER_ADDR0);
-}
-
/**
* syscall_exit_to_user_mode_work - Handle work before returning to user mode
* @regs: Pointer to currents pt_regs
@@ -412,145 +173,4 @@ void syscall_exit_to_user_mode_work(struct pt_regs *regs);
*/
void syscall_exit_to_user_mode(struct pt_regs *regs);
-/**
- * irqentry_enter_from_user_mode - Establish state before invoking the irq handler
- * @regs: Pointer to currents pt_regs
- *
- * Invoked from architecture specific entry code with interrupts disabled.
- * Can only be called when the interrupt entry came from user mode. The
- * calling code must be non-instrumentable. When the function returns all
- * state is correct and the subsequent functions can be instrumented.
- *
- * The function establishes state (lockdep, RCU (context tracking), tracing)
- */
-void irqentry_enter_from_user_mode(struct pt_regs *regs);
-
-/**
- * irqentry_exit_to_user_mode - Interrupt exit work
- * @regs: Pointer to current's pt_regs
- *
- * Invoked with interrupts disabled and fully valid regs. Returns with all
- * work handled, interrupts disabled such that the caller can immediately
- * switch to user mode. Called from architecture specific interrupt
- * handling code.
- *
- * The call order is #2 and #3 as described in syscall_exit_to_user_mode().
- * Interrupt exit is not invoking #1 which is the syscall specific one time
- * work.
- */
-void irqentry_exit_to_user_mode(struct pt_regs *regs);
-
-#ifndef irqentry_state
-/**
- * struct irqentry_state - Opaque object for exception state storage
- * @exit_rcu: Used exclusively in the irqentry_*() calls; signals whether the
- * exit path has to invoke ct_irq_exit().
- * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures that
- * lockdep state is restored correctly on exit from nmi.
- *
- * This opaque object is filled in by the irqentry_*_enter() functions and
- * must be passed back into the corresponding irqentry_*_exit() functions
- * when the exception is complete.
- *
- * Callers of irqentry_*_[enter|exit]() must consider this structure opaque
- * and all members private. Descriptions of the members are provided to aid in
- * the maintenance of the irqentry_*() functions.
- */
-typedef struct irqentry_state {
- union {
- bool exit_rcu;
- bool lockdep;
- };
-} irqentry_state_t;
-#endif
-
-/**
- * irqentry_enter - Handle state tracking on ordinary interrupt entries
- * @regs: Pointer to pt_regs of interrupted context
- *
- * Invokes:
- * - lockdep irqflag state tracking as low level ASM entry disabled
- * interrupts.
- *
- * - Context tracking if the exception hit user mode.
- *
- * - The hardirq tracer to keep the state consistent as low level ASM
- * entry disabled interrupts.
- *
- * As a precondition, this requires that the entry came from user mode,
- * idle, or a kernel context in which RCU is watching.
- *
- * For kernel mode entries RCU handling is done conditional. If RCU is
- * watching then the only RCU requirement is to check whether the tick has
- * to be restarted. If RCU is not watching then ct_irq_enter() has to be
- * invoked on entry and ct_irq_exit() on exit.
- *
- * Avoiding the ct_irq_enter/exit() calls is an optimization but also
- * solves the problem of kernel mode pagefaults which can schedule, which
- * is not possible after invoking ct_irq_enter() without undoing it.
- *
- * For user mode entries irqentry_enter_from_user_mode() is invoked to
- * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit
- * would not be possible.
- *
- * Returns: An opaque object that must be passed to idtentry_exit()
- */
-irqentry_state_t noinstr irqentry_enter(struct pt_regs *regs);
-
-/**
- * irqentry_exit_cond_resched - Conditionally reschedule on return from interrupt
- *
- * Conditional reschedule with additional sanity checks.
- */
-void raw_irqentry_exit_cond_resched(void);
-#ifdef CONFIG_PREEMPT_DYNAMIC
-#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-#define irqentry_exit_cond_resched_dynamic_enabled raw_irqentry_exit_cond_resched
-#define irqentry_exit_cond_resched_dynamic_disabled NULL
-DECLARE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resched);
-#define irqentry_exit_cond_resched() static_call(irqentry_exit_cond_resched)()
-#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-void dynamic_irqentry_exit_cond_resched(void);
-#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched()
-#endif
-#else /* CONFIG_PREEMPT_DYNAMIC */
-#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched()
-#endif /* CONFIG_PREEMPT_DYNAMIC */
-
-/**
- * irqentry_exit - Handle return from exception that used irqentry_enter()
- * @regs: Pointer to pt_regs (exception entry regs)
- * @state: Return value from matching call to irqentry_enter()
- *
- * Depending on the return target (kernel/user) this runs the necessary
- * preemption and work checks if possible and required and returns to
- * the caller with interrupts disabled and no further work pending.
- *
- * This is the last action before returning to the low level ASM code which
- * just needs to return to the appropriate context.
- *
- * Counterpart to irqentry_enter().
- */
-void noinstr irqentry_exit(struct pt_regs *regs, irqentry_state_t state);
-
-/**
- * irqentry_nmi_enter - Handle NMI entry
- * @regs: Pointer to currents pt_regs
- *
- * Similar to irqentry_enter() but taking care of the NMI constraints.
- */
-irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs);
-
-/**
- * irqentry_nmi_exit - Handle return from NMI handling
- * @regs: Pointer to pt_regs (NMI entry regs)
- * @irq_state: Return value from matching call to irqentry_nmi_enter()
- *
- * Last action before returning to the low level assembly code.
- *
- * Counterpart to irqentry_nmi_enter().
- */
-void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state);
-
#endif
diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h
new file mode 100644
index 000000000000..8af374331900
--- /dev/null
+++ b/include/linux/irq-entry-common.h
@@ -0,0 +1,389 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_IRQENTRYCOMMON_H
+#define __LINUX_IRQENTRYCOMMON_H
+
+#include <linux/static_call_types.h>
+#include <linux/syscalls.h>
+#include <linux/context_tracking.h>
+#include <linux/tick.h>
+#include <linux/kmsan.h>
+
+#include <asm/entry-common.h>
+
+/*
+ * Define dummy _TIF work flags if not defined by the architecture or for
+ * disabled functionality.
+ */
+#ifndef _TIF_PATCH_PENDING
+# define _TIF_PATCH_PENDING (0)
+#endif
+
+/*
+ * TIF flags handled in exit_to_user_mode_loop()
+ */
+#ifndef ARCH_EXIT_TO_USER_MODE_WORK
+# define ARCH_EXIT_TO_USER_MODE_WORK (0)
+#endif
+
+#define EXIT_TO_USER_MODE_WORK \
+ (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
+ _TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \
+ _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \
+ ARCH_EXIT_TO_USER_MODE_WORK)
+
+/**
+ * arch_enter_from_user_mode - Architecture specific sanity check for user mode regs
+ * @regs: Pointer to currents pt_regs
+ *
+ * Defaults to an empty implementation. Can be replaced by architecture
+ * specific code.
+ *
+ * Invoked from syscall_enter_from_user_mode() in the non-instrumentable
+ * section. Use __always_inline so the compiler cannot push it out of line
+ * and make it instrumentable.
+ */
+static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs);
+
+#ifndef arch_enter_from_user_mode
+static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) {}
+#endif
+
+/**
+ * enter_from_user_mode - Establish state when coming from user mode
+ *
+ * Syscall/interrupt entry disables interrupts, but user mode is traced as
+ * interrupts enabled. Also with NO_HZ_FULL RCU might be idle.
+ *
+ * 1) Tell lockdep that interrupts are disabled
+ * 2) Invoke context tracking if enabled to reactivate RCU
+ * 3) Trace interrupts off state
+ *
+ * Invoked from architecture specific syscall entry code with interrupts
+ * disabled. The calling code has to be non-instrumentable. When the
+ * function returns all state is correct and interrupts are still
+ * disabled. The subsequent functions can be instrumented.
+ *
+ * This is invoked when there is architecture specific functionality to be
+ * done between establishing state and enabling interrupts. The caller must
+ * enable interrupts before invoking syscall_enter_from_user_mode_work().
+ */
+static __always_inline void enter_from_user_mode(struct pt_regs *regs)
+{
+ arch_enter_from_user_mode(regs);
+ lockdep_hardirqs_off(CALLER_ADDR0);
+
+ CT_WARN_ON(__ct_state() != CT_STATE_USER);
+ user_exit_irqoff();
+
+ instrumentation_begin();
+ kmsan_unpoison_entry_regs(regs);
+ trace_hardirqs_off_finish();
+ instrumentation_end();
+}
+
+/**
+ * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable()
+ * @ti_work: Cached TIF flags gathered with interrupts disabled
+ *
+ * Defaults to local_irq_enable(). Can be supplied by architecture specific
+ * code.
+ */
+static inline void local_irq_enable_exit_to_user(unsigned long ti_work);
+
+#ifndef local_irq_enable_exit_to_user
+static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
+{
+ local_irq_enable();
+}
+#endif
+
+/**
+ * local_irq_disable_exit_to_user - Exit to user variant of local_irq_disable()
+ *
+ * Defaults to local_irq_disable(). Can be supplied by architecture specific
+ * code.
+ */
+static inline void local_irq_disable_exit_to_user(void);
+
+#ifndef local_irq_disable_exit_to_user
+static inline void local_irq_disable_exit_to_user(void)
+{
+ local_irq_disable();
+}
+#endif
+
+/**
+ * arch_exit_to_user_mode_work - Architecture specific TIF work for exit
+ * to user mode.
+ * @regs: Pointer to currents pt_regs
+ * @ti_work: Cached TIF flags gathered with interrupts disabled
+ *
+ * Invoked from exit_to_user_mode_loop() with interrupt enabled
+ *
+ * Defaults to NOOP. Can be supplied by architecture specific code.
+ */
+static inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
+ unsigned long ti_work);
+
+#ifndef arch_exit_to_user_mode_work
+static inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+}
+#endif
+
+/**
+ * arch_exit_to_user_mode_prepare - Architecture specific preparation for
+ * exit to user mode.
+ * @regs: Pointer to currents pt_regs
+ * @ti_work: Cached TIF flags gathered with interrupts disabled
+ *
+ * Invoked from exit_to_user_mode_prepare() with interrupt disabled as the last
+ * function before return. Defaults to NOOP.
+ */
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+ unsigned long ti_work);
+
+#ifndef arch_exit_to_user_mode_prepare
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+}
+#endif
+
+/**
+ * arch_exit_to_user_mode - Architecture specific final work before
+ * exit to user mode.
+ *
+ * Invoked from exit_to_user_mode() with interrupt disabled as the last
+ * function before return. Defaults to NOOP.
+ *
+ * This needs to be __always_inline because it is non-instrumentable code
+ * invoked after context tracking switched to user mode.
+ *
+ * An architecture implementation must not do anything complex, no locking
+ * etc. The main purpose is for speculation mitigations.
+ */
+static __always_inline void arch_exit_to_user_mode(void);
+
+#ifndef arch_exit_to_user_mode
+static __always_inline void arch_exit_to_user_mode(void) { }
+#endif
+
+/**
+ * arch_do_signal_or_restart - Architecture specific signal delivery function
+ * @regs: Pointer to currents pt_regs
+ *
+ * Invoked from exit_to_user_mode_loop().
+ */
+void arch_do_signal_or_restart(struct pt_regs *regs);
+
+/**
+ * exit_to_user_mode_loop - do any pending work before leaving to user space
+ */
+unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
+ unsigned long ti_work);
+
+/**
+ * exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required
+ * @regs: Pointer to pt_regs on entry stack
+ *
+ * 1) check that interrupts are disabled
+ * 2) call tick_nohz_user_enter_prepare()
+ * 3) call exit_to_user_mode_loop() if any flags from
+ * EXIT_TO_USER_MODE_WORK are set
+ * 4) check that interrupts are still disabled
+ */
+static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs)
+{
+ unsigned long ti_work;
+
+ lockdep_assert_irqs_disabled();
+
+ /* Flush pending rcuog wakeup before the last need_resched() check */
+ tick_nohz_user_enter_prepare();
+
+ ti_work = read_thread_flags();
+ if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
+ ti_work = exit_to_user_mode_loop(regs, ti_work);
+
+ arch_exit_to_user_mode_prepare(regs, ti_work);
+
+ /* Ensure that kernel state is sane for a return to userspace */
+ kmap_assert_nomap();
+ lockdep_assert_irqs_disabled();
+ lockdep_sys_exit();
+}
+
+/**
+ * exit_to_user_mode - Fixup state when exiting to user mode
+ *
+ * Syscall/interrupt exit enables interrupts, but the kernel state is
+ * interrupts disabled when this is invoked. Also tell RCU about it.
+ *
+ * 1) Trace interrupts on state
+ * 2) Invoke context tracking if enabled to adjust RCU state
+ * 3) Invoke architecture specific last minute exit code, e.g. speculation
+ * mitigations, etc.: arch_exit_to_user_mode()
+ * 4) Tell lockdep that interrupts are enabled
+ *
+ * Invoked from architecture specific code when syscall_exit_to_user_mode()
+ * is not suitable as the last step before returning to userspace. Must be
+ * invoked with interrupts disabled and the caller must be
+ * non-instrumentable.
+ * The caller has to invoke syscall_exit_to_user_mode_work() before this.
+ */
+static __always_inline void exit_to_user_mode(void)
+{
+ instrumentation_begin();
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare();
+ instrumentation_end();
+
+ user_enter_irqoff();
+ arch_exit_to_user_mode();
+ lockdep_hardirqs_on(CALLER_ADDR0);
+}
+
+/**
+ * irqentry_enter_from_user_mode - Establish state before invoking the irq handler
+ * @regs: Pointer to currents pt_regs
+ *
+ * Invoked from architecture specific entry code with interrupts disabled.
+ * Can only be called when the interrupt entry came from user mode. The
+ * calling code must be non-instrumentable. When the function returns all
+ * state is correct and the subsequent functions can be instrumented.
+ *
+ * The function establishes state (lockdep, RCU (context tracking), tracing)
+ */
+void irqentry_enter_from_user_mode(struct pt_regs *regs);
+
+/**
+ * irqentry_exit_to_user_mode - Interrupt exit work
+ * @regs: Pointer to current's pt_regs
+ *
+ * Invoked with interrupts disabled and fully valid regs. Returns with all
+ * work handled, interrupts disabled such that the caller can immediately
+ * switch to user mode. Called from architecture specific interrupt
+ * handling code.
+ *
+ * The call order is #2 and #3 as described in syscall_exit_to_user_mode().
+ * Interrupt exit is not invoking #1 which is the syscall specific one time
+ * work.
+ */
+void irqentry_exit_to_user_mode(struct pt_regs *regs);
+
+#ifndef irqentry_state
+/**
+ * struct irqentry_state - Opaque object for exception state storage
+ * @exit_rcu: Used exclusively in the irqentry_*() calls; signals whether the
+ * exit path has to invoke ct_irq_exit().
+ * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures that
+ * lockdep state is restored correctly on exit from nmi.
+ *
+ * This opaque object is filled in by the irqentry_*_enter() functions and
+ * must be passed back into the corresponding irqentry_*_exit() functions
+ * when the exception is complete.
+ *
+ * Callers of irqentry_*_[enter|exit]() must consider this structure opaque
+ * and all members private. Descriptions of the members are provided to aid in
+ * the maintenance of the irqentry_*() functions.
+ */
+typedef struct irqentry_state {
+ union {
+ bool exit_rcu;
+ bool lockdep;
+ };
+} irqentry_state_t;
+#endif
+
+/**
+ * irqentry_enter - Handle state tracking on ordinary interrupt entries
+ * @regs: Pointer to pt_regs of interrupted context
+ *
+ * Invokes:
+ * - lockdep irqflag state tracking as low level ASM entry disabled
+ * interrupts.
+ *
+ * - Context tracking if the exception hit user mode.
+ *
+ * - The hardirq tracer to keep the state consistent as low level ASM
+ * entry disabled interrupts.
+ *
+ * As a precondition, this requires that the entry came from user mode,
+ * idle, or a kernel context in which RCU is watching.
+ *
+ * For kernel mode entries RCU handling is done conditional. If RCU is
+ * watching then the only RCU requirement is to check whether the tick has
+ * to be restarted. If RCU is not watching then ct_irq_enter() has to be
+ * invoked on entry and ct_irq_exit() on exit.
+ *
+ * Avoiding the ct_irq_enter/exit() calls is an optimization but also
+ * solves the problem of kernel mode pagefaults which can schedule, which
+ * is not possible after invoking ct_irq_enter() without undoing it.
+ *
+ * For user mode entries irqentry_enter_from_user_mode() is invoked to
+ * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit
+ * would not be possible.
+ *
+ * Returns: An opaque object that must be passed to idtentry_exit()
+ */
+irqentry_state_t noinstr irqentry_enter(struct pt_regs *regs);
+
+/**
+ * irqentry_exit_cond_resched - Conditionally reschedule on return from interrupt
+ *
+ * Conditional reschedule with additional sanity checks.
+ */
+void raw_irqentry_exit_cond_resched(void);
+#ifdef CONFIG_PREEMPT_DYNAMIC
+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
+#define irqentry_exit_cond_resched_dynamic_enabled raw_irqentry_exit_cond_resched
+#define irqentry_exit_cond_resched_dynamic_disabled NULL
+DECLARE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resched);
+#define irqentry_exit_cond_resched() static_call(irqentry_exit_cond_resched)()
+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
+DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
+void dynamic_irqentry_exit_cond_resched(void);
+#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched()
+#endif
+#else /* CONFIG_PREEMPT_DYNAMIC */
+#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched()
+#endif /* CONFIG_PREEMPT_DYNAMIC */
+
+/**
+ * irqentry_exit - Handle return from exception that used irqentry_enter()
+ * @regs: Pointer to pt_regs (exception entry regs)
+ * @state: Return value from matching call to irqentry_enter()
+ *
+ * Depending on the return target (kernel/user) this runs the necessary
+ * preemption and work checks if possible and required and returns to
+ * the caller with interrupts disabled and no further work pending.
+ *
+ * This is the last action before returning to the low level ASM code which
+ * just needs to return to the appropriate context.
+ *
+ * Counterpart to irqentry_enter().
+ */
+void noinstr irqentry_exit(struct pt_regs *regs, irqentry_state_t state);
+
+/**
+ * irqentry_nmi_enter - Handle NMI entry
+ * @regs: Pointer to currents pt_regs
+ *
+ * Similar to irqentry_enter() but taking care of the NMI constraints.
+ */
+irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs);
+
+/**
+ * irqentry_nmi_exit - Handle return from NMI handling
+ * @regs: Pointer to pt_regs (NMI entry regs)
+ * @irq_state: Return value from matching call to irqentry_nmi_enter()
+ *
+ * Last action before returning to the low level assembly code.
+ *
+ * Counterpart to irqentry_nmi_enter().
+ */
+void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state);
+
+#endif
diff --git a/kernel/entry/Makefile b/kernel/entry/Makefile
index 095c775e001e..d38f3a7e7396 100644
--- a/kernel/entry/Makefile
+++ b/kernel/entry/Makefile
@@ -9,5 +9,6 @@ KCOV_INSTRUMENT := n
CFLAGS_REMOVE_common.o = -fstack-protector -fstack-protector-strong
CFLAGS_common.o += -fno-stack-protector
-obj-$(CONFIG_GENERIC_ENTRY) += common.o syscall_user_dispatch.o
+obj-$(CONFIG_GENERIC_IRQ_ENTRY) += common.o
+obj-$(CONFIG_GENERIC_SYSCALL) += syscall-common.o syscall_user_dispatch.o
obj-$(CONFIG_KVM_XFER_TO_GUEST_WORK) += kvm.o
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 20154572ede9..b82032777310 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -1,84 +1,13 @@
// SPDX-License-Identifier: GPL-2.0
-#include <linux/context_tracking.h>
-#include <linux/entry-common.h>
+#include <linux/irq-entry-common.h>
#include <linux/resume_user_mode.h>
#include <linux/highmem.h>
#include <linux/jump_label.h>
#include <linux/kmsan.h>
#include <linux/livepatch.h>
-#include <linux/audit.h>
#include <linux/tick.h>
-#include "common.h"
-
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
-static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
-{
- if (unlikely(audit_context())) {
- unsigned long args[6];
-
- syscall_get_arguments(current, regs, args);
- audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]);
- }
-}
-
-long syscall_trace_enter(struct pt_regs *regs, long syscall,
- unsigned long work)
-{
- long ret = 0;
-
- /*
- * Handle Syscall User Dispatch. This must comes first, since
- * the ABI here can be something that doesn't make sense for
- * other syscall_work features.
- */
- if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
- if (syscall_user_dispatch(regs))
- return -1L;
- }
-
- /* Handle ptrace */
- if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
- ret = ptrace_report_syscall_entry(regs);
- if (ret || (work & SYSCALL_WORK_SYSCALL_EMU))
- return -1L;
- }
-
- /* Do seccomp after ptrace, to catch any tracer changes. */
- if (work & SYSCALL_WORK_SECCOMP) {
- ret = __secure_computing();
- if (ret == -1L)
- return ret;
- }
-
- /* Either of the above might have changed the syscall number */
- syscall = syscall_get_nr(current, regs);
-
- if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) {
- trace_sys_enter(regs, syscall);
- /*
- * Probes or BPF hooks in the tracepoint may have changed the
- * system call number as well.
- */
- syscall = syscall_get_nr(current, regs);
- }
-
- syscall_enter_audit(regs, syscall);
-
- return ret ? : syscall;
-}
-
-noinstr void syscall_enter_from_user_mode_prepare(struct pt_regs *regs)
-{
- enter_from_user_mode(regs);
- instrumentation_begin();
- local_irq_enable();
- instrumentation_end();
-}
-
/* Workaround to allow gradual conversion of architecture code */
void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
@@ -133,93 +62,6 @@ __always_inline unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
return ti_work;
}
-/*
- * If SYSCALL_EMU is set, then the only reason to report is when
- * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall
- * instruction has been already reported in syscall_enter_from_user_mode().
- */
-static inline bool report_single_step(unsigned long work)
-{
- if (work & SYSCALL_WORK_SYSCALL_EMU)
- return false;
-
- return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP;
-}
-
-static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
-{
- bool step;
-
- /*
- * If the syscall was rolled back due to syscall user dispatching,
- * then the tracers below are not invoked for the same reason as
- * the entry side was not invoked in syscall_trace_enter(): The ABI
- * of these syscalls is unknown.
- */
- if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
- if (unlikely(current->syscall_dispatch.on_dispatch)) {
- current->syscall_dispatch.on_dispatch = false;
- return;
- }
- }
-
- audit_syscall_exit(regs);
-
- if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT)
- trace_sys_exit(regs, syscall_get_return_value(current, regs));
-
- step = report_single_step(work);
- if (step || work & SYSCALL_WORK_SYSCALL_TRACE)
- ptrace_report_syscall_exit(regs, step);
-}
-
-/*
- * Syscall specific exit to user mode preparation. Runs with interrupts
- * enabled.
- */
-static void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
-{
- unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
- unsigned long nr = syscall_get_nr(current, regs);
-
- CT_WARN_ON(ct_state() != CT_STATE_KERNEL);
-
- if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
- if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr))
- local_irq_enable();
- }
-
- rseq_syscall(regs);
-
- /*
- * Do one-time syscall specific work. If these work items are
- * enabled, we want to run them exactly once per syscall exit with
- * interrupts enabled.
- */
- if (unlikely(work & SYSCALL_WORK_EXIT))
- syscall_exit_work(regs, work);
-}
-
-static __always_inline void __syscall_exit_to_user_mode_work(struct pt_regs *regs)
-{
- syscall_exit_to_user_mode_prepare(regs);
- local_irq_disable_exit_to_user();
- exit_to_user_mode_prepare(regs);
-}
-
-void syscall_exit_to_user_mode_work(struct pt_regs *regs)
-{
- __syscall_exit_to_user_mode_work(regs);
-}
-
-__visible noinstr void syscall_exit_to_user_mode(struct pt_regs *regs)
-{
- instrumentation_begin();
- __syscall_exit_to_user_mode_work(regs);
- instrumentation_end();
- exit_to_user_mode();
-}
-
noinstr void irqentry_enter_from_user_mode(struct pt_regs *regs)
{
enter_from_user_mode(regs);
diff --git a/kernel/entry/syscall-common.c b/kernel/entry/syscall-common.c
new file mode 100644
index 000000000000..88edab20820f
--- /dev/null
+++ b/kernel/entry/syscall-common.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/audit.h>
+#include <linux/entry-common.h>
+#include "common.h"
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/syscalls.h>
+
+static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
+{
+ if (unlikely(audit_context())) {
+ unsigned long args[6];
+
+ syscall_get_arguments(current, regs, args);
+ audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]);
+ }
+}
+
+long syscall_trace_enter(struct pt_regs *regs, long syscall,
+ unsigned long work)
+{
+ long ret = 0;
+
+ /*
+ * Handle Syscall User Dispatch. This must comes first, since
+ * the ABI here can be something that doesn't make sense for
+ * other syscall_work features.
+ */
+ if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
+ if (syscall_user_dispatch(regs))
+ return -1L;
+ }
+
+ /* Handle ptrace */
+ if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
+ ret = ptrace_report_syscall_entry(regs);
+ if (ret || (work & SYSCALL_WORK_SYSCALL_EMU))
+ return -1L;
+ }
+
+ /* Do seccomp after ptrace, to catch any tracer changes. */
+ if (work & SYSCALL_WORK_SECCOMP) {
+ ret = __secure_computing();
+ if (ret == -1L)
+ return ret;
+ }
+
+ /* Either of the above might have changed the syscall number */
+ syscall = syscall_get_nr(current, regs);
+
+ if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) {
+ trace_sys_enter(regs, syscall);
+ /*
+ * Probes or BPF hooks in the tracepoint may have changed the
+ * system call number as well.
+ */
+ syscall = syscall_get_nr(current, regs);
+ }
+
+ syscall_enter_audit(regs, syscall);
+
+ return ret ? : syscall;
+}
+
+noinstr void syscall_enter_from_user_mode_prepare(struct pt_regs *regs)
+{
+ enter_from_user_mode(regs);
+ instrumentation_begin();
+ local_irq_enable();
+ instrumentation_end();
+}
+
+/*
+ * If SYSCALL_EMU is set, then the only reason to report is when
+ * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall
+ * instruction has been already reported in syscall_enter_from_user_mode().
+ */
+static inline bool report_single_step(unsigned long work)
+{
+ if (work & SYSCALL_WORK_SYSCALL_EMU)
+ return false;
+
+ return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP;
+}
+
+static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
+{
+ bool step;
+
+ /*
+ * If the syscall was rolled back due to syscall user dispatching,
+ * then the tracers below are not invoked for the same reason as
+ * the entry side was not invoked in syscall_trace_enter(): The ABI
+ * of these syscalls is unknown.
+ */
+ if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
+ if (unlikely(current->syscall_dispatch.on_dispatch)) {
+ current->syscall_dispatch.on_dispatch = false;
+ return;
+ }
+ }
+
+ audit_syscall_exit(regs);
+
+ if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT)
+ trace_sys_exit(regs, syscall_get_return_value(current, regs));
+
+ step = report_single_step(work);
+ if (step || work & SYSCALL_WORK_SYSCALL_TRACE)
+ ptrace_report_syscall_exit(regs, step);
+}
+
+/*
+ * Syscall specific exit to user mode preparation. Runs with interrupts
+ * enabled.
+ */
+static void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
+{
+ unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
+ unsigned long nr = syscall_get_nr(current, regs);
+
+ CT_WARN_ON(ct_state() != CT_STATE_KERNEL);
+
+ if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
+ if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr))
+ local_irq_enable();
+ }
+
+ rseq_syscall(regs);
+
+ /*
+ * Do one-time syscall specific work. If these work items are
+ * enabled, we want to run them exactly once per syscall exit with
+ * interrupts enabled.
+ */
+ if (unlikely(work & SYSCALL_WORK_EXIT))
+ syscall_exit_work(regs, work);
+}
+
+static __always_inline void __syscall_exit_to_user_mode_work(struct pt_regs *regs)
+{
+ syscall_exit_to_user_mode_prepare(regs);
+ local_irq_disable_exit_to_user();
+ exit_to_user_mode_prepare(regs);
+}
+
+void syscall_exit_to_user_mode_work(struct pt_regs *regs)
+{
+ __syscall_exit_to_user_mode_work(regs);
+}
+
+__visible noinstr void syscall_exit_to_user_mode(struct pt_regs *regs)
+{
+ instrumentation_begin();
+ __syscall_exit_to_user_mode_work(regs);
+ instrumentation_end();
+ exit_to_user_mode();
+}
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8a478903dea7..09d6712c3ed3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -68,8 +68,8 @@
#include <linux/workqueue_api.h>
#ifdef CONFIG_PREEMPT_DYNAMIC
-# ifdef CONFIG_GENERIC_ENTRY
-# include <linux/entry-common.h>
+# ifdef CONFIG_GENERIC_IRQ_ENTRY
+# include <linux/irq-entry-common.h>
# endif
#endif
@@ -7407,8 +7407,8 @@ EXPORT_SYMBOL(__cond_resched_rwlock_write);
#ifdef CONFIG_PREEMPT_DYNAMIC
-#ifdef CONFIG_GENERIC_ENTRY
-#include <linux/entry-common.h>
+#ifdef CONFIG_GENERIC_IRQ_ENTRY
+#include <linux/irq-entry-common.h>
#endif
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH -next v6 2/8] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled()
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry Jinjie Ruan
@ 2025-02-13 13:00 ` Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 3/8] arm64: entry: Refactor the entry and exit for exceptions from EL1 Jinjie Ruan
` (6 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 13:00 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
The generic entry code expects architecture code to provide
regs_irqs_disabled(regs) function, but arm64 does not have this and
provides inerrupts_enabled(regs), which has the opposite polarity.
In preparation for moving arm64 over to the generic entry code,
relace arm64's interrupts_enabled() with regs_irqs_disabled() and
update its callers under arch/arm64.
For the moment, a definition of interrupts_enabled() is provided for
the GICv3 driver. Once arch/arm implement regs_irqs_disabled(), this
can be removed.
Delete the fast_interrupts_enabled() macro as it is unused and we
don't want any new users to show up.
No functional changes.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v6:
- Define regs_irqs_disabled() by inline function.
- Define interrupts_enabled() in terms of regs_irqs_disabled().
- Delete the fast_interrupts_enabled() macro.
---
arch/arm64/include/asm/daifflags.h | 2 +-
arch/arm64/include/asm/ptrace.h | 9 +++++----
arch/arm64/include/asm/xen/events.h | 2 +-
arch/arm64/kernel/acpi.c | 2 +-
arch/arm64/kernel/debug-monitors.c | 2 +-
arch/arm64/kernel/entry-common.c | 4 ++--
arch/arm64/kernel/sdei.c | 2 +-
7 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index fbb5c99eb2f9..5fca48009043 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -128,7 +128,7 @@ static inline void local_daif_inherit(struct pt_regs *regs)
{
unsigned long flags = regs->pstate & DAIF_MASK;
- if (interrupts_enabled(regs))
+ if (!regs_irqs_disabled(regs))
trace_hardirqs_on();
if (system_uses_irq_prio_masking())
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 47ff8654c5ec..8b915d4a9d4b 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -214,11 +214,12 @@ static inline void forget_syscall(struct pt_regs *regs)
(regs)->pmr == GIC_PRIO_IRQON : \
true)
-#define interrupts_enabled(regs) \
- (!((regs)->pstate & PSR_I_BIT) && irqs_priority_unmasked(regs))
+static __always_inline bool regs_irqs_disabled(const struct pt_regs *regs)
+{
+ return (regs->pstate & PSR_I_BIT) || !irqs_priority_unmasked(regs);
+}
-#define fast_interrupts_enabled(regs) \
- (!((regs)->pstate & PSR_F_BIT))
+#define interrupts_enabled(regs) (!regs_irqs_disabled(regs))
static inline unsigned long user_stack_pointer(struct pt_regs *regs)
{
diff --git a/arch/arm64/include/asm/xen/events.h b/arch/arm64/include/asm/xen/events.h
index 2788e95d0ff0..2977b5fe068d 100644
--- a/arch/arm64/include/asm/xen/events.h
+++ b/arch/arm64/include/asm/xen/events.h
@@ -14,7 +14,7 @@ enum ipi_vector {
static inline int xen_irqs_disabled(struct pt_regs *regs)
{
- return !interrupts_enabled(regs);
+ return regs_irqs_disabled(regs);
}
#define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index e6f66491fbe9..732f89daae23 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -403,7 +403,7 @@ int apei_claim_sea(struct pt_regs *regs)
return_to_irqs_enabled = !irqs_disabled_flags(arch_local_save_flags());
if (regs)
- return_to_irqs_enabled = interrupts_enabled(regs);
+ return_to_irqs_enabled = !regs_irqs_disabled(regs);
/*
* SEA can interrupt SError, mask it and describe this as an NMI so
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index 58f047de3e1c..460c09d03a73 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -231,7 +231,7 @@ static void send_user_sigtrap(int si_code)
if (WARN_ON(!user_mode(regs)))
return;
- if (interrupts_enabled(regs))
+ if (!regs_irqs_disabled(regs))
local_irq_enable();
arm64_force_sig_fault(SIGTRAP, si_code, instruction_pointer(regs),
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index b260ddc4d3e9..c547e70428d3 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -73,7 +73,7 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
{
lockdep_assert_irqs_disabled();
- if (interrupts_enabled(regs)) {
+ if (!regs_irqs_disabled(regs)) {
if (regs->exit_rcu) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
@@ -569,7 +569,7 @@ static void noinstr el1_interrupt(struct pt_regs *regs,
{
write_sysreg(DAIF_PROCCTX_NOIRQ, daif);
- if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && !interrupts_enabled(regs))
+ if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && regs_irqs_disabled(regs))
__el1_pnmi(regs, handler);
else
__el1_irq(regs, handler);
diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
index 255d12f881c2..27a17da635d8 100644
--- a/arch/arm64/kernel/sdei.c
+++ b/arch/arm64/kernel/sdei.c
@@ -247,7 +247,7 @@ unsigned long __kprobes do_sdei_event(struct pt_regs *regs,
* If we interrupted the kernel with interrupts masked, we always go
* back to wherever we came from.
*/
- if (mode == kernel_mode && !interrupts_enabled(regs))
+ if (mode == kernel_mode && regs_irqs_disabled(regs))
return SDEI_EV_HANDLED;
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH -next v6 3/8] arm64: entry: Refactor the entry and exit for exceptions from EL1
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 2/8] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Jinjie Ruan
@ 2025-02-13 13:00 ` Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 4/8] arm64: entry: Rework arm64_preempt_schedule_irq() Jinjie Ruan
` (5 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 13:00 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
The generic entry code uses irqentry_state_t to track lockdep and RCU
state across exception entry and return. For historical reasons, arm64
embeds similar fields within its pt_regs structure.
In preparation for moving arm64 over to the generic entry code, pull
these fields out of arm64's pt_regs, and use a separate structure,
matching the style of the generic entry code.
No functional changes.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v6:
- irqentry_state_t -> arm64_irqentry_state_t.
---
arch/arm64/include/asm/ptrace.h | 4 -
arch/arm64/kernel/entry-common.c | 136 +++++++++++++++++++------------
2 files changed, 85 insertions(+), 55 deletions(-)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 8b915d4a9d4b..65b053a24d82 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -169,10 +169,6 @@ struct pt_regs {
u64 sdei_ttbr1;
struct frame_record_meta stackframe;
-
- /* Only valid for some EL1 exceptions. */
- u64 lockdep_hardirqs;
- u64 exit_rcu;
};
/* For correct stack alignment, pt_regs has to be a multiple of 16 bytes. */
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index c547e70428d3..8e597d32433d 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -28,6 +28,13 @@
#include <asm/sysreg.h>
#include <asm/system_misc.h>
+typedef struct irqentry_state {
+ union {
+ bool exit_rcu;
+ bool lockdep;
+ };
+} arm64_irqentry_state_t;
+
/*
* Handle IRQ/context state management when entering from kernel mode.
* Before this function is called it is not safe to call regular kernel code,
@@ -36,29 +43,36 @@
* This is intended to match the logic in irqentry_enter(), handling the kernel
* mode transitions only.
*/
-static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs)
+static __always_inline arm64_irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
{
- regs->exit_rcu = false;
+ arm64_irqentry_state_t state = {
+ .exit_rcu = false,
+ };
if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
lockdep_hardirqs_off(CALLER_ADDR0);
ct_irq_enter();
trace_hardirqs_off_finish();
- regs->exit_rcu = true;
- return;
+ state.exit_rcu = true;
+ return state;
}
lockdep_hardirqs_off(CALLER_ADDR0);
rcu_irq_enter_check_tick();
trace_hardirqs_off_finish();
+
+ return state;
}
-static void noinstr enter_from_kernel_mode(struct pt_regs *regs)
+static noinstr arm64_irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
{
- __enter_from_kernel_mode(regs);
+ arm64_irqentry_state_t state = __enter_from_kernel_mode(regs);
+
mte_check_tfsr_entry();
mte_disable_tco_entry(current);
+
+ return state;
}
/*
@@ -69,12 +83,13 @@ static void noinstr enter_from_kernel_mode(struct pt_regs *regs)
* This is intended to match the logic in irqentry_exit(), handling the kernel
* mode transitions only, and with preemption handled elsewhere.
*/
-static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
+static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs,
+ arm64_irqentry_state_t state)
{
lockdep_assert_irqs_disabled();
if (!regs_irqs_disabled(regs)) {
- if (regs->exit_rcu) {
+ if (state.exit_rcu) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
ct_irq_exit();
@@ -84,15 +99,16 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
trace_hardirqs_on();
} else {
- if (regs->exit_rcu)
+ if (state.exit_rcu)
ct_irq_exit();
}
}
-static void noinstr exit_to_kernel_mode(struct pt_regs *regs)
+static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
+ arm64_irqentry_state_t state)
{
mte_check_tfsr_exit();
- __exit_to_kernel_mode(regs);
+ __exit_to_kernel_mode(regs, state);
}
/*
@@ -190,9 +206,11 @@ asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs)
* mode. Before this function is called it is not safe to call regular kernel
* code, instrumentable code, or any code which may trigger an exception.
*/
-static void noinstr arm64_enter_nmi(struct pt_regs *regs)
+static noinstr arm64_irqentry_state_t arm64_enter_nmi(struct pt_regs *regs)
{
- regs->lockdep_hardirqs = lockdep_hardirqs_enabled();
+ arm64_irqentry_state_t state;
+
+ state.lockdep = lockdep_hardirqs_enabled();
__nmi_enter();
lockdep_hardirqs_off(CALLER_ADDR0);
@@ -201,6 +219,8 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs)
trace_hardirqs_off_finish();
ftrace_nmi_enter();
+
+ return state;
}
/*
@@ -208,19 +228,18 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs)
* mode. After this function returns it is not safe to call regular kernel
* code, instrumentable code, or any code which may trigger an exception.
*/
-static void noinstr arm64_exit_nmi(struct pt_regs *regs)
+static void noinstr arm64_exit_nmi(struct pt_regs *regs,
+ arm64_irqentry_state_t state)
{
- bool restore = regs->lockdep_hardirqs;
-
ftrace_nmi_exit();
- if (restore) {
+ if (state.lockdep) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
}
ct_nmi_exit();
lockdep_hardirq_exit();
- if (restore)
+ if (state.lockdep)
lockdep_hardirqs_on(CALLER_ADDR0);
__nmi_exit();
}
@@ -230,14 +249,18 @@ static void noinstr arm64_exit_nmi(struct pt_regs *regs)
* kernel mode. Before this function is called it is not safe to call regular
* kernel code, instrumentable code, or any code which may trigger an exception.
*/
-static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs)
+static noinstr arm64_irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs)
{
- regs->lockdep_hardirqs = lockdep_hardirqs_enabled();
+ arm64_irqentry_state_t state;
+
+ state.lockdep = lockdep_hardirqs_enabled();
lockdep_hardirqs_off(CALLER_ADDR0);
ct_nmi_enter();
trace_hardirqs_off_finish();
+
+ return state;
}
/*
@@ -245,17 +268,16 @@ static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs)
* kernel mode. After this function returns it is not safe to call regular
* kernel code, instrumentable code, or any code which may trigger an exception.
*/
-static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs)
+static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs,
+ arm64_irqentry_state_t state)
{
- bool restore = regs->lockdep_hardirqs;
-
- if (restore) {
+ if (state.lockdep) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
}
ct_nmi_exit();
- if (restore)
+ if (state.lockdep)
lockdep_hardirqs_on(CALLER_ADDR0);
}
@@ -426,78 +448,86 @@ UNHANDLED(el1t, 64, error)
static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
+ arm64_irqentry_state_t state;
- enter_from_kernel_mode(regs);
+ state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_mem_abort(far, esr, regs);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_pc(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
+ arm64_irqentry_state_t state;
- enter_from_kernel_mode(regs);
+ state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_sp_pc_abort(far, esr, regs);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_undef(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_undef(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_bti(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_gcs(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_mops(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_dbg(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
+ arm64_irqentry_state_t state;
- arm64_enter_el1_dbg(regs);
+ state = arm64_enter_el1_dbg(regs);
if (!cortex_a76_erratum_1463225_debug_handler(regs))
do_debug_exception(far, esr, regs);
- arm64_exit_el1_dbg(regs);
+ arm64_exit_el1_dbg(regs, state);
}
static void noinstr el1_fpac(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_fpac(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs)
@@ -546,15 +576,16 @@ asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs)
static __always_inline void __el1_pnmi(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- arm64_enter_nmi(regs);
+ arm64_irqentry_state_t state = arm64_enter_nmi(regs);
+
do_interrupt_handler(regs, handler);
- arm64_exit_nmi(regs);
+ arm64_exit_nmi(regs, state);
}
static __always_inline void __el1_irq(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- enter_from_kernel_mode(regs);
+ arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
irq_enter_rcu();
do_interrupt_handler(regs, handler);
@@ -562,7 +593,7 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
arm64_preempt_schedule_irq();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_interrupt(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
@@ -588,11 +619,12 @@ asmlinkage void noinstr el1h_64_fiq_handler(struct pt_regs *regs)
asmlinkage void noinstr el1h_64_error_handler(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
+ arm64_irqentry_state_t state;
local_daif_restore(DAIF_ERRCTX);
- arm64_enter_nmi(regs);
+ state = arm64_enter_nmi(regs);
do_serror(regs, esr);
- arm64_exit_nmi(regs);
+ arm64_exit_nmi(regs, state);
}
static void noinstr el0_da(struct pt_regs *regs, unsigned long esr)
@@ -855,12 +887,13 @@ asmlinkage void noinstr el0t_64_fiq_handler(struct pt_regs *regs)
static void noinstr __el0_error_handler_common(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
+ arm64_irqentry_state_t state;
enter_from_user_mode(regs);
local_daif_restore(DAIF_ERRCTX);
- arm64_enter_nmi(regs);
+ state = arm64_enter_nmi(regs);
do_serror(regs, esr);
- arm64_exit_nmi(regs);
+ arm64_exit_nmi(regs, state);
local_daif_restore(DAIF_PROCCTX);
exit_to_user_mode(regs);
}
@@ -968,6 +1001,7 @@ asmlinkage void noinstr __noreturn handle_bad_stack(struct pt_regs *regs)
asmlinkage noinstr unsigned long
__sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
{
+ arm64_irqentry_state_t state;
unsigned long ret;
/*
@@ -992,9 +1026,9 @@ __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
else if (cpu_has_pan())
set_pstate_pan(0);
- arm64_enter_nmi(regs);
+ state = arm64_enter_nmi(regs);
ret = do_sdei_event(regs, arg);
- arm64_exit_nmi(regs);
+ arm64_exit_nmi(regs, state);
return ret;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH -next v6 4/8] arm64: entry: Rework arm64_preempt_schedule_irq()
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
` (2 preceding siblings ...)
2025-02-13 13:00 ` [PATCH -next v6 3/8] arm64: entry: Refactor the entry and exit for exceptions from EL1 Jinjie Ruan
@ 2025-02-13 13:00 ` Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 5/8] arm64: entry: Use preempt_count() and need_resched() helper Jinjie Ruan
` (4 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 13:00 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
The generic entry code has the form:
| raw_irqentry_exit_cond_resched()
| {
| if (!preempt_count()) {
| ...
| if (need_resched())
| preempt_schedule_irq();
| }
| }
In preparation for moving arm64 over to the generic entry code, align
the structure of the arm64 code with raw_irqentry_exit_cond_resched() from
the generic entry code.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v6:
- Update the commit message.
---
arch/arm64/kernel/entry-common.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 8e597d32433d..94e4132213ce 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -289,10 +289,10 @@ DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
#endif
-static void __sched arm64_preempt_schedule_irq(void)
+static inline bool arm64_preempt_schedule_irq(void)
{
if (!need_irq_preemption())
- return;
+ return false;
/*
* Note: thread_info::preempt_count includes both thread_info::count
@@ -300,7 +300,7 @@ static void __sched arm64_preempt_schedule_irq(void)
* preempt_count().
*/
if (READ_ONCE(current_thread_info()->preempt_count) != 0)
- return;
+ return false;
/*
* DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
@@ -309,7 +309,7 @@ static void __sched arm64_preempt_schedule_irq(void)
* DAIF we must have handled an NMI, so skip preemption.
*/
if (system_uses_irq_prio_masking() && read_sysreg(daif))
- return;
+ return false;
/*
* Preempting a task from an IRQ means we leave copies of PSTATE
@@ -319,8 +319,10 @@ static void __sched arm64_preempt_schedule_irq(void)
* Only allow a task to be preempted once cpufeatures have been
* enabled.
*/
- if (system_capabilities_finalized())
- preempt_schedule_irq();
+ if (!system_capabilities_finalized())
+ return false;
+
+ return true;
}
static void do_interrupt_handler(struct pt_regs *regs,
@@ -591,7 +593,8 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
do_interrupt_handler(regs, handler);
irq_exit_rcu();
- arm64_preempt_schedule_irq();
+ if (arm64_preempt_schedule_irq())
+ preempt_schedule_irq();
exit_to_kernel_mode(regs, state);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH -next v6 5/8] arm64: entry: Use preempt_count() and need_resched() helper
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
` (3 preceding siblings ...)
2025-02-13 13:00 ` [PATCH -next v6 4/8] arm64: entry: Rework arm64_preempt_schedule_irq() Jinjie Ruan
@ 2025-02-13 13:00 ` Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 6/8] arm64: entry: Refactor preempt_schedule_irq() check code Jinjie Ruan
` (3 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 13:00 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
The generic entry code uses preempt_count() and need_resched() helpers to
check if it should do preempt_schedule_irq(). Currently, arm64 use its own
check logic, that is "READ_ONCE(current_thread_info()->preempt_count == 0",
which is equivalent to "preempt_count() == 0 && need_resched()".
In preparation for moving arm64 over to the generic entry code, use
these helpers to replace arm64's own code and move it ahead.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v6:
- Update the commit message.
- Move this ahead before we change the preemption logic to
preempt non-IRQ exceptions.
---
arch/arm64/kernel/entry-common.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 94e4132213ce..dceef4cb140b 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -294,14 +294,6 @@ static inline bool arm64_preempt_schedule_irq(void)
if (!need_irq_preemption())
return false;
- /*
- * Note: thread_info::preempt_count includes both thread_info::count
- * and thread_info::need_resched, and is not equivalent to
- * preempt_count().
- */
- if (READ_ONCE(current_thread_info()->preempt_count) != 0)
- return false;
-
/*
* DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
* priority masking is used the GIC irqchip driver will clear DAIF.IF
@@ -593,8 +585,10 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
do_interrupt_handler(regs, handler);
irq_exit_rcu();
- if (arm64_preempt_schedule_irq())
- preempt_schedule_irq();
+ if (!preempt_count() && need_resched()) {
+ if (arm64_preempt_schedule_irq())
+ preempt_schedule_irq();
+ }
exit_to_kernel_mode(regs, state);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH -next v6 6/8] arm64: entry: Refactor preempt_schedule_irq() check code
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
` (4 preceding siblings ...)
2025-02-13 13:00 ` [PATCH -next v6 5/8] arm64: entry: Use preempt_count() and need_resched() helper Jinjie Ruan
@ 2025-02-13 13:00 ` Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 7/8] arm64: entry: Move arm64_preempt_schedule_irq() into __exit_to_kernel_mode() Jinjie Ruan
` (2 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 13:00 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
ARM64 requires an additional check whether to reschedule on return
from interrupt. So add arch_irqentry_exit_need_resched() as the default
NOP implementation and hook it up into the need_resched() condition in
raw_irqentry_exit_cond_resched(). This allows ARM64 to implement
the architecture specific version for switching over to
the generic entry code.
To align the structure of the code with irqentry_exit_cond_resched()
from the generic entry code, hoist the need_irq_preemption()
and IS_ENABLED() check earlier. And different preemption check functions
are defined based on whether dynamic preemption is enabled.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Kevin Brodsky <kevin.brodsky@arm.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v6:
- Update the commit message.
- Hoist the IS_ENABLED() and need_irq_preemption() check earlier.
- Merge the 4 pathes.
---
arch/arm64/include/asm/preempt.h | 4 ++++
arch/arm64/kernel/entry-common.c | 35 ++++++++++++++++++--------------
kernel/entry/common.c | 16 ++++++++++++++-
3 files changed, 39 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h
index 0159b625cc7f..0f0ba250efe8 100644
--- a/arch/arm64/include/asm/preempt.h
+++ b/arch/arm64/include/asm/preempt.h
@@ -85,6 +85,7 @@ static inline bool should_resched(int preempt_offset)
void preempt_schedule(void);
void preempt_schedule_notrace(void);
+void raw_irqentry_exit_cond_resched(void);
#ifdef CONFIG_PREEMPT_DYNAMIC
DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
@@ -92,11 +93,14 @@ void dynamic_preempt_schedule(void);
#define __preempt_schedule() dynamic_preempt_schedule()
void dynamic_preempt_schedule_notrace(void);
#define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace()
+void dynamic_irqentry_exit_cond_resched(void);
+#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched()
#else /* CONFIG_PREEMPT_DYNAMIC */
#define __preempt_schedule() preempt_schedule()
#define __preempt_schedule_notrace() preempt_schedule_notrace()
+#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched()
#endif /* CONFIG_PREEMPT_DYNAMIC */
#endif /* CONFIG_PREEMPTION */
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index dceef4cb140b..1b4936d4cf6e 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -281,19 +281,8 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs,
lockdep_hardirqs_on(CALLER_ADDR0);
}
-#ifdef CONFIG_PREEMPT_DYNAMIC
-DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-#define need_irq_preemption() \
- (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
-#else
-#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
-#endif
-
static inline bool arm64_preempt_schedule_irq(void)
{
- if (!need_irq_preemption())
- return false;
-
/*
* DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
* priority masking is used the GIC irqchip driver will clear DAIF.IF
@@ -576,6 +565,24 @@ static __always_inline void __el1_pnmi(struct pt_regs *regs,
arm64_exit_nmi(regs, state);
}
+void raw_irqentry_exit_cond_resched(void)
+{
+ if (!preempt_count()) {
+ if (need_resched() && arm64_preempt_schedule_irq())
+ preempt_schedule_irq();
+ }
+}
+
+#ifdef CONFIG_PREEMPT_DYNAMIC
+DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
+void dynamic_irqentry_exit_cond_resched(void)
+{
+ if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
+ return;
+ raw_irqentry_exit_cond_resched();
+}
+#endif
+
static __always_inline void __el1_irq(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
@@ -585,10 +592,8 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
do_interrupt_handler(regs, handler);
irq_exit_rcu();
- if (!preempt_count() && need_resched()) {
- if (arm64_preempt_schedule_irq())
- preempt_schedule_irq();
- }
+ if (IS_ENABLED(CONFIG_PREEMPTION))
+ irqentry_exit_cond_resched();
exit_to_kernel_mode(regs, state);
}
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index b82032777310..4aa9656fa1b4 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -142,6 +142,20 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
return ret;
}
+/**
+ * arch_irqentry_exit_need_resched - Architecture specific need resched function
+ *
+ * Invoked from raw_irqentry_exit_cond_resched() to check if need resched.
+ * Defaults return true.
+ *
+ * The main purpose is to permit arch to skip preempt a task from an IRQ.
+ */
+static inline bool arch_irqentry_exit_need_resched(void);
+
+#ifndef arch_irqentry_exit_need_resched
+static inline bool arch_irqentry_exit_need_resched(void) { return true; }
+#endif
+
void raw_irqentry_exit_cond_resched(void)
{
if (!preempt_count()) {
@@ -149,7 +163,7 @@ void raw_irqentry_exit_cond_resched(void)
rcu_irq_exit_check_preempt();
if (IS_ENABLED(CONFIG_DEBUG_ENTRY))
WARN_ON_ONCE(!on_thread_stack());
- if (need_resched())
+ if (need_resched() && arch_irqentry_exit_need_resched())
preempt_schedule_irq();
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH -next v6 7/8] arm64: entry: Move arm64_preempt_schedule_irq() into __exit_to_kernel_mode()
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
` (5 preceding siblings ...)
2025-02-13 13:00 ` [PATCH -next v6 6/8] arm64: entry: Refactor preempt_schedule_irq() check code Jinjie Ruan
@ 2025-02-13 13:00 ` Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry Jinjie Ruan
2025-04-14 9:13 ` [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
8 siblings, 0 replies; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 13:00 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
The arm64 entry code only preempts a kernel context upon a return from
a regular IRQ exception. The generic entry code may preempt a kernel
context for any exception return where irqentry_exit() is used, and so
may preempt other exceptions such as faults.
In preparation for moving arm64 over to the generic entry code, align
arm64 with the generic behaviour by calling
arm64_preempt_schedule_irq() from exit_to_kernel_mode(). To make this
possible, arm64_preempt_schedule_irq()
and dynamic/raw_irqentry_exit_cond_resched() are moved earlier in
the file, with no changes.
As Mark pointed out, this change will have the following 2 key impact:
- " We'll preempt even without taking a "real" interrupt. That
shouldn't result in preemption that wasn't possible before,
but it does change the probability of preempting at certain points,
and might have a performance impact, so probably warrants a
benchmark."
- " We will not preempt when taking interrupts from a region of kernel
code where IRQs are enabled but RCU is not watching, matching the
behaviour of the generic entry code.
This has the potential to introduce livelock if we can ever have a
screaming interrupt in such a region, so we'll need to go figure out
whether that's actually a problem.
Having this as a separate patch will make it easier to test/bisect
for that specifically."
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v6:
- Update the commit message.
---
arch/arm64/kernel/entry-common.c | 92 ++++++++++++++++----------------
1 file changed, 46 insertions(+), 46 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 1b4936d4cf6e..7056c584f59c 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -75,6 +75,49 @@ static noinstr arm64_irqentry_state_t enter_from_kernel_mode(struct pt_regs *reg
return state;
}
+static inline bool arm64_preempt_schedule_irq(void)
+{
+ /*
+ * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
+ * priority masking is used the GIC irqchip driver will clear DAIF.IF
+ * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
+ * DAIF we must have handled an NMI, so skip preemption.
+ */
+ if (system_uses_irq_prio_masking() && read_sysreg(daif))
+ return false;
+
+ /*
+ * Preempting a task from an IRQ means we leave copies of PSTATE
+ * on the stack. cpufeature's enable calls may modify PSTATE, but
+ * resuming one of these preempted tasks would undo those changes.
+ *
+ * Only allow a task to be preempted once cpufeatures have been
+ * enabled.
+ */
+ if (!system_capabilities_finalized())
+ return false;
+
+ return true;
+}
+
+void raw_irqentry_exit_cond_resched(void)
+{
+ if (!preempt_count()) {
+ if (need_resched() && arm64_preempt_schedule_irq())
+ preempt_schedule_irq();
+ }
+}
+
+#ifdef CONFIG_PREEMPT_DYNAMIC
+DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
+void dynamic_irqentry_exit_cond_resched(void)
+{
+ if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
+ return;
+ raw_irqentry_exit_cond_resched();
+}
+#endif
+
/*
* Handle IRQ/context state management when exiting to kernel mode.
* After this function returns it is not safe to call regular kernel code,
@@ -97,6 +140,9 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs,
return;
}
+ if (IS_ENABLED(CONFIG_PREEMPTION))
+ irqentry_exit_cond_resched();
+
trace_hardirqs_on();
} else {
if (state.exit_rcu)
@@ -281,31 +327,6 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs,
lockdep_hardirqs_on(CALLER_ADDR0);
}
-static inline bool arm64_preempt_schedule_irq(void)
-{
- /*
- * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
- * priority masking is used the GIC irqchip driver will clear DAIF.IF
- * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
- * DAIF we must have handled an NMI, so skip preemption.
- */
- if (system_uses_irq_prio_masking() && read_sysreg(daif))
- return false;
-
- /*
- * Preempting a task from an IRQ means we leave copies of PSTATE
- * on the stack. cpufeature's enable calls may modify PSTATE, but
- * resuming one of these preempted tasks would undo those changes.
- *
- * Only allow a task to be preempted once cpufeatures have been
- * enabled.
- */
- if (!system_capabilities_finalized())
- return false;
-
- return true;
-}
-
static void do_interrupt_handler(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
@@ -565,24 +586,6 @@ static __always_inline void __el1_pnmi(struct pt_regs *regs,
arm64_exit_nmi(regs, state);
}
-void raw_irqentry_exit_cond_resched(void)
-{
- if (!preempt_count()) {
- if (need_resched() && arm64_preempt_schedule_irq())
- preempt_schedule_irq();
- }
-}
-
-#ifdef CONFIG_PREEMPT_DYNAMIC
-DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-void dynamic_irqentry_exit_cond_resched(void)
-{
- if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
- return;
- raw_irqentry_exit_cond_resched();
-}
-#endif
-
static __always_inline void __el1_irq(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
@@ -592,9 +595,6 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
do_interrupt_handler(regs, handler);
irq_exit_rcu();
- if (IS_ENABLED(CONFIG_PREEMPTION))
- irqentry_exit_cond_resched();
-
exit_to_kernel_mode(regs, state);
}
static void noinstr el1_interrupt(struct pt_regs *regs,
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
` (6 preceding siblings ...)
2025-02-13 13:00 ` [PATCH -next v6 7/8] arm64: entry: Move arm64_preempt_schedule_irq() into __exit_to_kernel_mode() Jinjie Ruan
@ 2025-02-13 13:00 ` Jinjie Ruan
2025-02-27 18:08 ` Valentin Schneider
2025-04-14 9:13 ` [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
8 siblings, 1 reply; 16+ messages in thread
From: Jinjie Ruan @ 2025-02-13 13:00 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
to use the generic entry infrastructure from kernel/entry/*.
The generic entry makes maintainers' work easier and codes
more elegant.
Switch arm64 to generic IRQ entry first, which removed duplicate 100+
LOC and make Lazy preemption on arm64 available by adding a
_TIF_NEED_RESCHED_LAZY bit and enabling ARCH_HAS_PREEMPT_LAZY. The next
patch serise will switch to generic entry completely later. Switch to
generic entry in two steps according to Mark's suggestion will make it
easier to review.
The changes are below:
- Remove *enter_from/exit_to_kernel_mode(), and wrap with generic
irqentry_enter/exit(). Also remove *enter_from/exit_to_user_mode(),
and wrap with generic enter_from/exit_to_user_mode() because they
are exactly the same so far.
- Remove arm64_enter/exit_nmi() and use generic irqentry_nmi_enter/exit()
because they're exactly the same, so the temporary arm64 version
irqentry_state can also be removed.
- Remove PREEMPT_DYNAMIC code, as generic entry do the same thing
if arm64 implement arch_irqentry_exit_need_resched().
Tested ok with following test cases on Qemu virt platform:
- Perf tests.
- Different `dynamic preempt` mode switch.
- Pseudo NMI tests.
- Stress-ng CPU stress test.
- MTE test case in Documentation/arch/arm64/memory-tagging-extension.rst
and all test cases in tools/testing/selftests/arm64/mte/*.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v6:
- Remove arch_exit_to_user_mode_prepare() and pull local_daif_mask() later
in the arm64 exit sequence so that we can have it explicit
in entry-common.c.
- Update the commit message.
---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/entry-common.h | 56 +++++
arch/arm64/include/asm/preempt.h | 6 -
arch/arm64/kernel/entry-common.c | 350 +++++++-------------------
arch/arm64/kernel/signal.c | 3 +-
5 files changed, 143 insertions(+), 273 deletions(-)
create mode 100644 arch/arm64/include/asm/entry-common.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c997b27b7da1..f234e3e9e956 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -150,6 +150,7 @@ config ARM64
select GENERIC_EARLY_IOREMAP
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IOREMAP
+ select GENERIC_IRQ_ENTRY
select GENERIC_IRQ_IPI
select GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD
select GENERIC_IRQ_PROBE
diff --git a/arch/arm64/include/asm/entry-common.h b/arch/arm64/include/asm/entry-common.h
new file mode 100644
index 000000000000..93c30b8d653d
--- /dev/null
+++ b/arch/arm64/include/asm/entry-common.h
@@ -0,0 +1,56 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_ARM64_ENTRY_COMMON_H
+#define _ASM_ARM64_ENTRY_COMMON_H
+
+#include <linux/thread_info.h>
+
+#include <asm/daifflags.h>
+#include <asm/fpsimd.h>
+#include <asm/mte.h>
+#include <asm/stacktrace.h>
+
+#define ARCH_EXIT_TO_USER_MODE_WORK (_TIF_MTE_ASYNC_FAULT | _TIF_FOREIGN_FPSTATE)
+
+static __always_inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+ if (ti_work & _TIF_MTE_ASYNC_FAULT) {
+ clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+ send_sig_fault(SIGSEGV, SEGV_MTEAERR, (void __user *)NULL, current);
+ }
+
+ if (ti_work & _TIF_FOREIGN_FPSTATE)
+ fpsimd_restore_current_state();
+}
+
+#define arch_exit_to_user_mode_work arch_exit_to_user_mode_work
+
+static inline bool arch_irqentry_exit_need_resched(void)
+{
+ /*
+ * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
+ * priority masking is used the GIC irqchip driver will clear DAIF.IF
+ * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
+ * DAIF we must have handled an NMI, so skip preemption.
+ */
+ if (system_uses_irq_prio_masking() && read_sysreg(daif))
+ return false;
+
+ /*
+ * Preempting a task from an IRQ means we leave copies of PSTATE
+ * on the stack. cpufeature's enable calls may modify PSTATE, but
+ * resuming one of these preempted tasks would undo those changes.
+ *
+ * Only allow a task to be preempted once cpufeatures have been
+ * enabled.
+ */
+ if (!system_capabilities_finalized())
+ return false;
+
+ return true;
+}
+
+#define arch_irqentry_exit_need_resched arch_irqentry_exit_need_resched
+
+#endif /* _ASM_ARM64_ENTRY_COMMON_H */
diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h
index 0f0ba250efe8..932ea4b62042 100644
--- a/arch/arm64/include/asm/preempt.h
+++ b/arch/arm64/include/asm/preempt.h
@@ -2,7 +2,6 @@
#ifndef __ASM_PREEMPT_H
#define __ASM_PREEMPT_H
-#include <linux/jump_label.h>
#include <linux/thread_info.h>
#define PREEMPT_NEED_RESCHED BIT(32)
@@ -85,22 +84,17 @@ static inline bool should_resched(int preempt_offset)
void preempt_schedule(void);
void preempt_schedule_notrace(void);
-void raw_irqentry_exit_cond_resched(void);
#ifdef CONFIG_PREEMPT_DYNAMIC
-DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
void dynamic_preempt_schedule(void);
#define __preempt_schedule() dynamic_preempt_schedule()
void dynamic_preempt_schedule_notrace(void);
#define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace()
-void dynamic_irqentry_exit_cond_resched(void);
-#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched()
#else /* CONFIG_PREEMPT_DYNAMIC */
#define __preempt_schedule() preempt_schedule()
#define __preempt_schedule_notrace() preempt_schedule_notrace()
-#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched()
#endif /* CONFIG_PREEMPT_DYNAMIC */
#endif /* CONFIG_PREEMPTION */
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 7056c584f59c..c3583524c37a 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -6,6 +6,7 @@
*/
#include <linux/context_tracking.h>
+#include <linux/irq-entry-common.h>
#include <linux/kasan.h>
#include <linux/linkage.h>
#include <linux/lockdep.h>
@@ -28,13 +29,6 @@
#include <asm/sysreg.h>
#include <asm/system_misc.h>
-typedef struct irqentry_state {
- union {
- bool exit_rcu;
- bool lockdep;
- };
-} arm64_irqentry_state_t;
-
/*
* Handle IRQ/context state management when entering from kernel mode.
* Before this function is called it is not safe to call regular kernel code,
@@ -43,31 +37,14 @@ typedef struct irqentry_state {
* This is intended to match the logic in irqentry_enter(), handling the kernel
* mode transitions only.
*/
-static __always_inline arm64_irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
+static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
{
- arm64_irqentry_state_t state = {
- .exit_rcu = false,
- };
-
- if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
- lockdep_hardirqs_off(CALLER_ADDR0);
- ct_irq_enter();
- trace_hardirqs_off_finish();
-
- state.exit_rcu = true;
- return state;
- }
-
- lockdep_hardirqs_off(CALLER_ADDR0);
- rcu_irq_enter_check_tick();
- trace_hardirqs_off_finish();
-
- return state;
+ return irqentry_enter(regs);
}
-static noinstr arm64_irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
+static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
{
- arm64_irqentry_state_t state = __enter_from_kernel_mode(regs);
+ irqentry_state_t state = __enter_from_kernel_mode(regs);
mte_check_tfsr_entry();
mte_disable_tco_entry(current);
@@ -75,49 +52,6 @@ static noinstr arm64_irqentry_state_t enter_from_kernel_mode(struct pt_regs *reg
return state;
}
-static inline bool arm64_preempt_schedule_irq(void)
-{
- /*
- * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
- * priority masking is used the GIC irqchip driver will clear DAIF.IF
- * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
- * DAIF we must have handled an NMI, so skip preemption.
- */
- if (system_uses_irq_prio_masking() && read_sysreg(daif))
- return false;
-
- /*
- * Preempting a task from an IRQ means we leave copies of PSTATE
- * on the stack. cpufeature's enable calls may modify PSTATE, but
- * resuming one of these preempted tasks would undo those changes.
- *
- * Only allow a task to be preempted once cpufeatures have been
- * enabled.
- */
- if (!system_capabilities_finalized())
- return false;
-
- return true;
-}
-
-void raw_irqentry_exit_cond_resched(void)
-{
- if (!preempt_count()) {
- if (need_resched() && arm64_preempt_schedule_irq())
- preempt_schedule_irq();
- }
-}
-
-#ifdef CONFIG_PREEMPT_DYNAMIC
-DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-void dynamic_irqentry_exit_cond_resched(void)
-{
- if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
- return;
- raw_irqentry_exit_cond_resched();
-}
-#endif
-
/*
* Handle IRQ/context state management when exiting to kernel mode.
* After this function returns it is not safe to call regular kernel code,
@@ -127,31 +61,13 @@ void dynamic_irqentry_exit_cond_resched(void)
* mode transitions only, and with preemption handled elsewhere.
*/
static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs,
- arm64_irqentry_state_t state)
-{
- lockdep_assert_irqs_disabled();
-
- if (!regs_irqs_disabled(regs)) {
- if (state.exit_rcu) {
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare();
- ct_irq_exit();
- lockdep_hardirqs_on(CALLER_ADDR0);
- return;
- }
-
- if (IS_ENABLED(CONFIG_PREEMPTION))
- irqentry_exit_cond_resched();
-
- trace_hardirqs_on();
- } else {
- if (state.exit_rcu)
- ct_irq_exit();
- }
+ irqentry_state_t state)
+{
+ irqentry_exit(regs, state);
}
static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
- arm64_irqentry_state_t state)
+ irqentry_state_t state)
{
mte_check_tfsr_exit();
__exit_to_kernel_mode(regs, state);
@@ -162,18 +78,15 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
* Before this function is called it is not safe to call regular kernel code,
* instrumentable code, or any code which may trigger an exception.
*/
-static __always_inline void __enter_from_user_mode(void)
+static __always_inline void __enter_from_user_mode(struct pt_regs *regs)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
- CT_WARN_ON(ct_state() != CT_STATE_USER);
- user_exit_irqoff();
- trace_hardirqs_off_finish();
+ enter_from_user_mode(regs);
mte_disable_tco_entry(current);
}
-static __always_inline void enter_from_user_mode(struct pt_regs *regs)
+static __always_inline void arm64_enter_from_user_mode(struct pt_regs *regs)
{
- __enter_from_user_mode();
+ __enter_from_user_mode(regs);
}
/*
@@ -181,113 +94,18 @@ static __always_inline void enter_from_user_mode(struct pt_regs *regs)
* After this function returns it is not safe to call regular kernel code,
* instrumentable code, or any code which may trigger an exception.
*/
-static __always_inline void __exit_to_user_mode(void)
-{
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare();
- user_enter_irqoff();
- lockdep_hardirqs_on(CALLER_ADDR0);
-}
-
-static void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags)
-{
- do {
- local_irq_enable();
-
- if (thread_flags & _TIF_NEED_RESCHED)
- schedule();
-
- if (thread_flags & _TIF_UPROBE)
- uprobe_notify_resume(regs);
-
- if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
- clear_thread_flag(TIF_MTE_ASYNC_FAULT);
- send_sig_fault(SIGSEGV, SEGV_MTEAERR,
- (void __user *)NULL, current);
- }
-
- if (thread_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
- do_signal(regs);
-
- if (thread_flags & _TIF_NOTIFY_RESUME)
- resume_user_mode_work(regs);
-
- if (thread_flags & _TIF_FOREIGN_FPSTATE)
- fpsimd_restore_current_state();
-
- local_irq_disable();
- thread_flags = read_thread_flags();
- } while (thread_flags & _TIF_WORK_MASK);
-}
-
-static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs)
+static __always_inline void arm64_exit_to_user_mode(struct pt_regs *regs)
{
- unsigned long flags;
-
local_irq_disable();
-
- flags = read_thread_flags();
- if (unlikely(flags & _TIF_WORK_MASK))
- do_notify_resume(regs, flags);
-
- local_daif_mask();
-
- lockdep_sys_exit();
-}
-
-static __always_inline void exit_to_user_mode(struct pt_regs *regs)
-{
exit_to_user_mode_prepare(regs);
+ local_daif_mask();
mte_check_tfsr_exit();
- __exit_to_user_mode();
+ exit_to_user_mode();
}
asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs)
{
- exit_to_user_mode(regs);
-}
-
-/*
- * Handle IRQ/context state management when entering an NMI from user/kernel
- * mode. Before this function is called it is not safe to call regular kernel
- * code, instrumentable code, or any code which may trigger an exception.
- */
-static noinstr arm64_irqentry_state_t arm64_enter_nmi(struct pt_regs *regs)
-{
- arm64_irqentry_state_t state;
-
- state.lockdep = lockdep_hardirqs_enabled();
-
- __nmi_enter();
- lockdep_hardirqs_off(CALLER_ADDR0);
- lockdep_hardirq_enter();
- ct_nmi_enter();
-
- trace_hardirqs_off_finish();
- ftrace_nmi_enter();
-
- return state;
-}
-
-/*
- * Handle IRQ/context state management when exiting an NMI from user/kernel
- * mode. After this function returns it is not safe to call regular kernel
- * code, instrumentable code, or any code which may trigger an exception.
- */
-static void noinstr arm64_exit_nmi(struct pt_regs *regs,
- arm64_irqentry_state_t state)
-{
- ftrace_nmi_exit();
- if (state.lockdep) {
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare();
- }
-
- ct_nmi_exit();
- lockdep_hardirq_exit();
- if (state.lockdep)
- lockdep_hardirqs_on(CALLER_ADDR0);
- __nmi_exit();
+ arm64_exit_to_user_mode(regs);
}
/*
@@ -295,9 +113,9 @@ static void noinstr arm64_exit_nmi(struct pt_regs *regs,
* kernel mode. Before this function is called it is not safe to call regular
* kernel code, instrumentable code, or any code which may trigger an exception.
*/
-static noinstr arm64_irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs)
+static noinstr irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs)
{
- arm64_irqentry_state_t state;
+ irqentry_state_t state;
state.lockdep = lockdep_hardirqs_enabled();
@@ -315,7 +133,7 @@ static noinstr arm64_irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs)
* kernel code, instrumentable code, or any code which may trigger an exception.
*/
static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs,
- arm64_irqentry_state_t state)
+ irqentry_state_t state)
{
if (state.lockdep) {
trace_hardirqs_on_prepare();
@@ -346,7 +164,7 @@ extern void (*handle_arch_fiq)(struct pt_regs *);
static void noinstr __panic_unhandled(struct pt_regs *regs, const char *vector,
unsigned long esr)
{
- arm64_enter_nmi(regs);
+ irqentry_nmi_enter(regs);
console_verbose();
@@ -452,7 +270,7 @@ UNHANDLED(el1t, 64, error)
static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
- arm64_irqentry_state_t state;
+ irqentry_state_t state;
state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
@@ -464,7 +282,7 @@ static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr)
static void noinstr el1_pc(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
- arm64_irqentry_state_t state;
+ irqentry_state_t state;
state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
@@ -475,7 +293,7 @@ static void noinstr el1_pc(struct pt_regs *regs, unsigned long esr)
static void noinstr el1_undef(struct pt_regs *regs, unsigned long esr)
{
- arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_el1_undef(regs, esr);
@@ -485,7 +303,7 @@ static void noinstr el1_undef(struct pt_regs *regs, unsigned long esr)
static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr)
{
- arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_el1_bti(regs, esr);
@@ -495,7 +313,7 @@ static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr)
static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr)
{
- arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_el1_gcs(regs, esr);
@@ -505,7 +323,7 @@ static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr)
static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr)
{
- arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_el1_mops(regs, esr);
@@ -516,7 +334,7 @@ static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr)
static void noinstr el1_dbg(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
- arm64_irqentry_state_t state;
+ irqentry_state_t state;
state = arm64_enter_el1_dbg(regs);
if (!cortex_a76_erratum_1463225_debug_handler(regs))
@@ -526,7 +344,7 @@ static void noinstr el1_dbg(struct pt_regs *regs, unsigned long esr)
static void noinstr el1_fpac(struct pt_regs *regs, unsigned long esr)
{
- arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_el1_fpac(regs, esr);
@@ -580,16 +398,16 @@ asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs)
static __always_inline void __el1_pnmi(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- arm64_irqentry_state_t state = arm64_enter_nmi(regs);
+ irqentry_state_t state = irqentry_nmi_enter(regs);
do_interrupt_handler(regs, handler);
- arm64_exit_nmi(regs, state);
+ irqentry_nmi_exit(regs, state);
}
static __always_inline void __el1_irq(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- arm64_irqentry_state_t state = enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
irq_enter_rcu();
do_interrupt_handler(regs, handler);
@@ -621,22 +439,22 @@ asmlinkage void noinstr el1h_64_fiq_handler(struct pt_regs *regs)
asmlinkage void noinstr el1h_64_error_handler(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
- arm64_irqentry_state_t state;
+ irqentry_state_t state;
local_daif_restore(DAIF_ERRCTX);
- state = arm64_enter_nmi(regs);
+ state = irqentry_nmi_enter(regs);
do_serror(regs, esr);
- arm64_exit_nmi(regs, state);
+ irqentry_nmi_exit(regs, state);
}
static void noinstr el0_da(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_mem_abort(far, esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_ia(struct pt_regs *regs, unsigned long esr)
@@ -651,50 +469,50 @@ static void noinstr el0_ia(struct pt_regs *regs, unsigned long esr)
if (!is_ttbr0_addr(far))
arm64_apply_bp_hardening();
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_mem_abort(far, esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_fpsimd_acc(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_fpsimd_acc(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_sve_acc(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sve_acc(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_sme_acc(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sme_acc(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_fpsimd_exc(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_fpsimd_exc(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_sys(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_sys(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_pc(struct pt_regs *regs, unsigned long esr)
@@ -704,58 +522,58 @@ static void noinstr el0_pc(struct pt_regs *regs, unsigned long esr)
if (!is_ttbr0_addr(instruction_pointer(regs)))
arm64_apply_bp_hardening();
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sp_pc_abort(far, esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_sp(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sp_pc_abort(regs->sp, esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_undef(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_undef(regs, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_bti(struct pt_regs *regs)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_bti(regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_mops(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_mops(regs, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_gcs(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_gcs(regs, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_inv(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
bad_el0_sync(regs, 0, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_dbg(struct pt_regs *regs, unsigned long esr)
@@ -763,28 +581,28 @@ static void noinstr el0_dbg(struct pt_regs *regs, unsigned long esr)
/* Only watchpoints write FAR_EL1, otherwise its UNKNOWN */
unsigned long far = read_sysreg(far_el1);
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
do_debug_exception(far, esr, regs);
local_daif_restore(DAIF_PROCCTX);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_svc(struct pt_regs *regs)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
fp_user_discard();
local_daif_restore(DAIF_PROCCTX);
do_el0_svc(regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_fpac(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_fpac(regs, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
@@ -852,7 +670,7 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
static void noinstr el0_interrupt(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
write_sysreg(DAIF_PROCCTX_NOIRQ, daif);
@@ -863,7 +681,7 @@ static void noinstr el0_interrupt(struct pt_regs *regs,
do_interrupt_handler(regs, handler);
irq_exit_rcu();
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr __el0_irq_handler_common(struct pt_regs *regs)
@@ -889,15 +707,15 @@ asmlinkage void noinstr el0t_64_fiq_handler(struct pt_regs *regs)
static void noinstr __el0_error_handler_common(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
- arm64_irqentry_state_t state;
+ irqentry_state_t state;
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_ERRCTX);
- state = arm64_enter_nmi(regs);
+ state = irqentry_nmi_enter(regs);
do_serror(regs, esr);
- arm64_exit_nmi(regs, state);
+ irqentry_nmi_exit(regs, state);
local_daif_restore(DAIF_PROCCTX);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
asmlinkage void noinstr el0t_64_error_handler(struct pt_regs *regs)
@@ -908,19 +726,19 @@ asmlinkage void noinstr el0t_64_error_handler(struct pt_regs *regs)
#ifdef CONFIG_COMPAT
static void noinstr el0_cp15(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_cp15(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_svc_compat(struct pt_regs *regs)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
local_daif_restore(DAIF_PROCCTX);
do_el0_svc_compat(regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
asmlinkage void noinstr el0t_32_sync_handler(struct pt_regs *regs)
@@ -994,7 +812,7 @@ asmlinkage void noinstr __noreturn handle_bad_stack(struct pt_regs *regs)
unsigned long esr = read_sysreg(esr_el1);
unsigned long far = read_sysreg(far_el1);
- arm64_enter_nmi(regs);
+ irqentry_nmi_enter(regs);
panic_bad_stack(regs, esr, far);
}
#endif /* CONFIG_VMAP_STACK */
@@ -1003,7 +821,7 @@ asmlinkage void noinstr __noreturn handle_bad_stack(struct pt_regs *regs)
asmlinkage noinstr unsigned long
__sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
{
- arm64_irqentry_state_t state;
+ irqentry_state_t state;
unsigned long ret;
/*
@@ -1028,9 +846,9 @@ __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
else if (cpu_has_pan())
set_pstate_pan(0);
- state = arm64_enter_nmi(regs);
+ state = irqentry_nmi_enter(regs);
ret = do_sdei_event(regs, arg);
- arm64_exit_nmi(regs, state);
+ irqentry_nmi_exit(regs, state);
return ret;
}
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 99ea26d400ff..e1c1abc2cb3f 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -9,6 +9,7 @@
#include <linux/cache.h>
#include <linux/compat.h>
#include <linux/errno.h>
+#include <linux/irq-entry-common.h>
#include <linux/kernel.h>
#include <linux/signal.h>
#include <linux/freezer.h>
@@ -1616,7 +1617,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
* the kernel can handle, and then we build all the user-level signal handling
* stack-frames in one go after that.
*/
-void do_signal(struct pt_regs *regs)
+void arch_do_signal_or_restart(struct pt_regs *regs)
{
unsigned long continue_addr = 0, restart_addr = 0;
int retval = 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry
2025-02-13 13:00 ` [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry Jinjie Ruan
@ 2025-02-27 18:08 ` Valentin Schneider
2025-02-27 18:35 ` Mark Rutland
0 siblings, 1 reply; 16+ messages in thread
From: Valentin Schneider @ 2025-02-27 18:08 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, sstabellini, tglx,
peterz, luto, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, kees, aliceryhl,
ojeda, samitolvanen, masahiroy, rppt, xur, paulmck, arnd,
mark.rutland, puranjay, broonie, mbenes, sudeep.holla, guohanjun,
prarit, liuwei09, Jonathan.Cameron, dwmw, kristina.martsenko,
liaochang1, ptosi, thiago.bauermann, kevin.brodsky, Dave.Martin,
joey.gouly, linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie
On 13/02/25 21:00, Jinjie Ruan wrote:
> Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
> to use the generic entry infrastructure from kernel/entry/*.
> The generic entry makes maintainers' work easier and codes
> more elegant.
>
> Switch arm64 to generic IRQ entry first, which removed duplicate 100+
> LOC and make Lazy preemption on arm64 available by adding a
> _TIF_NEED_RESCHED_LAZY bit and enabling ARCH_HAS_PREEMPT_LAZY.
Just a drive-by comment as I'm interested in lazy preemption for arm64;
this series doesn't actually enable lazy preemption, is that for a
follow-up with the rest of the generic entry stuff?
From a quick glance, it looks like everything is in place for enabling it.
> The next
> patch serise will switch to generic entry completely later. Switch to
> generic entry in two steps according to Mark's suggestion will make it
> easier to review.
>
> The changes are below:
> - Remove *enter_from/exit_to_kernel_mode(), and wrap with generic
> irqentry_enter/exit(). Also remove *enter_from/exit_to_user_mode(),
> and wrap with generic enter_from/exit_to_user_mode() because they
> are exactly the same so far.
>
> - Remove arm64_enter/exit_nmi() and use generic irqentry_nmi_enter/exit()
> because they're exactly the same, so the temporary arm64 version
> irqentry_state can also be removed.
>
> - Remove PREEMPT_DYNAMIC code, as generic entry do the same thing
> if arm64 implement arch_irqentry_exit_need_resched().
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry
2025-02-27 18:08 ` Valentin Schneider
@ 2025-02-27 18:35 ` Mark Rutland
2025-02-27 21:29 ` Valentin Schneider
0 siblings, 1 reply; 16+ messages in thread
From: Mark Rutland @ 2025-02-27 18:35 UTC (permalink / raw)
To: Valentin Schneider
Cc: Jinjie Ruan, catalin.marinas, will, oleg, sstabellini, tglx,
peterz, luto, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, kees, aliceryhl,
ojeda, samitolvanen, masahiroy, rppt, xur, paulmck, arnd,
puranjay, broonie, mbenes, sudeep.holla, guohanjun, prarit,
liuwei09, Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1,
ptosi, thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
On Thu, Feb 27, 2025 at 07:08:56PM +0100, Valentin Schneider wrote:
> On 13/02/25 21:00, Jinjie Ruan wrote:
> > Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
> > to use the generic entry infrastructure from kernel/entry/*.
> > The generic entry makes maintainers' work easier and codes
> > more elegant.
> >
> > Switch arm64 to generic IRQ entry first, which removed duplicate 100+
> > LOC and make Lazy preemption on arm64 available by adding a
> > _TIF_NEED_RESCHED_LAZY bit and enabling ARCH_HAS_PREEMPT_LAZY.
>
> Just a drive-by comment as I'm interested in lazy preemption for arm64;
> this series doesn't actually enable lazy preemption, is that for a
> follow-up with the rest of the generic entry stuff?
>
> From a quick glance, it looks like everything is in place for enabling it.
Sorry, there's been some fractured discussion on this on the
linux-rt-users list:
https://lore.kernel.org/linux-rt-users/20241216190451.1c61977c@mordecai.tesarici.cz/
The TL;DR is that lazy preemption doesn't actually depend on generic
entry, and we should be able to enable it on arm64 independently of this
series. I'd posted a quick hack which Mike Galbraith cleaned up:
https://lore.kernel.org/linux-rt-users/a198a7dd9076f97b89d8882bb249b3bf303564ef.camel@gmx.de/
... but that was never posted as a new thread to LAKML.
Would you be happy to take charge and take that patch, test it, and post
it here (or spin your own working version)? I was happy with the way it
looks but hadn't had the time for testing and so on.
I expect that we'll merge the generic entry code too, but having them
separate is a bit easier for bisection and backporting where people want
lazy preemption in downstream trees.
Mark.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry
2025-02-27 18:35 ` Mark Rutland
@ 2025-02-27 21:29 ` Valentin Schneider
0 siblings, 0 replies; 16+ messages in thread
From: Valentin Schneider @ 2025-02-27 21:29 UTC (permalink / raw)
To: Mark Rutland
Cc: Jinjie Ruan, catalin.marinas, will, oleg, sstabellini, tglx,
peterz, luto, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, kees, aliceryhl,
ojeda, samitolvanen, masahiroy, rppt, xur, paulmck, arnd,
puranjay, broonie, mbenes, sudeep.holla, guohanjun, prarit,
liuwei09, Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1,
ptosi, thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
On 27/02/25 18:35, Mark Rutland wrote:
> On Thu, Feb 27, 2025 at 07:08:56PM +0100, Valentin Schneider wrote:
>> On 13/02/25 21:00, Jinjie Ruan wrote:
>> > Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
>> > to use the generic entry infrastructure from kernel/entry/*.
>> > The generic entry makes maintainers' work easier and codes
>> > more elegant.
>> >
>> > Switch arm64 to generic IRQ entry first, which removed duplicate 100+
>> > LOC and make Lazy preemption on arm64 available by adding a
>> > _TIF_NEED_RESCHED_LAZY bit and enabling ARCH_HAS_PREEMPT_LAZY.
>>
>> Just a drive-by comment as I'm interested in lazy preemption for arm64;
>> this series doesn't actually enable lazy preemption, is that for a
>> follow-up with the rest of the generic entry stuff?
>>
>> From a quick glance, it looks like everything is in place for enabling it.
>
> Sorry, there's been some fractured discussion on this on the
> linux-rt-users list:
>
> https://lore.kernel.org/linux-rt-users/20241216190451.1c61977c@mordecai.tesarici.cz/
>
> The TL;DR is that lazy preemption doesn't actually depend on generic
> entry, and we should be able to enable it on arm64 independently of this
> series. I'd posted a quick hack which Mike Galbraith cleaned up:
>
> https://lore.kernel.org/linux-rt-users/a198a7dd9076f97b89d8882bb249b3bf303564ef.camel@gmx.de/
>
> ... but that was never posted as a new thread to LAKML.
>
Thanks for the breadcrumbs!
> Would you be happy to take charge and take that patch, test it, and post
> it here (or spin your own working version)? I was happy with the way it
> looks but hadn't had the time for testing and so on.
>
Sure, looks straightforward enough, I'll pick this up.
> I expect that we'll merge the generic entry code too, but having them
> separate is a bit easier for bisection and backporting where people want
> lazy preemption in downstream trees.
>
> Mark.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry
2025-02-13 13:00 ` [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry Jinjie Ruan
@ 2025-03-20 14:26 ` Linus Walleij
2025-04-28 14:46 ` Linus Walleij
0 siblings, 1 reply; 16+ messages in thread
From: Linus Walleij @ 2025-03-20 14:26 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
Cc: ruanjinjie, Linus Walleij
> Currently CONFIG_GENERIC_ENTRY enables both the generic exception
> entry logic and the generic syscall entry logic, which are otherwise
> loosely coupled.
>
> Introduce separate config options for these so that archtiectures can
> select the two independently. This will make it easier for
> architectures to migrate to generic entry code.
>
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
If the generic entry maintainer (hi Thomas) thinks this patch is
OK it would be nice to have it applied because it will also make
the ARM32 transition to generic entry easier as we can work on
exception and syscall entry each in isolation.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
` (7 preceding siblings ...)
2025-02-13 13:00 ` [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry Jinjie Ruan
@ 2025-04-14 9:13 ` Jinjie Ruan
8 siblings, 0 replies; 16+ messages in thread
From: Jinjie Ruan @ 2025-04-14 9:13 UTC (permalink / raw)
To: catalin.marinas, will, oleg, sstabellini, tglx, peterz, luto,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kees, aliceryhl, ojeda, samitolvanen,
masahiroy, rppt, xur, paulmck, arnd, mark.rutland, puranjay,
broonie, mbenes, sudeep.holla, guohanjun, prarit, liuwei09,
Jonathan.Cameron, dwmw, kristina.martsenko, liaochang1, ptosi,
thiago.bauermann, kevin.brodsky, Dave.Martin, joey.gouly,
linux-kernel, linux-arm-kernel, xen-devel
On 2025/2/13 20:59, Jinjie Ruan wrote:
> Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
> to use the generic entry infrastructure from kernel/entry/*. The generic
> entry makes maintainers' work easier and codes more elegant, which will
> make PREEMPT_DYNAMIC and PREEMPT_LAZY available and remove a lot of
> duplicate code.
>
> This patch series split the generic entry into generic irq entry and
> generic syscall entry first, and then convert arm64 to use the generic
> irq entry. arm64 will be completely converted to generic entry in another
> patch series.
>
> The main convert steps are as follows:
> - Split generic entry into generic irq entry and generic syscall to
> make the single patch more concentrated in switching to one thing.
> - Make arm64 easier to use irqentry_enter/exit().
> - Make arm64 closer to the PREEMPT_DYNAMIC code of generic entry.
> - Switch to generic irq entry.
Gentle Ping.
>
> Changes in v6:
> - Rebased on 6.14 rc2 next.
> - Put the syscall bits aside and split it out.
> - Have the split patch before the arm64 changes.
> - Merge some tightly coupled patches.
> - Adjust the order of some patches to make them more reasonable.
> - Define regs_irqs_disabled() by inline function.
> - Define interrupts_enabled() in terms of regs_irqs_disabled().
> - Delete the fast_interrupts_enabled() macro.
> - irqentry_state_t -> arm64_irqentry_state_t.
> - Remove arch_exit_to_user_mode_prepare() and pull local_daif_mask() later
> in the arm64 exit sequence
> - Update the commit message.
>
> Changes in v5:
> - Not change arm32 and keep inerrupts_enabled() macro for gicv3 driver.
> - Move irqentry_state definition into arch/arm64/kernel/entry-common.c.
> - Avoid removing the __enter_from_*() and __exit_to_*() wrappers.
> - Update "irqentry_state_t ret/irq_state" to "state"
> to keep it consistently.
> - Use generic irq entry header for PREEMPT_DYNAMIC after split
> the generic entry.
> - Also refactor the ARM64 syscall code.
> - Introduce arch_ptrace_report_syscall_entry/exit(), instead of
> arch_pre/post_report_syscall_entry/exit() to simplify code.
> - Make the syscall patches clear separation.
> - Update the commit message.
>
> Changes in v4:
> - Rework/cleanup split into a few patches as Mark suggested.
> - Replace interrupts_enabled() macro with regs_irqs_disabled(), instead
> of left it here.
> - Remove rcu and lockdep state in pt_regs by using temporary
> irqentry_state_t as Mark suggested.
> - Remove some unnecessary intermediate functions to make it clear.
> - Rework preempt irq and PREEMPT_DYNAMIC code
> to make the switch more clear.
> - arch_prepare_*_entry/exit() -> arch_pre_*_entry/exit().
> - Expand the arch functions comment.
> - Make arch functions closer to its caller.
> - Declare saved_reg in for block.
> - Remove arch_exit_to_kernel_mode_prepare(), arch_enter_from_kernel_mode().
> - Adjust "Add few arch functions to use generic entry" patch to be
> the penultimate.
> - Update the commit message.
> - Add suggested-by.
>
> Changes in v3:
> - Test the MTE test cases.
> - Handle forget_syscall() in arch_post_report_syscall_entry()
> - Make the arch funcs not use __weak as Thomas suggested, so move
> the arch funcs to entry-common.h, and make arch_forget_syscall() folded
> in arch_post_report_syscall_entry() as suggested.
> - Move report_single_step() to thread_info.h for arm64
> - Change __always_inline() to inline, add inline for the other arch funcs.
> - Remove unused signal.h for entry-common.h.
> - Add Suggested-by.
> - Update the commit message.
>
> Changes in v2:
> - Add tested-by.
> - Fix a bug that not call arch_post_report_syscall_entry() in
> syscall_trace_enter() if ptrace_report_syscall_entry() return not zero.
> - Refactor report_syscall().
> - Add comment for arch_prepare_report_syscall_exit().
> - Adjust entry-common.h header file inclusion to alphabetical order.
> - Update the commit message.
>
> Jinjie Ruan (8):
> entry: Split generic entry into generic exception and syscall entry
> arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled()
> arm64: entry: Refactor the entry and exit for exceptions from EL1
> arm64: entry: Rework arm64_preempt_schedule_irq()
> arm64: entry: Use preempt_count() and need_resched() helper
> arm64: entry: Refactor preempt_schedule_irq() check code
> arm64: entry: Move arm64_preempt_schedule_irq() into
> __exit_to_kernel_mode()
> arm64: entry: Switch to generic IRQ entry
>
> MAINTAINERS | 1 +
> arch/Kconfig | 9 +
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/daifflags.h | 2 +-
> arch/arm64/include/asm/entry-common.h | 56 ++++
> arch/arm64/include/asm/preempt.h | 2 -
> arch/arm64/include/asm/ptrace.h | 13 +-
> arch/arm64/include/asm/xen/events.h | 2 +-
> arch/arm64/kernel/acpi.c | 2 +-
> arch/arm64/kernel/debug-monitors.c | 2 +-
> arch/arm64/kernel/entry-common.c | 378 ++++++++-----------------
> arch/arm64/kernel/sdei.c | 2 +-
> arch/arm64/kernel/signal.c | 3 +-
> include/linux/entry-common.h | 382 +------------------------
> include/linux/irq-entry-common.h | 389 ++++++++++++++++++++++++++
> kernel/entry/Makefile | 3 +-
> kernel/entry/common.c | 176 ++----------
> kernel/entry/syscall-common.c | 159 +++++++++++
> kernel/sched/core.c | 8 +-
> 19 files changed, 766 insertions(+), 824 deletions(-)
> create mode 100644 arch/arm64/include/asm/entry-common.h
> create mode 100644 include/linux/irq-entry-common.h
> create mode 100644 kernel/entry/syscall-common.c
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry
2025-03-20 14:26 ` Linus Walleij
@ 2025-04-28 14:46 ` Linus Walleij
2025-04-29 15:03 ` Will Deacon
0 siblings, 1 reply; 16+ messages in thread
From: Linus Walleij @ 2025-04-28 14:46 UTC (permalink / raw)
To: tglx
Cc: ruanjinjie, catalin.marinas, will, oleg, sstabellini, peterz,
luto, linux-kernel, arnd, paulmck, mingo, dietmar.eggemann,
linux-arm-kernel, mark.rutland, joey.gouly, kevin.brodsky,
dave.martin
Hi Thomas,
On Thu, Mar 20, 2025 at 3:26 PM Linus Walleij <linus.walleij@linaro.org> wrote:
> > Currently CONFIG_GENERIC_ENTRY enables both the generic exception
> > entry logic and the generic syscall entry logic, which are otherwise
> > loosely coupled.
> >
> > Introduce separate config options for these so that archtiectures can
> > select the two independently. This will make it easier for
> > architectures to migrate to generic entry code.
> >
> > Suggested-by: Mark Rutland <mark.rutland@arm.com>
> > Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>
> If the generic entry maintainer (hi Thomas) thinks this patch is
> OK it would be nice to have it applied because it will also make
> the ARM32 transition to generic entry easier as we can work on
> exception and syscall entry each in isolation.
>
> Acked-by: Linus Walleij <linus.walleij@linaro.org>
Do you think this patch is something you could queue on some
tip branch for v6.16?
I don't know if either arm64 or arm32 is going anywhere with
generic entry for v6.16, but having this in-tree makes it easier
to split the task going forward. We can always pull it into some
branch or just wait for v6.16-rc1 to get this patch under our
patch stacks.
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry
2025-04-28 14:46 ` Linus Walleij
@ 2025-04-29 15:03 ` Will Deacon
0 siblings, 0 replies; 16+ messages in thread
From: Will Deacon @ 2025-04-29 15:03 UTC (permalink / raw)
To: Linus Walleij
Cc: tglx, ruanjinjie, catalin.marinas, oleg, sstabellini, peterz,
luto, linux-kernel, arnd, paulmck, mingo, dietmar.eggemann,
linux-arm-kernel, mark.rutland, joey.gouly, kevin.brodsky,
dave.martin
On Mon, Apr 28, 2025 at 04:46:02PM +0200, Linus Walleij wrote:
> Hi Thomas,
>
> On Thu, Mar 20, 2025 at 3:26 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> > > Currently CONFIG_GENERIC_ENTRY enables both the generic exception
> > > entry logic and the generic syscall entry logic, which are otherwise
> > > loosely coupled.
> > >
> > > Introduce separate config options for these so that archtiectures can
> > > select the two independently. This will make it easier for
> > > architectures to migrate to generic entry code.
> > >
> > > Suggested-by: Mark Rutland <mark.rutland@arm.com>
> > > Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> >
> > If the generic entry maintainer (hi Thomas) thinks this patch is
> > OK it would be nice to have it applied because it will also make
> > the ARM32 transition to generic entry easier as we can work on
> > exception and syscall entry each in isolation.
> >
> > Acked-by: Linus Walleij <linus.walleij@linaro.org>
>
> Do you think this patch is something you could queue on some
> tip branch for v6.16?
>
> I don't know if either arm64 or arm32 is going anywhere with
> generic entry for v6.16, but having this in-tree makes it easier
> to split the task going forward. We can always pull it into some
> branch or just wait for v6.16-rc1 to get this patch under our
> patch stacks.
Agreed, having the core part merged would be really helpful if possible.
Cheers,
Will
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-04-29 15:26 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-13 12:59 [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 1/8] entry: Split generic entry into generic exception and syscall entry Jinjie Ruan
2025-03-20 14:26 ` Linus Walleij
2025-04-28 14:46 ` Linus Walleij
2025-04-29 15:03 ` Will Deacon
2025-02-13 13:00 ` [PATCH -next v6 2/8] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 3/8] arm64: entry: Refactor the entry and exit for exceptions from EL1 Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 4/8] arm64: entry: Rework arm64_preempt_schedule_irq() Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 5/8] arm64: entry: Use preempt_count() and need_resched() helper Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 6/8] arm64: entry: Refactor preempt_schedule_irq() check code Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 7/8] arm64: entry: Move arm64_preempt_schedule_irq() into __exit_to_kernel_mode() Jinjie Ruan
2025-02-13 13:00 ` [PATCH -next v6 8/8] arm64: entry: Switch to generic IRQ entry Jinjie Ruan
2025-02-27 18:08 ` Valentin Schneider
2025-02-27 18:35 ` Mark Rutland
2025-02-27 21:29 ` Valentin Schneider
2025-04-14 9:13 ` [PATCH -next v6 0/8] arm64: entry: Convert to generic irq entry Jinjie Ruan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox