* [PATCH -next v4 00/19] arm64: entry: Convert to generic entry
@ 2024-10-25 10:06 Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 01/19] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Jinjie Ruan
` (18 more replies)
0 siblings, 19 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
to use the generic entry infrastructure from kernel/entry/*. The generic
entry makes maintainers' work easier and codes more elegant, which aslo
removed a lot of duplicate code.
The patch 1 ~ 5 try to make arm64 easier to use irqentry_enter/exit().
The patch 6 ~ 13 and patch 15 try to make it closer to the PREEMPT_DYNAMIC
code of generic entry. And the patch 16 split the generic entry into
generic irq entry and generic syscall to make the single patch more
concentrated in switching to one thing.
Changes in v4:
- Rework/cleanup split into a few patches as Mark suggested.
- Replace interrupts_enabled() macro with regs_irqs_disabled(), instead
of left it here.
- Remove rcu and lockdep state in pt_regs by using temporary
irqentry_state_t as Mark suggested.
- Remove some unnecessary intermediate functions to make it clear.
- Rework preempt irq and PREEMPT_DYNAMIC code
to make the switch more clear.
- arch_prepare_*_entry/exit() -> arch_pre_*_entry/exit().
- Expand the arch functions comment.
- Make arch functions closer to its caller.
- Declare saved_reg in for block.
- Remove arch_exit_to_kernel_mode_prepare(), arch_enter_from_kernel_mode().
- Adjust "Add few arch functions to use generic entry" patch to be
the penultimate.
- Update the commit message.
- Add suggested-by.
Changes in v3:
- Test the MTE test cases.
- Handle forget_syscall() in arch_post_report_syscall_entry()
- Make the arch funcs not use __weak as Thomas suggested, so move
the arch funcs to entry-common.h, and make arch_forget_syscall() folded
in arch_post_report_syscall_entry() as suggested.
- Move report_single_step() to thread_info.h for arm64
- Change __always_inline() to inline, add inline for the other arch funcs.
- Remove unused signal.h for entry-common.h.
- Add Suggested-by.
- Update the commit message.
Changes in v2:
- Add tested-by.
- Fix a bug that not call arch_post_report_syscall_entry() in
syscall_trace_enter() if ptrace_report_syscall_entry() return not zero.
- Refactor report_syscall().
- Add comment for arch_prepare_report_syscall_exit().
- Adjust entry-common.h header file inclusion to alphabetical order.
- Update the commit message.
Jinjie Ruan (19):
arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled()
arm64: entry: Refactor the entry and exit for exceptions from EL1
arm64: entry: Remove __enter_from_user_mode()
arm64: entry: Remove __enter_from_kernel_mode()
arm64: entry: Remove __exit_to_kernel_mode()
arm64: entry: Move arm64_preempt_schedule_irq() into
exit_to_kernel_mode()
arm64: entry: Call arm64_preempt_schedule_irq() only if irqs enabled
arm64: entry: Rework arm64_preempt_schedule_irq()
arm64: entry: Use preempt_count() and need_resched() helper
arm64: entry: preempt_schedule_irq() only if PREEMPTION enabled
arm64: entry: Extract raw_irqentry_exit_cond_resched() function
arm64: entry: Check dynamic key ahead
arm64: entry: Check dynamic resched when PREEMPT_DYNAMIC enabled
entry: Split into irq entry and syscall
entry: Add arch irqentry_exit_need_resched() for arm64
arm64: entry: Switch to generic IRQ entry
entry: Add syscall arch functions to use generic syscall for arm64
arm64/ptrace: Split report_syscall() into separate enter and exit
functions
arm64: entry: Convert to generic entry
MAINTAINERS | 1 +
arch/Kconfig | 8 +
arch/arm/include/asm/ptrace.h | 4 +-
arch/arm/kernel/hw_breakpoint.c | 2 +-
arch/arm/kernel/process.c | 2 +-
arch/arm/mm/alignment.c | 2 +-
arch/arm/mm/fault.c | 2 +-
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/daifflags.h | 2 +-
arch/arm64/include/asm/entry-common.h | 149 ++++++++++
arch/arm64/include/asm/preempt.h | 2 -
arch/arm64/include/asm/ptrace.h | 8 +-
arch/arm64/include/asm/syscall.h | 6 +-
arch/arm64/include/asm/thread_info.h | 23 +-
arch/arm64/include/asm/xen/events.h | 2 +-
arch/arm64/kernel/acpi.c | 2 +-
arch/arm64/kernel/debug-monitors.c | 2 +-
arch/arm64/kernel/entry-common.c | 381 +++++++------------------
arch/arm64/kernel/ptrace.c | 90 ------
arch/arm64/kernel/sdei.c | 2 +-
arch/arm64/kernel/signal.c | 3 +-
arch/arm64/kernel/syscall.c | 18 +-
drivers/irqchip/irq-gic-v3.c | 2 +-
include/linux/entry-common.h | 377 +-----------------------
include/linux/irq-entry-common.h | 393 ++++++++++++++++++++++++++
include/linux/thread_info.h | 13 +
kernel/entry/Makefile | 3 +-
kernel/entry/common.c | 175 ++----------
kernel/entry/syscall-common.c | 237 ++++++++++++++++
29 files changed, 962 insertions(+), 950 deletions(-)
create mode 100644 arch/arm64/include/asm/entry-common.h
create mode 100644 include/linux/irq-entry-common.h
create mode 100644 kernel/entry/syscall-common.c
--
2.34.1
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH -next v4 01/19] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled()
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-29 14:19 ` Mark Rutland
2024-10-25 10:06 ` [PATCH -next v4 02/19] arm64: entry: Refactor the entry and exit for exceptions from EL1 Jinjie Ruan
` (17 subsequent siblings)
18 siblings, 1 reply; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Implement regs_irqs_disabled(), and replace interrupts_enabled() macro
with regs_irqs_disabled() all over the place.
No functional changes.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm/include/asm/ptrace.h | 4 ++--
arch/arm/kernel/hw_breakpoint.c | 2 +-
arch/arm/kernel/process.c | 2 +-
arch/arm/mm/alignment.c | 2 +-
arch/arm/mm/fault.c | 2 +-
arch/arm64/include/asm/daifflags.h | 2 +-
arch/arm64/include/asm/ptrace.h | 4 ++--
arch/arm64/include/asm/xen/events.h | 2 +-
arch/arm64/kernel/acpi.c | 2 +-
arch/arm64/kernel/debug-monitors.c | 2 +-
arch/arm64/kernel/entry-common.c | 4 ++--
arch/arm64/kernel/sdei.c | 2 +-
drivers/irqchip/irq-gic-v3.c | 2 +-
13 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index 6eb311fb2da0..2054b17b3a69 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -46,8 +46,8 @@ struct svc_pt_regs {
#define processor_mode(regs) \
((regs)->ARM_cpsr & MODE_MASK)
-#define interrupts_enabled(regs) \
- (!((regs)->ARM_cpsr & PSR_I_BIT))
+#define regs_irqs_disabled(regs) \
+ ((regs)->ARM_cpsr & PSR_I_BIT)
#define fast_interrupts_enabled(regs) \
(!((regs)->ARM_cpsr & PSR_F_BIT))
diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index a12efd0f43e8..bc7c9f5a2767 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -947,7 +947,7 @@ static int hw_breakpoint_pending(unsigned long addr, unsigned int fsr,
preempt_disable();
- if (interrupts_enabled(regs))
+ if (!regs_irqs_disabled(regs))
local_irq_enable();
/* We only handle watchpoints and hardware breakpoints. */
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index e16ed102960c..5979a5cec2d0 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -167,7 +167,7 @@ void __show_regs(struct pt_regs *regs)
segment = "user";
printk("Flags: %s IRQs o%s FIQs o%s Mode %s ISA %s Segment %s\n",
- buf, interrupts_enabled(regs) ? "n" : "ff",
+ buf, !regs_irqs_disabled(regs) ? "n" : "ff",
fast_interrupts_enabled(regs) ? "n" : "ff",
processor_modes[processor_mode(regs)],
isa_modes[isa_mode(regs)], segment);
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 3c6ddb1afdc4..642aae48a09e 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -809,7 +809,7 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
int thumb2_32b = 0;
int fault;
- if (interrupts_enabled(regs))
+ if (!regs_irqs_disabled(regs))
local_irq_enable();
instrptr = instruction_pointer(regs);
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index ab01b51de559..dd8e95fcce10 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -275,7 +275,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
/* Enable interrupts if they were enabled in the parent context. */
- if (interrupts_enabled(regs))
+ if (!regs_irqs_disabled(regs))
local_irq_enable();
/*
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index fbb5c99eb2f9..5fca48009043 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -128,7 +128,7 @@ static inline void local_daif_inherit(struct pt_regs *regs)
{
unsigned long flags = regs->pstate & DAIF_MASK;
- if (interrupts_enabled(regs))
+ if (!regs_irqs_disabled(regs))
trace_hardirqs_on();
if (system_uses_irq_prio_masking())
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 47ff8654c5ec..3e5372a98da4 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -214,8 +214,8 @@ static inline void forget_syscall(struct pt_regs *regs)
(regs)->pmr == GIC_PRIO_IRQON : \
true)
-#define interrupts_enabled(regs) \
- (!((regs)->pstate & PSR_I_BIT) && irqs_priority_unmasked(regs))
+#define regs_irqs_disabled(regs) \
+ (((regs)->pstate & PSR_I_BIT) || (!irqs_priority_unmasked(regs)))
#define fast_interrupts_enabled(regs) \
(!((regs)->pstate & PSR_F_BIT))
diff --git a/arch/arm64/include/asm/xen/events.h b/arch/arm64/include/asm/xen/events.h
index 2788e95d0ff0..2977b5fe068d 100644
--- a/arch/arm64/include/asm/xen/events.h
+++ b/arch/arm64/include/asm/xen/events.h
@@ -14,7 +14,7 @@ enum ipi_vector {
static inline int xen_irqs_disabled(struct pt_regs *regs)
{
- return !interrupts_enabled(regs);
+ return regs_irqs_disabled(regs);
}
#define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index e6f66491fbe9..732f89daae23 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -403,7 +403,7 @@ int apei_claim_sea(struct pt_regs *regs)
return_to_irqs_enabled = !irqs_disabled_flags(arch_local_save_flags());
if (regs)
- return_to_irqs_enabled = interrupts_enabled(regs);
+ return_to_irqs_enabled = !regs_irqs_disabled(regs);
/*
* SEA can interrupt SError, mask it and describe this as an NMI so
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index c60a4a90c6a5..5497df05dd1a 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -231,7 +231,7 @@ static void send_user_sigtrap(int si_code)
if (WARN_ON(!user_mode(regs)))
return;
- if (interrupts_enabled(regs))
+ if (!regs_irqs_disabled(regs))
local_irq_enable();
arm64_force_sig_fault(SIGTRAP, si_code, instruction_pointer(regs),
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index b260ddc4d3e9..c547e70428d3 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -73,7 +73,7 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
{
lockdep_assert_irqs_disabled();
- if (interrupts_enabled(regs)) {
+ if (!regs_irqs_disabled(regs)) {
if (regs->exit_rcu) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
@@ -569,7 +569,7 @@ static void noinstr el1_interrupt(struct pt_regs *regs,
{
write_sysreg(DAIF_PROCCTX_NOIRQ, daif);
- if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && !interrupts_enabled(regs))
+ if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && regs_irqs_disabled(regs))
__el1_pnmi(regs, handler);
else
__el1_irq(regs, handler);
diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
index 255d12f881c2..27a17da635d8 100644
--- a/arch/arm64/kernel/sdei.c
+++ b/arch/arm64/kernel/sdei.c
@@ -247,7 +247,7 @@ unsigned long __kprobes do_sdei_event(struct pt_regs *regs,
* If we interrupted the kernel with interrupts masked, we always go
* back to wherever we came from.
*/
- if (mode == kernel_mode && !interrupts_enabled(regs))
+ if (mode == kernel_mode && regs_irqs_disabled(regs))
return SDEI_EV_HANDLED;
/*
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index ce87205e3e82..5c832c436bd8 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -932,7 +932,7 @@ static void __gic_handle_irq_from_irqsoff(struct pt_regs *regs)
static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
{
- if (unlikely(gic_supports_nmi() && !interrupts_enabled(regs)))
+ if (unlikely(gic_supports_nmi() && regs_irqs_disabled(regs)))
__gic_handle_irq_from_irqsoff(regs);
else
__gic_handle_irq_from_irqson(regs);
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 02/19] arm64: entry: Refactor the entry and exit for exceptions from EL1
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 01/19] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-29 14:33 ` Mark Rutland
2024-10-25 10:06 ` [PATCH -next v4 03/19] arm64: entry: Remove __enter_from_user_mode() Jinjie Ruan
` (16 subsequent siblings)
18 siblings, 1 reply; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
These changes refactor the entry and exit routines for the exceptions
from EL1. They store the RCU and lockdep state in a struct
irqentry_state variable on the stack, rather than recording them
in the fields of pt_regs, since it is safe enough for these context.
Before:
struct pt_regs {
...
u64 lockdep_hardirqs;
u64 exit_rcu;
}
enter_from_kernel_mode(regs);
...
exit_to_kernel_mode(regs);
After:
typedef struct irqentry_state {
union {
bool exit_rcu;
bool lockdep;
};
} irqentry_state_t;
irqentry_state_t state = enter_from_kernel_mode(regs);
...
exit_to_kernel_mode(regs, state);
No functional changes.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/include/asm/ptrace.h | 11 ++-
arch/arm64/kernel/entry-common.c | 129 +++++++++++++++++++------------
2 files changed, 85 insertions(+), 55 deletions(-)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 3e5372a98da4..5156c0d5fa20 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -149,6 +149,13 @@ static inline unsigned long pstate_to_compat_psr(const unsigned long pstate)
return psr;
}
+typedef struct irqentry_state {
+ union {
+ bool exit_rcu;
+ bool lockdep;
+ };
+} irqentry_state_t;
+
/*
* This struct defines the way the registers are stored on the stack during an
* exception. struct user_pt_regs must form a prefix of struct pt_regs.
@@ -169,10 +176,6 @@ struct pt_regs {
u64 sdei_ttbr1;
struct frame_record_meta stackframe;
-
- /* Only valid for some EL1 exceptions. */
- u64 lockdep_hardirqs;
- u64 exit_rcu;
};
/* For correct stack alignment, pt_regs has to be a multiple of 16 bytes. */
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index c547e70428d3..68a9aecacdb9 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -36,29 +36,36 @@
* This is intended to match the logic in irqentry_enter(), handling the kernel
* mode transitions only.
*/
-static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs)
+static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
{
- regs->exit_rcu = false;
+ irqentry_state_t ret = {
+ .exit_rcu = false,
+ };
if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
lockdep_hardirqs_off(CALLER_ADDR0);
ct_irq_enter();
trace_hardirqs_off_finish();
- regs->exit_rcu = true;
- return;
+ ret.exit_rcu = true;
+ return ret;
}
lockdep_hardirqs_off(CALLER_ADDR0);
rcu_irq_enter_check_tick();
trace_hardirqs_off_finish();
+
+ return ret;
}
-static void noinstr enter_from_kernel_mode(struct pt_regs *regs)
+static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
{
- __enter_from_kernel_mode(regs);
+ irqentry_state_t ret = __enter_from_kernel_mode(regs);
+
mte_check_tfsr_entry();
mte_disable_tco_entry(current);
+
+ return ret;
}
/*
@@ -69,12 +76,13 @@ static void noinstr enter_from_kernel_mode(struct pt_regs *regs)
* This is intended to match the logic in irqentry_exit(), handling the kernel
* mode transitions only, and with preemption handled elsewhere.
*/
-static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
+static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs,
+ irqentry_state_t state)
{
lockdep_assert_irqs_disabled();
if (!regs_irqs_disabled(regs)) {
- if (regs->exit_rcu) {
+ if (state.exit_rcu) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
ct_irq_exit();
@@ -84,15 +92,16 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
trace_hardirqs_on();
} else {
- if (regs->exit_rcu)
+ if (state.exit_rcu)
ct_irq_exit();
}
}
-static void noinstr exit_to_kernel_mode(struct pt_regs *regs)
+static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
+ irqentry_state_t state)
{
mte_check_tfsr_exit();
- __exit_to_kernel_mode(regs);
+ __exit_to_kernel_mode(regs, state);
}
/*
@@ -190,9 +199,11 @@ asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs)
* mode. Before this function is called it is not safe to call regular kernel
* code, instrumentable code, or any code which may trigger an exception.
*/
-static void noinstr arm64_enter_nmi(struct pt_regs *regs)
+static noinstr irqentry_state_t arm64_enter_nmi(struct pt_regs *regs)
{
- regs->lockdep_hardirqs = lockdep_hardirqs_enabled();
+ irqentry_state_t irq_state;
+
+ irq_state.lockdep = lockdep_hardirqs_enabled();
__nmi_enter();
lockdep_hardirqs_off(CALLER_ADDR0);
@@ -201,6 +212,8 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs)
trace_hardirqs_off_finish();
ftrace_nmi_enter();
+
+ return irq_state;
}
/*
@@ -208,19 +221,18 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs)
* mode. After this function returns it is not safe to call regular kernel
* code, instrumentable code, or any code which may trigger an exception.
*/
-static void noinstr arm64_exit_nmi(struct pt_regs *regs)
+static void noinstr arm64_exit_nmi(struct pt_regs *regs,
+ irqentry_state_t irq_state)
{
- bool restore = regs->lockdep_hardirqs;
-
ftrace_nmi_exit();
- if (restore) {
+ if (irq_state.lockdep) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
}
ct_nmi_exit();
lockdep_hardirq_exit();
- if (restore)
+ if (irq_state.lockdep)
lockdep_hardirqs_on(CALLER_ADDR0);
__nmi_exit();
}
@@ -230,14 +242,18 @@ static void noinstr arm64_exit_nmi(struct pt_regs *regs)
* kernel mode. Before this function is called it is not safe to call regular
* kernel code, instrumentable code, or any code which may trigger an exception.
*/
-static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs)
+static noinstr irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs)
{
- regs->lockdep_hardirqs = lockdep_hardirqs_enabled();
+ irqentry_state_t state;
+
+ state.lockdep = lockdep_hardirqs_enabled();
lockdep_hardirqs_off(CALLER_ADDR0);
ct_nmi_enter();
trace_hardirqs_off_finish();
+
+ return state;
}
/*
@@ -245,17 +261,16 @@ static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs)
* kernel mode. After this function returns it is not safe to call regular
* kernel code, instrumentable code, or any code which may trigger an exception.
*/
-static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs)
+static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs,
+ irqentry_state_t state)
{
- bool restore = regs->lockdep_hardirqs;
-
- if (restore) {
+ if (state.lockdep) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
}
ct_nmi_exit();
- if (restore)
+ if (state.lockdep)
lockdep_hardirqs_on(CALLER_ADDR0);
}
@@ -426,78 +441,86 @@ UNHANDLED(el1t, 64, error)
static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
+ irqentry_state_t state;
- enter_from_kernel_mode(regs);
+ state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_mem_abort(far, esr, regs);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_pc(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
+ irqentry_state_t state;
- enter_from_kernel_mode(regs);
+ state = enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_sp_pc_abort(far, esr, regs);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_undef(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_undef(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_bti(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_gcs(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_mops(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_dbg(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
+ irqentry_state_t state;
- arm64_enter_el1_dbg(regs);
+ state = arm64_enter_el1_dbg(regs);
if (!cortex_a76_erratum_1463225_debug_handler(regs))
do_debug_exception(far, esr, regs);
- arm64_exit_el1_dbg(regs);
+ arm64_exit_el1_dbg(regs, state);
}
static void noinstr el1_fpac(struct pt_regs *regs, unsigned long esr)
{
- enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
+
local_daif_inherit(regs);
do_el1_fpac(regs, esr);
local_daif_mask();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs)
@@ -546,15 +569,16 @@ asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs)
static __always_inline void __el1_pnmi(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- arm64_enter_nmi(regs);
+ irqentry_state_t state = arm64_enter_nmi(regs);
+
do_interrupt_handler(regs, handler);
- arm64_exit_nmi(regs);
+ arm64_exit_nmi(regs, state);
}
static __always_inline void __el1_irq(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- enter_from_kernel_mode(regs);
+ irqentry_state_t state = enter_from_kernel_mode(regs);
irq_enter_rcu();
do_interrupt_handler(regs, handler);
@@ -562,7 +586,7 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
arm64_preempt_schedule_irq();
- exit_to_kernel_mode(regs);
+ exit_to_kernel_mode(regs, state);
}
static void noinstr el1_interrupt(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
@@ -588,11 +612,12 @@ asmlinkage void noinstr el1h_64_fiq_handler(struct pt_regs *regs)
asmlinkage void noinstr el1h_64_error_handler(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
+ irqentry_state_t state;
local_daif_restore(DAIF_ERRCTX);
- arm64_enter_nmi(regs);
+ state = arm64_enter_nmi(regs);
do_serror(regs, esr);
- arm64_exit_nmi(regs);
+ arm64_exit_nmi(regs, state);
}
static void noinstr el0_da(struct pt_regs *regs, unsigned long esr)
@@ -855,12 +880,13 @@ asmlinkage void noinstr el0t_64_fiq_handler(struct pt_regs *regs)
static void noinstr __el0_error_handler_common(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
+ irqentry_state_t state;
enter_from_user_mode(regs);
local_daif_restore(DAIF_ERRCTX);
- arm64_enter_nmi(regs);
+ state = arm64_enter_nmi(regs);
do_serror(regs, esr);
- arm64_exit_nmi(regs);
+ arm64_exit_nmi(regs, state);
local_daif_restore(DAIF_PROCCTX);
exit_to_user_mode(regs);
}
@@ -968,6 +994,7 @@ asmlinkage void noinstr __noreturn handle_bad_stack(struct pt_regs *regs)
asmlinkage noinstr unsigned long
__sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
{
+ irqentry_state_t state;
unsigned long ret;
/*
@@ -992,9 +1019,9 @@ __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
else if (cpu_has_pan())
set_pstate_pan(0);
- arm64_enter_nmi(regs);
+ state = arm64_enter_nmi(regs);
ret = do_sdei_event(regs, arg);
- arm64_exit_nmi(regs);
+ arm64_exit_nmi(regs, state);
return ret;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 03/19] arm64: entry: Remove __enter_from_user_mode()
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 01/19] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 02/19] arm64: entry: Refactor the entry and exit for exceptions from EL1 Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-29 14:42 ` Mark Rutland
2024-10-25 10:06 ` [PATCH -next v4 04/19] arm64: entry: Remove __enter_from_kernel_mode() Jinjie Ruan
` (15 subsequent siblings)
18 siblings, 1 reply; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
The __enter_from_user_mode() is only called by enter_from_user_mode(),
so replaced it with enter_from_user_mode().
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 68a9aecacdb9..ccf59b44464d 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -109,7 +109,7 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
* Before this function is called it is not safe to call regular kernel code,
* instrumentable code, or any code which may trigger an exception.
*/
-static __always_inline void __enter_from_user_mode(void)
+static __always_inline void enter_from_user_mode(struct pt_regs *regs)
{
lockdep_hardirqs_off(CALLER_ADDR0);
CT_WARN_ON(ct_state() != CT_STATE_USER);
@@ -118,11 +118,6 @@ static __always_inline void __enter_from_user_mode(void)
mte_disable_tco_entry(current);
}
-static __always_inline void enter_from_user_mode(struct pt_regs *regs)
-{
- __enter_from_user_mode();
-}
-
/*
* Handle IRQ/context state management when exiting to user mode.
* After this function returns it is not safe to call regular kernel code,
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 04/19] arm64: entry: Remove __enter_from_kernel_mode()
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (2 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 03/19] arm64: entry: Remove __enter_from_user_mode() Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-29 14:37 ` Mark Rutland
2024-10-25 10:06 ` [PATCH -next v4 05/19] arm64: entry: Remove __exit_to_kernel_mode() Jinjie Ruan
` (14 subsequent siblings)
18 siblings, 1 reply; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
The __enter_from_kernel_mode() is only called by enter_from_kernel_mode(),
remove it.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index ccf59b44464d..a7fd4d6c7650 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -36,7 +36,7 @@
* This is intended to match the logic in irqentry_enter(), handling the kernel
* mode transitions only.
*/
-static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
+static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
{
irqentry_state_t ret = {
.exit_rcu = false,
@@ -55,13 +55,6 @@ static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs
rcu_irq_enter_check_tick();
trace_hardirqs_off_finish();
- return ret;
-}
-
-static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
-{
- irqentry_state_t ret = __enter_from_kernel_mode(regs);
-
mte_check_tfsr_entry();
mte_disable_tco_entry(current);
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 05/19] arm64: entry: Remove __exit_to_kernel_mode()
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (3 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 04/19] arm64: entry: Remove __enter_from_kernel_mode() Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 06/19] arm64: entry: Move arm64_preempt_schedule_irq() into exit_to_kernel_mode() Jinjie Ruan
` (13 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
__exit_to_kernel_mode() is only called by exit_to_kernel_mode(),
remove it.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index a7fd4d6c7650..137481a3f0fa 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -69,9 +69,11 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
* This is intended to match the logic in irqentry_exit(), handling the kernel
* mode transitions only, and with preemption handled elsewhere.
*/
-static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs,
- irqentry_state_t state)
+static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
+ irqentry_state_t state)
{
+ mte_check_tfsr_exit();
+
lockdep_assert_irqs_disabled();
if (!regs_irqs_disabled(regs)) {
@@ -90,13 +92,6 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs,
}
}
-static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
- irqentry_state_t state)
-{
- mte_check_tfsr_exit();
- __exit_to_kernel_mode(regs, state);
-}
-
/*
* Handle IRQ/context state management when entering from user mode.
* Before this function is called it is not safe to call regular kernel code,
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 06/19] arm64: entry: Move arm64_preempt_schedule_irq() into exit_to_kernel_mode()
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (4 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 05/19] arm64: entry: Remove __exit_to_kernel_mode() Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-29 14:52 ` Mark Rutland
2024-10-25 10:06 ` [PATCH -next v4 07/19] arm64: entry: Call arm64_preempt_schedule_irq() only if irqs enabled Jinjie Ruan
` (12 subsequent siblings)
18 siblings, 1 reply; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Move arm64_preempt_schedule_irq() into exit_to_kernel_mode(), so not
only __el1_irq() but also every time when kernel mode irq return,
there is a chance to reschedule.
As Mark pointed out, this change will have the following key impact:
"We'll preempt even without taking a "real" interrupt. That
shouldn't result in preemption that wasn't possible before,
but it does change the probability of preempting at certain points,
and might have a performance impact, so probably warrants a
benchmark."
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 88 ++++++++++++++++----------------
1 file changed, 44 insertions(+), 44 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 137481a3f0fa..e0380812d71e 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -61,6 +61,48 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
return ret;
}
+#ifdef CONFIG_PREEMPT_DYNAMIC
+DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
+#define need_irq_preemption() \
+ (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
+#else
+#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
+#endif
+
+static void __sched arm64_preempt_schedule_irq(void)
+{
+ if (!need_irq_preemption())
+ return;
+
+ /*
+ * Note: thread_info::preempt_count includes both thread_info::count
+ * and thread_info::need_resched, and is not equivalent to
+ * preempt_count().
+ */
+ if (READ_ONCE(current_thread_info()->preempt_count) != 0)
+ return;
+
+ /*
+ * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
+ * priority masking is used the GIC irqchip driver will clear DAIF.IF
+ * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
+ * DAIF we must have handled an NMI, so skip preemption.
+ */
+ if (system_uses_irq_prio_masking() && read_sysreg(daif))
+ return;
+
+ /*
+ * Preempting a task from an IRQ means we leave copies of PSTATE
+ * on the stack. cpufeature's enable calls may modify PSTATE, but
+ * resuming one of these preempted tasks would undo those changes.
+ *
+ * Only allow a task to be preempted once cpufeatures have been
+ * enabled.
+ */
+ if (system_capabilities_finalized())
+ preempt_schedule_irq();
+}
+
/*
* Handle IRQ/context state management when exiting to kernel mode.
* After this function returns it is not safe to call regular kernel code,
@@ -72,6 +114,8 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
irqentry_state_t state)
{
+ arm64_preempt_schedule_irq();
+
mte_check_tfsr_exit();
lockdep_assert_irqs_disabled();
@@ -257,48 +301,6 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs,
lockdep_hardirqs_on(CALLER_ADDR0);
}
-#ifdef CONFIG_PREEMPT_DYNAMIC
-DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-#define need_irq_preemption() \
- (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
-#else
-#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
-#endif
-
-static void __sched arm64_preempt_schedule_irq(void)
-{
- if (!need_irq_preemption())
- return;
-
- /*
- * Note: thread_info::preempt_count includes both thread_info::count
- * and thread_info::need_resched, and is not equivalent to
- * preempt_count().
- */
- if (READ_ONCE(current_thread_info()->preempt_count) != 0)
- return;
-
- /*
- * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
- * priority masking is used the GIC irqchip driver will clear DAIF.IF
- * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
- * DAIF we must have handled an NMI, so skip preemption.
- */
- if (system_uses_irq_prio_masking() && read_sysreg(daif))
- return;
-
- /*
- * Preempting a task from an IRQ means we leave copies of PSTATE
- * on the stack. cpufeature's enable calls may modify PSTATE, but
- * resuming one of these preempted tasks would undo those changes.
- *
- * Only allow a task to be preempted once cpufeatures have been
- * enabled.
- */
- if (system_capabilities_finalized())
- preempt_schedule_irq();
-}
-
static void do_interrupt_handler(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
@@ -567,8 +569,6 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
do_interrupt_handler(regs, handler);
irq_exit_rcu();
- arm64_preempt_schedule_irq();
-
exit_to_kernel_mode(regs, state);
}
static void noinstr el1_interrupt(struct pt_regs *regs,
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 07/19] arm64: entry: Call arm64_preempt_schedule_irq() only if irqs enabled
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (5 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 06/19] arm64: entry: Move arm64_preempt_schedule_irq() into exit_to_kernel_mode() Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-29 14:55 ` Mark Rutland
2024-10-25 10:06 ` [PATCH -next v4 08/19] arm64: entry: Rework arm64_preempt_schedule_irq() Jinjie Ruan
` (11 subsequent siblings)
18 siblings, 1 reply; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Only if irqs are enabled when the interrupt trapped, there may be
a chance to reschedule after the interrupt has been handled, so move
arm64_preempt_schedule_irq() into regs_irqs_disabled() check false
if block.
As Mark pointed out, this change will have the following key impact:
"We will not preempt when taking interrupts from a region of kernel
code where IRQs are enabled but RCU is not watching, matching the
behaviour of the generic entry code.
This has the potential to introduce livelock if we can ever have a
screaming interrupt in such a region, so we'll need to go figure out
whether that's actually a problem.
Having this as a separate patch will make it easier to test/bisect
for that specifically."
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index e0380812d71e..b57f6dc66115 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -114,8 +114,6 @@ static void __sched arm64_preempt_schedule_irq(void)
static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
irqentry_state_t state)
{
- arm64_preempt_schedule_irq();
-
mte_check_tfsr_exit();
lockdep_assert_irqs_disabled();
@@ -129,6 +127,8 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
return;
}
+ arm64_preempt_schedule_irq();
+
trace_hardirqs_on();
} else {
if (state.exit_rcu)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 08/19] arm64: entry: Rework arm64_preempt_schedule_irq()
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (6 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 07/19] arm64: entry: Call arm64_preempt_schedule_irq() only if irqs enabled Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 09/19] arm64: entry: Use preempt_count() and need_resched() helper Jinjie Ruan
` (10 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Rework arm64_preempt_schedule_irq() to check whether it need
resched in a check function arm64_irqentry_exit_need_resched().
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index b57f6dc66115..a3414fb599fa 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -69,10 +69,10 @@ DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
#endif
-static void __sched arm64_preempt_schedule_irq(void)
+static inline bool arm64_irqentry_exit_need_resched(void)
{
if (!need_irq_preemption())
- return;
+ return false;
/*
* Note: thread_info::preempt_count includes both thread_info::count
@@ -80,7 +80,7 @@ static void __sched arm64_preempt_schedule_irq(void)
* preempt_count().
*/
if (READ_ONCE(current_thread_info()->preempt_count) != 0)
- return;
+ return false;
/*
* DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
@@ -89,7 +89,7 @@ static void __sched arm64_preempt_schedule_irq(void)
* DAIF we must have handled an NMI, so skip preemption.
*/
if (system_uses_irq_prio_masking() && read_sysreg(daif))
- return;
+ return false;
/*
* Preempting a task from an IRQ means we leave copies of PSTATE
@@ -99,8 +99,10 @@ static void __sched arm64_preempt_schedule_irq(void)
* Only allow a task to be preempted once cpufeatures have been
* enabled.
*/
- if (system_capabilities_finalized())
- preempt_schedule_irq();
+ if (!system_capabilities_finalized())
+ return false;
+
+ return true;
}
/*
@@ -127,7 +129,8 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
return;
}
- arm64_preempt_schedule_irq();
+ if (arm64_irqentry_exit_need_resched())
+ preempt_schedule_irq();
trace_hardirqs_on();
} else {
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 09/19] arm64: entry: Use preempt_count() and need_resched() helper
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (7 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 08/19] arm64: entry: Rework arm64_preempt_schedule_irq() Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 10/19] arm64: entry: preempt_schedule_irq() only if PREEMPTION enabled Jinjie Ruan
` (9 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
The "READ_ONCE(current_thread_info()->preempt_count = 0" is equivalent
to "preempt_count() == 0 && need_resched()", so use these helpers to
replace it, which will make it more clear when switch to generic entry.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index a3414fb599fa..3ea3ab32d232 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -74,14 +74,6 @@ static inline bool arm64_irqentry_exit_need_resched(void)
if (!need_irq_preemption())
return false;
- /*
- * Note: thread_info::preempt_count includes both thread_info::count
- * and thread_info::need_resched, and is not equivalent to
- * preempt_count().
- */
- if (READ_ONCE(current_thread_info()->preempt_count) != 0)
- return false;
-
/*
* DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
* priority masking is used the GIC irqchip driver will clear DAIF.IF
@@ -129,8 +121,10 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
return;
}
- if (arm64_irqentry_exit_need_resched())
- preempt_schedule_irq();
+ if (!preempt_count()) {
+ if (need_resched() && arm64_irqentry_exit_need_resched())
+ preempt_schedule_irq();
+ }
trace_hardirqs_on();
} else {
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 10/19] arm64: entry: preempt_schedule_irq() only if PREEMPTION enabled
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (8 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 09/19] arm64: entry: Use preempt_count() and need_resched() helper Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 11/19] arm64: entry: Extract raw_irqentry_exit_cond_resched() function Jinjie Ruan
` (8 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Whether PREEMPT_DYNAMIC enabled or not, PREEMPTION should
be enabled to allow reschedule after an interrupt.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 3ea3ab32d232..58d660878c09 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -65,8 +65,6 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
#define need_irq_preemption() \
(static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
-#else
-#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
#endif
static inline bool arm64_irqentry_exit_need_resched(void)
@@ -121,9 +119,12 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
return;
}
- if (!preempt_count()) {
- if (need_resched() && arm64_irqentry_exit_need_resched())
- preempt_schedule_irq();
+ if (IS_ENABLED(CONFIG_PREEMPTION)) {
+ if (!preempt_count()) {
+ if (need_resched() &&
+ arm64_irqentry_exit_need_resched())
+ preempt_schedule_irq();
+ }
}
trace_hardirqs_on();
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 11/19] arm64: entry: Extract raw_irqentry_exit_cond_resched() function
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (9 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 10/19] arm64: entry: preempt_schedule_irq() only if PREEMPTION enabled Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 12/19] arm64: entry: Check dynamic key ahead Jinjie Ruan
` (7 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Extract the arm64 resched logic code to
raw_irqentry_exit_cond_resched() function, which makes the
code more clear when switch to generic entry.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/include/asm/preempt.h | 1 +
arch/arm64/kernel/entry-common.c | 17 ++++++++++-------
2 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h
index 0159b625cc7f..d0f93385bd85 100644
--- a/arch/arm64/include/asm/preempt.h
+++ b/arch/arm64/include/asm/preempt.h
@@ -85,6 +85,7 @@ static inline bool should_resched(int preempt_offset)
void preempt_schedule(void);
void preempt_schedule_notrace(void);
+void raw_irqentry_exit_cond_resched(void);
#ifdef CONFIG_PREEMPT_DYNAMIC
DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 58d660878c09..5b7df53cfcf6 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -95,6 +95,14 @@ static inline bool arm64_irqentry_exit_need_resched(void)
return true;
}
+void raw_irqentry_exit_cond_resched(void)
+{
+ if (!preempt_count()) {
+ if (need_resched() && arm64_irqentry_exit_need_resched())
+ preempt_schedule_irq();
+ }
+}
+
/*
* Handle IRQ/context state management when exiting to kernel mode.
* After this function returns it is not safe to call regular kernel code,
@@ -119,13 +127,8 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
return;
}
- if (IS_ENABLED(CONFIG_PREEMPTION)) {
- if (!preempt_count()) {
- if (need_resched() &&
- arm64_irqentry_exit_need_resched())
- preempt_schedule_irq();
- }
- }
+ if (IS_ENABLED(CONFIG_PREEMPTION))
+ raw_irqentry_exit_cond_resched();
trace_hardirqs_on();
} else {
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 12/19] arm64: entry: Check dynamic key ahead
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (10 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 11/19] arm64: entry: Extract raw_irqentry_exit_cond_resched() function Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 13/19] arm64: entry: Check dynamic resched when PREEMPT_DYNAMIC enabled Jinjie Ruan
` (6 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Check dynamic key ahead in raw_irqentry_exit_cond_resched(), which
will make arm64_irqentry_exit_need_resched() all about arch-specific.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/entry-common.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 5b7df53cfcf6..3b110dcf4fa3 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -63,15 +63,10 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
#ifdef CONFIG_PREEMPT_DYNAMIC
DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-#define need_irq_preemption() \
- (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
#endif
static inline bool arm64_irqentry_exit_need_resched(void)
{
- if (!need_irq_preemption())
- return false;
-
/*
* DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
* priority masking is used the GIC irqchip driver will clear DAIF.IF
@@ -97,6 +92,11 @@ static inline bool arm64_irqentry_exit_need_resched(void)
void raw_irqentry_exit_cond_resched(void)
{
+#ifdef CONFIG_PREEMPT_DYNAMIC
+ if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
+ return;
+#endif
+
if (!preempt_count()) {
if (need_resched() && arm64_irqentry_exit_need_resched())
preempt_schedule_irq();
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 13/19] arm64: entry: Check dynamic resched when PREEMPT_DYNAMIC enabled
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (11 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 12/19] arm64: entry: Check dynamic key ahead Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 14/19] entry: Split into irq entry and syscall Jinjie Ruan
` (5 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Check dynamic resched alone when PREEMPT_DYNAMIC enabled.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/include/asm/preempt.h | 3 +++
arch/arm64/kernel/entry-common.c | 21 +++++++++++----------
2 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h
index d0f93385bd85..0f0ba250efe8 100644
--- a/arch/arm64/include/asm/preempt.h
+++ b/arch/arm64/include/asm/preempt.h
@@ -93,11 +93,14 @@ void dynamic_preempt_schedule(void);
#define __preempt_schedule() dynamic_preempt_schedule()
void dynamic_preempt_schedule_notrace(void);
#define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace()
+void dynamic_irqentry_exit_cond_resched(void);
+#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched()
#else /* CONFIG_PREEMPT_DYNAMIC */
#define __preempt_schedule() preempt_schedule()
#define __preempt_schedule_notrace() preempt_schedule_notrace()
+#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched()
#endif /* CONFIG_PREEMPT_DYNAMIC */
#endif /* CONFIG_PREEMPTION */
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 3b110dcf4fa3..152216201f84 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -61,10 +61,6 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
return ret;
}
-#ifdef CONFIG_PREEMPT_DYNAMIC
-DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-#endif
-
static inline bool arm64_irqentry_exit_need_resched(void)
{
/*
@@ -92,17 +88,22 @@ static inline bool arm64_irqentry_exit_need_resched(void)
void raw_irqentry_exit_cond_resched(void)
{
-#ifdef CONFIG_PREEMPT_DYNAMIC
- if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
- return;
-#endif
-
if (!preempt_count()) {
if (need_resched() && arm64_irqentry_exit_need_resched())
preempt_schedule_irq();
}
}
+#ifdef CONFIG_PREEMPT_DYNAMIC
+DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
+void dynamic_irqentry_exit_cond_resched(void)
+{
+ if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
+ return;
+ raw_irqentry_exit_cond_resched();
+}
+#endif
+
/*
* Handle IRQ/context state management when exiting to kernel mode.
* After this function returns it is not safe to call regular kernel code,
@@ -128,7 +129,7 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
}
if (IS_ENABLED(CONFIG_PREEMPTION))
- raw_irqentry_exit_cond_resched();
+ irqentry_exit_cond_resched();
trace_hardirqs_on();
} else {
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 14/19] entry: Split into irq entry and syscall
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (12 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 13/19] arm64: entry: Check dynamic resched when PREEMPT_DYNAMIC enabled Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 15/19] entry: Add arch irqentry_exit_need_resched() for arm64 Jinjie Ruan
` (4 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
As Mark pointed out, do not try to switch to *all* the
generic entry code in one go. The regular entry state management
(e.g. enter_from_user_mode() and exit_to_user_mode()) is largely
separate from the syscall state management. Move arm64 over to
enter_from_user_mode() and exit_to_user_mode() without needing to use
any of the generic syscall logic. Doing that first, *then* moving over
to the generic syscall handling would be much easier to
review/test/bisect, and if there are any ABI issues with the syscall
handling in particular, it will be easier to handle those in isolation.
So split generic entry into irq entry and syscall code, which will
make review work easier and switch to generic entry clear.
Introdue two configs called GENERIC_SYSCALL and GENERIC_IRQ_ENTRY,
which control the irq entry and syscall parts of the generic code
respectively. And split the header file irq-entry-common.h from
entry-common.h for GENERIC_IRQ_ENTRY.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
MAINTAINERS | 1 +
arch/Kconfig | 8 +
include/linux/entry-common.h | 376 +----------------------------
include/linux/irq-entry-common.h | 393 +++++++++++++++++++++++++++++++
kernel/entry/Makefile | 3 +-
kernel/entry/common.c | 159 +------------
kernel/entry/syscall-common.c | 159 +++++++++++++
7 files changed, 565 insertions(+), 534 deletions(-)
create mode 100644 include/linux/irq-entry-common.h
create mode 100644 kernel/entry/syscall-common.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 72dce03be648..468d6e1a3228 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9531,6 +9531,7 @@ S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/entry
F: include/linux/entry-common.h
F: include/linux/entry-kvm.h
+F: include/linux/irq-entry-common.h
F: kernel/entry/
GENERIC GPIO I2C DRIVER
diff --git a/arch/Kconfig b/arch/Kconfig
index feb50cfc4bdb..8e9c6f85960e 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -64,8 +64,16 @@ config HOTPLUG_PARALLEL
bool
select HOTPLUG_SPLIT_STARTUP
+config GENERIC_IRQ_ENTRY
+ bool
+
+config GENERIC_SYSCALL
+ bool
+
config GENERIC_ENTRY
bool
+ select GENERIC_IRQ_ENTRY
+ select GENERIC_SYSCALL
config KPROBES
bool "Kprobes"
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 1e50cdb83ae5..1ae3143d4b12 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -2,6 +2,7 @@
#ifndef __LINUX_ENTRYCOMMON_H
#define __LINUX_ENTRYCOMMON_H
+#include <linux/irq-entry-common.h>
#include <linux/static_call_types.h>
#include <linux/ptrace.h>
#include <linux/syscalls.h>
@@ -15,14 +16,6 @@
#include <asm/entry-common.h>
-/*
- * Define dummy _TIF work flags if not defined by the architecture or for
- * disabled functionality.
- */
-#ifndef _TIF_PATCH_PENDING
-# define _TIF_PATCH_PENDING (0)
-#endif
-
#ifndef _TIF_UPROBE
# define _TIF_UPROBE (0)
#endif
@@ -55,68 +48,6 @@
SYSCALL_WORK_SYSCALL_EXIT_TRAP | \
ARCH_SYSCALL_WORK_EXIT)
-/*
- * TIF flags handled in exit_to_user_mode_loop()
- */
-#ifndef ARCH_EXIT_TO_USER_MODE_WORK
-# define ARCH_EXIT_TO_USER_MODE_WORK (0)
-#endif
-
-#define EXIT_TO_USER_MODE_WORK \
- (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
- _TIF_NEED_RESCHED | _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \
- ARCH_EXIT_TO_USER_MODE_WORK)
-
-/**
- * arch_enter_from_user_mode - Architecture specific sanity check for user mode regs
- * @regs: Pointer to currents pt_regs
- *
- * Defaults to an empty implementation. Can be replaced by architecture
- * specific code.
- *
- * Invoked from syscall_enter_from_user_mode() in the non-instrumentable
- * section. Use __always_inline so the compiler cannot push it out of line
- * and make it instrumentable.
- */
-static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs);
-
-#ifndef arch_enter_from_user_mode
-static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) {}
-#endif
-
-/**
- * enter_from_user_mode - Establish state when coming from user mode
- *
- * Syscall/interrupt entry disables interrupts, but user mode is traced as
- * interrupts enabled. Also with NO_HZ_FULL RCU might be idle.
- *
- * 1) Tell lockdep that interrupts are disabled
- * 2) Invoke context tracking if enabled to reactivate RCU
- * 3) Trace interrupts off state
- *
- * Invoked from architecture specific syscall entry code with interrupts
- * disabled. The calling code has to be non-instrumentable. When the
- * function returns all state is correct and interrupts are still
- * disabled. The subsequent functions can be instrumented.
- *
- * This is invoked when there is architecture specific functionality to be
- * done between establishing state and enabling interrupts. The caller must
- * enable interrupts before invoking syscall_enter_from_user_mode_work().
- */
-static __always_inline void enter_from_user_mode(struct pt_regs *regs)
-{
- arch_enter_from_user_mode(regs);
- lockdep_hardirqs_off(CALLER_ADDR0);
-
- CT_WARN_ON(__ct_state() != CT_STATE_USER);
- user_exit_irqoff();
-
- instrumentation_begin();
- kmsan_unpoison_entry_regs(regs);
- trace_hardirqs_off_finish();
- instrumentation_end();
-}
-
/**
* syscall_enter_from_user_mode_prepare - Establish state and enable interrupts
* @regs: Pointer to currents pt_regs
@@ -201,170 +132,6 @@ static __always_inline long syscall_enter_from_user_mode(struct pt_regs *regs, l
return ret;
}
-/**
- * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable()
- * @ti_work: Cached TIF flags gathered with interrupts disabled
- *
- * Defaults to local_irq_enable(). Can be supplied by architecture specific
- * code.
- */
-static inline void local_irq_enable_exit_to_user(unsigned long ti_work);
-
-#ifndef local_irq_enable_exit_to_user
-static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
-{
- local_irq_enable();
-}
-#endif
-
-/**
- * local_irq_disable_exit_to_user - Exit to user variant of local_irq_disable()
- *
- * Defaults to local_irq_disable(). Can be supplied by architecture specific
- * code.
- */
-static inline void local_irq_disable_exit_to_user(void);
-
-#ifndef local_irq_disable_exit_to_user
-static inline void local_irq_disable_exit_to_user(void)
-{
- local_irq_disable();
-}
-#endif
-
-/**
- * arch_exit_to_user_mode_work - Architecture specific TIF work for exit
- * to user mode.
- * @regs: Pointer to currents pt_regs
- * @ti_work: Cached TIF flags gathered with interrupts disabled
- *
- * Invoked from exit_to_user_mode_loop() with interrupt enabled
- *
- * Defaults to NOOP. Can be supplied by architecture specific code.
- */
-static inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
- unsigned long ti_work);
-
-#ifndef arch_exit_to_user_mode_work
-static inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
- unsigned long ti_work)
-{
-}
-#endif
-
-/**
- * arch_exit_to_user_mode_prepare - Architecture specific preparation for
- * exit to user mode.
- * @regs: Pointer to currents pt_regs
- * @ti_work: Cached TIF flags gathered with interrupts disabled
- *
- * Invoked from exit_to_user_mode_prepare() with interrupt disabled as the last
- * function before return. Defaults to NOOP.
- */
-static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
- unsigned long ti_work);
-
-#ifndef arch_exit_to_user_mode_prepare
-static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
- unsigned long ti_work)
-{
-}
-#endif
-
-/**
- * arch_exit_to_user_mode - Architecture specific final work before
- * exit to user mode.
- *
- * Invoked from exit_to_user_mode() with interrupt disabled as the last
- * function before return. Defaults to NOOP.
- *
- * This needs to be __always_inline because it is non-instrumentable code
- * invoked after context tracking switched to user mode.
- *
- * An architecture implementation must not do anything complex, no locking
- * etc. The main purpose is for speculation mitigations.
- */
-static __always_inline void arch_exit_to_user_mode(void);
-
-#ifndef arch_exit_to_user_mode
-static __always_inline void arch_exit_to_user_mode(void) { }
-#endif
-
-/**
- * arch_do_signal_or_restart - Architecture specific signal delivery function
- * @regs: Pointer to currents pt_regs
- *
- * Invoked from exit_to_user_mode_loop().
- */
-void arch_do_signal_or_restart(struct pt_regs *regs);
-
-/**
- * exit_to_user_mode_loop - do any pending work before leaving to user space
- */
-unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
- unsigned long ti_work);
-
-/**
- * exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required
- * @regs: Pointer to pt_regs on entry stack
- *
- * 1) check that interrupts are disabled
- * 2) call tick_nohz_user_enter_prepare()
- * 3) call exit_to_user_mode_loop() if any flags from
- * EXIT_TO_USER_MODE_WORK are set
- * 4) check that interrupts are still disabled
- */
-static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs)
-{
- unsigned long ti_work;
-
- lockdep_assert_irqs_disabled();
-
- /* Flush pending rcuog wakeup before the last need_resched() check */
- tick_nohz_user_enter_prepare();
-
- ti_work = read_thread_flags();
- if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
- ti_work = exit_to_user_mode_loop(regs, ti_work);
-
- arch_exit_to_user_mode_prepare(regs, ti_work);
-
- /* Ensure that kernel state is sane for a return to userspace */
- kmap_assert_nomap();
- lockdep_assert_irqs_disabled();
- lockdep_sys_exit();
-}
-
-/**
- * exit_to_user_mode - Fixup state when exiting to user mode
- *
- * Syscall/interrupt exit enables interrupts, but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about it.
- *
- * 1) Trace interrupts on state
- * 2) Invoke context tracking if enabled to adjust RCU state
- * 3) Invoke architecture specific last minute exit code, e.g. speculation
- * mitigations, etc.: arch_exit_to_user_mode()
- * 4) Tell lockdep that interrupts are enabled
- *
- * Invoked from architecture specific code when syscall_exit_to_user_mode()
- * is not suitable as the last step before returning to userspace. Must be
- * invoked with interrupts disabled and the caller must be
- * non-instrumentable.
- * The caller has to invoke syscall_exit_to_user_mode_work() before this.
- */
-static __always_inline void exit_to_user_mode(void)
-{
- instrumentation_begin();
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare();
- instrumentation_end();
-
- user_enter_irqoff();
- arch_exit_to_user_mode();
- lockdep_hardirqs_on(CALLER_ADDR0);
-}
-
/**
* syscall_exit_to_user_mode_work - Handle work before returning to user mode
* @regs: Pointer to currents pt_regs
@@ -411,145 +178,4 @@ void syscall_exit_to_user_mode_work(struct pt_regs *regs);
*/
void syscall_exit_to_user_mode(struct pt_regs *regs);
-/**
- * irqentry_enter_from_user_mode - Establish state before invoking the irq handler
- * @regs: Pointer to currents pt_regs
- *
- * Invoked from architecture specific entry code with interrupts disabled.
- * Can only be called when the interrupt entry came from user mode. The
- * calling code must be non-instrumentable. When the function returns all
- * state is correct and the subsequent functions can be instrumented.
- *
- * The function establishes state (lockdep, RCU (context tracking), tracing)
- */
-void irqentry_enter_from_user_mode(struct pt_regs *regs);
-
-/**
- * irqentry_exit_to_user_mode - Interrupt exit work
- * @regs: Pointer to current's pt_regs
- *
- * Invoked with interrupts disabled and fully valid regs. Returns with all
- * work handled, interrupts disabled such that the caller can immediately
- * switch to user mode. Called from architecture specific interrupt
- * handling code.
- *
- * The call order is #2 and #3 as described in syscall_exit_to_user_mode().
- * Interrupt exit is not invoking #1 which is the syscall specific one time
- * work.
- */
-void irqentry_exit_to_user_mode(struct pt_regs *regs);
-
-#ifndef irqentry_state
-/**
- * struct irqentry_state - Opaque object for exception state storage
- * @exit_rcu: Used exclusively in the irqentry_*() calls; signals whether the
- * exit path has to invoke ct_irq_exit().
- * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures that
- * lockdep state is restored correctly on exit from nmi.
- *
- * This opaque object is filled in by the irqentry_*_enter() functions and
- * must be passed back into the corresponding irqentry_*_exit() functions
- * when the exception is complete.
- *
- * Callers of irqentry_*_[enter|exit]() must consider this structure opaque
- * and all members private. Descriptions of the members are provided to aid in
- * the maintenance of the irqentry_*() functions.
- */
-typedef struct irqentry_state {
- union {
- bool exit_rcu;
- bool lockdep;
- };
-} irqentry_state_t;
-#endif
-
-/**
- * irqentry_enter - Handle state tracking on ordinary interrupt entries
- * @regs: Pointer to pt_regs of interrupted context
- *
- * Invokes:
- * - lockdep irqflag state tracking as low level ASM entry disabled
- * interrupts.
- *
- * - Context tracking if the exception hit user mode.
- *
- * - The hardirq tracer to keep the state consistent as low level ASM
- * entry disabled interrupts.
- *
- * As a precondition, this requires that the entry came from user mode,
- * idle, or a kernel context in which RCU is watching.
- *
- * For kernel mode entries RCU handling is done conditional. If RCU is
- * watching then the only RCU requirement is to check whether the tick has
- * to be restarted. If RCU is not watching then ct_irq_enter() has to be
- * invoked on entry and ct_irq_exit() on exit.
- *
- * Avoiding the ct_irq_enter/exit() calls is an optimization but also
- * solves the problem of kernel mode pagefaults which can schedule, which
- * is not possible after invoking ct_irq_enter() without undoing it.
- *
- * For user mode entries irqentry_enter_from_user_mode() is invoked to
- * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit
- * would not be possible.
- *
- * Returns: An opaque object that must be passed to idtentry_exit()
- */
-irqentry_state_t noinstr irqentry_enter(struct pt_regs *regs);
-
-/**
- * irqentry_exit_cond_resched - Conditionally reschedule on return from interrupt
- *
- * Conditional reschedule with additional sanity checks.
- */
-void raw_irqentry_exit_cond_resched(void);
-#ifdef CONFIG_PREEMPT_DYNAMIC
-#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-#define irqentry_exit_cond_resched_dynamic_enabled raw_irqentry_exit_cond_resched
-#define irqentry_exit_cond_resched_dynamic_disabled NULL
-DECLARE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resched);
-#define irqentry_exit_cond_resched() static_call(irqentry_exit_cond_resched)()
-#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-void dynamic_irqentry_exit_cond_resched(void);
-#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched()
-#endif
-#else /* CONFIG_PREEMPT_DYNAMIC */
-#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched()
-#endif /* CONFIG_PREEMPT_DYNAMIC */
-
-/**
- * irqentry_exit - Handle return from exception that used irqentry_enter()
- * @regs: Pointer to pt_regs (exception entry regs)
- * @state: Return value from matching call to irqentry_enter()
- *
- * Depending on the return target (kernel/user) this runs the necessary
- * preemption and work checks if possible and required and returns to
- * the caller with interrupts disabled and no further work pending.
- *
- * This is the last action before returning to the low level ASM code which
- * just needs to return to the appropriate context.
- *
- * Counterpart to irqentry_enter().
- */
-void noinstr irqentry_exit(struct pt_regs *regs, irqentry_state_t state);
-
-/**
- * irqentry_nmi_enter - Handle NMI entry
- * @regs: Pointer to currents pt_regs
- *
- * Similar to irqentry_enter() but taking care of the NMI constraints.
- */
-irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs);
-
-/**
- * irqentry_nmi_exit - Handle return from NMI handling
- * @regs: Pointer to pt_regs (NMI entry regs)
- * @irq_state: Return value from matching call to irqentry_nmi_enter()
- *
- * Last action before returning to the low level assembly code.
- *
- * Counterpart to irqentry_nmi_enter().
- */
-void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state);
-
#endif
diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h
new file mode 100644
index 000000000000..b7d60a18f1a2
--- /dev/null
+++ b/include/linux/irq-entry-common.h
@@ -0,0 +1,393 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_IRQENTRYCOMMON_H
+#define __LINUX_IRQENTRYCOMMON_H
+
+#include <linux/static_call_types.h>
+#include <linux/ptrace.h>
+#include <linux/syscalls.h>
+#include <linux/seccomp.h>
+#include <linux/sched.h>
+#include <linux/context_tracking.h>
+#include <linux/livepatch.h>
+#include <linux/resume_user_mode.h>
+#include <linux/tick.h>
+#include <linux/kmsan.h>
+
+#include <asm/entry-common.h>
+
+/*
+ * Define dummy _TIF work flags if not defined by the architecture or for
+ * disabled functionality.
+ */
+#ifndef _TIF_PATCH_PENDING
+# define _TIF_PATCH_PENDING (0)
+#endif
+
+/*
+ * TIF flags handled in exit_to_user_mode_loop()
+ */
+#ifndef ARCH_EXIT_TO_USER_MODE_WORK
+# define ARCH_EXIT_TO_USER_MODE_WORK (0)
+#endif
+
+#define EXIT_TO_USER_MODE_WORK \
+ (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
+ _TIF_NEED_RESCHED | _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \
+ ARCH_EXIT_TO_USER_MODE_WORK)
+
+/**
+ * arch_enter_from_user_mode - Architecture specific sanity check for user mode regs
+ * @regs: Pointer to currents pt_regs
+ *
+ * Defaults to an empty implementation. Can be replaced by architecture
+ * specific code.
+ *
+ * Invoked from syscall_enter_from_user_mode() in the non-instrumentable
+ * section. Use __always_inline so the compiler cannot push it out of line
+ * and make it instrumentable.
+ */
+static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs);
+
+#ifndef arch_enter_from_user_mode
+static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) {}
+#endif
+
+/**
+ * enter_from_user_mode - Establish state when coming from user mode
+ *
+ * Syscall/interrupt entry disables interrupts, but user mode is traced as
+ * interrupts enabled. Also with NO_HZ_FULL RCU might be idle.
+ *
+ * 1) Tell lockdep that interrupts are disabled
+ * 2) Invoke context tracking if enabled to reactivate RCU
+ * 3) Trace interrupts off state
+ *
+ * Invoked from architecture specific syscall entry code with interrupts
+ * disabled. The calling code has to be non-instrumentable. When the
+ * function returns all state is correct and interrupts are still
+ * disabled. The subsequent functions can be instrumented.
+ *
+ * This is invoked when there is architecture specific functionality to be
+ * done between establishing state and enabling interrupts. The caller must
+ * enable interrupts before invoking syscall_enter_from_user_mode_work().
+ */
+static __always_inline void enter_from_user_mode(struct pt_regs *regs)
+{
+ arch_enter_from_user_mode(regs);
+ lockdep_hardirqs_off(CALLER_ADDR0);
+
+ CT_WARN_ON(__ct_state() != CT_STATE_USER);
+ user_exit_irqoff();
+
+ instrumentation_begin();
+ kmsan_unpoison_entry_regs(regs);
+ trace_hardirqs_off_finish();
+ instrumentation_end();
+}
+
+/**
+ * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable()
+ * @ti_work: Cached TIF flags gathered with interrupts disabled
+ *
+ * Defaults to local_irq_enable(). Can be supplied by architecture specific
+ * code.
+ */
+static inline void local_irq_enable_exit_to_user(unsigned long ti_work);
+
+#ifndef local_irq_enable_exit_to_user
+static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
+{
+ local_irq_enable();
+}
+#endif
+
+/**
+ * local_irq_disable_exit_to_user - Exit to user variant of local_irq_disable()
+ *
+ * Defaults to local_irq_disable(). Can be supplied by architecture specific
+ * code.
+ */
+static inline void local_irq_disable_exit_to_user(void);
+
+#ifndef local_irq_disable_exit_to_user
+static inline void local_irq_disable_exit_to_user(void)
+{
+ local_irq_disable();
+}
+#endif
+
+/**
+ * arch_exit_to_user_mode_work - Architecture specific TIF work for exit
+ * to user mode.
+ * @regs: Pointer to currents pt_regs
+ * @ti_work: Cached TIF flags gathered with interrupts disabled
+ *
+ * Invoked from exit_to_user_mode_loop() with interrupt enabled
+ *
+ * Defaults to NOOP. Can be supplied by architecture specific code.
+ */
+static inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
+ unsigned long ti_work);
+
+#ifndef arch_exit_to_user_mode_work
+static inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+}
+#endif
+
+/**
+ * arch_exit_to_user_mode_prepare - Architecture specific preparation for
+ * exit to user mode.
+ * @regs: Pointer to currents pt_regs
+ * @ti_work: Cached TIF flags gathered with interrupts disabled
+ *
+ * Invoked from exit_to_user_mode_prepare() with interrupt disabled as the last
+ * function before return. Defaults to NOOP.
+ */
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+ unsigned long ti_work);
+
+#ifndef arch_exit_to_user_mode_prepare
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+}
+#endif
+
+/**
+ * arch_exit_to_user_mode - Architecture specific final work before
+ * exit to user mode.
+ *
+ * Invoked from exit_to_user_mode() with interrupt disabled as the last
+ * function before return. Defaults to NOOP.
+ *
+ * This needs to be __always_inline because it is non-instrumentable code
+ * invoked after context tracking switched to user mode.
+ *
+ * An architecture implementation must not do anything complex, no locking
+ * etc. The main purpose is for speculation mitigations.
+ */
+static __always_inline void arch_exit_to_user_mode(void);
+
+#ifndef arch_exit_to_user_mode
+static __always_inline void arch_exit_to_user_mode(void) { }
+#endif
+
+/**
+ * arch_do_signal_or_restart - Architecture specific signal delivery function
+ * @regs: Pointer to currents pt_regs
+ *
+ * Invoked from exit_to_user_mode_loop().
+ */
+void arch_do_signal_or_restart(struct pt_regs *regs);
+
+/**
+ * exit_to_user_mode_loop - do any pending work before leaving to user space
+ */
+unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
+ unsigned long ti_work);
+
+/**
+ * exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required
+ * @regs: Pointer to pt_regs on entry stack
+ *
+ * 1) check that interrupts are disabled
+ * 2) call tick_nohz_user_enter_prepare()
+ * 3) call exit_to_user_mode_loop() if any flags from
+ * EXIT_TO_USER_MODE_WORK are set
+ * 4) check that interrupts are still disabled
+ */
+static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs)
+{
+ unsigned long ti_work;
+
+ lockdep_assert_irqs_disabled();
+
+ /* Flush pending rcuog wakeup before the last need_resched() check */
+ tick_nohz_user_enter_prepare();
+
+ ti_work = read_thread_flags();
+ if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
+ ti_work = exit_to_user_mode_loop(regs, ti_work);
+
+ arch_exit_to_user_mode_prepare(regs, ti_work);
+
+ /* Ensure that kernel state is sane for a return to userspace */
+ kmap_assert_nomap();
+ lockdep_assert_irqs_disabled();
+ lockdep_sys_exit();
+}
+
+/**
+ * exit_to_user_mode - Fixup state when exiting to user mode
+ *
+ * Syscall/interrupt exit enables interrupts, but the kernel state is
+ * interrupts disabled when this is invoked. Also tell RCU about it.
+ *
+ * 1) Trace interrupts on state
+ * 2) Invoke context tracking if enabled to adjust RCU state
+ * 3) Invoke architecture specific last minute exit code, e.g. speculation
+ * mitigations, etc.: arch_exit_to_user_mode()
+ * 4) Tell lockdep that interrupts are enabled
+ *
+ * Invoked from architecture specific code when syscall_exit_to_user_mode()
+ * is not suitable as the last step before returning to userspace. Must be
+ * invoked with interrupts disabled and the caller must be
+ * non-instrumentable.
+ * The caller has to invoke syscall_exit_to_user_mode_work() before this.
+ */
+static __always_inline void exit_to_user_mode(void)
+{
+ instrumentation_begin();
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare();
+ instrumentation_end();
+
+ user_enter_irqoff();
+ arch_exit_to_user_mode();
+ lockdep_hardirqs_on(CALLER_ADDR0);
+}
+
+/**
+ * irqentry_enter_from_user_mode - Establish state before invoking the irq handler
+ * @regs: Pointer to currents pt_regs
+ *
+ * Invoked from architecture specific entry code with interrupts disabled.
+ * Can only be called when the interrupt entry came from user mode. The
+ * calling code must be non-instrumentable. When the function returns all
+ * state is correct and the subsequent functions can be instrumented.
+ *
+ * The function establishes state (lockdep, RCU (context tracking), tracing)
+ */
+void irqentry_enter_from_user_mode(struct pt_regs *regs);
+
+/**
+ * irqentry_exit_to_user_mode - Interrupt exit work
+ * @regs: Pointer to current's pt_regs
+ *
+ * Invoked with interrupts disabled and fully valid regs. Returns with all
+ * work handled, interrupts disabled such that the caller can immediately
+ * switch to user mode. Called from architecture specific interrupt
+ * handling code.
+ *
+ * The call order is #2 and #3 as described in syscall_exit_to_user_mode().
+ * Interrupt exit is not invoking #1 which is the syscall specific one time
+ * work.
+ */
+void irqentry_exit_to_user_mode(struct pt_regs *regs);
+
+#ifndef irqentry_state
+/**
+ * struct irqentry_state - Opaque object for exception state storage
+ * @exit_rcu: Used exclusively in the irqentry_*() calls; signals whether the
+ * exit path has to invoke ct_irq_exit().
+ * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures that
+ * lockdep state is restored correctly on exit from nmi.
+ *
+ * This opaque object is filled in by the irqentry_*_enter() functions and
+ * must be passed back into the corresponding irqentry_*_exit() functions
+ * when the exception is complete.
+ *
+ * Callers of irqentry_*_[enter|exit]() must consider this structure opaque
+ * and all members private. Descriptions of the members are provided to aid in
+ * the maintenance of the irqentry_*() functions.
+ */
+typedef struct irqentry_state {
+ union {
+ bool exit_rcu;
+ bool lockdep;
+ };
+} irqentry_state_t;
+#endif
+
+/**
+ * irqentry_enter - Handle state tracking on ordinary interrupt entries
+ * @regs: Pointer to pt_regs of interrupted context
+ *
+ * Invokes:
+ * - lockdep irqflag state tracking as low level ASM entry disabled
+ * interrupts.
+ *
+ * - Context tracking if the exception hit user mode.
+ *
+ * - The hardirq tracer to keep the state consistent as low level ASM
+ * entry disabled interrupts.
+ *
+ * As a precondition, this requires that the entry came from user mode,
+ * idle, or a kernel context in which RCU is watching.
+ *
+ * For kernel mode entries RCU handling is done conditional. If RCU is
+ * watching then the only RCU requirement is to check whether the tick has
+ * to be restarted. If RCU is not watching then ct_irq_enter() has to be
+ * invoked on entry and ct_irq_exit() on exit.
+ *
+ * Avoiding the ct_irq_enter/exit() calls is an optimization but also
+ * solves the problem of kernel mode pagefaults which can schedule, which
+ * is not possible after invoking ct_irq_enter() without undoing it.
+ *
+ * For user mode entries irqentry_enter_from_user_mode() is invoked to
+ * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit
+ * would not be possible.
+ *
+ * Returns: An opaque object that must be passed to idtentry_exit()
+ */
+irqentry_state_t noinstr irqentry_enter(struct pt_regs *regs);
+
+/**
+ * irqentry_exit_cond_resched - Conditionally reschedule on return from interrupt
+ *
+ * Conditional reschedule with additional sanity checks.
+ */
+void raw_irqentry_exit_cond_resched(void);
+#ifdef CONFIG_PREEMPT_DYNAMIC
+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
+#define irqentry_exit_cond_resched_dynamic_enabled raw_irqentry_exit_cond_resched
+#define irqentry_exit_cond_resched_dynamic_disabled NULL
+DECLARE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resched);
+#define irqentry_exit_cond_resched() static_call(irqentry_exit_cond_resched)()
+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
+DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
+void dynamic_irqentry_exit_cond_resched(void);
+#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched()
+#endif
+#else /* CONFIG_PREEMPT_DYNAMIC */
+#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched()
+#endif /* CONFIG_PREEMPT_DYNAMIC */
+
+/**
+ * irqentry_exit - Handle return from exception that used irqentry_enter()
+ * @regs: Pointer to pt_regs (exception entry regs)
+ * @state: Return value from matching call to irqentry_enter()
+ *
+ * Depending on the return target (kernel/user) this runs the necessary
+ * preemption and work checks if possible and required and returns to
+ * the caller with interrupts disabled and no further work pending.
+ *
+ * This is the last action before returning to the low level ASM code which
+ * just needs to return to the appropriate context.
+ *
+ * Counterpart to irqentry_enter().
+ */
+void noinstr irqentry_exit(struct pt_regs *regs, irqentry_state_t state);
+
+/**
+ * irqentry_nmi_enter - Handle NMI entry
+ * @regs: Pointer to currents pt_regs
+ *
+ * Similar to irqentry_enter() but taking care of the NMI constraints.
+ */
+irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs);
+
+/**
+ * irqentry_nmi_exit - Handle return from NMI handling
+ * @regs: Pointer to pt_regs (NMI entry regs)
+ * @irq_state: Return value from matching call to irqentry_nmi_enter()
+ *
+ * Last action before returning to the low level assembly code.
+ *
+ * Counterpart to irqentry_nmi_enter().
+ */
+void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state);
+
+#endif
diff --git a/kernel/entry/Makefile b/kernel/entry/Makefile
index 095c775e001e..d38f3a7e7396 100644
--- a/kernel/entry/Makefile
+++ b/kernel/entry/Makefile
@@ -9,5 +9,6 @@ KCOV_INSTRUMENT := n
CFLAGS_REMOVE_common.o = -fstack-protector -fstack-protector-strong
CFLAGS_common.o += -fno-stack-protector
-obj-$(CONFIG_GENERIC_ENTRY) += common.o syscall_user_dispatch.o
+obj-$(CONFIG_GENERIC_IRQ_ENTRY) += common.o
+obj-$(CONFIG_GENERIC_SYSCALL) += syscall-common.o syscall_user_dispatch.o
obj-$(CONFIG_KVM_XFER_TO_GUEST_WORK) += kvm.o
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 5b6934e23c21..2ad132c7be05 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -1,84 +1,14 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/context_tracking.h>
-#include <linux/entry-common.h>
+#include <linux/irq-entry-common.h>
#include <linux/resume_user_mode.h>
#include <linux/highmem.h>
#include <linux/jump_label.h>
#include <linux/kmsan.h>
#include <linux/livepatch.h>
-#include <linux/audit.h>
#include <linux/tick.h>
-#include "common.h"
-
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
-static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
-{
- if (unlikely(audit_context())) {
- unsigned long args[6];
-
- syscall_get_arguments(current, regs, args);
- audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]);
- }
-}
-
-long syscall_trace_enter(struct pt_regs *regs, long syscall,
- unsigned long work)
-{
- long ret = 0;
-
- /*
- * Handle Syscall User Dispatch. This must comes first, since
- * the ABI here can be something that doesn't make sense for
- * other syscall_work features.
- */
- if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
- if (syscall_user_dispatch(regs))
- return -1L;
- }
-
- /* Handle ptrace */
- if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
- ret = ptrace_report_syscall_entry(regs);
- if (ret || (work & SYSCALL_WORK_SYSCALL_EMU))
- return -1L;
- }
-
- /* Do seccomp after ptrace, to catch any tracer changes. */
- if (work & SYSCALL_WORK_SECCOMP) {
- ret = __secure_computing(NULL);
- if (ret == -1L)
- return ret;
- }
-
- /* Either of the above might have changed the syscall number */
- syscall = syscall_get_nr(current, regs);
-
- if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) {
- trace_sys_enter(regs, syscall);
- /*
- * Probes or BPF hooks in the tracepoint may have changed the
- * system call number as well.
- */
- syscall = syscall_get_nr(current, regs);
- }
-
- syscall_enter_audit(regs, syscall);
-
- return ret ? : syscall;
-}
-
-noinstr void syscall_enter_from_user_mode_prepare(struct pt_regs *regs)
-{
- enter_from_user_mode(regs);
- instrumentation_begin();
- local_irq_enable();
- instrumentation_end();
-}
-
/* Workaround to allow gradual conversion of architecture code */
void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
@@ -133,93 +63,6 @@ __always_inline unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
return ti_work;
}
-/*
- * If SYSCALL_EMU is set, then the only reason to report is when
- * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall
- * instruction has been already reported in syscall_enter_from_user_mode().
- */
-static inline bool report_single_step(unsigned long work)
-{
- if (work & SYSCALL_WORK_SYSCALL_EMU)
- return false;
-
- return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP;
-}
-
-static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
-{
- bool step;
-
- /*
- * If the syscall was rolled back due to syscall user dispatching,
- * then the tracers below are not invoked for the same reason as
- * the entry side was not invoked in syscall_trace_enter(): The ABI
- * of these syscalls is unknown.
- */
- if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
- if (unlikely(current->syscall_dispatch.on_dispatch)) {
- current->syscall_dispatch.on_dispatch = false;
- return;
- }
- }
-
- audit_syscall_exit(regs);
-
- if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT)
- trace_sys_exit(regs, syscall_get_return_value(current, regs));
-
- step = report_single_step(work);
- if (step || work & SYSCALL_WORK_SYSCALL_TRACE)
- ptrace_report_syscall_exit(regs, step);
-}
-
-/*
- * Syscall specific exit to user mode preparation. Runs with interrupts
- * enabled.
- */
-static void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
-{
- unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
- unsigned long nr = syscall_get_nr(current, regs);
-
- CT_WARN_ON(ct_state() != CT_STATE_KERNEL);
-
- if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
- if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr))
- local_irq_enable();
- }
-
- rseq_syscall(regs);
-
- /*
- * Do one-time syscall specific work. If these work items are
- * enabled, we want to run them exactly once per syscall exit with
- * interrupts enabled.
- */
- if (unlikely(work & SYSCALL_WORK_EXIT))
- syscall_exit_work(regs, work);
-}
-
-static __always_inline void __syscall_exit_to_user_mode_work(struct pt_regs *regs)
-{
- syscall_exit_to_user_mode_prepare(regs);
- local_irq_disable_exit_to_user();
- exit_to_user_mode_prepare(regs);
-}
-
-void syscall_exit_to_user_mode_work(struct pt_regs *regs)
-{
- __syscall_exit_to_user_mode_work(regs);
-}
-
-__visible noinstr void syscall_exit_to_user_mode(struct pt_regs *regs)
-{
- instrumentation_begin();
- __syscall_exit_to_user_mode_work(regs);
- instrumentation_end();
- exit_to_user_mode();
-}
-
noinstr void irqentry_enter_from_user_mode(struct pt_regs *regs)
{
enter_from_user_mode(regs);
diff --git a/kernel/entry/syscall-common.c b/kernel/entry/syscall-common.c
new file mode 100644
index 000000000000..0eb036986ad4
--- /dev/null
+++ b/kernel/entry/syscall-common.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/audit.h>
+#include <linux/entry-common.h>
+#include "common.h"
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/syscalls.h>
+
+static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
+{
+ if (unlikely(audit_context())) {
+ unsigned long args[6];
+
+ syscall_get_arguments(current, regs, args);
+ audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]);
+ }
+}
+
+long syscall_trace_enter(struct pt_regs *regs, long syscall,
+ unsigned long work)
+{
+ long ret = 0;
+
+ /*
+ * Handle Syscall User Dispatch. This must comes first, since
+ * the ABI here can be something that doesn't make sense for
+ * other syscall_work features.
+ */
+ if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
+ if (syscall_user_dispatch(regs))
+ return -1L;
+ }
+
+ /* Handle ptrace */
+ if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
+ ret = ptrace_report_syscall_entry(regs);
+ if (ret || (work & SYSCALL_WORK_SYSCALL_EMU))
+ return -1L;
+ }
+
+ /* Do seccomp after ptrace, to catch any tracer changes. */
+ if (work & SYSCALL_WORK_SECCOMP) {
+ ret = __secure_computing(NULL);
+ if (ret == -1L)
+ return ret;
+ }
+
+ /* Either of the above might have changed the syscall number */
+ syscall = syscall_get_nr(current, regs);
+
+ if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) {
+ trace_sys_enter(regs, syscall);
+ /*
+ * Probes or BPF hooks in the tracepoint may have changed the
+ * system call number as well.
+ */
+ syscall = syscall_get_nr(current, regs);
+ }
+
+ syscall_enter_audit(regs, syscall);
+
+ return ret ? : syscall;
+}
+
+noinstr void syscall_enter_from_user_mode_prepare(struct pt_regs *regs)
+{
+ enter_from_user_mode(regs);
+ instrumentation_begin();
+ local_irq_enable();
+ instrumentation_end();
+}
+
+/*
+ * If SYSCALL_EMU is set, then the only reason to report is when
+ * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall
+ * instruction has been already reported in syscall_enter_from_user_mode().
+ */
+static inline bool report_single_step(unsigned long work)
+{
+ if (work & SYSCALL_WORK_SYSCALL_EMU)
+ return false;
+
+ return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP;
+}
+
+static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
+{
+ bool step;
+
+ /*
+ * If the syscall was rolled back due to syscall user dispatching,
+ * then the tracers below are not invoked for the same reason as
+ * the entry side was not invoked in syscall_trace_enter(): The ABI
+ * of these syscalls is unknown.
+ */
+ if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
+ if (unlikely(current->syscall_dispatch.on_dispatch)) {
+ current->syscall_dispatch.on_dispatch = false;
+ return;
+ }
+ }
+
+ audit_syscall_exit(regs);
+
+ if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT)
+ trace_sys_exit(regs, syscall_get_return_value(current, regs));
+
+ step = report_single_step(work);
+ if (step || work & SYSCALL_WORK_SYSCALL_TRACE)
+ ptrace_report_syscall_exit(regs, step);
+}
+
+/*
+ * Syscall specific exit to user mode preparation. Runs with interrupts
+ * enabled.
+ */
+static void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
+{
+ unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
+ unsigned long nr = syscall_get_nr(current, regs);
+
+ CT_WARN_ON(ct_state() != CT_STATE_KERNEL);
+
+ if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
+ if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr))
+ local_irq_enable();
+ }
+
+ rseq_syscall(regs);
+
+ /*
+ * Do one-time syscall specific work. If these work items are
+ * enabled, we want to run them exactly once per syscall exit with
+ * interrupts enabled.
+ */
+ if (unlikely(work & SYSCALL_WORK_EXIT))
+ syscall_exit_work(regs, work);
+}
+
+static __always_inline void __syscall_exit_to_user_mode_work(struct pt_regs *regs)
+{
+ syscall_exit_to_user_mode_prepare(regs);
+ local_irq_disable_exit_to_user();
+ exit_to_user_mode_prepare(regs);
+}
+
+void syscall_exit_to_user_mode_work(struct pt_regs *regs)
+{
+ __syscall_exit_to_user_mode_work(regs);
+}
+
+__visible noinstr void syscall_exit_to_user_mode(struct pt_regs *regs)
+{
+ instrumentation_begin();
+ __syscall_exit_to_user_mode_work(regs);
+ instrumentation_end();
+ exit_to_user_mode();
+}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 15/19] entry: Add arch irqentry_exit_need_resched() for arm64
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (13 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 14/19] entry: Split into irq entry and syscall Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-28 18:05 ` Thomas Gleixner
2024-10-25 10:06 ` [PATCH -next v4 16/19] arm64: entry: Switch to generic IRQ entry Jinjie Ruan
` (3 subsequent siblings)
18 siblings, 1 reply; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
As the front patch 6 ~ 13 did, the arm64_preempt_schedule_irq() is
same with the irq preempt schedule code of generic entry besides those
architecture-related logic called arm64_irqentry_exit_need_resched().
So add arch irqentry_exit_need_resched() to support architecture-related
need_resched() check logic, which do not affect existing architectures
that use generic entry, but support arm64 to use generic irq entry.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Kevin Brodsky <kevin.brodsky@arm.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
kernel/entry/common.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 2ad132c7be05..0cc117b658b8 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -143,6 +143,20 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
return ret;
}
+/**
+ * arch_irqentry_exit_need_resched - Architecture specific need resched function
+ *
+ * Invoked from raw_irqentry_exit_cond_resched() to check if need resched.
+ * Defaults return true.
+ *
+ * The main purpose is to permit arch to skip preempt a task from an IRQ.
+ */
+static inline bool arch_irqentry_exit_need_resched(void);
+
+#ifndef arch_irqentry_exit_need_resched
+static inline bool arch_irqentry_exit_need_resched(void) { return true; }
+#endif
+
void raw_irqentry_exit_cond_resched(void)
{
if (!preempt_count()) {
@@ -150,7 +164,7 @@ void raw_irqentry_exit_cond_resched(void)
rcu_irq_exit_check_preempt();
if (IS_ENABLED(CONFIG_DEBUG_ENTRY))
WARN_ON_ONCE(!on_thread_stack());
- if (need_resched())
+ if (need_resched() && arch_irqentry_exit_need_resched())
preempt_schedule_irq();
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 16/19] arm64: entry: Switch to generic IRQ entry
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (14 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 15/19] entry: Add arch irqentry_exit_need_resched() for arm64 Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 17/19] entry: Add syscall arch functions to use generic syscall for arm64 Jinjie Ruan
` (2 subsequent siblings)
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
to use the generic entry infrastructure from kernel/entry/*.
The generic entry makes maintainers' work easier and codes
more elegant.
Switch arm64 to generic IRQ entry first, which removed duplicate 100+ LOC,
the next patch will switch arm64 to generic entry completely. Switch to
generic entry in two steps according to Mark's suggestion will make it
easier to review.
The changes are below:
- Remove *enter_from/exit_to_kernel_mode(), and wrap with generic
irqentry_enter/exit(). Also remove *enter_from/exit_to_user_mode(),
and wrap with generic enter_from/exit_to_user_mode(). The front
patch 1 ~ 5 try to make it easier to make this switch. And the patch
14 split the generic irq entry and generic syscall to make this patch
more single and concentrated in switching to generic IRQ entry.
- Remove arm64_enter/exit_nmi() and use generic irqentry_nmi_enter/exit().
- Remove PREEMPT_DYNAMIC code, as generic entry do the same thing
if arm64 implement arch_irqentry_exit_need_resched(). The front patch
6 ~ 13 and patch 15 try to make it closer to the generic implementation.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/entry-common.h | 64 ++++++
arch/arm64/include/asm/preempt.h | 6 -
arch/arm64/include/asm/ptrace.h | 7 -
arch/arm64/kernel/entry-common.c | 303 ++++++--------------------
arch/arm64/kernel/signal.c | 3 +-
6 files changed, 130 insertions(+), 254 deletions(-)
create mode 100644 arch/arm64/include/asm/entry-common.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 232dcade2783..4545017cfd01 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -146,6 +146,7 @@ config ARM64
select GENERIC_CPU_DEVICES
select GENERIC_CPU_VULNERABILITIES
select GENERIC_EARLY_IOREMAP
+ select GENERIC_IRQ_ENTRY
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IOREMAP
select GENERIC_IRQ_IPI
diff --git a/arch/arm64/include/asm/entry-common.h b/arch/arm64/include/asm/entry-common.h
new file mode 100644
index 000000000000..1cc9d966a6c3
--- /dev/null
+++ b/arch/arm64/include/asm/entry-common.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_ARM64_ENTRY_COMMON_H
+#define _ASM_ARM64_ENTRY_COMMON_H
+
+#include <linux/thread_info.h>
+
+#include <asm/daifflags.h>
+#include <asm/fpsimd.h>
+#include <asm/mte.h>
+#include <asm/stacktrace.h>
+
+#define ARCH_EXIT_TO_USER_MODE_WORK (_TIF_MTE_ASYNC_FAULT | _TIF_FOREIGN_FPSTATE)
+
+static __always_inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+ if (ti_work & _TIF_MTE_ASYNC_FAULT) {
+ clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+ send_sig_fault(SIGSEGV, SEGV_MTEAERR, (void __user *)NULL, current);
+ }
+
+ if (ti_work & _TIF_FOREIGN_FPSTATE)
+ fpsimd_restore_current_state();
+}
+
+#define arch_exit_to_user_mode_work arch_exit_to_user_mode_work
+
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+ local_daif_mask();
+}
+
+#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
+
+static inline bool arch_irqentry_exit_need_resched(void)
+{
+ /*
+ * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
+ * priority masking is used the GIC irqchip driver will clear DAIF.IF
+ * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
+ * DAIF we must have handled an NMI, so skip preemption.
+ */
+ if (system_uses_irq_prio_masking() && read_sysreg(daif))
+ return false;
+
+ /*
+ * Preempting a task from an IRQ means we leave copies of PSTATE
+ * on the stack. cpufeature's enable calls may modify PSTATE, but
+ * resuming one of these preempted tasks would undo those changes.
+ *
+ * Only allow a task to be preempted once cpufeatures have been
+ * enabled.
+ */
+ if (!system_capabilities_finalized())
+ return false;
+
+ return true;
+}
+
+#define arch_irqentry_exit_need_resched arch_irqentry_exit_need_resched
+
+#endif /* _ASM_ARM64_ENTRY_COMMON_H */
diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h
index 0f0ba250efe8..932ea4b62042 100644
--- a/arch/arm64/include/asm/preempt.h
+++ b/arch/arm64/include/asm/preempt.h
@@ -2,7 +2,6 @@
#ifndef __ASM_PREEMPT_H
#define __ASM_PREEMPT_H
-#include <linux/jump_label.h>
#include <linux/thread_info.h>
#define PREEMPT_NEED_RESCHED BIT(32)
@@ -85,22 +84,17 @@ static inline bool should_resched(int preempt_offset)
void preempt_schedule(void);
void preempt_schedule_notrace(void);
-void raw_irqentry_exit_cond_resched(void);
#ifdef CONFIG_PREEMPT_DYNAMIC
-DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
void dynamic_preempt_schedule(void);
#define __preempt_schedule() dynamic_preempt_schedule()
void dynamic_preempt_schedule_notrace(void);
#define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace()
-void dynamic_irqentry_exit_cond_resched(void);
-#define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched()
#else /* CONFIG_PREEMPT_DYNAMIC */
#define __preempt_schedule() preempt_schedule()
#define __preempt_schedule_notrace() preempt_schedule_notrace()
-#define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched()
#endif /* CONFIG_PREEMPT_DYNAMIC */
#endif /* CONFIG_PREEMPTION */
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 5156c0d5fa20..f14c2adc239a 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -149,13 +149,6 @@ static inline unsigned long pstate_to_compat_psr(const unsigned long pstate)
return psr;
}
-typedef struct irqentry_state {
- union {
- bool exit_rcu;
- bool lockdep;
- };
-} irqentry_state_t;
-
/*
* This struct defines the way the registers are stored on the stack during an
* exception. struct user_pt_regs must form a prefix of struct pt_regs.
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 152216201f84..55fee0960fca 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -6,6 +6,7 @@
*/
#include <linux/context_tracking.h>
+#include <linux/irq-entry-common.h>
#include <linux/kasan.h>
#include <linux/linkage.h>
#include <linux/lockdep.h>
@@ -38,71 +39,13 @@
*/
static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
{
- irqentry_state_t ret = {
- .exit_rcu = false,
- };
-
- if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
- lockdep_hardirqs_off(CALLER_ADDR0);
- ct_irq_enter();
- trace_hardirqs_off_finish();
-
- ret.exit_rcu = true;
- return ret;
- }
-
- lockdep_hardirqs_off(CALLER_ADDR0);
- rcu_irq_enter_check_tick();
- trace_hardirqs_off_finish();
+ irqentry_state_t state = irqentry_enter(regs);
mte_check_tfsr_entry();
mte_disable_tco_entry(current);
- return ret;
-}
-
-static inline bool arm64_irqentry_exit_need_resched(void)
-{
- /*
- * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
- * priority masking is used the GIC irqchip driver will clear DAIF.IF
- * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
- * DAIF we must have handled an NMI, so skip preemption.
- */
- if (system_uses_irq_prio_masking() && read_sysreg(daif))
- return false;
-
- /*
- * Preempting a task from an IRQ means we leave copies of PSTATE
- * on the stack. cpufeature's enable calls may modify PSTATE, but
- * resuming one of these preempted tasks would undo those changes.
- *
- * Only allow a task to be preempted once cpufeatures have been
- * enabled.
- */
- if (!system_capabilities_finalized())
- return false;
-
- return true;
-}
-
-void raw_irqentry_exit_cond_resched(void)
-{
- if (!preempt_count()) {
- if (need_resched() && arm64_irqentry_exit_need_resched())
- preempt_schedule_irq();
- }
-}
-
-#ifdef CONFIG_PREEMPT_DYNAMIC
-DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-void dynamic_irqentry_exit_cond_resched(void)
-{
- if (!static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
- return;
- raw_irqentry_exit_cond_resched();
+ return state;
}
-#endif
/*
* Handle IRQ/context state management when exiting to kernel mode.
@@ -116,26 +59,7 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
irqentry_state_t state)
{
mte_check_tfsr_exit();
-
- lockdep_assert_irqs_disabled();
-
- if (!regs_irqs_disabled(regs)) {
- if (state.exit_rcu) {
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare();
- ct_irq_exit();
- lockdep_hardirqs_on(CALLER_ADDR0);
- return;
- }
-
- if (IS_ENABLED(CONFIG_PREEMPTION))
- irqentry_exit_cond_resched();
-
- trace_hardirqs_on();
- } else {
- if (state.exit_rcu)
- ct_irq_exit();
- }
+ irqentry_exit(regs, state);
}
/*
@@ -143,127 +67,26 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
* Before this function is called it is not safe to call regular kernel code,
* instrumentable code, or any code which may trigger an exception.
*/
-static __always_inline void enter_from_user_mode(struct pt_regs *regs)
+static __always_inline void arm64_enter_from_user_mode(struct pt_regs *regs)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
- CT_WARN_ON(ct_state() != CT_STATE_USER);
- user_exit_irqoff();
- trace_hardirqs_off_finish();
+ enter_from_user_mode(regs);
mte_disable_tco_entry(current);
}
-/*
- * Handle IRQ/context state management when exiting to user mode.
- * After this function returns it is not safe to call regular kernel code,
- * instrumentable code, or any code which may trigger an exception.
- */
-static __always_inline void __exit_to_user_mode(void)
-{
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare();
- user_enter_irqoff();
- lockdep_hardirqs_on(CALLER_ADDR0);
-}
-
-static void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags)
+static __always_inline void arm64_exit_to_user_mode(struct pt_regs *regs)
{
- do {
- local_irq_enable();
-
- if (thread_flags & _TIF_NEED_RESCHED)
- schedule();
-
- if (thread_flags & _TIF_UPROBE)
- uprobe_notify_resume(regs);
-
- if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
- clear_thread_flag(TIF_MTE_ASYNC_FAULT);
- send_sig_fault(SIGSEGV, SEGV_MTEAERR,
- (void __user *)NULL, current);
- }
-
- if (thread_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
- do_signal(regs);
-
- if (thread_flags & _TIF_NOTIFY_RESUME)
- resume_user_mode_work(regs);
-
- if (thread_flags & _TIF_FOREIGN_FPSTATE)
- fpsimd_restore_current_state();
-
- local_irq_disable();
- thread_flags = read_thread_flags();
- } while (thread_flags & _TIF_WORK_MASK);
-}
-
-static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs)
-{
- unsigned long flags;
-
local_irq_disable();
- flags = read_thread_flags();
- if (unlikely(flags & _TIF_WORK_MASK))
- do_notify_resume(regs, flags);
-
- local_daif_mask();
-
- lockdep_sys_exit();
-}
-
-static __always_inline void exit_to_user_mode(struct pt_regs *regs)
-{
+ instrumentation_begin();
exit_to_user_mode_prepare(regs);
+ instrumentation_end();
mte_check_tfsr_exit();
- __exit_to_user_mode();
+ exit_to_user_mode();
}
asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs)
{
- exit_to_user_mode(regs);
-}
-
-/*
- * Handle IRQ/context state management when entering an NMI from user/kernel
- * mode. Before this function is called it is not safe to call regular kernel
- * code, instrumentable code, or any code which may trigger an exception.
- */
-static noinstr irqentry_state_t arm64_enter_nmi(struct pt_regs *regs)
-{
- irqentry_state_t irq_state;
-
- irq_state.lockdep = lockdep_hardirqs_enabled();
-
- __nmi_enter();
- lockdep_hardirqs_off(CALLER_ADDR0);
- lockdep_hardirq_enter();
- ct_nmi_enter();
-
- trace_hardirqs_off_finish();
- ftrace_nmi_enter();
-
- return irq_state;
-}
-
-/*
- * Handle IRQ/context state management when exiting an NMI from user/kernel
- * mode. After this function returns it is not safe to call regular kernel
- * code, instrumentable code, or any code which may trigger an exception.
- */
-static void noinstr arm64_exit_nmi(struct pt_regs *regs,
- irqentry_state_t irq_state)
-{
- ftrace_nmi_exit();
- if (irq_state.lockdep) {
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare();
- }
-
- ct_nmi_exit();
- lockdep_hardirq_exit();
- if (irq_state.lockdep)
- lockdep_hardirqs_on(CALLER_ADDR0);
- __nmi_exit();
+ arm64_exit_to_user_mode(regs);
}
/*
@@ -322,7 +145,7 @@ extern void (*handle_arch_fiq)(struct pt_regs *);
static void noinstr __panic_unhandled(struct pt_regs *regs, const char *vector,
unsigned long esr)
{
- arm64_enter_nmi(regs);
+ irqentry_nmi_enter(regs);
console_verbose();
@@ -556,10 +379,10 @@ asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs)
static __always_inline void __el1_pnmi(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- irqentry_state_t state = arm64_enter_nmi(regs);
+ irqentry_state_t state = irqentry_nmi_enter(regs);
do_interrupt_handler(regs, handler);
- arm64_exit_nmi(regs, state);
+ irqentry_nmi_exit(regs, state);
}
static __always_inline void __el1_irq(struct pt_regs *regs,
@@ -600,19 +423,19 @@ asmlinkage void noinstr el1h_64_error_handler(struct pt_regs *regs)
irqentry_state_t state;
local_daif_restore(DAIF_ERRCTX);
- state = arm64_enter_nmi(regs);
+ state = irqentry_nmi_enter(regs);
do_serror(regs, esr);
- arm64_exit_nmi(regs, state);
+ irqentry_nmi_exit(regs, state);
}
static void noinstr el0_da(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_mem_abort(far, esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_ia(struct pt_regs *regs, unsigned long esr)
@@ -627,50 +450,50 @@ static void noinstr el0_ia(struct pt_regs *regs, unsigned long esr)
if (!is_ttbr0_addr(far))
arm64_apply_bp_hardening();
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_mem_abort(far, esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_fpsimd_acc(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_fpsimd_acc(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_sve_acc(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sve_acc(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_sme_acc(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sme_acc(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_fpsimd_exc(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_fpsimd_exc(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_sys(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_sys(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_pc(struct pt_regs *regs, unsigned long esr)
@@ -680,58 +503,58 @@ static void noinstr el0_pc(struct pt_regs *regs, unsigned long esr)
if (!is_ttbr0_addr(instruction_pointer(regs)))
arm64_apply_bp_hardening();
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sp_pc_abort(far, esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_sp(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_sp_pc_abort(regs->sp, esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_undef(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_undef(regs, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_bti(struct pt_regs *regs)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_bti(regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_mops(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_mops(regs, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_gcs(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_gcs(regs, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_inv(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
bad_el0_sync(regs, 0, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_dbg(struct pt_regs *regs, unsigned long esr)
@@ -739,28 +562,28 @@ static void noinstr el0_dbg(struct pt_regs *regs, unsigned long esr)
/* Only watchpoints write FAR_EL1, otherwise its UNKNOWN */
unsigned long far = read_sysreg(far_el1);
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
do_debug_exception(far, esr, regs);
local_daif_restore(DAIF_PROCCTX);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_svc(struct pt_regs *regs)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
fp_user_discard();
local_daif_restore(DAIF_PROCCTX);
do_el0_svc(regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_fpac(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_fpac(regs, esr);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
@@ -828,7 +651,7 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
static void noinstr el0_interrupt(struct pt_regs *regs,
void (*handler)(struct pt_regs *))
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
write_sysreg(DAIF_PROCCTX_NOIRQ, daif);
@@ -839,7 +662,7 @@ static void noinstr el0_interrupt(struct pt_regs *regs,
do_interrupt_handler(regs, handler);
irq_exit_rcu();
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr __el0_irq_handler_common(struct pt_regs *regs)
@@ -867,13 +690,13 @@ static void noinstr __el0_error_handler_common(struct pt_regs *regs)
unsigned long esr = read_sysreg(esr_el1);
irqentry_state_t state;
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_ERRCTX);
- state = arm64_enter_nmi(regs);
+ state = irqentry_nmi_enter(regs);
do_serror(regs, esr);
- arm64_exit_nmi(regs, state);
+ irqentry_nmi_exit(regs, state);
local_daif_restore(DAIF_PROCCTX);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
asmlinkage void noinstr el0t_64_error_handler(struct pt_regs *regs)
@@ -884,19 +707,19 @@ asmlinkage void noinstr el0t_64_error_handler(struct pt_regs *regs)
#ifdef CONFIG_COMPAT
static void noinstr el0_cp15(struct pt_regs *regs, unsigned long esr)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_cp15(esr, regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
static void noinstr el0_svc_compat(struct pt_regs *regs)
{
- enter_from_user_mode(regs);
+ arm64_enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
local_daif_restore(DAIF_PROCCTX);
do_el0_svc_compat(regs);
- exit_to_user_mode(regs);
+ arm64_exit_to_user_mode(regs);
}
asmlinkage void noinstr el0t_32_sync_handler(struct pt_regs *regs)
@@ -970,7 +793,7 @@ asmlinkage void noinstr __noreturn handle_bad_stack(struct pt_regs *regs)
unsigned long esr = read_sysreg(esr_el1);
unsigned long far = read_sysreg(far_el1);
- arm64_enter_nmi(regs);
+ irqentry_nmi_enter(regs);
panic_bad_stack(regs, esr, far);
}
#endif /* CONFIG_VMAP_STACK */
@@ -1004,9 +827,9 @@ __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
else if (cpu_has_pan())
set_pstate_pan(0);
- state = arm64_enter_nmi(regs);
+ state = irqentry_nmi_enter(regs);
ret = do_sdei_event(regs, arg);
- arm64_exit_nmi(regs, state);
+ irqentry_nmi_exit(regs, state);
return ret;
}
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 2eb2e97a934f..04b20c2f6cda 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -9,6 +9,7 @@
#include <linux/cache.h>
#include <linux/compat.h>
#include <linux/errno.h>
+#include <linux/irq-entry-common.h>
#include <linux/kernel.h>
#include <linux/signal.h>
#include <linux/freezer.h>
@@ -1540,7 +1541,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
* the kernel can handle, and then we build all the user-level signal handling
* stack-frames in one go after that.
*/
-void do_signal(struct pt_regs *regs)
+void arch_do_signal_or_restart(struct pt_regs *regs)
{
unsigned long continue_addr = 0, restart_addr = 0;
int retval = 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 17/19] entry: Add syscall arch functions to use generic syscall for arm64
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (15 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 16/19] arm64: entry: Switch to generic IRQ entry Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-28 18:21 ` Thomas Gleixner
2024-10-25 10:06 ` [PATCH -next v4 18/19] arm64/ptrace: Split report_syscall() into separate enter and exit functions Jinjie Ruan
2024-10-25 10:07 ` [PATCH -next v4 19/19] arm64: entry: Convert to generic entry Jinjie Ruan
18 siblings, 1 reply; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Add some syscall arch functions to support arm64 to use generic syscall
code, which do not affect existing architectures that use generic entry:
- arch_pre/post_report_syscall_entry/exit().
Also make syscall_exit_work() not static and move report_single_step() to
thread_info.h, which can be used by arm64 later.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Kevin Brodsky <kevin.brodsky@arm.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
include/linux/entry-common.h | 1 +
include/linux/thread_info.h | 13 +++++
kernel/entry/syscall-common.c | 100 ++++++++++++++++++++++++++++++----
3 files changed, 103 insertions(+), 11 deletions(-)
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 1ae3143d4b12..39a2d41af05e 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -178,4 +178,5 @@ void syscall_exit_to_user_mode_work(struct pt_regs *regs);
*/
void syscall_exit_to_user_mode(struct pt_regs *regs);
+void syscall_exit_work(struct pt_regs *regs, unsigned long work);
#endif
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 9ea0b28068f4..062de9666ef3 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -55,6 +55,19 @@ enum syscall_work_bit {
#define SYSCALL_WORK_SYSCALL_AUDIT BIT(SYSCALL_WORK_BIT_SYSCALL_AUDIT)
#define SYSCALL_WORK_SYSCALL_USER_DISPATCH BIT(SYSCALL_WORK_BIT_SYSCALL_USER_DISPATCH)
#define SYSCALL_WORK_SYSCALL_EXIT_TRAP BIT(SYSCALL_WORK_BIT_SYSCALL_EXIT_TRAP)
+
+/*
+ * If SYSCALL_EMU is set, then the only reason to report is when
+ * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall
+ * instruction has been already reported in syscall_enter_from_user_mode().
+ */
+static inline bool report_single_step(unsigned long work)
+{
+ if (work & SYSCALL_WORK_SYSCALL_EMU)
+ return false;
+
+ return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP;
+}
#endif
#include <asm/thread_info.h>
diff --git a/kernel/entry/syscall-common.c b/kernel/entry/syscall-common.c
index 0eb036986ad4..73f87d09e04e 100644
--- a/kernel/entry/syscall-common.c
+++ b/kernel/entry/syscall-common.c
@@ -17,6 +17,49 @@ static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
}
}
+/**
+ * arch_pre_report_syscall_entry - Architecture specific work before
+ * report_syscall_entry().
+ *
+ * Invoked from syscall_trace_enter() to prepare for ptrace_report_syscall_entry().
+ * Defaults to NOP.
+ *
+ * The main purpose is for saving a general purpose register clobbered
+ * in the tracee.
+ */
+static inline unsigned long arch_pre_report_syscall_entry(struct pt_regs *regs);
+
+#ifndef arch_pre_report_syscall_entry
+static inline unsigned long arch_pre_report_syscall_entry(struct pt_regs *regs)
+{
+ return 0;
+}
+#endif
+
+/**
+ * arch_post_report_syscall_entry - Architecture specific work after
+ * report_syscall_entry().
+ *
+ * Invoked from syscall_trace_enter() after calling ptrace_report_syscall_entry().
+ * Defaults to NOP.
+ *
+ * The main purpose is for restoring a general purpose register clobbered
+ * in the trace saved in arch_pre_report_syscall_entry(), also it can
+ * do something arch-specific according to the return value of
+ * ptrace_report_syscall_entry().
+ */
+static inline void arch_post_report_syscall_entry(struct pt_regs *regs,
+ unsigned long saved_reg,
+ long ret);
+
+#ifndef arch_post_report_syscall_entry
+static inline void arch_post_report_syscall_entry(struct pt_regs *regs,
+ unsigned long saved_reg,
+ long ret)
+{
+}
+#endif
+
long syscall_trace_enter(struct pt_regs *regs, long syscall,
unsigned long work)
{
@@ -34,7 +77,9 @@ long syscall_trace_enter(struct pt_regs *regs, long syscall,
/* Handle ptrace */
if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
+ unsigned long saved_reg = arch_pre_report_syscall_entry(regs);
ret = ptrace_report_syscall_entry(regs);
+ arch_post_report_syscall_entry(regs, saved_reg, ret);
if (ret || (work & SYSCALL_WORK_SYSCALL_EMU))
return -1L;
}
@@ -71,20 +116,50 @@ noinstr void syscall_enter_from_user_mode_prepare(struct pt_regs *regs)
instrumentation_end();
}
-/*
- * If SYSCALL_EMU is set, then the only reason to report is when
- * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall
- * instruction has been already reported in syscall_enter_from_user_mode().
+/**
+ * arch_pre_report_syscall_exit - Architecture specific work before
+ * report_syscall_exit().
+ *
+ * Invoked from syscall_exit_work() to prepare for ptrace_report_syscall_exit().
+ * Defaults to NOP.
+ *
+ * The main purpose is for saving a general purpose register clobbered
+ * in the trace.
*/
-static inline bool report_single_step(unsigned long work)
-{
- if (work & SYSCALL_WORK_SYSCALL_EMU)
- return false;
+static inline unsigned long arch_pre_report_syscall_exit(struct pt_regs *regs,
+ unsigned long work);
- return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP;
+#ifndef arch_pre_report_syscall_exit
+static inline unsigned long arch_pre_report_syscall_exit(struct pt_regs *regs,
+ unsigned long work)
+{
+ return 0;
}
+#endif
+
+/**
+ * arch_post_report_syscall_exit - Architecture specific work after
+ * report_syscall_exit().
+ *
+ * Invoked from syscall_exit_work() after calling ptrace_report_syscall_exit().
+ * Defaults to NOP.
+ *
+ * The main purpose is for restoring a general purpose register clobbered
+ * in the trace saved in arch_pre_report_syscall_exit().
+ */
+static inline void arch_post_report_syscall_exit(struct pt_regs *regs,
+ unsigned long saved_reg,
+ unsigned long work);
+
+#ifndef arch_post_report_syscall_exit
+static inline void arch_post_report_syscall_exit(struct pt_regs *regs,
+ unsigned long saved_reg,
+ unsigned long work)
+{
+}
+#endif
-static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
+void syscall_exit_work(struct pt_regs *regs, unsigned long work)
{
bool step;
@@ -107,8 +182,11 @@ static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
trace_sys_exit(regs, syscall_get_return_value(current, regs));
step = report_single_step(work);
- if (step || work & SYSCALL_WORK_SYSCALL_TRACE)
+ if (step || work & SYSCALL_WORK_SYSCALL_TRACE) {
+ unsigned long saved_reg = arch_pre_report_syscall_exit(regs, work);
ptrace_report_syscall_exit(regs, step);
+ arch_post_report_syscall_exit(regs, saved_reg, work);
+ }
}
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 18/19] arm64/ptrace: Split report_syscall() into separate enter and exit functions
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (16 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 17/19] entry: Add syscall arch functions to use generic syscall for arm64 Jinjie Ruan
@ 2024-10-25 10:06 ` Jinjie Ruan
2024-10-25 10:07 ` [PATCH -next v4 19/19] arm64: entry: Convert to generic entry Jinjie Ruan
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:06 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Split report_syscall() to two separate enter and exit
functions. So it will be more clear when arm64 switch to
generic entry.
No functional changes.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/ptrace.c | 29 ++++++++++++++++++++---------
1 file changed, 20 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 6c1dcfe6d25a..6ea303ab9e22 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -2290,7 +2290,7 @@ enum ptrace_syscall_dir {
PTRACE_SYSCALL_EXIT,
};
-static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
+static void report_syscall_enter(struct pt_regs *regs)
{
int regno;
unsigned long saved_reg;
@@ -2313,13 +2313,24 @@ static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
*/
regno = (is_compat_task() ? 12 : 7);
saved_reg = regs->regs[regno];
- regs->regs[regno] = dir;
+ regs->regs[regno] = PTRACE_SYSCALL_ENTER;
- if (dir == PTRACE_SYSCALL_ENTER) {
- if (ptrace_report_syscall_entry(regs))
- forget_syscall(regs);
- regs->regs[regno] = saved_reg;
- } else if (!test_thread_flag(TIF_SINGLESTEP)) {
+ if (ptrace_report_syscall_entry(regs))
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+}
+
+static void report_syscall_exit(struct pt_regs *regs)
+{
+ int regno;
+ unsigned long saved_reg;
+
+ /* See comment for report_syscall_enter() */
+ regno = (is_compat_task() ? 12 : 7);
+ saved_reg = regs->regs[regno];
+ regs->regs[regno] = PTRACE_SYSCALL_EXIT;
+
+ if (!test_thread_flag(TIF_SINGLESTEP)) {
ptrace_report_syscall_exit(regs, 0);
regs->regs[regno] = saved_reg;
} else {
@@ -2339,7 +2350,7 @@ int syscall_trace_enter(struct pt_regs *regs)
unsigned long flags = read_thread_flags();
if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
- report_syscall(regs, PTRACE_SYSCALL_ENTER);
+ report_syscall_enter(regs);
if (flags & _TIF_SYSCALL_EMU)
return NO_SYSCALL;
}
@@ -2367,7 +2378,7 @@ void syscall_trace_exit(struct pt_regs *regs)
trace_sys_exit(regs, syscall_get_return_value(current, regs));
if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
- report_syscall(regs, PTRACE_SYSCALL_EXIT);
+ report_syscall_exit(regs);
rseq_syscall(regs);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH -next v4 19/19] arm64: entry: Convert to generic entry
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
` (17 preceding siblings ...)
2024-10-25 10:06 ` [PATCH -next v4 18/19] arm64/ptrace: Split report_syscall() into separate enter and exit functions Jinjie Ruan
@ 2024-10-25 10:07 ` Jinjie Ruan
18 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-25 10:07 UTC (permalink / raw)
To: oleg, linux, will, mark.rutland, catalin.marinas, sstabellini,
maz, tglx, peterz, luto, kees, wad, akpm, samitolvanen, arnd,
ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck, aquini,
petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
Currently, x86, Riscv, Loongarch use the generic entry. Convert arm64
to use the generic entry infrastructure from kernel/entry/*.
The generic entry makes maintainers' work easier and codes more elegant.
The changes are below:
- Remove TIF_SYSCALL_* flag, _TIF_WORK_MASK, _TIF_SYSCALL_WORK
- Remove syscall_trace_enter/exit() and use generic identical functions.
Tested ok with following test cases on Qemu cortex-a53 and HiSilicon
Kunpeng-920:
- Perf tests.
- Different `dynamic preempt` mode switch.
- Pseudo NMI tests.
- Stress-ng CPU stress test.
- MTE test case in Documentation/arch/arm64/memory-tagging-extension.rst
and all test cases in tools/testing/selftests/arm64/mte/* (Only Qemu).
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/Kconfig | 2 +-
arch/arm64/include/asm/entry-common.h | 85 ++++++++++++++++++++++
arch/arm64/include/asm/syscall.h | 6 +-
arch/arm64/include/asm/thread_info.h | 23 +-----
arch/arm64/kernel/ptrace.c | 101 --------------------------
arch/arm64/kernel/signal.c | 2 +-
arch/arm64/kernel/syscall.c | 18 +++--
7 files changed, 103 insertions(+), 134 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 4545017cfd01..89d46d0fb18b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -146,7 +146,7 @@ config ARM64
select GENERIC_CPU_DEVICES
select GENERIC_CPU_VULNERABILITIES
select GENERIC_EARLY_IOREMAP
- select GENERIC_IRQ_ENTRY
+ select GENERIC_ENTRY
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IOREMAP
select GENERIC_IRQ_IPI
diff --git a/arch/arm64/include/asm/entry-common.h b/arch/arm64/include/asm/entry-common.h
index 1cc9d966a6c3..04a31b4fc4fd 100644
--- a/arch/arm64/include/asm/entry-common.h
+++ b/arch/arm64/include/asm/entry-common.h
@@ -10,6 +10,11 @@
#include <asm/mte.h>
#include <asm/stacktrace.h>
+enum ptrace_syscall_dir {
+ PTRACE_SYSCALL_ENTER = 0,
+ PTRACE_SYSCALL_EXIT,
+};
+
#define ARCH_EXIT_TO_USER_MODE_WORK (_TIF_MTE_ASYNC_FAULT | _TIF_FOREIGN_FPSTATE)
static __always_inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
@@ -61,4 +66,84 @@ static inline bool arch_irqentry_exit_need_resched(void)
#define arch_irqentry_exit_need_resched arch_irqentry_exit_need_resched
+static inline unsigned long arch_pre_report_syscall_entry(struct pt_regs *regs)
+{
+ unsigned long saved_reg;
+ int regno;
+
+ /*
+ * We have some ABI weirdness here in the way that we handle syscall
+ * exit stops because we indicate whether or not the stop has been
+ * signalled from syscall entry or syscall exit by clobbering a general
+ * purpose register (ip/r12 for AArch32, x7 for AArch64) in the tracee
+ * and restoring its old value after the stop. This means that:
+ *
+ * - Any writes by the tracer to this register during the stop are
+ * ignored/discarded.
+ *
+ * - The actual value of the register is not available during the stop,
+ * so the tracer cannot save it and restore it later.
+ *
+ * - Syscall stops behave differently to seccomp and pseudo-step traps
+ * (the latter do not nobble any registers).
+ */
+ regno = (is_compat_task() ? 12 : 7);
+ saved_reg = regs->regs[regno];
+ regs->regs[regno] = PTRACE_SYSCALL_ENTER;
+
+ return saved_reg;
+}
+
+#define arch_pre_report_syscall_entry arch_pre_report_syscall_entry
+
+static inline void arch_post_report_syscall_entry(struct pt_regs *regs,
+ unsigned long saved_reg, long ret)
+{
+ int regno = (is_compat_task() ? 12 : 7);
+
+ if (ret)
+ forget_syscall(regs);
+
+ regs->regs[regno] = saved_reg;
+}
+
+#define arch_post_report_syscall_entry arch_post_report_syscall_entry
+
+static inline unsigned long arch_pre_report_syscall_exit(struct pt_regs *regs,
+ unsigned long work)
+{
+ unsigned long saved_reg;
+ int regno;
+
+ /* See comment for arch_pre_report_syscall_entry() */
+ regno = (is_compat_task() ? 12 : 7);
+ saved_reg = regs->regs[regno];
+ regs->regs[regno] = PTRACE_SYSCALL_EXIT;
+
+ if (report_single_step(work)) {
+ /*
+ * Signal a pseudo-step exception since we are stepping but
+ * tracer modifications to the registers may have rewound the
+ * state machine.
+ */
+ regs->regs[regno] = saved_reg;
+ }
+
+ return saved_reg;
+}
+
+#define arch_pre_report_syscall_exit arch_pre_report_syscall_exit
+
+static inline void arch_post_report_syscall_exit(struct pt_regs *regs,
+ unsigned long saved_reg,
+ unsigned long work)
+{
+ int regno = (is_compat_task() ? 12 : 7);
+
+ if (!report_single_step(work))
+ regs->regs[regno] = saved_reg;
+}
+
+#define arch_post_report_syscall_exit arch_post_report_syscall_exit
+
#endif /* _ASM_ARM64_ENTRY_COMMON_H */
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index ab8e14b96f68..9891b15da4c3 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -85,7 +85,9 @@ static inline int syscall_get_arch(struct task_struct *task)
return AUDIT_ARCH_AARCH64;
}
-int syscall_trace_enter(struct pt_regs *regs);
-void syscall_trace_exit(struct pt_regs *regs);
+static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
+{
+ return false;
+}
#endif /* __ASM_SYSCALL_H */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 1114c1c3300a..543fdb00d713 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -43,6 +43,7 @@ struct thread_info {
void *scs_sp;
#endif
u32 cpu;
+ unsigned long syscall_work; /* SYSCALL_WORK_ flags */
};
#define thread_saved_pc(tsk) \
@@ -64,11 +65,6 @@ void arch_setup_new_exec(void);
#define TIF_UPROBE 4 /* uprobe breakpoint or singlestep */
#define TIF_MTE_ASYNC_FAULT 5 /* MTE Asynchronous Tag Check Fault */
#define TIF_NOTIFY_SIGNAL 6 /* signal notifications exist */
-#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
-#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
-#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
-#define TIF_SECCOMP 11 /* syscall secure computing */
-#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
#define TIF_MEMDIE 18 /* is terminating due to OOM killer */
#define TIF_FREEZE 19
#define TIF_RESTORE_SIGMASK 20
@@ -87,28 +83,13 @@ void arch_setup_new_exec(void);
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
#define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE)
-#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
-#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
-#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
-#define _TIF_SECCOMP (1 << TIF_SECCOMP)
-#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
-#define _TIF_UPROBE (1 << TIF_UPROBE)
-#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
+#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
#define _TIF_MTE_ASYNC_FAULT (1 << TIF_MTE_ASYNC_FAULT)
#define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
#define _TIF_TSC_SIGSEGV (1 << TIF_TSC_SIGSEGV)
-#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
- _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
- _TIF_UPROBE | _TIF_MTE_ASYNC_FAULT | \
- _TIF_NOTIFY_SIGNAL)
-
-#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
- _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
- _TIF_SYSCALL_EMU)
-
#ifdef CONFIG_SHADOW_CALL_STACK
#define INIT_SCS \
.scs_base = init_shadow_call_stack, \
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 6ea303ab9e22..0f642ed4dbe4 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -42,9 +42,6 @@
#include <asm/traps.h>
#include <asm/system_misc.h>
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
struct pt_regs_offset {
const char *name;
int offset;
@@ -2285,104 +2282,6 @@ long arch_ptrace(struct task_struct *child, long request,
return ptrace_request(child, request, addr, data);
}
-enum ptrace_syscall_dir {
- PTRACE_SYSCALL_ENTER = 0,
- PTRACE_SYSCALL_EXIT,
-};
-
-static void report_syscall_enter(struct pt_regs *regs)
-{
- int regno;
- unsigned long saved_reg;
-
- /*
- * We have some ABI weirdness here in the way that we handle syscall
- * exit stops because we indicate whether or not the stop has been
- * signalled from syscall entry or syscall exit by clobbering a general
- * purpose register (ip/r12 for AArch32, x7 for AArch64) in the tracee
- * and restoring its old value after the stop. This means that:
- *
- * - Any writes by the tracer to this register during the stop are
- * ignored/discarded.
- *
- * - The actual value of the register is not available during the stop,
- * so the tracer cannot save it and restore it later.
- *
- * - Syscall stops behave differently to seccomp and pseudo-step traps
- * (the latter do not nobble any registers).
- */
- regno = (is_compat_task() ? 12 : 7);
- saved_reg = regs->regs[regno];
- regs->regs[regno] = PTRACE_SYSCALL_ENTER;
-
- if (ptrace_report_syscall_entry(regs))
- forget_syscall(regs);
- regs->regs[regno] = saved_reg;
-}
-
-static void report_syscall_exit(struct pt_regs *regs)
-{
- int regno;
- unsigned long saved_reg;
-
- /* See comment for report_syscall_enter() */
- regno = (is_compat_task() ? 12 : 7);
- saved_reg = regs->regs[regno];
- regs->regs[regno] = PTRACE_SYSCALL_EXIT;
-
- if (!test_thread_flag(TIF_SINGLESTEP)) {
- ptrace_report_syscall_exit(regs, 0);
- regs->regs[regno] = saved_reg;
- } else {
- regs->regs[regno] = saved_reg;
-
- /*
- * Signal a pseudo-step exception since we are stepping but
- * tracer modifications to the registers may have rewound the
- * state machine.
- */
- ptrace_report_syscall_exit(regs, 1);
- }
-}
-
-int syscall_trace_enter(struct pt_regs *regs)
-{
- unsigned long flags = read_thread_flags();
-
- if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
- report_syscall_enter(regs);
- if (flags & _TIF_SYSCALL_EMU)
- return NO_SYSCALL;
- }
-
- /* Do the secure computing after ptrace; failures should be fast. */
- if (secure_computing() == -1)
- return NO_SYSCALL;
-
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
- trace_sys_enter(regs, regs->syscallno);
-
- audit_syscall_entry(regs->syscallno, regs->orig_x0, regs->regs[1],
- regs->regs[2], regs->regs[3]);
-
- return regs->syscallno;
-}
-
-void syscall_trace_exit(struct pt_regs *regs)
-{
- unsigned long flags = read_thread_flags();
-
- audit_syscall_exit(regs);
-
- if (flags & _TIF_SYSCALL_TRACEPOINT)
- trace_sys_exit(regs, syscall_get_return_value(current, regs));
-
- if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
- report_syscall_exit(regs);
-
- rseq_syscall(regs);
-}
-
/*
* SPSR_ELx bits which are always architecturally RES0 per ARM DDI 0487D.a.
* We permit userspace to set SSBS (AArch64 bit 12, AArch32 bit 23) which is
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 04b20c2f6cda..4965cb80e67e 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -8,8 +8,8 @@
#include <linux/cache.h>
#include <linux/compat.h>
+#include <linux/entry-common.h>
#include <linux/errno.h>
-#include <linux/irq-entry-common.h>
#include <linux/kernel.h>
#include <linux/signal.h>
#include <linux/freezer.h>
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index c442fcec6b9e..ea818e3d597b 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -2,6 +2,7 @@
#include <linux/compiler.h>
#include <linux/context_tracking.h>
+#include <linux/entry-common.h>
#include <linux/errno.h>
#include <linux/nospec.h>
#include <linux/ptrace.h>
@@ -65,14 +66,15 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
choose_random_kstack_offset(get_random_u16());
}
-static inline bool has_syscall_work(unsigned long flags)
+static inline bool has_syscall_work(unsigned long work)
{
- return unlikely(flags & _TIF_SYSCALL_WORK);
+ return unlikely(work & SYSCALL_WORK_ENTER);
}
static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
const syscall_fn_t syscall_table[])
{
+ unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
unsigned long flags = read_thread_flags();
regs->orig_x0 = regs->regs[0];
@@ -106,7 +108,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
return;
}
- if (has_syscall_work(flags)) {
+ if (has_syscall_work(work)) {
/*
* The de-facto standard way to skip a system call using ptrace
* is to set the system call to -1 (NO_SYSCALL) and set x0 to a
@@ -124,7 +126,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
*/
if (scno == NO_SYSCALL)
syscall_set_return_value(current, regs, -ENOSYS, 0);
- scno = syscall_trace_enter(regs);
+ scno = syscall_trace_enter(regs, regs->syscallno, work);
if (scno == NO_SYSCALL)
goto trace_exit;
}
@@ -136,14 +138,14 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
* check again. However, if we were tracing entry, then we always trace
* exit regardless, as the old entry assembly did.
*/
- if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
- flags = read_thread_flags();
- if (!has_syscall_work(flags) && !(flags & _TIF_SINGLESTEP))
+ if (!has_syscall_work(work) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
+ work = READ_ONCE(current_thread_info()->syscall_work);
+ if (!has_syscall_work(work) && !report_single_step(work))
return;
}
trace_exit:
- syscall_trace_exit(regs);
+ syscall_exit_work(regs, work);
}
void do_el0_svc(struct pt_regs *regs)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 15/19] entry: Add arch irqentry_exit_need_resched() for arm64
2024-10-25 10:06 ` [PATCH -next v4 15/19] entry: Add arch irqentry_exit_need_resched() for arm64 Jinjie Ruan
@ 2024-10-28 18:05 ` Thomas Gleixner
2024-10-28 22:15 ` Thomas Gleixner
2024-10-29 2:33 ` Jinjie Ruan
0 siblings, 2 replies; 35+ messages in thread
From: Thomas Gleixner @ 2024-10-28 18:05 UTC (permalink / raw)
To: Jinjie Ruan, oleg, linux, will, mark.rutland, catalin.marinas,
sstabellini, maz, peterz, luto, kees, wad, akpm, samitolvanen,
arnd, ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck,
aquini, petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb,
wangkefeng.wang, surenb, linus.walleij, yangyj.ee, broonie,
mbenes, puranjay, pcc, guohanjun, sudeep.holla, Jonathan.Cameron,
prarit, liuwei09, dwmw, oliver.upton, kristina.martsenko, ptosi,
frederic, vschneid, thiago.bauermann, joey.gouly, liuyuntao12,
leobras, linux-kernel, linux-arm-kernel, xen-devel
On Fri, Oct 25 2024 at 18:06, Jinjie Ruan wrote:
> As the front patch 6 ~ 13 did, the arm64_preempt_schedule_irq() is
Once this series is applied nobody knows what 'front patch 6 ~ 13' did.
> same with the irq preempt schedule code of generic entry besides those
> architecture-related logic called arm64_irqentry_exit_need_resched().
>
> So add arch irqentry_exit_need_resched() to support architecture-related
> need_resched() check logic, which do not affect existing architectures
> that use generic entry, but support arm64 to use generic irq entry.
Simply say:
ARM64 requires an additional whether to reschedule on return from
interrupt.
Add arch_irqentry_exit_need_resched() as the default NOOP
implementation and hook it up into the need_resched() condition in
raw_irqentry_exit_cond_resched().
This allows ARM64 to implement the architecture specific version for
switchting over to the generic entry code.
That explains things completely independently. Hmm?
Thanks,
tglx
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 17/19] entry: Add syscall arch functions to use generic syscall for arm64
2024-10-25 10:06 ` [PATCH -next v4 17/19] entry: Add syscall arch functions to use generic syscall for arm64 Jinjie Ruan
@ 2024-10-28 18:21 ` Thomas Gleixner
0 siblings, 0 replies; 35+ messages in thread
From: Thomas Gleixner @ 2024-10-28 18:21 UTC (permalink / raw)
To: Jinjie Ruan, oleg, linux, will, mark.rutland, catalin.marinas,
sstabellini, maz, peterz, luto, kees, wad, akpm, samitolvanen,
arnd, ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck,
aquini, petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb,
wangkefeng.wang, surenb, linus.walleij, yangyj.ee, broonie,
mbenes, puranjay, pcc, guohanjun, sudeep.holla, Jonathan.Cameron,
prarit, liuwei09, dwmw, oliver.upton, kristina.martsenko, ptosi,
frederic, vschneid, thiago.bauermann, joey.gouly, liuyuntao12,
leobras, linux-kernel, linux-arm-kernel, xen-devel
On Fri, Oct 25 2024 at 18:06, Jinjie Ruan wrote:
$Subject: Can you please make this simply:
entry: Add arch_pre/post_report_syscall_entry/exit()
> Add some syscall arch functions to support arm64 to use generic syscall
> code, which do not affect existing architectures that use generic entry:
>
> - arch_pre/post_report_syscall_entry/exit().
> Also make syscall_exit_work() not static and move report_single_step() to
> thread_info.h, which can be used by arm64 later.
This does way too many things which have nothing to do with the subject
line.
> long syscall_trace_enter(struct pt_regs *regs, long syscall,
> unsigned long work)
> {
> @@ -34,7 +77,9 @@ long syscall_trace_enter(struct pt_regs *regs, long syscall,
>
> /* Handle ptrace */
> if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
> + unsigned long saved_reg = arch_pre_report_syscall_entry(regs);
Lacks a new line between declaration and code.
> ret = ptrace_report_syscall_entry(regs);
> + arch_post_report_syscall_entry(regs, saved_reg, ret);
Though I'm not sure whether these pre/post hooks buy anything. It's
probably simpler to do:
- ret = ptrace_report_syscall_entry(regs);
+ ret = arch_ptrace_report_syscall_entry(regs);
And have the default implementation as
return ptrace_report_syscall_entry(regs);
and let ARM64 implement it's magic around it in the architecture
header. The actual ptrace_report_syscall_entry() is simple enough to be
in both places. That reduces the inflation of architecture specific
helpers and keeps the code tidy.
Thanks,
tglx
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 15/19] entry: Add arch irqentry_exit_need_resched() for arm64
2024-10-28 18:05 ` Thomas Gleixner
@ 2024-10-28 22:15 ` Thomas Gleixner
2024-10-29 2:33 ` Jinjie Ruan
1 sibling, 0 replies; 35+ messages in thread
From: Thomas Gleixner @ 2024-10-28 22:15 UTC (permalink / raw)
To: Jinjie Ruan, oleg, linux, will, mark.rutland, catalin.marinas,
sstabellini, maz, peterz, luto, kees, wad, akpm, samitolvanen,
arnd, ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck,
aquini, petr.pavlu, ruanjinjie, viro, rmk+kernel, ardb,
wangkefeng.wang, surenb, linus.walleij, yangyj.ee, broonie,
mbenes, puranjay, pcc, guohanjun, sudeep.holla, Jonathan.Cameron,
prarit, liuwei09, dwmw, oliver.upton, kristina.martsenko, ptosi,
frederic, vschneid, thiago.bauermann, joey.gouly, liuyuntao12,
leobras, linux-kernel, linux-arm-kernel, xen-devel
On Mon, Oct 28 2024 at 19:05, Thomas Gleixner wrote:
> On Fri, Oct 25 2024 at 18:06, Jinjie Ruan wrote:
>
>> As the front patch 6 ~ 13 did, the arm64_preempt_schedule_irq() is
>
> Once this series is applied nobody knows what 'front patch 6 ~ 13' did.
>
>> same with the irq preempt schedule code of generic entry besides those
>> architecture-related logic called arm64_irqentry_exit_need_resched().
>>
>> So add arch irqentry_exit_need_resched() to support architecture-related
>> need_resched() check logic, which do not affect existing architectures
>> that use generic entry, but support arm64 to use generic irq entry.
>
> Simply say:
>
> ARM64 requires an additional whether to reschedule on return from
ARM64 requires an additional check whether to reschedule on return from
obviously...
> interrupt.
>
> Add arch_irqentry_exit_need_resched() as the default NOOP
> implementation and hook it up into the need_resched() condition in
> raw_irqentry_exit_cond_resched().
>
> This allows ARM64 to implement the architecture specific version for
> switchting over to the generic entry code.
>
> That explains things completely independently. Hmm?
>
> Thanks,
>
> tglx
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 15/19] entry: Add arch irqentry_exit_need_resched() for arm64
2024-10-28 18:05 ` Thomas Gleixner
2024-10-28 22:15 ` Thomas Gleixner
@ 2024-10-29 2:33 ` Jinjie Ruan
1 sibling, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-29 2:33 UTC (permalink / raw)
To: Thomas Gleixner, oleg, linux, will, mark.rutland, catalin.marinas,
sstabellini, maz, peterz, luto, kees, wad, akpm, samitolvanen,
arnd, ojeda, rppt, hca, aliceryhl, samuel.holland, paulmck,
aquini, petr.pavlu, viro, rmk+kernel, ardb, wangkefeng.wang,
surenb, linus.walleij, yangyj.ee, broonie, mbenes, puranjay, pcc,
guohanjun, sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On 2024/10/29 2:05, Thomas Gleixner wrote:
> On Fri, Oct 25 2024 at 18:06, Jinjie Ruan wrote:
>
>> As the front patch 6 ~ 13 did, the arm64_preempt_schedule_irq() is
>
> Once this series is applied nobody knows what 'front patch 6 ~ 13' did.
Yes, if some of the previous patches are applied, the description will
immediately become difficult to understand, the other patch's similar
commit message will be updated too.
>
>> same with the irq preempt schedule code of generic entry besides those
>> architecture-related logic called arm64_irqentry_exit_need_resched().
>>
>> So add arch irqentry_exit_need_resched() to support architecture-related
>> need_resched() check logic, which do not affect existing architectures
>> that use generic entry, but support arm64 to use generic irq entry.
>
> Simply say:
>
> ARM64 requires an additional whether to reschedule on return from
> interrupt.
>
> Add arch_irqentry_exit_need_resched() as the default NOOP
> implementation and hook it up into the need_resched() condition in
> raw_irqentry_exit_cond_resched().
>
> This allows ARM64 to implement the architecture specific version for
> switchting over to the generic entry code.
>
> That explains things completely independently. Hmm?
Of course, this is clearer and not as coupled as other patches and
describes how to implement it.
>
> Thanks,
>
> tglx
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 01/19] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled()
2024-10-25 10:06 ` [PATCH -next v4 01/19] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Jinjie Ruan
@ 2024-10-29 14:19 ` Mark Rutland
2024-10-31 3:34 ` Jinjie Ruan
0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2024-10-29 14:19 UTC (permalink / raw)
To: Jinjie Ruan
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On Fri, Oct 25, 2024 at 06:06:42PM +0800, Jinjie Ruan wrote:
> Implement regs_irqs_disabled(), and replace interrupts_enabled() macro
> with regs_irqs_disabled() all over the place.
>
> No functional changes.
>
Please say why, e.g.
| The generic entry code expects architecture code to provide
| regs_irqs_disabled(regs), but arm64 does not have this and provides
| interrupts_enabled(regs), which has the opposite polarity.
|
| In preparation for moving arm64 over to the generic entry code,
| replace arm64's interrupts_enabled() with regs_irqs_disabled() and
| update its callers under arch/arm64.
|
| For the moment, a definition of interrupts_enabled() is provided for
| the GICv3 driver. Once arch/arm implement regs_irqs_disabled(), this
| can be removed.
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
[...]
> arch/arm/include/asm/ptrace.h | 4 ++--
> arch/arm/kernel/hw_breakpoint.c | 2 +-
> arch/arm/kernel/process.c | 2 +-
> arch/arm/mm/alignment.c | 2 +-
> arch/arm/mm/fault.c | 2 +-
> drivers/irqchip/irq-gic-v3.c | 2 +-
I hadn't realised that the GICv3 driver was using this and hence we'd
need to update a few places in arch/arm at the same time. Please update
just the arch/arm64 bits, and add:
| /*
| * Used by the GICv3 driver, can be removed once arch/arm implements
| * regs_irqs_disabled() directly.
| */
| #define interrupts_enabled(regs) (!regs_irqs_disabled(regs))
... and then once 32-bit arm implements this we can update the GIC
driver and remove the architecture definitions.
That way we avoid the risk of conflicts with 32-bit arm.
Mark.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 02/19] arm64: entry: Refactor the entry and exit for exceptions from EL1
2024-10-25 10:06 ` [PATCH -next v4 02/19] arm64: entry: Refactor the entry and exit for exceptions from EL1 Jinjie Ruan
@ 2024-10-29 14:33 ` Mark Rutland
2024-10-31 3:35 ` Jinjie Ruan
0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2024-10-29 14:33 UTC (permalink / raw)
To: Jinjie Ruan
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On Fri, Oct 25, 2024 at 06:06:43PM +0800, Jinjie Ruan wrote:
> These changes refactor the entry and exit routines for the exceptions
> from EL1. They store the RCU and lockdep state in a struct
> irqentry_state variable on the stack, rather than recording them
> in the fields of pt_regs, since it is safe enough for these context.
In general, please descirbe *why* we want to make the change first, e.g.
| The generic entry code uses irqentry_state_t to track lockdep and RCU
| state across exception entry and return. For historical reasons, arm64
| embeds similar fields within its pt_regs structure.
|
| In preparation for moving arm64 over to the generic entry code, pull
| these fields out of arm64's pt_regs, and use a seperate structure,
| matching the style of the generic entry code.
> Before:
> struct pt_regs {
> ...
> u64 lockdep_hardirqs;
> u64 exit_rcu;
> }
>
> enter_from_kernel_mode(regs);
> ...
> exit_to_kernel_mode(regs);
>
> After:
> typedef struct irqentry_state {
> union {
> bool exit_rcu;
> bool lockdep;
> };
> } irqentry_state_t;
>
> irqentry_state_t state = enter_from_kernel_mode(regs);
> ...
> exit_to_kernel_mode(regs, state);
I don't think this part is necessary.
>
> No functional changes.
>
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> arch/arm64/include/asm/ptrace.h | 11 ++-
> arch/arm64/kernel/entry-common.c | 129 +++++++++++++++++++------------
> 2 files changed, 85 insertions(+), 55 deletions(-)
>
> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> index 3e5372a98da4..5156c0d5fa20 100644
> --- a/arch/arm64/include/asm/ptrace.h
> +++ b/arch/arm64/include/asm/ptrace.h
> @@ -149,6 +149,13 @@ static inline unsigned long pstate_to_compat_psr(const unsigned long pstate)
> return psr;
> }
>
> +typedef struct irqentry_state {
> + union {
> + bool exit_rcu;
> + bool lockdep;
> + };
> +} irqentry_state_t;
AFAICT this can be moved directly into arch/arm64/kernel/entry-common.c.
> +
> /*
> * This struct defines the way the registers are stored on the stack during an
> * exception. struct user_pt_regs must form a prefix of struct pt_regs.
> @@ -169,10 +176,6 @@ struct pt_regs {
>
> u64 sdei_ttbr1;
> struct frame_record_meta stackframe;
> -
> - /* Only valid for some EL1 exceptions. */
> - u64 lockdep_hardirqs;
> - u64 exit_rcu;
> };
>
> /* For correct stack alignment, pt_regs has to be a multiple of 16 bytes. */
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index c547e70428d3..68a9aecacdb9 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -36,29 +36,36 @@
> * This is intended to match the logic in irqentry_enter(), handling the kernel
> * mode transitions only.
> */
> -static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs)
> +static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
> {
> - regs->exit_rcu = false;
> + irqentry_state_t ret = {
> + .exit_rcu = false,
> + };
I realise that the generic entry code calls this 'ret' in
irqentry_enter() and similar, but could we please use 'state'
consistently in the arm64 code?
[...]
> /*
> @@ -190,9 +199,11 @@ asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs)
> * mode. Before this function is called it is not safe to call regular kernel
> * code, instrumentable code, or any code which may trigger an exception.
> */
> -static void noinstr arm64_enter_nmi(struct pt_regs *regs)
> +static noinstr irqentry_state_t arm64_enter_nmi(struct pt_regs *regs)
> {
> - regs->lockdep_hardirqs = lockdep_hardirqs_enabled();
> + irqentry_state_t irq_state;
Likewise, please use 'state' rather than 'irq_state'.
In future we should probably have a separate structure for the NMI
paths, and get rid of the union, which would avoid the possiblity of
using mismatched helpers.
Mark.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 04/19] arm64: entry: Remove __enter_from_kernel_mode()
2024-10-25 10:06 ` [PATCH -next v4 04/19] arm64: entry: Remove __enter_from_kernel_mode() Jinjie Ruan
@ 2024-10-29 14:37 ` Mark Rutland
2024-10-31 3:56 ` Jinjie Ruan
0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2024-10-29 14:37 UTC (permalink / raw)
To: Jinjie Ruan
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On Fri, Oct 25, 2024 at 06:06:45PM +0800, Jinjie Ruan wrote:
> The __enter_from_kernel_mode() is only called by enter_from_kernel_mode(),
> remove it.
The point of this split is to cleanly separate the raw entry logic (in
__enter_from_kernel_mode() from pieces that run later and can safely be
instrumented (later in enter_from_kernel_mode()).
I had expected that a later patch would replace
__enter_from_kernel_mode() with the generic equivalent, leaving
enter_from_kernel_mode() unchanged. It looks like patch 16 could do that
without this patch being necessary -- am I missing something?
Mark.
>
> No functional changes.
>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> arch/arm64/kernel/entry-common.c | 9 +--------
> 1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index ccf59b44464d..a7fd4d6c7650 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -36,7 +36,7 @@
> * This is intended to match the logic in irqentry_enter(), handling the kernel
> * mode transitions only.
> */
> -static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
> +static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
> {
> irqentry_state_t ret = {
> .exit_rcu = false,
> @@ -55,13 +55,6 @@ static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs
> rcu_irq_enter_check_tick();
> trace_hardirqs_off_finish();
>
> - return ret;
> -}
> -
> -static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
> -{
> - irqentry_state_t ret = __enter_from_kernel_mode(regs);
> -
> mte_check_tfsr_entry();
> mte_disable_tco_entry(current);
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 03/19] arm64: entry: Remove __enter_from_user_mode()
2024-10-25 10:06 ` [PATCH -next v4 03/19] arm64: entry: Remove __enter_from_user_mode() Jinjie Ruan
@ 2024-10-29 14:42 ` Mark Rutland
2024-10-31 3:40 ` Jinjie Ruan
0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2024-10-29 14:42 UTC (permalink / raw)
To: Jinjie Ruan
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On Fri, Oct 25, 2024 at 06:06:44PM +0800, Jinjie Ruan wrote:
> The __enter_from_user_mode() is only called by enter_from_user_mode(),
> so replaced it with enter_from_user_mode().
As with the next two patches, all the __enter_from_*() and __exit_to_*()
are supposed to handle the raw entry, closely matching the generic code,
and the non-underscored enter_from_*() and exit_to_*() functions are
supposed to be wrappers that handle (possibly instrumentable)
arm64-specific post-entry and pre-exit logic.
I would prefer to keep that split, even though enter_from_user_mode() is
a trivial wrapper.
Am I missing some reason we must remove the wrappers?
Mark.
>
> No functional changes.
>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> arch/arm64/kernel/entry-common.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index 68a9aecacdb9..ccf59b44464d 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -109,7 +109,7 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
> * Before this function is called it is not safe to call regular kernel code,
> * instrumentable code, or any code which may trigger an exception.
> */
> -static __always_inline void __enter_from_user_mode(void)
> +static __always_inline void enter_from_user_mode(struct pt_regs *regs)
> {
> lockdep_hardirqs_off(CALLER_ADDR0);
> CT_WARN_ON(ct_state() != CT_STATE_USER);
> @@ -118,11 +118,6 @@ static __always_inline void __enter_from_user_mode(void)
> mte_disable_tco_entry(current);
> }
>
> -static __always_inline void enter_from_user_mode(struct pt_regs *regs)
> -{
> - __enter_from_user_mode();
> -}
> -
> /*
> * Handle IRQ/context state management when exiting to user mode.
> * After this function returns it is not safe to call regular kernel code,
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 06/19] arm64: entry: Move arm64_preempt_schedule_irq() into exit_to_kernel_mode()
2024-10-25 10:06 ` [PATCH -next v4 06/19] arm64: entry: Move arm64_preempt_schedule_irq() into exit_to_kernel_mode() Jinjie Ruan
@ 2024-10-29 14:52 ` Mark Rutland
2024-10-31 4:02 ` Jinjie Ruan
0 siblings, 1 reply; 35+ messages in thread
From: Mark Rutland @ 2024-10-29 14:52 UTC (permalink / raw)
To: Jinjie Ruan
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On Fri, Oct 25, 2024 at 06:06:47PM +0800, Jinjie Ruan wrote:
> Move arm64_preempt_schedule_irq() into exit_to_kernel_mode(), so not
> only __el1_irq() but also every time when kernel mode irq return,
> there is a chance to reschedule.
We use exit_to_kernel_mode() for every non-NMI exception return to the
kernel, not just IRQ returns.
> As Mark pointed out, this change will have the following key impact:
>
> "We'll preempt even without taking a "real" interrupt. That
> shouldn't result in preemption that wasn't possible before,
> but it does change the probability of preempting at certain points,
> and might have a performance impact, so probably warrants a
> benchmark."
For anyone following along at home, I said that at:
https://lore.kernel.org/linux-arm-kernel/ZxejvAmccYMTa4P1@J2N7QTR9R3/
... and there I specifically said:
> I's suggest you first write a patch to align arm64's entry code with the
> generic code, by removing the call to arm64_preempt_schedule_irq() from
> __el1_irq(), and adding a call to arm64_preempt_schedule_irq() in
> __exit_to_kernel_mode(), e.g.
>
> | static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
> | {
> | ...
> | if (interrupts_enabled(regs)) {
> | ...
> | if (regs->exit_rcu) {
> | ...
> | }
> | ...
> | arm64_preempt_schedule_irq();
> | ...
> | } else {
> | ...
> | }
> | }
[...]
> +#ifdef CONFIG_PREEMPT_DYNAMIC
> +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
> +#define need_irq_preemption() \
> + (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
> +#else
> +#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
> +#endif
> +
> +static void __sched arm64_preempt_schedule_irq(void)
> +{
> + if (!need_irq_preemption())
> + return;
> +
> + /*
> + * Note: thread_info::preempt_count includes both thread_info::count
> + * and thread_info::need_resched, and is not equivalent to
> + * preempt_count().
> + */
> + if (READ_ONCE(current_thread_info()->preempt_count) != 0)
> + return;
> +
> + /*
> + * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
> + * priority masking is used the GIC irqchip driver will clear DAIF.IF
> + * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
> + * DAIF we must have handled an NMI, so skip preemption.
> + */
> + if (system_uses_irq_prio_masking() && read_sysreg(daif))
> + return;
> +
> + /*
> + * Preempting a task from an IRQ means we leave copies of PSTATE
> + * on the stack. cpufeature's enable calls may modify PSTATE, but
> + * resuming one of these preempted tasks would undo those changes.
> + *
> + * Only allow a task to be preempted once cpufeatures have been
> + * enabled.
> + */
> + if (system_capabilities_finalized())
> + preempt_schedule_irq();
> +}
> +
> /*
> * Handle IRQ/context state management when exiting to kernel mode.
> * After this function returns it is not safe to call regular kernel code,
> @@ -72,6 +114,8 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
> static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
> irqentry_state_t state)
> {
> + arm64_preempt_schedule_irq();
This is broken; exit_to_kernel_mode() is called for any non-NMI return
excpetion return to the kernel, and this doesn't check that interrupts
were enabled in the context the exception was taken from.
This will preempt in cases where we should not, e.g. if we WARN() in a section with
IRQs disabled.
Mark.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 07/19] arm64: entry: Call arm64_preempt_schedule_irq() only if irqs enabled
2024-10-25 10:06 ` [PATCH -next v4 07/19] arm64: entry: Call arm64_preempt_schedule_irq() only if irqs enabled Jinjie Ruan
@ 2024-10-29 14:55 ` Mark Rutland
0 siblings, 0 replies; 35+ messages in thread
From: Mark Rutland @ 2024-10-29 14:55 UTC (permalink / raw)
To: Jinjie Ruan
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On Fri, Oct 25, 2024 at 06:06:48PM +0800, Jinjie Ruan wrote:
> Only if irqs are enabled when the interrupt trapped, there may be
> a chance to reschedule after the interrupt has been handled, so move
> arm64_preempt_schedule_irq() into regs_irqs_disabled() check false
> if block.
>
> As Mark pointed out, this change will have the following key impact:
>
> "We will not preempt when taking interrupts from a region of kernel
> code where IRQs are enabled but RCU is not watching, matching the
> behaviour of the generic entry code.
>
> This has the potential to introduce livelock if we can ever have a
> screaming interrupt in such a region, so we'll need to go figure out
> whether that's actually a problem.
>
> Having this as a separate patch will make it easier to test/bisect
> for that specifically."
>
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
This should be folded into the prior patch.
Mark.
> ---
> arch/arm64/kernel/entry-common.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index e0380812d71e..b57f6dc66115 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -114,8 +114,6 @@ static void __sched arm64_preempt_schedule_irq(void)
> static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
> irqentry_state_t state)
> {
> - arm64_preempt_schedule_irq();
> -
> mte_check_tfsr_exit();
>
> lockdep_assert_irqs_disabled();
> @@ -129,6 +127,8 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
> return;
> }
>
> + arm64_preempt_schedule_irq();
> +
> trace_hardirqs_on();
> } else {
> if (state.exit_rcu)
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 01/19] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled()
2024-10-29 14:19 ` Mark Rutland
@ 2024-10-31 3:34 ` Jinjie Ruan
0 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-31 3:34 UTC (permalink / raw)
To: Mark Rutland
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On 2024/10/29 22:19, Mark Rutland wrote:
> On Fri, Oct 25, 2024 at 06:06:42PM +0800, Jinjie Ruan wrote:
>> Implement regs_irqs_disabled(), and replace interrupts_enabled() macro
>> with regs_irqs_disabled() all over the place.
>>
>> No functional changes.
>>
>
> Please say why, e.g.
>
> | The generic entry code expects architecture code to provide
> | regs_irqs_disabled(regs), but arm64 does not have this and provides
> | interrupts_enabled(regs), which has the opposite polarity.
> |
> | In preparation for moving arm64 over to the generic entry code,
> | replace arm64's interrupts_enabled() with regs_irqs_disabled() and
> | update its callers under arch/arm64.
> |
> | For the moment, a definition of interrupts_enabled() is provided for
> | the GICv3 driver. Once arch/arm implement regs_irqs_disabled(), this
> | can be removed.
>
Thank you! Will expand the commit message and describe the cause of the
patch also for other patches.
>> Suggested-by: Mark Rutland <mark.rutland@arm.com>
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>
> [...]
>
>> arch/arm/include/asm/ptrace.h | 4 ++--
>> arch/arm/kernel/hw_breakpoint.c | 2 +-
>> arch/arm/kernel/process.c | 2 +-
>> arch/arm/mm/alignment.c | 2 +-
>> arch/arm/mm/fault.c | 2 +-
>
>> drivers/irqchip/irq-gic-v3.c | 2 +-
>
> I hadn't realised that the GICv3 driver was using this and hence we'd
> need to update a few places in arch/arm at the same time. Please update
> just the arch/arm64 bits, and add:
>
> | /*
> | * Used by the GICv3 driver, can be removed once arch/arm implements
> | * regs_irqs_disabled() directly.
> | */
> | #define interrupts_enabled(regs) (!regs_irqs_disabled(regs))
>
> ... and then once 32-bit arm implements this we can update the GIC
> driver and remove the architecture definitions.
>
> That way we avoid the risk of conflicts with 32-bit arm.
>
> Mark.
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 02/19] arm64: entry: Refactor the entry and exit for exceptions from EL1
2024-10-29 14:33 ` Mark Rutland
@ 2024-10-31 3:35 ` Jinjie Ruan
0 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-31 3:35 UTC (permalink / raw)
To: Mark Rutland
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On 2024/10/29 22:33, Mark Rutland wrote:
> On Fri, Oct 25, 2024 at 06:06:43PM +0800, Jinjie Ruan wrote:
>> These changes refactor the entry and exit routines for the exceptions
>> from EL1. They store the RCU and lockdep state in a struct
>> irqentry_state variable on the stack, rather than recording them
>> in the fields of pt_regs, since it is safe enough for these context.
>
> In general, please descirbe *why* we want to make the change first, e.g.
>
> | The generic entry code uses irqentry_state_t to track lockdep and RCU
> | state across exception entry and return. For historical reasons, arm64
> | embeds similar fields within its pt_regs structure.
> |
> | In preparation for moving arm64 over to the generic entry code, pull
> | these fields out of arm64's pt_regs, and use a seperate structure,
> | matching the style of the generic entry code.
>
>> Before:
>> struct pt_regs {
>> ...
>> u64 lockdep_hardirqs;
>> u64 exit_rcu;
>> }
>>
>> enter_from_kernel_mode(regs);
>> ...
>> exit_to_kernel_mode(regs);
>>
>> After:
>> typedef struct irqentry_state {
>> union {
>> bool exit_rcu;
>> bool lockdep;
>> };
>> } irqentry_state_t;
>>
>> irqentry_state_t state = enter_from_kernel_mode(regs);
>> ...
>> exit_to_kernel_mode(regs, state);
>
> I don't think this part is necessary.
Thank you, will remove it and explain why.
>
>>
>> No functional changes.
>>
>> Suggested-by: Mark Rutland <mark.rutland@arm.com>
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>> arch/arm64/include/asm/ptrace.h | 11 ++-
>> arch/arm64/kernel/entry-common.c | 129 +++++++++++++++++++------------
>> 2 files changed, 85 insertions(+), 55 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
>> index 3e5372a98da4..5156c0d5fa20 100644
>> --- a/arch/arm64/include/asm/ptrace.h
>> +++ b/arch/arm64/include/asm/ptrace.h
>> @@ -149,6 +149,13 @@ static inline unsigned long pstate_to_compat_psr(const unsigned long pstate)
>> return psr;
>> }
>>
>> +typedef struct irqentry_state {
>> + union {
>> + bool exit_rcu;
>> + bool lockdep;
>> + };
>> +} irqentry_state_t;
>
> AFAICT this can be moved directly into arch/arm64/kernel/entry-common.c.
>
>> +
>> /*
>> * This struct defines the way the registers are stored on the stack during an
>> * exception. struct user_pt_regs must form a prefix of struct pt_regs.
>> @@ -169,10 +176,6 @@ struct pt_regs {
>>
>> u64 sdei_ttbr1;
>> struct frame_record_meta stackframe;
>> -
>> - /* Only valid for some EL1 exceptions. */
>> - u64 lockdep_hardirqs;
>> - u64 exit_rcu;
>> };
>>
>> /* For correct stack alignment, pt_regs has to be a multiple of 16 bytes. */
>> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
>> index c547e70428d3..68a9aecacdb9 100644
>> --- a/arch/arm64/kernel/entry-common.c
>> +++ b/arch/arm64/kernel/entry-common.c
>> @@ -36,29 +36,36 @@
>> * This is intended to match the logic in irqentry_enter(), handling the kernel
>> * mode transitions only.
>> */
>> -static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs)
>> +static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
>> {
>> - regs->exit_rcu = false;
>> + irqentry_state_t ret = {
>> + .exit_rcu = false,
>> + };
>
> I realise that the generic entry code calls this 'ret' in
> irqentry_enter() and similar, but could we please use 'state'
> consistently in the arm64 code?
>
> [...]
>
>> /*
>> @@ -190,9 +199,11 @@ asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs)
>> * mode. Before this function is called it is not safe to call regular kernel
>> * code, instrumentable code, or any code which may trigger an exception.
>> */
>> -static void noinstr arm64_enter_nmi(struct pt_regs *regs)
>> +static noinstr irqentry_state_t arm64_enter_nmi(struct pt_regs *regs)
>> {
>> - regs->lockdep_hardirqs = lockdep_hardirqs_enabled();
>> + irqentry_state_t irq_state;
>
> Likewise, please use 'state' rather than 'irq_state'.
>
> In future we should probably have a separate structure for the NMI
> paths, and get rid of the union, which would avoid the possiblity of
> using mismatched helpers.
>
> Mark.
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 03/19] arm64: entry: Remove __enter_from_user_mode()
2024-10-29 14:42 ` Mark Rutland
@ 2024-10-31 3:40 ` Jinjie Ruan
0 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-31 3:40 UTC (permalink / raw)
To: Mark Rutland
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On 2024/10/29 22:42, Mark Rutland wrote:
> On Fri, Oct 25, 2024 at 06:06:44PM +0800, Jinjie Ruan wrote:
>> The __enter_from_user_mode() is only called by enter_from_user_mode(),
>> so replaced it with enter_from_user_mode().
>
> As with the next two patches, all the __enter_from_*() and __exit_to_*()
> are supposed to handle the raw entry, closely matching the generic code,
> and the non-underscored enter_from_*() and exit_to_*() functions are
> supposed to be wrappers that handle (possibly instrumentable)
Sure, the __enter_from_*() and __exit_to_*() is all about the generic
code, and the enter_from_*() and exit_to_*() includes arm64-specific MTE
check.
> arm64-specific post-entry and pre-exit logic.
>
> I would prefer to keep that split, even though enter_from_user_mode() is
> a trivial wrapper.
>
> Am I missing some reason we must remove the wrappers?
It is not necessary to remove these functions, just found it by chance
and cleanup them by the way, originally I thought that removing the
underline function might make the relative order of the MTE functions
look clearer.
>
> Mark.
>
>>
>> No functional changes.
>>
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>> arch/arm64/kernel/entry-common.c | 7 +------
>> 1 file changed, 1 insertion(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
>> index 68a9aecacdb9..ccf59b44464d 100644
>> --- a/arch/arm64/kernel/entry-common.c
>> +++ b/arch/arm64/kernel/entry-common.c
>> @@ -109,7 +109,7 @@ static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
>> * Before this function is called it is not safe to call regular kernel code,
>> * instrumentable code, or any code which may trigger an exception.
>> */
>> -static __always_inline void __enter_from_user_mode(void)
>> +static __always_inline void enter_from_user_mode(struct pt_regs *regs)
>> {
>> lockdep_hardirqs_off(CALLER_ADDR0);
>> CT_WARN_ON(ct_state() != CT_STATE_USER);
>> @@ -118,11 +118,6 @@ static __always_inline void __enter_from_user_mode(void)
>> mte_disable_tco_entry(current);
>> }
>>
>> -static __always_inline void enter_from_user_mode(struct pt_regs *regs)
>> -{
>> - __enter_from_user_mode();
>> -}
>> -
>> /*
>> * Handle IRQ/context state management when exiting to user mode.
>> * After this function returns it is not safe to call regular kernel code,
>> --
>> 2.34.1
>>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 04/19] arm64: entry: Remove __enter_from_kernel_mode()
2024-10-29 14:37 ` Mark Rutland
@ 2024-10-31 3:56 ` Jinjie Ruan
0 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-31 3:56 UTC (permalink / raw)
To: Mark Rutland
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On 2024/10/29 22:37, Mark Rutland wrote:
> On Fri, Oct 25, 2024 at 06:06:45PM +0800, Jinjie Ruan wrote:
>> The __enter_from_kernel_mode() is only called by enter_from_kernel_mode(),
>> remove it.
>
> The point of this split is to cleanly separate the raw entry logic (in
> __enter_from_kernel_mode() from pieces that run later and can safely be
> instrumented (later in enter_from_kernel_mode()).
Hi, Mark,
I reviewed your commit bc29b71f53b1 ("arm64: entry: clarify entry/exit
helpers"), and keep these functions is to make instrumentation
boundaries more clear, and will not change them.
>
> I had expected that a later patch would replace
> __enter_from_kernel_mode() with the generic equivalent, leaving
> enter_from_kernel_mode() unchanged. It looks like patch 16 could do that
> without this patch being necessary -- am I missing something?
Yes, you are right! these useless cleanup patches will be removed.
And when switched to generic syscall, I found that proper refactoring
would also facilitate clear code switching.
Thank you.
>
> Mark.
>
>>
>> No functional changes.
>>
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>> arch/arm64/kernel/entry-common.c | 9 +--------
>> 1 file changed, 1 insertion(+), 8 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
>> index ccf59b44464d..a7fd4d6c7650 100644
>> --- a/arch/arm64/kernel/entry-common.c
>> +++ b/arch/arm64/kernel/entry-common.c
>> @@ -36,7 +36,7 @@
>> * This is intended to match the logic in irqentry_enter(), handling the kernel
>> * mode transitions only.
>> */
>> -static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs)
>> +static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
>> {
>> irqentry_state_t ret = {
>> .exit_rcu = false,
>> @@ -55,13 +55,6 @@ static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs
>> rcu_irq_enter_check_tick();
>> trace_hardirqs_off_finish();
>>
>> - return ret;
>> -}
>> -
>> -static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
>> -{
>> - irqentry_state_t ret = __enter_from_kernel_mode(regs);
>> -
>> mte_check_tfsr_entry();
>> mte_disable_tco_entry(current);
>>
>> --
>> 2.34.1
>>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH -next v4 06/19] arm64: entry: Move arm64_preempt_schedule_irq() into exit_to_kernel_mode()
2024-10-29 14:52 ` Mark Rutland
@ 2024-10-31 4:02 ` Jinjie Ruan
0 siblings, 0 replies; 35+ messages in thread
From: Jinjie Ruan @ 2024-10-31 4:02 UTC (permalink / raw)
To: Mark Rutland
Cc: oleg, linux, will, catalin.marinas, sstabellini, maz, tglx,
peterz, luto, kees, wad, akpm, samitolvanen, arnd, ojeda, rppt,
hca, aliceryhl, samuel.holland, paulmck, aquini, petr.pavlu, viro,
rmk+kernel, ardb, wangkefeng.wang, surenb, linus.walleij,
yangyj.ee, broonie, mbenes, puranjay, pcc, guohanjun,
sudeep.holla, Jonathan.Cameron, prarit, liuwei09, dwmw,
oliver.upton, kristina.martsenko, ptosi, frederic, vschneid,
thiago.bauermann, joey.gouly, liuyuntao12, leobras, linux-kernel,
linux-arm-kernel, xen-devel
On 2024/10/29 22:52, Mark Rutland wrote:
> On Fri, Oct 25, 2024 at 06:06:47PM +0800, Jinjie Ruan wrote:
>> Move arm64_preempt_schedule_irq() into exit_to_kernel_mode(), so not
>> only __el1_irq() but also every time when kernel mode irq return,
>> there is a chance to reschedule.
>
> We use exit_to_kernel_mode() for every non-NMI exception return to the
> kernel, not just IRQ returns.
Yes, it it not only irq but other non-NMI exception, will update it.
>
>> As Mark pointed out, this change will have the following key impact:
>>
>> "We'll preempt even without taking a "real" interrupt. That
>> shouldn't result in preemption that wasn't possible before,
>> but it does change the probability of preempting at certain points,
>> and might have a performance impact, so probably warrants a
>> benchmark."
>
> For anyone following along at home, I said that at:
>
> https://lore.kernel.org/linux-arm-kernel/ZxejvAmccYMTa4P1@J2N7QTR9R3/
>
> ... and there I specifically said:
Thank you!
This one and the next patch will be merged as you suggested.
I would have thought it would have been clearer to put it in
__exit_to_kernel_mode() and move it to interrupt enabled block in two steps.
>
>> I's suggest you first write a patch to align arm64's entry code with the
>> generic code, by removing the call to arm64_preempt_schedule_irq() from
>> __el1_irq(), and adding a call to arm64_preempt_schedule_irq() in
>> __exit_to_kernel_mode(), e.g.
>>
>> | static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
>> | {
>> | ...
>> | if (interrupts_enabled(regs)) {
>> | ...
>> | if (regs->exit_rcu) {
>> | ...
>> | }
>> | ...
>> | arm64_preempt_schedule_irq();
>> | ...
>> | } else {
>> | ...
>> | }
>> | }
>
> [...]
>
>> +#ifdef CONFIG_PREEMPT_DYNAMIC
>> +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
>> +#define need_irq_preemption() \
>> + (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
>> +#else
>> +#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
>> +#endif
>> +
>> +static void __sched arm64_preempt_schedule_irq(void)
>> +{
>> + if (!need_irq_preemption())
>> + return;
>> +
>> + /*
>> + * Note: thread_info::preempt_count includes both thread_info::count
>> + * and thread_info::need_resched, and is not equivalent to
>> + * preempt_count().
>> + */
>> + if (READ_ONCE(current_thread_info()->preempt_count) != 0)
>> + return;
>> +
>> + /*
>> + * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
>> + * priority masking is used the GIC irqchip driver will clear DAIF.IF
>> + * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
>> + * DAIF we must have handled an NMI, so skip preemption.
>> + */
>> + if (system_uses_irq_prio_masking() && read_sysreg(daif))
>> + return;
>> +
>> + /*
>> + * Preempting a task from an IRQ means we leave copies of PSTATE
>> + * on the stack. cpufeature's enable calls may modify PSTATE, but
>> + * resuming one of these preempted tasks would undo those changes.
>> + *
>> + * Only allow a task to be preempted once cpufeatures have been
>> + * enabled.
>> + */
>> + if (system_capabilities_finalized())
>> + preempt_schedule_irq();
>> +}
>> +
>> /*
>> * Handle IRQ/context state management when exiting to kernel mode.
>> * After this function returns it is not safe to call regular kernel code,
>> @@ -72,6 +114,8 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
>> static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
>> irqentry_state_t state)
>> {
>> + arm64_preempt_schedule_irq();
>
> This is broken; exit_to_kernel_mode() is called for any non-NMI return
> excpetion return to the kernel, and this doesn't check that interrupts
> were enabled in the context the exception was taken from.
>
> This will preempt in cases where we should not, e.g. if we WARN() in a section with
> IRQs disabled.
>
> Mark.
>
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2024-10-31 4:02 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-25 10:06 [PATCH -next v4 00/19] arm64: entry: Convert to generic entry Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 01/19] arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled() Jinjie Ruan
2024-10-29 14:19 ` Mark Rutland
2024-10-31 3:34 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 02/19] arm64: entry: Refactor the entry and exit for exceptions from EL1 Jinjie Ruan
2024-10-29 14:33 ` Mark Rutland
2024-10-31 3:35 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 03/19] arm64: entry: Remove __enter_from_user_mode() Jinjie Ruan
2024-10-29 14:42 ` Mark Rutland
2024-10-31 3:40 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 04/19] arm64: entry: Remove __enter_from_kernel_mode() Jinjie Ruan
2024-10-29 14:37 ` Mark Rutland
2024-10-31 3:56 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 05/19] arm64: entry: Remove __exit_to_kernel_mode() Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 06/19] arm64: entry: Move arm64_preempt_schedule_irq() into exit_to_kernel_mode() Jinjie Ruan
2024-10-29 14:52 ` Mark Rutland
2024-10-31 4:02 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 07/19] arm64: entry: Call arm64_preempt_schedule_irq() only if irqs enabled Jinjie Ruan
2024-10-29 14:55 ` Mark Rutland
2024-10-25 10:06 ` [PATCH -next v4 08/19] arm64: entry: Rework arm64_preempt_schedule_irq() Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 09/19] arm64: entry: Use preempt_count() and need_resched() helper Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 10/19] arm64: entry: preempt_schedule_irq() only if PREEMPTION enabled Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 11/19] arm64: entry: Extract raw_irqentry_exit_cond_resched() function Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 12/19] arm64: entry: Check dynamic key ahead Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 13/19] arm64: entry: Check dynamic resched when PREEMPT_DYNAMIC enabled Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 14/19] entry: Split into irq entry and syscall Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 15/19] entry: Add arch irqentry_exit_need_resched() for arm64 Jinjie Ruan
2024-10-28 18:05 ` Thomas Gleixner
2024-10-28 22:15 ` Thomas Gleixner
2024-10-29 2:33 ` Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 16/19] arm64: entry: Switch to generic IRQ entry Jinjie Ruan
2024-10-25 10:06 ` [PATCH -next v4 17/19] entry: Add syscall arch functions to use generic syscall for arm64 Jinjie Ruan
2024-10-28 18:21 ` Thomas Gleixner
2024-10-25 10:06 ` [PATCH -next v4 18/19] arm64/ptrace: Split report_syscall() into separate enter and exit functions Jinjie Ruan
2024-10-25 10:07 ` [PATCH -next v4 19/19] arm64: entry: Convert to generic entry Jinjie Ruan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox