* [PATCH v8 00/12] arm64: entry: Convert to Generic Entry
@ 2025-11-26 7:14 Jinjie Ruan
2025-11-26 7:14 ` [PATCH v8 01/12] arm64: Remove unused _TIF_WORK_MASK Jinjie Ruan
` (11 more replies)
0 siblings, 12 replies; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
Currently, x86, Riscv, Loongarch use the Generic Entry which makes
maintainers' work easier and codes more elegant. arm64 has already
successfully switched to the Generic IRQ Entry in commit
b3cf07851b6c ("arm64: entry: Switch to generic IRQ entry"), it is
time to completely convert arm64 to Generic Entry.
The goal is to bring arm64 in line with other architectures that already
use the generic entry infrastructure, reducing duplicated code and
making it easier to share future changes in entry/exit paths, such as
"Syscall User Dispatch".
This patch set is rebased on v6.18-rc6.
The performance benchmarks from perf bench basic syscall on
real hardware are below:
| Metric | W/O Generic Framework | With Generic Framework | Change |
| ---------- | --------------------- | ---------------------- | ------ |
| Total time | 2.813 [sec] | 2.930 [sec] | ↑4% |
| usecs/op | 0.281349 | 0.293006 | ↑4% |
| ops/sec | 3,554,299 | 3,412,894 | ↓4% |
Compared to earlier with arch specific handling, the performance decreased
by approximately 4%.
It was tested ok with following test cases on QEMU virt platform:
- Perf tests.
- Different `dynamic preempt` mode switch.
- Pseudo NMI tests.
- Stress-ng CPU stress test.
- Hackbench stress test.
- MTE test case in Documentation/arch/arm64/memory-tagging-extension.rst
and all test cases in tools/testing/selftests/arm64/mte/*.
- "sud" selftest testcase.
- get_set_sud, get_syscall_info, set_syscall_info, peeksiginfo
in tools/testing/selftests/ptrace.
- breakpoint_test_arm64 in selftests/breakpoints.
- syscall-abi and ptrace in tools/testing/selftests/arm64/abi
- fp-ptrace, sve-ptrace, za-ptrace in selftests/arm64/fp.
- vdso_test_getrandom in tools/testing/selftests/vDSO
- Strace tests.
The test QEMU configuration is as follows:
qemu-system-aarch64 \
-M virt,gic-version=3,virtualization=on,mte=on \
cpu max,pauth-impdef=on \
kernel Image \
smp 8,sockets=1,cores=4,threads=2 \
m 512m \
nographic \
no-reboot \
device virtio-rng-pci \
append "root=/dev/vda rw console=ttyAMA0 kgdboc=ttyAMA0,115200 \
earlycon preempt=voluntary irqchip.gicv3_pseudo_nmi=1" \
drive if=none,file=images/rootfs.ext4,format=raw,id=hd0 \
device virtio-blk-device,drive=hd0 \
Changes in v8:
- Rename "report_syscall_enter()" to "report_syscall_entry()".
- Add ptrace_save_reg() to avoid duplication.
- Remove unused _TIF_WORK_MASK in a standalone patch.
- Align syscall_trace_enter() return value with the generic version.
- Use "scno" instead of regs->syscallno in el0_svc_common().
- Move rseq_syscall() ahead in a standalone patch to clarify it clearly.
- Rename "syscall_trace_exit()" to "syscall_exit_work()".
- Keep the goto in el0_svc_common().
- No argument was passed to __secure_computing() and check -1 not -1L.
- Remove "Add has_syscall_work() helper" patch.
- Move "Add syscall_exit_to_user_mode_prepare() helper" patch later.
- Add miss header for asm/entry-common.h.
- Update the implementation of arch_syscall_is_vdso_sigreturn().
- Add "ARCH_SYSCALL_WORK_EXIT" to be defined as "SECCOMP | SYSCALL_EMU"
to keep the behaviour unchanged.
- Add more testcases test.
- Add Reviewed-by.
- Update the commit message.
- Link to v7: https://lore.kernel.org/all/20251117133048.53182-1-ruanjinjie@huawei.com/
Chanegs in v7:
- Support "Syscall User Dispatch" by implementing
arch_syscall_is_vdso_sigreturn() as kemal suggested.
- Add aarch64 support for "sud" selftest testcase, which tested ok with
the patch series.
- Fix the kernel test robot warning for arch_ptrace_report_syscall_entry()
and arch_ptrace_report_syscall_exit() in asm/entry-common.h.
- Add perf syscall performance test.
- Link to v6: https://lore.kernel.org/all/20250916082611.2972008-1-ruanjinjie@huawei.com/
Changes in v6:
- Rebased on v6.17-rc5-next as arm64 generic irq entry has merged.
- Update the commit message.
- Link to v5: https://lore.kernel.org/all/20241206101744.4161990-1-ruanjinjie@huawei.com/
Changes in v5:
- Not change arm32 and keep inerrupts_enabled() macro for gicv3 driver.
- Move irqentry_state definition into arch/arm64/kernel/entry-common.c.
- Avoid removing the __enter_from_*() and __exit_to_*() wrappers.
- Update "irqentry_state_t ret/irq_state" to "state"
to keep it consistently.
- Use generic irq entry header for PREEMPT_DYNAMIC after split
the generic entry.
- Also refactor the ARM64 syscall code.
- Introduce arch_ptrace_report_syscall_entry/exit(), instead of
arch_pre/post_report_syscall_entry/exit() to simplify code.
- Make the syscall patches clear separation.
- Update the commit message.
- Link to v4: https://lore.kernel.org/all/20241025100700.3714552-1-ruanjinjie@huawei.com/
Changes in v4:
- Rework/cleanup split into a few patches as Mark suggested.
- Replace interrupts_enabled() macro with regs_irqs_disabled(), instead
of left it here.
- Remove rcu and lockdep state in pt_regs by using temporary
irqentry_state_t as Mark suggested.
- Remove some unnecessary intermediate functions to make it clear.
- Rework preempt irq and PREEMPT_DYNAMIC code
to make the switch more clear.
- arch_prepare_*_entry/exit() -> arch_pre_*_entry/exit().
- Expand the arch functions comment.
- Make arch functions closer to its caller.
- Declare saved_reg in for block.
- Remove arch_exit_to_kernel_mode_prepare(), arch_enter_from_kernel_mode().
- Adjust "Add few arch functions to use generic entry" patch to be
the penultimate.
- Update the commit message.
- Add suggested-by.
- Link to v3: https://lore.kernel.org/all/20240629085601.470241-1-ruanjinjie@huawei.com/
Changes in v3:
- Test the MTE test cases.
- Handle forget_syscall() in arch_post_report_syscall_entry()
- Make the arch funcs not use __weak as Thomas suggested, so move
the arch funcs to entry-common.h, and make arch_forget_syscall() folded
in arch_post_report_syscall_entry() as suggested.
- Move report_single_step() to thread_info.h for arm64
- Change __always_inline() to inline, add inline for the other arch funcs.
- Remove unused signal.h for entry-common.h.
- Add Suggested-by.
- Update the commit message.
Changes in v2:
- Add tested-by.
- Fix a bug that not call arch_post_report_syscall_entry() in
syscall_trace_enter() if ptrace_report_syscall_entry() return not zero.
- Refactor report_syscall().
- Add comment for arch_prepare_report_syscall_exit().
- Adjust entry-common.h header file inclusion to alphabetical order.
- Update the commit message.
Jinjie Ruan (11):
arm64: Remove unused _TIF_WORK_MASK
arm64/ptrace: Split report_syscall()
arm64/ptrace: Refactor syscall_trace_enter/exit()
arm64: ptrace: Move rseq_syscall() before audit_syscall_exit()
arm64: syscall: Rework el0_svc_common()
arm64/ptrace: Return early for ptrace_report_syscall_entry() error
arm64/ptrace: Expand secure_computing() in place
arm64/ptrace: Use syscall_get_arguments() heleper
entry: Split syscall_exit_to_user_mode_work() for arch reuse
entry: Add arch_ptrace_report_syscall_entry/exit()
arm64: entry: Convert to generic entry
kemal (1):
selftests: sud_test: Support aarch64
arch/arm64/Kconfig | 2 +-
arch/arm64/include/asm/entry-common.h | 76 ++++++++++++++++
arch/arm64/include/asm/syscall.h | 21 ++++-
arch/arm64/include/asm/thread_info.h | 22 +----
arch/arm64/kernel/debug-monitors.c | 7 ++
arch/arm64/kernel/ptrace.c | 90 -------------------
arch/arm64/kernel/signal.c | 2 +-
arch/arm64/kernel/syscall.c | 25 ++----
include/linux/entry-common.h | 35 +++++---
kernel/entry/syscall-common.c | 43 ++++++++-
.../syscall_user_dispatch/sud_test.c | 4 +
11 files changed, 179 insertions(+), 148 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v8 01/12] arm64: Remove unused _TIF_WORK_MASK
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:27 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 02/12] arm64/ptrace: Split report_syscall() Jinjie Ruan
` (10 subsequent siblings)
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
Since commit b3cf07851b6c ("arm64: entry: Switch to generic IRQ
entry"), _TIF_WORK_MASK is never used, so remove it.
Fixes: b3cf07851b6c ("arm64: entry: Switch to generic IRQ entry")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/include/asm/thread_info.h | 6 ------
1 file changed, 6 deletions(-)
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index f241b8601ebd..ff4998fa1844 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -106,12 +106,6 @@ void arch_setup_new_exec(void);
#define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
#define _TIF_TSC_SIGSEGV (1 << TIF_TSC_SIGSEGV)
-#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \
- _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
- _TIF_UPROBE | _TIF_MTE_ASYNC_FAULT | \
- _TIF_NOTIFY_SIGNAL | _TIF_SIGPENDING | \
- _TIF_PATCH_PENDING)
-
#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
_TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
_TIF_SYSCALL_EMU)
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 02/12] arm64/ptrace: Split report_syscall()
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
2025-11-26 7:14 ` [PATCH v8 01/12] arm64: Remove unused _TIF_WORK_MASK Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:28 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 03/12] arm64/ptrace: Refactor syscall_trace_enter/exit() Jinjie Ruan
` (9 subsequent siblings)
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
The generic syscall entry code has the form:
| syscall_trace_enter()
| {
| ptrace_report_syscall_entry()
| }
|
| syscall_exit_work()
| {
| ptrace_report_syscall_exit()
| }
In preparation for moving arm64 over to the generic entry code, split
report_syscall() to two separate enter and exit functions to align
the structure of the arm64 code with syscall_trace_enter() and
syscall_exit_work() from the generic entry code.
No functional changes.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
v8:
- report_syscall_enter() -> report_syscall_entry().
- Add ptrace_save_reg() helper.
---
arch/arm64/kernel/ptrace.c | 41 +++++++++++++++++++++++++++-----------
1 file changed, 29 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 4b001121c72d..abc5baa29cc9 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -2317,9 +2317,10 @@ enum ptrace_syscall_dir {
PTRACE_SYSCALL_EXIT,
};
-static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
+static inline unsigned long ptrace_save_reg(struct pt_regs *regs,
+ enum ptrace_syscall_dir dir,
+ int *regno)
{
- int regno;
unsigned long saved_reg;
/*
@@ -2338,15 +2339,31 @@ static void report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir)
* - Syscall stops behave differently to seccomp and pseudo-step traps
* (the latter do not nobble any registers).
*/
- regno = (is_compat_task() ? 12 : 7);
- saved_reg = regs->regs[regno];
- regs->regs[regno] = dir;
+ *regno = (is_compat_task() ? 12 : 7);
+ saved_reg = regs->regs[*regno];
+ regs->regs[*regno] = dir;
- if (dir == PTRACE_SYSCALL_ENTER) {
- if (ptrace_report_syscall_entry(regs))
- forget_syscall(regs);
- regs->regs[regno] = saved_reg;
- } else if (!test_thread_flag(TIF_SINGLESTEP)) {
+ return saved_reg;
+}
+
+static void report_syscall_entry(struct pt_regs *regs)
+{
+ unsigned long saved_reg;
+ int regno;
+
+ saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_ENTER, ®no);
+ if (ptrace_report_syscall_entry(regs))
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+}
+
+static void report_syscall_exit(struct pt_regs *regs)
+{
+ unsigned long saved_reg;
+ int regno;
+
+ saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
+ if (!test_thread_flag(TIF_SINGLESTEP)) {
ptrace_report_syscall_exit(regs, 0);
regs->regs[regno] = saved_reg;
} else {
@@ -2366,7 +2383,7 @@ int syscall_trace_enter(struct pt_regs *regs)
unsigned long flags = read_thread_flags();
if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
- report_syscall(regs, PTRACE_SYSCALL_ENTER);
+ report_syscall_entry(regs);
if (flags & _TIF_SYSCALL_EMU)
return NO_SYSCALL;
}
@@ -2394,7 +2411,7 @@ void syscall_trace_exit(struct pt_regs *regs)
trace_sys_exit(regs, syscall_get_return_value(current, regs));
if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
- report_syscall(regs, PTRACE_SYSCALL_EXIT);
+ report_syscall_exit(regs);
rseq_syscall(regs);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 03/12] arm64/ptrace: Refactor syscall_trace_enter/exit()
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
2025-11-26 7:14 ` [PATCH v8 01/12] arm64: Remove unused _TIF_WORK_MASK Jinjie Ruan
2025-11-26 7:14 ` [PATCH v8 02/12] arm64/ptrace: Split report_syscall() Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:28 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 04/12] arm64: ptrace: Move rseq_syscall() before audit_syscall_exit() Jinjie Ruan
` (8 subsequent siblings)
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
The generic syscall entry code has the following form, which use
the input syscall work flag and syscall number:
| syscall_trace_enter(struct pt_regs *regs, long syscall,
| unsigned long work)
|
| syscall_exit_work(struct pt_regs *regs, unsigned long work)
In preparation for moving arm64 over to the generic entry code,
refactor syscall_trace_enter/exit() to also pass thread flags, and
get syscall number by syscall_get_nr() helper.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/include/asm/syscall.h | 4 ++--
arch/arm64/kernel/ptrace.c | 26 ++++++++++++++++----------
arch/arm64/kernel/syscall.c | 5 +++--
3 files changed, 21 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 712daa90e643..d69f590a989b 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -114,7 +114,7 @@ static inline int syscall_get_arch(struct task_struct *task)
return AUDIT_ARCH_AARCH64;
}
-int syscall_trace_enter(struct pt_regs *regs);
-void syscall_trace_exit(struct pt_regs *regs);
+int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags);
+void syscall_trace_exit(struct pt_regs *regs, unsigned long flags);
#endif /* __ASM_SYSCALL_H */
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index abc5baa29cc9..63ba6c961ecc 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -2378,10 +2378,8 @@ static void report_syscall_exit(struct pt_regs *regs)
}
}
-int syscall_trace_enter(struct pt_regs *regs)
+int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
{
- unsigned long flags = read_thread_flags();
-
if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
report_syscall_entry(regs);
if (flags & _TIF_SYSCALL_EMU)
@@ -2392,19 +2390,27 @@ int syscall_trace_enter(struct pt_regs *regs)
if (secure_computing() == -1)
return NO_SYSCALL;
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
- trace_sys_enter(regs, regs->syscallno);
+ /* Either of the above might have changed the syscall number */
+ syscall = syscall_get_nr(current, regs);
+
+ if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) {
+ trace_sys_enter(regs, syscall);
- audit_syscall_entry(regs->syscallno, regs->orig_x0, regs->regs[1],
+ /*
+ * Probes or BPF hooks in the tracepoint may have changed the
+ * system call number as well.
+ */
+ syscall = syscall_get_nr(current, regs);
+ }
+
+ audit_syscall_entry(syscall, regs->orig_x0, regs->regs[1],
regs->regs[2], regs->regs[3]);
- return regs->syscallno;
+ return syscall;
}
-void syscall_trace_exit(struct pt_regs *regs)
+void syscall_trace_exit(struct pt_regs *regs, unsigned long flags)
{
- unsigned long flags = read_thread_flags();
-
audit_syscall_exit(regs);
if (flags & _TIF_SYSCALL_TRACEPOINT)
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index aba7ca6bca2d..ec31f82d2e9f 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -124,7 +124,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
*/
if (scno == NO_SYSCALL)
syscall_set_return_value(current, regs, -ENOSYS, 0);
- scno = syscall_trace_enter(regs);
+ scno = syscall_trace_enter(regs, scno, flags);
if (scno == NO_SYSCALL)
goto trace_exit;
}
@@ -143,7 +143,8 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
}
trace_exit:
- syscall_trace_exit(regs);
+ flags = read_thread_flags();
+ syscall_trace_exit(regs, flags);
}
void do_el0_svc(struct pt_regs *regs)
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 04/12] arm64: ptrace: Move rseq_syscall() before audit_syscall_exit()
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (2 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 03/12] arm64/ptrace: Refactor syscall_trace_enter/exit() Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:28 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 05/12] arm64: syscall: Rework el0_svc_common() Jinjie Ruan
` (7 subsequent siblings)
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
commit a9f3a74a29af ("entry: Provide generic syscall exit function")
introduce generic syscall exit function and call rseq_syscall()
before audit_syscall_exit() and arch_syscall_exit_tracehook().
And commit b74406f37737 ("arm: Add syscall detection for restartable
sequences") add rseq support for arm32, which also call rseq_syscall()
before audit_syscall_exit() and tracehook_report_syscall().
However, commit 409d5db49867c ("arm64: rseq: Implement backend rseq
calls and select HAVE_RSEQ") implement arm64 rseq and call
rseq_syscall() after audit_syscall_exit() and tracehook_report_syscall().
So compared to the generic entry and arm32 code, arm64 terminates
the process a bit later if the syscall is issued within
a restartable sequence.
But as commit b74406f37737 ("arm: Add syscall detection for restartable
sequences") said, syscalls are not allowed inside restartable sequences,
so should call rseq_syscall() at the very beginning of system call
exiting path for CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect
whether there is a syscall issued inside restartable sequences.
It makes sense to raise SIGSEGV via rseq_syscall() before auditing
and ptrace syscall exit, because this guarantees that the process is
already in an error state with SIGSEGV pending when those later steps
run. Although it makes no practical difference to signal delivery (signals
are processed at the very end in arm64_exit_to_user_mode()), the ordering
is more logical: detect and flag the error first, then proceed with
the remaining work.
To make it more reasonable and in preparation for moving arm64 over to
the generic entry code, move rseq_syscall() ahead before
audit_syscall_exit().
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/ptrace.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 63ba6c961ecc..dfdd886dc0a9 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -2411,6 +2411,8 @@ int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
void syscall_trace_exit(struct pt_regs *regs, unsigned long flags)
{
+ rseq_syscall(regs);
+
audit_syscall_exit(regs);
if (flags & _TIF_SYSCALL_TRACEPOINT)
@@ -2418,8 +2420,6 @@ void syscall_trace_exit(struct pt_regs *regs, unsigned long flags)
if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
report_syscall_exit(regs);
-
- rseq_syscall(regs);
}
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 05/12] arm64: syscall: Rework el0_svc_common()
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (3 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 04/12] arm64: ptrace: Move rseq_syscall() before audit_syscall_exit() Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:29 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 06/12] arm64/ptrace: Return early for ptrace_report_syscall_entry() error Jinjie Ruan
` (6 subsequent siblings)
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
The generic syscall syscall_exit_work() has the following content:
| audit_syscall_exit(regs)
| trace_sys_exit(regs, ...)
| ptrace_report_syscall_exit(regs, step)
The generic syscall syscall_exit_to_user_mode_work() has
the following form:
| unsigned long work = READ_ONCE(current_thread_info()->syscall_work)
| rseq_syscall()
| if (unlikely(work & SYSCALL_WORK_EXIT))
| syscall_exit_work(regs, work)
In preparation for moving arm64 over to the generic entry code,
rework el0_svc_common() as below:
- Rename syscall_trace_exit() to syscall_exit_work().
- Add syscall_exit_to_user_mode_prepare() function to replace
the combination of read_thread_flags() and syscall_exit_work(),
also move the syscall exit check logic into it. Move has_syscall_work()
helper into asm/syscall.h for reuse.
- As currently rseq_syscall() is always called and itself is controlled
by the CONFIG_DEBUG_RSEQ macro, so the CONFIG_DEBUG_RSEQ check
is removed.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/include/asm/syscall.h | 7 ++++++-
arch/arm64/kernel/ptrace.c | 14 +++++++++++---
arch/arm64/kernel/syscall.c | 20 +-------------------
3 files changed, 18 insertions(+), 23 deletions(-)
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index d69f590a989b..6225981fbbdb 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -114,7 +114,12 @@ static inline int syscall_get_arch(struct task_struct *task)
return AUDIT_ARCH_AARCH64;
}
+static inline bool has_syscall_work(unsigned long flags)
+{
+ return unlikely(flags & _TIF_SYSCALL_WORK);
+}
+
int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags);
-void syscall_trace_exit(struct pt_regs *regs, unsigned long flags);
+void syscall_exit_to_user_mode_prepare(struct pt_regs *regs);
#endif /* __ASM_SYSCALL_H */
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index dfdd886dc0a9..233a7688ac94 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -2409,10 +2409,8 @@ int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
return syscall;
}
-void syscall_trace_exit(struct pt_regs *regs, unsigned long flags)
+static void syscall_exit_work(struct pt_regs *regs, unsigned long flags)
{
- rseq_syscall(regs);
-
audit_syscall_exit(regs);
if (flags & _TIF_SYSCALL_TRACEPOINT)
@@ -2422,6 +2420,16 @@ void syscall_trace_exit(struct pt_regs *regs, unsigned long flags)
report_syscall_exit(regs);
}
+void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
+{
+ unsigned long flags = read_thread_flags();
+
+ rseq_syscall(regs);
+
+ if (has_syscall_work(flags) || flags & _TIF_SINGLESTEP)
+ syscall_exit_work(regs, flags);
+}
+
/*
* SPSR_ELx bits which are always architecturally RES0 per ARM DDI 0487D.a.
* We permit userspace to set SSBS (AArch64 bit 12, AArch32 bit 23) which is
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index ec31f82d2e9f..65021d0f49e1 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -65,11 +65,6 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
choose_random_kstack_offset(get_random_u16());
}
-static inline bool has_syscall_work(unsigned long flags)
-{
- return unlikely(flags & _TIF_SYSCALL_WORK);
-}
-
static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
const syscall_fn_t syscall_table[])
{
@@ -130,21 +125,8 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
}
invoke_syscall(regs, scno, sc_nr, syscall_table);
-
- /*
- * The tracing status may have changed under our feet, so we have to
- * check again. However, if we were tracing entry, then we always trace
- * exit regardless, as the old entry assembly did.
- */
- if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
- flags = read_thread_flags();
- if (!has_syscall_work(flags) && !(flags & _TIF_SINGLESTEP))
- return;
- }
-
trace_exit:
- flags = read_thread_flags();
- syscall_trace_exit(regs, flags);
+ syscall_exit_to_user_mode_prepare(regs);
}
void do_el0_svc(struct pt_regs *regs)
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 06/12] arm64/ptrace: Return early for ptrace_report_syscall_entry() error
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (4 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 05/12] arm64: syscall: Rework el0_svc_common() Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:29 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 07/12] arm64/ptrace: Expand secure_computing() in place Jinjie Ruan
` (5 subsequent siblings)
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
The generic entry abort the syscall_trace_enter() sequence if
ptrace_report_syscall_entry() errors out, but arm64 not.
As the ptrace_report_syscall_entry() comment said, the calling arch code
should abort the system call and must prevent normal entry so no system
call is made if ptrace_report_syscall_entry() return nonzero.
In preparation for moving arm64 over to the generic entry code,
return early if ptrace_report_syscall_entry() encounters an error.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/ptrace.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 233a7688ac94..da9687d30bcf 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -2346,15 +2346,18 @@ static inline unsigned long ptrace_save_reg(struct pt_regs *regs,
return saved_reg;
}
-static void report_syscall_entry(struct pt_regs *regs)
+static int report_syscall_entry(struct pt_regs *regs)
{
unsigned long saved_reg;
- int regno;
+ int regno, ret;
saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_ENTER, ®no);
- if (ptrace_report_syscall_entry(regs))
+ ret = ptrace_report_syscall_entry(regs);
+ if (ret)
forget_syscall(regs);
regs->regs[regno] = saved_reg;
+
+ return ret;
}
static void report_syscall_exit(struct pt_regs *regs)
@@ -2380,9 +2383,11 @@ static void report_syscall_exit(struct pt_regs *regs)
int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
{
+ int ret;
+
if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
- report_syscall_entry(regs);
- if (flags & _TIF_SYSCALL_EMU)
+ ret = report_syscall_entry(regs);
+ if (ret || (flags & _TIF_SYSCALL_EMU))
return NO_SYSCALL;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 07/12] arm64/ptrace: Expand secure_computing() in place
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (5 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 06/12] arm64/ptrace: Return early for ptrace_report_syscall_entry() error Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:29 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 08/12] arm64/ptrace: Use syscall_get_arguments() heleper Jinjie Ruan
` (4 subsequent siblings)
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
The generic entry expand secure_computing() in place and call
__secure_computing() directly.
In order to switch to the generic entry for arm64, refactor
secure_computing() for syscall_trace_enter().
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/ptrace.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index da9687d30bcf..72d4d987ba3b 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -2392,8 +2392,11 @@ int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
}
/* Do the secure computing after ptrace; failures should be fast. */
- if (secure_computing() == -1)
- return NO_SYSCALL;
+ if (flags & _TIF_SECCOMP) {
+ ret = __secure_computing();
+ if (ret == -1)
+ return NO_SYSCALL;
+ }
/* Either of the above might have changed the syscall number */
syscall = syscall_get_nr(current, regs);
@@ -2411,7 +2414,7 @@ int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
audit_syscall_entry(syscall, regs->orig_x0, regs->regs[1],
regs->regs[2], regs->regs[3]);
- return syscall;
+ return ret ? : syscall;
}
static void syscall_exit_work(struct pt_regs *regs, unsigned long flags)
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 08/12] arm64/ptrace: Use syscall_get_arguments() heleper
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (6 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 07/12] arm64/ptrace: Expand secure_computing() in place Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:30 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 09/12] entry: Split syscall_exit_to_user_mode_work() for arch reuse Jinjie Ruan
` (3 subsequent siblings)
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
The generic entry check audit context first and use
syscall_get_arguments() helper.
In order to switch to the generic entry for arm64,
- Also use syscall_get_arguments() to get audit_syscall_entry()'s
last four parameters.
- Extract the syscall_enter_audit() helper to make it clear.
- Check audit context first, which saves an unnecessary memcpy when
current process's audit_context is NULL.
Overall these changes make syscall_enter_audit() exactly equivalent
to the generic one.
No functional changes.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/kernel/ptrace.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 72d4d987ba3b..c2bd0130212d 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -2381,6 +2381,16 @@ static void report_syscall_exit(struct pt_regs *regs)
}
}
+static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
+{
+ if (unlikely(audit_context())) {
+ unsigned long args[6];
+
+ syscall_get_arguments(current, regs, args);
+ audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]);
+ }
+}
+
int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
{
int ret;
@@ -2411,8 +2421,7 @@ int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
syscall = syscall_get_nr(current, regs);
}
- audit_syscall_entry(syscall, regs->orig_x0, regs->regs[1],
- regs->regs[2], regs->regs[3]);
+ syscall_enter_audit(regs, syscall);
return ret ? : syscall;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 09/12] entry: Split syscall_exit_to_user_mode_work() for arch reuse
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (7 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 08/12] arm64/ptrace: Use syscall_get_arguments() heleper Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-26 7:14 ` [PATCH v8 10/12] entry: Add arch_ptrace_report_syscall_entry/exit() Jinjie Ruan
` (2 subsequent siblings)
11 siblings, 0 replies; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
In the generic entry code, the beginning of
syscall_exit_to_user_mode_work() can be reused on arm64 so it makes
sense to split it.
In preparation for moving arm64 over to the generic entry
code, split out syscall_exit_to_user_mode_prepare() helper from
syscall_exit_to_user_mode_work().
No functional changes.
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
include/linux/entry-common.h | 35 ++++++++++++++++++++++-------------
1 file changed, 22 insertions(+), 13 deletions(-)
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 7177436f0f9e..cd6dacb2d8bf 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -137,20 +137,11 @@ static __always_inline long syscall_enter_from_user_mode(struct pt_regs *regs, l
*/
void syscall_exit_work(struct pt_regs *regs, unsigned long work);
-/**
- * syscall_exit_to_user_mode_work - Handle work before returning to user mode
- * @regs: Pointer to currents pt_regs
- *
- * Same as step 1 and 2 of syscall_exit_to_user_mode() but without calling
- * exit_to_user_mode() to perform the final transition to user mode.
- *
- * Calling convention is the same as for syscall_exit_to_user_mode() and it
- * returns with all work handled and interrupts disabled. The caller must
- * invoke exit_to_user_mode() before actually switching to user mode to
- * make the final state transitions. Interrupts must stay disabled between
- * return from this function and the invocation of exit_to_user_mode().
+/*
+ * Syscall specific exit to user mode preparation. Runs with interrupts
+ * enabled.
*/
-static __always_inline void syscall_exit_to_user_mode_work(struct pt_regs *regs)
+static __always_inline void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
{
unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
unsigned long nr = syscall_get_nr(current, regs);
@@ -171,6 +162,24 @@ static __always_inline void syscall_exit_to_user_mode_work(struct pt_regs *regs)
*/
if (unlikely(work & SYSCALL_WORK_EXIT))
syscall_exit_work(regs, work);
+}
+
+/**
+ * syscall_exit_to_user_mode_work - Handle work before returning to user mode
+ * @regs: Pointer to currents pt_regs
+ *
+ * Same as step 1 and 2 of syscall_exit_to_user_mode() but without calling
+ * exit_to_user_mode() to perform the final transition to user mode.
+ *
+ * Calling convention is the same as for syscall_exit_to_user_mode() and it
+ * returns with all work handled and interrupts disabled. The caller must
+ * invoke exit_to_user_mode() before actually switching to user mode to
+ * make the final state transitions. Interrupts must stay disabled between
+ * return from this function and the invocation of exit_to_user_mode().
+ */
+static __always_inline void syscall_exit_to_user_mode_work(struct pt_regs *regs)
+{
+ syscall_exit_to_user_mode_prepare(regs);
local_irq_disable_exit_to_user();
exit_to_user_mode_prepare(regs);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 10/12] entry: Add arch_ptrace_report_syscall_entry/exit()
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (8 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 09/12] entry: Split syscall_exit_to_user_mode_work() for arch reuse Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:30 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 11/12] arm64: entry: Convert to generic entry Jinjie Ruan
2025-11-26 7:14 ` [PATCH v8 12/12] selftests: sud_test: Support aarch64 Jinjie Ruan
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
Differ from generic entry, due to historical reasons, ARM64 need to
save/restore during syscall entry/exit because ARM64 use a scratch
register (ip(r12) on AArch32, x7 on AArch64) to denote syscall entry/exit.
In preparation for moving arm64 over to the generic entry code,
add arch_ptrace_report_syscall_entry/exit() as the default
ptrace_report_syscall_entry/exit() implementation. This allows
arm64 to implement the architecture specific version.
This allows arm64 to implement the architecture specific version.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
kernel/entry/syscall-common.c | 43 +++++++++++++++++++++++++++++++++--
1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/kernel/entry/syscall-common.c b/kernel/entry/syscall-common.c
index 66e6ba7fa80c..27310e611567 100644
--- a/kernel/entry/syscall-common.c
+++ b/kernel/entry/syscall-common.c
@@ -17,6 +17,25 @@ static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
}
}
+/**
+ * arch_ptrace_report_syscall_entry - Architecture specific
+ * ptrace_report_syscall_entry().
+ *
+ * Invoked from syscall_trace_enter() to wrap ptrace_report_syscall_entry().
+ * Defaults to ptrace_report_syscall_entry.
+ *
+ * The main purpose is to support arch-specific ptrace_report_syscall_entry()
+ * implementation.
+ */
+static __always_inline int arch_ptrace_report_syscall_entry(struct pt_regs *regs);
+
+#ifndef arch_ptrace_report_syscall_entry
+static __always_inline int arch_ptrace_report_syscall_entry(struct pt_regs *regs)
+{
+ return ptrace_report_syscall_entry(regs);
+}
+#endif
+
long syscall_trace_enter(struct pt_regs *regs, long syscall,
unsigned long work)
{
@@ -34,7 +53,7 @@ long syscall_trace_enter(struct pt_regs *regs, long syscall,
/* Handle ptrace */
if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
- ret = ptrace_report_syscall_entry(regs);
+ ret = arch_ptrace_report_syscall_entry(regs);
if (ret || (work & SYSCALL_WORK_SYSCALL_EMU))
return -1L;
}
@@ -84,6 +103,26 @@ static inline bool report_single_step(unsigned long work)
return work & SYSCALL_WORK_SYSCALL_EXIT_TRAP;
}
+/**
+ * arch_ptrace_report_syscall_exit - Architecture specific
+ * ptrace_report_syscall_exit.
+ *
+ * Invoked from syscall_exit_work() to wrap ptrace_report_syscall_exit().
+ *
+ * The main purpose is to support arch-specific ptrace_report_syscall_exit
+ * implementation.
+ */
+static __always_inline void arch_ptrace_report_syscall_exit(struct pt_regs *regs,
+ int step);
+
+#ifndef arch_ptrace_report_syscall_exit
+static __always_inline void arch_ptrace_report_syscall_exit(struct pt_regs *regs,
+ int step)
+{
+ ptrace_report_syscall_exit(regs, step);
+}
+#endif
+
void syscall_exit_work(struct pt_regs *regs, unsigned long work)
{
bool step;
@@ -108,5 +147,5 @@ void syscall_exit_work(struct pt_regs *regs, unsigned long work)
step = report_single_step(work);
if (step || work & SYSCALL_WORK_SYSCALL_TRACE)
- ptrace_report_syscall_exit(regs, step);
+ arch_ptrace_report_syscall_exit(regs, step);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 11/12] arm64: entry: Convert to generic entry
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (9 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 10/12] entry: Add arch_ptrace_report_syscall_entry/exit() Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
2025-11-27 13:31 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 12/12] selftests: sud_test: Support aarch64 Jinjie Ruan
11 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
Currently, x86, Riscv, Loongarch use the generic entry which makes
maintainers' work easier and codes more elegant. arm64 has already
switched to the generic IRQ entry, so completely convert arm64 to use
the generic entry infrastructure from kernel/entry/*.
The changes are below:
- Remove TIF_SYSCALL_* flag, _TIF_WORK_MASK, _TIF_SYSCALL_WORK,
and remove has_syscall_work(), as _TIF_SYSCALL_WORK is equal with
SYSCALL_WORK_ENTER.
- Implement arch_ptrace_report_syscall_entry/exit() with
report_syscall_entry/exit() to do arm64-specific save/restore
during syscall entry/exit.
- Add "ARCH_SYSCALL_WORK_EXIT" to be defined as "_TIF_SECCOMP |
_TIF_SYSCALL_EMU" to keep the arm64 behaviour unchanged.
- Remove arm64 syscall_trace_enter(), syscall_exit_to_user_mode_prepare(),
and related sub-functions including syscall_exit_work() and
syscall_enter_audit(), by calling generic entry's functions with similar
functionality.
- Implement arch_syscall_is_vdso_sigreturn() to support "Syscall User
Dispatch".
Suggested-by: Kevin Brodsky <kevin.brodsky@arm.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/arm64/Kconfig | 2 +-
arch/arm64/include/asm/entry-common.h | 76 ++++++++++++++
arch/arm64/include/asm/syscall.h | 20 +++-
arch/arm64/include/asm/thread_info.h | 16 +--
arch/arm64/kernel/debug-monitors.c | 7 ++
arch/arm64/kernel/ptrace.c | 138 --------------------------
arch/arm64/kernel/signal.c | 2 +-
arch/arm64/kernel/syscall.c | 6 +-
8 files changed, 108 insertions(+), 159 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6663ffd23f25..1463ff15d67a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -152,9 +152,9 @@ config ARM64
select GENERIC_CPU_DEVICES
select GENERIC_CPU_VULNERABILITIES
select GENERIC_EARLY_IOREMAP
+ select GENERIC_ENTRY
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IOREMAP
- select GENERIC_IRQ_ENTRY
select GENERIC_IRQ_IPI
select GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD
select GENERIC_IRQ_PROBE
diff --git a/arch/arm64/include/asm/entry-common.h b/arch/arm64/include/asm/entry-common.h
index cab8cd78f693..ab0544b44549 100644
--- a/arch/arm64/include/asm/entry-common.h
+++ b/arch/arm64/include/asm/entry-common.h
@@ -3,14 +3,21 @@
#ifndef _ASM_ARM64_ENTRY_COMMON_H
#define _ASM_ARM64_ENTRY_COMMON_H
+#include <linux/ptrace.h>
#include <linux/thread_info.h>
+#include <asm/compat.h>
#include <asm/cpufeature.h>
#include <asm/daifflags.h>
#include <asm/fpsimd.h>
#include <asm/mte.h>
#include <asm/stacktrace.h>
+enum ptrace_syscall_dir {
+ PTRACE_SYSCALL_ENTER = 0,
+ PTRACE_SYSCALL_EXIT,
+};
+
#define ARCH_EXIT_TO_USER_MODE_WORK (_TIF_MTE_ASYNC_FAULT | _TIF_FOREIGN_FPSTATE)
static __always_inline void arch_exit_to_user_mode_work(struct pt_regs *regs,
@@ -54,4 +61,73 @@ static inline bool arch_irqentry_exit_need_resched(void)
#define arch_irqentry_exit_need_resched arch_irqentry_exit_need_resched
+static inline unsigned long ptrace_save_reg(struct pt_regs *regs,
+ enum ptrace_syscall_dir dir,
+ int *regno)
+{
+ unsigned long saved_reg;
+
+ /*
+ * We have some ABI weirdness here in the way that we handle syscall
+ * exit stops because we indicate whether or not the stop has been
+ * signalled from syscall entry or syscall exit by clobbering a general
+ * purpose register (ip/r12 for AArch32, x7 for AArch64) in the tracee
+ * and restoring its old value after the stop. This means that:
+ *
+ * - Any writes by the tracer to this register during the stop are
+ * ignored/discarded.
+ *
+ * - The actual value of the register is not available during the stop,
+ * so the tracer cannot save it and restore it later.
+ *
+ * - Syscall stops behave differently to seccomp and pseudo-step traps
+ * (the latter do not nobble any registers).
+ */
+ *regno = (is_compat_task() ? 12 : 7);
+ saved_reg = regs->regs[*regno];
+ regs->regs[*regno] = dir;
+
+ return saved_reg;
+}
+
+static __always_inline int arch_ptrace_report_syscall_entry(struct pt_regs *regs)
+{
+ unsigned long saved_reg;
+ int regno, ret;
+
+ saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_ENTER, ®no);
+ ret = ptrace_report_syscall_entry(regs);
+ if (ret)
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+
+ return ret;
+}
+
+#define arch_ptrace_report_syscall_entry arch_ptrace_report_syscall_entry
+
+static __always_inline void arch_ptrace_report_syscall_exit(struct pt_regs *regs,
+ int step)
+{
+ unsigned long saved_reg;
+ int regno;
+
+ saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
+ if (!step) {
+ ptrace_report_syscall_exit(regs, 0);
+ regs->regs[regno] = saved_reg;
+ } else {
+ regs->regs[regno] = saved_reg;
+
+ /*
+ * Signal a pseudo-step exception since we are stepping but
+ * tracer modifications to the registers may have rewound the
+ * state machine.
+ */
+ ptrace_report_syscall_exit(regs, 1);
+ }
+}
+
+#define arch_ptrace_report_syscall_exit arch_ptrace_report_syscall_exit
+
#endif /* _ASM_ARM64_ENTRY_COMMON_H */
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 6225981fbbdb..f705ba2bb6fd 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -9,6 +9,9 @@
#include <linux/compat.h>
#include <linux/err.h>
+#include <asm/compat.h>
+#include <asm/vdso.h>
+
typedef long (*syscall_fn_t)(const struct pt_regs *regs);
extern const syscall_fn_t sys_call_table[];
@@ -114,12 +117,21 @@ static inline int syscall_get_arch(struct task_struct *task)
return AUDIT_ARCH_AARCH64;
}
-static inline bool has_syscall_work(unsigned long flags)
+static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
{
- return unlikely(flags & _TIF_SYSCALL_WORK);
+ unsigned long sigtramp;
+
+#ifdef CONFIG_COMPAT
+ if (is_compat_task()) {
+ unsigned long vdso = (unsigned long)current->mm->context.sigpage;
+
+ return (regs->pc >= vdso && regs->pc < (vdso + PAGE_SIZE));
+ }
+#endif
+ sigtramp = (unsigned long)VDSO_SYMBOL(current->mm->context.vdso, sigtramp);
+ return regs->pc == (sigtramp + 8);
}
-int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags);
-void syscall_exit_to_user_mode_prepare(struct pt_regs *regs);
+#define ARCH_SYSCALL_WORK_EXIT (_TIF_SECCOMP | _TIF_SYSCALL_EMU)
#endif /* __ASM_SYSCALL_H */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index ff4998fa1844..d3142b5d1b9c 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -43,6 +43,7 @@ struct thread_info {
void *scs_sp;
#endif
u32 cpu;
+ unsigned long syscall_work; /* SYSCALL_WORK_ flags */
};
#define thread_saved_pc(tsk) \
@@ -65,11 +66,8 @@ void arch_setup_new_exec(void);
#define TIF_UPROBE 5 /* uprobe breakpoint or singlestep */
#define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */
#define TIF_NOTIFY_SIGNAL 7 /* signal notifications exist */
-#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
-#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
-#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
-#define TIF_SECCOMP 11 /* syscall secure computing */
-#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
+#define TIF_SECCOMP 11 /* syscall secure computing */
+#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
#define TIF_PATCH_PENDING 13 /* pending live patching update */
#define TIF_MEMDIE 18 /* is terminating due to OOM killer */
#define TIF_FREEZE 19
@@ -92,24 +90,16 @@ void arch_setup_new_exec(void);
#define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
#define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE)
-#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
-#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
-#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING)
#define _TIF_UPROBE (1 << TIF_UPROBE)
-#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
#define _TIF_MTE_ASYNC_FAULT (1 << TIF_MTE_ASYNC_FAULT)
#define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
#define _TIF_TSC_SIGSEGV (1 << TIF_TSC_SIGSEGV)
-#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
- _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
- _TIF_SYSCALL_EMU)
-
#ifdef CONFIG_SHADOW_CALL_STACK
#define INIT_SCS \
.scs_base = init_shadow_call_stack, \
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index 29307642f4c9..e67643a70405 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -385,11 +385,18 @@ void user_enable_single_step(struct task_struct *task)
if (!test_and_set_ti_thread_flag(ti, TIF_SINGLESTEP))
set_regs_spsr_ss(task_pt_regs(task));
+
+ /*
+ * Ensure that a trap is triggered once stepping out of a system
+ * call prior to executing any user instruction.
+ */
+ set_task_syscall_work(task, SYSCALL_EXIT_TRAP);
}
NOKPROBE_SYMBOL(user_enable_single_step);
void user_disable_single_step(struct task_struct *task)
{
clear_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP);
+ clear_task_syscall_work(task, SYSCALL_EXIT_TRAP);
}
NOKPROBE_SYMBOL(user_disable_single_step);
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index c2bd0130212d..9e3b39e207d1 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -42,9 +42,6 @@
#include <asm/traps.h>
#include <asm/system_misc.h>
-#define CREATE_TRACE_POINTS
-#include <trace/events/syscalls.h>
-
struct pt_regs_offset {
const char *name;
int offset;
@@ -2312,141 +2309,6 @@ long arch_ptrace(struct task_struct *child, long request,
return ptrace_request(child, request, addr, data);
}
-enum ptrace_syscall_dir {
- PTRACE_SYSCALL_ENTER = 0,
- PTRACE_SYSCALL_EXIT,
-};
-
-static inline unsigned long ptrace_save_reg(struct pt_regs *regs,
- enum ptrace_syscall_dir dir,
- int *regno)
-{
- unsigned long saved_reg;
-
- /*
- * We have some ABI weirdness here in the way that we handle syscall
- * exit stops because we indicate whether or not the stop has been
- * signalled from syscall entry or syscall exit by clobbering a general
- * purpose register (ip/r12 for AArch32, x7 for AArch64) in the tracee
- * and restoring its old value after the stop. This means that:
- *
- * - Any writes by the tracer to this register during the stop are
- * ignored/discarded.
- *
- * - The actual value of the register is not available during the stop,
- * so the tracer cannot save it and restore it later.
- *
- * - Syscall stops behave differently to seccomp and pseudo-step traps
- * (the latter do not nobble any registers).
- */
- *regno = (is_compat_task() ? 12 : 7);
- saved_reg = regs->regs[*regno];
- regs->regs[*regno] = dir;
-
- return saved_reg;
-}
-
-static int report_syscall_entry(struct pt_regs *regs)
-{
- unsigned long saved_reg;
- int regno, ret;
-
- saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_ENTER, ®no);
- ret = ptrace_report_syscall_entry(regs);
- if (ret)
- forget_syscall(regs);
- regs->regs[regno] = saved_reg;
-
- return ret;
-}
-
-static void report_syscall_exit(struct pt_regs *regs)
-{
- unsigned long saved_reg;
- int regno;
-
- saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
- if (!test_thread_flag(TIF_SINGLESTEP)) {
- ptrace_report_syscall_exit(regs, 0);
- regs->regs[regno] = saved_reg;
- } else {
- regs->regs[regno] = saved_reg;
-
- /*
- * Signal a pseudo-step exception since we are stepping but
- * tracer modifications to the registers may have rewound the
- * state machine.
- */
- ptrace_report_syscall_exit(regs, 1);
- }
-}
-
-static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
-{
- if (unlikely(audit_context())) {
- unsigned long args[6];
-
- syscall_get_arguments(current, regs, args);
- audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]);
- }
-}
-
-int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
-{
- int ret;
-
- if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
- ret = report_syscall_entry(regs);
- if (ret || (flags & _TIF_SYSCALL_EMU))
- return NO_SYSCALL;
- }
-
- /* Do the secure computing after ptrace; failures should be fast. */
- if (flags & _TIF_SECCOMP) {
- ret = __secure_computing();
- if (ret == -1)
- return NO_SYSCALL;
- }
-
- /* Either of the above might have changed the syscall number */
- syscall = syscall_get_nr(current, regs);
-
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) {
- trace_sys_enter(regs, syscall);
-
- /*
- * Probes or BPF hooks in the tracepoint may have changed the
- * system call number as well.
- */
- syscall = syscall_get_nr(current, regs);
- }
-
- syscall_enter_audit(regs, syscall);
-
- return ret ? : syscall;
-}
-
-static void syscall_exit_work(struct pt_regs *regs, unsigned long flags)
-{
- audit_syscall_exit(regs);
-
- if (flags & _TIF_SYSCALL_TRACEPOINT)
- trace_sys_exit(regs, syscall_get_return_value(current, regs));
-
- if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
- report_syscall_exit(regs);
-}
-
-void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
-{
- unsigned long flags = read_thread_flags();
-
- rseq_syscall(regs);
-
- if (has_syscall_work(flags) || flags & _TIF_SINGLESTEP)
- syscall_exit_work(regs, flags);
-}
-
/*
* SPSR_ELx bits which are always architecturally RES0 per ARM DDI 0487D.a.
* We permit userspace to set SSBS (AArch64 bit 12, AArch32 bit 23) which is
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 1110eeb21f57..d3ec1892b3c7 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -8,8 +8,8 @@
#include <linux/cache.h>
#include <linux/compat.h>
+#include <linux/entry-common.h>
#include <linux/errno.h>
-#include <linux/irq-entry-common.h>
#include <linux/kernel.h>
#include <linux/signal.h>
#include <linux/freezer.h>
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 65021d0f49e1..9848772c63fd 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -2,6 +2,7 @@
#include <linux/compiler.h>
#include <linux/context_tracking.h>
+#include <linux/entry-common.h>
#include <linux/errno.h>
#include <linux/nospec.h>
#include <linux/ptrace.h>
@@ -68,6 +69,7 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
const syscall_fn_t syscall_table[])
{
+ unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
unsigned long flags = read_thread_flags();
regs->orig_x0 = regs->regs[0];
@@ -101,7 +103,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
return;
}
- if (has_syscall_work(flags)) {
+ if (work & SYSCALL_WORK_ENTER) {
/*
* The de-facto standard way to skip a system call using ptrace
* is to set the system call to -1 (NO_SYSCALL) and set x0 to a
@@ -119,7 +121,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
*/
if (scno == NO_SYSCALL)
syscall_set_return_value(current, regs, -ENOSYS, 0);
- scno = syscall_trace_enter(regs, scno, flags);
+ scno = syscall_trace_enter(regs, scno, work);
if (scno == NO_SYSCALL)
goto trace_exit;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v8 12/12] selftests: sud_test: Support aarch64
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
` (10 preceding siblings ...)
2025-11-26 7:14 ` [PATCH v8 11/12] arm64: entry: Convert to generic entry Jinjie Ruan
@ 2025-11-26 7:14 ` Jinjie Ruan
11 siblings, 0 replies; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-26 7:14 UTC (permalink / raw)
To: catalin.marinas, will, oleg, tglx, peterz, luto, shuah, kees, wad,
charlie, akpm, ldv, macro, deller, mark.rutland, efault, song,
mbenes, ryan.roberts, ada.coupriediaz, anshuman.khandual, broonie,
kevin.brodsky, pengcan, dvyukov, kmal, linux-arm-kernel,
linux-kernel, linux-kselftest
Cc: ruanjinjie
From: kemal <kmal@cock.li>
Support aarch64 to test "Syscall User Dispatch" with sud_test
selftest testcase.
Signed-off-by: kemal <kmal@cock.li>
---
tools/testing/selftests/syscall_user_dispatch/sud_test.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
index 2eb2c06303f2..f53ebc89befc 100644
--- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c
+++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
@@ -192,6 +192,10 @@ static void handle_sigsys(int sig, siginfo_t *info, void *ucontext)
((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A0] =
((ucontext_t *)ucontext)->uc_mcontext.__gregs[REG_A7];
#endif
+#ifdef __aarch64__
+ ((ucontext_t *)ucontext)->uc_mcontext.regs[0] = (unsigned int)
+ ((ucontext_t *)ucontext)->uc_mcontext.regs[8];
+#endif
}
int setup_sigsys_handler(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v8 01/12] arm64: Remove unused _TIF_WORK_MASK
2025-11-26 7:14 ` [PATCH v8 01/12] arm64: Remove unused _TIF_WORK_MASK Jinjie Ruan
@ 2025-11-27 13:27 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:27 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> Since commit b3cf07851b6c ("arm64: entry: Switch to generic IRQ
> entry"), _TIF_WORK_MASK is never used, so remove it.
>
> Fixes: b3cf07851b6c ("arm64: entry: Switch to generic IRQ entry")
I'm not sure a Fixes: tag is warranted - this has no functional effect
and doesn't need to be backported.
Otherwise:
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> arch/arm64/include/asm/thread_info.h | 6 ------
> 1 file changed, 6 deletions(-)
>
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index f241b8601ebd..ff4998fa1844 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -106,12 +106,6 @@ void arch_setup_new_exec(void);
> #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
> #define _TIF_TSC_SIGSEGV (1 << TIF_TSC_SIGSEGV)
>
> -#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \
> - _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
> - _TIF_UPROBE | _TIF_MTE_ASYNC_FAULT | \
> - _TIF_NOTIFY_SIGNAL | _TIF_SIGPENDING | \
> - _TIF_PATCH_PENDING)
> -
> #define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
> _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
> _TIF_SYSCALL_EMU)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 02/12] arm64/ptrace: Split report_syscall()
2025-11-26 7:14 ` [PATCH v8 02/12] arm64/ptrace: Split report_syscall() Jinjie Ruan
@ 2025-11-27 13:28 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:28 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> The generic syscall entry code has the form:
>
> | syscall_trace_enter()
> | {
> | ptrace_report_syscall_entry()
> | }
> |
> | syscall_exit_work()
> | {
> | ptrace_report_syscall_exit()
> | }
>
> In preparation for moving arm64 over to the generic entry code, split
> report_syscall() to two separate enter and exit functions to align
> the structure of the arm64 code with syscall_trace_enter() and
> syscall_exit_work() from the generic entry code.
>
> No functional changes.
>
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 03/12] arm64/ptrace: Refactor syscall_trace_enter/exit()
2025-11-26 7:14 ` [PATCH v8 03/12] arm64/ptrace: Refactor syscall_trace_enter/exit() Jinjie Ruan
@ 2025-11-27 13:28 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:28 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> The generic syscall entry code has the following form, which use
> the input syscall work flag and syscall number:
>
> | syscall_trace_enter(struct pt_regs *regs, long syscall,
> | unsigned long work)
> |
> | syscall_exit_work(struct pt_regs *regs, unsigned long work)
>
> In preparation for moving arm64 over to the generic entry code,
> refactor syscall_trace_enter/exit() to also pass thread flags, and
> get syscall number by syscall_get_nr() helper.
>
> No functional changes.
>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 04/12] arm64: ptrace: Move rseq_syscall() before audit_syscall_exit()
2025-11-26 7:14 ` [PATCH v8 04/12] arm64: ptrace: Move rseq_syscall() before audit_syscall_exit() Jinjie Ruan
@ 2025-11-27 13:28 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:28 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> commit a9f3a74a29af ("entry: Provide generic syscall exit function")
> introduce generic syscall exit function and call rseq_syscall()
> before audit_syscall_exit() and arch_syscall_exit_tracehook().
>
> And commit b74406f37737 ("arm: Add syscall detection for restartable
> sequences") add rseq support for arm32, which also call rseq_syscall()
> before audit_syscall_exit() and tracehook_report_syscall().
>
> However, commit 409d5db49867c ("arm64: rseq: Implement backend rseq
> calls and select HAVE_RSEQ") implement arm64 rseq and call
> rseq_syscall() after audit_syscall_exit() and tracehook_report_syscall().
>
> So compared to the generic entry and arm32 code, arm64 terminates
> the process a bit later if the syscall is issued within
> a restartable sequence.
>
> But as commit b74406f37737 ("arm: Add syscall detection for restartable
> sequences") said, syscalls are not allowed inside restartable sequences,
> so should call rseq_syscall() at the very beginning of system call
> exiting path for CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect
> whether there is a syscall issued inside restartable sequences.
>
> It makes sense to raise SIGSEGV via rseq_syscall() before auditing
> and ptrace syscall exit, because this guarantees that the process is
> already in an error state with SIGSEGV pending when those later steps
> run. Although it makes no practical difference to signal delivery (signals
> are processed at the very end in arm64_exit_to_user_mode()), the ordering
> is more logical: detect and flag the error first, then proceed with
> the remaining work.
Thanks for providing this separate patch with a detailed rationale!
> To make it more reasonable and in preparation for moving arm64 over to
> the generic entry code, move rseq_syscall() ahead before
> audit_syscall_exit().
>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 05/12] arm64: syscall: Rework el0_svc_common()
2025-11-26 7:14 ` [PATCH v8 05/12] arm64: syscall: Rework el0_svc_common() Jinjie Ruan
@ 2025-11-27 13:29 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:29 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> The generic syscall syscall_exit_work() has the following content:
>
> | audit_syscall_exit(regs)
> | trace_sys_exit(regs, ...)
> | ptrace_report_syscall_exit(regs, step)
>
> The generic syscall syscall_exit_to_user_mode_work() has
> the following form:
>
> | unsigned long work = READ_ONCE(current_thread_info()->syscall_work)
> | rseq_syscall()
> | if (unlikely(work & SYSCALL_WORK_EXIT))
> | syscall_exit_work(regs, work)
>
> In preparation for moving arm64 over to the generic entry code,
> rework el0_svc_common() as below:
>
> - Rename syscall_trace_exit() to syscall_exit_work().
>
> - Add syscall_exit_to_user_mode_prepare() function to replace
> the combination of read_thread_flags() and syscall_exit_work(),
> also move the syscall exit check logic into it. Move has_syscall_work()
> helper into asm/syscall.h for reuse.
>
> - As currently rseq_syscall() is always called and itself is controlled
> by the CONFIG_DEBUG_RSEQ macro, so the CONFIG_DEBUG_RSEQ check
> is removed.
>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 06/12] arm64/ptrace: Return early for ptrace_report_syscall_entry() error
2025-11-26 7:14 ` [PATCH v8 06/12] arm64/ptrace: Return early for ptrace_report_syscall_entry() error Jinjie Ruan
@ 2025-11-27 13:29 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:29 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> The generic entry abort the syscall_trace_enter() sequence if
> ptrace_report_syscall_entry() errors out, but arm64 not.
>
> As the ptrace_report_syscall_entry() comment said, the calling arch code
> should abort the system call and must prevent normal entry so no system
> call is made if ptrace_report_syscall_entry() return nonzero.
Which is already the case on arm64 before patch 2, which changes
syscall_trace_enter() so that it no longer returns regs->syscallno,
meaning that forget_syscall() no longer skips the syscall.
The most sensible thing to do is probably to move this patch before
patch 2. This ensures patch 2 doesn't introduce a regression, and then
the only effect of this patch is to abort the trace sequence early.
- Kevin
> In preparation for moving arm64 over to the generic entry code,
> return early if ptrace_report_syscall_entry() encounters an error.
>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> arch/arm64/kernel/ptrace.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 233a7688ac94..da9687d30bcf 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -2346,15 +2346,18 @@ static inline unsigned long ptrace_save_reg(struct pt_regs *regs,
> return saved_reg;
> }
>
> -static void report_syscall_entry(struct pt_regs *regs)
> +static int report_syscall_entry(struct pt_regs *regs)
> {
> unsigned long saved_reg;
> - int regno;
> + int regno, ret;
>
> saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_ENTER, ®no);
> - if (ptrace_report_syscall_entry(regs))
> + ret = ptrace_report_syscall_entry(regs);
> + if (ret)
> forget_syscall(regs);
> regs->regs[regno] = saved_reg;
> +
> + return ret;
> }
>
> static void report_syscall_exit(struct pt_regs *regs)
> @@ -2380,9 +2383,11 @@ static void report_syscall_exit(struct pt_regs *regs)
>
> int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
> {
> + int ret;
> +
> if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
> - report_syscall_entry(regs);
> - if (flags & _TIF_SYSCALL_EMU)
> + ret = report_syscall_entry(regs);
> + if (ret || (flags & _TIF_SYSCALL_EMU))
> return NO_SYSCALL;
> }
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 07/12] arm64/ptrace: Expand secure_computing() in place
2025-11-26 7:14 ` [PATCH v8 07/12] arm64/ptrace: Expand secure_computing() in place Jinjie Ruan
@ 2025-11-27 13:29 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:29 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> The generic entry expand secure_computing() in place and call
> __secure_computing() directly. In order to switch to the generic entry
> for arm64, refactor secure_computing() for syscall_trace_enter(). No
> functional changes. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 08/12] arm64/ptrace: Use syscall_get_arguments() heleper
2025-11-26 7:14 ` [PATCH v8 08/12] arm64/ptrace: Use syscall_get_arguments() heleper Jinjie Ruan
@ 2025-11-27 13:30 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:30 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> The generic entry check audit context first and use
> syscall_get_arguments() helper.
>
> In order to switch to the generic entry for arm64,
>
> - Also use syscall_get_arguments() to get audit_syscall_entry()'s
> last four parameters.
>
> - Extract the syscall_enter_audit() helper to make it clear.
>
> - Check audit context first, which saves an unnecessary memcpy when
> current process's audit_context is NULL.
>
> Overall these changes make syscall_enter_audit() exactly equivalent
> to the generic one.
>
> No functional changes.
>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
As noted in v7, in subject: s/heleper/helper/
Otherwise:
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 10/12] entry: Add arch_ptrace_report_syscall_entry/exit()
2025-11-26 7:14 ` [PATCH v8 10/12] entry: Add arch_ptrace_report_syscall_entry/exit() Jinjie Ruan
@ 2025-11-27 13:30 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:30 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> Differ from generic entry, due to historical reasons, ARM64 need to
> save/restore during syscall entry/exit because ARM64 use a scratch
> register (ip(r12) on AArch32, x7 on AArch64) to denote syscall entry/exit.
>
> In preparation for moving arm64 over to the generic entry code,
> add arch_ptrace_report_syscall_entry/exit() as the default
> ptrace_report_syscall_entry/exit() implementation. This allows
> arm64 to implement the architecture specific version.
>
> This allows arm64 to implement the architecture specific version.
>
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 11/12] arm64: entry: Convert to generic entry
2025-11-26 7:14 ` [PATCH v8 11/12] arm64: entry: Convert to generic entry Jinjie Ruan
@ 2025-11-27 13:31 ` Kevin Brodsky
2025-11-28 3:34 ` Jinjie Ruan
0 siblings, 1 reply; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-27 13:31 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 26/11/2025 08:14, Jinjie Ruan wrote:
> Currently, x86, Riscv, Loongarch use the generic entry which makes
> maintainers' work easier and codes more elegant. arm64 has already
> switched to the generic IRQ entry, so completely convert arm64 to use
> the generic entry infrastructure from kernel/entry/*.
>
> The changes are below:
> - Remove TIF_SYSCALL_* flag, _TIF_WORK_MASK, _TIF_SYSCALL_WORK,
_TIF_WORK_MASK is now removed in patch 1.
> and remove has_syscall_work(), as _TIF_SYSCALL_WORK is equal with
> SYSCALL_WORK_ENTER.
>
> [...]
>
> +static __always_inline void arch_ptrace_report_syscall_exit(struct pt_regs *regs,
> + int step)
> +{
> + unsigned long saved_reg;
> + int regno;
> +
> + saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
> + if (!step) {
A difference I noticed here is that the generic report_single_step()
always returns false if SYSCALL_EMU is set. I don't know if the
combination of SYSCALL_EMU and SINGLESTEP is meaningful, but if it is
then I think that's a behaviour change.
> + ptrace_report_syscall_exit(regs, 0);
> + regs->regs[regno] = saved_reg;
> + } else {
> + regs->regs[regno] = saved_reg;
> +
> + /*
> + * Signal a pseudo-step exception since we are stepping but
> + * tracer modifications to the registers may have rewound the
> + * state machine.
> + */
> + ptrace_report_syscall_exit(regs, 1);
> + }
> +}
> +
> +#define arch_ptrace_report_syscall_exit arch_ptrace_report_syscall_exit
> +
> #endif /* _ASM_ARM64_ENTRY_COMMON_H */
> diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
> index 6225981fbbdb..f705ba2bb6fd 100644
> --- a/arch/arm64/include/asm/syscall.h
> +++ b/arch/arm64/include/asm/syscall.h
> @@ -9,6 +9,9 @@
> #include <linux/compat.h>
> #include <linux/err.h>
>
> +#include <asm/compat.h>
> +#include <asm/vdso.h>
> +
> typedef long (*syscall_fn_t)(const struct pt_regs *regs);
>
> extern const syscall_fn_t sys_call_table[];
> @@ -114,12 +117,21 @@ static inline int syscall_get_arch(struct task_struct *task)
> return AUDIT_ARCH_AARCH64;
> }
>
> -static inline bool has_syscall_work(unsigned long flags)
> +static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
> {
> - return unlikely(flags & _TIF_SYSCALL_WORK);
> + unsigned long sigtramp;
> +
> +#ifdef CONFIG_COMPAT
> + if (is_compat_task()) {
> + unsigned long vdso = (unsigned long)current->mm->context.sigpage;
Might as well call it sigpage (separate from the vDSO on arm32).
> +
> + return (regs->pc >= vdso && regs->pc < (vdso + PAGE_SIZE));
Nit: no need for parentheses around the expression to return.
> + }
> +#endif
> + sigtramp = (unsigned long)VDSO_SYMBOL(current->mm->context.vdso, sigtramp);
> + return regs->pc == (sigtramp + 8);
> }
>
> -int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags);
> -void syscall_exit_to_user_mode_prepare(struct pt_regs *regs);
> +#define ARCH_SYSCALL_WORK_EXIT (_TIF_SECCOMP | _TIF_SYSCALL_EMU)
>
> #endif /* __ASM_SYSCALL_H */
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index ff4998fa1844..d3142b5d1b9c 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -43,6 +43,7 @@ struct thread_info {
> void *scs_sp;
> #endif
> u32 cpu;
> + unsigned long syscall_work; /* SYSCALL_WORK_ flags */
> };
>
> #define thread_saved_pc(tsk) \
> @@ -65,11 +66,8 @@ void arch_setup_new_exec(void);
> #define TIF_UPROBE 5 /* uprobe breakpoint or singlestep */
> #define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */
> #define TIF_NOTIFY_SIGNAL 7 /* signal notifications exist */
> -#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
> -#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
> -#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
> -#define TIF_SECCOMP 11 /* syscall secure computing */
> -#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
> +#define TIF_SECCOMP 11 /* syscall secure computing */
> +#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
These seem to have reappeared in v8 for some reason?
> #define TIF_PATCH_PENDING 13 /* pending live patching update */
> #define TIF_MEMDIE 18 /* is terminating due to OOM killer */
> #define TIF_FREEZE 19
> @@ -92,24 +90,16 @@ void arch_setup_new_exec(void);
> #define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY)
> #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
> #define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE)
> -#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
> -#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
> -#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
> #define _TIF_SECCOMP (1 << TIF_SECCOMP)
> #define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
Ditto.
> #define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING)
> #define _TIF_UPROBE (1 << TIF_UPROBE)
> -#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
> #define _TIF_32BIT (1 << TIF_32BIT)
> #define _TIF_SVE (1 << TIF_SVE)
> #define _TIF_MTE_ASYNC_FAULT (1 << TIF_MTE_ASYNC_FAULT)
> #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
> #define _TIF_TSC_SIGSEGV (1 << TIF_TSC_SIGSEGV)
>
> -#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
> - _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
> - _TIF_SYSCALL_EMU)
> -
> #ifdef CONFIG_SHADOW_CALL_STACK
> #define INIT_SCS \
> .scs_base = init_shadow_call_stack, \
> diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
> index 29307642f4c9..e67643a70405 100644
> --- a/arch/arm64/kernel/debug-monitors.c
> +++ b/arch/arm64/kernel/debug-monitors.c
> @@ -385,11 +385,18 @@ void user_enable_single_step(struct task_struct *task)
>
> if (!test_and_set_ti_thread_flag(ti, TIF_SINGLESTEP))
> set_regs_spsr_ss(task_pt_regs(task));
> +
> + /*
> + * Ensure that a trap is triggered once stepping out of a system
> + * call prior to executing any user instruction.
> + */
> + set_task_syscall_work(task, SYSCALL_EXIT_TRAP);
> }
> NOKPROBE_SYMBOL(user_enable_single_step);
>
> void user_disable_single_step(struct task_struct *task)
> {
> clear_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP);
> + clear_task_syscall_work(task, SYSCALL_EXIT_TRAP);
> }
> NOKPROBE_SYMBOL(user_disable_single_step);
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index c2bd0130212d..9e3b39e207d1 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -42,9 +42,6 @@
> #include <asm/traps.h>
> #include <asm/system_misc.h>
>
> -#define CREATE_TRACE_POINTS
> -#include <trace/events/syscalls.h>
> -
> struct pt_regs_offset {
> const char *name;
> int offset;
> @@ -2312,141 +2309,6 @@ long arch_ptrace(struct task_struct *child, long request,
> return ptrace_request(child, request, addr, data);
> }
>
> -enum ptrace_syscall_dir {
> - PTRACE_SYSCALL_ENTER = 0,
> - PTRACE_SYSCALL_EXIT,
> -};
> -
> -static inline unsigned long ptrace_save_reg(struct pt_regs *regs,
> - enum ptrace_syscall_dir dir,
> - int *regno)
> -{
> - unsigned long saved_reg;
> -
> - /*
> - * We have some ABI weirdness here in the way that we handle syscall
> - * exit stops because we indicate whether or not the stop has been
> - * signalled from syscall entry or syscall exit by clobbering a general
> - * purpose register (ip/r12 for AArch32, x7 for AArch64) in the tracee
> - * and restoring its old value after the stop. This means that:
> - *
> - * - Any writes by the tracer to this register during the stop are
> - * ignored/discarded.
> - *
> - * - The actual value of the register is not available during the stop,
> - * so the tracer cannot save it and restore it later.
> - *
> - * - Syscall stops behave differently to seccomp and pseudo-step traps
> - * (the latter do not nobble any registers).
> - */
> - *regno = (is_compat_task() ? 12 : 7);
> - saved_reg = regs->regs[*regno];
> - regs->regs[*regno] = dir;
> -
> - return saved_reg;
> -}
> -
> -static int report_syscall_entry(struct pt_regs *regs)
> -{
> - unsigned long saved_reg;
> - int regno, ret;
> -
> - saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_ENTER, ®no);
> - ret = ptrace_report_syscall_entry(regs);
> - if (ret)
> - forget_syscall(regs);
> - regs->regs[regno] = saved_reg;
> -
> - return ret;
> -}
> -
> -static void report_syscall_exit(struct pt_regs *regs)
> -{
> - unsigned long saved_reg;
> - int regno;
> -
> - saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
> - if (!test_thread_flag(TIF_SINGLESTEP)) {
> - ptrace_report_syscall_exit(regs, 0);
> - regs->regs[regno] = saved_reg;
> - } else {
> - regs->regs[regno] = saved_reg;
> -
> - /*
> - * Signal a pseudo-step exception since we are stepping but
> - * tracer modifications to the registers may have rewound the
> - * state machine.
> - */
> - ptrace_report_syscall_exit(regs, 1);
> - }
> -}
> -
> -static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
> -{
> - if (unlikely(audit_context())) {
> - unsigned long args[6];
> -
> - syscall_get_arguments(current, regs, args);
> - audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]);
> - }
> -}
> -
> -int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
> -{
> - int ret;
> -
> - if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
> - ret = report_syscall_entry(regs);
> - if (ret || (flags & _TIF_SYSCALL_EMU))
> - return NO_SYSCALL;
> - }
> -
> - /* Do the secure computing after ptrace; failures should be fast. */
> - if (flags & _TIF_SECCOMP) {
> - ret = __secure_computing();
> - if (ret == -1)
> - return NO_SYSCALL;
> - }
> -
> - /* Either of the above might have changed the syscall number */
> - syscall = syscall_get_nr(current, regs);
> -
> - if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) {
> - trace_sys_enter(regs, syscall);
> -
> - /*
> - * Probes or BPF hooks in the tracepoint may have changed the
> - * system call number as well.
> - */
> - syscall = syscall_get_nr(current, regs);
> - }
> -
> - syscall_enter_audit(regs, syscall);
> -
> - return ret ? : syscall;
> -}
> -
> -static void syscall_exit_work(struct pt_regs *regs, unsigned long flags)
> -{
> - audit_syscall_exit(regs);
> -
> - if (flags & _TIF_SYSCALL_TRACEPOINT)
> - trace_sys_exit(regs, syscall_get_return_value(current, regs));
> -
> - if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
> - report_syscall_exit(regs);
> -}
> -
> -void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
> -{
> - unsigned long flags = read_thread_flags();
> -
> - rseq_syscall(regs);
> -
> - if (has_syscall_work(flags) || flags & _TIF_SINGLESTEP)
> - syscall_exit_work(regs, flags);
> -}
Aside from the small change in arch_ptrace_report_syscall_exit(), these
look exactly equivalent to the generic functions, so LGTM.
- Kevin
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 11/12] arm64: entry: Convert to generic entry
2025-11-27 13:31 ` Kevin Brodsky
@ 2025-11-28 3:34 ` Jinjie Ruan
2025-11-28 13:32 ` Kevin Brodsky
0 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-28 3:34 UTC (permalink / raw)
To: Kevin Brodsky, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 2025/11/27 21:31, Kevin Brodsky wrote:
> On 26/11/2025 08:14, Jinjie Ruan wrote:
>> Currently, x86, Riscv, Loongarch use the generic entry which makes
>> maintainers' work easier and codes more elegant. arm64 has already
>> switched to the generic IRQ entry, so completely convert arm64 to use
>> the generic entry infrastructure from kernel/entry/*.
>>
>> The changes are below:
>> - Remove TIF_SYSCALL_* flag, _TIF_WORK_MASK, _TIF_SYSCALL_WORK,
>
> _TIF_WORK_MASK is now removed in patch 1.
>
>> and remove has_syscall_work(), as _TIF_SYSCALL_WORK is equal with
>> SYSCALL_WORK_ENTER.
>>
>> [...]
>>
>> +static __always_inline void arch_ptrace_report_syscall_exit(struct pt_regs *regs,
>> + int step)
>> +{
>> + unsigned long saved_reg;
>> + int regno;
>> +
>> + saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
>> + if (!step) {
>
> A difference I noticed here is that the generic report_single_step()
> always returns false if SYSCALL_EMU is set. I don't know if the
> combination of SYSCALL_EMU and SINGLESTEP is meaningful, but if it is
> then I think that's a behaviour change.
commit 64eb35f701f0 ("ptrace: Migrate TIF_SYSCALL_EMU to use
SYSCALL_WORK flag") has changed the following code:
Therefore, the original logic returns false in these cases for
report_single_step() :
- Only _TIF_SYSCALL_EMU is set.
- Both _TIF_SINGLESTEP and _TIF_SYSCALL_EMU are set.
- Neither TIF_SINGLESTEP nor _TIF_SYSCALL_EMU is set;
#define SYSEMU_STEP (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU)
static inline bool report_single_step(unsigned long ti_work)
{
return (ti_work & SYSEMU_STEP) == _TIF_SINGLESTEP;
}
I think the "returns false if SYSCALL_EMU is set" behaviour is correct
according to the Man's Manual, both PTRACE_SYSEMU and
PTRACE_SYSEMU_SINGLESTEP need to report the syscal only once on syscall
entry.
“For PTRACE_SYSEMU, continue and stop on entry to the next
system call, which will not be executed. See the
documentation on syscall-stops below. For
PTRACE_SYSEMU_SINGLESTEP, do the same but also singlestep
if not a system call. “
Link:https://man7.org/linux/man-pages/man2/ptrace.2.html
>
>> + ptrace_report_syscall_exit(regs, 0);
>> + regs->regs[regno] = saved_reg;
>> + } else {
>> + regs->regs[regno] = saved_reg;
>> +
>> + /*
>> + * Signal a pseudo-step exception since we are stepping but
>> + * tracer modifications to the registers may have rewound the
>> + * state machine.
>> + */
>> + ptrace_report_syscall_exit(regs, 1);
>> + }
>> +}
>> +
>> +#define arch_ptrace_report_syscall_exit arch_ptrace_report_syscall_exit
>> +
>> #endif /* _ASM_ARM64_ENTRY_COMMON_H */
>> diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
>> index 6225981fbbdb..f705ba2bb6fd 100644
>> --- a/arch/arm64/include/asm/syscall.h
>> +++ b/arch/arm64/include/asm/syscall.h
>> @@ -9,6 +9,9 @@
>> #include <linux/compat.h>
>> #include <linux/err.h>
>>
>> +#include <asm/compat.h>
>> +#include <asm/vdso.h>
>> +
>> typedef long (*syscall_fn_t)(const struct pt_regs *regs);
>>
>> extern const syscall_fn_t sys_call_table[];
>> @@ -114,12 +117,21 @@ static inline int syscall_get_arch(struct task_struct *task)
>> return AUDIT_ARCH_AARCH64;
>> }
>>
>> -static inline bool has_syscall_work(unsigned long flags)
>> +static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
>> {
>> - return unlikely(flags & _TIF_SYSCALL_WORK);
>> + unsigned long sigtramp;
>> +
>> +#ifdef CONFIG_COMPAT
>> + if (is_compat_task()) {
>> + unsigned long vdso = (unsigned long)current->mm->context.sigpage;
>
> Might as well call it sigpage (separate from the vDSO on arm32).
>
>> +
>> + return (regs->pc >= vdso && regs->pc < (vdso + PAGE_SIZE));
>
> Nit: no need for parentheses around the expression to return.
>
>> + }
>> +#endif
>> + sigtramp = (unsigned long)VDSO_SYMBOL(current->mm->context.vdso, sigtramp);
>> + return regs->pc == (sigtramp + 8);
>> }
>>
>> -int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags);
>> -void syscall_exit_to_user_mode_prepare(struct pt_regs *regs);
>> +#define ARCH_SYSCALL_WORK_EXIT (_TIF_SECCOMP | _TIF_SYSCALL_EMU)
>>
>> #endif /* __ASM_SYSCALL_H */
>> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
>> index ff4998fa1844..d3142b5d1b9c 100644
>> --- a/arch/arm64/include/asm/thread_info.h
>> +++ b/arch/arm64/include/asm/thread_info.h
>> @@ -43,6 +43,7 @@ struct thread_info {
>> void *scs_sp;
>> #endif
>> u32 cpu;
>> + unsigned long syscall_work; /* SYSCALL_WORK_ flags */
>> };
>>
>> #define thread_saved_pc(tsk) \
>> @@ -65,11 +66,8 @@ void arch_setup_new_exec(void);
>> #define TIF_UPROBE 5 /* uprobe breakpoint or singlestep */
>> #define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */
>> #define TIF_NOTIFY_SIGNAL 7 /* signal notifications exist */
>> -#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
>> -#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
>> -#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
>> -#define TIF_SECCOMP 11 /* syscall secure computing */
>> -#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
>> +#define TIF_SECCOMP 11 /* syscall secure computing */
>> +#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
>
> These seem to have reappeared in v8 for some reason?
v8 add "ARCH_SYSCALL_WORK_EXIT" to be defined as "SECCOMP | SYSCALL_EMU"
to keep the arm64 behaviour unchanged as mentioned in v7.
>
>> #define TIF_PATCH_PENDING 13 /* pending live patching update */
>> #define TIF_MEMDIE 18 /* is terminating due to OOM killer */
>> #define TIF_FREEZE 19
>> @@ -92,24 +90,16 @@ void arch_setup_new_exec(void);
>> #define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY)
>> #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
>> #define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE)
>> -#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
>> -#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
>> -#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
>> #define _TIF_SECCOMP (1 << TIF_SECCOMP)
>> #define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
>
> Ditto.
>
>> #define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING)
>> #define _TIF_UPROBE (1 << TIF_UPROBE)
>> -#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
>> #define _TIF_32BIT (1 << TIF_32BIT)
>> #define _TIF_SVE (1 << TIF_SVE)
>> #define _TIF_MTE_ASYNC_FAULT (1 << TIF_MTE_ASYNC_FAULT)
>> #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
>> #define _TIF_TSC_SIGSEGV (1 << TIF_TSC_SIGSEGV)
>>
>> -#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
>> - _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
>> - _TIF_SYSCALL_EMU)
>> -
>> #ifdef CONFIG_SHADOW_CALL_STACK
>> #define INIT_SCS \
>> .scs_base = init_shadow_call_stack, \
>> diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
>> index 29307642f4c9..e67643a70405 100644
>> --- a/arch/arm64/kernel/debug-monitors.c
>> +++ b/arch/arm64/kernel/debug-monitors.c
>> @@ -385,11 +385,18 @@ void user_enable_single_step(struct task_struct *task)
>>
>> if (!test_and_set_ti_thread_flag(ti, TIF_SINGLESTEP))
>> set_regs_spsr_ss(task_pt_regs(task));
>> +
>> + /*
>> + * Ensure that a trap is triggered once stepping out of a system
>> + * call prior to executing any user instruction.
>> + */
>> + set_task_syscall_work(task, SYSCALL_EXIT_TRAP);
>> }
>> NOKPROBE_SYMBOL(user_enable_single_step);
>>
>> void user_disable_single_step(struct task_struct *task)
>> {
>> clear_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP);
>> + clear_task_syscall_work(task, SYSCALL_EXIT_TRAP);
>> }
>> NOKPROBE_SYMBOL(user_disable_single_step);
>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>> index c2bd0130212d..9e3b39e207d1 100644
>> --- a/arch/arm64/kernel/ptrace.c
>> +++ b/arch/arm64/kernel/ptrace.c
>> @@ -42,9 +42,6 @@
>> #include <asm/traps.h>
>> #include <asm/system_misc.h>
>>
>> -#define CREATE_TRACE_POINTS
>> -#include <trace/events/syscalls.h>
>> -
>> struct pt_regs_offset {
>> const char *name;
>> int offset;
>> @@ -2312,141 +2309,6 @@ long arch_ptrace(struct task_struct *child, long request,
>> return ptrace_request(child, request, addr, data);
>> }
>>
>> -enum ptrace_syscall_dir {
>> - PTRACE_SYSCALL_ENTER = 0,
>> - PTRACE_SYSCALL_EXIT,
>> -};
>> -
>> -static inline unsigned long ptrace_save_reg(struct pt_regs *regs,
>> - enum ptrace_syscall_dir dir,
>> - int *regno)
>> -{
>> - unsigned long saved_reg;
>> -
>> - /*
>> - * We have some ABI weirdness here in the way that we handle syscall
>> - * exit stops because we indicate whether or not the stop has been
>> - * signalled from syscall entry or syscall exit by clobbering a general
>> - * purpose register (ip/r12 for AArch32, x7 for AArch64) in the tracee
>> - * and restoring its old value after the stop. This means that:
>> - *
>> - * - Any writes by the tracer to this register during the stop are
>> - * ignored/discarded.
>> - *
>> - * - The actual value of the register is not available during the stop,
>> - * so the tracer cannot save it and restore it later.
>> - *
>> - * - Syscall stops behave differently to seccomp and pseudo-step traps
>> - * (the latter do not nobble any registers).
>> - */
>> - *regno = (is_compat_task() ? 12 : 7);
>> - saved_reg = regs->regs[*regno];
>> - regs->regs[*regno] = dir;
>> -
>> - return saved_reg;
>> -}
>> -
>> -static int report_syscall_entry(struct pt_regs *regs)
>> -{
>> - unsigned long saved_reg;
>> - int regno, ret;
>> -
>> - saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_ENTER, ®no);
>> - ret = ptrace_report_syscall_entry(regs);
>> - if (ret)
>> - forget_syscall(regs);
>> - regs->regs[regno] = saved_reg;
>> -
>> - return ret;
>> -}
>> -
>> -static void report_syscall_exit(struct pt_regs *regs)
>> -{
>> - unsigned long saved_reg;
>> - int regno;
>> -
>> - saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
>> - if (!test_thread_flag(TIF_SINGLESTEP)) {
>> - ptrace_report_syscall_exit(regs, 0);
>> - regs->regs[regno] = saved_reg;
>> - } else {
>> - regs->regs[regno] = saved_reg;
>> -
>> - /*
>> - * Signal a pseudo-step exception since we are stepping but
>> - * tracer modifications to the registers may have rewound the
>> - * state machine.
>> - */
>> - ptrace_report_syscall_exit(regs, 1);
>> - }
>> -}
>> -
>> -static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
>> -{
>> - if (unlikely(audit_context())) {
>> - unsigned long args[6];
>> -
>> - syscall_get_arguments(current, regs, args);
>> - audit_syscall_entry(syscall, args[0], args[1], args[2], args[3]);
>> - }
>> -}
>> -
>> -int syscall_trace_enter(struct pt_regs *regs, long syscall, unsigned long flags)
>> -{
>> - int ret;
>> -
>> - if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
>> - ret = report_syscall_entry(regs);
>> - if (ret || (flags & _TIF_SYSCALL_EMU))
>> - return NO_SYSCALL;
>> - }
>> -
>> - /* Do the secure computing after ptrace; failures should be fast. */
>> - if (flags & _TIF_SECCOMP) {
>> - ret = __secure_computing();
>> - if (ret == -1)
>> - return NO_SYSCALL;
>> - }
>> -
>> - /* Either of the above might have changed the syscall number */
>> - syscall = syscall_get_nr(current, regs);
>> -
>> - if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) {
>> - trace_sys_enter(regs, syscall);
>> -
>> - /*
>> - * Probes or BPF hooks in the tracepoint may have changed the
>> - * system call number as well.
>> - */
>> - syscall = syscall_get_nr(current, regs);
>> - }
>> -
>> - syscall_enter_audit(regs, syscall);
>> -
>> - return ret ? : syscall;
>> -}
>> -
>> -static void syscall_exit_work(struct pt_regs *regs, unsigned long flags)
>> -{
>> - audit_syscall_exit(regs);
>> -
>> - if (flags & _TIF_SYSCALL_TRACEPOINT)
>> - trace_sys_exit(regs, syscall_get_return_value(current, regs));
>> -
>> - if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
>> - report_syscall_exit(regs);
>> -}
>> -
>> -void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
>> -{
>> - unsigned long flags = read_thread_flags();
>> -
>> - rseq_syscall(regs);
>> -
>> - if (has_syscall_work(flags) || flags & _TIF_SINGLESTEP)
>> - syscall_exit_work(regs, flags);
>> -}
>
> Aside from the small change in arch_ptrace_report_syscall_exit(), these
> look exactly equivalent to the generic functions, so LGTM.
>
> - Kevin
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 11/12] arm64: entry: Convert to generic entry
2025-11-28 3:34 ` Jinjie Ruan
@ 2025-11-28 13:32 ` Kevin Brodsky
2025-11-29 1:23 ` Jinjie Ruan
0 siblings, 1 reply; 27+ messages in thread
From: Kevin Brodsky @ 2025-11-28 13:32 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 28/11/2025 04:34, Jinjie Ruan wrote:
>
> On 2025/11/27 21:31, Kevin Brodsky wrote:
>> On 26/11/2025 08:14, Jinjie Ruan wrote:
>>> Currently, x86, Riscv, Loongarch use the generic entry which makes
>>> maintainers' work easier and codes more elegant. arm64 has already
>>> switched to the generic IRQ entry, so completely convert arm64 to use
>>> the generic entry infrastructure from kernel/entry/*.
>>>
>>> The changes are below:
>>> - Remove TIF_SYSCALL_* flag, _TIF_WORK_MASK, _TIF_SYSCALL_WORK,
>> _TIF_WORK_MASK is now removed in patch 1.
>>
>>> and remove has_syscall_work(), as _TIF_SYSCALL_WORK is equal with
>>> SYSCALL_WORK_ENTER.
>>>
>>> [...]
>>>
>>> +static __always_inline void arch_ptrace_report_syscall_exit(struct pt_regs *regs,
>>> + int step)
>>> +{
>>> + unsigned long saved_reg;
>>> + int regno;
>>> +
>>> + saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
>>> + if (!step) {
>> A difference I noticed here is that the generic report_single_step()
>> always returns false if SYSCALL_EMU is set. I don't know if the
>> combination of SYSCALL_EMU and SINGLESTEP is meaningful, but if it is
>> then I think that's a behaviour change.
> commit 64eb35f701f0 ("ptrace: Migrate TIF_SYSCALL_EMU to use
> SYSCALL_WORK flag") has changed the following code:
>
> Therefore, the original logic returns false in these cases for
> report_single_step() :
>
> - Only _TIF_SYSCALL_EMU is set.
>
> - Both _TIF_SINGLESTEP and _TIF_SYSCALL_EMU are set.
>
> - Neither TIF_SINGLESTEP nor _TIF_SYSCALL_EMU is set;
>
>
> #define SYSEMU_STEP (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU)
>
> static inline bool report_single_step(unsigned long ti_work)
> {
> return (ti_work & SYSEMU_STEP) == _TIF_SINGLESTEP;
> }
The code did look different before this commit, but AFAICT it was
functionally equivalent w.r.t. SYSEMU / SINGLESTEP.
> I think the "returns false if SYSCALL_EMU is set" behaviour is correct
> according to the Man's Manual, both PTRACE_SYSEMU and
> PTRACE_SYSEMU_SINGLESTEP need to report the syscal only once on syscall
> entry.
>
> “For PTRACE_SYSEMU, continue and stop on entry to the next
> system call, which will not be executed. See the
> documentation on syscall-stops below. For
> PTRACE_SYSEMU_SINGLESTEP, do the same but also singlestep
> if not a system call. “
That seems sensible (based on my very limited understanding of SYSEMU),
nevertheless it is not what arm64 currently does AFAIU. To follow the
same logic as the rest, this change should be made in a separate patch.
> Link:https://man7.org/linux/man-pages/man2/ptrace.2.html
>
>>> [...]
>>>
>>> #define TIF_UPROBE 5 /* uprobe breakpoint or singlestep */
>>> #define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */
>>> #define TIF_NOTIFY_SIGNAL 7 /* signal notifications exist */
>>> -#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
>>> -#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
>>> -#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
>>> -#define TIF_SECCOMP 11 /* syscall secure computing */
>>> -#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
>>> +#define TIF_SECCOMP 11 /* syscall secure computing */
>>> +#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
>> These seem to have reappeared in v8 for some reason?
> v8 add "ARCH_SYSCALL_WORK_EXIT" to be defined as "SECCOMP | SYSCALL_EMU"
> to keep the arm64 behaviour unchanged as mentioned in v7.
Ah then that is where the issue is, I missed that: surely switching to
generic entry means that we are using SYSCALL_WORK_BIT_* rather than
TIF_* for all these flags?
- Kevin
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 11/12] arm64: entry: Convert to generic entry
2025-11-28 13:32 ` Kevin Brodsky
@ 2025-11-29 1:23 ` Jinjie Ruan
2025-12-01 10:17 ` Kevin Brodsky
0 siblings, 1 reply; 27+ messages in thread
From: Jinjie Ruan @ 2025-11-29 1:23 UTC (permalink / raw)
To: Kevin Brodsky, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 2025/11/28 21:32, Kevin Brodsky wrote:
> On 28/11/2025 04:34, Jinjie Ruan wrote:
>>
>> On 2025/11/27 21:31, Kevin Brodsky wrote:
>>> On 26/11/2025 08:14, Jinjie Ruan wrote:
>>>> Currently, x86, Riscv, Loongarch use the generic entry which makes
>>>> maintainers' work easier and codes more elegant. arm64 has already
>>>> switched to the generic IRQ entry, so completely convert arm64 to use
>>>> the generic entry infrastructure from kernel/entry/*.
>>>>
>>>> The changes are below:
>>>> - Remove TIF_SYSCALL_* flag, _TIF_WORK_MASK, _TIF_SYSCALL_WORK,
>>> _TIF_WORK_MASK is now removed in patch 1.
>>>
>>>> and remove has_syscall_work(), as _TIF_SYSCALL_WORK is equal with
>>>> SYSCALL_WORK_ENTER.
>>>>
>>>> [...]
>>>>
>>>> +static __always_inline void arch_ptrace_report_syscall_exit(struct pt_regs *regs,
>>>> + int step)
>>>> +{
>>>> + unsigned long saved_reg;
>>>> + int regno;
>>>> +
>>>> + saved_reg = ptrace_save_reg(regs, PTRACE_SYSCALL_EXIT, ®no);
>>>> + if (!step) {
>>> A difference I noticed here is that the generic report_single_step()
>>> always returns false if SYSCALL_EMU is set. I don't know if the
>>> combination of SYSCALL_EMU and SINGLESTEP is meaningful, but if it is
>>> then I think that's a behaviour change.
>> commit 64eb35f701f0 ("ptrace: Migrate TIF_SYSCALL_EMU to use
>> SYSCALL_WORK flag") has changed the following code:
>>
>> Therefore, the original logic returns false in these cases for
>> report_single_step() :
>>
>> - Only _TIF_SYSCALL_EMU is set.
>>
>> - Both _TIF_SINGLESTEP and _TIF_SYSCALL_EMU are set.
>>
>> - Neither TIF_SINGLESTEP nor _TIF_SYSCALL_EMU is set;
>>
>>
>> #define SYSEMU_STEP (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU)
>>
>> static inline bool report_single_step(unsigned long ti_work)
>> {
>> return (ti_work & SYSEMU_STEP) == _TIF_SINGLESTEP;
>> }
>
> The code did look different before this commit, but AFAICT it was
> functionally equivalent w.r.t. SYSEMU / SINGLESTEP.
>
>> I think the "returns false if SYSCALL_EMU is set" behaviour is correct
>> according to the Man's Manual, both PTRACE_SYSEMU and
>> PTRACE_SYSEMU_SINGLESTEP need to report the syscal only once on syscall
>> entry.
>>
>> “For PTRACE_SYSEMU, continue and stop on entry to the next
>> system call, which will not be executed. See the
>> documentation on syscall-stops below. For
>> PTRACE_SYSEMU_SINGLESTEP, do the same but also singlestep
>> if not a system call. “
>
> That seems sensible (based on my very limited understanding of SYSEMU),
> nevertheless it is not what arm64 currently does AFAIU. To follow the
> same logic as the rest, this change should be made in a separate patch.
Right, and the man page description seems to match the comments of the
report_single_step() function.
"74 /*
75 * If SYSCALL_EMU is set, then the only reason to report is when
76 * SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall
77 * instruction has been already reported in
syscall_enter_from_user_mode().
78 */
"
>
>> Link:https://man7.org/linux/man-pages/man2/ptrace.2.html
>>
>>>> [...]
>>>>
>>>> #define TIF_UPROBE 5 /* uprobe breakpoint or singlestep */
>>>> #define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */
>>>> #define TIF_NOTIFY_SIGNAL 7 /* signal notifications exist */
>>>> -#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
>>>> -#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
>>>> -#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
>>>> -#define TIF_SECCOMP 11 /* syscall secure computing */
>>>> -#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
>>>> +#define TIF_SECCOMP 11 /* syscall secure computing */
>>>> +#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
>>> These seem to have reappeared in v8 for some reason?
>> v8 add "ARCH_SYSCALL_WORK_EXIT" to be defined as "SECCOMP | SYSCALL_EMU"
>> to keep the arm64 behaviour unchanged as mentioned in v7.
>
> Ah then that is where the issue is, I missed that: surely switching to
> generic entry means that we are using SYSCALL_WORK_BIT_* rather than
> TIF_* for all these flags?
I think they may be the same thing as you mentioned in v7,neither
SYSCALL_WORK_EXIT nor report_single_step() excluded SYSCALL_EMU, maybe
we should clarify them for arm64 together in a separate patch.
1、"The generic report_single_step() always returns false if SYSCALL_EMU
is set."
2、"
> -void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
> -{
> - unsigned long flags = read_thread_flags();
> -
> - rseq_syscall(regs);
> -
> - if (has_syscall_work(flags) || flags & _TIF_SINGLESTEP)
I believe switching to the generic function introduces a change
here: syscall_exit_work() is only called if a flag in
SYSCALL_WORK_EXIT is set, and this set does not include SYSCALL_EMU and
SECCOMP. Practically this means that audit_syscall_exit() will no
longer be called if only SECCOMP and/or SYSCALL_EMU is set.
It doesn't feel like a major behaviour change, but it should be
pointed out."
>
> - Kevin
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v8 11/12] arm64: entry: Convert to generic entry
2025-11-29 1:23 ` Jinjie Ruan
@ 2025-12-01 10:17 ` Kevin Brodsky
0 siblings, 0 replies; 27+ messages in thread
From: Kevin Brodsky @ 2025-12-01 10:17 UTC (permalink / raw)
To: Jinjie Ruan, catalin.marinas, will, oleg, tglx, peterz, luto,
shuah, kees, wad, charlie, akpm, ldv, macro, deller, mark.rutland,
efault, song, mbenes, ryan.roberts, ada.coupriediaz,
anshuman.khandual, broonie, pengcan, dvyukov, kmal,
linux-arm-kernel, linux-kernel, linux-kselftest
On 29/11/2025 02:23, Jinjie Ruan wrote:
>>>>> #define TIF_UPROBE 5 /* uprobe breakpoint or singlestep */
>>>>> #define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */
>>>>> #define TIF_NOTIFY_SIGNAL 7 /* signal notifications exist */
>>>>> -#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
>>>>> -#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
>>>>> -#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
>>>>> -#define TIF_SECCOMP 11 /* syscall secure computing */
>>>>> -#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
>>>>> +#define TIF_SECCOMP 11 /* syscall secure computing */
>>>>> +#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
>>>> These seem to have reappeared in v8 for some reason?
>>> v8 add "ARCH_SYSCALL_WORK_EXIT" to be defined as "SECCOMP | SYSCALL_EMU"
>>> to keep the arm64 behaviour unchanged as mentioned in v7.
>> Ah then that is where the issue is, I missed that: surely switching to
>> generic entry means that we are using SYSCALL_WORK_BIT_* rather than
>> TIF_* for all these flags?
> I think they may be the same thing as you mentioned in v7,neither
> SYSCALL_WORK_EXIT nor report_single_step() excluded SYSCALL_EMU, maybe
> we should clarify them for arm64 together in a separate patch.
These two might indeed be related. On second thoughts, while waiting for
more knowledgeable arm64 reviewers, I would suggest aligning arm64 with
the generic entry. Which means...
> 1、"The generic report_single_step() always returns false if SYSCALL_EMU
> is set."
... replicating this behaviour on arm64 (in a separate patch), and...
> 2、"
> > -void syscall_exit_to_user_mode_prepare(struct pt_regs *regs)
> > -{
> > - unsigned long flags = read_thread_flags();
> > -
> > - rseq_syscall(regs);
> > -
> > - if (has_syscall_work(flags) || flags & _TIF_SINGLESTEP)
>
> I believe switching to the generic function introduces a change
> here: syscall_exit_work() is only called if a flag in
> SYSCALL_WORK_EXIT is set, and this set does not include SYSCALL_EMU and
> SECCOMP. Practically this means that audit_syscall_exit() will no
> longer be called if only SECCOMP and/or SYSCALL_EMU is set.
>
> It doesn't feel like a major behaviour change, but it should be
> pointed out."
... replicating this on arm64 as well, i.e. introducing a separate set
of flags for syscall exit. This should be a patch of its own, as it
isn't directly related to the report_single_step() behaviour (especially
since it concerns SECCOMP as well). It would also be an occasion to get
rid of has_syscall_work(), in preparation to the move to generic entry.
- Kevin
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2025-12-01 10:18 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-26 7:14 [PATCH v8 00/12] arm64: entry: Convert to Generic Entry Jinjie Ruan
2025-11-26 7:14 ` [PATCH v8 01/12] arm64: Remove unused _TIF_WORK_MASK Jinjie Ruan
2025-11-27 13:27 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 02/12] arm64/ptrace: Split report_syscall() Jinjie Ruan
2025-11-27 13:28 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 03/12] arm64/ptrace: Refactor syscall_trace_enter/exit() Jinjie Ruan
2025-11-27 13:28 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 04/12] arm64: ptrace: Move rseq_syscall() before audit_syscall_exit() Jinjie Ruan
2025-11-27 13:28 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 05/12] arm64: syscall: Rework el0_svc_common() Jinjie Ruan
2025-11-27 13:29 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 06/12] arm64/ptrace: Return early for ptrace_report_syscall_entry() error Jinjie Ruan
2025-11-27 13:29 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 07/12] arm64/ptrace: Expand secure_computing() in place Jinjie Ruan
2025-11-27 13:29 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 08/12] arm64/ptrace: Use syscall_get_arguments() heleper Jinjie Ruan
2025-11-27 13:30 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 09/12] entry: Split syscall_exit_to_user_mode_work() for arch reuse Jinjie Ruan
2025-11-26 7:14 ` [PATCH v8 10/12] entry: Add arch_ptrace_report_syscall_entry/exit() Jinjie Ruan
2025-11-27 13:30 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 11/12] arm64: entry: Convert to generic entry Jinjie Ruan
2025-11-27 13:31 ` Kevin Brodsky
2025-11-28 3:34 ` Jinjie Ruan
2025-11-28 13:32 ` Kevin Brodsky
2025-11-29 1:23 ` Jinjie Ruan
2025-12-01 10:17 ` Kevin Brodsky
2025-11-26 7:14 ` [PATCH v8 12/12] selftests: sud_test: Support aarch64 Jinjie Ruan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).