* [PATCH 0/5] ptrace_set_syscall_info: add support for seccomp syscall skipping and instruction pointer modification
@ 2026-07-01 15:05 Renzo Davoli
2026-07-01 15:05 ` [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP Renzo Davoli
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Renzo Davoli @ 2026-07-01 15:05 UTC (permalink / raw)
To: linux-kernel
Cc: Renzo Davoli, Andrew Morton, Oleg Nesterov, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
PTRACE_SET_SYSCALL_INFO is a generic ptrace API that complements
PTRACE_GET_SYSCALL_INFO by allowing a tracer to modify details of a
system call in which the tracee is currently blocked.
The API is designed to let tracers inspect and modify system call
information in a simple, architecture-agnostic manner.
The current implementation only supports modifying the subset of
system call information needed by strace: the system call number,
arguments, and return value.
This patch set extends PTRACE_SET_SYSCALL_INFO with support for:
Skipping a system call triggered via seccomp
Modifying the tracee's instruction pointer
1. Seccomp system call skip
When a seccomp filter returns SECCOMP_RET_TRACE, the tracer receives,
via PTRACE_GET_SYSCALL_INFO, a struct ptrace_syscall_info with
op == PTRACE_SYSCALL_INFO_SECCOMP.
The tracer can skip the system call by setting the system call number
to -1. However, the current PTRACE_SET_SYSCALL_INFO interface does not
provide a way to specify the return value or error code that should be
reported to the tracee after skipping the call.
Patch 1/5 introduces a new op value,
PTRACE_SYSCALL_INFO_SECCOMP_SKIP, for use with
PTRACE_SET_SYSCALL_INFO.
When the tracer retrieves a ptrace_syscall_info structure with
op == PTRACE_SYSCALL_INFO_SECCOMP, it may choose to skip the system
call by changing op to PTRACE_SYSCALL_INFO_SECCOMP_SKIP and
populating the exit union fields (rval and is_error) to define
the return value and error status for the tracee.
2. Setting the instruction pointer
Patch 4/5 adds support for modifying the tracee's instruction pointer.
To do this, the tracer stores the new instruction pointer value in the
instruction_pointer field of the ptrace_syscall_info structure and
sets the PTRACE_SYSCALL_INFO_FLAG_SET_IP flag in the flags field.
This flag is introduced to avoid breaking existing code that uses
PTRACE_SET_SYSCALL_INFO and currently ignores the
instruction_pointer field.
Renzo Davoli (5):
ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_SECCOMP_SKIP
asm/ptrace.h: add instruction_pointer_set
ptrace: add PTRACE_SYSCALL_INFO_FLAG_SET_IP
selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_FLAG_SET_IP
arch/alpha/include/asm/ptrace.h | 6 +
arch/hexagon/include/asm/ptrace.h | 6 +
arch/m68k/include/asm/ptrace.h | 6 +
arch/microblaze/include/asm/ptrace.h | 6 +
arch/nios2/include/asm/ptrace.h | 6 +
arch/um/include/asm/ptrace-generic.h | 6 +
arch/xtensa/include/asm/ptrace.h | 6 +
include/uapi/linux/ptrace.h | 5 +
kernel/ptrace.c | 39 ++-
.../selftests/ptrace/set_syscall_info.c | 321 +++++++++++++++++-
10 files changed, 401 insertions(+), 6 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-01 15:05 [PATCH 0/5] ptrace_set_syscall_info: add support for seccomp syscall skipping and instruction pointer modification Renzo Davoli
@ 2026-07-01 15:05 ` Renzo Davoli
2026-07-02 8:43 ` Oleg Nesterov
2026-07-01 15:05 ` [PATCH 2/5] selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_SECCOMP_SKIP Renzo Davoli
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Renzo Davoli @ 2026-07-01 15:05 UTC (permalink / raw)
To: linux-kernel
Cc: Renzo Davoli, Andrew Morton, Oleg Nesterov, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
This patch extends PTRACE_SET_SYSCALL_INFO with support for skipping a system
call triggered via seccomp.
When the tracer retrieves a ptrace_syscall_info structure with
op == PTRACE_SYSCALL_INFO_SECCOMP, it may choose to skip the system
call by changing op to PTRACE_SYSCALL_INFO_SECCOMP_SKIP and
populating the exit union fields (rval and is_error) to define
the return value and error status for the tracee.
Signed-off-by: Renzo Davoli <renzo@cs.unibo.it>
---
include/uapi/linux/ptrace.h | 1 +
kernel/ptrace.c | 28 +++++++++++++++++++++++++---
2 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 5f8ef6156752..22489597a325 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -79,6 +79,7 @@ struct seccomp_metadata {
#define PTRACE_SYSCALL_INFO_ENTRY 1
#define PTRACE_SYSCALL_INFO_EXIT 2
#define PTRACE_SYSCALL_INFO_SECCOMP 3
+#define PTRACE_SYSCALL_INFO_SECCOMP_SKIP 4
struct ptrace_syscall_info {
__u8 op; /* PTRACE_SYSCALL_INFO_* */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index d041645d9d17..ff763c87e4f7 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1119,12 +1119,22 @@ ptrace_set_syscall_info_exit(struct task_struct *child, struct pt_regs *regs,
return 0;
}
+static int
+ptrace_set_syscall_info_seccomp_skip(struct task_struct *child,
+ struct pt_regs *regs,
+ struct ptrace_syscall_info *info)
+{
+ syscall_set_nr(child, regs, -1);
+ return ptrace_set_syscall_info_exit(child, regs, info);
+}
+
static int
ptrace_set_syscall_info(struct task_struct *child, unsigned long user_size,
const void __user *datavp)
{
struct pt_regs *regs = task_pt_regs(child);
struct ptrace_syscall_info info;
+ int child_op;
if (user_size < sizeof(info))
return -EINVAL;
@@ -1141,9 +1151,19 @@ ptrace_set_syscall_info(struct task_struct *child, unsigned long user_size,
if (info.flags || info.reserved)
return -EINVAL;
- /* Changing the type of the system call stop is not supported yet. */
- if (ptrace_get_syscall_info_op(child) != info.op)
- return -EINVAL;
+ /*
+ * Changing the type of the system call stop is
+ * not allowed, with the following exception:
+ * PTRACE_SYSCALL_INFO_SECCOMP can be changed to
+ * PTRACE_SYSCALL_INFO_SECCOMP_SKIP.
+ */
+
+ child_op = ptrace_get_syscall_info_op(child);
+ if (child_op != info.op) {
+ if ((info.op != PTRACE_SYSCALL_INFO_SECCOMP_SKIP) ||
+ (child_op != PTRACE_SYSCALL_INFO_SECCOMP))
+ return -EINVAL;
+ }
switch (info.op) {
case PTRACE_SYSCALL_INFO_ENTRY:
@@ -1152,6 +1172,8 @@ ptrace_set_syscall_info(struct task_struct *child, unsigned long user_size,
return ptrace_set_syscall_info_exit(child, regs, &info);
case PTRACE_SYSCALL_INFO_SECCOMP:
return ptrace_set_syscall_info_seccomp(child, regs, &info);
+ case PTRACE_SYSCALL_INFO_SECCOMP_SKIP:
+ return ptrace_set_syscall_info_seccomp_skip(child, regs, &info);
default:
/* Other types of system call stops are not supported yet. */
return -EINVAL;
--
2.53.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/5] selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-01 15:05 [PATCH 0/5] ptrace_set_syscall_info: add support for seccomp syscall skipping and instruction pointer modification Renzo Davoli
2026-07-01 15:05 ` [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP Renzo Davoli
@ 2026-07-01 15:05 ` Renzo Davoli
2026-07-01 15:05 ` [PATCH 3/5] asm/ptrace.h: add instruction_pointer_set Renzo Davoli
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Renzo Davoli @ 2026-07-01 15:05 UTC (permalink / raw)
To: linux-kernel
Cc: Renzo Davoli, Andrew Morton, Oleg Nesterov, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
Check whether PTRACE_SYSCALL_INFO_SECCOMP_SKIP semantics implemented in the
kernel matches userspace expectations.
Signed-off-by: Renzo Davoli <renzo@cs.unibo.it>
---
.../selftests/ptrace/set_syscall_info.c | 174 +++++++++++++++++-
1 file changed, 173 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ptrace/set_syscall_info.c b/tools/testing/selftests/ptrace/set_syscall_info.c
index 1cc411a41cd6..7f397469fd00 100644
--- a/tools/testing/selftests/ptrace/set_syscall_info.c
+++ b/tools/testing/selftests/ptrace/set_syscall_info.c
@@ -11,9 +11,16 @@
#include <err.h>
#include <fcntl.h>
#include <signal.h>
+#include <stdlib.h>
+#include <stddef.h>
#include <asm/unistd.h>
+#include <sys/prctl.h>
#include <linux/types.h>
#include <linux/ptrace.h>
+#include <linux/filter.h>
+#include <linux/seccomp.h>
+#include <linux/prctl.h>
+
#if defined(_MIPS_SIM) && _MIPS_SIM == _MIPS_SIM_NABI32
/*
@@ -36,6 +43,7 @@ struct si_exit {
static unsigned int ptrace_stop;
static pid_t tracee_pid;
+static pid_t tracer_pid;
static int
kill_tracee(pid_t pid)
@@ -64,6 +72,25 @@ sys_ptrace(int request, pid_t pid, unsigned long addr, unsigned long data)
ptrace_stop, ##__VA_ARGS__); \
} while (0)
+static int sys_seccomp(unsigned int operation, unsigned int flags, void *args)
+{
+ return syscall(__NR_seccomp, operation, flags, args);
+}
+
+static struct sock_filter seccomp_filter[] = {
+ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct seccomp_data, nr)),
+
+ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_restart_syscall, 0, 1),
+ BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
+
+ BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_TRACE),
+};
+
+static struct sock_fprog seccomp_prog = {
+ .filter = seccomp_filter,
+ .len = ARRAY_SIZE(seccomp_filter)
+};
+
static void
check_psi_entry(struct __test_metadata *_metadata,
const struct ptrace_syscall_info *info,
@@ -128,7 +155,6 @@ check_psi_exit(struct __test_metadata *_metadata,
TEST(set_syscall_info)
{
- const pid_t tracer_pid = getpid();
const kernel_ulong_t dummy[] = {
(kernel_ulong_t) 0xdad0bef0bad0fed0ULL,
(kernel_ulong_t) 0xdad1bef1bad1fed1ULL,
@@ -138,6 +164,7 @@ TEST(set_syscall_info)
(kernel_ulong_t) 0xdad5bef5bad5fed5ULL,
};
int splice_in[2], splice_out[2];
+ tracer_pid = getpid();
ASSERT_EQ(0, pipe(splice_in));
ASSERT_EQ(0, pipe(splice_out));
@@ -516,4 +543,149 @@ TEST(set_syscall_info)
ASSERT_EQ(ptrace_stop, ARRAY_SIZE(si) * 2);
}
+TEST(set_syscall_info_seccomp)
+{
+ tracer_pid = getpid();
+ tracee_pid = fork();
+
+ ASSERT_LE(0, tracee_pid) {
+ TH_LOG("fork: %m");
+ }
+
+ /* tracee */
+ if (tracee_pid == 0) {
+ tracee_pid = getpid();
+ ASSERT_EQ(0, sys_ptrace(PTRACE_TRACEME, 0, 0, 0)) {
+ TH_LOG("PTRACE_TRACEME: %m");
+ }
+ ASSERT_EQ(0, kill(tracee_pid, SIGSTOP)) {
+ /* cannot happen */
+ TH_LOG("kill SIGSTOP: %m");
+ }
+
+ ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+ TH_LOG("prctl: %m");
+ _exit(1);
+ }
+ ASSERT_EQ(0, sys_seccomp(SECCOMP_SET_MODE_FILTER, 0,
+ (void *) &seccomp_prog)) {
+ TH_LOG("seccomp: %m");
+ _exit(1);
+ }
+
+ /* run getpid unmodified */
+ ASSERT_EQ(tracee_pid, getpid()) {
+ TH_LOG("getpid seccomp unchanged: %m");
+ _exit(1);
+ }
+
+ /* run getppid instead of getpid */
+ ASSERT_EQ(tracer_pid, getpid()) {
+ TH_LOG("getpid seccomp nr changes: %m");
+ _exit(1);
+ }
+
+ /* skip getpid and return 42 */
+ ASSERT_EQ(42, getpid()) {
+ TH_LOG("getpid skip set return value changes: %m");
+ _exit(1);
+ }
+ _exit(0);
+ }
+
+ int status;
+
+ /* tracer */
+ ASSERT_LE(0, waitpid(-1,&status,0)) {
+ LOG_KILL_TRACEE("waitpid: %m");
+ }
+
+ ASSERT_EQ(0, sys_ptrace(PTRACE_SETOPTIONS, tracee_pid, 0, PTRACE_O_TRACESECCOMP | PTRACE_O_TRACESYSGOOD))
+ LOG_KILL_TRACEE("PTRACE_SETOPTIONS: %m");
+
+ ASSERT_EQ(0, sys_ptrace(PTRACE_CONT, tracee_pid, 0, 0)) {
+ LOG_KILL_TRACEE("PTRACE_CONT: %m");
+ }
+
+ while (1) {
+ ASSERT_EQ(tracee_pid, wait(&status)) {
+ /* cannot happen */
+ LOG_KILL_TRACEE("wait: %m");
+ }
+ if (WIFEXITED(status)) {
+ tracee_pid = 0; /* the tracee is no more */
+ ASSERT_EQ(0, WEXITSTATUS(status)) {
+ LOG_KILL_TRACEE("unexpected exit status %u",
+ WEXITSTATUS(status));
+ }
+ break;
+ }
+ ASSERT_FALSE(WIFSIGNALED(status)) {
+ tracee_pid = 0; /* the tracee is no more */
+ LOG_KILL_TRACEE("unexpected signal %u",
+ WTERMSIG(status));
+ }
+ ASSERT_TRUE(WIFSTOPPED(status)) {
+ LOG_KILL_TRACEE("unexpected wait status %#x", status);
+ }
+
+ if (status >> 8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP << 8))) {
+ struct ptrace_syscall_info info;
+ size_t info_size = sizeof(info);
+ ASSERT_LT(0, sys_ptrace(PTRACE_GET_SYSCALL_INFO, tracee_pid, info_size, (uintptr_t) &info)) {
+ LOG_KILL_TRACEE("PTRACE_GET_SYSCALL_INFO: %m");
+ };
+ ASSERT_EQ(PTRACE_SYSCALL_INFO_SECCOMP, info.op) {
+ LOG_KILL_TRACEE("entry op mismatch: %m");
+ }
+ ASSERT_TRUE(info.arch) {
+ LOG_KILL_TRACEE("entry arch mismatch: %m");
+ }
+ ASSERT_TRUE(info.instruction_pointer) {
+ LOG_KILL_TRACEE("entry instruction_pointer mismatch: %m");
+ }
+ ASSERT_TRUE(info.stack_pointer) {
+ LOG_KILL_TRACEE("entry stack_pointer mismatch: %m");
+ }
+
+ switch (ptrace_stop) {
+ case 0: ASSERT_EQ(__NR_getpid, info.seccomp.nr) {
+ LOG_KILL_TRACEE("step %d nr __NR_getpid mismatch: %m", ptrace_stop);
+ }
+ ptrace_stop++;
+ break;
+ case 1: ASSERT_EQ(__NR_getpid, info.seccomp.nr) {
+ LOG_KILL_TRACEE("step %d nr __NR_getpid mismatch: %m", ptrace_stop);
+ }
+ info.seccomp.nr = __NR_getppid;
+ ptrace_stop++;
+ break;
+ case 2: ASSERT_EQ(__NR_getpid, info.seccomp.nr) {
+ LOG_KILL_TRACEE("step %d nr __NR_getpid mismatch: %m", ptrace_stop);
+ }
+ info.op = PTRACE_SYSCALL_INFO_SECCOMP_SKIP;
+ info.exit.rval = 42;
+ info.exit.is_error = 0;
+ ptrace_stop++;
+ break;
+ case 3: ASSERT_EQ(__NR_exit_group, info.seccomp.nr) {
+ LOG_KILL_TRACEE("step %d nr __NR_exit_group mismatch: %m", ptrace_stop);
+ }
+ break;
+ default:
+ LOG_KILL_TRACEE("unexpected system call: %m");
+ break;
+
+ }
+ ASSERT_EQ(0,sys_ptrace(PTRACE_SET_SYSCALL_INFO, tracee_pid, info_size, (uintptr_t) &info)) {
+ LOG_KILL_TRACEE("PTRACE_SET_SYSCALL_INFO: %m");
+ }
+
+ ASSERT_EQ(0,sys_ptrace(PTRACE_CONT, tracee_pid, 0, 0)) {
+ LOG_KILL_TRACEE("PTRACE_CONT: %m");
+ }
+ }
+ }
+}
+
TEST_HARNESS_MAIN
--
2.53.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] asm/ptrace.h: add instruction_pointer_set
2026-07-01 15:05 [PATCH 0/5] ptrace_set_syscall_info: add support for seccomp syscall skipping and instruction pointer modification Renzo Davoli
2026-07-01 15:05 ` [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP Renzo Davoli
2026-07-01 15:05 ` [PATCH 2/5] selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_SECCOMP_SKIP Renzo Davoli
@ 2026-07-01 15:05 ` Renzo Davoli
2026-07-01 15:05 ` [PATCH 4/5] ptrace: add PTRACE_SYSCALL_INFO_FLAG_SET_IP Renzo Davoli
2026-07-01 15:05 ` [PATCH 5/5] selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_FLAG_SET_IP Renzo Davoli
4 siblings, 0 replies; 14+ messages in thread
From: Renzo Davoli @ 2026-07-01 15:05 UTC (permalink / raw)
To: linux-kernel
Cc: Renzo Davoli, Andrew Morton, Oleg Nesterov, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
Add an instruction_pointer_set function for architectures that do
not currently provide one.
Signed-off-by: Renzo Davoli <renzo@cs.unibo.it>
---
arch/alpha/include/asm/ptrace.h | 6 ++++++
arch/hexagon/include/asm/ptrace.h | 6 ++++++
arch/m68k/include/asm/ptrace.h | 6 ++++++
arch/microblaze/include/asm/ptrace.h | 6 ++++++
arch/nios2/include/asm/ptrace.h | 6 ++++++
arch/um/include/asm/ptrace-generic.h | 6 ++++++
arch/xtensa/include/asm/ptrace.h | 6 ++++++
7 files changed, 42 insertions(+)
diff --git a/arch/alpha/include/asm/ptrace.h b/arch/alpha/include/asm/ptrace.h
index 3557ce64ed21..0821fe9a27c8 100644
--- a/arch/alpha/include/asm/ptrace.h
+++ b/arch/alpha/include/asm/ptrace.h
@@ -24,4 +24,10 @@ static inline unsigned long regs_return_value(struct pt_regs *regs)
return regs->r0;
}
+static inline void instruction_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ instruction_pointer(regs) = val;
+}
+
#endif
diff --git a/arch/hexagon/include/asm/ptrace.h b/arch/hexagon/include/asm/ptrace.h
index ed35da1ee685..0a121f6e3bfc 100644
--- a/arch/hexagon/include/asm/ptrace.h
+++ b/arch/hexagon/include/asm/ptrace.h
@@ -18,6 +18,12 @@ extern const char *regs_query_register_name(unsigned int offset);
((struct pt_regs *) \
((unsigned long)current_thread_info() + THREAD_SIZE) - 1)
+static inline void instruction_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ instruction_pointer(regs) = val;
+}
+
#if CONFIG_HEXAGON_ARCH_VERSION >= 4
#define arch_has_single_step() (1)
#endif
diff --git a/arch/m68k/include/asm/ptrace.h b/arch/m68k/include/asm/ptrace.h
index bc86ce012025..6e8a8f0daee8 100644
--- a/arch/m68k/include/asm/ptrace.h
+++ b/arch/m68k/include/asm/ptrace.h
@@ -18,6 +18,12 @@
(struct pt_regs *)((char *)current_thread_info() + THREAD_SIZE) - 1
#define current_user_stack_pointer() rdusp()
+static inline void instruction_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ instruction_pointer(regs) = val;
+}
+
#define arch_has_single_step() (1)
#ifdef CONFIG_MMU
diff --git a/arch/microblaze/include/asm/ptrace.h b/arch/microblaze/include/asm/ptrace.h
index 17982292a64f..69e10658d7a9 100644
--- a/arch/microblaze/include/asm/ptrace.h
+++ b/arch/microblaze/include/asm/ptrace.h
@@ -20,5 +20,11 @@ static inline long regs_return_value(struct pt_regs *regs)
return regs->r3;
}
+static inline void instruction_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ instruction_pointer(regs) = val;
+}
+
#endif /* __ASSEMBLER__ */
#endif /* _ASM_MICROBLAZE_PTRACE_H */
diff --git a/arch/nios2/include/asm/ptrace.h b/arch/nios2/include/asm/ptrace.h
index 96cbcd40c7ce..d120d8ecb187 100644
--- a/arch/nios2/include/asm/ptrace.h
+++ b/arch/nios2/include/asm/ptrace.h
@@ -70,6 +70,12 @@ struct switch_stack {
#define user_stack_pointer(regs) ((regs)->sp)
extern void show_regs(struct pt_regs *);
+static inline void instruction_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ instruction_pointer(regs) = val;
+}
+
#define current_pt_regs() \
((struct pt_regs *)((unsigned long)current_thread_info() + THREAD_SIZE)\
- 1)
diff --git a/arch/um/include/asm/ptrace-generic.h b/arch/um/include/asm/ptrace-generic.h
index 86d74f9d33cf..44beb96862d8 100644
--- a/arch/um/include/asm/ptrace-generic.h
+++ b/arch/um/include/asm/ptrace-generic.h
@@ -29,6 +29,12 @@ struct pt_regs {
#define PTRACE_OLDSETOPTIONS 21
+static inline void instruction_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ instruction_pointer(regs) = val;
+}
+
struct task_struct;
extern long subarch_ptrace(struct task_struct *child, long request,
diff --git a/arch/xtensa/include/asm/ptrace.h b/arch/xtensa/include/asm/ptrace.h
index d0568ff6d349..97b14418955e 100644
--- a/arch/xtensa/include/asm/ptrace.h
+++ b/arch/xtensa/include/asm/ptrace.h
@@ -103,6 +103,12 @@ static inline unsigned long regs_return_value(struct pt_regs *regs)
return regs->areg[2];
}
+static inline void instruction_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ instruction_pointer(regs) = val;
+}
+
int do_syscall_trace_enter(struct pt_regs *regs);
void do_syscall_trace_leave(struct pt_regs *regs);
--
2.53.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] ptrace: add PTRACE_SYSCALL_INFO_FLAG_SET_IP
2026-07-01 15:05 [PATCH 0/5] ptrace_set_syscall_info: add support for seccomp syscall skipping and instruction pointer modification Renzo Davoli
` (2 preceding siblings ...)
2026-07-01 15:05 ` [PATCH 3/5] asm/ptrace.h: add instruction_pointer_set Renzo Davoli
@ 2026-07-01 15:05 ` Renzo Davoli
2026-07-01 15:05 ` [PATCH 5/5] selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_FLAG_SET_IP Renzo Davoli
4 siblings, 0 replies; 14+ messages in thread
From: Renzo Davoli @ 2026-07-01 15:05 UTC (permalink / raw)
To: linux-kernel
Cc: Renzo Davoli, Andrew Morton, Oleg Nesterov, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
This flag adds support for modifying the tracee's instruction pointer.
To do this, the tracer stores the new instruction pointer value in the
instruction_pointer field of the ptrace_syscall_info structure and
sets the PTRACE_SYSCALL_INFO_FLAG_SET_IP flag in the flags field.
This flag is introduced to avoid breaking existing code that uses
PTRACE_SET_SYSCALL_INFO and currently ignores the
instruction_pointer field.
Signed-off-by: Renzo Davoli <renzo@cs.unibo.it>
---
include/uapi/linux/ptrace.h | 4 ++++
kernel/ptrace.c | 11 +++++++++--
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 22489597a325..1d9c04447bb7 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -81,6 +81,10 @@ struct seccomp_metadata {
#define PTRACE_SYSCALL_INFO_SECCOMP 3
#define PTRACE_SYSCALL_INFO_SECCOMP_SKIP 4
+#define PTRACE_SYSCALL_INFO_FLAG_SET_IP (1 << 0)
+#define PTRACE_SYSCALL_INFO_FLAG_ALL \
+ (PTRACE_SYSCALL_INFO_FLAG_SET_IP)
+
struct ptrace_syscall_info {
__u8 op; /* PTRACE_SYSCALL_INFO_* */
__u8 reserved;
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index ff763c87e4f7..57a1731451d0 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1147,8 +1147,8 @@ ptrace_set_syscall_info(struct task_struct *child, unsigned long user_size,
if (copy_from_user(&info, datavp, sizeof(info)))
return -EFAULT;
- /* Reserved for future use. */
- if (info.flags || info.reserved)
+ /* Unused flags and fields reserved for future use. */
+ if ((info.flags & ~PTRACE_SYSCALL_INFO_FLAG_ALL) || info.reserved)
return -EINVAL;
/*
@@ -1165,6 +1165,13 @@ ptrace_set_syscall_info(struct task_struct *child, unsigned long user_size,
return -EINVAL;
}
+ if (info.flags & PTRACE_SYSCALL_INFO_FLAG_SET_IP) {
+ unsigned long ip = info.instruction_pointer;
+ if (ip != info.instruction_pointer)
+ return -ERANGE;
+ instruction_pointer_set(regs, ip);
+ }
+
switch (info.op) {
case PTRACE_SYSCALL_INFO_ENTRY:
return ptrace_set_syscall_info_entry(child, regs, &info);
--
2.53.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_FLAG_SET_IP
2026-07-01 15:05 [PATCH 0/5] ptrace_set_syscall_info: add support for seccomp syscall skipping and instruction pointer modification Renzo Davoli
` (3 preceding siblings ...)
2026-07-01 15:05 ` [PATCH 4/5] ptrace: add PTRACE_SYSCALL_INFO_FLAG_SET_IP Renzo Davoli
@ 2026-07-01 15:05 ` Renzo Davoli
4 siblings, 0 replies; 14+ messages in thread
From: Renzo Davoli @ 2026-07-01 15:05 UTC (permalink / raw)
To: linux-kernel
Cc: Renzo Davoli, Andrew Morton, Oleg Nesterov, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
Check whether PTRACE_SYSCALL_INFO_FLAG_SET_IP semantics implemented in the
kernel matches userspace expectations.
Signed-off-by: Renzo Davoli <renzo@cs.unibo.it>
---
.../selftests/ptrace/set_syscall_info.c | 147 ++++++++++++++++++
1 file changed, 147 insertions(+)
diff --git a/tools/testing/selftests/ptrace/set_syscall_info.c b/tools/testing/selftests/ptrace/set_syscall_info.c
index 7f397469fd00..772312bb8328 100644
--- a/tools/testing/selftests/ptrace/set_syscall_info.c
+++ b/tools/testing/selftests/ptrace/set_syscall_info.c
@@ -91,6 +91,10 @@ static struct sock_fprog seccomp_prog = {
.len = ARRAY_SIZE(seccomp_filter)
};
+static char w1[] = {'A', '\n'};
+static char w2[] = {'B', '\n'};
+static char w3[] = {'C', '\n'};
+
static void
check_psi_entry(struct __test_metadata *_metadata,
const struct ptrace_syscall_info *info,
@@ -688,4 +692,147 @@ TEST(set_syscall_info_seccomp)
}
}
+TEST(set_syscall_info_setip)
+{
+ tracer_pid = getpid();
+ tracee_pid = fork();
+
+ ASSERT_LE(0, tracee_pid) {
+ TH_LOG("fork: %m");
+ }
+
+ /* tracee */
+ if (tracee_pid == 0) {
+ tracee_pid = getpid();
+ ASSERT_EQ(0, sys_ptrace(PTRACE_TRACEME, 0, 0, 0)) {
+ TH_LOG("PTRACE_TRACEME: %m");
+ }
+ ASSERT_EQ(0, kill(tracee_pid, SIGSTOP)) {
+ /* cannot happen */
+ TH_LOG("kill SIGSTOP: %m");
+ }
+
+ ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+ TH_LOG("prctl: %m");
+ _exit(1);
+ }
+ ASSERT_EQ(0, sys_seccomp(SECCOMP_SET_MODE_FILTER, 0,
+ (void *) &seccomp_prog)) {
+ TH_LOG("seccomp: %m");
+ _exit(1);
+ }
+
+presyscall:
+ /* this sysall will run twice
+ (the tracer steps back the instruction pointer) */
+ int rv = write(1, w1, sizeof(w1));
+ ASSERT_EQ(2, rv) {
+ TH_LOG("getpid skip set return value changes: %m");
+ _exit(1);
+ }
+
+ /* run write unmodified */
+ ASSERT_EQ(2, write(1, w3, sizeof(w3))) {
+ TH_LOG("getpid skip set return value changes: %m");
+ _exit(1);
+ }
+ _exit(0);
+ }
+
+ int status;
+ void *doitagain = &&presyscall;
+
+ /* tracer */
+ ASSERT_LE(0, waitpid(-1,&status,0)) {
+ LOG_KILL_TRACEE("waitpid: %m");
+ }
+
+ ASSERT_EQ(0, sys_ptrace(PTRACE_SETOPTIONS, tracee_pid, 0, PTRACE_O_TRACESECCOMP | PTRACE_O_TRACESYSGOOD))
+ LOG_KILL_TRACEE("PTRACE_SETOPTIONS: %m");
+
+ ASSERT_EQ(0, sys_ptrace(PTRACE_CONT, tracee_pid, 0, 0)) {
+ LOG_KILL_TRACEE("PTRACE_CONT: %m");
+ }
+
+ while (1) {
+ ASSERT_EQ(tracee_pid, wait(&status)) {
+ /* cannot happen */
+ LOG_KILL_TRACEE("wait: %m");
+ }
+ if (WIFEXITED(status)) {
+ tracee_pid = 0; /* the tracee is no more */
+ ASSERT_EQ(0, WEXITSTATUS(status)) {
+ LOG_KILL_TRACEE("unexpected exit status %u",
+ WEXITSTATUS(status));
+ }
+ break;
+ }
+ ASSERT_FALSE(WIFSIGNALED(status)) {
+ tracee_pid = 0; /* the tracee is no more */
+ LOG_KILL_TRACEE("unexpected signal %u",
+ WTERMSIG(status));
+ }
+ ASSERT_TRUE(WIFSTOPPED(status)) {
+ LOG_KILL_TRACEE("unexpected wait status %#x", status);
+ }
+
+ if (status >> 8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP << 8))) {
+ struct ptrace_syscall_info info;
+ size_t info_size = sizeof(info);
+ ASSERT_LT(0, sys_ptrace(PTRACE_GET_SYSCALL_INFO, tracee_pid, info_size, (uintptr_t) &info)) {
+ LOG_KILL_TRACEE("PTRACE_GET_SYSCALL_INFO: %m");
+ }
+ ASSERT_EQ(PTRACE_SYSCALL_INFO_SECCOMP, info.op) {
+ LOG_KILL_TRACEE("entry op mismatch: %m");
+ }
+ ASSERT_TRUE(info.arch) {
+ LOG_KILL_TRACEE("entry arch mismatch: %m");
+ }
+ ASSERT_TRUE(info.instruction_pointer) {
+ LOG_KILL_TRACEE("entry instruction_pointer mismatch: %m");
+ }
+ ASSERT_TRUE(info.stack_pointer) {
+ LOG_KILL_TRACEE("entry stack_pointer mismatch: %m");
+ }
+
+ switch (ptrace_stop) {
+ case 0: ASSERT_EQ(__NR_write, info.seccomp.nr) {
+ LOG_KILL_TRACEE("step %d nr __NR_write mismatch: %m", ptrace_stop);
+ }
+ info.instruction_pointer = (uintptr_t) doitagain;
+ info.flags = PTRACE_SYSCALL_INFO_FLAG_SET_IP;
+ ptrace_stop++;
+ break;
+ case 1:
+ info.seccomp.nr = __NR_write;
+ info.seccomp.args[0] = 1;
+ info.seccomp.args[1] = (uintptr_t) w2;
+ info.seccomp.args[2] = sizeof(w2);
+ ptrace_stop++;
+ break;
+ case 2: ASSERT_EQ(__NR_write, info.seccomp.nr) {
+ LOG_KILL_TRACEE("step %d nr __NR_write mismatch: %m", ptrace_stop);
+ }
+ ptrace_stop++;
+ break;
+ case 3: ASSERT_EQ(__NR_exit_group, info.seccomp.nr) {
+ LOG_KILL_TRACEE("step %d nr __NR_exit_group mismatch: %m", ptrace_stop);
+ }
+ break;
+ default:
+ LOG_KILL_TRACEE("unexpected system call: %m");
+ break;
+
+ }
+ ASSERT_EQ(0,sys_ptrace(PTRACE_SET_SYSCALL_INFO, tracee_pid, info_size, (uintptr_t) &info)) {
+ LOG_KILL_TRACEE("PTRACE_SET_SYSCALL_INFO: %m");
+ }
+
+ ASSERT_EQ(0,sys_ptrace(PTRACE_CONT, tracee_pid, 0, 0)) {
+ LOG_KILL_TRACEE("PTRACE_CONT: %m");
+ }
+ }
+ }
+}
+
TEST_HARNESS_MAIN
--
2.53.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-01 15:05 ` [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP Renzo Davoli
@ 2026-07-02 8:43 ` Oleg Nesterov
2026-07-02 9:09 ` Renzo Davoli
0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2026-07-02 8:43 UTC (permalink / raw)
To: Renzo Davoli
Cc: linux-kernel, Andrew Morton, Shuah Khan, Alexey Gladkov,
Eugene Syromyatnikov, Mike Frysinger, Davide Berardi,
strace-devel, Dmitry Levin
On 07/01, Renzo Davoli wrote:
>
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -79,6 +79,7 @@ struct seccomp_metadata {
> #define PTRACE_SYSCALL_INFO_ENTRY 1
> #define PTRACE_SYSCALL_INFO_EXIT 2
> #define PTRACE_SYSCALL_INFO_SECCOMP 3
> +#define PTRACE_SYSCALL_INFO_SECCOMP_SKIP 4
>
> struct ptrace_syscall_info {
> __u8 op; /* PTRACE_SYSCALL_INFO_* */
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index d041645d9d17..ff763c87e4f7 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -1119,12 +1119,22 @@ ptrace_set_syscall_info_exit(struct task_struct *child, struct pt_regs *regs,
> return 0;
> }
>
> +static int
> +ptrace_set_syscall_info_seccomp_skip(struct task_struct *child,
> + struct pt_regs *regs,
> + struct ptrace_syscall_info *info)
> +{
> + syscall_set_nr(child, regs, -1);
> + return ptrace_set_syscall_info_exit(child, regs, info);
> +}
Rather than add the new PTRACE_SYSCALL_INFO_SECCOMP_SKIP, can't we teach
ptrace_set_syscall_info_seccomp() to treat info->entry.nr == -1 as "skip" ?
Note that ptrace_set_syscall_info_seccomp() -> ptrace_set_syscall_info_entry()
already does syscall_set_nr().
And perhaps the changelog should say more about motivation...
See also https://sashiko.dev/#/patchset/20260701150558.330348-1-renzo%40cs.unibo.it
Oleg.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-02 8:43 ` Oleg Nesterov
@ 2026-07-02 9:09 ` Renzo Davoli
2026-07-02 9:58 ` Oleg Nesterov
0 siblings, 1 reply; 14+ messages in thread
From: Renzo Davoli @ 2026-07-02 9:09 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, Andrew Morton, Shuah Khan, Alexey Gladkov,
Eugene Syromyatnikov, Mike Frysinger, Davide Berardi,
strace-devel, Dmitry Levin
Hi Oleg,
> Rather than add the new PTRACE_SYSCALL_INFO_SECCOMP_SKIP, can't we teach
> ptrace_set_syscall_info_seccomp() to treat info->entry.nr == -1 as "skip" ?
it already does
> Note that ptrace_set_syscall_info_seccomp() -> ptrace_set_syscall_info_entry()
> already does syscall_set_nr().
Syscall skipping is useless if there is not a way to set the return value/errno.
As I explain in the cover letter
+ The tracer can skip the system call by setting the system call number
+ to -1. However, the current PTRACE_SET_SYSCALL_INFO interface does not
+ provide a way to specify the return value or error code that should be
+ reported to the tracee after skipping the call.
currently retvalue/errno can be set only at PTRACE_SYSCALL_INFO_EXIT
renzo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-02 9:09 ` Renzo Davoli
@ 2026-07-02 9:58 ` Oleg Nesterov
2026-07-02 11:07 ` Dmitry V. Levin
0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2026-07-02 9:58 UTC (permalink / raw)
To: Renzo Davoli
Cc: linux-kernel, Andrew Morton, Shuah Khan, Alexey Gladkov,
Eugene Syromyatnikov, Mike Frysinger, Davide Berardi,
strace-devel, Dmitry Levin
On 07/02, Renzo Davoli wrote:
>
> Hi Oleg,
>
> > Rather than add the new PTRACE_SYSCALL_INFO_SECCOMP_SKIP, can't we teach
> > ptrace_set_syscall_info_seccomp() to treat info->entry.nr == -1 as "skip" ?
> it already does
> > Note that ptrace_set_syscall_info_seccomp() -> ptrace_set_syscall_info_entry()
> > already does syscall_set_nr().
> Syscall skipping is useless if there is not a way to set the return value/errno.
>
> As I explain in the cover letter
> + The tracer can skip the system call by setting the system call number
> + to -1. However, the current PTRACE_SET_SYSCALL_INFO interface does not
> + provide a way to specify the return value or error code that should be
> + reported to the tracee after skipping the call.
>
> currently retvalue/errno can be set only at PTRACE_SYSCALL_INFO_EXIT
I meant something like below. This way both PTRACE_SYSCALL_INFO_ENTRY and
__SECCOMP can skip the syscall and set the return/errr value.
Oleg.
---
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 5f8ef6156752..4ee7870f3291 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -90,7 +90,13 @@ struct ptrace_syscall_info {
union {
struct {
__u64 nr;
- __u64 args[6];
+ union {
+ __u64 args[6];
+ struct {
+ __s64 rval;
+ __u8 is_error;
+ };
+ };
} entry;
struct {
__s64 rval;
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 130043bfc209..1daac0e62cfa 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1031,6 +1031,28 @@ ptrace_get_syscall_info(struct task_struct *child, unsigned long user_size,
return copy_to_user(datavp, &info, write_size) ? -EFAULT : actual_size;
}
+static int
+__set_syscall_info_exit(struct task_struct *child, struct pt_regs *regs,
+ __s64 __rval, __u8 __is_error)
+{
+ long rval = __rval;
+
+ /*
+ * Check that the return value specified in info->exit.rval
+ * is either a value of type "long" or a sign-extended value
+ * of type "long".
+ */
+ if (rval != __rval)
+ return -ERANGE;
+
+ if (__is_error)
+ syscall_set_return_value(child, regs, rval, 0);
+ else
+ syscall_set_return_value(child, regs, 0, rval);
+
+ return 0;
+}
+
static int
ptrace_set_syscall_info_entry(struct task_struct *child, struct pt_regs *regs,
struct ptrace_syscall_info *info)
@@ -1047,6 +1069,11 @@ ptrace_set_syscall_info_entry(struct task_struct *child, struct pt_regs *regs,
if (nr != info->entry.nr)
return -ERANGE;
+ syscall_set_nr(child, regs, nr);
+ if (nr == -1)
+ return __set_syscall_info_exit(child, regs,
+ info->entry.rval, info->entry.is_error);
+
for (i = 0; i < ARRAY_SIZE(args); i++) {
args[i] = info->entry.args[i];
/*
@@ -1058,16 +1085,7 @@ ptrace_set_syscall_info_entry(struct task_struct *child, struct pt_regs *regs,
return -ERANGE;
}
- syscall_set_nr(child, regs, nr);
- /*
- * If the syscall number is set to -1, setting syscall arguments is not
- * just pointless, it would also clobber the syscall return value on
- * those architectures that share the same register both for the first
- * argument of syscall and its return value.
- */
- if (nr != -1)
- syscall_set_arguments(child, regs, args);
-
+ syscall_set_arguments(child, regs, args);
return 0;
}
@@ -1086,22 +1104,8 @@ static int
ptrace_set_syscall_info_exit(struct task_struct *child, struct pt_regs *regs,
struct ptrace_syscall_info *info)
{
- long rval = info->exit.rval;
-
- /*
- * Check that the return value specified in info->exit.rval
- * is either a value of type "long" or a sign-extended value
- * of type "long".
- */
- if (rval != info->exit.rval)
- return -ERANGE;
-
- if (info->exit.is_error)
- syscall_set_return_value(child, regs, rval, 0);
- else
- syscall_set_return_value(child, regs, 0, rval);
-
- return 0;
+ return __set_syscall_info_exit(child, regs,
+ info->exit.rval, info->exit.is_error);
}
static int
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-02 9:58 ` Oleg Nesterov
@ 2026-07-02 11:07 ` Dmitry V. Levin
2026-07-02 11:31 ` Oleg Nesterov
0 siblings, 1 reply; 14+ messages in thread
From: Dmitry V. Levin @ 2026-07-02 11:07 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Renzo Davoli, linux-kernel, Andrew Morton, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
On Thu, Jul 02, 2026 at 11:58:14AM +0200, Oleg Nesterov wrote:
[...]
> @@ -1047,6 +1069,11 @@ ptrace_set_syscall_info_entry(struct task_struct *child, struct pt_regs *regs,
> if (nr != info->entry.nr)
> return -ERANGE;
>
> + syscall_set_nr(child, regs, nr);
> + if (nr == -1)
> + return __set_syscall_info_exit(child, regs,
> + info->entry.rval, info->entry.is_error);
The kernel shouldn't suddenly start interpreting info->entry.rval and
info->entry.is_error because the current users of this interface are not
aware that the kernel might be doing it. If we want to extend
PTRACE_SYSCALL_INFO_ENTRY/PTRACE_SYSCALL_INFO_SECCOMP this way, we would
have to require setting a flag in info->flags signalling the kernel that
the user requests this new behaviour.
--
ldv
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-02 11:07 ` Dmitry V. Levin
@ 2026-07-02 11:31 ` Oleg Nesterov
2026-07-02 11:39 ` Oleg Nesterov
0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2026-07-02 11:31 UTC (permalink / raw)
To: Dmitry V. Levin
Cc: Renzo Davoli, linux-kernel, Andrew Morton, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
On 07/02, Dmitry V. Levin wrote:
>
> On Thu, Jul 02, 2026 at 11:58:14AM +0200, Oleg Nesterov wrote:
> [...]
> > @@ -1047,6 +1069,11 @@ ptrace_set_syscall_info_entry(struct task_struct *child, struct pt_regs *regs,
> > if (nr != info->entry.nr)
> > return -ERANGE;
> >
> > + syscall_set_nr(child, regs, nr);
> > + if (nr == -1)
> > + return __set_syscall_info_exit(child, regs,
> > + info->entry.rval, info->entry.is_error);
>
> The kernel shouldn't suddenly start interpreting info->entry.rval and
> info->entry.is_error because the current users of this interface are not
> aware that the kernel might be doing it. If we want to extend
> PTRACE_SYSCALL_INFO_ENTRY/PTRACE_SYSCALL_INFO_SECCOMP this way, we would
> have to require setting a flag in info->flags signalling the kernel that
> the user requests this new behaviour.
Ah. I forgot to mention that (obviously) this is a user-visible change,
and a new flag in info->flags will be safer. Of course.
Or we can define a special SKIP_AND_SET_RVAL value for info->entry.nr.
But I am just curious, will this change (without new flag) actually break
strace? What does strace do when it uses PTRACE_SYSCALL_INFO_ENTRY with
info->entry.nr == -1?
Oleg.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-02 11:31 ` Oleg Nesterov
@ 2026-07-02 11:39 ` Oleg Nesterov
2026-07-02 14:47 ` Oleg Nesterov
0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2026-07-02 11:39 UTC (permalink / raw)
To: Dmitry V. Levin
Cc: Renzo Davoli, linux-kernel, Andrew Morton, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
Or we can simply allow the ENTRY/SECCOMP -> EXIT transition, I dunno.
This is more safe.
But somehow I don't like the new PTRACE_SYSCALL_INFO_SECCOMP_SKIP...
Oleg.
On 07/02, Oleg Nesterov wrote:
>
> On 07/02, Dmitry V. Levin wrote:
> >
> > On Thu, Jul 02, 2026 at 11:58:14AM +0200, Oleg Nesterov wrote:
> > [...]
> > > @@ -1047,6 +1069,11 @@ ptrace_set_syscall_info_entry(struct task_struct *child, struct pt_regs *regs,
> > > if (nr != info->entry.nr)
> > > return -ERANGE;
> > >
> > > + syscall_set_nr(child, regs, nr);
> > > + if (nr == -1)
> > > + return __set_syscall_info_exit(child, regs,
> > > + info->entry.rval, info->entry.is_error);
> >
> > The kernel shouldn't suddenly start interpreting info->entry.rval and
> > info->entry.is_error because the current users of this interface are not
> > aware that the kernel might be doing it. If we want to extend
> > PTRACE_SYSCALL_INFO_ENTRY/PTRACE_SYSCALL_INFO_SECCOMP this way, we would
> > have to require setting a flag in info->flags signalling the kernel that
> > the user requests this new behaviour.
>
> Ah. I forgot to mention that (obviously) this is a user-visible change,
> and a new flag in info->flags will be safer. Of course.
>
> Or we can define a special SKIP_AND_SET_RVAL value for info->entry.nr.
>
> But I am just curious, will this change (without new flag) actually break
> strace? What does strace do when it uses PTRACE_SYSCALL_INFO_ENTRY with
> info->entry.nr == -1?
>
> Oleg.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-02 11:39 ` Oleg Nesterov
@ 2026-07-02 14:47 ` Oleg Nesterov
2026-07-02 16:10 ` Renzo Davoli
0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2026-07-02 14:47 UTC (permalink / raw)
To: Dmitry V. Levin
Cc: Renzo Davoli, linux-kernel, Andrew Morton, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
On 07/02, Oleg Nesterov wrote:
>
> Or we can simply allow the ENTRY/SECCOMP -> EXIT transition, I dunno.
> This is more safe.
and more simple
> But somehow I don't like the new PTRACE_SYSCALL_INFO_SECCOMP_SKIP...
Wdyt about something like below?
Oleg.
---
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 130043bfc209..ecbfa28dfbf6 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1084,7 +1084,7 @@ ptrace_set_syscall_info_seccomp(struct task_struct *child, struct pt_regs *regs,
static int
ptrace_set_syscall_info_exit(struct task_struct *child, struct pt_regs *regs,
- struct ptrace_syscall_info *info)
+ struct ptrace_syscall_info *info, bool force)
{
long rval = info->exit.rval;
@@ -1101,6 +1101,9 @@ ptrace_set_syscall_info_exit(struct task_struct *child, struct pt_regs *regs,
else
syscall_set_return_value(child, regs, 0, rval);
+ if (force)
+ syscall_set_nr(child, regs, -1);
+
return 0;
}
@@ -1110,6 +1113,7 @@ ptrace_set_syscall_info(struct task_struct *child, unsigned long user_size,
{
struct pt_regs *regs = task_pt_regs(child);
struct ptrace_syscall_info info;
+ bool force = false;
if (user_size < sizeof(info))
return -EINVAL;
@@ -1127,14 +1131,17 @@ ptrace_set_syscall_info(struct task_struct *child, unsigned long user_size,
return -EINVAL;
/* Changing the type of the system call stop is not supported yet. */
- if (ptrace_get_syscall_info_op(child) != info.op)
- return -EINVAL;
+ if (ptrace_get_syscall_info_op(child) != info.op) {
+ if (info.op != PTRACE_SYSCALL_INFO_EXIT)
+ return -EINVAL;
+ force = true;
+ }
switch (info.op) {
case PTRACE_SYSCALL_INFO_ENTRY:
return ptrace_set_syscall_info_entry(child, regs, &info);
case PTRACE_SYSCALL_INFO_EXIT:
- return ptrace_set_syscall_info_exit(child, regs, &info);
+ return ptrace_set_syscall_info_exit(child, regs, &info, force);
case PTRACE_SYSCALL_INFO_SECCOMP:
return ptrace_set_syscall_info_seccomp(child, regs, &info);
default:
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP
2026-07-02 14:47 ` Oleg Nesterov
@ 2026-07-02 16:10 ` Renzo Davoli
0 siblings, 0 replies; 14+ messages in thread
From: Renzo Davoli @ 2026-07-02 16:10 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Dmitry V. Levin, linux-kernel, Andrew Morton, Shuah Khan,
Alexey Gladkov, Eugene Syromyatnikov, Mike Frysinger,
Davide Berardi, strace-devel
On Thu, Jul 02, 2026 at 04:47:08PM +0200, Oleg Nesterov wrote:
> On 07/02, Oleg Nesterov wrote:
> Wdyt about something like below?
I like it.
I have one comment:
> + if (ptrace_get_syscall_info_op(child) != info.op) {
> + if (info.op != PTRACE_SYSCALL_INFO_EXIT)
> + return -EINVAL;
> + force = true;
> + }
I have found the behavior "negative syscall number => skip syscall" defined
only for PTRACE_EVENT_SECCOMP in the manual. I'd restrict the option to this
case, for safety.
A minimal detail: I'd also rename "force" into "skip" or "skip_syscall", just
for the sake of readability.
Dmitry: does this proposal have counter effects for strace?
renzo
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-07-02 16:10 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 15:05 [PATCH 0/5] ptrace_set_syscall_info: add support for seccomp syscall skipping and instruction pointer modification Renzo Davoli
2026-07-01 15:05 ` [PATCH 1/5] ptrace: add PTRACE_SYSCALL_INFO_SECCOMP_SKIP Renzo Davoli
2026-07-02 8:43 ` Oleg Nesterov
2026-07-02 9:09 ` Renzo Davoli
2026-07-02 9:58 ` Oleg Nesterov
2026-07-02 11:07 ` Dmitry V. Levin
2026-07-02 11:31 ` Oleg Nesterov
2026-07-02 11:39 ` Oleg Nesterov
2026-07-02 14:47 ` Oleg Nesterov
2026-07-02 16:10 ` Renzo Davoli
2026-07-01 15:05 ` [PATCH 2/5] selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_SECCOMP_SKIP Renzo Davoli
2026-07-01 15:05 ` [PATCH 3/5] asm/ptrace.h: add instruction_pointer_set Renzo Davoli
2026-07-01 15:05 ` [PATCH 4/5] ptrace: add PTRACE_SYSCALL_INFO_FLAG_SET_IP Renzo Davoli
2026-07-01 15:05 ` [PATCH 5/5] selftests/ptrace: add a test case for PTRACE_SYSCALL_INFO_FLAG_SET_IP Renzo Davoli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox