From: Oleg Nesterov <oleg@redhat.com>
To: Andy Lutomirski <luto@kernel.org>, Kees Cook <kees@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@kernel.org>, Will Drewry <wad@chromium.org>
Cc: Eric Paris <eparis@redhat.com>,
Kusaram Devineni <kusaram@devineni.in>,
Max Ver <dudududumaxver@gmail.com>,
Paul Moore <paul@paul-moore.com>,
audit@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [RFC PATCH 2/2] seccomp: drop syscall exit events for rejected syscalls
Date: Sun, 19 Apr 2026 17:53:25 +0200 [thread overview]
Message-ID: <aeT6dasGDZYPMQ_h@redhat.com> (raw)
In-Reply-To: <aeT6T7ZJ45yAtdZs@redhat.com>
seccomp_nack_syscall() calls syscall_rollback(), which means that the
syscall exit path sees the original syscall number as the return value.
This confuses audit_syscall_exit(), trace_syscall_exit(), and ptrace,
causing them to report completely bogus syscall exit events.
Add a new SYSCALL_WORK_SECCOMP_EXIT flag set by seccomp_nack_syscall(),
and change syscall_exit_work() to return early if this flag is set. After
all, this syscall was never actually executed.
Note that syscall_exit_work() has to clear SYSCALL_WORK_SECCOMP_EXIT for
the !force_coredump case, and that is why we actually need the new flag:
seccomp_nack_syscall() can't just clear SYSCALL_AUDIT/TRACEPOINT/TRACE.
Reported-by: Max Ver <dudududumaxver@gmail.com>
Closes: https://lore.kernel.org/all/CABjJbFJO+p3jA1r0gjUZrCepQb1Fab3kqxYhc_PSfoqo21ypeQ@mail.gmail.com/
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
include/linux/entry-common.h | 9 ++++++++-
include/linux/thread_info.h | 2 ++
kernel/seccomp.c | 4 ++++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 535da46c3ee9..403802eed387 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -34,7 +34,8 @@
SYSCALL_WORK_SYSCALL_TRACE | \
SYSCALL_WORK_SYSCALL_AUDIT | \
SYSCALL_WORK_SYSCALL_USER_DISPATCH | \
- SYSCALL_WORK_SYSCALL_EXIT_TRAP)
+ SYSCALL_WORK_SYSCALL_EXIT_TRAP | \
+ SYSCALL_WORK_SECCOMP_EXIT)
/**
* arch_ptrace_report_syscall_entry - Architecture specific ptrace_report_syscall_entry() wrapper
@@ -235,6 +236,12 @@ static __always_inline void syscall_exit_work(struct pt_regs *regs, unsigned lon
}
}
+ if (work & SYSCALL_WORK_SECCOMP_EXIT) {
+ /* Rejected by seccomp, no valid syscall exit state */
+ clear_syscall_work(SECCOMP_EXIT);
+ return;
+ }
+
audit_syscall_exit(regs);
if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT)
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 051e42902690..167c850ae16e 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -40,6 +40,7 @@ enum {
#ifdef CONFIG_GENERIC_ENTRY
enum syscall_work_bit {
SYSCALL_WORK_BIT_SECCOMP,
+ SYSCALL_WORK_BIT_SECCOMP_EXIT,
SYSCALL_WORK_BIT_SYSCALL_TRACEPOINT,
SYSCALL_WORK_BIT_SYSCALL_TRACE,
SYSCALL_WORK_BIT_SYSCALL_EMU,
@@ -50,6 +51,7 @@ enum syscall_work_bit {
};
#define SYSCALL_WORK_SECCOMP BIT(SYSCALL_WORK_BIT_SECCOMP)
+#define SYSCALL_WORK_SECCOMP_EXIT BIT(SYSCALL_WORK_BIT_SECCOMP_EXIT)
#define SYSCALL_WORK_SYSCALL_TRACEPOINT BIT(SYSCALL_WORK_BIT_SYSCALL_TRACEPOINT)
#define SYSCALL_WORK_SYSCALL_TRACE BIT(SYSCALL_WORK_BIT_SYSCALL_TRACE)
#define SYSCALL_WORK_SYSCALL_EMU BIT(SYSCALL_WORK_BIT_SYSCALL_EMU)
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index cb8dd78791cd..35703dceb6d2 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -1262,6 +1262,10 @@ static void seccomp_nack_syscall(int this_syscall, int data, bool force_coredump
syscall_rollback(current, current_pt_regs());
/* Let the filter pass back 16 bits of data. */
force_sig_seccomp(this_syscall, data, force_coredump);
+#ifdef CONFIG_GENERIC_ENTRY
+ /* No valid syscall exit state after syscall_rollback() */
+ set_syscall_work(SECCOMP_EXIT);
+#endif
}
static int __seccomp_filter(int this_syscall, const bool recheck_after_trace)
--
2.52.0
prev parent reply other threads:[~2026-04-19 15:53 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-19 15:52 [RFC PATCH 0/2] seccomp: drop syscall exit events for rejected syscalls Oleg Nesterov
2026-04-19 15:53 ` [RFC PATCH 1/2] seccomp: introduce seccomp_nack_syscall() helper Oleg Nesterov
2026-04-19 15:53 ` Oleg Nesterov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aeT6dasGDZYPMQ_h@redhat.com \
--to=oleg@redhat.com \
--cc=audit@vger.kernel.org \
--cc=dudududumaxver@gmail.com \
--cc=eparis@redhat.com \
--cc=kees@kernel.org \
--cc=kusaram@devineni.in \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=paul@paul-moore.com \
--cc=peterz@infradead.org \
--cc=tglx@kernel.org \
--cc=wad@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox