All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Andy Lutomirski <luto@kernel.org>, Kees Cook <kees@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@kernel.org>, Will Drewry <wad@chromium.org>
Cc: Eric Paris <eparis@redhat.com>,
	Kusaram Devineni <kusaram@devineni.in>,
	Max Ver <dudududumaxver@gmail.com>,
	Paul Moore <paul@paul-moore.com>,
	audit@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [RFC PATCH 2/2] seccomp: drop syscall exit events for rejected syscalls
Date: Sun, 19 Apr 2026 17:53:25 +0200	[thread overview]
Message-ID: <aeT6dasGDZYPMQ_h@redhat.com> (raw)
In-Reply-To: <aeT6T7ZJ45yAtdZs@redhat.com>

seccomp_nack_syscall() calls syscall_rollback(), which means that the
syscall exit path sees the original syscall number as the return value.

This confuses audit_syscall_exit(), trace_syscall_exit(), and ptrace,
causing them to report completely bogus syscall exit events.

Add a new SYSCALL_WORK_SECCOMP_EXIT flag set by seccomp_nack_syscall(),
and change syscall_exit_work() to return early if this flag is set. After
all, this syscall was never actually executed.

Note that syscall_exit_work() has to clear SYSCALL_WORK_SECCOMP_EXIT for
the !force_coredump case, and that is why we actually need the new flag:
seccomp_nack_syscall() can't just clear SYSCALL_AUDIT/TRACEPOINT/TRACE.

Reported-by: Max Ver <dudududumaxver@gmail.com>
Closes: https://lore.kernel.org/all/CABjJbFJO+p3jA1r0gjUZrCepQb1Fab3kqxYhc_PSfoqo21ypeQ@mail.gmail.com/
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/entry-common.h | 9 ++++++++-
 include/linux/thread_info.h  | 2 ++
 kernel/seccomp.c             | 4 ++++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 535da46c3ee9..403802eed387 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -34,7 +34,8 @@
 				 SYSCALL_WORK_SYSCALL_TRACE |		\
 				 SYSCALL_WORK_SYSCALL_AUDIT |		\
 				 SYSCALL_WORK_SYSCALL_USER_DISPATCH |	\
-				 SYSCALL_WORK_SYSCALL_EXIT_TRAP)
+				 SYSCALL_WORK_SYSCALL_EXIT_TRAP |	\
+				 SYSCALL_WORK_SECCOMP_EXIT)
 
 /**
  * arch_ptrace_report_syscall_entry - Architecture specific ptrace_report_syscall_entry() wrapper
@@ -235,6 +236,12 @@ static __always_inline void syscall_exit_work(struct pt_regs *regs, unsigned lon
 		}
 	}
 
+	if (work & SYSCALL_WORK_SECCOMP_EXIT) {
+		/* Rejected by seccomp, no valid syscall exit state */
+		clear_syscall_work(SECCOMP_EXIT);
+		return;
+	}
+
 	audit_syscall_exit(regs);
 
 	if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT)
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 051e42902690..167c850ae16e 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -40,6 +40,7 @@ enum {
 #ifdef CONFIG_GENERIC_ENTRY
 enum syscall_work_bit {
 	SYSCALL_WORK_BIT_SECCOMP,
+	SYSCALL_WORK_BIT_SECCOMP_EXIT,
 	SYSCALL_WORK_BIT_SYSCALL_TRACEPOINT,
 	SYSCALL_WORK_BIT_SYSCALL_TRACE,
 	SYSCALL_WORK_BIT_SYSCALL_EMU,
@@ -50,6 +51,7 @@ enum syscall_work_bit {
 };
 
 #define SYSCALL_WORK_SECCOMP			BIT(SYSCALL_WORK_BIT_SECCOMP)
+#define SYSCALL_WORK_SECCOMP_EXIT		BIT(SYSCALL_WORK_BIT_SECCOMP_EXIT)
 #define SYSCALL_WORK_SYSCALL_TRACEPOINT		BIT(SYSCALL_WORK_BIT_SYSCALL_TRACEPOINT)
 #define SYSCALL_WORK_SYSCALL_TRACE		BIT(SYSCALL_WORK_BIT_SYSCALL_TRACE)
 #define SYSCALL_WORK_SYSCALL_EMU		BIT(SYSCALL_WORK_BIT_SYSCALL_EMU)
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index cb8dd78791cd..35703dceb6d2 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -1262,6 +1262,10 @@ static void seccomp_nack_syscall(int this_syscall, int data, bool force_coredump
 	syscall_rollback(current, current_pt_regs());
 	/* Let the filter pass back 16 bits of data. */
 	force_sig_seccomp(this_syscall, data, force_coredump);
+#ifdef CONFIG_GENERIC_ENTRY
+	/* No valid syscall exit state after syscall_rollback() */
+	set_syscall_work(SECCOMP_EXIT);
+#endif
 }
 
 static int __seccomp_filter(int this_syscall, const bool recheck_after_trace)
-- 
2.52.0


  parent reply	other threads:[~2026-04-19 15:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-19 15:52 [RFC PATCH 0/2] seccomp: drop syscall exit events for rejected syscalls Oleg Nesterov
2026-04-19 15:53 ` [RFC PATCH 1/2] seccomp: introduce seccomp_nack_syscall() helper Oleg Nesterov
2026-04-19 15:53 ` Oleg Nesterov [this message]
2026-04-21 16:52   ` [RFC PATCH 2/2] seccomp: drop syscall exit events for rejected syscalls Kees Cook
2026-04-21 18:59     ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeT6dasGDZYPMQ_h@redhat.com \
    --to=oleg@redhat.com \
    --cc=audit@vger.kernel.org \
    --cc=dudududumaxver@gmail.com \
    --cc=eparis@redhat.com \
    --cc=kees@kernel.org \
    --cc=kusaram@devineni.in \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=paul@paul-moore.com \
    --cc=peterz@infradead.org \
    --cc=tglx@kernel.org \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.