linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kees Cook <kees@kernel.org>
To: Eyal Birger <eyal.birger@gmail.com>
Cc: luto@amacapital.net, wad@chromium.org, oleg@redhat.com,
	mhiramat@kernel.org, andrii@kernel.org, jolsa@kernel.org,
	alexei.starovoitov@gmail.com, olsajiri@gmail.com,
	cyphar@cyphar.com, songliubraving@fb.com, yhs@fb.com,
	john.fastabend@gmail.com, peterz@infradead.org,
	tglx@linutronix.de, bp@alien8.de, daniel@iogearbox.net,
	ast@kernel.org, andrii.nakryiko@gmail.com, rostedt@goodmis.org,
	rafi@rbk.io, shmulik.ladkani@gmail.com, bpf@vger.kernel.org,
	linux-api@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH v3 1/2] seccomp: passthrough uretprobe systemcall without filtering
Date: Thu, 6 Feb 2025 13:20:45 -0800	[thread overview]
Message-ID: <202502061320.07B459A@keescook> (raw)
In-Reply-To: <20250202162921.335813-2-eyal.birger@gmail.com>

On Sun, Feb 02, 2025 at 08:29:20AM -0800, Eyal Birger wrote:
> When attaching uretprobes to processes running inside docker, the attached
> process is segfaulted when encountering the retprobe.
> 
> The reason is that now that uretprobe is a system call the default seccomp
> filters in docker block it as they only allow a specific set of known
> syscalls. This is true for other userspace applications which use seccomp
> to control their syscall surface.
> 
> Since uretprobe is a "kernel implementation detail" system call which is
> not used by userspace application code directly, it is impractical and
> there's very little point in forcing all userspace applications to
> explicitly allow it in order to avoid crashing tracked processes.
> 
> Pass this systemcall through seccomp without depending on configuration.
> 
> Note: uretprobe isn't supported in i386 and __NR_ia32_rt_tgsigqueueinfo
> uses the same number as __NR_uretprobe so the syscall isn't forced in the
> compat bitmap.
> 
> Fixes: ff474a78cef5 ("uprobe: Add uretprobe syscall to speed up return probe")
> Reported-by: Rafael Buchbinder <rafi@rbk.io>
> Link: https://lore.kernel.org/lkml/CAHsH6Gs3Eh8DFU0wq58c_LF8A4_+o6z456J7BidmcVY2AqOnHQ@mail.gmail.com/
> Link: https://lore.kernel.org/lkml/20250121182939.33d05470@gandalf.local.home/T/#me2676c378eff2d6a33f3054fed4a5f3afa64e65b
> Link: https://lore.kernel.org/lkml/20250128145806.1849977-1-eyal.birger@gmail.com/
> Cc: stable@vger.kernel.org
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> ---
> v3: no change - deferring 32bit compat handling as there aren't plans to
>     support this syscall in compat mode.
> v2: use action_cache bitmap and mode1 array to check the syscall
> ---
>  kernel/seccomp.c | 24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index f59381c4a2ff..09b6f8e6db51 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -734,13 +734,13 @@ seccomp_prepare_user_filter(const char __user *user_filter)
>  
>  #ifdef SECCOMP_ARCH_NATIVE
>  /**
> - * seccomp_is_const_allow - check if filter is constant allow with given data
> + * seccomp_is_filter_const_allow - check if filter is constant allow with given data
>   * @fprog: The BPF programs
>   * @sd: The seccomp data to check against, only syscall number and arch
>   *      number are considered constant.
>   */
> -static bool seccomp_is_const_allow(struct sock_fprog_kern *fprog,
> -				   struct seccomp_data *sd)
> +static bool seccomp_is_filter_const_allow(struct sock_fprog_kern *fprog,
> +					  struct seccomp_data *sd)
>  {
>  	unsigned int reg_value = 0;
>  	unsigned int pc;
> @@ -812,6 +812,21 @@ static bool seccomp_is_const_allow(struct sock_fprog_kern *fprog,
>  	return false;
>  }
>  
> +static bool seccomp_is_const_allow(struct sock_fprog_kern *fprog,
> +				   struct seccomp_data *sd)
> +{
> +#ifdef __NR_uretprobe
> +	if (sd->nr == __NR_uretprobe
> +#ifdef SECCOMP_ARCH_COMPAT
> +	    && sd->arch != SECCOMP_ARCH_COMPAT
> +#endif
> +	   )
> +		return true;
> +#endif
> +
> +	return seccomp_is_filter_const_allow(fprog, sd);
> +}
> +
>  static void seccomp_cache_prepare_bitmap(struct seccomp_filter *sfilter,
>  					 void *bitmap, const void *bitmap_prev,
>  					 size_t bitmap_size, int arch)

I minimized the above to:

@@ -749,6 +749,15 @@ static bool seccomp_is_const_allow(struct sock_fprog_kern *fprog,
 	if (WARN_ON_ONCE(!fprog))
 		return false;
 
+	/* Our single exception to filtering. */
+#ifdef __NR_uretprobe
+#ifdef SECCOMP_ARCH_COMPAT
+	if (sd->arch == SECCOMP_ARCH_NATIVE)
+#endif
+		if (sd->nr == __NR_uretprobe)
+			return true;
+#endif
+
 	for (pc = 0; pc < fprog->len; pc++) {
 		struct sock_filter *insn = &fprog->filter[pc];
 		u16 code = insn->code;

> @@ -1023,6 +1038,9 @@ static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
>   */
>  static const int mode1_syscalls[] = {
>  	__NR_seccomp_read, __NR_seccomp_write, __NR_seccomp_exit, __NR_seccomp_sigreturn,
> +#ifdef __NR_uretprobe
> +	__NR_uretprobe,
> +#endif
>  	-1, /* negative terminated */
>  };
>  
> -- 
> 2.43.0
> 

-Kees

-- 
Kees Cook

  reply	other threads:[~2025-02-06 21:20 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-02 16:29 [PATCH v3 0/2] seccomp: pass uretprobe system call through seccomp Eyal Birger
2025-02-02 16:29 ` [PATCH v3 1/2] seccomp: passthrough uretprobe systemcall without filtering Eyal Birger
2025-02-06 21:20   ` Kees Cook [this message]
2025-02-02 16:29 ` [PATCH v3 2/2] selftests/seccomp: validate uretprobe syscall passes through seccomp Eyal Birger
2025-02-02 20:51   ` Jiri Olsa
2025-02-02 21:13     ` Eyal Birger
2025-02-06 21:18       ` Kees Cook
2025-02-06 21:21 ` [PATCH v3 0/2] seccomp: pass uretprobe system call " Kees Cook
2025-02-07  1:06   ` Eyal Birger
2025-02-07 13:24     ` Jiri Olsa
2025-02-07 15:27 ` Jann Horn
2025-02-07 16:20   ` Eyal Birger
2025-02-07 16:50     ` Jann Horn
2025-02-08  0:03   ` Jiri Olsa
2025-02-08 20:35     ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202502061320.07B459A@keescook \
    --to=kees@kernel.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bp@alien8.de \
    --cc=bpf@vger.kernel.org \
    --cc=cyphar@cyphar.com \
    --cc=daniel@iogearbox.net \
    --cc=eyal.birger@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mhiramat@kernel.org \
    --cc=oleg@redhat.com \
    --cc=olsajiri@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rafi@rbk.io \
    --cc=rostedt@goodmis.org \
    --cc=shmulik.ladkani@gmail.com \
    --cc=songliubraving@fb.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).