From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Jiri Olsa <olsajiri@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
linux-api@vger.kernel.org, Oleg Nesterov <oleg@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
bpf@vger.kernel.org, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
"Borislav Petkov (AMD)" <bp@alien8.de>,
x86@kernel.org
Subject: Re: [PATCHv2 1/3] uprobe: Add uretprobe syscall to speed up return probe
Date: Fri, 5 Apr 2024 01:06:03 +0900 [thread overview]
Message-ID: <20240405010603.58bd96cdf56d456103844e65@kernel.org> (raw)
In-Reply-To: <Zg6V8y2-OP_9at2l@krava>
On Thu, 4 Apr 2024 13:58:43 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:
> On Wed, Apr 03, 2024 at 07:00:07PM -0700, Andrii Nakryiko wrote:
>
> SNIP
>
> > Check rt_sigreturn syscall (manpage at [0], for example).
> >
> > sigreturn() exists only to allow the implementation of signal
> > handlers. It should never be called directly. (Indeed, a simple
> > sigreturn() wrapper in the GNU C library simply returns -1, with
> > errno set to ENOSYS.) Details of the arguments (if any) passed
> > to sigreturn() vary depending on the architecture. (On some
> > architectures, such as x86-64, sigreturn() takes no arguments,
> > since all of the information that it requires is available in the
> > stack frame that was previously created by the kernel on the
> > user-space stack.)
> >
> > This is a very similar use case. Also, check its source code in
> > arch/x86/kernel/signal_64.c. It sends SIGSEGV to the calling process
> > on any sign of something not being right. It's exactly the same with
> > sys_uretprobe.
> >
> > [0] https://man7.org/linux/man-pages/man2/sigreturn.2.html
> >
> > > And the number of syscalls are limited resource.
> >
> > We have almost 500 of them, it didn't seems like adding 1-2 for good
> > reasons would be a problem. Can you please point to where the limits
> > on syscalls as a resource are described? I'm curious to learn.
> >
> > >
> > > I'm actually not sure how much we need to care of it, but adding a new
> > > syscall is worth to be discussed carefully because all of them are
> > > user-space compatibility.
> >
> > Absolutely, it's a good discussion to have.
> >
> > >
> > > > > > > Also, we should run syzkaller on this syscall. And if uretprobe is
> > > > > >
> > > > > > right, I'll check on syzkaller
> > > > > >
> > > > > > > set in the user function, what happen if the user function directly
> > > > > > > calls this syscall? (maybe it consumes shadow stack?)
> > > > > >
> > > > > > the process should receive SIGILL if there's no pending uretprobe for
> > > > > > the current task, or it will trigger uretprobe if there's one pending
> > > > >
> > > > > No, that is too aggressive and not safe. Since the syscall is exposed to
> > > > > user program, it should return appropriate error code instead of SIGILL.
> > > > >
> > > >
> > > > This is the way it is today with uretprobes even through interrupt.
> > >
> > > I doubt that the interrupt (exception) and syscall should be handled
> > > differently. Especially, this exception is injected by uprobes but
> > > syscall will be caused by itself. But syscall can be called from user
> > > program (of couse this works as sys_kill(self, SIGILL)).
> >
> > Yep, I'd keep the behavior the same between uretprobes implemented
> > through int3 and sys_uretprobe.
>
> +1
>
> >
> > >
> > > > E.g., it could happen that user process is using fibers and is
> > > > replacing stack pointer without kernel realizing this, which will
> > > > trigger some defensive checks in uretprobe handling code and kernel
> > > > will send SIGILL because it can't support such cases. This is
> > > > happening today already, and it works fine in practice (except for
> > > > applications that manually change stack pointer, too bad, you can't
> > > > trace them with uretprobes, unfortunately).
> > >
> > > OK, we at least need to document it.
> >
> > +1, yep
> >
> > >
> > > >
> > > > So I think it's absolutely adequate to have this behavior if the user
> > > > process is *intentionally* abusing this API.
> > >
> > > Of course user expected that it is abusing. So at least we need to
> > > add a document that this syscall number is reserved to uprobes and
> > > user program must not use it.
> > >
> >
> > Totally agree about documenting this.
>
> ok there's map page on sigreturn.. do you think we should add man page
> for uretprobe or you can think of some other place to document it?
I think it is better to have a man-page. Anyway, to discuss and explain
this syscall, the man-page is a good format to describe it.
>
> >
> > > >
> > > > > >
> > > > > > but we could limit the syscall to be executed just from the trampoline,
> > > > > > that should prevent all the user space use cases, I'll do that in next
> > > > > > version and add more tests for that
> > > > >
> > > > > Why not limit? :) The uprobe_handle_trampoline() expects it is called
> > > > > only from the trampoline, so it is natural to check the caller address.
> > > > > (and uprobe should know where is the trampoline)
> > > > >
> > > > > Since the syscall is always exposed to the user program, it should
> > > > > - Do nothing and return an error unless it is properly called.
> > > > > - check the prerequisites for operation strictly.
> > > > > I concern that new system calls introduce vulnerabilities.
> > > > >
> > > >
> > > > As Oleg and Jiri mentioned, this syscall can't harm kernel or other
> > > > processes, only the process that is abusing the API. So any extra
> > > > checks that would slow down this approach is an unnecessary overhead
> > > > and complication that will never be useful in practice.
> > >
> > > I think at least it should check the caller address to ensure the
> > > address is in the trampoline.
> > > But anyway, uprobes itself can break the target process, so no one
> > > might care if this system call breaks the process now.
> >
> > If we already have an expected range of addresses, then I think it's
> > fine to do a quick unlikely() check. I'd be more concerned if we need
> > to do another lookup or search to just validate this. I'm sure Jiri
> > will figure it out.
>
> Oleg mentioned the trampoline address check could race with another
> thread's mremap call, however trap is using that check as well, it
> still seems like good idea to do it also in the uretprobe syscall
Yeah, and also, can we add a stack pointer check if the trampoline is
shared with other probe points?
Thank you,
>
> jirka
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
prev parent reply other threads:[~2024-04-04 16:06 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240402093302.2416467-1-jolsa@kernel.org>
[not found] ` <20240402093302.2416467-2-jolsa@kernel.org>
[not found] ` <20240403100708.233575a8ac2a5bac2192d180@kernel.org>
[not found] ` <Zg0lvUIB4WdRUGw_@krava>
[not found] ` <20240403230937.c3bd47ee47c102cd89713ee8@kernel.org>
[not found] ` <CAEf4BzZ2RFfz8PNgJ4ENZ0us4uX=DWhYFimXdtWms-VvGXOjgQ@mail.gmail.com>
[not found] ` <20240404095829.ec5db177f29cd29e849169fa@kernel.org>
[not found] ` <CAEf4BzYH60TwvBipHWB_kUqZZ6D-iUVnnFsBv06imRikK3o-bg@mail.gmail.com>
2024-04-04 15:54 ` [PATCHv2 1/3] uprobe: Add uretprobe syscall to speed up return probe Masami Hiramatsu
2024-04-04 16:11 ` Oleg Nesterov
2024-04-05 1:22 ` Masami Hiramatsu
2024-04-05 8:56 ` Jiri Olsa
2024-04-05 11:02 ` Oleg Nesterov
2024-04-06 3:05 ` Masami Hiramatsu
2024-04-06 17:55 ` Oleg Nesterov
2024-04-08 3:54 ` Masami Hiramatsu
2024-04-08 16:02 ` Jiri Olsa
2024-04-08 16:22 ` Oleg Nesterov
2024-04-09 12:06 ` Jiri Olsa
2024-04-09 0:34 ` Masami Hiramatsu
2024-04-09 7:57 ` Jiri Olsa
2024-04-08 3:16 ` Masami Hiramatsu
[not found] ` <Zg6V8y2-OP_9at2l@krava>
2024-04-04 16:06 ` Masami Hiramatsu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240405010603.58bd96cdf56d456103844e65@kernel.org \
--to=mhiramat@kernel.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=olsajiri@gmail.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).