From: Jiri Olsa <olsajiri@gmail.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: "olsajiri@gmail.com" <olsajiri@gmail.com>,
"songliubraving@fb.com" <songliubraving@fb.com>,
"luto@kernel.org" <luto@kernel.org>,
"mhiramat@kernel.org" <mhiramat@kernel.org>,
"andrii@kernel.org" <andrii@kernel.org>,
"debug@rivosinc.com" <debug@rivosinc.com>,
"john.fastabend@gmail.com" <john.fastabend@gmail.com>,
"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"ast@kernel.org" <ast@kernel.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"linux-man@vger.kernel.org" <linux-man@vger.kernel.org>,
"oleg@redhat.com" <oleg@redhat.com>, "yhs@fb.com" <yhs@fb.com>,
"daniel@iogearbox.net" <daniel@iogearbox.net>,
"peterz@infradead.org" <peterz@infradead.org>,
"linux-trace-kernel@vger.kernel.org"
<linux-trace-kernel@vger.kernel.org>,
"bp@alien8.de" <bp@alien8.de>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"x86@kernel.org" <x86@kernel.org>,
"broonie@kernel.org" <broonie@kernel.org>
Subject: Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe
Date: Mon, 6 May 2024 12:56:19 +0200 [thread overview]
Message-ID: <Zji3U131RJtQDdA_@krava> (raw)
In-Reply-To: <d2e0e53581e26358ee0b3d188a07795878938d2f.camel@intel.com>
On Fri, May 03, 2024 at 08:35:24PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2024-05-03 at 22:17 +0200, Jiri Olsa wrote:
> > when uretprobe is created, kernel overwrites the return address on user
> > stack to point to user space trampoline, so the setup is in kernel hands
>
> I mean for uprobes in general. I'm didn't have any specific ideas in mind, but
> in general when we give the kernel more abilities around shadow stack we have to
> think if attackers could use it to work around shadow stack protections.
>
> >
> > with the hack below on top of this patchset I'm no longer seeing shadow
> > stack app crash on uretprobe.. I'll try to polish it and send out next
> > week, any suggestions are welcome ;-)
>
> Thanks. Some comments below.
>
> >
> > thanks,
> > jirka
> >
> >
> > ---
> > diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
> > index 42fee8959df7..d374305a6851 100644
> > --- a/arch/x86/include/asm/shstk.h
> > +++ b/arch/x86/include/asm/shstk.h
> > @@ -21,6 +21,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct
> > *p, unsigned long clon
> > void shstk_free(struct task_struct *p);
> > int setup_signal_shadow_stack(struct ksignal *ksig);
> > int restore_signal_shadow_stack(void);
> > +void uprobe_change_stack(unsigned long addr);
> > +void uprobe_push_stack(unsigned long addr);
>
> Maybe name them:
> shstk_update_last_frame();
> shstk_push_frame();
ok
>
>
> > #else
> > static inline long shstk_prctl(struct task_struct *task, int option,
> > unsigned long arg2) { return -EINVAL; }
> > diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
> > index 59e15dd8d0f8..804c446231d9 100644
> > --- a/arch/x86/kernel/shstk.c
> > +++ b/arch/x86/kernel/shstk.c
> > @@ -577,3 +577,24 @@ long shstk_prctl(struct task_struct *task, int option,
> > unsigned long arg2)
> > return wrss_control(true);
> > return -EINVAL;
> > }
> > +
> > +void uprobe_change_stack(unsigned long addr)
> > +{
> > + unsigned long ssp;
>
> Probably want something like:
>
> if (!features_enabled(ARCH_SHSTK_SHSTK))
> return;
ok
>
> So this doesn't try the below if shadow stack is disabled.
>
> > +
> > + ssp = get_user_shstk_addr();
> > + write_user_shstk_64((u64 __user *)ssp, (u64)addr);
> > +}
>
> Can we know that there was a valid return address just before this point on the
> stack? Or could it be a sigframe or something?
when uprobe hijack the return address it assumes it's on the top of the stack,
so it's saved and replaced with address of the user space trampoline
>
> > +
> > +void uprobe_push_stack(unsigned long addr)
> > +{
> > + unsigned long ssp;
>
> if (!features_enabled(ARCH_SHSTK_SHSTK))
> return;
>
> > +
> > + ssp = get_user_shstk_addr();
> > + ssp -= SS_FRAME_SIZE;
> > + write_user_shstk_64((u64 __user *)ssp, (u64)addr);
> > +
> > + fpregs_lock_and_load();
> > + wrmsrl(MSR_IA32_PL3_SSP, ssp);
> > + fpregs_unlock();
> > +}
> > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> > index 81e6ee95784d..259457838020 100644
> > --- a/arch/x86/kernel/uprobes.c
> > +++ b/arch/x86/kernel/uprobes.c
> > @@ -416,6 +416,7 @@ SYSCALL_DEFINE0(uretprobe)
> > regs->r11 = regs->flags;
> > regs->cx = regs->ip;
> >
> > + uprobe_push_stack(r11_cx_ax[2]);
>
> I'm concerned this could be used to push arbitrary frames to the shadow stack.
> Couldn't an attacker do a jump to the point that calls this syscall? Maybe this
> is what peterz was raising.
of course never say never, but here's my reasoning why I think it's ok
the page with the syscall trampoline is mapped in user space and can be
found in procfs maps file under '[uprobes]' name
the syscall can be called only from this trampoline, if it's called from
anywhere else the calling process receives SIGILL
now if you run the uretprobe syscall without any pending uretprobe for
the task it will receive SIGILL before it gets to the point of pushing
address on the shadow stack
and to configure the uretprobe you need to have CAP_PERFMON or CAP_SYS_ADMIN
if you'd actually managed to get the pending uretprobe instance, the shadow
stack entry is going to be used/pop-ed right away in the trampoline with
the ret instruction
and as I mentioned above it's ensured that the syscall is returning to the
trampoline and it can't be called from any other place
>
> > return regs->ax;
> >
> > sigill:
> > @@ -1191,8 +1192,10 @@ arch_uretprobe_hijack_return_addr(unsigned long
> > trampoline_vaddr, struct pt_regs
> > return orig_ret_vaddr;
> >
> > nleft = copy_to_user((void __user *)regs->sp, &trampoline_vaddr,
> > rasize);
> > - if (likely(!nleft))
> > + if (likely(!nleft)) {
> > + uprobe_change_stack(trampoline_vaddr);
> > return orig_ret_vaddr;
> > + }
> >
> > if (nleft != rasize) {
> > pr_err("return address clobbered: pid=%d, %%sp=%#lx,
> > %%ip=%#lx\n",
>
I'll try to add uprobe test under tools/testing/selftests/x86/test_shadow_stack.c
and send that and change below as part of new version
thanks for the comments,
jirka
---
diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
index 42fee8959df7..2e1ddcf98242 100644
--- a/arch/x86/include/asm/shstk.h
+++ b/arch/x86/include/asm/shstk.h
@@ -21,6 +21,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clon
void shstk_free(struct task_struct *p);
int setup_signal_shadow_stack(struct ksignal *ksig);
int restore_signal_shadow_stack(void);
+int shstk_update_last_frame(unsigned long val);
+int shstk_push_frame(unsigned long val);
#else
static inline long shstk_prctl(struct task_struct *task, int option,
unsigned long arg2) { return -EINVAL; }
@@ -31,6 +33,8 @@ static inline unsigned long shstk_alloc_thread_stack(struct task_struct *p,
static inline void shstk_free(struct task_struct *p) {}
static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; }
static inline int restore_signal_shadow_stack(void) { return 0; }
+static inline int shstk_update_last_frame(unsigned long val) { return 0; }
+static inline int shstk_push_frame(unsigned long val) { return 0; }
#endif /* CONFIG_X86_USER_SHADOW_STACK */
#endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index 59e15dd8d0f8..66434dfde52e 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -577,3 +577,32 @@ long shstk_prctl(struct task_struct *task, int option, unsigned long arg2)
return wrss_control(true);
return -EINVAL;
}
+
+int shstk_update_last_frame(unsigned long val)
+{
+ unsigned long ssp;
+
+ if (!features_enabled(ARCH_SHSTK_SHSTK))
+ return 0;
+
+ ssp = get_user_shstk_addr();
+ return write_user_shstk_64((u64 __user *)ssp, (u64)val);
+}
+
+int shstk_push_frame(unsigned long val)
+{
+ unsigned long ssp;
+
+ if (!features_enabled(ARCH_SHSTK_SHSTK))
+ return 0;
+
+ ssp = get_user_shstk_addr();
+ ssp -= SS_FRAME_SIZE;
+ if (write_user_shstk_64((u64 __user *)ssp, (u64)val))
+ return -EFAULT;
+
+ fpregs_lock_and_load();
+ wrmsrl(MSR_IA32_PL3_SSP, ssp);
+ fpregs_unlock();
+ return 0;
+}
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 81e6ee95784d..ae6c3458a675 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -406,6 +406,11 @@ SYSCALL_DEFINE0(uretprobe)
* trampoline's ret instruction
*/
r11_cx_ax[2] = regs->ip;
+
+ /* make the shadow stack follow that */
+ if (shstk_push_frame(regs->ip))
+ goto sigill;
+
regs->ip = ip;
err = copy_to_user((void __user *)regs->sp, r11_cx_ax, sizeof(r11_cx_ax));
@@ -1191,8 +1196,13 @@ arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs
return orig_ret_vaddr;
nleft = copy_to_user((void __user *)regs->sp, &trampoline_vaddr, rasize);
- if (likely(!nleft))
+ if (likely(!nleft)) {
+ if (shstk_update_last_frame(trampoline_vaddr)) {
+ force_sig(SIGSEGV);
+ return -1;
+ }
return orig_ret_vaddr;
+ }
if (nleft != rasize) {
pr_err("return address clobbered: pid=%d, %%sp=%#lx, %%ip=%#lx\n",
next prev parent reply other threads:[~2024-05-06 10:56 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-02 12:23 [PATCHv4 bpf-next 0/7] uprobe: uretprobe speed up Jiri Olsa
2024-05-02 12:23 ` [PATCHv4 bpf-next 1/7] uprobe: Wire up uretprobe system call Jiri Olsa
2024-05-02 12:23 ` [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up return probe Jiri Olsa
2024-05-03 11:34 ` Peter Zijlstra
2024-05-03 13:04 ` Jiri Olsa
2024-05-03 15:53 ` Edgecombe, Rick P
2024-05-03 19:18 ` Jiri Olsa
2024-05-03 19:38 ` Edgecombe, Rick P
2024-05-03 20:17 ` Jiri Olsa
2024-05-03 20:35 ` Edgecombe, Rick P
2024-05-06 10:56 ` Jiri Olsa [this message]
2024-05-03 23:01 ` Deepak Gupta
2024-05-02 12:23 ` [PATCHv4 bpf-next 3/7] selftests/bpf: Add uretprobe syscall test for regs integrity Jiri Olsa
2024-05-02 12:23 ` [PATCHv4 bpf-next 4/7] selftests/bpf: Add uretprobe syscall test for regs changes Jiri Olsa
2024-05-02 12:23 ` [PATCHv4 bpf-next 5/7] selftests/bpf: Add uretprobe syscall call from user space test Jiri Olsa
2024-05-02 16:33 ` Andrii Nakryiko
2024-05-02 12:23 ` [PATCHv4 bpf-next 6/7] selftests/bpf: Add uretprobe compat test Jiri Olsa
2024-05-02 16:35 ` Andrii Nakryiko
2024-05-02 12:23 ` [PATCHv4 7/7] man2: Add uretprobe syscall page Jiri Olsa
2024-05-02 13:43 ` Alejandro Colomar
2024-05-02 20:13 ` Jiri Olsa
2024-05-02 22:06 ` Alejandro Colomar
2024-05-02 16:43 ` [PATCHv4 bpf-next 0/7] uprobe: uretprobe speed up Andrii Nakryiko
2024-05-02 20:04 ` Jiri Olsa
2024-05-03 18:03 ` Andrii Nakryiko
2024-05-03 20:39 ` Jiri Olsa
2024-05-07 7:47 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zji3U131RJtQDdA_@krava \
--to=olsajiri@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=broonie@kernel.org \
--cc=daniel@iogearbox.net \
--cc=debug@rivosinc.com \
--cc=john.fastabend@gmail.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).