* [RFC PATCH for 4.18 1/2] compat: Introduce is_compat_frame @ 2018-06-26 21:16 Mathieu Desnoyers 2018-06-26 21:16 ` [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields Mathieu Desnoyers 0 siblings, 1 reply; 5+ messages in thread From: Mathieu Desnoyers @ 2018-06-26 21:16 UTC (permalink / raw) To: Andy Lutomirski, Thomas Gleixner Cc: linux-kernel, Mathieu Desnoyers, Joel Fernandes, Peter Zijlstra, Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen, H . Peter Anvin, Chris Lameter, Russell King, Andrew Hunter, Michael Kerrisk, Paul E . McKenney, Paul Turner, Boqun Feng, Josh Triplett, Steven Rostedt, Ben Maurer, linux-api, linux-arch x86 is moving from is_compat_task() to in_compat_syscall(). However, in_compat_syscall cannot be used to check whether a signal is being delivered on a compat task. Introduce is_compat_frame to allow performing this check from architecture agnostic code. On all architectures except x86, it invokes is_compat_task(). On x86, it uses is_ia32_frame() and is_x32_frame() to check whether the signal frame is 32-bit. This is needed by restartable sequences to detect whether it needs to clear the top bits of the start_ip, abort_ip, and post_commit_offset rseq_cs fields on signal delivery, thus ensuring identical behavior for a 32-bit binary executed on 32-bit and 64-bit kernels. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Joel Fernandes <joelaf@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Chris Lameter <cl@linux.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Andrew Hunter <ahh@google.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Maurer <bmaurer@fb.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> --- arch/x86/include/asm/compat.h | 24 ++++++++++++++++++++++++ arch/x86/kernel/signal.c | 17 ----------------- include/linux/compat.h | 17 +++++++++++++++++ 3 files changed, 41 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h index fb97cf7c4137..1405a8df5215 100644 --- a/arch/x86/include/asm/compat.h +++ b/arch/x86/include/asm/compat.h @@ -8,6 +8,7 @@ #include <linux/types.h> #include <linux/sched.h> #include <linux/sched/task_stack.h> +#include <linux/signal.h> #include <asm/processor.h> #include <asm/user32.h> #include <asm/unistd.h> @@ -242,4 +243,27 @@ struct compat_siginfo; int __copy_siginfo_to_user32(struct compat_siginfo __user *to, const siginfo_t *from, bool x32_ABI); +static inline int is_ia32_compat_frame(struct ksignal *ksig) +{ + return IS_ENABLED(CONFIG_IA32_EMULATION) && + ksig->ka.sa.sa_flags & SA_IA32_ABI; +} + +static inline int is_ia32_frame(struct ksignal *ksig) +{ + return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame(ksig); +} + +static inline int is_x32_frame(struct ksignal *ksig) +{ + return IS_ENABLED(CONFIG_X86_X32_ABI) && + ksig->ka.sa.sa_flags & SA_X32_ABI; +} + +static inline bool is_compat_frame(struct ksignal *ksig) +{ + return is_ia32_frame(ksig) || is_x32_frame(ksig); +} +#define is_compat_frame is_compat_frame /* override the generic impl */ + #endif /* _ASM_X86_COMPAT_H */ diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c index 92a3b312a53c..cb488e3e952d 100644 --- a/arch/x86/kernel/signal.c +++ b/arch/x86/kernel/signal.c @@ -664,23 +664,6 @@ SYSCALL_DEFINE0(rt_sigreturn) return 0; } -static inline int is_ia32_compat_frame(struct ksignal *ksig) -{ - return IS_ENABLED(CONFIG_IA32_EMULATION) && - ksig->ka.sa.sa_flags & SA_IA32_ABI; -} - -static inline int is_ia32_frame(struct ksignal *ksig) -{ - return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame(ksig); -} - -static inline int is_x32_frame(struct ksignal *ksig) -{ - return IS_ENABLED(CONFIG_X86_X32_ABI) && - ksig->ka.sa.sa_flags & SA_X32_ABI; -} - static int setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) { diff --git a/include/linux/compat.h b/include/linux/compat.h index b1a5562b3215..2e1ffba65117 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -1022,6 +1022,19 @@ static inline struct compat_timeval ns_to_compat_timeval(s64 nsec) return ctv; } +/* + * For most but not all architectures, "is this a compat sigframe?" and + * "am I a compat task?" are the same question. For architectures on which + * they aren't the same question, arch code can override is_compat_frame. + */ + +#ifndef is_compat_frame +static inline bool is_compat_frame(struct ksignal *ksig) +{ + return is_compat_task(); +} +#endif + #else /* !CONFIG_COMPAT */ #define is_compat_task() (0) @@ -1029,6 +1042,10 @@ static inline struct compat_timeval ns_to_compat_timeval(s64 nsec) static inline bool in_compat_syscall(void) { return false; } #endif +#ifndef is_compat_frame +static inline bool is_compat_frame(struct ksignal *ksig) { return false; } +#endif + #endif /* CONFIG_COMPAT */ #endif /* _LINUX_COMPAT_H */ -- 2.11.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields 2018-06-26 21:16 [RFC PATCH for 4.18 1/2] compat: Introduce is_compat_frame Mathieu Desnoyers @ 2018-06-26 21:16 ` Mathieu Desnoyers 2018-06-26 21:58 ` Andy Lutomirski 0 siblings, 1 reply; 5+ messages in thread From: Mathieu Desnoyers @ 2018-06-26 21:16 UTC (permalink / raw) To: Andy Lutomirski, Thomas Gleixner Cc: linux-kernel, Mathieu Desnoyers, Joel Fernandes, Peter Zijlstra, Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen, H . Peter Anvin, Chris Lameter, Russell King, Andrew Hunter, Michael Kerrisk, Paul E . McKenney, Paul Turner, Boqun Feng, Josh Triplett, Steven Rostedt, Ben Maurer, linux-api, linux-arch Make the behavior rseq on compat tasks more robust by ensuring that kernel/rseq.c:rseq_get_rseq_cs() clears the high bits of rseq_cs->abort_ip, rseq_cs->start_ip and rseq_cs->post_commit_offset when a 32-bit binary is run on a 64-bit kernel. The intent here is that if user-space has garbage rather than zeroes in its struct rseq_cs fields padding, the behavior will be the same whether the binary is run on 32-bit or 64-bit kernels. Use in_compat_syscall() when rseq_get_rseq_cs() is invoked from system call context, and use is_compat_frame() when invoked from signal delivery. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Joel Fernandes <joelaf@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Chris Lameter <cl@linux.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Andrew Hunter <ahh@google.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Maurer <bmaurer@fb.com> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> --- kernel/rseq.c | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/kernel/rseq.c b/kernel/rseq.c index 22b6acf1ad63..7b1d51b965fc 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -13,6 +13,7 @@ #include <linux/syscalls.h> #include <linux/rseq.h> #include <linux/types.h> +#include <linux/compat.h> #include <asm/ptrace.h> #define CREATE_TRACE_POINTS @@ -112,7 +113,23 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t) return 0; } -static int rseq_get_rseq_cs(struct task_struct *t, struct rseq_cs *rseq_cs) +#ifdef CONFIG_COMPAT +static void rseq_cs_compat(struct ksignal *ksig, struct rseq_cs *rseq_cs) +{ + if (!(ksig ? is_compat_frame(ksig) : in_compat_syscall())) + return; + + rseq_cs->abort_ip = (compat_uptr_t) rseq_cs->abort_ip; + rseq_cs->start_ip = (compat_uptr_t) rseq_cs->start_ip; + rseq_cs->post_commit_offset = + (compat_uptr_t) rseq_cs->post_commit_offset; +} +#else +static void rseq_cs_compat(struct ksignal *ksig, struct rseq_cs *rseq_cs) { } +#endif + +static int rseq_get_rseq_cs(struct ksignal *ksig, struct task_struct *t, + struct rseq_cs *rseq_cs) { struct rseq_cs __user *urseq_cs; unsigned long ptr; @@ -132,6 +149,7 @@ static int rseq_get_rseq_cs(struct task_struct *t, struct rseq_cs *rseq_cs) return -EFAULT; if (rseq_cs->version > 0) return -EINVAL; + rseq_cs_compat(ksig, rseq_cs); /* Ensure that abort_ip is not in the critical section. */ if (rseq_cs->abort_ip - rseq_cs->start_ip < rseq_cs->post_commit_offset) @@ -209,14 +227,14 @@ static bool in_rseq_cs(unsigned long ip, struct rseq_cs *rseq_cs) return ip - rseq_cs->start_ip < rseq_cs->post_commit_offset; } -static int rseq_ip_fixup(struct pt_regs *regs) +static int rseq_ip_fixup(struct ksignal *ksig, struct pt_regs *regs) { unsigned long ip = instruction_pointer(regs); struct task_struct *t = current; struct rseq_cs rseq_cs; int ret; - ret = rseq_get_rseq_cs(t, &rseq_cs); + ret = rseq_get_rseq_cs(ksig, t, &rseq_cs); if (ret) return ret; @@ -260,7 +278,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs) return; if (unlikely(!access_ok(VERIFY_WRITE, t->rseq, sizeof(*t->rseq)))) goto error; - ret = rseq_ip_fixup(regs); + ret = rseq_ip_fixup(ksig, regs); if (unlikely(ret < 0)) goto error; if (unlikely(rseq_update_cpu_id(t))) @@ -287,7 +305,7 @@ void rseq_syscall(struct pt_regs *regs) if (!t->rseq) return; if (!access_ok(VERIFY_READ, t->rseq, sizeof(*t->rseq)) || - rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs)) + rseq_get_rseq_cs(NULL, t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs)) force_sig(SIGSEGV, t); } -- 2.11.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields 2018-06-26 21:16 ` [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields Mathieu Desnoyers @ 2018-06-26 21:58 ` Andy Lutomirski 2018-06-26 22:17 ` Mathieu Desnoyers 2018-06-28 8:04 ` Thomas Gleixner 0 siblings, 2 replies; 5+ messages in thread From: Andy Lutomirski @ 2018-06-26 21:58 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Thomas Gleixner, linux-kernel, Joel Fernandes, Peter Zijlstra, Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen, H . Peter Anvin, Chris Lameter, Russell King, Andrew Hunter, Michael Kerrisk, Paul E . McKenney, Paul Turner, Boqun Feng, Josh Triplett, Steven Rostedt, Ben Maurer, linux-api, linux-arch, x86 > On Jun 26, 2018, at 2:16 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > > Make the behavior rseq on compat tasks more robust by ensuring that > kernel/rseq.c:rseq_get_rseq_cs() clears the high bits of > rseq_cs->abort_ip, rseq_cs->start_ip and rseq_cs->post_commit_offset > when a 32-bit binary is run on a 64-bit kernel. > > The intent here is that if user-space has garbage rather than zeroes > in its struct rseq_cs fields padding, the behavior will be the same > whether the binary is run on 32-bit or 64-bit kernels. > > Use in_compat_syscall() when rseq_get_rseq_cs() is invoked from > system call context, and use is_compat_frame() when invoked from > signal delivery. > And when it’s invoked due to preemption unrelated to a syscall or signal, you malfunction? I think the only sane solution is to make these fields be u64, delete the LINUX_FIELD_ macros, and possibly teach the x86 slowpath return to inject a signal if it’s trying to return to a 32-bit context with garbage in the high bits of regs->ip so that we determistically fail if the user screws up. Rseq is brand new. It should not need compat code at all. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields 2018-06-26 21:58 ` Andy Lutomirski @ 2018-06-26 22:17 ` Mathieu Desnoyers 2018-06-28 8:04 ` Thomas Gleixner 1 sibling, 0 replies; 5+ messages in thread From: Mathieu Desnoyers @ 2018-06-26 22:17 UTC (permalink / raw) To: Andy Lutomirski Cc: Thomas Gleixner, linux-kernel, Joel Fernandes, Peter Zijlstra, Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen, H. Peter Anvin, Chris Lameter, Russell King, Andrew Hunter, Michael Kerrisk, Paul E. McKenney, Paul Turner, Boqun Feng, Josh Triplett, rostedt, Ben Maurer, linux-api ----- On Jun 26, 2018, at 5:58 PM, Andy Lutomirski luto@amacapital.net wrote: >> On Jun 26, 2018, at 2:16 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> wrote: >> >> Make the behavior rseq on compat tasks more robust by ensuring that >> kernel/rseq.c:rseq_get_rseq_cs() clears the high bits of >> rseq_cs->abort_ip, rseq_cs->start_ip and rseq_cs->post_commit_offset >> when a 32-bit binary is run on a 64-bit kernel. >> >> The intent here is that if user-space has garbage rather than zeroes >> in its struct rseq_cs fields padding, the behavior will be the same >> whether the binary is run on 32-bit or 64-bit kernels. >> >> Use in_compat_syscall() when rseq_get_rseq_cs() is invoked from >> system call context, and use is_compat_frame() when invoked from >> signal delivery. >> > > And when it’s invoked due to preemption unrelated to a syscall or signal, you > malfunction? Fair point! Hence the "RFC". ;) So I understand better your intent to use the pt_regs to figure out whether it is compat or not. My is_compat_frame()+in_compat_syscall() approach does not handle this correctly. > > I think the only sane solution is to make these fields be u64, I'm OK with turning the rseq_cs start_ip, post_commit_offset, and abort_ip fields into normal u64. > delete the > LINUX_FIELD_ macros, The LINUX_FIELD_ macros are still needed to ensure single-copy updates of the (struct rseq *__tls_abi)->rseq_cs pointer by 32-bit user-space. > and possibly teach the x86 slowpath return to inject a > signal if it’s trying to return to a 32-bit context with garbage in the high > bits of regs->ip so that we determistically fail if the user screws up. I like the approach of dealing with the rseq_cs fields as u64 even on 32-bit architectures. As a downside, it will require 32-bit architectures to do arithmetic on 64-bit values, but it's not a fast-path. As you point out, the tricky bit is to decide what happens when architecture code returns to userspace with regs->ip containing garbage in the high bits. An alternative approach is to ensure the high bits are cleared when returning to an IP with garbage in the high bits. > Rseq is brand new. It should not need compat code at all. Dealing with u64 for start_ip, post_commit_offset, and abort_ip at the kernel level would indeed provide this characteristic. However, I'm uneasy adding 64-bit arithmetic on operations really caring about 32 bits on 32-bit archs, even though those are not fast paths. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields 2018-06-26 21:58 ` Andy Lutomirski 2018-06-26 22:17 ` Mathieu Desnoyers @ 2018-06-28 8:04 ` Thomas Gleixner 1 sibling, 0 replies; 5+ messages in thread From: Thomas Gleixner @ 2018-06-28 8:04 UTC (permalink / raw) To: Andy Lutomirski Cc: Mathieu Desnoyers, linux-kernel, Joel Fernandes, Peter Zijlstra, Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen, H . Peter Anvin, Chris Lameter, Russell King, Andrew Hunter, Michael Kerrisk, Paul E . McKenney, Paul Turner, Boqun Feng, Josh Triplett, Steven Rostedt, Ben Maurer, linux-api, linux-arch, x86 [-- Attachment #1: Type: text/plain, Size: 1434 bytes --] On Tue, 26 Jun 2018, Andy Lutomirski wrote: > > On Jun 26, 2018, at 2:16 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > > > > Make the behavior rseq on compat tasks more robust by ensuring that > > kernel/rseq.c:rseq_get_rseq_cs() clears the high bits of > > rseq_cs->abort_ip, rseq_cs->start_ip and rseq_cs->post_commit_offset > > when a 32-bit binary is run on a 64-bit kernel. > > > > The intent here is that if user-space has garbage rather than zeroes > > in its struct rseq_cs fields padding, the behavior will be the same > > whether the binary is run on 32-bit or 64-bit kernels. > > > > Use in_compat_syscall() when rseq_get_rseq_cs() is invoked from > > system call context, and use is_compat_frame() when invoked from > > signal delivery. > > > > And when it’s invoked due to preemption unrelated to a syscall or signal, > you malfunction? > > I think the only sane solution is to make these fields be u64, delete the > LINUX_FIELD_ macros, and possibly teach the x86 slowpath return to inject > a signal if it’s trying to return to a 32-bit context with garbage in the > high bits of regs->ip so that we determistically fail if the user screws > up. Right. That's the only sane solution. Trying to play games with 32/64bit for a dubious value is going to bite us within no time and just create ugly workarounds left and right. Forcing a clear handling upfront avoids all of that. Thanks, tglx ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-06-28 8:04 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-06-26 21:16 [RFC PATCH for 4.18 1/2] compat: Introduce is_compat_frame Mathieu Desnoyers 2018-06-26 21:16 ` [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields Mathieu Desnoyers 2018-06-26 21:58 ` Andy Lutomirski 2018-06-26 22:17 ` Mathieu Desnoyers 2018-06-28 8:04 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).