* [RFC PATCH v2 0/1] x86/pkeys: update PKRU to enable pkey 0 @ 2024-03-21 21:56 Aruna Ramakrishna 2024-03-21 21:56 ` [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE Aruna Ramakrishna 0 siblings, 1 reply; 14+ messages in thread From: Aruna Ramakrishna @ 2024-03-21 21:56 UTC (permalink / raw) To: linux-kernel Cc: x86, dave.hansen, tglx, matthias.neugschwandtner, andrew.brownsword v2 updates: - PKRU is passed as a parameter from handle_signal() all the way to copy_fpregs_to_sigframe(), where it actually updates PKRU in the sigframe I'm working on adding a test case in tools/testing/selftests/mm/protection_keys.c. Aruna Ramakrishna (1): x86/pkeys: update PKRU to enable pkey 0 before XSAVE arch/x86/include/asm/fpu/signal.h | 3 +- arch/x86/include/asm/sighandling.h | 10 +++---- arch/x86/kernel/fpu/signal.c | 44 ++++++++++++++++++++++++++---- arch/x86/kernel/fpu/xstate.c | 13 +++++++++ arch/x86/kernel/fpu/xstate.h | 1 + arch/x86/kernel/signal.c | 40 +++++++++++++++++++++------ arch/x86/kernel/signal_32.c | 8 +++--- arch/x86/kernel/signal_64.c | 9 +++--- 8 files changed, 101 insertions(+), 27 deletions(-) -- 2.39.3 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-03-21 21:56 [RFC PATCH v2 0/1] x86/pkeys: update PKRU to enable pkey 0 Aruna Ramakrishna @ 2024-03-21 21:56 ` Aruna Ramakrishna 2024-03-22 9:46 ` Ingo Molnar ` (3 more replies) 0 siblings, 4 replies; 14+ messages in thread From: Aruna Ramakrishna @ 2024-03-21 21:56 UTC (permalink / raw) To: linux-kernel Cc: x86, dave.hansen, tglx, matthias.neugschwandtner, andrew.brownsword This patch is a workaround for a bug where the PKRU value is not restored to the init value before the signal handler is invoked. Problem description: Let's assume there's a multithreaded application that runs untrusted user code. Each thread has its stack/code protected by a non-zero pkey, and the PKRU register is set up such that only that particular non-zero pkey is enabled. Each thread also sets up an alternate signal stack to handle signals, which is protected by pkey zero. The pkeys man page documents that the PKRU will be reset to init_pkru when the signal handler is invoked, which means that pkey zero access will be enabled. But this reset happens after the kernel attempts to push fpu state to the alternate stack, which is not (yet) accessible by the kernel, which leads to a new SIGSEGV being sent to the application, terminating it. Enabling both the non-zero pkey (for the thread) and pkey zero (in userspace) will not work for us. We cannot have the alt stack writeable by all - the rationale here is that the code running in that thread (using a non-zero pkey) is untrusted and should not have access to the alternate signal stack (that uses pkey zero), to prevent the return address of a function from being changed. The expectation is that kernel should be able to set up the alternate signal stack and deliver the signal to the application even if pkey zero is explicitly disabled by the application. The signal handler accessibility should not be dictated by the PKRU value that the thread sets up. Solution: The PKRU register is managed by XSAVE, which means the sigframe contents must match the register contents - which is not the case here. We want the sigframe to contain the user-defined PKRU value (so that it is restored correctly from sigcontext) but the actual register must be reset to init_pkru so that the alt stack is accessible and the signal can be delivered to the application. It seems that the proper fix here would be to remove PKRU from the XSAVE framework and manage it separately, which is quite complicated. As a workaround, this patch does something like this: orig_pkru = rdpkru(); wrpkru(init_pkru & orig_pkru); xsave_to_user_sigframe(); put_user(pkru_sigframe_addr, orig_pkru) Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Helped-by: Dave Kleikamp <dave.kleikamp@oracle.com> Helped-by: Prakash Sangappa <prakash.sangappa@oracle.com> --- arch/x86/include/asm/fpu/signal.h | 3 +- arch/x86/include/asm/sighandling.h | 10 +++---- arch/x86/kernel/fpu/signal.c | 44 ++++++++++++++++++++++++++---- arch/x86/kernel/fpu/xstate.c | 13 +++++++++ arch/x86/kernel/fpu/xstate.h | 1 + arch/x86/kernel/signal.c | 40 +++++++++++++++++++++------ arch/x86/kernel/signal_32.c | 8 +++--- arch/x86/kernel/signal_64.c | 9 +++--- 8 files changed, 101 insertions(+), 27 deletions(-) diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h index 611fa41711af..deaa81f44c2a 100644 --- a/arch/x86/include/asm/fpu/signal.h +++ b/arch/x86/include/asm/fpu/signal.h @@ -29,7 +29,8 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame, unsigned long fpu__get_fpstate_size(void); -extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size); +extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size, + u32 pkru); extern void fpu__clear_user_states(struct fpu *fpu); extern bool fpu__restore_sig(void __user *buf, int ia32_frame); diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h index e770c4fc47f4..de458354a3ea 100644 --- a/arch/x86/include/asm/sighandling.h +++ b/arch/x86/include/asm/sighandling.h @@ -17,11 +17,11 @@ void signal_fault(struct pt_regs *regs, void __user *frame, char *where); void __user * get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size, - void __user **fpstate); + void __user **fpstate, u32 pkru); -int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs); -int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs); -int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs); -int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs); +int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru); +int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru); +int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru); +int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru); #endif /* _ASM_X86_SIGHANDLING_H */ diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index 247f2225aa9f..1abbb6551992 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -63,6 +63,32 @@ static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf, return true; } +/* + * Update the value of PKRU register that was already pushed + * onto the signal frame. + */ +static inline int +__update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru) +{ + int err = -EFAULT; + struct _fpx_sw_bytes fx_sw; + struct pkru_state *pk = NULL; + + if (unlikely(!check_xstate_in_sigframe((void __user *) buf, &fx_sw))) + goto out; + + pk = get_xsave_addr_user(buf, XFEATURE_PKRU); + if (!pk || !user_write_access_begin(buf, sizeof(struct xregs_state))) + goto out; + unsafe_put_user(pkru, (unsigned int __user *) pk, uaccess_end); + + err = 0; +uaccess_end: + user_access_end(); +out: + return err; +} + /* * Signal frame handlers. */ @@ -156,10 +182,17 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame, return !err; } -static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf) +static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, + u32 pkru) { - if (use_xsave()) - return xsave_to_user_sigframe(buf); + int err = 0; + + if (use_xsave()) { + err = xsave_to_user_sigframe(buf); + if (!err && cpu_feature_enabled(X86_FEATURE_OSPKE)) + err = __update_pkru_in_sigframe(buf, pkru); + return err; + } if (use_fxsr()) return fxsave_to_user_sigframe((struct fxregs_state __user *) buf); else @@ -185,7 +218,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf) * For [f]xsave state, update the SW reserved fields in the [f]xsave frame * indicating the absence/presence of the extended state to the user. */ -bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size) +bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size, + u32 pkru) { struct task_struct *tsk = current; struct fpstate *fpstate = tsk->thread.fpu.fpstate; @@ -228,7 +262,7 @@ bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size) fpregs_restore_userregs(); pagefault_disable(); - ret = copy_fpregs_to_sigframe(buf_fx); + ret = copy_fpregs_to_sigframe(buf_fx, pkru); pagefault_enable(); fpregs_unlock(); diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 117e74c44e75..22623abf32b6 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -991,6 +991,19 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr) return __raw_xsave_addr(xsave, xfeature_nr); } +/* + * Given an xstate feature nr, calculate where in the xsave + * buffer the state is. The xsave buffer should be in standard + * format, not compacted (e.g. user mode signal frames). + */ +void *get_xsave_addr_user(struct xregs_state *xsave, int xfeature_nr) +{ + if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr))) + return NULL; + + return (void *)xsave + xstate_offsets[xfeature_nr]; +} + #ifdef CONFIG_ARCH_HAS_PKEYS /* diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index 3518fb26d06b..ade07e78fd26 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -55,6 +55,7 @@ extern void fpu__init_cpu_xstate(void); extern void fpu__init_system_xstate(unsigned int legacy_size); extern void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr); +extern void *get_xsave_addr_user(struct xregs_state *xsave, int xfeature_nr); static inline u64 xfeatures_mask_supervisor(void) { diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c index 31b6f5dddfc2..36134931df68 100644 --- a/arch/x86/kernel/signal.c +++ b/arch/x86/kernel/signal.c @@ -74,7 +74,7 @@ static inline int is_x32_frame(struct ksignal *ksig) */ void __user * get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size, - void __user **fpstate) + void __user **fpstate, u32 pkru) { struct k_sigaction *ka = &ksig->ka; int ia32_frame = is_ia32_frame(ksig); @@ -139,7 +139,8 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size, } /* save i387 and extended state */ - if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size)) + if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, + math_size, pkru)) return (void __user *)-1L; return (void __user *)sp; @@ -206,7 +207,7 @@ unsigned long get_sigframe_size(void) } static int -setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) +setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru) { /* Perform fixup for the pre-signal frame. */ rseq_signal_deliver(ksig, regs); @@ -214,21 +215,41 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) /* Set up the stack frame */ if (is_ia32_frame(ksig)) { if (ksig->ka.sa.sa_flags & SA_SIGINFO) - return ia32_setup_rt_frame(ksig, regs); + return ia32_setup_rt_frame(ksig, regs, pkru); else - return ia32_setup_frame(ksig, regs); + return ia32_setup_frame(ksig, regs, pkru); } else if (is_x32_frame(ksig)) { - return x32_setup_rt_frame(ksig, regs); + return x32_setup_rt_frame(ksig, regs, pkru); } else { - return x64_setup_rt_frame(ksig, regs); + return x64_setup_rt_frame(ksig, regs, pkru); } } +/* + * Ensure that the both the current stack and the alternate signal + * stack is writeable. The alternate stack must be accessible by the + * init PKRU value. + */ +static inline u32 sig_prepare_pkru(void) +{ + u32 current_pkru = read_pkru(); + u32 init_pkru_snapshot = pkru_get_init_value(); + + write_pkru(current_pkru & init_pkru_snapshot); + return current_pkru; +} + +static inline void sig_restore_pkru(u32 pkru) +{ + write_pkru(pkru); +} + static void handle_signal(struct ksignal *ksig, struct pt_regs *regs) { bool stepping, failed; struct fpu *fpu = ¤t->thread.fpu; + u32 pkru; if (v8086_mode(regs)) save_v86_state((struct kernel_vm86_regs *) regs, VM86_SIGNAL); @@ -264,7 +285,8 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) if (stepping) user_disable_single_step(current); - failed = (setup_rt_frame(ksig, regs) < 0); + pkru = sig_prepare_pkru(); + failed = (setup_rt_frame(ksig, regs, pkru) < 0); if (!failed) { /* * Clear the direction flag as per the ABI for function entry. @@ -281,6 +303,8 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) * Ensure the signal handler starts with the new fpu state. */ fpu__clear_user_states(fpu); + } else { + sig_restore_pkru(pkru); } signal_setup_done(failed, ksig, stepping); } diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c index c12624bc82a3..68f2bfd7d6e7 100644 --- a/arch/x86/kernel/signal_32.c +++ b/arch/x86/kernel/signal_32.c @@ -228,7 +228,7 @@ do { \ goto label; \ } while(0) -int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs) +int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru) { sigset32_t *set = (sigset32_t *) sigmask_to_save(); struct sigframe_ia32 __user *frame; @@ -246,7 +246,7 @@ int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs) 0x80cd, /* int $0x80 */ }; - frame = get_sigframe(ksig, regs, sizeof(*frame), &fp); + frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru); if (ksig->ka.sa.sa_flags & SA_RESTORER) { restorer = ksig->ka.sa.sa_restorer; @@ -299,7 +299,7 @@ int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs) return -EFAULT; } -int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) +int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru) { sigset32_t *set = (sigset32_t *) sigmask_to_save(); struct rt_sigframe_ia32 __user *frame; @@ -319,7 +319,7 @@ int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) 0, }; - frame = get_sigframe(ksig, regs, sizeof(*frame), &fp); + frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru); if (!user_access_begin(frame, sizeof(*frame))) return -EFAULT; diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c index 23d8aaf8d9fd..b5c9535ff08b 100644 --- a/arch/x86/kernel/signal_64.c +++ b/arch/x86/kernel/signal_64.c @@ -161,7 +161,7 @@ static unsigned long frame_uc_flags(struct pt_regs *regs) return flags; } -int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) +int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru) { sigset_t *set = sigmask_to_save(); struct rt_sigframe __user *frame; @@ -172,7 +172,8 @@ int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) if (!(ksig->ka.sa.sa_flags & SA_RESTORER)) return -EFAULT; - frame = get_sigframe(ksig, regs, sizeof(struct rt_sigframe), &fp); + frame = get_sigframe(ksig, regs, sizeof(struct rt_sigframe), &fp, + pkru); uc_flags = frame_uc_flags(regs); if (!user_access_begin(frame, sizeof(*frame))) @@ -300,7 +301,7 @@ int copy_siginfo_to_user32(struct compat_siginfo __user *to, return __copy_siginfo_to_user32(to, from); } -int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) +int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru) { compat_sigset_t *set = (compat_sigset_t *) sigmask_to_save(); struct rt_sigframe_x32 __user *frame; @@ -311,7 +312,7 @@ int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) if (!(ksig->ka.sa.sa_flags & SA_RESTORER)) return -EFAULT; - frame = get_sigframe(ksig, regs, sizeof(*frame), &fp); + frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru); uc_flags = frame_uc_flags(regs); -- 2.39.3 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-03-21 21:56 ` [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE Aruna Ramakrishna @ 2024-03-22 9:46 ` Ingo Molnar 2024-03-22 18:30 ` Aruna Ramakrishna 2024-04-25 22:03 ` jeffxu 2024-03-22 15:40 ` Dave Hansen ` (2 subsequent siblings) 3 siblings, 2 replies; 14+ messages in thread From: Ingo Molnar @ 2024-03-22 9:46 UTC (permalink / raw) To: Aruna Ramakrishna Cc: linux-kernel, x86, dave.hansen, tglx, matthias.neugschwandtner, andrew.brownsword * Aruna Ramakrishna <aruna.ramakrishna@oracle.com> wrote: > This patch is a workaround for a bug where the PKRU value is not > restored to the init value before the signal handler is invoked. > > Problem description: > Let's assume there's a multithreaded application that runs untrusted > user code. Each thread has its stack/code protected by a non-zero pkey, > and the PKRU register is set up such that only that particular non-zero > pkey is enabled. Each thread also sets up an alternate signal stack to > handle signals, which is protected by pkey zero. The pkeys man page > documents that the PKRU will be reset to init_pkru when the signal > handler is invoked, which means that pkey zero access will be enabled. > But this reset happens after the kernel attempts to push fpu state > to the alternate stack, which is not (yet) accessible by the kernel, > which leads to a new SIGSEGV being sent to the application, terminating > it. > > Enabling both the non-zero pkey (for the thread) and pkey zero (in > userspace) will not work for us. We cannot have the alt stack writeable > by all - the rationale here is that the code running in that thread > (using a non-zero pkey) is untrusted and should not have access to the > alternate signal stack (that uses pkey zero), to prevent the return > address of a function from being changed. The expectation is that kernel > should be able to set up the alternate signal stack and deliver the > signal to the application even if pkey zero is explicitly disabled by > the application. The signal handler accessibility should not be dictated > by the PKRU value that the thread sets up. > > Solution: > The PKRU register is managed by XSAVE, which means the sigframe contents > must match the register contents - which is not the case here. We want > the sigframe to contain the user-defined PKRU value (so that it is > restored correctly from sigcontext) but the actual register must be > reset to init_pkru so that the alt stack is accessible and the signal > can be delivered to the application. It seems that the proper fix here > would be to remove PKRU from the XSAVE framework and manage it > separately, which is quite complicated. As a workaround, this patch does > something like this: > > orig_pkru = rdpkru(); > wrpkru(init_pkru & orig_pkru); > xsave_to_user_sigframe(); > put_user(pkru_sigframe_addr, orig_pkru) > > Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> > Helped-by: Dave Kleikamp <dave.kleikamp@oracle.com> > Helped-by: Prakash Sangappa <prakash.sangappa@oracle.com> > --- > arch/x86/include/asm/fpu/signal.h | 3 +- > arch/x86/include/asm/sighandling.h | 10 +++---- > arch/x86/kernel/fpu/signal.c | 44 ++++++++++++++++++++++++++---- > arch/x86/kernel/fpu/xstate.c | 13 +++++++++ > arch/x86/kernel/fpu/xstate.h | 1 + > arch/x86/kernel/signal.c | 40 +++++++++++++++++++++------ > arch/x86/kernel/signal_32.c | 8 +++--- > arch/x86/kernel/signal_64.c | 9 +++--- > 8 files changed, 101 insertions(+), 27 deletions(-) Ok, this looks a lot saner than the first patch. A couple of requests: 1) Please split out all the PKRU parameter passing interface changes into a separate patch. Ie. split out patches that don't change any behavior, and try to make the final feature-enabling (bug-fixing) patch as small and easy to read as possible. Maybe even have 3 patches: - function interface changes - helper function additions - behavioral changes to signal handler pkru context 2) I do agree that isolation of sandboxed execution into a non-zero pkey might make sense. But this really needs an actual testcase. 3) The semantics you've implemented for sigaltstacks are not the only possible ones. In principle, a signal handler with its own stack might want to have its own key(s) enabled. In a way a Linux signal handler is a mini-thread created on the fly, with its own stack and its own attributes. Some thought & analysis should go into which way to go here, and the best path should be chosen. Fixing the SIGSEGV you observed should be a happy side effect of other worthwile improvements. Thanks, Ingo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-03-22 9:46 ` Ingo Molnar @ 2024-03-22 18:30 ` Aruna Ramakrishna 2024-04-25 22:03 ` jeffxu 1 sibling, 0 replies; 14+ messages in thread From: Aruna Ramakrishna @ 2024-03-22 18:30 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, x86@kernel.org, dave.hansen@linux.intel.com, tglx@linutronix.de, Matthias Neugschwandtner, Andrew Brownsword > On Mar 22, 2024, at 2:46 AM, Ingo Molnar <mingo@kernel.org> wrote: > > Ok, this looks a lot saner than the first patch. > > A couple of requests: > > 1) > > Please split out all the PKRU parameter passing interface changes into a > separate patch. Ie. split out patches that don't change any behavior, and > try to make the final feature-enabling (bug-fixing) patch as small and easy > to read as possible. Maybe even have 3 patches: > > - function interface changes > - helper function additions > - behavioral changes to signal handler pkru context > > 2) > > I do agree that isolation of sandboxed execution into a non-zero pkey might > make sense. But this really needs an actual testcase. > > 3) > > The semantics you've implemented for sigaltstacks are not the only possible > ones. In principle, a signal handler with its own stack might want to have > its own key(s) enabled. In a way a Linux signal handler is a mini-thread > created on the fly, with its own stack and its own attributes. Some thought > & analysis should go into which way to go here, and the best path should be > chosen. Fixing the SIGSEGV you observed should be a happy side effect of > other worthwile improvements. > > Thanks, > > Ingo Thank you, Ingo! I will split this patch into multiple patches when I send it out as a patch review request next. And add a testcase. I agree that this patch covers a very specific use case, and it probably raises more questions than it answers. That’s why I sent it out as an RFC - because I wasn’t sure if this was the best way to add this functionality, and I wanted the experts to weigh in. As Dave suggested, I can instead do wrpkru(0) to enable all pkeys before XSAVE. Thanks, Aruna ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-03-22 9:46 ` Ingo Molnar 2024-03-22 18:30 ` Aruna Ramakrishna @ 2024-04-25 22:03 ` jeffxu 1 sibling, 0 replies; 14+ messages in thread From: jeffxu @ 2024-04-25 22:03 UTC (permalink / raw) To: aruna.ramakrishna Cc: andrew.brownsword, dave.hansen, linux-kernel, matthias.neugschwandtner, tglx, jeffxu, jeffxu, jannh, sroettger, x86 From: Jeff Xu <jeffxu@chromium.org> Resend(previous email sent to the wrong thread) >The semantics you've implemented for sigaltstacks are not the only possible >ones. In principle, a signal handler with its own stack might want to have >its own key(s) enabled. In a way a Linux signal handler is a mini-thread >created on the fly, with its own stack and its own attributes. Some thought >& analysis should go into which way to go here, and the best path should be >chosen. Fixing the SIGSEGV you observed should be a happy side effect of >other worthwile improvements. > This is exactly right. we wants to use altstack with its own pkey. The pkey won't be writeable during thread's normal operation, only by signaling handling itself. Thanks -Jeff >Thanks, > > Ingo > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-03-21 21:56 ` [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE Aruna Ramakrishna 2024-03-22 9:46 ` Ingo Molnar @ 2024-03-22 15:40 ` Dave Hansen 2024-03-22 18:28 ` Aruna Ramakrishna 2024-04-25 21:05 ` jeffxu 2024-04-25 21:58 ` jeffxu 3 siblings, 1 reply; 14+ messages in thread From: Dave Hansen @ 2024-03-22 15:40 UTC (permalink / raw) To: Aruna Ramakrishna, linux-kernel Cc: x86, dave.hansen, tglx, matthias.neugschwandtner, andrew.brownsword On 3/21/24 14:56, Aruna Ramakrishna wrote: > +/* > + * Ensure that the both the current stack and the alternate signal > + * stack is writeable. The alternate stack must be accessible by the > + * init PKRU value. > + */ > +static inline u32 sig_prepare_pkru(void) > +{ > + u32 current_pkru = read_pkru(); > + u32 init_pkru_snapshot = pkru_get_init_value(); > + > + write_pkru(current_pkru & init_pkru_snapshot); > + return current_pkru; > +} That comment is quite misleading. This code has *ZERO* knowledge of the permissions on either the current or alternate stack. It _assumes_ that the current PKRU permissions allow writes to the current stack and _assumes_ that the init PKRU value can write to the alternative stack. Those aren't bad assumptions, but they _are_ assumptions and need to be clearly called out as such. The '&' operation looks rather random and needs an explanation. What is that logically trying to do? It's trying to clear bits in the old (pre-signal) PKRU value so that it gains write access to the alt stack. Please say that. Which leads me to ask: Why bother with the '&'? It would be simpler to, for instance, just wrpkru(0). What is being written to the old stack at this point? I also dislike something being called 'current_pkru' when it's clearly the old value by the time it is returned. > +static inline void sig_restore_pkru(u32 pkru) > +{ > + write_pkru(pkru); > +} This seems like unnecessary abstraction. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-03-22 15:40 ` Dave Hansen @ 2024-03-22 18:28 ` Aruna Ramakrishna 0 siblings, 0 replies; 14+ messages in thread From: Aruna Ramakrishna @ 2024-03-22 18:28 UTC (permalink / raw) To: Dave Hansen Cc: linux-kernel@vger.kernel.org, x86@kernel.org, dave.hansen@linux.intel.com, tglx@linutronix.de, Matthias Neugschwandtner, Andrew Brownsword > On Mar 22, 2024, at 8:40 AM, Dave Hansen <dave.hansen@intel.com> wrote: > > On 3/21/24 14:56, Aruna Ramakrishna wrote: >> +/* >> + * Ensure that the both the current stack and the alternate signal >> + * stack is writeable. The alternate stack must be accessible by the >> + * init PKRU value. >> + */ >> +static inline u32 sig_prepare_pkru(void) >> +{ >> + u32 current_pkru = read_pkru(); >> + u32 init_pkru_snapshot = pkru_get_init_value(); >> + >> + write_pkru(current_pkru & init_pkru_snapshot); >> + return current_pkru; >> +} > > That comment is quite misleading. This code has *ZERO* knowledge of the > permissions on either the current or alternate stack. It _assumes_ that > the current PKRU permissions allow writes to the current stack and > _assumes_ that the init PKRU value can write to the alternative stack. > > Those aren't bad assumptions, but they _are_ assumptions and need to be > clearly called out as such. > > The '&' operation looks rather random and needs an explanation. What is > that logically trying to do? It's trying to clear bits in the old > (pre-signal) PKRU value so that it gains write access to the alt stack. > Please say that. > > Which leads me to ask: Why bother with the '&'? It would be simpler to, > for instance, just wrpkru(0). What is being written to the old stack at > this point? Right. This works only for the very specific use case where the alt stack is protected by init_pkru and the current execution stack is protected by the thread’s PKRU. If those assumptions do not hold for an application, then it would still run into the same issue. I wasn’t sure if enabling all pkeys before XSAVE - i.e. wrpkru(0) - will be acceptable from a security standpoint. If it is, that seems like a more generic solution than what’s in this patch. > > I also dislike something being called 'current_pkru' when it's clearly > the old value by the time it is returned. > >> +static inline void sig_restore_pkru(u32 pkru) >> +{ >> + write_pkru(pkru); >> +} > > This seems like unnecessary abstraction. Yeah. Just trying to be consistent with the prep/restore… I can remove this. Thanks, Aruna ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-03-21 21:56 ` [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE Aruna Ramakrishna 2024-03-22 9:46 ` Ingo Molnar 2024-03-22 15:40 ` Dave Hansen @ 2024-04-25 21:05 ` jeffxu 2024-04-25 22:49 ` Aruna Ramakrishna 2024-04-25 21:58 ` jeffxu 3 siblings, 1 reply; 14+ messages in thread From: jeffxu @ 2024-04-25 21:05 UTC (permalink / raw) To: aruna.ramakrishna Cc: andrew.brownsword, dave.hansen, linux-kernel, matthias.neugschwandtner, tglx, jeffxu, jeffxu, jannh, sroettger, x86 From: Jeff Xu <jeffxu@chromium.org> On 3/21/24 14:56, Aruna Ramakrishna wrote: >Enabling both the non-zero pkey (for the thread) and pkey zero (in >userspace) will not work for us. We cannot have the alt stack writeable >by all - the rationale here is that the code running in that thread >(using a non-zero pkey) is untrusted and should not have access to the >alternate signal stack (that uses pkey zero), to prevent the return >address of a function from being changed. The expectation is that kernel >should be able to set up the alternate signal stack and deliver the >signal to the application even if pkey zero is explicitly disabled by >the application. The signal handler accessibility should not be dictated >by the PKRU value that the thread sets up. > We have a similar threat model that we don't want "untrusted threads" to access altstack. I think this patch need not be restricted to the use case of zero pkey for altstack, i.e. application can also set non-zero pkey to altstack and expect the same. >Solution: >The PKRU register is managed by XSAVE, which means the sigframe contents >must match the register contents - which is not the case here. We want >the sigframe to contain the user-defined PKRU value (so that it is >restored correctly from sigcontext) but the actual register must be >reset to init_pkru so that the alt stack is accessible and the signal >can be delivered to the application. It seems that the proper fix here >would be to remove PKRU from the XSAVE framework and manage it >separately, which is quite complicated. As a workaround, this patch does >something like this: > > orig_pkru = rdpkru(); > wrpkru(init_pkru & orig_pkru); > xsave_to_user_sigframe(); > put_user(pkru_sigframe_addr, orig_pkru) > The default PKRU of thread [1] is set as 01 (disable access) for each PKEY from 1 to 15, and 00 (RW) for PKEY 0. Let's use pkey 1 as an example: The init_pkru is 01, if the thread has PKRU (orig_pkru) as 10 (disable write but have read) then new_pkru from (init_pkru & orig_pkru) is 00, which gives RW access to the pkey 1. When the thread has orig_pkru as 01 (disable access) or 00 (RW), new_pkru is unchanged from orig_pkru. Now take pkey 0: the init_pkru is 00, regardless what threads has, new_pkru will always be 00. This seems to work out well for pkey 1 to 15, i.e. signal handing code in kernel only give write access when the thread alrady has read access to the PKEY that is used by the altstack. The threat model interesting here is to prevent untrusted threads from writing to altstack, and read is probably less of a problem. Does this meet what you want? (Note the pkey 0 is different than 1-15) Suppose someone also like to disable all access to altstack, then there is one more place to mind: in sigreturn(), it calls restore_altstack(), and requires read access to altstack. However, at the time, PKRU is already restored from sigframe, so SEGV will raise (the value in sigframe doesn't have read access to the PKEY). Without changing sigreturn, using wrpkru(0) here might not be necessary: the dispatch to user space works fine, only to crash at sigreturn step. [1] defined by init_pkru_value in pkeys.c Best regards, -Jeff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-04-25 21:05 ` jeffxu @ 2024-04-25 22:49 ` Aruna Ramakrishna 2024-04-26 0:12 ` Jeff Xu 0 siblings, 1 reply; 14+ messages in thread From: Aruna Ramakrishna @ 2024-04-25 22:49 UTC (permalink / raw) To: jeffxu@chromium.org Cc: Andrew Brownsword, dave.hansen@linux.intel.com, linux-kernel@vger.kernel.org, Matthias Neugschwandtner, tglx@linutronix.de, jeffxu@google.com, jannh@google.com, sroettger@google.com, x86@kernel.org > On Apr 25, 2024, at 2:05 PM, jeffxu@chromium.org wrote: > > From: Jeff Xu <jeffxu@chromium.org> > > On 3/21/24 14:56, Aruna Ramakrishna wrote: >> Enabling both the non-zero pkey (for the thread) and pkey zero (in >> userspace) will not work for us. We cannot have the alt stack writeable >> by all - the rationale here is that the code running in that thread >> (using a non-zero pkey) is untrusted and should not have access to the >> alternate signal stack (that uses pkey zero), to prevent the return >> address of a function from being changed. The expectation is that kernel >> should be able to set up the alternate signal stack and deliver the >> signal to the application even if pkey zero is explicitly disabled by >> the application. The signal handler accessibility should not be dictated >> by the PKRU value that the thread sets up. >> > We have a similar threat model that we don't want "untrusted threads" to > access altstack. I think this patch need not be restricted to the > use case of zero pkey for altstack, i.e. application can also set > non-zero pkey to altstack and expect the same. Agreed. In the latest version of this patchset, this assumption has been removed. Link here: https://lore.kernel.org/lkml/20240425180542.1042933-1-aruna.ramakrishna@oracle.com/T/#t > >> Solution: >> The PKRU register is managed by XSAVE, which means the sigframe contents >> must match the register contents - which is not the case here. We want >> the sigframe to contain the user-defined PKRU value (so that it is >> restored correctly from sigcontext) but the actual register must be >> reset to init_pkru so that the alt stack is accessible and the signal >> can be delivered to the application. It seems that the proper fix here >> would be to remove PKRU from the XSAVE framework and manage it >> separately, which is quite complicated. As a workaround, this patch does >> something like this: >> >> orig_pkru = rdpkru(); >> wrpkru(init_pkru & orig_pkru); >> xsave_to_user_sigframe(); >> put_user(pkru_sigframe_addr, orig_pkru) >> > The default PKRU of thread [1] is set as 01 (disable access) for each PKEY > from 1 to 15, and 00 (RW) for PKEY 0. > > Let's use pkey 1 as an example: > The init_pkru is 01, if the thread has PKRU (orig_pkru) as 10 (disable write > but have read) then new_pkru from (init_pkru & orig_pkru) is 00, which gives > RW access to the pkey 1. > > When the thread has orig_pkru as 01 (disable access) or 00 (RW), new_pkru is > unchanged from orig_pkru. > > Now take pkey 0: > the init_pkru is 00, regardless what threads has, new_pkru will always be 00. > > This seems to work out well for pkey 1 to 15, i.e. signal handing code in > kernel only give write access when the thread alrady has read access to the > PKEY that is used by the altstack. The threat model interesting here is to > prevent untrusted threads from writing to altstack, and read is probably less > of a problem. > This piece of code assumed that the init PKRU value allows writes to the alternative signal stack. As you mentioned earlier, that may not always be true - a non-zero pkey can be used for the altstack. So the new version simply does write_pkru(0) (i.e. enabled all pkeys) before XSAVE. Is this more reasonable? > > Does this meet what you want? (Note the pkey 0 is different than 1-15) > > Suppose someone also like to disable all access to altstack, then there is one > more place to mind: in sigreturn(), it calls restore_altstack(), and requires > read access to altstack. However, at the time, PKRU is already restored from > sigframe, so SEGV will raise (the value in sigframe doesn't have read access > to the PKEY). > > Without changing sigreturn, using wrpkru(0) here might not be necessary: > the dispatch to user space works fine, only to crash at sigreturn step. > > [1] defined by init_pkru_value in pkeys.c > > Best regards, > -Jeff I see what you're saying. In rt_sigreturn(): if (!restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags)) <— restores PKRU, disabling access to altstack goto badframe; ... if (restore_altstack(&frame->uc.uc_stack)) <— needs read access to altstack goto badframe; I’m wary about reordering anything in here. Also, this code is not aware of the altstack permissions. I’m wondering if wrpkru(0) is needed here too. Thanks, Aruna ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-04-25 22:49 ` Aruna Ramakrishna @ 2024-04-26 0:12 ` Jeff Xu 2024-04-26 16:13 ` Jeff Xu 0 siblings, 1 reply; 14+ messages in thread From: Jeff Xu @ 2024-04-26 0:12 UTC (permalink / raw) To: Aruna Ramakrishna Cc: jeffxu@chromium.org, Andrew Brownsword, dave.hansen@linux.intel.com, linux-kernel@vger.kernel.org, Matthias Neugschwandtner, tglx@linutronix.de, jannh@google.com, sroettger@google.com, x86@kernel.org On Thu, Apr 25, 2024 at 3:49 PM Aruna Ramakrishna <aruna.ramakrishna@oracle.com> wrote: > > > > > On Apr 25, 2024, at 2:05 PM, jeffxu@chromium.org wrote: > > > > From: Jeff Xu <jeffxu@chromium.org> > > > > On 3/21/24 14:56, Aruna Ramakrishna wrote: > >> Enabling both the non-zero pkey (for the thread) and pkey zero (in > >> userspace) will not work for us. We cannot have the alt stack writeable > >> by all - the rationale here is that the code running in that thread > >> (using a non-zero pkey) is untrusted and should not have access to the > >> alternate signal stack (that uses pkey zero), to prevent the return > >> address of a function from being changed. The expectation is that kernel > >> should be able to set up the alternate signal stack and deliver the > >> signal to the application even if pkey zero is explicitly disabled by > >> the application. The signal handler accessibility should not be dictated > >> by the PKRU value that the thread sets up. > >> > > We have a similar threat model that we don't want "untrusted threads" to > > access altstack. I think this patch need not be restricted to the > > use case of zero pkey for altstack, i.e. application can also set > > non-zero pkey to altstack and expect the same. > > Agreed. In the latest version of this patchset, this assumption has been removed. > > Link here: > https://lore.kernel.org/lkml/20240425180542.1042933-1-aruna.ramakrishna@oracle.com/T/#t > > > > >> Solution: > >> The PKRU register is managed by XSAVE, which means the sigframe contents > >> must match the register contents - which is not the case here. We want > >> the sigframe to contain the user-defined PKRU value (so that it is > >> restored correctly from sigcontext) but the actual register must be > >> reset to init_pkru so that the alt stack is accessible and the signal > >> can be delivered to the application. It seems that the proper fix here > >> would be to remove PKRU from the XSAVE framework and manage it > >> separately, which is quite complicated. As a workaround, this patch does > >> something like this: > >> > >> orig_pkru = rdpkru(); > >> wrpkru(init_pkru & orig_pkru); > >> xsave_to_user_sigframe(); > >> put_user(pkru_sigframe_addr, orig_pkru) > >> > > The default PKRU of thread [1] is set as 01 (disable access) for each PKEY > > from 1 to 15, and 00 (RW) for PKEY 0. > > > > Let's use pkey 1 as an example: > > The init_pkru is 01, if the thread has PKRU (orig_pkru) as 10 (disable write > > but have read) then new_pkru from (init_pkru & orig_pkru) is 00, which gives > > RW access to the pkey 1. > > > > When the thread has orig_pkru as 01 (disable access) or 00 (RW), new_pkru is > > unchanged from orig_pkru. > > > > Now take pkey 0: > > the init_pkru is 00, regardless what threads has, new_pkru will always be 00. > > > > This seems to work out well for pkey 1 to 15, i.e. signal handing code in > > kernel only give write access when the thread alrady has read access to the > > PKEY that is used by the altstack. The threat model interesting here is to > > prevent untrusted threads from writing to altstack, and read is probably less > > of a problem. > > > > This piece of code assumed that the init PKRU value allows writes to the alternative > signal stack. As you mentioned earlier, that may not always be true - a non-zero pkey > can be used for the altstack. > Only PKEY 0 has init PKRU as 00. So in your case, the thread doesn't have write access to pkey 0, and need the write access to pkey 0 during signal handling. > So the new version simply does write_pkru(0) (i.e. enabled all pkeys) before XSAVE. > Is this more reasonable? > write_pkru(0) will work, but it is unnecessary in the current patch. Consider the following case: A thread has no access to pkey 1, and use pkey 1 for its altstack. In V3 (use write_pkru(0): Signal will be dispatched to the user, i.e. write to signal frame is OK, but it will SEGV at sigreturn. In V2: it will SEGV earlier at dispatch stage when writing to sigframe. I would rather that the code fails earlier for this case. > > > > Does this meet what you want? (Note the pkey 0 is different than 1-15) > > > > Suppose someone also like to disable all access to altstack, then there is one > > more place to mind: in sigreturn(), it calls restore_altstack(), and requires > > read access to altstack. However, at the time, PKRU is already restored from > > sigframe, so SEGV will raise (the value in sigframe doesn't have read access > > to the PKEY). > > > > Without changing sigreturn, using wrpkru(0) here might not be necessary: > > the dispatch to user space works fine, only to crash at sigreturn step. > > > > [1] defined by init_pkru_value in pkeys.c > > > > Best regards, > > -Jeff > > > I see what you're saying. In rt_sigreturn(): > > if (!restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags)) <— restores PKRU, disabling access to altstack > goto badframe; > ... > if (restore_altstack(&frame->uc.uc_stack)) <— needs read access to altstack > goto badframe; > > > I’m wary about reordering anything in here. Also, this code is not aware of the altstack permissions. I’m wondering if wrpkru(0) is needed here too. > We can't change PKRU after restore_sigcontext, the calling thread would have PKRU 0, not the original PKRU from before handling the signal. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-04-26 0:12 ` Jeff Xu @ 2024-04-26 16:13 ` Jeff Xu 2024-04-26 16:33 ` Edgecombe, Rick P 0 siblings, 1 reply; 14+ messages in thread From: Jeff Xu @ 2024-04-26 16:13 UTC (permalink / raw) To: Jeff Xu Cc: Aruna Ramakrishna, Andrew Brownsword, dave.hansen@linux.intel.com, linux-kernel@vger.kernel.org, Matthias Neugschwandtner, tglx@linutronix.de, jannh@google.com, sroettger@google.com, x86@kernel.org, Kees Cook, rick.p.edgecombe On Thu, Apr 25, 2024 at 5:12 PM Jeff Xu <jeffxu@google.com> wrote: > > On Thu, Apr 25, 2024 at 3:49 PM Aruna Ramakrishna > <aruna.ramakrishna@oracle.com> wrote: > > > > > > > > > On Apr 25, 2024, at 2:05 PM, jeffxu@chromium.org wrote: > > > > > > From: Jeff Xu <jeffxu@chromium.org> > > > > > > On 3/21/24 14:56, Aruna Ramakrishna wrote: > > >> Enabling both the non-zero pkey (for the thread) and pkey zero (in > > >> userspace) will not work for us. We cannot have the alt stack writeable > > >> by all - the rationale here is that the code running in that thread > > >> (using a non-zero pkey) is untrusted and should not have access to the > > >> alternate signal stack (that uses pkey zero), to prevent the return > > >> address of a function from being changed. The expectation is that kernel > > >> should be able to set up the alternate signal stack and deliver the > > >> signal to the application even if pkey zero is explicitly disabled by > > >> the application. The signal handler accessibility should not be dictated > > >> by the PKRU value that the thread sets up. > > >> > > > We have a similar threat model that we don't want "untrusted threads" to > > > access altstack. I think this patch need not be restricted to the > > > use case of zero pkey for altstack, i.e. application can also set > > > non-zero pkey to altstack and expect the same. > > > > Agreed. In the latest version of this patchset, this assumption has been removed. > > > > Link here: > > https://lore.kernel.org/lkml/20240425180542.1042933-1-aruna.ramakrishna@oracle.com/T/#t > > > > > > > >> Solution: > > >> The PKRU register is managed by XSAVE, which means the sigframe contents > > >> must match the register contents - which is not the case here. We want > > >> the sigframe to contain the user-defined PKRU value (so that it is > > >> restored correctly from sigcontext) but the actual register must be > > >> reset to init_pkru so that the alt stack is accessible and the signal > > >> can be delivered to the application. It seems that the proper fix here > > >> would be to remove PKRU from the XSAVE framework and manage it > > >> separately, which is quite complicated. As a workaround, this patch does > > >> something like this: > > >> > > >> orig_pkru = rdpkru(); > > >> wrpkru(init_pkru & orig_pkru); > > >> xsave_to_user_sigframe(); > > >> put_user(pkru_sigframe_addr, orig_pkru) > > >> > > > The default PKRU of thread [1] is set as 01 (disable access) for each PKEY > > > from 1 to 15, and 00 (RW) for PKEY 0. > > > > > > Let's use pkey 1 as an example: > > > The init_pkru is 01, if the thread has PKRU (orig_pkru) as 10 (disable write > > > but have read) then new_pkru from (init_pkru & orig_pkru) is 00, which gives > > > RW access to the pkey 1. > > > > > > When the thread has orig_pkru as 01 (disable access) or 00 (RW), new_pkru is > > > unchanged from orig_pkru. > > > > > > Now take pkey 0: > > > the init_pkru is 00, regardless what threads has, new_pkru will always be 00. > > > > > > This seems to work out well for pkey 1 to 15, i.e. signal handing code in > > > kernel only give write access when the thread alrady has read access to the > > > PKEY that is used by the altstack. The threat model interesting here is to > > > prevent untrusted threads from writing to altstack, and read is probably less > > > of a problem. > > > > > > > This piece of code assumed that the init PKRU value allows writes to the alternative > > signal stack. As you mentioned earlier, that may not always be true - a non-zero pkey > > can be used for the altstack. > > > Only PKEY 0 has init PKRU as 00. > So in your case, the thread doesn't have write access to pkey 0, and > need the write > access to pkey 0 during signal handling. > > > So the new version simply does write_pkru(0) (i.e. enabled all pkeys) before XSAVE. > > Is this more reasonable? > > > write_pkru(0) will work, but it is unnecessary in the current patch. > > Consider the following case: A thread has no access to pkey 1, and > use pkey 1 for its altstack. > > In V3 (use write_pkru(0): > Signal will be dispatched to the user, i.e. write to signal frame is > OK, but it will SEGV at sigreturn. > > In V2: > it will SEGV earlier at dispatch stage when writing to sigframe. > > I would rather that the code fails earlier for this case. > > > > > > > Does this meet what you want? (Note the pkey 0 is different than 1-15) > > > > > > Suppose someone also like to disable all access to altstack, then there is one > > > more place to mind: in sigreturn(), it calls restore_altstack(), and requires > > > read access to altstack. However, at the time, PKRU is already restored from > > > sigframe, so SEGV will raise (the value in sigframe doesn't have read access > > > to the PKEY). > > > > > > Without changing sigreturn, using wrpkru(0) here might not be necessary: > > > the dispatch to user space works fine, only to crash at sigreturn step. > > > > > > [1] defined by init_pkru_value in pkeys.c > > > > > > Best regards, > > > -Jeff > > > > > > I see what you're saying. In rt_sigreturn(): > > > > if (!restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags)) <— restores PKRU, disabling access to altstack > > goto badframe; > > ... > > if (restore_altstack(&frame->uc.uc_stack)) <— needs read access to altstack > > goto badframe; > > > > > > I’m wary about reordering anything here. Also, this code is not aware of the altstack permissions. I’m wondering if wrpkru(0) is needed here too. > > > We can't change PKRU after restore_sigcontext, the calling thread > would have PKRU 0, not the original PKRU from before handling the > signal. probably putting restore_altstack ahead of restore_sigcontext would be good enough. restore_altstack doesn't seem to need to be after restore_sigcontex, it reads data from the sigframe and calls do_sigaltstack to update the current struct. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-04-26 16:13 ` Jeff Xu @ 2024-04-26 16:33 ` Edgecombe, Rick P 2024-04-26 17:13 ` Jeff Xu 0 siblings, 1 reply; 14+ messages in thread From: Edgecombe, Rick P @ 2024-04-26 16:33 UTC (permalink / raw) To: jeffxu@google.com, jeffxu@chromium.org Cc: dave.hansen@linux.intel.com, x86@kernel.org, jannh@google.com, andrew.brownsword@oracle.com, matthias.neugschwandtner@oracle.com, tglx@linutronix.de, aruna.ramakrishna@oracle.com, sroettger@google.com, linux-kernel@vger.kernel.org, keescook@chromium.org On Fri, 2024-04-26 at 09:13 -0700, Jeff Xu wrote: > > > I’m wary about reordering anything here. Also, this code is not aware of > > > the altstack permissions. I’m wondering if wrpkru(0) is needed here too. > > > > > We can't change PKRU after restore_sigcontext, the calling thread > > would have PKRU 0, not the original PKRU from before handling the > > signal. > > probably putting restore_altstack ahead of restore_sigcontext would be > good enough. > restore_altstack doesn't seem to need to be after restore_sigcontex, > it reads data > from the sigframe and calls do_sigaltstack to update the current struct. Just was CCed, and haven't reviewed the whole thread. But I hit an issue with the ordering in setting up a signal frame. I noted that the ordering in sigreturn was potentially wrong in the same way: https://lore.kernel.org/lkml/20231107182251.91276-1-rick.p.edgecombe@intel.com/ It might be useful analysis. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-04-26 16:33 ` Edgecombe, Rick P @ 2024-04-26 17:13 ` Jeff Xu 0 siblings, 0 replies; 14+ messages in thread From: Jeff Xu @ 2024-04-26 17:13 UTC (permalink / raw) To: Edgecombe, Rick P Cc: jeffxu@chromium.org, dave.hansen@linux.intel.com, x86@kernel.org, jannh@google.com, andrew.brownsword@oracle.com, matthias.neugschwandtner@oracle.com, tglx@linutronix.de, aruna.ramakrishna@oracle.com, sroettger@google.com, linux-kernel@vger.kernel.org, keescook@chromium.org On Fri, Apr 26, 2024 at 9:33 AM Edgecombe, Rick P <rick.p.edgecombe@intel.com> wrote: > > On Fri, 2024-04-26 at 09:13 -0700, Jeff Xu wrote: > > > > I’m wary about reordering anything here. Also, this code is not aware of > > > > the altstack permissions. I’m wondering if wrpkru(0) is needed here too. > > > > > > > We can't change PKRU after restore_sigcontext, the calling thread > > > would have PKRU 0, not the original PKRU from before handling the > > > signal. > > > > probably putting restore_altstack ahead of restore_sigcontext would be > > good enough. > > restore_altstack doesn't seem to need to be after restore_sigcontex, > > it reads data > > from the sigframe and calls do_sigaltstack to update the current struct. > > Just was CCed, and haven't reviewed the whole thread. > > But I hit an issue with the ordering in setting up a signal frame. I noted that > the ordering in sigreturn was potentially wrong in the same way: > https://lore.kernel.org/lkml/20231107182251.91276-1-rick.p.edgecombe@intel.com/ > > It might be useful analysis. Great! so it is already noticed. It can be fixed in this patch set. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE 2024-03-21 21:56 ` [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE Aruna Ramakrishna ` (2 preceding siblings ...) 2024-04-25 21:05 ` jeffxu @ 2024-04-25 21:58 ` jeffxu 3 siblings, 0 replies; 14+ messages in thread From: jeffxu @ 2024-04-25 21:58 UTC (permalink / raw) To: aruna.ramakrishna Cc: andrew.brownsword, dave.hansen, linux-kernel, matthias.neugschwandtner, tglx, jeffxu, jeffxu, jannh, sroettger, x86 From: Jeff Xu <jeffxu@chromium.org> >The semantics you've implemented for sigaltstacks are not the only possible >ones. In principle, a signal handler with its own stack might want to have >its own key(s) enabled. In a way a Linux signal handler is a mini-thread >created on the fly, with its own stack and its own attributes. Some thought >& analysis should go into which way to go here, and the best path should be >chosen. Fixing the SIGSEGV you observed should be a happy side effect of >other worthwile improvements. > This is exactly right. we wants to use altstack with its own pkey. The pkey won't be writeable during thread's normal operation, only by signaling handling itself. Thanks -Jeff >Thanks, > > Ingo > ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-04-26 17:13 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-21 21:56 [RFC PATCH v2 0/1] x86/pkeys: update PKRU to enable pkey 0 Aruna Ramakrishna 2024-03-21 21:56 ` [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE Aruna Ramakrishna 2024-03-22 9:46 ` Ingo Molnar 2024-03-22 18:30 ` Aruna Ramakrishna 2024-04-25 22:03 ` jeffxu 2024-03-22 15:40 ` Dave Hansen 2024-03-22 18:28 ` Aruna Ramakrishna 2024-04-25 21:05 ` jeffxu 2024-04-25 22:49 ` Aruna Ramakrishna 2024-04-26 0:12 ` Jeff Xu 2024-04-26 16:13 ` Jeff Xu 2024-04-26 16:33 ` Edgecombe, Rick P 2024-04-26 17:13 ` Jeff Xu 2024-04-25 21:58 ` jeffxu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox