* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features [not found] <20240116234901.3238852-1-avagin@google.com> @ 2024-01-17 19:34 ` Dave Hansen 2024-01-17 22:30 ` Andrei Vagin 0 siblings, 1 reply; 12+ messages in thread From: Dave Hansen @ 2024-01-17 19:34 UTC (permalink / raw) To: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML Cc: x86, H. Peter Anvin ... adding LKML. More context here: https://lore.kernel.org/all/20240116234901.3238852-1-avagin@google.com/ On 1/16/24 15:49, Andrei Vagin wrote: > + /* xstate_size has to fit all requested components. */ > + if (fx_sw->xstate_size != fpstate->user_size) { > + int min_xstate_size = > + xstate_calculate_size(fx_sw->xfeatures, false); > + > + if (min_xstate_size < 0 || > + fx_sw->xstate_size < min_xstate_size || > + fx_sw->xstate_size > fpstate->user_size) > + goto setfx; > + } The bug here is that the buffer from userspace is garbage and the (XSAVE XSTATE_BV) metadata doesn't match the size of the buffer. Right? This proposed fix just checks another piece of user-supplied metadata instead: fx_sw->xstate_size. Can't userspace just provide more bad data there and end up with the same problem? Seems like the real problem here is that the fault_in_readable() doesn't match the XRSTOR. It's going to continue to be a problem as long as we don't know what memory XRSTOR tried to access. We can try all day long to precalculate what XRSTOR _will_ do, but that seems a bit silly because the CPU knows where the fault happened. It told us in CR2 and all we have to do is plumb that back to fault_in_readable(). It would take a little XSTATE_OP() munging to pass something back other than 'err', but that doesn't seem insurmountable. Anybody have better ideas? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-17 19:34 ` [PATCH] x86/fpu: verify xstate buffer size according with requested features Dave Hansen @ 2024-01-17 22:30 ` Andrei Vagin 2024-01-17 23:52 ` Dave Hansen 0 siblings, 1 reply; 12+ messages in thread From: Andrei Vagin @ 2024-01-17 22:30 UTC (permalink / raw) To: Dave Hansen Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On Wed, Jan 17, 2024 at 11:34 AM Dave Hansen <dave.hansen@intel.com> wrote: > > .. adding LKML. More context here: > > https://lore.kernel.org/all/20240116234901.3238852-1-avagin@google.com/ > > On 1/16/24 15:49, Andrei Vagin wrote: > > + /* xstate_size has to fit all requested components. */ > > + if (fx_sw->xstate_size != fpstate->user_size) { > > + int min_xstate_size = > > + xstate_calculate_size(fx_sw->xfeatures, false); > > + > > + if (min_xstate_size < 0 || > > + fx_sw->xstate_size < min_xstate_size || > > + fx_sw->xstate_size > fpstate->user_size) > > + goto setfx; > > + } > > The bug here is that the buffer from userspace is garbage and the (XSAVE > XSTATE_BV) metadata doesn't match the size of the buffer. Right? right > > This proposed fix just checks another piece of user-supplied metadata > instead: fx_sw->xstate_size. > > Can't userspace just provide more bad data there and end up with the > same problem? It can't... I would not post this change if I thought otherwise... > > Seems like the real problem here is that the fault_in_readable() doesn't > match the XRSTOR. It's going to continue to be a problem as long as we > don't know what memory XRSTOR tried to access. We can try all day long > to precalculate what XRSTOR _will_ do, but that seems a bit silly I don't understand this part. The behavior of XRSTOR is well-defined by CPU specifications, allowing us to easily precalculate the memory it will attempt to access. What does it mean "we don't know what memory XRSTOR tried to access"? xrstor restores only features that are set in fx_sw->xfeatures. > because the CPU knows where the fault happened. It told us in CR2 and > all we have to do is plumb that back to fault_in_readable(). I considered this option as well, but then I decided that this approach is better. The most important aspect is that it always rejects bad buffers, allowing a user space to detect an issue even when a fault isn't triggered. I believe proper handling of xrstor page faults could be a valuable additional improvement to this change. If we detect a fault outside of a provided buffer, we can print a warning to signal that check_xstate_in_sigframe is incomplete. > > It would take a little XSTATE_OP() munging to pass something back other > than 'err', but that doesn't seem insurmountable. > > Anybody have better ideas? > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-17 22:30 ` Andrei Vagin @ 2024-01-17 23:52 ` Dave Hansen 2024-01-18 7:59 ` Andrei Vagin 0 siblings, 1 reply; 12+ messages in thread From: Dave Hansen @ 2024-01-17 23:52 UTC (permalink / raw) To: Andrei Vagin Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On 1/17/24 14:30, Andrei Vagin wrote: > On Wed, Jan 17, 2024 at 11:34 AM Dave Hansen <dave.hansen@intel.com> wrote: >> Seems like the real problem here is that the fault_in_readable() doesn't >> match the XRSTOR. It's going to continue to be a problem as long as we >> don't know what memory XRSTOR tried to access. We can try all day long >> to precalculate what XRSTOR _will_ do, but that seems a bit silly > > I don't understand this part. The behavior of XRSTOR is well-defined > by CPU specifications, allowing us to easily precalculate the memory it > will attempt to access. What does it mean "we don't know what memory > XRSTOR tried to access"? > > xrstor restores only features that are set in fx_sw->xfeatures. XSAVE is a big old pile of fun. Someone is confused here and that someone might be me. Let me walk through my logic a bit. Maybe you can point out where I've gone wrong. :) Take a look at the "Legacy Region of an XSAVE Area" in the SDM. The x87 state component comprises bytes 23:0 and bytes 159:32. The SSE state component comprises bytes 31:24 and bytes 415:160. The XSAVE feature set does not use bytes 511:416; bytes 463:416 are reserved. and the next section: The XSAVE header of an XSAVE area comprises the 64 bytes starting at offset 512 from the area’s base address: ... Bytes 7:0 of the XSAVE header is a state-component bitmap (see Section 13.1) called XSTATE_BV. XRSTOR accesses memory based on XSTATE_BV (and XCOMP_BV which should be irrelevant here). So we're looking for something at 512 bytes up in the XSAVE buffer. Let's have gdb help us out a bit. Where is sw_reserved? (gdb) print/d &((struct fxregs_state *)0)->sw_reserved $4 = 464 Where does XRSTOR itself look? (gdb) print/d &((union fpregs_state *)0)->xsave->header.xfeatures $5 = 512 (xfeatures aka. XSTATE_BV) There _was_ a reason once upon a time that the kernel started sticking a copy of XSTATE_BV in 'sw_reserved'. I just forget what it is at the moment. It's horribly confusing to it laid out like this, but the SDM is pretty clear: "the XSAVE feature set does not use bytes 511:416". "fx_sw" is actually a software-defined and software-only-consumed area of the XSAVE buffer, thus the '_sw'. Nothing in the '_sw' section tells us how the hardware will behave. >> because the CPU knows where the fault happened. It told us in CR2 and >> all we have to do is plumb that back to fault_in_readable(). > > I considered this option as well, but then I decided that this approach > is better. The most important aspect is that it always rejects bad > buffers, allowing a user space to detect an issue even when a fault > isn't triggered. I believe proper handling of xrstor page faults could > be a valuable additional improvement to this change. If we detect a > fault outside of a provided buffer, we can print a warning to signal > that check_xstate_in_sigframe is incomplete. I'm not really following the logic there. What's the downside of taking the fault? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-17 23:52 ` Dave Hansen @ 2024-01-18 7:59 ` Andrei Vagin 2024-01-18 18:27 ` Dave Hansen 2024-01-18 19:07 ` Thomas Gleixner 0 siblings, 2 replies; 12+ messages in thread From: Andrei Vagin @ 2024-01-18 7:59 UTC (permalink / raw) To: Dave Hansen Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On Wed, Jan 17, 2024 at 3:52 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 1/17/24 14:30, Andrei Vagin wrote: > > On Wed, Jan 17, 2024 at 11:34 AM Dave Hansen <dave.hansen@intel.com> wrote: > >> Seems like the real problem here is that the fault_in_readable() doesn't > >> match the XRSTOR. It's going to continue to be a problem as long as we > >> don't know what memory XRSTOR tried to access. We can try all day long > >> to precalculate what XRSTOR _will_ do, but that seems a bit silly > > > > I don't understand this part. The behavior of XRSTOR is well-defined > > by CPU specifications, allowing us to easily precalculate the memory it > > will attempt to access. What does it mean "we don't know what memory > > XRSTOR tried to access"? > > > > xrstor restores only features that are set in fx_sw->xfeatures. > > XSAVE is a big old pile of fun. Someone is confused here and that > someone might be me. Let me walk through my logic a bit. Maybe you can > point out where I've gone wrong. :) I could be wrong too, so thank you for looking at this and helping to find the right solution. > > Take a look at the "Legacy Region of an XSAVE Area" in the SDM. > > The x87 state component comprises bytes 23:0 and bytes 159:32. > The SSE state component comprises bytes 31:24 and bytes 415:160. > The XSAVE feature set does not use bytes 511:416; bytes 463:416 > are reserved. > > and the next section: > > The XSAVE header of an XSAVE area comprises the 64 bytes > starting at offset 512 from the area’s base address: > ... > Bytes 7:0 of the XSAVE header is a state-component bitmap (see > Section 13.1) called XSTATE_BV. > > XRSTOR accesses memory based on XSTATE_BV (and XCOMP_BV which should be > irrelevant here). So we're looking for something at 512 bytes up in the > XSAVE buffer. > > Let's have gdb help us out a bit. Where is sw_reserved? > > (gdb) print/d &((struct fxregs_state *)0)->sw_reserved > $4 = 464 > > Where does XRSTOR itself look? > > (gdb) print/d &((union fpregs_state *)0)->xsave->header.xfeatures > $5 = 512 > > (xfeatures aka. XSTATE_BV) > > There _was_ a reason once upon a time that the kernel started sticking a > copy of XSTATE_BV in 'sw_reserved'. I just forget what it is at the > moment. It's horribly confusing to it laid out like this, but the SDM > is pretty clear: "the XSAVE feature set does not use bytes 511:416". > > "fx_sw" is actually a software-defined and software-only-consumed area > of the XSAVE buffer, thus the '_sw'. Nothing in the '_sw' section tells > us how the hardware will behave. I think you don't take into account the requested-feature bitmap (RFBM), which is the logical-and of edx:eax and XCR0. In our case, edx:eax is set to the value of fx_sw->xfeatures. > > >> because the CPU knows where the fault happened. It told us in CR2 and > >> all we have to do is plumb that back to fault_in_readable(). > > > > I considered this option as well, but then I decided that this approach > > is better. The most important aspect is that it always rejects bad > > buffers, allowing a user space to detect an issue even when a fault > > isn't triggered. I believe proper handling of xrstor page faults could > > be a valuable additional improvement to this change. If we detect a > > fault outside of a provided buffer, we can print a warning to signal > > that check_xstate_in_sigframe is incomplete. > > I'm not really following the logic there. What's the downside of taking > the fault? Let's consider a scenario where someone messed up with an fpu state on a signal frame. With my approach, a mistake can be promptly detected. However, if we incorporate the page fault handling of xrstor, a mistake will only be identified if xrstor triggers a fault. In cases where a buffer is allocated in a large memory mapping, xrstor may silently read memory beyond the buffer. Next time, a page beyond a buffer might be swapped out, xrstore triggers a fault leading to application crashes. Thanks, Andrei ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-18 7:59 ` Andrei Vagin @ 2024-01-18 18:27 ` Dave Hansen 2024-01-18 19:54 ` Thomas Gleixner 2024-01-18 19:07 ` Thomas Gleixner 1 sibling, 1 reply; 12+ messages in thread From: Dave Hansen @ 2024-01-18 18:27 UTC (permalink / raw) To: Andrei Vagin Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On 1/17/24 23:59, Andrei Vagin wrote: > On Wed, Jan 17, 2024 at 3:52 PM Dave Hansen <dave.hansen@intel.com> wrote: ... >> "fx_sw" is actually a software-defined and software-only-consumed area >> of the XSAVE buffer, thus the '_sw'. Nothing in the '_sw' section tells >> us how the hardware will behave. > > I think you don't take into account the requested-feature bitmap (RFBM), > which is the logical-and of edx:eax and XCR0. In our case, edx:eax > is set to the value of fx_sw->xfeatures. Ahh, I did miss that indeed. Let's step back and look at what's in play: 1. fx_sw->xstate_size, which is eventually passed to fault_in_readable() 2. fx_sw->xfeatures 3. XSTATE_BV (aka. fpu->...->header.xfeatures) 4. What XRSTOR actually does, which is #2 OR'd with #3. 5. xstate_calculate_size(fx_sw->xfeatures) The bug that you've reported here is essentially that the size passed to fault_in_readable() doesn't match what XRSTOR actually does (#4). Note that today, fault_in_readable() may end up faulting in _too_ much memory if there's a bit clear in XSTATE_BV. The proposed fix adds the #5 calculation. It's conservative because fx_sw->xfeatures is represents a superset of what XRSTOR will actually restore. But now, instead of just faulting in too much memory, a too-small fx_sw->xstate_size will end up zapping all the XSAVE state. This is all freakishly complicated and changes a bunch of behavior. It's way too much to be done in a patch with a 5-line commit message. I suspect this needs some real refactoring. I really think fx_sw should remain unmodified after being copied in from userspace. If you want to start interpreting 'fx_sw->xstate_size' as the fault_in_readable() size, then that needs to be a separate logical variable. >>>> because the CPU knows where the fault happened. It told us in CR2 and >>>> all we have to do is plumb that back to fault_in_readable(). >>> >>> I considered this option as well, but then I decided that this approach >>> is better. The most important aspect is that it always rejects bad >>> buffers, allowing a user space to detect an issue even when a fault >>> isn't triggered. I believe proper handling of xrstor page faults could >>> be a valuable additional improvement to this change. If we detect a >>> fault outside of a provided buffer, we can print a warning to signal >>> that check_xstate_in_sigframe is incomplete. >> >> I'm not really following the logic there. What's the downside of taking >> the fault? > > Let's consider a scenario where someone messed up with an fpu state on a > signal frame. With my approach, a mistake can be promptly detected. > However, if we incorporate the page fault handling of xrstor, a mistake > will only be identified if xrstor triggers a fault. In cases where a > buffer is allocated in a large memory mapping, xrstor may silently read > memory beyond the buffer. Next time, a page beyond a buffer might be > swapped out, xrstore triggers a fault leading to application crashes. I think that's an orthogonal problem really. Fault loops are nasty. There's a reason that the architecture provides CR2 instead of depending on software to, for instance, figure out why and how every instruction faulted. It's easy to look into the past and as the CPU where the fault happened. On the other hand, we have XRSTOR. Sure, it's _possible_ to look into the future and figure out what memory XRSTOR will touch. But, we apparently stink at looking into the future (thus this bug). Not because we're stupid, but simply because looking into the future is hard. I'd much rather fix the fault loop problem by looking into the past than predicting the future. The fault handling _must_ be correct or we get hangs. If we have nice, reliable fault handling and then decide that we've got XRSTOR's running amok reading random memory all over the place that need a nicer error message, then we can add that code to predict the future. If our "predict the future" code goes wrong, then we lose an error message -- not a big deal. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-18 18:27 ` Dave Hansen @ 2024-01-18 19:54 ` Thomas Gleixner 2024-01-18 22:02 ` Dave Hansen 0 siblings, 1 reply; 12+ messages in thread From: Thomas Gleixner @ 2024-01-18 19:54 UTC (permalink / raw) To: Dave Hansen, Andrei Vagin Cc: Andrei Vagin, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On Thu, Jan 18 2024 at 10:27, Dave Hansen wrote: > If we have nice, reliable fault handling and then decide that we've got > XRSTOR's running amok reading random memory all over the place that need > a nicer error message, then we can add that code to predict the future. > If our "predict the future" code goes wrong, then we lose an error > message -- not a big deal. After staring more at it, it's arguable to pass fpstate->user_size to fault_in_readable() and ignore fx_sw->xstate_size completely. That's a guaranteed to be reliable size which prevents endless loops because arguably that's the maximum size which can be touched by XRSTOR, no? Thanks, tglx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-18 19:54 ` Thomas Gleixner @ 2024-01-18 22:02 ` Dave Hansen 2024-01-18 22:11 ` Thomas Gleixner 0 siblings, 1 reply; 12+ messages in thread From: Dave Hansen @ 2024-01-18 22:02 UTC (permalink / raw) To: Thomas Gleixner, Andrei Vagin Cc: Andrei Vagin, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On 1/18/24 11:54, Thomas Gleixner wrote: > On Thu, Jan 18 2024 at 10:27, Dave Hansen wrote: >> If we have nice, reliable fault handling and then decide that we've got >> XRSTOR's running amok reading random memory all over the place that need >> a nicer error message, then we can add that code to predict the future. >> If our "predict the future" code goes wrong, then we lose an error >> message -- not a big deal. > After staring more at it, it's arguable to pass fpstate->user_size to > fault_in_readable() and ignore fx_sw->xstate_size completely. > > That's a guaranteed to be reliable size which prevents endless loops > because arguably that's the maximum size which can be touched by XRSTOR, > no? I like it. It takes fx_sw completely out of the picture, which was the root of the problem in the first place. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-18 22:02 ` Dave Hansen @ 2024-01-18 22:11 ` Thomas Gleixner 2024-01-22 3:58 ` Andrei Vagin 0 siblings, 1 reply; 12+ messages in thread From: Thomas Gleixner @ 2024-01-18 22:11 UTC (permalink / raw) To: Dave Hansen, Andrei Vagin Cc: Andrei Vagin, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On Thu, Jan 18 2024 at 14:02, Dave Hansen wrote: > On 1/18/24 11:54, Thomas Gleixner wrote: >> On Thu, Jan 18 2024 at 10:27, Dave Hansen wrote: >>> If we have nice, reliable fault handling and then decide that we've got >>> XRSTOR's running amok reading random memory all over the place that need >>> a nicer error message, then we can add that code to predict the future. >>> If our "predict the future" code goes wrong, then we lose an error >>> message -- not a big deal. >> After staring more at it, it's arguable to pass fpstate->user_size to >> fault_in_readable() and ignore fx_sw->xstate_size completely. >> >> That's a guaranteed to be reliable size which prevents endless loops >> because arguably that's the maximum size which can be touched by XRSTOR, >> no? > > I like it. It takes fx_sw completely out of the picture, which was the > root of the problem in the first place. Correct. I really don't care about the esoteric case where this might theoretically result in a unjustified application abort. You really need to twist your brain around 6 corners and then squint twice to construct that case. Of course syzcaller might trigger it, but fuzzing the sigreturn frame is a #GP, #PF and whatever lottery anyway. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-18 22:11 ` Thomas Gleixner @ 2024-01-22 3:58 ` Andrei Vagin 2024-01-22 13:26 ` Thomas Gleixner 0 siblings, 1 reply; 12+ messages in thread From: Andrei Vagin @ 2024-01-22 3:58 UTC (permalink / raw) To: Thomas Gleixner Cc: Dave Hansen, Andrei Vagin, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On Thu, Jan 18, 2024 at 2:11 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Thu, Jan 18 2024 at 14:02, Dave Hansen wrote: > > On 1/18/24 11:54, Thomas Gleixner wrote: > >> On Thu, Jan 18 2024 at 10:27, Dave Hansen wrote: > >>> If we have nice, reliable fault handling and then decide that we've got > >>> XRSTOR's running amok reading random memory all over the place that need > >>> a nicer error message, then we can add that code to predict the future. > >>> If our "predict the future" code goes wrong, then we lose an error > >>> message -- not a big deal. > >> After staring more at it, it's arguable to pass fpstate->user_size to > >> fault_in_readable() and ignore fx_sw->xstate_size completely. > >> > >> That's a guaranteed to be reliable size which prevents endless loops > >> because arguably that's the maximum size which can be touched by XRSTOR, > >> no? fpstate->user_size isn't constant. It can be modified from the XFD #NM handler. For example, it happens when a process invokes one of amx instructions for the first time. It means we have to be able to restore an fpu state from signal frames generated with a smaller fpstate->user_size. Can it trigger any issues? > > > > I like it. It takes fx_sw completely out of the picture, which was the > > root of the problem in the first place. > > Correct. > > I really don't care about the esoteric case where this might > theoretically result in a unjustified application abort. > > You really need to twist your brain around 6 corners and then squint > twice to construct that case. Of course syzcaller might trigger it, but > fuzzing the sigreturn frame is a #GP, #PF and whatever lottery anyway. In my case, the bug was triggered by gVisor (it is like the user-mode Linux). Thanks, Andrei ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-22 3:58 ` Andrei Vagin @ 2024-01-22 13:26 ` Thomas Gleixner 0 siblings, 0 replies; 12+ messages in thread From: Thomas Gleixner @ 2024-01-22 13:26 UTC (permalink / raw) To: Andrei Vagin Cc: Dave Hansen, Andrei Vagin, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On Sun, Jan 21 2024 at 19:58, Andrei Vagin wrote: > On Thu, Jan 18, 2024 at 2:11 PM Thomas Gleixner <tglx@linutronix.de> wrote: >> >> After staring more at it, it's arguable to pass fpstate->user_size to >> >> fault_in_readable() and ignore fx_sw->xstate_size completely. >> >> >> >> That's a guaranteed to be reliable size which prevents endless loops >> >> because arguably that's the maximum size which can be touched by XRSTOR, >> >> no? > > fpstate->user_size isn't constant. It can be modified from the XFD #NM > handler. For example, it happens when a process invokes one of amx > instructions for the first time. It means we have to be able to restore > an fpu state from signal frames generated with a smaller > fpstate->user_size. Can it trigger any issues? I know, but the #NM handler does not run in the signal restore path. fpstate->user_size is guaranteed to be correct at that point. The untested below should just work. Thanks, tglx --- --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -274,8 +274,7 @@ static int __restore_fpregs_from_user(vo * Attempt to restore the FPU registers directly from user memory. * Pagefaults are handled and any errors returned are fatal. */ -static bool restore_fpregs_from_user(void __user *buf, u64 xrestore, - bool fx_only, unsigned int size) +static bool restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only) { struct fpu *fpu = ¤t->thread.fpu; int ret; @@ -309,7 +308,7 @@ static bool restore_fpregs_from_user(voi if (ret != X86_TRAP_PF) return false; - if (!fault_in_readable(buf, size)) + if (!fault_in_readable(buf, fpu->fpstate->user_size)) goto retry; return false; } @@ -339,7 +338,6 @@ static bool __fpu_restore_sig(void __use struct user_i387_ia32_struct env; bool success, fx_only = false; union fpregs_state *fpregs; - unsigned int state_size; u64 user_xfeatures = 0; if (use_xsave()) { @@ -349,17 +347,14 @@ static bool __fpu_restore_sig(void __use return false; fx_only = !fx_sw_user.magic1; - state_size = fx_sw_user.xstate_size; user_xfeatures = fx_sw_user.xfeatures; } else { user_xfeatures = XFEATURE_MASK_FPSSE; - state_size = fpu->fpstate->user_size; } if (likely(!ia32_fxstate)) { /* Restore the FPU registers directly from user memory. */ - return restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only, - state_size); + return restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only); } /* ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-18 7:59 ` Andrei Vagin 2024-01-18 18:27 ` Dave Hansen @ 2024-01-18 19:07 ` Thomas Gleixner 2024-01-22 6:43 ` Andrei Vagin 1 sibling, 1 reply; 12+ messages in thread From: Thomas Gleixner @ 2024-01-18 19:07 UTC (permalink / raw) To: Andrei Vagin, Dave Hansen Cc: Andrei Vagin, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On Wed, Jan 17 2024 at 23:59, Andrei Vagin wrote: > On Wed, Jan 17, 2024 at 3:52 PM Dave Hansen <dave.hansen@intel.com> wrote: >> I'm not really following the logic there. What's the downside of taking >> the fault? > > Let's consider a scenario where someone messed up with an fpu state on a > signal frame. Then he can rightfully keep the pieces... > With my approach, a mistake can be promptly detected. How so? Everything which ends up at the 'setfx:' label will just silently fall back to FX only and init all other components. > However, if we incorporate the page fault handling of xrstor, a > mistake will only be identified if xrstor triggers a fault. In cases > where a buffer is allocated in a large memory mapping, xrstor may > silently read memory beyond the buffer. It's either failing the restore due to invalid data (#GP) or it will restore garbage. User space asked for it. > Next time, a page beyond a buffer might be swapped out, xrstore > triggers a fault leading to application crashes. If it's swapped out it will be swapped back in, no crash. There are two ways for crashing: 1) There is no mapping or a non-sufficient mapping i.e. fault_in_readable() fails. 2) The data in the buffer is invalid. Crashing the application in both cases is just fine. The nasty part is that the expected size of the user space buffer is taken from fx_sw->xstate_size. So you can construct a sigreturn frame where 1) fx_sw->xstate_size is smaller than the size required by the valid bits in fx_sw->xfeatures. 2) user space unmapped parts of the stack so that not all of the buffer (as required by XRSTOR) is accessible. Now XRSTOR tries to restore and accesses the unmapped part of the stack, which results in a fault. But fault_in_readable() succeeds because 'buf + fx_sw->xstate_size' is within the still mapped stack. So it goes back and tries XRSTOR again. Lather, rinse and repeat. That's what Andrej is trying to prevent by calculating the size required by the valid bits in fx_sw->xfeatures and validating that against fx_sw->xstate_size. That fx_sw construct is yet another horror from the past. It's not much better than xsave itself. Thanks, tglx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features 2024-01-18 19:07 ` Thomas Gleixner @ 2024-01-22 6:43 ` Andrei Vagin 0 siblings, 0 replies; 12+ messages in thread From: Andrei Vagin @ 2024-01-22 6:43 UTC (permalink / raw) To: Thomas Gleixner Cc: Dave Hansen, Andrei Vagin, Ingo Molnar, Borislav Petkov, Dave Hansen, LKML, x86, H. Peter Anvin On Thu, Jan 18, 2024 at 11:07 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Wed, Jan 17 2024 at 23:59, Andrei Vagin wrote: > > On Wed, Jan 17, 2024 at 3:52 PM Dave Hansen <dave.hansen@intel.com> wrote: > >> I'm not really following the logic there. What's the downside of taking > >> the fault? > > > > Let's consider a scenario where someone messed up with an fpu state on a > > signal frame. > > Then he can rightfully keep the pieces... > > > With my approach, a mistake can be promptly detected. > > How so? Everything which ends up at the 'setfx:' label will just > silently fall back to FX only and init all other components. and it will trigger the x86_fpu_xstate_check_failed tracing event. Usually, when I want to check that our code generates sigframe fpu states properly, I enable this event type, run tests and check that no event has been triggered. Plus, we have tests that check fpu register values, so they will fail if registers are changed in unexpected values. Thanks, Andrei ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-01-22 13:26 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240116234901.3238852-1-avagin@google.com>
2024-01-17 19:34 ` [PATCH] x86/fpu: verify xstate buffer size according with requested features Dave Hansen
2024-01-17 22:30 ` Andrei Vagin
2024-01-17 23:52 ` Dave Hansen
2024-01-18 7:59 ` Andrei Vagin
2024-01-18 18:27 ` Dave Hansen
2024-01-18 19:54 ` Thomas Gleixner
2024-01-18 22:02 ` Dave Hansen
2024-01-18 22:11 ` Thomas Gleixner
2024-01-22 3:58 ` Andrei Vagin
2024-01-22 13:26 ` Thomas Gleixner
2024-01-18 19:07 ` Thomas Gleixner
2024-01-22 6:43 ` Andrei Vagin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox