* [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return"
@ 2026-04-29 0:06 Andrei Vagin
2026-04-29 7:26 ` Chang S. Bae
0 siblings, 1 reply; 13+ messages in thread
From: Andrei Vagin @ 2026-04-29 0:06 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen
Cc: linux-kernel, criu, x86, Andrei Vagin, Chang S. Bae, stable
This reverts commit dc8aa31a7ac2 ("x86/fpu: Refine and simplify the
magic number check during signal return").
The reverted commit broke applications that construct signal frames in
userspace (such as CRIU and gVisor) if the frame's xstate size is
smaller than the kernel's fpstate->user_size.
Furthermore, this introduces a critical issue for checkpoint/restore
tools like CRIU. If a process is checkpointed while inside a signal
handler, its stack contains a signal frame formatted according to the
source host's xstate capabilities. If that process is later restored on
a destination host with larger xstate capabilities (e.g., a newer CPU
with more features enabled, resulting in a larger fpstate->user_size),
the kernel will look for FP_XSTATE_MAGIC2 at the destination host's
larger user_size offset instead of the offset encoded in the frame's
fx_sw->xstate_size. This causes the magic2 check to fail, forcing
sigreturn to silently fall back to "FX-only" mode. Upon return from the
signal handler, the process's extended state is reset to initial values
instead of being restored, leading to silent data corruption.
The original commit cited commit d877550eaf2d ("x86/fpu: Stop
relying on userspace for info to fault in xsave buffer") as
justification to stop relying on userspace for the magic number check.
However, these two changes are fundamentally different. The last one
only changed how much memory the kernel ensures is paged-in before
running XRSTOR to prevent an infinite loop. It did not change the signal
frame format or how the layout is validated.
Reverting this change restores the use of fx_sw->xstate_size for
locating magic2 and restores the necessary sanity checks, ensuring that
the signal frame remains self-describing and portable.
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: stable@vger.kernel.org
Fixes: dc8aa31a7ac2 ("x86/fpu: Refine and simplify the magic number check during signal return")
Signed-off-by: Andrei Vagin <avagin@google.com>
---
arch/x86/kernel/fpu/signal.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index c3ec2512f2bb..20b638c507ca 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -27,14 +27,19 @@
static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf,
struct _fpx_sw_bytes *fx_sw)
{
+ int min_xstate_size = sizeof(struct fxregs_state) +
+ sizeof(struct xstate_header);
void __user *fpstate = fxbuf;
unsigned int magic2;
if (__copy_from_user(fx_sw, &fxbuf->sw_reserved[0], sizeof(*fx_sw)))
return false;
- /* Check for the first magic field */
- if (fx_sw->magic1 != FP_XSTATE_MAGIC1)
+ /* Check for the first magic field and other error scenarios. */
+ if (fx_sw->magic1 != FP_XSTATE_MAGIC1 ||
+ fx_sw->xstate_size < min_xstate_size ||
+ fx_sw->xstate_size > x86_task_fpu(current)->fpstate->user_size ||
+ fx_sw->xstate_size > fx_sw->extended_size)
goto setfx;
/*
@@ -43,7 +48,7 @@ static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf,
* fpstate layout with out copying the extended state information
* in the memory layout.
*/
- if (__get_user(magic2, (__u32 __user *)(fpstate + x86_task_fpu(current)->fpstate->user_size)))
+ if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size)))
return false;
if (likely(magic2 == FP_XSTATE_MAGIC2))
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-04-29 0:06 [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" Andrei Vagin @ 2026-04-29 7:26 ` Chang S. Bae 2026-04-29 16:44 ` Andrei Vagin 2026-05-01 18:44 ` Andrei Vagin 0 siblings, 2 replies; 13+ messages in thread From: Chang S. Bae @ 2026-04-29 7:26 UTC (permalink / raw) To: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen Cc: linux-kernel, criu, x86, stable On 4/28/2026 5:06 PM, Andrei Vagin wrote: > > The reverted commit broke applications that construct signal frames in > userspace (such as CRIU and gVisor) if the frame's xstate size is > smaller than the kernel's fpstate->user_size. In the extended state area, the sigframe embeds the hardware-defined XSAVE format. If CPU A and CPU B support different XSTATE features, the layout (size and offsets) differ across systems. However, within a system, the layout is invariant. Userspace can query CPUID to obtain the exact offset and sizes, which effectively defines the ABI. On top of the XSAVE data, the kernel appends metadata (e.g. the xstate size and magic values). In particular fpstate->user_size is written by save_sw_bytes() at signal delivery. On sigreturn, the kernel validates this, which is a symmetric and straightforward check. Because the format is hardware-defined, arbitrary size mismatches should not be allowed. The sigframe should match the CPU-defined XSAVE layout. So the change in fact strengthens the sanity check. > Furthermore, this introduces a critical issue for checkpoint/restore > tools like CRIU. If a process is checkpointed while inside a signal > handler, its stack contains a signal frame formatted according to the > source host's xstate capabilities. If that process is later restored on > a destination host with larger xstate capabilities (e.g., a newer CPU > with more features enabled, resulting in a larger fpstate->user_size), > the kernel will look for FP_XSTATE_MAGIC2 at the destination host's > larger user_size offset instead of the offset encoded in the frame's > fx_sw->xstate_size. This causes the magic2 check to fail, forcing > sigreturn to silently fall back to "FX-only" mode. It seems that userspace could translate the XSAVE buffer from CPU A's format to CPU B's format during restore. If so, the frame can be consistent with the destination system without modifying fx_sw->xstate_size, and the kernel-side validation would continue to work as intended. Thanks, Chang ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-04-29 7:26 ` Chang S. Bae @ 2026-04-29 16:44 ` Andrei Vagin 2026-04-29 17:15 ` Chang S. Bae 2026-05-01 18:44 ` Andrei Vagin 1 sibling, 1 reply; 13+ messages in thread From: Andrei Vagin @ 2026-04-29 16:44 UTC (permalink / raw) To: Chang S. Bae Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On Wed, Apr 29, 2026 at 12:27 AM Chang S. Bae <chang.seok.bae@intel.com> wrote: > > On 4/28/2026 5:06 PM, Andrei Vagin wrote: > > > > The reverted commit broke applications that construct signal frames in > > userspace (such as CRIU and gVisor) if the frame's xstate size is > > smaller than the kernel's fpstate->user_size. > > In the extended state area, the sigframe embeds the hardware-defined > XSAVE format. If CPU A and CPU B support different XSTATE features, the > layout (size and offsets) differ across systems. However, within a > system, the layout is invariant. Userspace can query CPUID to obtain the > exact offset and sizes, which effectively defines the ABI. > > On top of the XSAVE data, the kernel appends metadata (e.g. the xstate > size and magic values). In particular fpstate->user_size is written by > save_sw_bytes() at signal delivery. On sigreturn, the kernel validates > this, which is a symmetric and straightforward check. First of all, the reverted change broke backward compatibility for user-space. There are at least two projects (gVisor and CRIU) that worked correctly before this change. With the reverted commit, they run into silent memory corruption. We usually try to avoid breaking user-space like this without strong justification. As for layout compatibility, in most cases CPU A (older) and CPU B (newer) have compatible XSAVE layouts in terms of saving states on A and restoring them on B. CPU B may feature new extended hardware states, but the layout for previously supported components remains the same. CRIU relies on this fact to allow users to migrate processes from older to newer CPUs. CRIU can check whether XSAVE states align across machines. > > Because the format is hardware-defined, arbitrary size mismatches should > not be allowed. The sigframe should match the CPU-defined XSAVE layout. > So the change in fact strengthens the sanity check. > > > Furthermore, this introduces a critical issue for checkpoint/restore > > tools like CRIU. If a process is checkpointed while inside a signal > > handler, its stack contains a signal frame formatted according to the > > source host's xstate capabilities. If that process is later restored on > > a destination host with larger xstate capabilities (e.g., a newer CPU > > with more features enabled, resulting in a larger fpstate->user_size), > > the kernel will look for FP_XSTATE_MAGIC2 at the destination host's > > larger user_size offset instead of the offset encoded in the frame's > > fx_sw->xstate_size. This causes the magic2 check to fail, forcing > > sigreturn to silently fall back to "FX-only" mode. > > It seems that userspace could translate the XSAVE buffer from CPU A's > format to CPU B's format during restore. If so, the frame can be > consistent with the destination system without modifying > fx_sw->xstate_size, and the kernel-side validation would continue to > work as intended. When checkpointing a process, CRIU cannot determine whether it is currently executing within a signal handler, and it cannot find signal frames on a user stack. In fact, there could be multiple nested signal frames stacked on top of each other if a process triggered additional signals while executing in an earlier handler. Even if CRIU were somehow able to locate these frames, extending them would be impossible. The target application stack is not under our control, and other user stack data or local variables reside immediately after the frame. Thanks, Andrei ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-04-29 16:44 ` Andrei Vagin @ 2026-04-29 17:15 ` Chang S. Bae 2026-04-29 20:44 ` Andrei Vagin 0 siblings, 1 reply; 13+ messages in thread From: Chang S. Bae @ 2026-04-29 17:15 UTC (permalink / raw) To: Andrei Vagin Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On 4/29/2026 9:44 AM, Andrei Vagin wrote: > > First of all, the reverted change broke backward compatibility for > user-space. The ABI itself is still intact. Do you mean that the kernel cannot strengthen its sanity check logic? The change does not alter the ABI, but enforces stricter validation of the existing format. > As for layout compatibility, in most cases CPU A (older) and CPU B > (newer) have compatible XSAVE layouts in terms of saving states on A > and restoring them on B. CPU B may feature new extended hardware > states, but the layout for previously supported components remains > the same. I don't think this assumption holds. For example, with APX, the state is placed at the offset previously used by MPX. So the layout is not strictly append-only, and offsets are not guaranteed to remain stable across different CPU generations. > Even if CRIU were somehow able to locate these frames, extending > them would be impossible. The target application stack is not > under our control, and other user stack data or local variables > reside immediately after the frame. I’m confused by this point. If the frame cannot be adjusted, in the first place, how does migration work across systems with differing feature sets? Features can be introduced or deprecated over time, and a snapshot taken on one machine cannot be expected to run unmodified on an random machine with a different XSTATE set. Some form of translation is inevitable for any cross-machine restore mechanism. Thanks, Chang ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-04-29 17:15 ` Chang S. Bae @ 2026-04-29 20:44 ` Andrei Vagin 2026-04-29 21:44 ` Chang S. Bae 0 siblings, 1 reply; 13+ messages in thread From: Andrei Vagin @ 2026-04-29 20:44 UTC (permalink / raw) To: Chang S. Bae Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On Wed, Apr 29, 2026 at 10:15 AM Chang S. Bae <chang.seok.bae@intel.com> wrote: > > On 4/29/2026 9:44 AM, Andrei Vagin wrote: > > > > First of all, the reverted change broke backward compatibility for > > user-space. > > The ABI itself is still intact. Do you mean that the kernel cannot > strengthen its sanity check logic? The change does not alter the ABI, > but enforces stricter validation of the existing format. Enforcing validation against 'fpstate->user_size' instead of the frame's own 'fx_sw->xstate_size' changes the kernel ABI, it isn't strengthen the sanity check logic. When user-space supplies a valid, self-consistent frame with an explicit size that older kernels accepted, and the updated logic rejects it, which triggers a userspace regression. CRIU and gVisor breakages are not related to migration from one host to another. In both cases, they were broken even when running on the same host. Migration between different CPUs is a separate issue. In both cases, the code that constructs signal frames has existed for many years and has worked without any problem before this change. > > > As for layout compatibility, in most cases CPU A (older) and CPU B > > (newer) have compatible XSAVE layouts in terms of saving states on A > > and restoring them on B. CPU B may feature new extended hardware > > states, but the layout for previously supported components remains > > the same. > I don't think this assumption holds. For example, with APX, the state is > placed at the offset previously used by MPX. So the layout is not > strictly append-only, and offsets are not guaranteed to remain stable > across different CPU generations. Regarding layout variations (like APX vs MPX), migration tools already track XSAVE capabilities and offsets. Furthermore, APX has its own dedicated bit in the 'xfeatures' field of the xstate_header. If platforms present conflicting layouts or incompatible extensions, CRIU cancels restoration. The issue with checking against 'user_size' is that it disrupts migration even between compatible systems. If offsets match but the destination cpu has more features (leading to a larger 'user_size'), validation fails... > > > Even if CRIU were somehow able to locate these frames, extending > > them would be impossible. The target application stack is not > > under our control, and other user stack data or local variables > > reside immediately after the frame. > I’m confused by this point. If the frame cannot be adjusted, in the > first place, how does migration work across systems with differing > feature sets? Cross-host migration only works reliably between compatible systems. It works when both hosts share identical feature sets, or in a one-way direction when the target host supports all features of the source host and their XSAVE layouts are compatible. In this context, `compatible` means fpu states saved on the source hosts are restorable on the destination host. If processes are checkpointed at safe, predefined points where they are not executing signal handlers, target host requirements can be more flexible. Here, I need to mention when CRIU constructs signal frames from userspace. In the final step, after all file descriptors and memory mappings are restored, it invokes sigreturn with a pre-constructed signal frame to restore registers and resume the fully restored process. Since CRIU constructs these frames, it can adjust the XSAVE layout if required. We currently do not do this because we have not yet seen scenarios where it would be required. > on one machine cannot be expected to run unmodified on an random machine > with a different XSTATE set. Some form of translation is inevitable for > any cross-machine restore mechanism. As I mentioned, migration tools have logic to determine where a specific workload can be migrated. Because we cannot always control the exact execution point at which a process is stopped, state translation is not always feasible. For instance, an active signal frame on a process stack can be entirely outside our control. However, we can reliably find out compatible target systems where the workload can be resumed safely. Thanks, Andrei ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-04-29 20:44 ` Andrei Vagin @ 2026-04-29 21:44 ` Chang S. Bae 2026-04-30 0:28 ` Andrei Vagin 0 siblings, 1 reply; 13+ messages in thread From: Chang S. Bae @ 2026-04-29 21:44 UTC (permalink / raw) To: Andrei Vagin Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On 4/29/2026 1:44 PM, Andrei Vagin wrote: > > Enforcing validation against 'fpstate->user_size' instead of the frame's > own 'fx_sw->xstate_size' changes the kernel ABI, it isn't strengthen the > sanity check logic. When user-space supplies a valid, self-consistent > frame with an explicit size that older kernels accepted, and the updated > logic rejects it, which triggers a userspace regression. Sorry, I don't get your version of ABI. Eventually, XRSTOR will execute to restore the state. The kernel tracks each task's requested feature bitmap (RFBM), which determines the size. As describe SDM Vol.1, Section 13.13: An execution of an instruction in the XSAVE feature set may access any byte of any state component on which that execution operates even when saving a state component is omitted ... Given this, the kernel must ensure the backing memory is valid and sufficient. So this consistency does matter. Thanks, Chang ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-04-29 21:44 ` Chang S. Bae @ 2026-04-30 0:28 ` Andrei Vagin 0 siblings, 0 replies; 13+ messages in thread From: Andrei Vagin @ 2026-04-30 0:28 UTC (permalink / raw) To: Chang S. Bae Cc: Andrei Vagin, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On Wed, Apr 29, 2026 at 2:44 PM Chang S. Bae <chang.seok.bae@intel.com> wrote: > > On 4/29/2026 1:44 PM, Andrei Vagin wrote: > > > > Enforcing validation against 'fpstate->user_size' instead of the frame's > > own 'fx_sw->xstate_size' changes the kernel ABI, it isn't strengthen the > > sanity check logic. When user-space supplies a valid, self-consistent > > frame with an explicit size that older kernels accepted, and the updated > > logic rejects it, which triggers a userspace regression. > Sorry, I don't get your version of ABI. > > Eventually, XRSTOR will execute to restore the state. The kernel tracks > each task's requested feature bitmap (RFBM), which determines the size. > As describe SDM Vol.1, Section 13.13: > > An execution of an instruction in the XSAVE feature set may access > any byte of any state component on which that execution operates even > when saving a state component is omitted ... > > Given this, the kernel must ensure the backing memory is valid and > sufficient. So this consistency does matter. We need to add one more paragraph to have the full context: Each instruction in the XSAVE feature set operates on a set of XSAVE-managed state components. The specific set of components on which an instruction operates is determined by the values of XCR0, the IA32_XSS MSR, EDX:EAX, and (for XRSTOR and XRSTORS) the XSAVE header. Section 13.4 provides the details necessary to determine the location of each state component for any execution of an instruction in the XSAVE feature set. An execution of an instruction in the XSAVE feature set may access any byte of any state component on which that execution operates even when saving a state component is omitted because it is in its initial configuration; when restoring a state component to its initial configuration; or when XFD is enabled for the state components (see Section 13.14). I interpret this to mean that XRSTOR will not access memory for a component if its corresponding bit is clear in the XSAVE header. However, my point was not about the CPU specification, but about the kernel ABI. The reverted change broke existing user-space applications without justifying an ABI regression. Even if xrstor were to trigger a fault, the kernel handles it properly, so there is no real issue there. It feels like we are trying to justify the change after the fact. The rule is: "we don't break user-space". As usual, there are no rules without exceptions, but any exception should be explicitly analyzed considering all side effects. According to the commit message of the reverted commit, that wasn't such case. Thanks, Andrei ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-04-29 7:26 ` Chang S. Bae 2026-04-29 16:44 ` Andrei Vagin @ 2026-05-01 18:44 ` Andrei Vagin 2026-05-01 19:13 ` Chang S. Bae 1 sibling, 1 reply; 13+ messages in thread From: Andrei Vagin @ 2026-05-01 18:44 UTC (permalink / raw) To: Chang S. Bae Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On Wed, Apr 29, 2026 at 12:26 AM Chang S. Bae <chang.seok.bae@intel.com> wrote: > > On 4/28/2026 5:06 PM, Andrei Vagin wrote: > > > > The reverted commit broke applications that construct signal frames in > > userspace (such as CRIU and gVisor) if the frame's xstate size is > > smaller than the kernel's fpstate->user_size. > > In the extended state area, the sigframe embeds the hardware-defined > XSAVE format. If CPU A and CPU B support different XSTATE features, the > layout (size and offsets) differ across systems. However, within a > system, the layout is invariant. Userspace can query CPUID to obtain the > exact offset and sizes, which effectively defines the ABI. I've been thinking about this more, and I believe the claim that XSAVE offsets can differ across CPUs for the same feature is inaccurate. The XSAVE standard format uses fixed offsets specifically to allow migration between different CPU generations. If a feature exists on both the source and destination CPUs, its data resides at the exact same byte offset. This design is what makes virtual machine migration possible. Hypervisors cannot "translate" XSTATE data hidden in guest memory, so it relies on these invariant offsets. The CRIU case is very similar: when a process is in a signal handler, its state is saved on the stack as an opaque block of memory. If a future CPU uses different offsets for existing features, it would break VM migration. Backward compatibility in this area should be a requirement even for hardware. If we look at existing CPUs, they follow this principle. Thanks, Andrei ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-05-01 18:44 ` Andrei Vagin @ 2026-05-01 19:13 ` Chang S. Bae 2026-05-01 20:50 ` Andrei Vagin 0 siblings, 1 reply; 13+ messages in thread From: Chang S. Bae @ 2026-05-01 19:13 UTC (permalink / raw) To: Andrei Vagin Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On 5/1/2026 11:44 AM, Andrei Vagin wrote: > > I've been thinking about this more, and I believe the claim that XSAVE > offsets can differ across CPUs for the same feature is inaccurate. The > XSAVE standard format uses fixed offsets specifically to allow migration > between different CPU generations. If a feature exists on both the > source and destination CPUs, its data resides at the exact same byte > offset. There is commit ba386777a30b ("x86/elf: Add a new FPU buffer layout info to x86 core files") for this reason: ... The XSAVE layouts of modern AMD and Intel CPUs differ, especially since Memory Protection Keys and the AVX-512 features have been inculcated into the AMD CPUs. Since AMD never adopted (and hence never left room in the XSAVE layout for) the Intel MPX feature, tools like GDB had assumed a fixed XSAVE layout matching that of Intel (based on the XCR0 mask). Hence, core dumps from AMD CPUs didn't match the known size for the XCR0 mask. This resulted in GDB and other tools not being able to access the values of the AVX-512 and PKRU registers on AMD CPUs. ... Thanks, Chang ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-05-01 19:13 ` Chang S. Bae @ 2026-05-01 20:50 ` Andrei Vagin 2026-05-01 21:04 ` Chang S. Bae 0 siblings, 1 reply; 13+ messages in thread From: Andrei Vagin @ 2026-05-01 20:50 UTC (permalink / raw) To: Chang S. Bae Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On Fri, May 1, 2026 at 12:13 PM Chang S. Bae <chang.seok.bae@intel.com> wrote: > > On 5/1/2026 11:44 AM, Andrei Vagin wrote: > > > > I've been thinking about this more, and I believe the claim that XSAVE > > offsets can differ across CPUs for the same feature is inaccurate. The > > XSAVE standard format uses fixed offsets specifically to allow migration > > between different CPU generations. If a feature exists on both the > > source and destination CPUs, its data resides at the exact same byte > > offset. > > There is commit ba386777a30b ("x86/elf: Add a new FPU buffer layout info > to x86 core files") for this reason: > > ... > The XSAVE layouts of modern AMD and Intel CPUs differ, especially > since Memory Protection Keys and the AVX-512 features have been > inculcated into the AMD CPUs. > > Since AMD never adopted (and hence never left room in the XSAVE > layout for) the Intel MPX feature, tools like GDB had assumed a > fixed XSAVE layout matching that of Intel (based on the XCR0 mask). > > Hence, core dumps from AMD CPUs didn't match the known size for the > XCR0 mask. This resulted in GDB and other tools not being able to > access the values of the AVX-512 and PKRU registers on AMD CPUs. > ... This is a different; here, we have two different CPU vendors where XSAVE layouts differ. The XSAVE layout itself is not the only reason why migration between Intel and AMD cannot work reliably. Thanks, Andrei ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-05-01 20:50 ` Andrei Vagin @ 2026-05-01 21:04 ` Chang S. Bae 2026-05-01 21:42 ` Andrei Vagin 0 siblings, 1 reply; 13+ messages in thread From: Chang S. Bae @ 2026-05-01 21:04 UTC (permalink / raw) To: Andrei Vagin Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On 5/1/2026 1:50 PM, Andrei Vagin wrote: > > This is a different; here, we have two different CPU vendors where XSAVE > layouts differ. The XSAVE layout itself is not the only reason why migration > between Intel and AMD cannot work reliably. When saying CPU A and B, I didn't intend the same vendor but x86 in general. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-05-01 21:04 ` Chang S. Bae @ 2026-05-01 21:42 ` Andrei Vagin 2026-05-02 19:23 ` Chang S. Bae 0 siblings, 1 reply; 13+ messages in thread From: Andrei Vagin @ 2026-05-01 21:42 UTC (permalink / raw) To: Chang S. Bae Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On Fri, May 1, 2026 at 2:04 PM Chang S. Bae <chang.seok.bae@intel.com> wrote: > > On 5/1/2026 1:50 PM, Andrei Vagin wrote: > > > > This is a different; here, we have two different CPU vendors where XSAVE > > layouts differ. The XSAVE layout itself is not the only reason why migration > > between Intel and AMD cannot work reliably. > When saying CPU A and B, I didn't intend the same vendor but x86 in general. My point is that the reverted change broke a significant, real-life use case that the hardware was explicitly designed to support. It is the responsibility of C/R tooling to ensure the migration target is compatible with the source. Enforcing a magic check based on a fixed offset does not provide additional security. The kernel must be prepared to handle "trash" data in the userspace xsave area and manage any exceptions triggered by the xrstor instruction. Thanks, Andrei ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" 2026-05-01 21:42 ` Andrei Vagin @ 2026-05-02 19:23 ` Chang S. Bae 0 siblings, 0 replies; 13+ messages in thread From: Chang S. Bae @ 2026-05-02 19:23 UTC (permalink / raw) To: Andrei Vagin Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, linux-kernel, criu, x86, stable On 5/1/2026 2:42 PM, Andrei Vagin wrote: > > My point is that the reverted change broke a significant, real-life use > case that the hardware was explicitly designed to support. > > It is the responsibility of C/R tooling to ensure the migration target > is compatible with the source. Enforcing a magic check based on a fixed > offset does not provide additional security. The kernel must be prepared > to handle "trash" data in the userspace xsave area and manage any > exceptions triggered by the xrstor instruction. It looks like this behavior has been in place since c37b5efea43f ("x86, xsave: save/restore the extended state context in sigframe"). With the sanity check, userspace can modify the sw_fx->xfeature_size and the sw_fx->xfeatures (independently). But, it seems there is no consistency check between the two. For example, the size only could be set to an arbitrary value within the valid range, without matching xfeatures. If userspace sets an inconsistent size vs. xfeatures, maybe zeroing out the garbage could be an option which I expect still compatible with the portability model. It's still not entirely clear to me whether your claimed portability was considered in the original sigframe design. If so, this should be documented more clearly (e.g., in headers and/or Documentation), along with relevant selftests. I’d to follow up on that. That said, yes, this area ultimately falls under the rule of not breaking userspace. So, Acked-by: Chang S. Bae chang.seok.bae@intel.com Thanks, Chang ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-05-02 19:23 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-29 0:06 [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" Andrei Vagin 2026-04-29 7:26 ` Chang S. Bae 2026-04-29 16:44 ` Andrei Vagin 2026-04-29 17:15 ` Chang S. Bae 2026-04-29 20:44 ` Andrei Vagin 2026-04-29 21:44 ` Chang S. Bae 2026-04-30 0:28 ` Andrei Vagin 2026-05-01 18:44 ` Andrei Vagin 2026-05-01 19:13 ` Chang S. Bae 2026-05-01 20:50 ` Andrei Vagin 2026-05-01 21:04 ` Chang S. Bae 2026-05-01 21:42 ` Andrei Vagin 2026-05-02 19:23 ` Chang S. Bae
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox