public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
@ 2026-01-01  9:05 Paolo Bonzini
  2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
                   ` (6 more replies)
  0 siblings, 7 replies; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-01  9:05 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, x86

Fix a possible host panic, due to an unexpected #NM, when a KVM guest
is using AMX features.

The guest's XFD value, which is stored in fpstate->xfd, is used for both
guest execution and host XSAVE operations.  However, the guest-configured
XFD setting can disable features that were enabled when the guest executed
XSAVE, and this causes a #NM when executing XRSTOR on the guest FPU state.

This can happen in two cases: due to a KVM_SET_XSAVE that includes a
disabled component, or if an interrupt causes XSAVE to be executed
before the call to fpu_update_guest_xfd().

The first patch fixes both cases, the rest is improvements to selftests
in order to cover this test and also verify that #NM faults are injected
corectly.

v1 had extra patches to export higher-level functions for KVM in place
of switch_fpu_return() and fpregs_assert_state_consistent().  Those
were part of refactoring how KVM loaded guest state when KVM_RUN is
issued, but are not needed anymore with this v2 fix and I will submit
them separately.

Tested on a Sapphire Rapids machine, reviews and acks are welcome so
that I can submit it to Linus via the KVM tree.

Paolo



Paolo Bonzini (2):
  selftests: kvm: replace numbered sync points with actions
  selftests: kvm: try getting XFD and XSAVE state out of sync

Sean Christopherson (2):
  x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1

 arch/x86/kernel/fpu/core.c                 |  32 ++++-
 arch/x86/kvm/x86.c                         |   9 ++
 tools/testing/selftests/kvm/x86/amx_test.c | 144 ++++++++++++---------
 3 files changed, 123 insertions(+), 62 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-01  9:05 [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Paolo Bonzini
@ 2026-01-01  9:05 ` Paolo Bonzini
  2026-01-03  2:06   ` Yao Yuan
                     ` (4 more replies)
  2026-01-01  9:05 ` [PATCH 2/4] selftests: kvm: replace numbered sync points with actions Paolo Bonzini
                   ` (5 subsequent siblings)
  6 siblings, 5 replies; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-01  9:05 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, x86, stable

From: Sean Christopherson <seanjc@google.com>

When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
response to a guest WRMSR, clear XFD-disabled features in the saved (or to
be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
features that are disabled via the guest's XFD.  Because the kernel
executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
will cause XRSTOR to #NM and panic the kernel.

E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV:

  ------------[ cut here ]------------
  WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#29: amx_test/848
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 29 UID: 1000 PID: 848 Comm: amx_test Not tainted 6.19.0-rc2-ffa07f7fd437-x86_amx_nm_xfd_non_init-vm #171 NONE
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:exc_device_not_available+0x101/0x110
  Call Trace:
   <TASK>
   asm_exc_device_not_available+0x1a/0x20
  RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
   switch_fpu_return+0x4a/0xb0
   kvm_arch_vcpu_ioctl_run+0x1245/0x1e40 [kvm]
   kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
   __x64_sys_ioctl+0x8f/0xd0
   do_syscall_64+0x62/0x940
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>
  ---[ end trace 0000000000000000 ]---

This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1,
and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's
call to fpu_update_guest_xfd().

and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE:

  ------------[ cut here ]------------
  WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#14: amx_test/867
  Modules linked in: kvm_intel kvm irqbypass
  CPU: 14 UID: 1000 PID: 867 Comm: amx_test Not tainted 6.19.0-rc2-2dace9faccd6-x86_amx_nm_xfd_non_init-vm #168 NONE
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:exc_device_not_available+0x101/0x110
  Call Trace:
   <TASK>
   asm_exc_device_not_available+0x1a/0x20
  RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
   fpu_swap_kvm_fpstate+0x6b/0x120
   kvm_load_guest_fpu+0x30/0x80 [kvm]
   kvm_arch_vcpu_ioctl_run+0x85/0x1e40 [kvm]
   kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
   __x64_sys_ioctl+0x8f/0xd0
   do_syscall_64+0x62/0x940
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>
  ---[ end trace 0000000000000000 ]---

The new behavior is consistent with the AMX architecture.  Per Intel's SDM,
XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD
(and non-compacted XSAVE saves the initial configuration of the state
component):

  If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i,
  the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1;
  instead, it operates as if XINUSE[i] = 0 (and the state component was
  in its initial state): it saves bit i of XSTATE_BV field of the XSAVE
  header as 0; in addition, XSAVE saves the initial configuration of the
  state component (the other instructions do not save state component i).

Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using
a constant XFD based on the set of enabled features when XSAVEing for
a struct fpu_guest.  However, having XSTATE_BV[i]=1 for XFD-disabled
features can only happen in the above interrupt case, or in similar
scenarios involving preemption on preemptible kernels, because
fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the
outgoing FPU state with the current XFD; and that is (on all but the
first WRMSR to XFD) the guest XFD.

Therefore, XFD can only go out of sync with XSTATE_BV in the above
interrupt case, or in similar scenarios involving preemption on
preemptible kernels, and it we can consider it (de facto) part of KVM
ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14)
Signed-off-by: Sean Christopherson <seanjc@google.com>
[Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate
 to kvm_vcpu_ioctl_x86_set_xsave. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/fpu/core.c | 32 +++++++++++++++++++++++++++++---
 arch/x86/kvm/x86.c         |  9 +++++++++
 2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index da233f20ae6f..166c380b0161 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -319,10 +319,29 @@ EXPORT_SYMBOL_FOR_KVM(fpu_enable_guest_xfd_features);
 #ifdef CONFIG_X86_64
 void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
 {
+	struct fpstate *fpstate = guest_fpu->fpstate;
+
 	fpregs_lock();
-	guest_fpu->fpstate->xfd = xfd;
-	if (guest_fpu->fpstate->in_use)
-		xfd_update_state(guest_fpu->fpstate);
+
+	/*
+	 * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
+	 * the save state to initialized.  Likewise, KVM_GET_XSAVE does the
+	 * same as XSAVE and returns XSTATE_BV[i]=0 whenever XFD[i]=1.
+	 *
+	 * If the guest's FPU state is in hardware, just update XFD: the XSAVE
+	 * in fpu_swap_kvm_fpstate will clear XSTATE_BV[i] whenever XFD[i]=1.
+	 *
+	 * If however the guest's FPU state is NOT resident in hardware, clear
+	 * disabled components in XSTATE_BV now, or a subsequent XRSTOR will
+	 * attempt to load disabled components and generate #NM _in the host_.
+	 */
+	if (xfd && test_thread_flag(TIF_NEED_FPU_LOAD))
+		fpstate->regs.xsave.header.xfeatures &= ~xfd;
+
+	fpstate->xfd = xfd;
+	if (fpstate->in_use)
+		xfd_update_state(fpstate);
+
 	fpregs_unlock();
 }
 EXPORT_SYMBOL_FOR_KVM(fpu_update_guest_xfd);
@@ -430,6 +449,13 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
 	if (ustate->xsave.header.xfeatures & ~xcr0)
 		return -EINVAL;
 
+	/*
+	 * Disabled features must be in their initial state, otherwise XRSTOR
+	 * causes an exception.
+	 */
+	if (WARN_ON_ONCE(ustate->xsave.header.xfeatures & kstate->xfd))
+		return -EINVAL;
+
 	/*
 	 * Nullify @vpkru to preserve its current value if PKRU's bit isn't set
 	 * in the header.  KVM's odd ABI is to leave PKRU untouched in this
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ff8812f3a129..c0416f53b5f5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5807,9 +5807,18 @@ static int kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
 static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
 					struct kvm_xsave *guest_xsave)
 {
+	union fpregs_state *xstate = (union fpregs_state *)guest_xsave->region;
+
 	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
 		return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;
 
+	/*
+	 * Do not reject non-initialized disabled features for backwards
+	 * compatibility, but clear XSTATE_BV[i] whenever XFD[i]=1.
+	 * Otherwise, XRSTOR would cause a #NM.
+	 */
+	xstate->xsave.header.xfeatures &= ~vcpu->arch.guest_fpu.fpstate->xfd;
+
 	return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu,
 					      guest_xsave->region,
 					      kvm_caps.supported_xcr0,
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/4] selftests: kvm: replace numbered sync points with actions
  2026-01-01  9:05 [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Paolo Bonzini
  2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
@ 2026-01-01  9:05 ` Paolo Bonzini
  2026-01-06  0:02   ` Sean Christopherson
  2026-01-01  9:05 ` [PATCH 3/4] selftests: kvm: try getting XFD and XSAVE state out of sync Paolo Bonzini
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-01  9:05 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, x86, stable

Rework the guest=>host syncs in the AMX test to use named actions instead
of arbitrary, incrementing numbers.  The "stage" of the test has no real
meaning, what matters is what action the test wants the host to perform.
The incrementing numbers are somewhat helpful for triaging failures, but
fully debugging failures almost always requires a much deeper dive into
the test (and KVM).

Using named actions not only makes it easier to extend the test without
having to shift all sync point numbers, it makes the code easier to read.

[Commit message by Sean Christopherson]

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	I wrote this before seeing your patch... It's obviously
	similar but different enough that I kept my version. :)
	Thanks anyway for including it, your commit message was
	better so I used it.

 tools/testing/selftests/kvm/x86/amx_test.c | 88 +++++++++++-----------
 1 file changed, 43 insertions(+), 45 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86/amx_test.c b/tools/testing/selftests/kvm/x86/amx_test.c
index f4ce5a185a7d..4ac41c1a7255 100644
--- a/tools/testing/selftests/kvm/x86/amx_test.c
+++ b/tools/testing/selftests/kvm/x86/amx_test.c
@@ -124,6 +124,14 @@ static void set_tilecfg(struct tile_config *cfg)
 	}
 }
 
+enum {
+	/* Check TMM0 against tiledata */
+	TEST_COMPARE_TILEDATA = 1,
+
+	/* Full VM save/restore */
+	TEST_SAVE_RESTORE = 2,
+};
+
 static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 						    struct tile_data *tiledata,
 						    struct xstate *xstate)
@@ -131,20 +139,20 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 	GUEST_ASSERT(this_cpu_has(X86_FEATURE_XSAVE) &&
 		     this_cpu_has(X86_FEATURE_OSXSAVE));
 	check_xtile_info();
-	GUEST_SYNC(1);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
 
 	/* xfd=0, enable amx */
 	wrmsr(MSR_IA32_XFD, 0);
-	GUEST_SYNC(2);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == 0);
 	set_tilecfg(amx_cfg);
 	__ldtilecfg(amx_cfg);
-	GUEST_SYNC(3);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
 	/* Check save/restore when trap to userspace */
 	__tileloadd(tiledata);
-	GUEST_SYNC(4);
+	GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
 	__tilerelease();
-	GUEST_SYNC(5);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
 	/*
 	 * After XSAVEC, XTILEDATA is cleared in the xstate_bv but is set in
 	 * the xcomp_bv.
@@ -154,6 +162,8 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 	GUEST_ASSERT(!(xstate->header.xstate_bv & XFEATURE_MASK_XTILE_DATA));
 	GUEST_ASSERT(xstate->header.xcomp_bv & XFEATURE_MASK_XTILE_DATA);
 
+	/* #NM test */
+
 	/* xfd=0x40000, disable amx tiledata */
 	wrmsr(MSR_IA32_XFD, XFEATURE_MASK_XTILE_DATA);
 
@@ -166,13 +176,13 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 	GUEST_ASSERT(!(xstate->header.xstate_bv & XFEATURE_MASK_XTILE_DATA));
 	GUEST_ASSERT((xstate->header.xcomp_bv & XFEATURE_MASK_XTILE_DATA));
 
-	GUEST_SYNC(6);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
 	set_tilecfg(amx_cfg);
 	__ldtilecfg(amx_cfg);
 	/* Trigger #NM exception */
 	__tileloadd(tiledata);
-	GUEST_SYNC(10);
+	GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
 
 	GUEST_DONE();
 }
@@ -180,18 +190,18 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 void guest_nm_handler(struct ex_regs *regs)
 {
 	/* Check if #NM is triggered by XFEATURE_MASK_XTILE_DATA */
-	GUEST_SYNC(7);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
 	GUEST_ASSERT(!(get_cr0() & X86_CR0_TS));
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
-	GUEST_SYNC(8);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
 	/* Clear xfd_err */
 	wrmsr(MSR_IA32_XFD_ERR, 0);
 	/* xfd=0, enable amx */
 	wrmsr(MSR_IA32_XFD, 0);
-	GUEST_SYNC(9);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
 }
 
 int main(int argc, char *argv[])
@@ -244,6 +254,7 @@ int main(int argc, char *argv[])
 	memset(addr_gva2hva(vm, xstate), 0, PAGE_SIZE * DIV_ROUND_UP(XSAVE_SIZE, PAGE_SIZE));
 	vcpu_args_set(vcpu, 3, amx_cfg, tiledata, xstate);
 
+	int iter = 0;
 	for (;;) {
 		vcpu_run(vcpu);
 		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
@@ -253,20 +264,9 @@ int main(int argc, char *argv[])
 			REPORT_GUEST_ASSERT(uc);
 			/* NOT REACHED */
 		case UCALL_SYNC:
-			switch (uc.args[1]) {
-			case 1:
-			case 2:
-			case 3:
-			case 5:
-			case 6:
-			case 7:
-			case 8:
-				fprintf(stderr, "GUEST_SYNC(%ld)\n", uc.args[1]);
-				break;
-			case 4:
-			case 10:
-				fprintf(stderr,
-				"GUEST_SYNC(%ld), check save/restore status\n", uc.args[1]);
+			++iter;
+			if (uc.args[1] & TEST_COMPARE_TILEDATA) {
+				fprintf(stderr, "GUEST_SYNC #%d, check TMM0 contents\n", iter);
 
 				/* Compacted mode, get amx offset by xsave area
 				 * size subtract 8K amx size.
@@ -279,11 +279,25 @@ int main(int argc, char *argv[])
 				ret = memcmp(amx_start, tiles_data, TILE_SIZE);
 				TEST_ASSERT(ret == 0, "memcmp failed, ret=%d", ret);
 				kvm_x86_state_cleanup(state);
-				break;
-			case 9:
-				fprintf(stderr,
-				"GUEST_SYNC(%ld), #NM exception and enable amx\n", uc.args[1]);
-				break;
+			}
+			if (uc.args[1] & TEST_SAVE_RESTORE) {
+				fprintf(stderr, "GUEST_SYNC #%d, save/restore VM state\n", iter);
+				state = vcpu_save_state(vcpu);
+				memset(&regs1, 0, sizeof(regs1));
+				vcpu_regs_get(vcpu, &regs1);
+
+				kvm_vm_release(vm);
+
+				/* Restore state in a new VM.  */
+				vcpu = vm_recreate_with_one_vcpu(vm);
+				vcpu_load_state(vcpu, state);
+				kvm_x86_state_cleanup(state);
+
+				memset(&regs2, 0, sizeof(regs2));
+				vcpu_regs_get(vcpu, &regs2);
+				TEST_ASSERT(!memcmp(&regs1, &regs2, sizeof(regs2)),
+					    "Unexpected register values after vcpu_load_state; rdi: %lx rsi: %lx",
+					    (ulong) regs2.rdi, (ulong) regs2.rsi);
 			}
 			break;
 		case UCALL_DONE:
@@ -293,22 +307,6 @@ int main(int argc, char *argv[])
 			TEST_FAIL("Unknown ucall %lu", uc.cmd);
 		}
 
-		state = vcpu_save_state(vcpu);
-		memset(&regs1, 0, sizeof(regs1));
-		vcpu_regs_get(vcpu, &regs1);
-
-		kvm_vm_release(vm);
-
-		/* Restore state in a new VM.  */
-		vcpu = vm_recreate_with_one_vcpu(vm);
-		vcpu_load_state(vcpu, state);
-		kvm_x86_state_cleanup(state);
-
-		memset(&regs2, 0, sizeof(regs2));
-		vcpu_regs_get(vcpu, &regs2);
-		TEST_ASSERT(!memcmp(&regs1, &regs2, sizeof(regs2)),
-			    "Unexpected register values after vcpu_load_state; rdi: %lx rsi: %lx",
-			    (ulong) regs2.rdi, (ulong) regs2.rsi);
 	}
 done:
 	kvm_vm_free(vm);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 3/4] selftests: kvm: try getting XFD and XSAVE state out of sync
  2026-01-01  9:05 [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Paolo Bonzini
  2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
  2026-01-01  9:05 ` [PATCH 2/4] selftests: kvm: replace numbered sync points with actions Paolo Bonzini
@ 2026-01-01  9:05 ` Paolo Bonzini
  2026-01-01  9:05 ` [PATCH 4/4] selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1 Paolo Bonzini
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-01  9:05 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, x86, stable

The host is allowed to set FPU state that includes a disabled
xstate component.  Check that this does not cause bad effects.

Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tools/testing/selftests/kvm/x86/amx_test.c | 38 +++++++++++++++++-----
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86/amx_test.c b/tools/testing/selftests/kvm/x86/amx_test.c
index 4ac41c1a7255..00a42a592a37 100644
--- a/tools/testing/selftests/kvm/x86/amx_test.c
+++ b/tools/testing/selftests/kvm/x86/amx_test.c
@@ -125,11 +125,17 @@ static void set_tilecfg(struct tile_config *cfg)
 }
 
 enum {
+	/* Retrieve TMM0 from guest, stash it for TEST_RESTORE_TILEDATA */
+	TEST_SAVE_TILEDATA = 1,
+
 	/* Check TMM0 against tiledata */
-	TEST_COMPARE_TILEDATA = 1,
+	TEST_COMPARE_TILEDATA = 2,
+
+	/* Restore TMM0 from earlier save */
+	TEST_RESTORE_TILEDATA = 4,
 
 	/* Full VM save/restore */
-	TEST_SAVE_RESTORE = 2,
+	TEST_SAVE_RESTORE = 8,
 };
 
 static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
@@ -150,7 +156,16 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 	GUEST_SYNC(TEST_SAVE_RESTORE);
 	/* Check save/restore when trap to userspace */
 	__tileloadd(tiledata);
-	GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
+	GUEST_SYNC(TEST_SAVE_TILEDATA | TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
+
+	/* xfd=0x40000, disable amx tiledata */
+	wrmsr(MSR_IA32_XFD, XFEATURE_MASK_XTILE_DATA);
+
+	/* host tries setting tiledata while guest XFD is set */
+	GUEST_SYNC(TEST_RESTORE_TILEDATA);
+	GUEST_SYNC(TEST_SAVE_RESTORE);
+
+	wrmsr(MSR_IA32_XFD, 0);
 	__tilerelease();
 	GUEST_SYNC(TEST_SAVE_RESTORE);
 	/*
@@ -210,10 +225,10 @@ int main(int argc, char *argv[])
 	struct kvm_vcpu *vcpu;
 	struct kvm_vm *vm;
 	struct kvm_x86_state *state;
+	struct kvm_x86_state *tile_state = NULL;
 	int xsave_restore_size;
 	vm_vaddr_t amx_cfg, tiledata, xstate;
 	struct ucall uc;
-	u32 amx_offset;
 	int ret;
 
 	/*
@@ -265,20 +280,27 @@ int main(int argc, char *argv[])
 			/* NOT REACHED */
 		case UCALL_SYNC:
 			++iter;
+			if (uc.args[1] & TEST_SAVE_TILEDATA) {
+				fprintf(stderr, "GUEST_SYNC #%d, save tiledata\n", iter);
+				tile_state = vcpu_save_state(vcpu);
+			}
 			if (uc.args[1] & TEST_COMPARE_TILEDATA) {
 				fprintf(stderr, "GUEST_SYNC #%d, check TMM0 contents\n", iter);
 
 				/* Compacted mode, get amx offset by xsave area
 				 * size subtract 8K amx size.
 				 */
-				amx_offset = xsave_restore_size - NUM_TILES*TILE_SIZE;
-				state = vcpu_save_state(vcpu);
-				void *amx_start = (void *)state->xsave + amx_offset;
+				u32 amx_offset = xsave_restore_size - NUM_TILES*TILE_SIZE;
+				void *amx_start = (void *)tile_state->xsave + amx_offset;
 				void *tiles_data = (void *)addr_gva2hva(vm, tiledata);
 				/* Only check TMM0 register, 1 tile */
 				ret = memcmp(amx_start, tiles_data, TILE_SIZE);
 				TEST_ASSERT(ret == 0, "memcmp failed, ret=%d", ret);
-				kvm_x86_state_cleanup(state);
+			}
+			if (uc.args[1] & TEST_RESTORE_TILEDATA) {
+				fprintf(stderr, "GUEST_SYNC #%d, before KVM_SET_XSAVE\n", iter);
+				vcpu_xsave_set(vcpu, tile_state->xsave);
+				fprintf(stderr, "GUEST_SYNC #%d, after KVM_SET_XSAVE\n", iter);
 			}
 			if (uc.args[1] & TEST_SAVE_RESTORE) {
 				fprintf(stderr, "GUEST_SYNC #%d, save/restore VM state\n", iter);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 4/4] selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1
  2026-01-01  9:05 [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Paolo Bonzini
                   ` (2 preceding siblings ...)
  2026-01-01  9:05 ` [PATCH 3/4] selftests: kvm: try getting XFD and XSAVE state out of sync Paolo Bonzini
@ 2026-01-01  9:05 ` Paolo Bonzini
  2026-01-06  1:18 ` [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Sean Christopherson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-01  9:05 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: seanjc, x86

From: Sean Christopherson <seanjc@google.com>

Rework the AMX test's #NM handling to use kvm_asm_safe() to verify an #NM
actually occurs.  As is, a completely missing #NM could go unnoticed.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tools/testing/selftests/kvm/x86/amx_test.c | 30 +++++++++++++---------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86/amx_test.c b/tools/testing/selftests/kvm/x86/amx_test.c
index 00a42a592a37..371355bde54e 100644
--- a/tools/testing/selftests/kvm/x86/amx_test.c
+++ b/tools/testing/selftests/kvm/x86/amx_test.c
@@ -69,6 +69,12 @@ static inline void __tileloadd(void *tile)
 		     : : "a"(tile), "d"(0));
 }
 
+static inline int tileloadd_safe(void *tile)
+{
+	return kvm_asm_safe(".byte 0xc4,0xe2,0x7b,0x4b,0x04,0x10",
+			    "a"(tile), "d"(0));
+}
+
 static inline void __tilerelease(void)
 {
 	asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0" ::);
@@ -142,6 +148,8 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 						    struct tile_data *tiledata,
 						    struct xstate *xstate)
 {
+	int vector;
+
 	GUEST_ASSERT(this_cpu_has(X86_FEATURE_XSAVE) &&
 		     this_cpu_has(X86_FEATURE_OSXSAVE));
 	check_xtile_info();
@@ -195,17 +203,13 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
 	set_tilecfg(amx_cfg);
 	__ldtilecfg(amx_cfg);
+
 	/* Trigger #NM exception */
-	__tileloadd(tiledata);
-	GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
+	vector = tileloadd_safe(tiledata);
+	__GUEST_ASSERT(vector == NM_VECTOR,
+		       "Wanted #NM on tileloadd with XFD[18]=1, got %s",
+		       ex_str(vector));
 
-	GUEST_DONE();
-}
-
-void guest_nm_handler(struct ex_regs *regs)
-{
-	/* Check if #NM is triggered by XFEATURE_MASK_XTILE_DATA */
-	GUEST_SYNC(TEST_SAVE_RESTORE);
 	GUEST_ASSERT(!(get_cr0() & X86_CR0_TS));
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
@@ -217,6 +221,11 @@ void guest_nm_handler(struct ex_regs *regs)
 	/* xfd=0, enable amx */
 	wrmsr(MSR_IA32_XFD, 0);
 	GUEST_SYNC(TEST_SAVE_RESTORE);
+
+	__tileloadd(tiledata);
+	GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
+
+	GUEST_DONE();
 }
 
 int main(int argc, char *argv[])
@@ -253,9 +262,6 @@ int main(int argc, char *argv[])
 
 	vcpu_regs_get(vcpu, &regs1);
 
-	/* Register #NM handler */
-	vm_install_exception_handler(vm, NM_VECTOR, guest_nm_handler);
-
 	/* amx cfg for guest_code */
 	amx_cfg = vm_vaddr_alloc_page(vm);
 	memset(addr_gva2hva(vm, amx_cfg), 0x0, getpagesize());
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
@ 2026-01-03  2:06   ` Yao Yuan
  2026-01-05 17:31     ` Sean Christopherson
  2026-01-06  0:54   ` Jim Mattson
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Yao Yuan @ 2026-01-03  2:06 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, x86, stable

On Thu, Jan 01, 2026 at 10:05:13AM +0100, Paolo Bonzini wrote:
> From: Sean Christopherson <seanjc@google.com>
>
> When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
> response to a guest WRMSR, clear XFD-disabled features in the saved (or to
> be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
> features that are disabled via the guest's XFD.  Because the kernel
> executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
> will cause XRSTOR to #NM and panic the kernel.
>
> E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV:
>
>   ------------[ cut here ]------------
>   WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#29: amx_test/848
>   Modules linked in: kvm_intel kvm irqbypass
>   CPU: 29 UID: 1000 PID: 848 Comm: amx_test Not tainted 6.19.0-rc2-ffa07f7fd437-x86_amx_nm_xfd_non_init-vm #171 NONE
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:exc_device_not_available+0x101/0x110
>   Call Trace:
>    <TASK>
>    asm_exc_device_not_available+0x1a/0x20
>   RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
>    switch_fpu_return+0x4a/0xb0
>    kvm_arch_vcpu_ioctl_run+0x1245/0x1e40 [kvm]
>    kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
>    __x64_sys_ioctl+0x8f/0xd0
>    do_syscall_64+0x62/0x940
>    entry_SYSCALL_64_after_hwframe+0x4b/0x53
>    </TASK>
>   ---[ end trace 0000000000000000 ]---
>
> This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1,
> and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's
> call to fpu_update_guest_xfd().
>
> and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE:
>
>   ------------[ cut here ]------------
>   WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#14: amx_test/867
>   Modules linked in: kvm_intel kvm irqbypass
>   CPU: 14 UID: 1000 PID: 867 Comm: amx_test Not tainted 6.19.0-rc2-2dace9faccd6-x86_amx_nm_xfd_non_init-vm #168 NONE
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:exc_device_not_available+0x101/0x110
>   Call Trace:
>    <TASK>
>    asm_exc_device_not_available+0x1a/0x20
>   RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
>    fpu_swap_kvm_fpstate+0x6b/0x120
>    kvm_load_guest_fpu+0x30/0x80 [kvm]
>    kvm_arch_vcpu_ioctl_run+0x85/0x1e40 [kvm]
>    kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
>    __x64_sys_ioctl+0x8f/0xd0
>    do_syscall_64+0x62/0x940
>    entry_SYSCALL_64_after_hwframe+0x4b/0x53
>    </TASK>
>   ---[ end trace 0000000000000000 ]---
>
> The new behavior is consistent with the AMX architecture.  Per Intel's SDM,
> XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD
> (and non-compacted XSAVE saves the initial configuration of the state
> component):
>
>   If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i,
>   the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1;
>   instead, it operates as if XINUSE[i] = 0 (and the state component was
>   in its initial state): it saves bit i of XSTATE_BV field of the XSAVE
>   header as 0; in addition, XSAVE saves the initial configuration of the
>   state component (the other instructions do not save state component i).
>
> Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using
> a constant XFD based on the set of enabled features when XSAVEing for
> a struct fpu_guest.  However, having XSTATE_BV[i]=1 for XFD-disabled
> features can only happen in the above interrupt case, or in similar
> scenarios involving preemption on preemptible kernels, because
> fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the
> outgoing FPU state with the current XFD; and that is (on all but the
> first WRMSR to XFD) the guest XFD.
>
> Therefore, XFD can only go out of sync with XSTATE_BV in the above
> interrupt case, or in similar scenarios involving preemption on
> preemptible kernels, and it we can consider it (de facto) part of KVM
> ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.
>
> Reported-by: Paolo Bonzini <pbonzini@redhat.com>
> Cc: stable@vger.kernel.org
> Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14)
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> [Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate
>  to kvm_vcpu_ioctl_x86_set_xsave. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kernel/fpu/core.c | 32 +++++++++++++++++++++++++++++---
>  arch/x86/kvm/x86.c         |  9 +++++++++
>  2 files changed, 38 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index da233f20ae6f..166c380b0161 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -319,10 +319,29 @@ EXPORT_SYMBOL_FOR_KVM(fpu_enable_guest_xfd_features);
>  #ifdef CONFIG_X86_64
>  void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
>  {
> +	struct fpstate *fpstate = guest_fpu->fpstate;
> +
>  	fpregs_lock();
> -	guest_fpu->fpstate->xfd = xfd;
> -	if (guest_fpu->fpstate->in_use)
> -		xfd_update_state(guest_fpu->fpstate);
> +
> +	/*
> +	 * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
> +	 * the save state to initialized.  Likewise, KVM_GET_XSAVE does the
> +	 * same as XSAVE and returns XSTATE_BV[i]=0 whenever XFD[i]=1.
> +	 *
> +	 * If the guest's FPU state is in hardware, just update XFD: the XSAVE
> +	 * in fpu_swap_kvm_fpstate will clear XSTATE_BV[i] whenever XFD[i]=1.
> +	 *
> +	 * If however the guest's FPU state is NOT resident in hardware, clear
> +	 * disabled components in XSTATE_BV now, or a subsequent XRSTOR will
> +	 * attempt to load disabled components and generate #NM _in the host_.
> +	 */

Hi Sean and Paolo,

> +	if (xfd && test_thread_flag(TIF_NEED_FPU_LOAD))
> +		fpstate->regs.xsave.header.xfeatures &= ~xfd;
> +
> +	fpstate->xfd = xfd;
> +	if (fpstate->in_use)
> +		xfd_update_state(fpstate);

I see a *small* window that the Host IRQ can happen just
after above TIF_NEED_FPU_LOAD checking, which could set
TIF_NEED_FPU_LOAD but w/o clear the xfd from
fpstate->regs.xsave.header.xfeatures.

But there's WARN in in kernel_fpu_begin_mask():

	WARN_ON_FPU(!irq_fpu_usable());

irq_fpu_usable()
{
	...
	/*
	 * In hard interrupt context it's safe when soft interrupts
	 * are enabled, which means the interrupt did not hit in
	 * a fpregs_lock()'ed critical region.
	 */
	return !softirq_count();
}

Looks we are relying on this to catch the above *small* window
yet, we're in fpregs_lock() region yet.

Is this correct understanding ?

> +
>  	fpregs_unlock();
>  }
>  EXPORT_SYMBOL_FOR_KVM(fpu_update_guest_xfd);
> @@ -430,6 +449,13 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
>  	if (ustate->xsave.header.xfeatures & ~xcr0)
>  		return -EINVAL;
>
> +	/*
> +	 * Disabled features must be in their initial state, otherwise XRSTOR
> +	 * causes an exception.
> +	 */
> +	if (WARN_ON_ONCE(ustate->xsave.header.xfeatures & kstate->xfd))
> +		return -EINVAL;
> +
>  	/*
>  	 * Nullify @vpkru to preserve its current value if PKRU's bit isn't set
>  	 * in the header.  KVM's odd ABI is to leave PKRU untouched in this
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ff8812f3a129..c0416f53b5f5 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5807,9 +5807,18 @@ static int kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
>  static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
>  					struct kvm_xsave *guest_xsave)
>  {
> +	union fpregs_state *xstate = (union fpregs_state *)guest_xsave->region;
> +
>  	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
>  		return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;
>
> +	/*
> +	 * Do not reject non-initialized disabled features for backwards
> +	 * compatibility, but clear XSTATE_BV[i] whenever XFD[i]=1.
> +	 * Otherwise, XRSTOR would cause a #NM.
> +	 */
> +	xstate->xsave.header.xfeatures &= ~vcpu->arch.guest_fpu.fpstate->xfd;
> +
>  	return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu,
>  					      guest_xsave->region,
>  					      kvm_caps.supported_xcr0,
> --
> 2.52.0
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-03  2:06   ` Yao Yuan
@ 2026-01-05 17:31     ` Sean Christopherson
  2026-01-06  5:25       ` Yao Yuan
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2026-01-05 17:31 UTC (permalink / raw)
  To: Yao Yuan; +Cc: Paolo Bonzini, linux-kernel, kvm, x86, stable

On Sat, Jan 03, 2026, Yao Yuan wrote:
> On Thu, Jan 01, 2026 at 10:05:13AM +0100, Paolo Bonzini wrote:
> > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> > index da233f20ae6f..166c380b0161 100644
> > --- a/arch/x86/kernel/fpu/core.c
> > +++ b/arch/x86/kernel/fpu/core.c
> > @@ -319,10 +319,29 @@ EXPORT_SYMBOL_FOR_KVM(fpu_enable_guest_xfd_features);
> >  #ifdef CONFIG_X86_64
> >  void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
> >  {
> > +	struct fpstate *fpstate = guest_fpu->fpstate;
> > +
> >  	fpregs_lock();
> > -	guest_fpu->fpstate->xfd = xfd;
> > -	if (guest_fpu->fpstate->in_use)
> > -		xfd_update_state(guest_fpu->fpstate);
> > +
> > +	/*
> > +	 * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
> > +	 * the save state to initialized.  Likewise, KVM_GET_XSAVE does the
> > +	 * same as XSAVE and returns XSTATE_BV[i]=0 whenever XFD[i]=1.
> > +	 *
> > +	 * If the guest's FPU state is in hardware, just update XFD: the XSAVE
> > +	 * in fpu_swap_kvm_fpstate will clear XSTATE_BV[i] whenever XFD[i]=1.
> > +	 *
> > +	 * If however the guest's FPU state is NOT resident in hardware, clear
> > +	 * disabled components in XSTATE_BV now, or a subsequent XRSTOR will
> > +	 * attempt to load disabled components and generate #NM _in the host_.
> > +	 */
> 
> Hi Sean and Paolo,
> 
> > +	if (xfd && test_thread_flag(TIF_NEED_FPU_LOAD))
> > +		fpstate->regs.xsave.header.xfeatures &= ~xfd;
> > +
> > +	fpstate->xfd = xfd;
> > +	if (fpstate->in_use)
> > +		xfd_update_state(fpstate);
> 
> I see a *small* window that the Host IRQ can happen just after above
> TIF_NEED_FPU_LOAD checking, which could set TIF_NEED_FPU_LOAD

Only if the code using FPU from IRQ context is buggy.  More below.

> but w/o clear the xfd from fpstate->regs.xsave.header.xfeatures.
> 
> But there's WARN in in kernel_fpu_begin_mask():
> 
> 	WARN_ON_FPU(!irq_fpu_usable());
> 
> irq_fpu_usable()
> {
> 	...
> 	/*
> 	 * In hard interrupt context it's safe when soft interrupts
> 	 * are enabled, which means the interrupt did not hit in
> 	 * a fpregs_lock()'ed critical region.
> 	 */
> 	return !softirq_count();
> }
> 
> Looks we are relying on this to catch the above *small* window
> yet, we're in fpregs_lock() region yet.

Kernel use of FPU from (soft) IRQ context is required to check irq_fpu_usable()
(e.g. via may_use_simd()), i.e. calling fpregs_lock() protects against the kernel
using the FPU and thus setting TIF_NEED_FPU_LOAD.

The WARN in kernel_fpu_begin_mask() is purely a sanity check to help detect and
debug buggy users.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/4] selftests: kvm: replace numbered sync points with actions
  2026-01-01  9:05 ` [PATCH 2/4] selftests: kvm: replace numbered sync points with actions Paolo Bonzini
@ 2026-01-06  0:02   ` Sean Christopherson
  2026-01-07 22:28     ` Paolo Bonzini
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2026-01-06  0:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, x86, stable

On Thu, Jan 01, 2026, Paolo Bonzini wrote:
> Rework the guest=>host syncs in the AMX test to use named actions instead
> of arbitrary, incrementing numbers.  The "stage" of the test has no real
> meaning, what matters is what action the test wants the host to perform.
> The incrementing numbers are somewhat helpful for triaging failures, but
> fully debugging failures almost always requires a much deeper dive into
> the test (and KVM).
> 
> Using named actions not only makes it easier to extend the test without
> having to shift all sync point numbers, it makes the code easier to read.
> 
> [Commit message by Sean Christopherson]
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> 	I wrote this before seeing your patch... It's obviously
> 	similar but different enough that I kept my version. :)

Heh, no worries.

> @@ -244,6 +254,7 @@ int main(int argc, char *argv[])
>  	memset(addr_gva2hva(vm, xstate), 0, PAGE_SIZE * DIV_ROUND_UP(XSAVE_SIZE, PAGE_SIZE));
>  	vcpu_args_set(vcpu, 3, amx_cfg, tiledata, xstate);
>  
> +	int iter = 0;

If we want to retain "tracing" of guest syncs, I vote to provide the information
from the guest, otherwise I'll end up counting GUEST_SYNC() calls on my fingers
(and run out of fingers) :-D.

E.g. if we wrap all GUEST_SYNC() calls in a macro, we can print the line number
without having to hardcode sync point numbers.

# ./x86/amx_test 
Random seed: 0x6b8b4567
GUEST_SYNC line 164, save/restore VM state
GUEST_SYNC line 168, save/restore VM state
GUEST_SYNC line 172, save/restore VM state
GUEST_SYNC line 175, save tiledata
GUEST_SYNC line 175, check TMM0 contents
GUEST_SYNC line 175, save/restore VM state
GUEST_SYNC line 181, before KVM_SET_XSAVE
GUEST_SYNC line 181, after KVM_SET_XSAVE
GUEST_SYNC line 182, save/restore VM state
GUEST_SYNC line 186, save/restore VM state
GUEST_SYNC line 210, save/restore VM state
GUEST_SYNC line 224, save/restore VM state
GUEST_SYNC line 231, save/restore VM state
GUEST_SYNC line 234, check TMM0 contents
GUEST_SYNC line 234, save/restore VM state
UCALL_DONE

---
 tools/testing/selftests/kvm/x86/amx_test.c | 55 +++++++++++++---------
 1 file changed, 33 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86/amx_test.c b/tools/testing/selftests/kvm/x86/amx_test.c
index 37b166260ee3..9593ecd47d28 100644
--- a/tools/testing/selftests/kvm/x86/amx_test.c
+++ b/tools/testing/selftests/kvm/x86/amx_test.c
@@ -131,19 +131,27 @@ static void set_tilecfg(struct tile_config *cfg)
 }
 
 enum {
+	TEST_SYNC_LINE_NUMBER_MASK = GENMASK(15, 0),
+
 	/* Retrieve TMM0 from guest, stash it for TEST_RESTORE_TILEDATA */
-	TEST_SAVE_TILEDATA = 1,
+	TEST_SAVE_TILEDATA = BIT(16),
 
 	/* Check TMM0 against tiledata */
-	TEST_COMPARE_TILEDATA = 2,
+	TEST_COMPARE_TILEDATA = BIT(17),
 
 	/* Restore TMM0 from earlier save */
-	TEST_RESTORE_TILEDATA = 4,
+	TEST_RESTORE_TILEDATA = BIT(18),
 
 	/* Full VM save/restore */
-	TEST_SAVE_RESTORE = 8,
+	TEST_SAVE_RESTORE = BIT(19),
 };
 
+#define AMX_GUEST_SYNC(action)						\
+do {									\
+	kvm_static_assert(!((action) & TEST_SYNC_LINE_NUMBER_MASK));	\
+	GUEST_SYNC((action) | __LINE__);				\
+} while (0)
+
 static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 						    struct tile_data *tiledata,
 						    struct xstate *xstate)
@@ -153,29 +161,29 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 	GUEST_ASSERT(this_cpu_has(X86_FEATURE_XSAVE) &&
 		     this_cpu_has(X86_FEATURE_OSXSAVE));
 	check_xtile_info();
-	GUEST_SYNC(TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
 
 	/* xfd=0, enable amx */
 	wrmsr(MSR_IA32_XFD, 0);
-	GUEST_SYNC(TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == 0);
 	set_tilecfg(amx_cfg);
 	__ldtilecfg(amx_cfg);
-	GUEST_SYNC(TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
 	/* Check save/restore when trap to userspace */
 	__tileloadd(tiledata);
-	GUEST_SYNC(TEST_SAVE_TILEDATA | TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_SAVE_TILEDATA | TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
 
 	/* xfd=0x40000, disable amx tiledata */
 	wrmsr(MSR_IA32_XFD, XFEATURE_MASK_XTILE_DATA);
 
 	/* host tries setting tiledata while guest XFD is set */
-	GUEST_SYNC(TEST_RESTORE_TILEDATA);
-	GUEST_SYNC(TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_RESTORE_TILEDATA);
+	AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
 
 	wrmsr(MSR_IA32_XFD, 0);
 	__tilerelease();
-	GUEST_SYNC(TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
 	/*
 	 * After XSAVEC, XTILEDATA is cleared in the xstate_bv but is set in
 	 * the xcomp_bv.
@@ -199,7 +207,7 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 	GUEST_ASSERT(!(xstate->header.xstate_bv & XFEATURE_MASK_XTILE_DATA));
 	GUEST_ASSERT((xstate->header.xcomp_bv & XFEATURE_MASK_XTILE_DATA));
 
-	GUEST_SYNC(TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
 	set_tilecfg(amx_cfg);
 	__ldtilecfg(amx_cfg);
@@ -213,17 +221,17 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
 	GUEST_ASSERT(!(get_cr0() & X86_CR0_TS));
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
-	GUEST_SYNC(TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA);
 	GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
 	/* Clear xfd_err */
 	wrmsr(MSR_IA32_XFD_ERR, 0);
 	/* xfd=0, enable amx */
 	wrmsr(MSR_IA32_XFD, 0);
-	GUEST_SYNC(TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
 
 	__tileloadd(tiledata);
-	GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
+	AMX_GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
 
 	GUEST_DONE();
 }
@@ -275,7 +283,6 @@ int main(int argc, char *argv[])
 	memset(addr_gva2hva(vm, xstate), 0, PAGE_SIZE * DIV_ROUND_UP(XSAVE_SIZE, PAGE_SIZE));
 	vcpu_args_set(vcpu, 3, amx_cfg, tiledata, xstate);
 
-	int iter = 0;
 	for (;;) {
 		vcpu_run(vcpu);
 		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
@@ -285,13 +292,14 @@ int main(int argc, char *argv[])
 			REPORT_GUEST_ASSERT(uc);
 			/* NOT REACHED */
 		case UCALL_SYNC:
-			++iter;
 			if (uc.args[1] & TEST_SAVE_TILEDATA) {
-				fprintf(stderr, "GUEST_SYNC #%d, save tiledata\n", iter);
+				fprintf(stderr, "GUEST_SYNC line %d, save tiledata\n",
+					(u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
 				tile_state = vcpu_save_state(vcpu);
 			}
 			if (uc.args[1] & TEST_COMPARE_TILEDATA) {
-				fprintf(stderr, "GUEST_SYNC #%d, check TMM0 contents\n", iter);
+				fprintf(stderr, "GUEST_SYNC line %d, check TMM0 contents\n",
+					(u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
 
 				/* Compacted mode, get amx offset by xsave area
 				 * size subtract 8K amx size.
@@ -304,12 +312,15 @@ int main(int argc, char *argv[])
 				TEST_ASSERT(ret == 0, "memcmp failed, ret=%d", ret);
 			}
 			if (uc.args[1] & TEST_RESTORE_TILEDATA) {
-				fprintf(stderr, "GUEST_SYNC #%d, before KVM_SET_XSAVE\n", iter);
+				fprintf(stderr, "GUEST_SYNC line %d, before KVM_SET_XSAVE\n",
+					(u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
 				vcpu_xsave_set(vcpu, tile_state->xsave);
-				fprintf(stderr, "GUEST_SYNC #%d, after KVM_SET_XSAVE\n", iter);
+				fprintf(stderr, "GUEST_SYNC line %d, after KVM_SET_XSAVE\n",
+					(u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
 			}
 			if (uc.args[1] & TEST_SAVE_RESTORE) {
-				fprintf(stderr, "GUEST_SYNC #%d, save/restore VM state\n", iter);
+				fprintf(stderr, "GUEST_SYNC line %d, save/restore VM state\n",
+					(u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
 				state = vcpu_save_state(vcpu);
 				memset(&regs1, 0, sizeof(regs1));
 				vcpu_regs_get(vcpu, &regs1);

base-commit: bc6eb58bab2fda28ef473ff06f4229c814c29380
--

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
  2026-01-03  2:06   ` Yao Yuan
@ 2026-01-06  0:54   ` Jim Mattson
  2026-01-06  1:17     ` Sean Christopherson
  2026-01-07  0:28   ` Chang S. Bae
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 37+ messages in thread
From: Jim Mattson @ 2026-01-06  0:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, x86, stable

On Thu, Jan 1, 2026 at 1:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
> ...
> +       /*
> +        * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
> +        * the save state to initialized.

This comment suggests that an entry should be added to
Documentation/virt/kvm/x86/errata.rst.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-06  0:54   ` Jim Mattson
@ 2026-01-06  1:17     ` Sean Christopherson
  2026-01-06 17:56       ` Jim Mattson
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2026-01-06  1:17 UTC (permalink / raw)
  To: Jim Mattson; +Cc: Paolo Bonzini, linux-kernel, kvm, x86, stable

On Mon, Jan 05, 2026, Jim Mattson wrote:
> On Thu, Jan 1, 2026 at 1:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > From: Sean Christopherson <seanjc@google.com>
> > ...
> > +       /*
> > +        * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
> > +        * the save state to initialized.
> 
> This comment suggests that an entry should be added to
> Documentation/virt/kvm/x86/errata.rst.

Hmm, I don't think it's necessary, the SDM (in a style more suited for the APM,
*sigh*), "recommends" that software not rely on state being maintained when disabled
via XFD.

  Before doing so, system software should first initialize AMX state (e.g., by
  executing TILERELEASE); maintaining AMX state in a non-initialized state may
  have negative power and performance implications and will prevent the execution
  of In-Field Scan tests. In addition, software should not rely on the state of
  the tile data after setting IA32_XFD[17] or IA32_XFD[18]; software should always
  reload or reinitialize the tile data after clearing IA32_XFD[17] and IA32_XFD[18].

  System software should not use XFD to implement a “lazy restore” approach to
  management of the TILEDATA state component. This approach will not operate correctly
  for a variety of reasons. One is that the LDTILECFG and TILERELEASE instructions
  initialize TILEDATA and do not cause an #NM exception. Another is that an execution
  of XSAVE, XSAVEC, XSAVEOPT, or XSAVES by a user thread will save TILEDATA as
  initialized instead of the data expected by the user thread.

I suppose that doesn't _quite_ say that the CPU is allowed to clobber state, but
it's darn close.

I'm definitely not opposed to officially documenting KVM's virtual CPU implementation,
but IMO calling it an erratum is a bit unfair.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-01  9:05 [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Paolo Bonzini
                   ` (3 preceding siblings ...)
  2026-01-01  9:05 ` [PATCH 4/4] selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1 Paolo Bonzini
@ 2026-01-06  1:18 ` Sean Christopherson
  2026-01-15 12:22 ` Borislav Petkov
  2026-01-16 12:22 ` Borislav Petkov
  6 siblings, 0 replies; 37+ messages in thread
From: Sean Christopherson @ 2026-01-06  1:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, x86

On Thu, Jan 01, 2026, Paolo Bonzini wrote:
> Fix a possible host panic, due to an unexpected #NM, when a KVM guest
> is using AMX features.
> 
> The guest's XFD value, which is stored in fpstate->xfd, is used for both
> guest execution and host XSAVE operations.  However, the guest-configured
> XFD setting can disable features that were enabled when the guest executed
> XSAVE, and this causes a #NM when executing XRSTOR on the guest FPU state.
> 
> This can happen in two cases: due to a KVM_SET_XSAVE that includes a
> disabled component, or if an interrupt causes XSAVE to be executed
> before the call to fpu_update_guest_xfd().
> 
> The first patch fixes both cases, the rest is improvements to selftests
> in order to cover this test and also verify that #NM faults are injected
> corectly.
> 
> v1 had extra patches to export higher-level functions for KVM in place
> of switch_fpu_return() and fpregs_assert_state_consistent().  Those
> were part of refactoring how KVM loaded guest state when KVM_RUN is
> issued, but are not needed anymore with this v2 fix and I will submit
> them separately.
> 
> Tested on a Sapphire Rapids machine, reviews and acks are welcome so
> that I can submit it to Linus via the KVM tree.

Tested on EMR with with my simulated IRQ hack.  Other than ongoing complaints
about the prints in the selftest, LGTM :-)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-05 17:31     ` Sean Christopherson
@ 2026-01-06  5:25       ` Yao Yuan
  0 siblings, 0 replies; 37+ messages in thread
From: Yao Yuan @ 2026-01-06  5:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Yao Yuan, Paolo Bonzini, linux-kernel, kvm, x86, stable

On Mon, Jan 05, 2026 at 09:31:05AM +0800, Sean Christopherson wrote:
> On Sat, Jan 03, 2026, Yao Yuan wrote:
> > On Thu, Jan 01, 2026 at 10:05:13AM +0100, Paolo Bonzini wrote:
> > > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> > > index da233f20ae6f..166c380b0161 100644
> > > --- a/arch/x86/kernel/fpu/core.c
> > > +++ b/arch/x86/kernel/fpu/core.c
> > > @@ -319,10 +319,29 @@ EXPORT_SYMBOL_FOR_KVM(fpu_enable_guest_xfd_features);
> > >  #ifdef CONFIG_X86_64
> > >  void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
> > >  {
> > > +	struct fpstate *fpstate = guest_fpu->fpstate;
> > > +
> > >  	fpregs_lock();
> > > -	guest_fpu->fpstate->xfd = xfd;
> > > -	if (guest_fpu->fpstate->in_use)
> > > -		xfd_update_state(guest_fpu->fpstate);
> > > +
> > > +	/*
> > > +	 * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
> > > +	 * the save state to initialized.  Likewise, KVM_GET_XSAVE does the
> > > +	 * same as XSAVE and returns XSTATE_BV[i]=0 whenever XFD[i]=1.
> > > +	 *
> > > +	 * If the guest's FPU state is in hardware, just update XFD: the XSAVE
> > > +	 * in fpu_swap_kvm_fpstate will clear XSTATE_BV[i] whenever XFD[i]=1.
> > > +	 *
> > > +	 * If however the guest's FPU state is NOT resident in hardware, clear
> > > +	 * disabled components in XSTATE_BV now, or a subsequent XRSTOR will
> > > +	 * attempt to load disabled components and generate #NM _in the host_.
> > > +	 */
> >
> > Hi Sean and Paolo,
> >
> > > +	if (xfd && test_thread_flag(TIF_NEED_FPU_LOAD))
> > > +		fpstate->regs.xsave.header.xfeatures &= ~xfd;
> > > +
> > > +	fpstate->xfd = xfd;
> > > +	if (fpstate->in_use)
> > > +		xfd_update_state(fpstate);
> >
> > I see a *small* window that the Host IRQ can happen just after above
> > TIF_NEED_FPU_LOAD checking, which could set TIF_NEED_FPU_LOAD
>
> Only if the code using FPU from IRQ context is buggy.  More below.
>
> > but w/o clear the xfd from fpstate->regs.xsave.header.xfeatures.
> >
> > But there's WARN in in kernel_fpu_begin_mask():
> >
> > 	WARN_ON_FPU(!irq_fpu_usable());
> >
> > irq_fpu_usable()
> > {
> > 	...
> > 	/*
> > 	 * In hard interrupt context it's safe when soft interrupts
> > 	 * are enabled, which means the interrupt did not hit in
> > 	 * a fpregs_lock()'ed critical region.
> > 	 */
> > 	return !softirq_count();
> > }
> >
> > Looks we are relying on this to catch the above *small* window
> > yet, we're in fpregs_lock() region yet.
>
> Kernel use of FPU from (soft) IRQ context is required to check irq_fpu_usable()
> (e.g. via may_use_simd()), i.e. calling fpregs_lock() protects against the kernel
> using the FPU and thus setting TIF_NEED_FPU_LOAD.
>
> The WARN in kernel_fpu_begin_mask() is purely a sanity check to help detect and
> debug buggy users.

OK, I have same understanding w/ you, thanks.

Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-06  1:17     ` Sean Christopherson
@ 2026-01-06 17:56       ` Jim Mattson
  2026-01-15 16:07         ` Dave Hansen
  0 siblings, 1 reply; 37+ messages in thread
From: Jim Mattson @ 2026-01-06 17:56 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, linux-kernel, kvm, x86, stable

On Mon, Jan 5, 2026 at 5:17 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Jan 05, 2026, Jim Mattson wrote:
> > On Thu, Jan 1, 2026 at 1:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > >
> > > From: Sean Christopherson <seanjc@google.com>
> > > ...
> > > +       /*
> > > +        * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
> > > +        * the save state to initialized.
> >
> > This comment suggests that an entry should be added to
> > Documentation/virt/kvm/x86/errata.rst.
>
> Hmm, I don't think it's necessary, the SDM (in a style more suited for the APM,
> *sigh*), "recommends" that software not rely on state being maintained when disabled
> via XFD.
>
>   Before doing so, system software should first initialize AMX state (e.g., by
>   executing TILERELEASE); maintaining AMX state in a non-initialized state may
>   have negative power and performance implications and will prevent the execution
>   of In-Field Scan tests. In addition, software should not rely on the state of
>   the tile data after setting IA32_XFD[17] or IA32_XFD[18]; software should always
>   reload or reinitialize the tile data after clearing IA32_XFD[17] and IA32_XFD[18].
>
>   System software should not use XFD to implement a “lazy restore” approach to
>   management of the TILEDATA state component. This approach will not operate correctly
>   for a variety of reasons. One is that the LDTILECFG and TILERELEASE instructions
>   initialize TILEDATA and do not cause an #NM exception. Another is that an execution
>   of XSAVE, XSAVEC, XSAVEOPT, or XSAVES by a user thread will save TILEDATA as
>   initialized instead of the data expected by the user thread.
>
> I suppose that doesn't _quite_ say that the CPU is allowed to clobber state, but
> it's darn close.
>
> I'm definitely not opposed to officially documenting KVM's virtual CPU implementation,
> but IMO calling it an erratum is a bit unfair.

Apologies. You're right. Though Intel is a bit coy, the only way to
interpret that section of the SDM is to conclude that the AMX state in
the CPU becomes undefined when XFD[18] is set.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
  2026-01-03  2:06   ` Yao Yuan
  2026-01-06  0:54   ` Jim Mattson
@ 2026-01-07  0:28   ` Chang S. Bae
  2026-01-07 22:33     ` Paolo Bonzini
  2026-01-08  3:06   ` Binbin Wu
  2026-01-15 15:54   ` Dave Hansen
  4 siblings, 1 reply; 37+ messages in thread
From: Chang S. Bae @ 2026-01-07  0:28 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: seanjc, x86, stable

On 1/1/2026 1:05 AM, Paolo Bonzini wrote:
> 
> Therefore, XFD can only go out of sync with XSTATE_BV in the above
> interrupt case, or in similar scenarios involving preemption on

This seems to restate the scenario already described above; I’m not sure
whether the repetition is intentional.

> preemptible kernels, and it we can consider it (de facto) part of KVM
                            ^^^^^
I assume you meant 'we' here though, you might want to slightly rephrase 
it, given the previous debate:

   https://lore.kernel.org/all/87iko54f42.ffs@tglx/

> ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.

On my side, testing on AMX systems, I was able to reproduce the issue 
described and confirm that this patch resolves it:

   Tested-by: Chang S. Bae <chang.seok.bae@intel.com>

The added guards on these paths also look reasonable to me with the 
established KVM ABI. So,

   Reviewed-by: Chang S. Bae <chang.seok.bae@intel.com>

Thanks,
Chang

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/4] selftests: kvm: replace numbered sync points with actions
  2026-01-06  0:02   ` Sean Christopherson
@ 2026-01-07 22:28     ` Paolo Bonzini
  2026-01-08 20:26       ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-07 22:28 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, x86, stable

On Tue, Jan 6, 2026 at 1:02 AM Sean Christopherson <seanjc@google.com> wrote:
> > @@ -244,6 +254,7 @@ int main(int argc, char *argv[])
> >       memset(addr_gva2hva(vm, xstate), 0, PAGE_SIZE * DIV_ROUND_UP(XSAVE_SIZE, PAGE_SIZE));
> >       vcpu_args_set(vcpu, 3, amx_cfg, tiledata, xstate);
> >
> > +     int iter = 0;
>
> If we want to retain "tracing" of guest syncs, I vote to provide the information
> from the guest, otherwise I'll end up counting GUEST_SYNC() calls on my fingers
> (and run out of fingers) :-D.

I had a similar idea, but I was too lazy to implement it because for a
very linear test such as this one, "12n" in vi does wonders...

> E.g. if we wrap all GUEST_SYNC() calls in a macro, we can print the line number
> without having to hardcode sync point numbers.

... but there are actually better reasons than laziness and linearity
to keep the simple "iter++".

First, while using line numbers has the advantage of zero maintenance,
the disadvantage is that they change all the time as you're debugging.
So you are left slightly puzzled if the number changed because the
test passed or because of the extra debugging code you added.

Second, the iteration number is probably more useful to identify the
places at which the VM was reentered (which are where the iteration
number changes), than to identify the specific GUEST_SYNC that failed;
from that perspective there's not much difference between line
numbers, manually-numbered sync points, or incrementing a counter in
main().

Paolo

> # ./x86/amx_test
> Random seed: 0x6b8b4567
> GUEST_SYNC line 164, save/restore VM state
> GUEST_SYNC line 168, save/restore VM state
> GUEST_SYNC line 172, save/restore VM state
> GUEST_SYNC line 175, save tiledata
> GUEST_SYNC line 175, check TMM0 contents
> GUEST_SYNC line 175, save/restore VM state
> GUEST_SYNC line 181, before KVM_SET_XSAVE
> GUEST_SYNC line 181, after KVM_SET_XSAVE
> GUEST_SYNC line 182, save/restore VM state
> GUEST_SYNC line 186, save/restore VM state
> GUEST_SYNC line 210, save/restore VM state
> GUEST_SYNC line 224, save/restore VM state
> GUEST_SYNC line 231, save/restore VM state
> GUEST_SYNC line 234, check TMM0 contents
> GUEST_SYNC line 234, save/restore VM state
> UCALL_DONE
>
> ---
>  tools/testing/selftests/kvm/x86/amx_test.c | 55 +++++++++++++---------
>  1 file changed, 33 insertions(+), 22 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/x86/amx_test.c b/tools/testing/selftests/kvm/x86/amx_test.c
> index 37b166260ee3..9593ecd47d28 100644
> --- a/tools/testing/selftests/kvm/x86/amx_test.c
> +++ b/tools/testing/selftests/kvm/x86/amx_test.c
> @@ -131,19 +131,27 @@ static void set_tilecfg(struct tile_config *cfg)
>  }
>
>  enum {
> +       TEST_SYNC_LINE_NUMBER_MASK = GENMASK(15, 0),
> +
>         /* Retrieve TMM0 from guest, stash it for TEST_RESTORE_TILEDATA */
> -       TEST_SAVE_TILEDATA = 1,
> +       TEST_SAVE_TILEDATA = BIT(16),
>
>         /* Check TMM0 against tiledata */
> -       TEST_COMPARE_TILEDATA = 2,
> +       TEST_COMPARE_TILEDATA = BIT(17),
>
>         /* Restore TMM0 from earlier save */
> -       TEST_RESTORE_TILEDATA = 4,
> +       TEST_RESTORE_TILEDATA = BIT(18),
>
>         /* Full VM save/restore */
> -       TEST_SAVE_RESTORE = 8,
> +       TEST_SAVE_RESTORE = BIT(19),
>  };
>
> +#define AMX_GUEST_SYNC(action)                                         \
> +do {                                                                   \
> +       kvm_static_assert(!((action) & TEST_SYNC_LINE_NUMBER_MASK));    \
> +       GUEST_SYNC((action) | __LINE__);                                \
> +} while (0)
> +
>  static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
>                                                     struct tile_data *tiledata,
>                                                     struct xstate *xstate)
> @@ -153,29 +161,29 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
>         GUEST_ASSERT(this_cpu_has(X86_FEATURE_XSAVE) &&
>                      this_cpu_has(X86_FEATURE_OSXSAVE));
>         check_xtile_info();
> -       GUEST_SYNC(TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
>
>         /* xfd=0, enable amx */
>         wrmsr(MSR_IA32_XFD, 0);
> -       GUEST_SYNC(TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
>         GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == 0);
>         set_tilecfg(amx_cfg);
>         __ldtilecfg(amx_cfg);
> -       GUEST_SYNC(TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
>         /* Check save/restore when trap to userspace */
>         __tileloadd(tiledata);
> -       GUEST_SYNC(TEST_SAVE_TILEDATA | TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_SAVE_TILEDATA | TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
>
>         /* xfd=0x40000, disable amx tiledata */
>         wrmsr(MSR_IA32_XFD, XFEATURE_MASK_XTILE_DATA);
>
>         /* host tries setting tiledata while guest XFD is set */
> -       GUEST_SYNC(TEST_RESTORE_TILEDATA);
> -       GUEST_SYNC(TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_RESTORE_TILEDATA);
> +       AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
>
>         wrmsr(MSR_IA32_XFD, 0);
>         __tilerelease();
> -       GUEST_SYNC(TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
>         /*
>          * After XSAVEC, XTILEDATA is cleared in the xstate_bv but is set in
>          * the xcomp_bv.
> @@ -199,7 +207,7 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
>         GUEST_ASSERT(!(xstate->header.xstate_bv & XFEATURE_MASK_XTILE_DATA));
>         GUEST_ASSERT((xstate->header.xcomp_bv & XFEATURE_MASK_XTILE_DATA));
>
> -       GUEST_SYNC(TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
>         GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
>         set_tilecfg(amx_cfg);
>         __ldtilecfg(amx_cfg);
> @@ -213,17 +221,17 @@ static void __attribute__((__flatten__)) guest_code(struct tile_config *amx_cfg,
>         GUEST_ASSERT(!(get_cr0() & X86_CR0_TS));
>         GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA);
>         GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
> -       GUEST_SYNC(TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
>         GUEST_ASSERT(rdmsr(MSR_IA32_XFD_ERR) == XFEATURE_MASK_XTILE_DATA);
>         GUEST_ASSERT(rdmsr(MSR_IA32_XFD) == XFEATURE_MASK_XTILE_DATA);
>         /* Clear xfd_err */
>         wrmsr(MSR_IA32_XFD_ERR, 0);
>         /* xfd=0, enable amx */
>         wrmsr(MSR_IA32_XFD, 0);
> -       GUEST_SYNC(TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_SAVE_RESTORE);
>
>         __tileloadd(tiledata);
> -       GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
> +       AMX_GUEST_SYNC(TEST_COMPARE_TILEDATA | TEST_SAVE_RESTORE);
>
>         GUEST_DONE();
>  }
> @@ -275,7 +283,6 @@ int main(int argc, char *argv[])
>         memset(addr_gva2hva(vm, xstate), 0, PAGE_SIZE * DIV_ROUND_UP(XSAVE_SIZE, PAGE_SIZE));
>         vcpu_args_set(vcpu, 3, amx_cfg, tiledata, xstate);
>
> -       int iter = 0;
>         for (;;) {
>                 vcpu_run(vcpu);
>                 TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
> @@ -285,13 +292,14 @@ int main(int argc, char *argv[])
>                         REPORT_GUEST_ASSERT(uc);
>                         /* NOT REACHED */
>                 case UCALL_SYNC:
> -                       ++iter;
>                         if (uc.args[1] & TEST_SAVE_TILEDATA) {
> -                               fprintf(stderr, "GUEST_SYNC #%d, save tiledata\n", iter);
> +                               fprintf(stderr, "GUEST_SYNC line %d, save tiledata\n",
> +                                       (u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
>                                 tile_state = vcpu_save_state(vcpu);
>                         }
>                         if (uc.args[1] & TEST_COMPARE_TILEDATA) {
> -                               fprintf(stderr, "GUEST_SYNC #%d, check TMM0 contents\n", iter);
> +                               fprintf(stderr, "GUEST_SYNC line %d, check TMM0 contents\n",
> +                                       (u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
>
>                                 /* Compacted mode, get amx offset by xsave area
>                                  * size subtract 8K amx size.
> @@ -304,12 +312,15 @@ int main(int argc, char *argv[])
>                                 TEST_ASSERT(ret == 0, "memcmp failed, ret=%d", ret);
>                         }
>                         if (uc.args[1] & TEST_RESTORE_TILEDATA) {
> -                               fprintf(stderr, "GUEST_SYNC #%d, before KVM_SET_XSAVE\n", iter);
> +                               fprintf(stderr, "GUEST_SYNC line %d, before KVM_SET_XSAVE\n",
> +                                       (u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
>                                 vcpu_xsave_set(vcpu, tile_state->xsave);
> -                               fprintf(stderr, "GUEST_SYNC #%d, after KVM_SET_XSAVE\n", iter);
> +                               fprintf(stderr, "GUEST_SYNC line %d, after KVM_SET_XSAVE\n",
> +                                       (u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
>                         }
>                         if (uc.args[1] & TEST_SAVE_RESTORE) {
> -                               fprintf(stderr, "GUEST_SYNC #%d, save/restore VM state\n", iter);
> +                               fprintf(stderr, "GUEST_SYNC line %d, save/restore VM state\n",
> +                                       (u16)(uc.args[1] & TEST_SYNC_LINE_NUMBER_MASK));
>                                 state = vcpu_save_state(vcpu);
>                                 memset(&regs1, 0, sizeof(regs1));
>                                 vcpu_regs_get(vcpu, &regs1);
>
> base-commit: bc6eb58bab2fda28ef473ff06f4229c814c29380
> --
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-07  0:28   ` Chang S. Bae
@ 2026-01-07 22:33     ` Paolo Bonzini
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-07 22:33 UTC (permalink / raw)
  To: Chang S. Bae; +Cc: linux-kernel, kvm, seanjc, x86, stable

On Wed, Jan 7, 2026 at 1:29 AM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> On 1/1/2026 1:05 AM, Paolo Bonzini wrote:
> >
> > Therefore, XFD can only go out of sync with XSTATE_BV in the above
> > interrupt case, or in similar scenarios involving preemption on
>
> This seems to restate the scenario already described above; I’m not sure
> whether the repetition is intentional.
>
> > preemptible kernels, and it we can consider it (de facto) part of KVM
>                             ^^^^^
> I assume you meant 'we' here though, you might want to slightly rephrase
> it, given the previous debate:
>
>    https://lore.kernel.org/all/87iko54f42.ffs@tglx/

There are two possible "we"s:

1) the code - in the context of this patch this would be "we force
XSTATE_BV[i] to 0" or "we can be preempted", and I agree it's bad form

2) the community, or the maintainers - this is the case in the commit
message, and I think it's acceptable. While I (Paolo) cannot forcibly
come to your computer and clear XSTATE_BV[i], I certainly can decide
that KVM will do so. :)

> > ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.
>
> On my side, testing on AMX systems, I was able to reproduce the issue
> described and confirm that this patch resolves it:
>
>    Tested-by: Chang S. Bae <chang.seok.bae@intel.com>
>    Reviewed-by: Chang S. Bae <chang.seok.bae@intel.com>

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
                     ` (2 preceding siblings ...)
  2026-01-07  0:28   ` Chang S. Bae
@ 2026-01-08  3:06   ` Binbin Wu
  2026-01-08 16:26     ` Paolo Bonzini
  2026-01-15 15:54   ` Dave Hansen
  4 siblings, 1 reply; 37+ messages in thread
From: Binbin Wu @ 2026-01-08  3:06 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, x86, stable



On 1/1/2026 5:05 PM, Paolo Bonzini wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
> response to a guest WRMSR, clear XFD-disabled features in the saved (or to
> be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
> features that are disabled via the guest's XFD.  Because the kernel
> executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
> will cause XRSTOR to #NM and panic the kernel.
> 
> E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV:
> 
>   ------------[ cut here ]------------
>   WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#29: amx_test/848
>   Modules linked in: kvm_intel kvm irqbypass
>   CPU: 29 UID: 1000 PID: 848 Comm: amx_test Not tainted 6.19.0-rc2-ffa07f7fd437-x86_amx_nm_xfd_non_init-vm #171 NONE
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:exc_device_not_available+0x101/0x110
>   Call Trace:
>    <TASK>
>    asm_exc_device_not_available+0x1a/0x20
>   RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
>    switch_fpu_return+0x4a/0xb0
>    kvm_arch_vcpu_ioctl_run+0x1245/0x1e40 [kvm]
>    kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
>    __x64_sys_ioctl+0x8f/0xd0
>    do_syscall_64+0x62/0x940
>    entry_SYSCALL_64_after_hwframe+0x4b/0x53
>    </TASK>
>   ---[ end trace 0000000000000000 ]---
> 
> This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1,
> and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's
> call to fpu_update_guest_xfd().
> 
> and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE:
> 
>   ------------[ cut here ]------------
>   WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#14: amx_test/867
>   Modules linked in: kvm_intel kvm irqbypass
>   CPU: 14 UID: 1000 PID: 867 Comm: amx_test Not tainted 6.19.0-rc2-2dace9faccd6-x86_amx_nm_xfd_non_init-vm #168 NONE
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:exc_device_not_available+0x101/0x110
>   Call Trace:
>    <TASK>
>    asm_exc_device_not_available+0x1a/0x20
>   RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
>    fpu_swap_kvm_fpstate+0x6b/0x120
>    kvm_load_guest_fpu+0x30/0x80 [kvm]
>    kvm_arch_vcpu_ioctl_run+0x85/0x1e40 [kvm]
>    kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
>    __x64_sys_ioctl+0x8f/0xd0
>    do_syscall_64+0x62/0x940
>    entry_SYSCALL_64_after_hwframe+0x4b/0x53
>    </TASK>
>   ---[ end trace 0000000000000000 ]---
> 
> The new behavior is consistent with the AMX architecture.  Per Intel's SDM,
> XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD
> (and non-compacted XSAVE saves the initial configuration of the state
> component):
> 
>   If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i,
>   the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1;
>   instead, it operates as if XINUSE[i] = 0 (and the state component was
>   in its initial state): it saves bit i of XSTATE_BV field of the XSAVE
>   header as 0; in addition, XSAVE saves the initial configuration of the
>   state component (the other instructions do not save state component i).
> 
> Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using
> a constant XFD based on the set of enabled features when XSAVEing for
> a struct fpu_guest.  However, having XSTATE_BV[i]=1 for XFD-disabled
> features can only happen in the above interrupt case, or in similar
> scenarios involving preemption on preemptible kernels, because
> fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the
> outgoing FPU state with the current XFD; and that is (on all but the
> first WRMSR to XFD) the guest XFD.
> 
> Therefore, XFD can only go out of sync with XSTATE_BV in the above
> interrupt case, or in similar scenarios involving preemption on
> preemptible kernels, and it we can consider it (de facto) part of KVM
> ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

One nit blew.

> 
> Reported-by: Paolo Bonzini <pbonzini@redhat.com>
> Cc: stable@vger.kernel.org
> Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14)
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> [Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate
>  to kvm_vcpu_ioctl_x86_set_xsave. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kernel/fpu/core.c | 32 +++++++++++++++++++++++++++++---
>  arch/x86/kvm/x86.c         |  9 +++++++++
>  2 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index da233f20ae6f..166c380b0161 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -319,10 +319,29 @@ EXPORT_SYMBOL_FOR_KVM(fpu_enable_guest_xfd_features);
>  #ifdef CONFIG_X86_64
>  void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
>  {
> +	struct fpstate *fpstate = guest_fpu->fpstate;
> +
>  	fpregs_lock();
> -	guest_fpu->fpstate->xfd = xfd;
> -	if (guest_fpu->fpstate->in_use)
> -		xfd_update_state(guest_fpu->fpstate);
> +
> +	/*
> +	 * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
> +	 * the save state to initialized.  Likewise, KVM_GET_XSAVE does the

Nit:
To me "initialized" has the implication that it's active.
I prefer the description "initial state" or "initial configuration" used in
SDM here.
I am not a native English speaker though, please ignore it if it's just my
feeling.


> +	 * same as XSAVE and returns XSTATE_BV[i]=0 whenever XFD[i]=1.
> +	 *
> +	 * If the guest's FPU state is in hardware, just update XFD: the XSAVE
> +	 * in fpu_swap_kvm_fpstate will clear XSTATE_BV[i] whenever XFD[i]=1.
> +	 *
> +	 * If however the guest's FPU state is NOT resident in hardware, clear
> +	 * disabled components in XSTATE_BV now, or a subsequent XRSTOR will
> +	 * attempt to load disabled components and generate #NM _in the host_.
> +	 */
> +	if (xfd && test_thread_flag(TIF_NEED_FPU_LOAD))
> +		fpstate->regs.xsave.header.xfeatures &= ~xfd;
> +
> +	fpstate->xfd = xfd;
> +	if (fpstate->in_use)
> +		xfd_update_state(fpstate);
> +
>  	fpregs_unlock();
>  }
>  EXPORT_SYMBOL_FOR_KVM(fpu_update_guest_xfd);
> @@ -430,6 +449,13 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
>  	if (ustate->xsave.header.xfeatures & ~xcr0)
>  		return -EINVAL;
>  
> +	/*
> +	 * Disabled features must be in their initial state, otherwise XRSTOR
> +	 * causes an exception.
> +	 */
> +	if (WARN_ON_ONCE(ustate->xsave.header.xfeatures & kstate->xfd))
> +		return -EINVAL;
> +
>  	/*
>  	 * Nullify @vpkru to preserve its current value if PKRU's bit isn't set
>  	 * in the header.  KVM's odd ABI is to leave PKRU untouched in this
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ff8812f3a129..c0416f53b5f5 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5807,9 +5807,18 @@ static int kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
>  static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
>  					struct kvm_xsave *guest_xsave)
>  {
> +	union fpregs_state *xstate = (union fpregs_state *)guest_xsave->region;
> +
>  	if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
>  		return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;
>  
> +	/*
> +	 * Do not reject non-initialized disabled features for backwards
> +	 * compatibility, but clear XSTATE_BV[i] whenever XFD[i]=1.
> +	 * Otherwise, XRSTOR would cause a #NM.
> +	 */
> +	xstate->xsave.header.xfeatures &= ~vcpu->arch.guest_fpu.fpstate->xfd;
> +
>  	return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu,
>  					      guest_xsave->region,
>  					      kvm_caps.supported_xcr0,


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-08  3:06   ` Binbin Wu
@ 2026-01-08 16:26     ` Paolo Bonzini
  0 siblings, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-08 16:26 UTC (permalink / raw)
  To: Binbin Wu; +Cc: linux-kernel, kvm, seanjc, x86, stable

On Thu, Jan 8, 2026 at 4:08 AM Binbin Wu <binbin.wu@linux.intel.com> wrote:
> > +     /*
> > +      * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
> > +      * the save state to initialized.  Likewise, KVM_GET_XSAVE does the
>
> Nit:
> To me "initialized" has the implication that it's active.
> I prefer the description "initial state" or "initial configuration" used in
> SDM here.
> I am not a native English speaker though, please ignore it if it's just my
> feeling.

Sure, why not:

   KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert the
   save state to its initial configuration.  Likewise, KVM_GET_XSAVE does
   the same as XSAVE and returns XSTATE_BV[i]=0 whenever XFD[i]=1.

> > +     /*
> > +      * Do not reject non-initialized disabled features for backwards
> > +      * compatibility, but clear XSTATE_BV[i] whenever XFD[i]=1.
> > +      * Otherwise, XRSTOR would cause a #NM.
> > +      */

Same here:

   For backwards compatibility, do not expect disabled features to be in
   their initial state.  XSTATE_BV[i] must still be cleared whenever
   XFD[i]=1, or XRSTOR would cause a #NM.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/4] selftests: kvm: replace numbered sync points with actions
  2026-01-07 22:28     ` Paolo Bonzini
@ 2026-01-08 20:26       ` Sean Christopherson
  0 siblings, 0 replies; 37+ messages in thread
From: Sean Christopherson @ 2026-01-08 20:26 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, x86, stable

On Wed, Jan 07, 2026, Paolo Bonzini wrote:
> On Tue, Jan 6, 2026 at 1:02 AM Sean Christopherson <seanjc@google.com> wrote:
> > > @@ -244,6 +254,7 @@ int main(int argc, char *argv[])
> > >       memset(addr_gva2hva(vm, xstate), 0, PAGE_SIZE * DIV_ROUND_UP(XSAVE_SIZE, PAGE_SIZE));
> > >       vcpu_args_set(vcpu, 3, amx_cfg, tiledata, xstate);
> > >
> > > +     int iter = 0;
> >
> > If we want to retain "tracing" of guest syncs, I vote to provide the information
> > from the guest, otherwise I'll end up counting GUEST_SYNC() calls on my fingers
> > (and run out of fingers) :-D.
> 
> I had a similar idea, but I was too lazy to implement it because for a
> very linear test such as this one, "12n" in vi does wonders...
> 
> > E.g. if we wrap all GUEST_SYNC() calls in a macro, we can print the line number
> > without having to hardcode sync point numbers.
> 
> ... but there are actually better reasons than laziness and linearity
> to keep the simple "iter++".
> 
> First, while using line numbers has the advantage of zero maintenance,
> the disadvantage is that they change all the time as you're debugging.
> So you are left slightly puzzled if the number changed because the
> test passed or because of the extra debugging code you added.

True.  I'm good with the current patch.

> Second, the iteration number is probably more useful to identify the
> places at which the VM was reentered (which are where the iteration
> number changes), than to identify the specific GUEST_SYNC that failed;
> from that perspective there's not much difference between line
> numbers, manually-numbered sync points, or incrementing a counter in
> main().

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-01  9:05 [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Paolo Bonzini
                   ` (4 preceding siblings ...)
  2026-01-06  1:18 ` [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Sean Christopherson
@ 2026-01-15 12:22 ` Borislav Petkov
  2026-01-15 13:49   ` Paolo Bonzini
  2026-01-16 12:22 ` Borislav Petkov
  6 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2026-01-15 12:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, x86

On Thu, Jan 01, 2026 at 10:05:12AM +0100, Paolo Bonzini wrote:
> Fix a possible host panic, due to an unexpected #NM, when a KVM guest
> is using AMX features.
> 
> The guest's XFD value, which is stored in fpstate->xfd, is used for both
> guest execution and host XSAVE operations. 

This already sounds weird. Why?

Why don't we carry separate XFD copies - guest and host - which we use for the
guest and the host, respectively?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-15 12:22 ` Borislav Petkov
@ 2026-01-15 13:49   ` Paolo Bonzini
  2026-01-15 16:39     ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-15 13:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kernel Mailing List, Linux, kvm, Sean Christopherson,
	the arch/x86 maintainers

Il gio 15 gen 2026, 13:22 Borislav Petkov <bp@alien8.de> ha scritto:
>
> On Thu, Jan 01, 2026 at 10:05:12AM +0100, Paolo Bonzini wrote:
> > Fix a possible host panic, due to an unexpected #NM, when a KVM guest
> > is using AMX features.
> >
> > The guest's XFD value, which is stored in fpstate->xfd, is used for both
> > guest execution and host XSAVE operations.
>
> This already sounds weird. Why?

Because the state of disabled components is undefined anyway. There's
no point in making all host XSAVEs more expensive, even when the TMM
registers aren't in use by the guest (which is going to be most of the
time, likely).

> Why don't we carry separate XFD copies - guest and host - which we use for the
> guest and the host, respectively?

That was exactly what I did in v1, but it's more code and less efficient too.

Paolo

>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
                     ` (3 preceding siblings ...)
  2026-01-08  3:06   ` Binbin Wu
@ 2026-01-15 15:54   ` Dave Hansen
  2026-01-15 16:22     ` Paolo Bonzini
  4 siblings, 1 reply; 37+ messages in thread
From: Dave Hansen @ 2026-01-15 15:54 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: seanjc, x86, stable

On 1/1/26 01:05, Paolo Bonzini wrote:
> When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
> response to a guest WRMSR, clear XFD-disabled features in the saved (or to
> be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
> features that are disabled via the guest's XFD.  Because the kernel
> executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
> will cause XRSTOR to #NM and panic the kernel.

It would be really nice to see the actual ordering of events here. What
order do the KVM_SET_XSAVE, XFD[$FOO]=1 and kernel_fpu_begin() have to
happen in to trigger this?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-06 17:56       ` Jim Mattson
@ 2026-01-15 16:07         ` Dave Hansen
  2026-01-15 16:12           ` Paolo Bonzini
  0 siblings, 1 reply; 37+ messages in thread
From: Dave Hansen @ 2026-01-15 16:07 UTC (permalink / raw)
  To: Jim Mattson, Sean Christopherson
  Cc: Paolo Bonzini, linux-kernel, kvm, x86, stable

On 1/6/26 09:56, Jim Mattson wrote:
> Apologies. You're right. Though Intel is a bit coy, the only way to
> interpret that section of the SDM is to conclude that the AMX state in
> the CPU becomes undefined when XFD[18] is set.

I'll touch base with the folks that wrote that blurb. I'm a little
nervous to interpret that "software should not..." blurb as a full
architectural DANGER sign partly because it's in a "RECOMMENDATIONS FOR
SYSTEM SOFTWARE" section.

I'm _sure_ they discussed tying XFD[i] and XINUSE[i] together and there
was a good reason they did not.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-15 16:07         ` Dave Hansen
@ 2026-01-15 16:12           ` Paolo Bonzini
  2026-01-15 16:27             ` Dave Hansen
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-15 16:12 UTC (permalink / raw)
  To: Dave Hansen, Jim Mattson, Sean Christopherson
  Cc: linux-kernel, kvm, x86, stable

On 1/15/26 17:07, Dave Hansen wrote:
> On 1/6/26 09:56, Jim Mattson wrote:
>> Apologies. You're right. Though Intel is a bit coy, the only way to
>> interpret that section of the SDM is to conclude that the AMX state in
>> the CPU becomes undefined when XFD[18] is set.
> 
> I'll touch base with the folks that wrote that blurb. I'm a little
> nervous to interpret that "software should not..." blurb as a full
> architectural DANGER sign partly because it's in a "RECOMMENDATIONS FOR
> SYSTEM SOFTWARE" section.
> 
> I'm _sure_ they discussed tying XFD[i] and XINUSE[i] together and there
> was a good reason they did not.

Is there anything that prevents an SMM handler (or more likely, an SMI 
transfer monitor) to do an XSAVE/XRSTOR and destroy tile data?

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-15 15:54   ` Dave Hansen
@ 2026-01-15 16:22     ` Paolo Bonzini
  2026-01-15 18:19       ` Dave Hansen
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-15 16:22 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, kvm; +Cc: seanjc, x86, stable

On 1/15/26 16:54, Dave Hansen wrote:
> On 1/1/26 01:05, Paolo Bonzini wrote:
>> When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
>> response to a guest WRMSR, clear XFD-disabled features in the saved (or to
>> be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
>> features that are disabled via the guest's XFD.  Because the kernel
>> executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
>> will cause XRSTOR to #NM and panic the kernel.
> 
> It would be really nice to see the actual ordering of events here. What
> order do the KVM_SET_XSAVE, XFD[$FOO]=1 and kernel_fpu_begin() have to
> happen in to trigger this?

The problematic case is described a couple paragraphs below: "This can 
happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1, and 
a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's 
call to fpu_update_guest_xfd()."

Or more in detail:

   Guest running with MSR_IA32_XFD = 0
     WRMSR(MSR_IA32_XFD)
     vmexit
   Host:
     enable IRQ
     interrupt handler
       kernel_fpu_begin() -> sets TIF_NEED_FPU_LOAD
         XSAVE -> stores XINUSE[18] = 1
         ...
       kernel_fpu_end()
     handle vmexit
       fpu_update_guest_xfd() -> XFD[18] = 1
     reenter guest
       fpu_swap_kvm_fpstate()
         XRSTOR -> XINUSE[18] = 1 && XFD[18] = 1 -> #NM and boom

With the patch, fpu_update_guest_xfd() sees TIF_NEED_FPU_LOAD set and 
clears the bit from xinuse.

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-15 16:12           ` Paolo Bonzini
@ 2026-01-15 16:27             ` Dave Hansen
  0 siblings, 0 replies; 37+ messages in thread
From: Dave Hansen @ 2026-01-15 16:27 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson, Sean Christopherson
  Cc: linux-kernel, kvm, x86, stable

On 1/15/26 08:12, Paolo Bonzini wrote:
...
>> I'm _sure_ they discussed tying XFD[i] and XINUSE[i] together and there
>> was a good reason they did not.
> 
> Is there anything that prevents an SMM handler (or more likely, an SMI
> transfer monitor) to do an XSAVE/XRSTOR and destroy tile data?

I think you're saying: let's assume XFD[18]=1 and XINUSE[18]=1 and
there's an SMI. The SMI handler does:

	XSAVE(RFBM=-1, &buf)
	... run some gunk
	XRSTOR(RFBM=-1, &buf)

to try and save everything. But, that XSAVE is subject to this behavior
from the SDM:

	If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state
	component i, the instruction does not generate #NM when XCR0[i]
	= IA32_XFD[i] = 1; instead, it operates as if XINUSE[i] = 0 (and
	the state component was in its initial state)

So 'buf' will end up having XFEATURES[18]=0. The XRSTOR will see
XFEATURES[18]=0 and set feature 18 to its init state, effectively
zapping its contents.

I guess the only thing preventing that in practice is the lack of XSAVE
use in SMM handlers. But I see your point.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-15 13:49   ` Paolo Bonzini
@ 2026-01-15 16:39     ` Sean Christopherson
  2026-01-15 17:05       ` Borislav Petkov
  0 siblings, 1 reply; 37+ messages in thread
From: Sean Christopherson @ 2026-01-15 16:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Borislav Petkov, Kernel Mailing List, Linux, kvm,
	the arch/x86 maintainers

On Thu, Jan 15, 2026, Paolo Bonzini wrote:
> Il gio 15 gen 2026, 13:22 Borislav Petkov <bp@alien8.de> ha scritto:
> >
> > On Thu, Jan 01, 2026 at 10:05:12AM +0100, Paolo Bonzini wrote:
> > > Fix a possible host panic, due to an unexpected #NM, when a KVM guest
> > > is using AMX features.
> > >
> > > The guest's XFD value, which is stored in fpstate->xfd, is used for both
> > > guest execution and host XSAVE operations.
> >
> > This already sounds weird. Why?
> 
> Because the state of disabled components is undefined anyway. There's
> no point in making all host XSAVEs more expensive, even when the TMM
> registers aren't in use by the guest (which is going to be most of the
> time, likely).
> 
> > Why don't we carry separate XFD copies - guest and host - which we use for the
> > guest and the host, respectively?
> 
> That was exactly what I did in v1, but it's more code and less efficient too.

And creates a weird ABI for KVM:

 : This also creates a nasty, subtle asymmetry in KVM's ABI.  Notably, the comment
 : above is wrong.  XSAVE does NOT run with fpstate->xfd, it runs with whatever
 : happens to be in hardware.  For non-guest tasks, fpstate->xfd is guaranteed to
 : be resident in hardware when save_fpregs_to_fpstate() runs, but for guest tasks,
 : it will usually be the _guest's_ value.  So in the common case, KVM_GET_XSAVE2
 : would not return the same data set by KVM_SET_XSAVE.
 : 
 : In theory we could ensure KVM saved exactly what is resident in hardware, but
 : that's quite tricky (and costly!) as it would require doing xfd_update_state()
 : before _every_ save_fpregs_to_fpstate(), e.g. not just in fpu_swap_kvm_fpstate().
 : E.g. if the host kernel used the FPU from IRQ context (spoiler alert!), then KVM
 : wouldn't have a chance to swap in the maximal XFD[18]=0 value (i.e. the userspace
 : task's XFD).

And IMO papered over the true bug, which is that the xstate snapshot can become
inconsistent relative to KVM's tracking of guest XFD:

 : Lastly, the fix is effectively papering over another bug, which I'm pretty sure
 : is the underlying issue that was originally encountered.  Assuming QEMU doesn't
 : intercept MSR_IA32_XFD for its own purposes, the only sequence I've come up with
 : that would result in KVM trying to load XTILE data with XFD[18]=1, without a
 : colluding userspace VMM (Paolo's selftest) is:
 : 
 :   1. vCPU loads non-init XTILE data without ever setting XFD to a non-zero value
 :      (KVM only disables XFD interception on writes with a non-zero value).
 :   2. Guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1
 :   3. VM-Exit due to the WRMSR
 :   4. Host IRQ arrives and triggers kernel_fpu_begin()
 :   5. save_fpregs_to_fpstate() saves guest FPU with XFD[18]=0
 :   6. fpu_update_guest_xfd() stuffs guest_fpu->fpstate->xfd = XFD[18]=1
 :   7. vcpu_enter_guest() attempts to load XTILE data with XFD[18]=1
 : 
 : Note!  There's no KVM_SET_XSAVE2 in the above, i.e. this doesn't require userspace
 : to trigger save/restore for live migration or whatever, the only timing condition
 : is the arrival of an IRQ that uses kernel FPU during the XFD 0=>1 VM-Exit.

https://lore.kernel.org/all/aVMEcaZD_SzKzRvr@google.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-15 16:39     ` Sean Christopherson
@ 2026-01-15 17:05       ` Borislav Petkov
  2026-01-15 17:12         ` Sean Christopherson
  0 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2026-01-15 17:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Kernel Mailing List, Linux, kvm,
	the arch/x86 maintainers

On Thu, Jan 15, 2026 at 08:39:51AM -0800, Sean Christopherson wrote:
>  :   1. vCPU loads non-init XTILE data without ever setting XFD to a non-zero value
>  :      (KVM only disables XFD interception on writes with a non-zero value).
>  :   2. Guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1
>  :   3. VM-Exit due to the WRMSR
>  :   4. Host IRQ arrives and triggers kernel_fpu_begin()
>  :   5. save_fpregs_to_fpstate() saves guest FPU with XFD[18]=0
>  :   6. fpu_update_guest_xfd() stuffs guest_fpu->fpstate->xfd = XFD[18]=1
>  :   7. vcpu_enter_guest() attempts to load XTILE data with XFD[18]=1

I don't know, maybe I'm missing an important aspect but if not, I'm wondering
how you folks are not seeing the big honking discrepancy here.

*Anything* poking in MSRs under the kernel's feet where the kernel doesn't
know about that poking, is bound to cause trouble. And this is no exception.

Step 5. above should use the updated XFD[18]=1. The guest just disabled that
state! Anything else is bonkers.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-15 17:05       ` Borislav Petkov
@ 2026-01-15 17:12         ` Sean Christopherson
  0 siblings, 0 replies; 37+ messages in thread
From: Sean Christopherson @ 2026-01-15 17:12 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Paolo Bonzini, Kernel Mailing List, Linux, kvm,
	the arch/x86 maintainers

On Thu, Jan 15, 2026, Borislav Petkov wrote:
> On Thu, Jan 15, 2026 at 08:39:51AM -0800, Sean Christopherson wrote:
> >  :   1. vCPU loads non-init XTILE data without ever setting XFD to a non-zero value
> >  :      (KVM only disables XFD interception on writes with a non-zero value).
> >  :   2. Guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1
> >  :   3. VM-Exit due to the WRMSR
> >  :   4. Host IRQ arrives and triggers kernel_fpu_begin()
> >  :   5. save_fpregs_to_fpstate() saves guest FPU with XFD[18]=0
> >  :   6. fpu_update_guest_xfd() stuffs guest_fpu->fpstate->xfd = XFD[18]=1
> >  :   7. vcpu_enter_guest() attempts to load XTILE data with XFD[18]=1
> 
> I don't know, maybe I'm missing an important aspect but if not, I'm wondering
> how you folks are not seeing the big honking discrepancy here.
> 
> *Anything* poking in MSRs under the kernel's feet where the kernel doesn't
> know about that poking, is bound to cause trouble. And this is no exception.

KVM isn't poking the MSR, KVM is literally calling a kernel API, fpu_update_guest_xfd(),
to ask/tell the kernel to update the guest's XFD.  It's the FPU code that's buggy,
because it doesn't ensure the state _it_ saved _without KVM's knowledge_ is
consistent with new XFD.

> Step 5. above should use the updated XFD[18]=1. The guest just disabled that
> state! Anything else is bonkers.

As I explained in my previous reply, that's easier said than done:

  In theory we could ensure KVM saved exactly what is resident in hardware, but
  that's quite tricky (and costly!) as it would require doing xfd_update_state()
  before _every_ save_fpregs_to_fpstate(), e.g. not just in fpu_swap_kvm_fpstate().
  E.g. if the host kernel used the FPU from IRQ context (spoiler alert!), then KVM
  wouldn't have a chance to swap in the maximal XFD[18]=0 value (i.e. the userspace
  task's XFD).

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-15 16:22     ` Paolo Bonzini
@ 2026-01-15 18:19       ` Dave Hansen
  2026-01-15 18:26         ` Paolo Bonzini
  2026-01-15 23:43         ` Chang S. Bae
  0 siblings, 2 replies; 37+ messages in thread
From: Dave Hansen @ 2026-01-15 18:19 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: seanjc, x86, Thomas Gleixner, Borislav Petkov, Bae, Chang Seok

On 1/15/26 08:22, Paolo Bonzini wrote:
> 
>   Guest running with MSR_IA32_XFD = 0
>     WRMSR(MSR_IA32_XFD)
>     vmexit
>   Host:
>     enable IRQ
>     interrupt handler
>       kernel_fpu_begin() -> sets TIF_NEED_FPU_LOAD
>         XSAVE -> stores XINUSE[18] = 1
>         ...
>       kernel_fpu_end()
>     handle vmexit
>       fpu_update_guest_xfd() -> XFD[18] = 1
>     reenter guest
>       fpu_swap_kvm_fpstate()
>         XRSTOR -> XINUSE[18] = 1 && XFD[18] = 1 -> #NM and boom
> 
> With the patch, fpu_update_guest_xfd() sees TIF_NEED_FPU_LOAD set and
> clears the bit from xinuse.

Paolo, thanks for clarifying that!

Abbreviated, that's just:

	XFD[18]=0
	...
	# Interrupt (that does XSAVE)
	XFD[18]=1
	XRSTOR => #NM

Is there anything preventing the kernel_fpu_begin() interrupt from
happening a little later, say:

	XFD[18]=0
	...
	XFD[18]=1
	# Interrupt (that does XSAVE)
	XRSTOR (no #NM)
	
In that case, the XSAVE in kernel_fpu_begin() "operates as if XINUSE[i]
= 0" and would set XFEATURES[18]=0; it would save the component as being
in its init state. The later XRSTOR would obviously restore state 18 to
its init state.

Without involving SMIs, I think it lands feature 18 in its init state as
well. The state is _already_ being destroyed in the existing code
without anything exotic needing to happen.

That's a long-winded way of saying I think I agree with the patch. It
destroys the state a bit more aggressively but it doesn't do anything _new_.

What would folks think about making the SDM language stronger, or at
least explicitly adding the language that setting XFD[i]=1 can lead to
XINUSE[i] going from 1=>0. Kinda like the language that's already in
"XRSTOR and the Init and Modified Optimizations", but specific to XFD:

	If XFD[i] = 1 and XINUSE[i] = 1, state component i may be
	tracked as init; XINUSE[i] may be set to 0.

That would make it consistent with the KVM behavior. It might also give
the CPU folks some additional wiggle room for new behavior.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-15 18:19       ` Dave Hansen
@ 2026-01-15 18:26         ` Paolo Bonzini
  2026-01-15 23:43         ` Chang S. Bae
  1 sibling, 0 replies; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-15 18:26 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, kvm
  Cc: seanjc, x86, Thomas Gleixner, Borislav Petkov, Bae, Chang Seok

On 1/15/26 19:19, Dave Hansen wrote:
> Is there anything preventing the kernel_fpu_begin() interrupt from
> happening a little later, say:
> 
> 	XFD[18]=0
> 	...
> 	XFD[18]=1
> 	# Interrupt (that does XSAVE)
> 	XRSTOR (no #NM)
> 	
> In that case, the XSAVE in kernel_fpu_begin() "operates as if XINUSE[i]
> = 0" and would set XFEATURES[18]=0; it would save the component as being
> in its init state. The later XRSTOR would obviously restore state 18 to
> its init state.

Yes, absolutely, and the fact that the race window is so small is why 
this issue stayed undetected for years.  In fact, consider that XFD 
becomes a pass-through MSR after the first write, at which point there's 
on race window at all---XFD[18] will be 1 if that's the guest value and 
the state will be destroyed.

I only mentioned SMIs as a way for this to happen on bare metal, i.e. 
without KVM involvement at all (though for dual-monitor treatment 
virtualization _is_ involved).

> That's a long-winded way of saying I think I agree with the patch. It
> destroys the state a bit more aggressively but it doesn't do anything _new_.

Thanks. :)

> What would folks think about making the SDM language stronger, or at
> least explicitly adding the language that setting XFD[i]=1 can lead to
> XINUSE[i] going from 1=>0. Kinda like the language that's already in
> "XRSTOR and the Init and Modified Optimizations", but specific to XFD:
> 
> 	If XFD[i] = 1 and XINUSE[i] = 1, state component i may be
> 	tracked as init; XINUSE[i] may be set to 0.
> 
> That would make it consistent with the KVM behavior. It might also give
> the CPU folks some additional wiggle room for new behavior.
Yes, absolutely.  I think any other hypervisor may want to do the same, 
to avoid save/restores of tile data to when guest XFD[18]=1 (and to 
avoid unnecessary clearing of XFD, just for the sake of storing tile 
data that is most likely unused).

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1
  2026-01-15 18:19       ` Dave Hansen
  2026-01-15 18:26         ` Paolo Bonzini
@ 2026-01-15 23:43         ` Chang S. Bae
  1 sibling, 0 replies; 37+ messages in thread
From: Chang S. Bae @ 2026-01-15 23:43 UTC (permalink / raw)
  To: Dave Hansen, Paolo Bonzini, linux-kernel, kvm
  Cc: seanjc, x86, Thomas Gleixner, Borislav Petkov

On 1/15/2026 10:19 AM, Dave Hansen wrote:
> 
> What would folks think about making the SDM language stronger, or at
> least explicitly adding the language that setting XFD[i]=1 can lead to
> XINUSE[i] going from 1=>0. Kinda like the language that's already in
> "XRSTOR and the Init and Modified Optimizations", but specific to XFD:
> 
> 	If XFD[i] = 1 and XINUSE[i] = 1, state component i may be
> 	tracked as init; XINUSE[i] may be set to 0.
> 
> That would make it consistent with the KVM behavior. It might also give
> the CPU folks some additional wiggle room for new behavior.

Yeah, I saw that you quoted this sentence in the XFD section in your 
other response:

	If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state
	component i, the instruction does not generate #NM when XCR0[i]
	= IA32_XFD[i] = 1; instead, it operates as if XINUSE[i] = 0 (and
	the state component was in its initial state)

Indeed, I do applaud the idea to clarify this behavior more explicitly 
right there.

Thanks,
Chang

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-01  9:05 [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Paolo Bonzini
                   ` (5 preceding siblings ...)
  2026-01-15 12:22 ` Borislav Petkov
@ 2026-01-16 12:22 ` Borislav Petkov
  2026-01-21 11:35   ` Paolo Bonzini
  6 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2026-01-16 12:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, x86

On Thu, Jan 01, 2026 at 10:05:12AM +0100, Paolo Bonzini wrote:
> Tested on a Sapphire Rapids machine, reviews and acks are welcome so
> that I can submit it to Linus via the KVM tree.

So I wanted to give this a thorough review after yesterday's discussion and
tried to apply the patch but it wouldn't apply. So I took a look at the code
it touches just to find out that the patch is already in Linus' tree!

Why?

Can you folks please explain to me how is this the process we've all agreed
upon?

Where does it say that people should sneak patches behind the maintainers'
backs without even getting an Ack from them?

By that logic, we can just as well sneak KVM patches behind your back and
you're supposed to be fine with it. Right?

Or should we try to adhere to the development rules we all have agreed upon
and work together in a fair and correct way?

I'd probably vote for latter, after we all sit down and agree upon something.

What I don't want is sneaking patches behind our backs and I'm sure you won't
like this either so let's please stop this.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-16 12:22 ` Borislav Petkov
@ 2026-01-21 11:35   ` Paolo Bonzini
  2026-01-22 11:12     ` Borislav Petkov
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-21 11:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kernel Mailing List, Linux, kvm, Sean Christopherson,
	the arch/x86 maintainers

Il ven 16 gen 2026, 13:23 Borislav Petkov <bp@alien8.de> ha scritto:
>
> On Thu, Jan 01, 2026 at 10:05:12AM +0100, Paolo Bonzini wrote:
> > Tested on a Sapphire Rapids machine, reviews and acks are welcome so
> > that I can submit it to Linus via the KVM tree.
>
> So I wanted to give this a thorough review after yesterday's discussion and
> tried to apply the patch but it wouldn't apply. So I took a look at the code
> it touches just to find out that the patch is already in Linus' tree!
>
> Why?
>
> Can you folks please explain to me how is this the process we've all agreed
> upon?

It's a fix for a host crash that literally adds a single AND to a
function that's called fpu_update_*guest*_xfd. The patch doesn't have
any effect unless KVM is in use, and on any task that isn't the task
currently in KVM_RUN (other than by not crashing the system). So,
because of the effect of the bug and the small size/impact of the
patch, and the fact that there are really just two approaches and both
had been discussed extensively on list, I accepted the small
possibility that the patches would be rejected and would have to be
reverted.

If I really wanted to sneak something in, I could have written this
patch entirely in arch/x86/kvm. It would be possible, though the code
would be worse and inefficient. Sean wouldn't have let me :) but
anyway that didn't even cross my mind of course, because sneaking
something past you guys wasn't something I had in mind either. In fact
I instead plan to make that impossible, by making fpregs_lock() not
public and reducing the API exposed to KVM. I certainly will not send
that change to Linus without acks, even though it would also affect
only KVM in practice.

> By that logic, we can just as well sneak KVM patches behind your back and
> you're supposed to be fine with it. Right?

I would be ok with a Cc and sending the patch to Linus after a couple
weeks, yes, for a patch of similarly small and well-defined impact.
For example I didn't have a problem when commit b1e1296d7c6a ("kvm:
explicitly set FOLL_HONOR_NUMA_FAULT in hva_to_pfn_slow()",
2023-08-21) was sent without my ack.

Paolo


>
> Or should we try to adhere to the development rules we all have agreed upon
> and work together in a fair and correct way?
>
> I'd probably vote for latter, after we all sit down and agree upon something.
>
> What I don't want is sneaking patches behind our backs and I'm sure you won't
> like this either so let's please stop this.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-21 11:35   ` Paolo Bonzini
@ 2026-01-22 11:12     ` Borislav Petkov
  2026-01-22 12:00       ` Paolo Bonzini
  0 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2026-01-22 11:12 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kernel Mailing List, Linux, kvm, Sean Christopherson,
	the arch/x86 maintainers

On Wed, Jan 21, 2026 at 12:35:50PM +0100, Paolo Bonzini wrote:
> It's a fix for a host crash that literally adds a single AND to a
> function that's called fpu_update_*guest*_xfd. The patch doesn't have
> any effect unless KVM is in use,

No Paolo, *exactly* *because* arch/x86/ and KVM are so closely intertwined in
some areas, we should sync on changes there. And judging by our questions
on this thread, one of the aspects were whether the handling of the guest
state is adequate enough. And if it is not then we have to rethink it and
accomodate it.

What we definitely should NOT do is solo efforts without even an ACK.

We've had this before with the X86_FEATURE gunk and we're back at it with the
FPU.

> and on any task that isn't the task currently in KVM_RUN (other than by not
> crashing the system). So, because of the effect of the bug and the small
> size/impact of the patch, and the fact that there are really just two
> approaches and both had been discussed extensively on list,

Not by us.

> I accepted the small possibility that the patches would be rejected and
> would have to be reverted.

And all that smoke and effort just because you can't simply wait for us to
take a look. And what happened? We agreed and it is all good.

So what was all that rush all about?

> If I really wanted to sneak something in, I could have written this
> patch entirely in arch/x86/kvm. It would be possible, though the code
> would be worse and inefficient. Sean wouldn't have let me :) but

In my experience, syncing stuff with Sean who takes what and giving each other
immutable branches to use, works wonderfully. Why can't we simply stick to
that workflow?

> anyway that didn't even cross my mind of course, because sneaking
> something past you guys wasn't something I had in mind either. In fact
> I instead plan to make that impossible, by making fpregs_lock() not
> public and reducing the API exposed to KVM. I certainly will not send
> that change to Linus without acks, even though it would also affect
> only KVM in practice.

So how about we do only that from now on?

> I would be ok with a Cc and sending the patch to Linus after a couple
> weeks, yes, for a patch of similarly small and well-defined impact.
> For example I didn't have a problem when commit b1e1296d7c6a ("kvm:
> explicitly set FOLL_HONOR_NUMA_FAULT in hva_to_pfn_slow()",
> 2023-08-21) was sent without my ack.

It happens. We talked with akpm recently and we'll separate the
responsibilities much better and by the looks of it, it is already much better
this way. I'd suggest you try the same.

What is really annoying and counter-productive are the unsynchronized solo
efforts so let's not do those please.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-22 11:12     ` Borislav Petkov
@ 2026-01-22 12:00       ` Paolo Bonzini
  2026-01-23 13:23         ` Borislav Petkov
  0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2026-01-22 12:00 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kernel Mailing List, Linux, kvm, Sean Christopherson,
	the arch/x86 maintainers

On Thu, Jan 22, 2026 at 12:13 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Jan 21, 2026 at 12:35:50PM +0100, Paolo Bonzini wrote:
> > It's a fix for a host crash that literally adds a single AND to a
> > function that's called fpu_update_*guest*_xfd. The patch doesn't have
> > any effect unless KVM is in use,
>
> No Paolo, *exactly* *because* arch/x86/ and KVM are so closely intertwined in
> some areas, we should sync on changes there. And judging by our questions
> on this thread, one of the aspects were whether the handling of the guest
> state is adequate enough. And if it is not then we have to rethink it and
> accomodate it.
>
> What we definitely should NOT do is solo efforts without even an ACK.

I agree - as I wrote below, I judged that this was _not_ solo
considering that (while not including any x86 maintainers) there were
multiple people intervening and building on each other's analysis.
Yes, there was no x86 maintainer, I obviously knew that, but my
judgment call was that all these people together had looked at the
code more than it deserved. In the previous mail I said the
probability of a disagreement was small, it was even practically
nonexistent.

I don't think you can say that this is routine, for example in commit
eb4441864e03 ("KVM: SEV: sync FPU and AVX state at LAUNCH_UPDATE_VMSA
time", 2024-04-11) I explicitly sought an ack for just an
EXPORT_SYMBOL change. Knowing that x86 maintainers want to tightly
control the API boundary of arch/x86/kernel/fpu, I considered that to
require the attention of you guys *even more* than a code change!

> We've had this before with the X86_FEATURE gunk and we're back at it with the
> FPU.

I agree that causing conflicts on X86_FEATURE (years ago?) was a
mistake, that said I don't think it's a great example. I still see
occasional changes to cpufeatures.h go in via Sean without ack---and
in fact I check them explicitly when I get his pull requests and look
at what tip is doing with cpufeatures.h in the same merge window. :)

> > If I really wanted to sneak something in, I could have written this
> > patch entirely in arch/x86/kvm. It would be possible, though the code
> > would be worse and inefficient. Sean wouldn't have let me :) but
>
> In my experience, syncing stuff with Sean who takes what and giving each other
> immutable branches to use, works wonderfully. Why can't we simply stick to
> that workflow?

I think it's a perfectly fine workflow across releases i.e. to prepare
for the merge window; points of contact for -rc patches are rare and
using branches to sync is unlikely to be necessary.

I appreciate a lot the support that Thomas and other arch/x86/ people
put in to help Linux run well and without hacks as a hypervisor. At
the same time I think it's fine for both sides to acknowledge that in
extremely rare cases the lines can be blurred. So rare that I cannot
think of another case in the past and it's no problem for me to say
"never again", but then it would be like saying that the Earth is
spherical...

Paolo


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX
  2026-01-22 12:00       ` Paolo Bonzini
@ 2026-01-23 13:23         ` Borislav Petkov
  0 siblings, 0 replies; 37+ messages in thread
From: Borislav Petkov @ 2026-01-23 13:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kernel Mailing List, Linux, kvm, Sean Christopherson,
	the arch/x86 maintainers

On Thu, Jan 22, 2026 at 01:00:27PM +0100, Paolo Bonzini wrote:
> I agree - as I wrote below, I judged that this was _not_ solo
> considering that (while not including any x86 maintainers) there were
> multiple people intervening and building on each other's analysis.
> Yes, there was no x86 maintainer, I obviously knew that, but my
> judgment call was that all these people together had looked at the
> code more than it deserved. In the previous mail I said the
> probability of a disagreement was small, it was even practically
> nonexistent.

This all doesn't matter. Because when it comes down to cleaning up the mess
people have left behind, it is always we who end up mopping after everyone.
And everyone esle skedaddles into their next feature enablement.

And I do appreciate more than anyone when people make an effort to review
patches. You still need a maintainer ack though.

And it is not hard - you just need to ping us/send us a private mail even call
us if you want. :-)

> I don't think you can say that this is routine, for example in commit
> eb4441864e03 ("KVM: SEV: sync FPU and AVX state at LAUNCH_UPDATE_VMSA
> time", 2024-04-11) I explicitly sought an ack for just an
> EXPORT_SYMBOL change. Knowing that x86 maintainers want to tightly
> control the API boundary of arch/x86/kernel/fpu, I considered that to
> require the attention of you guys *even more* than a code change!

Much appreciated, this is how it should always work. So let's make that the
default workflow please.

> I appreciate a lot the support that Thomas and other arch/x86/ people
> put in to help Linux run well and without hacks as a hypervisor. At
> the same time I think it's fine for both sides to acknowledge that in
> extremely rare cases the lines can be blurred.

If we don't reply for a week or so, sure. But if you really need an x86
maintainer ack, I'm sure you'll get one in time.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2026-01-23 13:23 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-01  9:05 [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Paolo Bonzini
2026-01-01  9:05 ` [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1 Paolo Bonzini
2026-01-03  2:06   ` Yao Yuan
2026-01-05 17:31     ` Sean Christopherson
2026-01-06  5:25       ` Yao Yuan
2026-01-06  0:54   ` Jim Mattson
2026-01-06  1:17     ` Sean Christopherson
2026-01-06 17:56       ` Jim Mattson
2026-01-15 16:07         ` Dave Hansen
2026-01-15 16:12           ` Paolo Bonzini
2026-01-15 16:27             ` Dave Hansen
2026-01-07  0:28   ` Chang S. Bae
2026-01-07 22:33     ` Paolo Bonzini
2026-01-08  3:06   ` Binbin Wu
2026-01-08 16:26     ` Paolo Bonzini
2026-01-15 15:54   ` Dave Hansen
2026-01-15 16:22     ` Paolo Bonzini
2026-01-15 18:19       ` Dave Hansen
2026-01-15 18:26         ` Paolo Bonzini
2026-01-15 23:43         ` Chang S. Bae
2026-01-01  9:05 ` [PATCH 2/4] selftests: kvm: replace numbered sync points with actions Paolo Bonzini
2026-01-06  0:02   ` Sean Christopherson
2026-01-07 22:28     ` Paolo Bonzini
2026-01-08 20:26       ` Sean Christopherson
2026-01-01  9:05 ` [PATCH 3/4] selftests: kvm: try getting XFD and XSAVE state out of sync Paolo Bonzini
2026-01-01  9:05 ` [PATCH 4/4] selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1 Paolo Bonzini
2026-01-06  1:18 ` [PATCH v2 0/4] x86, fpu/kvm: fix crash with AMX Sean Christopherson
2026-01-15 12:22 ` Borislav Petkov
2026-01-15 13:49   ` Paolo Bonzini
2026-01-15 16:39     ` Sean Christopherson
2026-01-15 17:05       ` Borislav Petkov
2026-01-15 17:12         ` Sean Christopherson
2026-01-16 12:22 ` Borislav Petkov
2026-01-21 11:35   ` Paolo Bonzini
2026-01-22 11:12     ` Borislav Petkov
2026-01-22 12:00       ` Paolo Bonzini
2026-01-23 13:23         ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox