public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization
@ 2026-02-05 21:43 Jim Mattson
  2026-02-05 21:43 ` [PATCH v3 1/8] KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2 Jim Mattson
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

Currently, KVM's implementation of nested SVM treats the PAT MSR the same
way whether or not nested NPT is enabled: L1 and L2 share a single
PAT. However, the APM specifies that when nested NPT is enabled, the host
(L1) and the guest (L2) should have independent PATs: hPAT for L1 and gPAT
for L2. This patch series implements the architectural specification in
KVM.

Use the existing PAT MSR (vcpu->arch.pat) for hPAT. Add a new field,
svm->nested.gpat, for gPAT. With nested NPT enabled, redirect guest
accesses to the IA32_PAT MSR to gPAT. All other accesses, including
userspace accesses via KVM_{GET,SET}_MSRS, continue to reference hPAT.  The
special handling of userspace accesses ensures save/restore forward
compatibility (i.e. resuming a new checkpoint on an older kernel). When an
old kernel restores a checkpoint from a new kernel, the gPAT will be lost,
and L2 will simply use L1's PAT, which is the existing behavior of the old
kernel anyway.

v1: https://lore.kernel.org/kvm/20260113003016.3511895-1-jmattson@google.com/
v2: https://lore.kernel.org/kvm/20260115232154.3021475-1-jmattson@google.com/

v2 -> v3:

* Extract VMCB_NPT clean bit fix as a separate patch [Yosry]
* Squash v2 patches 2 and 3 (cache and validate g_pat) [Yosry]
* Drop redundant npt_enabled check in g_pat validation since existing
  nested_vmcb_check_controls() already rejects NP_ENABLE when !npt_enabled
  [Yosry]
* Fix svm_set_hpat() to propagate to vmcb02 only when !nested_npt_enabled,
  not unconditionally when in guest mode [Jim]
* Warn in svm_{get,set}_msr() if host_initiated and vcpu_wants_to_run when
  accessing IA32_PAT [Sean]
* Use dedicated svm->nested.gpat field instead of vmcb_save_area_cached
* Use dedicated header field (kvm_svm_nested_state_hdr.gpat) for nested
  state save/restore instead of overwriting vmcb01 save area
* Replace restore_gpat_from_pat with legacy_gpat_semantics to correctly
  handle KVM_GET_NESTED_STATE before the first KVM_RUN [Jim]
* Remove nested_vmcb02_compute_g_pat() after removing all callers [Yosry]

Jim Mattson (8):
  KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2
  KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT
  KVM: x86: nSVM: Redirect IA32_PAT accesses to either hPAT or gPAT
  KVM: x86: nSVM: Save gPAT to vmcb12.g_pat on VMEXIT
  KVM: x86: nSVM: Save/restore gPAT with KVM_{GET,SET}_NESTED_STATE
  KVM: x86: nSVM: Handle restore of legacy nested state
  KVM: selftests: nSVM: Add svm_nested_pat test

 arch/x86/include/uapi/asm/kvm.h               |   5 +
 arch/x86/kvm/svm/nested.c                     |  51 ++-
 arch/x86/kvm/svm/svm.c                        |  37 ++-
 arch/x86/kvm/svm/svm.h                        |  35 +-
 tools/testing/selftests/kvm/Makefile.kvm      |   1 +
 .../selftests/kvm/x86/svm_nested_pat_test.c   | 298 ++++++++++++++++++
 6 files changed, 406 insertions(+), 21 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86/svm_nested_pat_test.c


base-commit: e944fe2c09f405a2e2d147145c9b470084bc4c9a
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 1/8] KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2
  2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
@ 2026-02-05 21:43 ` Jim Mattson
  2026-02-09 16:05   ` Yosry Ahmed
  2026-02-05 21:43 ` [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat Jim Mattson
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

When running an L2 guest and writing to MSR_IA32_CR_PAT, the host PAT value
is stored in vmcb01.ptr->save.g_pat, but the clean bit was only being
cleared for svm->vmcb, which points to vmcb02 in guest mode.

Introduce the helper svm_set_vmcb_gpat() which sets vmcb->save.g_pat and
marks the VMCB dirty for VMCB_NPT. Use this helper in both svm_set_msr()
for updating vmcb01 and in nested_vmcb02_compute_g_pat() for updating
vmcb02, ensuring both VMCBs are properly marked dirty.

Fixes: 4995a3685f1b ("KVM: SVM: Use a separate vmcb for the nested L2 guest")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/svm/nested.c | 2 +-
 arch/x86/kvm/svm/svm.c    | 3 +--
 arch/x86/kvm/svm/svm.h    | 6 ++++++
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index de90b104a0dd..f72dbd10dcad 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -636,7 +636,7 @@ void nested_vmcb02_compute_g_pat(struct vcpu_svm *svm)
 		return;
 
 	/* FIXME: merge g_pat from vmcb01 and vmcb12.  */
-	svm->nested.vmcb02.ptr->save.g_pat = svm->vmcb01.ptr->save.g_pat;
+	svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, svm->vmcb01.ptr->save.g_pat);
 }
 
 static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 5f0136dbdde6..08f145eb9215 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2939,10 +2939,9 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		if (ret)
 			break;
 
-		svm->vmcb01.ptr->save.g_pat = data;
+		svm_set_vmcb_gpat(svm->vmcb01.ptr, data);
 		if (is_guest_mode(vcpu))
 			nested_vmcb02_compute_g_pat(svm);
-		vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
 		break;
 	case MSR_IA32_SPEC_CTRL:
 		if (!msr->host_initiated &&
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index ebd7b36b1ceb..986d90f2d4ca 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -420,6 +420,12 @@ static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
         return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
 }
 
+static inline void svm_set_vmcb_gpat(struct vmcb *vmcb, u64 data)
+{
+	vmcb->save.g_pat = data;
+	vmcb_mark_dirty(vmcb, VMCB_NPT);
+}
+
 static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
 {
 	return container_of(vcpu, struct vcpu_svm, vcpu);
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
  2026-02-05 21:43 ` [PATCH v3 1/8] KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2 Jim Mattson
@ 2026-02-05 21:43 ` Jim Mattson
  2026-02-06 18:19   ` Sean Christopherson
  2026-02-05 21:43 ` [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT Jim Mattson
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
add a validity check so that when nested paging is enabled for vmcb12, an
invalid g_pat causes an immediate VMEXIT with exit code VMEXIT_INVALID, as
specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."

Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/svm/nested.c | 4 +++-
 arch/x86/kvm/svm/svm.h    | 3 +++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index f72dbd10dcad..1d4ff6408b34 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 
 	nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
 	nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
+	svm->nested.gpat = vmcb12->save.g_pat;
 
 	if (!nested_vmcb_check_save(vcpu) ||
-	    !nested_vmcb_check_controls(vcpu)) {
+	    !nested_vmcb_check_controls(vcpu) ||
+	    (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
 		vmcb12->control.exit_code    = SVM_EXIT_ERR;
 		vmcb12->control.exit_info_1  = 0;
 		vmcb12->control.exit_info_2  = 0;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 986d90f2d4ca..42a4bf83b3aa 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -208,6 +208,9 @@ struct svm_nested_state {
 	 */
 	struct vmcb_save_area_cached save;
 
+	/* Cached guest PAT from vmcb12.save.g_pat */
+	u64 gpat;
+
 	bool initialized;
 
 	/*
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT
  2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
  2026-02-05 21:43 ` [PATCH v3 1/8] KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2 Jim Mattson
  2026-02-05 21:43 ` [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat Jim Mattson
@ 2026-02-05 21:43 ` Jim Mattson
  2026-02-06 18:23   ` Sean Christopherson
  2026-02-05 21:43 ` [PATCH v3 4/8] KVM: x86: nSVM: Redirect IA32_PAT accesses to either hPAT or gPAT Jim Mattson
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

When nested NPT is enabled in vmcb12, copy the (cached and validated)
vmcb12 g_pat field to the guest PAT register. Under KVM, the guest PAT
register lives in the vmcb02 g_pat field.

When NPT is enabled, but nested NPT is disabled, copy L1's IA32_PAT MSR to
the vmcb02 g_pat field, since L2 shares the IA32_PAT MSR with L1.

When NPT is disabled, the vmcb02 g_pat field is ignored by hardware.

Fixes: 15038e147247 ("KVM: SVM: obey guest PAT")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/svm/nested.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 1d4ff6408b34..1ff2ede96094 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -646,9 +646,6 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
 	struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 
-	nested_vmcb02_compute_g_pat(svm);
-	vmcb_mark_dirty(vmcb02, VMCB_NPT);
-
 	/* Load the nested guest state */
 	if (svm->nested.vmcb12_gpa != svm->nested.last_vmcb12_gpa) {
 		new_vmcb12 = true;
@@ -656,6 +653,19 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
 		svm->nested.force_msr_bitmap_recalc = true;
 	}
 
+	if (npt_enabled) {
+		if (nested_npt_enabled(svm)) {
+			if (unlikely(new_vmcb12 ||
+				     vmcb_is_dirty(vmcb12, VMCB_NPT))) {
+				vmcb02->save.g_pat = svm->nested.gpat;
+				vmcb_mark_dirty(vmcb02, VMCB_NPT);
+			}
+		} else {
+			vmcb02->save.g_pat = vcpu->arch.pat;
+			vmcb_mark_dirty(vmcb02, VMCB_NPT);
+		}
+	}
+
 	if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
 		vmcb02->save.es = vmcb12->save.es;
 		vmcb02->save.cs = vmcb12->save.cs;
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 4/8] KVM: x86: nSVM: Redirect IA32_PAT accesses to either hPAT or gPAT
  2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
                   ` (2 preceding siblings ...)
  2026-02-05 21:43 ` [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT Jim Mattson
@ 2026-02-05 21:43 ` Jim Mattson
  2026-02-05 21:43 ` [PATCH v3 5/8] KVM: x86: nSVM: Save gPAT to vmcb12.g_pat on VMEXIT Jim Mattson
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

When the vCPU is in guest mode with nested NPT enabled, guest accesses to
IA32_PAT are redirected to the gPAT register, which is stored in
vmcb02->save.g_pat.

Non-guest accesses (e.g. from userspace) to IA32_PAT are always redirected
to hPAT, which is stored in vcpu->arch.pat.

This is architected behavior. It also makes it possible to restore a new
checkpoint on an old kernel with reasonable semantics. After the restore,
gPAT will be lost, and L2 will run on L1's PAT. Note that the old kernel
would have always run L2 on L1's PAT.

Fixes: 15038e147247 ("KVM: SVM: obey guest PAT")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/svm/nested.c |  9 ---------
 arch/x86/kvm/svm/svm.c    | 34 ++++++++++++++++++++++++++++------
 arch/x86/kvm/svm/svm.h    | 17 ++++++++++++++++-
 3 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 1ff2ede96094..08844bc51b3c 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -630,15 +630,6 @@ static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
 	return 0;
 }
 
-void nested_vmcb02_compute_g_pat(struct vcpu_svm *svm)
-{
-	if (!svm->nested.vmcb02.ptr)
-		return;
-
-	/* FIXME: merge g_pat from vmcb01 and vmcb12.  */
-	svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, svm->vmcb01.ptr->save.g_pat);
-}
-
 static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12)
 {
 	bool new_vmcb12 = false;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 08f145eb9215..b62c32c3942d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2852,6 +2852,20 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_AMD64_DE_CFG:
 		msr_info->data = svm->msr_decfg;
 		break;
+	case MSR_IA32_CR_PAT:
+		/*
+		 * When nested NPT is enabled, L2 has a separate PAT from L1.
+		 * Guest accesses to IA32_PAT while running L2 target L2's gPAT;
+		 * host-initiated accesses always target L1's hPAT for backward
+		 * and forward KVM_GET_MSRS compatibility with older kernels.
+		 */
+		WARN_ON_ONCE(msr_info->host_initiated && vcpu->wants_to_run);
+		if (!msr_info->host_initiated && is_guest_mode(vcpu) &&
+		    nested_npt_enabled(svm))
+			msr_info->data = svm->nested.gpat;
+		else
+			msr_info->data = vcpu->arch.pat;
+		break;
 	default:
 		return kvm_get_msr_common(vcpu, msr_info);
 	}
@@ -2935,13 +2949,21 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 
 		break;
 	case MSR_IA32_CR_PAT:
-		ret = kvm_set_msr_common(vcpu, msr);
-		if (ret)
-			break;
+		if (!kvm_pat_valid(data))
+			return 1;
 
-		svm_set_vmcb_gpat(svm->vmcb01.ptr, data);
-		if (is_guest_mode(vcpu))
-			nested_vmcb02_compute_g_pat(svm);
+		/*
+		 * When nested NPT is enabled, L2 has a separate PAT from L1.
+		 * Guest accesses to IA32_PAT while running L2 target L2's gPAT;
+		 * host-initiated accesses always target L1's hPAT for backward
+		 * and forward KVM_SET_MSRS compatibility with older kernels.
+		 */
+		WARN_ON_ONCE(msr->host_initiated && vcpu->wants_to_run);
+		if (!msr->host_initiated && is_guest_mode(vcpu) &&
+		    nested_npt_enabled(svm))
+			svm_set_gpat(svm, data);
+		else
+			svm_set_hpat(svm, data);
 		break;
 	case MSR_IA32_SPEC_CTRL:
 		if (!msr->host_initiated &&
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 42a4bf83b3aa..a0e94a2c51a1 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -590,6 +590,22 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
 	return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE;
 }
 
+static inline void svm_set_gpat(struct vcpu_svm *svm, u64 data)
+{
+	svm->nested.gpat = data;
+	svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, data);
+}
+
+static inline void svm_set_hpat(struct vcpu_svm *svm, u64 data)
+{
+	svm->vcpu.arch.pat = data;
+	if (npt_enabled) {
+		svm_set_vmcb_gpat(svm->vmcb01.ptr, data);
+		if (is_guest_mode(&svm->vcpu) && !nested_npt_enabled(svm))
+			svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, data);
+	}
+}
+
 static inline bool nested_vnmi_enabled(struct vcpu_svm *svm)
 {
 	return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_VNMI) &&
@@ -816,7 +832,6 @@ void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
 void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
 				    struct vmcb_save_area *save);
 void nested_sync_control_from_vmcb02(struct vcpu_svm *svm);
-void nested_vmcb02_compute_g_pat(struct vcpu_svm *svm);
 void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb);
 
 extern struct kvm_x86_nested_ops svm_nested_ops;
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 5/8] KVM: x86: nSVM: Save gPAT to vmcb12.g_pat on VMEXIT
  2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
                   ` (3 preceding siblings ...)
  2026-02-05 21:43 ` [PATCH v3 4/8] KVM: x86: nSVM: Redirect IA32_PAT accesses to either hPAT or gPAT Jim Mattson
@ 2026-02-05 21:43 ` Jim Mattson
  2026-02-05 21:43 ` [PATCH v3 6/8] KVM: x86: nSVM: Save/restore gPAT with KVM_{GET,SET}_NESTED_STATE Jim Mattson
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

According to the APM volume 3 pseudo-code for "VMRUN," when nested paging
is enabled in the vmcb, the guest PAT register (gPAT) is saved to the vmcb
on emulated VMEXIT.

Under KVM, the guest PAT register lives in the vmcb02 g_pat field. Save
this value to the vmcb12 g_pat field on emulated VMEXIT.

Fixes: 15038e147247 ("KVM: SVM: obey guest PAT")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/svm/nested.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 08844bc51b3c..0b95ae1e864b 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1169,6 +1169,9 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	vmcb12->save.dr6    = svm->vcpu.arch.dr6;
 	vmcb12->save.cpl    = vmcb02->save.cpl;
 
+	if (nested_npt_enabled(svm))
+		vmcb12->save.g_pat = vmcb02->save.g_pat;
+
 	if (guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) {
 		vmcb12->save.s_cet	= vmcb02->save.s_cet;
 		vmcb12->save.isst_addr	= vmcb02->save.isst_addr;
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 6/8] KVM: x86: nSVM: Save/restore gPAT with KVM_{GET,SET}_NESTED_STATE
  2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
                   ` (4 preceding siblings ...)
  2026-02-05 21:43 ` [PATCH v3 5/8] KVM: x86: nSVM: Save gPAT to vmcb12.g_pat on VMEXIT Jim Mattson
@ 2026-02-05 21:43 ` Jim Mattson
  2026-02-05 21:43 ` [PATCH v3 7/8] KVM: x86: nSVM: Handle restore of legacy nested state Jim Mattson
  2026-02-05 21:43 ` [PATCH v3 8/8] KVM: selftests: nSVM: Add svm_nested_pat test Jim Mattson
  7 siblings, 0 replies; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

Add a 'flags' field to the SVM nested state header, and use bit 0 of the
flags to indicate that gPAT is stored in the nested state.

If in guest mode with NPT enabled, store the current vmcb->save.g_pat value
into the header of the nested state, and set the flag.

Note that struct kvm_svm_nested_state_hdr is included in a union padded to
120 bytes, so there is room to add the flags field and the gpt field
without changing any offsets.

Fixes: cc440cdad5b7 ("KVM: nSVM: implement KVM_GET_NESTED_STATE and KVM_SET_NESTED_STATE")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/include/uapi/asm/kvm.h |  5 +++++
 arch/x86/kvm/svm/nested.c       | 14 ++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 846a63215ce1..664d04d1db3f 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -495,6 +495,8 @@ struct kvm_sync_regs {
 
 #define KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE	0x00000001
 
+#define KVM_STATE_SVM_VALID_GPAT	0x00000001
+
 /* vendor-independent attributes for system fd (group 0) */
 #define KVM_X86_GRP_SYSTEM		0
 #  define KVM_X86_XCOMP_GUEST_SUPP	0
@@ -531,6 +533,9 @@ struct kvm_svm_nested_state_data {
 
 struct kvm_svm_nested_state_hdr {
 	__u64 vmcb_pa;
+	__u32 flags;
+	__u32 reserved;
+	__u64 gpat;
 };
 
 /* for KVM_CAP_NESTED_STATE */
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 0b95ae1e864b..3f512fb630db 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1764,6 +1764,10 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
 	/* First fill in the header and copy it out.  */
 	if (is_guest_mode(vcpu)) {
 		kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
+		if (nested_npt_enabled(svm)) {
+			kvm_state.hdr.svm.flags |= KVM_STATE_SVM_VALID_GPAT;
+			kvm_state.hdr.svm.gpat = svm->nested.gpat;
+		}
 		kvm_state.size += KVM_STATE_NESTED_SVM_VMCB_SIZE;
 		kvm_state.flags |= KVM_STATE_NESTED_GUEST_MODE;
 
@@ -1889,6 +1893,12 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	    !__nested_vmcb_check_save(vcpu, &save_cached))
 		goto out_free;
 
+	/*
+	 * Validate gPAT, if provided.
+	 */
+	if ((kvm_state->hdr.svm.flags & KVM_STATE_SVM_VALID_GPAT) &&
+	    !kvm_pat_valid(kvm_state->hdr.svm.gpat))
+		goto out_free;
 
 	/*
 	 * All checks done, we can enter guest mode. Userspace provides
@@ -1926,6 +1936,10 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	if (ret)
 		goto out_free;
 
+	if (nested_npt_enabled(svm) &&
+	    (kvm_state->hdr.svm.flags & KVM_STATE_SVM_VALID_GPAT))
+		svm_set_gpat(svm, kvm_state->hdr.svm.gpat);
+
 	svm->nested.force_msr_bitmap_recalc = true;
 
 	kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 7/8] KVM: x86: nSVM: Handle restore of legacy nested state
  2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
                   ` (5 preceding siblings ...)
  2026-02-05 21:43 ` [PATCH v3 6/8] KVM: x86: nSVM: Save/restore gPAT with KVM_{GET,SET}_NESTED_STATE Jim Mattson
@ 2026-02-05 21:43 ` Jim Mattson
  2026-02-06 19:17   ` Sean Christopherson
  2026-02-05 21:43 ` [PATCH v3 8/8] KVM: selftests: nSVM: Add svm_nested_pat test Jim Mattson
  7 siblings, 1 reply; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

When nested NPT is enabled and KVM_SET_NESTED_STATE is used to restore an
old checkpoint (without a valid gPAT), the current IA32_PAT value must be
used as L2's gPAT.

The current IA32_PAT value may be restored by KVM_SET_MSRS after
KVM_SET_NESTED_STATE. Furthermore, there may be a KVM_GET_NESTED_STATE
before the first KVM_RUN.

Introduce a new boolean, svm->nested.legacy_gpat_semantics. When set, hPAT
updates are also applied to gPAT, preserving the old behavior where L2
shared L1's PAT. svm_vcpu_pre_run() clears this boolean at the first
KVM_RUN.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/svm/nested.c | 11 ++++++++---
 arch/x86/kvm/svm/svm.c    |  2 ++
 arch/x86/kvm/svm/svm.h    |  9 +++++++++
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 3f512fb630db..a7d6fc1382a7 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1936,9 +1936,14 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	if (ret)
 		goto out_free;
 
-	if (nested_npt_enabled(svm) &&
-	    (kvm_state->hdr.svm.flags & KVM_STATE_SVM_VALID_GPAT))
-		svm_set_gpat(svm, kvm_state->hdr.svm.gpat);
+	if (nested_npt_enabled(svm)) {
+		if (kvm_state->hdr.svm.flags & KVM_STATE_SVM_VALID_GPAT) {
+			svm_set_gpat(svm, kvm_state->hdr.svm.gpat);
+		} else {
+			svm_set_gpat(svm, vcpu->arch.pat);
+			svm->nested.legacy_gpat_semantics = true;
+		}
+	}
 
 	svm->nested.force_msr_bitmap_recalc = true;
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b62c32c3942d..a6a44deec82b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4239,6 +4239,8 @@ static int svm_vcpu_pre_run(struct kvm_vcpu *vcpu)
 	if (to_kvm_sev_info(vcpu->kvm)->need_init)
 		return -EINVAL;
 
+	to_svm(vcpu)->nested.legacy_gpat_semantics = false;
+
 	return 1;
 }
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a0e94a2c51a1..a559cd45c8a9 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -221,6 +221,13 @@ struct svm_nested_state {
 	 * on its side.
 	 */
 	bool force_msr_bitmap_recalc;
+
+	/*
+	 * Indicates that a legacy nested state was restored (without a
+	 * valid gPAT). In this mode, updates to hPAT are also applied to
+	 * gPAT, preserving the old behavior where L2 shared L1's PAT.
+	 */
+	bool legacy_gpat_semantics;
 };
 
 struct vcpu_sev_es_state {
@@ -604,6 +611,8 @@ static inline void svm_set_hpat(struct vcpu_svm *svm, u64 data)
 		if (is_guest_mode(&svm->vcpu) && !nested_npt_enabled(svm))
 			svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, data);
 	}
+	if (svm->nested.legacy_gpat_semantics)
+		svm_set_gpat(svm, data);
 }
 
 static inline bool nested_vnmi_enabled(struct vcpu_svm *svm)
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 8/8] KVM: selftests: nSVM: Add svm_nested_pat test
  2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
                   ` (6 preceding siblings ...)
  2026-02-05 21:43 ` [PATCH v3 7/8] KVM: x86: nSVM: Handle restore of legacy nested state Jim Mattson
@ 2026-02-05 21:43 ` Jim Mattson
  7 siblings, 0 replies; 23+ messages in thread
From: Jim Mattson @ 2026-02-05 21:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest, Yosry Ahmed
  Cc: Jim Mattson

Verify that when nested NPT is enabled, L1 and L2 have independent PATs,
and changes to the PAT by either guest are not visible to the other.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 tools/testing/selftests/kvm/Makefile.kvm      |   1 +
 .../selftests/kvm/x86/svm_nested_pat_test.c   | 298 ++++++++++++++++++
 2 files changed, 299 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86/svm_nested_pat_test.c

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index 58eee0474db6..dddaaa6ac157 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -110,6 +110,7 @@ TEST_GEN_PROGS_x86 += x86/state_test
 TEST_GEN_PROGS_x86 += x86/vmx_preemption_timer_test
 TEST_GEN_PROGS_x86 += x86/svm_vmcall_test
 TEST_GEN_PROGS_x86 += x86/svm_int_ctl_test
+TEST_GEN_PROGS_x86 += x86/svm_nested_pat_test
 TEST_GEN_PROGS_x86 += x86/svm_nested_shutdown_test
 TEST_GEN_PROGS_x86 += x86/svm_nested_soft_inject_test
 TEST_GEN_PROGS_x86 += x86/tsc_scaling_sync
diff --git a/tools/testing/selftests/kvm/x86/svm_nested_pat_test.c b/tools/testing/selftests/kvm/x86/svm_nested_pat_test.c
new file mode 100644
index 000000000000..08c1428969b0
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86/svm_nested_pat_test.c
@@ -0,0 +1,298 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * KVM nested SVM PAT test
+ *
+ * Copyright (C) 2026, Google LLC.
+ *
+ * Test that KVM correctly virtualizes the PAT MSR and VMCB g_pat field
+ * for nested SVM guests:
+ *
+ * o With nested NPT disabled:
+ *     - L1 and L2 share the same PAT
+ *     - The vmcb12.g_pat is ignored
+ * o With nested NPT enabled:
+ *     - Invalid g_pat in vmcb12 should cause VMEXIT_INVALID
+ *     - L2 should see vmcb12.g_pat via RDMSR, not L1's PAT
+ *     - L2's writes to PAT should be saved to vmcb12 on exit
+ *     - L1's PAT should be restored after #VMEXIT from L2
+ *     - State save/restore should preserve both L1's and L2's PAT values
+ */
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+#include "svm_util.h"
+
+#define L2_GUEST_STACK_SIZE 256
+
+#define PAT_DEFAULT		0x0007040600070406ULL
+#define L1_PAT_VALUE		0x0007040600070404ULL  /* Change PA0 to WT */
+#define L2_VMCB12_PAT		0x0606060606060606ULL  /* All WB */
+#define L2_PAT_MODIFIED		0x0606060606060604ULL  /* Change PA0 to WT */
+#define INVALID_PAT_VALUE	0x0808080808080808ULL  /* 8 is reserved */
+
+/*
+ * Shared state between L1 and L2 for verification.
+ */
+struct pat_test_data {
+	uint64_t l2_pat_read;
+	uint64_t l2_pat_after_write;
+	uint64_t l1_pat_after_vmexit;
+	uint64_t vmcb12_gpat_after_exit;
+	bool l2_done;
+};
+
+static struct pat_test_data *pat_data;
+
+static void l2_guest_code(void)
+{
+	pat_data->l2_pat_read = rdmsr(MSR_IA32_CR_PAT);
+	wrmsr(MSR_IA32_CR_PAT, L2_PAT_MODIFIED);
+	pat_data->l2_pat_after_write = rdmsr(MSR_IA32_CR_PAT);
+	pat_data->l2_done = true;
+	vmmcall();
+}
+
+static void l2_guest_code_saverestoretest(void)
+{
+	pat_data->l2_pat_read = rdmsr(MSR_IA32_CR_PAT);
+
+	GUEST_SYNC(1);
+	GUEST_ASSERT_EQ(rdmsr(MSR_IA32_CR_PAT), pat_data->l2_pat_read);
+
+	wrmsr(MSR_IA32_CR_PAT, L2_PAT_MODIFIED);
+	pat_data->l2_pat_after_write = rdmsr(MSR_IA32_CR_PAT);
+
+	GUEST_SYNC(2);
+	GUEST_ASSERT_EQ(rdmsr(MSR_IA32_CR_PAT), L2_PAT_MODIFIED);
+
+	pat_data->l2_done = true;
+	vmmcall();
+}
+
+static void l2_guest_code_multi_vmentry(void)
+{
+	pat_data->l2_pat_read = rdmsr(MSR_IA32_CR_PAT);
+	wrmsr(MSR_IA32_CR_PAT, L2_PAT_MODIFIED);
+	pat_data->l2_pat_after_write = rdmsr(MSR_IA32_CR_PAT);
+	vmmcall();
+
+	pat_data->l2_pat_read = rdmsr(MSR_IA32_CR_PAT);
+	pat_data->l2_done = true;
+	vmmcall();
+}
+
+static struct vmcb *l1_common_setup(struct svm_test_data *svm,
+				    struct pat_test_data *data,
+				    void *l2_guest_code,
+				    void *l2_guest_stack)
+{
+	struct vmcb *vmcb = svm->vmcb;
+
+	pat_data = data;
+
+	wrmsr(MSR_IA32_CR_PAT, L1_PAT_VALUE);
+	GUEST_ASSERT_EQ(rdmsr(MSR_IA32_CR_PAT), L1_PAT_VALUE);
+
+	generic_svm_setup(svm, l2_guest_code, l2_guest_stack);
+
+	vmcb->save.g_pat = L2_VMCB12_PAT;
+	vmcb->control.intercept &= ~(1ULL << INTERCEPT_MSR_PROT);
+
+	return vmcb;
+}
+
+static void l1_assert_l2_state(struct pat_test_data *data, uint64_t expected_pat_read)
+{
+	GUEST_ASSERT(data->l2_done);
+	GUEST_ASSERT_EQ(data->l2_pat_read, expected_pat_read);
+	GUEST_ASSERT_EQ(data->l2_pat_after_write, L2_PAT_MODIFIED);
+}
+
+static void l1_svm_code_npt_disabled(struct svm_test_data *svm,
+				     struct pat_test_data *data)
+{
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	struct vmcb *vmcb;
+
+	vmcb = l1_common_setup(svm, data, l2_guest_code,
+			       &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	run_guest(vmcb, svm->vmcb_gpa);
+
+	GUEST_ASSERT_EQ(vmcb->control.exit_code, SVM_EXIT_VMMCALL);
+	l1_assert_l2_state(data, L1_PAT_VALUE);
+
+	data->l1_pat_after_vmexit = rdmsr(MSR_IA32_CR_PAT);
+	GUEST_ASSERT_EQ(data->l1_pat_after_vmexit, L2_PAT_MODIFIED);
+
+	GUEST_DONE();
+}
+
+static void l1_svm_code_invalid_gpat(struct svm_test_data *svm,
+				     struct pat_test_data *data)
+{
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	struct vmcb *vmcb;
+
+	vmcb = l1_common_setup(svm, data, l2_guest_code,
+			       &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	vmcb->save.g_pat = INVALID_PAT_VALUE;
+
+	run_guest(vmcb, svm->vmcb_gpa);
+
+	GUEST_ASSERT_EQ(vmcb->control.exit_code, SVM_EXIT_ERR);
+	GUEST_ASSERT(!data->l2_done);
+
+	GUEST_DONE();
+}
+
+static void l1_svm_code_npt_enabled(struct svm_test_data *svm,
+				    struct pat_test_data *data)
+{
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	struct vmcb *vmcb;
+
+	vmcb = l1_common_setup(svm, data, l2_guest_code,
+			       &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	run_guest(vmcb, svm->vmcb_gpa);
+
+	GUEST_ASSERT_EQ(vmcb->control.exit_code, SVM_EXIT_VMMCALL);
+	l1_assert_l2_state(data, L2_VMCB12_PAT);
+
+	data->vmcb12_gpat_after_exit = vmcb->save.g_pat;
+	GUEST_ASSERT_EQ(data->vmcb12_gpat_after_exit, L2_PAT_MODIFIED);
+
+	data->l1_pat_after_vmexit = rdmsr(MSR_IA32_CR_PAT);
+	GUEST_ASSERT_EQ(data->l1_pat_after_vmexit, L1_PAT_VALUE);
+
+	GUEST_DONE();
+}
+
+static void l1_svm_code_saverestore(struct svm_test_data *svm,
+				    struct pat_test_data *data)
+{
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	struct vmcb *vmcb;
+
+	vmcb = l1_common_setup(svm, data, l2_guest_code_saverestoretest,
+			       &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	run_guest(vmcb, svm->vmcb_gpa);
+
+	GUEST_ASSERT_EQ(vmcb->control.exit_code, SVM_EXIT_VMMCALL);
+	GUEST_ASSERT(data->l2_done);
+
+	GUEST_ASSERT_EQ(rdmsr(MSR_IA32_CR_PAT), L1_PAT_VALUE);
+	GUEST_ASSERT_EQ(vmcb->save.g_pat, L2_PAT_MODIFIED);
+
+	GUEST_DONE();
+}
+
+static void l1_svm_code_multi_vmentry(struct svm_test_data *svm,
+				      struct pat_test_data *data)
+{
+	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+	struct vmcb *vmcb;
+
+	vmcb = l1_common_setup(svm, data, l2_guest_code_multi_vmentry,
+			       &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+	run_guest(vmcb, svm->vmcb_gpa);
+	GUEST_ASSERT_EQ(vmcb->control.exit_code, SVM_EXIT_VMMCALL);
+
+	GUEST_ASSERT_EQ(data->l2_pat_after_write, L2_PAT_MODIFIED);
+	GUEST_ASSERT_EQ(vmcb->save.g_pat, L2_PAT_MODIFIED);
+	GUEST_ASSERT_EQ(rdmsr(MSR_IA32_CR_PAT), L1_PAT_VALUE);
+
+	vmcb->save.rip += 3;  /* vmmcall */
+	run_guest(vmcb, svm->vmcb_gpa);
+
+	GUEST_ASSERT_EQ(vmcb->control.exit_code, SVM_EXIT_VMMCALL);
+	GUEST_ASSERT(data->l2_done);
+	GUEST_ASSERT_EQ(data->l2_pat_read, L2_PAT_MODIFIED);
+	GUEST_ASSERT_EQ(rdmsr(MSR_IA32_CR_PAT), L1_PAT_VALUE);
+
+	GUEST_DONE();
+}
+
+static void run_test(void *l1_code, const char *test_name, bool npt_enabled,
+		     bool do_save_restore)
+{
+	struct pat_test_data *data_hva;
+	vm_vaddr_t svm_gva, data_gva;
+	struct kvm_x86_state *state;
+	struct kvm_vcpu *vcpu;
+	struct kvm_vm *vm;
+	struct ucall uc;
+
+	pr_info("Testing: %s\n", test_name);
+
+	vm = vm_create_with_one_vcpu(&vcpu, l1_code);
+	if (npt_enabled)
+		vm_enable_npt(vm);
+
+	vcpu_alloc_svm(vm, &svm_gva);
+
+	data_gva = vm_vaddr_alloc_page(vm);
+	data_hva = addr_gva2hva(vm, data_gva);
+	memset(data_hva, 0, sizeof(*data_hva));
+
+	if (npt_enabled)
+		tdp_identity_map_default_memslots(vm);
+
+	vcpu_args_set(vcpu, 2, svm_gva, data_gva);
+
+	for (;;) {
+		vcpu_run(vcpu);
+		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT(uc);
+			/* NOT REACHED */
+		case UCALL_SYNC:
+			if (do_save_restore) {
+				pr_info("  Save/restore at sync point %ld\n",
+					uc.args[1]);
+				state = vcpu_save_state(vcpu);
+				kvm_vm_release(vm);
+				vcpu = vm_recreate_with_one_vcpu(vm);
+				vcpu_load_state(vcpu, state);
+				kvm_x86_state_cleanup(state);
+			}
+			break;
+		case UCALL_DONE:
+			pr_info("  PASSED\n");
+			kvm_vm_free(vm);
+			return;
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM));
+	TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_NPT));
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
+
+	run_test(l1_svm_code_npt_disabled, "nested NPT disabled", false, false);
+
+	run_test(l1_svm_code_invalid_gpat, "invalid g_pat", true, false);
+
+	run_test(l1_svm_code_npt_enabled, "nested NPT enabled", true, false);
+
+	run_test(l1_svm_code_saverestore, "save/restore", true, true);
+
+	run_test(l1_svm_code_multi_vmentry, "multiple entries", true, false);
+
+	return 0;
+}
-- 
2.53.0.rc2.204.g2597b5adb4-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-05 21:43 ` [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat Jim Mattson
@ 2026-02-06 18:19   ` Sean Christopherson
  2026-02-06 18:23     ` Yosry Ahmed
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2026-02-06 18:19 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Shuah Khan, kvm, linux-kernel,
	linux-kselftest, Yosry Ahmed

On Thu, Feb 05, 2026, Jim Mattson wrote:
> Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
> add a validity check so that when nested paging is enabled for vmcb12, an
> invalid g_pat causes an immediate VMEXIT with exit code VMEXIT_INVALID, as
> specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."
> 
> Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> Signed-off-by: Jim Mattson <jmattson@google.com>
> ---
>  arch/x86/kvm/svm/nested.c | 4 +++-
>  arch/x86/kvm/svm/svm.h    | 3 +++
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index f72dbd10dcad..1d4ff6408b34 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
>  
>  	nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
>  	nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> +	svm->nested.gpat = vmcb12->save.g_pat;
>  
>  	if (!nested_vmcb_check_save(vcpu) ||
> -	    !nested_vmcb_check_controls(vcpu)) {
> +	    !nested_vmcb_check_controls(vcpu) ||
> +	    (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
>  		vmcb12->control.exit_code    = SVM_EXIT_ERR;
>  		vmcb12->control.exit_info_1  = 0;
>  		vmcb12->control.exit_info_2  = 0;
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 986d90f2d4ca..42a4bf83b3aa 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -208,6 +208,9 @@ struct svm_nested_state {
>  	 */
>  	struct vmcb_save_area_cached save;
>  
> +	/* Cached guest PAT from vmcb12.save.g_pat */
> +	u64 gpat;

Shouldn't this go in vmcb_save_area_cached?

> +
>  	bool initialized;
>  
>  	/*
> -- 
> 2.53.0.rc2.204.g2597b5adb4-goog
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT
  2026-02-05 21:43 ` [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT Jim Mattson
@ 2026-02-06 18:23   ` Sean Christopherson
  2026-02-06 18:29     ` Yosry Ahmed
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2026-02-06 18:23 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Shuah Khan, kvm, linux-kernel,
	linux-kselftest, Yosry Ahmed

On Thu, Feb 05, 2026, Jim Mattson wrote:
> When nested NPT is enabled in vmcb12, copy the (cached and validated)
> vmcb12 g_pat field to the guest PAT register. Under KVM, the guest PAT
> register lives in the vmcb02 g_pat field.
> 
> When NPT is enabled, but nested NPT is disabled, copy L1's IA32_PAT MSR to
> the vmcb02 g_pat field, since L2 shares the IA32_PAT MSR with L1.
> 
> When NPT is disabled, the vmcb02 g_pat field is ignored by hardware.

Uber nit, the "vmcb02" qualifier can be dropped, i.e.

  When NPT is disabled, the g_pat field is ignored by hardware.

Scoping it to vmcb02 makes it sound like there's a special rule about vmcb02.

> Fixes: 15038e147247 ("KVM: SVM: obey guest PAT")
> Signed-off-by: Jim Mattson <jmattson@google.com>
> ---
>  arch/x86/kvm/svm/nested.c | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 1d4ff6408b34..1ff2ede96094 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -646,9 +646,6 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
>  	struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
>  	struct kvm_vcpu *vcpu = &svm->vcpu;
>  
> -	nested_vmcb02_compute_g_pat(svm);
> -	vmcb_mark_dirty(vmcb02, VMCB_NPT);
> -
>  	/* Load the nested guest state */
>  	if (svm->nested.vmcb12_gpa != svm->nested.last_vmcb12_gpa) {
>  		new_vmcb12 = true;
> @@ -656,6 +653,19 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
>  		svm->nested.force_msr_bitmap_recalc = true;
>  	}
>  
> +	if (npt_enabled) {
> +		if (nested_npt_enabled(svm)) {
> +			if (unlikely(new_vmcb12 ||
> +				     vmcb_is_dirty(vmcb12, VMCB_NPT))) {
> +				vmcb02->save.g_pat = svm->nested.gpat;
> +				vmcb_mark_dirty(vmcb02, VMCB_NPT);
> +			}
> +		} else {
> +			vmcb02->save.g_pat = vcpu->arch.pat;
> +			vmcb_mark_dirty(vmcb02, VMCB_NPT);
> +		}
> +	}

To reduce indentation, how about this?  There's a consistency check for
nested_npt_enabled() vs. npt_enabled, so it's guaranteed to do the right thing.

	if (nested_npt_enabled(svm)) {
		if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_NPT))) {
			vmcb02->save.g_pat = svm->nested.gpat;
			vmcb_mark_dirty(vmcb02, VMCB_NPT);
		}
	} else if (npt_enabled) {
		vmcb02->save.g_pat = vcpu->arch.pat;
		vmcb_mark_dirty(vmcb02, VMCB_NPT);
	}

> +
>  	if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
>  		vmcb02->save.es = vmcb12->save.es;
>  		vmcb02->save.cs = vmcb12->save.cs;
> -- 
> 2.53.0.rc2.204.g2597b5adb4-goog
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-06 18:19   ` Sean Christopherson
@ 2026-02-06 18:23     ` Yosry Ahmed
  2026-02-06 18:32       ` Jim Mattson
  0 siblings, 1 reply; 23+ messages in thread
From: Yosry Ahmed @ 2026-02-06 18:23 UTC (permalink / raw)
  To: Sean Christopherson, Jim Mattson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Shuah Khan, kvm, linux-kernel,
	linux-kselftest

February 6, 2026 at 10:19 AM, "Sean Christopherson" <seanjc@google.com> wrote:


> 
> On Thu, Feb 05, 2026, Jim Mattson wrote:
> 
> > 
> > Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
> >  add a validity check so that when nested paging is enabled for vmcb12, an
> >  invalid g_pat causes an immediate VMEXIT with exit code VMEXIT_INVALID, as
> >  specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."
> >  
> >  Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> >  Signed-off-by: Jim Mattson <jmattson@google.com>
> >  ---
> >  arch/x86/kvm/svm/nested.c | 4 +++-
> >  arch/x86/kvm/svm/svm.h | 3 +++
> >  2 files changed, 6 insertions(+), 1 deletion(-)
> >  
> >  diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> >  index f72dbd10dcad..1d4ff6408b34 100644
> >  --- a/arch/x86/kvm/svm/nested.c
> >  +++ b/arch/x86/kvm/svm/nested.c
> >  @@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
> >  
> >  nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
> >  nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> >  + svm->nested.gpat = vmcb12->save.g_pat;
> >  
> >  if (!nested_vmcb_check_save(vcpu) ||
> >  - !nested_vmcb_check_controls(vcpu)) {
> >  + !nested_vmcb_check_controls(vcpu) ||
> >  + (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
> >  vmcb12->control.exit_code = SVM_EXIT_ERR;
> >  vmcb12->control.exit_info_1 = 0;
> >  vmcb12->control.exit_info_2 = 0;
> >  diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> >  index 986d90f2d4ca..42a4bf83b3aa 100644
> >  --- a/arch/x86/kvm/svm/svm.h
> >  +++ b/arch/x86/kvm/svm/svm.h
> >  @@ -208,6 +208,9 @@ struct svm_nested_state {
> >  */
> >  struct vmcb_save_area_cached save;
> >  
> >  + /* Cached guest PAT from vmcb12.save.g_pat */
> >  + u64 gpat;
> > 
> Shouldn't this go in vmcb_save_area_cached?

I believe Jim changed it after this discussion on v2: https://lore.kernel.org/kvm/20260115232154.3021475-4-jmattson@google.com/.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT
  2026-02-06 18:23   ` Sean Christopherson
@ 2026-02-06 18:29     ` Yosry Ahmed
  2026-02-06 19:14       ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Yosry Ahmed @ 2026-02-06 18:29 UTC (permalink / raw)
  To: Sean Christopherson, Jim Mattson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Shuah Khan, kvm, linux-kernel,
	linux-kselftest

February 6, 2026 at 10:23 AM, "Sean Christopherson" <seanjc@google.com> wrote:
> 
> On Thu, Feb 05, 2026, Jim Mattson wrote:
> 
> > 
> > When nested NPT is enabled in vmcb12, copy the (cached and validated)
> >  vmcb12 g_pat field to the guest PAT register. Under KVM, the guest PAT
> >  register lives in the vmcb02 g_pat field.
> >  
> >  When NPT is enabled, but nested NPT is disabled, copy L1's IA32_PAT MSR to
> >  the vmcb02 g_pat field, since L2 shares the IA32_PAT MSR with L1.
> >  
> >  When NPT is disabled, the vmcb02 g_pat field is ignored by hardware.
> > 
> Uber nit, the "vmcb02" qualifier can be dropped, i.e.
> 
>  When NPT is disabled, the g_pat field is ignored by hardware.
> 
> Scoping it to vmcb02 makes it sound like there's a special rule about vmcb02.
> 
> > 
> > Fixes: 15038e147247 ("KVM: SVM: obey guest PAT")
> >  Signed-off-by: Jim Mattson <jmattson@google.com>
> >  ---
> >  arch/x86/kvm/svm/nested.c | 16 +++++++++++++---
> >  1 file changed, 13 insertions(+), 3 deletions(-)
> >  
> >  diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> >  index 1d4ff6408b34..1ff2ede96094 100644
> >  --- a/arch/x86/kvm/svm/nested.c
> >  +++ b/arch/x86/kvm/svm/nested.c
> >  @@ -646,9 +646,6 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
> >  struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
> >  struct kvm_vcpu *vcpu = &svm->vcpu;
> >  
> >  - nested_vmcb02_compute_g_pat(svm);
> >  - vmcb_mark_dirty(vmcb02, VMCB_NPT);
> >  -
> >  /* Load the nested guest state */
> >  if (svm->nested.vmcb12_gpa != svm->nested.last_vmcb12_gpa) {
> >  new_vmcb12 = true;
> >  @@ -656,6 +653,19 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
> >  svm->nested.force_msr_bitmap_recalc = true;
> >  }
> >  
> >  + if (npt_enabled) {
> >  + if (nested_npt_enabled(svm)) {
> >  + if (unlikely(new_vmcb12 ||
> >  + vmcb_is_dirty(vmcb12, VMCB_NPT))) {
> >  + vmcb02->save.g_pat = svm->nested.gpat;
> >  + vmcb_mark_dirty(vmcb02, VMCB_NPT);
> >  + }
> >  + } else {
> >  + vmcb02->save.g_pat = vcpu->arch.pat;
> >  + vmcb_mark_dirty(vmcb02, VMCB_NPT);
> >  + }
> >  + }
> > 
> To reduce indentation, how about this? There's a consistency check for
> nested_npt_enabled() vs. npt_enabled, so it's guaranteed to do the right thing.

You mean the one that goes away after this patch: https://lore.kernel.org/kvm/20260115011312.3675857-16-yosry.ahmed@linux.dev/?

> 
>  if (nested_npt_enabled(svm)) {
>  if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_NPT))) {
>  vmcb02->save.g_pat = svm->nested.gpat;
>  vmcb_mark_dirty(vmcb02, VMCB_NPT);
>  }
>  } else if (npt_enabled) {
>  vmcb02->save.g_pat = vcpu->arch.pat;
>  vmcb_mark_dirty(vmcb02, VMCB_NPT);
>  }
> 
> > 
> > +
> >  if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
> >  vmcb02->save.es = vmcb12->save.es;
> >  vmcb02->save.cs = vmcb12->save.cs;
> >  -- 
> >  2.53.0.rc2.204.g2597b5adb4-goog
> >
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-06 18:23     ` Yosry Ahmed
@ 2026-02-06 18:32       ` Jim Mattson
  2026-02-06 19:12         ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Jim Mattson @ 2026-02-06 18:32 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest

On Fri, Feb 6, 2026 at 10:23 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
>
> February 6, 2026 at 10:19 AM, "Sean Christopherson" <seanjc@google.com> wrote:
>
>
> >
> > On Thu, Feb 05, 2026, Jim Mattson wrote:
> >
> > >
> > > Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
> > >  add a validity check so that when nested paging is enabled for vmcb12, an
> > >  invalid g_pat causes an immediate VMEXIT with exit code VMEXIT_INVALID, as
> > >  specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."
> > >
> > >  Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> > >  Signed-off-by: Jim Mattson <jmattson@google.com>
> > >  ---
> > >  arch/x86/kvm/svm/nested.c | 4 +++-
> > >  arch/x86/kvm/svm/svm.h | 3 +++
> > >  2 files changed, 6 insertions(+), 1 deletion(-)
> > >
> > >  diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > >  index f72dbd10dcad..1d4ff6408b34 100644
> > >  --- a/arch/x86/kvm/svm/nested.c
> > >  +++ b/arch/x86/kvm/svm/nested.c
> > >  @@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
> > >
> > >  nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
> > >  nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> > >  + svm->nested.gpat = vmcb12->save.g_pat;
> > >
> > >  if (!nested_vmcb_check_save(vcpu) ||
> > >  - !nested_vmcb_check_controls(vcpu)) {
> > >  + !nested_vmcb_check_controls(vcpu) ||
> > >  + (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
> > >  vmcb12->control.exit_code = SVM_EXIT_ERR;
> > >  vmcb12->control.exit_info_1 = 0;
> > >  vmcb12->control.exit_info_2 = 0;
> > >  diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > >  index 986d90f2d4ca..42a4bf83b3aa 100644
> > >  --- a/arch/x86/kvm/svm/svm.h
> > >  +++ b/arch/x86/kvm/svm/svm.h
> > >  @@ -208,6 +208,9 @@ struct svm_nested_state {
> > >  */
> > >  struct vmcb_save_area_cached save;
> > >
> > >  + /* Cached guest PAT from vmcb12.save.g_pat */
> > >  + u64 gpat;
> > >
> > Shouldn't this go in vmcb_save_area_cached?
>
> I believe Jim changed it after this discussion on v2: https://lore.kernel.org/kvm/20260115232154.3021475-4-jmattson@google.com/.

Right. The two issues with putting it in vmcb_save_area_cached were:

1. Checking all of vmcb_save_area_cached requires access to the
corresponding control area (or at least the boolean, "NTP enabled.")
2. In the nested state serialization payload, everything else in the
vmcb_save_area_cached comes from L1 (host state to be restored at
emulated #VMEXIT.)

The first issue was a little messy, but not that distasteful. The
second issue was really a mess.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-06 18:32       ` Jim Mattson
@ 2026-02-06 19:12         ` Sean Christopherson
  2026-02-06 19:15           ` Yosry Ahmed
  2026-02-06 20:56           ` Jim Mattson
  0 siblings, 2 replies; 23+ messages in thread
From: Sean Christopherson @ 2026-02-06 19:12 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yosry Ahmed, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest

On Fri, Feb 06, 2026, Jim Mattson wrote:
> On Fri, Feb 6, 2026 at 10:23 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> >
> > February 6, 2026 at 10:19 AM, "Sean Christopherson" <seanjc@google.com> wrote:
> >
> >
> > >
> > > On Thu, Feb 05, 2026, Jim Mattson wrote:
> > >
> > > >
> > > > Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
> > > >  add a validity check so that when nested paging is enabled for vmcb12, an
> > > >  invalid g_pat causes an immediate VMEXIT with exit code VMEXIT_INVALID, as
> > > >  specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."
> > > >
> > > >  Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> > > >  Signed-off-by: Jim Mattson <jmattson@google.com>
> > > >  ---
> > > >  arch/x86/kvm/svm/nested.c | 4 +++-
> > > >  arch/x86/kvm/svm/svm.h | 3 +++
> > > >  2 files changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > >  diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > >  index f72dbd10dcad..1d4ff6408b34 100644
> > > >  --- a/arch/x86/kvm/svm/nested.c
> > > >  +++ b/arch/x86/kvm/svm/nested.c
> > > >  @@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
> > > >
> > > >  nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
> > > >  nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> > > >  + svm->nested.gpat = vmcb12->save.g_pat;
> > > >
> > > >  if (!nested_vmcb_check_save(vcpu) ||
> > > >  - !nested_vmcb_check_controls(vcpu)) {
> > > >  + !nested_vmcb_check_controls(vcpu) ||
> > > >  + (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
> > > >  vmcb12->control.exit_code = SVM_EXIT_ERR;
> > > >  vmcb12->control.exit_info_1 = 0;
> > > >  vmcb12->control.exit_info_2 = 0;
> > > >  diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > > >  index 986d90f2d4ca..42a4bf83b3aa 100644
> > > >  --- a/arch/x86/kvm/svm/svm.h
> > > >  +++ b/arch/x86/kvm/svm/svm.h
> > > >  @@ -208,6 +208,9 @@ struct svm_nested_state {
> > > >  */
> > > >  struct vmcb_save_area_cached save;
> > > >
> > > >  + /* Cached guest PAT from vmcb12.save.g_pat */
> > > >  + u64 gpat;
> > > >
> > > Shouldn't this go in vmcb_save_area_cached?
> >
> > I believe Jim changed it after this discussion on v2: https://lore.kernel.org/kvm/20260115232154.3021475-4-jmattson@google.com/.

LOL, oh the irony:

  I'm going to cache it on its own to avoid confusion.

> Right. The two issues with putting it in vmcb_save_area_cached were:
> 
> 1. Checking all of vmcb_save_area_cached requires access to the
> corresponding control area (or at least the boolean, "NTP enabled.")

Checking the control area seems like the right answer (I went down that path
before reading this).

> 2. In the nested state serialization payload, everything else in the
> vmcb_save_area_cached comes from L1 (host state to be restored at
> emulated #VMEXIT.)

Hmm, right, but *because* it's ignored, that gives us carte blanche to clobber it.
More below.

> The first issue was a little messy, but not that distasteful.

I actually find it the opposite of distasteful.  KVM definitely _should_ be
checking the controls, not the vCPU state.  If it weren't for needing to get at
MAXPHYADDR in CPUID, I'd push to drop @vcpu entirely.

> The second issue was really a mess.

I'd rather have the mess contained and document though.  Caching g_pat outside
of vmcb_save_area_cached bleeds the mess into all of the relevant nSVM code, and
doesn't leave any breadcrumbs in the code/comments to explain that it "needs" to
be kept separate.

AFAICT, the only "problem" is that g_pat in the serialization payload will be
garbage when restoring state from an older KVM.  But that's totally fine, precisely
because L1's PAT isn't restored from vmcb01 on nested #VMEXIT, it's always resident
in vcpu->arch.pat.  So can't we just do this to avoid a spurious -EINVAL?

	/*
	 * Validate host state saved from before VMRUN (see
	 * nested_svm_check_permissions).
	 */
	__nested_copy_vmcb_save_to_cache(&save_cached, save);

	/*
	 * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
	 * gPAT.  L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
	 * vcpu->arch.pat, i.e. gPAT is a reflection of vcpu->arch.pat, not the
	 * other way around.
	 */
	save_cached.g_pat = vcpu->arch.pat;

	if (!(save->cr0 & X86_CR0_PG) ||
	    !(save->cr0 & X86_CR0_PE) ||
	    (save->rflags & X86_EFLAGS_VM) ||
	    !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
		goto out_free;

Oh, and if we do plumb in @ctrl to __nested_vmcb_check_save(), I vote to
opportunistically drop the useless single-use wrappers (probably in a standalone
patch to plumb in @ctrl).  E.g. (completely untested)

---
 arch/x86/kvm/svm/nested.c | 71 ++++++++++++++++++---------------------
 arch/x86/kvm/svm/svm.c    |  2 +-
 arch/x86/kvm/svm/svm.h    |  6 ++--
 3 files changed, 35 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index a7d6fc1382a7..a429947c8966 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -339,8 +339,8 @@ static bool nested_svm_check_bitmap_pa(struct kvm_vcpu *vcpu, u64 pa, u32 size)
 	    kvm_vcpu_is_legal_gpa(vcpu, addr + size - 1);
 }
 
-static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
-					 struct vmcb_ctrl_area_cached *control)
+static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
+				       struct vmcb_ctrl_area_cached *control)
 {
 	if (CC(!vmcb12_is_intercept(control, INTERCEPT_VMRUN)))
 		return false;
@@ -367,8 +367,9 @@ static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
 }
 
 /* Common checks that apply to both L1 and L2 state.  */
-static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
-				     struct vmcb_save_area_cached *save)
+static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu,
+				   struct vmcb_ctrl_area_cached *ctrl,
+				   struct vmcb_save_area_cached *save)
 {
 	if (CC(!(save->efer & EFER_SVME)))
 		return false;
@@ -399,25 +400,13 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
 	if (CC(!kvm_valid_efer(vcpu, save->efer)))
 		return false;
 
+	if (CC(ctrl->nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
+	       !kvm_pat_valid(save->g_pat))
+		return false;
+
 	return true;
 }
 
-static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu)
-{
-	struct vcpu_svm *svm = to_svm(vcpu);
-	struct vmcb_save_area_cached *save = &svm->nested.save;
-
-	return __nested_vmcb_check_save(vcpu, save);
-}
-
-static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu)
-{
-	struct vcpu_svm *svm = to_svm(vcpu);
-	struct vmcb_ctrl_area_cached *ctl = &svm->nested.ctl;
-
-	return __nested_vmcb_check_controls(vcpu, ctl);
-}
-
 /*
  * If a feature is not advertised to L1, clear the corresponding vmcb12
  * intercept.
@@ -504,6 +493,9 @@ static void __nested_copy_vmcb_save_to_cache(struct vmcb_save_area_cached *to,
 
 	to->dr6 = from->dr6;
 	to->dr7 = from->dr7;
+
+	to->g_pat = from->g_pat;
+
 }
 
 void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
@@ -644,17 +636,14 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
 		svm->nested.force_msr_bitmap_recalc = true;
 	}
 
-	if (npt_enabled) {
-		if (nested_npt_enabled(svm)) {
-			if (unlikely(new_vmcb12 ||
-				     vmcb_is_dirty(vmcb12, VMCB_NPT))) {
-				vmcb02->save.g_pat = svm->nested.gpat;
-				vmcb_mark_dirty(vmcb02, VMCB_NPT);
-			}
-		} else {
-			vmcb02->save.g_pat = vcpu->arch.pat;
+	if (nested_npt_enabled(svm)) {
+		if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_NPT))) {
+			vmcb02->save.g_pat = svm->nested.save.g_pat;
 			vmcb_mark_dirty(vmcb02, VMCB_NPT);
 		}
+	} else if (npt_enabled) {
+		vmcb02->save.g_pat = vcpu->arch.pat;
+		vmcb_mark_dirty(vmcb02, VMCB_NPT);
 	}
 
 	if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
@@ -1028,11 +1017,9 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 
 	nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
 	nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
-	svm->nested.gpat = vmcb12->save.g_pat;
 
-	if (!nested_vmcb_check_save(vcpu) ||
-	    !nested_vmcb_check_controls(vcpu) ||
-	    (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
+	if (!nested_vmcb_check_save(vcpu, &svm->nested.ctl, &svm->nested.save) ||
+	    !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) {
 		vmcb12->control.exit_code    = SVM_EXIT_ERR;
 		vmcb12->control.exit_info_1  = 0;
 		vmcb12->control.exit_info_2  = 0;
@@ -1766,7 +1753,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
 		kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
 		if (nested_npt_enabled(svm)) {
 			kvm_state.hdr.svm.flags |= KVM_STATE_SVM_VALID_GPAT;
-			kvm_state.hdr.svm.gpat = svm->nested.gpat;
+			kvm_state.hdr.svm.gpat = svm->nested.save.g_pat;
 		}
 		kvm_state.size += KVM_STATE_NESTED_SVM_VMCB_SIZE;
 		kvm_state.flags |= KVM_STATE_NESTED_GUEST_MODE;
@@ -1871,7 +1858,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 
 	ret = -EINVAL;
 	__nested_copy_vmcb_control_to_cache(vcpu, &ctl_cached, ctl);
-	if (!__nested_vmcb_check_controls(vcpu, &ctl_cached))
+	if (!nested_vmcb_check_controls(vcpu, &ctl_cached))
 		goto out_free;
 
 	/*
@@ -1887,15 +1874,21 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	 * nested_svm_check_permissions).
 	 */
 	__nested_copy_vmcb_save_to_cache(&save_cached, save);
+
+	/*
+	 * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
+	 * gPAT.  L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
+	 * vcpu->arch.pat, i.e. hPAT is a reflection of vcpu->arch.pat, not the
+	 * other way around.
+	 */
+	save_cached.g_pat = vcpu->arch.pat;
+
 	if (!(save->cr0 & X86_CR0_PG) ||
 	    !(save->cr0 & X86_CR0_PE) ||
 	    (save->rflags & X86_EFLAGS_VM) ||
-	    !__nested_vmcb_check_save(vcpu, &save_cached))
+	    !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
 		goto out_free;
 
-	/*
-	 * Validate gPAT, if provided.
-	 */
 	if ((kvm_state->hdr.svm.flags & KVM_STATE_SVM_VALID_GPAT) &&
 	    !kvm_pat_valid(kvm_state->hdr.svm.gpat))
 		goto out_free;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a6a44deec82b..bf8562a5f655 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2862,7 +2862,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		WARN_ON_ONCE(msr_info->host_initiated && vcpu->wants_to_run);
 		if (!msr_info->host_initiated && is_guest_mode(vcpu) &&
 		    nested_npt_enabled(svm))
-			msr_info->data = svm->nested.gpat;
+			msr_info->data = svm->nested.save.g_pat;
 		else
 			msr_info->data = vcpu->arch.pat;
 		break;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a559cd45c8a9..6f07d8e3f06e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -146,6 +146,7 @@ struct vmcb_save_area_cached {
 	u64 cr0;
 	u64 dr7;
 	u64 dr6;
+	u64 g_pat;
 };
 
 struct vmcb_ctrl_area_cached {
@@ -208,9 +209,6 @@ struct svm_nested_state {
 	 */
 	struct vmcb_save_area_cached save;
 
-	/* Cached guest PAT from vmcb12.save.g_pat */
-	u64 gpat;
-
 	bool initialized;
 
 	/*
@@ -599,7 +597,7 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
 
 static inline void svm_set_gpat(struct vcpu_svm *svm, u64 data)
 {
-	svm->nested.gpat = data;
+	svm->nested.save.g_pat = data;
 	svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, data);
 }
 

base-commit: 6461c50e232d6f81d5b9604236f7ee3df870e932
-- 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT
  2026-02-06 18:29     ` Yosry Ahmed
@ 2026-02-06 19:14       ` Sean Christopherson
  0 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2026-02-06 19:14 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Jim Mattson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest

On Fri, Feb 06, 2026, Yosry Ahmed wrote:
> February 6, 2026 at 10:23 AM, "Sean Christopherson" <seanjc@google.com> wrote:
> > >  if (svm->nested.vmcb12_gpa != svm->nested.last_vmcb12_gpa) {
> > >  new_vmcb12 = true;
> > >  @@ -656,6 +653,19 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
> > >  svm->nested.force_msr_bitmap_recalc = true;
> > >  }
> > >  
> > >  + if (npt_enabled) {
> > >  + if (nested_npt_enabled(svm)) {
> > >  + if (unlikely(new_vmcb12 ||
> > >  + vmcb_is_dirty(vmcb12, VMCB_NPT))) {
> > >  + vmcb02->save.g_pat = svm->nested.gpat;
> > >  + vmcb_mark_dirty(vmcb02, VMCB_NPT);
> > >  + }
> > >  + } else {
> > >  + vmcb02->save.g_pat = vcpu->arch.pat;
> > >  + vmcb_mark_dirty(vmcb02, VMCB_NPT);
> > >  + }
> > >  + }
> > > 
> > To reduce indentation, how about this? There's a consistency check for
> > nested_npt_enabled() vs. npt_enabled, so it's guaranteed to do the right thing.
> 
> You mean the one that goes away after this patch: https://lore.kernel.org/kvm/20260115011312.3675857-16-yosry.ahmed@linux.dev/?

Heh, still fine.  All that matters is that nested_npt_enabled() can't be %true
if npt_enabled is %false.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-06 19:12         ` Sean Christopherson
@ 2026-02-06 19:15           ` Yosry Ahmed
  2026-02-06 19:50             ` Sean Christopherson
  2026-02-06 20:56           ` Jim Mattson
  1 sibling, 1 reply; 23+ messages in thread
From: Yosry Ahmed @ 2026-02-06 19:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Jim Mattson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest

On Fri, Feb 06, 2026 at 11:12:36AM -0800, Sean Christopherson wrote:
> On Fri, Feb 06, 2026, Jim Mattson wrote:
> > On Fri, Feb 6, 2026 at 10:23 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> > >
> > > February 6, 2026 at 10:19 AM, "Sean Christopherson" <seanjc@google.com> wrote:
> > >
> > >
> > > >
> > > > On Thu, Feb 05, 2026, Jim Mattson wrote:
> > > >
> > > > >
> > > > > Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
> > > > >  add a validity check so that when nested paging is enabled for vmcb12, an
> > > > >  invalid g_pat causes an immediate VMEXIT with exit code VMEXIT_INVALID, as
> > > > >  specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."
> > > > >
> > > > >  Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> > > > >  Signed-off-by: Jim Mattson <jmattson@google.com>
> > > > >  ---
> > > > >  arch/x86/kvm/svm/nested.c | 4 +++-
> > > > >  arch/x86/kvm/svm/svm.h | 3 +++
> > > > >  2 files changed, 6 insertions(+), 1 deletion(-)
> > > > >
> > > > >  diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > > >  index f72dbd10dcad..1d4ff6408b34 100644
> > > > >  --- a/arch/x86/kvm/svm/nested.c
> > > > >  +++ b/arch/x86/kvm/svm/nested.c
> > > > >  @@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
> > > > >
> > > > >  nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
> > > > >  nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> > > > >  + svm->nested.gpat = vmcb12->save.g_pat;
> > > > >
> > > > >  if (!nested_vmcb_check_save(vcpu) ||
> > > > >  - !nested_vmcb_check_controls(vcpu)) {
> > > > >  + !nested_vmcb_check_controls(vcpu) ||
> > > > >  + (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
> > > > >  vmcb12->control.exit_code = SVM_EXIT_ERR;
> > > > >  vmcb12->control.exit_info_1 = 0;
> > > > >  vmcb12->control.exit_info_2 = 0;
> > > > >  diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > > > >  index 986d90f2d4ca..42a4bf83b3aa 100644
> > > > >  --- a/arch/x86/kvm/svm/svm.h
> > > > >  +++ b/arch/x86/kvm/svm/svm.h
> > > > >  @@ -208,6 +208,9 @@ struct svm_nested_state {
> > > > >  */
> > > > >  struct vmcb_save_area_cached save;
> > > > >
> > > > >  + /* Cached guest PAT from vmcb12.save.g_pat */
> > > > >  + u64 gpat;
> > > > >
> > > > Shouldn't this go in vmcb_save_area_cached?
> > >
> > > I believe Jim changed it after this discussion on v2: https://lore.kernel.org/kvm/20260115232154.3021475-4-jmattson@google.com/.
> 
> LOL, oh the irony:
> 
>   I'm going to cache it on its own to avoid confusion.
> 
> > Right. The two issues with putting it in vmcb_save_area_cached were:
> > 
> > 1. Checking all of vmcb_save_area_cached requires access to the
> > corresponding control area (or at least the boolean, "NTP enabled.")
> 
> Checking the control area seems like the right answer (I went down that path
> before reading this).
> 
> > 2. In the nested state serialization payload, everything else in the
> > vmcb_save_area_cached comes from L1 (host state to be restored at
> > emulated #VMEXIT.)
> 
> Hmm, right, but *because* it's ignored, that gives us carte blanche to clobber it.
> More below.
> 
> > The first issue was a little messy, but not that distasteful.
> 
> I actually find it the opposite of distasteful.  KVM definitely _should_ be
> checking the controls, not the vCPU state.  If it weren't for needing to get at
> MAXPHYADDR in CPUID, I'd push to drop @vcpu entirely.
> 
> > The second issue was really a mess.
> 
> I'd rather have the mess contained and document though.  Caching g_pat outside
> of vmcb_save_area_cached bleeds the mess into all of the relevant nSVM code, and
> doesn't leave any breadcrumbs in the code/comments to explain that it "needs" to
> be kept separate.
> 
> AFAICT, the only "problem" is that g_pat in the serialization payload will be
> garbage when restoring state from an older KVM.  But that's totally fine, precisely
> because L1's PAT isn't restored from vmcb01 on nested #VMEXIT, it's always resident
> in vcpu->arch.pat.  So can't we just do this to avoid a spurious -EINVAL?
> 
> 	/*
> 	 * Validate host state saved from before VMRUN (see
> 	 * nested_svm_check_permissions).
> 	 */
> 	__nested_copy_vmcb_save_to_cache(&save_cached, save);
> 
> 	/*
> 	 * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
> 	 * gPAT.  L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
> 	 * vcpu->arch.pat, i.e. gPAT is a reflection of vcpu->arch.pat, not the
> 	 * other way around.
> 	 */
> 	save_cached.g_pat = vcpu->arch.pat;
> 
> 	if (!(save->cr0 & X86_CR0_PG) ||
> 	    !(save->cr0 & X86_CR0_PE) ||
> 	    (save->rflags & X86_EFLAGS_VM) ||
> 	    !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
> 		goto out_free;
> 
> Oh, and if we do plumb in @ctrl to __nested_vmcb_check_save(), I vote to
> opportunistically drop the useless single-use wrappers (probably in a standalone
> patch to plumb in @ctrl).  E.g. (completely untested)

They are dropped by
https://lore.kernel.org/kvm/20260206190851.860662-9-yosry.ahmed@linux.dev/.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 7/8] KVM: x86: nSVM: Handle restore of legacy nested state
  2026-02-05 21:43 ` [PATCH v3 7/8] KVM: x86: nSVM: Handle restore of legacy nested state Jim Mattson
@ 2026-02-06 19:17   ` Sean Christopherson
  2026-02-06 22:38     ` Jim Mattson
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2026-02-06 19:17 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Shuah Khan, kvm, linux-kernel,
	linux-kselftest, Yosry Ahmed

On Thu, Feb 05, 2026, Jim Mattson wrote:
> When nested NPT is enabled and KVM_SET_NESTED_STATE is used to restore an
> old checkpoint (without a valid gPAT), the current IA32_PAT value must be
> used as L2's gPAT.
> 
> The current IA32_PAT value may be restored by KVM_SET_MSRS after
> KVM_SET_NESTED_STATE. Furthermore, there may be a KVM_GET_NESTED_STATE
> before the first KVM_RUN.
> 
> Introduce a new boolean, svm->nested.legacy_gpat_semantics. When set, hPAT
> updates are also applied to gPAT, preserving the old behavior where L2
> shared L1's PAT. svm_vcpu_pre_run() clears this boolean at the first
> KVM_RUN.

State this last point as a command and explain why.  E.g. I think this is why?

  Clear legacy_gpat_semantics on KVM_RUN so that the legacy semantics are
  scoped to a single restore (inasmuch as possible).  E.g. to support
  restoring a snapshot from an old KVM, and then later restoring a snapshot
  from a new KVM.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-06 19:15           ` Yosry Ahmed
@ 2026-02-06 19:50             ` Sean Christopherson
  0 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2026-02-06 19:50 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Jim Mattson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest

On Fri, Feb 06, 2026, Yosry Ahmed wrote:
> On Fri, Feb 06, 2026 at 11:12:36AM -0800, Sean Christopherson wrote:
> > Oh, and if we do plumb in @ctrl to __nested_vmcb_check_save(), I vote to
> > opportunistically drop the useless single-use wrappers (probably in a standalone
> > patch to plumb in @ctrl).  E.g. (completely untested)
> 
> They are dropped by
> https://lore.kernel.org/kvm/20260206190851.860662-9-yosry.ahmed@linux.dev/.

At least I'm consistent?  Hmm.  Jim, can you base your next version on v5 of
Yosry's series?  I think when I apply, I'll splice in this series between patches
20 and 21.  I could resolve the conflicts when applying, but since I'll probably
tag these for stable@ as well, I'd rather apply patches that are closer to what
was posted.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-06 19:12         ` Sean Christopherson
  2026-02-06 19:15           ` Yosry Ahmed
@ 2026-02-06 20:56           ` Jim Mattson
  2026-02-06 23:07             ` Sean Christopherson
  1 sibling, 1 reply; 23+ messages in thread
From: Jim Mattson @ 2026-02-06 20:56 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Yosry Ahmed, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest

On Fri, Feb 6, 2026 at 11:12 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Feb 06, 2026, Jim Mattson wrote:
> > On Fri, Feb 6, 2026 at 10:23 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> > >
> > > February 6, 2026 at 10:19 AM, "Sean Christopherson" <seanjc@google.com> wrote:
> > >
> > >
> > > >
> > > > On Thu, Feb 05, 2026, Jim Mattson wrote:
> > > >
> > > > >
> > > > > Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
> > > > >  add a validity check so that when nested paging is enabled for vmcb12, an
> > > > >  invalid g_pat causes an immediate VMEXIT with exit code VMEXIT_INVALID, as
> > > > >  specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."
> > > > >
> > > > >  Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> > > > >  Signed-off-by: Jim Mattson <jmattson@google.com>
> > > > >  ---
> > > > >  arch/x86/kvm/svm/nested.c | 4 +++-
> > > > >  arch/x86/kvm/svm/svm.h | 3 +++
> > > > >  2 files changed, 6 insertions(+), 1 deletion(-)
> > > > >
> > > > >  diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > > >  index f72dbd10dcad..1d4ff6408b34 100644
> > > > >  --- a/arch/x86/kvm/svm/nested.c
> > > > >  +++ b/arch/x86/kvm/svm/nested.c
> > > > >  @@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
> > > > >
> > > > >  nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
> > > > >  nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> > > > >  + svm->nested.gpat = vmcb12->save.g_pat;
> > > > >
> > > > >  if (!nested_vmcb_check_save(vcpu) ||
> > > > >  - !nested_vmcb_check_controls(vcpu)) {
> > > > >  + !nested_vmcb_check_controls(vcpu) ||
> > > > >  + (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
> > > > >  vmcb12->control.exit_code = SVM_EXIT_ERR;
> > > > >  vmcb12->control.exit_info_1 = 0;
> > > > >  vmcb12->control.exit_info_2 = 0;
> > > > >  diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > > > >  index 986d90f2d4ca..42a4bf83b3aa 100644
> > > > >  --- a/arch/x86/kvm/svm/svm.h
> > > > >  +++ b/arch/x86/kvm/svm/svm.h
> > > > >  @@ -208,6 +208,9 @@ struct svm_nested_state {
> > > > >  */
> > > > >  struct vmcb_save_area_cached save;
> > > > >
> > > > >  + /* Cached guest PAT from vmcb12.save.g_pat */
> > > > >  + u64 gpat;
> > > > >
> > > > Shouldn't this go in vmcb_save_area_cached?
> > >
> > > I believe Jim changed it after this discussion on v2: https://lore.kernel.org/kvm/20260115232154.3021475-4-jmattson@google.com/.
>
> LOL, oh the irony:
>
>   I'm going to cache it on its own to avoid confusion.
>
> > Right. The two issues with putting it in vmcb_save_area_cached were:
> >
> > 1. Checking all of vmcb_save_area_cached requires access to the
> > corresponding control area (or at least the boolean, "NTP enabled.")
>
> Checking the control area seems like the right answer (I went down that path
> before reading this).
>
> > 2. In the nested state serialization payload, everything else in the
> > vmcb_save_area_cached comes from L1 (host state to be restored at
> > emulated #VMEXIT.)
>
> Hmm, right, but *because* it's ignored, that gives us carte blanche to clobber it.
> More below.
>
> > The first issue was a little messy, but not that distasteful.
>
> I actually find it the opposite of distasteful.  KVM definitely _should_ be
> checking the controls, not the vCPU state.  If it weren't for needing to get at
> MAXPHYADDR in CPUID, I'd push to drop @vcpu entirely.
>
> > The second issue was really a mess.
>
> I'd rather have the mess contained and document though.  Caching g_pat outside
> of vmcb_save_area_cached bleeds the mess into all of the relevant nSVM code, and
> doesn't leave any breadcrumbs in the code/comments to explain that it "needs" to
> be kept separate.
>
> AFAICT, the only "problem" is that g_pat in the serialization payload will be
> garbage when restoring state from an older KVM.  But that's totally fine, precisely
> because L1's PAT isn't restored from vmcb01 on nested #VMEXIT, it's always resident
> in vcpu->arch.pat.  So can't we just do this to avoid a spurious -EINVAL?
>
>         /*
>          * Validate host state saved from before VMRUN (see
>          * nested_svm_check_permissions).
>          */
>         __nested_copy_vmcb_save_to_cache(&save_cached, save);
>
>         /*
>          * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
>          * gPAT.  L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
>          * vcpu->arch.pat, i.e. gPAT is a reflection of vcpu->arch.pat, not the
>          * other way around.
>          */
>         save_cached.g_pat = vcpu->arch.pat;

Your comment is a bit optimistic. Qemu, for instance, hasn't restored
MSRs yet, so vcpu->arch.pat will actually be the current vCPU's PAT
(in the case of snapshot restore, some future PAT). But, in any case,
it should be a valid PAT.

>         if (!(save->cr0 & X86_CR0_PG) ||
>             !(save->cr0 & X86_CR0_PE) ||
>             (save->rflags & X86_EFLAGS_VM) ||
>             !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))

Wrong ctl_cached. Those are the vmcb02 controls, but we are checking
the vmcb01 save state.

I think it would be better to add a boolean argument, "check_gpat,"
which will be false at this call site and nested_npt_enabled(vcpu) at
the other call site.

>                 goto out_free;
>
> Oh, and if we do plumb in @ctrl to __nested_vmcb_check_save(), I vote to
> opportunistically drop the useless single-use wrappers (probably in a standalone
> patch to plumb in @ctrl).  E.g. (completely untested)
>
> ---
>  arch/x86/kvm/svm/nested.c | 71 ++++++++++++++++++---------------------
>  arch/x86/kvm/svm/svm.c    |  2 +-
>  arch/x86/kvm/svm/svm.h    |  6 ++--
>  3 files changed, 35 insertions(+), 44 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index a7d6fc1382a7..a429947c8966 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -339,8 +339,8 @@ static bool nested_svm_check_bitmap_pa(struct kvm_vcpu *vcpu, u64 pa, u32 size)
>             kvm_vcpu_is_legal_gpa(vcpu, addr + size - 1);
>  }
>
> -static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
> -                                        struct vmcb_ctrl_area_cached *control)
> +static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
> +                                      struct vmcb_ctrl_area_cached *control)
>  {
>         if (CC(!vmcb12_is_intercept(control, INTERCEPT_VMRUN)))
>                 return false;
> @@ -367,8 +367,9 @@ static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
>  }
>
>  /* Common checks that apply to both L1 and L2 state.  */
> -static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
> -                                    struct vmcb_save_area_cached *save)
> +static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu,
> +                                  struct vmcb_ctrl_area_cached *ctrl,
> +                                  struct vmcb_save_area_cached *save)
>  {
>         if (CC(!(save->efer & EFER_SVME)))
>                 return false;
> @@ -399,25 +400,13 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
>         if (CC(!kvm_valid_efer(vcpu, save->efer)))
>                 return false;
>
> +       if (CC(ctrl->nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
> +              !kvm_pat_valid(save->g_pat))
> +               return false;
> +
>         return true;
>  }
>
> -static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu)
> -{
> -       struct vcpu_svm *svm = to_svm(vcpu);
> -       struct vmcb_save_area_cached *save = &svm->nested.save;
> -
> -       return __nested_vmcb_check_save(vcpu, save);
> -}
> -
> -static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu)
> -{
> -       struct vcpu_svm *svm = to_svm(vcpu);
> -       struct vmcb_ctrl_area_cached *ctl = &svm->nested.ctl;
> -
> -       return __nested_vmcb_check_controls(vcpu, ctl);
> -}
> -
>  /*
>   * If a feature is not advertised to L1, clear the corresponding vmcb12
>   * intercept.
> @@ -504,6 +493,9 @@ static void __nested_copy_vmcb_save_to_cache(struct vmcb_save_area_cached *to,
>
>         to->dr6 = from->dr6;
>         to->dr7 = from->dr7;
> +
> +       to->g_pat = from->g_pat;
> +
>  }
>
>  void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
> @@ -644,17 +636,14 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
>                 svm->nested.force_msr_bitmap_recalc = true;
>         }
>
> -       if (npt_enabled) {
> -               if (nested_npt_enabled(svm)) {
> -                       if (unlikely(new_vmcb12 ||
> -                                    vmcb_is_dirty(vmcb12, VMCB_NPT))) {
> -                               vmcb02->save.g_pat = svm->nested.gpat;
> -                               vmcb_mark_dirty(vmcb02, VMCB_NPT);
> -                       }
> -               } else {
> -                       vmcb02->save.g_pat = vcpu->arch.pat;
> +       if (nested_npt_enabled(svm)) {
> +               if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_NPT))) {
> +                       vmcb02->save.g_pat = svm->nested.save.g_pat;
>                         vmcb_mark_dirty(vmcb02, VMCB_NPT);
>                 }
> +       } else if (npt_enabled) {
> +               vmcb02->save.g_pat = vcpu->arch.pat;
> +               vmcb_mark_dirty(vmcb02, VMCB_NPT);
>         }
>
>         if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
> @@ -1028,11 +1017,9 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
>
>         nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
>         nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> -       svm->nested.gpat = vmcb12->save.g_pat;
>
> -       if (!nested_vmcb_check_save(vcpu) ||
> -           !nested_vmcb_check_controls(vcpu) ||
> -           (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
> +       if (!nested_vmcb_check_save(vcpu, &svm->nested.ctl, &svm->nested.save) ||
> +           !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) {
>                 vmcb12->control.exit_code    = SVM_EXIT_ERR;
>                 vmcb12->control.exit_info_1  = 0;
>                 vmcb12->control.exit_info_2  = 0;
> @@ -1766,7 +1753,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
>                 kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
>                 if (nested_npt_enabled(svm)) {
>                         kvm_state.hdr.svm.flags |= KVM_STATE_SVM_VALID_GPAT;
> -                       kvm_state.hdr.svm.gpat = svm->nested.gpat;
> +                       kvm_state.hdr.svm.gpat = svm->nested.save.g_pat;
>                 }
>                 kvm_state.size += KVM_STATE_NESTED_SVM_VMCB_SIZE;
>                 kvm_state.flags |= KVM_STATE_NESTED_GUEST_MODE;
> @@ -1871,7 +1858,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
>
>         ret = -EINVAL;
>         __nested_copy_vmcb_control_to_cache(vcpu, &ctl_cached, ctl);
> -       if (!__nested_vmcb_check_controls(vcpu, &ctl_cached))
> +       if (!nested_vmcb_check_controls(vcpu, &ctl_cached))
>                 goto out_free;
>
>         /*
> @@ -1887,15 +1874,21 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
>          * nested_svm_check_permissions).
>          */
>         __nested_copy_vmcb_save_to_cache(&save_cached, save);
> +
> +       /*
> +        * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
> +        * gPAT.  L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
> +        * vcpu->arch.pat, i.e. hPAT is a reflection of vcpu->arch.pat, not the
> +        * other way around.
> +        */
> +       save_cached.g_pat = vcpu->arch.pat;
> +
>         if (!(save->cr0 & X86_CR0_PG) ||
>             !(save->cr0 & X86_CR0_PE) ||
>             (save->rflags & X86_EFLAGS_VM) ||
> -           !__nested_vmcb_check_save(vcpu, &save_cached))
> +           !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
>                 goto out_free;
>
> -       /*
> -        * Validate gPAT, if provided.
> -        */
>         if ((kvm_state->hdr.svm.flags & KVM_STATE_SVM_VALID_GPAT) &&
>             !kvm_pat_valid(kvm_state->hdr.svm.gpat))
>                 goto out_free;
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index a6a44deec82b..bf8562a5f655 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2862,7 +2862,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>                 WARN_ON_ONCE(msr_info->host_initiated && vcpu->wants_to_run);
>                 if (!msr_info->host_initiated && is_guest_mode(vcpu) &&
>                     nested_npt_enabled(svm))
> -                       msr_info->data = svm->nested.gpat;
> +                       msr_info->data = svm->nested.save.g_pat;
>                 else
>                         msr_info->data = vcpu->arch.pat;
>                 break;
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index a559cd45c8a9..6f07d8e3f06e 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -146,6 +146,7 @@ struct vmcb_save_area_cached {
>         u64 cr0;
>         u64 dr7;
>         u64 dr6;
> +       u64 g_pat;
>  };
>
>  struct vmcb_ctrl_area_cached {
> @@ -208,9 +209,6 @@ struct svm_nested_state {
>          */
>         struct vmcb_save_area_cached save;
>
> -       /* Cached guest PAT from vmcb12.save.g_pat */
> -       u64 gpat;
> -
>         bool initialized;
>
>         /*
> @@ -599,7 +597,7 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
>
>  static inline void svm_set_gpat(struct vcpu_svm *svm, u64 data)
>  {
> -       svm->nested.gpat = data;
> +       svm->nested.save.g_pat = data;
>         svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, data);
>  }
>
>
> base-commit: 6461c50e232d6f81d5b9604236f7ee3df870e932
> --

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 7/8] KVM: x86: nSVM: Handle restore of legacy nested state
  2026-02-06 19:17   ` Sean Christopherson
@ 2026-02-06 22:38     ` Jim Mattson
  0 siblings, 0 replies; 23+ messages in thread
From: Jim Mattson @ 2026-02-06 22:38 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Shuah Khan, kvm, linux-kernel,
	linux-kselftest, Yosry Ahmed

On Fri, Feb 6, 2026 at 11:18 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Feb 05, 2026, Jim Mattson wrote:
> > When nested NPT is enabled and KVM_SET_NESTED_STATE is used to restore an
> > old checkpoint (without a valid gPAT), the current IA32_PAT value must be
> > used as L2's gPAT.
> >
> > The current IA32_PAT value may be restored by KVM_SET_MSRS after
> > KVM_SET_NESTED_STATE. Furthermore, there may be a KVM_GET_NESTED_STATE
> > before the first KVM_RUN.
> >
> > Introduce a new boolean, svm->nested.legacy_gpat_semantics. When set, hPAT
> > updates are also applied to gPAT, preserving the old behavior where L2
> > shared L1's PAT. svm_vcpu_pre_run() clears this boolean at the first
> > KVM_RUN.
>
> State this last point as a command and explain why.  E.g. I think this is why?
>
>   Clear legacy_gpat_semantics on KVM_RUN so that the legacy semantics are
>   scoped to a single restore (inasmuch as possible).  E.g. to support
>   restoring a snapshot from an old KVM, and then later restoring a snapshot
>   from a new KVM.

Actually, it's just because I'd like to return to normal behavior once
the non-atomic restore operation is complete. I'll add that note.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
  2026-02-06 20:56           ` Jim Mattson
@ 2026-02-06 23:07             ` Sean Christopherson
  0 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2026-02-06 23:07 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yosry Ahmed, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest

On Fri, Feb 06, 2026, Jim Mattson wrote:
> On Fri, Feb 6, 2026 at 11:12 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Feb 06, 2026, Jim Mattson wrote:
> > > On Fri, Feb 6, 2026 at 10:23 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> > > >
> > > > February 6, 2026 at 10:19 AM, "Sean Christopherson" <seanjc@google.com> wrote:
> > AFAICT, the only "problem" is that g_pat in the serialization payload will be
> > garbage when restoring state from an older KVM.  But that's totally fine, precisely
> > because L1's PAT isn't restored from vmcb01 on nested #VMEXIT, it's always resident
> > in vcpu->arch.pat.  So can't we just do this to avoid a spurious -EINVAL?
> >
> >         /*
> >          * Validate host state saved from before VMRUN (see
> >          * nested_svm_check_permissions).
> >          */
> >         __nested_copy_vmcb_save_to_cache(&save_cached, save);
> >
> >         /*
> >          * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
> >          * gPAT.  L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
> >          * vcpu->arch.pat, i.e. gPAT is a reflection of vcpu->arch.pat, not the
> >          * other way around.
> >          */
> >         save_cached.g_pat = vcpu->arch.pat;
> 
> Your comment is a bit optimistic. Qemu, for instance, hasn't restored
> MSRs yet, so vcpu->arch.pat will actually be the current vCPU's PAT
> (in the case of snapshot restore, some future PAT).

Yeah, FWIW, I was _trying_ account for that by not explicitly saying that arch.pat
is the "new" L1 state, but it's difficult to dance around :-/

> But, in any case, it should be a valid PAT.
>
> >         if (!(save->cr0 & X86_CR0_PG) ||
> >             !(save->cr0 & X86_CR0_PE) ||
> >             (save->rflags & X86_EFLAGS_VM) ||
> >             !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
> 
> Wrong ctl_cached. Those are the vmcb02 controls, but we are checking
> the vmcb01 save state.

*sigh*

> I think it would be better to add a boolean argument, "check_gpat,"
> which will be false at this call site and nested_npt_enabled(vcpu) at
> the other call site.

Yeah, agreed.  Because even though arch.pat should be valid, IIUC there isn't a
consistent check on hPAT because it's never reloaded.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/8] KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2
  2026-02-05 21:43 ` [PATCH v3 1/8] KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2 Jim Mattson
@ 2026-02-09 16:05   ` Yosry Ahmed
  0 siblings, 0 replies; 23+ messages in thread
From: Yosry Ahmed @ 2026-02-09 16:05 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	kvm, linux-kernel, linux-kselftest

On Thu, Feb 05, 2026 at 01:43:01PM -0800, Jim Mattson wrote:
> When running an L2 guest and writing to MSR_IA32_CR_PAT, the host PAT value
> is stored in vmcb01.ptr->save.g_pat, but the clean bit was only being
> cleared for svm->vmcb, which points to vmcb02 in guest mode.
> 
> Introduce the helper svm_set_vmcb_gpat() which sets vmcb->save.g_pat and
> marks the VMCB dirty for VMCB_NPT. Use this helper in both svm_set_msr()
> for updating vmcb01 and in nested_vmcb02_compute_g_pat() for updating
> vmcb02, ensuring both VMCBs are properly marked dirty.
> 
> Fixes: 4995a3685f1b ("KVM: SVM: Use a separate vmcb for the nested L2 guest")
> Signed-off-by: Jim Mattson <jmattson@google.com>
> ---
>  arch/x86/kvm/svm/nested.c | 2 +-
>  arch/x86/kvm/svm/svm.c    | 3 +--
>  arch/x86/kvm/svm/svm.h    | 6 ++++++
>  3 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index de90b104a0dd..f72dbd10dcad 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -636,7 +636,7 @@ void nested_vmcb02_compute_g_pat(struct vcpu_svm *svm)
>  		return;
>  
>  	/* FIXME: merge g_pat from vmcb01 and vmcb12.  */
> -	svm->nested.vmcb02.ptr->save.g_pat = svm->vmcb01.ptr->save.g_pat;
> +	svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, svm->vmcb01.ptr->save.g_pat);
>  }
>  
>  static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12)
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 5f0136dbdde6..08f145eb9215 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2939,10 +2939,9 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  		if (ret)
>  			break;
>  
> -		svm->vmcb01.ptr->save.g_pat = data;
> +		svm_set_vmcb_gpat(svm->vmcb01.ptr, data);
>  		if (is_guest_mode(vcpu))
>  			nested_vmcb02_compute_g_pat(svm);
> -		vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
>  		break;
>  	case MSR_IA32_SPEC_CTRL:
>  		if (!msr->host_initiated &&
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index ebd7b36b1ceb..986d90f2d4ca 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -420,6 +420,12 @@ static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
>          return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
>  }
>  
> +static inline void svm_set_vmcb_gpat(struct vmcb *vmcb, u64 data)

Nit: vmcb_set_gpat() is probably more consistent with other helpers
(e.g.  vmcb_set_intercept() and vmcb_set_seg()).

> +{
> +	vmcb->save.g_pat = data;
> +	vmcb_mark_dirty(vmcb, VMCB_NPT);
> +}
> +
>  static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
>  {
>  	return container_of(vcpu, struct vcpu_svm, vcpu);
> -- 
> 2.53.0.rc2.204.g2597b5adb4-goog
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2026-02-09 16:06 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
2026-02-05 21:43 ` [PATCH v3 1/8] KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2 Jim Mattson
2026-02-09 16:05   ` Yosry Ahmed
2026-02-05 21:43 ` [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat Jim Mattson
2026-02-06 18:19   ` Sean Christopherson
2026-02-06 18:23     ` Yosry Ahmed
2026-02-06 18:32       ` Jim Mattson
2026-02-06 19:12         ` Sean Christopherson
2026-02-06 19:15           ` Yosry Ahmed
2026-02-06 19:50             ` Sean Christopherson
2026-02-06 20:56           ` Jim Mattson
2026-02-06 23:07             ` Sean Christopherson
2026-02-05 21:43 ` [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT Jim Mattson
2026-02-06 18:23   ` Sean Christopherson
2026-02-06 18:29     ` Yosry Ahmed
2026-02-06 19:14       ` Sean Christopherson
2026-02-05 21:43 ` [PATCH v3 4/8] KVM: x86: nSVM: Redirect IA32_PAT accesses to either hPAT or gPAT Jim Mattson
2026-02-05 21:43 ` [PATCH v3 5/8] KVM: x86: nSVM: Save gPAT to vmcb12.g_pat on VMEXIT Jim Mattson
2026-02-05 21:43 ` [PATCH v3 6/8] KVM: x86: nSVM: Save/restore gPAT with KVM_{GET,SET}_NESTED_STATE Jim Mattson
2026-02-05 21:43 ` [PATCH v3 7/8] KVM: x86: nSVM: Handle restore of legacy nested state Jim Mattson
2026-02-06 19:17   ` Sean Christopherson
2026-02-06 22:38     ` Jim Mattson
2026-02-05 21:43 ` [PATCH v3 8/8] KVM: selftests: nSVM: Add svm_nested_pat test Jim Mattson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox