From: Sean Christopherson <seanjc@google.com>
To: Jim Mattson <jmattson@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>,
Paolo Bonzini <pbonzini@redhat.com>,
Thomas Gleixner <tglx@kernel.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Shuah Khan <shuah@kernel.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat
Date: Fri, 6 Feb 2026 11:12:36 -0800 [thread overview]
Message-ID: <aYY9JOMDBPDY48lA@google.com> (raw)
In-Reply-To: <CALMp9eTJAD4Dc88egovSjV-N2YHd8G80ZP-dL5wXFDAC+WR6fA@mail.gmail.com>
On Fri, Feb 06, 2026, Jim Mattson wrote:
> On Fri, Feb 6, 2026 at 10:23 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> >
> > February 6, 2026 at 10:19 AM, "Sean Christopherson" <seanjc@google.com> wrote:
> >
> >
> > >
> > > On Thu, Feb 05, 2026, Jim Mattson wrote:
> > >
> > > >
> > > > Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
> > > > add a validity check so that when nested paging is enabled for vmcb12, an
> > > > invalid g_pat causes an immediate VMEXIT with exit code VMEXIT_INVALID, as
> > > > specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."
> > > >
> > > > Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> > > > Signed-off-by: Jim Mattson <jmattson@google.com>
> > > > ---
> > > > arch/x86/kvm/svm/nested.c | 4 +++-
> > > > arch/x86/kvm/svm/svm.h | 3 +++
> > > > 2 files changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > > index f72dbd10dcad..1d4ff6408b34 100644
> > > > --- a/arch/x86/kvm/svm/nested.c
> > > > +++ b/arch/x86/kvm/svm/nested.c
> > > > @@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
> > > >
> > > > nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
> > > > nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> > > > + svm->nested.gpat = vmcb12->save.g_pat;
> > > >
> > > > if (!nested_vmcb_check_save(vcpu) ||
> > > > - !nested_vmcb_check_controls(vcpu)) {
> > > > + !nested_vmcb_check_controls(vcpu) ||
> > > > + (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
> > > > vmcb12->control.exit_code = SVM_EXIT_ERR;
> > > > vmcb12->control.exit_info_1 = 0;
> > > > vmcb12->control.exit_info_2 = 0;
> > > > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > > > index 986d90f2d4ca..42a4bf83b3aa 100644
> > > > --- a/arch/x86/kvm/svm/svm.h
> > > > +++ b/arch/x86/kvm/svm/svm.h
> > > > @@ -208,6 +208,9 @@ struct svm_nested_state {
> > > > */
> > > > struct vmcb_save_area_cached save;
> > > >
> > > > + /* Cached guest PAT from vmcb12.save.g_pat */
> > > > + u64 gpat;
> > > >
> > > Shouldn't this go in vmcb_save_area_cached?
> >
> > I believe Jim changed it after this discussion on v2: https://lore.kernel.org/kvm/20260115232154.3021475-4-jmattson@google.com/.
LOL, oh the irony:
I'm going to cache it on its own to avoid confusion.
> Right. The two issues with putting it in vmcb_save_area_cached were:
>
> 1. Checking all of vmcb_save_area_cached requires access to the
> corresponding control area (or at least the boolean, "NTP enabled.")
Checking the control area seems like the right answer (I went down that path
before reading this).
> 2. In the nested state serialization payload, everything else in the
> vmcb_save_area_cached comes from L1 (host state to be restored at
> emulated #VMEXIT.)
Hmm, right, but *because* it's ignored, that gives us carte blanche to clobber it.
More below.
> The first issue was a little messy, but not that distasteful.
I actually find it the opposite of distasteful. KVM definitely _should_ be
checking the controls, not the vCPU state. If it weren't for needing to get at
MAXPHYADDR in CPUID, I'd push to drop @vcpu entirely.
> The second issue was really a mess.
I'd rather have the mess contained and document though. Caching g_pat outside
of vmcb_save_area_cached bleeds the mess into all of the relevant nSVM code, and
doesn't leave any breadcrumbs in the code/comments to explain that it "needs" to
be kept separate.
AFAICT, the only "problem" is that g_pat in the serialization payload will be
garbage when restoring state from an older KVM. But that's totally fine, precisely
because L1's PAT isn't restored from vmcb01 on nested #VMEXIT, it's always resident
in vcpu->arch.pat. So can't we just do this to avoid a spurious -EINVAL?
/*
* Validate host state saved from before VMRUN (see
* nested_svm_check_permissions).
*/
__nested_copy_vmcb_save_to_cache(&save_cached, save);
/*
* Stuff gPAT in L1's save state, as older KVM may not have saved L1's
* gPAT. L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
* vcpu->arch.pat, i.e. gPAT is a reflection of vcpu->arch.pat, not the
* other way around.
*/
save_cached.g_pat = vcpu->arch.pat;
if (!(save->cr0 & X86_CR0_PG) ||
!(save->cr0 & X86_CR0_PE) ||
(save->rflags & X86_EFLAGS_VM) ||
!nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
goto out_free;
Oh, and if we do plumb in @ctrl to __nested_vmcb_check_save(), I vote to
opportunistically drop the useless single-use wrappers (probably in a standalone
patch to plumb in @ctrl). E.g. (completely untested)
---
arch/x86/kvm/svm/nested.c | 71 ++++++++++++++++++---------------------
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/svm/svm.h | 6 ++--
3 files changed, 35 insertions(+), 44 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index a7d6fc1382a7..a429947c8966 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -339,8 +339,8 @@ static bool nested_svm_check_bitmap_pa(struct kvm_vcpu *vcpu, u64 pa, u32 size)
kvm_vcpu_is_legal_gpa(vcpu, addr + size - 1);
}
-static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
- struct vmcb_ctrl_area_cached *control)
+static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
+ struct vmcb_ctrl_area_cached *control)
{
if (CC(!vmcb12_is_intercept(control, INTERCEPT_VMRUN)))
return false;
@@ -367,8 +367,9 @@ static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
}
/* Common checks that apply to both L1 and L2 state. */
-static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
- struct vmcb_save_area_cached *save)
+static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu,
+ struct vmcb_ctrl_area_cached *ctrl,
+ struct vmcb_save_area_cached *save)
{
if (CC(!(save->efer & EFER_SVME)))
return false;
@@ -399,25 +400,13 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
if (CC(!kvm_valid_efer(vcpu, save->efer)))
return false;
+ if (CC(ctrl->nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
+ !kvm_pat_valid(save->g_pat))
+ return false;
+
return true;
}
-static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu)
-{
- struct vcpu_svm *svm = to_svm(vcpu);
- struct vmcb_save_area_cached *save = &svm->nested.save;
-
- return __nested_vmcb_check_save(vcpu, save);
-}
-
-static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu)
-{
- struct vcpu_svm *svm = to_svm(vcpu);
- struct vmcb_ctrl_area_cached *ctl = &svm->nested.ctl;
-
- return __nested_vmcb_check_controls(vcpu, ctl);
-}
-
/*
* If a feature is not advertised to L1, clear the corresponding vmcb12
* intercept.
@@ -504,6 +493,9 @@ static void __nested_copy_vmcb_save_to_cache(struct vmcb_save_area_cached *to,
to->dr6 = from->dr6;
to->dr7 = from->dr7;
+
+ to->g_pat = from->g_pat;
+
}
void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
@@ -644,17 +636,14 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
svm->nested.force_msr_bitmap_recalc = true;
}
- if (npt_enabled) {
- if (nested_npt_enabled(svm)) {
- if (unlikely(new_vmcb12 ||
- vmcb_is_dirty(vmcb12, VMCB_NPT))) {
- vmcb02->save.g_pat = svm->nested.gpat;
- vmcb_mark_dirty(vmcb02, VMCB_NPT);
- }
- } else {
- vmcb02->save.g_pat = vcpu->arch.pat;
+ if (nested_npt_enabled(svm)) {
+ if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_NPT))) {
+ vmcb02->save.g_pat = svm->nested.save.g_pat;
vmcb_mark_dirty(vmcb02, VMCB_NPT);
}
+ } else if (npt_enabled) {
+ vmcb02->save.g_pat = vcpu->arch.pat;
+ vmcb_mark_dirty(vmcb02, VMCB_NPT);
}
if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
@@ -1028,11 +1017,9 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
- svm->nested.gpat = vmcb12->save.g_pat;
- if (!nested_vmcb_check_save(vcpu) ||
- !nested_vmcb_check_controls(vcpu) ||
- (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
+ if (!nested_vmcb_check_save(vcpu, &svm->nested.ctl, &svm->nested.save) ||
+ !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) {
vmcb12->control.exit_code = SVM_EXIT_ERR;
vmcb12->control.exit_info_1 = 0;
vmcb12->control.exit_info_2 = 0;
@@ -1766,7 +1753,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
if (nested_npt_enabled(svm)) {
kvm_state.hdr.svm.flags |= KVM_STATE_SVM_VALID_GPAT;
- kvm_state.hdr.svm.gpat = svm->nested.gpat;
+ kvm_state.hdr.svm.gpat = svm->nested.save.g_pat;
}
kvm_state.size += KVM_STATE_NESTED_SVM_VMCB_SIZE;
kvm_state.flags |= KVM_STATE_NESTED_GUEST_MODE;
@@ -1871,7 +1858,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
ret = -EINVAL;
__nested_copy_vmcb_control_to_cache(vcpu, &ctl_cached, ctl);
- if (!__nested_vmcb_check_controls(vcpu, &ctl_cached))
+ if (!nested_vmcb_check_controls(vcpu, &ctl_cached))
goto out_free;
/*
@@ -1887,15 +1874,21 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
* nested_svm_check_permissions).
*/
__nested_copy_vmcb_save_to_cache(&save_cached, save);
+
+ /*
+ * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
+ * gPAT. L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
+ * vcpu->arch.pat, i.e. hPAT is a reflection of vcpu->arch.pat, not the
+ * other way around.
+ */
+ save_cached.g_pat = vcpu->arch.pat;
+
if (!(save->cr0 & X86_CR0_PG) ||
!(save->cr0 & X86_CR0_PE) ||
(save->rflags & X86_EFLAGS_VM) ||
- !__nested_vmcb_check_save(vcpu, &save_cached))
+ !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
goto out_free;
- /*
- * Validate gPAT, if provided.
- */
if ((kvm_state->hdr.svm.flags & KVM_STATE_SVM_VALID_GPAT) &&
!kvm_pat_valid(kvm_state->hdr.svm.gpat))
goto out_free;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a6a44deec82b..bf8562a5f655 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2862,7 +2862,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
WARN_ON_ONCE(msr_info->host_initiated && vcpu->wants_to_run);
if (!msr_info->host_initiated && is_guest_mode(vcpu) &&
nested_npt_enabled(svm))
- msr_info->data = svm->nested.gpat;
+ msr_info->data = svm->nested.save.g_pat;
else
msr_info->data = vcpu->arch.pat;
break;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a559cd45c8a9..6f07d8e3f06e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -146,6 +146,7 @@ struct vmcb_save_area_cached {
u64 cr0;
u64 dr7;
u64 dr6;
+ u64 g_pat;
};
struct vmcb_ctrl_area_cached {
@@ -208,9 +209,6 @@ struct svm_nested_state {
*/
struct vmcb_save_area_cached save;
- /* Cached guest PAT from vmcb12.save.g_pat */
- u64 gpat;
-
bool initialized;
/*
@@ -599,7 +597,7 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
static inline void svm_set_gpat(struct vcpu_svm *svm, u64 data)
{
- svm->nested.gpat = data;
+ svm->nested.save.g_pat = data;
svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, data);
}
base-commit: 6461c50e232d6f81d5b9604236f7ee3df870e932
--
next prev parent reply other threads:[~2026-02-06 19:12 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-05 21:43 [PATCH v3 0/8] KVM: x86: nSVM: Improve PAT virtualization Jim Mattson
2026-02-05 21:43 ` [PATCH v3 1/8] KVM: x86: nSVM: Clear VMCB_NPT clean bit when updating g_pat in L2 Jim Mattson
2026-02-09 16:05 ` Yosry Ahmed
2026-02-05 21:43 ` [PATCH v3 2/8] KVM: x86: nSVM: Cache and validate vmcb12 g_pat Jim Mattson
2026-02-06 18:19 ` Sean Christopherson
2026-02-06 18:23 ` Yosry Ahmed
2026-02-06 18:32 ` Jim Mattson
2026-02-06 19:12 ` Sean Christopherson [this message]
2026-02-06 19:15 ` Yosry Ahmed
2026-02-06 19:50 ` Sean Christopherson
2026-02-06 20:56 ` Jim Mattson
2026-02-06 23:07 ` Sean Christopherson
2026-02-05 21:43 ` [PATCH v3 3/8] KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT Jim Mattson
2026-02-06 18:23 ` Sean Christopherson
2026-02-06 18:29 ` Yosry Ahmed
2026-02-06 19:14 ` Sean Christopherson
2026-02-05 21:43 ` [PATCH v3 4/8] KVM: x86: nSVM: Redirect IA32_PAT accesses to either hPAT or gPAT Jim Mattson
2026-02-05 21:43 ` [PATCH v3 5/8] KVM: x86: nSVM: Save gPAT to vmcb12.g_pat on VMEXIT Jim Mattson
2026-02-05 21:43 ` [PATCH v3 6/8] KVM: x86: nSVM: Save/restore gPAT with KVM_{GET,SET}_NESTED_STATE Jim Mattson
2026-02-05 21:43 ` [PATCH v3 7/8] KVM: x86: nSVM: Handle restore of legacy nested state Jim Mattson
2026-02-06 19:17 ` Sean Christopherson
2026-02-06 22:38 ` Jim Mattson
2026-02-05 21:43 ` [PATCH v3 8/8] KVM: selftests: nSVM: Add svm_nested_pat test Jim Mattson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYY9JOMDBPDY48lA@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=shuah@kernel.org \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.