Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH 3/4] KVM: nSVM: invalidate cached PDPTRs across nested NPT transitions
Date: Wed, 27 May 2026 16:53:21 -0700	[thread overview]
Message-ID: <aheD8XjfMrL9wGFa@google.com> (raw)
In-Reply-To: <20260527120600.20696-4-pbonzini@redhat.com>

On Wed, May 27, 2026, Paolo Bonzini wrote:
> When L2 runs under nested NPT and uses PAE paging, KVM's cached PDPTRs
> in mmu->pdptrs[] can hold stale or wrong values after nested
> transitions and across migration restore, because both
> nested_svm_load_cr3() and svm_get_nested_state_pages() only refresh
> PDPTRs on the !nested_npt path.
> 
> The user-visible bug is on migration restore of an L2 running with nested
> NPT and 32-bit PAE paging, if userspace uses KVM_SET_SREGS rather than
> KVM_SET_SREGS2.  In that case, load_pdptrs() leaves VCPU_EXREG_PDPTR
> marked as available, and kvm_pdptr_read() will use a stale translation
> that used L1 GPAs instead of L2 nGPAs.  svm_get_nested_state_pages()
> runs on first KVM_RUN but skips the refresh because nested_npt_enabled()
> is true.  The CPU itself reads L2's PDPTRs correctly from memory via
> L1's NPT, but KVM-side walking of guest PAE page tables uses the bogus
> cached values.
> 
> Unlike Intel's GUEST_PDPTR0..3 fields in the VMCS, SVM has no
> VMCB-cached PDPTR state: the in-memory PDPTEs at the current CR3 are
> the only source of truth, and svm_cache_reg(VCPU_EXREG_PDPTR) simply
> reloads them from memory via load_pdptrs().  Clearing the avail
> bit (and the dirty bit because !avail/dirty is invalid) to force
> a reload when PDPTRs as needed fixes the bug.
> 
> Do the same for nested_svm_load_cr3()'s nested_npt branch, so that
> the invariant "PDPTRs need reloading" is handled similarly for both
> immediate and deferred loading.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/kvm_cache_regs.h |  8 ++++++++
>  arch/x86/kvm/svm/nested.c     | 27 ++++++++++++++++++---------
>  2 files changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
> index 2ae492ad6412..6bae5db5a54e 100644
> --- a/arch/x86/kvm/kvm_cache_regs.h
> +++ b/arch/x86/kvm/kvm_cache_regs.h
> @@ -77,6 +77,14 @@ static inline bool kvm_register_is_dirty(struct kvm_vcpu *vcpu,
>  	return test_bit(reg, vcpu->arch.regs_dirty);
>  }
>  
> +static inline void kvm_register_mark_for_reload(struct kvm_vcpu *vcpu,
> +					       enum kvm_reg reg)

I would rather use kvm_clear_available_registers() than add yet another API,
which isn't even a good fit here since SVM never expects the PDPTRs to be dirty.

Though I think it's a moot point, because nSVM should be clearing *all*
lazy-loaded registers.  It just so happens that PDPTRs are the only such "register".

I haven't checked to see if this would actually be correct, I'm just mimicking
the nVMX code.  But conceptually, I think we want something like so:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e74fcde6155e..0c6ab00766b1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1303,6 +1303,8 @@ void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb)
 {
        svm->current_vmcb = target_vmcb;
        svm->vmcb = target_vmcb->ptr;
+
+       kvm_clear_available_registers(&svm->vcpu, SVM_REGS_LAZY_LOAD_SET);
 }
 
 static int svm_vcpu_precreate(struct kvm *kvm)

  reply	other threads:[~2026-05-27 23:53 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-27 12:05 [PATCH 0/4] KVM: x86: small MMU-adjacent cleanups Paolo Bonzini
2026-05-27 12:05 ` [PATCH 1/4] KVM: x86: remove nested_mmu from mmu_is_nested() Paolo Bonzini
2026-05-27 12:05 ` [PATCH 2/4] KVM: nVMX: remove unnecessary code in prepare_vmcs02 Paolo Bonzini
2026-05-28  0:14   ` Sean Christopherson
2026-05-28  7:56     ` Paolo Bonzini
2026-05-27 12:05 ` [PATCH 3/4] KVM: nSVM: invalidate cached PDPTRs across nested NPT transitions Paolo Bonzini
2026-05-27 23:53   ` Sean Christopherson [this message]
2026-05-28  8:33     ` Paolo Bonzini
2026-05-28 18:33       ` Jim Mattson
2026-05-27 12:06 ` [PATCH 4/4] KVM: x86: check that kvm_handle_invpcid is only invoked with shadow paging Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aheD8XjfMrL9wGFa@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox