All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH 3/4] KVM: nSVM: invalidate cached PDPTRs across nested NPT transitions
Date: Wed, 27 May 2026 16:53:21 -0700	[thread overview]
Message-ID: <aheD8XjfMrL9wGFa@google.com> (raw)
In-Reply-To: <20260527120600.20696-4-pbonzini@redhat.com>

On Wed, May 27, 2026, Paolo Bonzini wrote:
> When L2 runs under nested NPT and uses PAE paging, KVM's cached PDPTRs
> in mmu->pdptrs[] can hold stale or wrong values after nested
> transitions and across migration restore, because both
> nested_svm_load_cr3() and svm_get_nested_state_pages() only refresh
> PDPTRs on the !nested_npt path.
> 
> The user-visible bug is on migration restore of an L2 running with nested
> NPT and 32-bit PAE paging, if userspace uses KVM_SET_SREGS rather than
> KVM_SET_SREGS2.  In that case, load_pdptrs() leaves VCPU_EXREG_PDPTR
> marked as available, and kvm_pdptr_read() will use a stale translation
> that used L1 GPAs instead of L2 nGPAs.  svm_get_nested_state_pages()
> runs on first KVM_RUN but skips the refresh because nested_npt_enabled()
> is true.  The CPU itself reads L2's PDPTRs correctly from memory via
> L1's NPT, but KVM-side walking of guest PAE page tables uses the bogus
> cached values.
> 
> Unlike Intel's GUEST_PDPTR0..3 fields in the VMCS, SVM has no
> VMCB-cached PDPTR state: the in-memory PDPTEs at the current CR3 are
> the only source of truth, and svm_cache_reg(VCPU_EXREG_PDPTR) simply
> reloads them from memory via load_pdptrs().  Clearing the avail
> bit (and the dirty bit because !avail/dirty is invalid) to force
> a reload when PDPTRs as needed fixes the bug.
> 
> Do the same for nested_svm_load_cr3()'s nested_npt branch, so that
> the invariant "PDPTRs need reloading" is handled similarly for both
> immediate and deferred loading.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/kvm_cache_regs.h |  8 ++++++++
>  arch/x86/kvm/svm/nested.c     | 27 ++++++++++++++++++---------
>  2 files changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
> index 2ae492ad6412..6bae5db5a54e 100644
> --- a/arch/x86/kvm/kvm_cache_regs.h
> +++ b/arch/x86/kvm/kvm_cache_regs.h
> @@ -77,6 +77,14 @@ static inline bool kvm_register_is_dirty(struct kvm_vcpu *vcpu,
>  	return test_bit(reg, vcpu->arch.regs_dirty);
>  }
>  
> +static inline void kvm_register_mark_for_reload(struct kvm_vcpu *vcpu,
> +					       enum kvm_reg reg)

I would rather use kvm_clear_available_registers() than add yet another API,
which isn't even a good fit here since SVM never expects the PDPTRs to be dirty.

Though I think it's a moot point, because nSVM should be clearing *all*
lazy-loaded registers.  It just so happens that PDPTRs are the only such "register".

I haven't checked to see if this would actually be correct, I'm just mimicking
the nVMX code.  But conceptually, I think we want something like so:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e74fcde6155e..0c6ab00766b1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1303,6 +1303,8 @@ void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb)
 {
        svm->current_vmcb = target_vmcb;
        svm->vmcb = target_vmcb->ptr;
+
+       kvm_clear_available_registers(&svm->vcpu, SVM_REGS_LAZY_LOAD_SET);
 }
 
 static int svm_vcpu_precreate(struct kvm *kvm)

  reply	other threads:[~2026-05-27 23:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-27 12:05 [PATCH 0/4] KVM: x86: small MMU-adjacent cleanups Paolo Bonzini
2026-05-27 12:05 ` [PATCH 1/4] KVM: x86: remove nested_mmu from mmu_is_nested() Paolo Bonzini
2026-05-27 12:05 ` [PATCH 2/4] KVM: nVMX: remove unnecessary code in prepare_vmcs02 Paolo Bonzini
2026-05-28  0:14   ` Sean Christopherson
2026-05-28  7:56     ` Paolo Bonzini
2026-05-27 12:05 ` [PATCH 3/4] KVM: nSVM: invalidate cached PDPTRs across nested NPT transitions Paolo Bonzini
2026-05-27 23:53   ` Sean Christopherson [this message]
2026-05-28  8:33     ` Paolo Bonzini
2026-05-28 18:33       ` Jim Mattson
2026-05-30 16:54         ` Paolo Bonzini
2026-05-27 12:06 ` [PATCH 4/4] KVM: x86: check that kvm_handle_invpcid is only invoked with shadow paging Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aheD8XjfMrL9wGFa@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.