public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: "Nikunj A. Dadhania" <nikunj@amd.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com,
	thomas.lendacky@amd.com,  bp@alien8.de,
	joao.m.martins@oracle.com, kai.huang@intel.com
Subject: Re: [PATCH v6 7/7] KVM: SVM: Add Page modification logging support
Date: Tue, 21 Apr 2026 08:08:28 -0700	[thread overview]
Message-ID: <aeeS7LDPJ6NCy3Uw@google.com> (raw)
In-Reply-To: <34cfe5e8-756a-435a-a73d-54bf69801161@amd.com>

On Mon, Apr 20, 2026, Nikunj A. Dadhania wrote:
> Sashiko reported a couple of issues [1]. Let me address them here:
> 
> On 4/7/2026 12:02 PM, Nikunj A Dadhania wrote:
> > @@ -1206,6 +1209,16 @@ static void init_vmcb(struct kvm_vcpu *vcpu, bool init_event)
> >  	if (vcpu->kvm->arch.bus_lock_detection_enabled)
> >  		svm_set_intercept(svm, INTERCEPT_BUSLOCK);
> >  
> > +	if (pml) {
> > +		/*
> > +		 * Populate the page address and index here, PML is enabled
> > +		 * when dirty logging is enabled on the memslot through
> > +		 * svm_update_cpu_dirty_logging()
> > +		 */
> > +		control->pml_addr = (u64)__sme_set(page_to_phys(vcpu->arch.pml_page));
> > +		control->pml_index = PML_HEAD_INDEX;
> > +	}
> > +
> 
> > If the guest receives an INIT IPI and init_vmcb() is called to reset the
> > vCPU, does unconditionally setting pml_index to PML_HEAD_INDEX discard any
> > un-flushed dirty GPAs logged by the hardware?
> 
> There are two scenarios where init_vmcb() is called:
> 
> 1) During vCPU creation time, where we need to set pml_index to PML_HEAD_INDEX
> 2) During vCPU reset, when init_event=true

Nit, please just call this INIT, even though KVM calls the function "reset".  #1
is KVM's one and only true RESET flow.

>   Before vCPU reset:
>     vcpu_enter_guest()
>       └─> kvm_x86_call(vcpu_run) [VMRUN]
>       └─> [guest executes, PML accumulates dirty pages]
>       └─> VMEXIT
>       └─> svm_handle_exit() --> PML buffer flushed here
>       └─> return to vcpu_run()
> 
>   vCPU Reset:
>     vcpu_enter_guest()
>       ├─> kvm_check_request(KVM_REQ_EVENT)
>       ├─> kvm_apic_accept_events()
>       │     └─> kvm_vcpu_reset(..., true)
>       │           └─> init_vmcb(..., true)
>       │                 └─> control->pml_index = PML_HEAD_INDEX -- PML buffer was already flushed
>       └─> kvm_x86_call(): Next VMRUN
> 
> > Could this result in the hypervisor losing track of dirty memory during live
> > migration, leading to memory corruption on the destination host, since
> > svm_flush_pml_buffer() isn't called before resetting the index?
> 
> AFAIU, no. The PML buffer is always flushed opportunistically at every VM exit.

Huh.  There's a pre-existing bug here.  Commit f7f39c50edb9 ("KVM: x86: Exit to
userspace if fastpath triggers one on instruction skip") added a path that skips
kvm_x86_ops.handle_exit(), and specifically can give userspace control without
going through vmx_flush_pml_buffer():

	if (unlikely(exit_fastpath == EXIT_FASTPATH_EXIT_USERSPACE))
		return 0;

	r = kvm_x86_call(handle_exit)(vcpu, exit_fastpath);

Given that SVM support for PML is (obviously) on its way, it's mildly tempting
to add a dedicated kvm_x86_ops hook to flush the buffer on a fastpath userspace
exit.  But, I dislike one-off kvm_x86_ops hooks, and that only works if there's
no other vendor action required.  E.g. very theoretically, a fastpath userspace
exit could also be coincident with bus_lock_detected.

Yikes!  And I think userspace could see a stale CR0 and/or CR3 on AMD.  Hmm, but
waiting until the full exit path to grab CR0 and CR3 is flawed on its own, e.g.
it's a bug waiting to happen if KVM consumes CR3 in the fastpath.

So over two patches, I think the fix for those issues is the below.  I'll test
and send a mini-series.  I don't think there's anything you need to do; I can
resolve the resulting conflict easy enough.

diff --git arch/x86/kvm/svm/svm.c arch/x86/kvm/svm/svm.c
index e7fdd7a9c280..df0bd132edf7 100644
--- arch/x86/kvm/svm/svm.c
+++ arch/x86/kvm/svm/svm.c
@@ -3644,13 +3644,8 @@ static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
        struct vcpu_svm *svm = to_svm(vcpu);
        struct kvm_run *kvm_run = vcpu->run;
 
-       /* SEV-ES guests must use the CR write traps to track CR registers. */
-       if (!is_sev_es_guest(vcpu)) {
-               if (!svm_is_intercept(svm, INTERCEPT_CR0_WRITE))
-                       vcpu->arch.cr0 = svm->vmcb->save.cr0;
-               if (npt_enabled)
-                       vcpu->arch.cr3 = svm->vmcb->save.cr3;
-       }
+       if (unlikely(exit_fastpath == EXIT_FASTPATH_EXIT_USERSPACE))
+               return 0;
 
        if (is_guest_mode(vcpu)) {
                int vmexit;
@@ -4502,11 +4497,17 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
        if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
                x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);
 
+       /* SEV-ES guests must use the CR write traps to track CR registers. */
        if (!is_sev_es_guest(vcpu)) {
                vcpu->arch.cr2 = svm->vmcb->save.cr2;
                vcpu->arch.regs[VCPU_REGS_RAX] = svm->vmcb->save.rax;
                vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp;
                vcpu->arch.regs[VCPU_REGS_RIP] = svm->vmcb->save.rip;
+
+               if (!svm_is_intercept(svm, INTERCEPT_CR0_WRITE))
+                       vcpu->arch.cr0 = svm->vmcb->save.cr0;
+               if (npt_enabled)
+                       vcpu->arch.cr3 = svm->vmcb->save.cr3;
        }
        vcpu->arch.regs_dirty = 0;
 
diff --git arch/x86/kvm/vmx/vmx.c arch/x86/kvm/vmx/vmx.c
index a29896a9ef14..4cb355ecfe46 100644
--- arch/x86/kvm/vmx/vmx.c
+++ arch/x86/kvm/vmx/vmx.c
@@ -6687,6 +6687,9 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
        if (enable_pml && !is_guest_mode(vcpu))
                vmx_flush_pml_buffer(vcpu);
 
+       if (unlikely(exit_fastpath == EXIT_FASTPATH_EXIT_USERSPACE))
+               return 0;
+
        /*
         * KVM should never reach this point with a pending nested VM-Enter.
         * More specifically, short-circuiting VM-Entry to emulate L2 due to
diff --git arch/x86/kvm/x86.c arch/x86/kvm/x86.c
index 0a1b63c63d1a..9ad7ec3bf0f1 100644
--- arch/x86/kvm/x86.c
+++ arch/x86/kvm/x86.c
@@ -11602,9 +11602,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
        if (vcpu->arch.apic_attention)
                kvm_lapic_sync_from_vapic(vcpu);
 
-       if (unlikely(exit_fastpath == EXIT_FASTPATH_EXIT_USERSPACE))
-               return 0;
-
        r = kvm_x86_call(handle_exit)(vcpu, exit_fastpath);
        return r;

      reply	other threads:[~2026-04-21 15:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-07  6:32 [PATCH v6 0/7] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 1/7] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 2/7] KVM: x86: Move PML page to common vcpu arch structure Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 3/7] KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 4/7] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 5/7] KVM: SVM: Use BIT_ULL for 64-bit nested_ctl bit definitions Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 6/7] KVM: nSVM: Add helpers to temporarily switch to vmcb01 Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 7/7] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
2026-04-20  6:38   ` Nikunj A. Dadhania
2026-04-21 15:08     ` Sean Christopherson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeeS7LDPJ6NCy3Uw@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=joao.m.martins@oracle.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=nikunj@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=thomas.lendacky@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox