Re: [PATCH v6 7/7] KVM: SVM: Add Page modification logging support

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Kai Huang <kai.huang@intel.com>
Cc: "thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "pbonzini@redhat.com" <pbonzini@redhat.com>,
	 "joao.m.martins@oracle.com" <joao.m.martins@oracle.com>,
	"nikunj@amd.com" <nikunj@amd.com>,  "bp@alien8.de" <bp@alien8.de>
Subject: Re: [PATCH v6 7/7] KVM: SVM: Add Page modification logging support
Date: Wed, 22 Apr 2026 06:20:27 -0700	[thread overview]
Message-ID: <aejLG_w8H16EifMj@google.com> (raw)
In-Reply-To: <977e805339a29ab789650aa18cd320dd1e9e0c25.camel@intel.com>

On Wed, Apr 22, 2026, Kai Huang wrote:
> On Tue, 2026-04-21 at 17:30 -0700, Sean Christopherson wrote:
> > On Tue, Apr 21, 2026, Kai Huang wrote:
> > > On Tue, 2026-04-21 at 08:08 -0700, Sean Christopherson wrote:
> > > > >    vCPU Reset:
> > > > >      vcpu_enter_guest()
> > > > >        ├─> kvm_check_request(KVM_REQ_EVENT)
> > > > >        ├─> kvm_apic_accept_events()
> > > > >        │     └─> kvm_vcpu_reset(..., true)
> > > > >        │           └─> init_vmcb(..., true)
> > > > >        │                 └─> control->pml_index = PML_HEAD_INDEX -- PML buffer was already flushed
> > > > >        └─> kvm_x86_call(): Next VMRUN
> > > > > 
> > > > > > Could this result in the hypervisor losing track of dirty memory during live
> > > > > > migration, leading to memory corruption on the destination host, since
> > > > > > svm_flush_pml_buffer() isn't called before resetting the index?
> > > > > 
> > > > > AFAIU, no. The PML buffer is always flushed opportunistically at every VM exit.
> > > > 
> > > > Huh.  There's a pre-existing bug here.  Commit f7f39c50edb9 ("KVM: x86: Exit to
> > > > userspace if fastpath triggers one on instruction skip") added a path that skips
> > > > kvm_x86_ops.handle_exit(), and specifically can give userspace control without
> > > > going through vmx_flush_pml_buffer():
> > > > 
> > > > 	if (unlikely(exit_fastpath == EXIT_FASTPATH_EXIT_USERSPACE))
> > > > 		return 0;
> > > > 
> > > > 	r = kvm_x86_call(handle_exit)(vcpu, exit_fastpath);
> > > > 
> > > > Given that SVM support for PML is (obviously) on its way, it's mildly tempting
> > > > to add a dedicated kvm_x86_ops hook to flush the buffer on a fastpath userspace
> > > > exit.  But, I dislike one-off kvm_x86_ops hooks, and that only works if there's
> > > > no other vendor action required.  E.g. very theoretically, a fastpath userspace
> > > > exit could also be coincident with bus_lock_detected.
> > > 
> > > Seems vmx_vcpu_reset() doesn't reset PML index upon INIT event, which seems
> > > to be fine since we are not losing any dirty GPA tracking AFAICT (otherwise
> > > we already have a bug for VMX here)?
> > > 
> > > How about doing the same for SVM?
> > 
> > We don't really have that luxury.  On SHUTDOWN (even intercepted SHUTDOWN), the
> > state of the VMCB is technically undefined.  I.e. KVM needs to write _something_.
> 
> You mean KVM needs to reset VMCB to reflect the architecturally defined INIT
> state for a vCPU?  Or the hardware itself may reset VMCB thus may reset PML
> index?

Neither.  The APM states that the VMCB is undefined after SHUTDOWN.   PML index
could be anything:

  15.14.3 Shutdown Intercept
  When this intercept occurs, any condition that normally causes a shutdown causes a #VMEXIT to the
  VMM instead. After an intercepted shutdown, the state saved in the VMCB is undefined.

KVM synthesizes an INIT because it's the least awful option.

  static int shutdown_interception(struct kvm_vcpu *vcpu)
  {
	struct kvm_run *kvm_run = vcpu->run;
	struct vcpu_svm *svm = to_svm(vcpu);


	/*
	 * VMCB is undefined after a SHUTDOWN intercept.  INIT the vCPU to put
	 * the VMCB in a known good state.  Unfortuately, KVM doesn't have
	 * KVM_MP_STATE_SHUTDOWN and can't add it without potentially breaking
	 * userspace.  At a platform view, INIT is acceptable behavior as
	 * there exist bare metal platforms that automatically INIT the CPU
	 * in response to shutdown.
	 *
	 * The VM save area for SEV-ES guests has already been encrypted so it
	 * cannot be reinitialized, i.e. synthesizing INIT is futile.
	 */
	if (!is_sev_es_guest(vcpu)) {
		clear_page(svm->vmcb);
#ifdef CONFIG_KVM_SMM
		if (is_smm(vcpu))
			kvm_smm_changed(vcpu, false);
#endif
		kvm_vcpu_reset(vcpu, true);
	}

	kvm_run->exit_reason = KVM_EXIT_SHUTDOWN;
	return 0;
  }

Now, maybe the APM is trying to say only the save area is undefined, in which
case PML Index is fine and can and should be left alone.  But if that's the case,
the APM needs to be updated to make explicitly clear what fields in the VMCS are
and are not valid after SHUTDOWN.

> > Hmm, actually, how is that going to work?  Dropping PML entries just because a
> > vCPU hit SHUTDOWN isn't going to fly.  
> > 
> 
> Not dropping, but just leave PML index unchanged.  The PML buffer itself is
> still there unchanged, and PML is still working in hardware thus the buffer
> will eventually get flushed.

As above, the PML index could be garbage.

> This is the case for VMX AFAICT, thus I wonder whether this also works for
> SVM.

No, VMX is fine.

> > E.g. if a VM crashes while it's being
> > migrated by the host, then it could end up with corrupted, incoherent data on
> > the target due to leaving dirtied pages behind.
> 
> If we leave PML index unchanged upon INIT as mentioned above, I don't think
> we will lose any dirty GPA tracking.  But maybe I am missing something.

Correct.  The question is whether or not AMD's architecture guarantees PML Index
will be valid/accurate after SHUTDOWN.

next prev parent reply	other threads:[~2026-04-22 13:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-07  6:32 [PATCH v6 0/7] KVM: SVM: Add Page Modification Logging (PML) support Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 1/7] KVM: x86: Carve out PML flush routine Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 2/7] KVM: x86: Move PML page to common vcpu arch structure Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 3/7] KVM: VMX: Use cpu_dirty_log_size instead of enable_pml for PML checks Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 4/7] x86/cpufeatures: Add Page modification logging Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 5/7] KVM: SVM: Use BIT_ULL for 64-bit nested_ctl bit definitions Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 6/7] KVM: nSVM: Add helpers to temporarily switch to vmcb01 Nikunj A Dadhania
2026-04-07  6:32 ` [PATCH v6 7/7] KVM: SVM: Add Page modification logging support Nikunj A Dadhania
2026-04-20  6:38   ` Nikunj A. Dadhania
2026-04-21 15:08     ` Sean Christopherson
2026-04-21 23:50       ` Huang, Kai
2026-04-22  0:30         ` Sean Christopherson
2026-04-22  1:42           ` Huang, Kai
2026-04-22  5:59             ` Nikunj A. Dadhania
2026-04-22  8:14               ` Huang, Kai
2026-04-22 13:20             ` Sean Christopherson [this message]
2026-04-21 23:04   ` Yosry Ahmed
2026-04-21 23:15     ` Sean Christopherson
2026-04-22  6:26       ` Nikunj A. Dadhania

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aejLG_w8H16EifMj@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=joao.m.martins@oracle.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=nikunj@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=thomas.lendacky@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox