public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com
Subject: Re: [PATCH 0/2] KVM: x86/mmu: .change_pte() optimization in TDP MMU
Date: Wed, 16 Aug 2023 11:18:03 -0700	[thread overview]
Message-ID: <ZN0S28lkbo6+D7aF@google.com> (raw)
In-Reply-To: <20230808085056.14644-1-yan.y.zhao@intel.com>

On Tue, Aug 08, 2023, Yan Zhao wrote:
> This series optmizes KVM mmu notifier.change_pte() handler in x86 TDP MMU
> (i.e. kvm_tdp_mmu_set_spte_gfn()) by removing old dead code and prefetching
> notified new PFN into SPTEs directly in the handler.
> 
> As in [1], .change_pte() has been dead code on x86 for 10+ years.
> Patch 1 drops the dead code in x86 TDP MMU to save cpu cycles and prepare
> for optimization in TDP MMU in patch 2.

If we're going to officially kill the long-dead attempt at optimizing KSM, I'd
strongly prefer to rip out .change_pte() entirely, i.e. kill it off in all
architectures and remove it from mmu_notifiers.  The only reason I haven't proposed
such patches is because I didn't want to it to backfire and lead to someone trying
to resurrect the optimizations for KSM.

> Patch 2 optimizes TDP MMU's .change_pte() handler to prefetch SPTEs in the
> handler directly with PFN info contained in .change_pte() to avoid that
> each vCPU write that triggers .change_pte() must undergo twice VMExits and
> TDP page faults.

IMO, prefaulting guest memory as writable is better handled by userspace, e.g. by
using QEMU's prealloc option.  It's more coarse grained, but at a minimum it's
sufficient for improving guest boot time, e.g. by preallocating memory below 4GiB.

And we can do even better, e.g. by providing a KVM ioctl() to allow userspace to
prefault memory not just into the primary MMU, but also into KVM's MMU.  Such an
ioctl() is basically manadatory for TDX, we just need to morph the support being
added by TDX into a generic ioctl()[*]

Prefaulting guest memory as writable into the primary MMU should be able to achieve
far better performance than hooking .change_pte(), as it will avoid the mmu_notifier
invalidation, e.g. won't trigger taking mmu_lock for write and the resulting remote
TLB flush(es).  And a KVM ioctl() to prefault into KVM's MMU should eliminate page
fault VM-Exits entirely.

Explicit prefaulting isn't perfect, but IMO the value added by prefetching in
.change_pte() isn't enough to justify carrying the hook and the code in KVM.

[*] https://lore.kernel.org/all/ZMFYhkSPE6Zbp8Ea@google.com

  parent reply	other threads:[~2023-08-16 18:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-08  8:50 [PATCH 0/2] KVM: x86/mmu: .change_pte() optimization in TDP MMU Yan Zhao
2023-08-08  8:53 ` [PATCH 1/2] KVM: x86/mmu: Remove dead code in .change_pte() handler in x86 " Yan Zhao
2023-08-08  8:54 ` [PATCH 2/2] KVM: x86/mmu: prefetch SPTE directly in x86 TDP MMU's change_pte() handler Yan Zhao
2023-08-16 18:18 ` Sean Christopherson [this message]
2023-08-17  0:00   ` [PATCH 0/2] KVM: x86/mmu: .change_pte() optimization in TDP MMU Yan Zhao
2023-08-17 17:53     ` Sean Christopherson
2023-08-18 10:17       ` Yan Zhao
2023-08-18 13:46         ` Sean Christopherson
2023-09-04  7:03         ` Yan Zhao
2023-09-05 18:59           ` Sean Christopherson
2023-09-05 19:30             ` Linus Torvalds
2023-09-06  0:29             ` Robin Murphy
2023-09-06 14:44               ` Sean Christopherson
2023-09-06 16:18                 ` Robin Murphy
2023-09-06 16:46                   ` Sean Christopherson
2023-09-08  8:18                   ` Christoph Hellwig
2023-09-05 20:18           ` Sean Christopherson
2023-09-06  1:51             ` Yan Zhao
2023-09-06 22:17             ` Paolo Bonzini
2023-09-07  0:51               ` Sean Christopherson
2023-09-07  0:36                 ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZN0S28lkbo6+D7aF@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox