Re: [PATCH 1/7] Revert "KVM: x86/mmu: Don't bottom out on leafs when zapping collapsible SPTEs"

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: David Matlack <dmatlack@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, kvm@vger.kernel.org
Subject: Re: [PATCH 1/7] Revert "KVM: x86/mmu: Don't bottom out on leafs when zapping collapsible SPTEs"
Date: Fri, 16 Aug 2024 15:47:47 -0700	[thread overview]
Message-ID: <Zr_XE6NG1c-rNXEl@google.com> (raw)
In-Reply-To: <20240805233114.4060019-2-dmatlack@google.com>

On Mon, Aug 05, 2024, David Matlack wrote:
> This reverts commit 85f44f8cc07b5f61bef30fe5343d629fd4263230.
> 
> Bring back the logic that walks down to leafs when zapping collapsible
> SPTEs. Stepping down to leafs is technically unnecessary when zapping,
> but the leaf SPTE will be used in a subsequent commit to construct a
> huge SPTE and recover the huge mapping in place.

Please no, getting rid of the step-up code made me so happy. :-D

It's also suboptimal, e.g. in the worst case scenario (which is comically
unlikely, but theoretically possible), if the first present leaf SPTE in a 1GiB
region is the last SPTE in the last 2MiB range, and all non-leaf SPTEs are
somehow present (this is the super unlikely part), then KVM will read 512*512
SPTEs before encountering a shadow-present leaf SPTE.

The proposed approach will also ignore nx_huge_page_disallowed, and just always
create a huge NX page.  On the plus side, "free" NX hugepage recovery!  The
downside is that it means KVM is pretty much guaranteed to force the guest to
re-fault all of its code pages, and zap a non-trivial number of huge pages that
were just created.  IIRC, we deliberately did that for the zapping case, e.g. to
use the opportunity to recover NX huge pages, but zap+create+zap+create is a bit
different than zap+create (if the guest is still using the region for code).

So rather than looking for a present leaf SPTE, what about "stopping" as soon as
KVM find a SP that can be replaced with a huge SPTE, pre-checking
nx_huge_page_disallowed, and invoking kvm_mmu_do_page_fault() to install a new
SPTE?  Or maybe even use kvm_tdp_map_page()?  Though it might be more work to
massage kvm_tdp_map_page() into a usable form.

By virtue of there being a present non-leaf SPTE, KVM knows the guest accessed
the region at some point.  And now that MTRR virtualization is gone, the only time
KVM zaps _only_ leafs is for mmu_notifiers and and the APIC access page, i.e. the
odds of stepping down NOT finding a present SPTE somewhere in the region is very
small.  Lastly, kvm_mmu_max_mapping_level() has verified there is a valid mapping
in the pr imary MMU, else the max level would be PG_LEVEL_4K.  So the odds of
getting a false positive and effectively pre-faulting memory the guest isn't using
are quite small.

next prev parent reply	other threads:[~2024-08-16 22:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-05 23:31 [PATCH 0/7] KVM: x86/mmu: Optimize TDP MMU huge page recovery during disable-dirty-log David Matlack
2024-08-05 23:31 ` [PATCH 1/7] Revert "KVM: x86/mmu: Don't bottom out on leafs when zapping collapsible SPTEs" David Matlack
2024-08-16 22:47   ` Sean Christopherson [this message]
2024-08-16 23:35     ` David Matlack
2024-08-22 15:10       ` Sean Christopherson
2024-08-05 23:31 ` [PATCH 2/7] KVM: x86/mmu: Drop @max_level from kvm_mmu_max_mapping_level() David Matlack
2024-08-05 23:31 ` [PATCH 3/7] KVM: x86/mmu: Batch TLB flushes when zapping collapsible TDP MMU SPTEs David Matlack
2024-08-05 23:31 ` [PATCH 4/7] KVM: x86/mmu: Recover TDP MMU huge page mappings in-place instead of zapping David Matlack
2024-08-05 23:31 ` [PATCH 5/7] KVM: x86/mmu: Rename make_huge_page_split_spte() to make_small_spte() David Matlack
2024-08-05 23:31 ` [PATCH 6/7] KVM: x86/mmu: WARN if huge page recovery triggered during dirty logging David Matlack
2024-08-05 23:31 ` [PATCH 7/7] KVM: x86/mmu: Recheck SPTE points to a PT during huge page recovery David Matlack
2024-08-22 16:50   ` Sean Christopherson
2024-08-22 18:35     ` David Matlack
2024-08-22 20:11       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zr_XE6NG1c-rNXEl@google.com \
    --to=seanjc@google.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.