From: Sean Christopherson <seanjc@google.com>
To: David Matlack <dmatlack@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Marc Zyngier <maz@kernel.org>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Anup Patel <anup@brainfault.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Andrew Jones <drjones@redhat.com>,
Ben Gardon <bgardon@google.com>, Peter Xu <peterx@redhat.com>,
"Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>,
"moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)"
<kvmarm@lists.cs.columbia.edu>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)"
<linux-mips@vger.kernel.org>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)"
<kvm@vger.kernel.org>,
"open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)"
<kvm-riscv@lists.infradead.org>,
Peter Feiner <pfeiner@google.com>
Subject: Re: [PATCH v3 00/23] KVM: Extend Eager Page Splitting to the shadow MMU
Date: Mon, 11 Apr 2022 20:12:41 +0000 [thread overview]
Message-ID: <YlSLuZphElMyF2sG@google.com> (raw)
In-Reply-To: <CALzav=f_WY7xH_MV8-gJPAVmj1KjE_LvXupL7aA5n-vCjTETNw@mail.gmail.com>
On Mon, Apr 11, 2022, David Matlack wrote:
> On Mon, Apr 11, 2022 at 10:12 AM Sean Christopherson <seanjc@google.com> wrote:
> > Circling back to eager page splitting, this series could be reworked to take the
> > first step of forking FNAME(page_fault), FNAME(fetch) and kvm_mmu_get_page() in
> > order to provide the necessary path for reworking nested MMU page faults. Then it
> > can remove unsync and shrinker support for nested MMUs. With those gone,
> > dissecting the nested MMU variant of kvm_mmu_get_page() should be simpler/cleaner
> > than dealing with the existing kvm_mmu_get_page(), i.e. should eliminate at least
> > some of the complexity/churn.
>
> These sound like useful improvements but I am not really seeing the
> value of sequencing them before this series:
>
> - IMO the "churn" in patches 1-14 are a net improvement to the
> existing code. They improve readability by decomposing the shadow page
> creation path into smaller functions with better names, reduce the
> amount of redundant calculations, and reduce the dependence on struct
> kvm_vcpu where it is not needed. Even if eager page splitting is
> completely dropped I think they would be useful to merge.
I definitely like some of patches 1-14, probably most after a few read throughs.
But there are key parts that I do not like that are motivated almost entirely by
the desire to support page splitting. Specifically, I don't like splitting the
logic of finding a page, and I don't like having a separate alloc vs. initializer
(though I'm guessing this will be needed somewhere to split huge pages for nested
MMUs).
E.g. I'd prefer the "get" flow look like the below (completely untested, for
discussion purposes only). There's still churn, but the core loop is almost
entirely unchanged.
And it's not just this series, I don't want future improvements nested TDP to have
to deal with the legacy baggage.
Waaaay off topic, why do we still bother with stat.max_mmu_page_hash_collision?
I assume it was originally added to tune the hashing logic? At this point is it
anything but wasted cycles?
static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm_vcpu *vcpu,
gfn_t gfn,
unsigned int gfn_hash,
union kvm_mmu_page_role role)
{
struct hlist_head *sp_list = &kvm->arch.mmu_page_hash[gfn_hash];
struct kvm_mmu_page *sp;
LIST_HEAD(invalid_list);
int collisions = 0;
for_each_valid_sp(kvm, sp, sp_list) {
if (sp->gfn != gfn) {
collisions++;
continue;
}
if (sp->role.word != role.word) {
/*
* If the guest is creating an upper-level page, zap
* unsync pages for the same gfn. While it's possible
* the guest is using recursive page tables, in all
* likelihood the guest has stopped using the unsync
* page and is installing a completely unrelated page.
* Unsync pages must not be left as is, because the new
* upper-level page will be write-protected.
*/
if (role.level > PG_LEVEL_4K && sp->unsync)
kvm_mmu_prepare_zap_page(vcpu->kvm, sp, invalid_list);
continue;
}
/* unsync and write-flooding only apply to indirect SPs. */
if (sp->role.direct)
goto out;
if (sp->unsync) {
/*
* The page is good, but is stale. kvm_sync_page does
* get the latest guest state, but (unlike mmu_unsync_children)
* it doesn't write-protect the page or mark it synchronized!
* This way the validity of the mapping is ensured, but the
* overhead of write protection is not incurred until the
* guest invalidates the TLB mapping. This allows multiple
* SPs for a single gfn to be unsync.
*
* If the sync fails, the page is zapped. If so, break
* in order to rebuild it.
*/
if (!kvm_sync_page(vcpu, sp, &invalid_list))
break;
WARN_ON(!list_empty(&invalid_list));
kvm_flush_remote_tlbs(vcpu->kvm);
}
__clear_sp_write_flooding_count(sp);
goto out;
}
sp = NULL;
out:
if (collisions > kvm->stat.max_mmu_page_hash_collisions)
kvm->stat.max_mmu_page_hash_collisions = collisions;
kvm_mmu_commit_zap_page(vcpu->kvm, &invalid_list);
return sp;
}
static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu,
gfn_t gfn,
unsigned int gfn_hash,
union kvm_mmu_page_role role)
{
struct kvm_mmu_page *sp = __kvm_mmu_alloc_shadow_page(vcpu, role.direct);
struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
struct hlist_head *sp_list = &kvm->arch.mmu_page_hash[gfn_hash];
++kvm->stat.mmu_cache_miss;
sp->gfn = gfn;
sp->role = role;
sp->mmu_valid_gen = kvm->arch.mmu_valid_gen;
/*
* active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages()
* depends on valid pages being added to the head of the list. See
* comments in kvm_zap_obsolete_pages().
*/
list_add(&sp->link, &kvm->arch.active_mmu_pages);
kvm_mod_used_mmu_pages(kvm, 1);
sp_list = &kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
hlist_add_head(&sp->hash_link, sp_list);
if (!role.direct)
account_shadowed(kvm, slot, sp);
}
static struct kvm_mmu_page *kvm_mmu_get_shadow_page(struct kvm_vcpu *vcpu,
gfn_t gfn,
union kvm_mmu_page_role role)
{
unsigned int gfn_hash = kvm_page_table_hashfn(gfn);
struct kvm_mmu_page *sp;
bool created = false;
sp = kvm_mmu_find_shadow_page(vcpu, gfn, gfn_hash, role);
if (!sp) {
created = true;
sp = kvm_mmu_alloc_shadow_page(vcpu, gfn, gfn_hash, role);
}
trace_kvm_mmu_get_page(sp, created);
return sp;
}
next prev parent reply other threads:[~2022-04-11 20:12 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-01 17:55 [PATCH v3 00/23] KVM: Extend Eager Page Splitting to the shadow MMU David Matlack
2022-04-01 17:55 ` [PATCH v3 01/23] KVM: x86/mmu: Optimize MMU page cache lookup for all direct SPs David Matlack
2022-04-01 17:55 ` [PATCH v3 02/23] KVM: x86/mmu: Use a bool for direct David Matlack
2022-04-08 22:24 ` Sean Christopherson
2022-04-01 17:55 ` [PATCH v3 03/23] KVM: x86/mmu: Derive shadow MMU page role from parent David Matlack
2022-04-01 17:55 ` [PATCH v3 04/23] KVM: x86/mmu: Decompose kvm_mmu_get_page() into separate functions David Matlack
2022-04-01 17:55 ` [PATCH v3 05/23] KVM: x86/mmu: Rename shadow MMU functions that deal with shadow pages David Matlack
2022-04-01 17:55 ` [PATCH v3 06/23] KVM: x86/mmu: Pass memslot to kvm_mmu_new_shadow_page() David Matlack
2022-04-01 17:55 ` [PATCH v3 07/23] KVM: x86/mmu: Separate shadow MMU sp allocation from initialization David Matlack
2022-04-01 17:55 ` [PATCH v3 08/23] KVM: x86/mmu: Link spt to sp during allocation David Matlack
2022-04-01 17:55 ` [PATCH v3 09/23] KVM: x86/mmu: Move huge page split sp allocation code to mmu.c David Matlack
2022-04-01 17:55 ` [PATCH v3 10/23] KVM: x86/mmu: Use common code to free kvm_mmu_page structs David Matlack
2022-04-01 17:55 ` [PATCH v3 11/23] KVM: x86/mmu: Use common code to allocate shadow pages from vCPU caches David Matlack
2022-04-01 17:55 ` [PATCH v3 12/23] KVM: x86/mmu: Pass const memslot to rmap_add() David Matlack
2022-04-01 17:55 ` [PATCH v3 13/23] KVM: x86/mmu: Pass const memslot to init_shadow_page() and descendants David Matlack
2022-04-01 17:55 ` [PATCH v3 14/23] KVM: x86/mmu: Decouple rmap_add() and link_shadow_page() from kvm_vcpu David Matlack
2022-04-01 17:55 ` [PATCH v3 15/23] KVM: x86/mmu: Update page stats in __rmap_add() David Matlack
2022-04-01 17:55 ` [PATCH v3 16/23] KVM: x86/mmu: Cache the access bits of shadowed translations David Matlack
2022-04-02 6:19 ` kernel test robot
2022-04-02 7:01 ` kernel test robot
2022-04-09 0:02 ` Sean Christopherson
2022-04-14 16:47 ` David Matlack
2022-04-01 17:55 ` [PATCH v3 17/23] KVM: x86/mmu: Extend make_huge_page_split_spte() for the shadow MMU David Matlack
2022-04-01 17:55 ` [PATCH v3 18/23] KVM: x86/mmu: Zap collapsible SPTEs at all levels in " David Matlack
2022-04-01 17:55 ` [PATCH v3 19/23] KVM: x86/mmu: Refactor drop_large_spte() David Matlack
2022-04-01 17:55 ` [PATCH v3 20/23] KVM: Allow for different capacities in kvm_mmu_memory_cache structs David Matlack
2022-04-20 10:55 ` Anup Patel
2022-04-21 16:19 ` Ben Gardon
2022-04-21 16:33 ` David Matlack
2022-04-01 17:55 ` [PATCH v3 21/23] KVM: Allow GFP flags to be passed when topping up MMU caches David Matlack
2022-04-01 17:55 ` [PATCH v3 22/23] KVM: x86/mmu: Support Eager Page Splitting in the shadow MMU David Matlack
2022-04-09 0:39 ` Sean Christopherson
2022-04-14 16:50 ` David Matlack
2022-04-01 17:55 ` [PATCH v3 23/23] KVM: selftests: Map x86_64 guest virtual memory with huge pages David Matlack
2022-04-11 17:12 ` [PATCH v3 00/23] KVM: Extend Eager Page Splitting to the shadow MMU Sean Christopherson
2022-04-11 17:54 ` David Matlack
2022-04-11 20:12 ` Sean Christopherson [this message]
2022-04-11 23:41 ` David Matlack
2022-04-12 0:39 ` Sean Christopherson
2022-04-12 16:49 ` David Matlack
2022-04-13 1:02 ` Sean Christopherson
2022-04-13 17:57 ` David Matlack
2022-04-13 18:28 ` Sean Christopherson
2022-04-13 21:22 ` David Matlack
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YlSLuZphElMyF2sG@google.com \
--to=seanjc@google.com \
--cc=aleksandar.qemu.devel@gmail.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=bgardon@google.com \
--cc=chenhuacai@kernel.org \
--cc=dmatlack@google.com \
--cc=drjones@redhat.com \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-mips@vger.kernel.org \
--cc=maciej.szmigiero@oracle.com \
--cc=maz@kernel.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=pfeiner@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).