From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Anup Patel <anup@brainfault.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Sean Christopherson <seanjc@google.com>,
Andrew Jones <drjones@redhat.com>,
Ben Gardon <bgardon@google.com>, Peter Xu <peterx@redhat.com>,
maciej.szmigiero@oracle.com,
"moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)"
<kvmarm@lists.cs.columbia.edu>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)"
<linux-mips@vger.kernel.org>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)"
<kvm@vger.kernel.org>,
"open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)"
<kvm-riscv@lists.infradead.org>,
Peter Feiner <pfeiner@google.com>,
Lai Jiangshan <jiangshanlai@gmail.com>,
David Matlack <dmatlack@google.com>
Subject: [PATCH v6 04/22] KVM: x86/mmu: Derive shadow MMU page role from parent
Date: Mon, 16 May 2022 23:21:20 +0000 [thread overview]
Message-ID: <20220516232138.1783324-5-dmatlack@google.com> (raw)
In-Reply-To: <20220516232138.1783324-1-dmatlack@google.com>
Instead of computing the shadow page role from scratch for every new
page, derive most of the information from the parent shadow page. This
eliminates the dependency on the vCPU root role to allocate shadow page
tables, and reduces the number of parameters to kvm_mmu_get_page().
Preemptively split out the role calculation to a separate function for
use in a following commit.
Note that when calculating the MMU root role, we can take
@role.passthrough, @role.direct, and @role.access directly from
@vcpu->arch.mmu->root_role. Only @role.level and @role.quadrant still
must be overridden for PAE page directories.
No functional change intended.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
---
arch/x86/kvm/mmu/mmu.c | 98 +++++++++++++++++++++++-----------
arch/x86/kvm/mmu/paging_tmpl.h | 9 ++--
2 files changed, 71 insertions(+), 36 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a9d28bcabcbb..515e0b33144a 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2019,33 +2019,15 @@ static void clear_sp_write_flooding_count(u64 *spte)
__clear_sp_write_flooding_count(sptep_to_sp(spte));
}
-static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
- gfn_t gfn,
- gva_t gaddr,
- unsigned level,
- bool direct,
- unsigned int access)
+static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn,
+ union kvm_mmu_page_role role)
{
- union kvm_mmu_page_role role;
struct hlist_head *sp_list;
- unsigned quadrant;
struct kvm_mmu_page *sp;
int ret;
int collisions = 0;
LIST_HEAD(invalid_list);
- role = vcpu->arch.mmu->root_role;
- role.level = level;
- role.direct = direct;
- role.access = access;
- if (role.has_4_byte_gpte) {
- quadrant = gaddr >> (PAGE_SHIFT + (PT64_PT_BITS * level));
- quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
- role.quadrant = quadrant;
- }
- if (level <= vcpu->arch.mmu->cpu_role.base.level)
- role.passthrough = 0;
-
sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
for_each_valid_sp(vcpu->kvm, sp, sp_list) {
if (sp->gfn != gfn) {
@@ -2063,7 +2045,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
* Unsync pages must not be left as is, because the new
* upper-level page will be write-protected.
*/
- if (level > PG_LEVEL_4K && sp->unsync)
+ if (role.level > PG_LEVEL_4K && sp->unsync)
kvm_mmu_prepare_zap_page(vcpu->kvm, sp,
&invalid_list);
continue;
@@ -2104,14 +2086,14 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
++vcpu->kvm->stat.mmu_cache_miss;
- sp = kvm_mmu_alloc_page(vcpu, direct);
+ sp = kvm_mmu_alloc_page(vcpu, role.direct);
sp->gfn = gfn;
sp->role = role;
hlist_add_head(&sp->hash_link, sp_list);
if (sp_has_gptes(sp)) {
account_shadowed(vcpu->kvm, sp);
- if (level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn))
+ if (role.level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn))
kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1);
}
trace_kvm_mmu_get_page(sp, true);
@@ -2123,6 +2105,55 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
return sp;
}
+static union kvm_mmu_page_role kvm_mmu_child_role(u64 *sptep, bool direct, u32 access)
+{
+ struct kvm_mmu_page *parent_sp = sptep_to_sp(sptep);
+ union kvm_mmu_page_role role;
+
+ role = parent_sp->role;
+ role.level--;
+ role.access = access;
+ role.direct = direct;
+ role.passthrough = 0;
+
+ /*
+ * If the guest has 4-byte PTEs then that means it's using 32-bit,
+ * 2-level, non-PAE paging. KVM shadows such guests with PAE paging
+ * (i.e. 8-byte PTEs). The difference in PTE size means that KVM must
+ * shadow each guest page table with multiple shadow page tables, which
+ * requires extra bookkeeping in the role.
+ *
+ * Specifically, to shadow the guest's page directory (which covers a
+ * 4GiB address space), KVM uses 4 PAE page directories, each mapping
+ * 1GiB of the address space. @role.quadrant encodes which quarter of
+ * the address space each maps.
+ *
+ * To shadow the guest's page tables (which each map a 4MiB region), KVM
+ * uses 2 PAE page tables, each mapping a 2MiB region. For these,
+ * @role.quadrant encodes which half of the region they map.
+ *
+ * Note, the 4 PAE page directories are pre-allocated and the quadrant
+ * assigned in mmu_alloc_root(). So only page tables need to be handled
+ * here.
+ */
+ if (role.has_4_byte_gpte) {
+ WARN_ON_ONCE(role.level != PG_LEVEL_4K);
+ role.quadrant = (sptep - parent_sp->spt) % 2;
+ }
+
+ return role;
+}
+
+static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
+ u64 *sptep, gfn_t gfn,
+ bool direct, u32 access)
+{
+ union kvm_mmu_page_role role;
+
+ role = kvm_mmu_child_role(sptep, direct, access);
+ return kvm_mmu_get_page(vcpu, gfn, role);
+}
+
static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
struct kvm_vcpu *vcpu, hpa_t root,
u64 addr)
@@ -2965,8 +2996,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
if (is_shadow_present_pte(*it.sptep))
continue;
- sp = kvm_mmu_get_page(vcpu, base_gfn, it.addr,
- it.level - 1, true, ACC_ALL);
+ sp = kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, true, ACC_ALL);
link_shadow_page(vcpu, it.sptep, sp);
if (fault->is_tdp && fault->huge_page_disallowed &&
@@ -3369,13 +3399,18 @@ static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn)
return ret;
}
-static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva,
+static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant,
u8 level)
{
- bool direct = vcpu->arch.mmu->root_role.direct;
+ union kvm_mmu_page_role role = vcpu->arch.mmu->root_role;
struct kvm_mmu_page *sp;
- sp = kvm_mmu_get_page(vcpu, gfn, gva, level, direct, ACC_ALL);
+ role.level = level;
+
+ if (role.has_4_byte_gpte)
+ role.quadrant = quadrant;
+
+ sp = kvm_mmu_get_page(vcpu, gfn, role);
++sp->root_count;
return __pa(sp->spt);
@@ -3409,8 +3444,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
for (i = 0; i < 4; ++i) {
WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i]));
- root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
- i << 30, PT32_ROOT_LEVEL);
+ root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT), i,
+ PT32_ROOT_LEVEL);
mmu->pae_root[i] = root | PT_PRESENT_MASK |
shadow_me_mask;
}
@@ -3579,8 +3614,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
root_gfn = pdptrs[i] >> PAGE_SHIFT;
}
- root = mmu_alloc_root(vcpu, root_gfn, i << 30,
- PT32_ROOT_LEVEL);
+ root = mmu_alloc_root(vcpu, root_gfn, i, PT32_ROOT_LEVEL);
mmu->pae_root[i] = root | pm_mask;
}
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index db80f7ccaa4e..fd73c857af90 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -648,8 +648,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
if (!is_shadow_present_pte(*it.sptep)) {
table_gfn = gw->table_gfn[it.level - 2];
access = gw->pt_access[it.level - 2];
- sp = kvm_mmu_get_page(vcpu, table_gfn, fault->addr,
- it.level-1, false, access);
+ sp = kvm_mmu_get_child_sp(vcpu, it.sptep, table_gfn,
+ false, access);
+
/*
* We must synchronize the pagetable before linking it
* because the guest doesn't need to flush tlb when
@@ -705,8 +706,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
drop_large_spte(vcpu, it.sptep);
if (!is_shadow_present_pte(*it.sptep)) {
- sp = kvm_mmu_get_page(vcpu, base_gfn, fault->addr,
- it.level - 1, true, direct_access);
+ sp = kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn,
+ true, direct_access);
link_shadow_page(vcpu, it.sptep, sp);
if (fault->huge_page_disallowed &&
fault->req_level >= it.level)
--
2.36.0.550.gb090851708-goog
next prev parent reply other threads:[~2022-05-16 23:21 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-16 23:21 [PATCH v6 00/22] KVM: Extend Eager Page Splitting to the shadow MMU David Matlack
2022-05-16 23:21 ` [PATCH v6 01/22] KVM: x86/mmu: Optimize MMU page cache lookup for all direct SPs David Matlack
2022-05-16 23:21 ` [PATCH v6 02/22] KVM: x86/mmu: Use a bool for direct David Matlack
2022-05-16 23:21 ` [PATCH v6 03/22] KVM: x86/mmu: Stop passing @direct to mmu_alloc_root() David Matlack
2022-06-16 18:47 ` Sean Christopherson
2022-06-22 14:06 ` Paolo Bonzini
2022-06-22 14:19 ` Sean Christopherson
2022-05-16 23:21 ` David Matlack [this message]
2022-06-17 1:19 ` [PATCH v6 04/22] KVM: x86/mmu: Derive shadow MMU page role from parent Sean Christopherson
2022-06-17 15:12 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 05/22] KVM: x86/mmu: Always pass 0 for @quadrant when gptes are 8 bytes David Matlack
2022-06-17 15:20 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 06/22] KVM: x86/mmu: Decompose kvm_mmu_get_page() into separate functions David Matlack
2022-05-16 23:21 ` [PATCH v6 07/22] KVM: x86/mmu: Consolidate shadow page allocation and initialization David Matlack
2022-05-16 23:21 ` [PATCH v6 08/22] KVM: x86/mmu: Rename shadow MMU functions that deal with shadow pages David Matlack
2022-05-16 23:21 ` [PATCH v6 09/22] KVM: x86/mmu: Move guest PT write-protection to account_shadowed() David Matlack
2022-05-16 23:21 ` [PATCH v6 10/22] KVM: x86/mmu: Pass memory caches to allocate SPs separately David Matlack
2022-06-17 15:01 ` Sean Christopherson
2022-06-21 17:06 ` David Matlack
2022-06-21 17:27 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 11/22] KVM: x86/mmu: Replace vcpu with kvm in kvm_mmu_alloc_shadow_page() David Matlack
2022-05-16 23:21 ` [PATCH v6 12/22] KVM: x86/mmu: Pass kvm pointer separately from vcpu to kvm_mmu_find_shadow_page() David Matlack
2022-05-16 23:21 ` [PATCH v6 13/22] KVM: x86/mmu: Allow NULL @vcpu in kvm_mmu_find_shadow_page() David Matlack
2022-06-17 15:28 ` Sean Christopherson
2022-06-22 14:26 ` Paolo Bonzini
2022-05-16 23:21 ` [PATCH v6 14/22] KVM: x86/mmu: Pass const memslot to rmap_add() David Matlack
2022-06-17 15:30 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 15/22] KVM: x86/mmu: Decouple rmap_add() and link_shadow_page() from kvm_vcpu David Matlack
2022-06-17 16:39 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 16/22] KVM: x86/mmu: Update page stats in __rmap_add() David Matlack
2022-06-17 16:40 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 17/22] KVM: x86/mmu: Cache the access bits of shadowed translations David Matlack
2022-06-17 16:53 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 18/22] KVM: x86/mmu: Extend make_huge_page_split_spte() for the shadow MMU David Matlack
2022-06-17 16:56 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 19/22] KVM: x86/mmu: Zap collapsible SPTEs in shadow MMU at all possible levels David Matlack
2022-06-17 17:01 ` Sean Christopherson
2022-06-21 17:24 ` David Matlack
2022-06-21 17:59 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 20/22] KVM: x86/mmu: Refactor drop_large_spte() David Matlack
2022-06-17 17:11 ` Sean Christopherson
2022-06-22 16:13 ` Paolo Bonzini
2022-06-22 16:50 ` Paolo Bonzini
2022-05-16 23:21 ` [PATCH v6 21/22] KVM: Allow for different capacities in kvm_mmu_memory_cache structs David Matlack
2022-05-19 15:33 ` Anup Patel
2022-05-20 23:21 ` Mingwei Zhang
2022-05-23 17:37 ` Sean Christopherson
2022-05-23 17:44 ` David Matlack
2022-05-23 18:13 ` Mingwei Zhang
2022-05-23 18:22 ` David Matlack
2022-05-23 23:53 ` David Matlack
2022-06-17 17:41 ` Sean Christopherson
2022-06-17 18:34 ` Sean Christopherson
2022-05-16 23:21 ` [PATCH v6 22/22] KVM: x86/mmu: Extend Eager Page Splitting to nested MMUs David Matlack
2022-06-01 21:50 ` Ricardo Koller
2022-06-17 19:08 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220516232138.1783324-5-dmatlack@google.com \
--to=dmatlack@google.com \
--cc=aleksandar.qemu.devel@gmail.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=bgardon@google.com \
--cc=chenhuacai@kernel.org \
--cc=drjones@redhat.com \
--cc=jiangshanlai@gmail.com \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-mips@vger.kernel.org \
--cc=maciej.szmigiero@oracle.com \
--cc=maz@kernel.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=pfeiner@google.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).