From: Rick Edgecombe <rick.p.edgecombe@intel.com>
To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org
Cc: kai.huang@intel.com, dmatlack@google.com, erdemaktas@google.com,
isaku.yamahata@gmail.com, linux-kernel@vger.kernel.org,
sagis@google.com, yan.y.zhao@intel.com,
rick.p.edgecombe@intel.com,
Isaku Yamahata <isaku.yamahata@intel.com>,
Binbin Wu <binbin.wu@linux.intel.com>
Subject: [PATCH v3 04/17] KVM: x86/mmu: Add an external pointer to struct kvm_mmu_page
Date: Wed, 19 Jun 2024 15:36:01 -0700 [thread overview]
Message-ID: <20240619223614.290657-5-rick.p.edgecombe@intel.com> (raw)
In-Reply-To: <20240619223614.290657-1-rick.p.edgecombe@intel.com>
From: Isaku Yamahata <isaku.yamahata@intel.com>
Add a external pointer to struct kvm_mmu_page for TDX's private page table
and add helper functions to allocate/initialize/free a private page table
page. TDX will only be supported with the TDP MMU. Because KVM TDP MMU
doesn't use unsync_children and write_flooding_count, pack them to have
room for a pointer and use a union to avoid memory overhead.
For private GPA, CPU refers to a private page table whose contents are
encrypted. The dedicated APIs to operate on it (e.g. updating/reading its
PTE entry) are used, and their cost is expensive.
When KVM resolves the KVM page fault, it walks the page tables. To reuse
the existing KVM MMU code and mitigate the heavy cost of directly walking
the private page table allocate two sets of page tables for the private
half of the GPA space.
For the page tables that KVM will walk, allocate them like normal and refer
to them as mirror page tables. Additionally allocate one more page for the
page tables the CPU will walk, and call them external page tables. Resolve
the KVM page fault with the existing code, and do additional operations
necessary for modifying the external page table in future patches.
The relationship of the types of page tables in this scheme is depicted
below:
KVM page fault |
| |
V |
-------------+---------- |
| | |
V V |
shared GPA private GPA |
| | |
V V |
shared PT root mirror PT root | private PT root
| | | |
V V | V
shared PT mirror PT --propagate--> external PT
| | | |
| \-----------------+------\ |
| | | |
V | V V
shared guest page | private guest page
|
non-encrypted memory | encrypted memory
|
PT - Page table
Shared PT - Visible to KVM, and the CPU uses it for shared mappings.
External PT - The CPU uses it, but it is invisible to KVM. TDX module
updates this table to map private guest pages.
Mirror PT - It is visible to KVM, but the CPU doesn't use it. KVM uses
it to propagate PT change to the actual private PT.
Add a helper kvm_has_mirrored_tdp() to trigger this behavior and wire it
to the TDX vm type.
Co-developed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
---
TDX MMU Prep v3:
- mirrored->external rename (Paolo)
- Remove accidentally included kvm_mmu_alloc_private_spt() (Paolo)
- Those -> These (Paolo)
- Change log updates to make external/mirrored naming more clear
TDX MMU Prep v2:
- Rename private->mirror
- Don't trigger off of shared mask
TDX MMU Prep:
- Rename terminology, dummy PT => mirror PT. and updated the commit message
By Rick and Kai.
- Added a comment on union of private_spt by Rick.
- Don't handle the root case in kvm_mmu_alloc_private_spt(), it will not
be needed in future patches. (Rick)
- Update comments (Yan)
- Remove kvm_mmu_init_private_spt(), open code it in later patches (Yan)
v19:
- typo in the comment in kvm_mmu_alloc_private_spt()
- drop CONFIG_KVM_MMU_PRIVATE
---
arch/x86/include/asm/kvm_host.h | 5 +++++
arch/x86/kvm/mmu.h | 5 +++++
arch/x86/kvm/mmu/mmu.c | 7 +++++++
arch/x86/kvm/mmu/mmu_internal.h | 31 +++++++++++++++++++++++++++----
arch/x86/kvm/mmu/tdp_mmu.c | 1 +
5 files changed, 45 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index aabf1648a56a..9e35fe32f500 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -817,6 +817,11 @@ struct kvm_vcpu_arch {
struct kvm_mmu_memory_cache mmu_shadow_page_cache;
struct kvm_mmu_memory_cache mmu_shadowed_info_cache;
struct kvm_mmu_memory_cache mmu_page_header_cache;
+ /*
+ * This cache is to allocate private page table. E.g. private EPT used
+ * by the TDX module.
+ */
+ struct kvm_mmu_memory_cache mmu_external_spt_cache;
/*
* QEMU userspace and the guest each have their own FPU state.
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index dc80e72e4848..0c3bf89cf7db 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -318,4 +318,9 @@ static inline gpa_t kvm_translate_gpa(struct kvm_vcpu *vcpu,
return gpa;
return translate_nested_gpa(vcpu, gpa, access, exception);
}
+
+static inline bool kvm_has_mirrored_tdp(const struct kvm *kvm)
+{
+ return kvm->arch.vm_type == KVM_X86_TDX_VM;
+}
#endif
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f41c498fcdb5..8023cebeefaa 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -685,6 +685,12 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
1 + PT64_ROOT_MAX_LEVEL + PTE_PREFETCH_NUM);
if (r)
return r;
+ if (kvm_has_mirrored_tdp(vcpu->kvm)) {
+ r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_external_spt_cache,
+ PT64_ROOT_MAX_LEVEL);
+ if (r)
+ return r;
+ }
r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadow_page_cache,
PT64_ROOT_MAX_LEVEL);
if (r)
@@ -704,6 +710,7 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu)
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache);
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache);
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_info_cache);
+ kvm_mmu_free_memory_cache(&vcpu->arch.mmu_external_spt_cache);
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache);
}
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 706f0ce8784c..d2837f796f34 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -101,7 +101,22 @@ struct kvm_mmu_page {
int root_count;
refcount_t tdp_mmu_root_count;
};
- unsigned int unsync_children;
+ union {
+ /* These two members aren't used for TDP MMU */
+ struct {
+ unsigned int unsync_children;
+ /*
+ * Number of writes since the last time traversal
+ * visited this page.
+ */
+ atomic_t write_flooding_count;
+ };
+ /*
+ * Page table page of private PT.
+ * Passed to TDX module, not accessed by KVM.
+ */
+ void *external_spt;
+ };
union {
struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */
tdp_ptep_t ptep;
@@ -124,9 +139,6 @@ struct kvm_mmu_page {
int clear_spte_count;
#endif
- /* Number of writes since the last time traversal visited this page. */
- atomic_t write_flooding_count;
-
#ifdef CONFIG_X86_64
/* Used for freeing the page asynchronously if it is a TDP MMU page. */
struct rcu_head rcu_head;
@@ -145,6 +157,17 @@ static inline int kvm_mmu_page_as_id(struct kvm_mmu_page *sp)
return kvm_mmu_role_as_id(sp->role);
}
+static inline void kvm_mmu_alloc_external_spt(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
+{
+ /*
+ * external_spt is allocated for TDX module to hold private EPT mappings,
+ * TDX module will initialize the page by itself.
+ * Therefore, KVM does not need to initialize or access external_spt.
+ * KVM only interacts with sp->spt for external EPT operations.
+ */
+ sp->external_spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_external_spt_cache);
+}
+
static inline bool kvm_mmu_page_ad_need_write_protect(struct kvm_mmu_page *sp)
{
/*
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 16b54208e8d7..35249555b585 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -53,6 +53,7 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
{
+ free_page((unsigned long)sp->external_spt);
free_page((unsigned long)sp->spt);
kmem_cache_free(mmu_page_header_cache, sp);
}
--
2.34.1
next prev parent reply other threads:[~2024-06-19 22:36 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-19 22:35 [PATCH v3 00/17] TDX MMU prep series part 1 Rick Edgecombe
2024-06-19 22:35 ` [PATCH v3 01/17] KVM: x86/tdp_mmu: Rename REMOVED_SPTE to FROZEN_SPTE Rick Edgecombe
2024-06-19 22:35 ` [PATCH v3 02/17] KVM: Add member to struct kvm_gfn_range for target alias Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 03/17] KVM: x86: Add a VM type define for TDX Rick Edgecombe
2024-06-19 22:36 ` Rick Edgecombe [this message]
2024-07-03 7:03 ` [PATCH v3 04/17] KVM: x86/mmu: Add an external pointer to struct kvm_mmu_page Yan Zhao
2024-06-19 22:36 ` [PATCH v3 05/17] KVM: x86/mmu: Add an is_mirror member for union kvm_mmu_page_role Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 06/17] KVM: x86/mmu: Make kvm_tdp_mmu_alloc_root() return void Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 07/17] KVM: x86/tdp_mmu: Take struct kvm in iter loops Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 08/17] KVM: x86/tdp_mmu: Take a GFN in kvm_tdp_mmu_fast_pf_get_last_sptep() Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 09/17] KVM: x86/mmu: Support GFN direct bits Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 10/17] KVM: x86/tdp_mmu: Extract root invalid check from tdx_mmu_next_root() Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 11/17] KVM: x86/tdp_mmu: Introduce KVM MMU root types to specify page table type Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 12/17] KVM: x86/tdp_mmu: Take root in tdp_mmu_for_each_pte() Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 13/17] KVM: x86/tdp_mmu: Support mirror root for TDP MMU Rick Edgecombe
2024-06-24 8:30 ` Yan Zhao
2024-06-25 0:51 ` Edgecombe, Rick P
2024-06-25 5:43 ` Yan Zhao
2024-06-25 20:33 ` Edgecombe, Rick P
2024-06-26 5:05 ` Yan Zhao
2024-07-03 19:40 ` Edgecombe, Rick P
2024-07-04 8:09 ` Yan Zhao
2024-07-09 22:36 ` Edgecombe, Rick P
2024-07-04 8:51 ` Yan Zhao
2024-07-09 22:38 ` Edgecombe, Rick P
2024-07-11 23:54 ` Edgecombe, Rick P
2024-07-12 1:42 ` Yan Zhao
2024-06-19 22:36 ` [PATCH v3 14/17] KVM: x86/tdp_mmu: Propagate attr_filter to MMU notifier callbacks Rick Edgecombe
2024-06-19 22:36 ` [PATCH v3 15/17] KVM: x86/tdp_mmu: Propagate building mirror page tables Rick Edgecombe
2024-06-20 5:15 ` Binbin Wu
2024-06-24 23:52 ` Edgecombe, Rick P
2024-06-19 22:36 ` [PATCH v3 16/17] KVM: x86/tdp_mmu: Propagate tearing down " Rick Edgecombe
2024-06-20 8:44 ` Binbin Wu
2024-06-24 23:55 ` Edgecombe, Rick P
2024-06-19 22:36 ` [PATCH v3 17/17] KVM: x86/tdp_mmu: Take root types for kvm_tdp_mmu_invalidate_all_roots() Rick Edgecombe
2024-06-21 7:10 ` Yan Zhao
2024-06-21 19:08 ` Edgecombe, Rick P
2024-06-24 8:29 ` Yan Zhao
2024-06-24 23:15 ` Edgecombe, Rick P
2024-06-25 6:14 ` Yan Zhao
2024-06-25 20:56 ` Edgecombe, Rick P
2024-06-26 2:25 ` Yan Zhao
2024-07-03 20:00 ` Edgecombe, Rick P
2024-07-05 1:16 ` Yan Zhao
2024-07-09 22:52 ` Edgecombe, Rick P
2024-07-18 15:28 ` Isaku Yamahata
2024-07-18 15:55 ` Edgecombe, Rick P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240619223614.290657-5-rick.p.edgecombe@intel.com \
--to=rick.p.edgecombe@intel.com \
--cc=binbin.wu@linux.intel.com \
--cc=dmatlack@google.com \
--cc=erdemaktas@google.com \
--cc=isaku.yamahata@gmail.com \
--cc=isaku.yamahata@intel.com \
--cc=kai.huang@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=sagis@google.com \
--cc=seanjc@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox