* [RFC 01/14] KVM: x86/MMU: Add shadow_mmu.(c|h)
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2023-02-01 19:45 ` Sean Christopherson
2022-12-21 22:24 ` [RFC 02/14] KVM: x86/MMU: Expose functions for the Shadow MMU Ben Gardon
` (14 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
As a first step to splitting the Shadow MMU out of KVM MMU common code,
add separate files for it with some of the boilerplate and includes the
Shadow MMU will need.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/Makefile | 2 +-
arch/x86/kvm/mmu/mmu.c | 1 +
arch/x86/kvm/mmu/shadow_mmu.c | 21 +++++++++++++++++++++
arch/x86/kvm/mmu/shadow_mmu.h | 8 ++++++++
4 files changed, 31 insertions(+), 1 deletion(-)
create mode 100644 arch/x86/kvm/mmu/shadow_mmu.c
create mode 100644 arch/x86/kvm/mmu/shadow_mmu.h
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 80e3fe184d17..d6e94660b006 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -12,7 +12,7 @@ include $(srctree)/virt/kvm/Makefile.kvm
kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \
i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
hyperv.o debugfs.o mmu/mmu.o mmu/page_track.o \
- mmu/spte.o
+ mmu/spte.o mmu/shadow_mmu.o
ifdef CONFIG_HYPERV
kvm-y += kvm_onhyperv.o
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 4736d7849c60..07b99a7ce830 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -20,6 +20,7 @@
#include "mmu.h"
#include "mmu_internal.h"
#include "tdp_mmu.h"
+#include "shadow_mmu.h"
#include "x86.h"
#include "kvm_cache_regs.h"
#include "smm.h"
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
new file mode 100644
index 000000000000..7bce5ec52b2e
--- /dev/null
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM Shadow MMU
+ *
+ * This file implements the Shadow MMU: the KVM MMU implementation which has
+ * developed organically from hardware which did not have second level paging,
+ * and so used "shadow paging" to virtualize guest memory. The Shadow MMU is
+ * an alternative to the TDP MMU which only supports hardware with Two
+ * Dimentional Paging. (e.g. EPT on Intel or NPT on AMD CPUs.) Note that the
+ * Shadow MMU also supports TDP, it's just less scalable. The Shadow and TDP
+ * MMUs can cooperate to support nested virtualization on hardware with TDP.
+ */
+#include "mmu.h"
+#include "mmu_internal.h"
+#include "mmutrace.h"
+#include "shadow_mmu.h"
+#include "spte.h"
+
+#include <asm/vmx.h>
+#include <asm/cmpxchg.h>
+#include <trace/events/kvm.h>
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
new file mode 100644
index 000000000000..719b10f6c403
--- /dev/null
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __KVM_X86_MMU_SHADOW_MMU_H
+#define __KVM_X86_MMU_SHADOW_MMU_H
+
+#include <linux/kvm_host.h>
+
+#endif /* __KVM_X86_MMU_SHADOW_MMU_H */
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [RFC 01/14] KVM: x86/MMU: Add shadow_mmu.(c|h)
2022-12-21 22:24 ` [RFC 01/14] KVM: x86/MMU: Add shadow_mmu.(c|h) Ben Gardon
@ 2023-02-01 19:45 ` Sean Christopherson
2023-02-01 19:48 ` Ben Gardon
0 siblings, 1 reply; 29+ messages in thread
From: Sean Christopherson @ 2023-02-01 19:45 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
Vipin Sharma, Nagareddy Reddy
On Wed, Dec 21, 2022, Ben Gardon wrote:
> diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
> new file mode 100644
> index 000000000000..7bce5ec52b2e
> --- /dev/null
> +++ b/arch/x86/kvm/mmu/shadow_mmu.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * KVM Shadow MMU
> + *
> + * This file implements the Shadow MMU: the KVM MMU implementation which has
> + * developed organically from hardware which did not have second level paging,
> + * and so used "shadow paging" to virtualize guest memory. The Shadow MMU is
> + * an alternative to the TDP MMU which only supports hardware with Two
> + * Dimentional Paging. (e.g. EPT on Intel or NPT on AMD CPUs.) Note that the
> + * Shadow MMU also supports TDP, it's just less scalable. The Shadow and TDP
> + * MMUs can cooperate to support nested virtualization on hardware with TDP.
> + */
Eh, I vote to omit the comment. For newbies, Documentation is likely a better
landing spot for describing the MMUs, and people that are familiar with KVM x86
MMU already know what the shadow MMU is and does. That way we avoid bikeshedding
this comment, at least in the conext of this series. E.g. I'm pretty sure much
of the shadow MMU behavior wasn't developed organically, it was stolen from Xen.
And the line about the Shadow and TDP MMUs cooperating support nested virt is
loaded with assumptions and qualifiers, and makes it sound like nested virt only
works with _the_ TDP MMU as oposed to _a_ TDP MMU`.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC 01/14] KVM: x86/MMU: Add shadow_mmu.(c|h)
2023-02-01 19:45 ` Sean Christopherson
@ 2023-02-01 19:48 ` Ben Gardon
0 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-01 19:48 UTC (permalink / raw)
To: Sean Christopherson
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
Vipin Sharma, Nagareddy Reddy
On Wed, Feb 1, 2023 at 11:45 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Dec 21, 2022, Ben Gardon wrote:
> > diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
> > new file mode 100644
> > index 000000000000..7bce5ec52b2e
> > --- /dev/null
> > +++ b/arch/x86/kvm/mmu/shadow_mmu.c
> > @@ -0,0 +1,21 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * KVM Shadow MMU
> > + *
> > + * This file implements the Shadow MMU: the KVM MMU implementation which has
> > + * developed organically from hardware which did not have second level paging,
> > + * and so used "shadow paging" to virtualize guest memory. The Shadow MMU is
> > + * an alternative to the TDP MMU which only supports hardware with Two
> > + * Dimentional Paging. (e.g. EPT on Intel or NPT on AMD CPUs.) Note that the
> > + * Shadow MMU also supports TDP, it's just less scalable. The Shadow and TDP
> > + * MMUs can cooperate to support nested virtualization on hardware with TDP.
> > + */
>
> Eh, I vote to omit the comment. For newbies, Documentation is likely a better
> landing spot for describing the MMUs, and people that are familiar with KVM x86
> MMU already know what the shadow MMU is and does. That way we avoid bikeshedding
> this comment, at least in the conext of this series. E.g. I'm pretty sure much
> of the shadow MMU behavior wasn't developed organically, it was stolen from Xen.
> And the line about the Shadow and TDP MMUs cooperating support nested virt is
> loaded with assumptions and qualifiers, and makes it sound like nested virt only
> works with _the_ TDP MMU as oposed to _a_ TDP MMU`.
Sounds good, I can dump the comment. I plan to send out a rebased
version of this series tomorrow, incorporating all the feedback this
series has gotten. Thanks for taking another look at it.
^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC 02/14] KVM: x86/MMU: Expose functions for the Shadow MMU
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
2022-12-21 22:24 ` [RFC 01/14] KVM: x86/MMU: Add shadow_mmu.(c|h) Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2022-12-21 22:24 ` [RFC 04/14] KVM: x86/MMU: Expose functions for paging_tmpl.h Ben Gardon
` (13 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
Expose various common MMU functions which the Shadow MMU will need via
mmu_internal.h. This just slightly reduces the work needed to move the
shadow MMU code out of mmu.c, which will already be a massive change.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 41 ++++++++++++++-------------------
arch/x86/kvm/mmu/mmu_internal.h | 24 +++++++++++++++++++
2 files changed, 41 insertions(+), 24 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 07b99a7ce830..729a2799d4d7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -156,9 +156,9 @@ struct kvm_shadow_walk_iterator {
({ spte = mmu_spte_get_lockless(_walker.sptep); 1; }); \
__shadow_walk_next(&(_walker), spte))
-static struct kmem_cache *pte_list_desc_cache;
+struct kmem_cache *pte_list_desc_cache;
struct kmem_cache *mmu_page_header_cache;
-static struct percpu_counter kvm_total_used_mmu_pages;
+struct percpu_counter kvm_total_used_mmu_pages;
static void mmu_spte_set(u64 *sptep, u64 spte);
@@ -234,11 +234,6 @@ static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu)
return regs;
}
-static inline bool kvm_available_flush_tlb_with_range(void)
-{
- return kvm_x86_ops.tlb_remote_flush_with_range;
-}
-
static void kvm_flush_remote_tlbs_with_range(struct kvm *kvm,
struct kvm_tlb_range *range)
{
@@ -262,8 +257,8 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
kvm_flush_remote_tlbs_with_range(kvm, &range);
}
-static void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
- unsigned int access)
+void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
+ unsigned int access)
{
u64 spte = make_mmio_spte(vcpu, gfn, access);
@@ -610,7 +605,7 @@ static bool mmu_spte_age(u64 *sptep)
return true;
}
-static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
+void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
{
if (is_tdp_mmu(vcpu->arch.mmu)) {
kvm_tdp_mmu_walk_lockless_begin();
@@ -629,7 +624,7 @@ static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
}
}
-static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
+void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
{
if (is_tdp_mmu(vcpu->arch.mmu)) {
kvm_tdp_mmu_walk_lockless_end();
@@ -822,8 +817,8 @@ void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
&kvm->arch.possible_nx_huge_pages);
}
-static void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp,
- bool nx_huge_page_possible)
+void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+ bool nx_huge_page_possible)
{
sp->nx_huge_page_disallowed = true;
@@ -857,16 +852,15 @@ void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
list_del_init(&sp->possible_nx_huge_page_link);
}
-static void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
+void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
{
sp->nx_huge_page_disallowed = false;
untrack_possible_nx_huge_page(kvm, sp);
}
-static struct kvm_memory_slot *
-gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn,
- bool no_dirty_log)
+struct kvm_memory_slot *gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu,
+ gfn_t gfn, bool no_dirty_log)
{
struct kvm_memory_slot *slot;
@@ -1403,7 +1397,7 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
return write_protected;
}
-static bool kvm_vcpu_write_protect_gfn(struct kvm_vcpu *vcpu, u64 gfn)
+bool kvm_vcpu_write_protect_gfn(struct kvm_vcpu *vcpu, u64 gfn)
{
struct kvm_memory_slot *slot;
@@ -1902,9 +1896,8 @@ static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
return ret;
}
-static bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm,
- struct list_head *invalid_list,
- bool remote_flush)
+bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm, struct list_head *invalid_list,
+ bool remote_flush)
{
if (!remote_flush && list_empty(invalid_list))
return false;
@@ -1916,7 +1909,7 @@ static bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm,
return true;
}
-static bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
+bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
{
if (sp->role.invalid)
return true;
@@ -6148,7 +6141,7 @@ static inline bool need_topup(struct kvm_mmu_memory_cache *cache, int min)
return kvm_mmu_memory_cache_nr_free_objects(cache) < min;
}
-static bool need_topup_split_caches_or_resched(struct kvm *kvm)
+bool need_topup_split_caches_or_resched(struct kvm *kvm)
{
if (need_resched() || rwlock_needbreak(&kvm->mmu_lock))
return true;
@@ -6163,7 +6156,7 @@ static bool need_topup_split_caches_or_resched(struct kvm *kvm)
need_topup(&kvm->arch.split_shadow_page_cache, 1);
}
-static int topup_split_caches(struct kvm *kvm)
+int topup_split_caches(struct kvm *kvm)
{
/*
* Allocating rmap list entries when splitting huge pages for nested
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index dbaf6755c5a7..856e2e0a8420 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -131,7 +131,9 @@ struct kvm_mmu_page {
#endif
};
+extern struct kmem_cache *pte_list_desc_cache;
extern struct kmem_cache *mmu_page_header_cache;
+extern struct percpu_counter kvm_total_used_mmu_pages;
static inline int kvm_mmu_role_as_id(union kvm_mmu_page_role role)
{
@@ -317,6 +319,28 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_
void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp);
+void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+ bool nx_huge_page_possible);
void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp);
+void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp);
+static inline bool kvm_available_flush_tlb_with_range(void)
+{
+ return kvm_x86_ops.tlb_remote_flush_with_range;
+}
+
+void mark_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 gfn,
+ unsigned int access);
+struct kvm_memory_slot *gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu,
+ gfn_t gfn, bool no_dirty_log);
+bool kvm_vcpu_write_protect_gfn(struct kvm_vcpu *vcpu, u64 gfn);
+bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm, struct list_head *invalid_list,
+ bool remote_flush);
+bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp);
+
+void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu);
+void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu);
+
+bool need_topup_split_caches_or_resched(struct kvm *kvm);
+int topup_split_caches(struct kvm *kvm);
#endif /* __KVM_X86_MMU_INTERNAL_H */
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* [RFC 04/14] KVM: x86/MMU: Expose functions for paging_tmpl.h
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
2022-12-21 22:24 ` [RFC 01/14] KVM: x86/MMU: Add shadow_mmu.(c|h) Ben Gardon
2022-12-21 22:24 ` [RFC 02/14] KVM: x86/MMU: Expose functions for the Shadow MMU Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2023-01-06 19:49 ` David Matlack
2022-12-21 22:24 ` [RFC 05/14] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c Ben Gardon
` (12 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
In preparation for moving paging_tmpl.h to shadow_mmu.c, expose various
functions it needs through mmu_internal.h. This includes modifying the
BUILD_MMU_ROLE_ACCESSOR macro so that it does not automatically include
the static label, since some but not all of the accessors are needed by
paging_tmpl.h.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++----------------
arch/x86/kvm/mmu/mmu_internal.h | 16 ++++++++++++++++
2 files changed, 32 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index bf14e181eb12..a17e8a79e4df 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -153,18 +153,18 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
* and the vCPU may be incorrect/irrelevant.
*/
#define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name) \
-static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \
+inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \
{ \
return !!(mmu->cpu_role. base_or_ext . reg##_##name); \
}
BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
-BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse);
+static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse);
BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smep);
-BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smap);
-BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pke);
-BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57);
+static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smap);
+static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pke);
+static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57);
BUILD_MMU_ROLE_ACCESSOR(base, efer, nx);
-BUILD_MMU_ROLE_ACCESSOR(ext, efer, lma);
+static BUILD_MMU_ROLE_ACCESSOR(ext, efer, lma);
static inline bool is_cr0_pg(struct kvm_mmu *mmu)
{
@@ -210,7 +210,7 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
kvm_flush_remote_tlbs_with_range(kvm, &range);
}
-static gfn_t get_mmio_spte_gfn(u64 spte)
+gfn_t get_mmio_spte_gfn(u64 spte)
{
u64 gpa = spte & shadow_nonpresent_or_rsvd_lower_gfn_mask;
@@ -240,7 +240,7 @@ static bool check_mmio_spte(struct kvm_vcpu *vcpu, u64 spte)
return likely(kvm_gen == spte_gen);
}
-static int is_cpuid_PSE36(void)
+int is_cpuid_PSE36(void)
{
return 1;
}
@@ -279,7 +279,7 @@ void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
}
}
-static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
+int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
{
int r;
@@ -818,8 +818,8 @@ static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_t pfn)
return -EFAULT;
}
-static int handle_abnormal_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
- unsigned int access)
+int handle_abnormal_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
+ unsigned int access)
{
/* The pfn is invalid, report the error! */
if (unlikely(is_error_pfn(fault->pfn)))
@@ -1275,8 +1275,8 @@ static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr, bool direct)
return RET_PF_RETRY;
}
-static bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
- struct kvm_page_fault *fault)
+bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
+ struct kvm_page_fault *fault)
{
if (unlikely(fault->rsvd))
return false;
@@ -1338,7 +1338,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, 0, true);
}
-static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
+int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
{
struct kvm_memory_slot *slot = fault->slot;
bool async;
@@ -1403,8 +1403,8 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
* Returns true if the page fault is stale and needs to be retried, i.e. if the
* root was invalidated by a memslot update or a relevant mmu_notifier fired.
*/
-static bool is_page_fault_stale(struct kvm_vcpu *vcpu,
- struct kvm_page_fault *fault, int mmu_seq)
+bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
+ int mmu_seq)
{
struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root.hpa);
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 74a99b67f09e..957376fcb333 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -341,6 +341,22 @@ bool is_obsolete_sp(struct kvm *kvm, struct kvm_mmu_page *sp);
void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu);
void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu);
+int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect);
bool need_topup_split_caches_or_resched(struct kvm *kvm);
int topup_split_caches(struct kvm *kvm);
+
+bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
+ int mmu_seq);
+bool page_fault_handle_page_track(struct kvm_vcpu *vcpu,
+ struct kvm_page_fault *fault);
+int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+int handle_abnormal_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
+ unsigned int access);
+
+gfn_t get_mmio_spte_gfn(u64 spte);
+
+bool is_efer_nx(struct kvm_mmu *mmu);
+bool is_cr4_smep(struct kvm_mmu *mmu);
+bool is_cr0_wp(struct kvm_mmu *mmu);
+int is_cpuid_PSE36(void);
#endif /* __KVM_X86_MMU_INTERNAL_H */
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [RFC 04/14] KVM: x86/MMU: Expose functions for paging_tmpl.h
2022-12-21 22:24 ` [RFC 04/14] KVM: x86/MMU: Expose functions for paging_tmpl.h Ben Gardon
@ 2023-01-06 19:49 ` David Matlack
2023-01-09 18:47 ` Ben Gardon
0 siblings, 1 reply; 29+ messages in thread
From: David Matlack @ 2023-01-06 19:49 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
Vipin Sharma, Nagareddy Reddy
On Wed, Dec 21, 2022 at 10:24:08PM +0000, Ben Gardon wrote:
> In preparation for moving paging_tmpl.h to shadow_mmu.c, expose various
> functions it needs through mmu_internal.h. This includes modifying the
> BUILD_MMU_ROLE_ACCESSOR macro so that it does not automatically include
> the static label, since some but not all of the accessors are needed by
> paging_tmpl.h.
>
> No functional change intended.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
> arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++----------------
> arch/x86/kvm/mmu/mmu_internal.h | 16 ++++++++++++++++
> 2 files changed, 32 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index bf14e181eb12..a17e8a79e4df 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -153,18 +153,18 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
> * and the vCPU may be incorrect/irrelevant.
> */
> #define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name) \
> -static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \
> +inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \
> { \
> return !!(mmu->cpu_role. base_or_ext . reg##_##name); \
> }
> BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
> -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse);
> +static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse);
> BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smep);
> -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smap);
> -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pke);
> -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57);
> +static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smap);
> +static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pke);
> +static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57);
> BUILD_MMU_ROLE_ACCESSOR(base, efer, nx);
> -BUILD_MMU_ROLE_ACCESSOR(ext, efer, lma);
> +static BUILD_MMU_ROLE_ACCESSOR(ext, efer, lma);
Suggest moving all the BUILD_MMU_ROLE*() macros to mmu_internal.h, since
they are already static inline. That would be a cleaner patch and reduce
future churn if shadow_mmu.c ever needs to use a different role accessor
at some point.
>
> static inline bool is_cr0_pg(struct kvm_mmu *mmu)
> {
> @@ -210,7 +210,7 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
> kvm_flush_remote_tlbs_with_range(kvm, &range);
> }
>
> -static gfn_t get_mmio_spte_gfn(u64 spte)
> +gfn_t get_mmio_spte_gfn(u64 spte)
> {
> u64 gpa = spte & shadow_nonpresent_or_rsvd_lower_gfn_mask;
>
> @@ -240,7 +240,7 @@ static bool check_mmio_spte(struct kvm_vcpu *vcpu, u64 spte)
> return likely(kvm_gen == spte_gen);
> }
>
> -static int is_cpuid_PSE36(void)
> +int is_cpuid_PSE36(void)
> {
> return 1;
> }
Can we just drop is_cpuid_PSE36(), e.g. as a precursor patch? It just
returns 1...
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [RFC 04/14] KVM: x86/MMU: Expose functions for paging_tmpl.h
2023-01-06 19:49 ` David Matlack
@ 2023-01-09 18:47 ` Ben Gardon
0 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-01-09 18:47 UTC (permalink / raw)
To: David Matlack
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
Vipin Sharma, Nagareddy Reddy
On Fri, Jan 6, 2023 at 11:49 AM David Matlack <dmatlack@google.com> wrote:
>
> On Wed, Dec 21, 2022 at 10:24:08PM +0000, Ben Gardon wrote:
> > In preparation for moving paging_tmpl.h to shadow_mmu.c, expose various
> > functions it needs through mmu_internal.h. This includes modifying the
> > BUILD_MMU_ROLE_ACCESSOR macro so that it does not automatically include
> > the static label, since some but not all of the accessors are needed by
> > paging_tmpl.h.
> >
> > No functional change intended.
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++----------------
> > arch/x86/kvm/mmu/mmu_internal.h | 16 ++++++++++++++++
> > 2 files changed, 32 insertions(+), 16 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index bf14e181eb12..a17e8a79e4df 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -153,18 +153,18 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
> > * and the vCPU may be incorrect/irrelevant.
> > */
> > #define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name) \
> > -static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \
> > +inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \
> > { \
> > return !!(mmu->cpu_role. base_or_ext . reg##_##name); \
> > }
> > BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
> > -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse);
> > +static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse);
> > BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smep);
> > -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smap);
> > -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pke);
> > -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57);
> > +static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smap);
> > +static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pke);
> > +static BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57);
> > BUILD_MMU_ROLE_ACCESSOR(base, efer, nx);
> > -BUILD_MMU_ROLE_ACCESSOR(ext, efer, lma);
> > +static BUILD_MMU_ROLE_ACCESSOR(ext, efer, lma);
>
> Suggest moving all the BUILD_MMU_ROLE*() macros to mmu_internal.h, since
> they are already static inline. That would be a cleaner patch and reduce
> future churn if shadow_mmu.c ever needs to use a different role accessor
> at some point.
That sounds reasonable. Will do in V1.
>
> >
> > static inline bool is_cr0_pg(struct kvm_mmu *mmu)
> > {
> > @@ -210,7 +210,7 @@ void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
> > kvm_flush_remote_tlbs_with_range(kvm, &range);
> > }
> >
> > -static gfn_t get_mmio_spte_gfn(u64 spte)
> > +gfn_t get_mmio_spte_gfn(u64 spte)
> > {
> > u64 gpa = spte & shadow_nonpresent_or_rsvd_lower_gfn_mask;
> >
> > @@ -240,7 +240,7 @@ static bool check_mmio_spte(struct kvm_vcpu *vcpu, u64 spte)
> > return likely(kvm_gen == spte_gen);
> > }
> >
> > -static int is_cpuid_PSE36(void)
> > +int is_cpuid_PSE36(void)
> > {
> > return 1;
> > }
>
> Can we just drop is_cpuid_PSE36(), e.g. as a precursor patch? It just
> returns 1...
Yeah, good idea. Looks like we can eliminate a little dead code doing that too.
^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC 05/14] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (2 preceding siblings ...)
2022-12-21 22:24 ` [RFC 04/14] KVM: x86/MMU: Expose functions for paging_tmpl.h Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2022-12-21 22:24 ` [RFC 06/14] KVM: x86/MMU: Clean up Shadow MMU exports Ben Gardon
` (11 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
Move the integration point for paging_tmpl.h to shadow_mmu.c since
paging_tmpl.h is ostensibly part of the Shadow MMU. This requires
modifying some of the definitions to be non-static and then exporting
the pre-processed function names through shadow_mmu.h since they are
needed for mmu context callbacks in mmu.c. This will facilitate cleanups
in following commits because many of the functions being exposed by
shadow_mmu.h are only needed by paging_tmpl.h. Those functions will no
longer need to be exported.
sync_mmio_spte() is only used by paging_tmpl.h, so move it along with
the includes.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 29 -----------------------------
arch/x86/kvm/mmu/paging_tmpl.h | 11 +++++------
arch/x86/kvm/mmu/shadow_mmu.c | 30 ++++++++++++++++++++++++++++++
arch/x86/kvm/mmu/shadow_mmu.h | 25 ++++++++++++++++++++++++-
4 files changed, 59 insertions(+), 36 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a17e8a79e4df..dd97e346c786 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1699,35 +1699,6 @@ static unsigned long get_cr3(struct kvm_vcpu *vcpu)
return kvm_read_cr3(vcpu);
}
-static bool sync_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, gfn_t gfn,
- unsigned int access)
-{
- if (unlikely(is_mmio_spte(*sptep))) {
- if (gfn != get_mmio_spte_gfn(*sptep)) {
- mmu_spte_clear_no_track(sptep);
- return true;
- }
-
- mark_mmio_spte(vcpu, sptep, gfn, access);
- return true;
- }
-
- return false;
-}
-
-#define PTTYPE_EPT 18 /* arbitrary */
-#define PTTYPE PTTYPE_EPT
-#include "paging_tmpl.h"
-#undef PTTYPE
-
-#define PTTYPE 64
-#include "paging_tmpl.h"
-#undef PTTYPE
-
-#define PTTYPE 32
-#include "paging_tmpl.h"
-#undef PTTYPE
-
static void
__reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check,
u64 pa_bits_rsvd, int level, bool nx, bool gbpages,
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 0f6455072055..2e3b2aca64ad 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -787,7 +787,7 @@ FNAME(is_self_change_mapping)(struct kvm_vcpu *vcpu,
* Returns: 1 if we need to emulate the instruction, 0 otherwise, or
* a negative value on error.
*/
-static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
+int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
{
struct guest_walker walker;
int r;
@@ -897,7 +897,7 @@ static gpa_t FNAME(get_level1_sp_gpa)(struct kvm_mmu_page *sp)
return gfn_to_gpa(sp->gfn) + offset * sizeof(pt_element_t);
}
-static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
+void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
{
struct kvm_shadow_walk_iterator iterator;
struct kvm_mmu_page *sp;
@@ -957,9 +957,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa)
}
/* Note, @addr is a GPA when gva_to_gpa() translates an L2 GPA to an L1 GPA. */
-static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
- gpa_t addr, u64 access,
- struct x86_exception *exception)
+gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, gpa_t addr,
+ u64 access, struct x86_exception *exception)
{
struct guest_walker walker;
gpa_t gpa = INVALID_GPA;
@@ -992,7 +991,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
* 0: the sp is synced and no tlb flushing is required
* > 0: the sp is synced and tlb flushing is required
*/
-static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
+int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
{
union kvm_mmu_page_role root_role = vcpu->arch.mmu->root_role;
int i;
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 05d8f5be559d..86b5fb75d50a 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -10,6 +10,7 @@
* Shadow MMU also supports TDP, it's just less scalable. The Shadow and TDP
* MMUs can cooperate to support nested virtualization on hardware with TDP.
*/
+#include "ioapic.h"
#include "mmu.h"
#include "mmu_internal.h"
#include "mmutrace.h"
@@ -2798,6 +2799,35 @@ void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr)
walk_shadow_page_lockless_end(vcpu);
}
+static bool sync_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, gfn_t gfn,
+ unsigned int access)
+{
+ if (unlikely(is_mmio_spte(*sptep))) {
+ if (gfn != get_mmio_spte_gfn(*sptep)) {
+ mmu_spte_clear_no_track(sptep);
+ return true;
+ }
+
+ mark_mmio_spte(vcpu, sptep, gfn, access);
+ return true;
+ }
+
+ return false;
+}
+
+#define PTTYPE_EPT 18 /* arbitrary */
+#define PTTYPE PTTYPE_EPT
+#include "paging_tmpl.h"
+#undef PTTYPE
+
+#define PTTYPE 64
+#include "paging_tmpl.h"
+#undef PTTYPE
+
+#define PTTYPE 32
+#include "paging_tmpl.h"
+#undef PTTYPE
+
static bool is_obsolete_root(struct kvm *kvm, hpa_t root_hpa)
{
struct kvm_mmu_page *sp;
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 83876047c1f5..00d2f9abecf0 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -73,7 +73,6 @@ bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
int level, pte_t unused);
void drop_parent_pte(struct kvm_mmu_page *sp, u64 *parent_pte);
-int nonpaging_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
bool can_yield);
void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
@@ -150,4 +149,28 @@ void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
const struct kvm_memory_slot *slot);
unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc);
+
+/* Exports from paging_tmpl.h */
+gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+ gpa_t vaddr, u64 access,
+ struct x86_exception *exception);
+gpa_t paging64_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+ gpa_t vaddr, u64 access,
+ struct x86_exception *exception);
+gpa_t ept_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, gpa_t vaddr,
+ u64 access, struct x86_exception *exception);
+
+int paging32_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+int paging64_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+int ept_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+
+int paging32_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
+int paging64_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
+int ept_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
+/* Defined in shadow_mmu.c. */
+int nonpaging_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp);
+
+void paging32_invlpg(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root);
+void paging64_invlpg(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root);
+void ept_invlpg(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root);
#endif /* __KVM_X86_MMU_SHADOW_MMU_H */
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* [RFC 06/14] KVM: x86/MMU: Clean up Shadow MMU exports
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (3 preceding siblings ...)
2022-12-21 22:24 ` [RFC 05/14] KVM: x86/MMU: Move paging_tmpl.h includes to shadow_mmu.c Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2022-12-21 22:24 ` [RFC 07/14] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
` (10 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
Now that paging_tmpl.h is included from shadow_mmu.c, there's no need to
export many of the functions currrently in shadow_mmu.h, so remove those
exports and mark the functions static. This cleans up the interface
of the Shadow MMU, and will allow the implementation to keep the details
of rmap_heads internal.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/shadow_mmu.c | 78 +++++++++++++++++++++--------------
arch/x86/kvm/mmu/shadow_mmu.h | 51 +----------------------
2 files changed, 48 insertions(+), 81 deletions(-)
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 86b5fb75d50a..090b4788f7de 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -21,6 +21,20 @@
#include <asm/cmpxchg.h>
#include <trace/events/kvm.h>
+struct kvm_shadow_walk_iterator {
+ u64 addr;
+ hpa_t shadow_addr;
+ u64 *sptep;
+ int level;
+ unsigned index;
+};
+
+#define for_each_shadow_entry_using_root(_vcpu, _root, _addr, _walker) \
+ for (shadow_walk_init_using_root(&(_walker), (_vcpu), \
+ (_root), (_addr)); \
+ shadow_walk_okay(&(_walker)); \
+ shadow_walk_next(&(_walker)))
+
#define for_each_shadow_entry(_vcpu, _addr, _walker) \
for (shadow_walk_init(&(_walker), _vcpu, _addr); \
shadow_walk_okay(&(_walker)); \
@@ -227,7 +241,7 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte)
*
* Returns true if the TLB needs to be flushed
*/
-bool mmu_spte_update(u64 *sptep, u64 new_spte)
+static bool mmu_spte_update(u64 *sptep, u64 new_spte)
{
bool flush = false;
u64 old_spte = mmu_spte_update_no_track(sptep, new_spte);
@@ -311,7 +325,7 @@ static u64 mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
* Directly clear spte without caring the state bits of sptep,
* it is used to set the upper level spte.
*/
-void mmu_spte_clear_no_track(u64 *sptep)
+static void mmu_spte_clear_no_track(u64 *sptep)
{
__update_clear_spte_fast(sptep, 0ull);
}
@@ -354,7 +368,7 @@ static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc)
static bool sp_has_gptes(struct kvm_mmu_page *sp);
-gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
+static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
{
if (sp->role.passthrough)
return sp->gfn;
@@ -410,8 +424,8 @@ static void kvm_mmu_page_set_translation(struct kvm_mmu_page *sp, int index,
sp->gfn, kvm_mmu_page_get_gfn(sp, index), gfn);
}
-void kvm_mmu_page_set_access(struct kvm_mmu_page *sp, int index,
- unsigned int access)
+static void kvm_mmu_page_set_access(struct kvm_mmu_page *sp, int index,
+ unsigned int access)
{
gfn_t gfn = kvm_mmu_page_get_gfn(sp, index);
@@ -627,7 +641,7 @@ struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
return &slot->arch.rmap[level - PG_LEVEL_4K][idx];
}
-bool rmap_can_add(struct kvm_vcpu *vcpu)
+static bool rmap_can_add(struct kvm_vcpu *vcpu)
{
struct kvm_mmu_memory_cache *mc;
@@ -735,7 +749,7 @@ static u64 *rmap_get_next(struct rmap_iterator *iter)
for (_spte_ = rmap_get_first(_rmap_head_, _iter_); \
_spte_; _spte_ = rmap_get_next(_iter_))
-void drop_spte(struct kvm *kvm, u64 *sptep)
+static void drop_spte(struct kvm *kvm, u64 *sptep)
{
u64 old_spte = mmu_spte_clear_track_bits(kvm, sptep);
@@ -1112,7 +1126,7 @@ static void mmu_page_remove_parent_pte(struct kvm_mmu_page *sp,
pte_list_remove(parent_pte, &sp->parent_ptes);
}
-void drop_parent_pte(struct kvm_mmu_page *sp, u64 *parent_pte)
+static void drop_parent_pte(struct kvm_mmu_page *sp, u64 *parent_pte)
{
mmu_page_remove_parent_pte(sp, parent_pte);
mmu_spte_clear_no_track(parent_pte);
@@ -1342,8 +1356,8 @@ static void mmu_pages_clear_parents(struct mmu_page_path *parents)
} while (!sp->unsync_children);
}
-int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
- bool can_yield)
+static int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
+ bool can_yield)
{
int i;
struct kvm_mmu_page *sp;
@@ -1389,7 +1403,7 @@ void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
atomic_set(&sp->write_flooding_count, 0);
}
-void clear_sp_write_flooding_count(u64 *spte)
+static void clear_sp_write_flooding_count(u64 *spte)
{
__clear_sp_write_flooding_count(sptep_to_sp(spte));
}
@@ -1602,9 +1616,9 @@ static union kvm_mmu_page_role kvm_mmu_child_role(u64 *sptep, bool direct,
return role;
}
-struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, u64 *sptep,
- gfn_t gfn, bool direct,
- unsigned int access)
+static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
+ u64 *sptep, gfn_t gfn,
+ bool direct, unsigned int access)
{
union kvm_mmu_page_role role;
@@ -1615,8 +1629,9 @@ struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, u64 *sptep,
return kvm_mmu_get_shadow_page(vcpu, gfn, role);
}
-void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
- struct kvm_vcpu *vcpu, hpa_t root, u64 addr)
+static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
+ struct kvm_vcpu *vcpu, hpa_t root,
+ u64 addr)
{
iterator->addr = addr;
iterator->shadow_addr = root;
@@ -1643,14 +1658,14 @@ void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
}
}
-void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
- struct kvm_vcpu *vcpu, u64 addr)
+static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
+ struct kvm_vcpu *vcpu, u64 addr)
{
shadow_walk_init_using_root(iterator, vcpu, vcpu->arch.mmu->root.hpa,
addr);
}
-bool shadow_walk_okay(struct kvm_shadow_walk_iterator *iterator)
+static bool shadow_walk_okay(struct kvm_shadow_walk_iterator *iterator)
{
if (iterator->level < PG_LEVEL_4K)
return false;
@@ -1672,7 +1687,7 @@ static void __shadow_walk_next(struct kvm_shadow_walk_iterator *iterator,
--iterator->level;
}
-void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator)
+static void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator)
{
__shadow_walk_next(iterator, *iterator->sptep);
}
@@ -1703,13 +1718,14 @@ static void __link_shadow_page(struct kvm *kvm,
mark_unsync(sptep);
}
-void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, struct kvm_mmu_page *sp)
+static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep,
+ struct kvm_mmu_page *sp)
{
__link_shadow_page(vcpu->kvm, &vcpu->arch.mmu_pte_list_desc_cache, sptep, sp, true);
}
-void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
- unsigned direct_access)
+static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
+ unsigned direct_access)
{
if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep)) {
struct kvm_mmu_page *child;
@@ -1731,8 +1747,8 @@ void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
}
/* Returns the number of zapped non-leaf child shadow pages. */
-int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, u64 *spte,
- struct list_head *invalid_list)
+static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, u64 *spte,
+ struct list_head *invalid_list)
{
u64 pte;
struct kvm_mmu_page *child;
@@ -2144,9 +2160,9 @@ int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
return 0;
}
-int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
- u64 *sptep, unsigned int pte_access, gfn_t gfn,
- kvm_pfn_t pfn, struct kvm_page_fault *fault)
+static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+ u64 *sptep, unsigned int pte_access, gfn_t gfn,
+ kvm_pfn_t pfn, struct kvm_page_fault *fault)
{
struct kvm_mmu_page *sp = sptep_to_sp(sptep);
int level = sp->role.level;
@@ -2251,8 +2267,8 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
return 0;
}
-void __direct_pte_prefetch(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
- u64 *sptep)
+static void __direct_pte_prefetch(struct kvm_vcpu *vcpu,
+ struct kvm_mmu_page *sp, u64 *sptep)
{
u64 *spte, *start = NULL;
int i;
@@ -2788,7 +2804,7 @@ int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
return leaf;
}
-void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr)
+static void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr)
{
struct kvm_shadow_walk_iterator iterator;
u64 spte;
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 00d2f9abecf0..20c65a0ea52c 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -23,32 +23,11 @@ struct pte_list_desc {
u64 *sptes[PTE_LIST_EXT];
};
+/* Only exported for debugfs.c. */
unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
-struct kvm_shadow_walk_iterator {
- u64 addr;
- hpa_t shadow_addr;
- u64 *sptep;
- int level;
- unsigned index;
-};
-
-#define for_each_shadow_entry_using_root(_vcpu, _root, _addr, _walker) \
- for (shadow_walk_init_using_root(&(_walker), (_vcpu), \
- (_root), (_addr)); \
- shadow_walk_okay(&(_walker)); \
- shadow_walk_next(&(_walker)))
-
-bool mmu_spte_update(u64 *sptep, u64 new_spte);
-void mmu_spte_clear_no_track(u64 *sptep);
-gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index);
-void kvm_mmu_page_set_access(struct kvm_mmu_page *sp, int index,
- unsigned int access);
-
struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
const struct kvm_memory_slot *slot);
-bool rmap_can_add(struct kvm_vcpu *vcpu);
-void drop_spte(struct kvm *kvm, u64 *sptep);
bool rmap_write_protect(struct kvm_rmap_head *rmap_head, bool pt_protect);
bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
const struct kvm_memory_slot *slot);
@@ -72,30 +51,8 @@ bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
struct kvm_memory_slot *slot, gfn_t gfn,
int level, pte_t unused);
-void drop_parent_pte(struct kvm_mmu_page *sp, u64 *parent_pte);
-int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
- bool can_yield);
void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
-void clear_sp_write_flooding_count(u64 *spte);
-
-struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, u64 *sptep,
- gfn_t gfn, bool direct,
- unsigned int access);
-
-void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
- struct kvm_vcpu *vcpu, hpa_t root, u64 addr);
-void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
- struct kvm_vcpu *vcpu, u64 addr);
-bool shadow_walk_okay(struct kvm_shadow_walk_iterator *iterator);
-void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator);
-
-void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, struct kvm_mmu_page *sp);
-
-void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
- unsigned direct_access);
-int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, u64 *spte,
- struct list_head *invalid_list);
bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
struct list_head *invalid_list,
int *nr_zapped);
@@ -107,11 +64,6 @@ int make_mmu_pages_available(struct kvm_vcpu *vcpu);
int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
-int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
- u64 *sptep, unsigned int pte_access, gfn_t gfn,
- kvm_pfn_t pfn, struct kvm_page_fault *fault);
-void __direct_pte_prefetch(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
- u64 *sptep);
int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
u64 *fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa, u64 *spte);
@@ -121,7 +73,6 @@ int mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level);
-void shadow_page_table_clear_flood(struct kvm_vcpu *vcpu, gva_t addr);
void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
int bytes, struct kvm_page_track_notifier_node *node);
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* [RFC 07/14] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (4 preceding siblings ...)
2022-12-21 22:24 ` [RFC 06/14] KVM: x86/MMU: Clean up Shadow MMU exports Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2023-01-13 17:43 ` Vipin Sharma
2023-01-13 17:51 ` Vipin Sharma
2022-12-21 22:24 ` [RFC 08/14] KVM: x86/MMU: Clean up naming of exported Shadow MMU functions Ben Gardon
` (9 subsequent siblings)
15 siblings, 2 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
The MMU shrinker currently only operates on the Shadow MMU, but having
the entire implemenatation in shadow_mmu.c is awkward since much of the
function isn't Shadow MMU specific. There has also been talk of changing the
target of the shrinker to the MMU caches rather than already allocated page
tables. As a result, it makes sense to move some of the implementation back
to mmu.c.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 43 ++++++++++++++++++++++++
arch/x86/kvm/mmu/shadow_mmu.c | 62 ++++++++---------------------------
arch/x86/kvm/mmu/shadow_mmu.h | 3 +-
3 files changed, 58 insertions(+), 50 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index dd97e346c786..4c45a5b63356 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3147,6 +3147,49 @@ static unsigned long mmu_shrink_count(struct shrinker *shrink,
return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
}
+unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+{
+ struct kvm *kvm;
+ int nr_to_scan = sc->nr_to_scan;
+ unsigned long freed = 0;
+
+ mutex_lock(&kvm_lock);
+
+ list_for_each_entry(kvm, &vm_list, vm_list) {
+ /*
+ * Never scan more than sc->nr_to_scan VM instances.
+ * Will not hit this condition practically since we do not try
+ * to shrink more than one VM and it is very unlikely to see
+ * !n_used_mmu_pages so many times.
+ */
+ if (!nr_to_scan--)
+ break;
+
+ /*
+ * n_used_mmu_pages is accessed without holding kvm->mmu_lock
+ * here. We may skip a VM instance errorneosly, but we do not
+ * want to shrink a VM that only started to populate its MMU
+ * anyway.
+ */
+ if (!kvm->arch.n_used_mmu_pages &&
+ !kvm_shadow_mmu_has_zapped_obsolete_pages(kvm))
+ continue;
+
+ freed = kvm_shadow_mmu_shrink_scan(kvm, sc->nr_to_scan);
+
+ /*
+ * unfair on small ones
+ * per-vm shrinkers cry out
+ * sadness comes quickly
+ */
+ list_move_tail(&kvm->vm_list, &vm_list);
+ break;
+ }
+
+ mutex_unlock(&kvm_lock);
+ return freed;
+}
+
static struct shrinker mmu_shrinker = {
.count_objects = mmu_shrink_count,
.scan_objects = mmu_shrink_scan,
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 090b4788f7de..1259c4a3b140 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -3147,7 +3147,7 @@ void kvm_zap_obsolete_pages(struct kvm *kvm)
kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
}
-static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
+bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm)
{
return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
}
@@ -3416,60 +3416,24 @@ void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
}
-unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free)
{
- struct kvm *kvm;
- int nr_to_scan = sc->nr_to_scan;
unsigned long freed = 0;
+ int idx;
- mutex_lock(&kvm_lock);
-
- list_for_each_entry(kvm, &vm_list, vm_list) {
- int idx;
- LIST_HEAD(invalid_list);
-
- /*
- * Never scan more than sc->nr_to_scan VM instances.
- * Will not hit this condition practically since we do not try
- * to shrink more than one VM and it is very unlikely to see
- * !n_used_mmu_pages so many times.
- */
- if (!nr_to_scan--)
- break;
- /*
- * n_used_mmu_pages is accessed without holding kvm->mmu_lock
- * here. We may skip a VM instance errorneosly, but we do not
- * want to shrink a VM that only started to populate its MMU
- * anyway.
- */
- if (!kvm->arch.n_used_mmu_pages &&
- !kvm_has_zapped_obsolete_pages(kvm))
- continue;
-
- idx = srcu_read_lock(&kvm->srcu);
- write_lock(&kvm->mmu_lock);
-
- if (kvm_has_zapped_obsolete_pages(kvm)) {
- kvm_mmu_commit_zap_page(kvm,
- &kvm->arch.zapped_obsolete_pages);
- goto unlock;
- }
+ idx = srcu_read_lock(&kvm->srcu);
+ write_lock(&kvm->mmu_lock);
- freed = kvm_mmu_zap_oldest_mmu_pages(kvm, sc->nr_to_scan);
+ if (kvm_shadow_mmu_has_zapped_obsolete_pages(kvm)) {
+ kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
+ goto out;
+ }
-unlock:
- write_unlock(&kvm->mmu_lock);
- srcu_read_unlock(&kvm->srcu, idx);
+ freed = kvm_mmu_zap_oldest_mmu_pages(kvm, pages_to_free);
- /*
- * unfair on small ones
- * per-vm shrinkers cry out
- * sadness comes quickly
- */
- list_move_tail(&kvm->vm_list, &vm_list);
- break;
- }
+out:
+ write_unlock(&kvm->mmu_lock);
+ srcu_read_unlock(&kvm->srcu, idx);
- mutex_unlock(&kvm_lock);
return freed;
}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 20c65a0ea52c..9952aa1e86cf 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -99,7 +99,8 @@ void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
const struct kvm_memory_slot *slot);
-unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc);
+bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm);
+unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free);
/* Exports from paging_tmpl.h */
gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [RFC 07/14] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
2022-12-21 22:24 ` [RFC 07/14] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
@ 2023-01-13 17:43 ` Vipin Sharma
2023-01-13 17:51 ` Vipin Sharma
1 sibling, 0 replies; 29+ messages in thread
From: Vipin Sharma @ 2023-01-13 17:43 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
David Matlack, Nagareddy Reddy
On Wed, Dec 21, 2022 at 2:24 PM Ben Gardon <bgardon@google.com> wrote:
>
> The MMU shrinker currently only operates on the Shadow MMU, but having
> the entire implemenatation in shadow_mmu.c is awkward since much of the
> function isn't Shadow MMU specific. There has also been talk of changing the
> target of the shrinker to the MMU caches rather than already allocated page
> tables. As a result, it makes sense to move some of the implementation back
> to mmu.c.
>
My recommendation is to move the whole function back to mmu.c instead
of splitting. As all logic for shrinker will be contained in one place
instead of spreading it over two files. kvm_shadow_mmu_shrink_scan(),
the new function created in this patch, is not used anywhere else,
there are already some functions which are getting exposed from shadow
mmu in this patch, it will be cleaner to just expose the other
functions needed by shrink_scan() and keep the implementation of
shrink_scan same and at one place.
> No functional change intended.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
> arch/x86/kvm/mmu/mmu.c | 43 ++++++++++++++++++++++++
> arch/x86/kvm/mmu/shadow_mmu.c | 62 ++++++++---------------------------
> arch/x86/kvm/mmu/shadow_mmu.h | 3 +-
> 3 files changed, 58 insertions(+), 50 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index dd97e346c786..4c45a5b63356 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3147,6 +3147,49 @@ static unsigned long mmu_shrink_count(struct shrinker *shrink,
> return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> }
>
> +unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> +{
> + struct kvm *kvm;
> + int nr_to_scan = sc->nr_to_scan;
> + unsigned long freed = 0;
> +
> + mutex_lock(&kvm_lock);
> +
> + list_for_each_entry(kvm, &vm_list, vm_list) {
> + /*
> + * Never scan more than sc->nr_to_scan VM instances.
> + * Will not hit this condition practically since we do not try
> + * to shrink more than one VM and it is very unlikely to see
> + * !n_used_mmu_pages so many times.
> + */
> + if (!nr_to_scan--)
> + break;
> +
> + /*
> + * n_used_mmu_pages is accessed without holding kvm->mmu_lock
> + * here. We may skip a VM instance errorneosly, but we do not
> + * want to shrink a VM that only started to populate its MMU
> + * anyway.
> + */
> + if (!kvm->arch.n_used_mmu_pages &&
> + !kvm_shadow_mmu_has_zapped_obsolete_pages(kvm))
> + continue;
> +
> + freed = kvm_shadow_mmu_shrink_scan(kvm, sc->nr_to_scan);
> +
> + /*
> + * unfair on small ones
> + * per-vm shrinkers cry out
> + * sadness comes quickly
> + */
> + list_move_tail(&kvm->vm_list, &vm_list);
> + break;
> + }
> +
> + mutex_unlock(&kvm_lock);
> + return freed;
> +}
> +
> static struct shrinker mmu_shrinker = {
> .count_objects = mmu_shrink_count,
> .scan_objects = mmu_shrink_scan,
> diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
> index 090b4788f7de..1259c4a3b140 100644
> --- a/arch/x86/kvm/mmu/shadow_mmu.c
> +++ b/arch/x86/kvm/mmu/shadow_mmu.c
> @@ -3147,7 +3147,7 @@ void kvm_zap_obsolete_pages(struct kvm *kvm)
> kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
> }
>
> -static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
> +bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm)
> {
> return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
> }
> @@ -3416,60 +3416,24 @@ void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
> kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> }
>
> -unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> +unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free)
> {
> - struct kvm *kvm;
> - int nr_to_scan = sc->nr_to_scan;
> unsigned long freed = 0;
> + int idx;
>
> - mutex_lock(&kvm_lock);
> -
> - list_for_each_entry(kvm, &vm_list, vm_list) {
> - int idx;
> - LIST_HEAD(invalid_list);
> -
> - /*
> - * Never scan more than sc->nr_to_scan VM instances.
> - * Will not hit this condition practically since we do not try
> - * to shrink more than one VM and it is very unlikely to see
> - * !n_used_mmu_pages so many times.
> - */
> - if (!nr_to_scan--)
> - break;
> - /*
> - * n_used_mmu_pages is accessed without holding kvm->mmu_lock
> - * here. We may skip a VM instance errorneosly, but we do not
> - * want to shrink a VM that only started to populate its MMU
> - * anyway.
> - */
> - if (!kvm->arch.n_used_mmu_pages &&
> - !kvm_has_zapped_obsolete_pages(kvm))
> - continue;
> -
> - idx = srcu_read_lock(&kvm->srcu);
> - write_lock(&kvm->mmu_lock);
> -
> - if (kvm_has_zapped_obsolete_pages(kvm)) {
> - kvm_mmu_commit_zap_page(kvm,
> - &kvm->arch.zapped_obsolete_pages);
> - goto unlock;
> - }
> + idx = srcu_read_lock(&kvm->srcu);
> + write_lock(&kvm->mmu_lock);
>
> - freed = kvm_mmu_zap_oldest_mmu_pages(kvm, sc->nr_to_scan);
> + if (kvm_shadow_mmu_has_zapped_obsolete_pages(kvm)) {
> + kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
> + goto out;
> + }
>
> -unlock:
> - write_unlock(&kvm->mmu_lock);
> - srcu_read_unlock(&kvm->srcu, idx);
> + freed = kvm_mmu_zap_oldest_mmu_pages(kvm, pages_to_free);
>
> - /*
> - * unfair on small ones
> - * per-vm shrinkers cry out
> - * sadness comes quickly
> - */
> - list_move_tail(&kvm->vm_list, &vm_list);
> - break;
> - }
> +out:
> + write_unlock(&kvm->mmu_lock);
> + srcu_read_unlock(&kvm->srcu, idx);
>
> - mutex_unlock(&kvm_lock);
> return freed;
> }
> diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
> index 20c65a0ea52c..9952aa1e86cf 100644
> --- a/arch/x86/kvm/mmu/shadow_mmu.h
> +++ b/arch/x86/kvm/mmu/shadow_mmu.h
> @@ -99,7 +99,8 @@ void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
> void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
> const struct kvm_memory_slot *slot);
>
> -unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc);
> +bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm);
> +unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free);
>
> /* Exports from paging_tmpl.h */
> gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> --
> 2.39.0.314.g84b9a713c41-goog
>
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [RFC 07/14] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU
2022-12-21 22:24 ` [RFC 07/14] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
2023-01-13 17:43 ` Vipin Sharma
@ 2023-01-13 17:51 ` Vipin Sharma
1 sibling, 0 replies; 29+ messages in thread
From: Vipin Sharma @ 2023-01-13 17:51 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
David Matlack, Nagareddy Reddy
On Wed, Dec 21, 2022 at 2:24 PM Ben Gardon <bgardon@google.com> wrote:
>
> The MMU shrinker currently only operates on the Shadow MMU, but having
> the entire implemenatation in shadow_mmu.c is awkward since much of the
> function isn't Shadow MMU specific. There has also been talk of changing the
> target of the shrinker to the MMU caches rather than already allocated page
> tables. As a result, it makes sense to move some of the implementation back
> to mmu.c.
>
> No functional change intended.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
> arch/x86/kvm/mmu/mmu.c | 43 ++++++++++++++++++++++++
> arch/x86/kvm/mmu/shadow_mmu.c | 62 ++++++++---------------------------
> arch/x86/kvm/mmu/shadow_mmu.h | 3 +-
> 3 files changed, 58 insertions(+), 50 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index dd97e346c786..4c45a5b63356 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3147,6 +3147,49 @@ static unsigned long mmu_shrink_count(struct shrinker *shrink,
> return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> }
>
> +unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> +{
> + struct kvm *kvm;
> + int nr_to_scan = sc->nr_to_scan;
> + unsigned long freed = 0;
> +
> + mutex_lock(&kvm_lock);
> +
> + list_for_each_entry(kvm, &vm_list, vm_list) {
> + /*
> + * Never scan more than sc->nr_to_scan VM instances.
> + * Will not hit this condition practically since we do not try
> + * to shrink more than one VM and it is very unlikely to see
> + * !n_used_mmu_pages so many times.
> + */
> + if (!nr_to_scan--)
> + break;
> +
> + /*
> + * n_used_mmu_pages is accessed without holding kvm->mmu_lock
> + * here. We may skip a VM instance errorneosly, but we do not
> + * want to shrink a VM that only started to populate its MMU
> + * anyway.
> + */
> + if (!kvm->arch.n_used_mmu_pages &&
> + !kvm_shadow_mmu_has_zapped_obsolete_pages(kvm))
> + continue;
> +
> + freed = kvm_shadow_mmu_shrink_scan(kvm, sc->nr_to_scan);
> +
> + /*
> + * unfair on small ones
> + * per-vm shrinkers cry out
> + * sadness comes quickly
> + */
> + list_move_tail(&kvm->vm_list, &vm_list);
> + break;
> + }
> +
> + mutex_unlock(&kvm_lock);
> + return freed;
> +}
> +
> static struct shrinker mmu_shrinker = {
> .count_objects = mmu_shrink_count,
> .scan_objects = mmu_shrink_scan,
> diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
> index 090b4788f7de..1259c4a3b140 100644
> --- a/arch/x86/kvm/mmu/shadow_mmu.c
> +++ b/arch/x86/kvm/mmu/shadow_mmu.c
> @@ -3147,7 +3147,7 @@ void kvm_zap_obsolete_pages(struct kvm *kvm)
> kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
> }
>
> -static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
> +bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm)
Function renaming and removing static should be two separate commits.
> {
> return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
> }
> @@ -3416,60 +3416,24 @@ void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
> kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> }
>
> -unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> +unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free)
> {
> - struct kvm *kvm;
> - int nr_to_scan = sc->nr_to_scan;
> unsigned long freed = 0;
> + int idx;
>
> - mutex_lock(&kvm_lock);
> -
> - list_for_each_entry(kvm, &vm_list, vm_list) {
> - int idx;
> - LIST_HEAD(invalid_list);
> -
> - /*
> - * Never scan more than sc->nr_to_scan VM instances.
> - * Will not hit this condition practically since we do not try
> - * to shrink more than one VM and it is very unlikely to see
> - * !n_used_mmu_pages so many times.
> - */
> - if (!nr_to_scan--)
> - break;
> - /*
> - * n_used_mmu_pages is accessed without holding kvm->mmu_lock
> - * here. We may skip a VM instance errorneosly, but we do not
> - * want to shrink a VM that only started to populate its MMU
> - * anyway.
> - */
> - if (!kvm->arch.n_used_mmu_pages &&
> - !kvm_has_zapped_obsolete_pages(kvm))
> - continue;
> -
> - idx = srcu_read_lock(&kvm->srcu);
> - write_lock(&kvm->mmu_lock);
> -
> - if (kvm_has_zapped_obsolete_pages(kvm)) {
> - kvm_mmu_commit_zap_page(kvm,
> - &kvm->arch.zapped_obsolete_pages);
> - goto unlock;
> - }
> + idx = srcu_read_lock(&kvm->srcu);
> + write_lock(&kvm->mmu_lock);
>
> - freed = kvm_mmu_zap_oldest_mmu_pages(kvm, sc->nr_to_scan);
> + if (kvm_shadow_mmu_has_zapped_obsolete_pages(kvm)) {
> + kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
> + goto out;
> + }
>
> -unlock:
> - write_unlock(&kvm->mmu_lock);
> - srcu_read_unlock(&kvm->srcu, idx);
> + freed = kvm_mmu_zap_oldest_mmu_pages(kvm, pages_to_free);
>
> - /*
> - * unfair on small ones
> - * per-vm shrinkers cry out
> - * sadness comes quickly
> - */
> - list_move_tail(&kvm->vm_list, &vm_list);
> - break;
> - }
> +out:
> + write_unlock(&kvm->mmu_lock);
> + srcu_read_unlock(&kvm->srcu, idx);
>
> - mutex_unlock(&kvm_lock);
> return freed;
> }
> diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
> index 20c65a0ea52c..9952aa1e86cf 100644
> --- a/arch/x86/kvm/mmu/shadow_mmu.h
> +++ b/arch/x86/kvm/mmu/shadow_mmu.h
> @@ -99,7 +99,8 @@ void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
> void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
> const struct kvm_memory_slot *slot);
>
> -unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc);
> +bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm);
> +unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free);
>
> /* Exports from paging_tmpl.h */
> gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> --
> 2.39.0.314.g84b9a713c41-goog
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC 08/14] KVM: x86/MMU: Clean up naming of exported Shadow MMU functions
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (5 preceding siblings ...)
2022-12-21 22:24 ` [RFC 07/14] KVM: x86/MMU: Cleanup shrinker interface with Shadow MMU Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2022-12-21 22:24 ` [RFC 09/14] KVM: x86/MMU: Only make pages available on Shadow MMU fault Ben Gardon
` (8 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
Change the naming scheme on several functions exported from the shadow
MMU to match the naming scheme used by the TDP MMU: kvm_shadow_mmu_.
More cleanups will follow to convert the remaining functions to a similar
naming scheme, but for now, start with the trivial renames.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 19 ++++++++++---------
arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
arch/x86/kvm/mmu/shadow_mmu.c | 19 ++++++++++---------
arch/x86/kvm/mmu/shadow_mmu.h | 17 +++++++++--------
4 files changed, 30 insertions(+), 27 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 4c45a5b63356..bacb519ba7b4 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1129,7 +1129,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
int r;
write_lock(&vcpu->kvm->mmu_lock);
- r = make_mmu_pages_available(vcpu);
+ r = kvm_shadow_mmu_make_pages_available(vcpu);
if (r < 0)
goto out_unlock;
@@ -1204,7 +1204,7 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
if (is_tdp_mmu(vcpu->arch.mmu))
leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
else
- leaf = get_walk(vcpu, addr, sptes, &root);
+ leaf = kvm_shadow_mmu_get_walk(vcpu, addr, sptes, &root);
walk_shadow_page_lockless_end(vcpu);
@@ -1469,14 +1469,14 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
if (is_page_fault_stale(vcpu, fault, mmu_seq))
goto out_unlock;
- r = make_mmu_pages_available(vcpu);
+ r = kvm_shadow_mmu_make_pages_available(vcpu);
if (r)
goto out_unlock;
if (is_tdp_mmu_fault)
r = kvm_tdp_mmu_map(vcpu, fault);
else
- r = __direct_map(vcpu, fault);
+ r = kvm_shadow_mmu_direct_map(vcpu, fault);
out_unlock:
if (is_tdp_mmu_fault)
@@ -1514,7 +1514,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
trace_kvm_page_fault(vcpu, fault_address, error_code);
if (kvm_event_needs_reinjection(vcpu))
- kvm_mmu_unprotect_page_virt(vcpu, fault_address);
+ kvm_shadow_mmu_unprotect_page_virt(vcpu, fault_address);
r = kvm_mmu_page_fault(vcpu, fault_address, error_code, insn,
insn_len);
} else if (flags & KVM_PV_REASON_PAGE_NOT_PRESENT) {
@@ -2791,7 +2791,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
* In order to ensure all vCPUs drop their soon-to-be invalid roots,
* invalidating TDP MMU roots must be done while holding mmu_lock for
* write and in the same critical section as making the reload request,
- * e.g. before kvm_zap_obsolete_pages() could drop mmu_lock and yield.
+ * e.g. before kvm_shadow_mmu_zap_obsolete_pages() could drop mmu_lock
+ * and yield.
*/
if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_invalidate_all_roots(kvm);
@@ -2806,7 +2807,7 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
*/
kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS);
- kvm_zap_obsolete_pages(kvm);
+ kvm_shadow_mmu_zap_obsolete_pages(kvm);
write_unlock(&kvm->mmu_lock);
@@ -2892,7 +2893,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
kvm_mmu_invalidate_begin(kvm, gfn_start, gfn_end);
- flush = kvm_rmap_zap_gfn_range(kvm, gfn_start, gfn_end);
+ flush = kvm_shadow_mmu_zap_gfn_range(kvm, gfn_start, gfn_end);
if (is_tdp_mmu_enabled(kvm)) {
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
@@ -3036,7 +3037,7 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
{
if (kvm_memslots_have_rmaps(kvm)) {
write_lock(&kvm->mmu_lock);
- kvm_rmap_zap_collapsible_sptes(kvm, slot);
+ kvm_shadow_mmu_zap_collapsible_sptes(kvm, slot);
write_unlock(&kvm->mmu_lock);
}
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 2e3b2aca64ad..85c0b186cb0a 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -874,7 +874,7 @@ int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
if (is_page_fault_stale(vcpu, fault, mmu_seq))
goto out_unlock;
- r = make_mmu_pages_available(vcpu);
+ r = kvm_shadow_mmu_make_pages_available(vcpu);
if (r)
goto out_unlock;
r = FNAME(fetch)(vcpu, fault, &walker);
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 1259c4a3b140..e36b4d9c67f2 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -1965,7 +1965,7 @@ static inline unsigned long kvm_mmu_available_pages(struct kvm *kvm)
return 0;
}
-int make_mmu_pages_available(struct kvm_vcpu *vcpu)
+int kvm_shadow_mmu_make_pages_available(struct kvm_vcpu *vcpu)
{
unsigned long avail = kvm_mmu_available_pages(vcpu->kvm);
@@ -2029,7 +2029,7 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
return r;
}
-int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
+int kvm_shadow_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
{
gpa_t gpa;
int r;
@@ -2319,7 +2319,7 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
__direct_pte_prefetch(vcpu, sp, sptep);
}
-int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
+int kvm_shadow_mmu_direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
{
struct kvm_shadow_walk_iterator it;
struct kvm_mmu_page *sp;
@@ -2537,7 +2537,7 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
return r;
write_lock(&vcpu->kvm->mmu_lock);
- r = make_mmu_pages_available(vcpu);
+ r = kvm_shadow_mmu_make_pages_available(vcpu);
if (r < 0)
goto out_unlock;
@@ -2785,7 +2785,8 @@ void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu)
*
* Must be called between walk_shadow_page_lockless_{begin,end}.
*/
-int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
+int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+ int *root_level)
{
struct kvm_shadow_walk_iterator iterator;
int leaf = -1;
@@ -3091,7 +3092,7 @@ __always_inline bool slot_handle_level_4k(struct kvm *kvm,
}
#define BATCH_ZAP_PAGES 10
-void kvm_zap_obsolete_pages(struct kvm *kvm)
+void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm)
{
struct kvm_mmu_page *sp, *node;
int nr_zapped, batch = 0;
@@ -3152,7 +3153,7 @@ bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm)
return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
}
-bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
+bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
{
const struct kvm_memory_slot *memslot;
struct kvm_memslots *slots;
@@ -3404,8 +3405,8 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
return need_tlb_flush;
}
-void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
- const struct kvm_memory_slot *slot)
+void kvm_shadow_mmu_zap_collapsible_sptes(struct kvm *kvm,
+ const struct kvm_memory_slot *slot)
{
/*
* Note, use KVM_MAX_HUGEPAGE_LEVEL - 1 since there's no need to zap
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 9952aa1e86cf..148cc3593d2b 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -60,18 +60,19 @@ bool kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
struct list_head *invalid_list);
void kvm_mmu_commit_zap_page(struct kvm *kvm, struct list_head *invalid_list);
-int make_mmu_pages_available(struct kvm_vcpu *vcpu);
+int kvm_shadow_mmu_make_pages_available(struct kvm_vcpu *vcpu);
-int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
+int kvm_shadow_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
-int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
+int kvm_shadow_mmu_direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
u64 *fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa, u64 *spte);
hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, u8 level);
int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu);
int mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
-int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level);
+int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
+ int *root_level);
void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
int bytes, struct kvm_page_track_notifier_node *node);
@@ -86,8 +87,8 @@ bool slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot,
bool slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot,
slot_level_handler fn, bool flush_on_yield);
-void kvm_zap_obsolete_pages(struct kvm *kvm);
-bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
+void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm);
+bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
bool slot_rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
const struct kvm_memory_slot *slot);
@@ -96,8 +97,8 @@ void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
const struct kvm_memory_slot *slot,
gfn_t start, gfn_t end,
int target_level);
-void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
- const struct kvm_memory_slot *slot);
+void kvm_shadow_mmu_zap_collapsible_sptes(struct kvm *kvm,
+ const struct kvm_memory_slot *slot);
bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm);
unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free);
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* [RFC 09/14] KVM: x86/MMU: Only make pages available on Shadow MMU fault
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (6 preceding siblings ...)
2022-12-21 22:24 ` [RFC 08/14] KVM: x86/MMU: Clean up naming of exported Shadow MMU functions Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2022-12-29 18:30 ` David Matlack
2022-12-21 22:24 ` [RFC 10/14] KVM: x86/MMU: Fix naming on prepare / commit zap page functions Ben Gardon
` (7 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
Now that the Shadow MMU has been factored out of mmu.c and the naming
sheme has been cleaned up, it's clear that there's an unnecessary
operation in direct_page_fault(). Since the MMU page quota is only
applied to the Shadow MMU, there's no point to calling
kvm_shadow_mmu_make_pages_available on a fault where the TDP MMU is
going to handle installing new TDP PTEs.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index bacb519ba7b4..568b36de9eeb 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1469,14 +1469,14 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
if (is_page_fault_stale(vcpu, fault, mmu_seq))
goto out_unlock;
- r = kvm_shadow_mmu_make_pages_available(vcpu);
- if (r)
- goto out_unlock;
-
if (is_tdp_mmu_fault)
r = kvm_tdp_mmu_map(vcpu, fault);
- else
+ else {
+ r = kvm_shadow_mmu_make_pages_available(vcpu);
+ if (r)
+ goto out_unlock;
r = kvm_shadow_mmu_direct_map(vcpu, fault);
+ }
out_unlock:
if (is_tdp_mmu_fault)
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [RFC 09/14] KVM: x86/MMU: Only make pages available on Shadow MMU fault
2022-12-21 22:24 ` [RFC 09/14] KVM: x86/MMU: Only make pages available on Shadow MMU fault Ben Gardon
@ 2022-12-29 18:30 ` David Matlack
0 siblings, 0 replies; 29+ messages in thread
From: David Matlack @ 2022-12-29 18:30 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
Vipin Sharma, Nagareddy Reddy
On Wed, Dec 21, 2022 at 2:24 PM Ben Gardon <bgardon@google.com> wrote:
>
> Now that the Shadow MMU has been factored out of mmu.c and the naming
> sheme has been cleaned up, it's clear that there's an unnecessary
> operation in direct_page_fault(). Since the MMU page quota is only
> applied to the Shadow MMU, there's no point to calling
> kvm_shadow_mmu_make_pages_available on a fault where the TDP MMU is
> going to handle installing new TDP PTEs.
Jinx! An equivalent change recently went into kvm/queue:
https://git.kernel.org/pub/scm/virt/kvm/kvm.git/commit/?h=queue&id=1290f90e77186bf8a06a3a35ebf254f5b004676b
^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC 10/14] KVM: x86/MMU: Fix naming on prepare / commit zap page functions
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (7 preceding siblings ...)
2022-12-21 22:24 ` [RFC 09/14] KVM: x86/MMU: Only make pages available on Shadow MMU fault Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2022-12-21 22:24 ` [RFC 11/14] KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c Ben Gardon
` (6 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
Since the various prepare / commit zap page functions are part of the
Shadow MMU and used all over both shadow_mmu.c and mmu.c, add _shadow_
to the function names to match the rest of the Shadow MMU interface.
Since there are so many uses of these functions, this rename gets its
own commit.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 21 +++++++--------
arch/x86/kvm/mmu/shadow_mmu.c | 48 ++++++++++++++++++-----------------
arch/x86/kvm/mmu/shadow_mmu.h | 13 +++++-----
3 files changed, 43 insertions(+), 39 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 568b36de9eeb..160dd143a814 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -270,8 +270,9 @@ void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
kvm_tdp_mmu_walk_lockless_end();
} else {
/*
- * Make sure the write to vcpu->mode is not reordered in front of
- * reads to sptes. If it does, kvm_mmu_commit_zap_page() can see us
+ * Make sure the write to vcpu->mode is not reordered in front
+ * of reads to sptes. If it does,
+ * kvm_shadow_mmu_commit_zap_page() can see us
* OUTSIDE_GUEST_MODE and proceed to free the shadow page table.
*/
smp_store_release(&vcpu->mode, OUTSIDE_GUEST_MODE);
@@ -608,7 +609,7 @@ bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm, struct list_head *invalid_list
return false;
if (!list_empty(invalid_list))
- kvm_mmu_commit_zap_page(kvm, invalid_list);
+ kvm_shadow_mmu_commit_zap_page(kvm, invalid_list);
else
kvm_flush_remote_tlbs(kvm);
return true;
@@ -1062,7 +1063,7 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
if (is_tdp_mmu_page(sp))
kvm_tdp_mmu_put_root(kvm, sp, false);
else if (!--sp->root_count && sp->role.invalid)
- kvm_mmu_prepare_zap_page(kvm, sp, invalid_list);
+ kvm_shadow_mmu_prepare_zap_page(kvm, sp, invalid_list);
*root_hpa = INVALID_PAGE;
}
@@ -1115,7 +1116,7 @@ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu,
mmu->root.pgd = 0;
}
- kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
write_unlock(&kvm->mmu_lock);
}
EXPORT_SYMBOL_GPL(kvm_mmu_free_roots);
@@ -1417,8 +1418,8 @@ bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
* there is a pending request to free obsolete roots. The request is
* only a hint that the current root _may_ be obsolete and needs to be
* reloaded, e.g. if the guest frees a PGD that KVM is tracking as a
- * previous root, then __kvm_mmu_prepare_zap_page() signals all vCPUs
- * to reload even if no vCPU is actively using the root.
+ * previous root, then __kvm_shadow_mmu_prepare_zap_page() signals all
+ * vCPUs to reload even if no vCPU is actively using the root.
*/
if (!sp && kvm_test_request(KVM_REQ_MMU_FREE_OBSOLETE_ROOTS, vcpu))
return true;
@@ -3103,13 +3104,13 @@ void kvm_mmu_zap_all(struct kvm *kvm)
list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
if (WARN_ON(sp->role.invalid))
continue;
- if (__kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
+ if (__kvm_shadow_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign))
goto restart;
if (cond_resched_rwlock_write(&kvm->mmu_lock))
goto restart;
}
- kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_zap_all(kvm);
@@ -3452,7 +3453,7 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm)
else if (is_tdp_mmu_page(sp))
flush |= kvm_tdp_mmu_zap_sp(kvm, sp);
else
- kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
+ kvm_shadow_mmu_prepare_zap_page(kvm, sp, &invalid_list);
WARN_ON_ONCE(sp->nx_huge_page_disallowed);
if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index e36b4d9c67f2..2d1a4026cf00 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -1280,7 +1280,7 @@ static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
int ret = vcpu->arch.mmu->sync_page(vcpu, sp);
if (ret < 0)
- kvm_mmu_prepare_zap_page(vcpu->kvm, sp, invalid_list);
+ kvm_shadow_mmu_prepare_zap_page(vcpu->kvm, sp, invalid_list);
return ret;
}
@@ -1442,8 +1442,8 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm *kvm,
* upper-level page will be write-protected.
*/
if (role.level > PG_LEVEL_4K && sp->unsync)
- kvm_mmu_prepare_zap_page(kvm, sp,
- &invalid_list);
+ kvm_shadow_mmu_prepare_zap_page(kvm, sp,
+ &invalid_list);
continue;
}
@@ -1485,7 +1485,7 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm *kvm,
++kvm->stat.mmu_cache_miss;
out:
- kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
if (collisions > kvm->stat.max_mmu_page_hash_collisions)
kvm->stat.max_mmu_page_hash_collisions = collisions;
@@ -1768,8 +1768,8 @@ static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, u64 *spte,
*/
if (tdp_enabled && invalid_list &&
child->role.guest_mode && !child->parent_ptes.val)
- return kvm_mmu_prepare_zap_page(kvm, child,
- invalid_list);
+ return kvm_shadow_mmu_prepare_zap_page(kvm,
+ child, invalid_list);
}
} else if (is_mmio_spte(pte)) {
mmu_spte_clear_no_track(spte);
@@ -1814,7 +1814,7 @@ static int mmu_zap_unsync_children(struct kvm *kvm,
struct kvm_mmu_page *sp;
for_each_sp(pages, sp, parents, i) {
- kvm_mmu_prepare_zap_page(kvm, sp, invalid_list);
+ kvm_shadow_mmu_prepare_zap_page(kvm, sp, invalid_list);
mmu_pages_clear_parents(&parents);
zapped++;
}
@@ -1823,9 +1823,9 @@ static int mmu_zap_unsync_children(struct kvm *kvm,
return zapped;
}
-bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
- struct list_head *invalid_list,
- int *nr_zapped)
+bool __kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+ struct list_head *invalid_list,
+ int *nr_zapped)
{
bool list_unstable, zapped_root = false;
@@ -1886,16 +1886,17 @@ bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
return list_unstable;
}
-bool kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
- struct list_head *invalid_list)
+bool kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+ struct list_head *invalid_list)
{
int nr_zapped;
- __kvm_mmu_prepare_zap_page(kvm, sp, invalid_list, &nr_zapped);
+ __kvm_shadow_mmu_prepare_zap_page(kvm, sp, invalid_list, &nr_zapped);
return nr_zapped;
}
-void kvm_mmu_commit_zap_page(struct kvm *kvm, struct list_head *invalid_list)
+void kvm_shadow_mmu_commit_zap_page(struct kvm *kvm,
+ struct list_head *invalid_list)
{
struct kvm_mmu_page *sp, *nsp;
@@ -1940,8 +1941,8 @@ static unsigned long kvm_mmu_zap_oldest_mmu_pages(struct kvm *kvm,
if (sp->root_count)
continue;
- unstable = __kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list,
- &nr_zapped);
+ unstable = __kvm_shadow_mmu_prepare_zap_page(kvm, sp,
+ &invalid_list, &nr_zapped);
total_zapped += nr_zapped;
if (total_zapped >= nr_to_zap)
break;
@@ -1950,7 +1951,7 @@ static unsigned long kvm_mmu_zap_oldest_mmu_pages(struct kvm *kvm,
goto restart;
}
- kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
kvm->stat.mmu_recycled += total_zapped;
return total_zapped;
@@ -2021,9 +2022,9 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
pgprintk("%s: gfn %llx role %x\n", __func__, gfn,
sp->role.word);
r = 1;
- kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
+ kvm_shadow_mmu_prepare_zap_page(kvm, sp, &invalid_list);
}
- kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ kvm_shadow_mmu_commit_zap_page(kvm, &invalid_list);
write_unlock(&kvm->mmu_lock);
return r;
@@ -3020,7 +3021,8 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
for_each_gfn_valid_sp_with_gptes(vcpu->kvm, sp, gfn) {
if (detect_write_misaligned(sp, gpa, bytes) ||
detect_write_flooding(sp)) {
- kvm_mmu_prepare_zap_page(vcpu->kvm, sp, &invalid_list);
+ kvm_shadow_mmu_prepare_zap_page(vcpu->kvm, sp,
+ &invalid_list);
++vcpu->kvm->stat.mmu_flooded;
continue;
}
@@ -3128,7 +3130,7 @@ void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm)
goto restart;
}
- unstable = __kvm_mmu_prepare_zap_page(kvm, sp,
+ unstable = __kvm_shadow_mmu_prepare_zap_page(kvm, sp,
&kvm->arch.zapped_obsolete_pages, &nr_zapped);
batch += nr_zapped;
@@ -3145,7 +3147,7 @@ void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm)
* kvm_mmu_load()), and the reload in the caller ensure no vCPUs are
* running with an obsolete MMU.
*/
- kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
+ kvm_shadow_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
}
bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm)
@@ -3426,7 +3428,7 @@ unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free)
write_lock(&kvm->mmu_lock);
if (kvm_shadow_mmu_has_zapped_obsolete_pages(kvm)) {
- kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
+ kvm_shadow_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
goto out;
}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 148cc3593d2b..af201d34d0b2 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -53,12 +53,13 @@ bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
-bool __kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
- struct list_head *invalid_list,
- int *nr_zapped);
-bool kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
- struct list_head *invalid_list);
-void kvm_mmu_commit_zap_page(struct kvm *kvm, struct list_head *invalid_list);
+bool __kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+ struct list_head *invalid_list,
+ int *nr_zapped);
+bool kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+ struct list_head *invalid_list);
+void kvm_shadow_mmu_commit_zap_page(struct kvm *kvm,
+ struct list_head *invalid_list);
int kvm_shadow_mmu_make_pages_available(struct kvm_vcpu *vcpu);
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* [RFC 11/14] KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (8 preceding siblings ...)
2022-12-21 22:24 ` [RFC 10/14] KVM: x86/MMU: Fix naming on prepare / commit zap page functions Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2022-12-21 22:24 ` [RFC 12/14] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c Ben Gardon
` (5 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
There are several functions in mmu.c which bifrucate to the Shadow
and/or TDP MMU implementations. In most of these, the Shadow MMU
implementation is open-coded. Wrap these instances in a nice function
which just needs kvm and slot arguments or similar. This matches the TDP
MMU interface and will allow for some nice cleanups in a following
commit.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 52 ++++++----------------------
arch/x86/kvm/mmu/shadow_mmu.c | 64 +++++++++++++++++++++++++++++++++++
arch/x86/kvm/mmu/shadow_mmu.h | 15 ++++++++
3 files changed, 90 insertions(+), 41 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 160dd143a814..ce2a6dd38c67 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -417,23 +417,13 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
struct kvm_memory_slot *slot,
gfn_t gfn_offset, unsigned long mask)
{
- struct kvm_rmap_head *rmap_head;
-
if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_clear_dirty_pt_masked(kvm, slot,
slot->base_gfn + gfn_offset, mask, true);
- if (!kvm_memslots_have_rmaps(kvm))
- return;
-
- while (mask) {
- rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
- PG_LEVEL_4K, slot);
- rmap_write_protect(rmap_head, false);
-
- /* clear the first set bit */
- mask &= mask - 1;
- }
+ if (kvm_memslots_have_rmaps(kvm))
+ kvm_shadow_mmu_write_protect_pt_masked(kvm, slot, gfn_offset,
+ mask);
}
/**
@@ -450,23 +440,13 @@ static void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
struct kvm_memory_slot *slot,
gfn_t gfn_offset, unsigned long mask)
{
- struct kvm_rmap_head *rmap_head;
-
if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_clear_dirty_pt_masked(kvm, slot,
slot->base_gfn + gfn_offset, mask, false);
- if (!kvm_memslots_have_rmaps(kvm))
- return;
-
- while (mask) {
- rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
- PG_LEVEL_4K, slot);
- __rmap_clear_dirty(kvm, rmap_head, slot);
-
- /* clear the first set bit */
- mask &= mask - 1;
- }
+ if (kvm_memslots_have_rmaps(kvm))
+ kvm_shadow_mmu_clear_dirty_pt_masked(kvm, slot, gfn_offset,
+ mask);
}
/**
@@ -524,16 +504,11 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
struct kvm_memory_slot *slot, u64 gfn,
int min_level)
{
- struct kvm_rmap_head *rmap_head;
- int i;
bool write_protected = false;
- if (kvm_memslots_have_rmaps(kvm)) {
- for (i = min_level; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) {
- rmap_head = gfn_to_rmap(gfn, i, slot);
- write_protected |= rmap_write_protect(rmap_head, true);
- }
- }
+ if (kvm_memslots_have_rmaps(kvm))
+ write_protected |=
+ kvm_shadow_mmu_write_protect_gfn(kvm, slot, gfn, min_level);
if (is_tdp_mmu_enabled(kvm))
write_protected |=
@@ -2917,8 +2892,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
{
if (kvm_memslots_have_rmaps(kvm)) {
write_lock(&kvm->mmu_lock);
- slot_handle_level(kvm, memslot, slot_rmap_write_protect,
- start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
+ kvm_shadow_mmu_wrprot_slot(kvm, memslot, start_level);
write_unlock(&kvm->mmu_lock);
}
@@ -3069,11 +3043,7 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
{
if (kvm_memslots_have_rmaps(kvm)) {
write_lock(&kvm->mmu_lock);
- /*
- * Clear dirty bits only on 4k SPTEs since the legacy MMU only
- * support dirty logging at a 4k granularity.
- */
- slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false);
+ kvm_shadow_mmu_clear_dirty_slot(kvm, memslot);
write_unlock(&kvm->mmu_lock);
}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 2d1a4026cf00..80b8c78daaeb 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -3440,3 +3440,67 @@ unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free)
return freed;
}
+
+void kvm_shadow_mmu_write_protect_pt_masked(struct kvm *kvm,
+ struct kvm_memory_slot *slot,
+ gfn_t gfn_offset, unsigned long mask)
+{
+ struct kvm_rmap_head *rmap_head;
+
+ while (mask) {
+ rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
+ PG_LEVEL_4K, slot);
+ rmap_write_protect(rmap_head, false);
+
+ /* clear the first set bit */
+ mask &= mask - 1;
+ }
+}
+
+void kvm_shadow_mmu_clear_dirty_pt_masked(struct kvm *kvm,
+ struct kvm_memory_slot *slot,
+ gfn_t gfn_offset, unsigned long mask)
+{
+ struct kvm_rmap_head *rmap_head;
+
+ while (mask) {
+ rmap_head = gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
+ PG_LEVEL_4K, slot);
+ __rmap_clear_dirty(kvm, rmap_head, slot);
+
+ /* clear the first set bit */
+ mask &= mask - 1;
+ }
+}
+
+bool kvm_shadow_mmu_write_protect_gfn(struct kvm *kvm,
+ struct kvm_memory_slot *slot,
+ u64 gfn, int min_level)
+{
+ struct kvm_rmap_head *rmap_head;
+ int i;
+ bool write_protected = false;
+
+ if (kvm_memslots_have_rmaps(kvm)) {
+ for (i = min_level; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) {
+ rmap_head = gfn_to_rmap(gfn, i, slot);
+ write_protected |= rmap_write_protect(rmap_head, true);
+ }
+ }
+
+ return write_protected;
+}
+
+void kvm_shadow_mmu_clear_dirty_slot(struct kvm *kvm,
+ const struct kvm_memory_slot *memslot)
+{
+ slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false);
+}
+
+void kvm_shadow_mmu_wrprot_slot(struct kvm *kvm,
+ const struct kvm_memory_slot *memslot,
+ int start_level)
+{
+ slot_handle_level(kvm, memslot, slot_rmap_write_protect,
+ start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
+}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index af201d34d0b2..c322eeaa0688 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -104,6 +104,21 @@ void kvm_shadow_mmu_zap_collapsible_sptes(struct kvm *kvm,
bool kvm_shadow_mmu_has_zapped_obsolete_pages(struct kvm *kvm);
unsigned long kvm_shadow_mmu_shrink_scan(struct kvm *kvm, int pages_to_free);
+void kvm_shadow_mmu_write_protect_pt_masked(struct kvm *kvm,
+ struct kvm_memory_slot *slot,
+ gfn_t gfn_offset, unsigned long mask);
+void kvm_shadow_mmu_clear_dirty_pt_masked(struct kvm *kvm,
+ struct kvm_memory_slot *slot,
+ gfn_t gfn_offset, unsigned long mask);
+bool kvm_shadow_mmu_write_protect_gfn(struct kvm *kvm,
+ struct kvm_memory_slot *slot,
+ u64 gfn, int min_level);
+void kvm_shadow_mmu_clear_dirty_slot(struct kvm *kvm,
+ const struct kvm_memory_slot *memslot);
+void kvm_shadow_mmu_wrprot_slot(struct kvm *kvm,
+ const struct kvm_memory_slot *memslot,
+ int start_level);
+
/* Exports from paging_tmpl.h */
gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
gpa_t vaddr, u64 access,
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* [RFC 12/14] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (9 preceding siblings ...)
2022-12-21 22:24 ` [RFC 11/14] KVM: x86/MMU: Factor Shadow MMU wrprot / clear dirty ops out of mmu.c Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2022-12-21 22:24 ` [RFC 13/14] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c Ben Gardon
` (4 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
Now that the various dirty logging / wrprot function implementations are
in shadow_mmu.c, do another round of cleanups to remove functions which
no longer need to be exposed and can be marked static.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/shadow_mmu.c | 30 +++++++++++++++++-------------
arch/x86/kvm/mmu/shadow_mmu.h | 18 ------------------
2 files changed, 17 insertions(+), 31 deletions(-)
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 80b8c78daaeb..77472eb9b06a 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -632,8 +632,8 @@ unsigned int pte_list_count(struct kvm_rmap_head *rmap_head)
return count;
}
-struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
- const struct kvm_memory_slot *slot)
+static struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
+ const struct kvm_memory_slot *slot)
{
unsigned long idx;
@@ -801,7 +801,7 @@ static bool spte_write_protect(u64 *sptep, bool pt_protect)
return mmu_spte_update(sptep, spte);
}
-bool rmap_write_protect(struct kvm_rmap_head *rmap_head, bool pt_protect)
+static bool rmap_write_protect(struct kvm_rmap_head *rmap_head, bool pt_protect)
{
u64 *sptep;
struct rmap_iterator iter;
@@ -840,8 +840,8 @@ static bool spte_wrprot_for_clear_dirty(u64 *sptep)
* - W bit on ad-disabled SPTEs.
* Returns true iff any D or W bits were cleared.
*/
-bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- const struct kvm_memory_slot *slot)
+static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+ const struct kvm_memory_slot *slot)
{
u64 *sptep;
struct rmap_iterator iter;
@@ -3045,6 +3045,11 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
write_unlock(&vcpu->kvm->mmu_lock);
}
+/* The return value indicates if tlb flush on all vcpus is needed. */
+typedef bool (*slot_level_handler) (struct kvm *kvm,
+ struct kvm_rmap_head *rmap_head,
+ const struct kvm_memory_slot *slot);
+
/* The caller should hold mmu-lock before calling this function. */
static __always_inline bool
slot_handle_level_range(struct kvm *kvm, const struct kvm_memory_slot *memslot,
@@ -3073,10 +3078,10 @@ slot_handle_level_range(struct kvm *kvm, const struct kvm_memory_slot *memslot,
return flush;
}
-__always_inline bool slot_handle_level(struct kvm *kvm,
- const struct kvm_memory_slot *memslot,
- slot_level_handler fn, int start_level,
- int end_level, bool flush_on_yield)
+static __always_inline bool
+slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot,
+ slot_level_handler fn, int start_level, int end_level,
+ bool flush_on_yield)
{
return slot_handle_level_range(kvm, memslot, fn, start_level,
end_level, memslot->base_gfn,
@@ -3084,10 +3089,9 @@ __always_inline bool slot_handle_level(struct kvm *kvm,
flush_on_yield, false);
}
-__always_inline bool slot_handle_level_4k(struct kvm *kvm,
- const struct kvm_memory_slot *memslot,
- slot_level_handler fn,
- bool flush_on_yield)
+static __always_inline bool
+slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot,
+ slot_level_handler fn, bool flush_on_yield)
{
return slot_handle_level(kvm, memslot, fn, PG_LEVEL_4K,
PG_LEVEL_4K, flush_on_yield);
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index c322eeaa0688..397fb463ef54 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -26,11 +26,6 @@ struct pte_list_desc {
/* Only exported for debugfs.c. */
unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
-struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
- const struct kvm_memory_slot *slot);
-bool rmap_write_protect(struct kvm_rmap_head *rmap_head, bool pt_protect);
-bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- const struct kvm_memory_slot *slot);
bool kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
struct kvm_memory_slot *slot, gfn_t gfn, int level,
pte_t unused);
@@ -78,22 +73,9 @@ int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
int bytes, struct kvm_page_track_notifier_node *node);
-/* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_level_handler) (struct kvm *kvm,
- struct kvm_rmap_head *rmap_head,
- const struct kvm_memory_slot *slot);
-bool slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot,
- slot_level_handler fn, int start_level, int end_level,
- bool flush_on_yield);
-bool slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot,
- slot_level_handler fn, bool flush_on_yield);
-
void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm);
bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
-bool slot_rmap_write_protect(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- const struct kvm_memory_slot *slot);
-
void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
const struct kvm_memory_slot *slot,
gfn_t start, gfn_t end,
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* [RFC 13/14] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (10 preceding siblings ...)
2022-12-21 22:24 ` [RFC 12/14] KVM: x86/MMU: Remove unneeded exports from shadow_mmu.c Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
2023-02-01 21:25 ` Sean Christopherson
2022-12-21 22:24 ` [RFC 14/14] KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in shadow_mmu.h Ben Gardon
` (3 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
handle_gfn_range + callback is not a bad interface, but it requires
exporting the whole callback scheme to mmu.c. Simplify the interface
with some basic wrapper functions, making the callback scheme internal
to shadow_mmu.c.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 8 +++---
arch/x86/kvm/mmu/shadow_mmu.c | 54 +++++++++++++++++++++++++----------
arch/x86/kvm/mmu/shadow_mmu.h | 25 ++++------------
3 files changed, 48 insertions(+), 39 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index ce2a6dd38c67..ceb3146016d0 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -530,7 +530,7 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
bool flush = false;
if (kvm_memslots_have_rmaps(kvm))
- flush = kvm_handle_gfn_range(kvm, range, kvm_zap_rmap);
+ flush = kvm_shadow_mmu_unmap_gfn_range(kvm, range);
if (is_tdp_mmu_enabled(kvm))
flush = kvm_tdp_mmu_unmap_gfn_range(kvm, range, flush);
@@ -543,7 +543,7 @@ bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
bool flush = false;
if (kvm_memslots_have_rmaps(kvm))
- flush = kvm_handle_gfn_range(kvm, range, kvm_set_pte_rmap);
+ flush = kvm_shadow_mmu_set_spte_gfn(kvm, range);
if (is_tdp_mmu_enabled(kvm))
flush |= kvm_tdp_mmu_set_spte_gfn(kvm, range);
@@ -556,7 +556,7 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
bool young = false;
if (kvm_memslots_have_rmaps(kvm))
- young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap);
+ young = kvm_shadow_mmu_age_gfn_range(kvm, range);
if (is_tdp_mmu_enabled(kvm))
young |= kvm_tdp_mmu_age_gfn_range(kvm, range);
@@ -569,7 +569,7 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
bool young = false;
if (kvm_memslots_have_rmaps(kvm))
- young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap);
+ young = kvm_shadow_mmu_test_age_gfn(kvm, range);
if (is_tdp_mmu_enabled(kvm))
young |= kvm_tdp_mmu_test_age_gfn(kvm, range);
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 77472eb9b06a..1c6ff6fe3d2c 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -862,16 +862,16 @@ static bool __kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
return kvm_zap_all_rmap_sptes(kvm, rmap_head);
}
-bool kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn, int level,
- pte_t unused)
+static bool kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+ struct kvm_memory_slot *slot, gfn_t gfn, int level,
+ pte_t unused)
{
return __kvm_zap_rmap(kvm, rmap_head, slot);
}
-bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn, int level,
- pte_t pte)
+static bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+ struct kvm_memory_slot *slot, gfn_t gfn, int level,
+ pte_t pte)
{
u64 *sptep;
struct rmap_iterator iter;
@@ -978,9 +978,13 @@ static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
slot_rmap_walk_okay(_iter_); \
slot_rmap_walk_next(_iter_))
-__always_inline bool kvm_handle_gfn_range(struct kvm *kvm,
- struct kvm_gfn_range *range,
- rmap_handler_t handler)
+typedef bool (*rmap_handler_t)(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+ struct kvm_memory_slot *slot, gfn_t gfn,
+ int level, pte_t pte);
+
+static __always_inline bool
+kvm_handle_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,
+ rmap_handler_t handler)
{
struct slot_rmap_walk_iterator iterator;
bool ret = false;
@@ -993,9 +997,9 @@ __always_inline bool kvm_handle_gfn_range(struct kvm *kvm,
return ret;
}
-bool kvm_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn, int level,
- pte_t unused)
+static bool kvm_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+ struct kvm_memory_slot *slot, gfn_t gfn, int level,
+ pte_t unused)
{
u64 *sptep;
struct rmap_iterator iter;
@@ -1007,9 +1011,9 @@ bool kvm_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
return young;
}
-bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn,
- int level, pte_t unused)
+static bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+ struct kvm_memory_slot *slot, gfn_t gfn,
+ int level, pte_t unused)
{
u64 *sptep;
struct rmap_iterator iter;
@@ -3508,3 +3512,23 @@ void kvm_shadow_mmu_wrprot_slot(struct kvm *kvm,
slot_handle_level(kvm, memslot, slot_rmap_write_protect,
start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
}
+
+bool kvm_shadow_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+ return kvm_handle_gfn_range(kvm, range, kvm_zap_rmap);
+}
+
+bool kvm_shadow_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+ return kvm_handle_gfn_range(kvm, range, kvm_set_pte_rmap);
+}
+
+bool kvm_shadow_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+ return kvm_handle_gfn_range(kvm, range, kvm_age_rmap);
+}
+
+bool kvm_shadow_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
+{
+ return kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap);
+}
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 397fb463ef54..2ded3d674cb0 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -26,26 +26,6 @@ struct pte_list_desc {
/* Only exported for debugfs.c. */
unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
-bool kvm_zap_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn, int level,
- pte_t unused);
-bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn, int level,
- pte_t pte);
-
-typedef bool (*rmap_handler_t)(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn,
- int level, pte_t pte);
-bool kvm_handle_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,
- rmap_handler_t handler);
-
-bool kvm_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn, int level,
- pte_t unused);
-bool kvm_test_age_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
- struct kvm_memory_slot *slot, gfn_t gfn,
- int level, pte_t unused);
-
void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
bool __kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
@@ -101,6 +81,11 @@ void kvm_shadow_mmu_wrprot_slot(struct kvm *kvm,
const struct kvm_memory_slot *memslot,
int start_level);
+bool kvm_shadow_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range);
+bool kvm_shadow_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
+bool kvm_shadow_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range);
+bool kvm_shadow_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
+
/* Exports from paging_tmpl.h */
gpa_t paging32_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
gpa_t vaddr, u64 access,
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [RFC 13/14] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c
2022-12-21 22:24 ` [RFC 13/14] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c Ben Gardon
@ 2023-02-01 21:25 ` Sean Christopherson
2023-02-01 22:30 ` Ben Gardon
0 siblings, 1 reply; 29+ messages in thread
From: Sean Christopherson @ 2023-02-01 21:25 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
Vipin Sharma, Nagareddy Reddy
[-- Attachment #1: Type: text/plain, Size: 1210 bytes --]
On Wed, Dec 21, 2022, Ben Gardon wrote:
> @@ -978,9 +978,13 @@ static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> slot_rmap_walk_okay(_iter_); \
> slot_rmap_walk_next(_iter_))
>
> -__always_inline bool kvm_handle_gfn_range(struct kvm *kvm,
> - struct kvm_gfn_range *range,
> - rmap_handler_t handler)
> +typedef bool (*rmap_handler_t)(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
> + struct kvm_memory_slot *slot, gfn_t gfn,
> + int level, pte_t pte);
> +
> +static __always_inline bool
> +kvm_handle_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,
Don't split function returns/attributes from the function declaration. I don't
think the rule ended up getting officially documented and enforced, but Linus was
unequivocal when it came up[*], and I happen to agree with him :-)
Actually, since I'm guessing you got the idea from existing code, can you fold
in the attached patches to purge the existing cases in mmu.c before those uglies
get moved around? Assuming you don't dislike the proposed rename, that is.
[*] https://lore.kernel.org/mm-commits/CAHk-=wjS-Jg7sGMwUPpDsjv392nDOOs0CtUtVkp=S6Q7JzFJRw@mail.gmail.com
[-- Attachment #2: 0001-KVM-x86-mmu-Rename-slot-rmap-walkers-to-add-clarity-.patch --]
[-- Type: text/x-diff, Size: 6324 bytes --]
From 1d263c3b37a74d04bd4dbd0ea2944e1692051e95 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 1 Feb 2023 12:20:52 -0800
Subject: [PATCH 1/3] KVM: x86/mmu: Rename slot rmap walkers to add clarity and
clean up code
Replace "slot_handle_level" with "walk_slot_rmaps" to better capture what
the helpers are doing, and to slightly shorten the function names so that
each function's return type and attributes can be placed on the same line
as the function declaration.
No functional change intended.
Link: https://lore.kernel.org/mm-commits/CAHk-=wjS-Jg7sGMwUPpDsjv392nDOOs0CtUtVkp=S6Q7JzFJRw@mail.gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 66 +++++++++++++++++++++---------------------
1 file changed, 33 insertions(+), 33 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c91ee2927dd7..3b2c477bbcd5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5813,23 +5813,24 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
EXPORT_SYMBOL_GPL(kvm_configure_mmu);
/* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_level_handler) (struct kvm *kvm,
+typedef bool (*slot_rmaps_handler) (struct kvm *kvm,
struct kvm_rmap_head *rmap_head,
const struct kvm_memory_slot *slot);
/* The caller should hold mmu-lock before calling this function. */
-static __always_inline bool
-slot_handle_level_range(struct kvm *kvm, const struct kvm_memory_slot *memslot,
- slot_level_handler fn, int start_level, int end_level,
- gfn_t start_gfn, gfn_t end_gfn, bool flush_on_yield,
- bool flush)
+static __always_inline bool __walk_slot_rmaps(struct kvm *kvm,
+ const struct kvm_memory_slot *slot,
+ slot_rmaps_handler fn,
+ int start_level, int end_level,
+ gfn_t start_gfn, gfn_t end_gfn,
+ bool flush_on_yield, bool flush)
{
struct slot_rmap_walk_iterator iterator;
- for_each_slot_rmap_range(memslot, start_level, end_level, start_gfn,
+ for_each_slot_rmap_range(slot, start_level, end_level, start_gfn,
end_gfn, &iterator) {
if (iterator.rmap)
- flush |= fn(kvm, iterator.rmap, memslot);
+ flush |= fn(kvm, iterator.rmap, slot);
if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
if (flush && flush_on_yield) {
@@ -5845,23 +5846,23 @@ slot_handle_level_range(struct kvm *kvm, const struct kvm_memory_slot *memslot,
return flush;
}
-static __always_inline bool
-slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot,
- slot_level_handler fn, int start_level, int end_level,
- bool flush_on_yield)
+static __always_inline bool walk_slot_rmaps(struct kvm *kvm,
+ const struct kvm_memory_slot *slot,
+ slot_rmaps_handler fn,
+ int start_level, int end_level,
+ bool flush_on_yield)
{
- return slot_handle_level_range(kvm, memslot, fn, start_level,
- end_level, memslot->base_gfn,
- memslot->base_gfn + memslot->npages - 1,
- flush_on_yield, false);
+ return __walk_slot_rmaps(kvm, slot, fn, start_level, end_level,
+ slot->base_gfn, slot->base_gfn + slot->npages - 1,
+ flush_on_yield, false);
}
-static __always_inline bool
-slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot,
- slot_level_handler fn, bool flush_on_yield)
+static __always_inline bool walk_slot_rmaps_4k(struct kvm *kvm,
+ const struct kvm_memory_slot *slot,
+ slot_rmaps_handler fn,
+ bool flush_on_yield)
{
- return slot_handle_level(kvm, memslot, fn, PG_LEVEL_4K,
- PG_LEVEL_4K, flush_on_yield);
+ return walk_slot_rmaps(kvm, slot, fn, PG_LEVEL_4K, PG_LEVEL_4K, flush_on_yield);
}
static void free_mmu_pages(struct kvm_mmu *mmu)
@@ -6156,9 +6157,9 @@ static bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_e
if (WARN_ON_ONCE(start >= end))
continue;
- flush = slot_handle_level_range(kvm, memslot, __kvm_zap_rmap,
- PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL,
- start, end - 1, true, flush);
+ flush = __walk_slot_rmaps(kvm, memslot, __kvm_zap_rmap,
+ PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL,
+ start, end - 1, true, flush);
}
}
@@ -6211,8 +6212,8 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
{
if (kvm_memslots_have_rmaps(kvm)) {
write_lock(&kvm->mmu_lock);
- slot_handle_level(kvm, memslot, slot_rmap_write_protect,
- start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
+ walk_slot_rmaps(kvm, memslot, slot_rmap_write_protect,
+ start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
write_unlock(&kvm->mmu_lock);
}
@@ -6447,10 +6448,9 @@ static void kvm_shadow_mmu_try_split_huge_pages(struct kvm *kvm,
* all the way to the target level. There's no need to split pages
* already at the target level.
*/
- for (level = KVM_MAX_HUGEPAGE_LEVEL; level > target_level; level--) {
- slot_handle_level_range(kvm, slot, shadow_mmu_try_split_huge_pages,
- level, level, start, end - 1, true, false);
- }
+ for (level = KVM_MAX_HUGEPAGE_LEVEL; level > target_level; level--)
+ __walk_slot_rmaps(kvm, slot, shadow_mmu_try_split_huge_pages,
+ level, level, start, end - 1, true, false);
}
/* Must be called with the mmu_lock held in write-mode. */
@@ -6548,8 +6548,8 @@ static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
* Note, use KVM_MAX_HUGEPAGE_LEVEL - 1 since there's no need to zap
* pages that are already mapped at the maximum hugepage level.
*/
- if (slot_handle_level(kvm, slot, kvm_mmu_zap_collapsible_spte,
- PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL - 1, true))
+ if (walk_slot_rmaps(kvm, slot, kvm_mmu_zap_collapsible_spte,
+ PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL - 1, true))
kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
}
@@ -6593,7 +6593,7 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
* Clear dirty bits only on 4k SPTEs since the legacy MMU only
* support dirty logging at a 4k granularity.
*/
- slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false);
+ walk_slot_rmaps_4k(kvm, memslot, __rmap_clear_dirty, false);
write_unlock(&kvm->mmu_lock);
}
base-commit: 11b36fe7d4500c8ef73677c087f302fd713101c2
--
2.39.1.456.gfc5497dd1b-goog
[-- Attachment #3: 0002-KVM-x86-mmu-Replace-comment-with-an-actual-lockdep-a.patch --]
[-- Type: text/x-diff, Size: 1404 bytes --]
From 2771b3dee6637b3f8699ec6cc41692d73ae7f892 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 1 Feb 2023 12:32:39 -0800
Subject: [PATCH 2/3] KVM: x86/mmu: Replace comment with an actual lockdep
assertion on mmu_lock
Assert that mmu_lock is held for write in __walk_slot_rmaps() instead of
hoping the function comment will magically prevent introducing bugs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3b2c477bbcd5..80448b96cf19 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5817,7 +5817,6 @@ typedef bool (*slot_rmaps_handler) (struct kvm *kvm,
struct kvm_rmap_head *rmap_head,
const struct kvm_memory_slot *slot);
-/* The caller should hold mmu-lock before calling this function. */
static __always_inline bool __walk_slot_rmaps(struct kvm *kvm,
const struct kvm_memory_slot *slot,
slot_rmaps_handler fn,
@@ -5827,6 +5826,8 @@ static __always_inline bool __walk_slot_rmaps(struct kvm *kvm,
{
struct slot_rmap_walk_iterator iterator;
+ lockdep_assert_held_write(&kvm->mmu_lock);
+
for_each_slot_rmap_range(slot, start_level, end_level, start_gfn,
end_gfn, &iterator) {
if (iterator.rmap)
--
2.39.1.456.gfc5497dd1b-goog
[-- Attachment #4: 0003-KVM-x86-mmu-Clean-up-mmu.c-functions-that-put-return.patch --]
[-- Type: text/x-diff, Size: 6468 bytes --]
From a67e6eaf356f08ca58c9d28f7845ae3e511f24b8 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 1 Feb 2023 12:33:54 -0800
Subject: [PATCH 3/3] KVM: x86/mmu: Clean up mmu.c functions that put return
type on separate line
Adjust a variety of functions in mmu.c to put the function return type on
the same line as the function declaration. As stated in the Linus
specification:
But the "on their own line" is complete garbage to begin with. That
will NEVER be a kernel rule. We should never have a rule that assumes
things are so long that they need to be on multiple lines.
We don't put function return types on their own lines either, even if
some other projects have that rule (just to get function names at the
beginning of lines or some other odd reason).
Leave the functions generated by BUILD_MMU_ROLE_REGS_ACCESSOR() as-is,
that code is basically illegible no matter how it's formatted.
No functional change intended.
Link: https://lore.kernel.org/mm-commits/CAHk-=wjS-Jg7sGMwUPpDsjv392nDOOs0CtUtVkp=S6Q7JzFJRw@mail.gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/mmu.c | 59 ++++++++++++++++++++----------------------
1 file changed, 28 insertions(+), 31 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 80448b96cf19..9bb72c22e757 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -888,9 +888,9 @@ static void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
untrack_possible_nx_huge_page(kvm, sp);
}
-static struct kvm_memory_slot *
-gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn,
- bool no_dirty_log)
+static struct kvm_memory_slot *gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu,
+ gfn_t gfn,
+ bool no_dirty_log)
{
struct kvm_memory_slot *slot;
@@ -950,10 +950,9 @@ static int pte_list_add(struct kvm_mmu_memory_cache *cache, u64 *spte,
return count;
}
-static void
-pte_list_desc_remove_entry(struct kvm_rmap_head *rmap_head,
- struct pte_list_desc *desc, int i,
- struct pte_list_desc *prev_desc)
+static void pte_list_desc_remove_entry(struct kvm_rmap_head *rmap_head,
+ struct pte_list_desc *desc, int i,
+ struct pte_list_desc *prev_desc)
{
int j = desc->spte_count - 1;
@@ -1504,8 +1503,8 @@ struct slot_rmap_walk_iterator {
struct kvm_rmap_head *end_rmap;
};
-static void
-rmap_walk_init_level(struct slot_rmap_walk_iterator *iterator, int level)
+static void rmap_walk_init_level(struct slot_rmap_walk_iterator *iterator,
+ int level)
{
iterator->level = level;
iterator->gfn = iterator->start_gfn;
@@ -1513,10 +1512,10 @@ rmap_walk_init_level(struct slot_rmap_walk_iterator *iterator, int level)
iterator->end_rmap = gfn_to_rmap(iterator->end_gfn, level, iterator->slot);
}
-static void
-slot_rmap_walk_init(struct slot_rmap_walk_iterator *iterator,
- const struct kvm_memory_slot *slot, int start_level,
- int end_level, gfn_t start_gfn, gfn_t end_gfn)
+static void slot_rmap_walk_init(struct slot_rmap_walk_iterator *iterator,
+ const struct kvm_memory_slot *slot,
+ int start_level, int end_level,
+ gfn_t start_gfn, gfn_t end_gfn)
{
iterator->slot = slot;
iterator->start_level = start_level;
@@ -3304,9 +3303,9 @@ static bool page_fault_can_be_fast(struct kvm_page_fault *fault)
* Returns true if the SPTE was fixed successfully. Otherwise,
* someone else modified the SPTE from its original value.
*/
-static bool
-fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
- u64 *sptep, u64 old_spte, u64 new_spte)
+static bool fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu,
+ struct kvm_page_fault *fault,
+ u64 *sptep, u64 old_spte, u64 new_spte)
{
/*
* Theoretically we could also set dirty bit (and flush TLB) here in
@@ -4638,10 +4637,9 @@ static bool sync_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, gfn_t gfn,
#include "paging_tmpl.h"
#undef PTTYPE
-static void
-__reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check,
- u64 pa_bits_rsvd, int level, bool nx, bool gbpages,
- bool pse, bool amd)
+static void __reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check,
+ u64 pa_bits_rsvd, int level, bool nx,
+ bool gbpages, bool pse, bool amd)
{
u64 gbpages_bit_rsvd = 0;
u64 nonleaf_bit8_rsvd = 0;
@@ -4754,9 +4752,9 @@ static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu,
guest_cpuid_is_amd_or_hygon(vcpu));
}
-static void
-__reset_rsvds_bits_mask_ept(struct rsvd_bits_validate *rsvd_check,
- u64 pa_bits_rsvd, bool execonly, int huge_page_level)
+static void __reset_rsvds_bits_mask_ept(struct rsvd_bits_validate *rsvd_check,
+ u64 pa_bits_rsvd, bool execonly,
+ int huge_page_level)
{
u64 high_bits_rsvd = pa_bits_rsvd & rsvd_bits(0, 51);
u64 large_1g_rsvd = 0, large_2m_rsvd = 0;
@@ -4856,8 +4854,7 @@ static inline bool boot_cpu_is_amd(void)
* the direct page table on host, use as much mmu features as
* possible, however, kvm currently does not do execution-protection.
*/
-static void
-reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
+static void reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
{
struct rsvd_bits_validate *shadow_zero_check;
int i;
@@ -5072,8 +5069,8 @@ static void paging32_init_context(struct kvm_mmu *context)
context->invlpg = paging32_invlpg;
}
-static union kvm_cpu_role
-kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
+static union kvm_cpu_role kvm_calc_cpu_role(struct kvm_vcpu *vcpu,
+ const struct kvm_mmu_role_regs *regs)
{
union kvm_cpu_role role = {0};
@@ -6664,8 +6661,8 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen)
}
}
-static unsigned long
-mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+static unsigned long mmu_shrink_scan(struct shrinker *shrink,
+ struct shrink_control *sc)
{
struct kvm *kvm;
int nr_to_scan = sc->nr_to_scan;
@@ -6723,8 +6720,8 @@ mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
return freed;
}
-static unsigned long
-mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+static unsigned long mmu_shrink_count(struct shrinker *shrink,
+ struct shrink_control *sc)
{
return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
}
--
2.39.1.456.gfc5497dd1b-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [RFC 13/14] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c
2023-02-01 21:25 ` Sean Christopherson
@ 2023-02-01 22:30 ` Ben Gardon
0 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-01 22:30 UTC (permalink / raw)
To: Sean Christopherson
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
Vipin Sharma, Nagareddy Reddy
On Wed, Feb 1, 2023 at 1:25 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Dec 21, 2022, Ben Gardon wrote:
> > @@ -978,9 +978,13 @@ static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator)
> > slot_rmap_walk_okay(_iter_); \
> > slot_rmap_walk_next(_iter_))
> >
> > -__always_inline bool kvm_handle_gfn_range(struct kvm *kvm,
> > - struct kvm_gfn_range *range,
> > - rmap_handler_t handler)
> > +typedef bool (*rmap_handler_t)(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
> > + struct kvm_memory_slot *slot, gfn_t gfn,
> > + int level, pte_t pte);
> > +
> > +static __always_inline bool
> > +kvm_handle_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range,
>
> Don't split function returns/attributes from the function declaration. I don't
> think the rule ended up getting officially documented and enforced, but Linus was
> unequivocal when it came up[*], and I happen to agree with him :-)
>
> Actually, since I'm guessing you got the idea from existing code, can you fold
> in the attached patches to purge the existing cases in mmu.c before those uglies
> get moved around? Assuming you don't dislike the proposed rename, that is.
>
> [*] https://lore.kernel.org/mm-commits/CAHk-=wjS-Jg7sGMwUPpDsjv392nDOOs0CtUtVkp=S6Q7JzFJRw@mail.gmail.com
Sounds good to me. Added the attached patches to the start of the series.
I didn't love those weird splits in the function def. Happy to see
them cleaned up too.
^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC 14/14] KVM: x86/MMU: Add kvm_shadow_mmu_ to the last few functions in shadow_mmu.h
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (11 preceding siblings ...)
2022-12-21 22:24 ` [RFC 13/14] KVM: x86/MMU: Wrap uses of kvm_handle_gfn_range in mmu.c Ben Gardon
@ 2022-12-21 22:24 ` Ben Gardon
[not found] ` <20221221222418.3307832-4-bgardon@google.com>
` (2 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:24 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy, Ben Gardon
Fix up the names of the last few Shadow MMU functions in shadow_mmu.h.
This gives a clean and obvious interface between the shared x86 MMU
code and the Shadow MMU. There are still a few functions exported from
paging_tmpl.h that are left as-is, but changing those will need to be
done separately, if at all.
No functional change intended.
Signed-off-by: Ben Gardon <bgardon@google.com>
---
arch/x86/kvm/mmu/mmu.c | 23 ++++++++++--------
arch/x86/kvm/mmu/shadow_mmu.c | 44 +++++++++++++++++++----------------
arch/x86/kvm/mmu/shadow_mmu.h | 16 +++++++------
3 files changed, 46 insertions(+), 37 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index ceb3146016d0..8f3b96af470d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -921,9 +921,11 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
u64 new_spte;
if (is_tdp_mmu(vcpu->arch.mmu))
- sptep = kvm_tdp_mmu_fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
+ sptep = kvm_tdp_mmu_fast_pf_get_last_sptep(vcpu,
+ fault->addr, &spte);
else
- sptep = fast_pf_get_last_sptep(vcpu, fault->addr, &spte);
+ sptep = kvm_shadow_mmu_fast_pf_get_last_sptep(vcpu,
+ fault->addr, &spte);
if (!is_shadow_present_pte(spte))
break;
@@ -1113,7 +1115,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu);
mmu->root.hpa = root;
} else if (shadow_root_level >= PT64_ROOT_4LEVEL) {
- root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level);
+ root = kvm_shadow_mmu_alloc_root(vcpu, 0, 0, shadow_root_level);
mmu->root.hpa = root;
} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
if (WARN_ON_ONCE(!mmu->pae_root)) {
@@ -1124,8 +1126,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
for (i = 0; i < 4; ++i) {
WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i]));
- root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT), 0,
- PT32_ROOT_LEVEL);
+ root = kvm_shadow_mmu_alloc_root(vcpu,
+ i << (30 - PAGE_SHIFT), 0, PT32_ROOT_LEVEL);
mmu->pae_root[i] = root | PT_PRESENT_MASK |
shadow_me_value;
}
@@ -1665,7 +1667,7 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
* count. Otherwise, clear the write flooding count.
*/
if (!new_role.direct)
- __clear_sp_write_flooding_count(
+ kvm_shadow_mmu_clear_sp_write_flooding_count(
to_shadow_page(vcpu->arch.mmu->root.hpa));
}
EXPORT_SYMBOL_GPL(kvm_mmu_new_pgd);
@@ -2447,13 +2449,13 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->root_role.direct);
if (r)
goto out;
- r = mmu_alloc_special_roots(vcpu);
+ r = kvm_shadow_mmu_alloc_special_roots(vcpu);
if (r)
goto out;
if (vcpu->arch.mmu->root_role.direct)
r = mmu_alloc_direct_roots(vcpu);
else
- r = mmu_alloc_shadow_roots(vcpu);
+ r = kvm_shadow_mmu_alloc_shadow_roots(vcpu);
if (r)
goto out;
@@ -2679,7 +2681,8 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
* generally doesn't use PAE paging and can skip allocating the PDP
* table. The main exception, handled here, is SVM's 32-bit NPT. The
* other exception is for shadowing L1's 32-bit or PAE NPT on 64-bit
- * KVM; that horror is handled on-demand by mmu_alloc_special_roots().
+ * KVM; that horror is handled on-demand by
+ * kvm_shadow_mmu_alloc_special_roots().
*/
if (tdp_enabled && kvm_mmu_get_tdp_level(vcpu) > PT32E_ROOT_LEVEL)
return 0;
@@ -2820,7 +2823,7 @@ int kvm_mmu_init_vm(struct kvm *kvm)
if (r < 0)
return r;
- node->track_write = kvm_mmu_pte_write;
+ node->track_write = kvm_shadow_mmu_pte_write;
node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
kvm_page_track_register_notifier(kvm, node);
diff --git a/arch/x86/kvm/mmu/shadow_mmu.c b/arch/x86/kvm/mmu/shadow_mmu.c
index 1c6ff6fe3d2c..6f3e201af670 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.c
+++ b/arch/x86/kvm/mmu/shadow_mmu.c
@@ -1402,14 +1402,14 @@ static int mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *parent,
return 0;
}
-void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
+void kvm_shadow_mmu_clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
{
atomic_set(&sp->write_flooding_count, 0);
}
static void clear_sp_write_flooding_count(u64 *spte)
{
- __clear_sp_write_flooding_count(sptep_to_sp(spte));
+ kvm_shadow_mmu_clear_sp_write_flooding_count(sptep_to_sp(spte));
}
/*
@@ -1480,7 +1480,7 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm *kvm,
kvm_flush_remote_tlbs(kvm);
}
- __clear_sp_write_flooding_count(sp);
+ kvm_shadow_mmu_clear_sp_write_flooding_count(sp);
goto out;
}
@@ -1605,12 +1605,13 @@ static union kvm_mmu_page_role kvm_mmu_child_role(u64 *sptep, bool direct,
* Concretely, a 4-byte PDE consumes bits 31:22, while an 8-byte PDE
* consumes bits 29:21. To consume bits 31:30, KVM's uses 4 shadow
* PDPTEs; those 4 PAE page directories are pre-allocated and their
- * quadrant is assigned in mmu_alloc_root(). A 4-byte PTE consumes
- * bits 21:12, while an 8-byte PTE consumes bits 20:12. To consume
- * bit 21 in the PTE (the child here), KVM propagates that bit to the
- * quadrant, i.e. sets quadrant to '0' or '1'. The parent 8-byte PDE
- * covers bit 21 (see above), thus the quadrant is calculated from the
- * _least_ significant bit of the PDE index.
+ * quadrant is assigned in kvm_shadow_mmu_alloc_root().
+ * A 4-byte PTE consumes bits 21:12, while an 8-byte PTE consumes
+ * bits 20:12. To consume bit 21 in the PTE (the child here), KVM
+ * propagates that bit to the quadrant, i.e. sets quadrant to
+ * '0' or '1'. The parent 8-byte PDE covers bit 21 (see above), thus
+ * the quadrant is calculated from the _least_ significant bit of the
+ * PDE index.
*/
if (role.has_4_byte_gpte) {
WARN_ON_ONCE(role.level != PG_LEVEL_4K);
@@ -2377,7 +2378,8 @@ int kvm_shadow_mmu_direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *faul
* - Must be called between walk_shadow_page_lockless_{begin,end}.
* - The returned sptep must not be used after walk_shadow_page_lockless_end.
*/
-u64 *fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa, u64 *spte)
+u64 *kvm_shadow_mmu_fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa,
+ u64 *spte)
{
struct kvm_shadow_walk_iterator iterator;
u64 old_spte;
@@ -2430,7 +2432,8 @@ static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn)
return ret;
}
-hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, u8 level)
+hpa_t kvm_shadow_mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant,
+ u8 level)
{
union kvm_mmu_page_role role = vcpu->arch.mmu->root_role;
struct kvm_mmu_page *sp;
@@ -2447,7 +2450,7 @@ hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, u8 level)
return __pa(sp->spt);
}
-static int mmu_first_shadow_root_alloc(struct kvm *kvm)
+static int kvm_shadow_mmu_first_shadow_root_alloc(struct kvm *kvm)
{
struct kvm_memslots *slots;
struct kvm_memory_slot *slot;
@@ -2508,7 +2511,7 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm)
return r;
}
-int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
+int kvm_shadow_mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
{
struct kvm_mmu *mmu = vcpu->arch.mmu;
u64 pdptrs[4], pm_mask;
@@ -2537,7 +2540,7 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
}
}
- r = mmu_first_shadow_root_alloc(vcpu->kvm);
+ r = kvm_shadow_mmu_first_shadow_root_alloc(vcpu->kvm);
if (r)
return r;
@@ -2551,8 +2554,8 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
* write-protect the guests page table root.
*/
if (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) {
- root = mmu_alloc_root(vcpu, root_gfn, 0,
- mmu->root_role.level);
+ root = kvm_shadow_mmu_alloc_root(vcpu, root_gfn, 0,
+ mmu->root_role.level);
mmu->root.hpa = root;
goto set_root_pgd;
}
@@ -2605,7 +2608,8 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
*/
quadrant = (mmu->cpu_role.base.level == PT32_ROOT_LEVEL) ? i : 0;
- root = mmu_alloc_root(vcpu, root_gfn, quadrant, PT32_ROOT_LEVEL);
+ root = kvm_shadow_mmu_alloc_root(vcpu, root_gfn, quadrant,
+ PT32_ROOT_LEVEL);
mmu->pae_root[i] = root | pm_mask;
}
@@ -2624,7 +2628,7 @@ int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
return r;
}
-int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
+int kvm_shadow_mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
{
struct kvm_mmu *mmu = vcpu->arch.mmu;
bool need_pml5 = mmu->root_role.level > PT64_ROOT_4LEVEL;
@@ -2997,8 +3001,8 @@ static u64 *get_written_sptes(struct kvm_mmu_page *sp, gpa_t gpa, int *nspte)
return spte;
}
-void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
- int bytes, struct kvm_page_track_notifier_node *node)
+void kvm_shadow_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
+ int bytes, struct kvm_page_track_notifier_node *node)
{
gfn_t gfn = gpa >> PAGE_SHIFT;
struct kvm_mmu_page *sp;
diff --git a/arch/x86/kvm/mmu/shadow_mmu.h b/arch/x86/kvm/mmu/shadow_mmu.h
index 2ded3d674cb0..a3e6daa36236 100644
--- a/arch/x86/kvm/mmu/shadow_mmu.h
+++ b/arch/x86/kvm/mmu/shadow_mmu.h
@@ -26,7 +26,7 @@ struct pte_list_desc {
/* Only exported for debugfs.c. */
unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
-void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
+void kvm_shadow_mmu_clear_sp_write_flooding_count(struct kvm_mmu_page *sp);
bool __kvm_shadow_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
struct list_head *invalid_list,
@@ -41,17 +41,19 @@ int kvm_shadow_mmu_make_pages_available(struct kvm_vcpu *vcpu);
int kvm_shadow_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva);
int kvm_shadow_mmu_direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
-u64 *fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa, u64 *spte);
+u64 *kvm_shadow_mmu_fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gpa_t gpa,
+ u64 *spte);
-hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, u8 level);
-int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu);
-int mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
+hpa_t kvm_shadow_mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant,
+ u8 level);
+int kvm_shadow_mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu);
+int kvm_shadow_mmu_alloc_special_roots(struct kvm_vcpu *vcpu);
int kvm_shadow_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
int *root_level);
-void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
- int bytes, struct kvm_page_track_notifier_node *node);
+void kvm_shadow_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
+ int bytes, struct kvm_page_track_notifier_node *node);
void kvm_shadow_mmu_zap_obsolete_pages(struct kvm *kvm);
bool kvm_shadow_mmu_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
--
2.39.0.314.g84b9a713c41-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread[parent not found: <20221221222418.3307832-4-bgardon@google.com>]
* Re: [RFC 03/14] KVM: x86/MMU: Move the Shadow MMU implementation to shadow_mmu.c
[not found] ` <20221221222418.3307832-4-bgardon@google.com>
@ 2022-12-21 22:40 ` Ben Gardon
2023-01-13 18:06 ` Vipin Sharma
0 siblings, 1 reply; 29+ messages in thread
From: Ben Gardon @ 2022-12-21 22:40 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Paolo Bonzini, Peter Xu, Sean Christopherson, David Matlack,
Vipin Sharma, Nagareddy Reddy
On Wed, Dec 21, 2022 at 2:24 PM Ben Gardon <bgardon@google.com> wrote:
>
> Cut and paste the implementation of the Shadow MMU to shadow_mmu.(c|h).
> This is a monsterously large commit, moving ~3500 lines. With such a
> large move, there's no way to make it easy. Do the move in one massive
> step to simplify dealing with merge conflicts and to make the git
> history a little easier to dig through. Several cleanup commits follow
> this one rather than preceed it so that their git history will remain
> easy to see.
>
> No functional change intended.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
Woops, I guess this message bounced because the patch was just too long.
I can try to split it in two if folks would prefer, or just send a
list of the functions / definitions moved.
> ---
> arch/x86/kvm/debugfs.c | 1 +
> arch/x86/kvm/mmu/mmu.c | 4526 ++++---------------------------
> arch/x86/kvm/mmu/mmu_internal.h | 4 +-
> arch/x86/kvm/mmu/shadow_mmu.c | 3408 +++++++++++++++++++++++
> arch/x86/kvm/mmu/shadow_mmu.h | 145 +
> 5 files changed, 4086 insertions(+), 3998 deletions(-)
>
...
> --
> 2.39.0.314.g84b9a713c41-goog
>
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [RFC 03/14] KVM: x86/MMU: Move the Shadow MMU implementation to shadow_mmu.c
2022-12-21 22:40 ` [RFC 03/14] KVM: x86/MMU: Move the Shadow MMU implementation to shadow_mmu.c Ben Gardon
@ 2023-01-13 18:06 ` Vipin Sharma
0 siblings, 0 replies; 29+ messages in thread
From: Vipin Sharma @ 2023-01-13 18:06 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
David Matlack, Nagareddy Reddy
On Wed, Dec 21, 2022 at 2:40 PM Ben Gardon <bgardon@google.com> wrote:
>
> On Wed, Dec 21, 2022 at 2:24 PM Ben Gardon <bgardon@google.com> wrote:
> >
> > Cut and paste the implementation of the Shadow MMU to shadow_mmu.(c|h).
> > This is a monsterously large commit, moving ~3500 lines. With such a
> > large move, there's no way to make it easy. Do the move in one massive
> > step to simplify dealing with merge conflicts and to make the git
> > history a little easier to dig through. Several cleanup commits follow
> > this one rather than preceed it so that their git history will remain
> > easy to see.
> >
> > No functional change intended.
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
>
> Woops, I guess this message bounced because the patch was just too long.
> I can try to split it in two if folks would prefer, or just send a
> list of the functions / definitions moved.
>
Interesting, I can see this patch in my email client,
lore.kernel.org/lkml but not in patchwork.kernel.org
One more way can be to move declarations to shadow_mmu.h first and
then in subsequent patch move definitions to shadow_mmu.c. I do agree
it won't reduce size much but it will make it easier to see which
functions are becoming the part of API.
> > ---
> > arch/x86/kvm/debugfs.c | 1 +
> > arch/x86/kvm/mmu/mmu.c | 4526 ++++---------------------------
> > arch/x86/kvm/mmu/mmu_internal.h | 4 +-
> > arch/x86/kvm/mmu/shadow_mmu.c | 3408 +++++++++++++++++++++++
> > arch/x86/kvm/mmu/shadow_mmu.h | 145 +
> > 5 files changed, 4086 insertions(+), 3998 deletions(-)
> >
>
> ...
>
> > --
> > 2.39.0.314.g84b9a713c41-goog
> >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (13 preceding siblings ...)
[not found] ` <20221221222418.3307832-4-bgardon@google.com>
@ 2023-01-06 19:18 ` David Matlack
2023-01-09 18:43 ` Ben Gardon
2023-02-01 20:02 ` Sean Christopherson
15 siblings, 1 reply; 29+ messages in thread
From: David Matlack @ 2023-01-06 19:18 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
Vipin Sharma, Nagareddy Reddy
On Wed, Dec 21, 2022 at 10:24:04PM +0000, Ben Gardon wrote:
> This series makes the Shadow MMU a distinct part of the KVM x86 MMU,
> implemented in separate files, with a defined interface to common code.
Overall I really like the end result.
While looking through I found a few more bits of code that should
probably be moved into shadow_mmu.c:
- kvm_mmu_zap_all(): Move the shadow MMU zapping to shadow_mmu.c (the
active_mmu_pages loop + commit_zap_page).
- need_topup(), need_topup_split_caches_or_resched()
topup_split_caches() should be static functions in shadow_mmu.c.
- Split out kvm_mmu_init/uninit_vm() functions for the shadow MMU.
Notably, the split caches, active_mmu_pages, zapped_obsolete_pages,
and other Shadow MMU-specific stuff can go in shadow_mmu.c.
- The Shadow MMU parts of walk_shadow_page_lockless_begin/end() should
go in shadow_mmu.c. e.g. kvm_shadow_mmu_walk_lockless_begin/end().
> Patch 3 is an enormous change, and doing it all at once in a single
> commit all but guarantees merge conflicts and makes it hard to review. I
> don't have a good answer to this problem as there's no easy way to move
> 3.5K lines between files. I tried moving the code bit-by-bit but the
> intermediate steps added complexity and ultimately the 50+ patches it
> created didn't seem any easier to review.
> Doing the big move all at once at least makes it easier to get past when
> doing Git archeology, and doing it at the beggining of the series allows the
> rest of the commits to still show up in Git blame.
An alternative would be to rename mmu.c to shadow_mmu.c first and then
move code in the opposite direction. That would preserve the git-blame
history for shadow_mmu.c. But by the end of the series mmu.c and
shadow_mmu.c are both ~3K LOC, so I don't think doing this is really any
better. Either way, you have to move ~3K LOC.
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU
2023-01-06 19:18 ` [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU David Matlack
@ 2023-01-09 18:43 ` Ben Gardon
0 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-01-09 18:43 UTC (permalink / raw)
To: David Matlack
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, Sean Christopherson,
Vipin Sharma, Nagareddy Reddy
On Fri, Jan 6, 2023 at 11:18 AM David Matlack <dmatlack@google.com> wrote:
>
> On Wed, Dec 21, 2022 at 10:24:04PM +0000, Ben Gardon wrote:
> > This series makes the Shadow MMU a distinct part of the KVM x86 MMU,
> > implemented in separate files, with a defined interface to common code.
>
> Overall I really like the end result.
>
> While looking through I found a few more bits of code that should
> probably be moved into shadow_mmu.c:
>
> - kvm_mmu_zap_all(): Move the shadow MMU zapping to shadow_mmu.c (the
> active_mmu_pages loop + commit_zap_page).
>
> - need_topup(), need_topup_split_caches_or_resched()
> topup_split_caches() should be static functions in shadow_mmu.c.
>
> - Split out kvm_mmu_init/uninit_vm() functions for the shadow MMU.
> Notably, the split caches, active_mmu_pages, zapped_obsolete_pages,
> and other Shadow MMU-specific stuff can go in shadow_mmu.c.
>
> - The Shadow MMU parts of walk_shadow_page_lockless_begin/end() should
> go in shadow_mmu.c. e.g. kvm_shadow_mmu_walk_lockless_begin/end().
Awesome, thank you for pointing these out. I'll work them into a V1.
>
> > Patch 3 is an enormous change, and doing it all at once in a single
> > commit all but guarantees merge conflicts and makes it hard to review. I
> > don't have a good answer to this problem as there's no easy way to move
> > 3.5K lines between files. I tried moving the code bit-by-bit but the
> > intermediate steps added complexity and ultimately the 50+ patches it
> > created didn't seem any easier to review.
> > Doing the big move all at once at least makes it easier to get past when
> > doing Git archeology, and doing it at the beggining of the series allows the
> > rest of the commits to still show up in Git blame.
>
> An alternative would be to rename mmu.c to shadow_mmu.c first and then
> move code in the opposite direction. That would preserve the git-blame
> history for shadow_mmu.c. But by the end of the series mmu.c and
> shadow_mmu.c are both ~3K LOC, so I don't think doing this is really any
> better. Either way, you have to move ~3K LOC.
I tried implementing this refactor both ways and ultimately found this
way to be a lot cleaner. Preserving the git blame for the Shadow MMU
code would be nice, since IMO it's the more complex code, but it got
complicated quickly. The in-between stages of moving around function
definitions to header files, and detangling code to move it back to
mmu.c, was a nightmare. It's relatively easy to move the leaf
functions in the call-tree, but I found moving the upper level
functions was difficult to do bit-by-bit.
If anyone wants to try implementing this commit in a more elegant way,
I'm happy to rebase the rest of the series on top of it.
As you said, either way we gotta move 3K lines of code.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU
2022-12-21 22:24 [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU Ben Gardon
` (14 preceding siblings ...)
2023-01-06 19:18 ` [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU David Matlack
@ 2023-02-01 20:02 ` Sean Christopherson
2023-02-01 20:45 ` Ben Gardon
15 siblings, 1 reply; 29+ messages in thread
From: Sean Christopherson @ 2023-02-01 20:02 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
Vipin Sharma, Nagareddy Reddy
On Wed, Dec 21, 2022, Ben Gardon wrote:
> This series builds on 9352e7470a1b4edd2fa9d235420ecc7bc3971bdc.
Before you send the next version, can you tweak your workflow to generate the
base commit via `git format-patch --base`? That makes it much easier for humans
and scripts to find the base commit, and saves you from having to remember to
manually specify the base. Because of the code movement, applying this series
without the precise base is an exercise in frustration.
E.g. my workflow does
git format-patch --base=HEAD~$nr <more crud>
where $nr is the number of patches to generate. There's also an "auto" option,
but IIRC that only works if you have the upstream pointing at the base, e.g. it
falls apart if upstream points at your own remote "backup" repo.
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: [RFC 00/14] KVM: x86/MMU: Formalize the Shadow MMU
2023-02-01 20:02 ` Sean Christopherson
@ 2023-02-01 20:45 ` Ben Gardon
0 siblings, 0 replies; 29+ messages in thread
From: Ben Gardon @ 2023-02-01 20:45 UTC (permalink / raw)
To: Sean Christopherson
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Xu, David Matlack,
Vipin Sharma, Nagareddy Reddy
On Wed, Feb 1, 2023 at 12:02 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Dec 21, 2022, Ben Gardon wrote:
> > This series builds on 9352e7470a1b4edd2fa9d235420ecc7bc3971bdc.
>
> Before you send the next version, can you tweak your workflow to generate the
> base commit via `git format-patch --base`? That makes it much easier for humans
> and scripts to find the base commit, and saves you from having to remember to
> manually specify the base. Because of the code movement, applying this series
> without the precise base is an exercise in frustration.
>
> E.g. my workflow does
>
> git format-patch --base=HEAD~$nr <more crud>
>
> where $nr is the number of patches to generate. There's also an "auto" option,
> but IIRC that only works if you have the upstream pointing at the base, e.g. it
> falls apart if upstream points at your own remote "backup" repo.
Sure thing, thanks for the tip!
^ permalink raw reply [flat|nested] 29+ messages in thread