From: Marc Zyngier <maz@kernel.org>
To: kvm-riscv@lists.infradead.org
Subject: [PATCH v5 20/21] KVM: Allow for different capacities in kvm_mmu_memory_cache structs
Date: Sun, 15 May 2022 12:42:52 +0100 [thread overview]
Message-ID: <87r14v58eb.wl-maz@kernel.org> (raw)
In-Reply-To: <20220513202819.829591-21-dmatlack@google.com>
On Fri, 13 May 2022 21:28:18 +0100,
David Matlack <dmatlack@google.com> wrote:
>
> Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at
> declaration time rather than being fixed for all declarations. This will
> be used in a follow-up commit to declare an cache in x86 with a capacity
> of 512+ objects without having to increase the capacity of all caches in
> KVM.
>
> This change requires each cache now specify its capacity at runtime,
> since the cache struct itself no longer has a fixed capacity known at
> compile time. To protect against someone accidentally defining a
> kvm_mmu_memory_cache struct directly (without the extra storage), this
> commit includes a WARN_ON() in kvm_mmu_topup_memory_cache().
>
> In order to support different capacities, this commit changes the
> objects pointer array to be dynamically allocated the first time the
> cache is topped-up.
>
> An alternative would be to lay out the objects array after the
> kvm_mmu_memory_cache struct, which can be done at compile time. But that
> change, unfortunately, adds some grottiness to arm64 and riscv, which
> uses a function-local (i.e. stack-allocated) kvm_mmu_memory_cache
> struct. Since C does not allow anonymous structs in functions, the new
> wrapper struct that contains kvm_mmu_memory_cache and the objects
> pointer array, must be named, which means dealing with an outer and
> inner struct. The outer struct can't be dropped since then there would
> be no guarantee the kvm_mmu_memory_cache struct and objects array would
> be laid out consecutively on the stack.
You may want to drop this paragraph. Someone interested in the history
can find it on the list.
>
> No functional change intended.
>
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
> arch/arm64/kvm/arm.c | 1 +
> arch/arm64/kvm/mmu.c | 5 ++++-
> arch/mips/kvm/mips.c | 2 ++
> arch/riscv/kvm/mmu.c | 8 ++++----
> arch/riscv/kvm/vcpu.c | 1 +
> arch/x86/kvm/mmu/mmu.c | 9 +++++++++
> include/linux/kvm_types.h | 9 +++++++--
> virt/kvm/kvm_main.c | 20 ++++++++++++++++++--
> 8 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 7fceb855fa71..aa1e0c1659d4 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -320,6 +320,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.target = -1;
> bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
>
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>
> /* Set up the timer */
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 53ae2c0640bc..2f2ef6b60ff4 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -764,7 +764,10 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
> {
> phys_addr_t addr;
> int ret = 0;
> - struct kvm_mmu_memory_cache cache = { 0, __GFP_ZERO, NULL, };
> + struct kvm_mmu_memory_cache cache = {
> + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE,
> + .gfp_zero = __GFP_ZERO,
> + };
> struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_DEVICE |
> KVM_PGTABLE_PROT_R |
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index a25e0b73ee70..45c7179144dc 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -387,6 +387,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> if (err)
> goto out_free_gebase;
>
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> +
> return 0;
>
> out_free_gebase:
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index f80a34fbf102..8c2338ecc246 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -347,10 +347,10 @@ static int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa,
> int ret = 0;
> unsigned long pfn;
> phys_addr_t addr, end;
> - struct kvm_mmu_memory_cache pcache;
> -
> - memset(&pcache, 0, sizeof(pcache));
> - pcache.gfp_zero = __GFP_ZERO;
> + struct kvm_mmu_memory_cache pcache = {
> + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE,
> + .gfp_zero = __GFP_ZERO,
> + };
>
> end = (gpa + size + PAGE_SIZE - 1) & PAGE_MASK;
> pfn = __phys_to_pfn(hpa);
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index 6785aef4cbd4..bbcb9d4a04fb 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -94,6 +94,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> /* Mark this VCPU never ran */
> vcpu->arch.ran_atleast_once = false;
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>
> /* Setup ISA features available to VCPU */
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4b40fa2e27eb..dad7e19ef8ed 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5803,12 +5803,21 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
> {
> int ret;
>
> + vcpu->arch.mmu_pte_list_desc_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_pte_list_desc_cache.kmem_cache = pte_list_desc_cache;
> vcpu->arch.mmu_pte_list_desc_cache.gfp_zero = __GFP_ZERO;
>
> + vcpu->arch.mmu_page_header_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_header_cache.kmem_cache = mmu_page_header_cache;
> vcpu->arch.mmu_page_header_cache.gfp_zero = __GFP_ZERO;
>
> + vcpu->arch.mmu_shadowed_info_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> +
> + vcpu->arch.mmu_shadow_page_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_shadow_page_cache.gfp_zero = __GFP_ZERO;
>
> vcpu->arch.mmu = &vcpu->arch.root_mmu;
> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> index ac1ebb37a0ff..549103a4f7bc 100644
> --- a/include/linux/kvm_types.h
> +++ b/include/linux/kvm_types.h
> @@ -83,14 +83,19 @@ struct gfn_to_pfn_cache {
> * MMU flows is problematic, as is triggering reclaim, I/O, etc... while
> * holding MMU locks. Note, these caches act more like prefetch buffers than
> * classical caches, i.e. objects are not returned to the cache on being freed.
> + *
> + * The storage for the cache object pointers is allocated dynamically when the
> + * cache is topped-up. The capacity field defines the number of object pointers
> + * available after the struct.
> */
> struct kvm_mmu_memory_cache {
> int nobjs;
> + int capacity;
> gfp_t gfp_zero;
> struct kmem_cache *kmem_cache;
> - void *objects[KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE];
> + void **objects;
> };
> -#endif
> +#endif /* KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE */
One thing that is missing here (and was already missing) is to make it
plain that kvm_mmu_memory_cache can only be used in contexts where
there are no concurrent accesses to the cache.
>
> #define HALT_POLL_HIST_COUNT 32
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index e089db822c12..264e4107e06f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -371,12 +371,23 @@ static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache *mc,
>
> int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
> {
> + gfp_t gfp = GFP_KERNEL_ACCOUNT;
> void *obj;
>
> if (mc->nobjs >= min)
> return 0;
> - while (mc->nobjs < ARRAY_SIZE(mc->objects)) {
> - obj = mmu_memory_cache_alloc_obj(mc, GFP_KERNEL_ACCOUNT);
> +
> + if (WARN_ON(mc->capacity == 0))
> + return -EINVAL;
> +
> + if (!mc->objects) {
> + mc->objects = kvmalloc_array(sizeof(void *), mc->capacity, gfp);
> + if (!mc->objects)
> + return -ENOMEM;
> + }
> +
> + while (mc->nobjs < mc->capacity) {
> + obj = mmu_memory_cache_alloc_obj(mc, gfp);
> if (!obj)
> return mc->nobjs >= min ? 0 : -ENOMEM;
> mc->objects[mc->nobjs++] = obj;
> @@ -397,6 +408,11 @@ void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc)
> else
> free_page((unsigned long)mc->objects[--mc->nobjs]);
> }
> +
> + kvfree(mc->objects);
> +
> + /* Note, must set to NULL to avoid use-after-free in the next top-up. */
> + mc->objects = NULL;
> }
>
> void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc)
Otherwise:
Reviewed-by: Marc Zyngier <maz@kernel.org>
M.
--
Without deviation from the norm, progress is not possible.
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: David Matlack <dmatlack@google.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS \(KVM/mips\)"
<kvm@vger.kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS \(KVM/mips\)"
<linux-mips@vger.kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
"open list:KERNEL VIRTUAL MACHINE FOR RISC-V \(KVM/riscv\)"
<kvm-riscv@lists.infradead.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Ben Gardon <bgardon@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
maciej.szmigiero@oracle.com,
"moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 \(KVM/arm64\)"
<kvmarm@lists.cs.columbia.edu>, Peter Feiner <pfeiner@google.com>
Subject: Re: [PATCH v5 20/21] KVM: Allow for different capacities in kvm_mmu_memory_cache structs
Date: Sun, 15 May 2022 12:42:52 +0100 [thread overview]
Message-ID: <87r14v58eb.wl-maz@kernel.org> (raw)
In-Reply-To: <20220513202819.829591-21-dmatlack@google.com>
On Fri, 13 May 2022 21:28:18 +0100,
David Matlack <dmatlack@google.com> wrote:
>
> Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at
> declaration time rather than being fixed for all declarations. This will
> be used in a follow-up commit to declare an cache in x86 with a capacity
> of 512+ objects without having to increase the capacity of all caches in
> KVM.
>
> This change requires each cache now specify its capacity at runtime,
> since the cache struct itself no longer has a fixed capacity known at
> compile time. To protect against someone accidentally defining a
> kvm_mmu_memory_cache struct directly (without the extra storage), this
> commit includes a WARN_ON() in kvm_mmu_topup_memory_cache().
>
> In order to support different capacities, this commit changes the
> objects pointer array to be dynamically allocated the first time the
> cache is topped-up.
>
> An alternative would be to lay out the objects array after the
> kvm_mmu_memory_cache struct, which can be done at compile time. But that
> change, unfortunately, adds some grottiness to arm64 and riscv, which
> uses a function-local (i.e. stack-allocated) kvm_mmu_memory_cache
> struct. Since C does not allow anonymous structs in functions, the new
> wrapper struct that contains kvm_mmu_memory_cache and the objects
> pointer array, must be named, which means dealing with an outer and
> inner struct. The outer struct can't be dropped since then there would
> be no guarantee the kvm_mmu_memory_cache struct and objects array would
> be laid out consecutively on the stack.
You may want to drop this paragraph. Someone interested in the history
can find it on the list.
>
> No functional change intended.
>
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
> arch/arm64/kvm/arm.c | 1 +
> arch/arm64/kvm/mmu.c | 5 ++++-
> arch/mips/kvm/mips.c | 2 ++
> arch/riscv/kvm/mmu.c | 8 ++++----
> arch/riscv/kvm/vcpu.c | 1 +
> arch/x86/kvm/mmu/mmu.c | 9 +++++++++
> include/linux/kvm_types.h | 9 +++++++--
> virt/kvm/kvm_main.c | 20 ++++++++++++++++++--
> 8 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 7fceb855fa71..aa1e0c1659d4 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -320,6 +320,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.target = -1;
> bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
>
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>
> /* Set up the timer */
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 53ae2c0640bc..2f2ef6b60ff4 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -764,7 +764,10 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
> {
> phys_addr_t addr;
> int ret = 0;
> - struct kvm_mmu_memory_cache cache = { 0, __GFP_ZERO, NULL, };
> + struct kvm_mmu_memory_cache cache = {
> + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE,
> + .gfp_zero = __GFP_ZERO,
> + };
> struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_DEVICE |
> KVM_PGTABLE_PROT_R |
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index a25e0b73ee70..45c7179144dc 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -387,6 +387,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> if (err)
> goto out_free_gebase;
>
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> +
> return 0;
>
> out_free_gebase:
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index f80a34fbf102..8c2338ecc246 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -347,10 +347,10 @@ static int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa,
> int ret = 0;
> unsigned long pfn;
> phys_addr_t addr, end;
> - struct kvm_mmu_memory_cache pcache;
> -
> - memset(&pcache, 0, sizeof(pcache));
> - pcache.gfp_zero = __GFP_ZERO;
> + struct kvm_mmu_memory_cache pcache = {
> + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE,
> + .gfp_zero = __GFP_ZERO,
> + };
>
> end = (gpa + size + PAGE_SIZE - 1) & PAGE_MASK;
> pfn = __phys_to_pfn(hpa);
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index 6785aef4cbd4..bbcb9d4a04fb 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -94,6 +94,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> /* Mark this VCPU never ran */
> vcpu->arch.ran_atleast_once = false;
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>
> /* Setup ISA features available to VCPU */
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4b40fa2e27eb..dad7e19ef8ed 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5803,12 +5803,21 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
> {
> int ret;
>
> + vcpu->arch.mmu_pte_list_desc_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_pte_list_desc_cache.kmem_cache = pte_list_desc_cache;
> vcpu->arch.mmu_pte_list_desc_cache.gfp_zero = __GFP_ZERO;
>
> + vcpu->arch.mmu_page_header_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_header_cache.kmem_cache = mmu_page_header_cache;
> vcpu->arch.mmu_page_header_cache.gfp_zero = __GFP_ZERO;
>
> + vcpu->arch.mmu_shadowed_info_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> +
> + vcpu->arch.mmu_shadow_page_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_shadow_page_cache.gfp_zero = __GFP_ZERO;
>
> vcpu->arch.mmu = &vcpu->arch.root_mmu;
> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> index ac1ebb37a0ff..549103a4f7bc 100644
> --- a/include/linux/kvm_types.h
> +++ b/include/linux/kvm_types.h
> @@ -83,14 +83,19 @@ struct gfn_to_pfn_cache {
> * MMU flows is problematic, as is triggering reclaim, I/O, etc... while
> * holding MMU locks. Note, these caches act more like prefetch buffers than
> * classical caches, i.e. objects are not returned to the cache on being freed.
> + *
> + * The storage for the cache object pointers is allocated dynamically when the
> + * cache is topped-up. The capacity field defines the number of object pointers
> + * available after the struct.
> */
> struct kvm_mmu_memory_cache {
> int nobjs;
> + int capacity;
> gfp_t gfp_zero;
> struct kmem_cache *kmem_cache;
> - void *objects[KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE];
> + void **objects;
> };
> -#endif
> +#endif /* KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE */
One thing that is missing here (and was already missing) is to make it
plain that kvm_mmu_memory_cache can only be used in contexts where
there are no concurrent accesses to the cache.
>
> #define HALT_POLL_HIST_COUNT 32
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index e089db822c12..264e4107e06f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -371,12 +371,23 @@ static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache *mc,
>
> int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
> {
> + gfp_t gfp = GFP_KERNEL_ACCOUNT;
> void *obj;
>
> if (mc->nobjs >= min)
> return 0;
> - while (mc->nobjs < ARRAY_SIZE(mc->objects)) {
> - obj = mmu_memory_cache_alloc_obj(mc, GFP_KERNEL_ACCOUNT);
> +
> + if (WARN_ON(mc->capacity == 0))
> + return -EINVAL;
> +
> + if (!mc->objects) {
> + mc->objects = kvmalloc_array(sizeof(void *), mc->capacity, gfp);
> + if (!mc->objects)
> + return -ENOMEM;
> + }
> +
> + while (mc->nobjs < mc->capacity) {
> + obj = mmu_memory_cache_alloc_obj(mc, gfp);
> if (!obj)
> return mc->nobjs >= min ? 0 : -ENOMEM;
> mc->objects[mc->nobjs++] = obj;
> @@ -397,6 +408,11 @@ void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc)
> else
> free_page((unsigned long)mc->objects[--mc->nobjs]);
> }
> +
> + kvfree(mc->objects);
> +
> + /* Note, must set to NULL to avoid use-after-free in the next top-up. */
> + mc->objects = NULL;
> }
>
> void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc)
Otherwise:
Reviewed-by: Marc Zyngier <maz@kernel.org>
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: David Matlack <dmatlack@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Anup Patel <anup@brainfault.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Sean Christopherson <seanjc@google.com>,
Andrew Jones <drjones@redhat.com>,
Ben Gardon <bgardon@google.com>, Peter Xu <peterx@redhat.com>,
maciej.szmigiero@oracle.com,
"moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)"
<kvmarm@lists.cs.columbia.edu>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)"
<linux-mips@vger.kernel.org>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)"
<kvm@vger.kernel.org>,
"open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)"
<kvm-riscv@lists.infradead.org>,
Peter Feiner <pfeiner@google.com>,
Lai Jiangshan <jiangshanlai@gmail.com>
Subject: Re: [PATCH v5 20/21] KVM: Allow for different capacities in kvm_mmu_memory_cache structs
Date: Sun, 15 May 2022 12:42:52 +0100 [thread overview]
Message-ID: <87r14v58eb.wl-maz@kernel.org> (raw)
In-Reply-To: <20220513202819.829591-21-dmatlack@google.com>
On Fri, 13 May 2022 21:28:18 +0100,
David Matlack <dmatlack@google.com> wrote:
>
> Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at
> declaration time rather than being fixed for all declarations. This will
> be used in a follow-up commit to declare an cache in x86 with a capacity
> of 512+ objects without having to increase the capacity of all caches in
> KVM.
>
> This change requires each cache now specify its capacity at runtime,
> since the cache struct itself no longer has a fixed capacity known at
> compile time. To protect against someone accidentally defining a
> kvm_mmu_memory_cache struct directly (without the extra storage), this
> commit includes a WARN_ON() in kvm_mmu_topup_memory_cache().
>
> In order to support different capacities, this commit changes the
> objects pointer array to be dynamically allocated the first time the
> cache is topped-up.
>
> An alternative would be to lay out the objects array after the
> kvm_mmu_memory_cache struct, which can be done at compile time. But that
> change, unfortunately, adds some grottiness to arm64 and riscv, which
> uses a function-local (i.e. stack-allocated) kvm_mmu_memory_cache
> struct. Since C does not allow anonymous structs in functions, the new
> wrapper struct that contains kvm_mmu_memory_cache and the objects
> pointer array, must be named, which means dealing with an outer and
> inner struct. The outer struct can't be dropped since then there would
> be no guarantee the kvm_mmu_memory_cache struct and objects array would
> be laid out consecutively on the stack.
You may want to drop this paragraph. Someone interested in the history
can find it on the list.
>
> No functional change intended.
>
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
> arch/arm64/kvm/arm.c | 1 +
> arch/arm64/kvm/mmu.c | 5 ++++-
> arch/mips/kvm/mips.c | 2 ++
> arch/riscv/kvm/mmu.c | 8 ++++----
> arch/riscv/kvm/vcpu.c | 1 +
> arch/x86/kvm/mmu/mmu.c | 9 +++++++++
> include/linux/kvm_types.h | 9 +++++++--
> virt/kvm/kvm_main.c | 20 ++++++++++++++++++--
> 8 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 7fceb855fa71..aa1e0c1659d4 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -320,6 +320,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.target = -1;
> bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
>
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>
> /* Set up the timer */
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 53ae2c0640bc..2f2ef6b60ff4 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -764,7 +764,10 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
> {
> phys_addr_t addr;
> int ret = 0;
> - struct kvm_mmu_memory_cache cache = { 0, __GFP_ZERO, NULL, };
> + struct kvm_mmu_memory_cache cache = {
> + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE,
> + .gfp_zero = __GFP_ZERO,
> + };
> struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_DEVICE |
> KVM_PGTABLE_PROT_R |
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index a25e0b73ee70..45c7179144dc 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -387,6 +387,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> if (err)
> goto out_free_gebase;
>
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> +
> return 0;
>
> out_free_gebase:
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index f80a34fbf102..8c2338ecc246 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -347,10 +347,10 @@ static int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa,
> int ret = 0;
> unsigned long pfn;
> phys_addr_t addr, end;
> - struct kvm_mmu_memory_cache pcache;
> -
> - memset(&pcache, 0, sizeof(pcache));
> - pcache.gfp_zero = __GFP_ZERO;
> + struct kvm_mmu_memory_cache pcache = {
> + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE,
> + .gfp_zero = __GFP_ZERO,
> + };
>
> end = (gpa + size + PAGE_SIZE - 1) & PAGE_MASK;
> pfn = __phys_to_pfn(hpa);
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index 6785aef4cbd4..bbcb9d4a04fb 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -94,6 +94,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> /* Mark this VCPU never ran */
> vcpu->arch.ran_atleast_once = false;
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>
> /* Setup ISA features available to VCPU */
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4b40fa2e27eb..dad7e19ef8ed 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5803,12 +5803,21 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
> {
> int ret;
>
> + vcpu->arch.mmu_pte_list_desc_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_pte_list_desc_cache.kmem_cache = pte_list_desc_cache;
> vcpu->arch.mmu_pte_list_desc_cache.gfp_zero = __GFP_ZERO;
>
> + vcpu->arch.mmu_page_header_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_header_cache.kmem_cache = mmu_page_header_cache;
> vcpu->arch.mmu_page_header_cache.gfp_zero = __GFP_ZERO;
>
> + vcpu->arch.mmu_shadowed_info_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> +
> + vcpu->arch.mmu_shadow_page_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_shadow_page_cache.gfp_zero = __GFP_ZERO;
>
> vcpu->arch.mmu = &vcpu->arch.root_mmu;
> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> index ac1ebb37a0ff..549103a4f7bc 100644
> --- a/include/linux/kvm_types.h
> +++ b/include/linux/kvm_types.h
> @@ -83,14 +83,19 @@ struct gfn_to_pfn_cache {
> * MMU flows is problematic, as is triggering reclaim, I/O, etc... while
> * holding MMU locks. Note, these caches act more like prefetch buffers than
> * classical caches, i.e. objects are not returned to the cache on being freed.
> + *
> + * The storage for the cache object pointers is allocated dynamically when the
> + * cache is topped-up. The capacity field defines the number of object pointers
> + * available after the struct.
> */
> struct kvm_mmu_memory_cache {
> int nobjs;
> + int capacity;
> gfp_t gfp_zero;
> struct kmem_cache *kmem_cache;
> - void *objects[KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE];
> + void **objects;
> };
> -#endif
> +#endif /* KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE */
One thing that is missing here (and was already missing) is to make it
plain that kvm_mmu_memory_cache can only be used in contexts where
there are no concurrent accesses to the cache.
>
> #define HALT_POLL_HIST_COUNT 32
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index e089db822c12..264e4107e06f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -371,12 +371,23 @@ static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache *mc,
>
> int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
> {
> + gfp_t gfp = GFP_KERNEL_ACCOUNT;
> void *obj;
>
> if (mc->nobjs >= min)
> return 0;
> - while (mc->nobjs < ARRAY_SIZE(mc->objects)) {
> - obj = mmu_memory_cache_alloc_obj(mc, GFP_KERNEL_ACCOUNT);
> +
> + if (WARN_ON(mc->capacity == 0))
> + return -EINVAL;
> +
> + if (!mc->objects) {
> + mc->objects = kvmalloc_array(sizeof(void *), mc->capacity, gfp);
> + if (!mc->objects)
> + return -ENOMEM;
> + }
> +
> + while (mc->nobjs < mc->capacity) {
> + obj = mmu_memory_cache_alloc_obj(mc, gfp);
> if (!obj)
> return mc->nobjs >= min ? 0 : -ENOMEM;
> mc->objects[mc->nobjs++] = obj;
> @@ -397,6 +408,11 @@ void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc)
> else
> free_page((unsigned long)mc->objects[--mc->nobjs]);
> }
> +
> + kvfree(mc->objects);
> +
> + /* Note, must set to NULL to avoid use-after-free in the next top-up. */
> + mc->objects = NULL;
> }
>
> void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc)
Otherwise:
Reviewed-by: Marc Zyngier <maz@kernel.org>
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2022-05-15 11:42 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-13 20:27 [PATCH v5 00/21] KVM: Extend Eager Page Splitting to the shadow MMU David Matlack
2022-05-13 20:27 ` David Matlack
2022-05-13 20:27 ` David Matlack
2022-05-13 20:27 ` [PATCH v5 01/21] KVM: x86/mmu: Optimize MMU page cache lookup for all direct SPs David Matlack
2022-05-13 20:27 ` David Matlack
2022-05-13 20:27 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 02/21] KVM: x86/mmu: Use a bool for direct David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 03/21] KVM: x86/mmu: Derive shadow MMU page role from parent David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-16 6:54 ` Lai Jiangshan
2022-05-16 6:54 ` Lai Jiangshan
2022-05-16 6:54 ` Lai Jiangshan
2022-05-16 21:38 ` David Matlack
2022-05-16 21:38 ` David Matlack
2022-05-16 21:38 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 04/21] KVM: x86/mmu: Always pass 0 for @quadrant when gptes are 8 bytes David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 05/21] KVM: x86/mmu: Decompose kvm_mmu_get_page() into separate functions David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 06/21] KVM: x86/mmu: Consolidate shadow page allocation and initialization David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 07/21] KVM: x86/mmu: Rename shadow MMU functions that deal with shadow pages David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 08/21] KVM: x86/mmu: Move guest PT write-protection to account_shadowed() David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 09/21] KVM: x86/mmu: Pass memory caches to allocate SPs separately David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 10/21] KVM: x86/mmu: Replace vcpu with kvm in kvm_mmu_alloc_shadow_page() David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 11/21] KVM: x86/mmu: Pass kvm pointer separately from vcpu to kvm_mmu_find_shadow_page() David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 12/21] KVM: x86/mmu: Allow NULL @vcpu in kvm_mmu_find_shadow_page() David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 13/21] KVM: x86/mmu: Pass const memslot to rmap_add() David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 14/21] KVM: x86/mmu: Decouple rmap_add() and link_shadow_page() from kvm_vcpu David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 15/21] KVM: x86/mmu: Update page stats in __rmap_add() David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 16/21] KVM: x86/mmu: Cache the access bits of shadowed translations David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 17/21] KVM: x86/mmu: Extend make_huge_page_split_spte() for the shadow MMU David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 18/21] KVM: x86/mmu: Zap collapsible SPTEs in shadow MMU at all possible levels David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 19/21] KVM: x86/mmu: Refactor drop_large_spte() David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 20/21] KVM: Allow for different capacities in kvm_mmu_memory_cache structs David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-15 11:42 ` Marc Zyngier [this message]
2022-05-15 11:42 ` Marc Zyngier
2022-05-15 11:42 ` Marc Zyngier
2022-05-16 3:31 ` Anup Patel
2022-05-16 3:31 ` Anup Patel
2022-05-16 3:31 ` Anup Patel
2022-05-16 23:23 ` David Matlack
2022-05-16 23:23 ` David Matlack
2022-05-16 23:23 ` David Matlack
2022-05-16 14:49 ` Sean Christopherson
2022-05-16 14:49 ` Sean Christopherson
2022-05-16 14:49 ` Sean Christopherson
2022-05-16 16:39 ` David Matlack
2022-05-16 16:39 ` David Matlack
2022-05-16 16:39 ` David Matlack
2022-05-16 17:53 ` Sean Christopherson
2022-05-16 17:53 ` Sean Christopherson
2022-05-16 17:53 ` Sean Christopherson
2022-05-13 20:28 ` [PATCH v5 21/21] KVM: x86/mmu: Extend Eager Page Splitting to nested MMUs David Matlack
2022-05-13 20:28 ` David Matlack
2022-05-13 20:28 ` David Matlack
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r14v58eb.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=kvm-riscv@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.