From: Marc Zyngier <maz@kernel.org>
To: David Matlack <dmatlack@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Anup Patel <anup@brainfault.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Sean Christopherson <seanjc@google.com>,
Andrew Jones <drjones@redhat.com>,
Ben Gardon <bgardon@google.com>, Peter Xu <peterx@redhat.com>,
maciej.szmigiero@oracle.com,
"moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)"
<kvmarm@lists.cs.columbia.edu>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)"
<linux-mips@vger.kernel.org>,
"open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)"
<kvm@vger.kernel.org>,
"open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)"
<kvm-riscv@lists.infradead.org>,
Peter Feiner <pfeiner@google.com>,
Lai Jiangshan <jiangshanlai@gmail.com>
Subject: Re: [PATCH v5 20/21] KVM: Allow for different capacities in kvm_mmu_memory_cache structs
Date: Sun, 15 May 2022 12:42:52 +0100 [thread overview]
Message-ID: <87r14v58eb.wl-maz@kernel.org> (raw)
In-Reply-To: <20220513202819.829591-21-dmatlack@google.com>
On Fri, 13 May 2022 21:28:18 +0100,
David Matlack <dmatlack@google.com> wrote:
>
> Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at
> declaration time rather than being fixed for all declarations. This will
> be used in a follow-up commit to declare an cache in x86 with a capacity
> of 512+ objects without having to increase the capacity of all caches in
> KVM.
>
> This change requires each cache now specify its capacity at runtime,
> since the cache struct itself no longer has a fixed capacity known at
> compile time. To protect against someone accidentally defining a
> kvm_mmu_memory_cache struct directly (without the extra storage), this
> commit includes a WARN_ON() in kvm_mmu_topup_memory_cache().
>
> In order to support different capacities, this commit changes the
> objects pointer array to be dynamically allocated the first time the
> cache is topped-up.
>
> An alternative would be to lay out the objects array after the
> kvm_mmu_memory_cache struct, which can be done at compile time. But that
> change, unfortunately, adds some grottiness to arm64 and riscv, which
> uses a function-local (i.e. stack-allocated) kvm_mmu_memory_cache
> struct. Since C does not allow anonymous structs in functions, the new
> wrapper struct that contains kvm_mmu_memory_cache and the objects
> pointer array, must be named, which means dealing with an outer and
> inner struct. The outer struct can't be dropped since then there would
> be no guarantee the kvm_mmu_memory_cache struct and objects array would
> be laid out consecutively on the stack.
You may want to drop this paragraph. Someone interested in the history
can find it on the list.
>
> No functional change intended.
>
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
> arch/arm64/kvm/arm.c | 1 +
> arch/arm64/kvm/mmu.c | 5 ++++-
> arch/mips/kvm/mips.c | 2 ++
> arch/riscv/kvm/mmu.c | 8 ++++----
> arch/riscv/kvm/vcpu.c | 1 +
> arch/x86/kvm/mmu/mmu.c | 9 +++++++++
> include/linux/kvm_types.h | 9 +++++++--
> virt/kvm/kvm_main.c | 20 ++++++++++++++++++--
> 8 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 7fceb855fa71..aa1e0c1659d4 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -320,6 +320,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> vcpu->arch.target = -1;
> bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
>
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>
> /* Set up the timer */
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 53ae2c0640bc..2f2ef6b60ff4 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -764,7 +764,10 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
> {
> phys_addr_t addr;
> int ret = 0;
> - struct kvm_mmu_memory_cache cache = { 0, __GFP_ZERO, NULL, };
> + struct kvm_mmu_memory_cache cache = {
> + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE,
> + .gfp_zero = __GFP_ZERO,
> + };
> struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_DEVICE |
> KVM_PGTABLE_PROT_R |
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index a25e0b73ee70..45c7179144dc 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -387,6 +387,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> if (err)
> goto out_free_gebase;
>
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> +
> return 0;
>
> out_free_gebase:
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index f80a34fbf102..8c2338ecc246 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -347,10 +347,10 @@ static int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa,
> int ret = 0;
> unsigned long pfn;
> phys_addr_t addr, end;
> - struct kvm_mmu_memory_cache pcache;
> -
> - memset(&pcache, 0, sizeof(pcache));
> - pcache.gfp_zero = __GFP_ZERO;
> + struct kvm_mmu_memory_cache pcache = {
> + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE,
> + .gfp_zero = __GFP_ZERO,
> + };
>
> end = (gpa + size + PAGE_SIZE - 1) & PAGE_MASK;
> pfn = __phys_to_pfn(hpa);
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index 6785aef4cbd4..bbcb9d4a04fb 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -94,6 +94,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> /* Mark this VCPU never ran */
> vcpu->arch.ran_atleast_once = false;
> + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>
> /* Setup ISA features available to VCPU */
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4b40fa2e27eb..dad7e19ef8ed 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5803,12 +5803,21 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
> {
> int ret;
>
> + vcpu->arch.mmu_pte_list_desc_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_pte_list_desc_cache.kmem_cache = pte_list_desc_cache;
> vcpu->arch.mmu_pte_list_desc_cache.gfp_zero = __GFP_ZERO;
>
> + vcpu->arch.mmu_page_header_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_page_header_cache.kmem_cache = mmu_page_header_cache;
> vcpu->arch.mmu_page_header_cache.gfp_zero = __GFP_ZERO;
>
> + vcpu->arch.mmu_shadowed_info_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> +
> + vcpu->arch.mmu_shadow_page_cache.capacity =
> + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
> vcpu->arch.mmu_shadow_page_cache.gfp_zero = __GFP_ZERO;
>
> vcpu->arch.mmu = &vcpu->arch.root_mmu;
> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> index ac1ebb37a0ff..549103a4f7bc 100644
> --- a/include/linux/kvm_types.h
> +++ b/include/linux/kvm_types.h
> @@ -83,14 +83,19 @@ struct gfn_to_pfn_cache {
> * MMU flows is problematic, as is triggering reclaim, I/O, etc... while
> * holding MMU locks. Note, these caches act more like prefetch buffers than
> * classical caches, i.e. objects are not returned to the cache on being freed.
> + *
> + * The storage for the cache object pointers is allocated dynamically when the
> + * cache is topped-up. The capacity field defines the number of object pointers
> + * available after the struct.
> */
> struct kvm_mmu_memory_cache {
> int nobjs;
> + int capacity;
> gfp_t gfp_zero;
> struct kmem_cache *kmem_cache;
> - void *objects[KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE];
> + void **objects;
> };
> -#endif
> +#endif /* KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE */
One thing that is missing here (and was already missing) is to make it
plain that kvm_mmu_memory_cache can only be used in contexts where
there are no concurrent accesses to the cache.
>
> #define HALT_POLL_HIST_COUNT 32
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index e089db822c12..264e4107e06f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -371,12 +371,23 @@ static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache *mc,
>
> int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
> {
> + gfp_t gfp = GFP_KERNEL_ACCOUNT;
> void *obj;
>
> if (mc->nobjs >= min)
> return 0;
> - while (mc->nobjs < ARRAY_SIZE(mc->objects)) {
> - obj = mmu_memory_cache_alloc_obj(mc, GFP_KERNEL_ACCOUNT);
> +
> + if (WARN_ON(mc->capacity == 0))
> + return -EINVAL;
> +
> + if (!mc->objects) {
> + mc->objects = kvmalloc_array(sizeof(void *), mc->capacity, gfp);
> + if (!mc->objects)
> + return -ENOMEM;
> + }
> +
> + while (mc->nobjs < mc->capacity) {
> + obj = mmu_memory_cache_alloc_obj(mc, gfp);
> if (!obj)
> return mc->nobjs >= min ? 0 : -ENOMEM;
> mc->objects[mc->nobjs++] = obj;
> @@ -397,6 +408,11 @@ void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc)
> else
> free_page((unsigned long)mc->objects[--mc->nobjs]);
> }
> +
> + kvfree(mc->objects);
> +
> + /* Note, must set to NULL to avoid use-after-free in the next top-up. */
> + mc->objects = NULL;
> }
>
> void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc)
Otherwise:
Reviewed-by: Marc Zyngier <maz@kernel.org>
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2022-05-15 11:42 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-13 20:27 [PATCH v5 00/21] KVM: Extend Eager Page Splitting to the shadow MMU David Matlack
2022-05-13 20:27 ` [PATCH v5 01/21] KVM: x86/mmu: Optimize MMU page cache lookup for all direct SPs David Matlack
2022-05-13 20:28 ` [PATCH v5 02/21] KVM: x86/mmu: Use a bool for direct David Matlack
2022-05-13 20:28 ` [PATCH v5 03/21] KVM: x86/mmu: Derive shadow MMU page role from parent David Matlack
2022-05-16 6:54 ` Lai Jiangshan
2022-05-16 21:38 ` David Matlack
2022-05-13 20:28 ` [PATCH v5 04/21] KVM: x86/mmu: Always pass 0 for @quadrant when gptes are 8 bytes David Matlack
2022-05-13 20:28 ` [PATCH v5 05/21] KVM: x86/mmu: Decompose kvm_mmu_get_page() into separate functions David Matlack
2022-05-13 20:28 ` [PATCH v5 06/21] KVM: x86/mmu: Consolidate shadow page allocation and initialization David Matlack
2022-05-13 20:28 ` [PATCH v5 07/21] KVM: x86/mmu: Rename shadow MMU functions that deal with shadow pages David Matlack
2022-05-13 20:28 ` [PATCH v5 08/21] KVM: x86/mmu: Move guest PT write-protection to account_shadowed() David Matlack
2022-05-13 20:28 ` [PATCH v5 09/21] KVM: x86/mmu: Pass memory caches to allocate SPs separately David Matlack
2022-05-13 20:28 ` [PATCH v5 10/21] KVM: x86/mmu: Replace vcpu with kvm in kvm_mmu_alloc_shadow_page() David Matlack
2022-05-13 20:28 ` [PATCH v5 11/21] KVM: x86/mmu: Pass kvm pointer separately from vcpu to kvm_mmu_find_shadow_page() David Matlack
2022-05-13 20:28 ` [PATCH v5 12/21] KVM: x86/mmu: Allow NULL @vcpu in kvm_mmu_find_shadow_page() David Matlack
2022-05-13 20:28 ` [PATCH v5 13/21] KVM: x86/mmu: Pass const memslot to rmap_add() David Matlack
2022-05-13 20:28 ` [PATCH v5 14/21] KVM: x86/mmu: Decouple rmap_add() and link_shadow_page() from kvm_vcpu David Matlack
2022-05-13 20:28 ` [PATCH v5 15/21] KVM: x86/mmu: Update page stats in __rmap_add() David Matlack
2022-05-13 20:28 ` [PATCH v5 16/21] KVM: x86/mmu: Cache the access bits of shadowed translations David Matlack
2022-05-13 20:28 ` [PATCH v5 17/21] KVM: x86/mmu: Extend make_huge_page_split_spte() for the shadow MMU David Matlack
2022-05-13 20:28 ` [PATCH v5 18/21] KVM: x86/mmu: Zap collapsible SPTEs in shadow MMU at all possible levels David Matlack
2022-05-13 20:28 ` [PATCH v5 19/21] KVM: x86/mmu: Refactor drop_large_spte() David Matlack
2022-05-13 20:28 ` [PATCH v5 20/21] KVM: Allow for different capacities in kvm_mmu_memory_cache structs David Matlack
2022-05-15 11:42 ` Marc Zyngier [this message]
2022-05-16 3:31 ` Anup Patel
2022-05-16 23:23 ` David Matlack
2022-05-16 14:49 ` Sean Christopherson
2022-05-16 16:39 ` David Matlack
2022-05-16 17:53 ` Sean Christopherson
2022-05-13 20:28 ` [PATCH v5 21/21] KVM: x86/mmu: Extend Eager Page Splitting to nested MMUs David Matlack
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r14v58eb.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=aleksandar.qemu.devel@gmail.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=bgardon@google.com \
--cc=chenhuacai@kernel.org \
--cc=dmatlack@google.com \
--cc=drjones@redhat.com \
--cc=jiangshanlai@gmail.com \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-mips@vger.kernel.org \
--cc=maciej.szmigiero@oracle.com \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=pfeiner@google.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).