linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Anup Patel <anup@brainfault.org>,
	Atish Patra <atishp@atishpatra.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeelb@google.com>,
	James Morse <james.morse@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	linux-mips@vger.kernel.org, kvm@vger.kernel.org,
	kvm-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH v3 4/6] KVM: arm64/mmu: count KVM page table pages in pagetable stats
Date: Tue, 26 Apr 2022 16:58:07 +0100	[thread overview]
Message-ID: <874k2falbk.wl-maz@kernel.org> (raw)
In-Reply-To: <20220426053904.3684293-5-yosryahmed@google.com>

On Tue, 26 Apr 2022 06:39:02 +0100,
Yosry Ahmed <yosryahmed@google.com> wrote:
> 
> Count the pages used by KVM in arm64 for page tables in pagetable stats.
> 
> Account pages allocated for PTEs in pgtable init functions and
> kvm_set_table_pte().
> 
> Since most page table pages are freed using put_page(), add a helper
> function put_pte_page() that checks if this is the last ref for a pte
> page before putting it, and unaccounts stats accordingly.
> 
> Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
> ---
>  arch/arm64/kernel/image-vars.h |  3 ++
>  arch/arm64/kvm/hyp/pgtable.c   | 50 +++++++++++++++++++++-------------
>  2 files changed, 34 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
> index 241c86b67d01..25bf058714f6 100644
> --- a/arch/arm64/kernel/image-vars.h
> +++ b/arch/arm64/kernel/image-vars.h
> @@ -143,6 +143,9 @@ KVM_NVHE_ALIAS(__hyp_rodata_end);
>  /* pKVM static key */
>  KVM_NVHE_ALIAS(kvm_protected_mode_initialized);
>  
> +/* Called by kvm_account_pgtable_pages() to update pagetable stats */
> +KVM_NVHE_ALIAS(__mod_lruvec_page_state);

This cannot be right. It means that this function will be called
directly from the EL2 code when in protected mode, and will result in
extreme fireworks.  There is no way you can call core kernel stuff
like this from this context.

Please do not add random symbols to this list just for the sake of
being able to link the kernel.

> +
>  #endif /* CONFIG_KVM */
>  
>  #endif /* __ARM64_KERNEL_IMAGE_VARS_H */
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 2cb3867eb7c2..53e13c3313e9 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -152,6 +152,7 @@ static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp,
>  
>  	WARN_ON(kvm_pte_valid(old));
>  	smp_store_release(ptep, pte);
> +	kvm_account_pgtable_pages((void *)childp, +1);

Why the + sign?

>  }
>  
>  static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, u32 level)
> @@ -326,6 +327,14 @@ int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr,
>  	return ret;
>  }
>  
> +static void put_pte_page(kvm_pte_t *ptep, struct kvm_pgtable_mm_ops *mm_ops)
> +{
> +	/* If this is the last page ref, decrement pagetable stats first. */
> +	if (!mm_ops->page_count || mm_ops->page_count(ptep) == 1)
> +		kvm_account_pgtable_pages((void *)ptep, -1);
> +	mm_ops->put_page(ptep);
> +}
> +
>  struct hyp_map_data {
>  	u64				phys;
>  	kvm_pte_t			attr;
> @@ -488,10 +497,10 @@ static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
>  
>  	dsb(ish);
>  	isb();
> -	mm_ops->put_page(ptep);
> +	put_pte_page(ptep, mm_ops);
>  
>  	if (childp)
> -		mm_ops->put_page(childp);
> +		put_pte_page(childp, mm_ops);
>  
>  	return 0;
>  }
> @@ -522,6 +531,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
>  	pgt->pgd = (kvm_pte_t *)mm_ops->zalloc_page(NULL);
>  	if (!pgt->pgd)
>  		return -ENOMEM;
> +	kvm_account_pgtable_pages((void *)pgt->pgd, +1);
>  
>  	pgt->ia_bits		= va_bits;
>  	pgt->start_level	= KVM_PGTABLE_MAX_LEVELS - levels;
> @@ -541,10 +551,10 @@ static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
>  	if (!kvm_pte_valid(pte))
>  		return 0;
>  
> -	mm_ops->put_page(ptep);
> +	put_pte_page(ptep, mm_ops);
>  
>  	if (kvm_pte_table(pte, level))
> -		mm_ops->put_page(kvm_pte_follow(pte, mm_ops));
> +		put_pte_page(kvm_pte_follow(pte, mm_ops), mm_ops);

OK, I see the pattern. I don't think this workable as such. I'd rather
the callbacks themselves (put_page, zalloc_page*) call into the
accounting code when it makes sense, rather than spreading the
complexity and having to special case the protected case.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  parent reply	other threads:[~2022-04-26 15:58 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-26  5:38 [PATCH v3 0/6] KVM: mm: count KVM page table pages in memory stats Yosry Ahmed
2022-04-26  5:38 ` [PATCH v3 1/6] mm: add NR_SECONDARY_PAGETABLE stat Yosry Ahmed
2022-04-26  5:39 ` [PATCH v3 2/6] KVM: mmu: add a helper to account page table pages used by KVM Yosry Ahmed
2022-04-26  5:39 ` [PATCH v3 3/6] KVM: x86/mmu: count KVM page table pages in pagetable stats Yosry Ahmed
2022-04-26  5:39 ` [PATCH v3 4/6] KVM: arm64/mmu: " Yosry Ahmed
2022-04-26  7:34   ` Oliver Upton
2022-04-26 19:27     ` Yosry Ahmed
2022-04-28 17:45       ` Oliver Upton
2022-04-28 23:49         ` Yosry Ahmed
2022-04-26 15:58   ` Marc Zyngier [this message]
2022-04-26 19:33     ` Yosry Ahmed
2022-04-26  5:39 ` [PATCH v3 5/6] KVM: riscv/mmu: " Yosry Ahmed
2022-04-26  5:39 ` [PATCH v3 6/6] KVM: mips/mmu: " Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874k2falbk.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aleksandar.qemu.devel@gmail.com \
    --cc=alexandru.elisei@arm.com \
    --cc=anup@brainfault.org \
    --cc=atishp@atishpatra.org \
    --cc=catalin.marinas@arm.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chenhuacai@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=james.morse@arm.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=roman.gushchin@linux.dev \
    --cc=seanjc@google.com \
    --cc=shakeelb@google.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=suzuki.poulose@arm.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).