From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DD89C433E0 for ; Wed, 17 Mar 2021 09:11:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E349764F30 for ; Wed, 17 Mar 2021 09:11:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229588AbhCQJLS (ORCPT ); Wed, 17 Mar 2021 05:11:18 -0400 Received: from mail.kernel.org ([198.145.29.99]:48196 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229535AbhCQJLR (ORCPT ); Wed, 17 Mar 2021 05:11:17 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A879F64F30; Wed, 17 Mar 2021 09:11:16 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1lMSCw-002AWF-GT; Wed, 17 Mar 2021 09:11:14 +0000 Date: Wed, 17 Mar 2021 09:10:20 +0000 Message-ID: <87mtv2i1s3.wl-maz@kernel.org> From: Marc Zyngier To: Wanpeng Li Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini Subject: Re: [PATCH] KVM: arm: memcg awareness In-Reply-To: <1615959984-7122-1-git-send-email-wanpengli@tencent.com> References: <1615959984-7122-1-git-send-email-wanpengli@tencent.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: kernellwp@gmail.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, pbonzini@redhat.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Wed, 17 Mar 2021 05:46:24 +0000, Wanpeng Li wrote: > > From: Wanpeng Li > > KVM allocations in the arm kvm code which are tied to the life > of the VM process should be charged to the VM process's cgroup. > This will help the memcg controler to do the right decisions. > > Signed-off-by: Wanpeng Li > --- > arch/arm64/kvm/arm.c | 5 +++-- > arch/arm64/kvm/hyp/pgtable.c | 4 ++-- > arch/arm64/kvm/mmu.c | 4 ++-- > arch/arm64/kvm/pmu-emul.c | 2 +- > arch/arm64/kvm/reset.c | 2 +- > arch/arm64/kvm/vgic/vgic-debug.c | 2 +- > arch/arm64/kvm/vgic/vgic-init.c | 2 +- > arch/arm64/kvm/vgic/vgic-irqfd.c | 2 +- > arch/arm64/kvm/vgic/vgic-its.c | 14 +++++++------- > arch/arm64/kvm/vgic/vgic-mmio-v3.c | 2 +- > arch/arm64/kvm/vgic/vgic-v4.c | 2 +- > 11 files changed, 21 insertions(+), 20 deletions(-) > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index 7f06ba7..8040874 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -278,9 +278,10 @@ long kvm_arch_dev_ioctl(struct file *filp, > struct kvm *kvm_arch_alloc_vm(void) > { > if (!has_vhe()) > - return kzalloc(sizeof(struct kvm), GFP_KERNEL); > + return kzalloc(sizeof(struct kvm), GFP_KERNEL_ACCOUNT); > > - return vzalloc(sizeof(struct kvm)); > + return __vmalloc(sizeof(struct kvm), > + GFP_KERNEL_ACCOUNT | __GFP_ZERO); > } > > void kvm_arch_free_vm(struct kvm *kvm) > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > index 926fc07..a0845d3 100644 > --- a/arch/arm64/kvm/hyp/pgtable.c > +++ b/arch/arm64/kvm/hyp/pgtable.c > @@ -366,7 +366,7 @@ static int hyp_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, > if (WARN_ON(level == KVM_PGTABLE_MAX_LEVELS - 1)) > return -EINVAL; > > - childp = (kvm_pte_t *)get_zeroed_page(GFP_KERNEL); > + childp = (kvm_pte_t *)get_zeroed_page(GFP_KERNEL_ACCOUNT); No, this is wrong. You cannot account the hypervisor page tables to the guest because we don't ever unmap them, and that we can't distinguish two data structures from two different VMs occupying the same page. > if (!childp) > return -ENOMEM; > > @@ -401,7 +401,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits) > { > u64 levels = ARM64_HW_PGTABLE_LEVELS(va_bits); > > - pgt->pgd = (kvm_pte_t *)get_zeroed_page(GFP_KERNEL); > + pgt->pgd = (kvm_pte_t *)get_zeroed_page(GFP_KERNEL_ACCOUNT); There is no VM in this context. There isn't even any userspace whatsoever in the system when this is called. [...] > diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c > index 66508b0..a80cc37 100644 > --- a/arch/arm64/kvm/vgic/vgic-v4.c > +++ b/arch/arm64/kvm/vgic/vgic-v4.c > @@ -227,7 +227,7 @@ int vgic_v4_init(struct kvm *kvm) > nr_vcpus = atomic_read(&kvm->online_vcpus); > > dist->its_vm.vpes = kcalloc(nr_vcpus, sizeof(*dist->its_vm.vpes), > - GFP_KERNEL); > + GFP_KERNEL_ACCOUNT); And now for the elephant in the room: what you do for the GICv4 VPTs that are allocated for each vPE? M. -- Without deviation from the norm, progress is not possible.