From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2AC00CA0ED3 for ; Mon, 2 Sep 2024 05:33:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=fuP3sA2mHPQmE8xDQ8PnCtcPNlxmhvMJ+f3RT1WsH/o=; b=BTVujjeSxIAgEkoZDWS7iL1iR7 CyrGnnRisbSCzhPvHdwbsRWqUfdOdAmFPvV5YAw8G19NuZ7Lmm/Faokgdv2ed8wCaifxO3ZYsaS+m pGaL9pgYWaxlIZj4OonqIpW2UtLd5hx+e62bTbsH+EdiWo/VrdMmRUNmtVRZJMM9+Xh7yg8GsyZpl VX4VMRoNbDqYBTvA19s0OGNobYrSjVsdN3ZuDiOb0A4p8USsYK/BA1yeif/fSIqOEIO7r5KhFgJCY rAMShqvjzE3G3Y8fhRWH6lP+pF147I0a+wE1VuNiqe+XfV81nfsR3O0sv2pz3SMYFUkHq2ITO0lG4 lK2s6LTA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1skzgk-0000000D2L4-0ChX; Mon, 02 Sep 2024 05:33:18 +0000 Received: from mail-wm1-x334.google.com ([2a00:1450:4864:20::334]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1skzf5-0000000D1yC-3Xr2 for linux-arm-kernel@lists.infradead.org; Mon, 02 Sep 2024 05:31:37 +0000 Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-42bba6a003bso53925e9.0 for ; Sun, 01 Sep 2024 22:31:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1725255094; x=1725859894; darn=lists.infradead.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=fuP3sA2mHPQmE8xDQ8PnCtcPNlxmhvMJ+f3RT1WsH/o=; b=lRnnBh/jCgKdK7xZXJaCqO+kIgMvSIVL2GvuVUrU3WGx43AVxSgmA6jBisujhL+ubO CceUidKGXJZTI0qfde3bKDy4Wfv51KsrmJgfiU06YuBJ0ISodZuUClIbySFkZFylALfE mUlncy/zRJkRNdwUxqXmvyklwawSyh4xS7dOGLka6WKSoWgKALq3W906r+PPmzVxa0cx Th5cJeEdeM307RiSqjKuNvWh3y+7qstW3+31sqHgTgc/Z8Ikz7CoYNUemYUu0X509GFU NO9i2ggB1QVgHn0SLxfVLofpukTGRCpAo8kR0L6isH4SBb/O6FPXMMfPt+JLr8B9wY5/ gO6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725255094; x=1725859894; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=fuP3sA2mHPQmE8xDQ8PnCtcPNlxmhvMJ+f3RT1WsH/o=; b=KtTIylnoujm1dMuG+zpSDljMnGY5IaBAzzcwQsmNoooKk0k6wfSfzNZ8pjvRfJXucp FLQ1Ih4D4Auqft8faLR0s4TTDoSfilZdv7eqkmxLcwBhD7jE2/4zdGq1YNyV1Tx1UURO eLU1Si2XnwpLGJzpBGuokcuM9XIT0/dY4kE75PG1g/i7qI6KllB7el1sr3emNazMIpxQ nr93O7/tBIIyutRrMw9k9VW//8Y6GIHzOwlAV+OU2kUEfwD3+F7fq/2qFG9/+0sl5OtC dTLXLCNL1VNVr6gmg6IabCL/qtbT6Avr5Y+noUv+kmDqJsjXngSyp88Ib6gwrKZJS0rZ WIdA== X-Forwarded-Encrypted: i=1; AJvYcCWnNGD8TYlfyMnDicnlJhD/D0KtyK0b/evuTBaALY9PhxNj2hU3nbcXfWSPbuZXlFAMPWCNVc0yCzs8XUY0OIpn@lists.infradead.org X-Gm-Message-State: AOJu0Yzk2AUIuHwVwf3ia9qf5qv1wQFxqNCOHOJEgKIRoOGg94j2D3dV b/iTdy6kK/9my4eG1x7AyJq2W8bi4sdAQqfi35ajZ4QxWWXByW1MdmpnedyJbQ== X-Google-Smtp-Source: AGHT+IEGuvxrZ87Ft8qzSNqILyBxB6gaWy1w3BABLr19h8u2JyTxmmBciKGdpCbbGgRt+NhZaw3o9A== X-Received: by 2002:a05:600c:4e4f:b0:42b:a04f:6eca with SMTP id 5b1f17b1804b1-42c3940e1e7mr1952785e9.6.1725255093565; Sun, 01 Sep 2024 22:31:33 -0700 (PDT) Received: from google.com (44.232.78.34.bc.googleusercontent.com. [34.78.232.44]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-374badc581csm6450421f8f.0.2024.09.01.22.31.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Sep 2024 22:31:33 -0700 (PDT) Date: Mon, 2 Sep 2024 05:31:31 +0000 From: Sebastian Ene To: Vincent Donnefort Cc: akpm@linux-foundation.org, alexghiti@rivosinc.com, ankita@nvidia.com, ardb@kernel.org, catalin.marinas@arm.com, christophe.leroy@csgroup.eu, james.morse@arm.com, mark.rutland@arm.com, maz@kernel.org, oliver.upton@linux.dev, rananta@google.com, ryan.roberts@arm.com, shahuang@redhat.com, suzuki.poulose@arm.com, will@kernel.org, yuzenghui@huawei.com, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH v9 4/5] KVM: arm64: Register ptdump with debugfs on guest creation Message-ID: References: <20240827084549.45731-1-sebastianene@google.com> <20240827084549.45731-5-sebastianene@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240901_223135_917559_E71AEB53 X-CRM114-Status: GOOD ( 45.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Aug 30, 2024 at 11:24:53AM +0100, Vincent Donnefort wrote: > Hi Seb, > > Thanks for the respin. > > On Tue, Aug 27, 2024 at 08:45:47AM +0000, Sebastian Ene wrote: > > While arch/*/mem/ptdump handles the kernel pagetable dumping code, > > introduce KVM/ptdump to show the guest stage-2 pagetables. The > > separation is necessary because most of the definitions from the > > stage-2 pagetable reside in the KVM path and we will be invoking > > functionality specific to KVM. > > > > When a guest is created, register a new file entry under the guest > > debugfs dir which allows userspace to show the contents of the guest > > stage-2 pagetables when accessed. > > > > Signed-off-by: Sebastian Ene > > I only have some nits, otherwise: Hello Vincent, > > Reviewed-by: Vincent Donnefort > Thanks for giving me consistent feedback on the series. I will incorporate your latest suggestions in my patch series and add the tag. > > --- > > arch/arm64/include/asm/kvm_host.h | 6 + > > arch/arm64/kvm/Makefile | 1 + > > arch/arm64/kvm/arm.c | 1 + > > arch/arm64/kvm/ptdump.c | 247 ++++++++++++++++++++++++++++++ > > 4 files changed, 255 insertions(+) > > create mode 100644 arch/arm64/kvm/ptdump.c > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index a33f5996ca9f..4acd589f086b 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -1473,4 +1473,10 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val); > > (pa + pi + pa3) == 1; \ > > }) > > > > +#ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS > > +void kvm_s2_ptdump_create_debugfs(struct kvm *kvm); > > +#else > > +static inline void kvm_s2_ptdump_create_debugfs(struct kvm *kvm) {} > > +#endif /* CONFIG_PTDUMP_STAGE2_DEBUGFS */ > > + > > #endif /* __ARM64_KVM_HOST_H__ */ > > diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile > > index 86a629aaf0a1..e4233b323a73 100644 > > --- a/arch/arm64/kvm/Makefile > > +++ b/arch/arm64/kvm/Makefile > > @@ -27,6 +27,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ > > > > kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o > > kvm-$(CONFIG_ARM64_PTR_AUTH) += pauth.o > > +kvm-$(CONFIG_PTDUMP_STAGE2_DEBUGFS) += ptdump.o > > > > always-y := hyp_constants.h hyp-constants.s > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index 9bef7638342e..b9fd928d3477 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -228,6 +228,7 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > > void kvm_arch_create_vm_debugfs(struct kvm *kvm) > > { > > kvm_sys_regs_create_debugfs(kvm); > > + kvm_s2_ptdump_create_debugfs(kvm); > > } > > > > static void kvm_destroy_mpidr_data(struct kvm *kvm) > > diff --git a/arch/arm64/kvm/ptdump.c b/arch/arm64/kvm/ptdump.c > > new file mode 100644 > > index 000000000000..e72a928d4445 > > --- /dev/null > > +++ b/arch/arm64/kvm/ptdump.c > > @@ -0,0 +1,247 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* > > + * Debug helper used to dump the stage-2 pagetables of the system and their > > + * associated permissions. > > + * > > + * Copyright (C) Google, 2024 > > + * Author: Sebastian Ene > > + */ > > +#include > > +#include > > +#include > > + > > +#include > > +#include > > nit: I believe you wanted to follow the alphabetical order, if that is the case, > kvm_host.h then kvm_pgtable.h > > > +#include > > + > > + > > nit: don't think double empty are a rule, I would remove it. > Ack. > > +#define MARKERS_LEN (2) > > nit: The brackets are not necessary for MARKERS_LEN. > > > +#define KVM_PGTABLE_MAX_LEVELS (KVM_PGTABLE_LAST_LEVEL + 1) > > + > > +struct kvm_ptdump_guest_state { > > + struct kvm *kvm; > > + struct ptdump_pg_state parser_state; > > + struct addr_marker ipa_marker[MARKERS_LEN]; > > + struct ptdump_pg_level level[KVM_PGTABLE_MAX_LEVELS]; > > + struct ptdump_range range[MARKERS_LEN]; > > +}; > > + > > +static const struct ptdump_prot_bits stage2_pte_bits[] = { > > + { > > + .mask = PTE_VALID, > > + .val = PTE_VALID, > > + .set = " ", > > + .clear = "F", > > This is effectively never used because an invalid PTE is 0 and note_page() won't > print it. This probably can be removed? > Please see my previous reply to this. I would keep it around as it should print out non-zero invalid PTEs. > > + }, { > > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID, > > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | PTE_VALID, > > + .set = "R", > > + .clear = " ", > > + }, { > > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID, > > + .val = KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | PTE_VALID, > > + .set = "W", > > + .clear = " ", > > + }, { > > + .mask = KVM_PTE_LEAF_ATTR_HI_S2_XN | PTE_VALID, > > + .val = PTE_VALID, > > + .set = " ", > > + .clear = "X", > > + }, { > > + .mask = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID, > > + .val = KVM_PTE_LEAF_ATTR_LO_S2_AF | PTE_VALID, > > + .set = "AF", > > + .clear = " ", > > + }, { > > + .mask = PTE_TABLE_BIT | PTE_VALID, > > + .val = PTE_VALID, > > + .set = "BLK", > > + .clear = " ", > > + }, > > +}; > > + > > +static int kvm_ptdump_visitor(const struct kvm_pgtable_visit_ctx *ctx, > > + enum kvm_pgtable_walk_flags visit) > > +{ > > + struct ptdump_pg_state *st = ctx->arg; > > + struct ptdump_state *pt_st = &st->ptdump; > > + > > + note_page(pt_st, ctx->addr, ctx->level, ctx->old); > > + > > + return 0; > > +} > > + > > +static int kvm_ptdump_build_levels(struct ptdump_pg_level *level, u32 start_lvl) > > +{ > > + u32 i; > > + u64 mask; > > + > > + if (WARN_ON_ONCE(start_lvl >= KVM_PGTABLE_LAST_LEVEL)) > > + return -EINVAL; > > + > > + mask = 0; > > + for (i = 0; i < ARRAY_SIZE(stage2_pte_bits); i++) > > + mask |= stage2_pte_bits[i].mask; > > + > > + for (i = start_lvl; i < KVM_PGTABLE_MAX_LEVELS; i++) { > > + snprintf(level[i].name, sizeof(level[i].name), "%d", i); > > %u, i being unsigned. Ack. > > > + > > + level[i].num = ARRAY_SIZE(stage2_pte_bits); > > + level[i].bits = stage2_pte_bits; > > + level[i].mask = mask; > > + } > > + > > + return 0; > > +} > > + > > +static struct kvm_ptdump_guest_state *kvm_ptdump_parser_create(struct kvm *kvm) > > +{ > > + struct kvm_ptdump_guest_state *st; > > + struct kvm_s2_mmu *mmu = &kvm->arch.mmu; > > + struct kvm_pgtable *pgtable = mmu->pgt; > > + int ret; > > + > > + st = kzalloc(sizeof(struct kvm_ptdump_guest_state), GFP_KERNEL_ACCOUNT); > > + if (!st) > > + return ERR_PTR(-ENOMEM); > > + > > + ret = kvm_ptdump_build_levels(&st->level[0], pgtable->start_level); > > + if (ret) { > > + kfree(st); > > + return ERR_PTR(ret); > > + } > > + > > + st->ipa_marker[0].name = "Guest IPA"; > > + st->ipa_marker[1].start_address = BIT(pgtable->ia_bits); > > + st->range[0].end = BIT(pgtable->ia_bits); > > + > > + st->kvm = kvm; > > + st->parser_state = (struct ptdump_pg_state) { > > + .marker = &st->ipa_marker[0], > > + .level = -1, > > + .pg_level = &st->level[0], > > + .ptdump.range = &st->range[0], > > + .start_address = 0, > > + }; > > + > > + return st; > > +} > > + > > +static int kvm_ptdump_guest_show(struct seq_file *m, void *unused) > > +{ > > + int ret; > > + struct kvm_ptdump_guest_state *st = m->private; > > + struct kvm *kvm = st->kvm; > > + struct kvm_s2_mmu *mmu = &kvm->arch.mmu; > > + struct ptdump_pg_state *parser_state = &st->parser_state; > > + struct kvm_pgtable_walker walker = (struct kvm_pgtable_walker) { > > + .cb = kvm_ptdump_visitor, > > + .arg = parser_state, > > + .flags = KVM_PGTABLE_WALK_LEAF, > > + }; > > + > > + parser_state->seq = m; > > + > > + write_lock(&kvm->mmu_lock); > > + ret = kvm_pgtable_walk(mmu->pgt, 0, BIT(mmu->pgt->ia_bits), &walker); > > + write_unlock(&kvm->mmu_lock); > > + > > + return ret; > > +} > > + > > +static int kvm_ptdump_guest_open(struct inode *m, struct file *file) > > +{ > > + struct kvm *kvm = m->i_private; > > + struct kvm_ptdump_guest_state *st; > > + int ret; > > + > > + if (!kvm_get_kvm_safe(kvm)) > > + return -ENOENT; > > + > > + st = kvm_ptdump_parser_create(kvm); > > + if (IS_ERR(st)) { > > + ret = PTR_ERR(st); > > + goto free_with_kvm_ref; > > + } > > + > > + ret = single_open(file, kvm_ptdump_guest_show, st); > > + if (!ret) > > + return 0; > > + > > + kfree(st); > > +free_with_kvm_ref: > > nit: I believe kfree understands IS_ERR() so you could have a simple "err:" > label covering all the error path. > > > + kvm_put_kvm(kvm); > > + return ret; > > +} > > + > > +static int kvm_ptdump_guest_close(struct inode *m, struct file *file) > > +{ > > + struct kvm *kvm = m->i_private; > > + void *st = ((struct seq_file *)file->private_data)->private; > > + > > + kfree(st); > > + kvm_put_kvm(kvm); > > + > > + return single_release(m, file); > > +} > > + > > +static const struct file_operations kvm_ptdump_guest_fops = { > > + .open = kvm_ptdump_guest_open, > > + .read = seq_read, > > + .llseek = seq_lseek, > > + .release = kvm_ptdump_guest_close, > > +}; > > + > > +static int kvm_pgtable_debugfs_show(struct seq_file *m, void *unused) > > +{ > > + const struct file *file = m->file; > > + struct kvm_pgtable *pgtable = m->private; > > + > > + if (!strcmp(file_dentry(file)->d_iname, "ipa_range")) > > + seq_printf(m, "%2u\n", pgtable->ia_bits); > > + else if (!strcmp(file_dentry(file)->d_iname, "stage2_levels")) > > + seq_printf(m, "%1d\n", KVM_PGTABLE_LAST_LEVEL - pgtable->start_level + 1); > > nit: KVM_PGTABLE_MAX_LEVELS - pgtable->start_level ? Yes, we can use this one. > > > + return 0; > > +} > > + > > +static int kvm_pgtable_debugfs_open(struct inode *m, struct file *file) > > +{ > > + struct kvm *kvm = m->i_private; > > + struct kvm_pgtable *pgtable; > > + int ret; > > + > > + if (!kvm_get_kvm_safe(kvm)) > > + return -ENOENT; > > + > > + pgtable = kvm->arch.mmu.pgt; > > + > > + ret = single_open(file, kvm_pgtable_debugfs_show, pgtable); > > + if (ret < 0) > > + kvm_put_kvm(kvm); > > + return ret; > > +} > > + > > +static int kvm_pgtable_debugfs_close(struct inode *m, struct file *file) > > +{ > > + struct kvm *kvm = m->i_private; > > + > > + kvm_put_kvm(kvm); > > + return single_release(m, file); > > +} > > + > > +static const struct file_operations kvm_pgtable_debugfs_fops = { > > + .open = kvm_pgtable_debugfs_open, > > + .read = seq_read, > > + .llseek = seq_lseek, > > + .release = kvm_pgtable_debugfs_close, > > +}; > > + > > +void kvm_s2_ptdump_create_debugfs(struct kvm *kvm) > > +{ > > + debugfs_create_file("stage2_page_tables", 0400, kvm->debugfs_dentry, > > + kvm, &kvm_ptdump_guest_fops); > > + debugfs_create_file("ipa_range", 0400, kvm->debugfs_dentry, kvm, > > + &kvm_pgtable_debugfs_fops); > > + debugfs_create_file("stage2_levels", 0400, kvm->debugfs_dentry, > > + kvm, &kvm_pgtable_debugfs_fops); > > +} > > -- > > 2.46.0.295.g3b9ea8a38a-goog > >