From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 05711C54E41 for ; Tue, 5 Mar 2024 08:47:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=nPSP4DM9PnmY24r+gB6Stji6VHDAyTQbXKufSRREUX0=; b=ZBnM6s+9YVpVFT 2zcdnN+qqOe3xBYpZty/5V97QH9Js7yERk1ZNFdYOI82c10pYkP9BvuwtS2PfODUcfwp3PFA9s0mk 52Mkbu7OREtS6bCJlDlxHF0k/1kYljZXaE8uEXwqhmWE2ySDnemBPmjQ7ty4fCSVklpJJJtRrNETi doBXYF2F8AlzMU8Lqj3TuB7a6muygjFk2lwvz7ufckAEBwnIL5fdZZGfWx+bIfGaoEWXmu3UMc/jP kYgEUXXjwbC0DaLs/cDthD5bfg76vPyUgT8QHjug7cJTQMimpmeb0ccTE7nTSbQvdO5CFWWS15fAG NRRyn6MkIl17vnuSCJIQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rhQRz-0000000CfXm-2FVI; Tue, 05 Mar 2024 08:47:03 +0000 Received: from out-184.mta0.migadu.com ([2001:41d0:1004:224b::b8]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rhQRw-0000000CfWH-0fuY for linux-arm-kernel@lists.infradead.org; Tue, 05 Mar 2024 08:47:02 +0000 Date: Tue, 5 Mar 2024 08:46:44 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1709628415; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=XFrbfdZsWF6FXVs3YMWczDqjRAtMwJZL7VFOKOOlg+8=; b=exzf0ZN0e5j8rMXz84wObFPRDcs6l2hsX6ODjXb4yQ9XbrZ5cOhDSPVnhoEilxaX/AXwaG JuKcJjNVUN2eFnRsS3Vdg+kvZKp8irQlGJQSsKMKZ971NA5s7tOzMQ1dEcqFM7GC1gBe9M rYwzzXADn1SJ13vRZk4LoTRzQpZZNfE= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Oliver Upton To: Ganapatrao Kulkarni Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, maz@kernel.org, darren@os.amperecomputing.com, d.scott.phillips@amperecomputing.com, James Morse , Suzuki K Poulose , Zenghui Yu Subject: Re: [RFC PATCH] kvm: nv: Optimize the unmapping of shadow S2-MMU tables. Message-ID: References: <20240305054606.13261-1-gankulkarni@os.amperecomputing.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20240305054606.13261-1-gankulkarni@os.amperecomputing.com> X-Migadu-Flow: FLOW_OUT X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240305_004700_505545_27FCE244 X-CRM114-Status: GOOD ( 25.89 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org -cc old kvmarm list +cc new kvmarm list, reviewers Please run scripts/get_maintainer.pl next time around so we get the right people looking at a patch. On Mon, Mar 04, 2024 at 09:46:06PM -0800, Ganapatrao Kulkarni wrote: > @@ -216,6 +223,13 @@ struct kvm_s2_mmu { > * >0: Somebody is actively using this. > */ > atomic_t refcnt; > + > + /* > + * For a Canonical IPA to Shadow IPA mapping. > + */ > + struct rb_root nested_mapipa_root; There isn't any benefit to tracking the canonical IPA -> shadow IPA(s) mapping on a per-S2 basis, as there already exists a one-to-many problem (more below). Maintaining a per-VM data structure (since this is keyed by canonical IPA) makes a bit more sense. > + rwlock_t mmu_lock; > + Err, is there any reason the existing mmu_lock is insufficient here? Surely taking a new reference on a canonical IPA for a shadow S2 must be done behind the MMU lock for it to be safe against MMU notifiers... Also, Reusing the exact same name for it is sure to produce some lock imbalance funnies. > }; > > static inline bool kvm_s2_mmu_valid(struct kvm_s2_mmu *mmu) > diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h > index da7ebd2f6e24..c31a59a1fdc6 100644 > --- a/arch/arm64/include/asm/kvm_nested.h > +++ b/arch/arm64/include/asm/kvm_nested.h > @@ -65,6 +65,9 @@ extern void kvm_init_nested(struct kvm *kvm); > extern int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu); > extern void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu); > extern struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu); > +extern void add_shadow_ipa_map_node( > + struct kvm_s2_mmu *mmu, > + phys_addr_t ipa, phys_addr_t shadow_ipa, long size); style nitpick: no newline between the open bracket and first parameter. Wrap as needed at 80 (or a bit more) columns. > +/* > + * Create a node and add to lookup table, when a page is mapped to > + * Canonical IPA and also mapped to Shadow IPA. > + */ > +void add_shadow_ipa_map_node(struct kvm_s2_mmu *mmu, > + phys_addr_t ipa, > + phys_addr_t shadow_ipa, long size) > +{ > + struct rb_root *ipa_root = &(mmu->nested_mapipa_root); > + struct rb_node **node = &(ipa_root->rb_node), *parent = NULL; > + struct mapipa_node *new; > + > + new = kzalloc(sizeof(struct mapipa_node), GFP_KERNEL); > + if (!new) > + return; Should be GFP_KERNEL_ACCOUNT, you want to charge this to the user. > + > + new->shadow_ipa = shadow_ipa; > + new->ipa = ipa; > + new->size = size; What about aliasing? You could have multiple shadow IPAs that point to the same canonical IPA, even within a single MMU. > + write_lock(&mmu->mmu_lock); > + > + while (*node) { > + struct mapipa_node *tmp; > + > + tmp = container_of(*node, struct mapipa_node, node); > + parent = *node; > + if (new->ipa < tmp->ipa) { > + node = &(*node)->rb_left; > + } else if (new->ipa > tmp->ipa) { > + node = &(*node)->rb_right; > + } else { > + write_unlock(&mmu->mmu_lock); > + kfree(new); > + return; > + } > + } > + > + rb_link_node(&new->node, parent, node); > + rb_insert_color(&new->node, ipa_root); > + write_unlock(&mmu->mmu_lock); Meh, one of the annoying things with rbtree is you have to build your own search functions... It would appear that the rbtree intends to express intervals (i.e. GPA + size), but the search implementation treats GPA as an index. So I don't think this works as intended. Have you considered other abstract data types (e.g. xarray, maple tree) and how they might apply here? > +bool get_shadow_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, phys_addr_t *shadow_ipa, long *size) > +{ > + struct rb_node *node; > + struct mapipa_node *tmp = NULL; > + > + read_lock(&mmu->mmu_lock); > + node = mmu->nested_mapipa_root.rb_node; > + > + while (node) { > + tmp = container_of(node, struct mapipa_node, node); > + > + if (tmp->ipa == ipa) > + break; > + else if (ipa > tmp->ipa) > + node = node->rb_right; > + else > + node = node->rb_left; > + } > + > + read_unlock(&mmu->mmu_lock); > + > + if (tmp && tmp->ipa == ipa) { > + *shadow_ipa = tmp->shadow_ipa; > + *size = tmp->size; > + write_lock(&mmu->mmu_lock); > + rb_erase(&tmp->node, &mmu->nested_mapipa_root); > + write_unlock(&mmu->mmu_lock); > + kfree(tmp); > + return true; > + } Implicitly evicting the entry isn't going to work if we want to use it for updates to a stage-2 that do not evict the mapping, like write protection or access flag updates. -- Thanks, Oliver _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel