From: Oliver Upton <oliver.upton@linux.dev>
To: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, maz@kernel.org,
darren@os.amperecomputing.com,
d.scott.phillips@amperecomputing.com,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>
Subject: Re: [RFC PATCH] kvm: nv: Optimize the unmapping of shadow S2-MMU tables.
Date: Tue, 5 Mar 2024 08:46:44 +0000 [thread overview]
Message-ID: <Zebb9CyihqC4JqnK@linux.dev> (raw)
In-Reply-To: <20240305054606.13261-1-gankulkarni@os.amperecomputing.com>
-cc old kvmarm list
+cc new kvmarm list, reviewers
Please run scripts/get_maintainer.pl next time around so we get the
right people looking at a patch.
On Mon, Mar 04, 2024 at 09:46:06PM -0800, Ganapatrao Kulkarni wrote:
> @@ -216,6 +223,13 @@ struct kvm_s2_mmu {
> * >0: Somebody is actively using this.
> */
> atomic_t refcnt;
> +
> + /*
> + * For a Canonical IPA to Shadow IPA mapping.
> + */
> + struct rb_root nested_mapipa_root;
There isn't any benefit to tracking the canonical IPA -> shadow IPA(s)
mapping on a per-S2 basis, as there already exists a one-to-many problem
(more below). Maintaining a per-VM data structure (since this is keyed
by canonical IPA) makes a bit more sense.
> + rwlock_t mmu_lock;
> +
Err, is there any reason the existing mmu_lock is insufficient here?
Surely taking a new reference on a canonical IPA for a shadow S2 must be
done behind the MMU lock for it to be safe against MMU notifiers...
Also, Reusing the exact same name for it is sure to produce some lock
imbalance funnies.
> };
>
> static inline bool kvm_s2_mmu_valid(struct kvm_s2_mmu *mmu)
> diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h
> index da7ebd2f6e24..c31a59a1fdc6 100644
> --- a/arch/arm64/include/asm/kvm_nested.h
> +++ b/arch/arm64/include/asm/kvm_nested.h
> @@ -65,6 +65,9 @@ extern void kvm_init_nested(struct kvm *kvm);
> extern int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu);
> extern void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu);
> extern struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu);
> +extern void add_shadow_ipa_map_node(
> + struct kvm_s2_mmu *mmu,
> + phys_addr_t ipa, phys_addr_t shadow_ipa, long size);
style nitpick: no newline between the open bracket and first parameter.
Wrap as needed at 80 (or a bit more) columns.
> +/*
> + * Create a node and add to lookup table, when a page is mapped to
> + * Canonical IPA and also mapped to Shadow IPA.
> + */
> +void add_shadow_ipa_map_node(struct kvm_s2_mmu *mmu,
> + phys_addr_t ipa,
> + phys_addr_t shadow_ipa, long size)
> +{
> + struct rb_root *ipa_root = &(mmu->nested_mapipa_root);
> + struct rb_node **node = &(ipa_root->rb_node), *parent = NULL;
> + struct mapipa_node *new;
> +
> + new = kzalloc(sizeof(struct mapipa_node), GFP_KERNEL);
> + if (!new)
> + return;
Should be GFP_KERNEL_ACCOUNT, you want to charge this to the user.
> +
> + new->shadow_ipa = shadow_ipa;
> + new->ipa = ipa;
> + new->size = size;
What about aliasing? You could have multiple shadow IPAs that point to
the same canonical IPA, even within a single MMU.
> + write_lock(&mmu->mmu_lock);
> +
> + while (*node) {
> + struct mapipa_node *tmp;
> +
> + tmp = container_of(*node, struct mapipa_node, node);
> + parent = *node;
> + if (new->ipa < tmp->ipa) {
> + node = &(*node)->rb_left;
> + } else if (new->ipa > tmp->ipa) {
> + node = &(*node)->rb_right;
> + } else {
> + write_unlock(&mmu->mmu_lock);
> + kfree(new);
> + return;
> + }
> + }
> +
> + rb_link_node(&new->node, parent, node);
> + rb_insert_color(&new->node, ipa_root);
> + write_unlock(&mmu->mmu_lock);
Meh, one of the annoying things with rbtree is you have to build your
own search functions...
It would appear that the rbtree intends to express intervals (i.e. GPA +
size), but the search implementation treats GPA as an index. So I don't
think this works as intended.
Have you considered other abstract data types (e.g. xarray, maple tree)
and how they might apply here?
> +bool get_shadow_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, phys_addr_t *shadow_ipa, long *size)
> +{
> + struct rb_node *node;
> + struct mapipa_node *tmp = NULL;
> +
> + read_lock(&mmu->mmu_lock);
> + node = mmu->nested_mapipa_root.rb_node;
> +
> + while (node) {
> + tmp = container_of(node, struct mapipa_node, node);
> +
> + if (tmp->ipa == ipa)
> + break;
> + else if (ipa > tmp->ipa)
> + node = node->rb_right;
> + else
> + node = node->rb_left;
> + }
> +
> + read_unlock(&mmu->mmu_lock);
> +
> + if (tmp && tmp->ipa == ipa) {
> + *shadow_ipa = tmp->shadow_ipa;
> + *size = tmp->size;
> + write_lock(&mmu->mmu_lock);
> + rb_erase(&tmp->node, &mmu->nested_mapipa_root);
> + write_unlock(&mmu->mmu_lock);
> + kfree(tmp);
> + return true;
> + }
Implicitly evicting the entry isn't going to work if we want to use it
for updates to a stage-2 that do not evict the mapping, like write
protection or access flag updates.
--
Thanks,
Oliver
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-03-05 8:47 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-05 5:46 [RFC PATCH] kvm: nv: Optimize the unmapping of shadow S2-MMU tables Ganapatrao Kulkarni
2024-03-05 8:46 ` Oliver Upton [this message]
2024-03-06 5:31 ` Ganapatrao Kulkarni
2024-03-06 8:39 ` Oliver Upton
2024-03-06 13:33 ` Marc Zyngier
2024-03-06 14:57 ` Ganapatrao Kulkarni
2024-03-05 11:08 ` Marc Zyngier
2024-03-05 11:13 ` Marc Zyngier
2024-03-05 13:29 ` Ganapatrao Kulkarni
2024-03-05 15:03 ` Marc Zyngier
2024-03-05 18:33 ` Ganapatrao Kulkarni
2024-03-06 10:23 ` Marc Zyngier
2024-03-26 11:33 ` Ganapatrao Kulkarni
2024-03-27 12:12 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zebb9CyihqC4JqnK@linux.dev \
--to=oliver.upton@linux.dev \
--cc=d.scott.phillips@amperecomputing.com \
--cc=darren@os.amperecomputing.com \
--cc=gankulkarni@os.amperecomputing.com \
--cc=james.morse@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).