From: Jason Gunthorpe <jgg@nvidia.com>
To: Nicolin Chen <nicolinc@nvidia.com>
Cc: will@kernel.org, robin.murphy@arm.com, joro@8bytes.org,
jean-philippe@linaro.org, miko.lenczewski@arm.com,
balbirs@nvidia.com, peterz@infradead.org, smostafa@google.com,
kevin.tian@intel.com, praan@google.com,
linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, patches@lists.linux.dev
Subject: Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
Date: Wed, 24 Sep 2025 18:42:15 -0300 [thread overview]
Message-ID: <20250924214215.GR2617119@nvidia.com> (raw)
In-Reply-To: <d3f9777e46d531273dfecb54d69725428cdef308.1757373449.git.nicolinc@nvidia.com>
On Mon, Sep 08, 2025 at 04:27:00PM -0700, Nicolin Chen wrote:
> Update the invs array with the invalidations required by each domain type
> during attachment operations.
>
> Only an SVA domain or a paging domain will have an invs array:
> a. SVA domain will add an INV_TYPE_S1_ASID per SMMU and an INV_TYPE_ATS
> per SID
>
> b. Non-nesting-parent paging domain with no ATS-enabled master will add
> a single INV_TYPE_S1_ASID or INV_TYPE_S2_VMID per SMMU
>
> c. Non-nesting-parent paging domain with ATS-enabled master(s) will do
> (b) and add an INV_TYPE_ATS per SID
>
> d. Nesting-parent paging domain will add an INV_TYPE_S2_VMID followed by
> an INV_TYPE_S2_VMID_S1_CLEAR per vSMMU. For an ATS-enabled master, it
> will add an INV_TYPE_ATS_FULL per SID
Just some minor forward looking clarification - this behavior should
be triggered when a nest-parent is attached through the viommu using
a nesting domain with a vSTE.
A nesting-parent that is just normally attached should act like a
normal S2 since it does not and can not have a two stage S1 on top of
it.
We can't quite get there yet until the next series of changing how the
VMID allocation works.
> The per-domain invalidation is not needed, until the domain is attached to
> a master, i.e. a possible translation request. Giving this clears a way to
> allowing the domain to be attached to many SMMUs, and avoids any pointless
> invalidation overheads during a teardown if there are no STE/CDs referring
> to the domain. This also means, when the last device is detached, the old
> domain must flush its ASID or VMID because any iommu_unmap() call after it
> wouldn't initiate any invalidation given an empty domain invs array.
Grammar/phrasing in this paragraph
> Introduce some arm_smmu_invs helper functions for building scratch arrays,
> preparing and installing old/new domain's invalidation arrays.
>
> Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 ++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 312 +++++++++++++++++++-
> 2 files changed, 332 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 246c6d84de3ab..e4e0e066108cc 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -678,6 +678,8 @@ struct arm_smmu_inv {
> /**
> * struct arm_smmu_invs - Per-domain invalidation array
> * @num_invs: number of invalidations in the flexible array
> + * @old: flag to synchronize with reader
> + * @rwlock: optional rwlock to fench ATS operations
> * @rcu: rcu head for kfree_rcu()
> * @inv: flexible invalidation array
> *
> @@ -703,6 +705,8 @@ struct arm_smmu_inv {
> */
> struct arm_smmu_invs {
> size_t num_invs;
> + rwlock_t rwlock;
> + u8 old;
> struct rcu_head rcu;
> struct arm_smmu_inv inv[];
> };
> @@ -714,6 +718,7 @@ static inline struct arm_smmu_invs *arm_smmu_invs_alloc(size_t num_invs)
> new_invs = kzalloc(struct_size(new_invs, inv, num_invs), GFP_KERNEL);
> if (!new_invs)
> return ERR_PTR(-ENOMEM);
> + rwlock_init(&new_invs->rwlock);
> new_invs->num_invs = num_invs;
> return new_invs;
> }
Put these and related hunks in the patch adding arm_smmu_invs
> @@ -1183,8 +1183,11 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
> i++;
> } else if (cmp == 0) {
> /* same item */
> - if (refcount_dec_and_test(&invs->inv[i].users))
> + if (refcount_dec_and_test(&invs->inv[i].users)) {
> + /* Notify the caller about this deletion */
> + refcount_set(&to_unref->inv[j].users, 1);
> num_dels++;
This is a bit convoluted. Instead of marking the entry and then
iterating the list again just directly call a function to do the
invalidation.
> +static inline void arm_smmu_invs_dbg(struct arm_smmu_master *master,
> + struct arm_smmu_domain *smmu_domain,
> + struct arm_smmu_invs *invs, char *name)
> +{
> + size_t i;
> +
> + dev_dbg(master->dev, "domain (type: %x), invs: %s, num_invs: %ld\n",
> + smmu_domain->domain.type, name, invs->num_invs);
> + for (i = 0; i < invs->num_invs; i++) {
> + struct arm_smmu_inv *cur = &invs->inv[i];
> +
> + dev_dbg(master->dev,
> + " entry: inv[%ld], type: %u, id: %u, users: %u\n", i,
> + cur->type, cur->id, refcount_read(&cur->users));
> + }
> +}
Move all the debug code to its own commit and don't send it
> +static void
> +arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
> +{
> + struct arm_smmu_inv_state *invst = &state->new_domain_invst;
> +
> + if (!invst->invs_ptr)
> + return;
> +
> + rcu_assign_pointer(*invst->invs_ptr, invst->new_invs);
> + /*
> + * Committed to updating the STE, using the new invalidation array, and
> + * acquiring any racing IOPTE updates.
> + */
> + smp_mb();
We are commited to updating the STE. Make the invalidation list
visable to parallel map/unmap threads and acquire any racying IOPTE
updates.
> + kfree_rcu(invst->old_invs, rcu);
> +}
> +
> +/*
> + * When an array entry's users count reaches zero, it means the ASID/VMID is no
> + * longer being invalidated by map/unmap and must be cleaned. The rule is that
> + * all ASIDs/VMIDs not in an invalidation array are left cleared in the IOTLB.
> + */
> +static void arm_smmu_invs_flush_iotlb_tags(struct arm_smmu_invs *invs)
> +{
> + size_t i;
> +
> + for (i = 0; i != invs->num_invs; i++) {
Just remove the loop and accept struct arm_smmu_inv * and call it
directly.
> + if (!new_invs) {
> + size_t new_num = old_invs->num_invs;
> +
> + /*
> + * OOM. Couldn't make a copy. Leave the array unoptimized. But
> + * trim its size if some tailing entries are marked as trash.
> + */
> + while (new_num != 0) {
> + if (refcount_read(&old_invs->inv[new_num - 1].users))
> + break;
> + new_num--;
> + }
Would be nicer to have arm_smmu_invs_unref return the new size so we
don't need this loop
Looks Ok to me otherwise
Jason
next prev parent reply other threads:[~2025-09-24 21:42 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
2025-09-08 23:26 ` [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit Nicolin Chen
2025-09-09 3:16 ` Balbir Singh
2025-09-09 5:42 ` Nicolin Chen
2025-09-09 22:49 ` Balbir Singh
2025-09-10 2:03 ` Nicolin Chen
2025-09-10 2:56 ` Nicolin Chen
2025-09-08 23:26 ` [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
2025-09-09 3:25 ` Balbir Singh
2025-09-09 22:31 ` Balbir Singh
2025-09-10 2:06 ` Nicolin Chen
2025-09-24 21:07 ` Jason Gunthorpe
2025-09-08 23:26 ` [PATCH rfcv2 3/8] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free() Nicolin Chen
2025-09-24 21:08 ` Jason Gunthorpe
2025-09-08 23:26 ` [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
2025-09-09 13:01 ` kernel test robot
2025-09-20 0:26 ` Nicolin Chen
2025-09-24 21:29 ` Jason Gunthorpe
2025-09-29 18:52 ` Nicolin Chen
2025-09-30 12:13 ` Jason Gunthorpe
2025-09-08 23:26 ` [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array Nicolin Chen
2025-09-24 21:32 ` Jason Gunthorpe
2025-09-29 19:11 ` Nicolin Chen
2025-09-30 11:55 ` Jason Gunthorpe
2025-09-08 23:27 ` [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters Nicolin Chen
2025-09-24 21:42 ` Jason Gunthorpe [this message]
2025-09-29 20:52 ` Nicolin Chen
2025-09-30 12:12 ` Jason Gunthorpe
2025-09-30 20:19 ` Nicolin Chen
2025-10-01 16:25 ` Jason Gunthorpe
2025-10-01 17:16 ` Nicolin Chen
2025-09-08 23:27 ` [PATCH rfcv2 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range() Nicolin Chen
2025-09-24 21:56 ` Jason Gunthorpe
2025-09-29 21:00 ` Nicolin Chen
2025-09-08 23:27 ` [PATCH rfcv2 8/8] iommu/arm-smmu-v3: Perform per-domain invalidations using arm_smmu_invs Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250924214215.GR2617119@nvidia.com \
--to=jgg@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miko.lenczewski@arm.com \
--cc=nicolinc@nvidia.com \
--cc=patches@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=praan@google.com \
--cc=robin.murphy@arm.com \
--cc=smostafa@google.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.