From: Jason Gunthorpe <jgg@nvidia.com>
To: Nicolin Chen <nicolinc@nvidia.com>
Cc: will@kernel.org, robin.murphy@arm.com, joro@8bytes.org,
jean-philippe@linaro.org, miko.lenczewski@arm.com,
balbirs@nvidia.com, peterz@infradead.org, smostafa@google.com,
kevin.tian@intel.com, praan@google.com,
linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, patches@lists.linux.dev
Subject: Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
Date: Wed, 24 Sep 2025 18:42:15 -0300 [thread overview]
Message-ID: <20250924214215.GR2617119@nvidia.com> (raw)
In-Reply-To: <d3f9777e46d531273dfecb54d69725428cdef308.1757373449.git.nicolinc@nvidia.com>
On Mon, Sep 08, 2025 at 04:27:00PM -0700, Nicolin Chen wrote:
> Update the invs array with the invalidations required by each domain type
> during attachment operations.
>
> Only an SVA domain or a paging domain will have an invs array:
> a. SVA domain will add an INV_TYPE_S1_ASID per SMMU and an INV_TYPE_ATS
> per SID
>
> b. Non-nesting-parent paging domain with no ATS-enabled master will add
> a single INV_TYPE_S1_ASID or INV_TYPE_S2_VMID per SMMU
>
> c. Non-nesting-parent paging domain with ATS-enabled master(s) will do
> (b) and add an INV_TYPE_ATS per SID
>
> d. Nesting-parent paging domain will add an INV_TYPE_S2_VMID followed by
> an INV_TYPE_S2_VMID_S1_CLEAR per vSMMU. For an ATS-enabled master, it
> will add an INV_TYPE_ATS_FULL per SID
Just some minor forward looking clarification - this behavior should
be triggered when a nest-parent is attached through the viommu using
a nesting domain with a vSTE.
A nesting-parent that is just normally attached should act like a
normal S2 since it does not and can not have a two stage S1 on top of
it.
We can't quite get there yet until the next series of changing how the
VMID allocation works.
> The per-domain invalidation is not needed, until the domain is attached to
> a master, i.e. a possible translation request. Giving this clears a way to
> allowing the domain to be attached to many SMMUs, and avoids any pointless
> invalidation overheads during a teardown if there are no STE/CDs referring
> to the domain. This also means, when the last device is detached, the old
> domain must flush its ASID or VMID because any iommu_unmap() call after it
> wouldn't initiate any invalidation given an empty domain invs array.
Grammar/phrasing in this paragraph
> Introduce some arm_smmu_invs helper functions for building scratch arrays,
> preparing and installing old/new domain's invalidation arrays.
>
> Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 ++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 312 +++++++++++++++++++-
> 2 files changed, 332 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 246c6d84de3ab..e4e0e066108cc 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -678,6 +678,8 @@ struct arm_smmu_inv {
> /**
> * struct arm_smmu_invs - Per-domain invalidation array
> * @num_invs: number of invalidations in the flexible array
> + * @old: flag to synchronize with reader
> + * @rwlock: optional rwlock to fench ATS operations
> * @rcu: rcu head for kfree_rcu()
> * @inv: flexible invalidation array
> *
> @@ -703,6 +705,8 @@ struct arm_smmu_inv {
> */
> struct arm_smmu_invs {
> size_t num_invs;
> + rwlock_t rwlock;
> + u8 old;
> struct rcu_head rcu;
> struct arm_smmu_inv inv[];
> };
> @@ -714,6 +718,7 @@ static inline struct arm_smmu_invs *arm_smmu_invs_alloc(size_t num_invs)
> new_invs = kzalloc(struct_size(new_invs, inv, num_invs), GFP_KERNEL);
> if (!new_invs)
> return ERR_PTR(-ENOMEM);
> + rwlock_init(&new_invs->rwlock);
> new_invs->num_invs = num_invs;
> return new_invs;
> }
Put these and related hunks in the patch adding arm_smmu_invs
> @@ -1183,8 +1183,11 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
> i++;
> } else if (cmp == 0) {
> /* same item */
> - if (refcount_dec_and_test(&invs->inv[i].users))
> + if (refcount_dec_and_test(&invs->inv[i].users)) {
> + /* Notify the caller about this deletion */
> + refcount_set(&to_unref->inv[j].users, 1);
> num_dels++;
This is a bit convoluted. Instead of marking the entry and then
iterating the list again just directly call a function to do the
invalidation.
> +static inline void arm_smmu_invs_dbg(struct arm_smmu_master *master,
> + struct arm_smmu_domain *smmu_domain,
> + struct arm_smmu_invs *invs, char *name)
> +{
> + size_t i;
> +
> + dev_dbg(master->dev, "domain (type: %x), invs: %s, num_invs: %ld\n",
> + smmu_domain->domain.type, name, invs->num_invs);
> + for (i = 0; i < invs->num_invs; i++) {
> + struct arm_smmu_inv *cur = &invs->inv[i];
> +
> + dev_dbg(master->dev,
> + " entry: inv[%ld], type: %u, id: %u, users: %u\n", i,
> + cur->type, cur->id, refcount_read(&cur->users));
> + }
> +}
Move all the debug code to its own commit and don't send it
> +static void
> +arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
> +{
> + struct arm_smmu_inv_state *invst = &state->new_domain_invst;
> +
> + if (!invst->invs_ptr)
> + return;
> +
> + rcu_assign_pointer(*invst->invs_ptr, invst->new_invs);
> + /*
> + * Committed to updating the STE, using the new invalidation array, and
> + * acquiring any racing IOPTE updates.
> + */
> + smp_mb();
We are commited to updating the STE. Make the invalidation list
visable to parallel map/unmap threads and acquire any racying IOPTE
updates.
> + kfree_rcu(invst->old_invs, rcu);
> +}
> +
> +/*
> + * When an array entry's users count reaches zero, it means the ASID/VMID is no
> + * longer being invalidated by map/unmap and must be cleaned. The rule is that
> + * all ASIDs/VMIDs not in an invalidation array are left cleared in the IOTLB.
> + */
> +static void arm_smmu_invs_flush_iotlb_tags(struct arm_smmu_invs *invs)
> +{
> + size_t i;
> +
> + for (i = 0; i != invs->num_invs; i++) {
Just remove the loop and accept struct arm_smmu_inv * and call it
directly.
> + if (!new_invs) {
> + size_t new_num = old_invs->num_invs;
> +
> + /*
> + * OOM. Couldn't make a copy. Leave the array unoptimized. But
> + * trim its size if some tailing entries are marked as trash.
> + */
> + while (new_num != 0) {
> + if (refcount_read(&old_invs->inv[new_num - 1].users))
> + break;
> + new_num--;
> + }
Would be nicer to have arm_smmu_invs_unref return the new size so we
don't need this loop
Looks Ok to me otherwise
Jason
next prev parent reply other threads:[~2025-09-24 21:42 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
2025-09-08 23:26 ` [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit Nicolin Chen
2025-09-09 3:16 ` Balbir Singh
2025-09-09 5:42 ` Nicolin Chen
2025-09-09 22:49 ` Balbir Singh
2025-09-10 2:03 ` Nicolin Chen
2025-09-10 2:56 ` Nicolin Chen
2025-09-08 23:26 ` [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
2025-09-09 3:25 ` Balbir Singh
2025-09-09 22:31 ` Balbir Singh
2025-09-10 2:06 ` Nicolin Chen
2025-09-24 21:07 ` Jason Gunthorpe
2025-09-08 23:26 ` [PATCH rfcv2 3/8] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free() Nicolin Chen
2025-09-24 21:08 ` Jason Gunthorpe
2025-09-08 23:26 ` [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
2025-09-09 13:01 ` kernel test robot
2025-09-20 0:26 ` Nicolin Chen
2025-09-24 21:29 ` Jason Gunthorpe
2025-09-29 18:52 ` Nicolin Chen
2025-09-30 12:13 ` Jason Gunthorpe
2025-09-08 23:26 ` [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array Nicolin Chen
2025-09-24 21:32 ` Jason Gunthorpe
2025-09-29 19:11 ` Nicolin Chen
2025-09-30 11:55 ` Jason Gunthorpe
2025-09-08 23:27 ` [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters Nicolin Chen
2025-09-24 21:42 ` Jason Gunthorpe [this message]
2025-09-29 20:52 ` Nicolin Chen
2025-09-30 12:12 ` Jason Gunthorpe
2025-09-30 20:19 ` Nicolin Chen
2025-10-01 16:25 ` Jason Gunthorpe
2025-10-01 17:16 ` Nicolin Chen
2025-09-08 23:27 ` [PATCH rfcv2 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range() Nicolin Chen
2025-09-24 21:56 ` Jason Gunthorpe
2025-09-29 21:00 ` Nicolin Chen
2025-09-08 23:27 ` [PATCH rfcv2 8/8] iommu/arm-smmu-v3: Perform per-domain invalidations using arm_smmu_invs Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250924214215.GR2617119@nvidia.com \
--to=jgg@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=iommu@lists.linux.dev \
--cc=jean-philippe@linaro.org \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miko.lenczewski@arm.com \
--cc=nicolinc@nvidia.com \
--cc=patches@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=praan@google.com \
--cc=robin.murphy@arm.com \
--cc=smostafa@google.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).