Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Nicolin Chen <nicolinc@nvidia.com>
Cc: jean-philippe@linaro.org, robin.murphy@arm.com, joro@8bytes.org,
	jgg@nvidia.com, balbirs@nvidia.com, miko.lenczewski@arm.com,
	peterz@infradead.org, kevin.tian@intel.com, praan@google.com,
	linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9 3/7] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array
Date: Fri, 23 Jan 2026 17:03:10 +0000	[thread overview]
Message-ID: <aXOpzjkAgODnh2HU@willie-the-truck> (raw)
In-Reply-To: <8c94c5194871ee1a0f3a6b49e18818b88f51226d.1766174731.git.nicolinc@nvidia.com>

On Fri, Dec 19, 2025 at 12:11:25PM -0800, Nicolin Chen wrote:
> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> Create a new data structure to hold an array of invalidations that need to
> be performed for the domain based on what masters are attached, to replace
> the single smmu pointer and linked list of masters in the current design.
> 
> Each array entry holds one of the invalidation actions - S1_ASID, S2_VMID,
> ATS or their variant with information to feed invalidation commands to HW.
> It is structured so that multiple SMMUs can participate in the same array,
> removing one key limitation of the current system.
> 
> To maximize performance, a sorted array is used as the data structure. It
> allows grouping SYNCs together to parallelize invalidations. For instance,
> it will group all the ATS entries after the ASID/VMID entry, so they will
> all be pushed to the PCI devices in parallel with one SYNC.
> 
> To minimize the locking cost on the invalidation fast path (reader of the
> invalidation array), the array is managed with RCU.
> 
> Provide a set of APIs to add/delete entries to/from an array, which cover
> cannot-fail attach cases, e.g. attaching to arm_smmu_blocked_domain. Also
> add kunit coverage for those APIs.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  98 +++++++
>  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c  |  92 +++++++
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 258 ++++++++++++++++++
>  3 files changed, 448 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 96a23ca633cb..a9441dc9070e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -649,6 +649,93 @@ struct arm_smmu_cmdq_batch {
>  	int				num;
>  };
>  
> +/*
> + * The order here also determines the sequence in which commands are sent to the
> + * command queue. E.g. TLBI must be done before ATC_INV.
> + */
> +enum arm_smmu_inv_type {
> +	INV_TYPE_S1_ASID,
> +	INV_TYPE_S2_VMID,
> +	INV_TYPE_S2_VMID_S1_CLEAR,
> +	INV_TYPE_ATS,
> +	INV_TYPE_ATS_FULL,
> +};
> +
> +struct arm_smmu_inv {
> +	struct arm_smmu_device *smmu;
> +	u8 type;
> +	u8 size_opcode;
> +	u8 nsize_opcode;
> +	u32 id; /* ASID or VMID or SID */
> +	union {
> +		size_t pgsize; /* ARM_SMMU_FEAT_RANGE_INV */
> +		u32 ssid; /* INV_TYPE_ATS */
> +	};
> +
> +	refcount_t users; /* users=0 to mark as a trash to be purged */

The refcount_t API uses atomics with barrier semantics. Do we actually
need those properties when updating the refcounts here? The ASID lock
gives us pretty strong serialisation even after this patch series and
we rely heavily on that.

> +static void arm_smmu_v3_invs_test(struct kunit *test)
> +{
> +	const int results1[2][3] = { { 1, 1, 3, }, { 1, 1, 1, }, };
> +	const int results2[2][5] = { { 1, 1, 3, 4, 5, }, { 2, 1, 1, 1, 1, }, };
> +	const int results3[2][3] = { { 1, 1, 3, }, { 1, 1, 1, }, };
> +	const int results4[2][5] = { { 1, 1, 3, 5, 6, }, { 2, 1, 1, 1, 1, }, };
> +	const int results5[2][5] = { { 1, 1, 3, 5, 6, }, { 1, 0, 0, 1, 1, }, };
> +	const int results6[2][3] = { { 1, 5, 6, }, { 1, 1, 1, }, };
> +	struct arm_smmu_invs *test_a, *test_b;
> +
> +	/* New array */
> +	test_a = arm_smmu_invs_alloc(0);
> +	KUNIT_EXPECT_EQ(test, test_a->num_invs, 0);
> +
> +	/* Test1: merge invs1 (new array) */
> +	test_b = arm_smmu_invs_merge(test_a, &invs1);
> +	kfree(test_a);
> +	arm_smmu_v3_invs_test_verify(test, test_b, ARRAY_SIZE(results1[0]),
> +				     results1[0], results1[1]);
> +
> +	/* Test2: merge invs2 (new array) */
> +	test_a = arm_smmu_invs_merge(test_b, &invs2);
> +	kfree(test_b);
> +	arm_smmu_v3_invs_test_verify(test, test_a, ARRAY_SIZE(results2[0]),
> +				     results2[0], results2[1]);
> +
> +	/* Test3: unref invs2 (same array) */
> +	arm_smmu_invs_unref(test_a, &invs2, NULL);
> +	arm_smmu_v3_invs_test_verify(test, test_a, ARRAY_SIZE(results3[0]),
> +				     results3[0], results3[1]);
> +	KUNIT_EXPECT_EQ(test, test_a->num_trashes, 0);
> +
> +	/* Test4: merge invs3 (new array) */
> +	test_b = arm_smmu_invs_merge(test_a, &invs3);
> +	kfree(test_a);
> +	arm_smmu_v3_invs_test_verify(test, test_b, ARRAY_SIZE(results4[0]),
> +				     results4[0], results4[1]);
> +
> +	/* Test5: unref invs1 (same array) */
> +	arm_smmu_invs_unref(test_b, &invs1, NULL);
> +	arm_smmu_v3_invs_test_verify(test, test_b, ARRAY_SIZE(results5[0]),
> +				     results5[0], results5[1]);
> +	KUNIT_EXPECT_EQ(test, test_b->num_trashes, 2);
> +
> +	/* Test6: purge test_b (new array) */
> +	test_a = arm_smmu_invs_purge(test_b);
> +	kfree(test_b);
> +	arm_smmu_v3_invs_test_verify(test, test_a, ARRAY_SIZE(results6[0]),
> +				     results6[0], results6[1]);
> +
> +	/* Test7: unref invs3 (same array) */
> +	arm_smmu_invs_unref(test_a, &invs3, NULL);
> +	KUNIT_EXPECT_EQ(test, test_a->num_invs, 0);
> +	KUNIT_EXPECT_EQ(test, test_a->num_trashes, 0);

Wouldn't we be better off testing num_trashes == 0 after test 6 has
completed?

> +/**
> + * arm_smmu_invs_for_each_entry - Iterate over two sorted arrays computing for
> + *                                arm_smmu_invs_merge() or arm_smmu_invs_unref()
> + * @invs_l: the base invalidation array
> + * @idx_l: a stack variable of 'size_t', to store the base array index
> + * @invs_r: the build_invs array as to_merge or to_unref
> + * @idx_r: a stack variable of 'size_t', to store the build_invs index
> + * @cmp: a stack variable of 'int', to store return value (-1, 0, or 1)
> + */
> +#define arm_smmu_invs_for_each_cmp(invs_l, idx_l, invs_r, idx_r, cmp)          \

nit: the kerneldoc comment doesn't match the name of this function.

Will


  parent reply	other threads:[~2026-01-23 17:03 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-19 20:11 [PATCH v9 0/7] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
2025-12-19 20:11 ` [PATCH v9 1/7] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
2026-01-23  9:49   ` Pranjal Shrivastava
2025-12-19 20:11 ` [PATCH v9 2/7] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free() Nicolin Chen
2026-01-23  9:50   ` Pranjal Shrivastava
2025-12-19 20:11 ` [PATCH v9 3/7] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
2026-01-23  9:53   ` Pranjal Shrivastava
2026-01-23 17:03   ` Will Deacon [this message]
2026-01-23 17:35     ` Nicolin Chen
2026-01-23 17:51       ` Will Deacon
2026-01-23 17:56         ` Nicolin Chen
2026-01-23 19:16           ` Jason Gunthorpe
2026-01-23 19:18             ` Nicolin Chen
2026-01-26 14:54             ` Will Deacon
2026-01-26 15:21               ` Jason Gunthorpe
2025-12-19 20:11 ` [PATCH v9 4/7] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array Nicolin Chen
2026-01-23  9:54   ` Pranjal Shrivastava
2025-12-19 20:11 ` [PATCH v9 5/7] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters Nicolin Chen
2025-12-19 20:11 ` [PATCH v9 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range() Nicolin Chen
2026-01-23  9:48   ` Pranjal Shrivastava
2026-01-23 13:56     ` Jason Gunthorpe
2026-01-27 16:38     ` Nicolin Chen
2026-01-27 17:08       ` Jason Gunthorpe
2026-01-27 18:07         ` Nicolin Chen
2026-01-27 18:23           ` Jason Gunthorpe
2026-01-27 18:37             ` Nicolin Chen
2026-01-27 19:19               ` Jason Gunthorpe
2026-01-27 20:14                 ` Nicolin Chen
2026-01-28  0:05                   ` Jason Gunthorpe
2026-01-23 17:05   ` Will Deacon
2026-01-23 17:10     ` Will Deacon
2026-01-23 17:43       ` Nicolin Chen
2026-01-23 20:03       ` Jason Gunthorpe
2026-01-26 13:01         ` Will Deacon
2026-01-26 15:20           ` Jason Gunthorpe
2026-01-26 16:02             ` Will Deacon
2026-01-26 16:09               ` Jason Gunthorpe
2026-01-26 18:56                 ` Will Deacon
2026-01-27  3:14                   ` Nicolin Chen
2026-01-26 17:50             ` Nicolin Chen
2025-12-19 20:11 ` [PATCH v9 7/7] iommu/arm-smmu-v3: Perform per-domain invalidations using arm_smmu_invs Nicolin Chen
2026-01-23 17:07   ` Will Deacon
2026-01-23 17:47     ` Nicolin Chen
2026-01-23 19:59     ` Jason Gunthorpe
2026-01-19 17:10 ` [PATCH v9 0/7] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXOpzjkAgODnh2HU@willie-the-truck \
    --to=will@kernel.org \
    --cc=balbirs@nvidia.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miko.lenczewski@arm.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterz@infradead.org \
    --cc=praan@google.com \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox