* [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
@ 2025-09-08 23:26 ` Nicolin Chen
2025-09-09 3:16 ` Balbir Singh
2025-09-10 2:56 ` Nicolin Chen
2025-09-08 23:26 ` [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
` (6 subsequent siblings)
7 siblings, 2 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-08 23:26 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
None of the callers of arm_smmu_cmdq_batch_submit() cares about the batch
after a submission. So, it'll be certainly safe to nuke the cmds->num, at
least upon a successful one. This will ease a bit a wrapper function, for
the new arm_smmu_invs structure.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2a8b46b948f05..cccf8f52ee0d5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -974,11 +974,17 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
cmds->num++;
}
+/* Clears cmds->num after a successful submission */
static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq_batch *cmds)
{
- return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmdq, cmds->cmds,
- cmds->num, true);
+ int ret;
+
+ ret = arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmdq, cmds->cmds,
+ cmds->num, true);
+ if (!ret)
+ cmds->num = 0;
+ return ret;
}
static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused,
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit
2025-09-08 23:26 ` [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit Nicolin Chen
@ 2025-09-09 3:16 ` Balbir Singh
2025-09-09 5:42 ` Nicolin Chen
2025-09-10 2:56 ` Nicolin Chen
1 sibling, 1 reply; 35+ messages in thread
From: Balbir Singh @ 2025-09-09 3:16 UTC (permalink / raw)
To: Nicolin Chen, jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
On 9/9/25 09:26, Nicolin Chen wrote:
> None of the callers of arm_smmu_cmdq_batch_submit() cares about the batch
> after a submission. So, it'll be certainly safe to nuke the cmds->num, at
> least upon a successful one. This will ease a bit a wrapper function, for
> the new arm_smmu_invs structure.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 2a8b46b948f05..cccf8f52ee0d5 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -974,11 +974,17 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
> cmds->num++;
> }
>
> +/* Clears cmds->num after a successful submission */
> static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
> struct arm_smmu_cmdq_batch *cmds)
> {
Nit: arm_smmu_cmdq_batch_submit_clear()?
> - return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmdq, cmds->cmds,
> - cmds->num, true);
> + int ret;
> +
> + ret = arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmdq, cmds->cmds,
> + cmds->num, true);
> + if (!ret)
> + cmds->num = 0;
> + return ret;
> }
>
> static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused,
Acked-by: Balbir Singh <balbirs@nvidia.com>
Balbir
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit
2025-09-09 3:16 ` Balbir Singh
@ 2025-09-09 5:42 ` Nicolin Chen
2025-09-09 22:49 ` Balbir Singh
0 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-09 5:42 UTC (permalink / raw)
To: Balbir Singh
Cc: jgg, will, robin.murphy, joro, jean-philippe, miko.lenczewski,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Tue, Sep 09, 2025 at 01:16:11PM +1000, Balbir Singh wrote:
> On 9/9/25 09:26, Nicolin Chen wrote:
> > None of the callers of arm_smmu_cmdq_batch_submit() cares about the batch
> > after a submission. So, it'll be certainly safe to nuke the cmds->num, at
> > least upon a successful one. This will ease a bit a wrapper function, for
> > the new arm_smmu_invs structure.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 ++++++++--
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 2a8b46b948f05..cccf8f52ee0d5 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -974,11 +974,17 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
> > cmds->num++;
> > }
> >
> > +/* Clears cmds->num after a successful submission */
> > static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
> > struct arm_smmu_cmdq_batch *cmds)
> > {
>
> Nit: arm_smmu_cmdq_batch_submit_clear()?
Probably not. There is no particular point in highlighting it in
the function name, as there is no use case wanting an uncleared
version.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit
2025-09-09 5:42 ` Nicolin Chen
@ 2025-09-09 22:49 ` Balbir Singh
2025-09-10 2:03 ` Nicolin Chen
0 siblings, 1 reply; 35+ messages in thread
From: Balbir Singh @ 2025-09-09 22:49 UTC (permalink / raw)
To: Nicolin Chen
Cc: jgg, will, robin.murphy, joro, jean-philippe, miko.lenczewski,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On 9/9/25 15:42, Nicolin Chen wrote:
> On Tue, Sep 09, 2025 at 01:16:11PM +1000, Balbir Singh wrote:
>> On 9/9/25 09:26, Nicolin Chen wrote:
>>> None of the callers of arm_smmu_cmdq_batch_submit() cares about the batch
>>> after a submission. So, it'll be certainly safe to nuke the cmds->num, at
>>> least upon a successful one. This will ease a bit a wrapper function, for
>>> the new arm_smmu_invs structure.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> ---
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 ++++++++--
>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index 2a8b46b948f05..cccf8f52ee0d5 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -974,11 +974,17 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
>>> cmds->num++;
>>> }
>>>
>>> +/* Clears cmds->num after a successful submission */
>>> static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
>>> struct arm_smmu_cmdq_batch *cmds)
>>> {
>>
>> Nit: arm_smmu_cmdq_batch_submit_clear()?
>
> Probably not. There is no particular point in highlighting it in
> the function name, as there is no use case wanting an uncleared
> version.
I did not suggest we need an uncleared version, I suggested the change
in name to highlight that the function has a side-effect of clearing
the cmds->num
Thanks,
Balbir
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit
2025-09-09 22:49 ` Balbir Singh
@ 2025-09-10 2:03 ` Nicolin Chen
0 siblings, 0 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-10 2:03 UTC (permalink / raw)
To: Balbir Singh
Cc: jgg, will, robin.murphy, joro, jean-philippe, miko.lenczewski,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Wed, Sep 10, 2025 at 08:49:52AM +1000, Balbir Singh wrote:
> On 9/9/25 15:42, Nicolin Chen wrote:
> > On Tue, Sep 09, 2025 at 01:16:11PM +1000, Balbir Singh wrote:
> >> On 9/9/25 09:26, Nicolin Chen wrote:
> >>> None of the callers of arm_smmu_cmdq_batch_submit() cares about the batch
> >>> after a submission. So, it'll be certainly safe to nuke the cmds->num, at
> >>> least upon a successful one. This will ease a bit a wrapper function, for
> >>> the new arm_smmu_invs structure.
> >>>
> >>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> >>> ---
> >>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 ++++++++--
> >>> 1 file changed, 8 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >>> index 2a8b46b948f05..cccf8f52ee0d5 100644
> >>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >>> @@ -974,11 +974,17 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
> >>> cmds->num++;
> >>> }
> >>>
> >>> +/* Clears cmds->num after a successful submission */
> >>> static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
> >>> struct arm_smmu_cmdq_batch *cmds)
> >>> {
> >>
> >> Nit: arm_smmu_cmdq_batch_submit_clear()?
> >
> > Probably not. There is no particular point in highlighting it in
> > the function name, as there is no use case wanting an uncleared
> > version.
>
> I did not suggest we need an uncleared version, I suggested the change
> in name to highlight that the function has a side-effect of clearing
> the cmds->num
No caller cares about the "side effect"...
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit
2025-09-08 23:26 ` [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit Nicolin Chen
2025-09-09 3:16 ` Balbir Singh
@ 2025-09-10 2:56 ` Nicolin Chen
1 sibling, 0 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-10 2:56 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
On Mon, Sep 08, 2025 at 04:26:55PM -0700, Nicolin Chen wrote:
> None of the callers of arm_smmu_cmdq_batch_submit() cares about the batch
> after a submission. So, it'll be certainly safe to nuke the cmds->num, at
> least upon a successful one. This will ease a bit a wrapper function, for
> the new arm_smmu_invs structure.
This was added when arm_smmu_cmdq_batch_submit() was needed by a
new arm_smmu_invs helper that used it at a few places. But after
a few rounds of rework, I found that there is only one call now.
I will drop this patch.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
2025-09-08 23:26 ` [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit Nicolin Chen
@ 2025-09-08 23:26 ` Nicolin Chen
2025-09-09 3:25 ` Balbir Singh
` (2 more replies)
2025-09-08 23:26 ` [PATCH rfcv2 3/8] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free() Nicolin Chen
` (5 subsequent siblings)
7 siblings, 3 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-08 23:26 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
Both the ARM_SMMU_DOMAIN_S1 case and the SVA case use ASID, requiring ASID
based invalidation commands to flush the TLB.
Define an ARM_SMMU_DOMAIN_SVA to make the SVA case clear to share the same
path with the ARM_SMMU_DOMAIN_S1 case, which will be a part of the routine
to build a new per-domain invalidation array.
There is no function change.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
3 files changed, 5 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index ae23aacc38402..5c0b38595d209 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -858,6 +858,7 @@ struct arm_smmu_master {
enum arm_smmu_domain_stage {
ARM_SMMU_DOMAIN_S1 = 0,
ARM_SMMU_DOMAIN_S2,
+ ARM_SMMU_DOMAIN_SVA,
};
struct arm_smmu_domain {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 59a480974d80f..6097f1f540d87 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -346,6 +346,7 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
* ARM_SMMU_FEAT_RANGE_INV is present
*/
smmu_domain->domain.pgsize_bitmap = PAGE_SIZE;
+ smmu_domain->stage = ARM_SMMU_DOMAIN_SVA;
smmu_domain->smmu = smmu;
ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index cccf8f52ee0d5..0016ec699acfe 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3070,6 +3070,9 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
arm_smmu_install_ste_for_dev(master, &target);
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
break;
+ default:
+ WARN_ON(true);
+ break;
}
arm_smmu_attach_commit(&state);
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA
2025-09-08 23:26 ` [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
@ 2025-09-09 3:25 ` Balbir Singh
2025-09-09 22:31 ` Balbir Singh
2025-09-24 21:07 ` Jason Gunthorpe
2 siblings, 0 replies; 35+ messages in thread
From: Balbir Singh @ 2025-09-09 3:25 UTC (permalink / raw)
To: Nicolin Chen, jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
On 9/9/25 09:26, Nicolin Chen wrote:
> Both the ARM_SMMU_DOMAIN_S1 case and the SVA case use ASID, requiring ASID
> based invalidation commands to flush the TLB.
>
> Define an ARM_SMMU_DOMAIN_SVA to make the SVA case clear to share the same
> path with the ARM_SMMU_DOMAIN_S1 case, which will be a part of the routine
> to build a new per-domain invalidation array.
>
> There is no function change.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Acked-by: Balbir Singh <balbirs@nvidia.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA
2025-09-08 23:26 ` [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
2025-09-09 3:25 ` Balbir Singh
@ 2025-09-09 22:31 ` Balbir Singh
2025-09-10 2:06 ` Nicolin Chen
2025-09-24 21:07 ` Jason Gunthorpe
2 siblings, 1 reply; 35+ messages in thread
From: Balbir Singh @ 2025-09-09 22:31 UTC (permalink / raw)
To: Nicolin Chen, jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
On 9/9/25 09:26, Nicolin Chen wrote:
> Both the ARM_SMMU_DOMAIN_S1 case and the SVA case use ASID, requiring ASID
> based invalidation commands to flush the TLB.
>
> Define an ARM_SMMU_DOMAIN_SVA to make the SVA case clear to share the same
> path with the ARM_SMMU_DOMAIN_S1 case, which will be a part of the routine
> to build a new per-domain invalidation array.
>
> There is no function change.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 1 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
> 3 files changed, 5 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index ae23aacc38402..5c0b38595d209 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -858,6 +858,7 @@ struct arm_smmu_master {
> enum arm_smmu_domain_stage {
> ARM_SMMU_DOMAIN_S1 = 0,
> ARM_SMMU_DOMAIN_S2,
> + ARM_SMMU_DOMAIN_SVA,
> };
>
> struct arm_smmu_domain {
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 59a480974d80f..6097f1f540d87 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -346,6 +346,7 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
> * ARM_SMMU_FEAT_RANGE_INV is present
> */
> smmu_domain->domain.pgsize_bitmap = PAGE_SIZE;
> + smmu_domain->stage = ARM_SMMU_DOMAIN_SVA;
> smmu_domain->smmu = smmu;
>
> ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index cccf8f52ee0d5..0016ec699acfe 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -3070,6 +3070,9 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> arm_smmu_install_ste_for_dev(master, &target);
> arm_smmu_clear_cd(master, IOMMU_NO_PASID);
> break;
> + default:
> + WARN_ON(true);
WARN_ONCE_ONCE() and shoudn't ret be set to -EINVAL?
> + break;
> }
>
> arm_smmu_attach_commit(&state);
Balbir
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA
2025-09-09 22:31 ` Balbir Singh
@ 2025-09-10 2:06 ` Nicolin Chen
0 siblings, 0 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-10 2:06 UTC (permalink / raw)
To: Balbir Singh
Cc: jgg, will, robin.murphy, joro, jean-philippe, miko.lenczewski,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Wed, Sep 10, 2025 at 08:31:43AM +1000, Balbir Singh wrote:
> On 9/9/25 09:26, Nicolin Chen wrote:
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index cccf8f52ee0d5..0016ec699acfe 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -3070,6 +3070,9 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> > arm_smmu_install_ste_for_dev(master, &target);
> > arm_smmu_clear_cd(master, IOMMU_NO_PASID);
> > break;
> > + default:
> > + WARN_ON(true);
>
> WARN_ONCE_ONCE() and shoudn't ret be set to -EINVAL?
I will fix it.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA
2025-09-08 23:26 ` [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
2025-09-09 3:25 ` Balbir Singh
2025-09-09 22:31 ` Balbir Singh
@ 2025-09-24 21:07 ` Jason Gunthorpe
2 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-24 21:07 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 08, 2025 at 04:26:56PM -0700, Nicolin Chen wrote:
> Both the ARM_SMMU_DOMAIN_S1 case and the SVA case use ASID, requiring ASID
> based invalidation commands to flush the TLB.
>
> Define an ARM_SMMU_DOMAIN_SVA to make the SVA case clear to share the same
> path with the ARM_SMMU_DOMAIN_S1 case, which will be a part of the routine
> to build a new per-domain invalidation array.
>
> There is no function change.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 1 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
> 3 files changed, 5 insertions(+)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH rfcv2 3/8] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free()
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
2025-09-08 23:26 ` [PATCH rfcv2 1/8] iommu/arm-smmu-v3: Clear cmds->num after arm_smmu_cmdq_batch_submit Nicolin Chen
2025-09-08 23:26 ` [PATCH rfcv2 2/8] iommu/arm-smmu-v3: Explicitly set smmu_domain->stage for SVA Nicolin Chen
@ 2025-09-08 23:26 ` Nicolin Chen
2025-09-24 21:08 ` Jason Gunthorpe
2025-09-08 23:26 ` [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
` (4 subsequent siblings)
7 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-08 23:26 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
There will be a bit more things to free than smmu_domain itself. So keep a
simple inline function in the header to share aross files.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 ++--
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 5c0b38595d209..96a23ca633cb6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -954,6 +954,11 @@ extern struct mutex arm_smmu_asid_lock;
struct arm_smmu_domain *arm_smmu_domain_alloc(void);
+static inline void arm_smmu_domain_free(struct arm_smmu_domain *smmu_domain)
+{
+ kfree(smmu_domain);
+}
+
void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid);
struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 6097f1f540d87..fc601b494e0af 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -365,6 +365,6 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
err_asid:
xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
err_free:
- kfree(smmu_domain);
+ arm_smmu_domain_free(smmu_domain);
return ERR_PTR(ret);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0016ec699acfe..08af5f2d1235a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2498,7 +2498,7 @@ static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
ida_free(&smmu->vmid_map, cfg->vmid);
}
- kfree(smmu_domain);
+ arm_smmu_domain_free(smmu_domain);
}
static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
@@ -3359,7 +3359,7 @@ arm_smmu_domain_alloc_paging_flags(struct device *dev, u32 flags,
return &smmu_domain->domain;
err_free:
- kfree(smmu_domain);
+ arm_smmu_domain_free(smmu_domain);
return ERR_PTR(ret);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 3/8] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free()
2025-09-08 23:26 ` [PATCH rfcv2 3/8] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free() Nicolin Chen
@ 2025-09-24 21:08 ` Jason Gunthorpe
0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-24 21:08 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 08, 2025 at 04:26:57PM -0700, Nicolin Chen wrote:
> There will be a bit more things to free than smmu_domain itself. So keep a
> simple inline function in the header to share aross files.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +++++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 ++--
> 3 files changed, 8 insertions(+), 3 deletions(-)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
` (2 preceding siblings ...)
2025-09-08 23:26 ` [PATCH rfcv2 3/8] iommu/arm-smmu-v3: Add an inline arm_smmu_domain_free() Nicolin Chen
@ 2025-09-08 23:26 ` Nicolin Chen
2025-09-09 13:01 ` kernel test robot
` (2 more replies)
2025-09-08 23:26 ` [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array Nicolin Chen
` (3 subsequent siblings)
7 siblings, 3 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-08 23:26 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
From: Jason Gunthorpe <jgg@nvidia.com>
Create a new data structure to hold an array of invalidations that need to
be performed for the domain based on what masters are attached, to replace
the single smmu pointer and linked list of masters in the current design.
Each array entry holds one of the invalidation actions - S1_ASID, S2_VMID,
ATS or their variant with information to feed invalidation commands to HW.
It is structured so that multiple SMMUs can participate in the same array,
removing one key limitation of the current system.
To maximize performance, a sorted array is used as the data structure. It
allows grouping SYNCs together to parallelize invalidations. For instance,
it will group all the ATS entries after the ASID/VMID entry, so they will
all be pushed to the PCI devices in parallel with one SYNC.
To minimize the locking cost on the invalidation fast path (reader of the
invalidation array), the array is managed with RCU.
Provide a set of APIs to add/delete entries to/from an array, which cover
cannot-fail attach cases, e.g. attaching to arm_smmu_blocked_domain. Also
add kunit coverage for those APIs.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 79 +++++++
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 93 ++++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 212 ++++++++++++++++++
3 files changed, 384 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 96a23ca633cb6..34fcc1a930e6a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -649,6 +649,82 @@ struct arm_smmu_cmdq_batch {
int num;
};
+/*
+ * The order here also determines the sequence in which commands are sent to the
+ * command queue. E.g. TLBI must be done before ATC_INV.
+ */
+enum arm_smmu_inv_type {
+ INV_TYPE_S1_ASID,
+ INV_TYPE_S2_VMID,
+ INV_TYPE_S2_VMID_S1_CLEAR,
+ INV_TYPE_ATS,
+ INV_TYPE_ATS_FULL,
+};
+
+struct arm_smmu_inv {
+ struct arm_smmu_device *smmu;
+ u8 type;
+ u8 size_opcode;
+ u8 nsize_opcode;
+ u32 id; /* ASID or VMID or SID */
+ union {
+ size_t pgsize; /* ARM_SMMU_FEAT_RANGE_INV */
+ u32 ssid; /* INV_TYPE_ATS */
+ };
+
+ refcount_t users; /* users=0 to mark as a trash to be purged */
+};
+
+/**
+ * struct arm_smmu_invs - Per-domain invalidation array
+ * @num_invs: number of invalidations in the flexible array
+ * @rcu: rcu head for kfree_rcu()
+ * @inv: flexible invalidation array
+ *
+ * The arm_smmu_invs is an RCU data structure. During a ->attach_dev callback,
+ * arm_smmu_invs_merge(), arm_smmu_invs_unref() and arm_smmu_invs_purge() will
+ * be used to allocate a new copy of an old array for addition and deletion in
+ * the old domain's and new domain's invs arrays.
+ *
+ * The arm_smmu_invs_unref() mutates a given array, by internally reducing the
+ * users counts of some given entries. This exists to support a no-fail routine
+ * like attaching to an IOMMU_DOMAIN_BLOCKED. And it could pair with a followup
+ * arm_smmu_invs_purge() call to generate a new clean array.
+ *
+ * Concurrent invalidation thread will push every invalidation described in the
+ * array into the command queue for each invalidation event. It is designed like
+ * this to optimize the invalidation fast path by avoiding locks.
+ *
+ * A domain can be shared across SMMU instances. When an instance gets removed,
+ * it would delete all the entries that belong to that SMMU instance. Then, a
+ * synchronize_rcu() would have to be called to sync the array, to prevent any
+ * concurrent invalidation thread accessing the old array from issuing commands
+ * to the command queue of a removed SMMU instance.
+ */
+struct arm_smmu_invs {
+ size_t num_invs;
+ struct rcu_head rcu;
+ struct arm_smmu_inv inv[];
+};
+
+static inline struct arm_smmu_invs *arm_smmu_invs_alloc(size_t num_invs)
+{
+ struct arm_smmu_invs *new_invs;
+
+ new_invs = kzalloc(struct_size(new_invs, inv, num_invs), GFP_KERNEL);
+ if (!new_invs)
+ return ERR_PTR(-ENOMEM);
+ new_invs->num_invs = num_invs;
+ return new_invs;
+}
+
+struct arm_smmu_invs *arm_smmu_invs_merge(struct arm_smmu_invs *invs,
+ struct arm_smmu_invs *to_merge);
+size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
+ struct arm_smmu_invs *to_unref);
+struct arm_smmu_invs *arm_smmu_invs_purge(struct arm_smmu_invs *invs,
+ size_t num_dels);
+
struct arm_smmu_evtq {
struct arm_smmu_queue q;
struct iopf_queue *iopf;
@@ -875,6 +951,8 @@ struct arm_smmu_domain {
struct iommu_domain domain;
+ struct arm_smmu_invs __rcu *invs;
+
/* List of struct arm_smmu_master_domain */
struct list_head devices;
spinlock_t devices_lock;
@@ -956,6 +1034,7 @@ struct arm_smmu_domain *arm_smmu_domain_alloc(void);
static inline void arm_smmu_domain_free(struct arm_smmu_domain *smmu_domain)
{
+ kfree_rcu(smmu_domain->invs, rcu);
kfree(smmu_domain);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index d2671bfd37981..417a2b5ea2024 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -567,6 +567,98 @@ static void arm_smmu_v3_write_cd_test_sva_release(struct kunit *test)
NUM_EXPECTED_SYNCS(2));
}
+static void arm_smmu_v3_invs_test_verify(struct kunit *test,
+ struct arm_smmu_invs *invs, int num,
+ const int *ids, const int *users)
+{
+ KUNIT_EXPECT_EQ(test, invs->num_invs, num);
+ while (num--) {
+ KUNIT_EXPECT_EQ(test, invs->inv[num].id, ids[num]);
+ KUNIT_EXPECT_EQ(test, refcount_read(&invs->inv[num].users),
+ users[num]);
+ }
+}
+
+static struct arm_smmu_invs invs1 = {
+ .num_invs = 3,
+ .inv = { { .type = INV_TYPE_S2_VMID, .id = 1, },
+ { .type = INV_TYPE_S2_VMID, .id = 2, },
+ { .type = INV_TYPE_S2_VMID, .id = 3, }, },
+};
+
+static struct arm_smmu_invs invs2 = {
+ .num_invs = 3,
+ .inv = { { .type = INV_TYPE_S2_VMID, .id = 1, }, /* duplicated */
+ { .type = INV_TYPE_ATS, .id = 4, },
+ { .type = INV_TYPE_ATS, .id = 5, }, },
+};
+
+static struct arm_smmu_invs invs3 = {
+ .num_invs = 3,
+ .inv = { { .type = INV_TYPE_S2_VMID, .id = 1, }, /* duplicated */
+ { .type = INV_TYPE_ATS, .id = 5, }, /* recover a trash */
+ { .type = INV_TYPE_ATS, .id = 6, }, },
+};
+
+static void arm_smmu_v3_invs_test(struct kunit *test)
+{
+ const int results1[2][3] = { { 1, 2, 3, }, { 1, 1, 1, }, };
+ const int results2[2][5] = { { 1, 2, 3, 4, 5, }, { 2, 1, 1, 1, 1, }, };
+ const int results3[2][5] = { { 1, 2, 3, 4, 5, }, { 1, 1, 1, 0, 0, }, };
+ const int results4[2][5] = { { 1, 2, 3, 5, 6, }, { 2, 1, 1, 1, 1, }, };
+ const int results5[2][5] = { { 1, 2, 3, 5, 6, }, { 1, 0, 0, 1, 1, }, };
+ const int results6[2][5] = { { 1, 2, 3, 5, 6, }, { 0, 0, 0, 0, 0, }, };
+ struct arm_smmu_invs *test_a, *test_b;
+ size_t num_dels;
+
+ /* New array */
+ test_a = arm_smmu_invs_alloc(0);
+ KUNIT_EXPECT_EQ(test, test_a->num_invs, 0);
+
+ /* Test1: merge invs1 (new array) */
+ test_b = arm_smmu_invs_merge(test_a, &invs1);
+ kfree(test_a);
+ arm_smmu_v3_invs_test_verify(test, test_b, ARRAY_SIZE(results1[0]),
+ results1[0], results1[1]);
+
+ /* Test2: merge invs2 (new array) */
+ test_a = arm_smmu_invs_merge(test_b, &invs2);
+ kfree(test_b);
+ arm_smmu_v3_invs_test_verify(test, test_a, ARRAY_SIZE(results2[0]),
+ results2[0], results2[1]);
+
+ /* Test3: unref invs2 (same array) */
+ num_dels = arm_smmu_invs_unref(test_a, &invs2);
+ arm_smmu_v3_invs_test_verify(test, test_a, ARRAY_SIZE(results3[0]),
+ results3[0], results3[1]);
+ KUNIT_EXPECT_EQ(test, num_dels, 2);
+
+ /* Test4: merge invs3 (new array) */
+ test_b = arm_smmu_invs_merge(test_a, &invs3);
+ kfree(test_a);
+ arm_smmu_v3_invs_test_verify(test, test_b, ARRAY_SIZE(results4[0]),
+ results4[0], results4[1]);
+
+ /* Test5: unref invs1 (same array) */
+ num_dels = arm_smmu_invs_unref(test_b, &invs1);
+ arm_smmu_v3_invs_test_verify(test, test_b, ARRAY_SIZE(results5[0]),
+ results5[0], results5[1]);
+ KUNIT_EXPECT_EQ(test, num_dels, 2);
+
+ /* Test6: unref invs3 (same array) */
+ num_dels = arm_smmu_invs_unref(test_b, &invs3);
+ arm_smmu_v3_invs_test_verify(test, test_b, ARRAY_SIZE(results6[0]),
+ results6[0], results6[1]);
+ KUNIT_EXPECT_EQ(test, num_dels, 5);
+
+ /* Test7: purge test_b (new array) */
+ test_a = arm_smmu_invs_purge(test_b, num_dels);
+ kfree(test_b);
+ KUNIT_EXPECT_EQ(test, test_a->num_invs, 0);
+
+ kfree(test_a);
+}
+
static struct kunit_case arm_smmu_v3_test_cases[] = {
KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_abort),
KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_bypass),
@@ -590,6 +682,7 @@ static struct kunit_case arm_smmu_v3_test_cases[] = {
KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_s1_stall),
KUNIT_CASE(arm_smmu_v3_write_cd_test_sva_clear),
KUNIT_CASE(arm_smmu_v3_write_cd_test_sva_release),
+ KUNIT_CASE(arm_smmu_v3_invs_test),
{},
};
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 08af5f2d1235a..83d842bd88817 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -26,6 +26,7 @@
#include <linux/pci.h>
#include <linux/pci-ats.h>
#include <linux/platform_device.h>
+#include <linux/sort.h>
#include <linux/string_choices.h>
#include <kunit/visibility.h>
#include <uapi/linux/iommufd.h>
@@ -1033,6 +1034,209 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
+static int arm_smmu_invs_cmp(const void *_l, const void *_r)
+{
+ const struct arm_smmu_inv *l = _l;
+ const struct arm_smmu_inv *r = _r;
+
+ if (l->smmu != r->smmu)
+ return cmp_int((uintptr_t)l->smmu, (uintptr_t)r->smmu);
+ if (l->type != r->type)
+ return cmp_int(l->type, r->type);
+ return cmp_int(l->id, r->id);
+}
+
+/*
+ * Merge compare of two sorted arrays items. If one side is past the end of the
+ * array, return the other side to let it run out the iteration.
+ */
+static inline int
+arm_smmu_invs_merge_cmp(const struct arm_smmu_invs *l, size_t l_idx,
+ const struct arm_smmu_invs *r, size_t r_idx)
+{
+ if (l_idx != l->num_invs && r_idx != r->num_invs)
+ return arm_smmu_invs_cmp(&l->inv[l_idx], &r->inv[r_idx]);
+ if (l_idx != l->num_invs)
+ return -1;
+ return 1;
+}
+
+/**
+ * arm_smmu_invs_merge() - Merge @to_merge into @invs and generate a new array
+ * @invs: the base invalidation array
+ * @to_merge: an array of invlidations to merge
+ *
+ * Return: a newly allocated array on success, or ERR_PTR
+ *
+ * This function must be locked and serialized with arm_smmu_invs_unref() and
+ * arm_smmu_invs_purge(), but do not lockdep on any lock for KUNIT test.
+ *
+ * Either @invs or @to_merge must be sorted itself. This ensures the returned
+ * array will be sorted as well.
+ *
+ * Caller is resposible for freeing the @invs and the returned new one.
+ *
+ * Entries marked as trash will be purged in the returned array.
+ */
+VISIBLE_IF_KUNIT
+struct arm_smmu_invs *arm_smmu_invs_merge(struct arm_smmu_invs *invs,
+ struct arm_smmu_invs *to_merge)
+{
+ struct arm_smmu_invs *new_invs;
+ struct arm_smmu_inv *new;
+ size_t num_adds = 0;
+ size_t num_dels = 0;
+ size_t i, j;
+
+ for (i = j = 0; i != invs->num_invs || j != to_merge->num_invs;) {
+ int cmp = arm_smmu_invs_merge_cmp(invs, i, to_merge, j);
+
+ if (cmp < 0) {
+ /* no found in to_merge, leave alone but delete trash */
+ if (!refcount_read(&invs->inv[i].users))
+ num_dels++;
+ i++;
+ } else if (cmp == 0) {
+ /* same item */
+ i++;
+ j++;
+ } else {
+ /* unique to to_merge */
+ num_adds++;
+ j++;
+ }
+ }
+
+ new_invs = arm_smmu_invs_alloc(invs->num_invs - num_dels + num_adds);
+ if (IS_ERR(new_invs))
+ return new_invs;
+
+ new = new_invs->inv;
+ for (i = j = 0; i != invs->num_invs || j != to_merge->num_invs;) {
+ int cmp = arm_smmu_invs_merge_cmp(invs, i, to_merge, j);
+
+ if (cmp <= 0 && !refcount_read(&invs->inv[i].users)) {
+ i++;
+ continue;
+ }
+
+ if (cmp < 0) {
+ *new = invs->inv[i];
+ i++;
+ } else if (cmp == 0) {
+ *new = invs->inv[i];
+ refcount_inc(&new->users);
+ i++;
+ j++;
+ } else {
+ *new = to_merge->inv[j];
+ refcount_set(&new->users, 1);
+ j++;
+ }
+ new++;
+ }
+
+ WARN_ON(new != new_invs->inv + new_invs->num_invs);
+
+ return new_invs;
+}
+EXPORT_SYMBOL_IF_KUNIT(arm_smmu_invs_merge);
+
+/**
+ * arm_smmu_invs_unref() - Find in @invs for all entries in @to_unref, decrease
+ * the user counts without deletions
+ * @invs: the base invalidation array
+ * @to_unref: an array of invlidations to decrease their user counts
+ *
+ * Return: the number of trash entries in the array, for arm_smmu_invs_purge()
+ *
+ * This function will not fail. Any entry with users=0 will be marked as trash.
+ * All trash entries will remain in the @invs until being completely deleted by
+ * the next arm_smmu_invs_merge() or an arm_smmu_invs_purge() function call.
+ *
+ * This function must be locked and serialized with arm_smmu_invs_merge() and
+ * arm_smmu_invs_purge(), but do not lockdep on any lock for KUNIT test.
+ *
+ * Note that the @invs->num_invs will not be updated, even if the actual number
+ * of invalidations are decreased. Readers should take the read lock to iterate
+ * each entry and check its users counter until @inv->num_invs.
+ */
+VISIBLE_IF_KUNIT
+size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
+ struct arm_smmu_invs *to_unref)
+{
+ size_t num_dels = 0;
+ size_t i, j;
+
+ for (i = j = 0; i != invs->num_invs || j != to_unref->num_invs;) {
+ int cmp;
+
+ if (!refcount_read(&invs->inv[i].users)) {
+ num_dels++;
+ i++;
+ continue;
+ }
+
+ cmp = arm_smmu_invs_merge_cmp(invs, i, to_unref, j);
+ if (cmp < 0) {
+ /* not found in to_unref, leave alone */
+ i++;
+ } else if (cmp == 0) {
+ /* same item */
+ if (refcount_dec_and_test(&invs->inv[i].users))
+ num_dels++;
+ i++;
+ j++;
+ } else {
+ /* item in to_unref is not in invs or already a trash */
+ WARN_ON(true);
+ j++;
+ }
+ }
+ return num_dels;
+}
+EXPORT_SYMBOL_IF_KUNIT(arm_smmu_invs_unref);
+
+/**
+ * arm_smmu_invs_purge() - Purge all the trash entries in the @invs
+ * @invs: the base invalidation array
+ * @num_dels: expected number of trash entries, typically the return value from
+ * a prior arm_smmu_invs_unref() call
+ *
+ * Return: a newly allocated array on success removing all the trash entries, or
+ * NULL on failure
+ *
+ * This function must be locked and serialized with arm_smmu_invs_merge() and
+ * arm_smmu_invs_unref(), but do not lockdep on any lock for KUNIT test.
+ *
+ * Caller is resposible for freeing the @invs and the returned new one.
+ */
+VISIBLE_IF_KUNIT
+struct arm_smmu_invs *arm_smmu_invs_purge(struct arm_smmu_invs *invs,
+ size_t num_dels)
+{
+ struct arm_smmu_invs *new_invs;
+ size_t i, j;
+
+ if (WARN_ON(invs->num_invs < num_dels))
+ return NULL;
+
+ new_invs = arm_smmu_invs_alloc(invs->num_invs - num_dels);
+ if (IS_ERR(new_invs))
+ return NULL;
+
+ for (i = j = 0; i != invs->num_invs; i++) {
+ if (!refcount_read(&invs->inv[i].users))
+ continue;
+ new_invs->inv[j] = invs->inv[i];
+ j++;
+ }
+
+ WARN_ON(j != new_invs->num_invs);
+ return new_invs;
+}
+EXPORT_SYMBOL_IF_KUNIT(arm_smmu_invs_purge);
+
/*
* Based on the value of ent report which bits of the STE the HW will access. It
* would be nice if this was complete according to the spec, but minimally it
@@ -2468,13 +2672,21 @@ static bool arm_smmu_enforce_cache_coherency(struct iommu_domain *domain)
struct arm_smmu_domain *arm_smmu_domain_alloc(void)
{
struct arm_smmu_domain *smmu_domain;
+ struct arm_smmu_invs *new_invs;
smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL);
if (!smmu_domain)
return ERR_PTR(-ENOMEM);
+ new_invs = arm_smmu_invs_alloc(0);
+ if (IS_ERR(new_invs)) {
+ kfree(smmu_domain);
+ return ERR_CAST(new_invs);
+ }
+
INIT_LIST_HEAD(&smmu_domain->devices);
spin_lock_init(&smmu_domain->devices_lock);
+ rcu_assign_pointer(smmu_domain->invs, new_invs);
return smmu_domain;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array
2025-09-08 23:26 ` [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
@ 2025-09-09 13:01 ` kernel test robot
2025-09-20 0:26 ` Nicolin Chen
2025-09-24 21:29 ` Jason Gunthorpe
2 siblings, 0 replies; 35+ messages in thread
From: kernel test robot @ 2025-09-09 13:01 UTC (permalink / raw)
To: Nicolin Chen, jgg, will, robin.murphy
Cc: oe-kbuild-all, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
Hi Nicolin,
kernel test robot noticed the following build errors:
[auto build test ERROR on soc/for-next]
[also build test ERROR on linus/master v6.17-rc5 next-20250909]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Nicolin-Chen/iommu-arm-smmu-v3-Clear-cmds-num-after-arm_smmu_cmdq_batch_submit/20250909-073052
base: https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git for-next
patch link: https://lore.kernel.org/r/80310b98efa4bd7e95d7b3ca302f40d4d69e59c5.1757373449.git.nicolinc%40nvidia.com
patch subject: [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array
config: arm64-randconfig-003-20250909 (https://download.01.org/0day-ci/archive/20250909/202509092020.mxUyqGcN-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 11.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250909/202509092020.mxUyqGcN-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202509092020.mxUyqGcN-lkp@intel.com/
All errors (new ones prefixed by >>):
>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1082:23: error: static declaration of 'arm_smmu_invs_merge' follows non-static declaration
1082 | struct arm_smmu_invs *arm_smmu_invs_merge(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
In file included from drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:34:
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h:721:23: note: previous declaration of 'arm_smmu_invs_merge' with type 'struct arm_smmu_invs *(struct arm_smmu_invs *, struct arm_smmu_invs *)'
721 | struct arm_smmu_invs *arm_smmu_invs_merge(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1165:8: error: static declaration of 'arm_smmu_invs_unref' follows non-static declaration
1165 | size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
In file included from drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:34:
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h:723:8: note: previous declaration of 'arm_smmu_invs_unref' with type 'size_t(struct arm_smmu_invs *, struct arm_smmu_invs *)' {aka 'long unsigned int(struct arm_smmu_invs *, struct arm_smmu_invs *)'}
723 | size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1215:23: error: static declaration of 'arm_smmu_invs_purge' follows non-static declaration
1215 | struct arm_smmu_invs *arm_smmu_invs_purge(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
In file included from drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:34:
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h:725:23: note: previous declaration of 'arm_smmu_invs_purge' with type 'struct arm_smmu_invs *(struct arm_smmu_invs *, size_t)' {aka 'struct arm_smmu_invs *(struct arm_smmu_invs *, long unsigned int)'}
725 | struct arm_smmu_invs *arm_smmu_invs_purge(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1215:23: warning: 'arm_smmu_invs_purge' defined but not used [-Wunused-function]
1215 | struct arm_smmu_invs *arm_smmu_invs_purge(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1165:8: warning: 'arm_smmu_invs_unref' defined but not used [-Wunused-function]
1165 | size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1082:23: warning: 'arm_smmu_invs_merge' defined but not used [-Wunused-function]
1082 | struct arm_smmu_invs *arm_smmu_invs_merge(struct arm_smmu_invs *invs,
| ^~~~~~~~~~~~~~~~~~~
vim +/arm_smmu_invs_merge +1082 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
1063
1064 /**
1065 * arm_smmu_invs_merge() - Merge @to_merge into @invs and generate a new array
1066 * @invs: the base invalidation array
1067 * @to_merge: an array of invlidations to merge
1068 *
1069 * Return: a newly allocated array on success, or ERR_PTR
1070 *
1071 * This function must be locked and serialized with arm_smmu_invs_unref() and
1072 * arm_smmu_invs_purge(), but do not lockdep on any lock for KUNIT test.
1073 *
1074 * Either @invs or @to_merge must be sorted itself. This ensures the returned
1075 * array will be sorted as well.
1076 *
1077 * Caller is resposible for freeing the @invs and the returned new one.
1078 *
1079 * Entries marked as trash will be purged in the returned array.
1080 */
1081 VISIBLE_IF_KUNIT
> 1082 struct arm_smmu_invs *arm_smmu_invs_merge(struct arm_smmu_invs *invs,
1083 struct arm_smmu_invs *to_merge)
1084 {
1085 struct arm_smmu_invs *new_invs;
1086 struct arm_smmu_inv *new;
1087 size_t num_adds = 0;
1088 size_t num_dels = 0;
1089 size_t i, j;
1090
1091 for (i = j = 0; i != invs->num_invs || j != to_merge->num_invs;) {
1092 int cmp = arm_smmu_invs_merge_cmp(invs, i, to_merge, j);
1093
1094 if (cmp < 0) {
1095 /* no found in to_merge, leave alone but delete trash */
1096 if (!refcount_read(&invs->inv[i].users))
1097 num_dels++;
1098 i++;
1099 } else if (cmp == 0) {
1100 /* same item */
1101 i++;
1102 j++;
1103 } else {
1104 /* unique to to_merge */
1105 num_adds++;
1106 j++;
1107 }
1108 }
1109
1110 new_invs = arm_smmu_invs_alloc(invs->num_invs - num_dels + num_adds);
1111 if (IS_ERR(new_invs))
1112 return new_invs;
1113
1114 new = new_invs->inv;
1115 for (i = j = 0; i != invs->num_invs || j != to_merge->num_invs;) {
1116 int cmp = arm_smmu_invs_merge_cmp(invs, i, to_merge, j);
1117
1118 if (cmp <= 0 && !refcount_read(&invs->inv[i].users)) {
1119 i++;
1120 continue;
1121 }
1122
1123 if (cmp < 0) {
1124 *new = invs->inv[i];
1125 i++;
1126 } else if (cmp == 0) {
1127 *new = invs->inv[i];
1128 refcount_inc(&new->users);
1129 i++;
1130 j++;
1131 } else {
1132 *new = to_merge->inv[j];
1133 refcount_set(&new->users, 1);
1134 j++;
1135 }
1136 new++;
1137 }
1138
1139 WARN_ON(new != new_invs->inv + new_invs->num_invs);
1140
1141 return new_invs;
1142 }
1143 EXPORT_SYMBOL_IF_KUNIT(arm_smmu_invs_merge);
1144
1145 /**
1146 * arm_smmu_invs_unref() - Find in @invs for all entries in @to_unref, decrease
1147 * the user counts without deletions
1148 * @invs: the base invalidation array
1149 * @to_unref: an array of invlidations to decrease their user counts
1150 *
1151 * Return: the number of trash entries in the array, for arm_smmu_invs_purge()
1152 *
1153 * This function will not fail. Any entry with users=0 will be marked as trash.
1154 * All trash entries will remain in the @invs until being completely deleted by
1155 * the next arm_smmu_invs_merge() or an arm_smmu_invs_purge() function call.
1156 *
1157 * This function must be locked and serialized with arm_smmu_invs_merge() and
1158 * arm_smmu_invs_purge(), but do not lockdep on any lock for KUNIT test.
1159 *
1160 * Note that the @invs->num_invs will not be updated, even if the actual number
1161 * of invalidations are decreased. Readers should take the read lock to iterate
1162 * each entry and check its users counter until @inv->num_invs.
1163 */
1164 VISIBLE_IF_KUNIT
> 1165 size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
1166 struct arm_smmu_invs *to_unref)
1167 {
1168 size_t num_dels = 0;
1169 size_t i, j;
1170
1171 for (i = j = 0; i != invs->num_invs || j != to_unref->num_invs;) {
1172 int cmp;
1173
1174 if (!refcount_read(&invs->inv[i].users)) {
1175 num_dels++;
1176 i++;
1177 continue;
1178 }
1179
1180 cmp = arm_smmu_invs_merge_cmp(invs, i, to_unref, j);
1181 if (cmp < 0) {
1182 /* not found in to_unref, leave alone */
1183 i++;
1184 } else if (cmp == 0) {
1185 /* same item */
1186 if (refcount_dec_and_test(&invs->inv[i].users))
1187 num_dels++;
1188 i++;
1189 j++;
1190 } else {
1191 /* item in to_unref is not in invs or already a trash */
1192 WARN_ON(true);
1193 j++;
1194 }
1195 }
1196 return num_dels;
1197 }
1198 EXPORT_SYMBOL_IF_KUNIT(arm_smmu_invs_unref);
1199
1200 /**
1201 * arm_smmu_invs_purge() - Purge all the trash entries in the @invs
1202 * @invs: the base invalidation array
1203 * @num_dels: expected number of trash entries, typically the return value from
1204 * a prior arm_smmu_invs_unref() call
1205 *
1206 * Return: a newly allocated array on success removing all the trash entries, or
1207 * NULL on failure
1208 *
1209 * This function must be locked and serialized with arm_smmu_invs_merge() and
1210 * arm_smmu_invs_unref(), but do not lockdep on any lock for KUNIT test.
1211 *
1212 * Caller is resposible for freeing the @invs and the returned new one.
1213 */
1214 VISIBLE_IF_KUNIT
> 1215 struct arm_smmu_invs *arm_smmu_invs_purge(struct arm_smmu_invs *invs,
1216 size_t num_dels)
1217 {
1218 struct arm_smmu_invs *new_invs;
1219 size_t i, j;
1220
1221 if (WARN_ON(invs->num_invs < num_dels))
1222 return NULL;
1223
1224 new_invs = arm_smmu_invs_alloc(invs->num_invs - num_dels);
1225 if (IS_ERR(new_invs))
1226 return NULL;
1227
1228 for (i = j = 0; i != invs->num_invs; i++) {
1229 if (!refcount_read(&invs->inv[i].users))
1230 continue;
1231 new_invs->inv[j] = invs->inv[i];
1232 j++;
1233 }
1234
1235 WARN_ON(j != new_invs->num_invs);
1236 return new_invs;
1237 }
1238 EXPORT_SYMBOL_IF_KUNIT(arm_smmu_invs_purge);
1239
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array
2025-09-08 23:26 ` [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
2025-09-09 13:01 ` kernel test robot
@ 2025-09-20 0:26 ` Nicolin Chen
2025-09-24 21:29 ` Jason Gunthorpe
2 siblings, 0 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-20 0:26 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
On Mon, Sep 08, 2025 at 04:26:58PM -0700, Nicolin Chen wrote:
> +/**
> + * arm_smmu_invs_unref() - Find in @invs for all entries in @to_unref, decrease
> + * the user counts without deletions
> + * @invs: the base invalidation array
> + * @to_unref: an array of invlidations to decrease their user counts
> + *
> + * Return: the number of trash entries in the array, for arm_smmu_invs_purge()
> + *
> + * This function will not fail. Any entry with users=0 will be marked as trash.
> + * All trash entries will remain in the @invs until being completely deleted by
> + * the next arm_smmu_invs_merge() or an arm_smmu_invs_purge() function call.
> + *
> + * This function must be locked and serialized with arm_smmu_invs_merge() and
> + * arm_smmu_invs_purge(), but do not lockdep on any lock for KUNIT test.
> + *
> + * Note that the @invs->num_invs will not be updated, even if the actual number
> + * of invalidations are decreased. Readers should take the read lock to iterate
> + * each entry and check its users counter until @inv->num_invs.
> + */
> +VISIBLE_IF_KUNIT
> +size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
> + struct arm_smmu_invs *to_unref)
> +{
> + size_t num_dels = 0;
> + size_t i, j;
> +
> + for (i = j = 0; i != invs->num_invs || j != to_unref->num_invs;) {
> + int cmp;
> +
> + if (!refcount_read(&invs->inv[i].users)) {
> + num_dels++;
> + i++;
> + continue;
> + }
Found that this should AND the condition "i < invs->num_invs".
Will fix in next version.
> +
> + cmp = arm_smmu_invs_merge_cmp(invs, i, to_unref, j);
> + if (cmp < 0) {
> + /* not found in to_unref, leave alone */
> + i++;
> + } else if (cmp == 0) {
> + /* same item */
> + if (refcount_dec_and_test(&invs->inv[i].users))
> + num_dels++;
> + i++;
> + j++;
> + } else {
> + /* item in to_unref is not in invs or already a trash */
> + WARN_ON(true);
> + j++;
> + }
> + }
> + return num_dels;
> +}
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array
2025-09-08 23:26 ` [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
2025-09-09 13:01 ` kernel test robot
2025-09-20 0:26 ` Nicolin Chen
@ 2025-09-24 21:29 ` Jason Gunthorpe
2025-09-29 18:52 ` Nicolin Chen
2 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-24 21:29 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 08, 2025 at 04:26:58PM -0700, Nicolin Chen wrote:
> +/**
> + * arm_smmu_invs_merge() - Merge @to_merge into @invs and generate a new array
> + * @invs: the base invalidation array
> + * @to_merge: an array of invlidations to merge
> + *
> + * Return: a newly allocated array on success, or ERR_PTR
> + *
> + * This function must be locked and serialized with arm_smmu_invs_unref() and
> + * arm_smmu_invs_purge(), but do not lockdep on any lock for KUNIT test.
> + *
> + * Either @invs or @to_merge must be sorted itself. This ensures the returned
s/Either/Both
A merge sort like this requires both lists to be sorted.
> +struct arm_smmu_invs *arm_smmu_invs_merge(struct arm_smmu_invs *invs,
> + struct arm_smmu_invs *to_merge)
> +{
> + struct arm_smmu_invs *new_invs;
> + struct arm_smmu_inv *new;
> + size_t num_adds = 0;
> + size_t num_dels = 0;
> + size_t i, j;
> +
> + for (i = j = 0; i != invs->num_invs || j != to_merge->num_invs;) {
> + int cmp = arm_smmu_invs_merge_cmp(invs, i, to_merge, j);
> +
> + if (cmp < 0) {
> + /* no found in to_merge, leave alone but delete trash */
s/no/not/
> + if (!refcount_read(&invs->inv[i].users))
> + num_dels++;
> + i++;
This sequence related to users should be consistent in all the merge
sorts. The one below in unref is the best one:
+ int cmp;
+
+ if (!refcount_read(&invs->inv[i].users)) {
+ num_dels++;
+ i++;
+ continue;
+ }
+
+ cmp = arm_smmu_invs_merge_cmp(invs, i, to_unref, j);
Make all of these loops look like that
> +
> + WARN_ON(new != new_invs->inv + new_invs->num_invs);
> +
> + return new_invs;
A debugging check that the output list is sorted would be a nice touch
for robustness.
I think this looks OK and has turned out to be pretty simple.
I've been thinking about generalizing it to core code and I think it
would hold up well there as well?
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array
2025-09-24 21:29 ` Jason Gunthorpe
@ 2025-09-29 18:52 ` Nicolin Chen
2025-09-30 12:13 ` Jason Gunthorpe
0 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-29 18:52 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Wed, Sep 24, 2025 at 06:29:12PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 08, 2025 at 04:26:58PM -0700, Nicolin Chen wrote:
> > +
> > + WARN_ON(new != new_invs->inv + new_invs->num_invs);
> > +
> > + return new_invs;
>
> A debugging check that the output list is sorted would be a nice touch
> for robustness.
Sure. The second for loop generating "new" could run a sanity.
> I think this looks OK and has turned out to be pretty simple.
>
> I've been thinking about generalizing it to core code and I think it
> would hold up well there as well?
How would you like it to be written in the core: a generalized
structure or a general macro that generates driver structures?
And potentially we need a new drivers/iommu/iommu-invs.c?
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array
2025-09-29 18:52 ` Nicolin Chen
@ 2025-09-30 12:13 ` Jason Gunthorpe
0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-30 12:13 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 29, 2025 at 11:52:09AM -0700, Nicolin Chen wrote:
> > I've been thinking about generalizing it to core code and I think it
> > would hold up well there as well?
>
> How would you like it to be written in the core: a generalized
> structure or a general macro that generates driver structures?
I'm thinking to leave this here for now and revisit it with another
driver someday.
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
` (3 preceding siblings ...)
2025-09-08 23:26 ` [PATCH rfcv2 4/8] iommu/arm-smmu-v3: Introduce a per-domain arm_smmu_invs array Nicolin Chen
@ 2025-09-08 23:26 ` Nicolin Chen
2025-09-24 21:32 ` Jason Gunthorpe
2025-09-08 23:27 ` [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters Nicolin Chen
` (2 subsequent siblings)
7 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-08 23:26 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
When a master is attached from an old domain to a new domain, it needs to
build an invalidation array to delete and add the array entries from/onto
the invalidation arrays of those two domains, passed via the to_merge and
to_unref arguments into arm_smmu_invs_merge/unref() respectively.
Since the master->num_streams might differ across masters, a memory would
have to be allocated when building an to_merge/to_unref array which might
fail with -ENOMEM.
On the other hand, an attachment to arm_smmu_blocked_domain must not fail
so it's the best to avoid any memory allocation in that path.
Pre-allocate a fixed size invalidation array for every master. This array
will be used as a scratch to fill dynamically when building a to_merge or
to_unref invs array.
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 +++++++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 +++++++++++++++++++++
2 files changed, 34 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 34fcc1a930e6a..246c6d84de3ab 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -919,6 +919,14 @@ struct arm_smmu_master {
struct arm_smmu_device *smmu;
struct device *dev;
struct arm_smmu_stream *streams;
+ /*
+ * Scratch memory for a to_merge or to_unref array to build a per-domain
+ * invalidation array. It'll be pre-allocated with enough enries for all
+ * possible build scenarios. It can be used by only one caller at a time
+ * until the arm_smmu_invs_merge/unref() finishes. Must be locked by the
+ * iommu_group mutex.
+ */
+ struct arm_smmu_invs *build_invs;
struct arm_smmu_vmaster *vmaster; /* use smmu->streams_mutex */
/* Locked by the iommu core using the group mutex */
struct arm_smmu_ctx_desc_cfg cd_table;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 83d842bd88817..4e69c81f5a28b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3663,12 +3663,22 @@ static int arm_smmu_init_sid_strtab(struct arm_smmu_device *smmu, u32 sid)
return 0;
}
+static int arm_smmu_ids_cmp(const void *_l, const void *_r)
+{
+ const typeof_member(struct iommu_fwspec, ids[0]) *l = _l;
+ const typeof_member(struct iommu_fwspec, ids[0]) *r = _r;
+
+ return cmp_int(*l, *r);
+}
+
static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
struct arm_smmu_master *master)
{
int i;
int ret = 0;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
+ bool ats_supported = dev_is_pci(master->dev) &&
+ pci_ats_supported(to_pci_dev(master->dev));
master->streams = kcalloc(fwspec->num_ids, sizeof(*master->streams),
GFP_KERNEL);
@@ -3676,6 +3686,20 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
return -ENOMEM;
master->num_streams = fwspec->num_ids;
+ /* Base case has 1 ASID or 1~2 VMIDs. ATS case adds num_ids */
+ if (!ats_supported)
+ master->build_invs = arm_smmu_invs_alloc(2);
+ else
+ master->build_invs = arm_smmu_invs_alloc(2 + fwspec->num_ids);
+ if (IS_ERR(master->build_invs)) {
+ kfree(master->streams);
+ return PTR_ERR(master->build_invs);
+ }
+
+ /* Put the ids into order for a sorted to_merge or to_unref array */
+ sort_nonatomic(fwspec->ids, fwspec->num_ids, sizeof(fwspec->ids[0]),
+ arm_smmu_ids_cmp, NULL);
+
mutex_lock(&smmu->streams_mutex);
for (i = 0; i < fwspec->num_ids; i++) {
struct arm_smmu_stream *new_stream = &master->streams[i];
@@ -3713,6 +3737,7 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
for (i--; i >= 0; i--)
rb_erase(&master->streams[i].node, &smmu->streams);
kfree(master->streams);
+ kfree(master->build_invs);
}
mutex_unlock(&smmu->streams_mutex);
@@ -3734,6 +3759,7 @@ static void arm_smmu_remove_master(struct arm_smmu_master *master)
mutex_unlock(&smmu->streams_mutex);
kfree(master->streams);
+ kfree(master->build_invs);
}
static struct iommu_device *arm_smmu_probe_device(struct device *dev)
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array
2025-09-08 23:26 ` [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array Nicolin Chen
@ 2025-09-24 21:32 ` Jason Gunthorpe
2025-09-29 19:11 ` Nicolin Chen
0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-24 21:32 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 08, 2025 at 04:26:59PM -0700, Nicolin Chen wrote:
> When a master is attached from an old domain to a new domain, it needs to
> build an invalidation array to delete and add the array entries from/onto
> the invalidation arrays of those two domains, passed via the to_merge and
> to_unref arguments into arm_smmu_invs_merge/unref() respectively.
>
> Since the master->num_streams might differ across masters, a memory would
> have to be allocated when building an to_merge/to_unref array which might
> fail with -ENOMEM.
>
> On the other hand, an attachment to arm_smmu_blocked_domain must not fail
> so it's the best to avoid any memory allocation in that path.
>
> Pre-allocate a fixed size invalidation array for every master. This array
> will be used as a scratch to fill dynamically when building a to_merge or
> to_unref invs array.
Mention the sorting here too, though maybe that hunk should be moved
to the next patch.
> + /* Base case has 1 ASID or 1~2 VMIDs. ATS case adds num_ids */
> + if (!ats_supported)
> + master->build_invs = arm_smmu_invs_alloc(2);
> + else
> + master->build_invs = arm_smmu_invs_alloc(2 + fwspec->num_ids);
> + if (IS_ERR(master->build_invs)) {
> + kfree(master->streams);
> + return PTR_ERR(master->build_invs);
> + }
> +
> + /* Put the ids into order for a sorted to_merge or to_unref array */
> + sort_nonatomic(fwspec->ids, fwspec->num_ids, sizeof(fwspec->ids[0]),
> + arm_smmu_ids_cmp, NULL);
The sort could be moved under the above !ats_supported, a little more
insurance in case something is inspecting the ids.
I searched around and only found tegra_dev_iommu_get_stream_id() which
is fine with the sort due to num_ids ==1
Otherwise looks OK
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array
2025-09-24 21:32 ` Jason Gunthorpe
@ 2025-09-29 19:11 ` Nicolin Chen
2025-09-30 11:55 ` Jason Gunthorpe
0 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-29 19:11 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Wed, Sep 24, 2025 at 06:32:30PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 08, 2025 at 04:26:59PM -0700, Nicolin Chen wrote:
> > + /* Base case has 1 ASID or 1~2 VMIDs. ATS case adds num_ids */
> > + if (!ats_supported)
> > + master->build_invs = arm_smmu_invs_alloc(2);
> > + else
> > + master->build_invs = arm_smmu_invs_alloc(2 + fwspec->num_ids);
> > + if (IS_ERR(master->build_invs)) {
> > + kfree(master->streams);
> > + return PTR_ERR(master->build_invs);
> > + }
> > +
> > + /* Put the ids into order for a sorted to_merge or to_unref array */
> > + sort_nonatomic(fwspec->ids, fwspec->num_ids, sizeof(fwspec->ids[0]),
> > + arm_smmu_ids_cmp, NULL);
>
> The sort could be moved under the above !ats_supported, a little more
> insurance in case something is inspecting the ids.
You mean this:
----------------------------------------------------------------
@@ -4080,19 +4080,19 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
master->num_streams = fwspec->num_ids;
/* Base case has 1 ASID or 1~2 VMIDs. ATS case adds num_ids */
- if (!ats_supported)
+ if (!ats_supported) {
master->build_invs = arm_smmu_invs_alloc(2);
- else
+ } else {
+ /* Put the ids into order for a sorted to_merge or to_unref array */
+ sort_nonatomic(fwspec->ids, fwspec->num_ids, sizeof(fwspec->ids[0]),
+ arm_smmu_ids_cmp, NULL);
master->build_invs = arm_smmu_invs_alloc(2 + fwspec->num_ids);
+ }
if (IS_ERR(master->build_invs)) {
kfree(master->streams);
return PTR_ERR(master->build_invs);
}
- /* Put the ids into order for a sorted to_merge or to_unref array */
- sort_nonatomic(fwspec->ids, fwspec->num_ids, sizeof(fwspec->ids[0]),
- arm_smmu_ids_cmp, NULL);
-
mutex_lock(&smmu->streams_mutex);
for (i = 0; i < fwspec->num_ids; i++) {
struct arm_smmu_stream *new_stream = &master->streams[i];
----------------------------------------------------------------
?
Hmm, I am not sure how it insures against anything concurrent.
Maybe we should sort it in arm_smmu_of_xlate() each time when
adding a new ID? Or iommu_fwspec_add_ids() itself could sort,
since we are thinking of generalizing this array in the core?
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array
2025-09-29 19:11 ` Nicolin Chen
@ 2025-09-30 11:55 ` Jason Gunthorpe
0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-30 11:55 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 29, 2025 at 12:11:47PM -0700, Nicolin Chen wrote:
> Hmm, I am not sure how it insures against anything concurrent.
Yes, not concurrent, just if someone is later looking at the IDs list
and expecting it to be in a certain order. I could not find such a
thing so I think it is safe.
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
` (4 preceding siblings ...)
2025-09-08 23:26 ` [PATCH rfcv2 5/8] iommu/arm-smmu-v3: Pre-allocate a per-master invalidation array Nicolin Chen
@ 2025-09-08 23:27 ` Nicolin Chen
2025-09-24 21:42 ` Jason Gunthorpe
2025-09-08 23:27 ` [PATCH rfcv2 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range() Nicolin Chen
2025-09-08 23:27 ` [PATCH rfcv2 8/8] iommu/arm-smmu-v3: Perform per-domain invalidations using arm_smmu_invs Nicolin Chen
7 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-08 23:27 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
Update the invs array with the invalidations required by each domain type
during attachment operations.
Only an SVA domain or a paging domain will have an invs array:
a. SVA domain will add an INV_TYPE_S1_ASID per SMMU and an INV_TYPE_ATS
per SID
b. Non-nesting-parent paging domain with no ATS-enabled master will add
a single INV_TYPE_S1_ASID or INV_TYPE_S2_VMID per SMMU
c. Non-nesting-parent paging domain with ATS-enabled master(s) will do
(b) and add an INV_TYPE_ATS per SID
d. Nesting-parent paging domain will add an INV_TYPE_S2_VMID followed by
an INV_TYPE_S2_VMID_S1_CLEAR per vSMMU. For an ATS-enabled master, it
will add an INV_TYPE_ATS_FULL per SID
The per-domain invalidation is not needed, until the domain is attached to
a master, i.e. a possible translation request. Giving this clears a way to
allowing the domain to be attached to many SMMUs, and avoids any pointless
invalidation overheads during a teardown if there are no STE/CDs referring
to the domain. This also means, when the last device is detached, the old
domain must flush its ASID or VMID because any iommu_unmap() call after it
wouldn't initiate any invalidation given an empty domain invs array.
Introduce some arm_smmu_invs helper functions for building scratch arrays,
preparing and installing old/new domain's invalidation arrays.
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 ++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 312 +++++++++++++++++++-
2 files changed, 332 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 246c6d84de3ab..e4e0e066108cc 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -678,6 +678,8 @@ struct arm_smmu_inv {
/**
* struct arm_smmu_invs - Per-domain invalidation array
* @num_invs: number of invalidations in the flexible array
+ * @old: flag to synchronize with reader
+ * @rwlock: optional rwlock to fench ATS operations
* @rcu: rcu head for kfree_rcu()
* @inv: flexible invalidation array
*
@@ -703,6 +705,8 @@ struct arm_smmu_inv {
*/
struct arm_smmu_invs {
size_t num_invs;
+ rwlock_t rwlock;
+ u8 old;
struct rcu_head rcu;
struct arm_smmu_inv inv[];
};
@@ -714,6 +718,7 @@ static inline struct arm_smmu_invs *arm_smmu_invs_alloc(size_t num_invs)
new_invs = kzalloc(struct_size(new_invs, inv, num_invs), GFP_KERNEL);
if (!new_invs)
return ERR_PTR(-ENOMEM);
+ rwlock_init(&new_invs->rwlock);
new_invs->num_invs = num_invs;
return new_invs;
}
@@ -1082,6 +1087,21 @@ static inline bool arm_smmu_master_canwbs(struct arm_smmu_master *master)
IOMMU_FWSPEC_PCI_RC_CANWBS;
}
+/**
+ * struct arm_smmu_inv_state - Per-domain invalidation array state
+ * @invs_ptr: points to the domain->invs (unwinding nesting/etc.) or is NULL if
+ * no change should be made
+ * @old_invs: the original invs array
+ * @new_invs: for new domain, this is the new invs array to update domin->invs;
+ * for old domain, this is the master->build_invs to pass in as the
+ * to_unref argument to an arm_smmu_invs_unref() call
+ */
+struct arm_smmu_inv_state {
+ struct arm_smmu_invs **invs_ptr;
+ struct arm_smmu_invs *old_invs;
+ struct arm_smmu_invs *new_invs;
+};
+
struct arm_smmu_attach_state {
/* Inputs */
struct iommu_domain *old_domain;
@@ -1091,6 +1111,8 @@ struct arm_smmu_attach_state {
ioasid_t ssid;
/* Resulting state */
struct arm_smmu_vmaster *vmaster;
+ struct arm_smmu_inv_state old_domain_invst;
+ struct arm_smmu_inv_state new_domain_invst;
bool ats_enabled;
};
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4e69c81f5a28b..ee779df1d78fb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1183,8 +1183,11 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
i++;
} else if (cmp == 0) {
/* same item */
- if (refcount_dec_and_test(&invs->inv[i].users))
+ if (refcount_dec_and_test(&invs->inv[i].users)) {
+ /* Notify the caller about this deletion */
+ refcount_set(&to_unref->inv[j].users, 1);
num_dels++;
+ }
i++;
j++;
} else {
@@ -3028,6 +3031,97 @@ static void arm_smmu_disable_iopf(struct arm_smmu_master *master,
iopf_queue_remove_device(master->smmu->evtq.iopf, master->dev);
}
+/*
+ * Use the preallocated scratch array at master->build_invs, to build a to_merge
+ * or to_unref array, to pass into a following arm_smmu_invs_merge/unref() call.
+ *
+ * Do not free the returned invs array. It is reused, and will be overwritten by
+ * the next arm_smmu_master_build_invs() call.
+ */
+static struct arm_smmu_invs *
+arm_smmu_master_build_invs(struct arm_smmu_master *master, bool ats_enabled,
+ ioasid_t ssid, struct arm_smmu_domain *smmu_domain)
+{
+ const bool e2h = master->smmu->features & ARM_SMMU_FEAT_E2H;
+ struct arm_smmu_invs *build_invs = master->build_invs;
+ const bool nesting = smmu_domain->nest_parent;
+ struct arm_smmu_inv *cur;
+
+ iommu_group_mutex_assert(master->dev);
+
+ cur = build_invs->inv;
+
+ switch (smmu_domain->stage) {
+ case ARM_SMMU_DOMAIN_SVA:
+ case ARM_SMMU_DOMAIN_S1:
+ *cur = (struct arm_smmu_inv){
+ .smmu = master->smmu,
+ .type = INV_TYPE_S1_ASID,
+ .id = smmu_domain->cd.asid,
+ .size_opcode = e2h ? CMDQ_OP_TLBI_EL2_VA :
+ CMDQ_OP_TLBI_NH_VA,
+ .nsize_opcode = e2h ? CMDQ_OP_TLBI_EL2_ASID :
+ CMDQ_OP_TLBI_NH_ASID
+ };
+ break;
+ case ARM_SMMU_DOMAIN_S2:
+ *cur = (struct arm_smmu_inv){
+ .smmu = master->smmu,
+ .type = INV_TYPE_S2_VMID,
+ .id = smmu_domain->s2_cfg.vmid,
+ .size_opcode = CMDQ_OP_TLBI_S2_IPA,
+ .nsize_opcode = CMDQ_OP_TLBI_S12_VMALL,
+ };
+ break;
+ default:
+ WARN_ON(true);
+ return NULL;
+ }
+
+ /* Range-based invalidation requires the leaf pgsize for calculation */
+ if (master->smmu->features & ARM_SMMU_FEAT_RANGE_INV)
+ cur->pgsize = __ffs(smmu_domain->domain.pgsize_bitmap);
+ cur++;
+
+ /* All the nested S1 ASIDs have to be flushed when S2 parent changes */
+ if (nesting) {
+ *cur = (struct arm_smmu_inv){
+ .smmu = master->smmu,
+ .type = INV_TYPE_S2_VMID_S1_CLEAR,
+ .id = smmu_domain->s2_cfg.vmid,
+ .size_opcode = CMDQ_OP_TLBI_NH_ALL,
+ .nsize_opcode = CMDQ_OP_TLBI_NH_ALL,
+ };
+ cur++;
+ }
+
+ if (ats_enabled) {
+ size_t i;
+
+ for (i = 0; i < master->num_streams; i++) {
+ /*
+ * If an S2 used as a nesting parent is changed we have
+ * no option but to completely flush the ATC.
+ */
+ *cur = (struct arm_smmu_inv){
+ .smmu = master->smmu,
+ .type = nesting ? INV_TYPE_ATS_FULL :
+ INV_TYPE_ATS,
+ .id = master->streams[i].id,
+ .ssid = ssid,
+ .size_opcode = CMDQ_OP_ATC_INV,
+ .nsize_opcode = CMDQ_OP_ATC_INV,
+ };
+ cur++;
+ }
+ }
+
+ /* Note this build_invs must have been sorted */
+
+ build_invs->num_invs = cur - build_invs->inv;
+ return build_invs;
+}
+
static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
struct iommu_domain *domain,
ioasid_t ssid)
@@ -3057,6 +3151,211 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master,
kfree(master_domain);
}
+static inline void arm_smmu_invs_dbg(struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_invs *invs, char *name)
+{
+ size_t i;
+
+ dev_dbg(master->dev, "domain (type: %x), invs: %s, num_invs: %ld\n",
+ smmu_domain->domain.type, name, invs->num_invs);
+ for (i = 0; i < invs->num_invs; i++) {
+ struct arm_smmu_inv *cur = &invs->inv[i];
+
+ dev_dbg(master->dev,
+ " entry: inv[%ld], type: %u, id: %u, users: %u\n", i,
+ cur->type, cur->id, refcount_read(&cur->users));
+ }
+}
+
+/*
+ * During attachment, the updates of the two domain->invs arrays are sequenced:
+ * 1. new domain updates its invs array, merging master->build_invs
+ * 2. new domain starts to include the master during its invalidation
+ * 3. master updates its STE switching from the old domain to the new domain
+ * 4. old domain still includes the master during its invalidation
+ * 5. old domain updates its invs array, unreferencing master->build_invs
+ *
+ * For 1 and 5, prepare the two updated arrays in advance, handling any changes
+ * that can possibly failure. So the actual update of either 1 or 5 won't fail.
+ * arm_smmu_asid_lock ensures that the old invs in the domains are intact while
+ * we are sequencing to update them.
+ */
+static int arm_smmu_attach_prepare_invs(struct arm_smmu_attach_state *state,
+ struct arm_smmu_domain *new_smmu_domain)
+{
+ struct arm_smmu_domain *old_smmu_domain =
+ to_smmu_domain_devices(state->old_domain);
+ struct arm_smmu_master *master = state->master;
+ ioasid_t ssid = state->ssid;
+
+ /* A re-attach case doesn't need to update invs array */
+ if (new_smmu_domain == old_smmu_domain)
+ return 0;
+
+ /*
+ * At this point a NULL domain indicates the domain doesn't use the
+ * IOTLB, see to_smmu_domain_devices().
+ */
+ if (new_smmu_domain) {
+ struct arm_smmu_inv_state *invst = &state->new_domain_invst;
+ struct arm_smmu_invs *build_invs;
+
+ invst->invs_ptr = &new_smmu_domain->invs;
+ invst->old_invs = rcu_dereference_protected(
+ new_smmu_domain->invs,
+ lockdep_is_held(&arm_smmu_asid_lock));
+ build_invs = arm_smmu_master_build_invs(
+ master, state->ats_enabled, ssid, new_smmu_domain);
+ if (!build_invs)
+ return -EINVAL;
+
+ invst->new_invs =
+ arm_smmu_invs_merge(invst->old_invs, build_invs);
+ if (IS_ERR(invst->new_invs))
+ return PTR_ERR(invst->new_invs);
+
+ arm_smmu_invs_dbg(master, new_smmu_domain, invst->old_invs,
+ "new domain's old invs");
+ arm_smmu_invs_dbg(master, new_smmu_domain, build_invs, "merge");
+ arm_smmu_invs_dbg(master, new_smmu_domain, invst->new_invs,
+ "new domain's new invs");
+ }
+
+ if (old_smmu_domain) {
+ struct arm_smmu_inv_state *invst = &state->old_domain_invst;
+
+ invst->invs_ptr = &old_smmu_domain->invs;
+ invst->old_invs = rcu_dereference_protected(
+ old_smmu_domain->invs,
+ lockdep_is_held(&arm_smmu_asid_lock));
+ /* For old_smmu_domain, new_invs points to master->build_invs */
+ invst->new_invs = arm_smmu_master_build_invs(
+ master, master->ats_enabled, ssid, old_smmu_domain);
+ }
+
+ return 0;
+}
+
+/* Must be installed before arm_smmu_install_ste_for_dev() */
+static void
+arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
+{
+ struct arm_smmu_inv_state *invst = &state->new_domain_invst;
+
+ if (!invst->invs_ptr)
+ return;
+
+ rcu_assign_pointer(*invst->invs_ptr, invst->new_invs);
+ /*
+ * Committed to updating the STE, using the new invalidation array, and
+ * acquiring any racing IOPTE updates.
+ */
+ smp_mb();
+ kfree_rcu(invst->old_invs, rcu);
+}
+
+/*
+ * When an array entry's users count reaches zero, it means the ASID/VMID is no
+ * longer being invalidated by map/unmap and must be cleaned. The rule is that
+ * all ASIDs/VMIDs not in an invalidation array are left cleared in the IOTLB.
+ */
+static void arm_smmu_invs_flush_iotlb_tags(struct arm_smmu_invs *invs)
+{
+ size_t i;
+
+ for (i = 0; i != invs->num_invs; i++) {
+ struct arm_smmu_inv *inv = &invs->inv[i];
+ struct arm_smmu_cmdq_ent cmd = {};
+
+ /* arm_smmu_invs_unref() sets users if it was the last user */
+ if (!refcount_read(&inv->users))
+ continue;
+
+ switch (inv->type) {
+ case INV_TYPE_S1_ASID:
+ cmd.tlbi.asid = inv->id;
+ break;
+ case INV_TYPE_S2_VMID:
+ /* S2_VMID using nsize_opcode covers S2_VMID_S1_CLEAR */
+ cmd.tlbi.vmid = inv->id;
+ break;
+ default:
+ continue;
+ }
+
+ cmd.opcode = inv->nsize_opcode;
+ arm_smmu_cmdq_issue_cmd_with_sync(inv->smmu, &cmd);
+ }
+}
+
+/* Should be installed after arm_smmu_install_ste_for_dev() */
+static void
+arm_smmu_install_old_domain_invs(struct arm_smmu_attach_state *state)
+{
+ struct arm_smmu_inv_state *invst = &state->old_domain_invst;
+ struct arm_smmu_domain *old_smmu_domain =
+ to_smmu_domain_devices(state->old_domain);
+ struct arm_smmu_invs *old_invs = invst->old_invs;
+ struct arm_smmu_master *master = state->master;
+ struct arm_smmu_invs *new_invs;
+ unsigned long flags;
+ size_t num_dels;
+
+ lockdep_assert_held(&arm_smmu_asid_lock);
+
+ if (!invst->invs_ptr)
+ return;
+
+ arm_smmu_invs_dbg(master, old_smmu_domain, old_invs,
+ "old domain's old invs");
+ arm_smmu_invs_dbg(master, old_smmu_domain, invst->new_invs, "unref");
+ num_dels = arm_smmu_invs_unref(old_invs, invst->new_invs);
+ if (!num_dels) {
+ arm_smmu_invs_dbg(master, old_smmu_domain, old_invs,
+ "old domain's new invs");
+ return;
+ }
+
+ arm_smmu_invs_flush_iotlb_tags(invst->new_invs);
+
+ new_invs = arm_smmu_invs_purge(old_invs, num_dels);
+ if (!new_invs) {
+ size_t new_num = old_invs->num_invs;
+
+ /*
+ * OOM. Couldn't make a copy. Leave the array unoptimized. But
+ * trim its size if some tailing entries are marked as trash.
+ */
+ while (new_num != 0) {
+ if (refcount_read(&old_invs->inv[new_num - 1].users))
+ break;
+ new_num--;
+ }
+
+ arm_smmu_invs_dbg(master, old_smmu_domain, old_invs,
+ "old domain's new invs");
+
+ /* The lock is required to fence concurrent ATS operations. */
+ write_lock_irqsave(&old_invs->rwlock, flags);
+ WRITE_ONCE(old_invs->num_invs, new_num);
+ write_unlock_irqrestore(&old_invs->rwlock, flags);
+ return;
+ }
+
+ arm_smmu_invs_dbg(master, old_smmu_domain, new_invs,
+ "old domain's new invs");
+
+ /* new_invs is a copy, do the copy update part of RCU */
+ rcu_assign_pointer(*invst->invs_ptr, new_invs);
+ /* Notify any concurrent invalidation to read the updated invs */
+ write_lock_irqsave(&old_invs->rwlock, flags);
+ WRITE_ONCE(old_invs->old, true);
+ write_unlock_irqrestore(&old_invs->rwlock, flags);
+
+ kfree_rcu(old_invs, rcu);
+}
+
/*
* Start the sequence to attach a domain to a master. The sequence contains three
* steps:
@@ -3114,12 +3413,16 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
arm_smmu_ats_supported(master);
}
+ ret = arm_smmu_attach_prepare_invs(state, smmu_domain);
+ if (ret)
+ return ret;
+
if (smmu_domain) {
if (new_domain->type == IOMMU_DOMAIN_NESTED) {
ret = arm_smmu_attach_prepare_vmaster(
state, to_smmu_nested_domain(new_domain));
if (ret)
- return ret;
+ goto err_unprepare_invs;
}
master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL);
@@ -3167,6 +3470,8 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
atomic_inc(&smmu_domain->nr_ats_masters);
list_add(&master_domain->devices_elm, &smmu_domain->devices);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+
+ arm_smmu_install_new_domain_invs(state);
}
if (!state->ats_enabled && master->ats_enabled) {
@@ -3186,6 +3491,8 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
kfree(master_domain);
err_free_vmaster:
kfree(state->vmaster);
+err_unprepare_invs:
+ kfree(state->new_domain_invst.new_invs);
return ret;
}
@@ -3217,6 +3524,7 @@ void arm_smmu_attach_commit(struct arm_smmu_attach_state *state)
}
arm_smmu_remove_master_domain(master, state->old_domain, state->ssid);
+ arm_smmu_install_old_domain_invs(state);
master->ats_enabled = state->ats_enabled;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
2025-09-08 23:27 ` [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters Nicolin Chen
@ 2025-09-24 21:42 ` Jason Gunthorpe
2025-09-29 20:52 ` Nicolin Chen
0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-24 21:42 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 08, 2025 at 04:27:00PM -0700, Nicolin Chen wrote:
> Update the invs array with the invalidations required by each domain type
> during attachment operations.
>
> Only an SVA domain or a paging domain will have an invs array:
> a. SVA domain will add an INV_TYPE_S1_ASID per SMMU and an INV_TYPE_ATS
> per SID
>
> b. Non-nesting-parent paging domain with no ATS-enabled master will add
> a single INV_TYPE_S1_ASID or INV_TYPE_S2_VMID per SMMU
>
> c. Non-nesting-parent paging domain with ATS-enabled master(s) will do
> (b) and add an INV_TYPE_ATS per SID
>
> d. Nesting-parent paging domain will add an INV_TYPE_S2_VMID followed by
> an INV_TYPE_S2_VMID_S1_CLEAR per vSMMU. For an ATS-enabled master, it
> will add an INV_TYPE_ATS_FULL per SID
Just some minor forward looking clarification - this behavior should
be triggered when a nest-parent is attached through the viommu using
a nesting domain with a vSTE.
A nesting-parent that is just normally attached should act like a
normal S2 since it does not and can not have a two stage S1 on top of
it.
We can't quite get there yet until the next series of changing how the
VMID allocation works.
> The per-domain invalidation is not needed, until the domain is attached to
> a master, i.e. a possible translation request. Giving this clears a way to
> allowing the domain to be attached to many SMMUs, and avoids any pointless
> invalidation overheads during a teardown if there are no STE/CDs referring
> to the domain. This also means, when the last device is detached, the old
> domain must flush its ASID or VMID because any iommu_unmap() call after it
> wouldn't initiate any invalidation given an empty domain invs array.
Grammar/phrasing in this paragraph
> Introduce some arm_smmu_invs helper functions for building scratch arrays,
> preparing and installing old/new domain's invalidation arrays.
>
> Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 ++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 312 +++++++++++++++++++-
> 2 files changed, 332 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 246c6d84de3ab..e4e0e066108cc 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -678,6 +678,8 @@ struct arm_smmu_inv {
> /**
> * struct arm_smmu_invs - Per-domain invalidation array
> * @num_invs: number of invalidations in the flexible array
> + * @old: flag to synchronize with reader
> + * @rwlock: optional rwlock to fench ATS operations
> * @rcu: rcu head for kfree_rcu()
> * @inv: flexible invalidation array
> *
> @@ -703,6 +705,8 @@ struct arm_smmu_inv {
> */
> struct arm_smmu_invs {
> size_t num_invs;
> + rwlock_t rwlock;
> + u8 old;
> struct rcu_head rcu;
> struct arm_smmu_inv inv[];
> };
> @@ -714,6 +718,7 @@ static inline struct arm_smmu_invs *arm_smmu_invs_alloc(size_t num_invs)
> new_invs = kzalloc(struct_size(new_invs, inv, num_invs), GFP_KERNEL);
> if (!new_invs)
> return ERR_PTR(-ENOMEM);
> + rwlock_init(&new_invs->rwlock);
> new_invs->num_invs = num_invs;
> return new_invs;
> }
Put these and related hunks in the patch adding arm_smmu_invs
> @@ -1183,8 +1183,11 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
> i++;
> } else if (cmp == 0) {
> /* same item */
> - if (refcount_dec_and_test(&invs->inv[i].users))
> + if (refcount_dec_and_test(&invs->inv[i].users)) {
> + /* Notify the caller about this deletion */
> + refcount_set(&to_unref->inv[j].users, 1);
> num_dels++;
This is a bit convoluted. Instead of marking the entry and then
iterating the list again just directly call a function to do the
invalidation.
> +static inline void arm_smmu_invs_dbg(struct arm_smmu_master *master,
> + struct arm_smmu_domain *smmu_domain,
> + struct arm_smmu_invs *invs, char *name)
> +{
> + size_t i;
> +
> + dev_dbg(master->dev, "domain (type: %x), invs: %s, num_invs: %ld\n",
> + smmu_domain->domain.type, name, invs->num_invs);
> + for (i = 0; i < invs->num_invs; i++) {
> + struct arm_smmu_inv *cur = &invs->inv[i];
> +
> + dev_dbg(master->dev,
> + " entry: inv[%ld], type: %u, id: %u, users: %u\n", i,
> + cur->type, cur->id, refcount_read(&cur->users));
> + }
> +}
Move all the debug code to its own commit and don't send it
> +static void
> +arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
> +{
> + struct arm_smmu_inv_state *invst = &state->new_domain_invst;
> +
> + if (!invst->invs_ptr)
> + return;
> +
> + rcu_assign_pointer(*invst->invs_ptr, invst->new_invs);
> + /*
> + * Committed to updating the STE, using the new invalidation array, and
> + * acquiring any racing IOPTE updates.
> + */
> + smp_mb();
We are commited to updating the STE. Make the invalidation list
visable to parallel map/unmap threads and acquire any racying IOPTE
updates.
> + kfree_rcu(invst->old_invs, rcu);
> +}
> +
> +/*
> + * When an array entry's users count reaches zero, it means the ASID/VMID is no
> + * longer being invalidated by map/unmap and must be cleaned. The rule is that
> + * all ASIDs/VMIDs not in an invalidation array are left cleared in the IOTLB.
> + */
> +static void arm_smmu_invs_flush_iotlb_tags(struct arm_smmu_invs *invs)
> +{
> + size_t i;
> +
> + for (i = 0; i != invs->num_invs; i++) {
Just remove the loop and accept struct arm_smmu_inv * and call it
directly.
> + if (!new_invs) {
> + size_t new_num = old_invs->num_invs;
> +
> + /*
> + * OOM. Couldn't make a copy. Leave the array unoptimized. But
> + * trim its size if some tailing entries are marked as trash.
> + */
> + while (new_num != 0) {
> + if (refcount_read(&old_invs->inv[new_num - 1].users))
> + break;
> + new_num--;
> + }
Would be nicer to have arm_smmu_invs_unref return the new size so we
don't need this loop
Looks Ok to me otherwise
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
2025-09-24 21:42 ` Jason Gunthorpe
@ 2025-09-29 20:52 ` Nicolin Chen
2025-09-30 12:12 ` Jason Gunthorpe
0 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-29 20:52 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Wed, Sep 24, 2025 at 06:42:15PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 08, 2025 at 04:27:00PM -0700, Nicolin Chen wrote:
> > Update the invs array with the invalidations required by each domain type
> > during attachment operations.
> >
> > Only an SVA domain or a paging domain will have an invs array:
> > a. SVA domain will add an INV_TYPE_S1_ASID per SMMU and an INV_TYPE_ATS
> > per SID
> >
> > b. Non-nesting-parent paging domain with no ATS-enabled master will add
> > a single INV_TYPE_S1_ASID or INV_TYPE_S2_VMID per SMMU
> >
> > c. Non-nesting-parent paging domain with ATS-enabled master(s) will do
> > (b) and add an INV_TYPE_ATS per SID
> >
> > d. Nesting-parent paging domain will add an INV_TYPE_S2_VMID followed by
> > an INV_TYPE_S2_VMID_S1_CLEAR per vSMMU. For an ATS-enabled master, it
> > will add an INV_TYPE_ATS_FULL per SID
>
> Just some minor forward looking clarification - this behavior should
> be triggered when a nest-parent is attached through the viommu using
> a nesting domain with a vSTE.
>
> A nesting-parent that is just normally attached should act like a
> normal S2 since it does not and can not have a two stage S1 on top of
> it.
>
> We can't quite get there yet until the next series of changing how the
> VMID allocation works.
Yea, you are right. Let's add this:
Note that case #d is for the case when a nesting parent domain is attached
through a vSMMU instance using a nested domain carrying a vSTE. This model
will allocate a VMID per vSMMU instance v.s. the current driver allocating
per S2 domain. So, this requires a few more patches for S2 domain sharing.
> > The per-domain invalidation is not needed, until the domain is attached to
> > a master, i.e. a possible translation request. Giving this clears a way to
> > allowing the domain to be attached to many SMMUs, and avoids any pointless
> > invalidation overheads during a teardown if there are no STE/CDs referring
> > to the domain. This also means, when the last device is detached, the old
> > domain must flush its ASID or VMID because any iommu_unmap() call after it
> > wouldn't initiate any invalidation given an empty domain invs array.
>
> Grammar/phrasing in this paragraph
OK. I asked AI to rewrite it:
The per-domain invalidation is not needed until the domain is attached to
a master (when it starts to possibly use TLB). This will make it possible
to attach the domain to multiple SMMUs and avoid unnecessary invalidation
overhead during teardown if no STEs/CDs refer to the domain. It also means
that when the last device is detached, the old domain must flush its ASID
or VMID, since any new iommu_unmap() call would not trigger invalidations
given an empty domain->invs array.
> > @@ -1183,8 +1183,11 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
> > i++;
> > } else if (cmp == 0) {
> > /* same item */
> > - if (refcount_dec_and_test(&invs->inv[i].users))
> > + if (refcount_dec_and_test(&invs->inv[i].users)) {
> > + /* Notify the caller about this deletion */
> > + refcount_set(&to_unref->inv[j].users, 1);
> > num_dels++;
>
> This is a bit convoluted. Instead of marking the entry and then
> iterating the list again just directly call a function to do the
> invalidation.
OK. If we want to generalize this arm_smmu_invs_unref function,
I suppose we will need to pass in a callback function pointer.
> > + if (!new_invs) {
> > + size_t new_num = old_invs->num_invs;
> > +
> > + /*
> > + * OOM. Couldn't make a copy. Leave the array unoptimized. But
> > + * trim its size if some tailing entries are marked as trash.
> > + */
> > + while (new_num != 0) {
> > + if (refcount_read(&old_invs->inv[new_num - 1].users))
> > + break;
> > + new_num--;
> > + }
>
> Would be nicer to have arm_smmu_invs_unref return the new size so we
> don't need this loop
The "new size" must be invs->num_invs subtracting the number of
the tailing trash entries. So, arm_smmu_invs_unref() would have
to have the same loop validating the tailing entries, right?
(I will address all other comments as well)
Thanks!
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
2025-09-29 20:52 ` Nicolin Chen
@ 2025-09-30 12:12 ` Jason Gunthorpe
2025-09-30 20:19 ` Nicolin Chen
0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-30 12:12 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 29, 2025 at 01:52:30PM -0700, Nicolin Chen wrote:
> > > + if (!new_invs) {
> > > + size_t new_num = old_invs->num_invs;
> > > +
> > > + /*
> > > + * OOM. Couldn't make a copy. Leave the array unoptimized. But
> > > + * trim its size if some tailing entries are marked as trash.
> > > + */
> > > + while (new_num != 0) {
> > > + if (refcount_read(&old_invs->inv[new_num - 1].users))
> > > + break;
> > > + new_num--;
> > > + }
> >
> > Would be nicer to have arm_smmu_invs_unref return the new size so we
> > don't need this loop
>
> The "new size" must be invs->num_invs subtracting the number of
> the tailing trash entries. So, arm_smmu_invs_unref() would have
> to have the same loop validating the tailing entries, right?
It doesn't need another loop, it just need to record the index of the
last valid entry while it is doing its own loop. If it reaches
invs->num_invs then that will be the new length.
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
2025-09-30 12:12 ` Jason Gunthorpe
@ 2025-09-30 20:19 ` Nicolin Chen
2025-10-01 16:25 ` Jason Gunthorpe
0 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-30 20:19 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Tue, Sep 30, 2025 at 09:12:00AM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 29, 2025 at 01:52:30PM -0700, Nicolin Chen wrote:
>
> > > > + if (!new_invs) {
> > > > + size_t new_num = old_invs->num_invs;
> > > > +
> > > > + /*
> > > > + * OOM. Couldn't make a copy. Leave the array unoptimized. But
> > > > + * trim its size if some tailing entries are marked as trash.
> > > > + */
> > > > + while (new_num != 0) {
> > > > + if (refcount_read(&old_invs->inv[new_num - 1].users))
> > > > + break;
> > > > + new_num--;
> > > > + }
> > >
> > > Would be nicer to have arm_smmu_invs_unref return the new size so we
> > > don't need this loop
> >
> > The "new size" must be invs->num_invs subtracting the number of
> > the tailing trash entries. So, arm_smmu_invs_unref() would have
> > to have the same loop validating the tailing entries, right?
>
> It doesn't need another loop, it just need to record the index of the
> last valid entry while it is doing its own loop. If it reaches
> invs->num_invs then that will be the new length.
arm_smmu_invs_purge() needs num_dels, while its fallback routine
needs num_invs (new size). This forces arm_smmu_invs_unref() to
return two numbers.
I see a cleaner way of handling this is to update invs->num_invs
inside arm_smmu_invs_unref():
----------------------------------------------------------------
@@ -1209,6 +1216,13 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
j++;
}
}
+
+ /* The lock is required to fence concurrent ATS operations. */
+ write_lock_irqsave(&invs->rwlock, flags);
+ /* Trim the size by removing tailing trash entries */
+ WRITE_ONCE(invs->num_invs, num_invs);
+ write_unlock_irqrestore(&invs->rwlock, flags);
+
return num_dels;
}
EXPORT_SYMBOL_IF_KUNIT(arm_smmu_invs_unref);
----------------------------------------------------------------
So the caller would look like:
----------------------------------------------------------------
num_dels = arm_smmu_invs_unref(old_invs, invst->new_invs);
if (!num_dels)
return;
new_invs = arm_smmu_invs_purge(old_invs, num_dels);
if (!new_invs)
return;
----------------------------------------------------------------
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
2025-09-30 20:19 ` Nicolin Chen
@ 2025-10-01 16:25 ` Jason Gunthorpe
2025-10-01 17:16 ` Nicolin Chen
0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2025-10-01 16:25 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Tue, Sep 30, 2025 at 01:19:29PM -0700, Nicolin Chen wrote:
> I see a cleaner way of handling this is to update invs->num_invs
> inside arm_smmu_invs_unref():
> ----------------------------------------------------------------
> @@ -1209,6 +1216,13 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
> j++;
> }
> }
> +
> + /* The lock is required to fence concurrent ATS operations. */
> + write_lock_irqsave(&invs->rwlock, flags);
> + /* Trim the size by removing tailing trash entries */
> + WRITE_ONCE(invs->num_invs, num_invs);
> + write_unlock_irqrestore(&invs->rwlock, flags);
That seems Ok
It means the arm_smmu_invs_unref() becomes the fence that guarentees
the ATS is stopped for anything marked as trash.
Then the next steps can just be normal RCU and don't need rwlocking.
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters
2025-10-01 16:25 ` Jason Gunthorpe
@ 2025-10-01 17:16 ` Nicolin Chen
0 siblings, 0 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-10-01 17:16 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Wed, Oct 01, 2025 at 01:25:16PM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 30, 2025 at 01:19:29PM -0700, Nicolin Chen wrote:
> > I see a cleaner way of handling this is to update invs->num_invs
> > inside arm_smmu_invs_unref():
> > ----------------------------------------------------------------
> > @@ -1209,6 +1216,13 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
> > j++;
> > }
> > }
> > +
> > + /* The lock is required to fence concurrent ATS operations. */
> > + write_lock_irqsave(&invs->rwlock, flags);
> > + /* Trim the size by removing tailing trash entries */
> > + WRITE_ONCE(invs->num_invs, num_invs);
> > + write_unlock_irqrestore(&invs->rwlock, flags);
>
> That seems Ok
>
> It means the arm_smmu_invs_unref() becomes the fence that guarentees
> the ATS is stopped for anything marked as trash.
>
> Then the next steps can just be normal RCU and don't need rwlocking.
Yea. The "old" flag could be dropped too:
while (true) {
invs = rcu_dereference(smmu_domain->invs);
/*
* Avoid locking unless ATS is being used. No ATS invalidate can
* be going on after a domain is detached.
*/
locked = false;
- if (invs->has_ats || READ_ONCE(invs->old)) {
+ if (invs->has_ats) {
read_lock(&invs->rwlock);
- if (invs->old) {
- read_unlock(&invs->rwlock);
- continue;
- }
locked = true;
}
break;
}
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH rfcv2 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range()
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
` (5 preceding siblings ...)
2025-09-08 23:27 ` [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs when attaching masters Nicolin Chen
@ 2025-09-08 23:27 ` Nicolin Chen
2025-09-24 21:56 ` Jason Gunthorpe
2025-09-08 23:27 ` [PATCH rfcv2 8/8] iommu/arm-smmu-v3: Perform per-domain invalidations using arm_smmu_invs Nicolin Chen
7 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2025-09-08 23:27 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
Now, each smmu_domain is built with an invs array that keeps all the IDs
(ASID/VMID) and its attached device SIDs, following the exact pattern of
all the existing invalidation functions.
Introduce a new arm_smmu_domain_inv helper iterating smmu_domain->invs,
to convert the invalidation array to commands. Any invalidation request
with no size specified means an entire flush over a range based one.
ATC invalidations must be in sync with an attachment, espeically when it
is an IOMMU_DOMAIN_BLOCKED that is used by the core for a device reset.
Add a has_ats flag in the invs structure and hold the read lock if it is
set, in which way a concurrent attachment will be blocked until any ATC
invalidation being sent to the in the command queue is finished.
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 16 ++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 225 ++++++++++++++++++--
2 files changed, 228 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index e4e0e066108cc..c73a94514c6d6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -675,9 +675,15 @@ struct arm_smmu_inv {
refcount_t users; /* users=0 to mark as a trash to be purged */
};
+static inline bool arm_smmu_inv_is_ats(struct arm_smmu_inv *inv)
+{
+ return inv->type == INV_TYPE_ATS || inv->type == INV_TYPE_ATS_FULL;
+}
+
/**
* struct arm_smmu_invs - Per-domain invalidation array
* @num_invs: number of invalidations in the flexible array
+ * @has_ats: flag if the array contains an INV_TYPE_ATS or INV_TYPE_ATS_FULL
* @old: flag to synchronize with reader
* @rwlock: optional rwlock to fench ATS operations
* @rcu: rcu head for kfree_rcu()
@@ -706,6 +712,7 @@ struct arm_smmu_inv {
struct arm_smmu_invs {
size_t num_invs;
rwlock_t rwlock;
+ u8 has_ats;
u8 old;
struct rcu_head rcu;
struct arm_smmu_inv inv[];
@@ -1072,6 +1079,15 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
unsigned long iova, size_t size);
+void arm_smmu_domain_inv_range(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size,
+ unsigned int granule, bool leaf);
+
+static inline void arm_smmu_domain_inv(struct arm_smmu_domain *smmu_domain)
+{
+ arm_smmu_domain_inv_range(smmu_domain, 0, 0, 0, false);
+}
+
void __arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ee779df1d78fb..c06d2dd893c11 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2479,23 +2479,19 @@ static void arm_smmu_tlb_inv_context(void *cookie)
arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
}
-static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
- unsigned long iova, size_t size,
- size_t granule,
- struct arm_smmu_domain *smmu_domain)
+static void arm_smmu_cmdq_batch_add_range(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq_batch *cmds,
+ struct arm_smmu_cmdq_ent *cmd,
+ unsigned long iova, size_t size,
+ size_t granule, size_t pgsize)
{
- struct arm_smmu_device *smmu = smmu_domain->smmu;
- unsigned long end = iova + size, num_pages = 0, tg = 0;
+ unsigned long end = iova + size, num_pages = 0, tg = pgsize;
size_t inv_range = granule;
- struct arm_smmu_cmdq_batch cmds;
if (!size)
return;
if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) {
- /* Get the leaf page size */
- tg = __ffs(smmu_domain->domain.pgsize_bitmap);
-
num_pages = size >> tg;
/* Convert page size of 12,14,16 (log2) to 1,2,3 */
@@ -2515,8 +2511,6 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
num_pages++;
}
- arm_smmu_cmdq_batch_init(smmu, &cmds, cmd);
-
while (iova < end) {
if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) {
/*
@@ -2544,9 +2538,26 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
}
cmd->tlbi.addr = iova;
- arm_smmu_cmdq_batch_add(smmu, &cmds, cmd);
+ arm_smmu_cmdq_batch_add(smmu, cmds, cmd);
iova += inv_range;
}
+}
+
+static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
+ unsigned long iova, size_t size,
+ size_t granule,
+ struct arm_smmu_domain *smmu_domain)
+{
+ struct arm_smmu_device *smmu = smmu_domain->smmu;
+ struct arm_smmu_cmdq_batch cmds;
+ size_t pgsize;
+
+ /* Get the leaf page size */
+ pgsize = __ffs(smmu_domain->domain.pgsize_bitmap);
+
+ arm_smmu_cmdq_batch_init(smmu, &cmds, cmd);
+ arm_smmu_cmdq_batch_add_range(smmu, &cmds, cmd, iova, size, granule,
+ pgsize);
arm_smmu_cmdq_batch_submit(smmu, &cmds);
}
@@ -2602,6 +2613,194 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
__arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
}
+static bool arm_smmu_inv_size_too_big(struct arm_smmu_device *smmu, size_t size,
+ size_t granule)
+{
+ size_t max_tlbi_ops;
+
+ /* 0 size means invalidate all */
+ if (!size || size == SIZE_MAX)
+ return true;
+
+ if (smmu->features & ARM_SMMU_FEAT_RANGE_INV)
+ return false;
+
+ /*
+ * Borrowed from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h,
+ * this is used as a threshold to replace "size_opcode" commands with a
+ * single "nsize_opcode" command, when SMMU doesn't implement the range
+ * invalidation feature, where there can be too many per-granule TLBIs,
+ * resulting in a soft lockup.
+ */
+ max_tlbi_ops = 1 << (ilog2(granule) - 3);
+ return size >= max_tlbi_ops * granule;
+}
+
+/* Used by non INV_TYPE_ATS* invalidations */
+static void arm_smmu_inv_to_cmdq_batch(struct arm_smmu_inv *inv,
+ struct arm_smmu_cmdq_batch *cmds,
+ struct arm_smmu_cmdq_ent *cmd,
+ unsigned long iova, size_t size,
+ unsigned int granule)
+{
+ if (arm_smmu_inv_size_too_big(inv->smmu, size, granule)) {
+ cmd->opcode = inv->nsize_opcode;
+ /* nsize_opcode always needs a sync, no batching */
+ arm_smmu_cmdq_issue_cmd_with_sync(inv->smmu, cmd);
+ return;
+ }
+
+ cmd->opcode = inv->size_opcode;
+ arm_smmu_cmdq_batch_add_range(inv->smmu, cmds, cmd, iova, size, granule,
+ inv->pgsize);
+}
+
+static inline bool arm_smmu_invs_end_batch(struct arm_smmu_inv *cur,
+ struct arm_smmu_inv *next)
+{
+ /* Changing smmu means changing command queue */
+ if (cur->smmu != next->smmu)
+ return true;
+ /* The batch for S2 TLBI must be done before nested S1 ASIDs */
+ if (cur->type != INV_TYPE_S2_VMID_S1_CLEAR &&
+ next->type == INV_TYPE_S2_VMID_S1_CLEAR)
+ return true;
+ /* ATS must be after a sync of the S1/S2 invalidations */
+ if (!arm_smmu_inv_is_ats(cur) && arm_smmu_inv_is_ats(next))
+ return true;
+ return false;
+}
+
+void arm_smmu_domain_inv_range(struct arm_smmu_domain *smmu_domain,
+ unsigned long iova, size_t size,
+ unsigned int granule, bool leaf)
+{
+ struct arm_smmu_cmdq_batch cmds = {};
+ struct arm_smmu_invs *invs;
+ struct arm_smmu_inv *cur;
+ struct arm_smmu_inv *end;
+ bool locked;
+
+ /*
+ * An invalidation request must follow some IOPTE change and then load
+ * an invalidation array. In the meantime, a domain attachment mutates
+ * the array and then stores an STE/CD asking SMMU HW to acquire those
+ * changed IOPTEs. In other word, these two are interdependent and can
+ * race.
+ *
+ * In a race, the RCU design (with its underlying memory barriers) can
+ * ensure the invalidation array to always get updated before loaded.
+ *
+ * smp_mb() is used here, paired with the smp_mb() following the array
+ * update in a concurrent attach, to ensure:
+ * - HW sees the new IOPTEs if it walks after STE installation
+ * - Invalidation thread sees the updated array with the new ASID.
+ *
+ * [CPU0] | [CPU1]
+ * |
+ * change IOPTEs and TLB flush: |
+ * arm_smmu_domain_inv_range() { | arm_smmu_install_new_domain_invs {
+ * ... | rcu_assign_pointer(new_invs);
+ * smp_mb(); // ensure IOPTEs | smp_mb(); // ensure new_invs
+ * ... | kfree_rcu(old_invs, rcu);
+ * // load invalidation array | }
+ * invs = rcu_dereference(); | arm_smmu_install_ste_for_dev {
+ * | STE = TTB0 // read new IOPTEs
+ */
+ smp_mb();
+
+ rcu_read_lock();
+
+ while (true) {
+ invs = rcu_dereference(smmu_domain->invs);
+
+ /*
+ * Avoid locking unless ATS is being used. No ATS invalidate can
+ * be going on after a domain is detached.
+ */
+ locked = false;
+ if (invs->has_ats || READ_ONCE(invs->old)) {
+ read_lock(&invs->rwlock);
+ if (invs->old) {
+ read_unlock(&invs->rwlock);
+ continue;
+ }
+ locked = true;
+ }
+ break;
+ }
+
+ cur = invs->inv;
+ end = cur + READ_ONCE(invs->num_invs);
+ /* Skip any leading entry marked as a trash */
+ for (; cur != end; cur++)
+ if (refcount_read(&cur->users))
+ break;
+ while (cur != end) {
+ struct arm_smmu_device *smmu = cur->smmu;
+ struct arm_smmu_cmdq_ent cmd = {
+ /*
+ * Pick size_opcode to run arm_smmu_get_cmdq(). This can
+ * be changed to nsize_opcode, which would result in the
+ * same CMDQ pointer.
+ */
+ .opcode = cur->size_opcode,
+ };
+ struct arm_smmu_inv *next;
+
+ if (!cmds.num)
+ arm_smmu_cmdq_batch_init(smmu, &cmds, &cmd);
+
+ switch (cur->type) {
+ case INV_TYPE_S1_ASID:
+ cmd.tlbi.asid = cur->id;
+ cmd.tlbi.leaf = leaf;
+ arm_smmu_inv_to_cmdq_batch(cur, &cmds, &cmd, iova, size,
+ granule);
+ break;
+ case INV_TYPE_S2_VMID:
+ cmd.tlbi.vmid = cur->id;
+ cmd.tlbi.leaf = leaf;
+ arm_smmu_inv_to_cmdq_batch(cur, &cmds, &cmd, iova, size,
+ granule);
+ break;
+ case INV_TYPE_S2_VMID_S1_CLEAR:
+ /* CMDQ_OP_TLBI_S12_VMALL already flushed S1 entries */
+ if (arm_smmu_inv_size_too_big(cur->smmu, size, granule))
+ continue;
+ cmd.tlbi.vmid = cur->id;
+ arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd);
+ break;
+ case INV_TYPE_ATS:
+ arm_smmu_atc_inv_to_cmd(cur->ssid, iova, size, &cmd);
+ cmd.atc.sid = cur->id;
+ arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd);
+ break;
+ case INV_TYPE_ATS_FULL:
+ arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd);
+ cmd.atc.sid = cur->id;
+ arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd);
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ continue;
+ }
+
+ /* Skip any trash entry in-between */
+ for (next = cur + 1; next != end; next++)
+ if (refcount_read(&next->users))
+ break;
+
+ if (cmds.num &&
+ (next == end || arm_smmu_invs_end_batch(cur, next)))
+ arm_smmu_cmdq_batch_submit(smmu, &cmds);
+ cur = next;
+ }
+ if (locked)
+ read_unlock(&invs->rwlock);
+ rcu_read_unlock();
+}
+
static void arm_smmu_tlb_inv_page_nosync(struct iommu_iotlb_gather *gather,
unsigned long iova, size_t granule,
void *cookie)
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range()
2025-09-08 23:27 ` [PATCH rfcv2 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range() Nicolin Chen
@ 2025-09-24 21:56 ` Jason Gunthorpe
2025-09-29 21:00 ` Nicolin Chen
0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2025-09-24 21:56 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Mon, Sep 08, 2025 at 04:27:01PM -0700, Nicolin Chen wrote:
Maybe:
Each smmu_domain now has an arm_smmu_invs that specifies the
invalidation steps to perform after any change the ioptes. This
includes basic ASID/VMID, the special case for nesting, and ATS.
Introduce a new arm_smmu_domain_inv helper iterating smmu_domain->invs,
to convert the invalidation array to commands. Any invalidation request
with no size specified means an entire flush over a range based one.
Take advantage of the sorted list to compatible batch same-SMMU
operations together. For instance ATS invaliations for multiple SIDs
can be pushed as a batch.
ATS invalidations must be completed before the driver disables ATS or
the device is permitted to ignore the racing invalidation and cause a
SMMU timeout. This sequencing is done with a rwlock where holding the
write side of the rwlock means there are no outstanding ATS
invalidations. If ATS is not used the rwlock is ignored, similar to
the existing code.
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index e4e0e066108cc..c73a94514c6d6 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -675,9 +675,15 @@ struct arm_smmu_inv {
> refcount_t users; /* users=0 to mark as a trash to be purged */
> };
>
> +static inline bool arm_smmu_inv_is_ats(struct arm_smmu_inv *inv)
> +{
> + return inv->type == INV_TYPE_ATS || inv->type == INV_TYPE_ATS_FULL;
> +}
I would put these has_ats related infrastructure hunks in the first
patch adding arm_smmu_invs
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH rfcv2 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range()
2025-09-24 21:56 ` Jason Gunthorpe
@ 2025-09-29 21:00 ` Nicolin Chen
0 siblings, 0 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-29 21:00 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: will, robin.murphy, joro, jean-philippe, miko.lenczewski, balbirs,
peterz, smostafa, kevin.tian, praan, linux-arm-kernel, iommu,
linux-kernel, patches
On Wed, Sep 24, 2025 at 06:56:13PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 08, 2025 at 04:27:01PM -0700, Nicolin Chen wrote:
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index e4e0e066108cc..c73a94514c6d6 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -675,9 +675,15 @@ struct arm_smmu_inv {
> > refcount_t users; /* users=0 to mark as a trash to be purged */
> > };
> >
> > +static inline bool arm_smmu_inv_is_ats(struct arm_smmu_inv *inv)
> > +{
> > + return inv->type == INV_TYPE_ATS || inv->type == INV_TYPE_ATS_FULL;
> > +}
>
> I would put these has_ats related infrastructure hunks in the first
> patch adding arm_smmu_invs
OK. I was trying to move this (and the chunks in the other thread)
closer to where they are being used for a cleaner review.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH rfcv2 8/8] iommu/arm-smmu-v3: Perform per-domain invalidations using arm_smmu_invs
2025-09-08 23:26 [PATCH rfcv2 0/8] iommu/arm-smmu-v3: Introduce an RCU-protected invalidation array Nicolin Chen
` (6 preceding siblings ...)
2025-09-08 23:27 ` [PATCH rfcv2 7/8] iommu/arm-smmu-v3: Add arm_smmu_invs based arm_smmu_domain_inv_range() Nicolin Chen
@ 2025-09-08 23:27 ` Nicolin Chen
7 siblings, 0 replies; 35+ messages in thread
From: Nicolin Chen @ 2025-09-08 23:27 UTC (permalink / raw)
To: jgg, will, robin.murphy
Cc: joro, jean-philippe, miko.lenczewski, balbirs, peterz, smostafa,
kevin.tian, praan, linux-arm-kernel, iommu, linux-kernel, patches
Replace the old invalidation functions with arm_smmu_domain_inv_range() in
all the existing invalidation routines. And deprecate the old functions.
The new arm_smmu_domain_inv_range() handles the CMDQ_MAX_TLBI_OPS as well,
so drop it in the SVA function.
Since arm_smmu_cmdq_batch_add_range() has only one caller now, and it must
be given a valid size, add a WARN_ON_ONCE to catch any missed case.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 -
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 29 +--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 165 +-----------------
3 files changed, 11 insertions(+), 190 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index c73a94514c6d6..e9f97301ded31 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1072,13 +1072,6 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
struct arm_smmu_cd *cd, struct iommu_domain *old);
-void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
-void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
- size_t granule, bool leaf,
- struct arm_smmu_domain *smmu_domain);
-int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
- unsigned long iova, size_t size);
-
void arm_smmu_domain_inv_range(struct arm_smmu_domain *smmu_domain,
unsigned long iova, size_t size,
unsigned int granule, bool leaf);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index fc601b494e0af..048b53f79b144 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -122,15 +122,6 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
}
EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_sva_cd);
-/*
- * Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this
- * is used as a threshold to replace per-page TLBI commands to issue in the
- * command queue with an address-space TLBI command, when SMMU w/o a range
- * invalidation feature handles too many per-page TLBI commands, which will
- * otherwise result in a soft lockup.
- */
-#define CMDQ_MAX_TLBI_OPS (1 << (PAGE_SHIFT - 3))
-
static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start,
@@ -146,21 +137,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
* range. So do a simple translation here by calculating size correctly.
*/
size = end - start;
- if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_RANGE_INV)) {
- if (size >= CMDQ_MAX_TLBI_OPS * PAGE_SIZE)
- size = 0;
- } else {
- if (size == ULONG_MAX)
- size = 0;
- }
-
- if (!size)
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
- else
- arm_smmu_tlb_inv_range_asid(start, size, smmu_domain->cd.asid,
- PAGE_SIZE, false, smmu_domain);
- arm_smmu_atc_inv_domain(smmu_domain, start, size);
+ arm_smmu_domain_inv_range(smmu_domain, start, size, PAGE_SIZE, false);
}
static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
@@ -191,8 +169,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
}
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
- arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
+ arm_smmu_domain_inv(smmu_domain);
}
static void arm_smmu_mmu_notifier_free(struct mmu_notifier *mn)
@@ -301,7 +278,7 @@ static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
/*
* Ensure the ASID is empty in the iommu cache before allowing reuse.
*/
- arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid);
+ arm_smmu_domain_inv(smmu_domain);
/*
* Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c06d2dd893c11..4d8b1230f8bb9 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1023,16 +1023,6 @@ static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused
}
/* Context descriptor manipulation functions */
-void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
-{
- struct arm_smmu_cmdq_ent cmd = {
- .opcode = smmu->features & ARM_SMMU_FEAT_E2H ?
- CMDQ_OP_TLBI_EL2_ASID : CMDQ_OP_TLBI_NH_ASID,
- .tlbi.asid = asid,
- };
-
- arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
-}
static int arm_smmu_invs_cmp(const void *_l, const void *_r)
{
@@ -2393,74 +2383,10 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master,
return arm_smmu_cmdq_batch_submit(master->smmu, &cmds);
}
-int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
- unsigned long iova, size_t size)
-{
- struct arm_smmu_master_domain *master_domain;
- int i;
- unsigned long flags;
- struct arm_smmu_cmdq_ent cmd = {
- .opcode = CMDQ_OP_ATC_INV,
- };
- struct arm_smmu_cmdq_batch cmds;
-
- if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS))
- return 0;
-
- /*
- * Ensure that we've completed prior invalidation of the main TLBs
- * before we read 'nr_ats_masters' in case of a concurrent call to
- * arm_smmu_enable_ats():
- *
- * // unmap() // arm_smmu_enable_ats()
- * TLBI+SYNC atomic_inc(&nr_ats_masters);
- * smp_mb(); [...]
- * atomic_read(&nr_ats_masters); pci_enable_ats() // writel()
- *
- * Ensures that we always see the incremented 'nr_ats_masters' count if
- * ATS was enabled at the PCI device before completion of the TLBI.
- */
- smp_mb();
- if (!atomic_read(&smmu_domain->nr_ats_masters))
- return 0;
-
- arm_smmu_cmdq_batch_init(smmu_domain->smmu, &cmds, &cmd);
-
- spin_lock_irqsave(&smmu_domain->devices_lock, flags);
- list_for_each_entry(master_domain, &smmu_domain->devices,
- devices_elm) {
- struct arm_smmu_master *master = master_domain->master;
-
- if (!master->ats_enabled)
- continue;
-
- if (master_domain->nested_ats_flush) {
- /*
- * If a S2 used as a nesting parent is changed we have
- * no option but to completely flush the ATC.
- */
- arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd);
- } else {
- arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size,
- &cmd);
- }
-
- for (i = 0; i < master->num_streams; i++) {
- cmd.atc.sid = master->streams[i].id;
- arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd);
- }
- }
- spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
-
- return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds);
-}
-
/* IO_PGTABLE API */
static void arm_smmu_tlb_inv_context(void *cookie)
{
struct arm_smmu_domain *smmu_domain = cookie;
- struct arm_smmu_device *smmu = smmu_domain->smmu;
- struct arm_smmu_cmdq_ent cmd;
/*
* NOTE: when io-pgtable is in non-strict mode, we may get here with
@@ -2469,14 +2395,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
* insertion to guarantee those are observed before the TLBI. Do be
* careful, 007.
*/
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
- arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid);
- } else {
- cmd.opcode = CMDQ_OP_TLBI_S12_VMALL;
- cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
- arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
- }
- arm_smmu_atc_inv_domain(smmu_domain, 0, 0);
+ arm_smmu_domain_inv(smmu_domain);
}
static void arm_smmu_cmdq_batch_add_range(struct arm_smmu_device *smmu,
@@ -2488,7 +2407,7 @@ static void arm_smmu_cmdq_batch_add_range(struct arm_smmu_device *smmu,
unsigned long end = iova + size, num_pages = 0, tg = pgsize;
size_t inv_range = granule;
- if (!size)
+ if (WARN_ON_ONCE(!size))
return;
if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) {
@@ -2543,76 +2462,6 @@ static void arm_smmu_cmdq_batch_add_range(struct arm_smmu_device *smmu,
}
}
-static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
- unsigned long iova, size_t size,
- size_t granule,
- struct arm_smmu_domain *smmu_domain)
-{
- struct arm_smmu_device *smmu = smmu_domain->smmu;
- struct arm_smmu_cmdq_batch cmds;
- size_t pgsize;
-
- /* Get the leaf page size */
- pgsize = __ffs(smmu_domain->domain.pgsize_bitmap);
-
- arm_smmu_cmdq_batch_init(smmu, &cmds, cmd);
- arm_smmu_cmdq_batch_add_range(smmu, &cmds, cmd, iova, size, granule,
- pgsize);
- arm_smmu_cmdq_batch_submit(smmu, &cmds);
-}
-
-static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
- size_t granule, bool leaf,
- struct arm_smmu_domain *smmu_domain)
-{
- struct arm_smmu_cmdq_ent cmd = {
- .tlbi = {
- .leaf = leaf,
- },
- };
-
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
- cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
- CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
- cmd.tlbi.asid = smmu_domain->cd.asid;
- } else {
- cmd.opcode = CMDQ_OP_TLBI_S2_IPA;
- cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
- }
- __arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
-
- if (smmu_domain->nest_parent) {
- /*
- * When the S2 domain changes all the nested S1 ASIDs have to be
- * flushed too.
- */
- cmd.opcode = CMDQ_OP_TLBI_NH_ALL;
- arm_smmu_cmdq_issue_cmd_with_sync(smmu_domain->smmu, &cmd);
- }
-
- /*
- * Unfortunately, this can't be leaf-only since we may have
- * zapped an entire table.
- */
- arm_smmu_atc_inv_domain(smmu_domain, iova, size);
-}
-
-void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
- size_t granule, bool leaf,
- struct arm_smmu_domain *smmu_domain)
-{
- struct arm_smmu_cmdq_ent cmd = {
- .opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
- CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA,
- .tlbi = {
- .asid = asid,
- .leaf = leaf,
- },
- };
-
- __arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
-}
-
static bool arm_smmu_inv_size_too_big(struct arm_smmu_device *smmu, size_t size,
size_t granule)
{
@@ -2814,7 +2663,9 @@ static void arm_smmu_tlb_inv_page_nosync(struct iommu_iotlb_gather *gather,
static void arm_smmu_tlb_inv_walk(unsigned long iova, size_t size,
size_t granule, void *cookie)
{
- arm_smmu_tlb_inv_range_domain(iova, size, granule, false, cookie);
+ struct arm_smmu_domain *smmu_domain = cookie;
+
+ arm_smmu_domain_inv_range(smmu_domain, iova, size, granule, false);
}
static const struct iommu_flush_ops arm_smmu_flush_ops = {
@@ -4123,9 +3974,9 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain,
if (!gather->pgsize)
return;
- arm_smmu_tlb_inv_range_domain(gather->start,
- gather->end - gather->start + 1,
- gather->pgsize, true, smmu_domain);
+ arm_smmu_domain_inv_range(smmu_domain, gather->start,
+ gather->end - gather->start + 1,
+ gather->pgsize, true);
}
static phys_addr_t
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread