* [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances
@ 2026-02-23 20:27 Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 01/10] iommu/arm-smmu-v3: Add a wrapper for arm_smmu_make_sva_cd() Nicolin Chen
` (9 more replies)
0 siblings, 10 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
In a system with multiple physical SMMU instances, multiple devices can be
passed through to a VM. Currently, a VM would allocate one domain per SMMU
instance that might be shared across devices that sit behind the same SMMU
instance. However, the gPA->PA mappings (either an S1 unmanaged domain or
an S2 nesting parent domain) can be shared across all the devices that sit
behind different SMMU instances as well, provided that the shared I/O page
table is compatible with all the SMMU instances.
The major difficulty in sharing the domain has been invalidation, since a
change to the shared I/O page table results in an invalidation on all SMMU
instances. A traditional approach involves building a linked list of SMMUs
within the domain, which is very inefficient for the invalidation path as
the linked list has to be locked.
To address this, the SMMUv3 driver now uses an RCU-protected invalidation
array. Any new device (and its SMMU) is preloaded into the array during a
device attachment. This array maintains all necessary information, such as
ASID/VMID and which SMMU instance (CMDQ) to issue the command to.
The second issue concerns the lifecycle of the iotlb tag. Currently, ASID
or VMID is allocated per domain and kept in the domain structure (cd->asid
or s2_cfg->vmid). This does not work ideally when the domain (e.g. S2) is
shared, as the VMID will have to be global across all SMMU instances, even
if a VM is not using all of them. This results in wasted VMID resources in
the bitmaps of unused SMMU instances.
Instead, an iotlb tag should be allocated per SMMU instance. Consequently,
these tags must be allocated and maintained separately. Since ASID or VMID
is only used when a CD or STE is installed to the HW (which happens during
device attachment), and the invalidation array is built right before that,
it is ideal to allocate a new iotlb tag before arm_smmu_invs_merge():
- when a device attaches, the driver first searches for an existing iotlb
tag for the SMMU the device sits behind
- If a match is found, the "users" counter is incremented
- otherwise, a new tag is allocated.
A nested domain case is slightly unique as certain HW requires the VMID at
the vSMMU init stage v.s. a device attachment (to the nested domain). Thus
- allocate/free a vmid in vsmmu_init/vsmmu_destroy and store in vSMMU
- introduce an INV_TYPE_S2_VMID_VSMMU to separate it from a naked S2 case
- retrieve the vmid from the vSMMU during attachment instead of allocation
With this, deprecate the cd->asid and s2_cfg->vmid from struct smmu_domain,
and replace them with the iotlb tag stored in the smmu_domain->invs array.
Finally, allow sharing a domain across the SMMU instances, so long as they
passes a compatibility test.
This is on Github:
https://github.com/nicolinc/iommufd/commits/smmuv3_share_domain-v3
This is based on the series "Introduce an RCU-protected invalidation array"
https://lore.kernel.org/all/cover.1766013662.git.nicolinc@nvidia.com/
So the whole implementation follows the path Jason envisioned initially.
An earlier effort to share S2 domain can be found:
https://lore.kernel.org/all/cover.1744692494.git.nicolinc@nvidia.com/
Changelog
v3
* Rebase on arm_smmu_invs-v12
* Add Reviewed-by tags from Jason
* Avoid Boolean function parameters
* Set users counter in arm_smmu_ins_unref()
* Add arm_smmu_inv_assert_iotlb_tag() helper
* Fix the return values in arm_smmu_alloc_iotlb_tag()
* Reorder the patches introducing INV_TYPE_S2_VMID_VSMMU
* Add a note explaining the lifecycle of vSMMU-owned iotlb tag
* Compare pgtbl with the new smmu in arm_smmu_domain_can_share()
* Pass in a function pointer to arm_smmu_set_pasid() for CD making
* Pass the raw domain pointer down to arm_smmu_domain_get_iotlb_tag()
v2
https://lore.kernel.org/all/cover.1769044718.git.nicolinc@nvidia.com/
* Add arm_smmu_domain_get_iotlb_tag()
* Drop asid array and vmid from master structure, and get the iotlb
tag in the smmu_domain->invs array
* Introduce INV_TYPE_S2_VMID_VSMMU for vSMMU type, and separate the
nested attach case from a naked S2 attach case
v1
https://lore.kernel.org/all/cover.1766088962.git.nicolinc@nvidia.com/
Thanks
Nicolin
Nicolin Chen (10):
iommu/arm-smmu-v3: Add a wrapper for arm_smmu_make_sva_cd()
iommu/arm-smmu-v3: Pass in arm_smmu_make_cd_fn to arm_smmu_set_pasid()
iommu/arm-smmu-v3: Store IOTLB cache tags in struct
arm_smmu_attach_state
iommu/arm-smmu-v3: Pass in IOTLB cache tag to
arm_smmu_master_build_invs()
iommu/arm-smmu-v3: Pass in IOTLB cache tag to CD and STE
iommu/arm-smmu-v3: Introduce INV_TYPE_S2_VMID_VSMMU
iommu/arm-smmu-v3: Allocate IOTLB cache tag if no id to reuse
iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init
iommu/arm-smmu-v3: Remove ASID/VMID from arm_smmu_domain
iommu/arm-smmu-v3: Allow sharing domain across SMMUs
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 76 ++++-
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 52 +++-
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 57 ++--
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 19 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 290 ++++++++++--------
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 1 +
6 files changed, 305 insertions(+), 190 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3 01/10] iommu/arm-smmu-v3: Add a wrapper for arm_smmu_make_sva_cd()
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 02/10] iommu/arm-smmu-v3: Pass in arm_smmu_make_cd_fn to arm_smmu_set_pasid() Nicolin Chen
` (8 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
Rename the existing arm_smmu_make_sva_cd() to __arm_smmu_make_sva_cd().
Add a higher-level wrapper arm_smmu_make_s1_cd() receiving smmu_domain
and master pointers, aligning with arm_smmu_make_s1_cd(). Then, the two
function can share a common typedef function pointer.
No functional changes.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 ++---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 22 +++++++++++++------
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 4 ++--
3 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 36de2b0b2ebe6..dd5d2b5acf664 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1019,9 +1019,9 @@ void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu,
void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master, bool ats_enabled,
unsigned int s1dss);
-void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
- struct arm_smmu_master *master, struct mm_struct *mm,
- u16 asid);
+void __arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
+ struct arm_smmu_master *master,
+ struct mm_struct *mm, u16 asid);
struct arm_smmu_invs *arm_smmu_invs_merge(struct arm_smmu_invs *invs,
struct arm_smmu_invs *to_merge);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index f1f8e01a7e914..414fc899140f7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -48,9 +48,9 @@ static u64 page_size_to_cd(void)
}
VISIBLE_IF_KUNIT
-void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
- struct arm_smmu_master *master, struct mm_struct *mm,
- u16 asid)
+void __arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
+ struct arm_smmu_master *master,
+ struct mm_struct *mm, u16 asid)
{
u64 par;
@@ -120,7 +120,15 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
if (system_supports_poe() || system_supports_gcs())
dev_warn_once(master->smmu->dev, "SVA devices ignore permission overlays and GCS\n");
}
-EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_sva_cd);
+EXPORT_SYMBOL_IF_KUNIT(__arm_smmu_make_sva_cd);
+
+static void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
+ struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain)
+{
+ __arm_smmu_make_sva_cd(target, master, smmu_domain->domain.mm,
+ smmu_domain->cd.asid);
+}
static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
struct mm_struct *mm,
@@ -162,8 +170,8 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
- arm_smmu_make_sva_cd(&target, master, NULL,
- smmu_domain->cd.asid);
+ __arm_smmu_make_sva_cd(&target, master, NULL,
+ smmu_domain->cd.asid);
arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target);
}
@@ -265,7 +273,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
* This does not need the arm_smmu_asid_lock because SVA domains never
* get reassigned
*/
- arm_smmu_make_sva_cd(&target, master, domain->mm, smmu_domain->cd.asid);
+ arm_smmu_make_sva_cd(&target, master, smmu_domain);
ret = arm_smmu_set_pasid(master, smmu_domain, id, &target, old);
mmput(domain->mm);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index 7b8035b1db24d..bf4412e904b01 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -546,7 +546,7 @@ static void arm_smmu_test_make_sva_cd(struct arm_smmu_cd *cd, unsigned int asid)
.smmu = &smmu,
};
- arm_smmu_make_sva_cd(cd, &master, &sva_mm, asid);
+ __arm_smmu_make_sva_cd(cd, &master, &sva_mm, asid);
}
static void arm_smmu_test_make_sva_release_cd(struct arm_smmu_cd *cd,
@@ -556,7 +556,7 @@ static void arm_smmu_test_make_sva_release_cd(struct arm_smmu_cd *cd,
.smmu = &smmu,
};
- arm_smmu_make_sva_cd(cd, &master, NULL, asid);
+ __arm_smmu_make_sva_cd(cd, &master, NULL, asid);
}
static void arm_smmu_v3_write_ste_test_s1_to_s2_stall(struct kunit *test)
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 02/10] iommu/arm-smmu-v3: Pass in arm_smmu_make_cd_fn to arm_smmu_set_pasid()
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 01/10] iommu/arm-smmu-v3: Add a wrapper for arm_smmu_make_sva_cd() Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 03/10] iommu/arm-smmu-v3: Store IOTLB cache tags in struct arm_smmu_attach_state Nicolin Chen
` (7 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
To install a domain (CD) to a substream, the common flow in the driver is:
- Make an S1 or SVA CD outside arm_smmu_asid_lock
- Invoke arm_smmu_set_pasid() where it takes arm_smmu_asid_lock, and fix
the ASID in the CD.
The reason for such a flow is for the timing of arm_smmu_asid_lock, since
it was too early to take the mutex outside the function.
Tidy it up by passing in a function pointer for CD making,, which supports
both existing functions: arm_smmu_make_s1_cd() and arm_smmu_make_sva_cd().
Then arm_smmu_set_pasid() can make a CD inside the lock where ASID is safe
to access.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++++++-
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 4 ++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 ++++---------------
3 files changed, 12 insertions(+), 18 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index dd5d2b5acf664..e3a66e6bc303e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1076,9 +1076,14 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target);
+typedef void (*arm_smmu_make_cd_fn)(struct arm_smmu_cd *target,
+ struct arm_smmu_master *master,
+ struct arm_smmu_domain *smmu_domain);
+
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- struct arm_smmu_cd *cd, struct iommu_domain *old);
+ struct arm_smmu_cd *cd, struct iommu_domain *old,
+ arm_smmu_make_cd_fn fn);
void arm_smmu_domain_inv_range(struct arm_smmu_domain *smmu_domain,
unsigned long iova, size_t size,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 414fc899140f7..4370cb88c57cf 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -273,8 +273,8 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain,
* This does not need the arm_smmu_asid_lock because SVA domains never
* get reassigned
*/
- arm_smmu_make_sva_cd(&target, master, smmu_domain);
- ret = arm_smmu_set_pasid(master, smmu_domain, id, &target, old);
+ ret = arm_smmu_set_pasid(master, smmu_domain, id, &target, old,
+ arm_smmu_make_sva_cd);
mmput(domain->mm);
return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b10a68565e9df..7c075e64f842e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3733,13 +3733,8 @@ static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain,
if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
return -EINVAL;
- /*
- * We can read cd.asid outside the lock because arm_smmu_set_pasid()
- * will fix it
- */
- arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
return arm_smmu_set_pasid(master, to_smmu_domain(domain), id,
- &target_cd, old);
+ &target_cd, old, arm_smmu_make_s1_cd);
}
static void arm_smmu_update_ste(struct arm_smmu_master *master,
@@ -3769,7 +3764,8 @@ static void arm_smmu_update_ste(struct arm_smmu_master *master,
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
- struct arm_smmu_cd *cd, struct iommu_domain *old)
+ struct arm_smmu_cd *cd, struct iommu_domain *old,
+ arm_smmu_make_cd_fn arm_smmu_make_cd_fn)
{
struct iommu_domain *sid_domain =
iommu_driver_get_domain_for_dev(master->dev);
@@ -3800,14 +3796,7 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (ret)
goto out_unlock;
- /*
- * We don't want to obtain to the asid_lock too early, so fix up the
- * caller set ASID under the lock in case it changed.
- */
- cd->data[0] &= ~cpu_to_le64(CTXDESC_CD_0_ASID);
- cd->data[0] |= cpu_to_le64(
- FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->cd.asid));
-
+ arm_smmu_make_cd_fn(cd, master, smmu_domain);
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
arm_smmu_update_ste(master, sid_domain, state.ats_enabled);
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 03/10] iommu/arm-smmu-v3: Store IOTLB cache tags in struct arm_smmu_attach_state
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 01/10] iommu/arm-smmu-v3: Add a wrapper for arm_smmu_make_sva_cd() Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 02/10] iommu/arm-smmu-v3: Pass in arm_smmu_make_cd_fn to arm_smmu_set_pasid() Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 04/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to arm_smmu_master_build_invs() Nicolin Chen
` (6 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
So far, an IOTLB tag (ASID or VMID) has been stored in the arm_smmu_domain
structure. Its lifecycle is aligned with the smmu_domain.
However, an IOTLB tag (ASID or VMID) will not be used:
1) Before being installed to CD or STE during a device attachment
2) After being removed from CD or STE during a device detachment
Both (1) and (2) exactly align with the lifecycle of smmu_domain->invs.
The bigger problem is that storing the IOTLB tag in struct arm_smmu_domain
makes it difficult to share across SMMU instances, a common use case for a
nesting parent domain.
Introduce arm_smmu_find_iotlb_tag() helper to find an existing IOTLB cache
tag in the smmu_domain->invs array.
Introduce arm_smmu_alloc_iotlb_tag() helper provisionally copying an IOTLB
tag from the smmu_domain (cd->asid and s2_cfg), which will be replaced to
actually allocate a new IOTLB cache tag from the ASID or VMID bitmap.
(Note only the new_smmu_domain pathway is allowed to allocate a new tag.)
The returned tag will be stored in struct arm_smmu_attach_state, which will
be forwarded to arm_smmu_master_build_invs() and eventually set to a CD or
STE accordingly.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 11 +++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 84 +++++++++++++++++++++
2 files changed, 95 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index e3a66e6bc303e..11b61a19e6e53 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -674,6 +674,11 @@ struct arm_smmu_inv {
int users; /* users=0 to mark as a trash to be purged */
};
+static inline void arm_smmu_inv_assert_iotlb_tag(struct arm_smmu_inv *inv)
+{
+ WARN_ON(inv->type != INV_TYPE_S1_ASID && inv->type != INV_TYPE_S2_VMID);
+}
+
static inline bool arm_smmu_inv_is_ats(const struct arm_smmu_inv *inv)
{
return inv->type == INV_TYPE_ATS || inv->type == INV_TYPE_ATS_FULL;
@@ -1117,11 +1122,13 @@ static inline bool arm_smmu_master_canwbs(struct arm_smmu_master *master)
* @new_invs: for new domain, this is the new invs array to update domain->invs;
* for old domain, this is the master->build_invs to pass in as the
* to_unref argument to an arm_smmu_invs_unref() call
+ * @tag: IOTLB cache tag (INV_TYPE_S1_ASID or INV_TYPE_S2_VMID)
*/
struct arm_smmu_inv_state {
struct arm_smmu_invs __rcu **invs_ptr;
struct arm_smmu_invs *old_invs;
struct arm_smmu_invs *new_invs;
+ struct arm_smmu_inv tag;
};
struct arm_smmu_attach_state {
@@ -1138,6 +1145,10 @@ struct arm_smmu_attach_state {
bool ats_enabled;
};
+int arm_smmu_find_iotlb_tag(struct iommu_domain *domain,
+ struct arm_smmu_device *smmu,
+ struct arm_smmu_inv *tag);
+
int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
struct iommu_domain *new_domain);
void arm_smmu_attach_commit(struct arm_smmu_attach_state *state);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7c075e64f842e..2033468dbf1e8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3200,6 +3200,77 @@ static void arm_smmu_disable_iopf(struct arm_smmu_master *master,
iopf_queue_remove_device(master->smmu->evtq.iopf, master->dev);
}
+static int __arm_smmu_domain_find_iotlb_tag(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_inv *tag)
+{
+ struct arm_smmu_invs *invs = rcu_dereference_protected(
+ smmu_domain->invs, lockdep_is_held(&arm_smmu_asid_lock));
+ size_t i;
+
+ arm_smmu_inv_assert_iotlb_tag(tag);
+
+ for (i = 0; i != invs->num_invs; i++) {
+ if (invs->inv[i].type == tag->type &&
+ invs->inv[i].smmu == tag->smmu &&
+ READ_ONCE(invs->inv[i].users)) {
+ *tag = invs->inv[i];
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
+
+/* Find an existing IOTLB cache tag in smmu_domain->invs (users counter != 0) */
+int arm_smmu_find_iotlb_tag(struct iommu_domain *domain,
+ struct arm_smmu_device *smmu,
+ struct arm_smmu_inv *tag)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
+
+ if (WARN_ON(!smmu_domain))
+ return -EINVAL;
+
+ /* Decide the type of the iotlb cache tag */
+ switch (smmu_domain->stage) {
+ case ARM_SMMU_DOMAIN_SVA:
+ case ARM_SMMU_DOMAIN_S1:
+ tag->type = INV_TYPE_S1_ASID;
+ break;
+ case ARM_SMMU_DOMAIN_S2:
+ tag->type = INV_TYPE_S2_VMID;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ tag->smmu = smmu;
+
+ return __arm_smmu_domain_find_iotlb_tag(smmu_domain, tag);
+}
+
+/* Allocate a new IOTLB cache tag (users counter == 0) */
+static int arm_smmu_alloc_iotlb_tag(struct iommu_domain *domain,
+ struct arm_smmu_device *smmu,
+ struct arm_smmu_inv *tag)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain);
+ int ret;
+
+ /* Only allocate if there is no IOTLB cache tag to re-use */
+ ret = arm_smmu_find_iotlb_tag(domain, smmu, tag);
+ if (!ret || ret != -ENOENT)
+ return ret;
+
+ /* FIXME replace with an actual allocation from the bitmap */
+ if (tag->type == INV_TYPE_S1_ASID)
+ tag->id = smmu_domain->cd.asid;
+ else
+ tag->id = smmu_domain->s2_cfg.vmid;
+
+ return 0;
+}
+
static struct arm_smmu_inv *
arm_smmu_master_build_inv(struct arm_smmu_master *master,
enum arm_smmu_inv_type type, u32 id, ioasid_t ssid,
@@ -3365,7 +3436,9 @@ static int arm_smmu_attach_prepare_invs(struct arm_smmu_attach_state *state,
struct arm_smmu_domain *new_smmu_domain =
to_smmu_domain_devices(new_domain);
struct arm_smmu_master *master = state->master;
+ struct arm_smmu_device *smmu = master->smmu;
ioasid_t ssid = state->ssid;
+ int ret;
/*
* At this point a NULL domain indicates the domain doesn't use the
@@ -3379,6 +3452,11 @@ static int arm_smmu_attach_prepare_invs(struct arm_smmu_attach_state *state,
invst->old_invs = rcu_dereference_protected(
new_smmu_domain->invs,
lockdep_is_held(&arm_smmu_asid_lock));
+
+ ret = arm_smmu_alloc_iotlb_tag(new_domain, smmu, &invst->tag);
+ if (ret)
+ return ret;
+
build_invs = arm_smmu_master_build_invs(
master, state->ats_enabled, ssid, new_smmu_domain);
if (!build_invs)
@@ -3401,6 +3479,12 @@ static int arm_smmu_attach_prepare_invs(struct arm_smmu_attach_state *state,
invst->old_invs = rcu_dereference_protected(
old_smmu_domain->invs,
lockdep_is_held(&arm_smmu_asid_lock));
+
+ ret = arm_smmu_find_iotlb_tag(state->old_domain, smmu,
+ &invst->tag);
+ if (WARN_ON(ret))
+ return ret;
+
/* For old_smmu_domain, new_invs points to master->build_invs */
invst->new_invs = arm_smmu_master_build_invs(
master, master->ats_enabled, ssid, old_smmu_domain);
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 04/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to arm_smmu_master_build_invs()
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
` (2 preceding siblings ...)
2026-02-23 20:27 ` [PATCH v3 03/10] iommu/arm-smmu-v3: Store IOTLB cache tags in struct arm_smmu_attach_state Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 05/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to CD and STE Nicolin Chen
` (5 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
Now struct arm_smmu_attach_state carrys an IOTLB cache tag in invst->tag.
Instead of getting the tag from smmu_domain again, pass in the invst->tag
to arm_smmu_master_build_invs(). This could simplify the function.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 37 ++++++++-------------
1 file changed, 13 insertions(+), 24 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2033468dbf1e8..4649368910e0c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3330,7 +3330,8 @@ arm_smmu_master_build_inv(struct arm_smmu_master *master,
*/
static struct arm_smmu_invs *
arm_smmu_master_build_invs(struct arm_smmu_master *master, bool ats_enabled,
- ioasid_t ssid, struct arm_smmu_domain *smmu_domain)
+ ioasid_t ssid, struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_inv *tag)
{
const bool nesting = smmu_domain->nest_parent;
size_t pgsize = 0, i;
@@ -3343,30 +3344,15 @@ arm_smmu_master_build_invs(struct arm_smmu_master *master, bool ats_enabled,
if (master->smmu->features & ARM_SMMU_FEAT_RANGE_INV)
pgsize = __ffs(smmu_domain->domain.pgsize_bitmap);
- switch (smmu_domain->stage) {
- case ARM_SMMU_DOMAIN_SVA:
- case ARM_SMMU_DOMAIN_S1:
- if (!arm_smmu_master_build_inv(master, INV_TYPE_S1_ASID,
- smmu_domain->cd.asid,
- IOMMU_NO_PASID, pgsize))
- return NULL;
- break;
- case ARM_SMMU_DOMAIN_S2:
- if (!arm_smmu_master_build_inv(master, INV_TYPE_S2_VMID,
- smmu_domain->s2_cfg.vmid,
- IOMMU_NO_PASID, pgsize))
- return NULL;
- break;
- default:
- WARN_ON(true);
+ if (!arm_smmu_master_build_inv(master, tag->type, tag->id,
+ IOMMU_NO_PASID, pgsize))
return NULL;
- }
/* All the nested S1 ASIDs have to be flushed when S2 parent changes */
if (nesting) {
- if (!arm_smmu_master_build_inv(
- master, INV_TYPE_S2_VMID_S1_CLEAR,
- smmu_domain->s2_cfg.vmid, IOMMU_NO_PASID, 0))
+ if (!arm_smmu_master_build_inv(master,
+ INV_TYPE_S2_VMID_S1_CLEAR,
+ tag->id, IOMMU_NO_PASID, 0))
return NULL;
}
@@ -3457,8 +3443,10 @@ static int arm_smmu_attach_prepare_invs(struct arm_smmu_attach_state *state,
if (ret)
return ret;
- build_invs = arm_smmu_master_build_invs(
- master, state->ats_enabled, ssid, new_smmu_domain);
+ build_invs = arm_smmu_master_build_invs(master,
+ state->ats_enabled,
+ ssid, new_smmu_domain,
+ &invst->tag);
if (!build_invs)
return -EINVAL;
@@ -3487,7 +3475,8 @@ static int arm_smmu_attach_prepare_invs(struct arm_smmu_attach_state *state,
/* For old_smmu_domain, new_invs points to master->build_invs */
invst->new_invs = arm_smmu_master_build_invs(
- master, master->ats_enabled, ssid, old_smmu_domain);
+ master, master->ats_enabled, ssid, old_smmu_domain,
+ &invst->tag);
}
return 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 05/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to CD and STE
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
` (3 preceding siblings ...)
2026-02-23 20:27 ` [PATCH v3 04/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to arm_smmu_master_build_invs() Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 06/10] iommu/arm-smmu-v3: Introduce INV_TYPE_S2_VMID_VSMMU Nicolin Chen
` (4 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
Now, struct arm_smmu_attach_state has the IOTLB cache tags.
Pass them down to arm_smmu_make_s1_cd() and arm_smmu_make_s2_domain_ste()
to set in the CD and STE, removing the uses of smmu_domain for ASID/VMID.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 +++---
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 26 ++++++++++++-------
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 23 +++++++++++-----
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 12 +++++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++------
5 files changed, 62 insertions(+), 29 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 11b61a19e6e53..3b91e4596ffee 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1010,7 +1010,7 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target);
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain,
- bool ats_enabled);
+ struct arm_smmu_inv *tag, bool ats_enabled);
#if IS_ENABLED(CONFIG_KUNIT)
void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits);
@@ -1076,14 +1076,16 @@ struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
u32 ssid);
void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain);
+ struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_inv *tag);
void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
struct arm_smmu_cd *cdptr,
const struct arm_smmu_cd *target);
typedef void (*arm_smmu_make_cd_fn)(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain);
+ struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_inv *tag);
int arm_smmu_set_pasid(struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain, ioasid_t pasid,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index ddae0b07c76b5..a77c60321203c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -39,12 +39,15 @@ void *arm_smmu_hw_info(struct device *dev, u32 *length,
return info;
}
-static void arm_smmu_make_nested_cd_table_ste(
- struct arm_smmu_ste *target, struct arm_smmu_master *master,
- struct arm_smmu_nested_domain *nested_domain, bool ats_enabled)
+static void
+arm_smmu_make_nested_cd_table_ste(struct arm_smmu_ste *target,
+ struct arm_smmu_master *master,
+ struct arm_smmu_nested_domain *nested_domain,
+ struct arm_smmu_inv *tag, bool ats_enabled)
{
- arm_smmu_make_s2_domain_ste(
- target, master, nested_domain->vsmmu->s2_parent, ats_enabled);
+ arm_smmu_make_s2_domain_ste(target, master,
+ nested_domain->vsmmu->s2_parent, tag,
+ ats_enabled);
target->data[0] = cpu_to_le64(STRTAB_STE_0_V |
FIELD_PREP(STRTAB_STE_0_CFG,
@@ -64,9 +67,11 @@ static void arm_smmu_make_nested_cd_table_ste(
* - Bypass STE (install the S2, no CD table)
* - CD table STE (install the S2 and the userspace CD table)
*/
-static void arm_smmu_make_nested_domain_ste(
- struct arm_smmu_ste *target, struct arm_smmu_master *master,
- struct arm_smmu_nested_domain *nested_domain, bool ats_enabled)
+static void
+arm_smmu_make_nested_domain_ste(struct arm_smmu_ste *target,
+ struct arm_smmu_master *master,
+ struct arm_smmu_nested_domain *nested_domain,
+ struct arm_smmu_inv *tag, bool ats_enabled)
{
unsigned int cfg =
FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(nested_domain->ste[0]));
@@ -82,12 +87,12 @@ static void arm_smmu_make_nested_domain_ste(
switch (cfg) {
case STRTAB_STE_0_CFG_S1_TRANS:
arm_smmu_make_nested_cd_table_ste(target, master, nested_domain,
- ats_enabled);
+ tag, ats_enabled);
break;
case STRTAB_STE_0_CFG_BYPASS:
arm_smmu_make_s2_domain_ste(target, master,
nested_domain->vsmmu->s2_parent,
- ats_enabled);
+ tag, ats_enabled);
break;
case STRTAB_STE_0_CFG_ABORT:
default:
@@ -187,6 +192,7 @@ static int arm_smmu_attach_dev_nested(struct iommu_domain *domain,
}
arm_smmu_make_nested_domain_ste(&ste, master, nested_domain,
+ &state.new_domain_invst.tag,
state.ats_enabled);
arm_smmu_install_ste_for_dev(master, &ste);
arm_smmu_attach_commit(&state);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 4370cb88c57cf..846a278fa5469 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -24,12 +24,18 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain)
list_for_each_entry(master_domain, &smmu_domain->devices, devices_elm) {
struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd *cdptr;
+ struct arm_smmu_inv tag;
cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
- arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
+ if (WARN_ON(arm_smmu_find_iotlb_tag(&smmu_domain->domain,
+ master->smmu, &tag)))
+ continue;
+ if (WARN_ON(tag.type != INV_TYPE_S1_ASID))
+ continue;
+ arm_smmu_make_s1_cd(&target_cd, master, smmu_domain, &tag);
arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target_cd);
}
@@ -124,10 +130,10 @@ EXPORT_SYMBOL_IF_KUNIT(__arm_smmu_make_sva_cd);
static void arm_smmu_make_sva_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
+ struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_inv *tag)
{
- __arm_smmu_make_sva_cd(target, master, smmu_domain->domain.mm,
- smmu_domain->cd.asid);
+ __arm_smmu_make_sva_cd(target, master, smmu_domain->domain.mm, tag->id);
}
static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
@@ -166,12 +172,17 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
struct arm_smmu_master *master = master_domain->master;
struct arm_smmu_cd target;
struct arm_smmu_cd *cdptr;
+ struct arm_smmu_inv tag;
cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid);
if (WARN_ON(!cdptr))
continue;
- __arm_smmu_make_sva_cd(&target, master, NULL,
- smmu_domain->cd.asid);
+ if (WARN_ON(arm_smmu_find_iotlb_tag(&smmu_domain->domain,
+ master->smmu, &tag)))
+ continue;
+ if (WARN_ON(tag.type != INV_TYPE_S1_ASID))
+ continue;
+ __arm_smmu_make_sva_cd(&target, master, NULL, tag.id);
arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr,
&target);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index bf4412e904b01..c0cdded058fc8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -347,6 +347,9 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
struct arm_smmu_domain smmu_domain = {
.pgtbl_ops = &io_pgtable.ops,
};
+ struct arm_smmu_inv tag = {
+ .type = INV_TYPE_S2_VMID,
+ };
io_pgtable.cfg.arm_lpae_s2_cfg.vttbr = 0xdaedbeefdeadbeefULL;
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.ps = 1;
@@ -357,7 +360,8 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste,
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.sl = 3;
io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.tsz = 4;
- arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain, ats_enabled);
+ arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain, &tag,
+ ats_enabled);
}
static void arm_smmu_v3_write_ste_test_s2_to_abort(struct kunit *test)
@@ -502,6 +506,10 @@ static void arm_smmu_test_make_s1_cd(struct arm_smmu_cd *cd, unsigned int asid)
.asid = asid,
},
};
+ struct arm_smmu_inv tag = {
+ .type = INV_TYPE_S1_ASID,
+ .id = asid,
+ };
io_pgtable.cfg.arm_lpae_s1_cfg.ttbr = 0xdaedbeefdeadbeefULL;
io_pgtable.cfg.arm_lpae_s1_cfg.tcr.ips = 1;
@@ -512,7 +520,7 @@ static void arm_smmu_test_make_s1_cd(struct arm_smmu_cd *cd, unsigned int asid)
io_pgtable.cfg.arm_lpae_s1_cfg.tcr.tsz = 4;
io_pgtable.cfg.arm_lpae_s1_cfg.mair = 0xabcdef012345678ULL;
- arm_smmu_make_s1_cd(cd, &master, &smmu_domain);
+ arm_smmu_make_s1_cd(cd, &master, &smmu_domain, &tag);
}
static void arm_smmu_v3_write_cd_test_s1_clear(struct kunit *test)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4649368910e0c..aa00c7cd3503e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1687,14 +1687,16 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid,
void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
struct arm_smmu_master *master,
- struct arm_smmu_domain *smmu_domain)
+ struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_inv *tag)
{
- struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
const struct io_pgtable_cfg *pgtbl_cfg =
&io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr =
&pgtbl_cfg->arm_lpae_s1_cfg.tcr;
+ WARN_ON(tag->type != INV_TYPE_S1_ASID);
+
memset(target, 0, sizeof(*target));
target->data[0] = cpu_to_le64(
@@ -1714,7 +1716,7 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target,
CTXDESC_CD_0_R |
CTXDESC_CD_0_A |
CTXDESC_CD_0_ASET |
- FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid)
+ FIELD_PREP(CTXDESC_CD_0_ASID, tag->id)
);
/* To enable dirty flag update, set both Access flag and dirty state update */
@@ -1971,9 +1973,8 @@ EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_cdtable_ste);
void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
struct arm_smmu_master *master,
struct arm_smmu_domain *smmu_domain,
- bool ats_enabled)
+ struct arm_smmu_inv *tag, bool ats_enabled)
{
- struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
const struct io_pgtable_cfg *pgtbl_cfg =
&io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr =
@@ -1981,6 +1982,8 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
u64 vtcr_val;
struct arm_smmu_device *smmu = master->smmu;
+ WARN_ON(tag->type != INV_TYPE_S2_VMID);
+
memset(target, 0, sizeof(*target));
target->data[0] = cpu_to_le64(
STRTAB_STE_0_V |
@@ -2004,7 +2007,7 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
target->data[2] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
+ FIELD_PREP(STRTAB_STE_2_S2VMID, tag->id) |
FIELD_PREP(STRTAB_STE_2_VTCR, vtcr_val) |
STRTAB_STE_2_S2AA64 |
#ifdef __BIG_ENDIAN
@@ -3767,7 +3770,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev,
case ARM_SMMU_DOMAIN_S1: {
struct arm_smmu_cd target_cd;
- arm_smmu_make_s1_cd(&target_cd, master, smmu_domain);
+ arm_smmu_make_s1_cd(&target_cd, master, smmu_domain,
+ &state.new_domain_invst.tag);
arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr,
&target_cd);
arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled,
@@ -3777,6 +3781,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev,
}
case ARM_SMMU_DOMAIN_S2:
arm_smmu_make_s2_domain_ste(&target, master, smmu_domain,
+ &state.new_domain_invst.tag,
state.ats_enabled);
arm_smmu_install_ste_for_dev(master, &target);
arm_smmu_clear_cd(master, IOMMU_NO_PASID);
@@ -3869,7 +3874,8 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master,
if (ret)
goto out_unlock;
- arm_smmu_make_cd_fn(cd, master, smmu_domain);
+ arm_smmu_make_cd_fn(cd, master, smmu_domain,
+ &state.new_domain_invst.tag);
arm_smmu_write_cd_entry(master, pasid, cdptr, cd);
arm_smmu_update_ste(master, sid_domain, state.ats_enabled);
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 06/10] iommu/arm-smmu-v3: Introduce INV_TYPE_S2_VMID_VSMMU
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
` (4 preceding siblings ...)
2026-02-23 20:27 ` [PATCH v3 05/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to CD and STE Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 07/10] iommu/arm-smmu-v3: Allocate IOTLB cache tag if no id to reuse Nicolin Chen
` (3 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
A VMID held by a vSMMU is required to setup hardware (e.g. tegra241-cmdqv)
during its initialization. So, it should be allocated in the ->viommu_init
callback. This makes the VMID lifecycle unique than a VMID allocated for a
naked S2 attachment.
Introduce an INV_TYPE_S2_VMID_VSMMU to prepare for this case.
In arm_smmu_alloc_iotlb_tag(), retrieve the preallocated VMID on the vSMMU
directly instead of allocating a new one.
In arm_smmu_find_iotlb_tag(), continue searching in the smmu_domain->invs,
using the type INV_TYPE_S2_VMID_VSMMU. This means a second device attached
to a nested domain associated with the same vSMMU instance shall reuse the
VMID held by the vSMMU. (FWIW, device attached to a nesting parent domain
will have an INV_TYPE_S2_VMID and will not resue the VMID on any vSMMU.)
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 12 +++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 +++++++++++++++---
2 files changed, 26 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3b91e4596ffee..b821241f73c7a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -655,6 +655,7 @@ struct arm_smmu_cmdq_batch {
enum arm_smmu_inv_type {
INV_TYPE_S1_ASID,
INV_TYPE_S2_VMID,
+ INV_TYPE_S2_VMID_VSMMU,
INV_TYPE_S2_VMID_S1_CLEAR,
INV_TYPE_ATS,
INV_TYPE_ATS_FULL,
@@ -676,7 +677,9 @@ struct arm_smmu_inv {
static inline void arm_smmu_inv_assert_iotlb_tag(struct arm_smmu_inv *inv)
{
- WARN_ON(inv->type != INV_TYPE_S1_ASID && inv->type != INV_TYPE_S2_VMID);
+ WARN_ON(inv->type != INV_TYPE_S1_ASID &&
+ inv->type != INV_TYPE_S2_VMID &&
+ inv->type != INV_TYPE_S2_VMID_VSMMU);
}
static inline bool arm_smmu_inv_is_ats(const struct arm_smmu_inv *inv)
@@ -1195,6 +1198,13 @@ struct arm_vsmmu {
u16 vmid;
};
+static inline struct arm_vsmmu *to_vsmmu(struct iommu_domain *domain)
+{
+ if (domain->type == IOMMU_DOMAIN_NESTED)
+ return to_smmu_nested_domain(domain)->vsmmu;
+ return NULL;
+}
+
#if IS_ENABLED(CONFIG_ARM_SMMU_V3_IOMMUFD)
void *arm_smmu_hw_info(struct device *dev, u32 *length,
enum iommu_hw_info_type *type);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index aa00c7cd3503e..0755ebe1c1560 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1982,7 +1982,8 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
u64 vtcr_val;
struct arm_smmu_device *smmu = master->smmu;
- WARN_ON(tag->type != INV_TYPE_S2_VMID);
+ WARN_ON(tag->type != INV_TYPE_S2_VMID &&
+ tag->type != INV_TYPE_S2_VMID_VSMMU);
memset(target, 0, sizeof(*target));
target->data[0] = cpu_to_le64(
@@ -2678,6 +2679,7 @@ static void __arm_smmu_domain_inv_range(struct arm_smmu_invs *invs,
granule);
break;
case INV_TYPE_S2_VMID:
+ case INV_TYPE_S2_VMID_VSMMU:
cmd.tlbi.vmid = cur->id;
cmd.tlbi.leaf = leaf;
arm_smmu_inv_to_cmdq_batch(cur, &cmds, &cmd, iova, size,
@@ -3241,7 +3243,10 @@ int arm_smmu_find_iotlb_tag(struct iommu_domain *domain,
tag->type = INV_TYPE_S1_ASID;
break;
case ARM_SMMU_DOMAIN_S2:
- tag->type = INV_TYPE_S2_VMID;
+ if (to_vsmmu(domain))
+ tag->type = INV_TYPE_S2_VMID_VSMMU;
+ else
+ tag->type = INV_TYPE_S2_VMID;
break;
default:
return -EINVAL;
@@ -3265,6 +3270,12 @@ static int arm_smmu_alloc_iotlb_tag(struct iommu_domain *domain,
if (!ret || ret != -ENOENT)
return ret;
+ if (tag->type == INV_TYPE_S2_VMID_VSMMU) {
+ /* Use the pre-allocated VMID from vSMMU */
+ tag->id = to_vsmmu(domain)->vmid;
+ return 0;
+ }
+
/* FIXME replace with an actual allocation from the bitmap */
if (tag->type == INV_TYPE_S1_ASID)
tag->id = smmu_domain->cd.asid;
@@ -3308,6 +3319,7 @@ arm_smmu_master_build_inv(struct arm_smmu_master *master,
}
break;
case INV_TYPE_S2_VMID:
+ case INV_TYPE_S2_VMID_VSMMU:
cur->size_opcode = CMDQ_OP_TLBI_S2_IPA;
cur->nsize_opcode = CMDQ_OP_TLBI_S12_VMALL;
break;
@@ -3352,7 +3364,7 @@ arm_smmu_master_build_invs(struct arm_smmu_master *master, bool ats_enabled,
return NULL;
/* All the nested S1 ASIDs have to be flushed when S2 parent changes */
- if (nesting) {
+ if (tag->type == INV_TYPE_S2_VMID_VSMMU) {
if (!arm_smmu_master_build_inv(master,
INV_TYPE_S2_VMID_S1_CLEAR,
tag->id, IOMMU_NO_PASID, 0))
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 07/10] iommu/arm-smmu-v3: Allocate IOTLB cache tag if no id to reuse
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
` (5 preceding siblings ...)
2026-02-23 20:27 ` [PATCH v3 06/10] iommu/arm-smmu-v3: Introduce INV_TYPE_S2_VMID_VSMMU Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 08/10] iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init Nicolin Chen
` (2 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
An IOTLB tag now is forwarded from arm_smmu_domain_get_iotlb_tag() to its
final destination (a CD or STE entry).
Thus, arm_smmu_domain_get_iotlb_tag() can safely delink its references to
the cd->asid and s2_cfg->vmid in the smmu_domain. Instead, allocate a new
IOTLB cache tag from the xarray/ida.
The old ASID and VMID in the smmu_domain will be deprecated, once VMID is
decoupled in the vSMMU use case too.
Since invst->new_invs->inv[0] and invst->tag are basically the same thing,
merge arm_smmu_inv_flush_iotlb_tag() into arm_smmu_iotlb_tag_free().
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 63 +++++++++++++--------
1 file changed, 38 insertions(+), 25 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0755ebe1c1560..9ab904d9d142c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3276,13 +3276,44 @@ static int arm_smmu_alloc_iotlb_tag(struct iommu_domain *domain,
return 0;
}
- /* FIXME replace with an actual allocation from the bitmap */
+ lockdep_assert_held(&arm_smmu_asid_lock);
+
+ if (tag->type == INV_TYPE_S1_ASID) {
+ ret = xa_alloc(&arm_smmu_asid_xa, &tag->id, smmu_domain,
+ XA_LIMIT(1, (1 << smmu->asid_bits) - 1),
+ GFP_KERNEL);
+ } else {
+ ret = ida_alloc_range(&smmu->vmid_map, 1,
+ (1 << smmu->vmid_bits) - 1, GFP_KERNEL);
+ if (ret > 0) {
+ tag->id = ret; /* int is good for 16-bit VMID */
+ ret = 0;
+ }
+ }
+
+ return ret;
+}
+
+static void arm_smmu_iotlb_tag_free(struct arm_smmu_inv *tag)
+{
+ struct arm_smmu_cmdq_ent cmd = {
+ .opcode = tag->nsize_opcode,
+ };
+
+ arm_smmu_inv_assert_iotlb_tag(tag);
+
if (tag->type == INV_TYPE_S1_ASID)
- tag->id = smmu_domain->cd.asid;
+ cmd.tlbi.asid = tag->id;
else
- tag->id = smmu_domain->s2_cfg.vmid;
+ cmd.tlbi.vmid = tag->id;
+ arm_smmu_cmdq_issue_cmd_with_sync(tag->smmu, &cmd);
- return 0;
+ if (tag->type == INV_TYPE_S1_ASID)
+ xa_erase(&arm_smmu_asid_xa, tag->id);
+ else if (tag->type == INV_TYPE_S2_VMID)
+ ida_free(&tag->smmu->vmid_map, tag->id);
+
+ /* Keep INV_TYPE_S2_VMID_VSMMU. vSMMU will free it */
}
static struct arm_smmu_inv *
@@ -3510,26 +3541,6 @@ arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
kfree_rcu(invst->old_invs, rcu);
}
-static void arm_smmu_inv_flush_iotlb_tag(struct arm_smmu_inv *inv)
-{
- struct arm_smmu_cmdq_ent cmd = {};
-
- switch (inv->type) {
- case INV_TYPE_S1_ASID:
- cmd.tlbi.asid = inv->id;
- break;
- case INV_TYPE_S2_VMID:
- /* S2_VMID using nsize_opcode covers S2_VMID_S1_CLEAR */
- cmd.tlbi.vmid = inv->id;
- break;
- default:
- return;
- }
-
- cmd.opcode = inv->nsize_opcode;
- arm_smmu_cmdq_issue_cmd_with_sync(inv->smmu, &cmd);
-}
-
/* Should be installed after arm_smmu_install_ste_for_dev() */
static void
arm_smmu_install_old_domain_invs(struct arm_smmu_attach_state *state)
@@ -3551,7 +3562,7 @@ arm_smmu_install_old_domain_invs(struct arm_smmu_attach_state *state)
* array must be left cleared in the IOTLB.
*/
if (!READ_ONCE(invst->new_invs->inv[0].users))
- arm_smmu_inv_flush_iotlb_tag(&invst->new_invs->inv[0]);
+ arm_smmu_iotlb_tag_free(&invst->tag);
new_invs = arm_smmu_invs_purge(old_invs);
if (!new_invs)
@@ -3697,6 +3708,8 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state,
err_free_vmaster:
kfree(state->vmaster);
err_unprepare_invs:
+ if (!READ_ONCE(state->new_domain_invst.tag.users))
+ arm_smmu_iotlb_tag_free(&state->new_domain_invst.tag);
kfree(state->new_domain_invst.new_invs);
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 08/10] iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
` (6 preceding siblings ...)
2026-02-23 20:27 ` [PATCH v3 07/10] iommu/arm-smmu-v3: Allocate IOTLB cache tag if no id to reuse Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-03-12 17:11 ` Jonathan Cameron
2026-02-23 20:27 ` [PATCH v3 09/10] iommu/arm-smmu-v3: Remove ASID/VMID from arm_smmu_domain Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 10/10] iommu/arm-smmu-v3: Allow sharing domain across SMMUs Nicolin Chen
9 siblings, 1 reply; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
VMID owned by a vSMMU should be allocated in the viommu_init callback for
- a straightforward lifecycle for a VMID used by a vSMMU
- HW like tegra241-cmdqv needs to setup VINTF with the VMID
Allocate/free a VMID in arm_vsmmu_init/destroy(). This decouples the VMID
owned by vSMMU from the VMID living in the S2 parent domain (s2_cfg.vmid).
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 +
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 24 +++++++++++++++++--
.../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 1 +
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index b821241f73c7a..db6568f1b2dd6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1213,6 +1213,7 @@ size_t arm_smmu_get_viommu_size(struct device *dev,
int arm_vsmmu_init(struct iommufd_viommu *viommu,
struct iommu_domain *parent_domain,
const struct iommu_user_data *user_data);
+void arm_vsmmu_destroy(struct iommufd_viommu *viommu);
int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
struct arm_smmu_nested_domain *nested_domain);
void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index a77c60321203c..dc638c38515e4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -406,7 +406,20 @@ int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu,
return ret;
}
+void arm_vsmmu_destroy(struct iommufd_viommu *viommu)
+{
+ struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core);
+
+ guard(mutex)(&arm_smmu_asid_lock);
+ /*
+ * arm_smmu_iotlb_tag_free() must have flushed the IOTLB with the VMID,
+ * but it did not free the VMID to align its lifecycle with the vSMMU.
+ */
+ ida_free(&vsmmu->smmu->vmid_map, vsmmu->vmid);
+}
+
static const struct iommufd_viommu_ops arm_vsmmu_ops = {
+ .destroy = arm_vsmmu_destroy,
.alloc_domain_nested = arm_vsmmu_alloc_domain_nested,
.cache_invalidate = arm_vsmmu_cache_invalidate,
};
@@ -456,14 +469,21 @@ int arm_vsmmu_init(struct iommufd_viommu *viommu,
struct arm_smmu_device *smmu =
container_of(viommu->iommu_dev, struct arm_smmu_device, iommu);
struct arm_smmu_domain *s2_parent = to_smmu_domain(parent_domain);
+ int id;
if (s2_parent->smmu != smmu)
return -EINVAL;
+ mutex_lock(&arm_smmu_asid_lock);
+ id = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
+ GFP_KERNEL);
+ mutex_unlock(&arm_smmu_asid_lock);
+ if (id < 0)
+ return id;
+
+ vsmmu->vmid = id;
vsmmu->smmu = smmu;
vsmmu->s2_parent = s2_parent;
- /* FIXME Move VMID allocation from the S2 domain allocation to here */
- vsmmu->vmid = s2_parent->s2_cfg.vmid;
if (viommu->type == IOMMU_VIOMMU_TYPE_ARM_SMMUV3) {
viommu->ops = &arm_vsmmu_ops;
diff --git a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
index 6fe5563eaf9eb..92845fabd0dec 100644
--- a/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
@@ -1152,6 +1152,7 @@ static void tegra241_cmdqv_destroy_vintf_user(struct iommufd_viommu *viommu)
iommufd_viommu_destroy_mmap(&vintf->vsmmu.core,
vintf->mmap_offset);
tegra241_cmdqv_remove_vintf(vintf->cmdqv, vintf->idx);
+ arm_vsmmu_destroy(viommu);
}
static void tegra241_vintf_destroy_vsid(struct iommufd_vdevice *vdev)
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 09/10] iommu/arm-smmu-v3: Remove ASID/VMID from arm_smmu_domain
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
` (7 preceding siblings ...)
2026-02-23 20:27 ` [PATCH v3 08/10] iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 10/10] iommu/arm-smmu-v3: Allow sharing domain across SMMUs Nicolin Chen
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
Now ASID/VMID are stored in the arm_smmu_master. These are dead code now.
Remove all.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 ---
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 20 +------
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 3 -
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 58 -------------------
4 files changed, 1 insertion(+), 88 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index db6568f1b2dd6..e0832b191a2a6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -790,10 +790,6 @@ static inline bool arm_smmu_ssids_in_use(struct arm_smmu_ctx_desc_cfg *cd_table)
return cd_table->used_ssids;
}
-struct arm_smmu_s2_cfg {
- u16 vmid;
-};
-
struct arm_smmu_strtab_cfg {
union {
struct {
@@ -969,10 +965,6 @@ struct arm_smmu_domain {
atomic_t nr_ats_masters;
enum arm_smmu_domain_stage stage;
- union {
- struct arm_smmu_ctx_desc cd;
- struct arm_smmu_s2_cfg s2_cfg;
- };
struct iommu_domain domain;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 846a278fa5469..0e48264ccd01b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -300,14 +300,6 @@ static void arm_smmu_sva_domain_free(struct iommu_domain *domain)
*/
arm_smmu_domain_inv(smmu_domain);
- /*
- * Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can
- * still be called/running at this point. We allow the ASID to be
- * reused, and if there is a race then it just suffers harmless
- * unnecessary invalidation.
- */
- xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
-
/*
* Actual free is defered to the SRCU callback
* arm_smmu_mmu_notifier_free()
@@ -326,7 +318,6 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
struct arm_smmu_master *master = dev_iommu_priv_get(dev);
struct arm_smmu_device *smmu = master->smmu;
struct arm_smmu_domain *smmu_domain;
- u32 asid;
int ret;
if (!(master->smmu->features & ARM_SMMU_FEAT_SVA))
@@ -345,22 +336,13 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev,
smmu_domain->domain.pgsize_bitmap = PAGE_SIZE;
smmu_domain->stage = ARM_SMMU_DOMAIN_SVA;
smmu_domain->smmu = smmu;
-
- ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
- XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
- if (ret)
- goto err_free;
-
- smmu_domain->cd.asid = asid;
smmu_domain->mmu_notifier.ops = &arm_smmu_mmu_notifier_ops;
ret = mmu_notifier_register(&smmu_domain->mmu_notifier, mm);
if (ret)
- goto err_asid;
+ goto err_free;
return &smmu_domain->domain;
-err_asid:
- xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
err_free:
arm_smmu_domain_free(smmu_domain);
return ERR_PTR(ret);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
index c0cdded058fc8..d7f39313c6b34 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c
@@ -502,9 +502,6 @@ static void arm_smmu_test_make_s1_cd(struct arm_smmu_cd *cd, unsigned int asid)
struct io_pgtable io_pgtable = {};
struct arm_smmu_domain smmu_domain = {
.pgtbl_ops = &io_pgtable.ops,
- .cd = {
- .asid = asid,
- },
};
struct arm_smmu_inv tag = {
.type = INV_TYPE_S1_ASID,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9ab904d9d142c..a10da6b8f64d5 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2871,66 +2871,17 @@ struct arm_smmu_domain *arm_smmu_domain_alloc(void)
static void arm_smmu_domain_free_paging(struct iommu_domain *domain)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
- struct arm_smmu_device *smmu = smmu_domain->smmu;
free_io_pgtable_ops(smmu_domain->pgtbl_ops);
-
- /* Free the ASID or VMID */
- if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
- /* Prevent SVA from touching the CD while we're freeing it */
- mutex_lock(&arm_smmu_asid_lock);
- xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid);
- mutex_unlock(&arm_smmu_asid_lock);
- } else {
- struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
- if (cfg->vmid)
- ida_free(&smmu->vmid_map, cfg->vmid);
- }
-
arm_smmu_domain_free(smmu_domain);
}
-static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
- struct arm_smmu_domain *smmu_domain)
-{
- int ret;
- u32 asid = 0;
- struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
-
- /* Prevent SVA from modifying the ASID until it is written to the CD */
- mutex_lock(&arm_smmu_asid_lock);
- ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain,
- XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL);
- cd->asid = (u16)asid;
- mutex_unlock(&arm_smmu_asid_lock);
- return ret;
-}
-
-static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
- struct arm_smmu_domain *smmu_domain)
-{
- int vmid;
- struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
-
- /* Reserve VMID 0 for stage-2 bypass STEs */
- vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
- GFP_KERNEL);
- if (vmid < 0)
- return vmid;
-
- cfg->vmid = (u16)vmid;
- return 0;
-}
-
static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
struct arm_smmu_device *smmu, u32 flags)
{
- int ret;
enum io_pgtable_fmt fmt;
struct io_pgtable_cfg pgtbl_cfg;
struct io_pgtable_ops *pgtbl_ops;
- int (*finalise_stage_fn)(struct arm_smmu_device *smmu,
- struct arm_smmu_domain *smmu_domain);
bool enable_dirty = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
pgtbl_cfg = (struct io_pgtable_cfg) {
@@ -2950,7 +2901,6 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
if (enable_dirty)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD;
fmt = ARM_64_LPAE_S1;
- finalise_stage_fn = arm_smmu_domain_finalise_s1;
break;
}
case ARM_SMMU_DOMAIN_S2:
@@ -2959,7 +2909,6 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
pgtbl_cfg.ias = smmu->oas;
pgtbl_cfg.oas = smmu->oas;
fmt = ARM_64_LPAE_S2;
- finalise_stage_fn = arm_smmu_domain_finalise_s2;
if ((smmu->features & ARM_SMMU_FEAT_S2FWB) &&
(flags & IOMMU_HWPT_ALLOC_NEST_PARENT))
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_S2FWB;
@@ -2977,13 +2926,6 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
smmu_domain->domain.geometry.force_aperture = true;
if (enable_dirty && smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
smmu_domain->domain.dirty_ops = &arm_smmu_dirty_ops;
-
- ret = finalise_stage_fn(smmu, smmu_domain);
- if (ret < 0) {
- free_io_pgtable_ops(pgtbl_ops);
- return ret;
- }
-
smmu_domain->pgtbl_ops = pgtbl_ops;
smmu_domain->smmu = smmu;
return 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 10/10] iommu/arm-smmu-v3: Allow sharing domain across SMMUs
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
` (8 preceding siblings ...)
2026-02-23 20:27 ` [PATCH v3 09/10] iommu/arm-smmu-v3: Remove ASID/VMID from arm_smmu_domain Nicolin Chen
@ 2026-02-23 20:27 ` Nicolin Chen
9 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-02-23 20:27 UTC (permalink / raw)
To: will, robin.murphy, jgg
Cc: joro, jpb, praan, miko.lenczewski, smostafa, linux-arm-kernel,
iommu, linux-kernel, patches
VMM needs a domain holding the mappings between gPA to hPA. It can be an S1
domain or an S2 nesting parent domain, depending on whether the VM is built
with a vSMMU or not.
Given that the IOAS for this gPA mapping is the same across SMMU instances,
this domain can be shared across devices even if they sit behind different
SMMUs, so long as the underlying page table is compatible between the SMMU
instances.
There is no direct information about the page table from the master device,
but a comparison can be done between the page table bits held in the domain
and the SMMU feature bits that are used to decide those page table bits.
Replace the smmu test in arm_smmu_attach_dev() and arm_vsmmu_init() with a
compatibility test for the S1 and S2 cases respectively.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 27 +++++++++++++++++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +--
3 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index e0832b191a2a6..2e5981db11cd6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -9,6 +9,7 @@
#define _ARM_SMMU_V3_H
#include <linux/bitfield.h>
+#include <linux/io-pgtable.h>
#include <linux/iommu.h>
#include <linux/iommufd.h>
#include <linux/kernel.h>
@@ -987,6 +988,32 @@ struct arm_smmu_nested_domain {
__le64 ste[2];
};
+static inline bool
+arm_smmu_domain_can_share(struct arm_smmu_domain *smmu_domain,
+ struct arm_smmu_device *new_smmu)
+{
+ struct io_pgtable *pgtbl =
+ io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+
+ if (pgtbl->fmt == ARM_64_LPAE_S1 &&
+ !(new_smmu->features & ARM_SMMU_FEAT_TRANS_S1))
+ return false;
+ if (pgtbl->fmt == ARM_64_LPAE_S2 &&
+ !(new_smmu->features & ARM_SMMU_FEAT_TRANS_S2))
+ return false;
+ if (pgtbl->cfg.pgsize_bitmap & ~new_smmu->pgsize_bitmap)
+ return false;
+ if (pgtbl->cfg.oas > new_smmu->oas)
+ return false;
+ if (pgtbl->cfg.coherent_walk &&
+ !(new_smmu->features & ARM_SMMU_FEAT_COHERENCY))
+ return false;
+ if ((pgtbl->cfg.quirks & IO_PGTABLE_QUIRK_ARM_S2FWB) &&
+ !(new_smmu->features & ARM_SMMU_FEAT_S2FWB))
+ return false;
+ return true;
+}
+
/* The following are exposed for testing purposes. */
struct arm_smmu_entry_writer_ops;
struct arm_smmu_entry_writer {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index dc638c38515e4..fbe5fc4e98ead 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -471,7 +471,7 @@ int arm_vsmmu_init(struct iommufd_viommu *viommu,
struct arm_smmu_domain *s2_parent = to_smmu_domain(parent_domain);
int id;
- if (s2_parent->smmu != smmu)
+ if (!arm_smmu_domain_can_share(s2_parent, smmu))
return -EINVAL;
mutex_lock(&arm_smmu_asid_lock);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a10da6b8f64d5..af99722d3b6fc 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -16,7 +16,6 @@
#include <linux/delay.h>
#include <linux/err.h>
#include <linux/interrupt.h>
-#include <linux/io-pgtable.h>
#include <linux/iopoll.h>
#include <linux/module.h>
#include <linux/msi.h>
@@ -3709,7 +3708,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev,
state.master = master = dev_iommu_priv_get(dev);
smmu = master->smmu;
- if (smmu_domain->smmu != smmu)
+ if (!arm_smmu_domain_can_share(smmu_domain, smmu))
return -EINVAL;
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 08/10] iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init
2026-02-23 20:27 ` [PATCH v3 08/10] iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init Nicolin Chen
@ 2026-03-12 17:11 ` Jonathan Cameron
2026-03-12 19:54 ` Nicolin Chen
0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Cameron @ 2026-03-12 17:11 UTC (permalink / raw)
To: Nicolin Chen
Cc: will, robin.murphy, jgg, joro, jpb, praan, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, patches
On Mon, 23 Feb 2026 12:27:44 -0800
Nicolin Chen <nicolinc@nvidia.com> wrote:
> VMID owned by a vSMMU should be allocated in the viommu_init callback for
> - a straightforward lifecycle for a VMID used by a vSMMU
> - HW like tegra241-cmdqv needs to setup VINTF with the VMID
>
> Allocate/free a VMID in arm_vsmmu_init/destroy(). This decouples the VMID
> owned by vSMMU from the VMID living in the S2 parent domain (s2_cfg.vmid).
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Hi Nicolin,
Not a proper review as I'd need to do a bunch of catch up on how
this stuff all works. So just one query inline.
> ---
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
> index a77c60321203c..dc638c38515e4 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
> @@ -406,7 +406,20 @@ int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu,
> return ret;
> }
>
> +void arm_vsmmu_destroy(struct iommufd_viommu *viommu)
> +{
> + struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core);
> +
> + guard(mutex)(&arm_smmu_asid_lock);
> + /*
> + * arm_smmu_iotlb_tag_free() must have flushed the IOTLB with the VMID,
> + * but it did not free the VMID to align its lifecycle with the vSMMU.
> + */
> + ida_free(&vsmmu->smmu->vmid_map, vsmmu->vmid);
I'm being slow today, but why do you need the lock?
The ida itself doesn't need it according to the docs.
(it's using the xarray lock underneath)
Likewise of the ida_alloc_range()
> +}
> +
> static const struct iommufd_viommu_ops arm_vsmmu_ops = {
> + .destroy = arm_vsmmu_destroy,
> .alloc_domain_nested = arm_vsmmu_alloc_domain_nested,
> .cache_invalidate = arm_vsmmu_cache_invalidate,
> };
> @@ -456,14 +469,21 @@ int arm_vsmmu_init(struct iommufd_viommu *viommu,
> struct arm_smmu_device *smmu =
> container_of(viommu->iommu_dev, struct arm_smmu_device, iommu);
> struct arm_smmu_domain *s2_parent = to_smmu_domain(parent_domain);
> + int id;
>
> if (s2_parent->smmu != smmu)
> return -EINVAL;
>
> + mutex_lock(&arm_smmu_asid_lock);
> + id = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
> + GFP_KERNEL);
> + mutex_unlock(&arm_smmu_asid_lock);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 08/10] iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init
2026-03-12 17:11 ` Jonathan Cameron
@ 2026-03-12 19:54 ` Nicolin Chen
0 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2026-03-12 19:54 UTC (permalink / raw)
To: Jonathan Cameron
Cc: will, robin.murphy, jgg, joro, jpb, praan, miko.lenczewski,
smostafa, linux-arm-kernel, iommu, linux-kernel, patches
On Thu, Mar 12, 2026 at 05:11:20PM +0000, Jonathan Cameron wrote:
> On Mon, 23 Feb 2026 12:27:44 -0800
> Nicolin Chen <nicolinc@nvidia.com> wrote:
> > +void arm_vsmmu_destroy(struct iommufd_viommu *viommu)
> > +{
> > + struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core);
> > +
> > + guard(mutex)(&arm_smmu_asid_lock);
> > + /*
> > + * arm_smmu_iotlb_tag_free() must have flushed the IOTLB with the VMID,
> > + * but it did not free the VMID to align its lifecycle with the vSMMU.
> > + */
> > + ida_free(&vsmmu->smmu->vmid_map, vsmmu->vmid);
>
> I'm being slow today, but why do you need the lock?
> The ida itself doesn't need it according to the docs.
> (it's using the xarray lock underneath)
>
> Likewise of the ida_alloc_range()
You are right. These do seem unnecessary.
Thanks!
Nicolin
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-03-12 19:55 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-23 20:27 [PATCH v3 00/10] iommu/arm-smmu-v3: Share domain across SMMU/vSMMU instances Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 01/10] iommu/arm-smmu-v3: Add a wrapper for arm_smmu_make_sva_cd() Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 02/10] iommu/arm-smmu-v3: Pass in arm_smmu_make_cd_fn to arm_smmu_set_pasid() Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 03/10] iommu/arm-smmu-v3: Store IOTLB cache tags in struct arm_smmu_attach_state Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 04/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to arm_smmu_master_build_invs() Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 05/10] iommu/arm-smmu-v3: Pass in IOTLB cache tag to CD and STE Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 06/10] iommu/arm-smmu-v3: Introduce INV_TYPE_S2_VMID_VSMMU Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 07/10] iommu/arm-smmu-v3: Allocate IOTLB cache tag if no id to reuse Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 08/10] iommu/arm-smmu-v3: Allocate INV_TYPE_S2_VMID_VSMMU in arm_vsmmu_init Nicolin Chen
2026-03-12 17:11 ` Jonathan Cameron
2026-03-12 19:54 ` Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 09/10] iommu/arm-smmu-v3: Remove ASID/VMID from arm_smmu_domain Nicolin Chen
2026-02-23 20:27 ` [PATCH v3 10/10] iommu/arm-smmu-v3: Allow sharing domain across SMMUs Nicolin Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox