* [PATCH RFC] iommu: Enable per-device SSID space for SVA
@ 2026-04-24 8:53 Joonwon Kang
2026-04-24 13:39 ` Jason Gunthorpe
2026-04-28 17:38 ` Easwar Hariharan
0 siblings, 2 replies; 10+ messages in thread
From: Joonwon Kang @ 2026-04-24 8:53 UTC (permalink / raw)
To: will, robin.murphy, joro, jpb
Cc: jgg, nicolinc, praan, kees, amhetre, Alexander.Grest, baolu.lu,
smostafa, linux-arm-kernel, iommu, linux-kernel, Joonwon Kang
For SVA, the IOMMU core always allocates PASID from the global PASID
space. The use of this global PASID space comes from the limitation of
the ENQCMD instruction in Intel CPUs that it fetches its PASID operand
from IA32_PASID, which is per-task.
Due to this nature, SVA with ARM SMMU v3 has been found not working in
our environment when other modules/devices compete for PASID. The
environment looks as follows:
- The device is not a PCIe device.
- The device is to use SVA.
- The supported SSID/PASID space is very small for the device; only 1 to
3 SSIDs are supported.
- There is a custom way of transmitting the SSID from the kernel to the
device.
With this setup, when other modules have allocated all the PASIDs that
our device is expected to use from the global PASID space via APIs like
iommu_alloc_global_pasid() or iommu_sva_bind_device(), SVA binding to
our device fails due to the lack of available PASIDs.
Since SSID/PASID is supported per-SID in ARM SMMU v3, this commit
leverages the fact and lifts the use of the global PASID space if
possible. What it does includes:
- Introduce a new IOMMU capability IOMMU_CAP_PER_DEV_PASID_SPACE, which
represents whether the IOMMU supports an independent PASID space per-
device, not shared across devices. ARM SMMU v3 is the case.
- Open a new API iommu_attach_device_pasid_any() to allocate any
available PASID and attach an IOMMU domain to it.
- Opt out the use of the global PASID space for SVA if the IOMMU has
that capability, and use the new API to allocate a PASID in that case.
Signed-off-by: Joonwon Kang <joonwonkang@google.com>
---
v1: Request comments for this approach, other possible approaches and/or
other aspects to consider more. Code is not sanitized and commits are
not separated appropriately in this version.
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +
drivers/iommu/iommu-sva.c | 44 +++++++----
drivers/iommu/iommu.c | 85 ++++++++++++++++++++-
include/linux/iommu.h | 5 ++
4 files changed, 121 insertions(+), 15 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4d00d796f078..3a700ab0b5c7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2494,6 +2494,8 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
return true;
case IOMMU_CAP_DIRTY_TRACKING:
return arm_smmu_dbm_capable(master->smmu);
+ case IOMMU_CAP_PER_DEV_PASID_SPACE:
+ return true;
default:
return false;
}
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 07d64908a05f..637d8fd29cbf 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -21,6 +21,7 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de
{
struct iommu_mm_data *iommu_mm;
ioasid_t pasid;
+ const struct iommu_ops *ops = dev_iommu_ops(dev);
lockdep_assert_held(&iommu_sva_lock);
@@ -39,11 +40,18 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de
if (!iommu_mm)
return ERR_PTR(-ENOMEM);
- pasid = iommu_alloc_global_pasid(dev);
- if (pasid == IOMMU_PASID_INVALID) {
- kfree(iommu_mm);
- return ERR_PTR(-ENOSPC);
+ if (ops->capable && ops->capable(dev, IOMMU_CAP_PER_DEV_PASID_SPACE)) {
+ pasid = IOMMU_NO_PASID;
+ iommu_mm->pasid_global = false;
+ } else {
+ pasid = iommu_alloc_global_pasid(dev);
+ if (pasid == IOMMU_PASID_INVALID) {
+ kfree(iommu_mm);
+ return ERR_PTR(-ENOSPC);
+ }
+ iommu_mm->pasid_global = true;
}
+
iommu_mm->pasid = pasid;
iommu_mm->mm = mm;
INIT_LIST_HEAD(&iommu_mm->sva_domains);
@@ -114,13 +122,15 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
goto out_unlock;
}
- /* Search for an existing domain. */
- list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) {
- ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid,
- &handle->handle);
- if (!ret) {
- domain->users++;
- goto out;
+ if (iommu_mm->pasid != IOMMU_NO_PASID) {
+ /* Search for an existing domain. */
+ list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) {
+ ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid,
+ &handle->handle);
+ if (!ret) {
+ domain->users++;
+ goto out;
+ }
}
}
@@ -131,8 +141,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
goto out_free_handle;
}
- ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid,
- &handle->handle);
+ if (iommu_mm->pasid != IOMMU_NO_PASID) {
+ ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid,
+ &handle->handle);
+ } else {
+ ret = iommu_attach_device_pasid_any(domain, dev, &iommu_mm->pasid,
+ &handle->handle);
+ }
if (ret)
goto out_free_domain;
domain->users = 1;
@@ -211,7 +226,8 @@ void mm_pasid_drop(struct mm_struct *mm)
if (!iommu_mm)
return;
- iommu_free_global_pasid(iommu_mm->pasid);
+ if (iommu_mm->pasid_global)
+ iommu_free_global_pasid(iommu_mm->pasid);
kfree(iommu_mm);
}
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 35db51780954..b882ecad7f57 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1061,7 +1061,7 @@ struct iommu_group *iommu_group_alloc(void)
mutex_init(&group->mutex);
INIT_LIST_HEAD(&group->devices);
INIT_LIST_HEAD(&group->entry);
- xa_init(&group->pasid_array);
+ xa_init_flags(&group->pasid_array, XA_FLAGS_ALLOC);
ret = ida_alloc(&iommu_group_ida, GFP_KERNEL);
if (ret < 0) {
@@ -3619,6 +3619,89 @@ int iommu_attach_device_pasid(struct iommu_domain *domain,
}
EXPORT_SYMBOL_GPL(iommu_attach_device_pasid);
+/**
+ * iommu_attach_device_pasid_any() - Allocate a pasid of device and attach a
+ * domain to it
+ * @domain: the iommu domain.
+ * @dev: the attached device.
+ * @pasid: pointer to the pasid of the device to be allocated.
+ * @handle: the attach handle.
+ *
+ * Caller should always provide a new handle to avoid race with the paths
+ * that have lockless reference to handle if it intends to pass a valid handle.
+ *
+ * Return: 0 on success, or an error.
+ */
+int iommu_attach_device_pasid_any(struct iommu_domain *domain,
+ struct device *dev,
+ ioasid_t *pasid,
+ struct iommu_attach_handle *handle)
+{
+ /* Caller must be a probed driver on dev */
+ struct iommu_group *group = dev->iommu_group;
+ const struct iommu_ops *ops;
+ void *entry;
+ u32 new_pasid;
+ int ret;
+
+ if (!group)
+ return -ENODEV;
+
+ ops = dev_iommu_ops(dev);
+
+ if (!domain->ops->set_dev_pasid ||
+ !ops->blocked_domain ||
+ !ops->blocked_domain->ops->set_dev_pasid)
+ return -EOPNOTSUPP;
+
+ if (!domain_iommu_ops_compatible(ops, domain) || !pasid)
+ return -EINVAL;
+
+ mutex_lock(&group->mutex);
+
+ /*
+ * This is a concurrent attach during a device reset. Reject it until
+ * pci_dev_reset_iommu_done() attaches the device to group->domain.
+ */
+ if (group->resetting_domain) {
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+
+ entry = iommu_make_pasid_array_entry(domain, handle);
+
+ struct xa_limit limit = {
+ .min = IOMMU_FIRST_GLOBAL_PASID,
+ .max = dev->iommu->max_pasids - 1,
+ };
+
+ ret = xa_alloc(&group->pasid_array, &new_pasid, XA_ZERO_ENTRY, limit, GFP_KERNEL);
+ if (ret)
+ goto out_unlock;
+
+ ret = __iommu_set_group_pasid(domain, group, new_pasid, NULL);
+ if (ret) {
+ xa_release(&group->pasid_array, new_pasid);
+ goto out_unlock;
+ }
+
+ /*
+ * The xa_insert() above reserved the memory, and the group->mutex is
+ * held, this cannot fail. The new domain cannot be visible until the
+ * operation succeeds as we cannot tolerate PRIs becoming concurrently
+ * queued and then failing attach.
+ */
+ WARN_ON(xa_is_err(xa_store(&group->pasid_array,
+ new_pasid, entry, GFP_KERNEL)));
+
+ *pasid = new_pasid;
+
+out_unlock:
+ mutex_unlock(&group->mutex);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_attach_device_pasid_any);
+
/**
* iommu_replace_device_pasid - Replace the domain that a specific pasid
* of the device is attached to
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 54b8b48c762e..1665f9fe1d8a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -271,6 +271,7 @@ enum iommu_cap {
*/
IOMMU_CAP_DEFERRED_FLUSH,
IOMMU_CAP_DIRTY_TRACKING, /* IOMMU supports dirty tracking */
+ IOMMU_CAP_PER_DEV_PASID_SPACE, /* IOMMU supports per-device PASID space */
};
/* These are the possible reserved region types */
@@ -1136,6 +1137,7 @@ struct iommu_sva {
struct iommu_mm_data {
u32 pasid;
+ bool pasid_global;
struct mm_struct *mm;
struct list_head sva_domains;
struct list_head mm_list_elm;
@@ -1184,6 +1186,9 @@ void iommu_device_release_dma_owner(struct device *dev);
int iommu_attach_device_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid,
struct iommu_attach_handle *handle);
+int iommu_attach_device_pasid_any(struct iommu_domain *domain,
+ struct device *dev, ioasid_t *pasid,
+ struct iommu_attach_handle *handle);
void iommu_detach_device_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid);
ioasid_t iommu_alloc_global_pasid(struct device *dev);
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH RFC] iommu: Enable per-device SSID space for SVA
2026-04-24 8:53 [PATCH RFC] iommu: Enable per-device SSID space for SVA Joonwon Kang
@ 2026-04-24 13:39 ` Jason Gunthorpe
2026-05-07 8:15 ` Tian, Kevin
2026-05-07 9:58 ` Joonwon Kang
2026-04-28 17:38 ` Easwar Hariharan
1 sibling, 2 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2026-04-24 13:39 UTC (permalink / raw)
To: Joonwon Kang
Cc: will, robin.murphy, joro, jpb, nicolinc, praan, kees, amhetre,
Alexander.Grest, baolu.lu, smostafa, linux-arm-kernel, iommu,
linux-kernel
On Fri, Apr 24, 2026 at 08:53:39AM +0000, Joonwon Kang wrote:
> For SVA, the IOMMU core always allocates PASID from the global PASID
> space. The use of this global PASID space comes from the limitation of
> the ENQCMD instruction in Intel CPUs that it fetches its PASID operand
> from IA32_PASID, which is per-task.
That's right, and all the iommu drivers should have no issue with
per-device pasid or they are not following the API contract.. I
believe that has been taking care of already.
So, I don't think this is an iommu driver capability.
Instead, you have to decide if the PASID is per device or not based on
if the system will use ENQCMD or any similar instruction. I
understand ARM has introduced a similar instruction.
So you may be better off with some kind of 'arch has enqcmd like
instruction' to control this instead of involving the iommu driver.
> - The device is not a PCIe device.
> - The device is to use SVA.
> - The supported SSID/PASID space is very small for the device; only 1 to
> 3 SSIDs are supported.
Yuk
> With this setup, when other modules have allocated all the PASIDs that
> our device is expected to use from the global PASID space via APIs like
> iommu_alloc_global_pasid() or iommu_sva_bind_device(), SVA binding to
> our device fails due to the lack of available PASIDs.
So you have multiple SVA using devices as well? Or multiple instances
of the same device?
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH RFC] iommu: Enable per-device SSID space for SVA
2026-04-24 13:39 ` Jason Gunthorpe
@ 2026-05-07 8:15 ` Tian, Kevin
2026-05-09 17:03 ` Jason Gunthorpe
2026-05-07 9:58 ` Joonwon Kang
1 sibling, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2026-05-07 8:15 UTC (permalink / raw)
To: Jason Gunthorpe, Joonwon Kang
Cc: will@kernel.org, robin.murphy@arm.com, joro@8bytes.org,
jpb@kernel.org, nicolinc@nvidia.com, praan@google.com,
kees@kernel.org, amhetre@nvidia.com,
Alexander.Grest@microsoft.com, baolu.lu@linux.intel.com,
smostafa@google.com, linux-arm-kernel@lists.infradead.org,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Friday, April 24, 2026 9:40 PM
>
> On Fri, Apr 24, 2026 at 08:53:39AM +0000, Joonwon Kang wrote:
> > For SVA, the IOMMU core always allocates PASID from the global PASID
> > space. The use of this global PASID space comes from the limitation of
> > the ENQCMD instruction in Intel CPUs that it fetches its PASID operand
> > from IA32_PASID, which is per-task.
>
> That's right, and all the iommu drivers should have no issue with
> per-device pasid or they are not following the API contract.. I
> believe that has been taking care of already.
>
> So, I don't think this is an iommu driver capability.
>
> Instead, you have to decide if the PASID is per device or not based on
> if the system will use ENQCMD or any similar instruction. I
> understand ARM has introduced a similar instruction.
>
> So you may be better off with some kind of 'arch has enqcmd like
> instruction' to control this instead of involving the iommu driver.
>
if both arch and device support enqcmd-like insn...
I recalled this was discussed years ago. For devices like this, just
let driver manage its own pasid space then have a new interface
e.g. iommu_sva_bind_device_pasid(dev, mm, pasid) to use the
specified pasid.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH RFC] iommu: Enable per-device SSID space for SVA
2026-05-07 8:15 ` Tian, Kevin
@ 2026-05-09 17:03 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2026-05-09 17:03 UTC (permalink / raw)
To: Tian, Kevin
Cc: Joonwon Kang, will@kernel.org, robin.murphy@arm.com,
joro@8bytes.org, jpb@kernel.org, nicolinc@nvidia.com,
praan@google.com, kees@kernel.org, amhetre@nvidia.com,
Alexander.Grest@microsoft.com, baolu.lu@linux.intel.com,
smostafa@google.com, linux-arm-kernel@lists.infradead.org,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org
On Thu, May 07, 2026 at 08:15:21AM +0000, Tian, Kevin wrote:
> if both arch and device support enqcmd-like insn...
>
> I recalled this was discussed years ago. For devices like this, just
> let driver manage its own pasid space then have a new interface
> e.g. iommu_sva_bind_device_pasid(dev, mm, pasid) to use the
> specified pasid.
Yeah, that makes sense. If the driver knows it doesn't use an ENQCMD
like programming model at all then it can use this API and it should
also avoid programming the MSRs/etc.
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH RFC] iommu: Enable per-device SSID space for SVA
2026-04-24 13:39 ` Jason Gunthorpe
2026-05-07 8:15 ` Tian, Kevin
@ 2026-05-07 9:58 ` Joonwon Kang
2026-05-09 17:10 ` Jason Gunthorpe
1 sibling, 1 reply; 10+ messages in thread
From: Joonwon Kang @ 2026-05-07 9:58 UTC (permalink / raw)
To: jgg
Cc: Alexander.Grest, amhetre, baolu.lu, iommu, joonwonkang, joro, jpb,
kees, linux-arm-kernel, linux-kernel, nicolinc, praan,
robin.murphy, smostafa, will, jacob.jun.pan, easwar.hariharan,
kevin.tian
Hi Jason, thank you for your review and sorry for the late reply.
> On Fri, Apr 24, 2026 at 08:53:39AM +0000, Joonwon Kang wrote:
> > For SVA, the IOMMU core always allocates PASID from the global PASID
> > space. The use of this global PASID space comes from the limitation of
> > the ENQCMD instruction in Intel CPUs that it fetches its PASID operand
> > from IA32_PASID, which is per-task.
>
> That's right, and all the iommu drivers should have no issue with
> per-device pasid or they are not following the API contract.. I
> believe that has been taking care of already.
>
Thanks for this info that every IOMMU should support per-device PASID
space already, i.e. each device behind the IOMMU can have its own PASID
space.
Let me clarify my understanding first to prevent future confusion.
The reason of using the global PASID space in the first place, i.e.
`iommu_global_pasid_ida`, is to support the case where a userspace driver
wants to communicate with multiple devices using the ENQCMD instruction
without kernel's intervention. Since the ENQCMD instruction fetches PASID
from the per-process IA32_PASID, the userspace driver could not use a
different PASID for each device. If a syscall had been provided to change
the process' current PASID, however, we might have been able to get rid of
the use of the global PASID space, although it may cause other issues and
require research on feasibility and effectiveness.
Please let me know if there is any other reason of the global PASID space
use that the team considered back then.
> So, I don't think this is an iommu driver capability.
>
> Instead, you have to decide if the PASID is per device or not based on
> if the system will use ENQCMD or any similar instruction. I
> understand ARM has introduced a similar instruction.
>
By "similar instruction" on ARM, I guess you mean ST64BV0, which fetches
the bottom 32 bits data from ACCDATA_EL1. Please let me know if you meant
others as it will matter. If ST64BV0 is supported on ARM, however, it
would mean that ST64B and ST64BV are also supported already according to
the ID_AA64ISAR1_EL1's LS64 field. The latter 2 instructions are just to
atomically store whatever user wants to a memory location without
referring to ACCDATA_EL1 and all the 3 instructions can be run at EL0. So,
the userspace driver would have enough capability to designate arbitrary
PASID as it wants via the latter 2 instructions when communicating with
multiple devices.
> So you may be better off with some kind of 'arch has enqcmd like
> instruction' to control this instead of involving the iommu driver.
>
If the above makes sense, I guess we could lift the use of the global
PASID space on ARM unconditionally. What do you think?
> > - The device is not a PCIe device.
> > - The device is to use SVA.
> > - The supported SSID/PASID space is very small for the device; only 1 to
> > 3 SSIDs are supported.
>
> Yuk
>
> > With this setup, when other modules have allocated all the PASIDs that
> > our device is expected to use from the global PASID space via APIs like
> > iommu_alloc_global_pasid() or iommu_sva_bind_device(), SVA binding to
> > our device fails due to the lack of available PASIDs.
>
> So you have multiple SVA using devices as well? Or multiple instances
> of the same device?
We have multiple processes and a single device, those processes want to
do SVA with the same device, and only one process will do SVA with the
device at a time. Though, the problem occurs even when irrelevant
processes allocate the PASIDs from the global PASID space for their own
irrelevant purposes.
Thanks,
Joonwon Kang
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH RFC] iommu: Enable per-device SSID space for SVA
2026-05-07 9:58 ` Joonwon Kang
@ 2026-05-09 17:10 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2026-05-09 17:10 UTC (permalink / raw)
To: Joonwon Kang
Cc: Alexander.Grest, amhetre, baolu.lu, iommu, joro, jpb, kees,
linux-arm-kernel, linux-kernel, nicolinc, praan, robin.murphy,
smostafa, will, jacob.jun.pan, easwar.hariharan, kevin.tian
On Thu, May 07, 2026 at 09:58:51AM +0000, Joonwon Kang wrote:
> By "similar instruction" on ARM, I guess you mean ST64BV0, which fetches
> the bottom 32 bits data from ACCDATA_EL1. Please let me know if you meant
> others as it will matter. If ST64BV0 is supported on ARM, however, it
> would mean that ST64B and ST64BV are also supported already according to
> the ID_AA64ISAR1_EL1's LS64 field. The latter 2 instructions are just to
> atomically store whatever user wants to a memory location without
> referring to ACCDATA_EL1 and all the 3 instructions can be run at EL0. So,
> the userspace driver would have enough capability to designate arbitrary
> PASID as it wants via the latter 2 instructions when communicating with
> multiple devices.
IDK exactly what ARM did. IIRC on Intel ENQCMD forms a special
non-posted write TLP and the device can tell the TLP came from ENQCMD
and so it trusts the encoded PASID. ARM has to have done the same
thing - allowing anyone to forge the PASID by using a different
instruction misses the point of the Intel design.
Honestly, I'm not sure why they even implemented it. SMMUv3 can't do
the translation scheme required to use ENQCMD from a VM anyhow, so it
is pretty useless.
> We have multiple processes and a single device, those processes want to
> do SVA with the same device, and only one process will do SVA with the
> device at a time. Though, the problem occurs even when irrelevant
> processes allocate the PASIDs from the global PASID space for their own
> irrelevant purposes.
The only way to allocate a PASID from the global PASID space is to
establish another SVA, so you have multiple devices doing SVA?
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH RFC] iommu: Enable per-device SSID space for SVA
2026-04-24 8:53 [PATCH RFC] iommu: Enable per-device SSID space for SVA Joonwon Kang
2026-04-24 13:39 ` Jason Gunthorpe
@ 2026-04-28 17:38 ` Easwar Hariharan
2026-04-28 17:44 ` Jason Gunthorpe
1 sibling, 1 reply; 10+ messages in thread
From: Easwar Hariharan @ 2026-04-28 17:38 UTC (permalink / raw)
To: Joonwon Kang
Cc: will, robin.murphy, joro, jpb, easwar.hariharan, jgg, nicolinc,
praan, kees, amhetre, Alexander.Grest, baolu.lu, smostafa,
linux-arm-kernel, iommu, linux-kernel
On 4/24/2026 1:53 AM, Joonwon Kang wrote:
> For SVA, the IOMMU core always allocates PASID from the global PASID
> space. The use of this global PASID space comes from the limitation of
> the ENQCMD instruction in Intel CPUs that it fetches its PASID operand
> from IA32_PASID, which is per-task.
>
> Due to this nature, SVA with ARM SMMU v3 has been found not working in
> our environment when other modules/devices compete for PASID. The
> environment looks as follows:
>
> - The device is not a PCIe device.
> - The device is to use SVA.
> - The supported SSID/PASID space is very small for the device; only 1 to
> 3 SSIDs are supported.
> - There is a custom way of transmitting the SSID from the kernel to the
> device.
>
> With this setup, when other modules have allocated all the PASIDs that
> our device is expected to use from the global PASID space via APIs like
> iommu_alloc_global_pasid() or iommu_sva_bind_device(), SVA binding to
> our device fails due to the lack of available PASIDs.
>
> Since SSID/PASID is supported per-SID in ARM SMMU v3, this commit
> leverages the fact and lifts the use of the global PASID space if
> possible. What it does includes:
>
> - Introduce a new IOMMU capability IOMMU_CAP_PER_DEV_PASID_SPACE, which
> represents whether the IOMMU supports an independent PASID space per-
> device, not shared across devices. ARM SMMU v3 is the case.
> - Open a new API iommu_attach_device_pasid_any() to allocate any
> available PASID and attach an IOMMU domain to it.
> - Opt out the use of the global PASID space for SVA if the IOMMU has
> that capability, and use the new API to allocate a PASID in that case.
>
> Signed-off-by: Joonwon Kang <joonwonkang@google.com>
> ---
> v1: Request comments for this approach, other possible approaches and/or
> other aspects to consider more. Code is not sanitized and commits are
> not separated appropriately in this version.
>
<snip>
This may be a a basic question, but how does this reconcile with the fact
that the process ID space is global? Even with PID namespacing, I understand
that each process in a PID namespace has a "parent" PID in the parent namespace,
unless I'm grossly mistaken.
Also, with the per-device PASID space, different SVA-capable devices
being used by the same process would have different PASIDs referring to the same
process address space, and would break the DSA<->IAA kind of interaction where the
device drivers can communicate the PASID among each other to operate on the same
process address space. Is that a scenario that does not matter to your use case?
Thanks,
Easwar (he/him)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH RFC] iommu: Enable per-device SSID space for SVA
2026-04-28 17:38 ` Easwar Hariharan
@ 2026-04-28 17:44 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2026-04-28 17:44 UTC (permalink / raw)
To: Easwar Hariharan
Cc: Joonwon Kang, will, robin.murphy, joro, jpb, nicolinc, praan,
kees, amhetre, Alexander.Grest, baolu.lu, smostafa,
linux-arm-kernel, iommu, linux-kernel
On Tue, Apr 28, 2026 at 10:38:37AM -0700, Easwar Hariharan wrote:
> process address space, and would break the DSA<->IAA kind of interaction where the
> device drivers can communicate the PASID among each other to operate on the same
> process address space.
That is not part of the Linux model...
Each device has to get its own SVA and it must use the returned PASID,
not just invent one from someplace else.
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH RFC] iommu: Enable per-device SSID space for SVA
@ 2026-04-24 8:50 Joonwon Kang
2026-04-24 8:57 ` Joonwon Kang
0 siblings, 1 reply; 10+ messages in thread
From: Joonwon Kang @ 2026-04-24 8:50 UTC (permalink / raw)
To: will, robin.murphy, joro
Cc: jgg, nicolinc, praan, kees, amhetre, Alexander.Grest, baolu.lu,
smostafa, linux-arm-kernel, iommu, linux-kernel, Joonwon Kang
For SVA, the IOMMU core always allocates PASID from the global PASID
space. The use of this global PASID space comes from the limitation of
the ENQCMD instruction in Intel CPUs that it fetches its PASID operand
from IA32_PASID, which is per-task.
Due to this nature, SVA with ARM SMMU v3 has been found not working in
our environment when other modules/devices compete for PASID. The
environment looks as follows:
- The device is not a PCIe device.
- The device is to use SVA.
- The supported SSID/PASID space is very small for the device; only 1 to
3 SSIDs are supported.
- There is a custom way of transmitting the SSID from the kernel to the
device.
With this setup, when other modules have allocated all the PASIDs that
our device is expected to use from the global PASID space via APIs like
iommu_alloc_global_pasid() or iommu_sva_bind_device(), SVA binding to
our device fails due to the lack of available PASIDs.
Since SSID/PASID is supported per-SID in ARM SMMU v3, this commit
leverages the fact and lifts the use of the global PASID space if
possible. What it does includes:
- Introduce a new IOMMU capability IOMMU_CAP_PER_DEV_PASID_SPACE, which
represents whether the IOMMU supports an independent PASID space per-
device, not shared across devices. ARM SMMU v3 is the case.
- Open a new API iommu_attach_device_pasid_any() to allocate any
available PASID and attach an IOMMU domain to it.
- Opt out the use of the global PASID space for SVA if the IOMMU has
that capability, and use the new API to allocate a PASID in that case.
Signed-off-by: Joonwon Kang <joonwonkang@google.com>
---
v1: Request comments for this approach, other possible approaches and/or
other aspects to consider more. Code is not sanitized and commits are
not separated appropriately in this version.
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +
drivers/iommu/iommu-sva.c | 44 +++++++----
drivers/iommu/iommu.c | 85 ++++++++++++++++++++-
include/linux/iommu.h | 5 ++
4 files changed, 121 insertions(+), 15 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4d00d796f078..3a700ab0b5c7 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2494,6 +2494,8 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
return true;
case IOMMU_CAP_DIRTY_TRACKING:
return arm_smmu_dbm_capable(master->smmu);
+ case IOMMU_CAP_PER_DEV_PASID_SPACE:
+ return true;
default:
return false;
}
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 07d64908a05f..637d8fd29cbf 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -21,6 +21,7 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de
{
struct iommu_mm_data *iommu_mm;
ioasid_t pasid;
+ const struct iommu_ops *ops = dev_iommu_ops(dev);
lockdep_assert_held(&iommu_sva_lock);
@@ -39,11 +40,18 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de
if (!iommu_mm)
return ERR_PTR(-ENOMEM);
- pasid = iommu_alloc_global_pasid(dev);
- if (pasid == IOMMU_PASID_INVALID) {
- kfree(iommu_mm);
- return ERR_PTR(-ENOSPC);
+ if (ops->capable && ops->capable(dev, IOMMU_CAP_PER_DEV_PASID_SPACE)) {
+ pasid = IOMMU_NO_PASID;
+ iommu_mm->pasid_global = false;
+ } else {
+ pasid = iommu_alloc_global_pasid(dev);
+ if (pasid == IOMMU_PASID_INVALID) {
+ kfree(iommu_mm);
+ return ERR_PTR(-ENOSPC);
+ }
+ iommu_mm->pasid_global = true;
}
+
iommu_mm->pasid = pasid;
iommu_mm->mm = mm;
INIT_LIST_HEAD(&iommu_mm->sva_domains);
@@ -114,13 +122,15 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
goto out_unlock;
}
- /* Search for an existing domain. */
- list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) {
- ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid,
- &handle->handle);
- if (!ret) {
- domain->users++;
- goto out;
+ if (iommu_mm->pasid != IOMMU_NO_PASID) {
+ /* Search for an existing domain. */
+ list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) {
+ ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid,
+ &handle->handle);
+ if (!ret) {
+ domain->users++;
+ goto out;
+ }
}
}
@@ -131,8 +141,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
goto out_free_handle;
}
- ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid,
- &handle->handle);
+ if (iommu_mm->pasid != IOMMU_NO_PASID) {
+ ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid,
+ &handle->handle);
+ } else {
+ ret = iommu_attach_device_pasid_any(domain, dev, &iommu_mm->pasid,
+ &handle->handle);
+ }
if (ret)
goto out_free_domain;
domain->users = 1;
@@ -211,7 +226,8 @@ void mm_pasid_drop(struct mm_struct *mm)
if (!iommu_mm)
return;
- iommu_free_global_pasid(iommu_mm->pasid);
+ if (iommu_mm->pasid_global)
+ iommu_free_global_pasid(iommu_mm->pasid);
kfree(iommu_mm);
}
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 35db51780954..b882ecad7f57 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1061,7 +1061,7 @@ struct iommu_group *iommu_group_alloc(void)
mutex_init(&group->mutex);
INIT_LIST_HEAD(&group->devices);
INIT_LIST_HEAD(&group->entry);
- xa_init(&group->pasid_array);
+ xa_init_flags(&group->pasid_array, XA_FLAGS_ALLOC);
ret = ida_alloc(&iommu_group_ida, GFP_KERNEL);
if (ret < 0) {
@@ -3619,6 +3619,89 @@ int iommu_attach_device_pasid(struct iommu_domain *domain,
}
EXPORT_SYMBOL_GPL(iommu_attach_device_pasid);
+/**
+ * iommu_attach_device_pasid_any() - Allocate a pasid of device and attach a
+ * domain to it
+ * @domain: the iommu domain.
+ * @dev: the attached device.
+ * @pasid: pointer to the pasid of the device to be allocated.
+ * @handle: the attach handle.
+ *
+ * Caller should always provide a new handle to avoid race with the paths
+ * that have lockless reference to handle if it intends to pass a valid handle.
+ *
+ * Return: 0 on success, or an error.
+ */
+int iommu_attach_device_pasid_any(struct iommu_domain *domain,
+ struct device *dev,
+ ioasid_t *pasid,
+ struct iommu_attach_handle *handle)
+{
+ /* Caller must be a probed driver on dev */
+ struct iommu_group *group = dev->iommu_group;
+ const struct iommu_ops *ops;
+ void *entry;
+ u32 new_pasid;
+ int ret;
+
+ if (!group)
+ return -ENODEV;
+
+ ops = dev_iommu_ops(dev);
+
+ if (!domain->ops->set_dev_pasid ||
+ !ops->blocked_domain ||
+ !ops->blocked_domain->ops->set_dev_pasid)
+ return -EOPNOTSUPP;
+
+ if (!domain_iommu_ops_compatible(ops, domain) || !pasid)
+ return -EINVAL;
+
+ mutex_lock(&group->mutex);
+
+ /*
+ * This is a concurrent attach during a device reset. Reject it until
+ * pci_dev_reset_iommu_done() attaches the device to group->domain.
+ */
+ if (group->resetting_domain) {
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+
+ entry = iommu_make_pasid_array_entry(domain, handle);
+
+ struct xa_limit limit = {
+ .min = IOMMU_FIRST_GLOBAL_PASID,
+ .max = dev->iommu->max_pasids - 1,
+ };
+
+ ret = xa_alloc(&group->pasid_array, &new_pasid, XA_ZERO_ENTRY, limit, GFP_KERNEL);
+ if (ret)
+ goto out_unlock;
+
+ ret = __iommu_set_group_pasid(domain, group, new_pasid, NULL);
+ if (ret) {
+ xa_release(&group->pasid_array, new_pasid);
+ goto out_unlock;
+ }
+
+ /*
+ * The xa_insert() above reserved the memory, and the group->mutex is
+ * held, this cannot fail. The new domain cannot be visible until the
+ * operation succeeds as we cannot tolerate PRIs becoming concurrently
+ * queued and then failing attach.
+ */
+ WARN_ON(xa_is_err(xa_store(&group->pasid_array,
+ new_pasid, entry, GFP_KERNEL)));
+
+ *pasid = new_pasid;
+
+out_unlock:
+ mutex_unlock(&group->mutex);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_attach_device_pasid_any);
+
/**
* iommu_replace_device_pasid - Replace the domain that a specific pasid
* of the device is attached to
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 54b8b48c762e..1665f9fe1d8a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -271,6 +271,7 @@ enum iommu_cap {
*/
IOMMU_CAP_DEFERRED_FLUSH,
IOMMU_CAP_DIRTY_TRACKING, /* IOMMU supports dirty tracking */
+ IOMMU_CAP_PER_DEV_PASID_SPACE, /* IOMMU supports per-device PASID space */
};
/* These are the possible reserved region types */
@@ -1136,6 +1137,7 @@ struct iommu_sva {
struct iommu_mm_data {
u32 pasid;
+ bool pasid_global;
struct mm_struct *mm;
struct list_head sva_domains;
struct list_head mm_list_elm;
@@ -1184,6 +1186,9 @@ void iommu_device_release_dma_owner(struct device *dev);
int iommu_attach_device_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid,
struct iommu_attach_handle *handle);
+int iommu_attach_device_pasid_any(struct iommu_domain *domain,
+ struct device *dev, ioasid_t *pasid,
+ struct iommu_attach_handle *handle);
void iommu_detach_device_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid);
ioasid_t iommu_alloc_global_pasid(struct device *dev);
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH RFC] iommu: Enable per-device SSID space for SVA
2026-04-24 8:50 Joonwon Kang
@ 2026-04-24 8:57 ` Joonwon Kang
0 siblings, 0 replies; 10+ messages in thread
From: Joonwon Kang @ 2026-04-24 8:57 UTC (permalink / raw)
To: joonwonkang
Cc: Alexander.Grest, amhetre, baolu.lu, iommu, jgg, joro, kees,
linux-arm-kernel, linux-kernel, nicolinc, praan, robin.murphy,
smostafa, will
> For SVA, the IOMMU core always allocates PASID from the global PASID
> space. The use of this global PASID space comes from the limitation of
> the ENQCMD instruction in Intel CPUs that it fetches its PASID operand
> from IA32_PASID, which is per-task.
>
> Due to this nature, SVA with ARM SMMU v3 has been found not working in
> our environment when other modules/devices compete for PASID. The
> environment looks as follows:
>
> - The device is not a PCIe device.
> - The device is to use SVA.
> - The supported SSID/PASID space is very small for the device; only 1 to
> 3 SSIDs are supported.
> - There is a custom way of transmitting the SSID from the kernel to the
> device.
>
> With this setup, when other modules have allocated all the PASIDs that
> our device is expected to use from the global PASID space via APIs like
> iommu_alloc_global_pasid() or iommu_sva_bind_device(), SVA binding to
> our device fails due to the lack of available PASIDs.
>
> Since SSID/PASID is supported per-SID in ARM SMMU v3, this commit
> leverages the fact and lifts the use of the global PASID space if
> possible. What it does includes:
>
> - Introduce a new IOMMU capability IOMMU_CAP_PER_DEV_PASID_SPACE, which
> represents whether the IOMMU supports an independent PASID space per-
> device, not shared across devices. ARM SMMU v3 is the case.
> - Open a new API iommu_attach_device_pasid_any() to allocate any
> available PASID and attach an IOMMU domain to it.
> - Opt out the use of the global PASID space for SVA if the IOMMU has
> that capability, and use the new API to allocate a PASID in that case.
>
> Signed-off-by: Joonwon Kang <joonwonkang@google.com>
Please disregard this RFC as I have sent a new one with more recipients.
Thanks,
Joonwon Kang
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-09 17:10 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24 8:53 [PATCH RFC] iommu: Enable per-device SSID space for SVA Joonwon Kang
2026-04-24 13:39 ` Jason Gunthorpe
2026-05-07 8:15 ` Tian, Kevin
2026-05-09 17:03 ` Jason Gunthorpe
2026-05-07 9:58 ` Joonwon Kang
2026-05-09 17:10 ` Jason Gunthorpe
2026-04-28 17:38 ` Easwar Hariharan
2026-04-28 17:44 ` Jason Gunthorpe
-- strict thread matches above, loose matches on Subject: below --
2026-04-24 8:50 Joonwon Kang
2026-04-24 8:57 ` Joonwon Kang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox