* [RFC 1/8] iommu: Make iommu_device_register_bus available beyond selftest
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
@ 2025-12-01 17:30 ` Jacob Pan
2025-12-01 17:30 ` [RFC 2/8] iommu: Add a helper to check if any iommu device is registered Jacob Pan
` (7 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-01 17:30 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Jacob Pan, Zhang Yu,
Jean Philippe-Brucker, David Matlack
Bus type specific registeration can be used beyond selfttest mock IOMMU
driver, move it outside CONFIG_IOMMUFD_TEST.
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
---
drivers/iommu/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 59244c744eab..0df914a04064 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -298,7 +298,6 @@ void iommu_device_unregister(struct iommu_device *iommu)
}
EXPORT_SYMBOL_GPL(iommu_device_unregister);
-#if IS_ENABLED(CONFIG_IOMMUFD_TEST)
void iommu_device_unregister_bus(struct iommu_device *iommu,
const struct bus_type *bus,
struct notifier_block *nb)
@@ -347,6 +346,7 @@ int iommu_device_register_bus(struct iommu_device *iommu,
}
EXPORT_SYMBOL_GPL(iommu_device_register_bus);
+#if IS_ENABLED(CONFIG_IOMMUFD_TEST)
int iommu_mock_device_add(struct device *dev, struct iommu_device *iommu)
{
int rc;
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
2025-12-01 17:30 ` [RFC 1/8] iommu: Make iommu_device_register_bus available beyond selftest Jacob Pan
@ 2025-12-01 17:30 ` Jacob Pan
2025-12-02 2:17 ` Baolu Lu
2025-12-01 17:30 ` [RFC 3/8] iommufd: Add a mock page table format for noiommu mode Jacob Pan
` (6 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Jacob Pan @ 2025-12-01 17:30 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Jacob Pan, Zhang Yu,
Jean Philippe-Brucker, David Matlack
The dummy IOMMU driver for No-IOMMU mode should only be active when
no real IOMMU devices are present in the system. Introduce a helper
to check this condition, ensuring that the dummy driver does not
interfere when hardware-backed IOMMU support is available.
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
---
drivers/iommu/iommu.c | 10 ++++++++++
include/linux/iommu.h | 1 +
2 files changed, 11 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0df914a04064..958f612bf176 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2895,6 +2895,16 @@ static const struct iommu_device *iommu_from_fwnode(const struct fwnode_handle *
return ret;
}
+bool iommu_is_registered(void)
+{
+ bool registered;
+
+ spin_lock(&iommu_device_lock);
+ registered = !list_empty(&iommu_device_list);
+ spin_unlock(&iommu_device_lock);
+ return registered;
+}
+
const struct iommu_ops *iommu_ops_from_fwnode(const struct fwnode_handle *fwnode)
{
const struct iommu_device *iommu = iommu_from_fwnode(fwnode);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c30d12e16473..4191ae7312dd 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -933,6 +933,7 @@ extern void iommu_put_resv_regions(struct device *dev, struct list_head *list);
extern void iommu_set_default_passthrough(bool cmd_line);
extern void iommu_set_default_translated(bool cmd_line);
extern bool iommu_default_passthrough(void);
+extern bool iommu_is_registered(void);
extern struct iommu_resv_region *
iommu_alloc_resv_region(phys_addr_t start, size_t length, int prot,
enum iommu_resv_type type, gfp_t gfp);
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-01 17:30 ` [RFC 2/8] iommu: Add a helper to check if any iommu device is registered Jacob Pan
@ 2025-12-02 2:17 ` Baolu Lu
2025-12-03 0:06 ` Jacob Pan
0 siblings, 1 reply; 23+ messages in thread
From: Baolu Lu @ 2025-12-02 2:17 UTC (permalink / raw)
To: Jacob Pan, linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Zhang Yu, Jean Philippe-Brucker,
David Matlack
Hi Jacob,
On 12/2/25 01:30, Jacob Pan wrote:
> The dummy IOMMU driver for No-IOMMU mode should only be active when
> no real IOMMU devices are present in the system. Introduce a helper
> to check this condition, ensuring that the dummy driver does not
> interfere when hardware-backed IOMMU support is available.
>
> Signed-off-by: Jacob Pan<jacob.pan@linux.microsoft.com>
> ---
> drivers/iommu/iommu.c | 10 ++++++++++
> include/linux/iommu.h | 1 +
> 2 files changed, 11 insertions(+)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 0df914a04064..958f612bf176 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2895,6 +2895,16 @@ static const struct iommu_device *iommu_from_fwnode(const struct fwnode_handle *
> return ret;
> }
>
> +bool iommu_is_registered(void)
> +{
> + bool registered;
> +
> + spin_lock(&iommu_device_lock);
> + registered = !list_empty(&iommu_device_list);
> + spin_unlock(&iommu_device_lock);
> + return registered;
> +}
IOMMU devices might be added by calling iommu_device_register() at any
time. Therefore, an empty iommu_device_list does not necessarily mean
that "no real IOMMU devices are present in the system."
Thanks,
baolu
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-02 2:17 ` Baolu Lu
@ 2025-12-03 0:06 ` Jacob Pan
2025-12-03 3:31 ` Baolu Lu
2025-12-03 13:11 ` Jason Gunthorpe
0 siblings, 2 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-03 0:06 UTC (permalink / raw)
To: Baolu Lu
Cc: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L, skhawaja, pasha.tatashin,
Zhang Yu, Jean Philippe-Brucker, David Matlack, Alex Williamson
Hi Baolu,
On Tue, 2 Dec 2025 10:17:34 +0800
Baolu Lu <baolu.lu@linux.intel.com> wrote:
> Hi Jacob,
>
> On 12/2/25 01:30, Jacob Pan wrote:
> > The dummy IOMMU driver for No-IOMMU mode should only be active when
> > no real IOMMU devices are present in the system. Introduce a helper
> > to check this condition, ensuring that the dummy driver does not
> > interfere when hardware-backed IOMMU support is available.
> >
> > Signed-off-by: Jacob Pan<jacob.pan@linux.microsoft.com>
> > ---
> > drivers/iommu/iommu.c | 10 ++++++++++
> > include/linux/iommu.h | 1 +
> > 2 files changed, 11 insertions(+)
> >
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index 0df914a04064..958f612bf176 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -2895,6 +2895,16 @@ static const struct iommu_device
> > *iommu_from_fwnode(const struct fwnode_handle * return ret;
> > }
> >
> > +bool iommu_is_registered(void)
> > +{
> > + bool registered;
> > +
> > + spin_lock(&iommu_device_lock);
> > + registered = !list_empty(&iommu_device_list);
> > + spin_unlock(&iommu_device_lock);
> > + return registered;
> > +}
>
> IOMMU devices might be added by calling iommu_device_register() at any
> time. Therefore, an empty iommu_device_list does not necessarily mean
> that "no real IOMMU devices are present in the system."
Good point. My intention was that the noiommu dummy driver should
register only after all hardware IOMMU drivers have completed
registration during boot. Any subsequent registration attempt, such as a
hot-added IOMMU, should fail if noiommu mode is already active.
We could enforce this by introducing a global flag that prevents any
iommu_device from being registered after the noiommu driver has been
initialized.
However, as you pointed out there seems to be no standard ordering
for iommu device registration across platforms. e.g. VT-d hooks up with
x86_init, smmuv3 does that in platform driver probe. This patchset puts
dummy driver under early_initcall which is after both but not a
guarantee for all platforms. Any suggestions?
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-03 0:06 ` Jacob Pan
@ 2025-12-03 3:31 ` Baolu Lu
2025-12-03 22:28 ` Jacob Pan
2025-12-03 13:11 ` Jason Gunthorpe
1 sibling, 1 reply; 23+ messages in thread
From: Baolu Lu @ 2025-12-03 3:31 UTC (permalink / raw)
To: Jacob Pan
Cc: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L, skhawaja, pasha.tatashin,
Zhang Yu, Jean Philippe-Brucker, David Matlack, Alex Williamson
On 12/3/25 08:06, Jacob Pan wrote:
> Hi Baolu,
>
> On Tue, 2 Dec 2025 10:17:34 +0800
> Baolu Lu <baolu.lu@linux.intel.com> wrote:
>
>> Hi Jacob,
>>
>> On 12/2/25 01:30, Jacob Pan wrote:
>>> The dummy IOMMU driver for No-IOMMU mode should only be active when
>>> no real IOMMU devices are present in the system. Introduce a helper
>>> to check this condition, ensuring that the dummy driver does not
>>> interfere when hardware-backed IOMMU support is available.
>>>
>>> Signed-off-by: Jacob Pan<jacob.pan@linux.microsoft.com>
>>> ---
>>> drivers/iommu/iommu.c | 10 ++++++++++
>>> include/linux/iommu.h | 1 +
>>> 2 files changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index 0df914a04064..958f612bf176 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -2895,6 +2895,16 @@ static const struct iommu_device
>>> *iommu_from_fwnode(const struct fwnode_handle * return ret;
>>> }
>>>
>>> +bool iommu_is_registered(void)
>>> +{
>>> + bool registered;
>>> +
>>> + spin_lock(&iommu_device_lock);
>>> + registered = !list_empty(&iommu_device_list);
>>> + spin_unlock(&iommu_device_lock);
>>> + return registered;
>>> +}
>>
>> IOMMU devices might be added by calling iommu_device_register() at any
>> time. Therefore, an empty iommu_device_list does not necessarily mean
>> that "no real IOMMU devices are present in the system."
> Good point. My intention was that the noiommu dummy driver should
> register only after all hardware IOMMU drivers have completed
> registration during boot. Any subsequent registration attempt, such as a
> hot-added IOMMU, should fail if noiommu mode is already active.
>
> We could enforce this by introducing a global flag that prevents any
> iommu_device from being registered after the noiommu driver has been
> initialized.
>
> However, as you pointed out there seems to be no standard ordering
> for iommu device registration across platforms. e.g. VT-d hooks up with
> x86_init, smmuv3 does that in platform driver probe. This patchset puts
> dummy driver under early_initcall which is after both but not a
> guarantee for all platforms. Any suggestions?
Could we add a helper call inside the IOMMU detection path (e.g., within
pci_iommu_alloc()) to set a flag, such as platform_iommu_present, after
successful hardware detection?
void __init pci_iommu_alloc(void)
{
if (xen_pv_domain()) {
pci_xen_swiotlb_init();
return;
}
pci_swiotlb_detect();
gart_iommu_hole_init();
amd_iommu_detect();
detect_intel_iommu();
swiotlb_init(x86_swiotlb_enable, x86_swiotlb_flags);
}
Thanks,
baolu
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-03 3:31 ` Baolu Lu
@ 2025-12-03 22:28 ` Jacob Pan
0 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-03 22:28 UTC (permalink / raw)
To: Baolu Lu
Cc: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L, skhawaja, pasha.tatashin,
Zhang Yu, Jean Philippe-Brucker, David Matlack, Alex Williamson
Hi Baolu,
On Wed, 3 Dec 2025 11:31:27 +0800
Baolu Lu <baolu.lu@linux.intel.com> wrote:
> On 12/3/25 08:06, Jacob Pan wrote:
> > Hi Baolu,
> >
> > On Tue, 2 Dec 2025 10:17:34 +0800
> > Baolu Lu <baolu.lu@linux.intel.com> wrote:
> >
> >> Hi Jacob,
> >>
> >> On 12/2/25 01:30, Jacob Pan wrote:
> >>> The dummy IOMMU driver for No-IOMMU mode should only be active
> >>> when no real IOMMU devices are present in the system. Introduce a
> >>> helper to check this condition, ensuring that the dummy driver
> >>> does not interfere when hardware-backed IOMMU support is
> >>> available.
> >>>
> >>> Signed-off-by: Jacob Pan<jacob.pan@linux.microsoft.com>
> >>> ---
> >>> drivers/iommu/iommu.c | 10 ++++++++++
> >>> include/linux/iommu.h | 1 +
> >>> 2 files changed, 11 insertions(+)
> >>>
> >>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >>> index 0df914a04064..958f612bf176 100644
> >>> --- a/drivers/iommu/iommu.c
> >>> +++ b/drivers/iommu/iommu.c
> >>> @@ -2895,6 +2895,16 @@ static const struct iommu_device
> >>> *iommu_from_fwnode(const struct fwnode_handle * return ret;
> >>> }
> >>>
> >>> +bool iommu_is_registered(void)
> >>> +{
> >>> + bool registered;
> >>> +
> >>> + spin_lock(&iommu_device_lock);
> >>> + registered = !list_empty(&iommu_device_list);
> >>> + spin_unlock(&iommu_device_lock);
> >>> + return registered;
> >>> +}
> >>
> >> IOMMU devices might be added by calling iommu_device_register() at
> >> any time. Therefore, an empty iommu_device_list does not
> >> necessarily mean that "no real IOMMU devices are present in the
> >> system."
> > Good point. My intention was that the noiommu dummy driver should
> > register only after all hardware IOMMU drivers have completed
> > registration during boot. Any subsequent registration attempt, such
> > as a hot-added IOMMU, should fail if noiommu mode is already active.
> >
> > We could enforce this by introducing a global flag that prevents any
> > iommu_device from being registered after the noiommu driver has been
> > initialized.
> >
> > However, as you pointed out there seems to be no standard ordering
> > for iommu device registration across platforms. e.g. VT-d hooks up
> > with x86_init, smmuv3 does that in platform driver probe. This
> > patchset puts dummy driver under early_initcall which is after both
> > but not a guarantee for all platforms. Any suggestions?
>
> Could we add a helper call inside the IOMMU detection path (e.g.,
> within pci_iommu_alloc()) to set a flag, such as
> platform_iommu_present, after successful hardware detection?
>
> void __init pci_iommu_alloc(void)
> {
> if (xen_pv_domain()) {
> pci_xen_swiotlb_init();
> return;
> }
> pci_swiotlb_detect();
> gart_iommu_hole_init();
> amd_iommu_detect();
> detect_intel_iommu();
> swiotlb_init(x86_swiotlb_enable, x86_swiotlb_flags);
> }
I think this would only work for x86.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-03 0:06 ` Jacob Pan
2025-12-03 3:31 ` Baolu Lu
@ 2025-12-03 13:11 ` Jason Gunthorpe
2025-12-03 22:36 ` Jacob Pan
1 sibling, 1 reply; 23+ messages in thread
From: Jason Gunthorpe @ 2025-12-03 13:11 UTC (permalink / raw)
To: Jacob Pan
Cc: Baolu Lu, linux-kernel, iommu@lists.linux.dev, Alex Williamson,
Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
Tian, Kevin, Liu, Yi L, skhawaja, pasha.tatashin, Zhang Yu,
Jean Philippe-Brucker, David Matlack, Alex Williamson
On Tue, Dec 02, 2025 at 04:06:35PM -0800, Jacob Pan wrote:
> However, as you pointed out there seems to be no standard ordering
> for iommu device registration across platforms. e.g. VT-d hooks up with
> x86_init, smmuv3 does that in platform driver probe. This patchset puts
> dummy driver under early_initcall which is after both but not a
> guarantee for all platforms. Any suggestions?
I think we need to do something more like the sefltest does and
manually bind a driver to a device so this init time ordering
shouldn't matter.
IDK maybe a fake iommu driver has too many problems :\
Jason
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-03 13:11 ` Jason Gunthorpe
@ 2025-12-03 22:36 ` Jacob Pan
2025-12-04 10:53 ` Robin Murphy
0 siblings, 1 reply; 23+ messages in thread
From: Jacob Pan @ 2025-12-03 22:36 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Baolu Lu, linux-kernel, iommu@lists.linux.dev, Alex Williamson,
Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
Tian, Kevin, Liu, Yi L, skhawaja, pasha.tatashin, Zhang Yu,
Jean Philippe-Brucker, David Matlack, Alex Williamson
Hi Jason,
On Wed, 3 Dec 2025 09:11:29 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:
> On Tue, Dec 02, 2025 at 04:06:35PM -0800, Jacob Pan wrote:
> > However, as you pointed out there seems to be no standard ordering
> > for iommu device registration across platforms. e.g. VT-d hooks up
> > with x86_init, smmuv3 does that in platform driver probe. This
> > patchset puts dummy driver under early_initcall which is after both
> > but not a guarantee for all platforms. Any suggestions?
>
> I think we need to do something more like the sefltest does and
> manually bind a driver to a device so this init time ordering
> shouldn't matter.
I have moved this dummy iommu driver init under iommufd_init(), which
aligns well since it runs after all physical IOMMU drivers have
registered. This dummy driver is intended for iommufd after all. But I
don't see a need to bind to a platform device as the selttest does.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-03 22:36 ` Jacob Pan
@ 2025-12-04 10:53 ` Robin Murphy
2025-12-04 22:07 ` Jacob Pan
0 siblings, 1 reply; 23+ messages in thread
From: Robin Murphy @ 2025-12-04 10:53 UTC (permalink / raw)
To: Jacob Pan, Jason Gunthorpe
Cc: Baolu Lu, linux-kernel, iommu@lists.linux.dev, Alex Williamson,
Joerg Roedel, Will Deacon, Nicolin Chen, Tian, Kevin, Liu, Yi L,
skhawaja, pasha.tatashin, Zhang Yu, Jean Philippe-Brucker,
David Matlack, Alex Williamson
On 2025-12-03 10:36 pm, Jacob Pan wrote:
> Hi Jason,
>
> On Wed, 3 Dec 2025 09:11:29 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
>
>> On Tue, Dec 02, 2025 at 04:06:35PM -0800, Jacob Pan wrote:
>>> However, as you pointed out there seems to be no standard ordering
>>> for iommu device registration across platforms. e.g. VT-d hooks up
>>> with x86_init, smmuv3 does that in platform driver probe. This
>>> patchset puts dummy driver under early_initcall which is after both
>>> but not a guarantee for all platforms. Any suggestions?
>>
>> I think we need to do something more like the sefltest does and
>> manually bind a driver to a device so this init time ordering
>> shouldn't matter.
> I have moved this dummy iommu driver init under iommufd_init(), which
> aligns well since it runs after all physical IOMMU drivers have
> registered. This dummy driver is intended for iommufd after all. But I
> don't see a need to bind to a platform device as the selttest does.
There is no "after all physical IOMMU drivers have registered", there is
only "after we've given up waiting to see if one might be loaded as a
module", but even that may be indefinite depending on build/runtime
configuration.
Thanks,
Robin.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-04 10:53 ` Robin Murphy
@ 2025-12-04 22:07 ` Jacob Pan
2025-12-12 4:02 ` Tian, Kevin
0 siblings, 1 reply; 23+ messages in thread
From: Jacob Pan @ 2025-12-04 22:07 UTC (permalink / raw)
To: Robin Murphy
Cc: Jason Gunthorpe, Baolu Lu, linux-kernel, iommu@lists.linux.dev,
Alex Williamson, Joerg Roedel, Will Deacon, Nicolin Chen,
Tian, Kevin, Liu, Yi L, skhawaja, pasha.tatashin, Zhang Yu,
Jean Philippe-Brucker, David Matlack, Alex Williamson
Hi Robin,
On Thu, 4 Dec 2025 10:53:36 +0000
Robin Murphy <robin.murphy@arm.com> wrote:
> On 2025-12-03 10:36 pm, Jacob Pan wrote:
> > Hi Jason,
> >
> > On Wed, 3 Dec 2025 09:11:29 -0400
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> >> On Tue, Dec 02, 2025 at 04:06:35PM -0800, Jacob Pan wrote:
> >>> However, as you pointed out there seems to be no standard ordering
> >>> for iommu device registration across platforms. e.g. VT-d hooks up
> >>> with x86_init, smmuv3 does that in platform driver probe. This
> >>> patchset puts dummy driver under early_initcall which is after
> >>> both but not a guarantee for all platforms. Any suggestions?
> >>
> >> I think we need to do something more like the sefltest does and
> >> manually bind a driver to a device so this init time ordering
> >> shouldn't matter.
> > I have moved this dummy iommu driver init under iommufd_init(),
> > which aligns well since it runs after all physical IOMMU drivers
> > have registered. This dummy driver is intended for iommufd after
> > all. But I don't see a need to bind to a platform device as the
> > selttest does.
>
> There is no "after all physical IOMMU drivers have registered", there
> is only "after we've given up waiting to see if one might be loaded
> as a module", but even that may be indefinite depending on
> build/runtime configuration.
OK, how about we make loading the dummy driver an explicit user opt-in,
the same way as /sys/module/iommufd/parameters/allow_unsafe_interrupt?
In addition, make sure once noiommu driver is loaded, no other iommu
device can be registered.
diff --git a/drivers/iommu/iommufd/device.c
b/drivers/iommu/iommufd/device.c index 4c842368289f..233e2a8a59b9 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -18,6 +18,41 @@ MODULE_PARM_DESC(
"Allow IOMMUFD to bind to devices even if the platform cannot
isolate " "the MSI interrupt window. Enabling this is a security
weakness.");
+static bool allow_unsafe_dma;
+
+static int allow_unsafe_dma_set(const char *val, const struct
kernel_param *kp) +{
+ int ret;
+ bool newv;
+
+ ret = kstrtobool(val, &newv);
+ if (ret)
+ return ret;
+ /* If set, call noiommu_init() to load dummy noiommu driver */
+ if (newv && !allow_unsafe_dma) {
+ /* Will fail if HW IOMMU is present */
+ ret = noiommu_init();
+ if (ret)
+ return ret;
+ allow_unsafe_dma = newv;
+ }
+
+ return 0;
+}
+
+static int allow_unsafe_dma_get(char *buf, const struct kernel_param
*kp) +{
+ return param_get_bool(buf, kp);
+}
+
+static const struct kernel_param_ops allow_unsafe_dma_ops = {
+ .set = allow_unsafe_dma_set,
+ .get = allow_unsafe_dma_get,
+};
+
+module_param_cb(allow_unsafe_dma, &allow_unsafe_dma_ops,
&allow_unsafe_dma, 0644); +MODULE_PARM_DESC(allow_unsafe_dma, "Enable
unsafe DMA no-IOMMU mode"); +
struct iommufd_attach {
struct iommufd_hw_pagetable *hwpt;
struct xarray device_array;
^ permalink raw reply [flat|nested] 23+ messages in thread* RE: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-04 22:07 ` Jacob Pan
@ 2025-12-12 4:02 ` Tian, Kevin
2025-12-12 19:51 ` Jacob Pan
0 siblings, 1 reply; 23+ messages in thread
From: Tian, Kevin @ 2025-12-12 4:02 UTC (permalink / raw)
To: Jacob Pan, Robin Murphy
Cc: Jason Gunthorpe, Baolu Lu, linux-kernel@vger.kernel.org,
iommu@lists.linux.dev, Alex Williamson, Joerg Roedel, Will Deacon,
Nicolin Chen, Liu, Yi L, skhawaja@google.com,
pasha.tatashin@soleen.com, Zhang Yu, Jean Philippe-Brucker,
David Matlack, Alex Williamson
> From: Jacob Pan <jacob.pan@linux.microsoft.com>
> Sent: Friday, December 5, 2025 6:07 AM
>
> Hi Robin,
>
> On Thu, 4 Dec 2025 10:53:36 +0000
> Robin Murphy <robin.murphy@arm.com> wrote:
>
> > On 2025-12-03 10:36 pm, Jacob Pan wrote:
> > > Hi Jason,
> > >
> > > On Wed, 3 Dec 2025 09:11:29 -0400
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > >> On Tue, Dec 02, 2025 at 04:06:35PM -0800, Jacob Pan wrote:
> > >>> However, as you pointed out there seems to be no standard ordering
> > >>> for iommu device registration across platforms. e.g. VT-d hooks up
> > >>> with x86_init, smmuv3 does that in platform driver probe. This
> > >>> patchset puts dummy driver under early_initcall which is after
> > >>> both but not a guarantee for all platforms. Any suggestions?
> > >>
> > >> I think we need to do something more like the sefltest does and
> > >> manually bind a driver to a device so this init time ordering
> > >> shouldn't matter.
> > > I have moved this dummy iommu driver init under iommufd_init(),
> > > which aligns well since it runs after all physical IOMMU drivers
> > > have registered. This dummy driver is intended for iommufd after
> > > all. But I don't see a need to bind to a platform device as the
> > > selttest does.
> >
> > There is no "after all physical IOMMU drivers have registered", there
> > is only "after we've given up waiting to see if one might be loaded
> > as a module", but even that may be indefinite depending on
> > build/runtime configuration.
> OK, how about we make loading the dummy driver an explicit user opt-in,
> the same way as /sys/module/iommufd/parameters/allow_unsafe_interrupt?
enable_unsafe_noiommu_mode plus cdev already implies that need?
>
> In addition, make sure once noiommu driver is loaded, no other iommu
> device can be registered.
>
this is probably the simplest way, e.g. converting below into a helper
to add that check:
spin_lock(&iommu_device_lock);
list_add_tail(&iommu->list, &iommu_device_list);
spin_unlock(&iommu_device_lock);
called by all iommu_device_register() variants.
btw then you may need a new iommu_device_register_noiommu()
as iommufd selftest has no such exclusiveness requirement.
or have a per-bus-type record to ensure each type can be only
probed by a single driver?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC 2/8] iommu: Add a helper to check if any iommu device is registered
2025-12-12 4:02 ` Tian, Kevin
@ 2025-12-12 19:51 ` Jacob Pan
0 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-12 19:51 UTC (permalink / raw)
To: Tian, Kevin
Cc: Robin Murphy, Jason Gunthorpe, Baolu Lu,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
Alex Williamson, Joerg Roedel, Will Deacon, Nicolin Chen,
Liu, Yi L, skhawaja@google.com, pasha.tatashin@soleen.com,
Zhang Yu, Jean Philippe-Brucker, David Matlack, Alex Williamson
Hi Kevin,
On Fri, 12 Dec 2025 04:02:38 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:
> > From: Jacob Pan <jacob.pan@linux.microsoft.com>
> > Sent: Friday, December 5, 2025 6:07 AM
> >
> > Hi Robin,
> >
> > On Thu, 4 Dec 2025 10:53:36 +0000
> > Robin Murphy <robin.murphy@arm.com> wrote:
> >
> > > On 2025-12-03 10:36 pm, Jacob Pan wrote:
> > > > Hi Jason,
> > > >
> > > > On Wed, 3 Dec 2025 09:11:29 -0400
> > > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >
> > > >> On Tue, Dec 02, 2025 at 04:06:35PM -0800, Jacob Pan wrote:
> > > >>> However, as you pointed out there seems to be no standard
> > > >>> ordering for iommu device registration across platforms. e.g.
> > > >>> VT-d hooks up with x86_init, smmuv3 does that in platform
> > > >>> driver probe. This patchset puts dummy driver under
> > > >>> early_initcall which is after both but not a guarantee for
> > > >>> all platforms. Any suggestions?
> > > >>
> > > >> I think we need to do something more like the sefltest does and
> > > >> manually bind a driver to a device so this init time ordering
> > > >> shouldn't matter.
> > > > I have moved this dummy iommu driver init under iommufd_init(),
> > > > which aligns well since it runs after all physical IOMMU drivers
> > > > have registered. This dummy driver is intended for iommufd after
> > > > all. But I don't see a need to bind to a platform device as the
> > > > selttest does.
> > >
> > > There is no "after all physical IOMMU drivers have registered",
> > > there is only "after we've given up waiting to see if one might
> > > be loaded as a module", but even that may be indefinite depending
> > > on build/runtime configuration.
> > OK, how about we make loading the dummy driver an explicit user
> > opt-in, the same way as
> > /sys/module/iommufd/parameters/allow_unsafe_interrupt?
>
> enable_unsafe_noiommu_mode plus cdev already implies that need?
yes, i guess you meant hook up with the existing vfio knob
/sys/module/vfio/parameters/enable_unsafe_noiommu_mode
>
> >
> > In addition, make sure once noiommu driver is loaded, no other iommu
> > device can be registered.
> >
>
> this is probably the simplest way, e.g. converting below into a helper
> to add that check:
>
> spin_lock(&iommu_device_lock);
> list_add_tail(&iommu->list, &iommu_device_list);
> spin_unlock(&iommu_device_lock);
>
> called by all iommu_device_register() variants.
>
> btw then you may need a new iommu_device_register_noiommu()
> as iommufd selftest has no such exclusiveness requirement.
>
Yes, looks cleaner this way. iommu_device_register_noiommu() would just
fail if any other iommu devices have registered. There is no need for a
separate helper or check.
> or have a per-bus-type record to ensure each type can be only
> probed by a single driver?
That seems to be a broader change. I think it would work too. Let me
give that a try.
AFAIK, noiommu mode is only tied to PCI (due to vfio). I assume things
will stay this way.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [RFC 3/8] iommufd: Add a mock page table format for noiommu mode
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
2025-12-01 17:30 ` [RFC 1/8] iommu: Make iommu_device_register_bus available beyond selftest Jacob Pan
2025-12-01 17:30 ` [RFC 2/8] iommu: Add a helper to check if any iommu device is registered Jacob Pan
@ 2025-12-01 17:30 ` Jacob Pan
2025-12-01 17:30 ` [RFC 4/8] iommu: Add a dummy driver " Jacob Pan
` (5 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-01 17:30 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Jacob Pan, Zhang Yu,
Jean Philippe-Brucker, David Matlack
The dummy IOMMU driver for No-IOMMU mode needs a page table to store IOVA
to physical address mappings. Instead of inventing a new format, reuse the
generic IOMMU page table (iommupt) provided mock format where its
implementation modeled after AMD IOMMU v1 format, but generalized for
software-only use.
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
---
drivers/iommu/generic_pt/fmt/Makefile | 1 +
drivers/iommu/generic_pt/fmt/iommu_noiommu.c | 10 ++++++++++
include/linux/generic_pt/iommu.h | 5 +++++
3 files changed, 16 insertions(+)
create mode 100644 drivers/iommu/generic_pt/fmt/iommu_noiommu.c
diff --git a/drivers/iommu/generic_pt/fmt/Makefile b/drivers/iommu/generic_pt/fmt/Makefile
index 5a3379107999..59ea791c6383 100644
--- a/drivers/iommu/generic_pt/fmt/Makefile
+++ b/drivers/iommu/generic_pt/fmt/Makefile
@@ -2,6 +2,7 @@
iommu_pt_fmt-$(CONFIG_IOMMU_PT_AMDV1) += amdv1
iommu_pt_fmt-$(CONFIG_IOMMUFD_TEST) += mock
+iommu_pt_fmt-$(CONFIG_NOIOMMU_MODE_IOMMU) += noiommu
iommu_pt_fmt-$(CONFIG_IOMMU_PT_X86_64) += x86_64
diff --git a/drivers/iommu/generic_pt/fmt/iommu_noiommu.c b/drivers/iommu/generic_pt/fmt/iommu_noiommu.c
new file mode 100644
index 000000000000..4991dff60b59
--- /dev/null
+++ b/drivers/iommu/generic_pt/fmt/iommu_noiommu.c
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES
+ * Copyright (c) 2025, Microsoft Corporation.
+ */
+#define PT_FMT amdv1
+#define PT_FMT_VARIANT noiommu
+#define PT_SUPPORTED_FEATURES 0
+
+#include "iommu_template.h"
diff --git a/include/linux/generic_pt/iommu.h b/include/linux/generic_pt/iommu.h
index fde7ccf007c5..d7f70eaeb37f 100644
--- a/include/linux/generic_pt/iommu.h
+++ b/include/linux/generic_pt/iommu.h
@@ -255,6 +255,11 @@ IOMMU_FORMAT(amdv1, amdpt);
struct pt_iommu_amdv1_mock_hw_info;
IOMMU_PROTOTYPES(amdv1_mock);
+#define pt_iommu_amdv1_noiommu pt_iommu_amdv1
+#define pt_iommu_amdv1_noiommu_cfg pt_iommu_amdv1_cfg
+struct pt_iommu_amdv1_noiommu_hw_info;
+IOMMU_PROTOTYPES(amdv1_noiommu);
+
struct pt_iommu_x86_64_cfg {
struct pt_iommu_cfg common;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [RFC 4/8] iommu: Add a dummy driver for noiommu mode
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
` (2 preceding siblings ...)
2025-12-01 17:30 ` [RFC 3/8] iommufd: Add a mock page table format for noiommu mode Jacob Pan
@ 2025-12-01 17:30 ` Jacob Pan
2025-12-01 17:30 ` [RFC 5/8] vfio: IOMMUFD relax requirement " Jacob Pan
` (4 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-01 17:30 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Jacob Pan, Zhang Yu,
Jean Philippe-Brucker, David Matlack
Introduce a dummy IOMMU driver that enables VFIO character devices (cdevs)
to operate under No-IOMMU mode using IOMMUFD.
Similar to VFIO’s existing No-IOMMU mode, this requires userspace to set
the enable_unsafe_noiommu module parameter, allowing DMA only with
physical addresses. Unlike the traditional VFIO No-IOMMU mode, this option
supports IOMMUFD IOAS UAPIs (e.g., map and unmap) by leveraging mock page
tables provided by the generic IOMMU page table layer.
In this model, IOVAs exposed to userspace are not used for DMA. Instead,
they serve as keys to retrieve corresponding physical addresses from the
mock IO page tables. Memory pinning is still performed the same way as
if there is a physical IOMMU.
For in-kernel DMA, DMA APIs will use direct mode only since this driver
provides identity domain only.
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
---
drivers/iommu/Kconfig | 25 +++++
drivers/iommu/Makefile | 1 +
drivers/iommu/noiommu.c | 204 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 230 insertions(+)
create mode 100644 drivers/iommu/noiommu.c
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index c9ae3221cd6f..9b3423180d16 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,6 +359,31 @@ config HYPERV_IOMMU
Stub IOMMU driver to handle IRQs to support Hyper-V Linux
guest and root partitions.
+config NOIOMMU_MODE_IOMMU
+ bool "Dummy IOMMU driver to support noiommu mode for IOMMUFD"
+ depends on PCI
+ depends on VFIO_NOIOMMU && VFIO_DEVICE_CDEV
+ depends on IOMMUFD_DRIVER
+ depends on IOMMU_PT
+ depends on GENERIC_PT
+ depends on IOMMU_PT_AMDV1
+ select IOMMU_API
+ help
+ This option introduces a dummy IOMMU driver that enables VFIO cdevs
+ to operate under no-IOMMU mode using IOMMUFD. Similar to VFIO’s
+ existing no-IOMMU mode, this requires userspace to set the
+ enable_unsafe_noiommu module parameter, allowing DMA only with physical
+ addresses. Unlike the traditional VFIO no-IOMMU mode, this option
+ supports IOMMUFD IOAS UAPIs such as map and unmap by leveraging mock
+ page tables provided by the generic IOMMU page table layer. The IOVAs
+ exposed to userspace are not used for DMA; instead, they serve as keys
+ to retrieve corresponding physical addresses from these mock tables.
+ Memory pinning is still performed to ensure that physical pages remain
+ resident during DMA operations.
+ VFIO group based No-IOMMU mode is mutually exclusive with this option.
+
+ If unsure, say N here.
+
config VIRTIO_IOMMU
tristate "Virtio IOMMU driver"
depends on VIRTIO
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index b17ef9818759..226041e928fa 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -31,6 +31,7 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
+obj-$(CONFIG_NOIOMMU_MODE_IOMMU) += noiommu.o
obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o
obj-$(CONFIG_IOMMU_IOPF) += io-pgfault.o
obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
diff --git a/drivers/iommu/noiommu.c b/drivers/iommu/noiommu.c
new file mode 100644
index 000000000000..06125a190686
--- /dev/null
+++ b/drivers/iommu/noiommu.c
@@ -0,0 +1,204 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2025, Microsoft Corporation.
+ */
+
+#define pr_fmt(fmt) "NOIOMMU: " fmt
+#include <linux/device.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/generic_pt/iommu.h>
+#include <linux/vfio.h>
+
+#include "iommu-priv.h"
+
+struct noiommu_dev {
+ struct iommu_device iommu;
+ struct device *dev;
+};
+
+struct noiommu_domain {
+ union {
+ struct iommu_domain domain;
+ struct pt_iommu iommu;
+ struct pt_iommu_amdv1 amdv1;
+ };
+};
+
+static struct iommu_ops noiommu_ops;
+
+struct noiommu_dev noiommu_dev = {
+ .iommu = {
+ .ops = &noiommu_ops,
+ },
+};
+
+static void noiommu_release_device(struct device *dev)
+{
+}
+
+static struct iommu_device *noiommu_probe_device(struct device *dev)
+{
+ /* Support VFIO PCI devices only */
+ if (!dev_is_pci(dev))
+ return ERR_PTR(-ENODEV);
+
+ return &noiommu_dev.iommu;
+}
+
+static int noiommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+ return 0;
+}
+
+static void noiommu_domain_free(struct iommu_domain *domain)
+{
+ kfree(domain);
+}
+
+static const struct iommu_domain_ops noiommu_amdv1_ops = {
+ IOMMU_PT_DOMAIN_OPS(amdv1),
+ .free = noiommu_domain_free,
+ .attach_dev = noiommu_attach_dev,
+};
+
+static struct iommu_domain *
+noiommu_domain_alloc_paging_flags(struct device *dev, u32 flags,
+ const struct iommu_user_data *user_data)
+{
+ struct noiommu_domain *noiommu_dom;
+ struct pt_iommu_amdv1_cfg cfg = {};
+ int rc;
+
+ if (user_data)
+ return ERR_PTR(-EOPNOTSUPP);
+
+ if (vfio_noiommu_enabled() == false) {
+ pr_info("Must enable unsafe_noiommu_mode\n");
+ return ERR_PTR(-ENODEV);
+ }
+
+ cfg.common.hw_max_vasz_lg2 = 64;
+ cfg.common.hw_max_oasz_lg2 = 52;
+ cfg.common.features = BIT(PT_FEAT_AMDV1_FORCE_COHERENCE);
+ cfg.starting_level = 2;
+
+ noiommu_dom = kzalloc(sizeof(*noiommu_dom), GFP_KERNEL);
+ if (!noiommu_dom)
+ return ERR_PTR(-ENOMEM);
+
+ noiommu_dom->amdv1.iommu.nid = NUMA_NO_NODE;
+ noiommu_dom->domain.ops = &noiommu_amdv1_ops;
+
+ /* Use mock page table which is based on AMDV1 */
+ rc = pt_iommu_amdv1_noiommu_init(&noiommu_dom->amdv1, &cfg, GFP_KERNEL);
+ if (rc) {
+ kfree(noiommu_dom);
+ return ERR_PTR(rc);
+ }
+
+ return &noiommu_dom->domain;
+}
+
+static int noiommu_domain_nop_attach(struct iommu_domain *domain,
+ struct device *dev)
+{
+ return 0;
+}
+
+static const struct iommu_domain_ops noiommu_nop_ops = {
+ .attach_dev = noiommu_domain_nop_attach,
+};
+
+static struct iommu_domain noiommu_identity_domain = {
+ .type = IOMMU_DOMAIN_IDENTITY,
+ .ops = &noiommu_nop_ops,
+};
+
+static struct iommu_domain noiommu_blocking_domain = {
+ .type = IOMMU_DOMAIN_BLOCKED,
+ .ops = &noiommu_nop_ops,
+};
+
+static bool noiommu_capable(struct device *dev, enum iommu_cap cap)
+{
+ switch (cap) {
+ /* Fake cache coherency support to allow iommufd-dev bind */
+ case IOMMU_CAP_CACHE_COHERENCY:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static struct iommu_ops noiommu_ops = {
+ .default_domain = &noiommu_identity_domain,
+ .blocked_domain = &noiommu_blocking_domain,
+ .capable = noiommu_capable,
+ .domain_alloc_paging_flags = noiommu_domain_alloc_paging_flags,
+ .probe_device = noiommu_probe_device,
+ .release_device = noiommu_release_device,
+ .device_group = generic_device_group,
+ .owner = THIS_MODULE,
+ .default_domain_ops = &(const struct iommu_domain_ops) {
+ .attach_dev = noiommu_attach_dev,
+ .free = noiommu_domain_free,
+ }
+};
+
+struct notifier_block noiommu_bus_nb = {
+ /* data */
+};
+
+static int iommu_noiommu_dev_add(struct device *dev, struct iommu_device *iommu)
+{
+ return iommu_fwspec_init(dev, iommu->fwnode);
+}
+
+static int __init noiommu_init(void)
+{
+ struct pci_dev *pdev = NULL;
+
+ if (iommu_is_registered()) {
+ pr_info("IOMMU devices already registered, skipping No-IOMMU driver\n");
+ return 0;
+ }
+ pr_debug("Initializing No-IOMMU driver\n");
+ iommu_device_sysfs_add(&noiommu_dev.iommu, noiommu_dev.dev, NULL,
+ "%s", "noiommu");
+
+ if (iommu_device_register_bus(&noiommu_dev.iommu, &noiommu_ops,
+ &pci_bus_type, &noiommu_bus_nb))
+ return -ENODEV;
+
+ for_each_pci_dev(pdev) {
+ if (iommu_noiommu_dev_add(&pdev->dev, &noiommu_dev.iommu)) {
+ dev_err(&pdev->dev, "Failed to add no-IOMMU fwspec \n");
+ continue;
+ }
+ iommu_probe_device(&pdev->dev);
+ dev_dbg(&pdev->dev, "Probed PCI device for no IOMMU\n");
+ }
+
+ return 0;
+}
+early_initcall(noiommu_init);
+
+static void __exit noiommu_exit(void)
+{
+ pr_debug("Exiting No-IOMMU driver\n");
+
+ /* No hardware resources to clean up */
+ iommu_device_unregister(&noiommu_dev.iommu);
+
+}
+
+module_init(noiommu_init);
+module_exit(noiommu_exit);
+
+MODULE_DESCRIPTION("No-IOMMU driver for PCI devices without hardware IOMMU");
+MODULE_AUTHOR("Anonymous");
+MODULE_LICENSE("GPL v2");
\ No newline at end of file
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [RFC 5/8] vfio: IOMMUFD relax requirement for noiommu mode
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
` (3 preceding siblings ...)
2025-12-01 17:30 ` [RFC 4/8] iommu: Add a dummy driver " Jacob Pan
@ 2025-12-01 17:30 ` Jacob Pan
2025-12-12 4:05 ` Tian, Kevin
2025-12-01 17:30 ` [RFC 6/8] vfio: Rename and remove compat from noiommu set function Jacob Pan
` (3 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Jacob Pan @ 2025-12-01 17:30 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Jacob Pan, Zhang Yu,
Jean Philippe-Brucker, David Matlack
noiommu device only has auto domain, does not need to attach to other ioas.
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
---
drivers/vfio/vfio_main.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 38c8e9350a60..805d30b0b82f 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -317,9 +317,7 @@ static int __vfio_register_dev(struct vfio_device *device,
if (WARN_ON(IS_ENABLED(CONFIG_IOMMUFD) &&
(!device->ops->bind_iommufd ||
- !device->ops->unbind_iommufd ||
- !device->ops->attach_ioas ||
- !device->ops->detach_ioas)))
+ !device->ops->unbind_iommufd)))
return -EINVAL;
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* RE: [RFC 5/8] vfio: IOMMUFD relax requirement for noiommu mode
2025-12-01 17:30 ` [RFC 5/8] vfio: IOMMUFD relax requirement " Jacob Pan
@ 2025-12-12 4:05 ` Tian, Kevin
2025-12-12 19:53 ` Jacob Pan
0 siblings, 1 reply; 23+ messages in thread
From: Tian, Kevin @ 2025-12-12 4:05 UTC (permalink / raw)
To: Jacob Pan, linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
Jason Gunthorpe, Alex Williamson, Joerg Roedel, Will Deacon,
Robin Murphy, Nicolin Chen, Liu, Yi L
Cc: skhawaja@google.com, pasha.tatashin@soleen.com, Zhang Yu,
Jean Philippe-Brucker, David Matlack
> From: Jacob Pan <jacob.pan@linux.microsoft.com>
> Sent: Tuesday, December 2, 2025 1:30 AM
>
> noiommu device only has auto domain, does not need to attach to other ioas.
>
> Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
> ---
> drivers/vfio/vfio_main.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index 38c8e9350a60..805d30b0b82f 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -317,9 +317,7 @@ static int __vfio_register_dev(struct vfio_device
> *device,
>
> if (WARN_ON(IS_ENABLED(CONFIG_IOMMUFD) &&
> (!device->ops->bind_iommufd ||
> - !device->ops->unbind_iommufd ||
> - !device->ops->attach_ioas ||
> - !device->ops->detach_ioas)))
> + !device->ops->unbind_iommufd)))
> return -EINVAL;
>
the subject says to relax for noiommu, but above actually relax
the check for all configs...
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC 5/8] vfio: IOMMUFD relax requirement for noiommu mode
2025-12-12 4:05 ` Tian, Kevin
@ 2025-12-12 19:53 ` Jacob Pan
0 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-12 19:53 UTC (permalink / raw)
To: Tian, Kevin
Cc: linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
Jason Gunthorpe, Alex Williamson, Joerg Roedel, Will Deacon,
Robin Murphy, Nicolin Chen, Liu, Yi L, skhawaja@google.com,
pasha.tatashin@soleen.com, Zhang Yu, Jean Philippe-Brucker,
David Matlack
Hi Kevin,
On Fri, 12 Dec 2025 04:05:06 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:
> > From: Jacob Pan <jacob.pan@linux.microsoft.com>
> > Sent: Tuesday, December 2, 2025 1:30 AM
> >
> > noiommu device only has auto domain, does not need to attach to
> > other ioas.
> >
> > Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
> > ---
> > drivers/vfio/vfio_main.c | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> > index 38c8e9350a60..805d30b0b82f 100644
> > --- a/drivers/vfio/vfio_main.c
> > +++ b/drivers/vfio/vfio_main.c
> > @@ -317,9 +317,7 @@ static int __vfio_register_dev(struct
> > vfio_device *device,
> >
> > if (WARN_ON(IS_ENABLED(CONFIG_IOMMUFD) &&
> > (!device->ops->bind_iommufd ||
> > - !device->ops->unbind_iommufd ||
> > - !device->ops->attach_ioas ||
> > - !device->ops->detach_ioas)))
> > + !device->ops->unbind_iommufd)))
> > return -EINVAL;
> >
>
> the subject says to relax for noiommu, but above actually relax
> the check for all configs...
I will add a check here for noiommu mode.
Thanks,
Jacob
^ permalink raw reply [flat|nested] 23+ messages in thread
* [RFC 6/8] vfio: Rename and remove compat from noiommu set function
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
` (4 preceding siblings ...)
2025-12-01 17:30 ` [RFC 5/8] vfio: IOMMUFD relax requirement " Jacob Pan
@ 2025-12-01 17:30 ` Jacob Pan
2025-12-01 17:30 ` [RFC 7/8] iommu: Enable cdev noiommu mode under iommufd Jacob Pan
` (2 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-01 17:30 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Jacob Pan, Zhang Yu,
Jean Philippe-Brucker, David Matlack
With the dummy iommu driver, noiommu mode can be supported beyond
vfio_compat mode under IOMMUFD. Rename it accordingly.
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
---
drivers/iommu/iommufd/vfio_compat.c | 6 +++---
drivers/vfio/group.c | 2 +-
include/linux/iommufd.h | 4 ++--
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/iommufd/vfio_compat.c b/drivers/iommu/iommufd/vfio_compat.c
index a258ee2f4579..82630fe024a6 100644
--- a/drivers/iommu/iommufd/vfio_compat.c
+++ b/drivers/iommu/iommufd/vfio_compat.c
@@ -47,12 +47,12 @@ int iommufd_vfio_compat_ioas_get_id(struct iommufd_ctx *ictx, u32 *out_ioas_id)
EXPORT_SYMBOL_NS_GPL(iommufd_vfio_compat_ioas_get_id, "IOMMUFD_VFIO");
/**
- * iommufd_vfio_compat_set_no_iommu - Called when a no-iommu device is attached
+ * iommufd_vfio_set_no_iommu - Called when a no-iommu device is attached
* @ictx: Context to operate on
*
* This allows selecting the VFIO_NOIOMMU_IOMMU and blocks normal types.
*/
-int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx)
+int iommufd_vfio_set_no_iommu(struct iommufd_ctx *ictx)
{
int ret;
@@ -66,7 +66,7 @@ int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx)
xa_unlock(&ictx->objects);
return ret;
}
-EXPORT_SYMBOL_NS_GPL(iommufd_vfio_compat_set_no_iommu, "IOMMUFD_VFIO");
+EXPORT_SYMBOL_NS_GPL(iommufd_vfio_set_no_iommu, "IOMMUFD_VFIO");
/**
* iommufd_vfio_compat_ioas_create - Ensure the compat IOAS is created
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c
index c376a6279de0..39fada00ef50 100644
--- a/drivers/vfio/group.c
+++ b/drivers/vfio/group.c
@@ -134,7 +134,7 @@ static int vfio_group_ioctl_set_container(struct vfio_group *group,
if (!IS_ERR(iommufd)) {
if (IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
group->type == VFIO_NO_IOMMU)
- ret = iommufd_vfio_compat_set_no_iommu(iommufd);
+ ret = iommufd_vfio_set_no_iommu(iommufd);
else
ret = iommufd_vfio_compat_ioas_create(iommufd);
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 6e7efe83bc5d..fb3bc387b78e 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -212,7 +212,7 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova,
void *data, size_t len, unsigned int flags);
int iommufd_vfio_compat_ioas_get_id(struct iommufd_ctx *ictx, u32 *out_ioas_id);
int iommufd_vfio_compat_ioas_create(struct iommufd_ctx *ictx);
-int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx);
+int iommufd_vfio_set_no_iommu(struct iommufd_ctx *ictx);
#else /* !CONFIG_IOMMUFD */
static inline struct iommufd_ctx *iommufd_ctx_from_file(struct file *file)
{
@@ -250,7 +250,7 @@ static inline int iommufd_vfio_compat_ioas_create(struct iommufd_ctx *ictx)
return -EOPNOTSUPP;
}
-static inline int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx)
+static inline int iommufd_vfio_set_no_iommu(struct iommufd_ctx *ictx)
{
return -EOPNOTSUPP;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [RFC 7/8] iommu: Enable cdev noiommu mode under iommufd
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
` (5 preceding siblings ...)
2025-12-01 17:30 ` [RFC 6/8] vfio: Rename and remove compat from noiommu set function Jacob Pan
@ 2025-12-01 17:30 ` Jacob Pan
2025-12-01 17:30 ` [RFC 8/8] iommufd: Add an ioctl IOMMU_IOAS_GET_PA to query PA from IOVA Jacob Pan
2026-01-30 19:35 ` [RFC 0/8] iommufd: Enable noiommu mode for cdev Jason Gunthorpe
8 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-01 17:30 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Jacob Pan, Zhang Yu,
Jean Philippe-Brucker, David Matlack
With an IOAS capable dummy IOMMU driver in place for noiommu mode, devices
under noiommu mode can now bind with IOMMUFD and performe subsequent
operations such as open, attach, and IOAS map/unmap.
No IOMMU cdevs are explicitly named with noiommu prefix. e.g.
/dev/vfio/
|-- 7
|-- devices
| `-- noiommu-vfio0
`-- vfio
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
---
drivers/iommu/iommufd/hw_pagetable.c | 8 ++++++
drivers/vfio/Kconfig | 3 +--
drivers/vfio/device_cdev.c | 6 +++++
drivers/vfio/vfio.h | 38 +++++++++++++++++++++++++++-
drivers/vfio/vfio_main.c | 16 +++++++++---
include/linux/vfio.h | 2 ++
6 files changed, 67 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index fe789c2dc0c9..8bf76b2002b4 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -345,6 +345,14 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
struct iommufd_device *idev;
int rc;
+ /*
+ * For devices operating in no-IOMMU mode, permit only the automatic
+ * domain HWPT where auto domain uses generic iommupt to provide mock
+ * page tables.
+ */
+ if (ucmd->ictx->no_iommu_mode)
+ return -EOPNOTSUPP;
+
if (cmd->__reserved)
return -EOPNOTSUPP;
if ((cmd->data_type == IOMMU_HWPT_DATA_NONE && cmd->data_len) ||
diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index ceae52fd7586..7a06a4a9dabe 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -22,8 +22,7 @@ config VFIO_DEVICE_CDEV
The VFIO device cdev is another way for userspace to get device
access. Userspace gets device fd by opening device cdev under
/dev/vfio/devices/vfioX, and then bind the device fd with an iommufd
- to set up secure DMA context for device access. This interface does
- not support noiommu.
+ to set up secure DMA context for device access.
If you don't know what to do here, say N.
diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
index 480cac3a0c27..50724e3653ef 100644
--- a/drivers/vfio/device_cdev.c
+++ b/drivers/vfio/device_cdev.c
@@ -132,6 +132,12 @@ long vfio_df_ioctl_bind_iommufd(struct vfio_device_file *df,
goto out_unlock;
}
+ if (device->noiommu) {
+ ret = iommufd_vfio_set_no_iommu(df->iommufd);
+ if (ret)
+ goto out_unlock;
+ }
+
/*
* Before the device open, get the KVM pointer currently
* associated with the device file (if there is) and obtain
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 50128da18bca..4c36de62a5f2 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -118,6 +118,19 @@ static inline bool vfio_device_is_noiommu(struct vfio_device *vdev)
return IS_ENABLED(CONFIG_VFIO_NOIOMMU) &&
vdev->group->type == VFIO_NO_IOMMU;
}
+
+static inline bool vfio_cdev_is_noiommu(struct vfio_device *vdev)
+{
+ return IS_ENABLED(CONFIG_NOIOMMU_MODE_IOMMU) && vfio_noiommu_enabled();
+}
+
+static inline int vfio_device_set_no_iommu(struct vfio_device *vdev)
+{
+ if (vfio_device_is_noiommu(vdev) || vfio_cdev_is_noiommu(vdev))
+ vdev->noiommu = true;
+
+ return 0;
+}
#else
struct vfio_group;
@@ -193,6 +206,25 @@ static inline bool vfio_device_is_noiommu(struct vfio_device *vdev)
{
return false;
}
+
+static inline int vfio_device_set_no_iommu(struct vfio_device *vdev)
+{
+ struct iommu_group *iommu_group;
+
+ /* Do not support group device noiommu mode simultaneously */
+ if (iommu_group_get(vdev->dev)) {
+ vdev->noiommu = false;
+ iommu_group_put(iommu_group);
+ return -EINVAL;
+ }
+
+ if (!IS_ENABLED(CONFIG_VFIO_NOIOMMU) || !vfio_noiommu)
+ return -EINVAL;
+
+ vdev->noiommu = true;
+
+ return 0;
+}
#endif /* CONFIG_VFIO_GROUP */
#if IS_ENABLED(CONFIG_VFIO_CONTAINER)
@@ -359,7 +391,11 @@ void vfio_init_device_cdev(struct vfio_device *device);
static inline int vfio_device_add(struct vfio_device *device)
{
- /* cdev does not support noiommu device */
+ /*
+ * cdev does not support noiommu device for VFIO_NOIOMMU group type.
+ * However, under IOMMUFD with dummy iommu driver, noiommu mode is
+ * also supported for cdev devices.
+ */
if (vfio_device_is_noiommu(device))
return device_add(&device->device);
vfio_init_device_cdev(device);
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 805d30b0b82f..c8aefa13bda3 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -60,6 +60,10 @@ bool vfio_noiommu __read_mostly;
module_param_named(enable_unsafe_noiommu_mode,
vfio_noiommu, bool, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU mode. This mode provides no device isolation, no DMA translation, no host kernel protection, cannot be used for device assignment to virtual machines, requires RAWIO permissions, and will taint the kernel. If you do not know what this is for, step away. (default: false)");
+bool vfio_noiommu_enabled(void)
+{
+ return vfio_noiommu;
+}
#endif
static DEFINE_XARRAY(vfio_device_set_xa);
@@ -327,13 +331,19 @@ static int __vfio_register_dev(struct vfio_device *device,
if (!device->dev_set)
vfio_assign_device_set(device, device);
- ret = dev_set_name(&device->device, "vfio%d", device->index);
+ ret = vfio_device_set_group(device, type);
if (ret)
return ret;
- ret = vfio_device_set_group(device, type);
+ ret = vfio_device_set_no_iommu(device);
if (ret)
- return ret;
+ goto err_out;
+
+ /* Just to be safe, expose to user explicitly noiommu cdev node */
+ ret = dev_set_name(&device->device, "%svfio%d",
+ device->noiommu ? "noiommu-" : "", device->index);
+ if (ret)
+ goto err_out;
/*
* VFIO always sets IOMMU_CACHE because we offer no way for userspace to
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index eb563f538dee..944e74dc3da0 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -71,6 +71,7 @@ struct vfio_device {
u8 iommufd_attached:1;
#endif
u8 cdev_opened:1;
+ u8 noiommu:1;
#ifdef CONFIG_DEBUG_FS
/*
* debug_root is a static property of the vfio_device
@@ -331,6 +332,7 @@ static inline bool vfio_file_has_dev(struct file *file, struct vfio_device *devi
return false;
}
#endif
+bool vfio_noiommu_enabled(void);
bool vfio_file_is_valid(struct file *file);
bool vfio_file_enforced_coherent(struct file *file);
void vfio_file_set_kvm(struct file *file, struct kvm *kvm);
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* [RFC 8/8] iommufd: Add an ioctl IOMMU_IOAS_GET_PA to query PA from IOVA
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
` (6 preceding siblings ...)
2025-12-01 17:30 ` [RFC 7/8] iommu: Enable cdev noiommu mode under iommufd Jacob Pan
@ 2025-12-01 17:30 ` Jacob Pan
2026-01-30 19:35 ` [RFC 0/8] iommufd: Enable noiommu mode for cdev Jason Gunthorpe
8 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2025-12-01 17:30 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Jason Gunthorpe,
Alex Williamson, Joerg Roedel, Will Deacon, Robin Murphy,
Nicolin Chen, Tian, Kevin, Liu, Yi L
Cc: skhawaja, pasha.tatashin, Jacob Pan, Zhang Yu,
Jean Philippe-Brucker, David Matlack
To support no-IOMMU mode where userspace drivers perform unsafe DMA
using physical addresses, introduce an new API to retrieve the
physical address of a user-allocated DMA buffer that has been mapped to
an IOVA via IOAS. The mapping is backed by mock I/O page tables maintained
by generic IOMMUPT framework.
Link: https://lore.kernel.org/linux-iommu/20250603175403.GA407344@nvidia.com/
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
---
drivers/iommu/iommufd/io_pagetable.c | 44 +++++++++++++++++++++++++
drivers/iommu/iommufd/ioas.c | 24 ++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 3 ++
drivers/iommu/iommufd/main.c | 3 ++
include/uapi/linux/iommufd.h | 25 ++++++++++++++
5 files changed, 99 insertions(+)
diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index c0360c450880..134a16acb44f 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -813,6 +813,50 @@ int iopt_unmap_iova(struct io_pagetable *iopt, unsigned long iova,
return iopt_unmap_iova_range(iopt, iova, iova_last, unmapped);
}
+int iopt_get_phys(struct io_pagetable *iopt, unsigned long iova,
+ phys_addr_t *paddr, u64 *length)
+{
+ unsigned long area_iova;
+ struct iopt_area *area;
+ unsigned long offset;
+ int rc = 0;
+
+ down_read(&iopt->iova_rwsem);
+ area = iopt_area_iter_first(iopt, iova, iova);
+ if (!area || !area->pages) {
+ pr_warn("%s: No area for iova 0x%lx\n", __func__, iova);
+ rc = -ENOENT;
+ goto unlock_exit;
+ }
+
+ if (!area->storage_domain) {
+ pr_warn("%s: area has no storage_domain\n", __func__);
+ rc = -EINVAL;
+ goto unlock_exit;
+ }
+
+ area_iova = iopt_area_iova(area);
+ offset = iova - area_iova;
+ *paddr = iommu_iova_to_phys(area->storage_domain, iova);
+ if (!*paddr) {
+ pr_warn("%s: No paddr for iova 0x%lx\n", __func__, iova);
+ rc = -EINVAL;
+ goto unlock_exit;
+ }
+ /*
+ * TBD: we can return contiguous IOVA length so that userspace can
+ * keep searching for next physical address.
+ * e.g.
+ * iopt_area_length(area) - offset;
+ */
+ *length = PAGE_SIZE;
+
+unlock_exit:
+ up_read(&iopt->iova_rwsem);
+
+ return rc;
+}
+
int iopt_unmap_all(struct io_pagetable *iopt, unsigned long *unmapped)
{
int rc;
diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c
index 1542c5fd10a8..c11c5fce955a 100644
--- a/drivers/iommu/iommufd/ioas.c
+++ b/drivers/iommu/iommufd/ioas.c
@@ -377,6 +377,30 @@ int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd)
return rc;
}
+int iommufd_ioas_get_pa(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_ioas_get_pa *cmd = ucmd->cmd;
+ struct iommufd_ioas *ioas;
+ int rc;
+
+ ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+
+ rc = iopt_get_phys(&ioas->iopt, cmd->iova, &cmd->phys, &cmd->length);
+ if (rc) {
+ pr_err("%s: Failed to get PA for IOVA 0x%llx length 0x%llx: %d\n",
+ __func__, cmd->iova, cmd->length, rc);
+ goto out_put;
+ }
+
+ rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+out_put:
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
+
+ return rc;
+}
+
static void iommufd_release_all_iova_rwsem(struct iommufd_ctx *ictx,
struct xarray *ioas_list)
{
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 627f9b78483a..f74a0aea70bf 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -117,6 +117,8 @@ int iopt_map_pages(struct io_pagetable *iopt, struct list_head *pages_list,
int iopt_unmap_iova(struct io_pagetable *iopt, unsigned long iova,
unsigned long length, unsigned long *unmapped);
int iopt_unmap_all(struct io_pagetable *iopt, unsigned long *unmapped);
+int iopt_get_phys(struct io_pagetable *iopt, unsigned long iova,
+ phys_addr_t *paddr, u64 *length);
int iopt_read_and_clear_dirty_data(struct io_pagetable *iopt,
struct iommu_domain *domain,
@@ -345,6 +347,7 @@ int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd);
int iommufd_ioas_change_process(struct iommufd_ucmd *ucmd);
int iommufd_ioas_copy(struct iommufd_ucmd *ucmd);
int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd);
+int iommufd_ioas_get_pa(struct iommufd_ucmd *ucmd);
int iommufd_ioas_option(struct iommufd_ucmd *ucmd);
int iommufd_option_rlimit_mode(struct iommu_option *cmd,
struct iommufd_ctx *ictx);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index ce775fbbae94..37e785d0e40d 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -432,6 +432,7 @@ union ucmd_buffer {
struct iommu_veventq_alloc veventq;
struct iommu_vfio_ioas vfio_ioas;
struct iommu_viommu_alloc viommu;
+ struct iommu_ioas_get_pa get_pa;
#ifdef CONFIG_IOMMUFD_TEST
struct iommu_test_cmd test;
#endif
@@ -484,6 +485,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
struct iommu_ioas_map_file, iova),
IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct iommu_ioas_unmap,
length),
+ IOCTL_OP(IOMMU_IOAS_GET_PA, iommufd_ioas_get_pa, struct iommu_ioas_get_pa,
+ phys),
IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option, val64),
IOCTL_OP(IOMMU_VDEVICE_ALLOC, iommufd_vdevice_alloc_ioctl,
struct iommu_vdevice_alloc, virt_id),
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index c218c89e0e2e..915cb128f220 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -57,6 +57,7 @@ enum {
IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92,
IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93,
IOMMUFD_CMD_HW_QUEUE_ALLOC = 0x94,
+ IOMMUFD_CMD_IOAS_GET_PA = 0x95,
};
/**
@@ -219,6 +220,30 @@ struct iommu_ioas_map {
};
#define IOMMU_IOAS_MAP _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP)
+/**
+ * struct iommu_ioas_get_pa - ioctl(IOMMU_IOAS_GET_PA)
+ * @size: sizeof(struct iommu_ioas_get_pa)
+ * @flags: TBD
+ * @ioas_id: IOAS ID to query IOVA to PA mapping from
+ * @__reserved: Must be 0
+ * @iova: IOVA to query
+ * @length: Number of bytes contiguous physical address starting from phys
+ * @phys: Output physical address the IOVA maps to
+ *
+ * Query the physical address backing an IOVA range. The entire range must be
+ * mapped already. For noiommu devices doing unsafe DMA only.
+ */
+struct iommu_ioas_get_pa {
+ __u32 size;
+ __u32 flags;
+ __u32 ioas_id;
+ __u32 __reserved;
+ __aligned_u64 iova;
+ __aligned_u64 length;
+ __aligned_u64 phys;
+};
+#define IOMMU_IOAS_GET_PA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_GET_PA)
+
/**
* struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
* @size: sizeof(struct iommu_ioas_map_file)
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [RFC 0/8] iommufd: Enable noiommu mode for cdev
2025-12-01 17:30 [RFC 0/8] iommufd: Enable noiommu mode for cdev Jacob Pan
` (7 preceding siblings ...)
2025-12-01 17:30 ` [RFC 8/8] iommufd: Add an ioctl IOMMU_IOAS_GET_PA to query PA from IOVA Jacob Pan
@ 2026-01-30 19:35 ` Jason Gunthorpe
2026-02-06 22:50 ` Jacob Pan
8 siblings, 1 reply; 23+ messages in thread
From: Jason Gunthorpe @ 2026-01-30 19:35 UTC (permalink / raw)
To: Jacob Pan
Cc: linux-kernel, iommu@lists.linux.dev, Alex Williamson,
Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
Tian, Kevin, Liu, Yi L, skhawaja, pasha.tatashin, Zhang Yu,
Jean Philippe-Brucker, David Matlack
On Mon, Dec 01, 2025 at 09:30:04AM -0800, Jacob Pan wrote:
> VFIO's unsafe_noiommu_mode has long provided a way for userspace drivers
> to operate on platforms lacking a hardware IOMMU. Today, IOMMUFD also
> supports No-IOMMU mode for group based devices under vfio_compat mode.
> However, IOMMUFD's native character device (cdev) does not yet implement
> No-IOMMU mode, which is the purpose of this patch. In summary, we have:
I tried to build this without the fake iommu driver and it worked out OK:
https://github.com/jgunthorpe/linux/commits/iommufd_noiommu/
Needs a little more checking than I might have done, but can you take
that and try for a submission?
Thanks,
Jason
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [RFC 0/8] iommufd: Enable noiommu mode for cdev
2026-01-30 19:35 ` [RFC 0/8] iommufd: Enable noiommu mode for cdev Jason Gunthorpe
@ 2026-02-06 22:50 ` Jacob Pan
0 siblings, 0 replies; 23+ messages in thread
From: Jacob Pan @ 2026-02-06 22:50 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: linux-kernel, iommu@lists.linux.dev, Alex Williamson,
Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
Tian, Kevin, Liu, Yi L, skhawaja, pasha.tatashin, Zhang Yu,
Jean Philippe-Brucker, David Matlack
Hi Jason,
On Fri, 30 Jan 2026 15:35:29 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:
> On Mon, Dec 01, 2025 at 09:30:04AM -0800, Jacob Pan wrote:
> > VFIO's unsafe_noiommu_mode has long provided a way for userspace
> > drivers to operate on platforms lacking a hardware IOMMU. Today,
> > IOMMUFD also supports No-IOMMU mode for group based devices under
> > vfio_compat mode. However, IOMMUFD's native character device (cdev)
> > does not yet implement No-IOMMU mode, which is the purpose of this
> > patch. In summary, we have:
>
> I tried to build this without the fake iommu driver and it worked out
> OK:
>
> https://github.com/jgunthorpe/linux/commits/iommufd_noiommu/
>
> Needs a little more checking than I might have done, but can you take
> that and try for a submission?
>
Will look into it.
Thanks
Jacob
^ permalink raw reply [flat|nested] 23+ messages in thread