* [PATCH 0/2] iommu/riscv: Enable IOMMU_DMA
@ 2026-05-08 21:23 Andrew Jones
2026-05-08 21:23 ` [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains Andrew Jones
2026-05-08 21:23 ` [PATCH 2/2] iommu/dma: enable IOMMU_DMA for RISC-V Andrew Jones
0 siblings, 2 replies; 10+ messages in thread
From: Andrew Jones @ 2026-05-08 21:23 UTC (permalink / raw)
To: linux-riscv, iommu; +Cc: linux-kernel, tjeznach, joro, will, pjw, palmer, anup
Arguably long overdue, let's start using paging domains. One blocker
to enabling IOMMU_DMA was that platforms with IMSICs would fault on
MSIs - Patch1 handles that. And, since QEMU is still one of the most-
used riscv platforms, another issue is that commit 69541898b71a
("iommu/riscv: Enable SVNAPOT support for contiguous ptes") exposes
a bug in the QEMU RISC-V IOMMU model. A patch for that is now on the
QEMU list[1].
Rest assured that the irqbypass work will get a v3 posted soon. This
series can be independently merged though since we don't need irqbypass
to enable paging domains and deliver MSIs for host devices.
[1] https://lore.kernel.org/all/20260508205129.377032-1-andrew.jones@oss.qualcomm.com/
Andrew Jones (1):
iommu/riscv: Map IMSIC addresses for paging domains
Tomasz Jeznach (1):
iommu/dma: enable IOMMU_DMA for RISC-V
drivers/iommu/Kconfig | 2 +-
drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
include/linux/irqchip/riscv-imsic.h | 7 ++++++
3 files changed, 42 insertions(+), 1 deletion(-)
--
2.43.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
2026-05-08 21:23 [PATCH 0/2] iommu/riscv: Enable IOMMU_DMA Andrew Jones
@ 2026-05-08 21:23 ` Andrew Jones
2026-05-09 2:21 ` fangyu.yu
` (2 more replies)
2026-05-08 21:23 ` [PATCH 2/2] iommu/dma: enable IOMMU_DMA for RISC-V Andrew Jones
1 sibling, 3 replies; 10+ messages in thread
From: Andrew Jones @ 2026-05-08 21:23 UTC (permalink / raw)
To: linux-riscv, iommu; +Cc: linux-kernel, tjeznach, joro, will, pjw, palmer, anup
When IOMMU_DMA is enabled, devices get paging domains and MSI writes
to IMSIC interrupt files must be handled correctly in the s-stage.
As the device always writes to the host physical IMSIC addresses,
which the IMSIC irqchip programs directly, install s-stage identity
mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
since the 1:1 mappings aren't required for device assignment.
Loop over the cpus rather than imsic groups to handle asymmetric
configurations.
Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
---
drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
include/linux/irqchip/riscv-imsic.h | 7 ++++++
2 files changed, 41 insertions(+)
diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
index a31f50bbad35..3c6aa9d69f95 100644
--- a/drivers/iommu/riscv/iommu.c
+++ b/drivers/iommu/riscv/iommu.c
@@ -19,6 +19,7 @@
#include <linux/init.h>
#include <linux/iommu.h>
#include <linux/iopoll.h>
+#include <linux/irqchip/riscv-imsic.h>
#include <linux/kernel.h>
#include <linux/pci.h>
#include <linux/generic_pt/iommu.h>
@@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
return &domain->domain;
}
+static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
+{
+ const struct imsic_global_config *imsic_global;
+ unsigned int cpu;
+
+ if (!imsic_enabled())
+ return;
+
+ imsic_global = imsic_get_global_config();
+
+ for_each_possible_cpu(cpu) {
+ const struct imsic_local_config *local;
+ struct iommu_resv_region *reg;
+
+ local = per_cpu_ptr(imsic_global->local, cpu);
+ if (!local->msi_va)
+ continue;
+
+ /*
+ * The device always writes to the host physical IMSIC address, so install
+ * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
+ * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
+ * devices.
+ */
+ reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
+ IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
+ IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
+ if (reg)
+ list_add_tail(®->list, head);
+ }
+}
+
static int riscv_iommu_attach_blocking_domain(struct iommu_domain *iommu_domain,
struct device *dev,
struct iommu_domain *old)
@@ -1401,6 +1434,7 @@ static const struct iommu_ops riscv_iommu_ops = {
.blocked_domain = &riscv_iommu_blocking_domain,
.release_domain = &riscv_iommu_blocking_domain,
.domain_alloc_paging = riscv_iommu_alloc_paging_domain,
+ .get_resv_regions = riscv_iommu_get_resv_regions,
.device_group = riscv_iommu_device_group,
.probe_device = riscv_iommu_probe_device,
.release_device = riscv_iommu_release_device,
diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
index 4b348836de7a..ba3000f047b0 100644
--- a/include/linux/irqchip/riscv-imsic.h
+++ b/include/linux/irqchip/riscv-imsic.h
@@ -88,6 +88,13 @@ static inline const struct imsic_global_config *imsic_get_global_config(void)
#endif
+static inline bool imsic_enabled(void)
+{
+ const struct imsic_global_config *imsic_global = imsic_get_global_config();
+
+ return imsic_global && imsic_global->nr_ids;
+}
+
#if IS_ENABLED(CONFIG_ACPI) && IS_ENABLED(CONFIG_RISCV_IMSIC)
int imsic_platform_acpi_probe(struct fwnode_handle *fwnode);
struct fwnode_handle *imsic_acpi_get_fwnode(struct device *dev);
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/2] iommu/dma: enable IOMMU_DMA for RISC-V
2026-05-08 21:23 [PATCH 0/2] iommu/riscv: Enable IOMMU_DMA Andrew Jones
2026-05-08 21:23 ` [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains Andrew Jones
@ 2026-05-08 21:23 ` Andrew Jones
1 sibling, 0 replies; 10+ messages in thread
From: Andrew Jones @ 2026-05-08 21:23 UTC (permalink / raw)
To: linux-riscv, iommu
Cc: linux-kernel, tjeznach, joro, will, pjw, palmer, anup, Nutty Liu
From: Tomasz Jeznach <tjeznach@rivosinc.com>
With iommu/riscv driver available we can enable IOMMU_DMA support
for RISC-V architecture.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
---
drivers/iommu/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index f86262b11416..34d8a792339f 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -151,7 +151,7 @@ config OF_IOMMU
# IOMMU-agnostic DMA-mapping layer
config IOMMU_DMA
- def_bool ARM64 || X86 || S390
+ def_bool ARM64 || X86 || S390 || RISCV
select DMA_OPS_HELPERS
select IOMMU_API
select IOMMU_IOVA
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
2026-05-08 21:23 ` [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains Andrew Jones
@ 2026-05-09 2:21 ` fangyu.yu
2026-05-09 19:47 ` Andrew Jones
2026-05-12 13:38 ` Jason Gunthorpe
2026-05-12 17:21 ` Tomasz Jeznach
2 siblings, 1 reply; 10+ messages in thread
From: fangyu.yu @ 2026-05-09 2:21 UTC (permalink / raw)
To: andrew.jones
Cc: anup, iommu, joro, linux-kernel, linux-riscv, palmer, pjw,
tjeznach, will
>When IOMMU_DMA is enabled, devices get paging domains and MSI writes
>to IMSIC interrupt files must be handled correctly in the s-stage.
>As the device always writes to the host physical IMSIC addresses,
>which the IMSIC irqchip programs directly, install s-stage identity
>mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
>since the 1:1 mappings aren't required for device assignment.
>
>Loop over the cpus rather than imsic groups to handle asymmetric
>configurations.
>
>Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
>---
> drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
> include/linux/irqchip/riscv-imsic.h | 7 ++++++
> 2 files changed, 41 insertions(+)
>
>diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
>index a31f50bbad35..3c6aa9d69f95 100644
>--- a/drivers/iommu/riscv/iommu.c
>+++ b/drivers/iommu/riscv/iommu.c
>@@ -19,6 +19,7 @@
> #include <linux/init.h>
> #include <linux/iommu.h>
> #include <linux/iopoll.h>
>+#include <linux/irqchip/riscv-imsic.h>
> #include <linux/kernel.h>
> #include <linux/pci.h>
> #include <linux/generic_pt/iommu.h>
>@@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
> return &domain->domain;
> }
>
>+static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
>+{
>+ const struct imsic_global_config *imsic_global;
>+ unsigned int cpu;
>+
>+ if (!imsic_enabled())
>+ return;
>+
>+ imsic_global = imsic_get_global_config();
>+
>+ for_each_possible_cpu(cpu) {
>+ const struct imsic_local_config *local;
>+ struct iommu_resv_region *reg;
>+
>+ local = per_cpu_ptr(imsic_global->local, cpu);
>+ if (!local->msi_va)
>+ continue;
>+
>+ /*
>+ * The device always writes to the host physical IMSIC address, so install
>+ * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
>+ * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
>+ * devices.
>+ */
>+ reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
>+ IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
>+ IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
>+ if (reg)
>+ list_add_tail(®->list, head);
>+ }
>+}
>+
Hi Andrew,
Thanks for picking this up -- enabling IOMMU_DMA on RISC-V has been
along-standing gap, and handling the IMSIC MSI mapping is the missing
piece that finally unblocks it.
One concern is that the current implementation emits one 4 KiB
RESV_DIRECT_RELAXABLE region for each possible CPU. On platforms
with hundreds of harts, this noticeably increases the cost of both
.get_resv_regions() and the iommu_create_device_direct_mappings()
walk.
Since interrupt files within one IMSIC group occupy a physically
contiguous range, would it make sense to emit one region per IMSIC
group covering the full group stride, aligned down/up to 2 MiB so
the core can map it as a superpage? This would over-map some padding
within the IMSIC PA window, but RESV_DIRECT_RELAXABLE keeps the
padding out of assigned-device IOVA space, so it looks harmless.
Thanks,
Fangyu
> static int riscv_iommu_attach_blocking_domain(struct iommu_domain *iommu_domain,
> struct device *dev,
> struct iommu_domain *old)
>@@ -1401,6 +1434,7 @@ static const struct iommu_ops riscv_iommu_ops = {
> .blocked_domain = &riscv_iommu_blocking_domain,
> .release_domain = &riscv_iommu_blocking_domain,
> .domain_alloc_paging = riscv_iommu_alloc_paging_domain,
>+ .get_resv_regions = riscv_iommu_get_resv_regions,
> .device_group = riscv_iommu_device_group,
> .probe_device = riscv_iommu_probe_device,
> .release_device = riscv_iommu_release_device,
>diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
>index 4b348836de7a..ba3000f047b0 100644
>--- a/include/linux/irqchip/riscv-imsic.h
>+++ b/include/linux/irqchip/riscv-imsic.h
>@@ -88,6 +88,13 @@ static inline const struct imsic_global_config *imsic_get_global_config(void)
>
> #endif
>
>+static inline bool imsic_enabled(void)
>+{
>+ const struct imsic_global_config *imsic_global = imsic_get_global_config();
>+
>+ return imsic_global && imsic_global->nr_ids;
>+}
>+
> #if IS_ENABLED(CONFIG_ACPI) && IS_ENABLED(CONFIG_RISCV_IMSIC)
> int imsic_platform_acpi_probe(struct fwnode_handle *fwnode);
> struct fwnode_handle *imsic_acpi_get_fwnode(struct device *dev);
>--
>2.43.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
2026-05-09 2:21 ` fangyu.yu
@ 2026-05-09 19:47 ` Andrew Jones
2026-05-10 14:40 ` fangyu.yu
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Jones @ 2026-05-09 19:47 UTC (permalink / raw)
To: fangyu.yu
Cc: anup, iommu, joro, linux-kernel, linux-riscv, palmer, pjw,
tjeznach, will
On Sat, May 09, 2026 at 10:21:13AM +0800, fangyu.yu@linux.alibaba.com wrote:
> >When IOMMU_DMA is enabled, devices get paging domains and MSI writes
> >to IMSIC interrupt files must be handled correctly in the s-stage.
> >As the device always writes to the host physical IMSIC addresses,
> >which the IMSIC irqchip programs directly, install s-stage identity
> >mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
> >since the 1:1 mappings aren't required for device assignment.
> >
> >Loop over the cpus rather than imsic groups to handle asymmetric
> >configurations.
> >
> >Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
> >---
> > drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
> > include/linux/irqchip/riscv-imsic.h | 7 ++++++
> > 2 files changed, 41 insertions(+)
> >
> >diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
> >index a31f50bbad35..3c6aa9d69f95 100644
> >--- a/drivers/iommu/riscv/iommu.c
> >+++ b/drivers/iommu/riscv/iommu.c
> >@@ -19,6 +19,7 @@
> > #include <linux/init.h>
> > #include <linux/iommu.h>
> > #include <linux/iopoll.h>
> >+#include <linux/irqchip/riscv-imsic.h>
> > #include <linux/kernel.h>
> > #include <linux/pci.h>
> > #include <linux/generic_pt/iommu.h>
> >@@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
> > return &domain->domain;
> > }
> >
> >+static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
> >+{
> >+ const struct imsic_global_config *imsic_global;
> >+ unsigned int cpu;
> >+
> >+ if (!imsic_enabled())
> >+ return;
> >+
> >+ imsic_global = imsic_get_global_config();
> >+
> >+ for_each_possible_cpu(cpu) {
> >+ const struct imsic_local_config *local;
> >+ struct iommu_resv_region *reg;
> >+
> >+ local = per_cpu_ptr(imsic_global->local, cpu);
> >+ if (!local->msi_va)
> >+ continue;
> >+
> >+ /*
> >+ * The device always writes to the host physical IMSIC address, so install
> >+ * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
> >+ * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
> >+ * devices.
> >+ */
> >+ reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
> >+ IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
> >+ IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
> >+ if (reg)
> >+ list_add_tail(®->list, head);
> >+ }
> >+}
> >+
>
> Hi Andrew,
>
> Thanks for picking this up -- enabling IOMMU_DMA on RISC-V has been
> along-standing gap, and handling the IMSIC MSI mapping is the missing
> piece that finally unblocks it.
>
> One concern is that the current implementation emits one 4 KiB
> RESV_DIRECT_RELAXABLE region for each possible CPU. On platforms
> with hundreds of harts, this noticeably increases the cost of both
> .get_resv_regions() and the iommu_create_device_direct_mappings()
> walk.
>
> Since interrupt files within one IMSIC group occupy a physically
> contiguous range, would it make sense to emit one region per IMSIC
> group covering the full group stride, aligned down/up to 2 MiB so
> the core can map it as a superpage? This would over-map some padding
> within the IMSIC PA window, but RESV_DIRECT_RELAXABLE keeps the
> padding out of assigned-device IOVA space, so it looks harmless.
>
Hi Fangyu,
Thanks for pointing out this issue. We'll need to decide how much we
want to isolate the devices from VMs in order to address it, though,
because, if we do group mappings, then we'll also be exposing the guest
interrupt files to the devices.
I'll certainly send a v2 to do larger mappings when s-mode imsics are
contiguous (nr-guest-files = 0) and then we can consider creating a
way to opt-in to group mappings even when nr-guest-files != 0. How's
that sound?
Thanks,
drew
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
2026-05-09 19:47 ` Andrew Jones
@ 2026-05-10 14:40 ` fangyu.yu
0 siblings, 0 replies; 10+ messages in thread
From: fangyu.yu @ 2026-05-10 14:40 UTC (permalink / raw)
To: andrew.jones
Cc: anup, fangyu.yu, iommu, joro, linux-kernel, linux-riscv, palmer,
pjw, tjeznach, will
>> >When IOMMU_DMA is enabled, devices get paging domains and MSI writes
>> >to IMSIC interrupt files must be handled correctly in the s-stage.
>> >As the device always writes to the host physical IMSIC addresses,
>> >which the IMSIC irqchip programs directly, install s-stage identity
>> >mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
>> >since the 1:1 mappings aren't required for device assignment.
>> >
>> >Loop over the cpus rather than imsic groups to handle asymmetric
>> >configurations.
>> >
>> >Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
>> >---
>> > drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
>> > include/linux/irqchip/riscv-imsic.h | 7 ++++++
>> > 2 files changed, 41 insertions(+)
>> >
>> >diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
>> >index a31f50bbad35..3c6aa9d69f95 100644
>> >--- a/drivers/iommu/riscv/iommu.c
>> >+++ b/drivers/iommu/riscv/iommu.c
>> >@@ -19,6 +19,7 @@
>> > #include <linux/init.h>
>> > #include <linux/iommu.h>
>> > #include <linux/iopoll.h>
>> >+#include <linux/irqchip/riscv-imsic.h>
>> > #include <linux/kernel.h>
>> > #include <linux/pci.h>
>> > #include <linux/generic_pt/iommu.h>
>> >@@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
>> > return &domain->domain;
>> > }
>> >
>> >+static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
>> >+{
>> >+ const struct imsic_global_config *imsic_global;
>> >+ unsigned int cpu;
>> >+
>> >+ if (!imsic_enabled())
>> >+ return;
>> >+
>> >+ imsic_global = imsic_get_global_config();
>> >+
>> >+ for_each_possible_cpu(cpu) {
>> >+ const struct imsic_local_config *local;
>> >+ struct iommu_resv_region *reg;
>> >+
>> >+ local = per_cpu_ptr(imsic_global->local, cpu);
>> >+ if (!local->msi_va)
>> >+ continue;
>> >+
>> >+ /*
>> >+ * The device always writes to the host physical IMSIC address, so install
>> >+ * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
>> >+ * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
>> >+ * devices.
>> >+ */
>> >+ reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
>> >+ IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
>> >+ IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
>> >+ if (reg)
>> >+ list_add_tail(®->list, head);
>> >+ }
>> >+}
>> >+
>>
>> Hi Andrew,
>>
>> Thanks for picking this up -- enabling IOMMU_DMA on RISC-V has been
>> along-standing gap, and handling the IMSIC MSI mapping is the missing
>> piece that finally unblocks it.
>>
>> One concern is that the current implementation emits one 4 KiB
>> RESV_DIRECT_RELAXABLE region for each possible CPU. On platforms
>> with hundreds of harts, this noticeably increases the cost of both
>> .get_resv_regions() and the iommu_create_device_direct_mappings()
>> walk.
>>
>> Since interrupt files within one IMSIC group occupy a physically
>> contiguous range, would it make sense to emit one region per IMSIC
>> group covering the full group stride, aligned down/up to 2 MiB so
>> the core can map it as a superpage? This would over-map some padding
>> within the IMSIC PA window, but RESV_DIRECT_RELAXABLE keeps the
>> padding out of assigned-device IOVA space, so it looks harmless.
>>
>
>Hi Fangyu,
>
>Thanks for pointing out this issue. We'll need to decide how much we
>want to isolate the devices from VMs in order to address it, though,
>because, if we do group mappings, then we'll also be exposing the guest
>interrupt files to the devices.
>
>I'll certainly send a v2 to do larger mappings when s-mode imsics are
>contiguous (nr-guest-files = 0) and then we can consider creating a
>way to opt-in to group mappings even when nr-guest-files != 0. How's
>that sound?
>
Hi Andrew,
As you mentioned, my suggestion could also expose the guest interrupt
files to the device-visible/mappable range, but the current approach
already appears to provide only limited isolation. So I think this
change would not materially increase the risk.
In any case, I respect your decision, and we can certainly proceed
with your current suggestion for now.
Thanks,
Fangyu
>Thanks,
>drew
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
2026-05-08 21:23 ` [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains Andrew Jones
2026-05-09 2:21 ` fangyu.yu
@ 2026-05-12 13:38 ` Jason Gunthorpe
2026-05-12 16:22 ` Andrew Jones
2026-05-12 17:21 ` Tomasz Jeznach
2 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2026-05-12 13:38 UTC (permalink / raw)
To: Andrew Jones
Cc: linux-riscv, iommu, linux-kernel, tjeznach, joro, will, pjw,
palmer, anup
On Fri, May 08, 2026 at 04:23:38PM -0500, Andrew Jones wrote:
> +static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
> +{
> + const struct imsic_global_config *imsic_global;
> + unsigned int cpu;
> +
> + if (!imsic_enabled())
> + return;
> +
> + imsic_global = imsic_get_global_config();
> +
> + for_each_possible_cpu(cpu) {
> + const struct imsic_local_config *local;
> + struct iommu_resv_region *reg;
> +
> + local = per_cpu_ptr(imsic_global->local, cpu);
> + if (!local->msi_va)
> + continue;
> +
> + /*
> + * The device always writes to the host physical IMSIC address, so install
> + * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
> + * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
> + * devices.
Oh? Why not?
> + reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
> + IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
> + IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
> + if (reg)
> + list_add_tail(®->list, head);
This seems like quite a hack, the ARM was seems much better, the
interrupt controller should be using the iommu_dma_prepare_msi() path
to obtain an appropriately translated MSI address for the aperture.
Then things will work correctly with VFIO too.
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
2026-05-12 13:38 ` Jason Gunthorpe
@ 2026-05-12 16:22 ` Andrew Jones
2026-05-12 16:33 ` Jason Gunthorpe
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Jones @ 2026-05-12 16:22 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: linux-riscv, iommu, linux-kernel, tjeznach, joro, will, pjw,
palmer, anup
On Tue, May 12, 2026 at 10:38:54AM -0300, Jason Gunthorpe wrote:
> On Fri, May 08, 2026 at 04:23:38PM -0500, Andrew Jones wrote:
> > +static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
> > +{
> > + const struct imsic_global_config *imsic_global;
> > + unsigned int cpu;
> > +
> > + if (!imsic_enabled())
> > + return;
> > +
> > + imsic_global = imsic_get_global_config();
> > +
> > + for_each_possible_cpu(cpu) {
> > + const struct imsic_local_config *local;
> > + struct iommu_resv_region *reg;
> > +
> > + local = per_cpu_ptr(imsic_global->local, cpu);
> > + if (!local->msi_va)
> > + continue;
> > +
> > + /*
> > + * The device always writes to the host physical IMSIC address, so install
> > + * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
> > + * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
> > + * devices.
>
> Oh? Why not?
Hi Jason,
I should change the comment above to be stronger. 'not required' sounds
like a choice is being made, but guest devices must not have mappings to
host IMSICs - that would break their isolation.
RISC-V AIA has the concept of guest interrupt files. Assigned devices must
write the addresses of those interrupt files to deliver MSIs to guest
IMSICs (virtual IMSICs). Also, the VMM can map the virtual IMSICs where it
likes, which will not necessarily be the same addresses the host IMSICs
use. We need the irqbypass series I'm working on for guests, not these
direct mappings.
Also, I actually discovered IOMMU_RESV_DIRECT_RELAXABLE after first trying
IOMMU_RESV_DIRECT which resulted in "Firmware has requested this device
have a 1:1 IOMMU mapping, rejecting..." failures when running the vfio
kselftests.
>
> > + reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
> > + IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
> > + IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
> > + if (reg)
> > + list_add_tail(®->list, head);
>
> This seems like quite a hack, the ARM was seems much better, the
> interrupt controller should be using the iommu_dma_prepare_msi() path
> to obtain an appropriately translated MSI address for the aperture.
The difference between ARM and RISC-V is that on RISC-V each CPU has an
IMSIC and the device will target any one of them, depending on its current
affinity (i.e. the MSI target changes with irq-set-affinity). ARM has a
single doorbell address which has a single IOVA->PA mapping created for
it that never changes. The ITS manages everything, including affinity
changes. Even when I get an IR irqdomain posted that implements
irq-set-affinity to help further isolate devices on the host, we'll still
want get_resv to pre-create these direct mappings since we can't
create/alloc them at irq-set-affinity time which runs in atomic context.
Thanks,
drew
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
2026-05-12 16:22 ` Andrew Jones
@ 2026-05-12 16:33 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2026-05-12 16:33 UTC (permalink / raw)
To: Andrew Jones
Cc: linux-riscv, iommu, linux-kernel, tjeznach, joro, will, pjw,
palmer, anup
On Tue, May 12, 2026 at 11:22:46AM -0500, Andrew Jones wrote:
> > > + /*
> > > + * The device always writes to the host physical IMSIC address, so install
> > > + * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
> > > + * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
> > > + * devices.
> >
> > Oh? Why not?
>
> Hi Jason,
>
> I should change the comment above to be stronger. 'not required' sounds
> like a choice is being made, but guest devices must not have mappings to
> host IMSICs - that would break their isolation.
VFIO is more that virtualization and it is certainly required to have
these mapping to operate interrupts in normal non virtualization VFIO
cases.
> RISC-V AIA has the concept of guest interrupt files. Assigned devices must
> write the addresses of those interrupt files to deliver MSIs to guest
> IMSICs (virtual IMSICs). Also, the VMM can map the virtual IMSICs where it
> likes, which will not necessarily be the same addresses the host IMSICs
> use. We need the irqbypass series I'm working on for guests, not these
> direct mappings.
I remember going over this and it was decided it couldn't do the MSI
security so it was in trouble..
And mapping the virtual IMSICs is problematic too, ARM has the same
issue with its ITS page that we never solved.
The proper comment is using IOMMU_RESV_DIRECT_RELAXABLE causes
interrupts to be unavailable to VFIO in the normal ways which I think
is not acceptable for an IOMMU driver.
> > > + reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
> > > + IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
> > > + IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
> > > + if (reg)
> > > + list_add_tail(®->list, head);
> >
> > This seems like quite a hack, the ARM was seems much better, the
> > interrupt controller should be using the iommu_dma_prepare_msi() path
> > to obtain an appropriately translated MSI address for the aperture.
>
> The difference between ARM and RISC-V is that on RISC-V each CPU has an
> IMSIC and the device will target any one of them, depending on its current
> affinity (i.e. the MSI target changes with irq-set-affinity). ARM has a
> single doorbell address which has a single IOVA->PA mapping created for
> it that never changes. The ITS manages everything, including affinity
> changes. Even when I get an IR irqdomain posted that implements
> irq-set-affinity to help further isolate devices on the host, we'll still
> want get_resv to pre-create these direct mappings since we can't
> create/alloc them at irq-set-affinity time which runs in atomic context.
You'd have to use the iommu_dma_prepare_msi() path to pre-create all
the IMSIC mappings when the IRQ is first attached.
The reserved regions were supposed to come from FW. The API path to
connect the irq driver to the iommu is through
iommu_dma_prepare_msi().
If you mix them up like this then your VFIO is broken.
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
2026-05-08 21:23 ` [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains Andrew Jones
2026-05-09 2:21 ` fangyu.yu
2026-05-12 13:38 ` Jason Gunthorpe
@ 2026-05-12 17:21 ` Tomasz Jeznach
2 siblings, 0 replies; 10+ messages in thread
From: Tomasz Jeznach @ 2026-05-12 17:21 UTC (permalink / raw)
To: Andrew Jones
Cc: linux-riscv, iommu, linux-kernel, joro, will, pjw, palmer, anup
Hi,
On Fri, May 8, 2026 at 2:23 PM Andrew Jones
<andrew.jones@oss.qualcomm.com> wrote:
>
> When IOMMU_DMA is enabled, devices get paging domains and MSI writes
> to IMSIC interrupt files must be handled correctly in the s-stage.
> As the device always writes to the host physical IMSIC addresses,
> which the IMSIC irqchip programs directly, install s-stage identity
> mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
> since the 1:1 mappings aren't required for device assignment.
>
Devices are expected to send MSI writes as MemWr with AT=00b, which
triggers address translation by the responsible IOMMU. While the
current MSI target address is identical to the IMSIC physical address,
direct mapping might work; however, I consider this an interim hack to
achieve basic MSI forwarding. It is quite similar to a workaround I
previously used utilizing IOMMU_RESV_MSI types [1], which worked well
enough with VFIO interfaces (assuming unsafe interrupts are
acceptable).
I would prefer to use the iommu_dma_prepare_msi() path, as Jason
suggested, or implement complete IRQ remapping for RISC-V IOMMU.
[1] https://github.com/tjeznach/linux/commit/3a165e1b8f7cc4d00770a932fc0840cfed760485
> Loop over the cpus rather than imsic groups to handle asymmetric
> configurations.
>
> Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
> ---
> drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
> include/linux/irqchip/riscv-imsic.h | 7 ++++++
> 2 files changed, 41 insertions(+)
>
> diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
> index a31f50bbad35..3c6aa9d69f95 100644
> --- a/drivers/iommu/riscv/iommu.c
> +++ b/drivers/iommu/riscv/iommu.c
> @@ -19,6 +19,7 @@
> #include <linux/init.h>
> #include <linux/iommu.h>
> #include <linux/iopoll.h>
> +#include <linux/irqchip/riscv-imsic.h>
> #include <linux/kernel.h>
> #include <linux/pci.h>
> #include <linux/generic_pt/iommu.h>
> @@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
> return &domain->domain;
> }
>
> +static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
> +{
> + const struct imsic_global_config *imsic_global;
> + unsigned int cpu;
> +
> + if (!imsic_enabled())
> + return;
> +
> + imsic_global = imsic_get_global_config();
> +
> + for_each_possible_cpu(cpu) {
> + const struct imsic_local_config *local;
> + struct iommu_resv_region *reg;
> +
> + local = per_cpu_ptr(imsic_global->local, cpu);
> + if (!local->msi_va)
> + continue;
> +
> + /*
> + * The device always writes to the host physical IMSIC address, so install
> + * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
> + * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
> + * devices.
> + */
As mentioned earlier, the comment "1:1 mappings are not required for
assigned devices" is inaccurate, assuming we'd like MSI to work.
> + reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
> + IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
> + IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
> + if (reg)
> + list_add_tail(®->list, head);
> + }
> +}
> +
> static int riscv_iommu_attach_blocking_domain(struct iommu_domain *iommu_domain,
> struct device *dev,
> struct iommu_domain *old)
> @@ -1401,6 +1434,7 @@ static const struct iommu_ops riscv_iommu_ops = {
> .blocked_domain = &riscv_iommu_blocking_domain,
> .release_domain = &riscv_iommu_blocking_domain,
> .domain_alloc_paging = riscv_iommu_alloc_paging_domain,
> + .get_resv_regions = riscv_iommu_get_resv_regions,
> .device_group = riscv_iommu_device_group,
> .probe_device = riscv_iommu_probe_device,
> .release_device = riscv_iommu_release_device,
> diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
> index 4b348836de7a..ba3000f047b0 100644
> --- a/include/linux/irqchip/riscv-imsic.h
> +++ b/include/linux/irqchip/riscv-imsic.h
> @@ -88,6 +88,13 @@ static inline const struct imsic_global_config *imsic_get_global_config(void)
>
> #endif
>
> +static inline bool imsic_enabled(void)
> +{
> + const struct imsic_global_config *imsic_global = imsic_get_global_config();
> +
> + return imsic_global && imsic_global->nr_ids;
> +}
> +
> #if IS_ENABLED(CONFIG_ACPI) && IS_ENABLED(CONFIG_RISCV_IMSIC)
> int imsic_platform_acpi_probe(struct fwnode_handle *fwnode);
> struct fwnode_handle *imsic_acpi_get_fwnode(struct device *dev);
> --
> 2.43.0
>
Best,
- Tomasz
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-12 17:21 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-08 21:23 [PATCH 0/2] iommu/riscv: Enable IOMMU_DMA Andrew Jones
2026-05-08 21:23 ` [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains Andrew Jones
2026-05-09 2:21 ` fangyu.yu
2026-05-09 19:47 ` Andrew Jones
2026-05-10 14:40 ` fangyu.yu
2026-05-12 13:38 ` Jason Gunthorpe
2026-05-12 16:22 ` Andrew Jones
2026-05-12 16:33 ` Jason Gunthorpe
2026-05-12 17:21 ` Tomasz Jeznach
2026-05-08 21:23 ` [PATCH 2/2] iommu/dma: enable IOMMU_DMA for RISC-V Andrew Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox