From: fangyu.yu@linux.alibaba.com
To: andrew.jones@oss.qualcomm.com
Cc: anup@brainfault.org, iommu@lists.linux.dev, joro@8bytes.org,
linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
palmer@dabbelt.com, pjw@kernel.org, tjeznach@rivosinc.com,
will@kernel.org
Subject: Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
Date: Sat, 9 May 2026 10:21:13 +0800 [thread overview]
Message-ID: <20260509022113.53400-1-fangyu.yu@linux.alibaba.com> (raw)
In-Reply-To: <20260508212339.381933-2-andrew.jones@oss.qualcomm.com>
>When IOMMU_DMA is enabled, devices get paging domains and MSI writes
>to IMSIC interrupt files must be handled correctly in the s-stage.
>As the device always writes to the host physical IMSIC addresses,
>which the IMSIC irqchip programs directly, install s-stage identity
>mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
>since the 1:1 mappings aren't required for device assignment.
>
>Loop over the cpus rather than imsic groups to handle asymmetric
>configurations.
>
>Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
>---
> drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
> include/linux/irqchip/riscv-imsic.h | 7 ++++++
> 2 files changed, 41 insertions(+)
>
>diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
>index a31f50bbad35..3c6aa9d69f95 100644
>--- a/drivers/iommu/riscv/iommu.c
>+++ b/drivers/iommu/riscv/iommu.c
>@@ -19,6 +19,7 @@
> #include <linux/init.h>
> #include <linux/iommu.h>
> #include <linux/iopoll.h>
>+#include <linux/irqchip/riscv-imsic.h>
> #include <linux/kernel.h>
> #include <linux/pci.h>
> #include <linux/generic_pt/iommu.h>
>@@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
> return &domain->domain;
> }
>
>+static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
>+{
>+ const struct imsic_global_config *imsic_global;
>+ unsigned int cpu;
>+
>+ if (!imsic_enabled())
>+ return;
>+
>+ imsic_global = imsic_get_global_config();
>+
>+ for_each_possible_cpu(cpu) {
>+ const struct imsic_local_config *local;
>+ struct iommu_resv_region *reg;
>+
>+ local = per_cpu_ptr(imsic_global->local, cpu);
>+ if (!local->msi_va)
>+ continue;
>+
>+ /*
>+ * The device always writes to the host physical IMSIC address, so install
>+ * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
>+ * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
>+ * devices.
>+ */
>+ reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
>+ IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
>+ IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
>+ if (reg)
>+ list_add_tail(®->list, head);
>+ }
>+}
>+
Hi Andrew,
Thanks for picking this up -- enabling IOMMU_DMA on RISC-V has been
along-standing gap, and handling the IMSIC MSI mapping is the missing
piece that finally unblocks it.
One concern is that the current implementation emits one 4 KiB
RESV_DIRECT_RELAXABLE region for each possible CPU. On platforms
with hundreds of harts, this noticeably increases the cost of both
.get_resv_regions() and the iommu_create_device_direct_mappings()
walk.
Since interrupt files within one IMSIC group occupy a physically
contiguous range, would it make sense to emit one region per IMSIC
group covering the full group stride, aligned down/up to 2 MiB so
the core can map it as a superpage? This would over-map some padding
within the IMSIC PA window, but RESV_DIRECT_RELAXABLE keeps the
padding out of assigned-device IOVA space, so it looks harmless.
Thanks,
Fangyu
> static int riscv_iommu_attach_blocking_domain(struct iommu_domain *iommu_domain,
> struct device *dev,
> struct iommu_domain *old)
>@@ -1401,6 +1434,7 @@ static const struct iommu_ops riscv_iommu_ops = {
> .blocked_domain = &riscv_iommu_blocking_domain,
> .release_domain = &riscv_iommu_blocking_domain,
> .domain_alloc_paging = riscv_iommu_alloc_paging_domain,
>+ .get_resv_regions = riscv_iommu_get_resv_regions,
> .device_group = riscv_iommu_device_group,
> .probe_device = riscv_iommu_probe_device,
> .release_device = riscv_iommu_release_device,
>diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
>index 4b348836de7a..ba3000f047b0 100644
>--- a/include/linux/irqchip/riscv-imsic.h
>+++ b/include/linux/irqchip/riscv-imsic.h
>@@ -88,6 +88,13 @@ static inline const struct imsic_global_config *imsic_get_global_config(void)
>
> #endif
>
>+static inline bool imsic_enabled(void)
>+{
>+ const struct imsic_global_config *imsic_global = imsic_get_global_config();
>+
>+ return imsic_global && imsic_global->nr_ids;
>+}
>+
> #if IS_ENABLED(CONFIG_ACPI) && IS_ENABLED(CONFIG_RISCV_IMSIC)
> int imsic_platform_acpi_probe(struct fwnode_handle *fwnode);
> struct fwnode_handle *imsic_acpi_get_fwnode(struct device *dev);
>--
>2.43.0
WARNING: multiple messages have this Message-ID (diff)
From: fangyu.yu@linux.alibaba.com
To: andrew.jones@oss.qualcomm.com
Cc: anup@brainfault.org, iommu@lists.linux.dev, joro@8bytes.org,
linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
palmer@dabbelt.com, pjw@kernel.org, tjeznach@rivosinc.com,
will@kernel.org
Subject: Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
Date: Sat, 9 May 2026 10:21:13 +0800 [thread overview]
Message-ID: <20260509022113.53400-1-fangyu.yu@linux.alibaba.com> (raw)
In-Reply-To: <20260508212339.381933-2-andrew.jones@oss.qualcomm.com>
>When IOMMU_DMA is enabled, devices get paging domains and MSI writes
>to IMSIC interrupt files must be handled correctly in the s-stage.
>As the device always writes to the host physical IMSIC addresses,
>which the IMSIC irqchip programs directly, install s-stage identity
>mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
>since the 1:1 mappings aren't required for device assignment.
>
>Loop over the cpus rather than imsic groups to handle asymmetric
>configurations.
>
>Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
>---
> drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
> include/linux/irqchip/riscv-imsic.h | 7 ++++++
> 2 files changed, 41 insertions(+)
>
>diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
>index a31f50bbad35..3c6aa9d69f95 100644
>--- a/drivers/iommu/riscv/iommu.c
>+++ b/drivers/iommu/riscv/iommu.c
>@@ -19,6 +19,7 @@
> #include <linux/init.h>
> #include <linux/iommu.h>
> #include <linux/iopoll.h>
>+#include <linux/irqchip/riscv-imsic.h>
> #include <linux/kernel.h>
> #include <linux/pci.h>
> #include <linux/generic_pt/iommu.h>
>@@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
> return &domain->domain;
> }
>
>+static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
>+{
>+ const struct imsic_global_config *imsic_global;
>+ unsigned int cpu;
>+
>+ if (!imsic_enabled())
>+ return;
>+
>+ imsic_global = imsic_get_global_config();
>+
>+ for_each_possible_cpu(cpu) {
>+ const struct imsic_local_config *local;
>+ struct iommu_resv_region *reg;
>+
>+ local = per_cpu_ptr(imsic_global->local, cpu);
>+ if (!local->msi_va)
>+ continue;
>+
>+ /*
>+ * The device always writes to the host physical IMSIC address, so install
>+ * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
>+ * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
>+ * devices.
>+ */
>+ reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
>+ IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
>+ IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
>+ if (reg)
>+ list_add_tail(®->list, head);
>+ }
>+}
>+
Hi Andrew,
Thanks for picking this up -- enabling IOMMU_DMA on RISC-V has been
along-standing gap, and handling the IMSIC MSI mapping is the missing
piece that finally unblocks it.
One concern is that the current implementation emits one 4 KiB
RESV_DIRECT_RELAXABLE region for each possible CPU. On platforms
with hundreds of harts, this noticeably increases the cost of both
.get_resv_regions() and the iommu_create_device_direct_mappings()
walk.
Since interrupt files within one IMSIC group occupy a physically
contiguous range, would it make sense to emit one region per IMSIC
group covering the full group stride, aligned down/up to 2 MiB so
the core can map it as a superpage? This would over-map some padding
within the IMSIC PA window, but RESV_DIRECT_RELAXABLE keeps the
padding out of assigned-device IOVA space, so it looks harmless.
Thanks,
Fangyu
> static int riscv_iommu_attach_blocking_domain(struct iommu_domain *iommu_domain,
> struct device *dev,
> struct iommu_domain *old)
>@@ -1401,6 +1434,7 @@ static const struct iommu_ops riscv_iommu_ops = {
> .blocked_domain = &riscv_iommu_blocking_domain,
> .release_domain = &riscv_iommu_blocking_domain,
> .domain_alloc_paging = riscv_iommu_alloc_paging_domain,
>+ .get_resv_regions = riscv_iommu_get_resv_regions,
> .device_group = riscv_iommu_device_group,
> .probe_device = riscv_iommu_probe_device,
> .release_device = riscv_iommu_release_device,
>diff --git a/include/linux/irqchip/riscv-imsic.h b/include/linux/irqchip/riscv-imsic.h
>index 4b348836de7a..ba3000f047b0 100644
>--- a/include/linux/irqchip/riscv-imsic.h
>+++ b/include/linux/irqchip/riscv-imsic.h
>@@ -88,6 +88,13 @@ static inline const struct imsic_global_config *imsic_get_global_config(void)
>
> #endif
>
>+static inline bool imsic_enabled(void)
>+{
>+ const struct imsic_global_config *imsic_global = imsic_get_global_config();
>+
>+ return imsic_global && imsic_global->nr_ids;
>+}
>+
> #if IS_ENABLED(CONFIG_ACPI) && IS_ENABLED(CONFIG_RISCV_IMSIC)
> int imsic_platform_acpi_probe(struct fwnode_handle *fwnode);
> struct fwnode_handle *imsic_acpi_get_fwnode(struct device *dev);
>--
>2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2026-05-09 2:21 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-08 21:23 [PATCH 0/2] iommu/riscv: Enable IOMMU_DMA Andrew Jones
2026-05-08 21:23 ` Andrew Jones
2026-05-08 21:23 ` [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains Andrew Jones
2026-05-08 21:23 ` Andrew Jones
2026-05-09 2:21 ` fangyu.yu [this message]
2026-05-09 2:21 ` fangyu.yu
2026-05-09 19:47 ` Andrew Jones
2026-05-09 19:47 ` Andrew Jones
2026-05-10 14:40 ` fangyu.yu
2026-05-10 14:40 ` fangyu.yu
2026-05-12 13:38 ` Jason Gunthorpe
2026-05-12 13:38 ` Jason Gunthorpe
2026-05-12 16:22 ` Andrew Jones
2026-05-12 16:22 ` Andrew Jones
2026-05-12 16:33 ` Jason Gunthorpe
2026-05-12 16:33 ` Jason Gunthorpe
2026-05-12 17:21 ` Tomasz Jeznach
2026-05-12 17:21 ` Tomasz Jeznach
2026-05-12 20:28 ` Andrew Jones
2026-05-12 20:28 ` Andrew Jones
2026-05-08 21:23 ` [PATCH 2/2] iommu/dma: enable IOMMU_DMA for RISC-V Andrew Jones
2026-05-08 21:23 ` Andrew Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260509022113.53400-1-fangyu.yu@linux.alibaba.com \
--to=fangyu.yu@linux.alibaba.com \
--cc=andrew.jones@oss.qualcomm.com \
--cc=anup@brainfault.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=pjw@kernel.org \
--cc=tjeznach@rivosinc.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.