From: fangyu.yu@linux.alibaba.com
To: andrew.jones@oss.qualcomm.com
Cc: anup@brainfault.org, fangyu.yu@linux.alibaba.com,
iommu@lists.linux.dev, joro@8bytes.org,
linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
palmer@dabbelt.com, pjw@kernel.org, tjeznach@rivosinc.com,
will@kernel.org
Subject: Re: Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
Date: Sun, 10 May 2026 22:40:38 +0800 [thread overview]
Message-ID: <20260510144038.54523-1-fangyu.yu@linux.alibaba.com> (raw)
In-Reply-To: <fhv3w4kmxd5kw26kdlj4uj7cgr5mhov7rytph6dedokdpe36d4@svjdhnilgnwe>
>> >When IOMMU_DMA is enabled, devices get paging domains and MSI writes
>> >to IMSIC interrupt files must be handled correctly in the s-stage.
>> >As the device always writes to the host physical IMSIC addresses,
>> >which the IMSIC irqchip programs directly, install s-stage identity
>> >mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
>> >since the 1:1 mappings aren't required for device assignment.
>> >
>> >Loop over the cpus rather than imsic groups to handle asymmetric
>> >configurations.
>> >
>> >Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
>> >---
>> > drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
>> > include/linux/irqchip/riscv-imsic.h | 7 ++++++
>> > 2 files changed, 41 insertions(+)
>> >
>> >diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
>> >index a31f50bbad35..3c6aa9d69f95 100644
>> >--- a/drivers/iommu/riscv/iommu.c
>> >+++ b/drivers/iommu/riscv/iommu.c
>> >@@ -19,6 +19,7 @@
>> > #include <linux/init.h>
>> > #include <linux/iommu.h>
>> > #include <linux/iopoll.h>
>> >+#include <linux/irqchip/riscv-imsic.h>
>> > #include <linux/kernel.h>
>> > #include <linux/pci.h>
>> > #include <linux/generic_pt/iommu.h>
>> >@@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
>> > return &domain->domain;
>> > }
>> >
>> >+static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
>> >+{
>> >+ const struct imsic_global_config *imsic_global;
>> >+ unsigned int cpu;
>> >+
>> >+ if (!imsic_enabled())
>> >+ return;
>> >+
>> >+ imsic_global = imsic_get_global_config();
>> >+
>> >+ for_each_possible_cpu(cpu) {
>> >+ const struct imsic_local_config *local;
>> >+ struct iommu_resv_region *reg;
>> >+
>> >+ local = per_cpu_ptr(imsic_global->local, cpu);
>> >+ if (!local->msi_va)
>> >+ continue;
>> >+
>> >+ /*
>> >+ * The device always writes to the host physical IMSIC address, so install
>> >+ * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
>> >+ * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
>> >+ * devices.
>> >+ */
>> >+ reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
>> >+ IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
>> >+ IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
>> >+ if (reg)
>> >+ list_add_tail(®->list, head);
>> >+ }
>> >+}
>> >+
>>
>> Hi Andrew,
>>
>> Thanks for picking this up -- enabling IOMMU_DMA on RISC-V has been
>> along-standing gap, and handling the IMSIC MSI mapping is the missing
>> piece that finally unblocks it.
>>
>> One concern is that the current implementation emits one 4 KiB
>> RESV_DIRECT_RELAXABLE region for each possible CPU. On platforms
>> with hundreds of harts, this noticeably increases the cost of both
>> .get_resv_regions() and the iommu_create_device_direct_mappings()
>> walk.
>>
>> Since interrupt files within one IMSIC group occupy a physically
>> contiguous range, would it make sense to emit one region per IMSIC
>> group covering the full group stride, aligned down/up to 2 MiB so
>> the core can map it as a superpage? This would over-map some padding
>> within the IMSIC PA window, but RESV_DIRECT_RELAXABLE keeps the
>> padding out of assigned-device IOVA space, so it looks harmless.
>>
>
>Hi Fangyu,
>
>Thanks for pointing out this issue. We'll need to decide how much we
>want to isolate the devices from VMs in order to address it, though,
>because, if we do group mappings, then we'll also be exposing the guest
>interrupt files to the devices.
>
>I'll certainly send a v2 to do larger mappings when s-mode imsics are
>contiguous (nr-guest-files = 0) and then we can consider creating a
>way to opt-in to group mappings even when nr-guest-files != 0. How's
>that sound?
>
Hi Andrew,
As you mentioned, my suggestion could also expose the guest interrupt
files to the device-visible/mappable range, but the current approach
already appears to provide only limited isolation. So I think this
change would not materially increase the risk.
In any case, I respect your decision, and we can certainly proceed
with your current suggestion for now.
Thanks,
Fangyu
>Thanks,
>drew
WARNING: multiple messages have this Message-ID (diff)
From: fangyu.yu@linux.alibaba.com
To: andrew.jones@oss.qualcomm.com
Cc: anup@brainfault.org, fangyu.yu@linux.alibaba.com,
iommu@lists.linux.dev, joro@8bytes.org,
linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
palmer@dabbelt.com, pjw@kernel.org, tjeznach@rivosinc.com,
will@kernel.org
Subject: Re: Re: [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains
Date: Sun, 10 May 2026 22:40:38 +0800 [thread overview]
Message-ID: <20260510144038.54523-1-fangyu.yu@linux.alibaba.com> (raw)
In-Reply-To: <fhv3w4kmxd5kw26kdlj4uj7cgr5mhov7rytph6dedokdpe36d4@svjdhnilgnwe>
>> >When IOMMU_DMA is enabled, devices get paging domains and MSI writes
>> >to IMSIC interrupt files must be handled correctly in the s-stage.
>> >As the device always writes to the host physical IMSIC addresses,
>> >which the IMSIC irqchip programs directly, install s-stage identity
>> >mappings for the host IMSICs. But, use IOMMU_RESV_DIRECT_RELAXABLE
>> >since the 1:1 mappings aren't required for device assignment.
>> >
>> >Loop over the cpus rather than imsic groups to handle asymmetric
>> >configurations.
>> >
>> >Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
>> >---
>> > drivers/iommu/riscv/iommu.c | 34 +++++++++++++++++++++++++++++
>> > include/linux/irqchip/riscv-imsic.h | 7 ++++++
>> > 2 files changed, 41 insertions(+)
>> >
>> >diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
>> >index a31f50bbad35..3c6aa9d69f95 100644
>> >--- a/drivers/iommu/riscv/iommu.c
>> >+++ b/drivers/iommu/riscv/iommu.c
>> >@@ -19,6 +19,7 @@
>> > #include <linux/init.h>
>> > #include <linux/iommu.h>
>> > #include <linux/iopoll.h>
>> >+#include <linux/irqchip/riscv-imsic.h>
>> > #include <linux/kernel.h>
>> > #include <linux/pci.h>
>> > #include <linux/generic_pt/iommu.h>
>> >@@ -1286,6 +1287,38 @@ static struct iommu_domain *riscv_iommu_alloc_paging_domain(struct device *dev)
>> > return &domain->domain;
>> > }
>> >
>> >+static void riscv_iommu_get_resv_regions(struct device *dev, struct list_head *head)
>> >+{
>> >+ const struct imsic_global_config *imsic_global;
>> >+ unsigned int cpu;
>> >+
>> >+ if (!imsic_enabled())
>> >+ return;
>> >+
>> >+ imsic_global = imsic_get_global_config();
>> >+
>> >+ for_each_possible_cpu(cpu) {
>> >+ const struct imsic_local_config *local;
>> >+ struct iommu_resv_region *reg;
>> >+
>> >+ local = per_cpu_ptr(imsic_global->local, cpu);
>> >+ if (!local->msi_va)
>> >+ continue;
>> >+
>> >+ /*
>> >+ * The device always writes to the host physical IMSIC address, so install
>> >+ * identity mappings directly. Use IOMMU_RESV_DIRECT_RELAXABLE instead of
>> >+ * IOMMU_RESV_DIRECT since these 1:1 mappings are not required for assigned
>> >+ * devices.
>> >+ */
>> >+ reg = iommu_alloc_resv_region(local->msi_pa, IMSIC_MMIO_PAGE_SZ,
>> >+ IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO,
>> >+ IOMMU_RESV_DIRECT_RELAXABLE, GFP_KERNEL);
>> >+ if (reg)
>> >+ list_add_tail(®->list, head);
>> >+ }
>> >+}
>> >+
>>
>> Hi Andrew,
>>
>> Thanks for picking this up -- enabling IOMMU_DMA on RISC-V has been
>> along-standing gap, and handling the IMSIC MSI mapping is the missing
>> piece that finally unblocks it.
>>
>> One concern is that the current implementation emits one 4 KiB
>> RESV_DIRECT_RELAXABLE region for each possible CPU. On platforms
>> with hundreds of harts, this noticeably increases the cost of both
>> .get_resv_regions() and the iommu_create_device_direct_mappings()
>> walk.
>>
>> Since interrupt files within one IMSIC group occupy a physically
>> contiguous range, would it make sense to emit one region per IMSIC
>> group covering the full group stride, aligned down/up to 2 MiB so
>> the core can map it as a superpage? This would over-map some padding
>> within the IMSIC PA window, but RESV_DIRECT_RELAXABLE keeps the
>> padding out of assigned-device IOVA space, so it looks harmless.
>>
>
>Hi Fangyu,
>
>Thanks for pointing out this issue. We'll need to decide how much we
>want to isolate the devices from VMs in order to address it, though,
>because, if we do group mappings, then we'll also be exposing the guest
>interrupt files to the devices.
>
>I'll certainly send a v2 to do larger mappings when s-mode imsics are
>contiguous (nr-guest-files = 0) and then we can consider creating a
>way to opt-in to group mappings even when nr-guest-files != 0. How's
>that sound?
>
Hi Andrew,
As you mentioned, my suggestion could also expose the guest interrupt
files to the device-visible/mappable range, but the current approach
already appears to provide only limited isolation. So I think this
change would not materially increase the risk.
In any case, I respect your decision, and we can certainly proceed
with your current suggestion for now.
Thanks,
Fangyu
>Thanks,
>drew
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2026-05-10 14:40 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-08 21:23 [PATCH 0/2] iommu/riscv: Enable IOMMU_DMA Andrew Jones
2026-05-08 21:23 ` Andrew Jones
2026-05-08 21:23 ` [PATCH 1/2] iommu/riscv: Map IMSIC addresses for paging domains Andrew Jones
2026-05-08 21:23 ` Andrew Jones
2026-05-09 2:21 ` fangyu.yu
2026-05-09 2:21 ` fangyu.yu
2026-05-09 19:47 ` Andrew Jones
2026-05-09 19:47 ` Andrew Jones
2026-05-10 14:40 ` fangyu.yu [this message]
2026-05-10 14:40 ` fangyu.yu
2026-05-12 13:38 ` Jason Gunthorpe
2026-05-12 13:38 ` Jason Gunthorpe
2026-05-12 16:22 ` Andrew Jones
2026-05-12 16:22 ` Andrew Jones
2026-05-12 16:33 ` Jason Gunthorpe
2026-05-12 16:33 ` Jason Gunthorpe
2026-05-12 17:21 ` Tomasz Jeznach
2026-05-12 17:21 ` Tomasz Jeznach
2026-05-12 20:28 ` Andrew Jones
2026-05-12 20:28 ` Andrew Jones
2026-05-08 21:23 ` [PATCH 2/2] iommu/dma: enable IOMMU_DMA for RISC-V Andrew Jones
2026-05-08 21:23 ` Andrew Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260510144038.54523-1-fangyu.yu@linux.alibaba.com \
--to=fangyu.yu@linux.alibaba.com \
--cc=andrew.jones@oss.qualcomm.com \
--cc=anup@brainfault.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=pjw@kernel.org \
--cc=tjeznach@rivosinc.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.