From: Andrew Jones <ajones@ventanamicro.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
iommu@lists.linux.dev, kvm-riscv@lists.infradead.org,
kvm@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org, zong.li@sifive.com,
tjeznach@rivosinc.com, joro@8bytes.org, will@kernel.org,
robin.murphy@arm.com, anup@brainfault.org, atish.patra@linux.dev,
alex.williamson@redhat.com, paul.walmsley@sifive.com,
palmer@dabbelt.com, alex@ghiti.fr
Subject: Re: [RFC PATCH v2 08/18] iommu/riscv: Use MSI table to enable IMSIC access
Date: Tue, 23 Sep 2025 10:12:42 -0500 [thread overview]
Message-ID: <20250923-b85e3309c54eaff1cdfddcf9@orel> (raw)
In-Reply-To: <20250923140646.GM1391379@nvidia.com>
On Tue, Sep 23, 2025 at 11:06:46AM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 23, 2025 at 12:12:52PM +0200, Thomas Gleixner wrote:
> > With a remapping domain intermediary this looks like this:
> >
> > [ CPU domain ] --- [ Remap domain] --- [ MSI domain ] -- device
> >
> > device driver allocates an MSI interrupt in the MSI domain
> >
> > MSI domain allocates an interrupt in the Remap domain
> >
> > Remap domain allocates a resource in the remap space, e.g. an entry
> > in the remap translation table and then allocates an interrupt in the
> > CPU domain.
>
> Thanks!
>
> And to be very crystal clear here, the meaning of
> IRQ_DOMAIN_FLAG_ISOLATED_MSI is that the remap domain has a security
> feature such that the device can only trigger CPU domain interrupts
> that have been explicitly allocated in the remap domain for that
> device. The device can never go through the remap domain and trigger
> some other device's interrupt.
>
> This is usally done by having the remap domain's HW take in the
> Addr/Data pair, do a per-BDF table lookup and then completely replace
> the Addr/Data pair with the "remapped" version. By fully replacing the
> remap domain prevents the device from generating a disallowed
> addr/data pair toward the CPU domain.
>
> It fundamentally must be done by having the HW do a per-RID/BDF table
> lookup based on the incoming MSI addr/data and fully sanitize the
> resulting output.
>
> There is some legacy history here. When MSI was first invented the
> goal was to make interrupts scalable by removing any state from the
> CPU side. The device would be told what Addr/Data to send to the CPU
> and the CPU would just take some encoded information in that pair as a
> delivery instruction. No state on the CPU side per interrupt.
>
> In the world of virtualization it was realized this is not secure, so
> the archs undid the core principal of MSI and the CPU HW has some kind
> of state/table entry for every single device interrupt source.
>
> x86/AMD did this by having per-device remapping tables in their IOMMU
> device context that are selected by incomming RID and effectively
> completely rewrite the addr/data pair before it reaches the APIC. The
> remap table alone now basically specifies where the interrupt is
> delivered.
>
> ARM doesn't do remapping, instead the interrupt controller itself has
> a table that converts (BDF,Data) into a delivery instruction. It is
> inherently secure.
Thanks, Jason. All the above information is very much appreciated,
particularly the history.
>
> That flag has nothing to do with affinity.
>
So the reason I keep bringing affinity into the context of isolation is
that, for MSI-capable RISC-V, each CPU has its own MSI controller (IMSIC).
As riscv is missing data validation its closer to the legacy, insecure
description above, but the "The device would be told what Addr/Data to
send to the CPU and the CPU would just take some encoded information in
that pair as a delivery instruction" part becomes "Addr is used to select
a CPU and then the CPU would take some encoded information in Data as the
delivery instruction". Since setting irq affinity is a way to set Addr
to one of a particular set of CPUs, then a device cannot raise interrupts
on CPUs outside that set. And, only interrupts that the allowed set of
CPUs are aware of may be raised. As a device's irqs move around from
irqbalance or a user's selection we can ensure only the CPU an irq should
be able to reach be reachable by managing the IOMMU MSI table. This gives
us some level of isolation, but there is still the possibility a device
may raise an interrupt it should not be able to when its irqs are affined
to the same CPU as another device's and the malicious/broken device uses
the wrong MSI data. For the non-virt case it's fair to say that's no where
near isolated enough. However, for the virt case, Addr is set to guest
interrupt files (something like virtual IMSICs) which means there will be
no other host device or other guest device irqs sharing those Addrs.
Interrupts for devices assigned to guests are truly isolated (not within
the guest, but we need nested support to fully isolate within the guest
anyway).
In v1, I tried to only turn IRQ_DOMAIN_FLAG_ISOLATED_MSI on for the virt
case, but, as you pointed out, that wasn't a good idea. For v2, I was
hoping the comment above the flag was enough, but thinking about it some
more, I agree it's not. I'm not sure what we can do for this other than
an IOMMU spec change at this point.
Thanks,
drew
next prev parent reply other threads:[~2025-09-23 15:12 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-20 20:38 [RFC PATCH v2 00/18] iommu/riscv: Add irqbypass support Andrew Jones
2025-09-20 20:38 ` [RFC PATCH v2 01/18] genirq/msi: Provide DOMAIN_BUS_MSI_REMAP Andrew Jones
2025-09-30 8:25 ` Nutty.Liu
2025-09-20 20:38 ` [RFC PATCH v2 02/18] iommu/riscv: Move struct riscv_iommu_domain and info to iommu.h Andrew Jones
2025-09-30 8:26 ` Nutty.Liu
2025-09-20 20:38 ` [RFC PATCH v2 03/18] iommu/riscv: Use data structure instead of individual values Andrew Jones
2025-09-24 3:25 ` Nutty.Liu
2025-09-24 13:31 ` Andrew Jones
2025-09-20 20:38 ` [RFC PATCH v2 04/18] iommu/riscv: Add IRQ domain for interrupt remapping Andrew Jones
2025-09-28 9:30 ` Nutty.Liu
2025-09-29 15:50 ` Andrew Jones
2025-09-20 20:38 ` [RFC PATCH v2 05/18] iommu/riscv: Prepare to use MSI table Andrew Jones
2025-10-05 8:30 ` Nutty.Liu
2025-09-20 20:38 ` [RFC PATCH v2 06/18] iommu/riscv: Implement MSI table management functions Andrew Jones
2025-10-05 8:28 ` Nutty.Liu
2025-09-20 20:38 ` [RFC PATCH v2 07/18] iommu/riscv: Export phys_to_ppn and ppn_to_phys Andrew Jones
2025-10-05 8:39 ` Nutty.Liu
2025-09-20 20:38 ` [RFC PATCH v2 08/18] iommu/riscv: Use MSI table to enable IMSIC access Andrew Jones
2025-09-22 18:43 ` Jason Gunthorpe
2025-09-22 21:20 ` Andrew Jones
2025-09-22 23:56 ` Jason Gunthorpe
2025-09-23 10:12 ` Thomas Gleixner
2025-09-23 14:06 ` Jason Gunthorpe
2025-09-23 15:12 ` Andrew Jones [this message]
2025-09-23 15:27 ` Jason Gunthorpe
2025-09-23 15:50 ` Andrew Jones
2025-09-23 16:23 ` Jason Gunthorpe
2025-09-23 16:33 ` Andrew Jones
2026-03-24 9:12 ` Vincent Chen
2026-03-26 17:31 ` Andrew Jones
2025-09-23 14:37 ` Andrew Jones
2025-09-23 14:52 ` Jason Gunthorpe
2025-09-23 15:37 ` Andrew Jones
2025-10-23 13:47 ` Jinvas
2025-09-20 20:38 ` [RFC PATCH v2 09/18] iommu/dma: enable IOMMU_DMA for RISC-V Andrew Jones
2025-10-05 8:40 ` Nutty.Liu
2025-09-20 20:39 ` [RFC PATCH v2 10/18] RISC-V: Define irqbypass vcpu_info Andrew Jones
2025-10-05 8:41 ` Nutty.Liu
2025-09-20 20:39 ` [RFC PATCH v2 11/18] iommu/riscv: Maintain each irq msitbl index with chip data Andrew Jones
2025-09-20 20:39 ` [RFC PATCH v2 12/18] iommu/riscv: Add guest file irqbypass support Andrew Jones
2025-09-20 20:39 ` [RFC PATCH v2 13/18] iommu/riscv: report iommu capabilities Andrew Jones
2025-10-05 8:43 ` Nutty.Liu
2025-09-20 20:39 ` [RFC PATCH v2 14/18] RISC-V: KVM: Enable KVM_VFIO interfaces on RISC-V arch Andrew Jones
2025-10-05 8:44 ` Nutty.Liu
2025-09-20 20:39 ` [RFC PATCH v2 15/18] RISC-V: KVM: Add guest file irqbypass support Andrew Jones
2025-09-20 20:39 ` [RFC PATCH v2 16/18] vfio: enable IOMMU_TYPE1 for RISC-V Andrew Jones
2025-10-05 8:44 ` Nutty.Liu
2025-09-20 20:39 ` [RFC PATCH v2 17/18] RISC-V: defconfig: Add VFIO modules Andrew Jones
2025-10-05 8:47 ` Nutty.Liu
2025-09-20 20:39 ` [RFC PATCH v2 18/18] DO NOT UPSTREAM: RISC-V: KVM: Workaround kvm_riscv_gstage_ioremap() bug Andrew Jones
2025-10-20 13:12 ` fangyu.yu
2025-10-20 19:47 ` Daniel Henrique Barboza
2025-10-21 1:10 ` fangyu.yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250923-b85e3309c54eaff1cdfddcf9@orel \
--to=ajones@ventanamicro.com \
--cc=alex.williamson@redhat.com \
--cc=alex@ghiti.fr \
--cc=anup@brainfault.org \
--cc=atish.patra@linux.dev \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=robin.murphy@arm.com \
--cc=tglx@linutronix.de \
--cc=tjeznach@rivosinc.com \
--cc=will@kernel.org \
--cc=zong.li@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox