From: Andrew Jones <ajones@ventanamicro.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: iommu@lists.linux.dev, kvm-riscv@lists.infradead.org,
kvm@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org, tjeznach@rivosinc.com,
zong.li@sifive.com, joro@8bytes.org, will@kernel.org,
robin.murphy@arm.com, anup@brainfault.org,
atishp@atishpatra.org, tglx@linutronix.de,
alex.williamson@redhat.com, paul.walmsley@sifive.com,
palmer@dabbelt.com, aou@eecs.berkeley.edu
Subject: Re: [RFC PATCH 08/15] iommu/riscv: Add IRQ domain for interrupt remapping
Date: Fri, 22 Nov 2024 18:07:59 +0100 [thread overview]
Message-ID: <20241122-8c00551e2383787346c5249f@orel> (raw)
In-Reply-To: <20241122153340.GC773835@ziepe.ca>
On Fri, Nov 22, 2024 at 11:33:40AM -0400, Jason Gunthorpe wrote:
> On Fri, Nov 22, 2024 at 04:11:36PM +0100, Andrew Jones wrote:
>
> > The reason is that the RISC-V IOMMU only checks the MSI table, i.e.
> > enables its support for MSI remapping, when the g-stage (second-stage)
> > page table is in use. However, the expected virtual memory scheme for an
> > OS to use for DMA would be to have s-stage (first-stage) in use and the
> > g-stage set to 'Bare' (not in use).
>
> That isn't really a technical reason.
>
> > OIOW, it doesn't appear the spec authors expected MSI remapping to
> > be enabled for the host DMA use case. That does make some sense,
> > since it's actually not necessary. For the host DMA use case,
> > providing mappings for each s-mode interrupt file which the device
> > is allowed to write to in the s-stage page table sufficiently
> > enables MSIs to be delivered.
>
> Well, that seems to be the main problem here. You are grappling with a
> spec design that doesn't match the SW expecations. Since it has
> deviated from what everyone else has done you now have extra
> challenges to resolve in some way.
>
> Just always using interrupt remapping if the HW is capable of
> interrupt remapping and ignoring the spec "expectation" is a nice a
> simple way to make things work with existing Linux.
>
> > If "default VFIO" means VFIO without irqbypass, then it would work the
> > same as the DMA API, assuming all mappings for all necessary s-mode
> > interrupt files are created (something the DMA API needs as well).
> > However, VFIO would also need 'vfio_iommu_type1.allow_unsafe_interrupts=1'
> > to be set for this no-irqbypass configuration.
>
> Which isn't what anyone wants, you need to make the DMA API domain be
> fully functional so that VFIO works.
>
> > > That isn't ideal, the translation under the IRQs shouldn't really be
> > > changing as the translation under the IOMMU changes.
> >
> > Unless the device is assigned to a guest, then the IRQ domain wouldn't
> > do anything at all (it'd just sit between the device and the device's
> > old MSI parent domain), but it also wouldn't come and go, risking issues
> > with anything sensitive to changes in the IRQ domain hierarchy.
>
> VFIO isn't restricted to such a simple use model. You have to support
> all the generality, which includes fully supporting changing the iommu
> translation on the fly.
>
> > > Further, VFIO assumes iommu_group_has_isolated_msi(), ie
> > > IRQ_DOMAIN_FLAG_ISOLATED_MSI, is fixed while it is is bound. Will that
> > > be true if the iommu is flapping all about? What will you do when VFIO
> > > has it attached to a blocked domain?
> > >
> > > It just doesn't make sense to change something so fundamental as the
> > > interrupt path on an iommu domain attachement. :\
> >
> > Yes, it does appear I should be doing this at iommu device probe time
> > instead. It won't provide any additional functionality to use cases which
> > aren't assigning devices to guests, but it also won't hurt, and it should
> > avoid the risks you point out.
>
> Even if you statically create the domain you can't change the value of
> IRQ_DOMAIN_FLAG_ISOLATED_MSI depending on what is currently attached
> to the IOMMU.
>
> What you are trying to do is not supported by the software stack right
> now. You need to make much bigger, more intrusive changes, if you
> really want to make interrupt remapping dynamic.
>
Let the fun begin. I'll look into this more. It also looks like I need to
collect some test cases to ensure I can support all use cases with
whatever I propose next. Pointers for those would be welcome.
Thanks,
drew
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2024-11-22 17:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-14 16:18 [RFC PATCH 00/15] iommu/riscv: Add irqbypass support Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 01/15] irqchip/riscv-imsic: Use hierarchy to reach irq_set_affinity Andrew Jones
2024-12-03 13:53 ` Thomas Gleixner
2024-12-03 16:27 ` Andrew Jones
2024-12-03 16:50 ` Thomas Gleixner
2024-12-05 16:12 ` Andrew Jones
2024-12-03 16:37 ` Anup Patel
2024-12-03 20:55 ` Thomas Gleixner
2024-12-03 22:59 ` Thomas Gleixner
2024-12-04 3:43 ` Anup Patel
2024-12-04 13:05 ` Thomas Gleixner
2024-11-14 16:18 ` [RFC PATCH 02/15] genirq/msi: Provide DOMAIN_BUS_MSI_REMAP Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 03/15] irqchip/riscv-imsic: Add support for DOMAIN_BUS_MSI_REMAP Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 04/15] iommu/riscv: report iommu capabilities Andrew Jones
2024-11-15 15:20 ` Robin Murphy
2024-11-19 8:28 ` Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 05/15] iommu/riscv: use data structure instead of individual values Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 06/15] iommu/riscv: support GSCID and GVMA invalidation command Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 07/15] iommu/riscv: Move definitions to iommu.h Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 08/15] iommu/riscv: Add IRQ domain for interrupt remapping Andrew Jones
2024-11-18 18:43 ` Jason Gunthorpe
2024-11-19 7:49 ` Andrew Jones
2024-11-19 14:00 ` Jason Gunthorpe
2024-11-19 15:03 ` Andrew Jones
2024-11-19 15:36 ` Jason Gunthorpe
2024-11-22 15:11 ` Andrew Jones
2024-11-22 15:33 ` Jason Gunthorpe
2024-11-22 17:07 ` Andrew Jones [this message]
2024-11-25 15:07 ` Jason Gunthorpe
2024-11-14 16:18 ` [RFC PATCH 09/15] RISC-V: KVM: Enable KVM_VFIO interfaces on RISC-V arch Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 10/15] RISC-V: KVM: Add irqbypass skeleton Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 11/15] RISC-V: Define irqbypass vcpu_info Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 12/15] iommu/riscv: Add guest file irqbypass support Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 13/15] RISC-V: KVM: " Andrew Jones
2024-11-14 16:18 ` [RFC PATCH 14/15] vfio: enable IOMMU_TYPE1 for RISC-V Andrew Jones
2024-11-14 16:19 ` [RFC PATCH 15/15] RISC-V: defconfig: Add VFIO modules Andrew Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241122-8c00551e2383787346c5249f@orel \
--to=ajones@ventanamicro.com \
--cc=alex.williamson@redhat.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=atishp@atishpatra.org \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=joro@8bytes.org \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=robin.murphy@arm.com \
--cc=tglx@linutronix.de \
--cc=tjeznach@rivosinc.com \
--cc=will@kernel.org \
--cc=zong.li@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox