All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Nicolin Chen <nicolinc@nvidia.com>,
	tglx@linutronix.de, bhelgaas@google.com,
	alex.williamson@redhat.com, jgg@nvidia.com, leonro@nvidia.com,
	shameerali.kolothum.thodi@huawei.com, dlemoal@kernel.org,
	kevin.tian@intel.com, smostafa@google.com,
	andriy.shevchenko@linux.intel.com, reinette.chatre@intel.com,
	eric.auger@redhat.com, ddutile@redhat.com, yebin10@huawei.com,
	brauner@kernel.org, apatel@ventanamicro.com,
	shivamurthy.shastri@linutronix.de, anna-maria@linutronix.de,
	nipun.gupta@amd.com, marek.vasut+renesas@mailbox.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH RFCv1 0/7] vfio: Allow userspace to specify the address for each MSI vector
Date: Mon, 11 Nov 2024 14:14:15 +0000	[thread overview]
Message-ID: <86pln1zwlk.wl-maz@kernel.org> (raw)
In-Reply-To: <a63e7c3b-ce96-47a5-b462-d5de3a2edb56@arm.com>

On Mon, 11 Nov 2024 13:09:20 +0000,
Robin Murphy <robin.murphy@arm.com> wrote:
> 
> On 2024-11-09 5:48 am, Nicolin Chen wrote:
> > On ARM GIC systems and others, the target address of the MSI is translated
> > by the IOMMU. For GIC, the MSI address page is called "ITS" page. When the
> > IOMMU is disabled, the MSI address is programmed to the physical location
> > of the GIC ITS page (e.g. 0x20200000). When the IOMMU is enabled, the ITS
> > page is behind the IOMMU, so the MSI address is programmed to an allocated
> > IO virtual address (a.k.a IOVA), e.g. 0xFFFF0000, which must be mapped to
> > the physical ITS page: IOVA (0xFFFF0000) ===> PA (0x20200000).
> > When a 2-stage translation is enabled, IOVA will be still used to program
> > the MSI address, though the mappings will be in two stages:
> >    IOVA (0xFFFF0000) ===> IPA (e.g. 0x80900000) ===> 0x20200000
> > (IPA stands for Intermediate Physical Address).
> > 
> > If the device that generates MSI is attached to an IOMMU_DOMAIN_DMA, the
> > IOVA is dynamically allocated from the top of the IOVA space. If attached
> > to an IOMMU_DOMAIN_UNMANAGED (e.g. a VFIO passthrough device), the IOVA is
> > fixed to an MSI window reported by the IOMMU driver via IOMMU_RESV_SW_MSI,
> > which is hardwired to MSI_IOVA_BASE (IOVA==0x8000000) for ARM IOMMUs.
> > 
> > So far, this IOMMU_RESV_SW_MSI works well as kernel is entirely in charge
> > of the IOMMU translation (1-stage translation), since the IOVA for the ITS
> > page is fixed and known by kernel. However, with virtual machine enabling
> > a nested IOMMU translation (2-stage), a guest kernel directly controls the
> > stage-1 translation with an IOMMU_DOMAIN_DMA, mapping a vITS page (at an
> > IPA 0x80900000) onto its own IOVA space (e.g. 0xEEEE0000). Then, the host
> > kernel can't know that guest-level IOVA to program the MSI address.
> > 
> > To solve this problem the VMM should capture the MSI IOVA allocated by the
> > guest kernel and relay it to the GIC driver in the host kernel, to program
> > the correct MSI IOVA. And this requires a new ioctl via VFIO.
> 
> Once VFIO has that information from userspace, though, do we really
> need the whole complicated dance to push it right down into the
> irqchip layer just so it can be passed back up again? AFAICS
> vfio_msi_set_vector_signal() via VFIO_DEVICE_SET_IRQS already
> explicitly rewrites MSI-X vectors, so it seems like it should be
> pretty straightforward to override the message address in general at
> that level, without the lower layers having to be aware at all, no?

+1.

I would like to avoid polluting each and every interrupt controller
with usage-specific knowledge (they usually are brain-damaged enough).
We already have an indirection into the IOMMU subsystem and it
shouldn't be a big deal to intercept the message for all
implementations at this level.

I also wonder how to handle the case of braindead^Wwonderful platforms
where ITS transactions are not translated by the SMMU. Somehow, VFIO
should be made aware of this situation.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2024-11-11 14:14 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-09  5:48 [PATCH RFCv1 0/7] vfio: Allow userspace to specify the address for each MSI vector Nicolin Chen
2024-11-09  5:48 ` [PATCH RFCv1 1/7] genirq/msi: Allow preset IOVA in struct msi_desc for MSI doorbell address Nicolin Chen
2024-11-09  5:48 ` [PATCH RFCv1 2/7] irqchip/gic-v3-its: Bypass iommu_cookie if desc->msi_iova is preset Nicolin Chen
2024-11-09  5:48 ` [PATCH RFCv1 3/7] PCI/MSI: Pass in msi_iova to msi_domain_insert_msi_desc Nicolin Chen
2024-11-09  9:43   ` kernel test robot
2024-11-09  5:48 ` [PATCH RFCv1 4/7] PCI/MSI: Allow __pci_enable_msi_range to pass in iova Nicolin Chen
2024-11-11  9:30   ` Andy Shevchenko
2024-11-09  5:48 ` [PATCH RFCv1 5/7] PCI/MSI: Extract a common __pci_alloc_irq_vectors function Nicolin Chen
2024-11-11  9:33   ` Andy Shevchenko
2024-11-09  5:48 ` [PATCH RFCv1 6/7] PCI/MSI: Add pci_alloc_irq_vectors_iovas helper Nicolin Chen
2024-11-09 21:52   ` kernel test robot
2024-11-10  9:51   ` kernel test robot
2024-11-11  9:34   ` Andy Shevchenko
2024-11-12 22:14     ` Nicolin Chen
2024-11-09  5:48 ` [PATCH RFCv1 7/7] vfio/pci: Allow preset MSI IOVAs via VFIO_IRQ_SET_ACTION_PREPARE Nicolin Chen
2024-11-11 13:09 ` [PATCH RFCv1 0/7] vfio: Allow userspace to specify the address for each MSI vector Robin Murphy
2024-11-11 14:14   ` Marc Zyngier [this message]
2024-11-12 22:13     ` Nicolin Chen
2024-11-12 21:54   ` Nicolin Chen
2024-11-13  1:34     ` Jason Gunthorpe
2024-11-13 21:11       ` Alex Williamson
2024-11-14 15:35         ` Robin Murphy
2024-11-20 13:17           ` Eric Auger
2024-11-20 14:03             ` Jason Gunthorpe
2024-11-28 11:15               ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86pln1zwlk.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=anna-maria@linutronix.de \
    --cc=apatel@ventanamicro.com \
    --cc=bhelgaas@google.com \
    --cc=brauner@kernel.org \
    --cc=ddutile@redhat.com \
    --cc=dlemoal@kernel.org \
    --cc=eric.auger@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=marek.vasut+renesas@mailbox.org \
    --cc=nicolinc@nvidia.com \
    --cc=nipun.gupta@amd.com \
    --cc=reinette.chatre@intel.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shivamurthy.shastri@linutronix.de \
    --cc=smostafa@google.com \
    --cc=tglx@linutronix.de \
    --cc=yebin10@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.