Re: [PATCH RFCv1 0/7] vfio: Allow userspace to specify the address for each MSI vector

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: Eric Auger <eric.auger@redhat.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	tglx@linutronix.de, maz@kernel.org, bhelgaas@google.com,
	leonro@nvidia.com, shameerali.kolothum.thodi@huawei.com,
	dlemoal@kernel.org, kevin.tian@intel.com, smostafa@google.com,
	andriy.shevchenko@linux.intel.com, reinette.chatre@intel.com,
	ddutile@redhat.com, yebin10@huawei.com, brauner@kernel.org,
	apatel@ventanamicro.com, shivamurthy.shastri@linutronix.de,
	anna-maria@linutronix.de, nipun.gupta@amd.com,
	marek.vasut+renesas@mailbox.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH RFCv1 0/7] vfio: Allow userspace to specify the address for each MSI vector
Date: Wed, 20 Nov 2024 10:03:37 -0400	[thread overview]
Message-ID: <20241120140337.GA772273@nvidia.com> (raw)
In-Reply-To: <66977090-d707-4585-b0c5-8b48f663827e@redhat.com>

On Wed, Nov 20, 2024 at 02:17:46PM +0100, Eric Auger wrote:
> > Yeah, I wasn't really suggesting to literally hook into this exact
> > case; it was more just a general observation that if VFIO already has
> > one justification for tinkering with pci_write_msi_msg() directly
> > without going through the msi_domain layer, then adding another
> > (wherever it fits best) can't be *entirely* unreasonable.

I'm not sure that we can assume VFIO is the only thing touching the
interrupt programming.

I think there is a KVM path, and also the /proc/ path that will change
the MSI affinity on the fly for a VFIO created IRQ. If the platform
requires a MSI update to do this (ie encoding affinity in the
add/data, not using IRQ remapping HW) then we still need to ensure the
correct MSI address is hooked in.

> >> Is it possible to do this with the existing write_msi_msg callback on
> >> the msi descriptor?  For instance we could simply translate the msg
> >> address and call pci_write_msi_msg() (while avoiding an infinite
> >> recursion).  Or maybe there should be an xlate_msi_msg callback we can
> >> register.  Or I suppose there might be a way to insert an irqchip that
> >> does the translation on write.  Thanks,
> >
> > I'm far from keen on the idea, but if there really is an appetite for
> > more indirection, then I guess the least-worst option would be yet
> > another type of iommu_dma_cookie to work via the existing
> > iommu_dma_compose_msi_msg() flow, 

For this direction I think I would turn iommu_dma_compose_msi_msg()
into a function pointer stored in the iommu_domain and have
vfio/iommufd provide its own implementation. The thing that is in
control of the domain's translation should be providing the msi_msg.

> > update per-device addresses direcitly. But then it's still going to
> > need some kind of "layering violation" for VFIO to poke the IRQ layer
> > into re-composing and re-writing a message whenever userspace feels
> > like changing an address

I think we'd need to get into the affinity update path and force a MSI
write as well, even if the platform isn't changing the MSI for
affinity. Processing a vMSI entry update would be two steps where we
update the MSI addr in VFIO and then set the affinity.

> for the record, the first integration was based on such distinct
> iommu_dma_cookie
> 
> [PATCH v15 00/12] SMMUv3 Nested Stage Setup (IOMMU part) <https://lore.kernel.org/all/20210411111228.14386-1-eric.auger@redhat.com/#r>, patches 8 - 11

There are some significant differences from that series with this idea:

 - We want to maintain a per-MSI-index/per-device lookup table. It
   is not just a simple cookie, the msi_desc->dev &&
   msi_desc->msi_index have to be matched against what userspace
   provides in the per-vMSI IOCTL

 - There would be no implicit progamming of the stage 2, this will be
   done directly in userspace by creating an IOAS area for the ITS page

 - It shouldn't have any sort of dynamic allocation behavior. It is an
   error for the kernel to ask for an msi_desc that userspace hasn't
   provided a mapping for
 
 - It should work well with nested and non-nested domains

Jason

next prev parent reply	other threads:[~2024-11-20 14:03 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-09  5:48 [PATCH RFCv1 0/7] vfio: Allow userspace to specify the address for each MSI vector Nicolin Chen
2024-11-09  5:48 ` [PATCH RFCv1 1/7] genirq/msi: Allow preset IOVA in struct msi_desc for MSI doorbell address Nicolin Chen
2024-11-09  5:48 ` [PATCH RFCv1 2/7] irqchip/gic-v3-its: Bypass iommu_cookie if desc->msi_iova is preset Nicolin Chen
2024-11-09  5:48 ` [PATCH RFCv1 3/7] PCI/MSI: Pass in msi_iova to msi_domain_insert_msi_desc Nicolin Chen
2024-11-09  9:43   ` kernel test robot
2024-11-09  5:48 ` [PATCH RFCv1 4/7] PCI/MSI: Allow __pci_enable_msi_range to pass in iova Nicolin Chen
2024-11-11  9:30   ` Andy Shevchenko
2024-11-09  5:48 ` [PATCH RFCv1 5/7] PCI/MSI: Extract a common __pci_alloc_irq_vectors function Nicolin Chen
2024-11-11  9:33   ` Andy Shevchenko
2024-11-09  5:48 ` [PATCH RFCv1 6/7] PCI/MSI: Add pci_alloc_irq_vectors_iovas helper Nicolin Chen
2024-11-09 21:52   ` kernel test robot
2024-11-10  9:51   ` kernel test robot
2024-11-11  9:34   ` Andy Shevchenko
2024-11-12 22:14     ` Nicolin Chen
2024-11-09  5:48 ` [PATCH RFCv1 7/7] vfio/pci: Allow preset MSI IOVAs via VFIO_IRQ_SET_ACTION_PREPARE Nicolin Chen
2024-11-11 13:09 ` [PATCH RFCv1 0/7] vfio: Allow userspace to specify the address for each MSI vector Robin Murphy
2024-11-11 14:14   ` Marc Zyngier
2024-11-12 22:13     ` Nicolin Chen
2024-11-12 21:54   ` Nicolin Chen
2024-11-13  1:34     ` Jason Gunthorpe
2024-11-13 21:11       ` Alex Williamson
2024-11-14 15:35         ` Robin Murphy
2024-11-20 13:17           ` Eric Auger
2024-11-20 14:03             ` Jason Gunthorpe [this message]
2024-11-28 11:15               ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241120140337.GA772273@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=anna-maria@linutronix.de \
    --cc=apatel@ventanamicro.com \
    --cc=bhelgaas@google.com \
    --cc=brauner@kernel.org \
    --cc=ddutile@redhat.com \
    --cc=dlemoal@kernel.org \
    --cc=eric.auger@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=marek.vasut+renesas@mailbox.org \
    --cc=maz@kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=nipun.gupta@amd.com \
    --cc=reinette.chatre@intel.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shivamurthy.shastri@linutronix.de \
    --cc=smostafa@google.com \
    --cc=tglx@linutronix.de \
    --cc=yebin10@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.