From: Nicolin Chen <nicolinc@nvidia.com>
To: Eric Auger <eric.auger@redhat.com>
Cc: Shameer Kolothum <skolothumtho@nvidia.com>, <qemu-arm@nongnu.org>,
	<qemu-devel@nongnu.org>, <peter.maydell@linaro.org>,
	<jgg@nvidia.com>, <ddutile@redhat.com>, <berrange@redhat.com>,
	<nathanc@nvidia.com>, <mochs@nvidia.com>, <smostafa@google.com>,
	<wangzhou1@hisilicon.com>, <jiangkunkun@huawei.com>,
	<jonathan.cameron@huawei.com>, <zhangfei.gao@linaro.org>,
	<zhenzhong.duan@intel.com>, <yi.l.liu@intel.com>,
	<shameerkolothum@gmail.com>
Subject: Re: [PATCH v4 11/27] hw/pci/pci: Introduce optional get_msi_address_space() callback
Date: Mon, 20 Oct 2025 11:00:21 -0700	[thread overview]
Message-ID: <aPZ4tcsMfN+2puGL@Asurada-Nvidia> (raw)
In-Reply-To: <291fe8be-405e-4ea3-acfb-d090f6a7cd15@redhat.com>
On Mon, Oct 20, 2025 at 06:14:33PM +0200, Eric Auger wrote:
> >> This will cause the device to be configured with wrong MSI doorbell
> >> address if it return the system address space.
> >
> > I think it'd be nicer to elaborate why a wrong address will be returned:
> >
> > --------------------------------------------------------------------------
> > On ARM, a device behind an IOMMU requires translation for its MSI doorbell
> > address. When HW nested translation is enabled, the translation will also
> > happen in two stages: gIOVA => gPA => ITS page.
> >
> > In the accelerated SMMUv3 mode, both stages are translated by the HW. So,
> > get_address_space() returns the system address space for stage-2 mappings,
> > as the smmuv3-accel model doesn't involve in either stage.
> I don't understand "doesn't involve in either stage". This is still not
> obious to me that for an HW accelerated nested IOMMU get_address_space()
> shall return the system address space. I think this deserves to be
> explained and maybe documented along with the callback.
get_address_space() is used by pci_device_iommu_address_space(),
which is for attach or translation.
In QEMU, we have an "iommu" type of memory region, to represent
the address space providing the stage-1 translation.
In accel case excluding MSI, there is no need of "emulated iommu
translation" since HW/host SMMU takes care of both stages. Thus,
the system address is returned for get_address_space(), to avoid
stage-1 translation and to also allow VFIO devices to attach to
the system address space that the VFIO core will monitor to take
care of stage-2 mappings.
> > On the other hand, this callback is also invoked by QEMU/KVM:
> >
> >  kvm_irqchip_add_msi_route()
> >    kvm_arch_fixup_msi_route()
> >      pci_device_iommu_address_space()
> >       get_address_space()
> >
> > What KVM wants is to translate an MSI doorbell gIOVA to a vITS page (gPA),
> > so as to inject IRQs to the guest VM. And it expected get_address_space()
> > to return the address space for stage-1 mappings instead. Apparently, this
> > is broken.
> "Apparently this is broken". Please clarify what is broken. Definitively if
> 
> pci_device_iommu_address_space(dev) retruns @adress_system_memory no
> translation is attempted.
Hmm, I thought my writing was clear:
 - pci_device_iommu_address_space() returns the system address
   space that can't do a stage-1 translation.
 - KVM/MSI pathway requires an adress space that can do a stage-1
   translation.
> kvm_arch_fixup_msi_route() was introduced by 
> https://lore.kernel.org/all/1523518688-26674-12-git-send-email-eric.auger@redhat.com/
> 
> This relies on the vIOMMU translate callback which is supposed to be bypassed in general with VFIO devices. Isn't needed only for emulated devices?
Not only for emulated devices.
This KVM function needs the translation for the IRQ injection for
VFIO devices as well.
Although we use RMR for underlying HW to bypass the stage-1, the
translation for gIOVA=>vITS page (VIRT_GIC_ITS) still exists in
the guest level. FWIW, it's just doesn't have the stage-2 mapping
because HW never uses the "gIOVA" but a hard-coded SW_MSI address.
In the meantime, a VFIO device in the guest is programmed with a
gIOVA for MSI doorbell. This gIOVA can't be used for KVM code to
inject IRQs. It needs the gPA (i.e. VIRT_GIC_ITS). So, it needs a
translation address space to do that.
Hope this is clear now.
Thanks
Nicolin
next prev parent reply	other threads:[~2025-10-20 18:01 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-29 13:36 [PATCH v4 00/27] hw/arm/virt: Add support for user-creatable accelerated SMMUv3 Shameer Kolothum
2025-09-29 13:36 ` [PATCH v4 01/27] backends/iommufd: Introduce iommufd_backend_alloc_viommu Shameer Kolothum
2025-09-29 15:35   ` Jonathan Cameron via
2025-10-17 12:21   ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 02/27] backends/iommufd: Introduce iommufd_vdev_alloc Shameer Kolothum
2025-09-29 15:40   ` Jonathan Cameron via
2025-09-29 17:52   ` Nicolin Chen
2025-09-30  8:14     ` Shameer Kolothum
2025-09-29 13:36 ` [PATCH v4 03/27] hw/arm/smmu-common: Factor out common helper functions and export Shameer Kolothum
2025-09-29 15:43   ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 04/27] hw/arm/smmu-common:Make iommu ops part of SMMUState Shameer Kolothum
2025-09-29 15:45   ` Jonathan Cameron via
2025-09-29 21:53   ` Nicolin Chen via
2025-10-01 16:11   ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 05/27] hw/arm/smmuv3-accel: Introduce smmuv3 accel device Shameer Kolothum
2025-09-29 15:53   ` Jonathan Cameron via
2025-09-29 22:24   ` Nicolin Chen
2025-10-01 16:25   ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 06/27] hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd Shameer Kolothum
2025-09-29 16:08   ` Jonathan Cameron via
2025-09-30  8:03     ` Shameer Kolothum
2025-10-01 16:38       ` Eric Auger
2025-10-02  8:16         ` Shameer Kolothum
2025-09-30  0:11   ` Nicolin Chen
2025-10-02  7:29     ` Shameer Kolothum
2025-10-01 17:32   ` Eric Auger
2025-10-02  9:30     ` Shameer Kolothum
2025-10-17 12:47       ` Eric Auger
2025-10-17 13:15         ` Shameer Kolothum
2025-10-17 17:19           ` Eric Auger
2025-10-20 16:31   ` Eric Auger
2025-10-20 18:25     ` Nicolin Chen
2025-10-20 18:59       ` Shameer Kolothum
2025-10-21 15:28         ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 07/27] hw/arm/smmuv3: Implement get_viommu_cap() callback Shameer Kolothum
2025-09-29 16:13   ` Jonathan Cameron via
2025-10-01 17:36   ` Eric Auger
2025-10-02  9:38     ` Shameer Kolothum
2025-10-02 12:31       ` Eric Auger
2025-10-02  9:39     ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 08/27] hw/arm/smmuv3-accel: Add set/unset_iommu_device callback Shameer Kolothum
2025-09-29 16:25   ` Jonathan Cameron via
2025-09-30  8:13     ` Shameer Kolothum
2025-10-02  6:52   ` Eric Auger
2025-10-02 11:34     ` Shameer Kolothum
2025-10-02 16:44       ` Nicolin Chen
2025-10-02 18:35         ` Jason Gunthorpe
2025-10-17 12:06         ` Eric Auger
2025-10-17 12:23   ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 09/27] hw/arm/smmuv3-accel: Support nested STE install/uninstall support Shameer Kolothum
2025-09-29 16:41   ` Jonathan Cameron via
2025-10-02 10:04   ` Eric Auger
2025-10-02 12:08     ` Shameer Kolothum
2025-10-02 12:27       ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 10/27] hw/arm/smmuv3-accel: Allocate a vDEVICE object for device Shameer Kolothum
2025-09-29 16:42   ` Jonathan Cameron via
2025-10-17 13:08   ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 11/27] hw/pci/pci: Introduce optional get_msi_address_space() callback Shameer Kolothum
2025-09-29 16:48   ` Jonathan Cameron via
2025-10-16 22:30   ` Nicolin Chen
2025-10-20 16:14     ` Eric Auger
2025-10-20 18:00       ` Nicolin Chen [this message]
2025-10-21 16:26         ` Eric Auger
2025-10-21 18:56           ` Nicolin Chen
2025-10-22 16:25             ` Eric Auger
2025-10-22 16:56               ` Shameer Kolothum
2025-10-20 16:21   ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 12/27] hw/arm/smmuv3-accel: Make use of " Shameer Kolothum
2025-09-29 16:51   ` Jonathan Cameron via
2025-10-02  7:33     ` Shameer Kolothum
2025-10-16 23:28   ` Nicolin Chen
2025-10-20 16:43   ` Eric Auger
2025-10-21  8:15     ` Shameer Kolothum
2025-10-21 16:16       ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 13/27] hw/arm/smmuv3-accel: Add support to issue invalidation cmd to host Shameer Kolothum
2025-09-29 16:53   ` Jonathan Cameron via
2025-10-16 22:59   ` Nicolin Chen via
2025-09-29 13:36 ` [PATCH v4 14/27] hw/arm/smmuv3-accel: Get host SMMUv3 hw info and validate Shameer Kolothum
2025-10-01 12:56   ` Jonathan Cameron via
2025-10-02  7:37     ` Shameer Kolothum
2025-10-02  9:54       ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 15/27] acpi/gpex: Fix PCI Express Slot Information function 0 returned value Shameer Kolothum
2025-10-01 12:59   ` Jonathan Cameron via
2025-10-02  7:39     ` Shameer Kolothum
2025-10-21 15:32       ` Eric Auger
2025-09-29 13:36 ` [PATCH v4 16/27] hw/pci-host/gpex: Allow to generate preserve boot config DSM #5 Shameer Kolothum
2025-10-01 13:05   ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 17/27] hw/arm/virt: Set PCI preserve_config for accel SMMUv3 Shameer Kolothum
2025-10-01 13:06   ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 18/27] hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI nested binding Shameer Kolothum
2025-10-01 13:30   ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 19/27] hw/arm/smmuv3-accel: Install S1 bypass hwpt on reset Shameer Kolothum
2025-10-01 13:32   ` Jonathan Cameron via
2025-10-16 23:19   ` Nicolin Chen
2025-10-20 14:22     ` Shameer Kolothum
2025-09-29 13:36 ` [PATCH v4 20/27] hw/arm/smmuv3: Add accel property for SMMUv3 device Shameer Kolothum
2025-10-16 21:48   ` Nicolin Chen
2025-09-29 13:36 ` [PATCH v4 21/27] hw/arm/smmuv3-accel: Add a property to specify RIL support Shameer Kolothum
2025-10-01 13:39   ` Jonathan Cameron via
2025-10-17  8:48   ` Zhangfei Gao
2025-10-17  9:40     ` Shameer Kolothum
2025-10-17 14:05       ` Zhangfei Gao
2025-09-29 13:36 ` [PATCH v4 22/27] hw/arm/smmuv3-accel: Add support for ATS Shameer Kolothum
2025-10-01 13:43   ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 23/27] hw/arm/smmuv3-accel: Add property to specify OAS bits Shameer Kolothum
2025-10-01 13:46   ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 24/27] backends/iommufd: Retrieve PASID width from iommufd_backend_get_device_info() Shameer Kolothum
2025-10-01 13:50   ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 25/27] backends/iommufd: Add a callback helper to retrieve PASID support Shameer Kolothum
2025-10-01 13:52   ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 26/27] vfio: Synthesize vPASID capability to VM Shameer Kolothum
2025-10-01 13:58   ` Jonathan Cameron via
2025-10-02  8:03     ` Shameer Kolothum
2025-10-02  9:58       ` Jonathan Cameron via
2025-09-29 13:36 ` [PATCH v4 27/27] hw.arm/smmuv3: Add support for PASID enable Shameer Kolothum
2025-10-01 14:01   ` Jonathan Cameron via
2025-10-17  6:25 ` [PATCH v4 00/27] hw/arm/virt: Add support for user-creatable accelerated SMMUv3 Zhangfei Gao
2025-10-17  9:43   ` Shameer Kolothum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=aPZ4tcsMfN+2puGL@Asurada-Nvidia \
    --to=nicolinc@nvidia.com \
    --cc=berrange@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jiangkunkun@huawei.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=mochs@nvidia.com \
    --cc=nathanc@nvidia.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shameerkolothum@gmail.com \
    --cc=skolothumtho@nvidia.com \
    --cc=smostafa@google.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhangfei.gao@linaro.org \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).