qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Auger <eric.auger@redhat.com>
To: qemu-arm@nongnu.org, qemu-devel@nongnu.org, skolothumtho@nvidia.com
Cc: peter.maydell@linaro.org, jgg@nvidia.com, nicolinc@nvidia.com,
	ddutile@redhat.com, berrange@redhat.com, nathanc@nvidia.com,
	mochs@nvidia.com, smostafa@google.com, linuxarm@huawei.com,
	wangzhou1@hisilicon.com, jiangkunkun@huawei.com,
	jonathan.cameron@huawei.com, zhangfei.gao@linaro.org,
	zhenzhong.duan@intel.com, shameerkolothum@gmail.com
Subject: Re: [RFC PATCH v3 11/15] hw/pci/pci: Introduce optional get_msi_address_space() callback.
Date: Fri, 5 Sep 2025 12:11:33 +0200	[thread overview]
Message-ID: <d29ccf81-59d2-4ae4-9548-f27cfc4255c5@redhat.com> (raw)
In-Reply-To: <20250714155941.22176-12-shameerali.kolothum.thodi@huawei.com>



On 7/14/25 5:59 PM, Shameer Kolothum wrote:
> On ARM, when a device is behind an IOMMU, its MSI doorbell address is
> subject to translation by the IOMMU. This behavior affects vfio-pci
> passthrough devices assigned to guests using an accelerated SMMUv3.
>
> In this setup, we configure the host SMMUv3 in nested mode, where
> VFIO sets up the Stage-2 (S2) mappings for guest RAM, while the guest
> controls Stage-1 (S1). To allow VFIO to correctly configure S2 mappings,
> we currently return the system address space via the get_address_space() callback for vfio-pci devices.
>
> However, QEMU/KVM also uses this same callback path when resolving the
> address space for MSI doorbells:
>
> kvm_irqchip_add_msi_route()
>   kvm_arch_fixup_msi_route()
>     pci_device_iommu_address_space()
>
> This leads to problems when MSI doorbells need to be translated.
Worth to explain the exact "problem" ;-)

Eric
>
> To fix this, introduce an optional get_msi_address_space() callback.
> In the SMMUv3 accelerated case, this callback returns the IOMMU address
> space if the guest has set up S1 translations for the vfio-pci device.
> Otherwise, it returns the system address space.
>
> Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/arm/smmuv3-accel.c | 25 +++++++++++++++++++++++++
>  hw/pci/pci.c          | 19 +++++++++++++++++++
>  include/hw/pci/pci.h  | 16 ++++++++++++++++
>  target/arm/kvm.c      |  2 +-
>  4 files changed, 61 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> i	ndex f1584dd775..04c665ccf5 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -346,6 +346,30 @@ static void smmuv3_accel_unset_iommu_device(PCIBus *bus, void *opaque,
>      }
>  }
>  
> +static AddressSpace *smmuv3_accel_find_msi_as(PCIBus *bus, void *opaque,
> +                                                  int devfn)
> +{
> +    SMMUState *bs = opaque;
> +    SMMUPciBus *sbus;
> +    SMMUv3AccelDevice *accel_dev;
> +    SMMUDevice *sdev;
> +
> +    sbus = smmu_get_sbus(bs, bus);
> +    accel_dev = smmuv3_accel_get_dev(bs, sbus, bus, devfn);
> +    sdev = &accel_dev->sdev;
> +
> +    /*
> +     * If the assigned vfio-pci dev has S1 translation enabled by
> +     * Guest, return IOMMU address space for MSI translation.
> +     * Otherwise, return system address space.
> +     */
> +    if (accel_dev->s1_hwpt) {
> +        return &sdev->as;
> +    } else {
> +        return &accel_dev->as_sysmem;
> +    }
> +}
> +
>  static bool smmuv3_accel_pdev_allowed(PCIDevice *pdev, bool *vfio_pci)
>  {
>  
> @@ -407,6 +431,7 @@ static const PCIIOMMUOps smmuv3_accel_ops = {
>      .get_viommu_cap = smmuv3_accel_get_viommu_cap,
>      .set_iommu_device = smmuv3_accel_set_iommu_device,
>      .unset_iommu_device = smmuv3_accel_unset_iommu_device,
> +    .get_msi_address_space = smmuv3_accel_find_msi_as,
>  };
>  
>  void smmuv3_accel_init(SMMUv3State *s)
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 13de0e2809..404aeb643d 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2957,6 +2957,25 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
>      return &address_space_memory;
>  }
>  
> +AddressSpace *pci_device_iommu_msi_address_space(PCIDevice *dev)
> +{
> +    PCIBus *bus;
> +    PCIBus *iommu_bus;
> +    int devfn;
> +
> +    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, &bus, &devfn);
> +    if (iommu_bus) {
> +        if (iommu_bus->iommu_ops->get_msi_address_space) {
> +            return iommu_bus->iommu_ops->get_msi_address_space(bus,
> +                                 iommu_bus->iommu_opaque, devfn);
> +        } else {
> +            return iommu_bus->iommu_ops->get_address_space(bus,
> +                                 iommu_bus->iommu_opaque, devfn);
> +        }
> +    }
> +    return &address_space_memory;
> +}
> +
>  int pci_iommu_init_iotlb_notifier(PCIDevice *dev, IOMMUNotifier *n,
>                                    IOMMUNotify fn, void *opaque)
>  {
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index d1d43e9fb9..55138c406e 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -639,12 +639,28 @@ typedef struct PCIIOMMUOps {
>                              uint32_t pasid, bool priv_req, bool exec_req,
>                              hwaddr addr, bool lpig, uint16_t prgi, bool is_read,
>                              bool is_write);
> +    /**
> +     * @get_msi_address_space: get the address space for MSI doorbell address
> +     * for devices
> +     *
> +     * Optional callback which returns a pointer to an #AddressSpace. This
> +     * is required if MSI doorbell also gets translated through IOMMU(eg: ARM)
> +     *
> +     * @bus: the #PCIBus being accessed.
> +     *
> +     * @opaque: the data passed to pci_setup_iommu().
> +     *
> +     * @devfn: device and function number
> +     */
> +    AddressSpace * (*get_msi_address_space)(PCIBus *bus, void *opaque,
> +                                            int devfn);
>  } PCIIOMMUOps;
>  
>  AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
>  bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
>                                   Error **errp);
>  void pci_device_unset_iommu_device(PCIDevice *dev);
> +AddressSpace *pci_device_iommu_msi_address_space(PCIDevice *dev);
>  
>  /**
>   * pci_device_get_viommu_cap: get vIOMMU capabilities.
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 6672344855..c78d0d59bb 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -1535,7 +1535,7 @@ int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level)
>  int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>                               uint64_t address, uint32_t data, PCIDevice *dev)
>  {
> -    AddressSpace *as = pci_device_iommu_address_space(dev);
> +    AddressSpace *as = pci_device_iommu_msi_address_space(dev);
>      hwaddr xlat, len, doorbell_gpa;
>      MemoryRegionSection mrs;
>      MemoryRegion *mr;



  parent reply	other threads:[~2025-09-05 10:12 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-14 15:59 [RFC PATCH v3 00/15] hw/arm/virt: Add support for user-creatable accelerated SMMUv3 Shameer Kolothum via
2025-07-14 15:59 ` [RFC PATCH v3 01/15] backends/iommufd: Introduce iommufd_backend_alloc_viommu Shameer Kolothum via
2025-07-14 16:22   ` Nicolin Chen
2025-07-15  9:14   ` Jonathan Cameron via
2025-07-14 15:59 ` [RFC PATCH v3 02/15] backends/iommufd: Introduce iommufd_vdev_alloc Shameer Kolothum via
2025-07-14 16:27   ` Nicolin Chen
2025-07-15  9:19   ` Jonathan Cameron via
2025-07-14 15:59 ` [RFC PATCH v3 03/15] hw/arm/smmu-common: Factor out common helper functions and export Shameer Kolothum via
2025-07-15  9:27   ` Jonathan Cameron via
2025-07-14 15:59 ` [RFC PATCH v3 04/15] hw/arm/smmu-common: Introduce smmu_iommu_ops_by_type() helper Shameer Kolothum via
2025-07-14 16:38   ` Nicolin Chen via
2025-07-15  9:30   ` Jonathan Cameron via
2025-09-04  7:55   ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 05/15] hw/arm/smmuv3-accel: Introduce smmuv3 accel device Shameer Kolothum via
2025-07-14 17:23   ` Nicolin Chen
2025-09-04 14:33     ` Eric Auger
2025-09-05  8:22       ` Shameer Kolothum
2025-09-05 10:17         ` Eric Auger
2025-07-15  9:45   ` Jonathan Cameron via
2025-07-15 10:48   ` Duan, Zhenzhong
2025-07-15 17:29     ` Nicolin Chen
2025-07-16  3:38       ` Duan, Zhenzhong
2025-07-16  9:27         ` Shameerali Kolothum Thodi via
2025-09-04 14:31           ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 06/15] hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd Shameer Kolothum via
2025-07-14 18:18   ` Nicolin Chen
2025-07-15  9:51   ` Jonathan Cameron via
2025-07-15 10:53   ` Duan, Zhenzhong
2025-07-15 17:59     ` Nicolin Chen
2025-07-16  6:26       ` Duan, Zhenzhong
2025-07-16  9:34         ` Shameerali Kolothum Thodi via
2025-07-16 10:32           ` Duan, Zhenzhong
2025-07-16 17:51           ` Nicolin Chen
2025-07-16 18:21             ` Nicolin Chen
2025-09-05  8:34             ` Eric Auger
2025-09-05  8:14         ` Eric Auger
2025-07-16  8:06       ` Shameerali Kolothum Thodi via
2025-09-05  8:29         ` Eric Auger
2025-08-06  0:55   ` Nicolin Chen
2025-09-05  8:42   ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 07/15] hw/arm/smmuv3: Implement get_viommu_cap() callback Shameer Kolothum via
2025-07-14 18:31   ` Nicolin Chen
2025-09-05  8:49   ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 08/15] hw/arm/smmuv3-accel: Add set/unset_iommu_device callback Shameer Kolothum via
2025-07-14 19:11   ` Nicolin Chen
2025-07-15 10:29   ` Jonathan Cameron via
2025-07-15 17:01     ` Nicolin Chen
2025-07-16  9:33       ` Jonathan Cameron via
2025-09-05  9:27   ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 09/15] hw/arm/smmuv3-accel: Support nested STE install/uninstall support Shameer Kolothum via
2025-07-14 19:37   ` Nicolin Chen
2025-07-15 23:12   ` Nicolin Chen
2025-07-16  8:36     ` Shameerali Kolothum Thodi via
2025-07-16 18:17       ` Nicolin Chen
2025-09-05  9:51       ` Eric Auger
2025-09-05  9:40   ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 10/15] hw/arm/smmuv3-accel: Allocate a vDEVICE object for device Shameer Kolothum via
2025-07-14 19:43   ` Nicolin Chen
2025-09-05  9:57   ` Eric Auger
2025-09-05 18:36     ` Nicolin Chen
2025-07-14 15:59 ` [RFC PATCH v3 11/15] hw/pci/pci: Introduce optional get_msi_address_space() callback Shameer Kolothum via
2025-07-14 19:50   ` Nicolin Chen
2025-09-05 10:11     ` Eric Auger
2025-09-05 10:11   ` Eric Auger [this message]
2025-07-14 15:59 ` [RFC PATCH v3 12/15] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations Shameer Kolothum via
2025-07-14 19:55   ` Nicolin Chen
2025-07-15 10:39   ` Jonathan Cameron via
2025-07-15 17:07     ` Nicolin Chen
2025-09-05 10:31   ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 13/15] hw/arm/smmuv3: Forward invalidation commands to hw Shameer Kolothum via
2025-07-15 10:46   ` Jonathan Cameron via
2025-07-15 17:22     ` Nicolin Chen
2025-07-16  7:32       ` Shameerali Kolothum Thodi via
2025-09-05 12:45   ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits Shameer Kolothum via
2025-07-14 20:04   ` Nicolin Chen via
2025-07-14 20:24     ` Nicolin Chen via
2025-07-15 10:48   ` Jonathan Cameron via
2025-07-16  2:57   ` Nicolin Chen via
2025-07-16 10:26     ` Shameerali Kolothum Thodi via
2025-07-16 18:37       ` Nicolin Chen
2025-07-16 11:51     ` Jason Gunthorpe
2025-07-16 17:35       ` Nicolin Chen
2025-07-16 17:45         ` Jason Gunthorpe
2025-07-16 18:09           ` Nicolin Chen
2025-07-16 18:42             ` Jason Gunthorpe
2025-07-16 18:53               ` Nicolin Chen
2025-09-05 13:04           ` Eric Auger
2025-07-22 17:42   ` Nicolin Chen
2025-09-05 13:20   ` Eric Auger
2025-07-14 15:59 ` [RFC PATCH v3 15/15] hw/arm/smmu-common: Add accel property for SMMU dev Shameer Kolothum via
2025-07-14 20:00   ` Nicolin Chen
2025-07-15 10:49   ` Jonathan Cameron via
2025-09-05 10:36   ` Eric Auger
2025-07-14 16:14 ` [RFC PATCH v3 00/15] hw/arm/virt: Add support for user-creatable accelerated SMMUv3 Nicolin Chen via
2025-07-14 20:22   ` Nicolin Chen via
2025-07-15 10:46 ` Duan, Zhenzhong
2025-07-16  7:27   ` Shameerali Kolothum Thodi via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d29ccf81-59d2-4ae4-9548-f27cfc4255c5@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=berrange@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jiangkunkun@huawei.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=linuxarm@huawei.com \
    --cc=mochs@nvidia.com \
    --cc=nathanc@nvidia.com \
    --cc=nicolinc@nvidia.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shameerkolothum@gmail.com \
    --cc=skolothumtho@nvidia.com \
    --cc=smostafa@google.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=zhangfei.gao@linaro.org \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).