* [PATCH 0/3] Two minor fixes on virtio-iommu @ 2024-01-22 6:40 Zhenzhong Duan 2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Zhenzhong Duan @ 2024-01-22 6:40 UTC (permalink / raw) To: qemu-devel, qemu-arm Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx, jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan PATCH1 fixes a potential issue with vfio devices when reboot to a different OS which set bus number differently from previous OS. I didn't reproduce the issue in reality, but it's still possible in theory. PATCH2 is a prerequisite of of PATCH3. PATCH3 make virtio-iommu support PCI device aliases. If there are more than one device in same IOMMU group, either due to topology, isolation feature, etc. virtio-iommu can only make one which has alias BDF works. This impacts both emulated and vfio devices. I have reproduced the failure with an example config to have two vfio devices under same pcie to pci bridge. This patch also make a proper place in virtio-iommu to store iova_ranges from vfio device when vfio device shares same IOMMU group with other devices, either emulated or vfio devices. Zhenzhong Duan (3): virtio_iommu: Clear IOMMUPciBus pointer cache when system reset hw/pci: Add two parameters to get_address_space virtio-iommu: Support PCI device aliases include/hw/pci/pci.h | 11 ++++++++--- hw/alpha/typhoon.c | 3 ++- hw/arm/smmu-common.c | 3 ++- hw/i386/amd_iommu.c | 6 ++++-- hw/i386/intel_iommu.c | 6 ++++-- hw/pci-host/astro.c | 3 ++- hw/pci-host/designware.c | 3 ++- hw/pci-host/dino.c | 3 ++- hw/pci-host/pnv_phb3.c | 3 ++- hw/pci-host/pnv_phb4.c | 3 ++- hw/pci-host/ppce500.c | 3 ++- hw/pci-host/raven.c | 3 ++- hw/pci-host/sabre.c | 3 ++- hw/pci/pci.c | 3 ++- hw/ppc/ppc440_pcix.c | 3 ++- hw/ppc/spapr_pci.c | 3 ++- hw/remote/iommu.c | 3 ++- hw/s390x/s390-pci-bus.c | 3 ++- hw/virtio/virtio-iommu.c | 21 ++++++++++++--------- 19 files changed, 58 insertions(+), 31 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset 2024-01-22 6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan @ 2024-01-22 6:40 ` Zhenzhong Duan 2024-01-23 9:41 ` Cédric Le Goater 2024-01-22 6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan 2024-01-22 6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan 2 siblings, 1 reply; 10+ messages in thread From: Zhenzhong Duan @ 2024-01-22 6:40 UTC (permalink / raw) To: qemu-devel, qemu-arm Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx, jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan IOMMUPciBus pointer cache is indexed by bus number, bus number may not always be a fixed value, i.e., guest reboot to different kernel which set bus number with different algorithm. This could lead to endpoint binding to wrong iommu MR in virtio_iommu_get_endpoint(), then vfio device setup wrong mapping from other device. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- hw/virtio/virtio-iommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 8a4bd933c6..bfce3237f3 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void *opaque) trace_virtio_iommu_system_reset(); + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s->iommu_pcibus_by_bus_num)); + /* * config.bypass is sticky across device reset, but should be restored on * system reset -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset 2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan @ 2024-01-23 9:41 ` Cédric Le Goater 2024-01-23 10:03 ` Duan, Zhenzhong 0 siblings, 1 reply; 10+ messages in thread From: Cédric Le Goater @ 2024-01-23 9:41 UTC (permalink / raw) To: Zhenzhong Duan, qemu-devel, qemu-arm Cc: eric.auger, jean-philippe, alex.williamson, peterx, jasowang, mst, yi.l.liu, chao.p.peng On 1/22/24 07:40, Zhenzhong Duan wrote: > IOMMUPciBus pointer cache is indexed by bus number, bus number > may not always be a fixed value, i.e., guest reboot to different > kernel which set bus number with different algorithm. > > This could lead to endpoint binding to wrong iommu MR in > virtio_iommu_get_endpoint(), then vfio device setup wrong > mapping from other device. > > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > --- > hw/virtio/virtio-iommu.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > index 8a4bd933c6..bfce3237f3 100644 > --- a/hw/virtio/virtio-iommu.c > +++ b/hw/virtio/virtio-iommu.c > @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void *opaque) > > trace_virtio_iommu_system_reset(); > > + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s->iommu_pcibus_by_bus_num)); > + > /* > * config.bypass is sticky across device reset, but should be restored on > * system reset you could remove the memset in virtio_iommu_device_realize() then ? Thanks, C. ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset 2024-01-23 9:41 ` Cédric Le Goater @ 2024-01-23 10:03 ` Duan, Zhenzhong 2024-01-24 21:04 ` Eric Auger 0 siblings, 1 reply; 10+ messages in thread From: Duan, Zhenzhong @ 2024-01-23 10:03 UTC (permalink / raw) To: Cédric Le Goater, qemu-devel@nongnu.org, qemu-arm@nongnu.org Cc: eric.auger@redhat.com, jean-philippe@linaro.org, alex.williamson@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, Liu, Yi L, Peng, Chao P >-----Original Message----- >From: Cédric Le Goater <clg@redhat.com> >Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache >when system reset > >On 1/22/24 07:40, Zhenzhong Duan wrote: >> IOMMUPciBus pointer cache is indexed by bus number, bus number >> may not always be a fixed value, i.e., guest reboot to different >> kernel which set bus number with different algorithm. >> >> This could lead to endpoint binding to wrong iommu MR in >> virtio_iommu_get_endpoint(), then vfio device setup wrong >> mapping from other device. >> >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >> --- >> hw/virtio/virtio-iommu.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c >> index 8a4bd933c6..bfce3237f3 100644 >> --- a/hw/virtio/virtio-iommu.c >> +++ b/hw/virtio/virtio-iommu.c >> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void >*opaque) >> >> trace_virtio_iommu_system_reset(); >> >> + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s- >>iommu_pcibus_by_bus_num)); >> + >> /* >> * config.bypass is sticky across device reset, but should be restored on >> * system reset > >you could remove the memset in virtio_iommu_device_realize() then ? Good suggestion, will do. Thanks Zhenzhong ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset 2024-01-23 10:03 ` Duan, Zhenzhong @ 2024-01-24 21:04 ` Eric Auger 2024-01-25 2:46 ` Duan, Zhenzhong 0 siblings, 1 reply; 10+ messages in thread From: Eric Auger @ 2024-01-24 21:04 UTC (permalink / raw) To: Duan, Zhenzhong, Cédric Le Goater, qemu-devel@nongnu.org, qemu-arm@nongnu.org Cc: jean-philippe@linaro.org, alex.williamson@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, Liu, Yi L, Peng, Chao P Hi Zhenzhong, On 1/23/24 11:03, Duan, Zhenzhong wrote: > >> -----Original Message----- >> From: Cédric Le Goater <clg@redhat.com> >> Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache >> when system reset >> >> On 1/22/24 07:40, Zhenzhong Duan wrote: >>> IOMMUPciBus pointer cache is indexed by bus number, bus number >>> may not always be a fixed value, i.e., guest reboot to different >>> kernel which set bus number with different algorithm. this cannot harm to reset it but I don't know if this can be encountered. >>> >>> This could lead to endpoint binding to wrong iommu MR in >>> virtio_iommu_get_endpoint(), then vfio device setup wrong >>> mapping from other device. >>> >>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >>> --- >>> hw/virtio/virtio-iommu.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c >>> index 8a4bd933c6..bfce3237f3 100644 >>> --- a/hw/virtio/virtio-iommu.c >>> +++ b/hw/virtio/virtio-iommu.c >>> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void >> *opaque) >>> trace_virtio_iommu_system_reset(); >>> >>> + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s- >>> iommu_pcibus_by_bus_num)); >>> + >>> /* >>> * config.bypass is sticky across device reset, but should be restored on >>> * system reset >> you could remove the memset in virtio_iommu_device_realize() then ? > Good suggestion, will do. By the way what about the vtd implementation. s->vtd_address_spaces is hash table but I don't see it reset either? Also if is requested here we would need it on smmuv3 as well. Thanks Eric > > Thanks > Zhenzhong ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset 2024-01-24 21:04 ` Eric Auger @ 2024-01-25 2:46 ` Duan, Zhenzhong 0 siblings, 0 replies; 10+ messages in thread From: Duan, Zhenzhong @ 2024-01-25 2:46 UTC (permalink / raw) To: eric.auger@redhat.com, Cédric Le Goater, qemu-devel@nongnu.org, qemu-arm@nongnu.org Cc: jean-philippe@linaro.org, alex.williamson@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, Liu, Yi L, Peng, Chao P Hi Eric, >-----Original Message----- >From: Eric Auger <eric.auger@redhat.com> >Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache >when system reset > >Hi Zhenzhong, > >On 1/23/24 11:03, Duan, Zhenzhong wrote: >> >>> -----Original Message----- >>> From: Cédric Le Goater <clg@redhat.com> >>> Subject: Re: [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer >cache >>> when system reset >>> >>> On 1/22/24 07:40, Zhenzhong Duan wrote: >>>> IOMMUPciBus pointer cache is indexed by bus number, bus number >>>> may not always be a fixed value, i.e., guest reboot to different >>>> kernel which set bus number with different algorithm. >this cannot harm to reset it but I don't know if this can be encountered. >>>> >>>> This could lead to endpoint binding to wrong iommu MR in >>>> virtio_iommu_get_endpoint(), then vfio device setup wrong >>>> mapping from other device. >>>> >>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >>>> --- >>>> hw/virtio/virtio-iommu.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c >>>> index 8a4bd933c6..bfce3237f3 100644 >>>> --- a/hw/virtio/virtio-iommu.c >>>> +++ b/hw/virtio/virtio-iommu.c >>>> @@ -1264,6 +1264,8 @@ static void virtio_iommu_system_reset(void >>> *opaque) >>>> trace_virtio_iommu_system_reset(); >>>> >>>> + memset(s->iommu_pcibus_by_bus_num, 0, sizeof(s- >>>> iommu_pcibus_by_bus_num)); >>>> + >>>> /* >>>> * config.bypass is sticky across device reset, but should be restored >on >>>> * system reset >>> you could remove the memset in virtio_iommu_device_realize() then ? >> Good suggestion, will do. >By the way what about the vtd implementation. s->vtd_address_spaces is >hash table but I don't see it reset either? Good question! s->vtd_address_spaces is indexed by a key containing (PCIBus *) which is reliable. /* * PCI bus number (or SID) is not reliable since the device is usaully * initialized before guest can configure the PCI bridge * (SECONDARY_BUS_NUMBER). */ struct vtd_as_key { PCIBus *bus; uint8_t devfn; uint32_t pasid; }; So I don’t think it should reset, same for s->as_by_busptr in virtio-iommu. s->vtd_as_cache[bus_num] is unstable after guest reboot, but vtd_get_as_by_sid() has logic to verify and update cache, so it doesn't have issue. But if we hotplug/unplug bridge in a loop, there may be issue with s->vtd_address_spaces Because old AS is never released. Anyway that's beyond scope of this patch. >Also if is requested here we would need it on smmuv3 as well. Good suggestion, I think so, I'll add a patch to smmuv3 for review. Thanks Zhenzhong ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/3] hw/pci: Add two parameters to get_address_space 2024-01-22 6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan 2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan @ 2024-01-22 6:40 ` Zhenzhong Duan 2024-01-22 6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan 2 siblings, 0 replies; 10+ messages in thread From: Zhenzhong Duan @ 2024-01-22 6:40 UTC (permalink / raw) To: qemu-devel, qemu-arm Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx, jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan, Richard Henderson, Peter Maydell, Paolo Bonzini, Eduardo Habkost, Marcel Apfelbaum, Helge Deller, Andrey Smirnov, Cédric Le Goater, Nicholas Piggin, Frédéric Barrat, Hervé Poussineau, Mark Cave-Ayland, BALATON Zoltan, Daniel Henrique Barboza, David Gibson, Harsh Prateek Bora, Elena Ufimtseva, Jagannathan Raman, Matthew Rosato, Eric Farman, Thomas Huth, Halil Pasic, Christian Borntraeger, David Hildenbrand, Ilya Leoshkevich, open list:PowerNV Non-Virt..., open list:S390 PCI This adds PCI device's real bus and devfn to API get_address_space(), for vIOMMU which also wants real BDF, i.e., virtio-iommu. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- include/hw/pci/pci.h | 11 ++++++++--- hw/alpha/typhoon.c | 3 ++- hw/arm/smmu-common.c | 3 ++- hw/i386/amd_iommu.c | 6 ++++-- hw/i386/intel_iommu.c | 6 ++++-- hw/pci-host/astro.c | 3 ++- hw/pci-host/designware.c | 3 ++- hw/pci-host/dino.c | 3 ++- hw/pci-host/pnv_phb3.c | 3 ++- hw/pci-host/pnv_phb4.c | 3 ++- hw/pci-host/ppce500.c | 3 ++- hw/pci-host/raven.c | 3 ++- hw/pci-host/sabre.c | 3 ++- hw/pci/pci.c | 3 ++- hw/ppc/ppc440_pcix.c | 3 ++- hw/ppc/spapr_pci.c | 3 ++- hw/remote/iommu.c | 3 ++- hw/s390x/s390-pci-bus.c | 3 ++- hw/virtio/virtio-iommu.c | 3 ++- 19 files changed, 48 insertions(+), 23 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index fa6313aabc..2483d61a8c 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -378,13 +378,18 @@ typedef struct PCIIOMMUOps { * * Mandatory callback which returns a pointer to an #AddressSpace * - * @bus: the #PCIBus being accessed. + * @bus: the aliased #PCIBus being accessed. * * @opaque: the data passed to pci_setup_iommu(). * - * @devfn: device and function number + * @devfn: aliased device and function number + * + * @real_bus: the #PCIBus being accessed. + * + * @real_devfn: device and function number */ - AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn); + AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn); } PCIIOMMUOps; AddressSpace *pci_device_iommu_address_space(PCIDevice *dev); diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c index e8711ae16a..eda5a1c391 100644 --- a/hw/alpha/typhoon.c +++ b/hw/alpha/typhoon.c @@ -732,7 +732,8 @@ static IOMMUTLBEntry typhoon_translate_iommu(IOMMUMemoryRegion *iommu, return ret; } -static AddressSpace *typhoon_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *typhoon_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { TyphoonState *s = opaque; return &s->pchip.iommu_as; diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 9a8ac45431..c3a8f84c38 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -569,7 +569,8 @@ SMMUPciBus *smmu_find_smmu_pcibus(SMMUState *s, uint8_t bus_num) return NULL; } -static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { SMMUState *s = opaque; SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_pcibus_by_busptr, bus); diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c index 4203144da9..0cc63fd050 100644 --- a/hw/i386/amd_iommu.c +++ b/hw/i386/amd_iommu.c @@ -1390,7 +1390,8 @@ static const MemoryRegionOps amdvi_ir_ops = { } }; -static AddressSpace *amdvi_host_dma_iommu(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *amdvi_host_dma_iommu(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { char name[128]; AMDVIState *s = opaque; @@ -1578,7 +1579,8 @@ static void amdvi_sysbus_realize(DeviceState *dev, Error **errp) } /* Pseudo address space under root PCI bus. */ - x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID); + x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID, + NULL, 0); /* set up MMIO */ memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio", diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 1a07faddb4..9d8c8e1d03 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -4094,7 +4094,8 @@ static void vtd_reset(DeviceState *dev) vtd_address_space_refresh_all(s); } -static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { IntelIOMMUState *s = opaque; VTDAddressSpace *vtd_as; @@ -4233,7 +4234,8 @@ static void vtd_realize(DeviceState *dev, Error **errp) vtd_init(s); pci_setup_iommu(bus, &vtd_iommu_ops, dev); /* Pseudo address space under root PCI bus. */ - x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC); + x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC, + NULL, 0); qemu_add_machine_init_done_notifier(&vtd_machine_done_notify); } diff --git a/hw/pci-host/astro.c b/hw/pci-host/astro.c index 37d271118c..c6c0f3f95f 100644 --- a/hw/pci-host/astro.c +++ b/hw/pci-host/astro.c @@ -337,7 +337,8 @@ static IOMMUTLBEntry astro_translate_iommu(IOMMUMemoryRegion *iommu, } static AddressSpace *elroy_pcihost_set_iommu(PCIBus *bus, void *opaque, - int devfn) + int devfn, + PCIBus *real_bus, int real_devfn) { ElroyState *s = opaque; return &s->astro->iommu_as; diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c index dd9e389c07..fc652e6609 100644 --- a/hw/pci-host/designware.c +++ b/hw/pci-host/designware.c @@ -656,7 +656,8 @@ static const MemoryRegionOps designware_pci_mmio_ops = { }; static AddressSpace *designware_pcie_host_set_iommu(PCIBus *bus, void *opaque, - int devfn) + int devfn, PCIBus *real_bus, + int real_devfn) { DesignwarePCIEHost *s = DESIGNWARE_PCIE_HOST(opaque); diff --git a/hw/pci-host/dino.c b/hw/pci-host/dino.c index d992c4bb69..45f8784b2b 100644 --- a/hw/pci-host/dino.c +++ b/hw/pci-host/dino.c @@ -347,7 +347,8 @@ static const MemoryRegionOps dino_config_addr_ops = { }; static AddressSpace *dino_pcihost_set_iommu(PCIBus *bus, void *opaque, - int devfn) + int devfn, PCIBus *real_bus, + int real_devfn) { DinoState *s = opaque; diff --git a/hw/pci-host/pnv_phb3.c b/hw/pci-host/pnv_phb3.c index 2a74dbe45f..a0c4235fae 100644 --- a/hw/pci-host/pnv_phb3.c +++ b/hw/pci-host/pnv_phb3.c @@ -935,7 +935,8 @@ static const MemoryRegionOps pnv_phb3_msi_ops = { .endianness = DEVICE_LITTLE_ENDIAN }; -static AddressSpace *pnv_phb3_dma_iommu(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *pnv_phb3_dma_iommu(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { PnvPHB3 *phb = opaque; PnvPhb3DMASpace *ds; diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c index 075499d36d..fee34f376e 100644 --- a/hw/pci-host/pnv_phb4.c +++ b/hw/pci-host/pnv_phb4.c @@ -1450,7 +1450,8 @@ static PnvPhb4DMASpace *pnv_phb4_dma_find(PnvPHB4 *phb, PCIBus *bus, int devfn) return ds; } -static AddressSpace *pnv_phb4_dma_iommu(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *pnv_phb4_dma_iommu(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { PnvPHB4 *phb = opaque; PnvPhb4DMASpace *ds; diff --git a/hw/pci-host/ppce500.c b/hw/pci-host/ppce500.c index fa0d67b342..52c17a8a02 100644 --- a/hw/pci-host/ppce500.c +++ b/hw/pci-host/ppce500.c @@ -428,7 +428,8 @@ static void e500_pcihost_bridge_realize(PCIDevice *d, Error **errp) } static AddressSpace *e500_pcihost_set_iommu(PCIBus *bus, void *opaque, - int devfn) + int devfn, PCIBus *real_bus, + int real_devfn) { PPCE500PCIState *s = opaque; diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c index c7a0a2878a..d1b7c1a847 100644 --- a/hw/pci-host/raven.c +++ b/hw/pci-host/raven.c @@ -216,7 +216,8 @@ static void raven_set_irq(void *opaque, int irq_num, int level) } static AddressSpace *raven_pcihost_set_iommu(PCIBus *bus, void *opaque, - int devfn) + int devfn, PCIBus *real_bus, + int real_devfn) { PREPPCIState *s = opaque; diff --git a/hw/pci-host/sabre.c b/hw/pci-host/sabre.c index d0851b48b0..c82c62c640 100644 --- a/hw/pci-host/sabre.c +++ b/hw/pci-host/sabre.c @@ -105,7 +105,8 @@ static inline void sabre_clear_request(SabreState *s, unsigned int irq_num) s->irq_request = NO_IRQ_REQUEST; } -static AddressSpace *sabre_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *sabre_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { IOMMUState *is = opaque; diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 76080af580..2f91b87559 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2719,7 +2719,8 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) } if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) { return iommu_bus->iommu_ops->get_address_space(bus, - iommu_bus->iommu_opaque, devfn); + iommu_bus->iommu_opaque, devfn, + pci_get_bus(dev), dev->devfn); } return &address_space_memory; } diff --git a/hw/ppc/ppc440_pcix.c b/hw/ppc/ppc440_pcix.c index df4ee374d0..ea8fa28699 100644 --- a/hw/ppc/ppc440_pcix.c +++ b/hw/ppc/ppc440_pcix.c @@ -442,7 +442,8 @@ static void ppc440_pcix_set_irq(void *opaque, int irq_num, int level) qemu_set_irq(*pci_irq, level); } -static AddressSpace *ppc440_pcix_set_iommu(PCIBus *b, void *opaque, int devfn) +static AddressSpace *ppc440_pcix_set_iommu(PCIBus *b, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { PPC440PCIXState *s = opaque; diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 25e0295d6f..4a2893c845 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -773,7 +773,8 @@ static const MemoryRegionOps spapr_msi_ops = { /* * PHB PCI device */ -static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { SpaprPhbState *phb = opaque; diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c index 7c56aad0fc..7e71647e79 100644 --- a/hw/remote/iommu.c +++ b/hw/remote/iommu.c @@ -37,7 +37,8 @@ */ static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus, - void *opaque, int devfn) + void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { RemoteIommu *iommu = opaque; RemoteIommuElem *elem = NULL; diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c index 347580ebac..00de41df69 100644 --- a/hw/s390x/s390-pci-bus.c +++ b/hw/s390x/s390-pci-bus.c @@ -644,7 +644,8 @@ static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus, return iommu; } -static AddressSpace *s390_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn) +static AddressSpace *s390_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn, + PCIBus *real_bus, int real_devfn) { S390pciState *s = opaque; S390PCIIOMMU *iommu = s390_pci_get_iommu(s, bus, devfn); diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index bfce3237f3..d99c1f0d64 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -395,7 +395,8 @@ static void add_prop_resv_regions(IOMMUDevice *sdev) } static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque, - int devfn) + int devfn, PCIBus *real_bus, + int real_devfn) { VirtIOIOMMU *s = opaque; IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus); -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] virtio-iommu: Support PCI device aliases 2024-01-22 6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan 2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan 2024-01-22 6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan @ 2024-01-22 6:40 ` Zhenzhong Duan 2024-01-24 20:55 ` Eric Auger 2 siblings, 1 reply; 10+ messages in thread From: Zhenzhong Duan @ 2024-01-22 6:40 UTC (permalink / raw) To: qemu-devel, qemu-arm Cc: eric.auger, jean-philippe, alex.williamson, clg, peterx, jasowang, mst, yi.l.liu, chao.p.peng, Zhenzhong Duan Currently virtio-iommu doesn't work well if there are multiple devices in same iommu group. In below example config, guest virtio-iommu driver can successfully probe first device but fail on others. Only one device under the bridge can work normally. -device virtio-iommu \ -device pcie-pci-bridge,id=root0 \ -device vfio-pci,host=81:11.0,bus=root0 \ -device vfio-pci,host=6f:01.0,bus=root0 \ The reason is virtio-iommu stores AS(address space) in hash table with aliased BDF and corelates endpoint which is indexed by device's real BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash table, we either get wrong AS or NULL. Fix it by storing AS indexed by real BDF. This way also make iova_ranges from vfio device stored in IOMMUDevice of real BDF successfully. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- hw/virtio/virtio-iommu.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index d99c1f0d64..6880d92a44 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -399,27 +399,27 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque, int real_devfn) { VirtIOIOMMU *s = opaque; - IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus); + IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, real_bus); static uint32_t mr_index; IOMMUDevice *sdev; if (!sbus) { sbus = g_malloc0(sizeof(IOMMUPciBus) + sizeof(IOMMUDevice *) * PCI_DEVFN_MAX); - sbus->bus = bus; - g_hash_table_insert(s->as_by_busptr, bus, sbus); + sbus->bus = real_bus; + g_hash_table_insert(s->as_by_busptr, real_bus, sbus); } - sdev = sbus->pbdev[devfn]; + sdev = sbus->pbdev[real_devfn]; if (!sdev) { char *name = g_strdup_printf("%s-%d-%d", TYPE_VIRTIO_IOMMU_MEMORY_REGION, - mr_index++, devfn); - sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1); + mr_index++, real_devfn); + sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1); sdev->viommu = s; - sdev->bus = bus; - sdev->devfn = devfn; + sdev->bus = real_bus; + sdev->devfn = real_devfn; trace_virtio_iommu_init_iommu_mr(name); -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] virtio-iommu: Support PCI device aliases 2024-01-22 6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan @ 2024-01-24 20:55 ` Eric Auger 2024-01-25 5:58 ` Duan, Zhenzhong 0 siblings, 1 reply; 10+ messages in thread From: Eric Auger @ 2024-01-24 20:55 UTC (permalink / raw) To: Zhenzhong Duan, qemu-devel, qemu-arm Cc: jean-philippe, alex.williamson, clg, peterx, jasowang, mst, yi.l.liu, chao.p.peng Hi Zhenzhong, On 1/22/24 07:40, Zhenzhong Duan wrote: > Currently virtio-iommu doesn't work well if there are multiple devices > in same iommu group. In below example config, guest virtio-iommu driver > can successfully probe first device but fail on others. Only one device > under the bridge can work normally. > > -device virtio-iommu \ > -device pcie-pci-bridge,id=root0 \ > -device vfio-pci,host=81:11.0,bus=root0 \ > -device vfio-pci,host=6f:01.0,bus=root0 \ > > The reason is virtio-iommu stores AS(address space) in hash table with > aliased BDF and corelates endpoint which is indexed by device's real > BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash > table, we either get wrong AS or NULL. > > Fix it by storing AS indexed by real BDF. This way also make iova_ranges > from vfio device stored in IOMMUDevice of real BDF successfully. > > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > --- > hw/virtio/virtio-iommu.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > index d99c1f0d64..6880d92a44 100644 > --- a/hw/virtio/virtio-iommu.c > +++ b/hw/virtio/virtio-iommu.c > @@ -399,27 +399,27 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque, > int real_devfn) > { > VirtIOIOMMU *s = opaque; > - IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus); > + IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, real_bus); > static uint32_t mr_index; > IOMMUDevice *sdev; > > if (!sbus) { > sbus = g_malloc0(sizeof(IOMMUPciBus) + > sizeof(IOMMUDevice *) * PCI_DEVFN_MAX); > - sbus->bus = bus; > - g_hash_table_insert(s->as_by_busptr, bus, sbus); > + sbus->bus = real_bus; > + g_hash_table_insert(s->as_by_busptr, real_bus, sbus); > } > > - sdev = sbus->pbdev[devfn]; > + sdev = sbus->pbdev[real_devfn]; > if (!sdev) { > char *name = g_strdup_printf("%s-%d-%d", > TYPE_VIRTIO_IOMMU_MEMORY_REGION, > - mr_index++, devfn); > - sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1); > + mr_index++, real_devfn); > + sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1); > > sdev->viommu = s; > - sdev->bus = bus; > - sdev->devfn = devfn; > + sdev->bus = real_bus; > + sdev->devfn = real_devfn; but then this means the 2 devices would be abstracted by two different IOMMU MRs whereas in practice they cannot be distinguished from an IOMMU pov. Shouldn't the virtio-iommu driver use the same ep_id for both devices within the same group? Note there are some known issues about virtio-iommu and pcie-to-pci bridges which were reported early last year and confirmed by Robin Murphy. See: [RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr() <https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/#r> https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/ Thanks Eric > > trace_virtio_iommu_init_iommu_mr(name); > ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH 3/3] virtio-iommu: Support PCI device aliases 2024-01-24 20:55 ` Eric Auger @ 2024-01-25 5:58 ` Duan, Zhenzhong 0 siblings, 0 replies; 10+ messages in thread From: Duan, Zhenzhong @ 2024-01-25 5:58 UTC (permalink / raw) To: eric.auger@redhat.com, qemu-devel@nongnu.org, qemu-arm@nongnu.org Cc: jean-philippe@linaro.org, alex.williamson@redhat.com, clg@redhat.com, peterx@redhat.com, jasowang@redhat.com, mst@redhat.com, Liu, Yi L, Peng, Chao P >-----Original Message----- >From: Eric Auger <eric.auger@redhat.com> >Subject: Re: [PATCH 3/3] virtio-iommu: Support PCI device aliases > >Hi Zhenzhong, > >On 1/22/24 07:40, Zhenzhong Duan wrote: >> Currently virtio-iommu doesn't work well if there are multiple devices >> in same iommu group. In below example config, guest virtio-iommu driver >> can successfully probe first device but fail on others. Only one device >> under the bridge can work normally. >> >> -device virtio-iommu \ >> -device pcie-pci-bridge,id=root0 \ >> -device vfio-pci,host=81:11.0,bus=root0 \ >> -device vfio-pci,host=6f:01.0,bus=root0 \ >> >> The reason is virtio-iommu stores AS(address space) in hash table with >> aliased BDF and corelates endpoint which is indexed by device's real >> BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash >> table, we either get wrong AS or NULL. >> >> Fix it by storing AS indexed by real BDF. This way also make iova_ranges >> from vfio device stored in IOMMUDevice of real BDF successfully. >> >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >> --- >> hw/virtio/virtio-iommu.c | 16 ++++++++-------- >> 1 file changed, 8 insertions(+), 8 deletions(-) >> >> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c >> index d99c1f0d64..6880d92a44 100644 >> --- a/hw/virtio/virtio-iommu.c >> +++ b/hw/virtio/virtio-iommu.c >> @@ -399,27 +399,27 @@ static AddressSpace >*virtio_iommu_find_add_as(PCIBus *bus, void *opaque, >> int real_devfn) >> { >> VirtIOIOMMU *s = opaque; >> - IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus); >> + IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, >real_bus); >> static uint32_t mr_index; >> IOMMUDevice *sdev; >> >> if (!sbus) { >> sbus = g_malloc0(sizeof(IOMMUPciBus) + >> sizeof(IOMMUDevice *) * PCI_DEVFN_MAX); >> - sbus->bus = bus; >> - g_hash_table_insert(s->as_by_busptr, bus, sbus); >> + sbus->bus = real_bus; >> + g_hash_table_insert(s->as_by_busptr, real_bus, sbus); >> } >> >> - sdev = sbus->pbdev[devfn]; >> + sdev = sbus->pbdev[real_devfn]; >> if (!sdev) { >> char *name = g_strdup_printf("%s-%d-%d", >> TYPE_VIRTIO_IOMMU_MEMORY_REGION, >> - mr_index++, devfn); >> - sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1); >> + mr_index++, real_devfn); >> + sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1); >> >> sdev->viommu = s; >> - sdev->bus = bus; >> - sdev->devfn = devfn; >> + sdev->bus = real_bus; >> + sdev->devfn = real_devfn; >but then this means the 2 devices would be abstracted by two different >IOMMU MRs whereas in practice they cannot be distinguished from an >IOMMU pov. Yes, normally the two different IOMMU MRs should link to same guest domain, so translation result will be same. But if a malicious guest try to break, then it fails to block that. >Shouldn't the virtio-iommu driver use the same ep_id for both >devices within the same group? IIUC, you mean for domain attach and not probe request? I was thinking ep_id represented an existing device in guest, not the aliased one. > >Note there are some known issues about virtio-iommu and pcie-to-pci >bridges which were reported early last year and confirmed by Robin >Murphy. See: > >[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr() ><https://lore.kernel.org/all/20230116124709.793084-1- >eric.auger@redhat.com/#r> Thanks for sharing, it’s valuable😊 BRs. Zhenzhong ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-01-25 5:59 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-22 6:40 [PATCH 0/3] Two minor fixes on virtio-iommu Zhenzhong Duan 2024-01-22 6:40 ` [PATCH 1/3] virtio_iommu: Clear IOMMUPciBus pointer cache when system reset Zhenzhong Duan 2024-01-23 9:41 ` Cédric Le Goater 2024-01-23 10:03 ` Duan, Zhenzhong 2024-01-24 21:04 ` Eric Auger 2024-01-25 2:46 ` Duan, Zhenzhong 2024-01-22 6:40 ` [PATCH 2/3] hw/pci: Add two parameters to get_address_space Zhenzhong Duan 2024-01-22 6:40 ` [PATCH 3/3] virtio-iommu: Support PCI device aliases Zhenzhong Duan 2024-01-24 20:55 ` Eric Auger 2024-01-25 5:58 ` Duan, Zhenzhong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).