* [PATCH 0/3] Add bypass mode support to assigned device @ 2022-06-13 6:10 Zhenzhong Duan 2022-06-13 6:10 ` [PATCH 1/3] virtio-iommu: " Zhenzhong Duan ` (3 more replies) 0 siblings, 4 replies; 8+ messages in thread From: Zhenzhong Duan @ 2022-06-13 6:10 UTC (permalink / raw) To: eric.auger Cc: qemu-devel, mst, jean-philippe, pbonzini, yu.c.zhang, chuanxiao.dong, tina.zhang Currently virtio-iommu's logic to support bypass mode works only for emulated device. For assigned device, no GPA -> HPA mapping is setup in IOMMU page table. Host report below error: [3713481.750944] dmar_fault: 191 callbacks suppressed [3713481.750953] DMAR: DRHD: handling fault status reg 302 [3713481.750962] DMAR: [DMA Read NO_PASID] Request device [2f:00.1] fault addr 0x7ebb0000 [fault reason 0x06] PTE Read access is not set [3713481.751003] DMAR: DRHD: handling fault status reg 402 [3713481.751007] DMAR: [DMA Read NO_PASID] Request device [2f:00.1] fault addr 0x7ebb0000 [fault reason 0x06] PTE Read access is not set [3713481.751023] DMAR: DRHD: handling fault status reg 502 [3713481.751026] DMAR: [DMA Write NO_PASID] Request device [2f:00.1] fault addr 0x7ebb0000 [fault reason 0x05] PTE Write access is not set [3713481.751072] DMAR: DRHD: handling fault status reg 602 Guest kernel report below error: [ 3.461716] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 3.462605] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 3.464630] i40e 0000:00:04.0: Adding to iommu group 5 [ 3.482093] i40e 0000:00:04.0: fw 0.0.00000 api 0.0 nvm 0.00 0x176953ce 28.50.1 [8086:37d3] [8086:35d0] [ 3.484295] i40e 0000:00:04.0: eeprom check failed (-62), Tx/Rx traffic disabled [ 3.487268] i40e 0000:00:04.0: configure_lan_hmc failed: -49 [ 3.489066] i40e: probe of 0000:00:04.0 failed with error -2 Fix it by adding switch beween bypass and iommu address space just like the virtual VT-d implementation, so that in bypass mode, vfio mapping is setup. Tested with four combination of qemu's virtio-iommu.boot-bypass=true/false with guest kernel's iommu=pt/nopt on x86_64 platform. Zhenzhong Duan (3): virtio-iommu: Add bypass mode support to assigned device virtio-iommu: Use recursive lock to avoid deadlock virtio-iommu: Add an assert check in translate routine hw/virtio/trace-events | 1 + hw/virtio/virtio-iommu.c | 135 ++++++++++++++++++++++++++++--- include/hw/virtio/virtio-iommu.h | 4 +- 3 files changed, 130 insertions(+), 10 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] virtio-iommu: Add bypass mode support to assigned device 2022-06-13 6:10 [PATCH 0/3] Add bypass mode support to assigned device Zhenzhong Duan @ 2022-06-13 6:10 ` Zhenzhong Duan 2022-06-23 16:52 ` Eric Auger 2022-06-13 6:10 ` [PATCH 2/3] virtio-iommu: Use recursive lock to avoid deadlock Zhenzhong Duan ` (2 subsequent siblings) 3 siblings, 1 reply; 8+ messages in thread From: Zhenzhong Duan @ 2022-06-13 6:10 UTC (permalink / raw) To: eric.auger Cc: qemu-devel, mst, jean-philippe, pbonzini, yu.c.zhang, chuanxiao.dong, tina.zhang Currently assigned devices can not work in virtio-iommu bypass mode. Guest driver fails to probe the device due to DMA failure. And the reason is because of lacking GPA -> HPA mappings when VM is created. Add a root container memory region to hold both bypass memory region and iommu memory region, so the switch between them is supported just like the implementation in virtual VT-d. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- hw/virtio/trace-events | 1 + hw/virtio/virtio-iommu.c | 115 ++++++++++++++++++++++++++++++- include/hw/virtio/virtio-iommu.h | 2 + 3 files changed, 116 insertions(+), 2 deletions(-) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index ab8e095b73fa..20af2e7ebd78 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -124,6 +124,7 @@ virtio_iommu_remap(const char *name, uint64_t virt_start, uint64_t virt_end, uin virtio_iommu_set_page_size_mask(const char *name, uint64_t old, uint64_t new) "mr=%s old_mask=0x%"PRIx64" new_mask=0x%"PRIx64 virtio_iommu_notify_flag_add(const char *name) "add notifier to mr %s" virtio_iommu_notify_flag_del(const char *name) "del notifier from mr %s" +virtio_iommu_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)" # virtio-mem.c virtio_mem_send_response(uint16_t type) "type=%" PRIu16 diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 2597e166f939..ff718107ee02 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -69,6 +69,77 @@ static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev) return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); } +static bool virtio_iommu_device_bypassed(IOMMUDevice *sdev) +{ + uint32_t sid; + bool bypassed; + VirtIOIOMMU *s = sdev->viommu; + VirtIOIOMMUEndpoint *ep; + + sid = virtio_iommu_get_bdf(sdev); + + qemu_mutex_lock(&s->mutex); + /* need to check bypass before system reset */ + if (!s->endpoints) { + bypassed = s->config.bypass; + goto unlock; + } + + ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); + if (!ep || !ep->domain) { + bypassed = s->config.bypass; + } else { + bypassed = ep->domain->bypass; + } + +unlock: + qemu_mutex_unlock(&s->mutex); + return bypassed; +} + +/* Return whether the device is using IOMMU translation. */ +static bool virtio_iommu_switch_address_space(IOMMUDevice *sdev) +{ + bool use_remapping; + + assert(sdev); + + use_remapping = !virtio_iommu_device_bypassed(sdev); + + trace_virtio_iommu_switch_address_space(pci_bus_num(sdev->bus), + PCI_SLOT(sdev->devfn), + PCI_FUNC(sdev->devfn), + use_remapping); + + /* Turn off first then on the other */ + if (use_remapping) { + memory_region_set_enabled(&sdev->bypass_mr, false); + memory_region_set_enabled(MEMORY_REGION(&sdev->iommu_mr), true); + } else { + memory_region_set_enabled(MEMORY_REGION(&sdev->iommu_mr), false); + memory_region_set_enabled(&sdev->bypass_mr, true); + } + + return use_remapping; +} + +static void virtio_iommu_switch_address_space_all(VirtIOIOMMU *s) +{ + GHashTableIter iter; + IOMMUPciBus *iommu_pci_bus; + int i; + + g_hash_table_iter_init(&iter, s->as_by_busptr); + while (g_hash_table_iter_next(&iter, NULL, (void **)&iommu_pci_bus)) { + for (i = 0; i < PCI_DEVFN_MAX; i++) { + if (!iommu_pci_bus->pbdev[i]) { + continue; + } + virtio_iommu_switch_address_space(iommu_pci_bus->pbdev[i]); + } + } +} + /** * The bus number is used for lookup when SID based operations occur. * In that case we lazily populate the IOMMUPciBus array from the bus hash @@ -213,6 +284,7 @@ static gboolean virtio_iommu_notify_map_cb(gpointer key, gpointer value, static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) { VirtIOIOMMUDomain *domain = ep->domain; + IOMMUDevice *sdev = container_of(ep->iommu_mr, IOMMUDevice, iommu_mr); if (!ep->domain) { return; @@ -221,6 +293,7 @@ static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) ep->iommu_mr); QLIST_REMOVE(ep, next); ep->domain = NULL; + virtio_iommu_switch_address_space(sdev); } static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, @@ -323,12 +396,39 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque, trace_virtio_iommu_init_iommu_mr(name); + memory_region_init(&sdev->root, OBJECT(s), name, UINT64_MAX); + address_space_init(&sdev->as, &sdev->root, TYPE_VIRTIO_IOMMU); + + /* + * Build the IOMMU disabled container with aliases to the + * shared MRs. Note that aliasing to a shared memory region + * could help the memory API to detect same FlatViews so we + * can have devices to share the same FlatView when in bypass + * mode. (either by not configuring virtio-iommu driver or with + * "iommu=pt"). It will greatly reduce the total number of + * FlatViews of the system hence VM runs faster. + */ + memory_region_init_alias(&sdev->bypass_mr, OBJECT(s), + "system", get_system_memory(), 0, + memory_region_size(get_system_memory())); + memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev->iommu_mr), TYPE_VIRTIO_IOMMU_MEMORY_REGION, OBJECT(s), name, UINT64_MAX); - address_space_init(&sdev->as, - MEMORY_REGION(&sdev->iommu_mr), TYPE_VIRTIO_IOMMU); + + /* + * Hook both the containers under the root container, we + * switch between iommu & bypass MRs by enable/disable + * corresponding sub-containers + */ + memory_region_add_subregion_overlap(&sdev->root, 0, + MEMORY_REGION(&sdev->iommu_mr), + 0); + memory_region_add_subregion_overlap(&sdev->root, 0, + &sdev->bypass_mr, 0); + + virtio_iommu_switch_address_space(sdev); g_free(name); } return &sdev->as; @@ -342,6 +442,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, uint32_t flags = le32_to_cpu(req->flags); VirtIOIOMMUDomain *domain; VirtIOIOMMUEndpoint *ep; + IOMMUDevice *sdev; trace_virtio_iommu_attach(domain_id, ep_id); @@ -375,6 +476,8 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, QLIST_INSERT_HEAD(&domain->endpoint_list, ep, next); ep->domain = domain; + sdev = container_of(ep->iommu_mr, IOMMUDevice, iommu_mr); + virtio_iommu_switch_address_space(sdev); /* Replay domain mappings on the associated memory region */ g_tree_foreach(domain->mappings, virtio_iommu_notify_map_cb, @@ -887,6 +990,7 @@ static void virtio_iommu_set_config(VirtIODevice *vdev, return; } dev_config->bypass = in_config->bypass; + virtio_iommu_switch_address_space_all(dev); } trace_virtio_iommu_set_config(in_config->bypass); @@ -1026,6 +1130,8 @@ static void virtio_iommu_system_reset(void *opaque) * system reset */ s->config.bypass = s->boot_bypass; + virtio_iommu_switch_address_space_all(s); + } static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) @@ -1041,6 +1147,11 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) virtio_iommu_handle_command); s->event_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE, NULL); + /* + * config.bypass is needed to get initial address space early, such as + * in vfio realize + */ + s->config.bypass = s->boot_bypass; s->config.page_size_mask = TARGET_PAGE_MASK; s->config.input_range.end = UINT64_MAX; s->config.domain_range.end = UINT32_MAX; diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h index 84391f844826..102eeefa731d 100644 --- a/include/hw/virtio/virtio-iommu.h +++ b/include/hw/virtio/virtio-iommu.h @@ -37,6 +37,8 @@ typedef struct IOMMUDevice { int devfn; IOMMUMemoryRegion iommu_mr; AddressSpace as; + MemoryRegion root; /* The root container of the device */ + MemoryRegion bypass_mr; /* The alias of shared memory MR */ } IOMMUDevice; typedef struct IOMMUPciBus { -- 2.25.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/3] virtio-iommu: Add bypass mode support to assigned device 2022-06-13 6:10 ` [PATCH 1/3] virtio-iommu: " Zhenzhong Duan @ 2022-06-23 16:52 ` Eric Auger 2022-06-24 8:28 ` Duan, Zhenzhong 0 siblings, 1 reply; 8+ messages in thread From: Eric Auger @ 2022-06-23 16:52 UTC (permalink / raw) To: Zhenzhong Duan Cc: qemu-devel, mst, jean-philippe, pbonzini, yu.c.zhang, chuanxiao.dong, tina.zhang Hi, On 6/13/22 08:10, Zhenzhong Duan wrote: > Currently assigned devices can not work in virtio-iommu bypass mode. > Guest driver fails to probe the device due to DMA failure. And the > reason is because of lacking GPA -> HPA mappings when VM is created. > > Add a root container memory region to hold both bypass memory region > and iommu memory region, so the switch between them is supported > just like the implementation in virtual VT-d. > > Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> > --- > hw/virtio/trace-events | 1 + > hw/virtio/virtio-iommu.c | 115 ++++++++++++++++++++++++++++++- > include/hw/virtio/virtio-iommu.h | 2 + > 3 files changed, 116 insertions(+), 2 deletions(-) > > diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events > index ab8e095b73fa..20af2e7ebd78 100644 > --- a/hw/virtio/trace-events > +++ b/hw/virtio/trace-events > @@ -124,6 +124,7 @@ virtio_iommu_remap(const char *name, uint64_t virt_start, uint64_t virt_end, uin > virtio_iommu_set_page_size_mask(const char *name, uint64_t old, uint64_t new) "mr=%s old_mask=0x%"PRIx64" new_mask=0x%"PRIx64 > virtio_iommu_notify_flag_add(const char *name) "add notifier to mr %s" > virtio_iommu_notify_flag_del(const char *name) "del notifier from mr %s" > +virtio_iommu_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)" > > # virtio-mem.c > virtio_mem_send_response(uint16_t type) "type=%" PRIu16 > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c > index 2597e166f939..ff718107ee02 100644 > --- a/hw/virtio/virtio-iommu.c > +++ b/hw/virtio/virtio-iommu.c > @@ -69,6 +69,77 @@ static inline uint16_t virtio_iommu_get_bdf(IOMMUDevice *dev) > return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); > } > > +static bool virtio_iommu_device_bypassed(IOMMUDevice *sdev) > +{ > + uint32_t sid; > + bool bypassed; > + VirtIOIOMMU *s = sdev->viommu; > + VirtIOIOMMUEndpoint *ep; > + > + sid = virtio_iommu_get_bdf(sdev); > + > + qemu_mutex_lock(&s->mutex); > + /* need to check bypass before system reset */ > + if (!s->endpoints) { > + bypassed = s->config.bypass; > + goto unlock; > + } > + > + ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); > + if (!ep || !ep->domain) { > + bypassed = s->config.bypass; > + } else { > + bypassed = ep->domain->bypass; > + } > + > +unlock: > + qemu_mutex_unlock(&s->mutex); > + return bypassed; > +} > + > +/* Return whether the device is using IOMMU translation. */ > +static bool virtio_iommu_switch_address_space(IOMMUDevice *sdev) > +{ > + bool use_remapping; > + > + assert(sdev); > + > + use_remapping = !virtio_iommu_device_bypassed(sdev); > + > + trace_virtio_iommu_switch_address_space(pci_bus_num(sdev->bus), > + PCI_SLOT(sdev->devfn), > + PCI_FUNC(sdev->devfn), > + use_remapping); > + > + /* Turn off first then on the other */ > + if (use_remapping) { > + memory_region_set_enabled(&sdev->bypass_mr, false); > + memory_region_set_enabled(MEMORY_REGION(&sdev->iommu_mr), true); > + } else { > + memory_region_set_enabled(MEMORY_REGION(&sdev->iommu_mr), false); > + memory_region_set_enabled(&sdev->bypass_mr, true); > + } > + > + return use_remapping; > +} > + > +static void virtio_iommu_switch_address_space_all(VirtIOIOMMU *s) > +{ > + GHashTableIter iter; > + IOMMUPciBus *iommu_pci_bus; > + int i; > + > + g_hash_table_iter_init(&iter, s->as_by_busptr); > + while (g_hash_table_iter_next(&iter, NULL, (void **)&iommu_pci_bus)) { > + for (i = 0; i < PCI_DEVFN_MAX; i++) { > + if (!iommu_pci_bus->pbdev[i]) { > + continue; > + } > + virtio_iommu_switch_address_space(iommu_pci_bus->pbdev[i]); > + } > + } > +} > + > /** > * The bus number is used for lookup when SID based operations occur. > * In that case we lazily populate the IOMMUPciBus array from the bus hash > @@ -213,6 +284,7 @@ static gboolean virtio_iommu_notify_map_cb(gpointer key, gpointer value, > static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) > { > VirtIOIOMMUDomain *domain = ep->domain; > + IOMMUDevice *sdev = container_of(ep->iommu_mr, IOMMUDevice, iommu_mr); > > if (!ep->domain) { > return; > @@ -221,6 +293,7 @@ static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) > ep->iommu_mr); > QLIST_REMOVE(ep, next); > ep->domain = NULL; > + virtio_iommu_switch_address_space(sdev); > } > > static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU *s, > @@ -323,12 +396,39 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque, > > trace_virtio_iommu_init_iommu_mr(name); > > + memory_region_init(&sdev->root, OBJECT(s), name, UINT64_MAX); > + address_space_init(&sdev->as, &sdev->root, TYPE_VIRTIO_IOMMU); > + > + /* > + * Build the IOMMU disabled container with aliases to the > + * shared MRs. Note that aliasing to a shared memory region What do you call the 'disabled container' and the shared MRs? > + * could help the memory API to detect same FlatViews so we > + * can have devices to share the same FlatView when in bypass > + * mode. (either by not configuring virtio-iommu driver or with > + * "iommu=pt"). It will greatly reduce the total number of > + * FlatViews of the system hence VM runs faster. Do you mean that we could have used a shared bypass MR instead of creatingone per device? > + */ > + memory_region_init_alias(&sdev->bypass_mr, OBJECT(s), > + "system", get_system_memory(), 0, > + memory_region_size(get_system_memory())); > + > memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev->iommu_mr), > TYPE_VIRTIO_IOMMU_MEMORY_REGION, > OBJECT(s), name, > UINT64_MAX); > - address_space_init(&sdev->as, > - MEMORY_REGION(&sdev->iommu_mr), TYPE_VIRTIO_IOMMU); > + > + /* > + * Hook both the containers under the root container, we did you mean "hook both sub-regions to the root container"? > + * switch between iommu & bypass MRs by enable/disable > + * corresponding sub-containers > + */ > + memory_region_add_subregion_overlap(&sdev->root, 0, > + MEMORY_REGION(&sdev->iommu_mr), > + 0); > + memory_region_add_subregion_overlap(&sdev->root, 0, > + &sdev->bypass_mr, 0); > + > + virtio_iommu_switch_address_space(sdev); > g_free(name); > } > return &sdev->as; > @@ -342,6 +442,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, > uint32_t flags = le32_to_cpu(req->flags); > VirtIOIOMMUDomain *domain; > VirtIOIOMMUEndpoint *ep; > + IOMMUDevice *sdev; > > trace_virtio_iommu_attach(domain_id, ep_id); > > @@ -375,6 +476,8 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, > QLIST_INSERT_HEAD(&domain->endpoint_list, ep, next); > > ep->domain = domain; > + sdev = container_of(ep->iommu_mr, IOMMUDevice, iommu_mr); > + virtio_iommu_switch_address_space(sdev); > > /* Replay domain mappings on the associated memory region */ > g_tree_foreach(domain->mappings, virtio_iommu_notify_map_cb, > @@ -887,6 +990,7 @@ static void virtio_iommu_set_config(VirtIODevice *vdev, > return; > } > dev_config->bypass = in_config->bypass; > + virtio_iommu_switch_address_space_all(dev); > } > > trace_virtio_iommu_set_config(in_config->bypass); > @@ -1026,6 +1130,8 @@ static void virtio_iommu_system_reset(void *opaque) > * system reset > */ > s->config.bypass = s->boot_bypass; > + virtio_iommu_switch_address_space_all(s); > + > } > > static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) > @@ -1041,6 +1147,11 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) > virtio_iommu_handle_command); > s->event_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE, NULL); > > + /* > + * config.bypass is needed to get initial address space early, such as > + * in vfio realize > + */ I don't understand this comment, please can you elaborate? > + s->config.bypass = s->boot_bypass; > s->config.page_size_mask = TARGET_PAGE_MASK; > s->config.input_range.end = UINT64_MAX; > s->config.domain_range.end = UINT32_MAX; > diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h > index 84391f844826..102eeefa731d 100644 > --- a/include/hw/virtio/virtio-iommu.h > +++ b/include/hw/virtio/virtio-iommu.h > @@ -37,6 +37,8 @@ typedef struct IOMMUDevice { > int devfn; > IOMMUMemoryRegion iommu_mr; > AddressSpace as; > + MemoryRegion root; /* The root container of the device */ > + MemoryRegion bypass_mr; /* The alias of shared memory MR */ > } IOMMUDevice; > > typedef struct IOMMUPciBus { Did you test migration? I wonder if we shouldn't call virtio_iommu_switch_address_space_all(s) after restoring the endpoints in iommu_post_load() as it is done in intel-iommu Thanks Eric ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 1/3] virtio-iommu: Add bypass mode support to assigned device 2022-06-23 16:52 ` Eric Auger @ 2022-06-24 8:28 ` Duan, Zhenzhong 0 siblings, 0 replies; 8+ messages in thread From: Duan, Zhenzhong @ 2022-06-24 8:28 UTC (permalink / raw) To: eric.auger@redhat.com Cc: qemu-devel@nongnu.org, mst@redhat.com, jean-philippe@linaro.org, pbonzini@redhat.com, Zhang, Yu C, Dong, Chuanxiao, Zhang, Tina >-----Original Message----- >From: Eric Auger <eric.auger@redhat.com> >Sent: Friday, June 24, 2022 12:52 AM >To: Duan, Zhenzhong <zhenzhong.duan@intel.com> >Cc: qemu-devel@nongnu.org; mst@redhat.com; jean-philippe@linaro.org; >pbonzini@redhat.com; Zhang, Yu C <yu.c.zhang@intel.com>; Dong, >Chuanxiao <chuanxiao.dong@intel.com>; Zhang, Tina ><tina.zhang@intel.com> >Subject: Re: [PATCH 1/3] virtio-iommu: Add bypass mode support to >assigned device > >Hi, >On 6/13/22 08:10, Zhenzhong Duan wrote: >> Currently assigned devices can not work in virtio-iommu bypass mode. >> Guest driver fails to probe the device due to DMA failure. And the >> reason is because of lacking GPA -> HPA mappings when VM is created. >> >> Add a root container memory region to hold both bypass memory region >> and iommu memory region, so the switch between them is supported just >> like the implementation in virtual VT-d. >> >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> >> --- >> hw/virtio/trace-events | 1 + >> hw/virtio/virtio-iommu.c | 115 ++++++++++++++++++++++++++++++- >> include/hw/virtio/virtio-iommu.h | 2 + >> 3 files changed, 116 insertions(+), 2 deletions(-) >> >> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index >> ab8e095b73fa..20af2e7ebd78 100644 >> --- a/hw/virtio/trace-events >> +++ b/hw/virtio/trace-events >> @@ -124,6 +124,7 @@ virtio_iommu_remap(const char *name, uint64_t >> virt_start, uint64_t virt_end, uin >> virtio_iommu_set_page_size_mask(const char *name, uint64_t old, >uint64_t new) "mr=%s old_mask=0x%"PRIx64" new_mask=0x%"PRIx64 >virtio_iommu_notify_flag_add(const char *name) "add notifier to mr %s" >> virtio_iommu_notify_flag_del(const char *name) "del notifier from mr %s" >> +virtio_iommu_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, >bool on) "Device %02x:%02x.%x switching address space (iommu >enabled=%d)" >> >> # virtio-mem.c >> virtio_mem_send_response(uint16_t type) "type=%" PRIu16 diff --git >> a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index >> 2597e166f939..ff718107ee02 100644 >> --- a/hw/virtio/virtio-iommu.c >> +++ b/hw/virtio/virtio-iommu.c >> @@ -69,6 +69,77 @@ static inline uint16_t >virtio_iommu_get_bdf(IOMMUDevice *dev) >> return PCI_BUILD_BDF(pci_bus_num(dev->bus), dev->devfn); } >> >> +static bool virtio_iommu_device_bypassed(IOMMUDevice *sdev) { >> + uint32_t sid; >> + bool bypassed; >> + VirtIOIOMMU *s = sdev->viommu; >> + VirtIOIOMMUEndpoint *ep; >> + >> + sid = virtio_iommu_get_bdf(sdev); >> + >> + qemu_mutex_lock(&s->mutex); >> + /* need to check bypass before system reset */ >> + if (!s->endpoints) { >> + bypassed = s->config.bypass; >> + goto unlock; >> + } >> + >> + ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); >> + if (!ep || !ep->domain) { >> + bypassed = s->config.bypass; >> + } else { >> + bypassed = ep->domain->bypass; >> + } >> + >> +unlock: >> + qemu_mutex_unlock(&s->mutex); >> + return bypassed; >> +} >> + >> +/* Return whether the device is using IOMMU translation. */ static >> +bool virtio_iommu_switch_address_space(IOMMUDevice *sdev) { >> + bool use_remapping; >> + >> + assert(sdev); >> + >> + use_remapping = !virtio_iommu_device_bypassed(sdev); >> + >> + trace_virtio_iommu_switch_address_space(pci_bus_num(sdev->bus), >> + PCI_SLOT(sdev->devfn), >> + PCI_FUNC(sdev->devfn), >> + use_remapping); >> + >> + /* Turn off first then on the other */ >> + if (use_remapping) { >> + memory_region_set_enabled(&sdev->bypass_mr, false); >> + memory_region_set_enabled(MEMORY_REGION(&sdev->iommu_mr), >true); >> + } else { >> + memory_region_set_enabled(MEMORY_REGION(&sdev->iommu_mr), >false); >> + memory_region_set_enabled(&sdev->bypass_mr, true); >> + } >> + >> + return use_remapping; >> +} >> + >> +static void virtio_iommu_switch_address_space_all(VirtIOIOMMU *s) { >> + GHashTableIter iter; >> + IOMMUPciBus *iommu_pci_bus; >> + int i; >> + >> + g_hash_table_iter_init(&iter, s->as_by_busptr); >> + while (g_hash_table_iter_next(&iter, NULL, (void **)&iommu_pci_bus)) >{ >> + for (i = 0; i < PCI_DEVFN_MAX; i++) { >> + if (!iommu_pci_bus->pbdev[i]) { >> + continue; >> + } >> + virtio_iommu_switch_address_space(iommu_pci_bus->pbdev[i]); >> + } >> + } >> +} >> + >> /** >> * The bus number is used for lookup when SID based operations occur. >> * In that case we lazily populate the IOMMUPciBus array from the bus >> hash @@ -213,6 +284,7 @@ static gboolean >> virtio_iommu_notify_map_cb(gpointer key, gpointer value, static void >> virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) >{ >> VirtIOIOMMUDomain *domain = ep->domain; >> + IOMMUDevice *sdev = container_of(ep->iommu_mr, IOMMUDevice, >> + iommu_mr); >> >> if (!ep->domain) { >> return; >> @@ -221,6 +293,7 @@ static void >virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep) >> ep->iommu_mr); >> QLIST_REMOVE(ep, next); >> ep->domain = NULL; >> + virtio_iommu_switch_address_space(sdev); >> } >> >> static VirtIOIOMMUEndpoint *virtio_iommu_get_endpoint(VirtIOIOMMU >*s, >> @@ -323,12 +396,39 @@ static AddressSpace >> *virtio_iommu_find_add_as(PCIBus *bus, void *opaque, >> >> trace_virtio_iommu_init_iommu_mr(name); >> >> + memory_region_init(&sdev->root, OBJECT(s), name, UINT64_MAX); >> + address_space_init(&sdev->as, &sdev->root, >> + TYPE_VIRTIO_IOMMU); >> + >> + /* >> + * Build the IOMMU disabled container with aliases to the >> + * shared MRs. Note that aliasing to a shared memory region >What do you call the 'disabled container' and the shared MRs? ' IOMMU disabled container' means sdev->bypass_mr. Shared MRs means get_system_memory(), aka system_memory. All endpoint's sdev->bypass_mr alias to system_memory, so saying sdev->bypass_mr 'IOMMU disabled' and system_memory shared MRs. > >> + * could help the memory API to detect same FlatViews so we >> + * can have devices to share the same FlatView when in bypass >> + * mode. (either by not configuring virtio-iommu driver or with >> + * "iommu=pt"). It will greatly reduce the total number of >> + * FlatViews of the system hence VM runs faster. > >Do you mean that we could have used a shared bypass MR instead of >creatingone per device? I think we still need to create iommu MR per device which is enabled when endpoint attached to dma domain. Shared bypass MR is enabled when: 1. endpoint not attached to any domain yet and virtio-iommu.boot_bypass=true 2. endpoint is attached to bypass domain >> + */ >> + memory_region_init_alias(&sdev->bypass_mr, OBJECT(s), >> + "system", get_system_memory(), 0, >> + >> + memory_region_size(get_system_memory())); >> + >> memory_region_init_iommu(&sdev->iommu_mr, sizeof(sdev- >>iommu_mr), >> TYPE_VIRTIO_IOMMU_MEMORY_REGION, >> OBJECT(s), name, >> UINT64_MAX); >> - address_space_init(&sdev->as, >> - MEMORY_REGION(&sdev->iommu_mr), >TYPE_VIRTIO_IOMMU); >> + >> + /* >> + * Hook both the containers under the root container, we >did you mean "hook both sub-regions to the root container"? Yes >> + * switch between iommu & bypass MRs by enable/disable >> + * corresponding sub-containers >> + */ >> + memory_region_add_subregion_overlap(&sdev->root, 0, >> + MEMORY_REGION(&sdev->iommu_mr), >> + 0); >> + memory_region_add_subregion_overlap(&sdev->root, 0, >> + &sdev->bypass_mr, 0); >> + >> + virtio_iommu_switch_address_space(sdev); >> g_free(name); >> } >> return &sdev->as; >> @@ -342,6 +442,7 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, >> uint32_t flags = le32_to_cpu(req->flags); >> VirtIOIOMMUDomain *domain; >> VirtIOIOMMUEndpoint *ep; >> + IOMMUDevice *sdev; >> >> trace_virtio_iommu_attach(domain_id, ep_id); >> >> @@ -375,6 +476,8 @@ static int virtio_iommu_attach(VirtIOIOMMU *s, >> QLIST_INSERT_HEAD(&domain->endpoint_list, ep, next); >> >> ep->domain = domain; >> + sdev = container_of(ep->iommu_mr, IOMMUDevice, iommu_mr); >> + virtio_iommu_switch_address_space(sdev); >> >> /* Replay domain mappings on the associated memory region */ >> g_tree_foreach(domain->mappings, virtio_iommu_notify_map_cb, @@ >> -887,6 +990,7 @@ static void virtio_iommu_set_config(VirtIODevice *vdev, >> return; >> } >> dev_config->bypass = in_config->bypass; >> + virtio_iommu_switch_address_space_all(dev); >> } >> >> trace_virtio_iommu_set_config(in_config->bypass); >> @@ -1026,6 +1130,8 @@ static void virtio_iommu_system_reset(void >*opaque) >> * system reset >> */ >> s->config.bypass = s->boot_bypass; >> + virtio_iommu_switch_address_space_all(s); >> + >> } >> >> static void virtio_iommu_device_realize(DeviceState *dev, Error >> **errp) @@ -1041,6 +1147,11 @@ static void >virtio_iommu_device_realize(DeviceState *dev, Error **errp) >> virtio_iommu_handle_command); >> s->event_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE, >> NULL); >> >> + /* >> + * config.bypass is needed to get initial address space early, such as >> + * in vfio realize >> + */ >I don't understand this comment, please can you elaborate? In vfio_realize(), virtio_iommu_find_add_as() is called to get the address space from virtio-iommu. virtio_iommu_find_add_as() calls virtio_iommu_switch_address_space() and virtio_iommu_switch_address_space() need to check config.bypass to enable the right MRs, either shared bypass MR or iommu MR. It could be ok to use s->boot_bypass directly in virtio_iommu_device_bypassed(), then this change could be removed. >> + s->config.bypass = s->boot_bypass; >> s->config.page_size_mask = TARGET_PAGE_MASK; >> s->config.input_range.end = UINT64_MAX; >> s->config.domain_range.end = UINT32_MAX; diff --git >> a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h >> index 84391f844826..102eeefa731d 100644 >> --- a/include/hw/virtio/virtio-iommu.h >> +++ b/include/hw/virtio/virtio-iommu.h >> @@ -37,6 +37,8 @@ typedef struct IOMMUDevice { >> int devfn; >> IOMMUMemoryRegion iommu_mr; >> AddressSpace as; >> + MemoryRegion root; /* The root container of the device */ >> + MemoryRegion bypass_mr; /* The alias of shared memory MR */ >> } IOMMUDevice; >> >> typedef struct IOMMUPciBus { >Did you test migration? I wonder if we shouldn't call > >virtio_iommu_switch_address_space_all(s) >after restoring the endpoints in iommu_post_load() as it is done in intel- >iommu Thanks for point out, I missed the migration scenario. After your suggested change, the migration works without pass-through device. With pass-through device I got "VFIO device doesn't support migration". Not clear if pass-through device migration is totally supported now. Zhenzhong ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/3] virtio-iommu: Use recursive lock to avoid deadlock 2022-06-13 6:10 [PATCH 0/3] Add bypass mode support to assigned device Zhenzhong Duan 2022-06-13 6:10 ` [PATCH 1/3] virtio-iommu: " Zhenzhong Duan @ 2022-06-13 6:10 ` Zhenzhong Duan 2022-06-13 6:10 ` [PATCH 3/3] virtio-iommu: Add an assert check in translate routine Zhenzhong Duan 2022-06-23 16:52 ` [PATCH 0/3] Add bypass mode support to assigned device Eric Auger 3 siblings, 0 replies; 8+ messages in thread From: Zhenzhong Duan @ 2022-06-13 6:10 UTC (permalink / raw) To: eric.auger Cc: qemu-devel, mst, jean-philippe, pbonzini, yu.c.zhang, chuanxiao.dong, tina.zhang When switching address space with mutex lock hold, mapping will be replayed for assigned device. This will trigger relock deadlock. Also release the mutex resource in unrealize routine. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- hw/virtio/virtio-iommu.c | 20 +++++++++++--------- include/hw/virtio/virtio-iommu.h | 2 +- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index ff718107ee02..73d5bde9d122 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -78,7 +78,7 @@ static bool virtio_iommu_device_bypassed(IOMMUDevice *sdev) sid = virtio_iommu_get_bdf(sdev); - qemu_mutex_lock(&s->mutex); + qemu_rec_mutex_lock(&s->mutex); /* need to check bypass before system reset */ if (!s->endpoints) { bypassed = s->config.bypass; @@ -93,7 +93,7 @@ static bool virtio_iommu_device_bypassed(IOMMUDevice *sdev) } unlock: - qemu_mutex_unlock(&s->mutex); + qemu_rec_mutex_unlock(&s->mutex); return bypassed; } @@ -745,7 +745,7 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq) tail.status = VIRTIO_IOMMU_S_DEVERR; goto out; } - qemu_mutex_lock(&s->mutex); + qemu_rec_mutex_lock(&s->mutex); switch (head.type) { case VIRTIO_IOMMU_T_ATTACH: tail.status = virtio_iommu_handle_attach(s, iov, iov_cnt); @@ -774,7 +774,7 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq) default: tail.status = VIRTIO_IOMMU_S_UNSUPP; } - qemu_mutex_unlock(&s->mutex); + qemu_rec_mutex_unlock(&s->mutex); out: sz = iov_from_buf(elem->in_sg, elem->in_num, 0, @@ -862,7 +862,7 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr, sid = virtio_iommu_get_bdf(sdev); trace_virtio_iommu_translate(mr->parent_obj.name, sid, addr, flag); - qemu_mutex_lock(&s->mutex); + qemu_rec_mutex_lock(&s->mutex); ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); if (!ep) { @@ -946,7 +946,7 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr, trace_virtio_iommu_translate_out(addr, entry.translated_addr, sid); unlock: - qemu_mutex_unlock(&s->mutex); + qemu_rec_mutex_unlock(&s->mutex); return entry; } @@ -1035,7 +1035,7 @@ static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n) sid = virtio_iommu_get_bdf(sdev); - qemu_mutex_lock(&s->mutex); + qemu_rec_mutex_lock(&s->mutex); if (!s->endpoints) { goto unlock; @@ -1049,7 +1049,7 @@ static void virtio_iommu_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n) g_tree_foreach(ep->domain->mappings, virtio_iommu_remap, mr); unlock: - qemu_mutex_unlock(&s->mutex); + qemu_rec_mutex_unlock(&s->mutex); } static int virtio_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu_mr, @@ -1167,7 +1167,7 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) virtio_add_feature(&s->features, VIRTIO_IOMMU_F_PROBE); virtio_add_feature(&s->features, VIRTIO_IOMMU_F_BYPASS_CONFIG); - qemu_mutex_init(&s->mutex); + qemu_rec_mutex_init(&s->mutex); s->as_by_busptr = g_hash_table_new_full(NULL, NULL, NULL, g_free); @@ -1195,6 +1195,8 @@ static void virtio_iommu_device_unrealize(DeviceState *dev) g_tree_destroy(s->endpoints); } + qemu_rec_mutex_destroy(&s->mutex); + virtio_delete_queue(s->req_vq); virtio_delete_queue(s->event_vq); virtio_cleanup(vdev); diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h index 102eeefa731d..2ad5ee320be9 100644 --- a/include/hw/virtio/virtio-iommu.h +++ b/include/hw/virtio/virtio-iommu.h @@ -58,7 +58,7 @@ struct VirtIOIOMMU { ReservedRegion *reserved_regions; uint32_t nb_reserved_regions; GTree *domains; - QemuMutex mutex; + QemuRecMutex mutex; GTree *endpoints; bool boot_bypass; }; -- 2.25.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] virtio-iommu: Add an assert check in translate routine 2022-06-13 6:10 [PATCH 0/3] Add bypass mode support to assigned device Zhenzhong Duan 2022-06-13 6:10 ` [PATCH 1/3] virtio-iommu: " Zhenzhong Duan 2022-06-13 6:10 ` [PATCH 2/3] virtio-iommu: Use recursive lock to avoid deadlock Zhenzhong Duan @ 2022-06-13 6:10 ` Zhenzhong Duan 2022-06-23 16:52 ` [PATCH 0/3] Add bypass mode support to assigned device Eric Auger 3 siblings, 0 replies; 8+ messages in thread From: Zhenzhong Duan @ 2022-06-13 6:10 UTC (permalink / raw) To: eric.auger Cc: qemu-devel, mst, jean-philippe, pbonzini, yu.c.zhang, chuanxiao.dong, tina.zhang With address space switch supported, dma access translation only happen after endpoint is attached to a non-bypass domain. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> --- hw/virtio/virtio-iommu.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 73d5bde9d122..7c122ab95780 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -865,6 +865,10 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr, qemu_rec_mutex_lock(&s->mutex); ep = g_tree_lookup(s->endpoints, GUINT_TO_POINTER(sid)); + + if (bypass_allowed) + assert(ep && ep->domain && !ep->domain->bypass); + if (!ep) { if (!bypass_allowed) { error_report_once("%s sid=%d is not known!!", __func__, sid); -- 2.25.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] Add bypass mode support to assigned device 2022-06-13 6:10 [PATCH 0/3] Add bypass mode support to assigned device Zhenzhong Duan ` (2 preceding siblings ...) 2022-06-13 6:10 ` [PATCH 3/3] virtio-iommu: Add an assert check in translate routine Zhenzhong Duan @ 2022-06-23 16:52 ` Eric Auger 2022-06-24 7:36 ` Duan, Zhenzhong 3 siblings, 1 reply; 8+ messages in thread From: Eric Auger @ 2022-06-23 16:52 UTC (permalink / raw) To: Zhenzhong Duan Cc: qemu-devel, mst, jean-philippe, pbonzini, yu.c.zhang, chuanxiao.dong, tina.zhang Hi Duan, On 6/13/22 08:10, Zhenzhong Duan wrote: > Currently virtio-iommu's logic to support bypass mode works only for > emulated device. For assigned device, no GPA -> HPA mapping is setup > in IOMMU page table. > > Host report below error: > [3713481.750944] dmar_fault: 191 callbacks suppressed > [3713481.750953] DMAR: DRHD: handling fault status reg 302 > [3713481.750962] DMAR: [DMA Read NO_PASID] Request device [2f:00.1] fault addr 0x7ebb0000 [fault reason 0x06] PTE Read access is not set > [3713481.751003] DMAR: DRHD: handling fault status reg 402 > [3713481.751007] DMAR: [DMA Read NO_PASID] Request device [2f:00.1] fault addr 0x7ebb0000 [fault reason 0x06] PTE Read access is not set > [3713481.751023] DMAR: DRHD: handling fault status reg 502 > [3713481.751026] DMAR: [DMA Write NO_PASID] Request device [2f:00.1] fault addr 0x7ebb0000 [fault reason 0x05] PTE Write access is not set > [3713481.751072] DMAR: DRHD: handling fault status reg 602 > > Guest kernel report below error: > [ 3.461716] i40e: Intel(R) Ethernet Connection XL710 Network Driver > [ 3.462605] i40e: Copyright (c) 2013 - 2019 Intel Corporation. > [ 3.464630] i40e 0000:00:04.0: Adding to iommu group 5 > [ 3.482093] i40e 0000:00:04.0: fw 0.0.00000 api 0.0 nvm 0.00 0x176953ce 28.50.1 [8086:37d3] [8086:35d0] > [ 3.484295] i40e 0000:00:04.0: eeprom check failed (-62), Tx/Rx traffic disabled > [ 3.487268] i40e 0000:00:04.0: configure_lan_hmc failed: -49 > [ 3.489066] i40e: probe of 0000:00:04.0 failed with error -2 > > Fix it by adding switch beween bypass and iommu address space just like > the virtual VT-d implementation, so that in bypass mode, vfio mapping > is setup. > > Tested with four combination of qemu's virtio-iommu.boot-bypass=true/false > with guest kernel's iommu=pt/nopt on x86_64 platform. I know this has already landed uptream (I was off last week) but I have few comments/questions related to the series. > > Zhenzhong Duan (3): > virtio-iommu: Add bypass mode support to assigned device > virtio-iommu: Use recursive lock to avoid deadlock This patch may have been squashed into the previous one, as "virtio-iommu: Add bypass mode support to assigned device" deadlocks. Eric > virtio-iommu: Add an assert check in translate routine > > hw/virtio/trace-events | 1 + > hw/virtio/virtio-iommu.c | 135 ++++++++++++++++++++++++++++--- > include/hw/virtio/virtio-iommu.h | 4 +- > 3 files changed, 130 insertions(+), 10 deletions(-) > ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 0/3] Add bypass mode support to assigned device 2022-06-23 16:52 ` [PATCH 0/3] Add bypass mode support to assigned device Eric Auger @ 2022-06-24 7:36 ` Duan, Zhenzhong 0 siblings, 0 replies; 8+ messages in thread From: Duan, Zhenzhong @ 2022-06-24 7:36 UTC (permalink / raw) To: eric.auger@redhat.com Cc: qemu-devel@nongnu.org, mst@redhat.com, jean-philippe@linaro.org, pbonzini@redhat.com, Zhang, Yu C, Dong, Chuanxiao, Zhang, Tina Hi Eric, >-----Original Message----- >From: Eric Auger <eric.auger@redhat.com> >Sent: Friday, June 24, 2022 12:52 AM >To: Duan, Zhenzhong <zhenzhong.duan@intel.com> >Cc: qemu-devel@nongnu.org; mst@redhat.com; jean-philippe@linaro.org; >pbonzini@redhat.com; Zhang, Yu C <yu.c.zhang@intel.com>; Dong, >Chuanxiao <chuanxiao.dong@intel.com>; Zhang, Tina ><tina.zhang@intel.com> >Subject: Re: [PATCH 0/3] Add bypass mode support to assigned device > >Hi Duan, > >On 6/13/22 08:10, Zhenzhong Duan wrote: >> Currently virtio-iommu's logic to support bypass mode works only for >> emulated device. For assigned device, no GPA -> HPA mapping is setup >> in IOMMU page table. >> >> Host report below error: >> [3713481.750944] dmar_fault: 191 callbacks suppressed [3713481.750953] >> DMAR: DRHD: handling fault status reg 302 [3713481.750962] DMAR: [DMA >> Read NO_PASID] Request device [2f:00.1] fault addr 0x7ebb0000 [fault >> reason 0x06] PTE Read access is not set [3713481.751003] DMAR: DRHD: >> handling fault status reg 402 [3713481.751007] DMAR: [DMA Read >> NO_PASID] Request device [2f:00.1] fault addr 0x7ebb0000 [fault reason >> 0x06] PTE Read access is not set [3713481.751023] DMAR: DRHD: handling >> fault status reg 502 [3713481.751026] DMAR: [DMA Write NO_PASID] >> Request device [2f:00.1] fault addr 0x7ebb0000 [fault reason 0x05] PTE >> Write access is not set [3713481.751072] DMAR: DRHD: handling fault >> status reg 602 >> >> Guest kernel report below error: >> [ 3.461716] i40e: Intel(R) Ethernet Connection XL710 Network Driver >> [ 3.462605] i40e: Copyright (c) 2013 - 2019 Intel Corporation. >> [ 3.464630] i40e 0000:00:04.0: Adding to iommu group 5 >> [ 3.482093] i40e 0000:00:04.0: fw 0.0.00000 api 0.0 nvm 0.00 0x176953ce >28.50.1 [8086:37d3] [8086:35d0] >> [ 3.484295] i40e 0000:00:04.0: eeprom check failed (-62), Tx/Rx traffic >disabled >> [ 3.487268] i40e 0000:00:04.0: configure_lan_hmc failed: -49 >> [ 3.489066] i40e: probe of 0000:00:04.0 failed with error -2 >> >> Fix it by adding switch beween bypass and iommu address space just >> like the virtual VT-d implementation, so that in bypass mode, vfio >> mapping is setup. >> >> Tested with four combination of qemu's >> virtio-iommu.boot-bypass=true/false >> with guest kernel's iommu=pt/nopt on x86_64 platform. > >I know this has already landed uptream (I was off last week) but I have few >comments/questions related to the series. Sure😊 >> >> Zhenzhong Duan (3): >> virtio-iommu: Add bypass mode support to assigned device >> virtio-iommu: Use recursive lock to avoid deadlock >This patch may have been squashed into the previous one, as >"virtio-iommu: Add bypass mode support to assigned device" deadlocks. Yes, that may be better. My initial thought is to make 1st patch slim a bit, which may be more clear for maintainers to review. Thanks Zhenzhong ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-06-24 8:48 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-06-13 6:10 [PATCH 0/3] Add bypass mode support to assigned device Zhenzhong Duan 2022-06-13 6:10 ` [PATCH 1/3] virtio-iommu: " Zhenzhong Duan 2022-06-23 16:52 ` Eric Auger 2022-06-24 8:28 ` Duan, Zhenzhong 2022-06-13 6:10 ` [PATCH 2/3] virtio-iommu: Use recursive lock to avoid deadlock Zhenzhong Duan 2022-06-13 6:10 ` [PATCH 3/3] virtio-iommu: Add an assert check in translate routine Zhenzhong Duan 2022-06-23 16:52 ` [PATCH 0/3] Add bypass mode support to assigned device Eric Auger 2022-06-24 7:36 ` Duan, Zhenzhong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).