qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] Fix vIOMMU reset order
@ 2025-02-18 18:25 Eric Auger
  2025-02-18 18:25 ` [PATCH v3 1/5] hw/virtio/virtio-iommu: Migrate to 3-phase reset Eric Auger
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Eric Auger @ 2025-02-18 18:25 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	mst, jasowang, imammedo, peterx, alex.williamson, clg, philmd,
	zhenzhong.duan, ddutile

With current reset scheme, DMA capable devices are reset after
the vIOMMU which translate them. This holds for the different
IOMMUs and various DMA capable devices such as virtio devices
and VFIO ones. With virtio devices, spurious traces can be
observed at qemu level such as "virtio: zero sized buffers are
not allowed" while for VFIO devices, translation faults can be
observed at host level.

Virtio devices use 3 phase reset and virtio-pci devices are reset
in the 'hold' phase. VFIO device reset are registered using
qemu_register_reset() and as a consequence they are also reset
on 'hold' phase.

Note that the tree of QOM devices resets depth-first but it does
so while enforcing the 3 phases. First the tree is traversed doing
the 'enter' phase, then the 'hold' phase and eventually the 'exit'
phase.

However the QOM hierarchy is not built so that vIOMMUs get reset
after the DMA capable devices (IOMMUs are using either legacy reset
scheme or hold phase). Changing the QOM hierarchy does not sound
trivial while forcing the vIOMMUs to be reset on 'exit' phase
sounds reasonable and much simpler. Obviously this relies on the
assumption that all DMA capable devices quiesce their DMA before
(ie. during 'enter' or hold' phase).

This was tested with qmp system_reset and virsh reset.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/viommu-3phase-reset-v2

History:
v2 -> v3:
- Collected R-bs and A-bs from Zhenzhong and Jason
- fixed the cover letter (Zhenzhong)
- no code change

v1 -> v2:
- Removed hw/i386/intel_iommu: Tear down address spaces before
  IOMMU reset
- Also move SMMU base class reset to exit reset. This was an
  oversight from v1
- Add last patch documenting expectations in terms of DMA reset
- Improved commit messages and cover letter
- dared to keep Michael's A-b for the patches whose code was
  not altered

References:
[1] [PATCH 0/4] intel_iommu: Reset vIOMMU after all the rest of devices
https://lore.kernel.org/all/20240117091559.144730-1-peterx@redhat.com/

Eric Auger (5):
  hw/virtio/virtio-iommu: Migrate to 3-phase reset
  hw/i386/intel-iommu: Migrate to 3-phase reset
  hw/arm/smmuv3: Move reset to exit phase
  hw/vfio/common: Add a trace point in vfio_reset_handler
  docs/devel/reset: Document reset expectations for DMA and IOMMU

 docs/devel/reset.rst     |  5 +++++
 hw/arm/smmu-common.c     |  9 +++++++--
 hw/arm/smmuv3.c          | 14 ++++++++++----
 hw/i386/intel_iommu.c    | 12 +++++++++---
 hw/vfio/common.c         |  1 +
 hw/virtio/virtio-iommu.c | 14 ++++++++++----
 hw/arm/trace-events      |  1 +
 hw/i386/trace-events     |  1 +
 hw/vfio/trace-events     |  1 +
 hw/virtio/trace-events   |  2 +-
 10 files changed, 46 insertions(+), 14 deletions(-)

-- 
2.47.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/5] hw/virtio/virtio-iommu: Migrate to 3-phase reset
  2025-02-18 18:25 [PATCH v3 0/5] Fix vIOMMU reset order Eric Auger
@ 2025-02-18 18:25 ` Eric Auger
  2025-02-18 18:25 ` [PATCH v3 2/5] hw/i386/intel-iommu: " Eric Auger
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Auger @ 2025-02-18 18:25 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	mst, jasowang, imammedo, peterx, alex.williamson, clg, philmd,
	zhenzhong.duan, ddutile

Currently the iommu may be reset before the devices
it protects. For example this happens with virtio-net.

Let's use 3-phase reset mechanism and reset the IOMMU on
exit phase after all DMA capable devices have been
reset during the 'enter' or 'hold' phase.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>

---
v2 -> v3:
- add a comment emphasizing the fact that the exit reset
  needs to be used
---
 hw/virtio/virtio-iommu.c | 14 ++++++++++----
 hw/virtio/trace-events   |  2 +-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index f41104a952..b6e7e01ef7 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -1504,11 +1504,11 @@ static void virtio_iommu_device_unrealize(DeviceState *dev)
     virtio_cleanup(vdev);
 }
 
-static void virtio_iommu_device_reset(VirtIODevice *vdev)
+static void virtio_iommu_device_reset_exit(Object *obj, ResetType type)
 {
-    VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
+    VirtIOIOMMU *s = VIRTIO_IOMMU(obj);
 
-    trace_virtio_iommu_device_reset();
+    trace_virtio_iommu_device_reset_exit();
 
     if (s->domains) {
         g_tree_destroy(s->domains);
@@ -1668,6 +1668,7 @@ static void virtio_iommu_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
     VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+    ResettableClass *rc = RESETTABLE_CLASS(klass);
 
     device_class_set_props(dc, virtio_iommu_properties);
     dc->vmsd = &vmstate_virtio_iommu;
@@ -1675,7 +1676,12 @@ static void virtio_iommu_class_init(ObjectClass *klass, void *data)
     set_bit(DEVICE_CATEGORY_MISC, dc->categories);
     vdc->realize = virtio_iommu_device_realize;
     vdc->unrealize = virtio_iommu_device_unrealize;
-    vdc->reset = virtio_iommu_device_reset;
+
+    /*
+     * Use 'exit' reset phase to make sure all DMA requests
+     * have been quiesced during 'enter' or 'hold' phase
+     */
+    rc->phases.exit = virtio_iommu_device_reset_exit;
     vdc->get_config = virtio_iommu_get_config;
     vdc->set_config = virtio_iommu_set_config;
     vdc->get_features = virtio_iommu_get_features;
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 04e36ae047..76f0d458b2 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -108,7 +108,7 @@ virtio_pci_notify_write(uint64_t addr, uint64_t val, unsigned int size) "0x%" PR
 virtio_pci_notify_write_pio(uint64_t addr, uint64_t val, unsigned int size) "0x%" PRIx64" = 0x%" PRIx64 " (%d)"
 
 # hw/virtio/virtio-iommu.c
-virtio_iommu_device_reset(void) "reset!"
+virtio_iommu_device_reset_exit(void) "reset!"
 virtio_iommu_system_reset(void) "system reset!"
 virtio_iommu_get_features(uint64_t features) "device supports features=0x%"PRIx64
 virtio_iommu_device_status(uint8_t status) "driver status = %d"
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/5] hw/i386/intel-iommu: Migrate to 3-phase reset
  2025-02-18 18:25 [PATCH v3 0/5] Fix vIOMMU reset order Eric Auger
  2025-02-18 18:25 ` [PATCH v3 1/5] hw/virtio/virtio-iommu: Migrate to 3-phase reset Eric Auger
@ 2025-02-18 18:25 ` Eric Auger
  2025-02-18 18:25 ` [PATCH v3 3/5] hw/arm/smmuv3: Move reset to exit phase Eric Auger
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Auger @ 2025-02-18 18:25 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	mst, jasowang, imammedo, peterx, alex.williamson, clg, philmd,
	zhenzhong.duan, ddutile

Currently the IOMMU may be reset before the devices
it protects. For example this happens with virtio devices
but also with VFIO devices. In this latter case this
produces spurious translation faults on host.

Let's use 3-phase reset mechanism and reset the IOMMU on
exit phase after all DMA capable devices have been reset
on 'enter' or 'hold' phase.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Zhenzhong Duan <zhenzhong.duan@intel.com>

---

v1 -> v2:
- add a comment emphasizing the importance of using an 'exit reset
---
 hw/i386/intel_iommu.c | 12 +++++++++---
 hw/i386/trace-events  |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index f366c223d0..a5cf2d0e81 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -4697,10 +4697,11 @@ static void vtd_init(IntelIOMMUState *s)
 /* Should not reset address_spaces when reset because devices will still use
  * the address space they got at first (won't ask the bus again).
  */
-static void vtd_reset(DeviceState *dev)
+static void vtd_reset_exit(Object *obj, ResetType type)
 {
-    IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
+    IntelIOMMUState *s = INTEL_IOMMU_DEVICE(obj);
 
+    trace_vtd_reset_exit();
     vtd_init(s);
     vtd_address_space_refresh_all(s);
 }
@@ -4864,8 +4865,13 @@ static void vtd_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
     X86IOMMUClass *x86_class = X86_IOMMU_DEVICE_CLASS(klass);
+    ResettableClass *rc = RESETTABLE_CLASS(klass);
 
-    device_class_set_legacy_reset(dc, vtd_reset);
+    /*
+     * Use 'exit' reset phase to make sure all DMA requests
+     * have been quiesced during 'enter' or 'hold' phase
+     */
+    rc->phases.exit = vtd_reset_exit;
     dc->vmsd = &vtd_vmstate;
     device_class_set_props(dc, vtd_properties);
     dc->hotpluggable = false;
diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index 53c02d7ac8..ac9e1a10aa 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -68,6 +68,7 @@ vtd_frr_new(int index, uint64_t hi, uint64_t lo) "index %d high 0x%"PRIx64" low
 vtd_warn_invalid_qi_tail(uint16_t tail) "tail 0x%"PRIx16
 vtd_warn_ir_vector(uint16_t sid, int index, int vec, int target) "sid 0x%"PRIx16" index %d vec %d (should be: %d)"
 vtd_warn_ir_trigger(uint16_t sid, int index, int trig, int target) "sid 0x%"PRIx16" index %d trigger %d (should be: %d)"
+vtd_reset_exit(void) ""
 
 # amd_iommu.c
 amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 0x%"PRIx64" +  offset 0x%"PRIx32
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 3/5] hw/arm/smmuv3: Move reset to exit phase
  2025-02-18 18:25 [PATCH v3 0/5] Fix vIOMMU reset order Eric Auger
  2025-02-18 18:25 ` [PATCH v3 1/5] hw/virtio/virtio-iommu: Migrate to 3-phase reset Eric Auger
  2025-02-18 18:25 ` [PATCH v3 2/5] hw/i386/intel-iommu: " Eric Auger
@ 2025-02-18 18:25 ` Eric Auger
  2025-02-18 18:25 ` [PATCH v3 4/5] hw/vfio/common: Add a trace point in vfio_reset_handler Eric Auger
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Auger @ 2025-02-18 18:25 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	mst, jasowang, imammedo, peterx, alex.williamson, clg, philmd,
	zhenzhong.duan, ddutile

Currently the iommu may be reset before the devices
it protects. For example this happens with virtio-scsi-pci.
when system_reset is issued from qmp monitor: spurious
"virtio: zero sized buffers are not allowed" warnings can
be observed. This happens because outstanding DMA requests
are still happening while the SMMU gets reset.

This can also happen with VFIO devices. In that case
spurious DMA translation faults can be observed on host.

Make sure the SMMU is reset in the 'exit' phase after
all DMA capable devices have been reset during the 'enter'
or 'hold' phase.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>

---

v1 -> v2:
- also migrate smmu_base_reset_hold to smmu_base_reset_exit
  which was an oversight of v1
- add a comment emphasizing the importance of doing the reset
  in 'exit' phase.
---
 hw/arm/smmu-common.c |  9 +++++++--
 hw/arm/smmuv3.c      | 14 ++++++++++----
 hw/arm/trace-events  |  1 +
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index dd74c2e558..8c1b407b82 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -924,7 +924,12 @@ static void smmu_base_realize(DeviceState *dev, Error **errp)
     }
 }
 
-static void smmu_base_reset_hold(Object *obj, ResetType type)
+/*
+ * Make sure the IOMMU is reset in 'exit' phase after
+ * all outstanding DMA requests have been quiesced during
+ * the 'enter' or 'hold' reset phases
+ */
+static void smmu_base_reset_exit(Object *obj, ResetType type)
 {
     SMMUState *s = ARM_SMMU(obj);
 
@@ -949,7 +954,7 @@ static void smmu_base_class_init(ObjectClass *klass, void *data)
     device_class_set_props(dc, smmu_dev_properties);
     device_class_set_parent_realize(dc, smmu_base_realize,
                                     &sbc->parent_realize);
-    rc->phases.hold = smmu_base_reset_hold;
+    rc->phases.exit = smmu_base_reset_exit;
 }
 
 static const TypeInfo smmu_base_info = {
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index c0cf5df0f6..b49a59b64c 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1870,13 +1870,19 @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev)
     }
 }
 
-static void smmu_reset_hold(Object *obj, ResetType type)
+/*
+ * Make sure the IOMMU is reset in 'exit' phase after
+ * all outstanding DMA requests have been quiesced during
+ * the 'enter' or 'hold' reset phases
+ */
+static void smmu_reset_exit(Object *obj, ResetType type)
 {
     SMMUv3State *s = ARM_SMMUV3(obj);
     SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
 
-    if (c->parent_phases.hold) {
-        c->parent_phases.hold(obj, type);
+    trace_smmu_reset_exit();
+    if (c->parent_phases.exit) {
+        c->parent_phases.exit(obj, type);
     }
 
     smmuv3_init_regs(s);
@@ -1999,7 +2005,7 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
     SMMUv3Class *c = ARM_SMMUV3_CLASS(klass);
 
     dc->vmsd = &vmstate_smmuv3;
-    resettable_class_set_parent_phases(rc, NULL, smmu_reset_hold, NULL,
+    resettable_class_set_parent_phases(rc, NULL, NULL, smmu_reset_exit,
                                        &c->parent_phases);
     device_class_set_parent_realize(dc, smmu_realize,
                                     &c->parent_realize);
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index c64ad344bd..7790db780e 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -56,6 +56,7 @@ smmuv3_config_cache_inv(uint32_t sid) "Config cache INV for sid=0x%x"
 smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
 smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
 smmuv3_inv_notifiers_iova(const char *name, int asid, int vmid, uint64_t iova, uint8_t tg, uint64_t num_pages, int stage) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64" stage=%d"
+smmu_reset_exit(void) ""
 
 # strongarm.c
 strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 4/5] hw/vfio/common: Add a trace point in vfio_reset_handler
  2025-02-18 18:25 [PATCH v3 0/5] Fix vIOMMU reset order Eric Auger
                   ` (2 preceding siblings ...)
  2025-02-18 18:25 ` [PATCH v3 3/5] hw/arm/smmuv3: Move reset to exit phase Eric Auger
@ 2025-02-18 18:25 ` Eric Auger
  2025-02-18 18:25 ` [PATCH v3 5/5] docs/devel/reset: Document reset expectations for DMA and IOMMU Eric Auger
  2025-02-18 22:31 ` [PATCH v3 0/5] Fix vIOMMU reset order Peter Xu
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Auger @ 2025-02-18 18:25 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	mst, jasowang, imammedo, peterx, alex.williamson, clg, philmd,
	zhenzhong.duan, ddutile

To ease the debug of reset sequence, let's add a trace point
in vfio_reset_handler()

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/common.c     | 1 +
 hw/vfio/trace-events | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index f7499a9b74..173fb3a997 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1386,6 +1386,7 @@ void vfio_reset_handler(void *opaque)
 {
     VFIODevice *vbasedev;
 
+    trace_vfio_reset_handler();
     QLIST_FOREACH(vbasedev, &vfio_device_list, global_next) {
         if (vbasedev->dev->realized) {
             vbasedev->ops->vfio_compute_needs_reset(vbasedev);
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index cab1cf1de0..c5385e1a4f 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -120,6 +120,7 @@ vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype
 vfio_legacy_dma_unmap_overflow_workaround(void) ""
 vfio_get_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start, uint64_t dirty_pages) "iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64" dirty_pages=%"PRIu64
 vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64
+vfio_reset_handler(void) ""
 
 # platform.c
 vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 5/5] docs/devel/reset: Document reset expectations for DMA and IOMMU
  2025-02-18 18:25 [PATCH v3 0/5] Fix vIOMMU reset order Eric Auger
                   ` (3 preceding siblings ...)
  2025-02-18 18:25 ` [PATCH v3 4/5] hw/vfio/common: Add a trace point in vfio_reset_handler Eric Auger
@ 2025-02-18 18:25 ` Eric Auger
  2025-02-18 22:31 ` [PATCH v3 0/5] Fix vIOMMU reset order Peter Xu
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Auger @ 2025-02-18 18:25 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	mst, jasowang, imammedo, peterx, alex.williamson, clg, philmd,
	zhenzhong.duan, ddutile

To avoid any translation faults, the IOMMUs are expected to be
reset after the devices they protect. Document that we expect
DMA requests to be stopped during the 'enter' or 'hold' phase
while IOMMUs should be reset during the 'exit' phase.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 docs/devel/reset.rst | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/docs/devel/reset.rst b/docs/devel/reset.rst
index adefd59ef9..0b8b2fa5f4 100644
--- a/docs/devel/reset.rst
+++ b/docs/devel/reset.rst
@@ -143,6 +143,11 @@ The *exit* phase is executed only when the last reset operation ends. Therefore
 the object does not need to care how many of reset controllers it has and how
 many of them have started a reset.
 
+DMA capable devices are expected to cancel all outstanding DMA operations
+during either 'enter' or 'hold' phases. IOMMUs are expected to reset during
+the 'exit' phase and this sequencing makes sure no outstanding DMA request
+will fault.
+
 
 Handling reset in a resettable object
 -------------------------------------
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/5] Fix vIOMMU reset order
  2025-02-18 18:25 [PATCH v3 0/5] Fix vIOMMU reset order Eric Auger
                   ` (4 preceding siblings ...)
  2025-02-18 18:25 ` [PATCH v3 5/5] docs/devel/reset: Document reset expectations for DMA and IOMMU Eric Auger
@ 2025-02-18 22:31 ` Peter Xu
  5 siblings, 0 replies; 7+ messages in thread
From: Peter Xu @ 2025-02-18 22:31 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-devel, qemu-arm, peter.maydell, mst,
	jasowang, imammedo, alex.williamson, clg, philmd, zhenzhong.duan,
	ddutile

On Tue, Feb 18, 2025 at 07:25:30PM +0100, Eric Auger wrote:
> With current reset scheme, DMA capable devices are reset after
> the vIOMMU which translate them. This holds for the different
> IOMMUs and various DMA capable devices such as virtio devices
> and VFIO ones. With virtio devices, spurious traces can be
> observed at qemu level such as "virtio: zero sized buffers are
> not allowed" while for VFIO devices, translation faults can be
> observed at host level.
> 
> Virtio devices use 3 phase reset and virtio-pci devices are reset
> in the 'hold' phase. VFIO device reset are registered using
> qemu_register_reset() and as a consequence they are also reset
> on 'hold' phase.
> 
> Note that the tree of QOM devices resets depth-first but it does
> so while enforcing the 3 phases. First the tree is traversed doing
> the 'enter' phase, then the 'hold' phase and eventually the 'exit'
> phase.
> 
> However the QOM hierarchy is not built so that vIOMMUs get reset
> after the DMA capable devices (IOMMUs are using either legacy reset
> scheme or hold phase). Changing the QOM hierarchy does not sound
> trivial while forcing the vIOMMUs to be reset on 'exit' phase
> sounds reasonable and much simpler. Obviously this relies on the
> assumption that all DMA capable devices quiesce their DMA before
> (ie. during 'enter' or hold' phase).
> 
> This was tested with qmp system_reset and virsh reset.

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-02-18 22:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-18 18:25 [PATCH v3 0/5] Fix vIOMMU reset order Eric Auger
2025-02-18 18:25 ` [PATCH v3 1/5] hw/virtio/virtio-iommu: Migrate to 3-phase reset Eric Auger
2025-02-18 18:25 ` [PATCH v3 2/5] hw/i386/intel-iommu: " Eric Auger
2025-02-18 18:25 ` [PATCH v3 3/5] hw/arm/smmuv3: Move reset to exit phase Eric Auger
2025-02-18 18:25 ` [PATCH v3 4/5] hw/vfio/common: Add a trace point in vfio_reset_handler Eric Auger
2025-02-18 18:25 ` [PATCH v3 5/5] docs/devel/reset: Document reset expectations for DMA and IOMMU Eric Auger
2025-02-18 22:31 ` [PATCH v3 0/5] Fix vIOMMU reset order Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).