* [PATCH 0/4] Some trivial live update fixes
@ 2025-06-23 10:22 Zhenzhong Duan
2025-06-23 10:22 ` [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails Zhenzhong Duan
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Zhenzhong Duan @ 2025-06-23 10:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, steven.sistare, chao.p.peng,
Zhenzhong Duan, Michael Tokarev, Laurent Vivier,
open list:Trivial patches
Hi
These are trivial VFIO live update fixes in corner cases.
1) open /dev/vfio/vfio fail trigger SIGSEGV
2) mdev hotplug trigger qemu abort
3) potential SIGSEGV when unmap-all-vaddr failed
3) potential vfio_container_post_load failure
Thanks
Zhenzhong
Zhenzhong Duan (4):
vfio/container: Fix SIGSEGV when open container file fails
vfio/container: fails mdev hotplug if add migration blocker failed
vfio/container: Fix potential SIGSEGV when recover from
unmap-all-vaddr failure
vfio/container: Fix vfio_container_post_load()
include/hw/vfio/vfio-cpr.h | 7 ++++---
hw/vfio/container.c | 12 +++++++++---
hw/vfio/cpr-legacy.c | 23 +++++++++--------------
3 files changed, 22 insertions(+), 20 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails
2025-06-23 10:22 [PATCH 0/4] Some trivial live update fixes Zhenzhong Duan
@ 2025-06-23 10:22 ` Zhenzhong Duan
2025-06-24 8:53 ` Cédric Le Goater
2025-06-24 9:55 ` Cédric Le Goater
2025-06-23 10:22 ` [PATCH 2/4] vfio/container: fails mdev hotplug if add migration blocker failed Zhenzhong Duan
` (2 subsequent siblings)
3 siblings, 2 replies; 14+ messages in thread
From: Zhenzhong Duan @ 2025-06-23 10:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, steven.sistare, chao.p.peng,
Zhenzhong Duan
When open /dev/vfio/vfio fails, SIGSEGV triggers because
vfio_listener_unregister() doesn't support a NULL bcontainer
pointer.
Fixes: a1f267a7d4d9 ("vfio/container: reform vfio_container_connect cleanup")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/vfio/container.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 3e8d645ebb..2853f6f08b 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -710,7 +710,9 @@ static bool vfio_container_connect(VFIOGroup *group, AddressSpace *as,
return true;
fail:
- vfio_listener_unregister(bcontainer);
+ if (new_container) {
+ vfio_listener_unregister(bcontainer);
+ }
if (group_was_added) {
vfio_container_group_del(container, group);
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/4] vfio/container: fails mdev hotplug if add migration blocker failed
2025-06-23 10:22 [PATCH 0/4] Some trivial live update fixes Zhenzhong Duan
2025-06-23 10:22 ` [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails Zhenzhong Duan
@ 2025-06-23 10:22 ` Zhenzhong Duan
2025-06-24 9:21 ` Cédric Le Goater
2025-06-23 10:22 ` [PATCH 3/4] vfio/container: Fix potential SIGSEGV when recover from unmap-all-vaddr failure Zhenzhong Duan
2025-06-23 10:22 ` [PATCH 4/4] vfio/container: Fix vfio_container_post_load() Zhenzhong Duan
3 siblings, 1 reply; 14+ messages in thread
From: Zhenzhong Duan @ 2025-06-23 10:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, steven.sistare, chao.p.peng,
Zhenzhong Duan
It's aggressive to abort a running QEMU process when hotplug a mdev
and it fails migration blocker adding.
Fix by just failing mdev hotplug itself.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/vfio/container.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 2853f6f08b..68b4fdb401 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -992,12 +992,16 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev,
if (vbasedev->mdev) {
error_setg(&vbasedev->cpr.mdev_blocker,
"CPR does not support vfio mdev %s", vbasedev->name);
- migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, &error_fatal,
- MIG_MODE_CPR_TRANSFER, -1);
+ if (migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, errp,
+ MIG_MODE_CPR_TRANSFER, -1)) {
+ goto hiod_unref_exit;
+ }
}
return true;
+hiod_unref_exit:
+ object_unref(vbasedev->hiod);
device_put_exit:
vfio_device_put(vbasedev);
group_put_exit:
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] vfio/container: Fix potential SIGSEGV when recover from unmap-all-vaddr failure
2025-06-23 10:22 [PATCH 0/4] Some trivial live update fixes Zhenzhong Duan
2025-06-23 10:22 ` [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails Zhenzhong Duan
2025-06-23 10:22 ` [PATCH 2/4] vfio/container: fails mdev hotplug if add migration blocker failed Zhenzhong Duan
@ 2025-06-23 10:22 ` Zhenzhong Duan
2025-06-24 16:54 ` Cédric Le Goater
2025-06-23 10:22 ` [PATCH 4/4] vfio/container: Fix vfio_container_post_load() Zhenzhong Duan
3 siblings, 1 reply; 14+ messages in thread
From: Zhenzhong Duan @ 2025-06-23 10:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, steven.sistare, chao.p.peng,
Zhenzhong Duan
cpr.saved_dma_map isn't initialized in source qemu which lead to vioc->dma_map
assigned a NULL value, this will trigger SIGSEGV.
Fix it by save and restore vioc->dma_map locally.
Fixes: eba1f657cbb1 ("vfio/container: recover from unmap-all-vaddr failure")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/vfio/vfio-cpr.h | 8 +++++---
hw/vfio/cpr-legacy.c | 3 ++-
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/include/hw/vfio/vfio-cpr.h b/include/hw/vfio/vfio-cpr.h
index 8bf85b9f4e..aef542e93c 100644
--- a/include/hw/vfio/vfio-cpr.h
+++ b/include/hw/vfio/vfio-cpr.h
@@ -16,14 +16,16 @@ struct VFIOContainer;
struct VFIOContainerBase;
struct VFIOGroup;
+typedef int (*DMA_MAP_FUNC)(const struct VFIOContainerBase *bcontainer,
+ hwaddr iova, ram_addr_t size, void *vaddr,
+ bool readonly, MemoryRegion *mr);
+
typedef struct VFIOContainerCPR {
Error *blocker;
bool vaddr_unmapped;
NotifierWithReturn transfer_notifier;
MemoryListener remap_listener;
- int (*saved_dma_map)(const struct VFIOContainerBase *bcontainer,
- hwaddr iova, ram_addr_t size,
- void *vaddr, bool readonly, MemoryRegion *mr);
+ DMA_MAP_FUNC saved_dma_map;
} VFIOContainerCPR;
typedef struct VFIODeviceCPR {
diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
index a84c3247b7..100a8db74d 100644
--- a/hw/vfio/cpr-legacy.c
+++ b/hw/vfio/cpr-legacy.c
@@ -148,6 +148,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
*/
VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
+ DMA_MAP_FUNC saved_dma_map = vioc->dma_map;
vioc->dma_map = vfio_legacy_cpr_dma_map;
container->cpr.remap_listener = (MemoryListener) {
@@ -158,7 +159,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
bcontainer->space->as);
memory_listener_unregister(&container->cpr.remap_listener);
container->cpr.vaddr_unmapped = false;
- vioc->dma_map = container->cpr.saved_dma_map;
+ vioc->dma_map = saved_dma_map;
}
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/4] vfio/container: Fix vfio_container_post_load()
2025-06-23 10:22 [PATCH 0/4] Some trivial live update fixes Zhenzhong Duan
` (2 preceding siblings ...)
2025-06-23 10:22 ` [PATCH 3/4] vfio/container: Fix potential SIGSEGV when recover from unmap-all-vaddr failure Zhenzhong Duan
@ 2025-06-23 10:22 ` Zhenzhong Duan
2025-06-26 13:21 ` Steven Sistare
3 siblings, 1 reply; 14+ messages in thread
From: Zhenzhong Duan @ 2025-06-23 10:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, steven.sistare, chao.p.peng,
Zhenzhong Duan
When there are multiple VFIO containers, vioc->dma_map is restored
multiple times, this made only first container work and remaining
containers using vioc->dma_map restored by first container.
Fix it by save and restore vioc->dma_map locally. saved_dma_map in
VFIOContainerCPR becomes useless and is removed.
Fixes: 7e9f21411302 ("vfio/container: restore DMA vaddr")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/vfio/vfio-cpr.h | 1 -
hw/vfio/cpr-legacy.c | 20 +++++++-------------
2 files changed, 7 insertions(+), 14 deletions(-)
diff --git a/include/hw/vfio/vfio-cpr.h b/include/hw/vfio/vfio-cpr.h
index aef542e93c..71a1c1a70c 100644
--- a/include/hw/vfio/vfio-cpr.h
+++ b/include/hw/vfio/vfio-cpr.h
@@ -25,7 +25,6 @@ typedef struct VFIOContainerCPR {
bool vaddr_unmapped;
NotifierWithReturn transfer_notifier;
MemoryListener remap_listener;
- DMA_MAP_FUNC saved_dma_map;
} VFIOContainerCPR;
typedef struct VFIODeviceCPR {
diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
index 100a8db74d..ff45a5128f 100644
--- a/hw/vfio/cpr-legacy.c
+++ b/hw/vfio/cpr-legacy.c
@@ -99,20 +99,21 @@ static int vfio_container_post_load(void *opaque, int version_id)
{
VFIOContainer *container = opaque;
VFIOContainerBase *bcontainer = &container->bcontainer;
- VFIOGroup *group;
+ VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
+ DMA_MAP_FUNC saved_dma_map = vioc->dma_map;
Error *local_err = NULL;
+ /* During incoming CPR, divert calls to dma_map. */
+ vioc->dma_map = vfio_legacy_cpr_dma_map;
+
if (!vfio_listener_register(bcontainer, &local_err)) {
error_report_err(local_err);
return -1;
}
- QLIST_FOREACH(group, &container->group_list, container_next) {
- VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
+ /* Restore original dma_map function */
+ vioc->dma_map = saved_dma_map;
- /* Restore original dma_map function */
- vioc->dma_map = container->cpr.saved_dma_map;
- }
return 0;
}
@@ -180,13 +181,6 @@ bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp)
vmstate_register(NULL, -1, &vfio_container_vmstate, container);
- /* During incoming CPR, divert calls to dma_map. */
- if (cpr_is_incoming()) {
- VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
- container->cpr.saved_dma_map = vioc->dma_map;
- vioc->dma_map = vfio_legacy_cpr_dma_map;
- }
-
migration_add_notifier_mode(&container->cpr.transfer_notifier,
vfio_cpr_fail_notifier,
MIG_MODE_CPR_TRANSFER);
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails
2025-06-23 10:22 ` [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails Zhenzhong Duan
@ 2025-06-24 8:53 ` Cédric Le Goater
2025-06-24 9:55 ` Cédric Le Goater
1 sibling, 0 replies; 14+ messages in thread
From: Cédric Le Goater @ 2025-06-24 8:53 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, steven.sistare, chao.p.peng
On 6/23/25 12:22, Zhenzhong Duan wrote:
> When open /dev/vfio/vfio fails, SIGSEGV triggers because
> vfio_listener_unregister() doesn't support a NULL bcontainer
> pointer.
>
> Fixes: a1f267a7d4d9 ("vfio/container: reform vfio_container_connect cleanup")
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
> ---
> hw/vfio/container.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index 3e8d645ebb..2853f6f08b 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -710,7 +710,9 @@ static bool vfio_container_connect(VFIOGroup *group, AddressSpace *as,
> return true;
>
> fail:
> - vfio_listener_unregister(bcontainer);
> + if (new_container) {
> + vfio_listener_unregister(bcontainer);
> + }
>
> if (group_was_added) {
> vfio_container_group_del(container, group);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] vfio/container: fails mdev hotplug if add migration blocker failed
2025-06-23 10:22 ` [PATCH 2/4] vfio/container: fails mdev hotplug if add migration blocker failed Zhenzhong Duan
@ 2025-06-24 9:21 ` Cédric Le Goater
2025-06-25 21:08 ` Cédric Le Goater
0 siblings, 1 reply; 14+ messages in thread
From: Cédric Le Goater @ 2025-06-24 9:21 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, steven.sistare, chao.p.peng
On 6/23/25 12:22, Zhenzhong Duan wrote:
> It's aggressive to abort a running QEMU process when hotplug a mdev
> and it fails migration blocker adding.
>
> Fix by just failing mdev hotplug itself.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/vfio/container.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index 2853f6f08b..68b4fdb401 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -992,12 +992,16 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev,
> if (vbasedev->mdev) {
> error_setg(&vbasedev->cpr.mdev_blocker,
> "CPR does not support vfio mdev %s", vbasedev->name);
> - migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, &error_fatal,
> - MIG_MODE_CPR_TRANSFER, -1);
> + if (migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, errp,
> + MIG_MODE_CPR_TRANSFER, -1)) {
migrate_add_blocker_modes() returns -errno. Testing with '< 0' would be
better.
Thanks,
C.
> + goto hiod_unref_exit;
> + }
> }
>
> return true;
>
> +hiod_unref_exit:
> + object_unref(vbasedev->hiod);
> device_put_exit:
> vfio_device_put(vbasedev);
> group_put_exit:
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails
2025-06-23 10:22 ` [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails Zhenzhong Duan
2025-06-24 8:53 ` Cédric Le Goater
@ 2025-06-24 9:55 ` Cédric Le Goater
1 sibling, 0 replies; 14+ messages in thread
From: Cédric Le Goater @ 2025-06-24 9:55 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, steven.sistare, chao.p.peng
On 6/23/25 12:22, Zhenzhong Duan wrote:
> When open /dev/vfio/vfio fails, SIGSEGV triggers because
> vfio_listener_unregister() doesn't support a NULL bcontainer
> pointer.
>
> Fixes: a1f267a7d4d9 ("vfio/container: reform vfio_container_connect cleanup")
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/vfio/container.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index 3e8d645ebb..2853f6f08b 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -710,7 +710,9 @@ static bool vfio_container_connect(VFIOGroup *group, AddressSpace *as,
> return true;
>
> fail:
> - vfio_listener_unregister(bcontainer);
> + if (new_container) {
> + vfio_listener_unregister(bcontainer);
> + }
>
> if (group_was_added) {
> vfio_container_group_del(container, group);
Applied to vfio-next.
Thanks,
C.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/4] vfio/container: Fix potential SIGSEGV when recover from unmap-all-vaddr failure
2025-06-23 10:22 ` [PATCH 3/4] vfio/container: Fix potential SIGSEGV when recover from unmap-all-vaddr failure Zhenzhong Duan
@ 2025-06-24 16:54 ` Cédric Le Goater
2025-06-26 12:53 ` Steven Sistare
0 siblings, 1 reply; 14+ messages in thread
From: Cédric Le Goater @ 2025-06-24 16:54 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, steven.sistare, chao.p.peng
On 6/23/25 12:22, Zhenzhong Duan wrote:
> cpr.saved_dma_map isn't initialized in source qemu which lead to vioc->dma_map
> assigned a NULL value, this will trigger SIGSEGV.
I don't understand the scenario. Could you please explain more ?
> Fix it by save and restore vioc->dma_map locally.
Steve, this is cpr territory. Is it still an issue with the rest
of the patches applied ?
Thanks,
C.
>
> Fixes: eba1f657cbb1 ("vfio/container: recover from unmap-all-vaddr failure")
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/hw/vfio/vfio-cpr.h | 8 +++++---
> hw/vfio/cpr-legacy.c | 3 ++-
> 2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/include/hw/vfio/vfio-cpr.h b/include/hw/vfio/vfio-cpr.h
> index 8bf85b9f4e..aef542e93c 100644
> --- a/include/hw/vfio/vfio-cpr.h
> +++ b/include/hw/vfio/vfio-cpr.h
> @@ -16,14 +16,16 @@ struct VFIOContainer;
> struct VFIOContainerBase;
> struct VFIOGroup;
>
> +typedef int (*DMA_MAP_FUNC)(const struct VFIOContainerBase *bcontainer,
> + hwaddr iova, ram_addr_t size, void *vaddr,
> + bool readonly, MemoryRegion *mr);
> +
> typedef struct VFIOContainerCPR {
> Error *blocker;
> bool vaddr_unmapped;
> NotifierWithReturn transfer_notifier;
> MemoryListener remap_listener;
> - int (*saved_dma_map)(const struct VFIOContainerBase *bcontainer,
> - hwaddr iova, ram_addr_t size,
> - void *vaddr, bool readonly, MemoryRegion *mr);
> + DMA_MAP_FUNC saved_dma_map;
> } VFIOContainerCPR;
>
> typedef struct VFIODeviceCPR {
> diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
> index a84c3247b7..100a8db74d 100644
> --- a/hw/vfio/cpr-legacy.c
> +++ b/hw/vfio/cpr-legacy.c
> @@ -148,6 +148,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
> */
>
> VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> + DMA_MAP_FUNC saved_dma_map = vioc->dma_map;
> vioc->dma_map = vfio_legacy_cpr_dma_map;
>
> container->cpr.remap_listener = (MemoryListener) {
> @@ -158,7 +159,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
> bcontainer->space->as);
> memory_listener_unregister(&container->cpr.remap_listener);
> container->cpr.vaddr_unmapped = false;
> - vioc->dma_map = container->cpr.saved_dma_map;
> + vioc->dma_map = saved_dma_map;
> }
> return 0;
> }
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] vfio/container: fails mdev hotplug if add migration blocker failed
2025-06-24 9:21 ` Cédric Le Goater
@ 2025-06-25 21:08 ` Cédric Le Goater
2025-06-26 1:43 ` Duan, Zhenzhong
0 siblings, 1 reply; 14+ messages in thread
From: Cédric Le Goater @ 2025-06-25 21:08 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, steven.sistare, chao.p.peng
On 6/24/25 11:21, Cédric Le Goater wrote:
> On 6/23/25 12:22, Zhenzhong Duan wrote:
>> It's aggressive to abort a running QEMU process when hotplug a mdev
>> and it fails migration blocker adding.
>>
>> Fix by just failing mdev hotplug itself.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/vfio/container.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index 2853f6f08b..68b4fdb401 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -992,12 +992,16 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev,
>> if (vbasedev->mdev) {
>> error_setg(&vbasedev->cpr.mdev_blocker,
>> "CPR does not support vfio mdev %s", vbasedev->name);
>> - migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, &error_fatal,
>> - MIG_MODE_CPR_TRANSFER, -1);
>> + if (migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, errp,
>> + MIG_MODE_CPR_TRANSFER, -1)) {
>
> migrate_add_blocker_modes() returns -errno. Testing with '< 0' would be
> better.
>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
I changed the test on the value returned by migrate_add_blocker_modes().
Thanks,
C.
> Thanks,
>
> C.
>
>
>
>> + goto hiod_unref_exit;
>> + }
>> }
>> return true;
>> +hiod_unref_exit:
>> + object_unref(vbasedev->hiod);
>> device_put_exit:
>> vfio_device_put(vbasedev);
>> group_put_exit:
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 2/4] vfio/container: fails mdev hotplug if add migration blocker failed
2025-06-25 21:08 ` Cédric Le Goater
@ 2025-06-26 1:43 ` Duan, Zhenzhong
0 siblings, 0 replies; 14+ messages in thread
From: Duan, Zhenzhong @ 2025-06-26 1:43 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, eric.auger@redhat.com,
steven.sistare@oracle.com, Peng, Chao P
>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH 2/4] vfio/container: fails mdev hotplug if add migration
>blocker failed
>
>On 6/24/25 11:21, Cédric Le Goater wrote:
>> On 6/23/25 12:22, Zhenzhong Duan wrote:
>>> It's aggressive to abort a running QEMU process when hotplug a mdev
>>> and it fails migration blocker adding.
>>>
>>> Fix by just failing mdev hotplug itself.
>>>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> hw/vfio/container.c | 8 ++++++--
>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>>> index 2853f6f08b..68b4fdb401 100644
>>> --- a/hw/vfio/container.c
>>> +++ b/hw/vfio/container.c
>>> @@ -992,12 +992,16 @@ static bool vfio_legacy_attach_device(const char
>*name, VFIODevice *vbasedev,
>>> if (vbasedev->mdev) {
>>> error_setg(&vbasedev->cpr.mdev_blocker,
>>> "CPR does not support vfio mdev %s", vbasedev->name);
>>> - migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker,
>&error_fatal,
>>> - MIG_MODE_CPR_TRANSFER, -1);
>>> + if (migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, errp,
>>> + MIG_MODE_CPR_TRANSFER, -1)) {
>>
>> migrate_add_blocker_modes() returns -errno. Testing with '< 0' would be
>> better.
>>
>
>
>Reviewed-by: Cédric Le Goater <clg@redhat.com>
>
Thanks Cédric.
I ever planned to send an update after receiving comments for patch3,4.
BRs,
Zhenzhong
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/4] vfio/container: Fix potential SIGSEGV when recover from unmap-all-vaddr failure
2025-06-24 16:54 ` Cédric Le Goater
@ 2025-06-26 12:53 ` Steven Sistare
2025-06-26 13:04 ` Steven Sistare
0 siblings, 1 reply; 14+ messages in thread
From: Steven Sistare @ 2025-06-26 12:53 UTC (permalink / raw)
To: Cédric Le Goater, Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, chao.p.peng
On 6/24/2025 12:54 PM, Cédric Le Goater wrote:
> On 6/23/25 12:22, Zhenzhong Duan wrote:
>> cpr.saved_dma_map isn't initialized in source qemu which lead to vioc->dma_map
>> assigned a NULL value, this will trigger SIGSEGV.
>
> I don't understand the scenario. Could you please explain more ?
Thank you Zhenzhong, I see the bug.
CPR overrides then restores dma_map in both outgoing and incoming QEMU, for
different reasons. But it only sets saved_dma_map in the target.
So, a more future-proof fix IMO is to always set saved_dma_map:
@@ -182,9 +182,9 @@ bool vfio_legacy_cpr_register_container(VFIOContainer *conta
vmstate_register(NULL, -1, &vfio_container_vmstate, container);
/* During incoming CPR, divert calls to dma_map. */
+ VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
+ container->cpr.saved_dma_map = vioc->dma_map;
if (cpr_is_incoming()) {
- VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
- container->cpr.saved_dma_map = vioc->dma_map;
vioc->dma_map = vfio_legacy_cpr_dma_map;
}
- Steve
>> Fix it by save and restore vioc->dma_map locally.
>
> Steve, this is cpr territory. Is it still an issue with the rest
> of the patches applied ?
>
> Thanks,
>
> C.
>
>> Fixes: eba1f657cbb1 ("vfio/container: recover from unmap-all-vaddr failure")
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> include/hw/vfio/vfio-cpr.h | 8 +++++---
>> hw/vfio/cpr-legacy.c | 3 ++-
>> 2 files changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/hw/vfio/vfio-cpr.h b/include/hw/vfio/vfio-cpr.h
>> index 8bf85b9f4e..aef542e93c 100644
>> --- a/include/hw/vfio/vfio-cpr.h
>> +++ b/include/hw/vfio/vfio-cpr.h
>> @@ -16,14 +16,16 @@ struct VFIOContainer;
>> struct VFIOContainerBase;
>> struct VFIOGroup;
>> +typedef int (*DMA_MAP_FUNC)(const struct VFIOContainerBase *bcontainer,
>> + hwaddr iova, ram_addr_t size, void *vaddr,
>> + bool readonly, MemoryRegion *mr);
>> +
>> typedef struct VFIOContainerCPR {
>> Error *blocker;
>> bool vaddr_unmapped;
>> NotifierWithReturn transfer_notifier;
>> MemoryListener remap_listener;
>> - int (*saved_dma_map)(const struct VFIOContainerBase *bcontainer,
>> - hwaddr iova, ram_addr_t size,
>> - void *vaddr, bool readonly, MemoryRegion *mr);
>> + DMA_MAP_FUNC saved_dma_map;
>> } VFIOContainerCPR;
>> typedef struct VFIODeviceCPR {
>> diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
>> index a84c3247b7..100a8db74d 100644
>> --- a/hw/vfio/cpr-legacy.c
>> +++ b/hw/vfio/cpr-legacy.c
>> @@ -148,6 +148,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
>> */
>> VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
>> + DMA_MAP_FUNC saved_dma_map = vioc->dma_map;
>> vioc->dma_map = vfio_legacy_cpr_dma_map;
>> container->cpr.remap_listener = (MemoryListener) {
>> @@ -158,7 +159,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
>> bcontainer->space->as);
>> memory_listener_unregister(&container->cpr.remap_listener);
>> container->cpr.vaddr_unmapped = false;
>> - vioc->dma_map = container->cpr.saved_dma_map;
>> + vioc->dma_map = saved_dma_map;
>> }
>> return 0;
>> }
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/4] vfio/container: Fix potential SIGSEGV when recover from unmap-all-vaddr failure
2025-06-26 12:53 ` Steven Sistare
@ 2025-06-26 13:04 ` Steven Sistare
0 siblings, 0 replies; 14+ messages in thread
From: Steven Sistare @ 2025-06-26 13:04 UTC (permalink / raw)
To: Cédric Le Goater, Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, chao.p.peng
On 6/26/2025 8:53 AM, Steven Sistare wrote:
> On 6/24/2025 12:54 PM, Cédric Le Goater wrote:
>> On 6/23/25 12:22, Zhenzhong Duan wrote:
>>> cpr.saved_dma_map isn't initialized in source qemu which lead to vioc->dma_map
>>> assigned a NULL value, this will trigger SIGSEGV.
>>
>> I don't understand the scenario. Could you please explain more ?
>
> Thank you Zhenzhong, I see the bug.
>
> CPR overrides then restores dma_map in both outgoing and incoming QEMU, for
> different reasons. But it only sets saved_dma_map in the target.
>
> So, a more future-proof fix IMO is to always set saved_dma_map:
>
> @@ -182,9 +182,9 @@ bool vfio_legacy_cpr_register_container(VFIOContainer *conta
> vmstate_register(NULL, -1, &vfio_container_vmstate, container);
>
> /* During incoming CPR, divert calls to dma_map. */
> + VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> + container->cpr.saved_dma_map = vioc->dma_map;
> if (cpr_is_incoming()) {
> - VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> - container->cpr.saved_dma_map = vioc->dma_map;
> vioc->dma_map = vfio_legacy_cpr_dma_map;
> }
Now I see that patch 4 deletes saved_dma_map entirely, and looks fine.
I still think the above diff is the best fix for patch 3, and you
can move the "save and restore locally" parts of patch 3 to patch 4.
- Steve
>>> Fix it by save and restore vioc->dma_map locally.
>>
>> Steve, this is cpr territory. Is it still an issue with the rest
>> of the patches applied ?
>>
>> Thanks,
>>
>> C.
>>
>>> Fixes: eba1f657cbb1 ("vfio/container: recover from unmap-all-vaddr failure")
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> include/hw/vfio/vfio-cpr.h | 8 +++++---
>>> hw/vfio/cpr-legacy.c | 3 ++-
>>> 2 files changed, 7 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/hw/vfio/vfio-cpr.h b/include/hw/vfio/vfio-cpr.h
>>> index 8bf85b9f4e..aef542e93c 100644
>>> --- a/include/hw/vfio/vfio-cpr.h
>>> +++ b/include/hw/vfio/vfio-cpr.h
>>> @@ -16,14 +16,16 @@ struct VFIOContainer;
>>> struct VFIOContainerBase;
>>> struct VFIOGroup;
>>> +typedef int (*DMA_MAP_FUNC)(const struct VFIOContainerBase *bcontainer,
>>> + hwaddr iova, ram_addr_t size, void *vaddr,
>>> + bool readonly, MemoryRegion *mr);
>>> +
>>> typedef struct VFIOContainerCPR {
>>> Error *blocker;
>>> bool vaddr_unmapped;
>>> NotifierWithReturn transfer_notifier;
>>> MemoryListener remap_listener;
>>> - int (*saved_dma_map)(const struct VFIOContainerBase *bcontainer,
>>> - hwaddr iova, ram_addr_t size,
>>> - void *vaddr, bool readonly, MemoryRegion *mr);
>>> + DMA_MAP_FUNC saved_dma_map;
>>> } VFIOContainerCPR;
>>> typedef struct VFIODeviceCPR {
>>> diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
>>> index a84c3247b7..100a8db74d 100644
>>> --- a/hw/vfio/cpr-legacy.c
>>> +++ b/hw/vfio/cpr-legacy.c
>>> @@ -148,6 +148,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
>>> */
>>> VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
>>> + DMA_MAP_FUNC saved_dma_map = vioc->dma_map;
>>> vioc->dma_map = vfio_legacy_cpr_dma_map;
>>> container->cpr.remap_listener = (MemoryListener) {
>>> @@ -158,7 +159,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
>>> bcontainer->space->as);
>>> memory_listener_unregister(&container->cpr.remap_listener);
>>> container->cpr.vaddr_unmapped = false;
>>> - vioc->dma_map = container->cpr.saved_dma_map;
>>> + vioc->dma_map = saved_dma_map;
>>> }
>>> return 0;
>>> }
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] vfio/container: Fix vfio_container_post_load()
2025-06-23 10:22 ` [PATCH 4/4] vfio/container: Fix vfio_container_post_load() Zhenzhong Duan
@ 2025-06-26 13:21 ` Steven Sistare
0 siblings, 0 replies; 14+ messages in thread
From: Steven Sistare @ 2025-06-26 13:21 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel; +Cc: alex.williamson, clg, eric.auger, chao.p.peng
On 6/23/2025 6:22 AM, Zhenzhong Duan wrote:
> When there are multiple VFIO containers, vioc->dma_map is restored
> multiple times, this made only first container work and remaining
> containers using vioc->dma_map restored by first container.
>
> Fix it by save and restore vioc->dma_map locally. saved_dma_map in
> VFIOContainerCPR becomes useless and is removed.
>
> Fixes: 7e9f21411302 ("vfio/container: restore DMA vaddr")
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/hw/vfio/vfio-cpr.h | 1 -
> hw/vfio/cpr-legacy.c | 20 +++++++-------------
> 2 files changed, 7 insertions(+), 14 deletions(-)
>
> diff --git a/include/hw/vfio/vfio-cpr.h b/include/hw/vfio/vfio-cpr.h
> index aef542e93c..71a1c1a70c 100644
> --- a/include/hw/vfio/vfio-cpr.h
> +++ b/include/hw/vfio/vfio-cpr.h
> @@ -25,7 +25,6 @@ typedef struct VFIOContainerCPR {
> bool vaddr_unmapped;
> NotifierWithReturn transfer_notifier;
> MemoryListener remap_listener;
> - DMA_MAP_FUNC saved_dma_map;
> } VFIOContainerCPR;
>
> typedef struct VFIODeviceCPR {
> diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
> index 100a8db74d..ff45a5128f 100644
> --- a/hw/vfio/cpr-legacy.c
> +++ b/hw/vfio/cpr-legacy.c
> @@ -99,20 +99,21 @@ static int vfio_container_post_load(void *opaque, int version_id)
> {
> VFIOContainer *container = opaque;
> VFIOContainerBase *bcontainer = &container->bcontainer;
> - VFIOGroup *group;
> + VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> + DMA_MAP_FUNC saved_dma_map = vioc->dma_map;
> Error *local_err = NULL;
>
> + /* During incoming CPR, divert calls to dma_map. */
> + vioc->dma_map = vfio_legacy_cpr_dma_map;
> +
> if (!vfio_listener_register(bcontainer, &local_err)) {
> error_report_err(local_err);
> return -1;
> }
>
> - QLIST_FOREACH(group, &container->group_list, container_next) {
> - VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> + /* Restore original dma_map function */
> + vioc->dma_map = saved_dma_map;
>
> - /* Restore original dma_map function */
> - vioc->dma_map = container->cpr.saved_dma_map;
> - }
> return 0;
> }
>
> @@ -180,13 +181,6 @@ bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp)
>
> vmstate_register(NULL, -1, &vfio_container_vmstate, container);
>
> - /* During incoming CPR, divert calls to dma_map. */
> - if (cpr_is_incoming()) {
> - VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> - container->cpr.saved_dma_map = vioc->dma_map;
> - vioc->dma_map = vfio_legacy_cpr_dma_map;
> - }
> -
> migration_add_notifier_mode(&container->cpr.transfer_notifier,
> vfio_cpr_fail_notifier,
> MIG_MODE_CPR_TRANSFER);
Good fix, thanks. All looks good. I will wait to see the "local save" parts
of patch 3 squashed into patch 4 before sending RB.
One nit about DMA_MAP_FUNC -- the majority of typedefs for function signatures
use lower case. See
git grep typedef | fgrep '(*'
- Steve
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-06-26 13:23 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-23 10:22 [PATCH 0/4] Some trivial live update fixes Zhenzhong Duan
2025-06-23 10:22 ` [PATCH 1/4] vfio/container: Fix SIGSEGV when open container file fails Zhenzhong Duan
2025-06-24 8:53 ` Cédric Le Goater
2025-06-24 9:55 ` Cédric Le Goater
2025-06-23 10:22 ` [PATCH 2/4] vfio/container: fails mdev hotplug if add migration blocker failed Zhenzhong Duan
2025-06-24 9:21 ` Cédric Le Goater
2025-06-25 21:08 ` Cédric Le Goater
2025-06-26 1:43 ` Duan, Zhenzhong
2025-06-23 10:22 ` [PATCH 3/4] vfio/container: Fix potential SIGSEGV when recover from unmap-all-vaddr failure Zhenzhong Duan
2025-06-24 16:54 ` Cédric Le Goater
2025-06-26 12:53 ` Steven Sistare
2025-06-26 13:04 ` Steven Sistare
2025-06-23 10:22 ` [PATCH 4/4] vfio/container: Fix vfio_container_post_load() Zhenzhong Duan
2025-06-26 13:21 ` Steven Sistare
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).