* [PATCH 0/3] vfio/migration: Some bug fixes and cleanups
@ 2024-10-20 13:01 Avihai Horon
2024-10-20 13:01 ` [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact() Avihai Horon
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Avihai Horon @ 2024-10-20 13:01 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Peter Xu, Fabiano Rosas,
Avihai Horon
This small series contains some bug fixes and cleanups.
Reviews are welcome.
Thanks.
Avihai Horon (3):
vfio/migration: Report only stop-copy size in
vfio_state_pending_exact()
vfio/migration: Refactor vfio_vmstate_change/_prepare() error
reporting
vfio/migration: Change trace formats from hex to decimal
hw/vfio/migration.c | 34 +++++++++++++++++++++-------------
hw/vfio/trace-events | 10 +++++-----
2 files changed, 26 insertions(+), 18 deletions(-)
--
2.40.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact()
2024-10-20 13:01 [PATCH 0/3] vfio/migration: Some bug fixes and cleanups Avihai Horon
@ 2024-10-20 13:01 ` Avihai Horon
2024-10-21 6:48 ` Cédric Le Goater
` (2 more replies)
2024-10-20 13:01 ` [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting Avihai Horon
` (2 subsequent siblings)
3 siblings, 3 replies; 20+ messages in thread
From: Avihai Horon @ 2024-10-20 13:01 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Peter Xu, Fabiano Rosas,
Avihai Horon
vfio_state_pending_exact() is used to update migration core how much
device data is left for the device migration. Currently, the sum of
pre-copy and stop-copy sizes of the VFIO device are reported.
The pre-copy size is obtained via the VFIO_MIG_GET_PRECOPY_INFO ioctl,
which returns the amount of device data available to be transferred
while the device is in the PRE_COPY states.
The stop-copy size is obtained via the VFIO_DEVICE_FEATURE_MIG_DATA_SIZE
ioctl, which returns the total amount of device data left to be
transferred in order to complete the device migration.
According to the above, current implementation is wrong -- it reports
extra overlapping data because pre-copy size is already contained in
stop-copy size. Fix it by reporting only stop-copy size.
Fixes: eda7362af959 ("vfio/migration: Add VFIO migration pre-copy support")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
hw/vfio/migration.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 17199b73ae..992dc3b102 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -576,9 +576,6 @@ static void vfio_state_pending_exact(void *opaque, uint64_t *must_precopy,
if (vfio_device_state_is_precopy(vbasedev)) {
vfio_query_precopy_size(migration);
-
- *must_precopy +=
- migration->precopy_init_size + migration->precopy_dirty_size;
}
trace_vfio_state_pending_exact(vbasedev->name, *must_precopy, *can_postcopy,
--
2.40.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-20 13:01 [PATCH 0/3] vfio/migration: Some bug fixes and cleanups Avihai Horon
2024-10-20 13:01 ` [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact() Avihai Horon
@ 2024-10-20 13:01 ` Avihai Horon
2024-10-21 15:14 ` Cédric Le Goater
2024-10-20 13:01 ` [PATCH 3/3] vfio/migration: Change trace formats from hex to decimal Avihai Horon
2024-10-23 15:08 ` [PATCH 0/3] vfio/migration: Some bug fixes and cleanups Cédric Le Goater
3 siblings, 1 reply; 20+ messages in thread
From: Avihai Horon @ 2024-10-20 13:01 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Peter Xu, Fabiano Rosas,
Avihai Horon
When the VM is shut down vfio_vmstate_change/_prepare() are called to
transition the VFIO device state to STOP. They are called after
migration_shutdown() and thus, by this time, the migration object is
already freed (more specifically, MigrationState->qemu_file_lock is
already destroyed).
In this case, if there is an error in vfio_vmstate_change/_prepare(), it
calls migration_file_set_error() which tries to lock the already
destroyed MigrationState->qemu_file_lock, leading to the following
assert:
qemu-system-x86_64: ../util/qemu-thread-posix.c:92: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
Fix this by not setting migration file error in the shut down flow.
Fixes: 20c64c8a51a4 ("migration: migration_file_set_error")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
hw/vfio/migration.c | 31 +++++++++++++++++++++----------
1 file changed, 21 insertions(+), 10 deletions(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 992dc3b102..1c44b036ea 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -783,6 +783,25 @@ static const SaveVMHandlers savevm_vfio_handlers = {
/* ---------------------------------------------------------------------- */
+static void vfio_vmstate_change_error_report(int ret, Error *err,
+ RunState state)
+{
+ if (state == RUN_STATE_SHUTDOWN) {
+ /*
+ * If VM is being shut down, migration object might have already been
+ * freed, so just report the error.
+ */
+ error_report_err(err);
+ return;
+ }
+
+ /*
+ * Migration should be aborted in this case, but vm_state_notify()
+ * currently does not support reporting failures.
+ */
+ migration_file_set_error(ret, err);
+}
+
static void vfio_vmstate_change_prepare(void *opaque, bool running,
RunState state)
{
@@ -798,11 +817,7 @@ static void vfio_vmstate_change_prepare(void *opaque, bool running,
ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
if (ret) {
- /*
- * Migration should be aborted in this case, but vm_state_notify()
- * currently does not support reporting failures.
- */
- migration_file_set_error(ret, local_err);
+ vfio_vmstate_change_error_report(ret, local_err, state);
}
trace_vfio_vmstate_change_prepare(vbasedev->name, running,
@@ -829,11 +844,7 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
if (ret) {
- /*
- * Migration should be aborted in this case, but vm_state_notify()
- * currently does not support reporting failures.
- */
- migration_file_set_error(ret, local_err);
+ vfio_vmstate_change_error_report(ret, local_err, state);
}
trace_vfio_vmstate_change(vbasedev->name, running, RunState_str(state),
--
2.40.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/3] vfio/migration: Change trace formats from hex to decimal
2024-10-20 13:01 [PATCH 0/3] vfio/migration: Some bug fixes and cleanups Avihai Horon
2024-10-20 13:01 ` [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact() Avihai Horon
2024-10-20 13:01 ` [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting Avihai Horon
@ 2024-10-20 13:01 ` Avihai Horon
2024-10-21 15:17 ` Cédric Le Goater
2024-10-23 15:08 ` [PATCH 0/3] vfio/migration: Some bug fixes and cleanups Cédric Le Goater
3 siblings, 1 reply; 20+ messages in thread
From: Avihai Horon @ 2024-10-20 13:01 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Peter Xu, Fabiano Rosas,
Avihai Horon
Data sizes in VFIO migration trace events are printed in hex format
while in migration core trace events they are printed in decimal format.
This inconsistency makes it less readable when using both trace event
types. Hence, change the data sizes print format to decimal in VFIO
migration trace events.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
hw/vfio/trace-events | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index c475c273fd..29789e8d27 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -151,7 +151,7 @@ vfio_display_edid_write_error(void) ""
vfio_load_cleanup(const char *name) " (%s)"
vfio_load_device_config_state(const char *name) " (%s)"
vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64
-vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size 0x%"PRIx64" ret %d"
+vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d"
vfio_migration_realize(const char *name) " (%s)"
vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s"
vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s"
@@ -160,10 +160,10 @@ vfio_save_block(const char *name, int data_size) " (%s) data_size %d"
vfio_save_cleanup(const char *name) " (%s)"
vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d"
vfio_save_device_config_state(const char *name) " (%s)"
-vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64
-vfio_save_setup(const char *name, uint64_t data_buffer_size) " (%s) data buffer size 0x%"PRIx64
-vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64
-vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" stopcopy size 0x%"PRIx64" precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64
+vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size %"PRIu64" precopy dirty size %"PRIu64
+vfio_save_setup(const char *name, uint64_t data_buffer_size) " (%s) data buffer size %"PRIu64
+vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64
+vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" stopcopy size %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64
vfio_vmstate_change(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s"
vfio_vmstate_change_prepare(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s"
--
2.40.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact()
2024-10-20 13:01 ` [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact() Avihai Horon
@ 2024-10-21 6:48 ` Cédric Le Goater
2024-10-23 10:06 ` Cédric Le Goater
2024-10-25 15:18 ` Michael Tokarev
2 siblings, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-21 6:48 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Alex Williamson, Peter Xu, Fabiano Rosas, Zhiyi Guo
+Zhiyi
On 10/20/24 15:01, Avihai Horon wrote:
> vfio_state_pending_exact() is used to update migration core how much
> device data is left for the device migration. Currently, the sum of
> pre-copy and stop-copy sizes of the VFIO device are reported.
>
> The pre-copy size is obtained via the VFIO_MIG_GET_PRECOPY_INFO ioctl,
> which returns the amount of device data available to be transferred
> while the device is in the PRE_COPY states.
>
> The stop-copy size is obtained via the VFIO_DEVICE_FEATURE_MIG_DATA_SIZE
> ioctl, which returns the total amount of device data left to be
> transferred in order to complete the device migration.
>
> According to the above, current implementation is wrong -- it reports
> extra overlapping data because pre-copy size is already contained in
> stop-copy size. Fix it by reporting only stop-copy size.
>
> Fixes: eda7362af959 ("vfio/migration: Add VFIO migration pre-copy support")
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---> hw/vfio/migration.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 17199b73ae..992dc3b102 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -576,9 +576,6 @@ static void vfio_state_pending_exact(void *opaque, uint64_t *must_precopy,
>
> if (vfio_device_state_is_precopy(vbasedev)) {
> vfio_query_precopy_size(migration);
> -
> - *must_precopy +=
> - migration->precopy_init_size + migration->precopy_dirty_size;
> }
>
> trace_vfio_state_pending_exact(vbasedev->name, *must_precopy, *can_postcopy,
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-20 13:01 ` [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting Avihai Horon
@ 2024-10-21 15:14 ` Cédric Le Goater
2024-10-21 15:50 ` Peter Xu
0 siblings, 1 reply; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-21 15:14 UTC (permalink / raw)
To: Avihai Horon, qemu-devel; +Cc: Alex Williamson, Peter Xu, Fabiano Rosas
Hello Avihai,
On 10/20/24 15:01, Avihai Horon wrote:
> When the VM is shut down vfio_vmstate_change/_prepare() are called to
> transition the VFIO device state to STOP. They are called after
> migration_shutdown() and thus, by this time, the migration object is
> already freed (more specifically, MigrationState->qemu_file_lock is
> already destroyed).
>
> In this case, if there is an error in vfio_vmstate_change/_prepare(), it
> calls migration_file_set_error() which tries to lock the already
> destroyed MigrationState->qemu_file_lock, leading to the following
> assert:
>
> qemu-system-x86_64: ../util/qemu-thread-posix.c:92: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
>
> Fix this by not setting migration file error in the shut down flow.
>
> Fixes: 20c64c8a51a4 ("migration: migration_file_set_error")
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> hw/vfio/migration.c | 31 +++++++++++++++++++++----------
> 1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 992dc3b102..1c44b036ea 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -783,6 +783,25 @@ static const SaveVMHandlers savevm_vfio_handlers = {
>
> /* ---------------------------------------------------------------------- */
>
> +static void vfio_vmstate_change_error_report(int ret, Error *err,
> + RunState state)
> +{
> + if (state == RUN_STATE_SHUTDOWN) {
> + /*
> + * If VM is being shut down, migration object might have already been
> + * freed, so just report the error.
> + */
> + error_report_err(err);
> + return;
> + }
> +
> + /*
> + * Migration should be aborted in this case, but vm_state_notify()
> + * currently does not support reporting failures.
> + */
> + migration_file_set_error(ret, err);
> +}
This seems correct, but testing the machine's runtime state to
work around the fact that 'current_migration' is not safe to
use reveals a flaw.
In vfio, migration_is_setup_or_active() is unsafe. So are the
calls to vfio_set_migration_error().
This comment seems in contradiction with the migration code :
/* When we add fault tolerance, we could have several
migrations at once. For now we don't need to add
dynamic creation of migration */
Why is 'current_migration' allocated and destroyed today ?
Could we replace it with a singleton and switch the state to
MIGRATION_STATUS_NONE instead of destroying it ?
If not, should 'current_migration' be set to NULL in
migration_shutdown() and its value tested before accessing
to any of its fields, in a thread safe manner if necessary.
Thanks,
C.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/3] vfio/migration: Change trace formats from hex to decimal
2024-10-20 13:01 ` [PATCH 3/3] vfio/migration: Change trace formats from hex to decimal Avihai Horon
@ 2024-10-21 15:17 ` Cédric Le Goater
0 siblings, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-21 15:17 UTC (permalink / raw)
To: Avihai Horon, qemu-devel; +Cc: Alex Williamson, Peter Xu, Fabiano Rosas
On 10/20/24 15:01, Avihai Horon wrote:
> Data sizes in VFIO migration trace events are printed in hex format
> while in migration core trace events they are printed in decimal format.
>
> This inconsistency makes it less readable when using both trace event
> types. Hence, change the data sizes print format to decimal in VFIO
> migration trace events.
>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
> ---
> hw/vfio/trace-events | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> index c475c273fd..29789e8d27 100644
> --- a/hw/vfio/trace-events
> +++ b/hw/vfio/trace-events
> @@ -151,7 +151,7 @@ vfio_display_edid_write_error(void) ""
> vfio_load_cleanup(const char *name) " (%s)"
> vfio_load_device_config_state(const char *name) " (%s)"
> vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64
> -vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size 0x%"PRIx64" ret %d"
> +vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d"
> vfio_migration_realize(const char *name) " (%s)"
> vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s"
> vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s"
> @@ -160,10 +160,10 @@ vfio_save_block(const char *name, int data_size) " (%s) data_size %d"
> vfio_save_cleanup(const char *name) " (%s)"
> vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d"
> vfio_save_device_config_state(const char *name) " (%s)"
> -vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64
> -vfio_save_setup(const char *name, uint64_t data_buffer_size) " (%s) data buffer size 0x%"PRIx64
> -vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64
> -vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" stopcopy size 0x%"PRIx64" precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64
> +vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size %"PRIu64" precopy dirty size %"PRIu64
> +vfio_save_setup(const char *name, uint64_t data_buffer_size) " (%s) data buffer size %"PRIu64
> +vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64
> +vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" stopcopy size %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64
> vfio_vmstate_change(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s"
> vfio_vmstate_change_prepare(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s"
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-21 15:14 ` Cédric Le Goater
@ 2024-10-21 15:50 ` Peter Xu
2024-10-21 16:43 ` Cédric Le Goater
0 siblings, 1 reply; 20+ messages in thread
From: Peter Xu @ 2024-10-21 15:50 UTC (permalink / raw)
To: Cédric Le Goater
Cc: Avihai Horon, qemu-devel, Alex Williamson, Fabiano Rosas
On Mon, Oct 21, 2024 at 05:14:11PM +0200, Cédric Le Goater wrote:
> Hello Avihai,
>
> On 10/20/24 15:01, Avihai Horon wrote:
> > When the VM is shut down vfio_vmstate_change/_prepare() are called to
> > transition the VFIO device state to STOP. They are called after
> > migration_shutdown() and thus, by this time, the migration object is
> > already freed (more specifically, MigrationState->qemu_file_lock is
> > already destroyed).
> >
> > In this case, if there is an error in vfio_vmstate_change/_prepare(), it
> > calls migration_file_set_error() which tries to lock the already
> > destroyed MigrationState->qemu_file_lock, leading to the following
> > assert:
> >
> > qemu-system-x86_64: ../util/qemu-thread-posix.c:92: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
> >
> > Fix this by not setting migration file error in the shut down flow.
> >
> > Fixes: 20c64c8a51a4 ("migration: migration_file_set_error")
> > Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> > ---
> > hw/vfio/migration.c | 31 +++++++++++++++++++++----------
> > 1 file changed, 21 insertions(+), 10 deletions(-)
> >
> > diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> > index 992dc3b102..1c44b036ea 100644
> > --- a/hw/vfio/migration.c
> > +++ b/hw/vfio/migration.c
> > @@ -783,6 +783,25 @@ static const SaveVMHandlers savevm_vfio_handlers = {
> > /* ---------------------------------------------------------------------- */
> > +static void vfio_vmstate_change_error_report(int ret, Error *err,
> > + RunState state)
> > +{
> > + if (state == RUN_STATE_SHUTDOWN) {
> > + /*
> > + * If VM is being shut down, migration object might have already been
> > + * freed, so just report the error.
> > + */
> > + error_report_err(err);
> > + return;
> > + }
> > +
> > + /*
> > + * Migration should be aborted in this case, but vm_state_notify()
> > + * currently does not support reporting failures.
> > + */
> > + migration_file_set_error(ret, err);
> > +}
>
> This seems correct, but testing the machine's runtime state to
> work around the fact that 'current_migration' is not safe to
> use reveals a flaw.
>
> In vfio, migration_is_setup_or_active() is unsafe. So are the
> calls to vfio_set_migration_error().
>
>
> This comment seems in contradiction with the migration code :
>
> /* When we add fault tolerance, we could have several
> migrations at once. For now we don't need to add
> dynamic creation of migration */
>
> Why is 'current_migration' allocated and destroyed today ?
>
> Could we replace it with a singleton and switch the state to
> MIGRATION_STATUS_NONE instead of destroying it ?
>
> If not, should 'current_migration' be set to NULL in
> migration_shutdown() and its value tested before accessing
> to any of its fields, in a thread safe manner if necessary.
IIUC the migration thread should always see valid migration object, as it
takes one refcount at the entrance of migration_thread():
object_ref(OBJECT(s));
So it's not yet clear to me on why the mutex was destroyed, if the main
object should still be there; logically it was only destroyed in the
finalize phase (migration_instance_finalize), but that should be after
migration thread quits and releases the refcount..
--
Peter Xu
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-21 15:50 ` Peter Xu
@ 2024-10-21 16:43 ` Cédric Le Goater
2024-10-21 16:54 ` Peter Xu
0 siblings, 1 reply; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-21 16:43 UTC (permalink / raw)
To: Peter Xu; +Cc: Avihai Horon, qemu-devel, Alex Williamson, Fabiano Rosas
Hello,
> IIUC the migration thread should always see valid migration object, as it
> takes one refcount at the entrance of migration_thread():
>
> object_ref(OBJECT(s));
Could the migration have failed before ? in migrate_fd_connect()
> So it's not yet clear to me on why the mutex was destroyed, if the main
> object should still be there; logically it was only destroyed in the
> finalize phase (migration_instance_finalize), but that should be after
> migration thread quits and releases the refcount..
Avihai, would you have a reproducer ? or did you keep some logs ?
a backtrace maybe ?
Thanks,
C.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-21 16:43 ` Cédric Le Goater
@ 2024-10-21 16:54 ` Peter Xu
2024-10-22 9:38 ` Avihai Horon via
0 siblings, 1 reply; 20+ messages in thread
From: Peter Xu @ 2024-10-21 16:54 UTC (permalink / raw)
To: Cédric Le Goater
Cc: Avihai Horon, qemu-devel, Alex Williamson, Fabiano Rosas
On Mon, Oct 21, 2024 at 06:43:13PM +0200, Cédric Le Goater wrote:
> Hello,
>
> > IIUC the migration thread should always see valid migration object, as it
> > takes one refcount at the entrance of migration_thread():
> >
> > object_ref(OBJECT(s));
>
> Could the migration have failed before ? in migrate_fd_connect()
I just noticed it's a vm state change notifier..
If so, maybe VFIO could refer to its internal states showing that it's
during a migration first (so as to guarantee the migration object is valid;
e.g., only after save_setup() but before save_cleanup() hooks are invoked).
Then migration_file_set_error() is only needed when it's during a migration
already. Otherwise it's only a vm state change event, so nothing relevant
to migration.
--
Peter Xu
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-21 16:54 ` Peter Xu
@ 2024-10-22 9:38 ` Avihai Horon via
2024-10-22 13:24 ` Cédric Le Goater
0 siblings, 1 reply; 20+ messages in thread
From: Avihai Horon via @ 2024-10-22 9:38 UTC (permalink / raw)
To: Peter Xu, Cédric Le Goater
Cc: qemu-devel, Alex Williamson, Fabiano Rosas
On 21/10/2024 19:54, Peter Xu wrote:
> External email: Use caution opening links or attachments
>
>
> On Mon, Oct 21, 2024 at 06:43:13PM +0200, Cédric Le Goater wrote:
>> Hello,
>>
>>> IIUC the migration thread should always see valid migration object, as it
>>> takes one refcount at the entrance of migration_thread():
>>>
>>> object_ref(OBJECT(s));
>> Could the migration have failed before ? in migrate_fd_connect()
> I just noticed it's a vm state change notifier..
Yep.
I stumbled upon this bug by accident when running on a buggy machine.
Migration wasn't involved, I just started the VM, shut it down and got
the assert (as my VFIO device was faulty and errored on RUNNING->STOP
state change).
You can repro it by forcefully triggering an error on *->STOP transition:
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 17199b73ae..d41cb7c634 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -831,7 +831,9 @@ static void vfio_vmstate_change(void *opaque, bool
running, RunState state)
}
ret = vfio_migration_set_state_or_reset(vbasedev, new_state,
&local_err);
- if (ret) {
+ if (ret || new_state == VFIO_DEVICE_STATE_STOP) {
+ ret = -1;
+ error_setg(&local_err, "%s: forced error", vbasedev->name);
/*
* Migration should be aborted in this case, but vm_state_notify()
* currently does not support reporting failures.
>
> If so, maybe VFIO could refer to its internal states showing that it's
> during a migration first (so as to guarantee the migration object is valid;
> e.g., only after save_setup() but before save_cleanup() hooks are invoked).
It's an option, but I think it's a bit awkward as we'd need to check
that VFIOMigration->data_buffer is set or add a new flag.
Besides that, as Cedric pointed out, VFIO code calls
migration_is_setup_or_active() which can also be unsafe, as it might be
invoked after migration object has been freed.
Maybe a simpler solution would be to extend the the migration object
lifetime?
Looking at commit history, you can see that commit 1f8956041ad3
("migration: finalize current_migration object") added migration object
finalization at the very end of qemu cleanup.
Then came commit 892ae715b6bc ("migration: Cleanup during exit") and
moved the migration object finalization to the beginning of qemu cleanup
(before stopping the VM etc.).
If so, the fix could be something like the below?
-------------8<-------------
diff --git a/include/migration/misc.h b/include/migration/misc.h
index bfadc5613b..5eb099349a 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -52,6 +52,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
/* migration/migration.c */
void migration_object_init(void);
+void migration_object_finalize(void);
void migration_shutdown(void);
bool migration_is_idle(void);
bool migration_is_active(void);
diff --git a/migration/migration.c b/migration/migration.c
index 021faee2f3..9eaa7947bc 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -265,6 +265,11 @@ void migration_object_init(void)
dirty_bitmap_mig_init();
}
+void migration_object_finalize(void)
+{
+ object_unref(OBJECT(current_migration));
+}
+
typedef struct {
QEMUBH *bh;
QEMUBHFunc *cb;
@@ -330,7 +335,6 @@ void migration_shutdown(void)
* stop the migration using this structure
*/
migration_cancel(NULL);
- object_unref(OBJECT(current_migration));
/*
* Cancel outgoing migration of dirty bitmaps. It should
diff --git a/system/runstate.c b/system/runstate.c
index c2c9afa905..fa823f5e72 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -930,5 +930,6 @@ void qemu_cleanup(int status)
monitor_cleanup();
qemu_chr_cleanup();
user_creatable_cleanup();
+ migration_object_finalize();
/* TODO: unref root container, check all devices are ok */
}
-------------8<-------------
Thanks.
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-22 9:38 ` Avihai Horon via
@ 2024-10-22 13:24 ` Cédric Le Goater
2024-10-22 14:21 ` Avihai Horon
2024-10-22 14:29 ` Peter Xu
0 siblings, 2 replies; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-22 13:24 UTC (permalink / raw)
To: Avihai Horon, Peter Xu; +Cc: qemu-devel, Alex Williamson, Fabiano Rosas
On 10/22/24 11:38, Avihai Horon wrote:
>
> On 21/10/2024 19:54, Peter Xu wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On Mon, Oct 21, 2024 at 06:43:13PM +0200, Cédric Le Goater wrote:
>>> Hello,
>>>
>>>> IIUC the migration thread should always see valid migration object, as it
>>>> takes one refcount at the entrance of migration_thread():
>>>>
>>>> object_ref(OBJECT(s));
>>> Could the migration have failed before ? in migrate_fd_connect()
>> I just noticed it's a vm state change notifier..
>
> Yep.
> I stumbled upon this bug by accident when running on a buggy machine.
> Migration wasn't involved, I just started the VM, shut it down and got the assert (as my VFIO device was faulty and errored on RUNNING->STOP state change).
>
> You can repro it by forcefully triggering an error on *->STOP transition:
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 17199b73ae..d41cb7c634 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -831,7 +831,9 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
> }
>
> ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
> - if (ret) {
> + if (ret || new_state == VFIO_DEVICE_STATE_STOP) {
> + ret = -1;
> + error_setg(&local_err, "%s: forced error", vbasedev->name);
> /*
> * Migration should be aborted in this case, but vm_state_notify()
> * currently does not support reporting failures.
>
>>
>> If so, maybe VFIO could refer to its internal states showing that it's
>> during a migration first (so as to guarantee the migration object is valid;
>> e.g., only after save_setup() but before save_cleanup() hooks are invoked).
>
> It's an option, but I think it's a bit awkward as we'd need to check that VFIOMigration->data_buffer is set
That's what I was looking at too. It works. It feels a bit awkward
indeed. We could hide the details in an helper routine though.
> or add a new flag.
yes that's another solution.
Peter,
I wonder if we could grab a ref on current_migration in save_setup(),
store it under VFIOMigration and drop it save_cleanup() ?
> Besides that, as Cedric pointed out, VFIO code calls migration_is_setup_or_active() which can also be unsafe, as it might be invoked after migration object has been freed.
>
> Maybe a simpler solution would be to extend the the migration object lifetime?
> Looking at commit history, you can see that commit 1f8956041ad3 ("migration: finalize current_migration object") added migration object finalization at the very end of qemu cleanup.
> Then came commit 892ae715b6bc ("migration: Cleanup during exit") and moved the migration object finalization to the beginning of qemu cleanup (before stopping the VM etc.).
>
> If so, the fix could be something like the below?
>
> -------------8<-------------
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index bfadc5613b..5eb099349a 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -52,6 +52,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
>
> /* migration/migration.c */
> void migration_object_init(void);
> +void migration_object_finalize(void);
> void migration_shutdown(void);
> bool migration_is_idle(void);
> bool migration_is_active(void);
> diff --git a/migration/migration.c b/migration/migration.c
> index 021faee2f3..9eaa7947bc 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -265,6 +265,11 @@ void migration_object_init(void)
> dirty_bitmap_mig_init();
> }
>
> +void migration_object_finalize(void)
> +{
> + object_unref(OBJECT(current_migration));
> +}
> +
> typedef struct {
> QEMUBH *bh;
> QEMUBHFunc *cb;
> @@ -330,7 +335,6 @@ void migration_shutdown(void)
> * stop the migration using this structure
> */
> migration_cancel(NULL);
> - object_unref(OBJECT(current_migration));
>
> /*
> * Cancel outgoing migration of dirty bitmaps. It should
> diff --git a/system/runstate.c b/system/runstate.c
> index c2c9afa905..fa823f5e72 100644
> --- a/system/runstate.c
> +++ b/system/runstate.c
> @@ -930,5 +930,6 @@ void qemu_cleanup(int status)
> monitor_cleanup();
> qemu_chr_cleanup();
> user_creatable_cleanup();
> + migration_object_finalize();
> /* TODO: unref root container, check all devices are ok */
> }
> -------------8<-------------
892ae715b6bc was trying to fix potential use-after-free issues.
I think it is safer to introduce an helper routine checking
(in some ways) if a migration is in progress than partially
revert 892ae715b6bc.
Thanks,
C.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-22 13:24 ` Cédric Le Goater
@ 2024-10-22 14:21 ` Avihai Horon
2024-10-22 14:29 ` Peter Xu
1 sibling, 0 replies; 20+ messages in thread
From: Avihai Horon @ 2024-10-22 14:21 UTC (permalink / raw)
To: Cédric Le Goater, Peter Xu
Cc: qemu-devel, Alex Williamson, Fabiano Rosas
On 22/10/2024 16:24, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> On 10/22/24 11:38, Avihai Horon wrote:
>>
>> On 21/10/2024 19:54, Peter Xu wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Mon, Oct 21, 2024 at 06:43:13PM +0200, Cédric Le Goater wrote:
>>>> Hello,
>>>>
>>>>> IIUC the migration thread should always see valid migration
>>>>> object, as it
>>>>> takes one refcount at the entrance of migration_thread():
>>>>>
>>>>> object_ref(OBJECT(s));
>>>> Could the migration have failed before ? in migrate_fd_connect()
>>> I just noticed it's a vm state change notifier..
>>
>> Yep.
>> I stumbled upon this bug by accident when running on a buggy machine.
>> Migration wasn't involved, I just started the VM, shut it down and
>> got the assert (as my VFIO device was faulty and errored on
>> RUNNING->STOP state change).
>>
>> You can repro it by forcefully triggering an error on *->STOP
>> transition:
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 17199b73ae..d41cb7c634 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -831,7 +831,9 @@ static void vfio_vmstate_change(void *opaque,
>> bool running, RunState state)
>> }
>>
>> ret = vfio_migration_set_state_or_reset(vbasedev, new_state,
>> &local_err);
>> - if (ret) {
>> + if (ret || new_state == VFIO_DEVICE_STATE_STOP) {
>> + ret = -1;
>> + error_setg(&local_err, "%s: forced error", vbasedev->name);
>> /*
>> * Migration should be aborted in this case, but
>> vm_state_notify()
>> * currently does not support reporting failures.
>>
>>>
>>> If so, maybe VFIO could refer to its internal states showing that it's
>>> during a migration first (so as to guarantee the migration object is
>>> valid;
>>> e.g., only after save_setup() but before save_cleanup() hooks are
>>> invoked).
>>
>> It's an option, but I think it's a bit awkward as we'd need to check
>> that VFIOMigration->data_buffer is set
>
> That's what I was looking at too. It works. It feels a bit awkward
> indeed. We could hide the details in an helper routine though.
>
>> or add a new flag.
>
> yes that's another solution.
>
> Peter,
>
> I wonder if we could grab a ref on current_migration in save_setup(),
> store it under VFIOMigration and drop it save_cleanup() ?
>
>
>> Besides that, as Cedric pointed out, VFIO code calls
>> migration_is_setup_or_active() which can also be unsafe, as it might
>> be invoked after migration object has been freed.
>>
>> Maybe a simpler solution would be to extend the the migration object
>> lifetime?
>> Looking at commit history, you can see that commit 1f8956041ad3
>> ("migration: finalize current_migration object") added migration
>> object finalization at the very end of qemu cleanup.
>> Then came commit 892ae715b6bc ("migration: Cleanup during exit") and
>> moved the migration object finalization to the beginning of qemu
>> cleanup (before stopping the VM etc.).
>>
>> If so, the fix could be something like the below?
>>
>> -------------8<-------------
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index bfadc5613b..5eb099349a 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -52,6 +52,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
>>
>> /* migration/migration.c */
>> void migration_object_init(void);
>> +void migration_object_finalize(void);
>> void migration_shutdown(void);
>> bool migration_is_idle(void);
>> bool migration_is_active(void);
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 021faee2f3..9eaa7947bc 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -265,6 +265,11 @@ void migration_object_init(void)
>> dirty_bitmap_mig_init();
>> }
>>
>> +void migration_object_finalize(void)
>> +{
>> + object_unref(OBJECT(current_migration));
>> +}
>> +
>> typedef struct {
>> QEMUBH *bh;
>> QEMUBHFunc *cb;
>> @@ -330,7 +335,6 @@ void migration_shutdown(void)
>> * stop the migration using this structure
>> */
>> migration_cancel(NULL);
>> - object_unref(OBJECT(current_migration));
>>
>> /*
>> * Cancel outgoing migration of dirty bitmaps. It should
>> diff --git a/system/runstate.c b/system/runstate.c
>> index c2c9afa905..fa823f5e72 100644
>> --- a/system/runstate.c
>> +++ b/system/runstate.c
>> @@ -930,5 +930,6 @@ void qemu_cleanup(int status)
>> monitor_cleanup();
>> qemu_chr_cleanup();
>> user_creatable_cleanup();
>> + migration_object_finalize();
>> /* TODO: unref root container, check all devices are ok */
>> }
>> -------------8<-------------
>
> 892ae715b6bc was trying to fix potential use-after-free issues.
Yes, and it did, by ref/unref current_migration in migration_thread().
In addition, it added a migrate_fd_cancel() in the beginning of qemu
cleanup to encourage migration_thread to quit.
IIUC, these are the core changes of the commit, and the only reason to
move "object_unref(OBJECT(current_migration))" from the end of qemu
cleanup to the beginning of it was to put migration cleanup code in one
place.
That's why I think moving "object_unref(OBJECT(current_migration))" back
to the end of qemu cleanup is safe and shouldn't make much difference.
It will delay freeing of MigrationState right to the end and will allow
us to use the already existing helper (migration_is_setup_or_active())
to check migration status.
IMHO this seems like the cleanest solution, unless I am missing something.
Thanks.
>
> I think it is safer to introduce an helper routine checking
> (in some ways) if a migration is in progress than partially
> revert 892ae715b6bc.
>
> Thanks,
>
> C.
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-22 13:24 ` Cédric Le Goater
2024-10-22 14:21 ` Avihai Horon
@ 2024-10-22 14:29 ` Peter Xu
2024-10-22 14:42 ` Cédric Le Goater
1 sibling, 1 reply; 20+ messages in thread
From: Peter Xu @ 2024-10-22 14:29 UTC (permalink / raw)
To: Cédric Le Goater
Cc: Avihai Horon, qemu-devel, Alex Williamson, Fabiano Rosas
On Tue, Oct 22, 2024 at 03:24:56PM +0200, Cédric Le Goater wrote:
> On 10/22/24 11:38, Avihai Horon wrote:
> >
> > On 21/10/2024 19:54, Peter Xu wrote:
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > On Mon, Oct 21, 2024 at 06:43:13PM +0200, Cédric Le Goater wrote:
> > > > Hello,
> > > >
> > > > > IIUC the migration thread should always see valid migration object, as it
> > > > > takes one refcount at the entrance of migration_thread():
> > > > >
> > > > > object_ref(OBJECT(s));
> > > > Could the migration have failed before ? in migrate_fd_connect()
> > > I just noticed it's a vm state change notifier..
> >
> > Yep.
> > I stumbled upon this bug by accident when running on a buggy machine.
> > Migration wasn't involved, I just started the VM, shut it down and got the assert (as my VFIO device was faulty and errored on RUNNING->STOP state change).
> >
> > You can repro it by forcefully triggering an error on *->STOP transition:
> >
> > diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> > index 17199b73ae..d41cb7c634 100644
> > --- a/hw/vfio/migration.c
> > +++ b/hw/vfio/migration.c
> > @@ -831,7 +831,9 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
> > }
> >
> > ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
> > - if (ret) {
> > + if (ret || new_state == VFIO_DEVICE_STATE_STOP) {
> > + ret = -1;
> > + error_setg(&local_err, "%s: forced error", vbasedev->name);
> > /*
> > * Migration should be aborted in this case, but vm_state_notify()
> > * currently does not support reporting failures.
> >
> > >
> > > If so, maybe VFIO could refer to its internal states showing that it's
> > > during a migration first (so as to guarantee the migration object is valid;
> > > e.g., only after save_setup() but before save_cleanup() hooks are invoked).
> >
> > It's an option, but I think it's a bit awkward as we'd need to check
> > that VFIOMigration->data_buffer is set
>
> That's what I was looking at too. It works. It feels a bit awkward
> indeed. We could hide the details in an helper routine though.
>
> > or add a new flag.
>
> yes that's another solution.
>
> Peter,
>
> I wonder if we could grab a ref on current_migration in save_setup(),
> store it under VFIOMigration and drop it save_cleanup() ?
VFIO can definitely do that, but I am not sure how that would help.. as I
think the migration object can never go away anyway during setup->cleanup,
so taking that extra refcount shouldn't change anything.
IOW, AFAICT this bug is triggered only when without migration at all.
>
>
> > Besides that, as Cedric pointed out, VFIO code calls migration_is_setup_or_active() which can also be unsafe, as it might be invoked after migration object has been freed.
> >
> > Maybe a simpler solution would be to extend the the migration object lifetime?
> > Looking at commit history, you can see that commit 1f8956041ad3 ("migration: finalize current_migration object") added migration object finalization at the very end of qemu cleanup.
> > Then came commit 892ae715b6bc ("migration: Cleanup during exit") and moved the migration object finalization to the beginning of qemu cleanup (before stopping the VM etc.).
> >
> > If so, the fix could be something like the below?
> >
> > -------------8<-------------
> > diff --git a/include/migration/misc.h b/include/migration/misc.h
> > index bfadc5613b..5eb099349a 100644
> > --- a/include/migration/misc.h
> > +++ b/include/migration/misc.h
> > @@ -52,6 +52,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
> >
> > /* migration/migration.c */
> > void migration_object_init(void);
> > +void migration_object_finalize(void);
> > void migration_shutdown(void);
> > bool migration_is_idle(void);
> > bool migration_is_active(void);
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 021faee2f3..9eaa7947bc 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -265,6 +265,11 @@ void migration_object_init(void)
> > dirty_bitmap_mig_init();
> > }
> >
> > +void migration_object_finalize(void)
> > +{
> > + object_unref(OBJECT(current_migration));
> > +}
> > +
> > typedef struct {
> > QEMUBH *bh;
> > QEMUBHFunc *cb;
> > @@ -330,7 +335,6 @@ void migration_shutdown(void)
> > * stop the migration using this structure
> > */
> > migration_cancel(NULL);
> > - object_unref(OBJECT(current_migration));
> >
> > /*
> > * Cancel outgoing migration of dirty bitmaps. It should
> > diff --git a/system/runstate.c b/system/runstate.c
> > index c2c9afa905..fa823f5e72 100644
> > --- a/system/runstate.c
> > +++ b/system/runstate.c
> > @@ -930,5 +930,6 @@ void qemu_cleanup(int status)
> > monitor_cleanup();
> > qemu_chr_cleanup();
> > user_creatable_cleanup();
> > + migration_object_finalize();
> > /* TODO: unref root container, check all devices are ok */
> > }
> > -------------8<-------------
>
> 892ae715b6bc was trying to fix potential use-after-free issues.
>
> I think it is safer to introduce an helper routine checking
> (in some ways) if a migration is in progress than partially
> revert 892ae715b6bc.
Right, Dave also mentioned something in 892ae715b6bc about moving it
earlier:
We do this a bit earlier; so hopefully migration gets out of the way
before all the devices etc are freed.
But I don't know the relationship on device free() v.s. the migration
object. We might at least want to figure that out if we want to move it
again.
I notice that vdpa also started using migration_is_setup_or_active(), so if
it's used in more places, maybe indeed we can consider making them safe to
be called at any phase of QEMU.
Logically it is safe in vm state change hook always because it has the BQL
and the object can only be freed when with BQL.
So let me send a small patch later to hopefully make all these exported
functions (including migration_file_set_error() in this case, logically
anything in migration/misc.h) safe to be called without migration.
--
Peter Xu
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-22 14:29 ` Peter Xu
@ 2024-10-22 14:42 ` Cédric Le Goater
2024-10-22 16:10 ` Peter Xu
0 siblings, 1 reply; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-22 14:42 UTC (permalink / raw)
To: Peter Xu; +Cc: Avihai Horon, qemu-devel, Alex Williamson, Fabiano Rosas
On 10/22/24 16:29, Peter Xu wrote:
> On Tue, Oct 22, 2024 at 03:24:56PM +0200, Cédric Le Goater wrote:
>> On 10/22/24 11:38, Avihai Horon wrote:
>>>
>>> On 21/10/2024 19:54, Peter Xu wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On Mon, Oct 21, 2024 at 06:43:13PM +0200, Cédric Le Goater wrote:
>>>>> Hello,
>>>>>
>>>>>> IIUC the migration thread should always see valid migration object, as it
>>>>>> takes one refcount at the entrance of migration_thread():
>>>>>>
>>>>>> object_ref(OBJECT(s));
>>>>> Could the migration have failed before ? in migrate_fd_connect()
>>>> I just noticed it's a vm state change notifier..
>>>
>>> Yep.
>>> I stumbled upon this bug by accident when running on a buggy machine.
>>> Migration wasn't involved, I just started the VM, shut it down and got the assert (as my VFIO device was faulty and errored on RUNNING->STOP state change).
>>>
>>> You can repro it by forcefully triggering an error on *->STOP transition:
>>>
>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>> index 17199b73ae..d41cb7c634 100644
>>> --- a/hw/vfio/migration.c
>>> +++ b/hw/vfio/migration.c
>>> @@ -831,7 +831,9 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
>>> }
>>>
>>> ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
>>> - if (ret) {
>>> + if (ret || new_state == VFIO_DEVICE_STATE_STOP) {
>>> + ret = -1;
>>> + error_setg(&local_err, "%s: forced error", vbasedev->name);
>>> /*
>>> * Migration should be aborted in this case, but vm_state_notify()
>>> * currently does not support reporting failures.
>>>
>>>>
>>>> If so, maybe VFIO could refer to its internal states showing that it's
>>>> during a migration first (so as to guarantee the migration object is valid;
>>>> e.g., only after save_setup() but before save_cleanup() hooks are invoked).
>>>
>>> It's an option, but I think it's a bit awkward as we'd need to check
>>> that VFIOMigration->data_buffer is set
>>
>> That's what I was looking at too. It works. It feels a bit awkward
>> indeed. We could hide the details in an helper routine though.
>>
>>> or add a new flag.
>>
>> yes that's another solution.
>>
>> Peter,
>>
>> I wonder if we could grab a ref on current_migration in save_setup(),
>> store it under VFIOMigration and drop it save_cleanup() ?
>
> VFIO can definitely do that, but I am not sure how that would help.. as I
> think the migration object can never go away anyway during setup->cleanup,
> so taking that extra refcount shouldn't change anything.
It won't be but we would have a VFIOMigration::MigrationState pointer to
test in the vmstate change handler, which is a bit cleaner than testing
migration->data_buffer. Just an idea. I am not convinced myself either.
> IOW, AFAICT this bug is triggered only when without migration at all.
>
>>
>>
>>> Besides that, as Cedric pointed out, VFIO code calls migration_is_setup_or_active() which can also be unsafe, as it might be invoked after migration object has been freed.
>>>
>>> Maybe a simpler solution would be to extend the the migration object lifetime?
>>> Looking at commit history, you can see that commit 1f8956041ad3 ("migration: finalize current_migration object") added migration object finalization at the very end of qemu cleanup.
>>> Then came commit 892ae715b6bc ("migration: Cleanup during exit") and moved the migration object finalization to the beginning of qemu cleanup (before stopping the VM etc.).
>>>
>>> If so, the fix could be something like the below?
>>>
>>> -------------8<-------------
>>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>>> index bfadc5613b..5eb099349a 100644
>>> --- a/include/migration/misc.h
>>> +++ b/include/migration/misc.h
>>> @@ -52,6 +52,7 @@ void dump_vmstate_json_to_file(FILE *out_fp);
>>>
>>> /* migration/migration.c */
>>> void migration_object_init(void);
>>> +void migration_object_finalize(void);
>>> void migration_shutdown(void);
>>> bool migration_is_idle(void);
>>> bool migration_is_active(void);
>>> diff --git a/migration/migration.c b/migration/migration.c
>>> index 021faee2f3..9eaa7947bc 100644
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -265,6 +265,11 @@ void migration_object_init(void)
>>> dirty_bitmap_mig_init();
>>> }
>>>
>>> +void migration_object_finalize(void)
>>> +{
>>> + object_unref(OBJECT(current_migration));
>>> +}
>>> +
>>> typedef struct {
>>> QEMUBH *bh;
>>> QEMUBHFunc *cb;
>>> @@ -330,7 +335,6 @@ void migration_shutdown(void)
>>> * stop the migration using this structure
>>> */
>>> migration_cancel(NULL);
>>> - object_unref(OBJECT(current_migration));
>>>
>>> /*
>>> * Cancel outgoing migration of dirty bitmaps. It should
>>> diff --git a/system/runstate.c b/system/runstate.c
>>> index c2c9afa905..fa823f5e72 100644
>>> --- a/system/runstate.c
>>> +++ b/system/runstate.c
>>> @@ -930,5 +930,6 @@ void qemu_cleanup(int status)
>>> monitor_cleanup();
>>> qemu_chr_cleanup();
>>> user_creatable_cleanup();
>>> + migration_object_finalize();
>>> /* TODO: unref root container, check all devices are ok */
>>> }
>>> -------------8<-------------
>>
>> 892ae715b6bc was trying to fix potential use-after-free issues.
>>
>> I think it is safer to introduce an helper routine checking
>> (in some ways) if a migration is in progress than partially
>> revert 892ae715b6bc.
>
> Right, Dave also mentioned something in 892ae715b6bc about moving it
> earlier:
>
> We do this a bit earlier; so hopefully migration gets out of the way
> before all the devices etc are freed.
>
> But I don't know the relationship on device free() v.s. the migration
> object. We might at least want to figure that out if we want to move it
> again.
>
> I notice that vdpa also started using migration_is_setup_or_active(), so if
> it's used in more places, maybe indeed we can consider making them safe to
> be called at any phase of QEMU.
>
> Logically it is safe in vm state change hook always because it has the BQL
> and the object can only be freed when with BQL.
>
> So let me send a small patch later to hopefully make all these exported
> functions (including migration_file_set_error() in this case, logically
> anything in migration/misc.h) safe to be called without migration.
>
OK.
Thanks
C.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting
2024-10-22 14:42 ` Cédric Le Goater
@ 2024-10-22 16:10 ` Peter Xu
0 siblings, 0 replies; 20+ messages in thread
From: Peter Xu @ 2024-10-22 16:10 UTC (permalink / raw)
To: Cédric Le Goater
Cc: Avihai Horon, qemu-devel, Alex Williamson, Fabiano Rosas
On Tue, Oct 22, 2024 at 04:42:14PM +0200, Cédric Le Goater wrote:
> > So let me send a small patch later to hopefully make all these exported
> > functions (including migration_file_set_error() in this case, logically
> > anything in migration/misc.h) safe to be called without migration.
>
> OK.
Posted here:
https://lore.kernel.org/qemu-devel/20241022160720.1013543-1-peterx@redhat.com/
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact()
2024-10-20 13:01 ` [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact() Avihai Horon
2024-10-21 6:48 ` Cédric Le Goater
@ 2024-10-23 10:06 ` Cédric Le Goater
2024-10-25 15:18 ` Michael Tokarev
2 siblings, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-23 10:06 UTC (permalink / raw)
To: Avihai Horon, qemu-devel; +Cc: Alex Williamson, Peter Xu, Fabiano Rosas
On 10/20/24 15:01, Avihai Horon wrote:
> vfio_state_pending_exact() is used to update migration core how much
> device data is left for the device migration. Currently, the sum of
> pre-copy and stop-copy sizes of the VFIO device are reported.
>
> The pre-copy size is obtained via the VFIO_MIG_GET_PRECOPY_INFO ioctl,
> which returns the amount of device data available to be transferred
> while the device is in the PRE_COPY states.
>
> The stop-copy size is obtained via the VFIO_DEVICE_FEATURE_MIG_DATA_SIZE
> ioctl, which returns the total amount of device data left to be
> transferred in order to complete the device migration.
>
> According to the above, current implementation is wrong -- it reports
> extra overlapping data because pre-copy size is already contained in
> stop-copy size. Fix it by reporting only stop-copy size.
>
> Fixes: eda7362af959 ("vfio/migration: Add VFIO migration pre-copy support")
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
> ---
> hw/vfio/migration.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 17199b73ae..992dc3b102 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -576,9 +576,6 @@ static void vfio_state_pending_exact(void *opaque, uint64_t *must_precopy,
>
> if (vfio_device_state_is_precopy(vbasedev)) {
> vfio_query_precopy_size(migration);
> -
> - *must_precopy +=
> - migration->precopy_init_size + migration->precopy_dirty_size;
> }
>
> trace_vfio_state_pending_exact(vbasedev->name, *must_precopy, *can_postcopy,
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] vfio/migration: Some bug fixes and cleanups
2024-10-20 13:01 [PATCH 0/3] vfio/migration: Some bug fixes and cleanups Avihai Horon
` (2 preceding siblings ...)
2024-10-20 13:01 ` [PATCH 3/3] vfio/migration: Change trace formats from hex to decimal Avihai Horon
@ 2024-10-23 15:08 ` Cédric Le Goater
3 siblings, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-23 15:08 UTC (permalink / raw)
To: Avihai Horon, qemu-devel; +Cc: Alex Williamson, Peter Xu, Fabiano Rosas
On 10/20/24 15:01, Avihai Horon wrote:
> This small series contains some bug fixes and cleanups.
> Reviews are welcome.
>
> Thanks.
>
> Avihai Horon (3):
> vfio/migration: Report only stop-copy size in
> vfio_state_pending_exact()
> vfio/migration: Refactor vfio_vmstate_change/_prepare() error
> reporting
> vfio/migration: Change trace formats from hex to decimal
>
> hw/vfio/migration.c | 34 +++++++++++++++++++++-------------
> hw/vfio/trace-events | 10 +++++-----
> 2 files changed, 26 insertions(+), 18 deletions(-)
>
Applied 1 and 3 to vfio-next. Patch 2 is being addressed in another series.
Thanks,
C.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact()
2024-10-20 13:01 ` [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact() Avihai Horon
2024-10-21 6:48 ` Cédric Le Goater
2024-10-23 10:06 ` Cédric Le Goater
@ 2024-10-25 15:18 ` Michael Tokarev
2024-10-25 16:09 ` Cédric Le Goater
2 siblings, 1 reply; 20+ messages in thread
From: Michael Tokarev @ 2024-10-25 15:18 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Peter Xu, Fabiano Rosas,
qemu-stable
20.10.2024 16:01, Avihai Horon:
> vfio_state_pending_exact() is used to update migration core how much
> device data is left for the device migration. Currently, the sum of
> pre-copy and stop-copy sizes of the VFIO device are reported.
>
> The pre-copy size is obtained via the VFIO_MIG_GET_PRECOPY_INFO ioctl,
> which returns the amount of device data available to be transferred
> while the device is in the PRE_COPY states.
>
> The stop-copy size is obtained via the VFIO_DEVICE_FEATURE_MIG_DATA_SIZE
> ioctl, which returns the total amount of device data left to be
> transferred in order to complete the device migration.
>
> According to the above, current implementation is wrong -- it reports
> extra overlapping data because pre-copy size is already contained in
> stop-copy size. Fix it by reporting only stop-copy size.
>
> Fixes: eda7362af959 ("vfio/migration: Add VFIO migration pre-copy support")
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
This smells like a qemu-stable material (for 8.1+).
Please let me know if it is not.
Thanks,
/mjt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact()
2024-10-25 15:18 ` Michael Tokarev
@ 2024-10-25 16:09 ` Cédric Le Goater
0 siblings, 0 replies; 20+ messages in thread
From: Cédric Le Goater @ 2024-10-25 16:09 UTC (permalink / raw)
To: Michael Tokarev, Avihai Horon, qemu-devel
Cc: Alex Williamson, Peter Xu, Fabiano Rosas, qemu-stable
On 10/25/24 17:18, Michael Tokarev wrote:
> 20.10.2024 16:01, Avihai Horon:
>> vfio_state_pending_exact() is used to update migration core how much
>> device data is left for the device migration. Currently, the sum of
>> pre-copy and stop-copy sizes of the VFIO device are reported.
>>
>> The pre-copy size is obtained via the VFIO_MIG_GET_PRECOPY_INFO ioctl,
>> which returns the amount of device data available to be transferred
>> while the device is in the PRE_COPY states.
>>
>> The stop-copy size is obtained via the VFIO_DEVICE_FEATURE_MIG_DATA_SIZE
>> ioctl, which returns the total amount of device data left to be
>> transferred in order to complete the device migration.
>>
>> According to the above, current implementation is wrong -- it reports
>> extra overlapping data because pre-copy size is already contained in
>> stop-copy size. Fix it by reporting only stop-copy size.
>>
>> Fixes: eda7362af959 ("vfio/migration: Add VFIO migration pre-copy support")
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>
> This smells like a qemu-stable material (for 8.1+).
I am in favor.
Thanks,
C.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-10-25 16:10 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-20 13:01 [PATCH 0/3] vfio/migration: Some bug fixes and cleanups Avihai Horon
2024-10-20 13:01 ` [PATCH 1/3] vfio/migration: Report only stop-copy size in vfio_state_pending_exact() Avihai Horon
2024-10-21 6:48 ` Cédric Le Goater
2024-10-23 10:06 ` Cédric Le Goater
2024-10-25 15:18 ` Michael Tokarev
2024-10-25 16:09 ` Cédric Le Goater
2024-10-20 13:01 ` [PATCH 2/3] vfio/migration: Refactor vfio_vmstate_change/_prepare() error reporting Avihai Horon
2024-10-21 15:14 ` Cédric Le Goater
2024-10-21 15:50 ` Peter Xu
2024-10-21 16:43 ` Cédric Le Goater
2024-10-21 16:54 ` Peter Xu
2024-10-22 9:38 ` Avihai Horon via
2024-10-22 13:24 ` Cédric Le Goater
2024-10-22 14:21 ` Avihai Horon
2024-10-22 14:29 ` Peter Xu
2024-10-22 14:42 ` Cédric Le Goater
2024-10-22 16:10 ` Peter Xu
2024-10-20 13:01 ` [PATCH 3/3] vfio/migration: Change trace formats from hex to decimal Avihai Horon
2024-10-21 15:17 ` Cédric Le Goater
2024-10-23 15:08 ` [PATCH 0/3] vfio/migration: Some bug fixes and cleanups Cédric Le Goater
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).