qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 0/2] Migration 20240331 patches
@ 2024-03-31 18:32 peterx
  2024-03-31 18:32 ` [PULL 1/2] migration: Set migration error in migration_completion() peterx
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: peterx @ 2024-03-31 18:32 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Cédric Le Goater, Prasad Pandit, peterx, Fabiano Rosas

From: Peter Xu <peterx@redhat.com>

The following changes since commit b9dbf6f9bf533564f6a4277d03906fcd32bb0245:

  Merge tag 'pull-tcg-20240329' of https://gitlab.com/rth7680/qemu into staging (2024-03-30 14:54:57 +0000)

are available in the Git repository at:

  https://gitlab.com/peterx/qemu.git tags/migration-20240331-pull-request

for you to fetch changes up to d0ad271a7613459bd0a3397c8071a4ad06f3f7eb:

  migration/postcopy: Ensure postcopy_start() sets errp if it fails (2024-03-31 14:30:03 -0400)

----------------------------------------------------------------
Migration pull for 9.0-rc2

- Avihai's two fixes on error paths

----------------------------------------------------------------

Avihai Horon (2):
  migration: Set migration error in migration_completion()
  migration/postcopy: Ensure postcopy_start() sets errp if it fails

 migration/migration.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

-- 
2.44.0



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PULL 1/2] migration: Set migration error in migration_completion()
  2024-03-31 18:32 [PULL 0/2] Migration 20240331 patches peterx
@ 2024-03-31 18:32 ` peterx
  2024-03-31 18:32 ` [PULL 2/2] migration/postcopy: Ensure postcopy_start() sets errp if it fails peterx
  2024-04-01 14:30 ` [PULL 0/2] Migration 20240331 patches Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: peterx @ 2024-03-31 18:32 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Cédric Le Goater, Prasad Pandit, peterx, Fabiano Rosas,
	Avihai Horon

From: Avihai Horon <avihaih@nvidia.com>

After commit 9425ef3f990a ("migration: Use migrate_has_error() in
close_return_path_on_source()"), close_return_path_on_source() assumes
that migration error is set if an error occurs during migration.

This may not be true if migration errors in migration_completion(). For
example, if qemu_savevm_state_complete_precopy() errors, migration error
will not be set.

This in turn, will cause a migration hang bug, similar to the bug that
was fixed by commit 22b04245f0d5 ("migration: Join the return path
thread before releasing to_dst_file"), as shutdown() will not be issued
for the return-path channel.

Fix it by ensuring migration error is set in case of error in
migration_completion().

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Fixes: 9425ef3f990a ("migration: Use migrate_has_error() in close_return_path_on_source()")
Acked-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240328140252.16756-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 9fe8fd2afd..b73ae3a72c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2799,6 +2799,7 @@ static void migration_completion(MigrationState *s)
 {
     int ret = 0;
     int current_active_state = s->state;
+    Error *local_err = NULL;
 
     if (s->state == MIGRATION_STATUS_ACTIVE) {
         ret = migration_completion_precopy(s, &current_active_state);
@@ -2832,6 +2833,15 @@ static void migration_completion(MigrationState *s)
     return;
 
 fail:
+    if (qemu_file_get_error_obj(s->to_dst_file, &local_err)) {
+        migrate_set_error(s, local_err);
+        error_free(local_err);
+    } else if (ret) {
+        error_setg_errno(&local_err, -ret, "Error in migration completion");
+        migrate_set_error(s, local_err);
+        error_free(local_err);
+    }
+
     migration_completion_failed(s, current_active_state);
 }
 
-- 
2.44.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PULL 2/2] migration/postcopy: Ensure postcopy_start() sets errp if it fails
  2024-03-31 18:32 [PULL 0/2] Migration 20240331 patches peterx
  2024-03-31 18:32 ` [PULL 1/2] migration: Set migration error in migration_completion() peterx
@ 2024-03-31 18:32 ` peterx
  2024-04-01 14:30 ` [PULL 0/2] Migration 20240331 patches Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: peterx @ 2024-03-31 18:32 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Cédric Le Goater, Prasad Pandit, peterx, Fabiano Rosas,
	Avihai Horon, qemu-stable

From: Avihai Horon <avihaih@nvidia.com>

There are several places where postcopy_start() fails without setting
errp. This can cause a null pointer de-reference, as in case of error,
the caller of postcopy_start() copies/prints the error set in errp.

Fix it by setting errp in all of postcopy_start() error paths.

Cc: qemu-stable <qemu-stable@nongnu.org>
Fixes: 908927db28ea ("migration: Update error description whenever migration fails")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240328140252.16756-3-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index b73ae3a72c..86bf76e925 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2510,6 +2510,8 @@ static int postcopy_start(MigrationState *ms, Error **errp)
         migration_wait_main_channel(ms);
         if (postcopy_preempt_establish_channel(ms)) {
             migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
+            error_setg(errp, "%s: Failed to establish preempt channel",
+                       __func__);
             return -1;
         }
     }
@@ -2525,17 +2527,22 @@ static int postcopy_start(MigrationState *ms, Error **errp)
 
     ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
     if (ret < 0) {
+        error_setg_errno(errp, -ret, "%s: Failed to stop the VM", __func__);
         goto fail;
     }
 
     ret = migration_maybe_pause(ms, &cur_state,
                                 MIGRATION_STATUS_POSTCOPY_ACTIVE);
     if (ret < 0) {
+        error_setg_errno(errp, -ret, "%s: Failed in migration_maybe_pause()",
+                         __func__);
         goto fail;
     }
 
     ret = bdrv_inactivate_all();
     if (ret < 0) {
+        error_setg_errno(errp, -ret, "%s: Failed in bdrv_inactivate_all()",
+                         __func__);
         goto fail;
     }
     restart_block = true;
@@ -2612,6 +2619,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
 
     /* Now send that blob */
     if (qemu_savevm_send_packaged(ms->to_dst_file, bioc->data, bioc->usage)) {
+        error_setg(errp, "%s: Failed to send packaged data", __func__);
         goto fail_closefb;
     }
     qemu_fclose(fb);
-- 
2.44.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PULL 0/2] Migration 20240331 patches
  2024-03-31 18:32 [PULL 0/2] Migration 20240331 patches peterx
  2024-03-31 18:32 ` [PULL 1/2] migration: Set migration error in migration_completion() peterx
  2024-03-31 18:32 ` [PULL 2/2] migration/postcopy: Ensure postcopy_start() sets errp if it fails peterx
@ 2024-04-01 14:30 ` Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Maydell @ 2024-04-01 14:30 UTC (permalink / raw)
  To: peterx; +Cc: qemu-devel, Cédric Le Goater, Prasad Pandit, Fabiano Rosas

On Sun, 31 Mar 2024 at 19:32, <peterx@redhat.com> wrote:
>
> From: Peter Xu <peterx@redhat.com>
>
> The following changes since commit b9dbf6f9bf533564f6a4277d03906fcd32bb0245:
>
>   Merge tag 'pull-tcg-20240329' of https://gitlab.com/rth7680/qemu into staging (2024-03-30 14:54:57 +0000)
>
> are available in the Git repository at:
>
>   https://gitlab.com/peterx/qemu.git tags/migration-20240331-pull-request
>
> for you to fetch changes up to d0ad271a7613459bd0a3397c8071a4ad06f3f7eb:
>
>   migration/postcopy: Ensure postcopy_start() sets errp if it fails (2024-03-31 14:30:03 -0400)
>
> ----------------------------------------------------------------
> Migration pull for 9.0-rc2
>
> - Avihai's two fixes on error paths


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-04-01 14:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-31 18:32 [PULL 0/2] Migration 20240331 patches peterx
2024-03-31 18:32 ` [PULL 1/2] migration: Set migration error in migration_completion() peterx
2024-03-31 18:32 ` [PULL 2/2] migration/postcopy: Ensure postcopy_start() sets errp if it fails peterx
2024-04-01 14:30 ` [PULL 0/2] Migration 20240331 patches Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).