All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>
Subject: Re: [PATCH 5/6] migration: Fix hang after error in destination setup phase
Date: Wed, 4 Dec 2024 13:44:35 -0500	[thread overview]
Message-ID: <Z1CjE2UYUECkRhJZ@x1n> (raw)
In-Reply-To: <20241202220137.32584-6-farosas@suse.de>

On Mon, Dec 02, 2024 at 07:01:36PM -0300, Fabiano Rosas wrote:
> If the destination side fails at migration_ioc_process_incoming()
> before starting the coroutine, it will report the error but QEMU will
> not exit.
> 
> Set the migration state to FAILED and exit the process if
> exit-on-error allows.
> 
> CC: Thomas Huth <thuth@redhat.com>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2633
> Reported-by: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

(I skipped the postcopy patches as of now, until we finish the discussion
 in patch 2)

> ---
>  migration/channel.c   | 11 ++++++-----
>  migration/migration.c | 31 ++++++++++++++++++-------------
>  migration/migration.h |  2 +-
>  3 files changed, 25 insertions(+), 19 deletions(-)
> 
> diff --git a/migration/channel.c b/migration/channel.c
> index f9de064f3b..6d7f9172d8 100644
> --- a/migration/channel.c
> +++ b/migration/channel.c
> @@ -40,13 +40,14 @@ void migration_channel_process_incoming(QIOChannel *ioc)
>  
>      if (migrate_channel_requires_tls_upgrade(ioc)) {
>          migration_tls_channel_process_incoming(s, ioc, &local_err);
> +
> +        if (local_err) {
> +            error_report_err(local_err);
> +        }

What if tls processing failed here, do we have similar issue that qemu will
stall?  Do we want to cover that too?

> +
>      } else {
>          migration_ioc_register_yank(ioc);
> -        migration_ioc_process_incoming(ioc, &local_err);
> -    }
> -
> -    if (local_err) {
> -        error_report_err(local_err);
> +        migration_ioc_process_incoming(ioc);
>      }
>  }
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index 8a61cc26d7..cd88ebc875 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -943,7 +943,7 @@ static bool migration_should_start_incoming(bool main_channel)
>      return true;
>  }
>  
> -void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
> +void migration_ioc_process_incoming(QIOChannel *ioc)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      Error *local_err = NULL;
> @@ -966,10 +966,9 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>           * issue is not possible.
>           */
>          ret = migration_channel_read_peek(ioc, (void *)&channel_magic,
> -                                          sizeof(channel_magic), errp);
> -
> +                                          sizeof(channel_magic), &local_err);
>          if (ret != 0) {
> -            return;
> +            goto err;
>          }
>  
>          default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC));
> @@ -977,8 +976,8 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>          default_channel = !mis->from_src_file;
>      }
>  
> -    if (multifd_recv_setup(errp) != 0) {
> -        return;
> +    if (multifd_recv_setup(&local_err) != 0) {
> +        goto err;
>      }
>  
>      if (default_channel) {
> @@ -995,18 +994,24 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>              postcopy_preempt_new_channel(mis, f);
>          }
>          if (local_err) {
> -            error_propagate(errp, local_err);
> -            return;
> +            goto err;
>          }
>      }
>  
> -    if (migration_should_start_incoming(default_channel)) {
> -        /* If it's a recovery, we're done */
> -        if (postcopy_try_recover()) {
> -            return;
> -        }
> +    if (migration_should_start_incoming(default_channel) &&
> +        !postcopy_try_recover()) {
>          migration_incoming_process();
>      }
> +
> +    return;
> +
> +err:
> +    error_report_err(local_err);
> +    migrate_set_state(&mis->state, MIGRATION_STATUS_SETUP,
> +                      MIGRATION_STATUS_FAILED);
> +    if (mis->exit_on_error) {
> +        exit(EXIT_FAILURE);
> +    }
>  }
>  
>  /**
> diff --git a/migration/migration.h b/migration/migration.h
> index 0956e9274b..c367e5ea40 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -477,7 +477,7 @@ void migrate_set_state(MigrationStatus *state, MigrationStatus old_state,
>                         MigrationStatus new_state);
>  
>  void migration_fd_process_incoming(QEMUFile *f);
> -void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
> +void migration_ioc_process_incoming(QIOChannel *ioc);
>  void migration_incoming_process(void);
>  
>  bool  migration_has_all_channels(void);
> -- 
> 2.35.3
> 

-- 
Peter Xu



  reply	other threads:[~2024-12-04 18:45 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-02 22:01 [PATCH 0/6] migration: Fix issues during qmp_migrate_cancel Fabiano Rosas
2024-12-02 22:01 ` [PATCH 1/6] tests/qtest/migration: Introduce migration_test_add_suffix Fabiano Rosas
2024-12-04 18:44   ` Peter Xu
2024-12-02 22:01 ` [PATCH 2/6] migration: Kick postcopy threads on cancel Fabiano Rosas
2024-12-04 18:39   ` Peter Xu
2024-12-04 19:02     ` Fabiano Rosas
2024-12-04 19:39       ` Peter Xu
2024-12-04 20:02         ` Daniel P. Berrangé
2024-12-04 20:40           ` Fabiano Rosas
2024-12-04 20:59             ` Peter Xu
2024-12-04 20:51           ` Peter Xu
2024-12-04 21:01             ` Fabiano Rosas
2024-12-04 21:11               ` Peter Xu
2024-12-05 10:52             ` Daniel P. Berrangé
2024-12-05 13:18               ` Fabiano Rosas
2024-12-05 15:03                 ` Peter Xu
2024-12-05 15:44                   ` Daniel P. Berrangé
2024-12-05 15:40                 ` Daniel P. Berrangé
2024-12-02 22:01 ` [PATCH 3/6] migration: Fix postcopy listen thread exit Fabiano Rosas
2024-12-02 22:01 ` [PATCH 4/6] migration: Make sure postcopy recovery doesn't hang when cancelling Fabiano Rosas
2024-12-02 22:01 ` [PATCH 5/6] migration: Fix hang after error in destination setup phase Fabiano Rosas
2024-12-04 18:44   ` Peter Xu [this message]
2024-12-04 19:05     ` Fabiano Rosas
2024-12-02 22:01 ` [PATCH 6/6] tests/qtest/migration: Add a cancel test Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z1CjE2UYUECkRhJZ@x1n \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=farosas@suse.de \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.