qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Juan Quintela <quintela@redhat.com>
To: Leonardo Bras <leobras@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	 Peter Xu <peterx@redhat.com>,
	 qemu-devel@nongnu.org,  Li Xiaohui <xiaohli@redhat.com>
Subject: Re: [PATCH v1 1/1] migration: Fix yank on postcopy multifd crashing guest after migration
Date: Thu, 10 Nov 2022 14:47:53 +0100	[thread overview]
Message-ID: <87tu362a5y.fsf@secure.mitica> (raw)
In-Reply-To: <20221109055629.789795-1-leobras@redhat.com> (Leonardo Bras's message of "Wed, 9 Nov 2022 02:56:29 -0300")

Leonardo Bras <leobras@redhat.com> wrote:
D> When multifd and postcopy-ram capabilities are enabled, if a
> migrate-start-postcopy is attempted, the migration will finish sending the
> memory pages and then crash with the following error:
>
> qemu-system-x86_64: ../util/yank.c:107: yank_unregister_instance: Assertion
> `QLIST_EMPTY(&entry->yankfns)' failed.
>
> This happens because even though all multifd channels could
> yank_register_function(), none of them could unregister it before
> unregistering the MIGRATION_YANK_INSTANCE, causing the assert to fail.
>
> Fix that by calling multifd_load_cleanup() on postcopy_ram_listen_thread()
> before MIGRATION_YANK_INSTANCE is unregistered.

Hi

One question,
What warantees that migration_load_cleanup() is not called twice?

I can't see anything that provides that here?  Or does postcopy have
never done the cleanup of multifd channels before?

Later, Juan.


> Fixes: b5eea99ec2 ("migration: Add yank feature")
> Reported-by: Li Xiaohui <xiaohli@redhat.com>
> Signed-off-by: Leonardo Bras <leobras@redhat.com>
> ---
>  migration/migration.h |  1 +
>  migration/migration.c | 18 +++++++++++++-----
>  migration/savevm.c    |  2 ++
>  3 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/migration/migration.h b/migration/migration.h
> index cdad8aceaa..240f64efb0 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -473,6 +473,7 @@ void migration_make_urgent_request(void);
>  void migration_consume_urgent_request(void);
>  bool migration_rate_limit(void);
>  void migration_cancel(const Error *error);
> +bool migration_load_cleanup(void);
>  
>  void populate_vfio_info(MigrationInfo *info);
>  void postcopy_temp_page_reset(PostcopyTmpPage *tmp_page);
> diff --git a/migration/migration.c b/migration/migration.c
> index 739bb683f3..4f363b2a95 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -486,6 +486,17 @@ void migrate_add_address(SocketAddress *address)
>                        QAPI_CLONE(SocketAddress, address));
>  }
>  
> +bool migration_load_cleanup(void)
> +{
> +    Error *local_err = NULL;
> +
> +    if (multifd_load_cleanup(&local_err)) {
> +        error_report_err(local_err);
> +        return true;
> +    }
> +    return false;
> +}
> +
>  static void qemu_start_incoming_migration(const char *uri, Error **errp)
>  {
>      const char *p = NULL;
> @@ -540,8 +551,7 @@ static void process_incoming_migration_bh(void *opaque)
>       */
>      qemu_announce_self(&mis->announce_timer, migrate_announce_params());
>  
> -    if (multifd_load_cleanup(&local_err) != 0) {
> -        error_report_err(local_err);
> +    if (migration_load_cleanup()) {
>          autostart = false;
>      }
>      /* If global state section was not received or we are in running
> @@ -646,9 +656,7 @@ fail:
>      migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>                        MIGRATION_STATUS_FAILED);
>      qemu_fclose(mis->from_src_file);
> -    if (multifd_load_cleanup(&local_err) != 0) {
> -        error_report_err(local_err);
> -    }
> +    migration_load_cleanup();
>      exit(EXIT_FAILURE);
>  }
>  
> diff --git a/migration/savevm.c b/migration/savevm.c
> index a0cdb714f7..250caff7f4 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1889,6 +1889,8 @@ static void *postcopy_ram_listen_thread(void *opaque)
>          exit(EXIT_FAILURE);
>      }
>  
> +    migration_load_cleanup();
> +

This addition is the one that I don't understand why it was not
needed/done before.

>      migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
>                                     MIGRATION_STATUS_COMPLETED);
>      /*

Later, Juan.



  parent reply	other threads:[~2022-11-10 13:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-09  5:56 [PATCH v1 1/1] migration: Fix yank on postcopy multifd crashing guest after migration Leonardo Bras
2022-11-09 13:31 ` Dr. David Alan Gilbert
2022-11-09 16:59   ` Leonardo Bras Soares Passos
2022-11-10 13:47 ` Juan Quintela [this message]
2022-11-15  2:32   ` Leonardo Bras Soares Passos
2022-11-24 16:04 ` Peter Xu
2022-11-29 20:28   ` Leonardo Bras Soares Passos
2022-11-29 20:50     ` Peter Xu
2023-02-09  4:14       ` Leonardo Brás
2023-02-09 14:22         ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tu362a5y.fsf@secure.mitica \
    --to=quintela@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=leobras@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=xiaohli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).