All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Lukas Straub <lukasstraub2@web.de>,
	Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
	qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH 1/5] migration: Fix missing join() of rp_thread
Date: Wed, 21 Jul 2021 10:49:21 +0100	[thread overview]
Message-ID: <YPftoYsHQGgvvUqy@work-vm> (raw)
In-Reply-To: <20210721012134.792845-2-peterx@redhat.com>

* Peter Xu (peterx@redhat.com) wrote:
> It's possible that the migration thread skip the join() of the rp_thread in
> below race and crash on src right at finishing migration:
> 
>        migration_thread                     rp_thread
>        ----------------                     ---------
>     migration_completion()
>                                         (before rp_thread quits)
>                                         from_dst_file=NULL
>                                         [thread got scheduled out]
>       s->rp_state.from_dst_file==NULL
>         (skip join() of rp_thread)
>     migrate_fd_cleanup()
>       qemu_fclose(s->to_dst_file)
>       yank_unregister_instance()
>         assert(yank_find_entry())  <------- crash
> 
> It could mostly happen with postcopy, but that shouldn't be required, e.g., I
> think it could also trigger with MIGRATION_CAPABILITY_RETURN_PATH set.
> 
> It's suspected that above race could be the root cause of a recent (but rare)
> migration-test break reported by either Dave or PMM:
> 
> https://lore.kernel.org/qemu-devel/YPamXAHwan%2FPPXLf@work-vm/
> 
> The issue is: from_dst_file is reset in the rp_thread, so if the thread reset
> it to NULL fast enough then the migration thread will assume there's no
> rp_thread at all.
> 
> This could potentially cause more severe issue (e.g. crash) after the yank code.
> 
> Fix it by using a boolean to keep "whether we've created rp_thread".
> 
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c | 4 +++-
>  migration/migration.h | 7 +++++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 2d306582eb..21b94f75a3 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2867,6 +2867,7 @@ static int open_return_path_on_source(MigrationState *ms,
>  
>      qemu_thread_create(&ms->rp_state.rp_thread, "return path",
>                         source_return_path_thread, ms, QEMU_THREAD_JOINABLE);
> +    ms->rp_state.rp_thread_created = true;
>  
>      trace_open_return_path_on_source_continue();
>  
> @@ -2891,6 +2892,7 @@ static int await_return_path_close_on_source(MigrationState *ms)
>      }
>      trace_await_return_path_close_on_source_joining();
>      qemu_thread_join(&ms->rp_state.rp_thread);
> +    ms->rp_state.rp_thread_created = false;
>      trace_await_return_path_close_on_source_close();
>      return ms->rp_state.error;
>  }
> @@ -3170,7 +3172,7 @@ static void migration_completion(MigrationState *s)
>       * it will wait for the destination to send it's status in
>       * a SHUT command).
>       */
> -    if (s->rp_state.from_dst_file) {
> +    if (s->rp_state.rp_thread_created) {
>          int rp_error;
>          trace_migration_return_path_end_before();
>          rp_error = await_return_path_close_on_source(s);
> diff --git a/migration/migration.h b/migration/migration.h
> index 2ebb740dfa..c302879fad 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -195,6 +195,13 @@ struct MigrationState {
>          QEMUFile     *from_dst_file;
>          QemuThread    rp_thread;
>          bool          error;
> +        /*
> +         * We can also check non-zero of rp_thread, but there's no "official"
> +         * way to do this, so this bool makes it slightly more elegant.
> +         * Checking from_dst_file for this is racy because from_dst_file will
> +         * be cleared in the rp_thread!
> +         */
> +        bool          rp_thread_created;
>          QemuSemaphore rp_sem;
>      } rp_state;
>  
> -- 
> 2.31.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2021-07-21  9:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21  1:21 [PATCH 0/5] migrations: Fix potential rare race of migration-test after yank Peter Xu
2021-07-21  1:21 ` [PATCH 1/5] migration: Fix missing join() of rp_thread Peter Xu
2021-07-21  9:49   ` Dr. David Alan Gilbert [this message]
2021-07-21  1:21 ` [PATCH 2/5] migration: Shutdown src in await_return_path_close_on_source() Peter Xu
2021-07-21  9:55   ` Dr. David Alan Gilbert
2021-07-21 15:40     ` Peter Xu
2021-07-21 18:00       ` Dr. David Alan Gilbert
2021-07-21 18:12         ` Peter Xu
2021-07-21 15:57     ` Daniel P. Berrangé
2021-07-21  1:21 ` [PATCH 3/5] migration: Introduce migration_ioc_[un]register_yank() Peter Xu
2021-07-21  9:58   ` Dr. David Alan Gilbert
2021-07-21  1:21 ` [PATCH 4/5] migration: Teach QEMUFile to be QIOChannel-aware Peter Xu
2021-07-21 10:27   ` Dr. David Alan Gilbert
2021-07-21 10:57     ` Daniel P. Berrangé
2021-07-21  1:21 ` [PATCH 5/5] migration: Move the yank unregister of channel_close out Peter Xu
2021-07-21 10:39   ` Dr. David Alan Gilbert
2021-07-21 15:45     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YPftoYsHQGgvvUqy@work-vm \
    --to=dgilbert@redhat.com \
    --cc=lsoaresp@redhat.com \
    --cc=lukasstraub2@web.de \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.