All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Juraj Marcin <jmarcin@redhat.com>
Cc: qemu-devel@nongnu.org, Jiri Denemark <jdenemar@redhat.com>,
	Stefan Weil <sw@weilnetz.de>, Paolo Bonzini <pbonzini@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [RFC PATCH 2/4] migration: Fix state transition in postcopy_start() error handling
Date: Thu, 7 Aug 2025 16:54:16 -0400	[thread overview]
Message-ID: <aJUSeOIKfQ47uliY@x1.local> (raw)
In-Reply-To: <20250807114922.1013286-3-jmarcin@redhat.com>

On Thu, Aug 07, 2025 at 01:49:10PM +0200, Juraj Marcin wrote:
> From: Juraj Marcin <jmarcin@redhat.com>
> 
> Depending on where an error during postcopy_start() happens, the state
> can be either "active", "device" or "cancelling", but never
> "postcopy-active". Migration state is transitioned to "postcopy-active"
> only just before a successful return from the function.
> 
> Accept any state except "cancelling" when transitioning to "failed"
> state.
> 
> Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
> ---
>  migration/migration.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 10c216d25d..e5ce2940d5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2872,8 +2872,9 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>  fail_closefb:
>      qemu_fclose(fb);
>  fail:
> -    migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> -                          MIGRATION_STATUS_FAILED);
> +    if ( ms->state != MIGRATION_STATUS_CANCELLING) {
> +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> +    }

Hmm, this might have been overlooked from my commit 48814111366b.  Maybe
worth a Fixes and copy stable?

For example, I would expect the old code (prior of 48814111366b) still be
able to fail postcopy and resume src QEMU if qemu_savevm_send_packaged()
failed.  Now, looks like it'll be stuck at "device" state..

The other thing is it also looks like a common pattern to set FAILED
meanwhile not messing with a CANCELLING stage.  It's not easy to always
remember this, so maybe we should consider having a helper function?

  migrate_set_failure(MigrationState *, Error *err);

Which could set err with migrate_set_error() (likely we could also
error_report() the error), and update FAILED iff it's not CANCELLING.

I saw three of such occurances that such helper may apply, but worth double
check:

postcopy_start[2725]           if (ms->state != MIGRATION_STATUS_CANCELLING) {
migration_completion[3069]     if (s->state != MIGRATION_STATUS_CANCELLING) {
igration_connect[4064]        if (s->state != MIGRATION_STATUS_CANCELLING) {

If the cleanup looks worthwhile, and if the Fixes apply, we could have the
cleanup patch on top of the fixes patch so patch 1 is easier to backport.

Thanks,

>      migration_block_activate(NULL);
>      migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
>      bql_unlock();
> -- 
> 2.50.1
> 

-- 
Peter Xu



  reply	other threads:[~2025-08-07 20:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-07 11:49 [RFC PATCH 0/4] migration: Introduce postcopy-setup capability and state Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 1/4] qemu-thread: Introduce qemu_thread_detach() Juraj Marcin
2025-08-19 10:37   ` Daniel P. Berrangé
2025-08-07 11:49 ` [RFC PATCH 2/4] migration: Fix state transition in postcopy_start() error handling Juraj Marcin
2025-08-07 20:54   ` Peter Xu [this message]
2025-08-08  9:44     ` Juraj Marcin
2025-08-08 16:00       ` Peter Xu
2025-08-08 19:08     ` Fabiano Rosas
2025-08-11 13:00       ` Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 3/4] migration: Make listen thread joinable Juraj Marcin
2025-08-07 20:57   ` Peter Xu
2025-08-08 11:08     ` Juraj Marcin
2025-08-08 17:05       ` Peter Xu
2025-08-11 13:02         ` Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 4/4] migration: Introduce postcopy-setup capability and state Juraj Marcin
2025-08-11 14:54 ` [RFC PATCH 0/4] " Peter Xu
2025-08-12 13:34   ` Juraj Marcin
2025-08-13 17:42     ` Peter Xu
2025-08-14 15:42       ` Juraj Marcin
2025-08-14 19:24         ` Peter Xu
2025-08-15  6:35           ` Juraj Marcin
2025-09-01 17:57           ` Dr. David Alan Gilbert
2025-09-02  8:30             ` Juraj Marcin
2025-09-03 12:00               ` Dr. David Alan Gilbert
2025-09-03 13:07                 ` Peter Xu
2025-09-04 16:11                 ` Juraj Marcin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aJUSeOIKfQ47uliY@x1.local \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=jdenemar@redhat.com \
    --cc=jmarcin@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.