From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>, Juraj Marcin <jmarcin@redhat.com>
Cc: qemu-devel@nongnu.org, Jiri Denemark <jdenemar@redhat.com>,
Stefan Weil <sw@weilnetz.de>, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [RFC PATCH 2/4] migration: Fix state transition in postcopy_start() error handling
Date: Fri, 08 Aug 2025 16:08:12 -0300 [thread overview]
Message-ID: <87ectl1vj7.fsf@suse.de> (raw)
In-Reply-To: <aJUSeOIKfQ47uliY@x1.local>
Peter Xu <peterx@redhat.com> writes:
> On Thu, Aug 07, 2025 at 01:49:10PM +0200, Juraj Marcin wrote:
>> From: Juraj Marcin <jmarcin@redhat.com>
>>
>> Depending on where an error during postcopy_start() happens, the state
>> can be either "active", "device" or "cancelling", but never
>> "postcopy-active". Migration state is transitioned to "postcopy-active"
>> only just before a successful return from the function.
>>
>> Accept any state except "cancelling" when transitioning to "failed"
>> state.
>>
>> Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
>> ---
>> migration/migration.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 10c216d25d..e5ce2940d5 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -2872,8 +2872,9 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>> fail_closefb:
>> qemu_fclose(fb);
>> fail:
>> - migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
>> - MIGRATION_STATUS_FAILED);
>> + if ( ms->state != MIGRATION_STATUS_CANCELLING) {
>> + migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
>> + }
>
> Hmm, this might have been overlooked from my commit 48814111366b. Maybe
> worth a Fixes and copy stable?
>
> For example, I would expect the old code (prior of 48814111366b) still be
> able to fail postcopy and resume src QEMU if qemu_savevm_send_packaged()
> failed. Now, looks like it'll be stuck at "device" state..
>
> The other thing is it also looks like a common pattern to set FAILED
> meanwhile not messing with a CANCELLING stage. It's not easy to always
> remember this, so maybe we should consider having a helper function?
>
> migrate_set_failure(MigrationState *, Error *err);
>
We didn't do it back then because at there would be some logical
conflict with this series:
https://lore.kernel.org/r/20250110100707.4805-1-shivam.kumar1@nutanix.com
But I don't remember the details. If it works this time I'm all for it.
> Which could set err with migrate_set_error() (likely we could also
> error_report() the error), and update FAILED iff it's not CANCELLING.
>
> I saw three of such occurances that such helper may apply, but worth double
> check:
>
> postcopy_start[2725] if (ms->state != MIGRATION_STATUS_CANCELLING) {
> migration_completion[3069] if (s->state != MIGRATION_STATUS_CANCELLING) {
> igration_connect[4064] if (s->state != MIGRATION_STATUS_CANCELLING) {
>
> If the cleanup looks worthwhile, and if the Fixes apply, we could have the
> cleanup patch on top of the fixes patch so patch 1 is easier to backport.
>
> Thanks,
>
>> migration_block_activate(NULL);
>> migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
>> bql_unlock();
>> --
>> 2.50.1
>>
next prev parent reply other threads:[~2025-08-08 19:08 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-07 11:49 [RFC PATCH 0/4] migration: Introduce postcopy-setup capability and state Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 1/4] qemu-thread: Introduce qemu_thread_detach() Juraj Marcin
2025-08-19 10:37 ` Daniel P. Berrangé
2025-08-07 11:49 ` [RFC PATCH 2/4] migration: Fix state transition in postcopy_start() error handling Juraj Marcin
2025-08-07 20:54 ` Peter Xu
2025-08-08 9:44 ` Juraj Marcin
2025-08-08 16:00 ` Peter Xu
2025-08-08 19:08 ` Fabiano Rosas [this message]
2025-08-11 13:00 ` Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 3/4] migration: Make listen thread joinable Juraj Marcin
2025-08-07 20:57 ` Peter Xu
2025-08-08 11:08 ` Juraj Marcin
2025-08-08 17:05 ` Peter Xu
2025-08-11 13:02 ` Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 4/4] migration: Introduce postcopy-setup capability and state Juraj Marcin
2025-08-11 14:54 ` [RFC PATCH 0/4] " Peter Xu
2025-08-12 13:34 ` Juraj Marcin
2025-08-13 17:42 ` Peter Xu
2025-08-14 15:42 ` Juraj Marcin
2025-08-14 19:24 ` Peter Xu
2025-08-15 6:35 ` Juraj Marcin
2025-09-01 17:57 ` Dr. David Alan Gilbert
2025-09-02 8:30 ` Juraj Marcin
2025-09-03 12:00 ` Dr. David Alan Gilbert
2025-09-03 13:07 ` Peter Xu
2025-09-04 16:11 ` Juraj Marcin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ectl1vj7.fsf@suse.de \
--to=farosas@suse.de \
--cc=jdenemar@redhat.com \
--cc=jmarcin@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=sw@weilnetz.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.