All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Prasad Pandit <ppandit@redhat.com>, Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Prasad Pandit <pjp@fedoraproject.org>
Subject: Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Date: Tue, 06 Jan 2026 10:47:30 -0300	[thread overview]
Message-ID: <87zf6q26q5.fsf@suse.de> (raw)
In-Reply-To: <CAE8KmOzcOdYhnxpDr8BMV8zjixpEh9r+COe=xyLfXCVWKD0CRw@mail.gmail.com>

Prasad Pandit <ppandit@redhat.com> writes:

> Hi,
>
> On Tue, 23 Dec 2025 at 21:00, Peter Xu <peterx@redhat.com> wrote:
>> One thing good about exposing such status via QAPI is, it can help us
>> diagnose issues by seeing CANCELLING / FAILING even looking at
>> query-migrate results (as normally when bug happens we can't see the
>> internal status..), so that we know either it's explicitly cancelled, or
>> something went wrong.
>>
>> If it's a completely hidden / internal status, we may see ACTIVE even if
>> something wrong happened..
>
> * Both process state and reason(s) for the state change needs to be
> visible to the user. But states like cancelling/failing are redundant,
> users would derive the same conclusion from CANCELLED and CANCELLING
> OR FAILED AND FAILING. Besides, migration_cleanup() does exactly the
> same steps irrespective of whether migration is failing or cancelling
> or failed or cancelled.
>
>> My current hope is any mgmt should normally by default ignore new migration
>> states..  If that's always achieved, it looks to me adding FAILING directly
>> into migration status would still have some benefits on debugging.
>
> * libvirtd(8) complains about unknown states multiple times:
>       libvirtd[2194267]: unknown status 'failing' in migration event
>       libvirtd[2194267]: unknown status 'failing' in migration event
>       libvirtd[2194267]: unknown status 'failing' in migration event
>
>
>> > > @@ -2907,7 +2914,7 @@ fail_closefb:
>> > >      qemu_fclose(fb);
>> > >  fail:
>> > >      if (ms->state != MIGRATION_STATUS_CANCELLING) {
>> > > -        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
>> > > +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
>> > >      }
>> >
>> > This is a good example where having MigrationStatus makes the code more
>> > complicated. This could be exiting=true, running=false, etc. It
>> > shouldn't matter why this routine failed. If we reach
>> > migration_cleanup() and, at the very end, state is CANCELLING, then we
>> > know the cancel command has caused this and set the state to
>> > CANCELLED. If the state was something else, then it's unintended and we
>> > set FAILED.
>>
>> If it'll be an internal status, we'll still need to identify if someone
>> already have cancelled it, right?
>>
>> Assuming we introduce stop_reason flag, making it:
>>
>> enum {
>>     MIG_STOP_REASON_CANCEL,
>>     MIG_STOP_REASON_FAIL,
>> } MigrationStopReason;
>>
>> Then we can switch to CANCELLED / FAILED when cleanup from those reasons.
>>
>> Then here, logically we also need logic like:
>>
>>     if (stop_reason != MIG_STOP_REASON_CANCEL) {
>>         stop_reason = MIG_STOP_REASON_FAIL;
>>     }
>>
>> Because we want to make sure when the user already triggered cancel, it
>> won't show FAILED but only show CANCELLED at last?
>
> * I think the way we are setting/changing these states in as many
> locations is only adding to the complications. Do we have to
> explicitly set these states like this? What if migration_cleanup()
> always sets the state to 'STOP'. Similarly other places set the state
> to a predefined state. OR
> ===
>     struct {
>         current_state;
>         old_state;
>         event/trigger;
>         reason[];
>     } MigrationState s;
>
>     migration_change_state(s) {
>           s->old_state = s->current_state;
>           if (s->current_state == START && s->trigger ==
> 'connections-established') {
>               s->current_state = ACTIVE;
>               s->reason = "connections-established, migration starting"
>           } else if (s->current_state == ACTIVE && s->trigger == 'completed') {
>               s->current_state = STOP
>               s->reason = "migration completed"
>           } else if (s->current_state == ACTIVE  && s->trigger == 'pause') {
>               s->current_state = PAUSE
>               s->reason = "pause, migration paused"
>           } else if (s->current_state == ACTIVE && s->trigger ==
> 'error-occurred') {
>               s->current_state = STOP
>               s->reason = "Error occurred, migration failed"
>           } else if (s->current_state == ACTIVE && s->trigger ==
> 'user-cancel') {
>               s->current_state = STOP
>               s->reason = "user-cancel, migration cancelled"
>          } else {
>               s->current_state = s->current_state;
>               warn_msg("unknown combination, maybe define a new rule?");
>          }
>     }
> ===
> * We define explicit rules for the state change and accordingly we
> only call migration_change_state() at any point and it'll change to an
> appropriate next state, recording the due reason for the change.
>

If we had a linear state transition table, i.e. a DFA without any
branching, that would be ideal. But since we have states that can reach
(and be reached from) multiple other states, then we'll always need some
input to migration_change_state(). Here you're making it the
s->trigger. Where will that come from?

Looking at runstate.c and job.c, it seems we could at least define a
state transition table and do away with the second parameter to
migrate_set_state(s, old, new).

As we've been discussing, the current state-change mechanism has the
dual purpose of emitting the state change event and also serving as
internal tracking of the migration state. It's not clear to me whether
you're covering both in this proposal or just one of them.

I don't think we've established actually what are the goals of having
any state changes. Do we even need state changes for internal tracking?
We could use your s->trigger as an enum and just check it wherever
necessary. And keep the MIGRATION_STATUS exclusive for the external API,
in which case, it's probably better to just set it unconditionally (in
many places migrate_set_state already takes the current state as
argument, i.e. it doesn't care about the current state).

> Wdyt...?
>
> Thank you.
> ---
>   - Prasad


  reply	other threads:[~2026-01-06 13:48 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-22 11:48 [PATCH] migration: introduce MIGRATION_STATUS_FAILING Prasad Pandit
2025-12-22 14:29 ` Fabiano Rosas
2025-12-23 11:17   ` Prasad Pandit
2025-12-23 15:04     ` Fabiano Rosas
2026-01-06 10:54       ` Prasad Pandit
2026-01-06 14:53         ` Peter Xu
2025-12-23 15:30   ` Peter Xu
2025-12-23 16:01     ` Fabiano Rosas
2026-01-06 11:45     ` Prasad Pandit
2026-01-06 13:47       ` Fabiano Rosas [this message]
2026-01-07 10:49         ` Prasad Pandit
2026-01-07 13:23           ` Fabiano Rosas
2026-01-12 13:05             ` Prasad Pandit
2026-01-12 19:45               ` Fabiano Rosas
2026-01-13 11:59                 ` Prasad Pandit
2026-01-14 12:16                   ` Fabiano Rosas
2026-01-06 15:09       ` Peter Xu
2026-01-07 11:11         ` Prasad Pandit
2026-01-07 17:03           ` Peter Xu
2025-12-23 15:17 ` Peter Xu
2025-12-23 15:36   ` Fabiano Rosas
2025-12-23 16:27     ` Peter Xu
2026-01-06 10:49   ` Prasad Pandit
2026-01-06 15:23     ` Peter Xu
2026-01-07 11:01       ` Prasad Pandit
2026-01-07 17:07         ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zf6q26q5.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=peterx@redhat.com \
    --cc=pjp@fedoraproject.org \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.