All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>, qemu-devel@nongnu.org
Cc: "Juraj Marcin" <jmarcin@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Prasad Pandit" <ppandit@redhat.com>,
	peterx@redhat.com, "Cédric Le Goater" <clg@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>
Subject: Re: [PATCH 3/5] migration: Notify migration FAILED before starting VM
Date: Fri, 23 Jan 2026 09:59:35 -0300	[thread overview]
Message-ID: <874iocjxhk.fsf@suse.de> (raw)
In-Reply-To: <20260122230331.3543312-4-peterx@redhat.com>

Peter Xu <peterx@redhat.com> writes:

> Devices may opt-in migration FAILED notifiers to be invoked when migration
> fails.  Currently, the notifications happen in migration_cleanup().  It is
> normally fine, but maybe not ideal if there's dependency of the fallback
> v.s. VM starts.
>
> This patch moves the FAILED notification earlier, so that if the failure
> happened during switchover, it'll notify before VM restart.
>

The change to FAILED in patch 2 should come to this patch to avoid
having a window where the notification only happens at the end.

> After walking over all existing FAILED notifier users, I got the conclusion
> that this should also be a cleaner approach at least from design POV.
>
> We have these notifier users, where the first two do not need to trap
> FAILED:
>
> |----------------------------+-------------------------------------+---------------------|
> | device                     | handler                             | events needed       |
> |----------------------------+-------------------------------------+---------------------|
> | gicv3                      | kvm_arm_gicv3_notifier              | DONE                |
> | vfio_iommufd / vfio_legacy | vfio_cpr_reboot_notifier            | SETUP               |
> | cpr-exec                   | cpr_exec_notifier                   | FAILED, DONE        |
> | virtio-net                 | virtio_net_migration_state_notifier | SETUP, FAILED       |
> | vfio                       | vfio_migration_state_notifier       | FAILED              |
> | vdpa                       | vdpa_net_migration_state_notifier   | SETUP, FAILED       |
> | spice [*]                  | migration_state_notifier            | SETUP, FAILED, DONE |
> |----------------------------+-------------------------------------+---------------------|
>
> For cpr-exec, it tries to cleanup some cpr-exec specific fd or env
> variables.  This should be fine either way, as long as before
> migration_cleanup().
>
> For virtio-net, we need to re-plug the primary device back to guest in the
> failover mode.  Likely benign.
>
> VFIO needs to re-start the device if FAILED.  IIUC it should do it before
> vm_start(), if the VFIO device can be put into a STOPed state due to
> migration, we should logically make it running again before vCPUs run.
>
> VDPA will disable SVQ when migration is FAILED.  Likely benign too, but
> looks better if we can do it before resuming vCPUs.
>
> For spice, we should rely on "spice_server_migrate_end(false)" to retake
> the ownership.  Benign, but looks more reasonable if the spice client does
> it before VM runs again.
>
> Note that this change may introduce slightly more downtime, if the
> migration failed exactly at the switchover phase.  But that's very rare,
> and even if it happens, none of above expects a long delay, but a short
> one, likely will be buried in the total downtime even if failed.
>
> Cc: Cédric Le Goater <clg@redhat.com>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/migration.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 91775f8472..1d9a2fc068 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1481,7 +1481,6 @@ static void migration_cleanup_json_writer(MigrationState *s)
>  
>  static void migration_cleanup(MigrationState *s)
>  {
> -    MigrationEventType type;
>      QEMUFile *tmp = NULL;
>  
>      trace_migration_cleanup();
> @@ -1535,9 +1534,15 @@ static void migration_cleanup(MigrationState *s)
>          /* It is used on info migrate.  We can't free it */
>          error_report_err(error_copy(s->error));
>      }
> -    type = migration_has_failed(s) ? MIG_EVENT_PRECOPY_FAILED :
> -                                     MIG_EVENT_PRECOPY_DONE;
> -    migration_call_notifiers(s, type, NULL);
> +
> +    /*
> +     * FAILED notification should have already happened.  Notify DONE if
> +     * migration completed successfully.
> +     */
> +    if (!migration_has_failed(s)) {
> +        migration_call_notifiers(s, MIG_EVENT_PRECOPY_DONE, NULL);
> +    }
> +
>      yank_unregister_instance(MIGRATION_YANK_INSTANCE);
>  }
>  
> @@ -3589,6 +3594,13 @@ static void migration_iteration_finish(MigrationState *s)
>              error_free(local_err);
>              break;
>          }
> +
> +        /*
> +         * Notify FAILED before starting VM, so that devices can invoke
> +         * necessary fallbacks before vCPUs run again.
> +         */
> +        migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
> +
>          if (runstate_is_live(s->vm_old_state)) {
>              if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>                  vm_start();


  reply	other threads:[~2026-01-23 13:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-22 23:03 [PATCH 0/5] migration: Notifier fixes for 11.0 Peter Xu
2026-01-22 23:03 ` [PATCH 1/5] migration: Add a tracepoint for invoking migration notifiers Peter Xu
2026-01-23 12:25   ` Fabiano Rosas
2026-01-22 23:03 ` [PATCH 2/5] migration: Fix double notification of DONE/FAIL for postcopy Peter Xu
2026-01-23 12:52   ` Fabiano Rosas
2026-01-23 12:54     ` Fabiano Rosas
2026-01-23 14:58     ` Peter Xu
2026-01-22 23:03 ` [PATCH 3/5] migration: Notify migration FAILED before starting VM Peter Xu
2026-01-23 12:59   ` Fabiano Rosas [this message]
2026-01-23 15:40     ` Peter Xu
2026-01-23 17:36       ` Fabiano Rosas
2026-01-26 15:21         ` Peter Xu
2026-01-26 19:20           ` Fabiano Rosas
2026-01-22 23:03 ` [PATCH 4/5] migration: Drop explicit block activation in postcopy fail path Peter Xu
2026-01-23 12:59   ` Fabiano Rosas
2026-01-22 23:03 ` [PATCH 5/5] migration: Rename MIG_EVENT_PRECOPY_* to MIG_EVENT_* Peter Xu
2026-01-23 13:02   ` Fabiano Rosas
2026-01-26 15:58 ` [PATCH 0/5] migration: Notifier fixes for 11.0 Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874iocjxhk.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=clg@redhat.com \
    --cc=jmarcin@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.