From: Peter Xu <peterx@redhat.com>
To: Shivam Kumar <shivam.kumar1@nutanix.com>
Cc: Fabiano Rosas <farosas@suse.de>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [RFC PATCH] Fix race in live migration failure path
Date: Thu, 23 Jan 2025 15:40:25 -0500 [thread overview]
Message-ID: <Z5KpOayak2EA3ajB@x1n> (raw)
In-Reply-To: <CA955950-188F-4FA7-9405-D7BBC81EB84E@nutanix.com>
On Thu, Jan 23, 2025 at 06:05:28PM +0000, Shivam Kumar wrote:
>
>
> > On 23 Jan 2025, at 9:57 PM, Peter Xu <peterx@redhat.com> wrote:
> >
> > !-------------------------------------------------------------------|
> > CAUTION: External Email
> >
> > |-------------------------------------------------------------------!
> >
> > On Thu, Jan 23, 2025 at 09:53:16AM +0000, Shivam Kumar wrote:
> >>
> >>
> >>> On 22 Jan 2025, at 10:10 PM, Peter Xu <peterx@redhat.com> wrote:
> >>>
> >>> !-------------------------------------------------------------------|
> >>> CAUTION: External Email
> >>>
> >>> |-------------------------------------------------------------------!
> >>>
> >>> Hi, Shivam,
> >>>
> >>> On Wed, Jan 22, 2025 at 10:54:17AM +0000, Shivam Kumar wrote:
> >>>> There is one place where we set the migration status to FAILED but we don’t set
> >>>> s->error, i.e. in save_snapshot workflow, please check qemu_savevm_state but
> >>>> not sure setting s->error in this case is possible (or required), as it seems a
> >>>> different workflow to me.
> >>>
> >>> Yes it's very different indeed. I may not yet fully get the challenge on
> >>> how switching to s->error (implies FAILING) would affect this one, though.
> >>> I mean, in qemu_savevm_state() it tries to fetch qemufile errors with:
> >>>
> >>> ret = qemu_file_get_error(f);
> >>>
> >>> IIUC we could also try to fetch s->error just like what migration thread
> >>> does, and if it sets means it's failing?
> >> Yes, I can just set s->error in qemu_savevm_state.
> >> @@ -1686,7 +1686,6 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
> >> - status = MIGRATION_STATUS_FAILED;
> >> + s->error = *errp;
> >> But my question was: is migrate_fd_cleanup called (where we will set the status
> >> to FAILED if s->error is set) in the snapshot workflow?
> >
> > I see what you meant. It's not called indeed. We may need to process it
> > the same as what migrate_fd_cleanup() does.
> >
> > So far the snapshot code reuses migration code in a partial way, so it's
> > not crystal clear where the line is, e.g., obviously it still moves the
> > migration state machine but it never shows "active" phase at all (even if
> > it has a major chunk of duration that it's actively migrating the data to
> > the snapshot files). Here the state machine only goes from SETUP to either
> > FAILED or COMPLETED.
> >
> > From that POV looks like we should duplicate such s->error detection logic
> > here on deciding whether to switch to FAILED, if we treat s->error as the
> > internal "single failure point" for migration.
> I hope setting the state to FAILED at the end of save_snapshot might work:
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1722,7 +1722,7 @@ cleanup:
> qemu_savevm_state_cleanup();
>
> if (ret != 0) {
> - qatomic_set(&ms->failing, 1);
> + migrate_set_error(ms, *errp);
> } else {
> migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP,
> MIGRATION_STATUS_COMPLETED);
> @@ -3054,6 +3054,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
> RunState saved_state = runstate_get();
> uint64_t vm_state_size;
> g_autoptr(GDateTime) now = g_date_time_new_now_local();
> + MigrationState *ms;
>
> GLOBAL_STATE_CODE();
>
> @@ -3149,8 +3150,13 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>
> the_end:
> bdrv_drain_all_end();
> -
> + ms = migrate_get_current();
> vm_resume(saved_state);
> + if (migrate_has_error(ms)) {
> + migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP,
> + MIGRATION_STATUS_FAILED);
> + }
> +
I actually am not sure whether we need to postpone the FAILED update until
here for save_snapshot.
After all, merely the whole qemu_savevm_state() takes BQL, then there's no
way to race it with another "migrate" (QMP command "migrate" needs BQL too)
at this point.
OTOH, it also feels weird to update state in qemu_savevm_state_cleanup()..
Perhaps we can keep how qemu_savevm_state() would update FAILED state, then
we only need to update FAILED for precopy in migrate_fd_cleanup()?
We can still check for s->error in qemu_savevm_state(), though, because
after switching to a heavier use of s->error maybe we can fail the
save_snapshot() too with some s->error set (even if qemufile is still
working). Then we may want to fail the save_snapshot() too.
Thanks,
--
Peter Xu
prev parent reply other threads:[~2025-01-23 20:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-10 10:07 [RFC PATCH] Fix race in live migration failure path Shivam Kumar
2025-01-10 13:09 ` Fabiano Rosas
2025-01-13 16:29 ` Peter Xu
2025-01-22 10:54 ` Shivam Kumar
2025-01-22 16:40 ` Peter Xu
2025-01-23 9:53 ` Shivam Kumar
2025-01-23 16:27 ` Peter Xu
2025-01-23 18:05 ` Shivam Kumar
2025-01-23 20:40 ` Peter Xu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z5KpOayak2EA3ajB@x1n \
--to=peterx@redhat.com \
--cc=farosas@suse.de \
--cc=qemu-devel@nongnu.org \
--cc=shivam.kumar1@nutanix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.