From: "Wang, Lei" <lei4.wang@intel.com>
To: Avihai Horon <avihaih@nvidia.com>, qemu-devel@nongnu.org
Cc: Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>,
Joao Martins <joao.m.martins@oracle.com>
Subject: Re: [PATCH] migration: Don't serialize migration while can't switchover
Date: Tue, 27 Feb 2024 11:16:38 +0800 [thread overview]
Message-ID: <6fa132c5-8ed8-41a6-a70d-90230ce3ca84@intel.com> (raw)
In-Reply-To: <20240222155627.14563-1-avihaih@nvidia.com>
On 2/22/2024 23:56, Avihai Horon wrote:
> Currently, migration code serializes device data sending during pre-copy
> iterative phase. As noted in the code comment, this is done to prevent
> faster changing device from sending its data over and over.
>
> However, with switchover-ack capability enabled, this behavior can be
> problematic and may prevent migration from converging. The problem lies
> in the fact that an earlier device may never finish sending its data and
> thus block other devices from sending theirs.
>
> This bug was observed in several VFIO migration scenarios where some
> workload on the VM prevented RAM from ever reaching a hard zero, not
> allowing VFIO initial pre-copy data to be sent, and thus destination
> could not ack switchover. Note that the same scenario, but without
> switchover-ack, would converge.
>
> Fix it by not serializing device data sending during pre-copy iterative
> phase if switchover was not acked yet.
Hi Avihai,
Can this bug be solved by ordering the priority of different device's handlers?
>
> Fixes: 1b4adb10f898 ("migration: Implement switchover ack logic")
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> migration/savevm.h | 2 +-
> migration/migration.c | 4 ++--
> migration/savevm.c | 22 +++++++++++++++-------
> 3 files changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/migration/savevm.h b/migration/savevm.h
> index 74669733dd6..d4a368b522b 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -36,7 +36,7 @@ void qemu_savevm_state_setup(QEMUFile *f);
> bool qemu_savevm_state_guest_unplug_pending(void);
> int qemu_savevm_state_resume_prepare(MigrationState *s);
> void qemu_savevm_state_header(QEMUFile *f);
> -int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy);
> +int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, bool can_switchover);
> void qemu_savevm_state_cleanup(void);
> void qemu_savevm_state_complete_postcopy(QEMUFile *f);
> int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> diff --git a/migration/migration.c b/migration/migration.c
> index ab21de2cadb..d8bfe1fb1b9 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3133,7 +3133,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
> }
>
> /* Just another iteration step */
> - qemu_savevm_state_iterate(s->to_dst_file, in_postcopy);
> + qemu_savevm_state_iterate(s->to_dst_file, in_postcopy, can_switchover);
> return MIG_ITERATE_RESUME;
> }
>
> @@ -3216,7 +3216,7 @@ static MigIterateState bg_migration_iteration_run(MigrationState *s)
> {
> int res;
>
> - res = qemu_savevm_state_iterate(s->to_dst_file, false);
> + res = qemu_savevm_state_iterate(s->to_dst_file, false, true);
> if (res > 0) {
> bg_migration_completion(s);
> return MIG_ITERATE_BREAK;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index d612c8a9020..3a012796375 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1386,7 +1386,7 @@ int qemu_savevm_state_resume_prepare(MigrationState *s)
> * 0 : We haven't finished, caller have to go again
> * 1 : We have finished, we can go to complete phase
> */
> -int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
> +int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, bool can_switchover)
> {
> SaveStateEntry *se;
> int ret = 1;
> @@ -1430,12 +1430,20 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
> "%d(%s): %d",
> se->section_id, se->idstr, ret);
> qemu_file_set_error(f, ret);
> + return ret;
> }
> - if (ret <= 0) {
> - /* Do not proceed to the next vmstate before this one reported
> - completion of the current stage. This serializes the migration
> - and reduces the probability that a faster changing state is
> - synchronized over and over again. */
> +
> + if (ret == 0 && can_switchover) {
> + /*
> + * Do not proceed to the next vmstate before this one reported
> + * completion of the current stage. This serializes the migration
> + * and reduces the probability that a faster changing state is
> + * synchronized over and over again.
> + * Do it only if migration can switchover. If migration can't
> + * switchover yet, do proceed to let other devices send their data
> + * too, as this may be required for switchover to be acked and
> + * migration to converge.
> + */
> break;
> }
> }
> @@ -1724,7 +1732,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
> qemu_savevm_state_setup(f);
>
> while (qemu_file_get_error(f) == 0) {
> - if (qemu_savevm_state_iterate(f, false) > 0) {
> + if (qemu_savevm_state_iterate(f, false, true) > 0) {
> break;
> }
> }
next prev parent reply other threads:[~2024-02-27 3:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-22 15:56 [PATCH] migration: Don't serialize migration while can't switchover Avihai Horon
2024-02-27 3:16 ` Wang, Lei [this message]
2024-02-28 9:56 ` Avihai Horon
2024-02-27 7:41 ` Peter Xu
2024-02-27 10:44 ` Joao Martins
2024-02-28 0:00 ` Avihai Horon
2024-02-28 3:04 ` Peter Xu
2024-02-28 9:39 ` Avihai Horon
2024-02-28 10:17 ` Peter Xu
2024-02-28 10:27 ` Avihai Horon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6fa132c5-8ed8-41a6-a70d-90230ce3ca84@intel.com \
--to=lei4.wang@intel.com \
--cc=avihaih@nvidia.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).