From: Avihai Horon <avihaih@nvidia.com>
To: <qemu-devel@nongnu.org>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Peter Xu" <peterx@redhat.com>, "Fabiano Rosas" <farosas@suse.de>,
"Wang, Lei" <lei4.wang@intel.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
"Avihai Horon" <avihaih@nvidia.com>
Subject: [PATCH v2 1/3] migration: Don't serialize devices in qemu_savevm_state_iterate()
Date: Mon, 4 Mar 2024 12:53:37 +0200 [thread overview]
Message-ID: <20240304105339.20713-2-avihaih@nvidia.com> (raw)
In-Reply-To: <20240304105339.20713-1-avihaih@nvidia.com>
Commit 90697be8896c ("live migration: Serialize vmstate saving in stage
2") introduced device serialization in qemu_savevm_state_iterate(). The
rationale behind it was to first complete migration of slower changing
block devices and only then migrate the RAM, to avoid sending fast
changing RAM pages over and over.
This commit was added a long time ago, and while it was useful back
then, it is not the case anymore:
1. Block migration is deprecated, see commit 66db46ca83b8 ("migration:
Deprecate block migration").
2. Today there are other iterative devices besides RAM and block, such
as VFIO, which are registered for migration after RAM. With current
serialization behavior, a fast changing device can block other
devices from sending their data, which may prevent migration from
converging in some cases.
The issue described in item 2 was observed in several VFIO migration
scenarios with switchover-ack capability enabled, where some workload on
the VM prevented RAM from ever reaching a hard zero, thus blocking VFIO
initial pre-copy data from being sent. Hence, destination could not ack
switchover and migration could not converge.
Fix that by not serializing iterative devices in
qemu_savevm_state_iterate().
Note that this still doesn't fully prevent device starvation. As
correctly pointed out by Peter [1], a fast changing device might
constantly consume all allocated bandwidth and block the following
devices. However, this scenario is more likely to happen only if
max-bandwidth is low.
[1] https://lore.kernel.org/qemu-devel/Zd6iw9dBhW6wKNxx@x1n/
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
migration/savevm.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index d612c8a9020..d76d82e7596 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1389,7 +1389,8 @@ int qemu_savevm_state_resume_prepare(MigrationState *s)
int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
{
SaveStateEntry *se;
- int ret = 1;
+ bool all_finished = true;
+ int ret;
trace_savevm_state_iterate();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
@@ -1430,16 +1431,12 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
"%d(%s): %d",
se->section_id, se->idstr, ret);
qemu_file_set_error(f, ret);
- }
- if (ret <= 0) {
- /* Do not proceed to the next vmstate before this one reported
- completion of the current stage. This serializes the migration
- and reduces the probability that a faster changing state is
- synchronized over and over again. */
- break;
+ return ret;
+ } else if (!ret) {
+ all_finished = false;
}
}
- return ret;
+ return all_finished;
}
static bool should_send_vmdesc(void)
--
2.26.3
next prev parent reply other threads:[~2024-03-04 10:54 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-04 10:53 [PATCH v2 0/3] migration: Don't serialize devices in qemu_savevm_state_iterate() Avihai Horon
2024-03-04 10:53 ` Avihai Horon [this message]
2024-03-04 10:53 ` [PATCH v2 2/3] vfio/migration: Refactor vfio_save_state() return value Avihai Horon
2024-03-04 10:53 ` [PATCH v2 3/3] vfio/migration: Add a note about migration rate limiting Avihai Horon
2024-03-04 20:36 ` [PATCH v2 0/3] migration: Don't serialize devices in qemu_savevm_state_iterate() Fabiano Rosas
2024-03-05 2:24 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240304105339.20713-2-avihaih@nvidia.com \
--to=avihaih@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=clg@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=lei4.wang@intel.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).