From: Peter Xu <peterx@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: qemu-devel@nongnu.org,
"Leonardo Bras Soares Passos" <lsoaresp@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Juan Quintela" <quintela@redhat.com>,
"Chensheng Dong" <chdong@redhat.com>,
"Zhiyi Guo" <zhguo@redhat.com>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"Fabiano Rosas" <farosas@suse.de>
Subject: Re: [PATCH] migration: Allow user to specify migration available bandwidth
Date: Tue, 25 Jul 2023 12:42:06 -0400 [thread overview]
Message-ID: <ZL/7XtiEFWEprQhD@x1n> (raw)
In-Reply-To: <87351cfdrq.fsf@pond.sub.org>
Hi, Markus,
On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > Migration bandwidth is a very important value to live migration. It's
> > because it's one of the major factors that we'll make decision on when to
> > switchover to destination in a precopy process.
> >
> > This value is currently estimated by QEMU during the whole live migration
> > process by monitoring how fast we were sending the data. This can be the
> > most accurate bandwidth if in the ideal world, where we're always feeding
> > unlimited data to the migration channel, and then it'll be limited to the
> > bandwidth that is available.
> >
> > However in reality it may be very different, e.g., over a 10Gbps network we
> > can see query-migrate showing migration bandwidth of only a few tens of
> > MB/s just because there are plenty of other things the migration thread
> > might be doing. For example, the migration thread can be busy scanning
> > zero pages, or it can be fetching dirty bitmap from other external dirty
> > sources (like vhost or KVM). It means we may not be pushing data as much
> > as possible to migration channel, so the bandwidth estimated from "how many
> > data we sent in the channel" can be dramatically inaccurate sometimes,
> > e.g., that a few tens of MB/s even if 10Gbps available, and then the
> > decision to switchover will be further affected by this.
> >
> > The migration may not even converge at all with the downtime specified,
> > with that wrong estimation of bandwidth.
> >
> > The issue is QEMU itself may not be able to avoid those uncertainties on
> > measuing the real "available migration bandwidth". At least not something
> > I can think of so far.
> >
> > One way to fix this is when the user is fully aware of the available
> > bandwidth, then we can allow the user to help providing an accurate value.
> >
> > For example, if the user has a dedicated channel of 10Gbps for migration
> > for this specific VM, the user can specify this bandwidth so QEMU can
> > always do the calculation based on this fact, trusting the user as long as
> > specified.
> >
> > When the user wants to have migration only use 5Gbps out of that 10Gbps,
> > one can set max-bandwidth to 5Gbps, along with available-bandwidth to 5Gbps
> > so it'll never use over 5Gbps too (so the user can have the rest 5Gbps for
> > other things). So it can be useful even if the network is not dedicated,
> > but as long as the user can know a solid value.
> >
> > A new parameter "available-bandwidth" is introduced just for this. So when
> > the user specified this parameter, instead of trusting the estimated value
> > from QEMU itself (based on the QEMUFile send speed), let's trust the user
> > more.
> >
> > This can resolve issues like "unconvergence migration" which is caused by
> > hilarious low "migration bandwidth" detected for whatever reason.
> >
> > Reported-by: Zhiyi Guo <zhguo@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > qapi/migration.json | 20 +++++++++++++++++++-
> > migration/migration.h | 2 +-
> > migration/options.h | 1 +
> > migration/migration-hmp-cmds.c | 14 ++++++++++++++
> > migration/migration.c | 19 +++++++++++++++----
> > migration/options.c | 28 ++++++++++++++++++++++++++++
> > migration/trace-events | 2 +-
> > 7 files changed, 79 insertions(+), 7 deletions(-)
> >
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 47dfef0278..fdc269e0a1 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -730,6 +730,16 @@
> > # @max-bandwidth: to set maximum speed for migration. maximum speed
> > # in bytes per second. (Since 2.8)
> > #
> > +# @available-bandwidth: to set available bandwidth for migration. By
> > +# default, this value is zero, means the user is not aware of the
> > +# available bandwidth that can be used by QEMU migration, so QEMU will
> > +# estimate the bandwidth automatically. This can be set when the
> > +# estimated value is not accurate, while the user is able to guarantee
> > +# such bandwidth is available for migration purpose during the
> > +# migration procedure. When specified correctly, this can make the
> > +# switchover decision much more accurate, which will also be based on
> > +# the max downtime specified. (Since 8.2)
>
> Humor me: break lines slightly earlier, like
>
> # @available-bandwidth: to set available bandwidth for migration. By
> # default, this value is zero, means the user is not aware of the
> # available bandwidth that can be used by QEMU migration, so QEMU
> # will estimate the bandwidth automatically. This can be set when
> # the estimated value is not accurate, while the user is able to
> # guarantee such bandwidth is available for migration purpose
> # during the migration procedure. When specified correctly, this
> # can make the switchover decision much more accurate, which will
> # also be based on the max downtime specified. (Since 8.2)
Sure.
>
> > +#
> > # @downtime-limit: set maximum tolerated downtime for migration.
> > # maximum downtime in milliseconds (Since 2.8)
> > #
> > @@ -803,7 +813,7 @@
> > 'cpu-throttle-initial', 'cpu-throttle-increment',
> > 'cpu-throttle-tailslow',
> > 'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
> > - 'downtime-limit',
> > + 'available-bandwidth', 'downtime-limit',
> > { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
> > 'block-incremental',
> > 'multifd-channels',
> > @@ -886,6 +896,9 @@
> > # @max-bandwidth: to set maximum speed for migration. maximum speed
> > # in bytes per second. (Since 2.8)
> > #
> > +# @available-bandwidth: to set available migration bandwidth. Please refer
> > +# to comments in MigrationParameter for more information. (Since 8.2)
>
> For better or worse, we duplicate full documentation between
> MigrationParameter, MigrateSetParameters, and MigrationParameters. This
> would be the first instance where we reference instead. I'm not opposed
> to use references, but if we do, I want them used consistently.
We discussed this over the other "switchover" parameter, but that patchset
just stranded..
Perhaps I just provide a pre-requisite patch to remove all the comments in
MigrateSetParameters and MigrationParameters, letting them all point to
MigrationParameter?
One thing I should have mentioned much earlier is that this patch is for
8.2 material.
>
> > +#
> > # @downtime-limit: set maximum tolerated downtime for migration.
> > # maximum downtime in milliseconds (Since 2.8)
> > #
> > @@ -971,6 +984,7 @@
> > '*tls-hostname': 'StrOrNull',
> > '*tls-authz': 'StrOrNull',
> > '*max-bandwidth': 'size',
> > + '*available-bandwidth': 'size',
> > '*downtime-limit': 'uint64',
> > '*x-checkpoint-delay': { 'type': 'uint32',
> > 'features': [ 'unstable' ] },
> > @@ -1078,6 +1092,9 @@
> > # @max-bandwidth: to set maximum speed for migration. maximum speed
> > # in bytes per second. (Since 2.8)
> > #
> > +# @available-bandwidth: to set available migration bandwidth. Please refer
> > +# to comments in MigrationParameter for more information. (Since 8.2)
> > +#
> > # @downtime-limit: set maximum tolerated downtime for migration.
> > # maximum downtime in milliseconds (Since 2.8)
> > #
> > @@ -1160,6 +1177,7 @@
> > '*tls-hostname': 'str',
> > '*tls-authz': 'str',
> > '*max-bandwidth': 'size',
> > + '*available-bandwidth': 'size',
> > '*downtime-limit': 'uint64',
> > '*x-checkpoint-delay': { 'type': 'uint32',
> > 'features': [ 'unstable' ] },
> > diff --git a/migration/migration.h b/migration/migration.h
> > index b7c8b67542..fadbf64d9d 100644
> > --- a/migration/migration.h
> > +++ b/migration/migration.h
> > @@ -283,7 +283,7 @@ struct MigrationState {
> > /*
> > * The final stage happens when the remaining data is smaller than
> > * this threshold; it's calculated from the requested downtime and
> > - * measured bandwidth
> > + * measured bandwidth, or available-bandwidth if user specified.
>
> Suggest to scratch "user".
Will do, thanks.
--
Peter Xu
next prev parent reply other threads:[~2023-07-25 16:42 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-24 17:07 [PATCH] migration: Allow user to specify migration available bandwidth Peter Xu
2023-07-24 18:04 ` Daniel P. Berrangé
2023-07-24 18:12 ` Peter Maydell
2023-07-24 19:47 ` Peter Xu
2023-07-25 9:16 ` Daniel P. Berrangé
2023-07-25 15:54 ` Peter Xu
2023-07-25 16:09 ` Daniel P. Berrangé
2023-07-25 16:38 ` Peter Xu
2023-07-25 17:10 ` Daniel P. Berrangé
2023-07-26 15:19 ` Peter Xu
2023-07-25 11:10 ` Markus Armbruster
2023-07-25 16:42 ` Peter Xu [this message]
2023-07-26 6:21 ` Markus Armbruster
2023-07-26 15:12 ` Peter Xu
2023-08-04 12:06 ` Markus Armbruster
2023-08-04 13:28 ` Peter Xu
2023-08-05 8:13 ` Markus Armbruster
2023-08-04 13:39 ` Daniel P. Berrangé
2023-08-04 14:02 ` Peter Xu
2023-08-05 8:05 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZL/7XtiEFWEprQhD@x1n \
--to=peterx@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=chdong@redhat.com \
--cc=eblake@redhat.com \
--cc=farosas@suse.de \
--cc=lsoaresp@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=zhguo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).