Re: [PATCH v4 13/14] migration: for SEV live migration bump downtime limit to 1s.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Ashish Kalra <Ashish.Kalra@amd.com>
Cc: Thomas.Lendacky@amd.com, brijesh.singh@amd.com,
	ehabkost@redhat.com, jejb@linux.ibm.com, tobin@ibm.com,
	qemu-devel@nongnu.org, dgilbert@redhat.com,
	dovmurik@linux.vnet.ibm.com, pbonzini@redhat.com
Subject: Re: [PATCH v4 13/14] migration: for SEV live migration bump downtime limit to 1s.
Date: Fri, 10 Sep 2021 10:43:50 +0100	[thread overview]
Message-ID: <YTso1rziufm6Fi+j@redhat.com> (raw)
In-Reply-To: <b1468803a2200c3b5e1f1434eb74302ec4b824c6.1628076205.git.ashish.kalra@amd.com>

On Wed, Aug 04, 2021 at 11:59:47AM +0000, Ashish Kalra wrote:
> From: Ashish Kalra <ashish.kalra@amd.com>
> 
> Now, qemu has a default expected downtime of 300 ms and
> SEV Live migration has a page-per-second bandwidth of 350-450 pages
> ( SEV Live migration being generally slow due to guest RAM pages
> being migrated after encryption using the security processor ).
> With this expected downtime of 300ms and 350-450 pps bandwith,
> the threshold size = <1/3 of the PPS bandwidth = ~100 pages.
> 
> Now, this threshold size is the maximum pages/bytes that can be
> sent in the final completion phase of Live migration
> (where the source VM is stopped) with the expected downtime.
> Therefore, with the threshold size computed above,
> the migration completion phase which halts the source VM
> and then transfers the leftover dirty pages,
> is only reached in SEV live migration case when # of dirty pages are ~100.
> 
> The dirty-pages-rate with larger guest RAM configuration like 4G, 8G, etc.
> is much higher, typically in the range of 300-400+ pages, hence,
> we always remain in the "dirty-sync" phase of migration and never
> reach the migration completion phase with above guest RAM configs.
> 
> To summarize, with larger guest RAM configs,
> the dirty-pages-rate > threshold_size (with the default qemu expected downtime of 300ms).
> 
> So, the fix is to increase qemu's expected downtime.
> 
> This is a tweakable parameter which can be set using "migrate_set_downtime".
> 
> With a downtime of 1 second, we get a threshold size of ~350-450 pages,
> which will handle the "dirty-pages-rate" of 300+ pages and complete
> the migration process, so we bump the default downtime to 1s in case
> of SEV live migration being active.
> 
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  migration/migration.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index daea3ecd04..c9bc33fb10 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3568,6 +3568,10 @@ static void migration_update_counters(MigrationState *s,
>      transferred = current_bytes - s->iteration_initial_bytes;
>      time_spent = current_time - s->iteration_start_time;
>      bandwidth = (double)transferred / time_spent;
> +    if (memcrypt_enabled() &&
> +        s->parameters.downtime_limit < 1000) {
> +        s->parameters.downtime_limit = 1000;
> +    }

I don't think we can be silently changing a value set by the mgmt
app. If the app requests 300 ms downtime, then we *must* honour
that, because it is driven by the SLA they need to privide to the
guest user's workload. If it means the migration won't complete,
it is up to the app to deal with that in some manner.

At most I think this is a documentation task to give guidance to
mgmt apps about what special SEV-only things to consider whe tuning
live migration.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2021-09-10  9:45 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04 11:52 [PATCH v4 00/14] Add SEV guest live migration support Ashish Kalra
2021-08-04 11:53 ` [PATCH v4 01/14] doc: update AMD SEV API spec web link Ashish Kalra
2021-08-16 18:44   ` Dr. David Alan Gilbert
2021-08-04 11:53 ` [PATCH v4 02/14] doc: update AMD SEV to include Live migration flow Ashish Kalra
2021-08-05  6:34   ` Dov Murik
2021-08-05  9:39     ` Ashish Kalra
2021-09-10  9:53   ` Daniel P. Berrangé
2021-08-04 11:54 ` [PATCH v4 03/14] migration.json: add AMD SEV specific migration parameters Ashish Kalra
2021-08-05  9:42   ` Dov Murik
2021-08-05 14:41     ` Ashish Kalra
2021-08-05 20:18   ` Eric Blake
2021-08-04 11:55 ` [PATCH v4 04/14] confidential guest support: introduce ConfidentialGuestMemoryEncryptionOps for encrypted VMs Ashish Kalra
2021-08-05 12:20   ` Dov Murik
2021-08-05 14:43     ` Ashish Kalra
2021-08-04 11:56 ` [PATCH v4 05/14] target/i386: sev: provide callback to setup outgoing context Ashish Kalra
2021-08-05 13:06   ` Dov Murik
2021-08-05 14:45     ` Ashish Kalra
2021-08-04 11:56 ` [PATCH v4 06/14] target/i386: sev: do not create launch context for an incoming guest Ashish Kalra
2021-08-04 11:56 ` [PATCH v4 07/14] target/i386: sev: add support to encrypt the outgoing page Ashish Kalra
2021-08-05 14:35   ` Dov Murik
2021-08-04 11:57 ` [PATCH v4 08/14] target/i386: sev: add support to load incoming encrypted page Ashish Kalra
2021-08-04 11:57 ` [PATCH v4 09/14] kvm: Add support for SEV shared regions list and KVM_EXIT_HYPERCALL Ashish Kalra
2021-08-04 11:57 ` [PATCH v4 10/14] migration: add support to migrate shared regions list Ashish Kalra
2021-09-10  7:54   ` Wang, Wei W
2021-09-10  8:47     ` Ashish Kalra
2021-09-10  9:11       ` Wang, Wei W
2021-09-10  9:42         ` Ashish Kalra
2021-08-04 11:58 ` [PATCH v4 11/14] migration/ram: add support to send encrypted pages Ashish Kalra
2021-08-04 11:59 ` [PATCH v4 12/14] migration/ram: Force encrypted status for flash0 & flash1 devices Ashish Kalra
2021-08-04 11:59 ` [PATCH v4 13/14] migration: for SEV live migration bump downtime limit to 1s Ashish Kalra
2021-09-10  9:43   ` Daniel P. Berrangé [this message]
2021-09-10 10:18     ` Ashish Kalra via
2021-08-04 12:00 ` [PATCH v4 14/14] kvm: Add support for userspace MSR filtering and handling of MSR_KVM_MIGRATION_CONTROL Ashish Kalra
2021-09-10  7:56   ` Wang, Wei W
2021-09-10  9:14     ` Ashish Kalra
2021-09-10  9:36       ` Wang, Wei W

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YTso1rziufm6Fi+j@redhat.com \
    --to=berrange@redhat.com \
    --cc=Ashish.Kalra@amd.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=brijesh.singh@amd.com \
    --cc=dgilbert@redhat.com \
    --cc=dovmurik@linux.vnet.ibm.com \
    --cc=ehabkost@redhat.com \
    --cc=jejb@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tobin@ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).