All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keir Fraser <keir.xen@gmail.com>
To: Olaf Hering <olaf@aepfle.de>, xen-devel@lists.xen.org
Subject: Re: reliable live migration of large and busy guests
Date: Tue, 06 Nov 2012 20:45:57 +0000	[thread overview]
Message-ID: <CCBF2785.442DA%keir.xen@gmail.com> (raw)
In-Reply-To: <20121106202816.GA29655@aepfle.de>

On 06/11/2012 20:28, "Olaf Hering" <olaf@aepfle.de> wrote:

> We got a customer report about long-lasting and then failing live
> migration of busy guests.
> 
> The guest has 64G memory, is busy with its set of applications and as a
> result there will be always dirty pages to transfer. While some of this
> can be solved with faster network connection, the underlying issue is
> that tools/libxc/xc_domain_save.c:xc_domain_save will suspend a domain
> after a given number of iterations to transfer the remaining dirty
> pages. From what I understand this pausing of the guest (I dont know how
> long it is actually paused) is causing issues within the guest, the
> applications start to fail (again, no details).
> 
> Their suggestion is to add some knob to the overall live migration
> process to avoid the suspend. If the guest could not be transfered with
> the parameters passed to xc_domain_save(), abort the migration and let
> it running on the old host.
> 
> 
> My questions are:
> Was such issue ever seen elsewhere?

It's known that if you have a workload that is dirtying lots of pages
quickly, the final stop-and-copy phase will necessarily be large. A VM that
is busy dirtying lots of pages can dirty pages much quicker than they can be
transferred over the LAN.

> Should 'xm migrate --live' and 'xl migrate' get something like a
> --no-suspend option?

Well, it is not really possible to avoid the suspend altogether, there is
always going to be some minimal 'dirty working set'. But could provide
parameters to require the dirty working set to be smaller than X pages
within Y rounds of dirty page copying.

 -- Keir

  reply	other threads:[~2012-11-06 20:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-06 20:28 reliable live migration of large and busy guests Olaf Hering
2012-11-06 20:45 ` Keir Fraser [this message]
2012-11-06 22:18   ` Olaf Hering
2012-11-06 23:18     ` Andrew Cooper
2012-11-06 23:41       ` Dan Magenheimer
2012-11-07 13:44         ` Andrew Cooper
2012-11-07 15:10           ` Dan Magenheimer
2012-11-08 10:58             ` George Dunlap
2012-11-12 17:12               ` Dan Magenheimer
2012-11-07 14:13       ` Olaf Hering

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CCBF2785.442DA%keir.xen@gmail.com \
    --to=keir.xen@gmail.com \
    --cc=olaf@aepfle.de \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.