From: Brendan Cully <brendan@cs.ubc.ca>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: xen-devel@lists.xensource.com,
Andreas Olsowski <andreas.olsowski@uni.leuphana.de>
Subject: Re: slow live magration / xc_restore on xen4 pvops
Date: Wed, 2 Jun 2010 09:27:45 -0700 [thread overview]
Message-ID: <20100602162745.GA27542@kremvax.cs.ubc.ca> (raw)
In-Reply-To: <19462.33905.936222.605434@mariner.uk.xensource.com>
On Wednesday, 02 June 2010 at 17:18, Ian Jackson wrote:
> Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore on xen4 pvops"):
> > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal
> > error: Error when reading batch size
> > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal
> > error: error when buffering batch, finishing
>
> These errors, and the slowness of migrations, are caused by changes
> made to support Remus. Previously, a migration would be regarded as
> complete as soon as the final information including CPU states was
> received at the migration target. xc_domain_restore would return
> immediately at that point.
>
> Since the Remus patches, xc_domain_restore waits until it gets an IO
> error, and also has a very short timeout which induces IO errors if
> nothing is received if there is no timeout. This is correct in the
> Remus case but wrong in the normal case.
>
> The code should be changed so that xc_domain_restore
> (a) takes an explicit parameter for the IO timeout, which
> should default to something much longer than the 100ms or so of
> the Remus case, and
> (b) gets told whether
> (i) it should return immediately after receiving the "tail"
> which contains the CPU state; or
> (ii) it should attempt to keep reading after receiving the "tail"
> and only return when the connection fails.
I'm going to have a look at this today, but the way the code was
originally written I don't believe this should have been a problem:
1. reads are only supposed to be able to time out after the entire
first checkpoint has been received (IOW this wouldn't kick in until
normal migration had already completed)
2. in normal migration, the sender should close the fd after sending
all data, immediately triggering an IO error on the receiver and
completing the restore.
I did try to avoid disturbing regular live migration as much as
possible when I wrote the code. I suspect some other regression has
crept in, and I'll investigate.
next prev parent reply other threads:[~2010-06-02 16:27 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-01 17:49 XCP AkshayKumar Mehta
2010-06-01 19:06 ` XCP Jonathan Ludlam
2010-06-01 19:15 ` XCP AkshayKumar Mehta
2010-06-03 3:03 ` XCP AkshayKumar Mehta
2010-06-03 10:24 ` XCP Jonathan Ludlam
2010-06-03 17:20 ` XCP AkshayKumar Mehta
2010-08-31 1:33 ` XCP - iisues with XCP .5 AkshayKumar Mehta
2010-06-01 21:17 ` slow live magration / xc_restore on xen4 pvops Andreas Olsowski
2010-06-02 7:11 ` Keir Fraser
2010-06-02 15:46 ` Andreas Olsowski
2010-06-02 15:55 ` Keir Fraser
2010-06-02 16:18 ` Ian Jackson
2010-06-02 16:20 ` Ian Jackson
2010-06-02 16:24 ` Keir Fraser
2010-06-03 1:04 ` Brendan Cully
2010-06-03 4:31 ` Brendan Cully
2010-06-03 5:47 ` Keir Fraser
2010-06-03 6:45 ` Brendan Cully
2010-06-03 6:53 ` Jeremy Fitzhardinge
2010-06-03 6:55 ` Brendan Cully
2010-06-03 7:12 ` Keir Fraser
2010-06-03 8:58 ` Zhai, Edwin
2010-06-09 13:32 ` Keir Fraser
2010-06-02 16:27 ` Brendan Cully [this message]
2010-06-03 10:01 ` Ian Jackson
2010-06-03 15:03 ` Brendan Cully
2010-06-03 15:18 ` Keir Fraser
2010-06-03 17:15 ` Ian Jackson
2010-06-03 17:29 ` Brendan Cully
2010-06-03 18:02 ` Ian Jackson
2010-06-02 22:59 ` Andreas Olsowski
2010-06-10 9:27 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100602162745.GA27542@kremvax.cs.ubc.ca \
--to=brendan@cs.ubc.ca \
--cc=Ian.Jackson@eu.citrix.com \
--cc=andreas.olsowski@uni.leuphana.de \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).