From: Andreas Olsowski <andreas.olsowski@uni.leuphana.de>
To: xen-devel@lists.xensource.com
Subject: Re: slow live magration / xc_restore on xen4 pvops
Date: Wed, 2 Jun 2010 17:46:45 +0200 [thread overview]
Message-ID: <20100602174645.9b37b6b1.andreas.olsowski@uni.leuphana.de> (raw)
In-Reply-To: <C82BC2B3.166A7%keir.fraser@eu.citrix.com>
Hi Keir,
i changed all DRPRINTF calls to ERROR and // DPRINTF to ERROR as well.
There are no DBGPRINTF calls in my xc_domain_restore.c though.
This is the new xend.log output, of course in this case the "ERROR Internal error:" is actually debug output.
xenturio1:~# tail -f /var/log/xen/xend.log
[2010-06-02 15:44:19 5468] DEBUG (XendCheckpoint:286) restore:shadow=0x0, _static_max=0x20000000, _static_min=0x0,
[2010-06-02 15:44:19 5468] DEBUG (XendCheckpoint:305) [xc_restore]: /usr/lib/xen/bin/xc_restore 50 51 1 2 0 0 0 0
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error: xc_domain_restore start: p2m_size = 20000
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error: Reloading memory pages: 0%
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error: reading batch of -7 pages
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error: reading batch of 1024 pages
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423) ERROR Internal error: reading batch of 1024 pages
[2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423) ERROR Internal error: reading batch of 1024 pages
[2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:49:03 5468] INFO (XendCheckpoint:423) ERROR Internal error: reading batch of 1024 pages
...
[2010-06-02 15:49:09 5468] INFO (XendCheckpoint:423) ERROR Internal err100%
...
One can see the timegap bewteen the first and the following memory batch reads.
After that restoration works as expected.
You might notice, that you have "0%" and then "100%" and no steps inbetween, whereas with xc_save you have, is that intentional or maybe another symptom for the same problem?
as for the read_exact stuff:
tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H RDEXACT {} \;
tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H rdexact {} \;
There are no RDEXACT/rdexact matches in my xen source code.
In a few hours i will shutdown all virtual machines on one of the hosts experiencing slow xc_restores, maybe reboot it and check if xc_restore is any faster without load or utilization on the machine.
Ill check in with results later.
On Wed, 2 Jun 2010 08:11:31 +0100
Keir Fraser <keir.fraser@eu.citrix.com> wrote:
> Hi Andreas,
>
> This is an interesting bug, to be sure. I think you need to modify the
> restore code to get a better idea of what's going on. The file in the Xen
> tree is tools/libxc/xc_domain_restore.c. You will see it contains many
> DBGPRINTF and DPRINTF calls, some of which are commented out, and some of
> which may 'log' at too low a priority level to make it to the log file. For
> your purposes you might change them to ERROR calls as they will definitely
> get properly logged. One area of possible concern is that our read function
> (RDEXACT, which is a macro mapping to rdexact) was modified for Remus to
> have a select() call with a timeout of 1000ms. Do I entirely trust it? Not
> when we have the inexplicable behaviour that you're seeing. So you might try
> mapping RDEXACT() to read_exact() instead (which is what we already do when
> building for __MINIOS__).
>
> This all assumes you know your way around C code at least a little bit.
>
> -- Keir
--
Andreas Olsowski <andreas.olsowski@uni.leuphana.de>
Leuphana Universität Lüneburg
System- und Netzwerktechnik
Rechenzentrum, Geb 7, Raum 15
Scharnhorststr. 1
21335 Lüneburg
Tel: ++49 4131 / 6771309
next prev parent reply other threads:[~2010-06-02 15:46 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-01 17:49 XCP AkshayKumar Mehta
2010-06-01 19:06 ` XCP Jonathan Ludlam
2010-06-01 19:15 ` XCP AkshayKumar Mehta
2010-06-03 3:03 ` XCP AkshayKumar Mehta
2010-06-03 10:24 ` XCP Jonathan Ludlam
2010-06-03 17:20 ` XCP AkshayKumar Mehta
2010-08-31 1:33 ` XCP - iisues with XCP .5 AkshayKumar Mehta
2010-06-01 21:17 ` slow live magration / xc_restore on xen4 pvops Andreas Olsowski
2010-06-02 7:11 ` Keir Fraser
2010-06-02 15:46 ` Andreas Olsowski [this message]
2010-06-02 15:55 ` Keir Fraser
2010-06-02 16:18 ` Ian Jackson
2010-06-02 16:20 ` Ian Jackson
2010-06-02 16:24 ` Keir Fraser
2010-06-03 1:04 ` Brendan Cully
2010-06-03 4:31 ` Brendan Cully
2010-06-03 5:47 ` Keir Fraser
2010-06-03 6:45 ` Brendan Cully
2010-06-03 6:53 ` Jeremy Fitzhardinge
2010-06-03 6:55 ` Brendan Cully
2010-06-03 7:12 ` Keir Fraser
2010-06-03 8:58 ` Zhai, Edwin
2010-06-09 13:32 ` Keir Fraser
2010-06-02 16:27 ` Brendan Cully
2010-06-03 10:01 ` Ian Jackson
2010-06-03 15:03 ` Brendan Cully
2010-06-03 15:18 ` Keir Fraser
2010-06-03 17:15 ` Ian Jackson
2010-06-03 17:29 ` Brendan Cully
2010-06-03 18:02 ` Ian Jackson
2010-06-02 22:59 ` Andreas Olsowski
2010-06-10 9:27 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100602174645.9b37b6b1.andreas.olsowski@uni.leuphana.de \
--to=andreas.olsowski@uni.leuphana.de \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).