From: Jim Henderson <his_jimboness@yahoo.ca>
To: xen-devel@lists.xensource.com
Subject: Migration - Still Issues?
Date: Tue, 26 Apr 2005 14:55:21 -0600 [thread overview]
Message-ID: <426EAAB9.4060102@yahoo.ca> (raw)
[-- Attachment #1: Type: text/plain, Size: 2797 bytes --]
All,
I am currently running a xen-2.0-testing snapshot from April 20. I'm
having sporadic problems with migration.
I have two xen machines, 10.130.2.35 and 10.130.2.36, booting from a
read-only, iso image loopback iscsi target from a third machine. I'm
using the Cisco iscsi-initiator and iscsi-init module for the boot. The
iscsi has been solid so far.
The scsi target ends up mounted to /dev/sda in Dom 0 on both the
machines. I then use that same read-only mount and, as the following
xenU config file shows, gets exported up to /dev/hda when a xenU gets
created:
kernel = "/boot/kernel-2.6.11-xen-2.0.5-domU"
ramdisk = "/boot/initrd"
memory = 64
name = "test"
vif = [ 'mac=00:55:4F:44:00:01' ]
disk = [ 'phy:sda,hda,r' ]
dhcp="dhcp"
root = "/dev/ram0 ro init=/linuxrc cdroot"
Everything boots just fine. The "test" xenU runs flawlessly; I can ssh
into it, run whatever. No problems there. And it's surprisingly fast
over iscsi, even though I've only got 100 Mbit Ethernet adapters.
BUT...
I've been migrating between the machines, both live and non-live, with
mixed success. Sometimes, I'd say every 1 in 10 migrations, I get the
errors posted in the attached xfrd.log files. The .1 file is the source
of the migration and the .2 is the destination. The other 9 of 10
times, it migrates just fine.
I don't seem to get these problems when I do not export /dev/sda to a
domU. For example, if I use just a simple domU (using the same kernel)
with no mounts and an initrd file system, I don't have these problems.
I saw mailing list messages a while back dealing with migration and the
possibility of a crash under heavy network load. Further, I saw a patch
that had been applied:
<QUOTE>
[PATCH] stream fixes for migration
I've attached a patch for libxutil/libxc. This fixes one of the hangs =
I've seen during migrations. It applies against 2.0 and 2.0-testing.
Changes:
* Encountering EOF or error when xfrd reads from stream could cause an =
infinite loop.
* Cleaned up the closing of streams.
* Fixed several memory leaks.
Signed-off-by: Charles Coffing <ccoffing@novell.com>
</QUOTE>
The version of 2.0-testing I'm using has this patch applied. But the
comments in this patch imply that there are still more "hangs" during
migration. Have a stumbled on another one of these?
I believe this patch fixed a previous problem, I would get a looping
hang under 2.0.5 stable; I haven't seen that after going to 2.0-testing.
Am I making incorrect assumptions that I can read-only mount an iscsi
target twice?
Or could hardware be a factor? For testing, I'm just running cheap-o VIA
Rhine 100-TX controllers. I thought I would post this before shelling
out for some Intel gig nics and gig switches though.
Thank you very much for your help.
-James Henderson
[-- Attachment #2: xfrd.log.1 --]
[-- Type: text/plain, Size: 7677 bytes --]
2605 [INF] XFRD> Accepted connection from 127.0.0.1:1145 on 2
2759 [INF] XFRD> Xfr service for 127.0.0.1:1145
[DEBUG] Conn_init> flags=1
[DEBUG] Conn_init> write stream...
[DEBUG] stream_init>mode=w flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_init> read stream...
[DEBUG] stream_init>mode=r flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_sxpr>
(xfr.hello 1 0)[DEBUG] Conn_sxpr< err=0
[DEBUG] Conn_sxpr>
(xfr.migrate 5 "(domain (id 5) (name test) (memory 63) (maxmem 65536) (state -b---) (cpu 0) (cpu_time 0.137634952) (up_time 15.0879249573) (start_time 1114545755.39) (console (status listening) (id 11) (domain 5) (local_port 11) (remote_port 1) (console_port 9605)) (devices (vif (idx 0) (vif 0) (mac 00:55:4f:44:00:01)(vifname vif5.0) (evtchn 12 3) (index 0)) (vbd (idx 0) (vdev 768) (device 2048)(mode r) (dev hda) (uname phy:sda) (node sda) (index 0))) (config (vm (name test) (memory 64) (image (linux (kernel /boot/kernel-2.6.11-xen-2.0.5-domU) (ramdisk /boot/initrd) (ip :1.2.3.4::::eth0:dhcp) (root '/dev/ram0 ro init=/linuxrc cdroot'))) (device (vbd (uname phy:sda) (dev hda) (mode r))) (device (vif (mac 00:55:4F:44:00:01))))))" 10.130.2.36 8002 1 0)[DEBUG] Conn_sxpr< err=0
[DEBUG] Conn_connect> addr=10.130.2.36:8002
[DEBUG] Conn_init> flags=1
[DEBUG] Conn_init> write stream...
[DEBUG] stream_init>mode=w flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_init> read stream...
[DEBUG] stream_init>mode=r flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_sxpr>
(xfr.err 0)[DEBUG] Conn_sxpr< err=0
[1114545770.483473] xc_linux_save start 5
xc_linux_save start 5
[1114545770.485161] Saving memory pages: iter 1 0%
Saving memory pages: iter 1 0%FNI 189 : [1000000c,1020] pte=00be4063, mfn=00000be4, pfn=ffffffff [mfn]=deadbeef
6%
12%
18%
25%
31%
38%
44%
50%
56%
63%
69%
75%
82%
88%
95%
1: sent 16165, skipped 219,
1: sent 16165, skipped 219, delta 6695ms, dom0 21%, target 73%, sent 79Mb/s, dirtied 1Mb/s 260 pages
[1114545777.180435] Saving memory pages: iter 2 0%
2: sent 242, skipped 12, 2 0%
2: sent 242, skipped 12, delta 102ms, dom0 20%, target 79%, sent 77Mb/s, dirtied 3Mb/s 12 pages
[1114545777.283396] Saving memory pages: iter 3 0%
3: sent 0, skipped 12, r 3 0%
3: sent 0, skipped 12, [DEBUG] Conn_sxpr>
(xfr.err 22)[DEBUG] Conn_sxpr< err=0
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Retry suspend domain (0)
Unable to suspend domain. (0)
Unable to suspend domain. (0)
Domain appears not to have suspended: 0
Domain appears not to have suspended: 0
2759 [WRN] XFRD> Transfer errors:
2759 [WRN] XFRD> state=XFR_STATE err=1
2759 [INF] XFRD> Xfr service err=1
[-- Attachment #3: xfrd.log.2 --]
[-- Type: text/plain, Size: 1027 bytes --]
2515 [INF] XFRD> Accepted connection from 10.130.2.35:4227 on 2
2656 [INF] XFRD> Xfr service for 10.130.2.35:4227
[DEBUG] Conn_init> flags=1
[DEBUG] Conn_init> write stream...
[DEBUG] stream_init>mode=w flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_init> read stream...
[DEBUG] stream_init>mode=r flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_sxpr>
(xfr.hello 1 0)[DEBUG] Conn_sxpr< err=0
[DEBUG] Conn_sxpr>
(xfr.xfr 5)[DEBUG] Conn_sxpr< err=0
[1114545766.260913] xc_linux_restore start
xc_linux_restore start
[1114545766.265957] Created domain 5
Created domain 5
(Domain-0 Domain-5)'domain id=5 name=test memory=64 console=9605 image=/boot/kernel-2.6.11-xen-2.0.5-domU'[1114545766.340293] Reloading memory pages: 0%
Reloading memory pages: 6%
12%
18%
25%
31%
37%
43%
50%
56%
62%
68%
75%
81%
87%
93%
98%
98%Error when reading from state file
Error when reading from state file
2656 [INF] XFRD> Xfr service err=1
[-- Attachment #4: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next reply other threads:[~2005-04-26 20:55 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-26 20:55 Jim Henderson [this message]
-- strict thread matches above, loose matches on Subject: below --
2005-04-27 0:24 Migration - Still Issues? Ian Pratt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=426EAAB9.4060102@yahoo.ca \
--to=his_jimboness@yahoo.ca \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.