All of lore.kernel.org
 help / color / mirror / Atom feed
* xm migrate, xc_save failed
@ 2014-10-03 11:09 Armin Zentai
  0 siblings, 0 replies; only message in thread
From: Armin Zentai @ 2014-10-03 11:09 UTC (permalink / raw)
  To: xen-devel

Dear Xen Developers!


We've encountered with a problem while doing a hotmigrate between 
hypervisors.

A process is the following:

# xm migrate fnavnzt3lm0r6f 10.0.20.24 --live
Error: /usr/lib/xen/bin/xc_save 20 13 0 0 1 failed

Usage: xm migrate <Domain> <Host>



Migrate a domain to another machine.



Options:



-h, --help           Print this help.

-l, --live           Use live migration.

-p=portnum, --port=portnum

                      Use specified port for migration.

-n=nodenum, --node=nodenum

                      Use specified NUMA node on target.

-s, --ssl            Use ssl connection for migration.

-c, --change_home_server

                      Change home server for managed domains.


This command takes about 15 minutes to finish, the VM is running fine 
while the command's running, and its continues to run on the source 
hypervisor after the xm migrate command fails.

After examining the xc_save process, in a normal case, the xc_save 
process runs for a few seconds, but in this case the xc_save keep 
running for 15 minutes, after that it times out.

In the xend.log we've found the following lines:
[2014-10-03 02:52:48 9020] DEBUG (XendCheckpoint:124) [xc_save]: 
/usr/lib/xen/bin/xc_save 29 17 0 0 1
[2014-10-03 02:52:48 9020] INFO (XendCheckpoint:423) xc_save: failed to 
get the suspend evtchn port
[2014-10-03 02:52:48 9020] INFO (XendCheckpoint:423)

After the ~15 minutes, it times out, and...

[2014-10-03 03:05:54 9020] INFO (XendCheckpoint:423) xc: error: Error 
when writing to state file (4c) (errno 110) (110 = Connection timed 
out): Internal error
[2014-10-03 03:05:54 9020] ERROR (XendCheckpoint:185) Save failed on 
domain b415gk79eo345x (24) - resuming.
Traceback (most recent call last):
   File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", 
line 146, in save
     forkHelper(cmd, fd, saveInputHandler, False)
   File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", 
line 411, in forkHelper
     raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen/bin/xc_save 25 24 0 0 1 failed
[2014-10-03 03:05:54 9020] DEBUG (XendDomainInfo:3141) 
XendDomainInfo.resumeDomain(24)
[2014-10-03 03:08:25 9020] INFO (XendCheckpoint:423) xc: error: Error 
when writing to state file (4a) (errno 110) (110 = Connection timed 
out): Internal error
[2014-10-03 03:08:25 9020] ERROR (XendCheckpoint:185) Save failed on 
domain y1xeszf11s89ab (17) - resuming.
Traceback (most recent call last):
   File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", 
line 146, in save
     forkHelper(cmd, fd, saveInputHandler, False)
   File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", 
line 411, in forkHelper
     raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen/bin/xc_save 29 17 0 0 1 failed
[2014-10-03 03:08:25 9020] DEBUG (XendDomainInfo:3141) 
XendDomainInfo.resumeDomain(17)


We've encountered with this error on multiple hypervisors, with multiple 
VMs.

Some info about the hypervisors:
# uname -a
Linux c2-node15 3.10.43-11.el6.centos.alt.x86_64 #1 SMP Mon Jun 16 
14:22:02 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

# xm info
host                   : c2-node15
release                : 3.10.43-11.el6.centos.alt.x86_64
version                : #1 SMP Mon Jun 16 14:22:02 UTC 2014
machine                : x86_64
nr_cpus                : 12
nr_nodes               : 1
cores_per_socket       : 6
threads_per_core       : 2
cpu_mhz                : 2660
hw_caps                : 
bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000
virt_caps              : hvm
total_memory           : 49139
free_memory            : 18959
free_cpus              : 0
xen_major              : 4
xen_minor              : 2
xen_extra              : .4-33.el6
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
xen_commandline        : dom0_mem=3145728 noreboot=true pcie_asmp=off 
dom0_max_vcpus=6
cc_compiler            : gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
cc_compile_by          : mockbuild
cc_compile_domain      : centos.org
cc_compile_date        : Mon Jun 16 17:22:14 UTC 2014
xend_config_format     : 4


CPU:  Intel(R) Xeon(R) CPU X5650  @ 2.67GHz
Memory: 48GB

All hypervisors are Dell R410 machines, with the same CPU and memory amount.


Thanks for your help,
  - Armin Zentai

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2014-10-03 11:09 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-03 11:09 xm migrate, xc_save failed Armin Zentai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.