From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nathan March Subject: Re: Internal error during live migration saving Date: Wed, 14 Sep 2011 10:58:49 -0700 Message-ID: <4E70EB59.1080708@gt.net> References: <4E6FC4A6.5040304@gt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: rshriram@cs.ubc.ca Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On 9/14/2011 10:53 AM, Shriram Rajagopalan wrote: > On Tue, Sep 13, 2011 at 2:01 PM, Nathan March wrote: >> Just wondering if this is a known bug? >> >> Trying to migrate the VM off to a diff dom0 results in the below error. >> Other VMs migrated off fine (started at around the same time as this vm) and >> I've tried a few different target servers, all resulting in the same thing. >> > Were other domains linux 3.0.3 as well ? All the dom0's are 3.0.3 and all the domU's are 2.6.32.27 (w/ grsec). I did a cold reboot of the VM and now it migrates properly. >> [2011-09-13 13:48:24 3996] DEBUG (XendCheckpoint:124) [xc_save]: >> /usr/lib/xen/bin/xc_save 29 77 0 0 1 >> [2011-09-13 13:48:24 3996] INFO (XendCheckpoint:423) xc_save: failed to get >> the suspend evtchn port >> [2011-09-13 13:48:24 3996] INFO (XendCheckpoint:423) >> [2011-09-13 13:49:03 3996] DEBUG (XendCheckpoint:394) suspend >> [2011-09-13 13:49:03 3996] DEBUG (XendCheckpoint:127) In saveInputHandler >> suspend >> [2011-09-13 13:49:03 3996] DEBUG (XendCheckpoint:129) Suspending 77 ... >> [2011-09-13 13:49:03 3996] DEBUG (XendDomainInfo:524) >> XendDomainInfo.shutdown(suspend) >> [2011-09-13 13:49:03 3996] DEBUG (XendDomainInfo:1881) >> XendDomainInfo.handleShutdownWatch >> [2011-09-13 13:50:06 3996] DEBUG (XendDomainInfo:1881) >> XendDomainInfo.handleShutdownWatch >> [2011-09-13 13:50:06 3996] INFO (XendCheckpoint:423) xc: error: Suspend >> request failed: Internal error >> [2011-09-13 13:50:06 3996] INFO (XendCheckpoint:423) xc: error: Domain >> appears not to have suspended: Internal error >> [2011-09-13 13:50:06 3996] ERROR (XendCheckpoint:185) Save failed on domain >> globish (77) - resuming. >> Traceback (most recent call last): >> File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line >> 146, in save >> forkHelper(cmd, fd, saveInputHandler, False) >> File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line >> 395, in forkHelper >> inputHandler(line, child.tochild) >> File "/usr/lib64/python2.6/site-packages/xen/xend/XendCheckpoint.py", line >> 131, in saveInputHandler >> dominfo.waitForSuspend() >> File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line >> 2998, in waitForSuspend >> raise XendError(msg) >> XendError: Timeout waiting for domain 77 to suspend >> [2011-09-13 13:50:06 3996] DEBUG (XendDomainInfo:3135) >> XendDomainInfo.resumeDomain(77) >> >> xend-debug.log and the target dom0 logs don't show anything of value. >> >> This is xen 4.1.1 on linux 3.0.3 >> > Did you try xm save -c (or the xl equivalent) ? This should be > activating the same > code path where this error seems to appear. > > Also, make sure you have CONFIG_XEN_SAVE_RESTORE enabled. Unfortunately I didn't think to try it. I do have that set on both dom0 and domu. - Nathan