From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: xen/stable-2.6.32.x xen-4.1.1 live migration fails with kernels 2.6.39, 3.0.3 and 3.1-rc2 Date: Mon, 29 Aug 2011 15:49:11 -0400 Message-ID: <20110829194911.GC16530@dumpdata.com> References: <4E4EA3E2.2040809@leuphana.de> <4E52224902000078000525CC@nat28.tlf.novell.com> <4E52601B.5060609@leuphana.de> <20110824203435.GA27865@dumpdata.com> <4E55F682.8060405@leuphana.de> <20110826150054.GA1793@dumpdata.com> <4E57D745.2080701@leuphana.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <4E57D745.2080701@leuphana.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andreas Olsowski Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On Fri, Aug 26, 2011 at 07:26:29PM +0200, Andreas Olsowski wrote: > >My todo list is not getting any shorter sadly so not sure when I will > >get to try this out. But let me do that when I get my 32GB machine > >working again. > It would certainly be interesting to know if you experience the same > thing on your platforms. This may or may not have sth to do with the > hardware in play. OK, got my box online. Getting closer to trying to reproduce the problem. > > > > > >Yeah, that really points to either the tools not liking the > >MFN being too high or the hypervisor. Or the save/resume path in the > >Linux kernel is failing silently and sticking in invalid MFNs > >as it can't deal with higher MFNs. > > > >In other words - need to run this to figure out. > > > >Unless you are up for helping out by debugging the code a bit and > >seeing if you can come with a fix? > > Allthough i am willing, i probably wont be able to, since i lack the > neccessary understanding of the low level workings of Xen and i am > not very experienced at debugging C code/programs. OK. > > > However i did some additional testing, this time with xen4.2 and > things have gotten worse: Yeah, xen-unstable past c/s 23379 is doing a lot of weird stuff for me. > > The two servers involved do BOTH have 96GB ram and are both running > the latest xen 4.2 but are of different hardware (R710 and R610): > http://pastebin.com/AaSpWZdg > > And this is happens when i throw a 32GB server (PE2950) in the mix: > http://pastebin.com/7X8t022R > > So with 4.2 there are still migration errors, but whats worse, now i > cant migrate anything anywhere anymore when the platform is > different. > > Within the same platform everything works fine (2x R610): > http://pastebin.com/ZWByjjY5 > > What is going on here? Development - and not all developers test everything in the mix.