From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Byrne Subject: Re: Stability of migration? Date: Tue, 13 Jun 2006 18:12:25 -0700 Message-ID: <448F6279.90502@hp.com> References: <448F5934.1010805@hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <448F5934.1010805@hp.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel List-Id: xen-devel@lists.xenproject.org I should have made clear I am testing on x86_64. John John Byrne wrote: > > Hi, > > With xen-unstable changset 10333:360f9dc71f51, live migration is not > reliable. Migrating an active domain (I use a kernel build in my test) > back and forth between two machines will result in the build or the > domain crashing. I tweaked xc_linux_save.c to enable the verify pass > without outputting all the debugging messages and I can see that one or > two pages do not get a data match in the log. > > I have yet to see a failure of the domain with non-live migration, but I > sometimes see a data mismatch on a page during the verification. Which > would indicate that either suspend doesn't mean what I think it does or > pages of a suspended VM are being altered when they shouldn't be. > > So, I guess I'll start with the easy question: should non-live migration > ever have a page fail to verify? If not, how can I identify the source > of the problem? > > The harder question: how to identify the source of the corruption in > live migration? > > Thanks, > > John Byrne >