From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: slow live magration / xc_restore on xen4 pvops Date: Thu, 3 Jun 2010 16:18:39 +0100 Message-ID: References: <20100603150305.GA53591@zanzibar.domain.invalid> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20100603150305.GA53591@zanzibar.domain.invalid> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Brendan Cully , Ian Jackson Cc: "xen-devel@lists.xensource.com" , Andreas Olsowski , "Zhai, Edwin" List-Id: xen-devel@lists.xenproject.org On 03/06/2010 16:03, "Brendan Cully" wrote: > I see no evidence that Remus has anything to do with the live > migration performance regression discussed in this thread, and I > haven't seen any other reported issues either. I think the mlock issue > is a much more likely candidate. I agree it's probably lack of batching plus expensive mlocks. The performance difference between different machines under test is either because one runs out of 2MB superpage extents before the other (for some reason) or because mlock operations are for some reason much more likely to take a slow path in the kernel (possibly including disk i/o) for some reason. We need to get batching back, and Edwin is on the case for that: I hope Andreas will try out Edwin's patch to work towards that. We can also reduce mlock cost by mlocking some domain_restore arrays across the entire restore operation, I should imagine. -- Keir