From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:60009) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TVAaT-0007Zx-Os for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:30:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TVAaQ-00072b-Fa for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:30:45 -0500 Received: from ozlabs.org ([2402:b800:7003:1:1::1]:58294) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TVAaP-00070d-UZ for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:30:42 -0500 Date: Mon, 5 Nov 2012 11:30:06 +1100 From: David Gibson Message-ID: <20121105003006.GW27695@truffula.fritz.box> References: <20121102031011.GM27695@truffula.fritz.box> <5093B8A9.4060501@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5093B8A9.4060501@redhat.com> Subject: Re: [Qemu-devel] Testing migration under stress List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Orit Wasserman Cc: aik@ozlabs.ru, qemu-devel@nongnu.org, quintela@redhat.com On Fri, Nov 02, 2012 at 02:12:25PM +0200, Orit Wasserman wrote: > On 11/02/2012 05:10 AM, David Gibson wrote: > > Asking for some advice on the list. > > > > I have prorotype savevm and migration support ready for the pseries > > machine. They seem to work under simple circumstances (idle guest). > > To test them more extensively I've been attempting to perform live > > migrations (just over tcp->localhost) which the guest is active with > > something. In particular I've tried while using octave to do matrix > > multiply (so exercising the FP unit) and my colleague Alexey has tried > > during some video encoding. > As you are doing local migration one option is to setting the speed > higher than line speed , as we don't actually send the data, another > is to set high downtime. I'm not entirely sure what you mean by that. But I do have suspicions based on this and other factors that the default bandwidth it is limiting to is horribly, horribly low. > > However, in each of these cases, we've found that the migration only > > completes and the source instance only stops after the intensive > > workload has (just) completed. What I surmise is happening is that > > the workload is touching memory pages fast enough that the ram > > migration code is never getting below the threshold to complete the > > migration until the guest is idle again. > > > The workload you chose is really bad for live migration, as all the > guest does is dirtying his memory. Well, I realised that was true of the matrix multiply. For video encode though, the output data should be much, much smaller than the input, so I wouldn't expect it to be dirtying memory that fast. > I recommend looking for workload > that does some networking or disk IO. Vinod succeeded running > SwingBench and SLOB benchmarks that converged ok, I don't know if > they run on pseries, but similar workload should be ok(small > database/warehouse). We found out that SpecJbb on the other hand is > hard to converge. Web workload or video streaming also do the > trick. Hrm. As something really simple and stupid, I did try migrationg an ls -lR /, but even that didn't converge :/. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson