From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:59976)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgibson@ozlabs.org>) id 1TVAaS-0007Yk-4k
	for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:30:46 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgibson@ozlabs.org>) id 1TVAaQ-00072V-EZ
	for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:30:44 -0500
Received: from ozlabs.org ([2402:b800:7003:1:1::1]:55758)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgibson@ozlabs.org>) id 1TVAaP-00070e-UV
	for qemu-devel@nongnu.org; Sun, 04 Nov 2012 19:30:42 -0500
Date: Mon, 5 Nov 2012 11:31:58 +1100
From: David Gibson <david@gibson.dropbear.id.au>
Message-ID: <20121105003158.GX27695@truffula.fritz.box>
References: <20121102031011.GM27695@truffula.fritz.box>
	<87sj8sgnku.fsf@elfo.mitica>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <87sj8sgnku.fsf@elfo.mitica>
Subject: Re: [Qemu-devel] Testing migration under stress
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Juan Quintela <quintela@redhat.com>
Cc: aik@ozlabs.ru, qemu-devel@nongnu.org

On Fri, Nov 02, 2012 at 02:07:45PM +0100, Juan Quintela wrote:
> David Gibson <david@gibson.dropbear.id.au> wrote:
> > Asking for some advice on the list.
> >
> > I have prorotype savevm and migration support ready for the pseries
> > machine.  They seem to work under simple circumstances (idle guest).
> > To test them more extensively I've been attempting to perform live
> > migrations (just over tcp->localhost) which the guest is active with
> > something.  In particular I've tried while using octave to do matrix
> > multiply (so exercising the FP unit) and my colleague Alexey has tried
> > during some video encoding.
> >
> > However, in each of these cases, we've found that the migration only
> > completes and the source instance only stops after the intensive
> > workload has (just) completed.  What I surmise is happening is that
> > the workload is touching memory pages fast enough that the ram
> > migration code is never getting below the threshold to complete the
> > migration until the guest is idle again.
> >
> > Does anyone have some ideas for testing this better: workloads that
> > are less likely to trigger this behaviour, or settings to tweak in the
> > migration itself to make it more likely to complete migration while
> > the workload is still active.
> 
> You can:
> 
> migrate_set_downtime 2s (or so)
> 
> I normally run stress, and you move the memory that it dirties until it
> converges (depends a lot of your networking).

So, I'm using tcp to localhost, so it should be really fast, but it
doesn't seem to be :/.  I suspect there are some other bugs here.

> Doing anything that is really memory intensive is basically never gonig
> to converge.

Well, I didn't think the loads I chose would be memory limited
(especially the video encode), but..

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson