From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:41316)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1YYbci-0002wO-Hb
	for qemu-devel@nongnu.org; Thu, 19 Mar 2015 10:40:39 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1YYbce-0003aN-CK
	for qemu-devel@nongnu.org; Thu, 19 Mar 2015 10:40:36 -0400
Date: Thu, 19 Mar 2015 14:40:18 +0000
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Message-ID: <20150319144018.GG2409@work-vm>
References: <1426582438-9698-1-git-send-email-liang.z.li@intel.com>
	<87wq2fkelb.fsf@neno.neno>
	<F2CBF3009FA73547804AE4C663CAB28E4D0DA1@shsmsx102.ccr.corp.intel.com>
	<20150318111709.GB4576@noname.redhat.com>
	<F2CBF3009FA73547804AE4C663CAB28E4D2321@shsmsx102.ccr.corp.intel.com>
	<20150318165549.GI2355@work-vm>
	<F2CBF3009FA73547804AE4C663CAB28E4D2F95@shsmsx102.ccr.corp.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E4D2F95@shsmsx102.ccr.corp.intel.com>
Subject: Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping
	VM
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: Kevin Wolf <kwolf@redhat.com>, "qemu-block@nongnu.org" <qemu-block@nongnu.org>, "quintela@redhat.com" <quintela@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "Zhang,
	Yang Z" <yang.z.zhang@intel.com>, "amit.shah@redhat.com" <amit.shah@redhat.com>

* Li, Liang Z (liang.z.li@intel.com) wrote:
> > * Li, Liang Z (liang.z.li@intel.com) wrote:
> > > > > > First explanation, why I think this don't fix the full problem.
> > > > > > Whith this patch, we fix the problem where we have a dirty block
> > > > > > layer but basically nothing dirtying the memory on the guest (we
> > > > > > are moving the 20 seconds from max_downtime for the blocklayer
> > > > > > flush), to 20 seconds until we have decided that the amount of
> > > > > > dirty memory is small enough to be transferred during
> > > > > > max_downtime.  But it is still going to take 20 seconds to flush
> > > > > > the block layer, and during that 20 seconds, the amount of memory
> > that can be dirty is HUGE.
> > > > >
> > > > > It's true.
> > > >
> > > > What kind of cache is it actually that takes 20s to flush here?
> > > >
> > >
> > > I run a script in the guest which do a dd operation,  like this:
> > >
> > > #!/bin/sh
> > > for i in {1..1000000}
> > > do
> > > 	time dd if=/dev/zero of=/time.bdf bs=4k count=200000
> > > 	rm /time.bdf
> > > done
> > >
> > > It's an extreme  case.
> > 
> > With what qemu options for the device, and what was your device backed by?
> 
> Very simple:
> ./qemu-system-x86_64 -enable-kvm -smp 4 -m 4096  -net none rhel6u5.img -monitor stdio
> 
> And it's a local migration.  I will do the test between two physical machines later.

OK, but for shared storage you would have to add cache=none (or something like that),
so that would change the behaviour anyway.

Dave
> 
> 
> Liang
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK