From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41316) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYbci-0002wO-Hb for qemu-devel@nongnu.org; Thu, 19 Mar 2015 10:40:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YYbce-0003aN-CK for qemu-devel@nongnu.org; Thu, 19 Mar 2015 10:40:36 -0400 Date: Thu, 19 Mar 2015 14:40:18 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20150319144018.GG2409@work-vm> References: <1426582438-9698-1-git-send-email-liang.z.li@intel.com> <87wq2fkelb.fsf@neno.neno> <20150318111709.GB4576@noname.redhat.com> <20150318165549.GI2355@work-vm> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Li, Liang Z" Cc: Kevin Wolf , "qemu-block@nongnu.org" , "quintela@redhat.com" , "qemu-devel@nongnu.org" , "Zhang, Yang Z" , "amit.shah@redhat.com" * Li, Liang Z (liang.z.li@intel.com) wrote: > > * Li, Liang Z (liang.z.li@intel.com) wrote: > > > > > > First explanation, why I think this don't fix the full problem. > > > > > > Whith this patch, we fix the problem where we have a dirty block > > > > > > layer but basically nothing dirtying the memory on the guest (we > > > > > > are moving the 20 seconds from max_downtime for the blocklayer > > > > > > flush), to 20 seconds until we have decided that the amount of > > > > > > dirty memory is small enough to be transferred during > > > > > > max_downtime. But it is still going to take 20 seconds to flush > > > > > > the block layer, and during that 20 seconds, the amount of memory > > that can be dirty is HUGE. > > > > > > > > > > It's true. > > > > > > > > What kind of cache is it actually that takes 20s to flush here? > > > > > > > > > > I run a script in the guest which do a dd operation, like this: > > > > > > #!/bin/sh > > > for i in {1..1000000} > > > do > > > time dd if=/dev/zero of=/time.bdf bs=4k count=200000 > > > rm /time.bdf > > > done > > > > > > It's an extreme case. > > > > With what qemu options for the device, and what was your device backed by? > > Very simple: > ./qemu-system-x86_64 -enable-kvm -smp 4 -m 4096 -net none rhel6u5.img -monitor stdio > > And it's a local migration. I will do the test between two physical machines later. OK, but for shared storage you would have to add cache=none (or something like that), so that would change the behaviour anyway. Dave > > > Liang -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK