From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kevin Wolf Subject: Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress introduces qcow2 corruptions Date: Thu, 10 Nov 2011 11:41:32 +0100 Message-ID: <4EBBAA5C.9010505@redhat.com> References: <4EBAAA68.10801@redhat.com> <4EBAACAF.4080407@codemonkey.ws> <4EBAB236.2060409@redhat.com> <4EBAB9FA.3070601@codemonkey.ws> <20111109201836.GA28457@redhat.com> <4EBAE0EA.1030405@codemonkey.ws> <20111109210052.GB28599@redhat.com> <4EBAEA33.9090709@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "Michael S. Tsirkin" , Lucas Meneghel Rodrigues , KVM mailing list , Juan Jose Quintela Carreira , Marcelo Tosatti , QEMU devel , Avi Kivity To: Anthony Liguori Return-path: Received: from mx1.redhat.com ([209.132.183.28]:32230 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751479Ab1KJKi3 (ORCPT ); Thu, 10 Nov 2011 05:38:29 -0500 In-Reply-To: <4EBAEA33.9090709@codemonkey.ws> Sender: kvm-owner@vger.kernel.org List-ID: Am 09.11.2011 22:01, schrieb Anthony Liguori: > On 11/09/2011 03:00 PM, Michael S. Tsirkin wrote: >> On Wed, Nov 09, 2011 at 02:22:02PM -0600, Anthony Liguori wrote: >>> On 11/09/2011 02:18 PM, Michael S. Tsirkin wrote: >>>> On Wed, Nov 09, 2011 at 11:35:54AM -0600, Anthony Liguori wrote: >>>>> On 11/09/2011 11:02 AM, Avi Kivity wrote: >>>>>> On 11/09/2011 06:39 PM, Anthony Liguori wrote: >>>>>>> >>>>>>> Migration with qcow2 is not a supported feature for 1.0. Migration is >>>>>>> only supported with raw images using coherent shared storage[1]. >>>>>>> >>>>>>> [1] NFS is only coherent with close-to-open which right now is not >>>>>>> good enough for migration. >>>>>> >>>>>> Say what? >>>>> >>>>> Due to block format probing, we read at least the first sector of >>>>> the disk during start up. >>>> >>>> A simple solution is not to do any probing before the VM is first >>>> started on the incoming path. >>>> >>>> Any issues with this? >>>> >>> >>> http://mid.gmane.org/1284213896-12705-4-git-send-email-aliguori@us.ibm.com >>> I think Kevin wanted open to get delayed. >>> >>> Regards, >>> >>> Anthony Liguori >> >> So, this patchset just needs to be revived and polished up? > > What I took from the feedback was that Kevin wanted to defer open until the > device model started. That eliminates the need to reopen or have a invalidation > callback. > > I think it would be good for Kevin to comment here though because I might have > misunderstood his feedback. Your approach was to delay reads, but still keep the image open. I think I worried that we might have additional reads somewhere that we don't know about, and this is why I proposed delaying the open as well, so that any read would always fail. I believe just reopening the image is (almost?) as good and it's way easier to do, so I would be inclined to do that for 1.0. I'm not 100% sure about cases like iscsi, where reopening doesn't help. I think delaying the open doesn't help there either if you migrate from A to B and then back from B to A, you could still get old data. So for iscsi probably cache=none remains the only safe choice, whatever we do. Kevin