From mboxrd@z Thu Jan  1 00:00:00 1970
From: Juan Quintela <quintela@redhat.com>
Subject: Re: qemu and qemu.git -> Migration + disk stress introduces qcow2 corruptions
Date: Thu, 10 Nov 2011 17:50:59 +0100
Message-ID: <m3obwj7wgc.fsf@neno.neno>
References: <4EBAAA68.10801@redhat.com> <4EBAACAF.4080407@codemonkey.ws>
	<4EBAB236.2060409@redhat.com> <4EBAB9FA.3070601@codemonkey.ws>
	<20111109201836.GA28457@redhat.com> <4EBAE0EA.1030405@codemonkey.ws>
	<20111109210052.GB28599@redhat.com> <4EBAEA33.9090709@codemonkey.ws>
	<4EBBAA5C.9010505@redhat.com>
Reply-To: quintela@redhat.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Anthony Liguori <anthony@codemonkey.ws>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Lucas Meneghel Rodrigues <lmr@redhat.com>,
	KVM mailing list <kvm@vger.kernel.org>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	QEMU devel <qemu-devel@nongnu.org>, Avi Kivity <avi@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:30914 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S935797Ab1KJQwX (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 10 Nov 2011 11:52:23 -0500
In-Reply-To: <4EBBAA5C.9010505@redhat.com> (Kevin Wolf's message of "Thu, 10
	Nov 2011 11:41:32 +0100")
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Kevin Wolf <kwolf@redhat.com> wrote:

>> What I took from the feedback was that Kevin wanted to defer open until the 
>> device model started.  That eliminates the need to reopen or have a invalidation 
>> callback.
>> 
>> I think it would be good for Kevin to comment here though because I might have 
>> misunderstood his feedback.
>
> Your approach was to delay reads, but still keep the image open. I think
> I worried that we might have additional reads somewhere that we don't
> know about, and this is why I proposed delaying the open as well, so
> that any read would always fail.
>
> I believe just reopening the image is (almost?) as good and it's way
> easier to do, so I would be inclined to do that for 1.0.
>
> I'm not 100% sure about cases like iscsi, where reopening doesn't help.
> I think delaying the open doesn't help there either if you migrate from
> A to B and then back from B to A, you could still get old data. So for
> iscsi probably cache=none remains the only safe choice, whatever we do.

iSCSI and NFS only works with cache=none.  Even on NFS with close+open,
we have troubles if anything else has the file opened (think libvirt,
guestfs, whatever).  I really think that anynthing different of
cache=none from iSCSI or NFS is just betting (and yes, it took a while
for Christoph to convince me, I was trying to a "poor man" distributed
lock manager, and as everybody knows, it is a _difficult_ problem to
solve.).

Later, Juan.