From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:40399)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1RQJV4-0001yk-14
	for qemu-devel@nongnu.org; Tue, 15 Nov 2011 08:56:37 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1RQJV3-0004LP-4W
	for qemu-devel@nongnu.org; Tue, 15 Nov 2011 08:56:34 -0500
Received: from mail-iy0-f173.google.com ([209.85.210.173]:46199)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1RQJV3-0004LD-1M
	for qemu-devel@nongnu.org; Tue, 15 Nov 2011 08:56:33 -0500
Received: by iakk32 with SMTP id k32so9970104iak.4
	for <qemu-devel@nongnu.org>; Tue, 15 Nov 2011 05:56:32 -0800 (PST)
Message-ID: <4EC26F8C.10102@codemonkey.ws>
Date: Tue, 15 Nov 2011 07:56:28 -0600
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
References: <4EBAAA68.10801@redhat.com>
	<4EBAACAF.4080407@codemonkey.ws>	<4EBAB236.2060409@redhat.com>
	<4EBAB9FA.3070601@codemonkey.ws>	<4EBB919B.7040605@redhat.com>
	<4EBC1792.3030004@codemonkey.ws>	<4EBC4260.1090405@codemonkey.ws>
	<4EBCF5DA.1000605@redhat.com>	<4EBE499E.4030100@redhat.com>
	<20111114101610.GA32392@redhat.com>	<4EC1238B.2030906@codemonkey.ws>
	<874ny5iktv.fsf@trasno.mitica>
In-Reply-To: <874ny5iktv.fsf@trasno.mitica>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress
 introduces qcow2 corruptions
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: quintela@redhat.com
Cc: Kevin Wolf <kwolf@redhat.com>, Lucas Meneghel Rodrigues <lmr@redhat.com>, KVM mailing list <kvm@vger.kernel.org>, "Michael S. Tsirkin" <mst@redhat.com>, "libvir-list@redhat.com" <libvir-list@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>, QEMU devel <qemu-devel@nongnu.org>, Avi Kivity <avi@redhat.com>

On 11/15/2011 07:20 AM, Juan Quintela wrote:
>> Again, I think defaulting DAS to cache=none|directsync is what makes
>> the most sense here.
>
> I think it is the only sane solution.  Otherwise, we need to write the
> equivalent of a lock manager, to know _who_ has the storage, and
> distributed lock managers are a mess :-(
>
>> We can even add a migration blocker for DAS with cache=on.  If we can
>> do dynamic toggling of the cache setting, then that's pretty friendly
>> at the end of the day.
>
> That could fix the problem also.  At the moment that we start migration,
> we do an fsync() + switch to O_DIRECT for all filesystems.
>
> As you said, time for implementing fcntl(O_DIRECT).

Yeah, I think this ends up being a very elegant solution.

We always open block devices O_DIRECT to start with.  That ensures reads go 
directly to disk if its DAS or result in NFS protocol reads.

As long as we fsync on the source (and we do), then we're okay.

For cache=write{back,through}, we would then just fcntl() away O_DIRECT as soon 
as we start the guest.  Then we can start doing reads through the page cache.

Regards,

Anthony Liguori

> Later, Juan.
>