From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1KpS40-0006Va-M6
	for qemu-devel@nongnu.org; Mon, 13 Oct 2008 14:22:40 -0400
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1KpS3x-0006TZ-7y
	for qemu-devel@nongnu.org; Mon, 13 Oct 2008 14:22:39 -0400
Received: from [199.232.76.173] (port=33307 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1KpS3w-0006TI-W5
	for qemu-devel@nongnu.org; Mon, 13 Oct 2008 14:22:37 -0400
Received: from mail2.shareable.org ([80.68.89.115]:34351)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <jamie@shareable.org>) id 1KpS3w-0003o6-Ha
	for qemu-devel@nongnu.org; Mon, 13 Oct 2008 14:22:36 -0400
Date: Mon, 13 Oct 2008 19:22:31 +0100
From: Jamie Lokier <jamie@shareable.org>
Subject: Re: [Qemu-devel] [RFC] Disk integrity in QEMU
Message-ID: <20081013182231.GA6369@shareable.org>
References: <48EE38B9.2050106@codemonkey.ws> <48F38C5E.1080504@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <48F38C5E.1080504@redhat.com>
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org
Cc: Chris Wright <chrisw@redhat.com>, Mark McLoughlin <markmc@redhat.com>, Ryan Harper <ryanh@us.ibm.com>, kvm-devel <kvm-devel@lists.sourceforge.net>, Laurent Vivier <Laurent.Vivier@bull.net>

Rik van Riel wrote:
> >When cache=on, read requests may not actually go to the disk.  If a 
> >previous read request (by some application on the system) has read the 
> >same data, then it becomes a simple memcpy().  Also, the host IO 
> >scheduler may do read ahead which means that the data may be available 
> >from that. 
> 
> This can be as much of a data integrity problem as
> asynchronous writes, if various qemu/kvm guests are
> accessing the same disk image with a cluster filesystem
> like GFS.

If there are multiple qemu/kvm guests accessing the same disk image in
a cluster, provided the host cluster filesystem uses a fully coherent
protocol, ordinary cached reads should be fine.  (E.g. not NFS).

The behaviour should be equivalent to a "virtual SAN".

(Btw, some other OSes have an O_RSYNC flag to force reads to hit the
media, much as O_DSYNC forces writes to.  That might be relevant to
accessing a disk image file on non-coherent cluster filesystems, but I
wouldn't recommend that.)

-- Jamie