From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=48338 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Ouk5M-0000xR-2e
	for qemu-devel@nongnu.org; Sun, 12 Sep 2010 06:47:01 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <avi@redhat.com>) id 1Ouk5K-0004hN-Ox
	for qemu-devel@nongnu.org; Sun, 12 Sep 2010 06:46:59 -0400
Received: from mx1.redhat.com ([209.132.183.28]:41745)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <avi@redhat.com>) id 1Ouk5K-0004hB-9x
	for qemu-devel@nongnu.org; Sun, 12 Sep 2010 06:46:58 -0400
Message-ID: <4C8CAF9C.8090903@redhat.com>
Date: Sun, 12 Sep 2010 12:46:52 +0200
From: Avi Kivity <avi@redhat.com>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live
	migration
References: <1284213896-12705-1-git-send-email-aliguori@us.ibm.com>
In-Reply-To: <1284213896-12705-1-git-send-email-aliguori@us.ibm.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Juan Quintela <quintela@redhat.com>, qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

  On 09/11/2010 05:04 PM, Anthony Liguori wrote:
> Today, live migration only works when using shared storage that is fully
> cache coherent using raw images.
>
> The failure case with weak coherent (i.e. NFS) is subtle but nontheless still
> exists.  NFS only guarantees close-to-open coherence and when performing a live
> migration, we do an open on the source and an open on the destination.  We
> fsync() on the source before launching the destination but since we have two
> simultaneous opens, we're not guaranteed coherence.
>
> This is not necessarily a problem except that we are a bit gratituous in reading
> from the disk before launching a guest.  This means that as things stand today,
> we're guaranteed to read the first 64k of the disk and as such, if a client
> writes to that region during live migration, corruption will result.
>
> The second failure condition has to do with image files (such as qcow2).  Today,
> we aggressively cache metadata in all image formats and that cache is definitely
> not coherent even with fully coherent shared storage.
>
> In all image formats, we prefetch at least the L1 table in open() which means
> that if there is a write operation that causes a modification to an L1 table,
> corruption will ensue.
>
> This series attempts to address both of these issue.  Technically, if a NFS
> client aggressively prefetches this solution is not enough but in practice,
> Linux doesn't do that.

I think it is unlikely that it will, but I prefer to be on the right 
side of the standards.  Why not delay image open until after migration 
completes?  I know your concern about the image not being there, but we 
can verify that with access().  If the image is deleted between access() 
and open() then the user has much bigger problems.

Note that on NFS, removing (and I think chmoding) a file after it is 
opened will cause subsequent data access to fail, unlike posix.

-- 
error compiling committee.c: too many arguments to function