qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration
Date: Sun, 12 Sep 2010 08:12:15 -0500	[thread overview]
Message-ID: <4C8CD1AF.3060904@codemonkey.ws> (raw)
In-Reply-To: <4C8CAF9C.8090903@redhat.com>

On 09/12/2010 05:46 AM, Avi Kivity wrote:
>  On 09/11/2010 05:04 PM, Anthony Liguori wrote:
>> Today, live migration only works when using shared storage that is fully
>> cache coherent using raw images.
>>
>> The failure case with weak coherent (i.e. NFS) is subtle but 
>> nontheless still
>> exists.  NFS only guarantees close-to-open coherence and when 
>> performing a live
>> migration, we do an open on the source and an open on the 
>> destination.  We
>> fsync() on the source before launching the destination but since we 
>> have two
>> simultaneous opens, we're not guaranteed coherence.
>>
>> This is not necessarily a problem except that we are a bit gratituous 
>> in reading
>> from the disk before launching a guest.  This means that as things 
>> stand today,
>> we're guaranteed to read the first 64k of the disk and as such, if a 
>> client
>> writes to that region during live migration, corruption will result.
>>
>> The second failure condition has to do with image files (such as 
>> qcow2).  Today,
>> we aggressively cache metadata in all image formats and that cache is 
>> definitely
>> not coherent even with fully coherent shared storage.
>>
>> In all image formats, we prefetch at least the L1 table in open() 
>> which means
>> that if there is a write operation that causes a modification to an 
>> L1 table,
>> corruption will ensue.
>>
>> This series attempts to address both of these issue.  Technically, if 
>> a NFS
>> client aggressively prefetches this solution is not enough but in 
>> practice,
>> Linux doesn't do that.
>
> I think it is unlikely that it will, but I prefer to be on the right 
> side of the standards.

I've been asking around about this and one thing that was suggested was 
acquiring a file lock as NFS requires that a lock acquisition drops any 
client cache for a file.  I need to understand this a bit more so it's 
step #2.

>   Why not delay image open until after migration completes?  I know 
> your concern about the image not being there, but we can verify that 
> with access().  If the image is deleted between access() and open() 
> then the user has much bigger problems.

3/3 would still be needed because if we delay the open we obviously can 
do a read until an open.

So it's only really a choice between invalidate_cache and delaying 
open.  It's a far less invasive change to just do invalidate_cache 
though and it has some nice properties.

Regards,

Anthony Liguori

> Note that on NFS, removing (and I think chmoding) a file after it is 
> opened will cause subsequent data access to fail, unlike posix.
>

      reply	other threads:[~2010-09-12 13:20 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-11 14:04 [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Anthony Liguori
2010-09-11 14:04 ` [Qemu-devel] [PATCH 1/3] block: allow migration to work with image files Anthony Liguori
2010-09-12 10:37   ` Avi Kivity
2010-09-12 13:06     ` Anthony Liguori
2010-09-12 13:28       ` Avi Kivity
2010-09-12 15:26         ` Anthony Liguori
2010-09-12 16:06           ` Avi Kivity
2010-09-12 17:10             ` Anthony Liguori
2010-09-12 17:51               ` Avi Kivity
2010-09-15 16:00                 ` [Qemu-devel] " Juan Quintela
2010-09-15 15:57         ` Juan Quintela
2010-09-13  8:21   ` Kevin Wolf
2010-09-13 13:27     ` Anthony Liguori
2010-09-15 16:03     ` Juan Quintela
2010-09-16  7:54       ` Kevin Wolf
2010-09-15 15:53   ` Juan Quintela
2010-09-11 14:04 ` [Qemu-devel] [PATCH 2/3] block-nbd: fix use of protocols in backing files and nbd probing Anthony Liguori
2010-09-11 16:53   ` Stefan Hajnoczi
2010-09-11 17:27     ` Anthony Liguori
2010-09-11 17:45       ` Anthony Liguori
2010-09-15 16:06   ` [Qemu-devel] " Juan Quintela
2010-09-16 15:40     ` Anthony Liguori
2010-09-17  8:53       ` Kevin Wolf
2010-09-16  8:08   ` Kevin Wolf
2010-09-16 13:00     ` Anthony Liguori
2010-09-16 14:08       ` Kevin Wolf
2010-09-11 14:04 ` [Qemu-devel] [PATCH 3/3] disk: don't read from disk until the guest starts Anthony Liguori
2010-09-11 17:24   ` Stefan Hajnoczi
2010-09-11 17:34     ` Anthony Liguori
2010-09-12 10:42   ` Avi Kivity
2010-09-12 13:08     ` Anthony Liguori
2010-09-12 13:26       ` Avi Kivity
2010-09-12 15:29         ` Anthony Liguori
2010-09-12 16:04           ` Avi Kivity
2010-09-15 16:10       ` [Qemu-devel] " Juan Quintela
2010-09-13  8:32   ` Kevin Wolf
2010-09-13 13:29     ` Anthony Liguori
2010-09-13 13:39       ` Kevin Wolf
2010-09-13 13:42         ` Anthony Liguori
2010-09-13 14:13           ` Kevin Wolf
2010-09-13 14:34             ` Anthony Liguori
2010-09-14  9:47               ` Avi Kivity
2010-09-14 12:51                 ` Anthony Liguori
2010-09-14 13:16                   ` Avi Kivity
2010-09-13 19:29             ` Stefan Hajnoczi
2010-09-13 20:03               ` Kevin Wolf
2010-09-13 20:09                 ` Anthony Liguori
2010-09-14  8:28                   ` Kevin Wolf
2010-09-15 16:16     ` Juan Quintela
2010-09-12 10:46 ` [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Avi Kivity
2010-09-12 13:12   ` Anthony Liguori [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C8CD1AF.3060904@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).