From: Anthony Liguori <aliguori@linux.vnet.ibm.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
qemu-devel@nongnu.org,
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the guest starts
Date: Mon, 13 Sep 2010 15:09:18 -0500 [thread overview]
Message-ID: <4C8E84EE.9040702@linux.vnet.ibm.com> (raw)
In-Reply-To: <4C8E8390.6050607@redhat.com>
On 09/13/2010 03:03 PM, Kevin Wolf wrote:
> Am 13.09.2010 21:29, schrieb Stefan Hajnoczi:
>
>> On Mon, Sep 13, 2010 at 3:13 PM, Kevin Wolf<kwolf@redhat.com> wrote:
>>
>>> Am 13.09.2010 15:42, schrieb Anthony Liguori:
>>>
>>>> On 09/13/2010 08:39 AM, Kevin Wolf wrote:
>>>>
>>>>>> Yeah, one of the key design points of live migration is to minimize the
>>>>>> number of failure scenarios where you lose a VM. If someone typed the
>>>>>> wrong command line or shared storage hasn't been mounted yet and we
>>>>>> delay failure until live migration is in the critical path, that would
>>>>>> be terribly unfortunate.
>>>>>>
>>>>>>
>>>>> We would catch most of them if we try to open the image when migration
>>>>> starts and immediately close it again until migration is (almost)
>>>>> completed, so that no other code can possibly use it before the source
>>>>> has really closed it.
>>>>>
>>>>>
>>>> I think the only real advantage is that we fix NFS migration, right?
>>>>
>>> That's the one that we know about, yes.
>>>
>>> The rest is not a specific scenario, but a strong feeling that having an
>>> image opened twice at the same time feels dangerous. As soon as an
>>> open/close sequence writes to the image for some format, we probably
>>> have a bug. For example, what about this mounted flag that you were
>>> discussing for QED?
>>>
>> There is some room left to work in, even if we can't check in open().
>> One idea would be to do the check asynchronously once I/O begins. It
>> is actually easy to check L1/L2 tables as they are loaded.
>>
>> The only barrier relationship between I/O and checking is that an
>> allocating write (which will need to update L1/L2 tables) is only
>> allowed after check completes. Otherwise reads and non-allocating
>> writes may proceed while the image is not yet fully checked. We can
>> detect when a table element is an invalid offset and discard it.
>>
> I'm not even talking about such complicated things. You wanted to have a
> dirty flag in the header, right? So when we allow opening an image
> twice, you get this sequence with migration:
>
> Source: open
> Destination: open (with dirty image)
> Source: close
>
> The image is now marked as clean, even though the destination is still
> working on it.
>
The dirty flag should be read on demand (which is the first time we
fetch an L1/L2 table).
I agree that the life cycle of the block drivers is getting fuzzy. Need
to think quite a bit here.
Regards,
Anthony Liguori
> Kevin
>
next prev parent reply other threads:[~2010-09-13 20:09 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-11 14:04 [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Anthony Liguori
2010-09-11 14:04 ` [Qemu-devel] [PATCH 1/3] block: allow migration to work with image files Anthony Liguori
2010-09-12 10:37 ` Avi Kivity
2010-09-12 13:06 ` Anthony Liguori
2010-09-12 13:28 ` Avi Kivity
2010-09-12 15:26 ` Anthony Liguori
2010-09-12 16:06 ` Avi Kivity
2010-09-12 17:10 ` Anthony Liguori
2010-09-12 17:51 ` Avi Kivity
2010-09-15 16:00 ` [Qemu-devel] " Juan Quintela
2010-09-15 15:57 ` Juan Quintela
2010-09-13 8:21 ` Kevin Wolf
2010-09-13 13:27 ` Anthony Liguori
2010-09-15 16:03 ` Juan Quintela
2010-09-16 7:54 ` Kevin Wolf
2010-09-15 15:53 ` Juan Quintela
2010-09-11 14:04 ` [Qemu-devel] [PATCH 2/3] block-nbd: fix use of protocols in backing files and nbd probing Anthony Liguori
2010-09-11 16:53 ` Stefan Hajnoczi
2010-09-11 17:27 ` Anthony Liguori
2010-09-11 17:45 ` Anthony Liguori
2010-09-15 16:06 ` [Qemu-devel] " Juan Quintela
2010-09-16 15:40 ` Anthony Liguori
2010-09-17 8:53 ` Kevin Wolf
2010-09-16 8:08 ` Kevin Wolf
2010-09-16 13:00 ` Anthony Liguori
2010-09-16 14:08 ` Kevin Wolf
2010-09-11 14:04 ` [Qemu-devel] [PATCH 3/3] disk: don't read from disk until the guest starts Anthony Liguori
2010-09-11 17:24 ` Stefan Hajnoczi
2010-09-11 17:34 ` Anthony Liguori
2010-09-12 10:42 ` Avi Kivity
2010-09-12 13:08 ` Anthony Liguori
2010-09-12 13:26 ` Avi Kivity
2010-09-12 15:29 ` Anthony Liguori
2010-09-12 16:04 ` Avi Kivity
2010-09-15 16:10 ` [Qemu-devel] " Juan Quintela
2010-09-13 8:32 ` Kevin Wolf
2010-09-13 13:29 ` Anthony Liguori
2010-09-13 13:39 ` Kevin Wolf
2010-09-13 13:42 ` Anthony Liguori
2010-09-13 14:13 ` Kevin Wolf
2010-09-13 14:34 ` Anthony Liguori
2010-09-14 9:47 ` Avi Kivity
2010-09-14 12:51 ` Anthony Liguori
2010-09-14 13:16 ` Avi Kivity
2010-09-13 19:29 ` Stefan Hajnoczi
2010-09-13 20:03 ` Kevin Wolf
2010-09-13 20:09 ` Anthony Liguori [this message]
2010-09-14 8:28 ` Kevin Wolf
2010-09-15 16:16 ` Juan Quintela
2010-09-12 10:46 ` [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Avi Kivity
2010-09-12 13:12 ` Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C8E84EE.9040702@linux.vnet.ibm.com \
--to=aliguori@linux.vnet.ibm.com \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@gmail.com \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).