qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Supriya Kannery <supriyak@linux.vnet.ibm.com>,
	Anthony Liguori <aliguori@us.ibm.com>,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] Safely reopening image files by stashing fds
Date: Mon, 08 Aug 2011 17:16:53 +0200	[thread overview]
Message-ID: <4E3FFDE5.1020802@redhat.com> (raw)
In-Reply-To: <CAJSP0QVtJ0Nd03ajcaqEa+gnyM9b9+jfLhB91jcLhP-OsBKaTg@mail.gmail.com>

Am 08.08.2011 16:49, schrieb Stefan Hajnoczi:
> On Fri, Aug 5, 2011 at 10:48 AM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 05.08.2011 11:29, schrieb Stefan Hajnoczi:
>>> On Fri, Aug 5, 2011 at 10:07 AM, Kevin Wolf <kwolf@redhat.com> wrote:
>>>> Am 05.08.2011 10:40, schrieb Stefan Hajnoczi:
>>>>> We've discussed safe methods for reopening image files (e.g. useful for
>>>>> changing the hostcache parameter).  The problem is that closing the file first
>>>>> and then opening it again exposes us to the error case where the open fails.
>>>>> At that point we cannot get to the file anymore and our options are to
>>>>> terminate QEMU, pause the VM, or offline the block device.
>>>>>
>>>>> This window of vulnerability can be eliminated by keeping the file descriptor
>>>>> around and falling back to it should the open fail.
>>>>>
>>>>> The challenge for the file descriptor approach is that image formats, like
>>>>> VMDK, can span multiple files.  Therefore the solution is not as simple as
>>>>> stashing a single file descriptor and reopening from it.
>>>>
>>>> So far I agree. The rest I believe is wrong because you can't assume
>>>> that every backend uses file descriptors. The qemu block layer is based
>>>> on BlockDriverStates, not fds. They are a concept that should be hidden
>>>> in raw-posix.
>>>>
>>>> I think something like this could do:
>>>>
>>>> struct BDRVReopenState {
>>>>    BlockDriverState *bs;
>>>>    /* can be extended by block drivers */
>>>> };
>>>>
>>>> .bdrv_reopen(BlockDriverState *bs, BDRVReopenState **reopen_state, int
>>>> flags);
>>>> .bdrv_reopen_commit(BDRVReopenState *reopen_state);
>>>> .bdrv_reopen_abort(BDRVReopenState *reopen_state);
>>>>
>>>> raw-posix would store the old file descriptor in its reopen_state. On
>>>> commit, it closes the old descriptors, on abort it reverts to the old
>>>> one and closes the newly opened one.
>>>>
>>>> Makes things a bit more complicated than the simple bdrv_reopen I had in
>>>> mind before, but it allows VMDK to get an all-or-nothing semantics.
>>>
>>> Can you show how bdrv_reopen() would use these new interfaces?  I'm
>>> not 100% clear on the idea.
>>
>> Well, you wouldn't only call bdrv_reopen, but also either
>> bdrv_reopen_commit/abort (for the top-level caller we can have a wrapper
>> function that does both, but that's syntactic sugar).
>>
>> For example we would have:
>>
>> int vmdk_reopen()
> 
> .bdrv_reopen() is a confusing name for this operation because it does
> not reopen anything.  bdrv_prepare_reopen() might be clearer.

Makes sense.

> 
>> {
>>    *((VMDKReopenState**) rs) = malloc();
>>
>>    foreach (extent in s->extents) {
>>        ret = bdrv_reopen(extent->file, &extent->reopen_state)
>>        if (ret < 0)
>>            goto fail;
>>    }
>>    return 0;
>>
>> fail:
>>    foreach (extent in rs->already_reopened) {
>>        bdrv_reopen_abort(extent->reopen_state);
>>    }
>>    return ret;
>> }
> 
>> void vmdk_reopen_commit()
>> {
>>    foreach (extent in s->extents) {
>>        bdrv_reopen_commit(extent->reopen_state);
>>    }
>>    free(rs);
>> }
>>
>> void vmdk_reopen_abort()
>> {
>>    foreach (extent in s->extents) {
>>        bdrv_reopen_abort(extent->reopen_state);
>>    }
>>    free(rs);
>> }
> 
> Does the caller invoke bdrv_close(bs) after bdrv_prepare_reopen(bs,
> &rs)? 

No. Closing the old backend would be part of bdrv_reopen_commit.

Do you have a use case where it would be helpful if the caller invoked
bdrv_close?

> There is more state than just the file descriptors and I'm not
> sure that that gets preserved unless we add code to stash away stuff.
> I'm basically hoping this interface does not require touching every
> BlockDriver.

If we only want to change flags like O_DIRECT or O_SYNC, I think format
drivers (except VMDK) can use a standard implementation that just
reopens bs->file.

If we wanted bdrv_reopen to ensure that all caches are dropped etc. then
I think we need a specific implementation in all drivers unless
bdrv->bdrv_open/bdrv_close is good enough to emulate it.

Kevin

  reply	other threads:[~2011-08-08 15:14 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-05  8:40 [Qemu-devel] Safely reopening image files by stashing fds Stefan Hajnoczi
2011-08-05  9:04 ` Paolo Bonzini
2011-08-05  9:27   ` Stefan Hajnoczi
2011-08-05  9:55     ` Paolo Bonzini
2011-08-05 13:03       ` Stefan Hajnoczi
2011-08-05 13:12     ` Daniel P. Berrange
2011-08-05 14:28       ` Christoph Hellwig
2011-08-05 15:24         ` Stefan Hajnoczi
2011-08-05 15:43           ` Kevin Wolf
2011-08-05 15:49             ` Anthony Liguori
2011-08-08  7:02               ` Supriya Kannery
2011-08-08  8:12                 ` Kevin Wolf
2011-08-09  9:22                   ` supriya kannery
2011-08-09  9:51                     ` Kevin Wolf
2011-08-09  9:32                       ` supriya kannery
2011-08-16 19:18                         ` [Qemu-devel] [RFC] " Supriya Kannery
2011-08-16 19:18                         ` Supriya Kannery
2011-08-17 14:35                           ` Kevin Wolf
2011-10-10 18:28                     ` [Qemu-devel] " Kevin Wolf
2011-10-11  5:21                       ` Supriya Kannery
2011-08-05 14:27     ` Christoph Hellwig
2011-08-05  9:07 ` Kevin Wolf
2011-08-05  9:29   ` Stefan Hajnoczi
2011-08-05  9:48     ` Kevin Wolf
2011-08-08 14:49       ` Stefan Hajnoczi
2011-08-08 15:16         ` Kevin Wolf [this message]
2011-08-09 10:25           ` Stefan Hajnoczi
2011-08-09 10:35             ` Kevin Wolf
2011-08-09 10:50               ` Stefan Hajnoczi
2011-08-09 10:56                 ` Stefan Hajnoczi
2011-08-09 11:39                   ` Kevin Wolf
2011-08-09 12:00                     ` Stefan Hajnoczi
2011-08-09 12:24                       ` Kevin Wolf
2011-08-09 19:39                         ` Blue Swirl
2011-08-10  7:58                           ` Kevin Wolf
2011-08-10 17:20                             ` Blue Swirl
2011-08-11  7:37                               ` Kevin Wolf
2011-08-11 16:21                                 ` Blue Swirl
2011-08-05 20:16 ` Blue Swirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E3FFDE5.1020802@redhat.com \
    --to=kwolf@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=supriyak@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).