From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:49763) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QqRWi-0003rk-6K for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:14:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QqRWg-0000ct-RS for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:14:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38575) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QqRWg-0000ci-HF for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:13:58 -0400 Message-ID: <4E3FFDE5.1020802@redhat.com> Date: Mon, 08 Aug 2011 17:16:53 +0200 From: Kevin Wolf MIME-Version: 1.0 References: <4E3BB2C3.4020607@redhat.com> <4E3BBC64.9020005@redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Safely reopening image files by stashing fds List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Supriya Kannery , Anthony Liguori , qemu-devel Am 08.08.2011 16:49, schrieb Stefan Hajnoczi: > On Fri, Aug 5, 2011 at 10:48 AM, Kevin Wolf wrote: >> Am 05.08.2011 11:29, schrieb Stefan Hajnoczi: >>> On Fri, Aug 5, 2011 at 10:07 AM, Kevin Wolf wrote: >>>> Am 05.08.2011 10:40, schrieb Stefan Hajnoczi: >>>>> We've discussed safe methods for reopening image files (e.g. useful for >>>>> changing the hostcache parameter). The problem is that closing the file first >>>>> and then opening it again exposes us to the error case where the open fails. >>>>> At that point we cannot get to the file anymore and our options are to >>>>> terminate QEMU, pause the VM, or offline the block device. >>>>> >>>>> This window of vulnerability can be eliminated by keeping the file descriptor >>>>> around and falling back to it should the open fail. >>>>> >>>>> The challenge for the file descriptor approach is that image formats, like >>>>> VMDK, can span multiple files. Therefore the solution is not as simple as >>>>> stashing a single file descriptor and reopening from it. >>>> >>>> So far I agree. The rest I believe is wrong because you can't assume >>>> that every backend uses file descriptors. The qemu block layer is based >>>> on BlockDriverStates, not fds. They are a concept that should be hidden >>>> in raw-posix. >>>> >>>> I think something like this could do: >>>> >>>> struct BDRVReopenState { >>>> BlockDriverState *bs; >>>> /* can be extended by block drivers */ >>>> }; >>>> >>>> .bdrv_reopen(BlockDriverState *bs, BDRVReopenState **reopen_state, int >>>> flags); >>>> .bdrv_reopen_commit(BDRVReopenState *reopen_state); >>>> .bdrv_reopen_abort(BDRVReopenState *reopen_state); >>>> >>>> raw-posix would store the old file descriptor in its reopen_state. On >>>> commit, it closes the old descriptors, on abort it reverts to the old >>>> one and closes the newly opened one. >>>> >>>> Makes things a bit more complicated than the simple bdrv_reopen I had in >>>> mind before, but it allows VMDK to get an all-or-nothing semantics. >>> >>> Can you show how bdrv_reopen() would use these new interfaces? I'm >>> not 100% clear on the idea. >> >> Well, you wouldn't only call bdrv_reopen, but also either >> bdrv_reopen_commit/abort (for the top-level caller we can have a wrapper >> function that does both, but that's syntactic sugar). >> >> For example we would have: >> >> int vmdk_reopen() > > .bdrv_reopen() is a confusing name for this operation because it does > not reopen anything. bdrv_prepare_reopen() might be clearer. Makes sense. > >> { >> *((VMDKReopenState**) rs) = malloc(); >> >> foreach (extent in s->extents) { >> ret = bdrv_reopen(extent->file, &extent->reopen_state) >> if (ret < 0) >> goto fail; >> } >> return 0; >> >> fail: >> foreach (extent in rs->already_reopened) { >> bdrv_reopen_abort(extent->reopen_state); >> } >> return ret; >> } > >> void vmdk_reopen_commit() >> { >> foreach (extent in s->extents) { >> bdrv_reopen_commit(extent->reopen_state); >> } >> free(rs); >> } >> >> void vmdk_reopen_abort() >> { >> foreach (extent in s->extents) { >> bdrv_reopen_abort(extent->reopen_state); >> } >> free(rs); >> } > > Does the caller invoke bdrv_close(bs) after bdrv_prepare_reopen(bs, > &rs)? No. Closing the old backend would be part of bdrv_reopen_commit. Do you have a use case where it would be helpful if the caller invoked bdrv_close? > There is more state than just the file descriptors and I'm not > sure that that gets preserved unless we add code to stash away stuff. > I'm basically hoping this interface does not require touching every > BlockDriver. If we only want to change flags like O_DIRECT or O_SYNC, I think format drivers (except VMDK) can use a standard implementation that just reopens bs->file. If we wanted bdrv_reopen to ensure that all caches are dropped etc. then I think we need a specific implementation in all drivers unless bdrv->bdrv_open/bdrv_close is good enough to emulate it. Kevin