From: "Denis V. Lunev" <den@openvz.org>
To: Roman Kagan <rkagan@parallels.com>, Kevin Wolf <kwolf@redhat.com>,
qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update
Date: Wed, 14 Jan 2015 15:30:17 +0300 [thread overview]
Message-ID: <54B66159.3010906@openvz.org> (raw)
In-Reply-To: <20150114120146.GA27418@rkaganb.sw.ru>
On 14/01/15 15:01, Roman Kagan wrote:
> On Tue, Jan 13, 2015 at 11:16:04PM +0300, Denis V. Lunev wrote:
>> On 13/01/15 18:17, Denis V. Lunev wrote:
>>> On 13/01/15 17:50, Roman Kagan wrote:
>>>> On Tue, Dec 30, 2014 at 01:07:09PM +0300, Denis V. Lunev wrote:
>>>>> --- a/block/parallels.c
>>>>> +++ b/block/parallels.c
>>>>> @@ -194,7 +194,7 @@ static int64_t allocate_sector(BlockDriverState
>>>>> *bs, int64_t sector_num)
>>>>> tmp = cpu_to_le32(s->bat[idx]);
>>>>> - ret = bdrv_pwrite_sync(bs->file, bat_offset(idx), &tmp,
>>>>> sizeof(tmp));
>>>>> + ret = bdrv_pwrite(bs->file, bat_offset(idx), &tmp, sizeof(tmp));
>>>>> if (ret < 0) {
>>>>> return ret;
>>>>> }
>>>> if the preceding bdrv_truncate(+cluster_size) gets reordered with this
>>>> bat entry update, there's a risk that on poweroff (snapshot, etc.) the
>>>> bat entry gets written to the storage while the file size is not
>>>> updated. As a result, the bat entry in the file would point at a
>>>> cluster which hadn't been allocated yet. When a new block is eventually
>>>> added to the file, you'd have two bat entries pointing at the same
>>>> cluster.
>>>>
>>>> The _sync version used to leave this window quite narrow due to the
>>>> flush following the write. The patch makes the reordering more likely.
>>>>
>>>> I'm afraid the only reliable way to handle it is to put a barrier
>>>> between truncate and bat update, and mitigate the costs by batching the
>>>> file expansions, as you seem to do in the followup patches.
>> Thinking a bit more on this. IMHO there is no problem
>> with a normal workflow, the problem could come
>> from a crash when the file state and a BAT state will
>> be not consistent and the BAT entry will point out
>> of the file (could point out of the file).
> Yes that's exactly the scenario I've been talking about.
>
>> But it is not possible to fix at all this way. Single sync
>> does not provide much warranty.
> How's that?
>
>> We should write proper magic into ParallelsHeader->inuse
>> on open and call sync immediately. On subsequent
>> open of the broken image we should perform
>> check consistency. The magic should be set to 0
>> on clean close.
> This seems to address the problem, but only if this protocol is adhered
> to by all the tools that may access the image (Parallels Desktop,
> Parallels Server, ploop, etc.). Furthermore, if it is, it *has* to be
> adhered to by this code, too, while adding write support, otherwise
> we'll break the assumptions of those other tools.
>
>> I think that this stuff could be implemented separately
>> in the next patchset. If you this that this is mandatory,
>> OK, I can do that, be this will increase patchset a lot.
>> Check consistency is not an easy stuff.
> I think as a 1st approximation, instead of a full-fledged consistency
> check and repair, just refusing write access to the image that has
> ->inuse set would suffice.
>
> Roman.
ok, seems reasonable
next prev parent reply other threads:[~2015-01-14 12:31 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-30 9:28 [Qemu-devel] [PATCH v2 0/19] write/create for Parallels images with reasonable performance Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 01/19] iotests, parallels: quote TEST_IMG in 076 test to be path-safe Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 02/19] block/parallels: rename parallels_header to ParallelsHeader Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 03/19] block/parallels: switch to bdrv_read Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 04/19] block/parallels: read up to cluster end in one go Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 05/19] block/parallels: add get_block_status Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 06/19] block/parallels: provide _co_readv routine for parallels format driver Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 07/19] block/parallels: replace magic constants 4, 64 with proper sizeofs Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 08/19] block/parallels: _co_writev callback for Parallels format Denis V. Lunev
2014-12-30 9:28 ` [Qemu-devel] [PATCH 09/19] iotests, parallels: test for write into Parallels image Denis V. Lunev
2014-12-30 10:07 ` [Qemu-devel] [PATCH 10/19] block/parallels: support parallels image creation Denis V. Lunev
2014-12-30 10:07 ` [Qemu-devel] [PATCH 11/19] iotests, parallels: test for newly created parallels image via qemu-img Denis V. Lunev
2014-12-30 10:07 ` [Qemu-devel] [PATCH 12/19] parallels: change copyright information in the image header Denis V. Lunev
2014-12-30 10:07 ` [Qemu-devel] [PATCH 13/19] block/parallels: store ParallelsHeader to the BDRVParallelsState Denis V. Lunev
2015-01-13 14:13 ` Roman Kagan
2014-12-30 10:07 ` [Qemu-devel] [PATCH 14/19] block/parallels: create catalog_offset helper Denis V. Lunev
2015-01-13 14:14 ` Roman Kagan
2014-12-30 10:07 ` [Qemu-devel] [PATCH 15/19] block/parallels: rename catalog_ names to bat_ Denis V. Lunev
2015-01-13 14:15 ` Roman Kagan
2014-12-30 10:07 ` [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update Denis V. Lunev
2015-01-13 14:50 ` Roman Kagan
2015-01-13 15:17 ` Denis V. Lunev
2015-01-13 20:16 ` Denis V. Lunev
2015-01-14 12:01 ` Roman Kagan
2015-01-14 12:30 ` Denis V. Lunev [this message]
2014-12-30 10:07 ` [Qemu-devel] [PATCH 17/19] block/parallels: delay writing to BAT till bdrv_co_flush_to_os Denis V. Lunev
2015-01-14 13:03 ` Roman Kagan
2015-01-14 13:08 ` Denis V. Lunev
2015-01-14 13:34 ` Roman Kagan
2015-01-14 14:29 ` Denis V. Lunev
2015-01-14 16:44 ` Denis V. Lunev
2014-12-30 10:07 ` [Qemu-devel] [PATCH 18/19] block/parallels: add prealloc-mode and prealloc-size open paramemets Denis V. Lunev
2015-01-14 14:26 ` Roman Kagan
2015-01-14 14:31 ` Denis V. Lunev
2015-01-14 17:22 ` Roman Kagan
2014-12-30 10:07 ` [Qemu-devel] [PATCH 19/19] block/parallels: optimize linear image expansion Denis V. Lunev
2015-01-14 17:56 ` Roman Kagan
2015-01-14 17:57 ` Denis V. Lunev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54B66159.3010906@openvz.org \
--to=den@openvz.org \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rkagan@parallels.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).