qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Denis V. Lunev" <den@openvz.org>
To: Kevin Wolf <kwolf@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 08/27] block/parallels: _co_writev callback for Parallels format
Date: Thu, 23 Apr 2015 12:47:00 +0300	[thread overview]
Message-ID: <5538BF94.9080904@openvz.org> (raw)
In-Reply-To: <20150423093223.GC5289@noname.redhat.com>

On 23/04/15 12:32, Kevin Wolf wrote:
> Am 23.04.2015 um 11:20 hat Stefan Hajnoczi geschrieben:
>> On Wed, Apr 22, 2015 at 04:16:38PM +0300, Denis V. Lunev wrote:
>>> On 22/04/15 16:08, Stefan Hajnoczi wrote:
>>>> On Wed, Mar 11, 2015 at 01:28:02PM +0300, Denis V. Lunev wrote:
>>>>> +static int64_t allocate_cluster(BlockDriverState *bs, int64_t sector_num)
>>>>> +{
>>>>> +    BDRVParallelsState *s = bs->opaque;
>>>>> +    uint32_t idx, offset, tmp;
>>>>> +    int64_t pos;
>>>>> +    int ret;
>>>>> +
>>>>> +    idx = sector_num / s->tracks;
>>>>> +    offset = sector_num % s->tracks;
>>>>> +
>>>>> +    if (idx >= s->catalog_size) {
>>>>> +        return -EINVAL;
>>>>> +    }
>>>>> +    if (s->catalog_bitmap[idx] != 0) {
>>>>> +        return (uint64_t)s->catalog_bitmap[idx] * s->off_multiplier + offset;
>>>>> +    }
>>>>> +
>>>>> +    pos = bdrv_getlength(bs->file) >> BDRV_SECTOR_BITS;
>>>>> +    bdrv_truncate(bs->file, (pos + s->tracks) << BDRV_SECTOR_BITS);
>>>>> +    s->catalog_bitmap[idx] = pos / s->off_multiplier;
>>>>> +
>>>>> +    tmp = cpu_to_le32(s->catalog_bitmap[idx]);
>>>>> +
>>>>> +    ret = bdrv_pwrite_sync(bs->file,
>>>>> +            sizeof(ParallelsHeader) + idx * sizeof(tmp), &tmp, sizeof(tmp));
>>>> What is the purpose of the sync?
>>> This is necessary to preserve image consistency on crash from
>>> my point of view. There is no check consistency at the moment.
>>> The sync will be removed later when proper crash detection
>>> code will be added (patches 19, 20, 21)
>> Let's look at possible orderings in case of failure:
>>
>> A. BAT update
>> B. Data write
>>
>> This sync enforces A, B ordering.  If we can see B, then A must also
>> have happened thanks to the sync.
>>
>> But A, B ordering is too conservative.  Imagine B, A ordering and the
>> failure where we crash before A.  It means we wrote the data but never
>> linked it into the BAT.
>>
>> What happens in that case?  We've leaked a cluster in the underlying
>> image file but it doesn't corrupt the visible disk from the guest
>> point-of-view.
>>
>> Because your implementation uses truncate to extend the file size before
>> A, even the A, B failure case results in a leaked cluster.  So the B, A
>> case is not worse in any way.
>>
>> Why do other image formats sync cluster allocation updates?  Because
>> they support backing files and in that case an A, B ordering results in
>> data corruption so they enforce B, A ordering (the opposite of what
>> you're trying to do!).
>>
>> The reason why A, B ordering results in data corruption when backing
>> files are in use is because the guest's write request might touch only a
>> subset of the cluster (a couple of sectors out of the whole cluster).
>> So the guest needs to copy the remaining sectors from the backing file.
>> If there is a dangling BAT entry like in the A, B failure case, then the
>> guest will see a zeroed cluster instead of the contents of the backing
>> file.  This is a data corruption, but only if a backing file is being
>> used!
>>
>> So the sync is not necessary, both A, B and B, A ordering work for
>> block/parallels.c.
> Actually, I suspect this means that the parallels driver is restricted
> to protocols with bdrv_has_zero_init() == true, otherwise zeros can turn
> into random data (which means that it can't work e.g. directly on host
> block devices).
>
> Do we enforce this?
>
> Kevin
this is fixed in the patch 26 when the code is replaced with

+    if (s->data_end + s->tracks > pos) {
+        int ret;
+        if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE) {
+            ret = bdrv_write_zeroes(bs->file, s->data_end,
+                                    s->prealloc_size, 0);
+        } else {
+            ret = bdrv_truncate(bs->file,
+                    (s->data_end + s->prealloc_size) << BDRV_SECTOR_BITS);
+        }
+        if (ret < 0) {
+            return ret;
+        }
+    }

on a default path, but you are correct. Some checking is
necessary to be on a safe side.

Den

  reply	other threads:[~2015-04-23  9:47 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-11 10:27 [Qemu-devel] [PATCH v3 0/27] write/create for Parallels images with reasonable performance Denis V. Lunev
2015-03-11 10:27 ` [Qemu-devel] [PATCH 01/27] iotests, parallels: quote TEST_IMG in 076 test to be path-safe Denis V. Lunev
2015-04-22 12:19   ` Stefan Hajnoczi
2015-03-11 10:27 ` [Qemu-devel] [PATCH 02/27] block/parallels: rename parallels_header to ParallelsHeader Denis V. Lunev
2015-04-22 12:19   ` Stefan Hajnoczi
2015-03-11 10:27 ` [Qemu-devel] [PATCH 03/27] block/parallels: switch to bdrv_read Denis V. Lunev
2015-04-22 12:23   ` Stefan Hajnoczi
2015-04-22 12:30     ` Denis V. Lunev
2015-04-23  8:48       ` Stefan Hajnoczi
2015-03-11 10:27 ` [Qemu-devel] [PATCH 04/27] block/parallels: read up to cluster end in one go Denis V. Lunev
2015-04-22 12:28   ` Stefan Hajnoczi
2015-03-11 10:27 ` [Qemu-devel] [PATCH 05/27] block/parallels: add get_block_status Denis V. Lunev
2015-04-22 12:39   ` Stefan Hajnoczi
2015-04-22 12:42     ` Denis V. Lunev
2015-04-23  9:03       ` Stefan Hajnoczi
2015-04-23  9:23         ` Denis V. Lunev
2015-04-24  8:27           ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 06/27] block/parallels: provide _co_readv routine for parallels format driver Denis V. Lunev
2015-04-22 12:41   ` Stefan Hajnoczi
2015-04-22 12:43     ` Denis V. Lunev
2015-03-11 10:28 ` [Qemu-devel] [PATCH 07/27] block/parallels: replace magic constants 4, 64 with proper sizeofs Denis V. Lunev
2015-04-22 12:42   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 08/27] block/parallels: _co_writev callback for Parallels format Denis V. Lunev
2015-04-22 12:44   ` Denis V. Lunev
2015-04-22 13:00   ` Stefan Hajnoczi
2015-04-22 13:08   ` Stefan Hajnoczi
2015-04-22 13:16     ` Denis V. Lunev
2015-04-23  9:20       ` Stefan Hajnoczi
2015-04-23  9:32         ` Kevin Wolf
2015-04-23  9:47           ` Denis V. Lunev [this message]
2015-04-23 10:09             ` Kevin Wolf
2015-04-23  9:36         ` Denis V. Lunev
2015-03-11 10:28 ` [Qemu-devel] [PATCH 09/27] iotests, parallels: test for write into Parallels image Denis V. Lunev
2015-04-22 13:09   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 10/27] block/parallels: support parallels image creation Denis V. Lunev
2015-04-22 13:15   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 11/27] iotests, parallels: test for newly created parallels image via qemu-img Denis V. Lunev
2015-04-22 13:17   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 12/27] parallels: change copyright information in the image header Denis V. Lunev
2015-04-22 13:26   ` Stefan Hajnoczi
2015-04-22 13:26   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 13/27] block/parallels: rename catalog_ names to bat_ Denis V. Lunev
2015-04-22 13:28   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 14/27] block/parallels: create bat2sect helper Denis V. Lunev
2015-04-22 13:29   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 15/27] block/parallels: keep BAT bitmap data in little endian in memory Denis V. Lunev
2015-04-22 13:31   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 16/27] block/parallels: read parallels image header and BAT into single buffer Denis V. Lunev
2015-04-22 13:39   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 17/27] block/parallels: move parallels_open/probe to the very end of the file Denis V. Lunev
2015-04-22 13:40   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 18/27] block/parallels: implement parallels_check method of block driver Denis V. Lunev
2015-03-11 10:44   ` Roman Kagan
2015-04-22 13:53   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 19/27] block/parallels: implement incorrect close detection Denis V. Lunev
2015-04-22 13:55   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 20/27] iotests, parallels: check for incorrectly closed image in tests Denis V. Lunev
2015-04-22 14:04   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 21/27] block/parallels: no need to flush on each block allocation table update Denis V. Lunev
2015-04-22 14:05   ` Stefan Hajnoczi
2015-04-22 14:08     ` Denis V. Lunev
2015-03-11 10:28 ` [Qemu-devel] [PATCH 22/27] block/parallels: improve image reading performance Denis V. Lunev
2015-04-22 14:11   ` Stefan Hajnoczi
2015-04-22 14:13     ` Denis V. Lunev
2015-03-11 10:28 ` [Qemu-devel] [PATCH 23/27] block/parallels: create bat_entry_off helper Denis V. Lunev
2015-04-22 14:13   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 24/27] block/parallels: delay writing to BAT till bdrv_co_flush_to_os Denis V. Lunev
2015-04-22 14:16   ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 25/27] block/parallels: add prealloc-mode and prealloc-size open paramemets Denis V. Lunev
2015-03-11 10:28 ` [Qemu-devel] [PATCH 26/27] block/parallels: optimize linear image expansion Denis V. Lunev
2015-04-22 14:18   ` Stefan Hajnoczi
2015-04-22 14:25     ` Denis V. Lunev
2015-04-22 15:41       ` Denis V. Lunev
2015-04-23  9:26       ` Stefan Hajnoczi
2015-03-11 10:28 ` [Qemu-devel] [PATCH 27/27] block/parallels: improve image writing performance further Denis V. Lunev
2015-04-22 14:19 ` [Qemu-devel] [PATCH v3 0/27] write/create for Parallels images with reasonable performance Stefan Hajnoczi
  -- strict thread matches above, loose matches on Subject: below --
2015-03-10  8:50 Denis V. Lunev
2015-03-10  8:51 ` [Qemu-devel] [PATCH 08/27] block/parallels: _co_writev callback for Parallels format Denis V. Lunev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5538BF94.9080904@openvz.org \
    --to=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).