All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lei Li <lilei@linux.vnet.ibm.com>
To: Fam Zheng <famcool@gmail.com>
Cc: Gerhard Wiesinger <lists@wiesinger.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH] block.c, block/vmdk.c: Fixed major bug in VMDK WRITE and READ handling - FIXES DATA CORRUPTION
Date: Fri, 09 Nov 2012 15:38:30 +0800	[thread overview]
Message-ID: <509CB2F6.9010300@linux.vnet.ibm.com> (raw)
In-Reply-To: <CABgc4wQ7GKhxHrHXzp0JvnYDfWOEAv4ABVFJujmp6RppLeauEg@mail.gmail.com>

On 11/09/2012 01:12 PM, Fam Zheng wrote:
> Yes, I can confirm the presence of this bug and this is a valid fix.
>
> May I ask where is this kind of vmdk from? Because regularly we see
> extents in identical size.

I guess the regularly extents you see are SPARSE extent in identical size.
This is the FLAT extent, which is preallocated virtual disk.
   

>
> ---
> Best regards!
> Fam Zheng
>
>
> On Fri, Nov 9, 2012 at 3:05 AM, Gerhard Wiesinger <lists@wiesinger.com> wrote:
>> Fixed a MAJOR BUG in VMDK files on file boundaries on reads
>> and ALSO ON WRITES WHICH MIGHT CORRUPT THE IMAGE AND DATA!!!!!!
>>
>> Triggered for example with the following VMDK file (partly listed):
>> # Extent description
>> RW 4193792 FLAT "XP-W1-f001.vmdk" 0
>> RW 2097664 FLAT "XP-W1-f002.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f003.vmdk" 0
>> RW 512 FLAT "XP-W1-f004.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f005.vmdk" 0
>> RW 2097664 FLAT "XP-W1-f006.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f007.vmdk" 0
>> RW 512 FLAT "XP-W1-f008.vmdk" 0
>>
>> Patch includes:
>> 1.) Patch fixes wrong calculation on extent boundaries. Especially it fixes
>> the relativeness of the sector number to the current extent.
>> 2.) Added debug code to block.c and to block/vmdk.c to verify correctness
>> 3.) Also optimized code which avoids multiplication and uses shifts.
>>
>> Verfied correctness with:
>> 1.) Converted either with Virtualbox to VDI and then with qemu-img and then
>> with qemu-img only
>> VBoxManage clonehd --format vdi /VM/XP-W/new/XP-W1.vmdk
>> ~/.VirtualBox/Harddisks/XP-W1-new-test.vdi
>> ./qemu-img convert -O raw ~/.VirtualBox/Harddisks/XP-W1-new-test.vdi
>> /root/QEMU/VM-XP-W1/XP-W1-via-VBOX.img
>> md5sum /root/QEMU/VM-XP-W/XP-W1-direct.img
>> md5sum /root/QEMU/VM-XP-W/XP-W1-via-VBOX.img
>> => same MD5 hash
>> 2.) Verified debug log files
>> 3.) Run Windows XP successfully
>> 4.) chkdsk run successfully without any errors
>>
>> Signed-off-by: Gerhard Wiesinger <lists@wiesinger.com>
>> ---
>>   block.c      |  50 +++++++++++++++++++++++
>>   block/vmdk.c | 129
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>   2 files changed, 170 insertions(+), 9 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index da1fdca..69259f2 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -49,6 +49,12 @@
>>   #include <windows.h>
>>   #endif
>>
>> +#if 0
>> +   #define DEBUG_BLOCK
>> +#endif
>> +
>> +#define DEBUG_BLOCK_PREFIX "BLOCK: "
>> +
>>   #define NOT_DONE 0x7fffffff /* used while emulated sync operation in
>> progress */
>>
>>   typedef enum {
>> @@ -789,6 +795,18 @@ int bdrv_open(BlockDriverState *bs, const char
>> *filename, int flags,
>>       int ret;
>>       char tmp_filename[PATH_MAX];
>>
>> +#ifdef DEBUG_BLOCK
>> +    const char *format = "(nil)";
>> +    if (drv) {
>> +        format = drv->format_name;
>> +    }
>> +
>> +    printf(DEBUG_BLOCK
>> +           "bdrv_open: filename=%s, BlockDriver=%p, format=%s\n",
>> +           filename, drv, format
>> +           );
>> +#endif
>> +
>>       if (flags & BDRV_O_SNAPSHOT) {
>>           BlockDriverState *bs1;
>>           int64_t total_size;
>> @@ -2004,6 +2022,22 @@ static int bdrv_rw_co(BlockDriverState *bs, int64_t
>> sector_num, uint8_t *buf,
>>   int bdrv_read(BlockDriverState *bs, int64_t sector_num,
>>                 uint8_t *buf, int nb_sectors)
>>   {
>> +#ifdef DEBUG_BLOCK
>> +    BlockDriver *drv = bs->drv;
>> +    const char *format_name = "(nil)";
>> +    if (drv) {
>> +        if (drv->format_name) {
>> +            format_name = drv->format_name;
>> +        }
>> +    }
>> +
>> +    printf(DEBUG_BLOCK
>> +           "bdrv_read: driver=%s, filename=%s, sector_num=%" PRId64
>> +           ", nb_sectors=%i\n",
>> +           format_name, bs->filename, sector_num, nb_sectors
>> +           );
>> +#endif
>> +
>>       return bdrv_rw_co(bs, sector_num, buf, nb_sectors, false);
>>   }
>>
>> @@ -2060,6 +2094,22 @@ static void set_dirty_bitmap(BlockDriverState *bs,
>> int64_t sector_num,
>>   int bdrv_write(BlockDriverState *bs, int64_t sector_num,
>>                  const uint8_t *buf, int nb_sectors)
>>   {
>> +#ifdef DEBUG_BLOCK
>> +    BlockDriver *drv = bs->drv;
>> +    const char *format_name = "(nil)";
>> +    if (drv) {
>> +        if (drv->format_name) {
>> +            format_name = drv->format_name;
>> +        }
>> +    }
>> +
>> +    printf(DEBUG_BLOCK
>> +           "bdrv_write: driver=%s, filename=%s, sector_num=%" PRId64
>> +           ", nb_sectors=%i\n",
>> +           format_name, bs->filename, sector_num, nb_sectors
>> +           );
>> +#endif
>> +
>>       return bdrv_rw_co(bs, sector_num, (uint8_t *)buf, nb_sectors, true);
>>   }
>>
>> diff --git a/block/vmdk.c b/block/vmdk.c
>> index 1a80e5a..92ab92c 100644
>> --- a/block/vmdk.c
>> +++ b/block/vmdk.c
>> @@ -29,6 +29,15 @@
>>   #include "migration.h"
>>   #include <zlib.h>
>>
>> +#if 0
>> +    #define DEBUG_VMDK
>> +#endif
>> +
>> +#define DEBUG_VMDK_PREFIX "VMDK: "
>> +#define DEBUG_VMDK_SEPARATOR \
>> +        "########################################" \
>> +        "########################################"
>> +
>>   #define VMDK3_MAGIC (('C' << 24) | ('O' << 16) | ('W' << 8) | 'D')
>>   #define VMDK4_MAGIC (('K' << 24) | ('D' << 16) | ('M' << 8) | 'V')
>>   #define VMDK4_COMPRESSION_DEFLATE 1
>> @@ -377,6 +386,16 @@ static VmdkExtent *vmdk_add_extent(BlockDriverState
>> *bs,
>>       VmdkExtent *extent;
>>       BDRVVmdkState *s = bs->opaque;
>>
>> +#ifdef DEBUG_VMDK
>> +    printf(DEBUG_VMDK_PREFIX
>> +           "vmdk_add_extent: flat=%i, sectors=%" PRId64
>> +           ", l1_backup_offset=%" PRId64
>> +           ", l1_size=%u, l2_size=%i, cluster_sectors=%u\n",
>> +           (int)flat, sectors, l1_backup_offset,
>> +           l1_size, l2_size, cluster_sectors
>> +           );
>> +#endif
>> +
>>       s->extents = g_realloc(s->extents,
>>                                 (s->num_extents + 1) * sizeof(VmdkExtent));
>>       extent = &s->extents[s->num_extents];
>> @@ -674,6 +693,14 @@ static int vmdk_parse_extents(const char *desc,
>> BlockDriverState *bs,
>>               return ret;
>>           }
>>
>> +#ifdef DEBUG_VMDK
>> +        printf(DEBUG_VMDK_PREFIX
>> +               "valid extent found: access=%s, sectors=%" PRId64
>> +               ", type=%s, filename=%s, flat_offset=%" PRId64 "\n",
>> +               access, sectors, type, fname, flat_offset
>> +               );
>> +#endif
>> +
>>           /* save to extents array */
>>           if (!strcmp(type, "FLAT")) {
>>               /* FLAT extent */
>> @@ -704,6 +731,45 @@ next_line:
>>       return 0;
>>   }
>>
>> +#ifdef DEBUG_VMDK
>> +static void vmdk_debug_print_one_extent(VmdkExtent *extent)
>> +{
>> +    printf(DEBUG_VMDK_PREFIX DEBUG_VMDK_SEPARATOR "\n");
>> +    printf(DEBUG_VMDK_PREFIX "filename%s\n", extent->file->filename);
>> +    printf(DEBUG_VMDK_PREFIX "flat=%i\n", (int)extent->flat);
>> +    printf(DEBUG_VMDK_PREFIX "compressed=%i\n", (int)extent->compressed);
>> +    printf(DEBUG_VMDK_PREFIX "has_marker=%i\n", (int)extent->has_marker);
>> +    printf(DEBUG_VMDK_PREFIX "sectors=%" PRId64 "\n", extent->sectors);
>> +    printf(DEBUG_VMDK_PREFIX "end_sector=%" PRId64 "\n",
>> extent->end_sector);
>> +    printf(DEBUG_VMDK_PREFIX "flat_start_offset=%" PRId64 "\n",
>> +                             extent->flat_start_offset);
>> +    printf(DEBUG_VMDK_PREFIX "l1_table_offsett=%" PRId64 "\n",
>> +                             extent->l1_table_offset);
>> +    printf(DEBUG_VMDK_PREFIX "l1_backup_table_offsett=%"
>> +                             PRId64 "\n", extent->l1_backup_table_offset);
>> +    printf(DEBUG_VMDK_PREFIX "l1_size=%u\n", extent->l1_size);
>> +    printf(DEBUG_VMDK_PREFIX "l1_entry_sectors=%u\n",
>> extent->l1_entry_sectors);
>> +    printf(DEBUG_VMDK_PREFIX "l2_size=%u\n", extent->l2_size);
>> +    printf(DEBUG_VMDK_PREFIX "cluster_sectors=%u\n",
>> extent->cluster_sectors);
>> +    fflush(stdout);
>> +}
>> +
>> +static void vmdk_debug_print_extents(BlockDriverState *bs)
>> +{
>> +    BDRVVmdkState *s = bs->opaque;
>> +    VmdkExtent *extent = &s->extents[0];
>> +
>> +    printf(DEBUG_VMDK_PREFIX DEBUG_VMDK_SEPARATOR "\n");
>> +    printf(DEBUG_VMDK_PREFIX "total_sectors=%" PRId64 "\n",
>> bs->total_sectors);
>> +    printf(DEBUG_VMDK_PREFIX "num_extents=%u\n", s->num_extents);
>> +
>> +    while (extent < &s->extents[s->num_extents]) {
>> +        vmdk_debug_print_one_extent(extent);
>> +        extent++;
>> +    }
>> +}
>> +#endif
>> +
>>   static int vmdk_open_desc_file(BlockDriverState *bs, int flags,
>>                                  int64_t desc_offset)
>>   {
>> @@ -728,7 +794,13 @@ static int vmdk_open_desc_file(BlockDriverState *bs,
>> int flags,
>>           return -ENOTSUP;
>>       }
>>       s->desc_offset = 0;
>> -    return vmdk_parse_extents(buf, bs, bs->file->filename);
>> +    ret = vmdk_parse_extents(buf, bs, bs->file->filename);
>> +
>> +#ifdef DEBUG_VMDK
>> +    vmdk_debug_print_extents(bs);
>> +#endif
>> +
>> +    return ret;
>>   }
>>
>>   static int vmdk_open(BlockDriverState *bs, int flags)
>> @@ -774,6 +846,14 @@ static int get_whole_cluster(BlockDriverState *bs,
>>       /* 128 sectors * 512 bytes each = grain size 64KB */
>>       uint8_t  whole_grain[extent->cluster_sectors * 512];
>>
>> +#ifdef DEBUG_VMDK
>> +    printf(DEBUG_VMDK_PREFIX
>> +           "vmdk_get_whole_cluster: filename=%s, cluster_offset=%" PRId64
>> +           ", offset=%" PRId64 "\n",
>> +           extent->file->filename, cluster_offset, offset
>> +           );
>> +#endif
>> +
>>       /* we will be here if it's first write on non-exist grain(cluster).
>>        * try to read from parent image, if exist */
>>       if (bs->backing_hd) {
>> @@ -1032,18 +1112,35 @@ static int vmdk_read_extent(VmdkExtent *extent,
>> int64_t cluster_offset,
>>       VmdkGrainMarker *marker;
>>       uLongf buf_len;
>>
>> +#ifdef DEBUG_VMDK
>> +    printf(DEBUG_VMDK_PREFIX
>> +           "vmdk_read_extent: filename=%s, cluster_offset=%" PRId64
>> +           ", offset_in_cluster=%" PRId64 ", nb_sectors=%i\n",
>> +           extent->file->filename, cluster_offset,
>> +           offset_in_cluster, nb_sectors
>> +           );
>> +#endif
>>
>>       if (!extent->compressed) {
>> +
>> +#ifdef DEBUG_VMDK
>> +        printf(DEBUG_VMDK_PREFIX
>> +               "vmdk_read_extent:bdrv_pread filename=%s, offset=%" PRId64
>> +               ", length=%i\n",
>> +               extent->file->filename, offset_in_cluster, nb_sectors << 9
>> +               );
>> +#endif
>> +
>>           ret = bdrv_pread(extent->file,
>>                             cluster_offset + offset_in_cluster,
>> -                          buf, nb_sectors * 512);
>> -        if (ret == nb_sectors * 512) {
>> +                          buf, nb_sectors << 9);
>> +        if (ret == nb_sectors << 9) {
>>               return 0;
>>           } else {
>>               return -EIO;
>>           }
>>       }
>> -    cluster_bytes = extent->cluster_sectors * 512;
>> +    cluster_bytes = extent->cluster_sectors << 9;
>>       /* Read two clusters in case GrainMarker + compressed data > one
>> cluster */
>>       buf_bytes = cluster_bytes * 2;
>>       cluster_buf = g_malloc(buf_bytes);
>> @@ -1092,9 +1189,18 @@ static int vmdk_read(BlockDriverState *bs, int64_t
>> sector_num,
>>       BDRVVmdkState *s = bs->opaque;
>>       int ret;
>>       uint64_t n, index_in_cluster;
>> +    uint64_t extent_begin_sector, extent_relative_sector_num;
>>       VmdkExtent *extent = NULL;
>>       uint64_t cluster_offset;
>>
>> +#ifdef DEBUG_VMDK
>> +    printf(DEBUG_VMDK_PREFIX
>> +           "vmdk_read: sectornum=%" PRId64
>> +           ", nb_sectors=%i\n",
>> +           sector_num, nb_sectors
>> +           );
>> +#endif
>> +
>>       while (nb_sectors > 0) {
>>           extent = find_extent(s, sector_num, extent);
>>           if (!extent) {
>> @@ -1103,7 +1209,9 @@ static int vmdk_read(BlockDriverState *bs, int64_t
>> sector_num,
>>           ret = get_cluster_offset(
>>                               bs, extent, NULL,
>>                               sector_num << 9, 0, &cluster_offset);
>> -        index_in_cluster = sector_num % extent->cluster_sectors;
>> +        extent_begin_sector = extent->end_sector - extent->sectors;
>> +        extent_relative_sector_num = sector_num - extent_begin_sector;
>> +        index_in_cluster = extent_relative_sector_num %
>> extent->cluster_sectors;
>>           n = extent->cluster_sectors - index_in_cluster;
>>           if (n > nb_sectors) {
>>               n = nb_sectors;
>> @@ -1119,11 +1227,11 @@ static int vmdk_read(BlockDriverState *bs, int64_t
>> sector_num,
>>                       return ret;
>>                   }
>>               } else {
>> -                memset(buf, 0, 512 * n);
>> +                memset(buf, 0, n << 9);
>>               }
>>           } else {
>>               ret = vmdk_read_extent(extent,
>> -                            cluster_offset, index_in_cluster * 512,
>> +                            cluster_offset, index_in_cluster << 9,
>>                               buf, n);
>>               if (ret) {
>>                   return ret;
>> @@ -1131,7 +1239,7 @@ static int vmdk_read(BlockDriverState *bs, int64_t
>> sector_num,
>>           }
>>           nb_sectors -= n;
>>           sector_num += n;
>> -        buf += n * 512;
>> +        buf += n << 9;
>>       }
>>       return 0;
>>   }
>> @@ -1154,6 +1262,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t
>> sector_num,
>>       VmdkExtent *extent = NULL;
>>       int n, ret;
>>       int64_t index_in_cluster;
>> +    uint64_t extent_begin_sector, extent_relative_sector_num;
>>       uint64_t cluster_offset;
>>       VmdkMetaData m_data;
>>
>> @@ -1196,7 +1305,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t
>> sector_num,
>>           if (ret) {
>>               return -EINVAL;
>>           }
>> -        index_in_cluster = sector_num % extent->cluster_sectors;
>> +        extent_begin_sector = extent->end_sector - extent->sectors;
>> +        extent_relative_sector_num = sector_num - extent_begin_sector;
>> +        index_in_cluster = extent_relative_sector_num %
>> extent->cluster_sectors;
>>           n = extent->cluster_sectors - index_in_cluster;
>>           if (n > nb_sectors) {
>>               n = nb_sectors;
>> --
>> 1.7.11.7
>>


-- 
Lei

  parent reply	other threads:[~2012-11-09  7:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-08 19:05 [Qemu-devel] [PATCH] block.c, block/vmdk.c: Fixed major bug in VMDK WRITE and READ handling - FIXES DATA CORRUPTION Gerhard Wiesinger
2012-11-09  5:12 ` Fam Zheng
2012-11-09  6:51   ` Gerhard Wiesinger
2012-11-09  7:38   ` Lei Li [this message]
2012-11-09  8:26 ` Paolo Bonzini
2012-11-10  8:30   ` Gerhard Wiesinger
2012-11-10  8:55     ` Paolo Bonzini
2012-11-10  9:59       ` Gerhard Wiesinger
2012-11-13 12:27         ` Kevin Wolf
2012-11-13 13:36           ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=509CB2F6.9010300@linux.vnet.ibm.com \
    --to=lilei@linux.vnet.ibm.com \
    --cc=famcool@gmail.com \
    --cc=lists@wiesinger.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.