qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Gerhard Wiesinger <lists@wiesinger.com>
To: Fam Zheng <famcool@gmail.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH] block.c, block/vmdk.c: Fixed major bug in VMDK WRITE and READ handling - FIXES DATA CORRUPTION
Date: Fri, 09 Nov 2012 07:51:37 +0100	[thread overview]
Message-ID: <509CA7F9.7020400@wiesinger.com> (raw)
In-Reply-To: <CABgc4wQ7GKhxHrHXzp0JvnYDfWOEAv4ABVFJujmp6RppLeauEg@mail.gmail.com>

Hello,

It was created with VMWare Server 2.0.x or maybe even with 1.x. under Linux.

Ciao,
Gerhard

On 09.11.2012 06:12, Fam Zheng wrote:
> Yes, I can confirm the presence of this bug and this is a valid fix.
>
> May I ask where is this kind of vmdk from? Because regularly we see
> extents in identical size.
>
> ---
> Best regards!
> Fam Zheng
>
>
> On Fri, Nov 9, 2012 at 3:05 AM, Gerhard Wiesinger <lists@wiesinger.com> wrote:
>> Fixed a MAJOR BUG in VMDK files on file boundaries on reads
>> and ALSO ON WRITES WHICH MIGHT CORRUPT THE IMAGE AND DATA!!!!!!
>>
>> Triggered for example with the following VMDK file (partly listed):
>> # Extent description
>> RW 4193792 FLAT "XP-W1-f001.vmdk" 0
>> RW 2097664 FLAT "XP-W1-f002.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f003.vmdk" 0
>> RW 512 FLAT "XP-W1-f004.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f005.vmdk" 0
>> RW 2097664 FLAT "XP-W1-f006.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f007.vmdk" 0
>> RW 512 FLAT "XP-W1-f008.vmdk" 0
>>
>> Patch includes:
>> 1.) Patch fixes wrong calculation on extent boundaries. Especially it fixes
>> the relativeness of the sector number to the current extent.
>> 2.) Added debug code to block.c and to block/vmdk.c to verify correctness
>> 3.) Also optimized code which avoids multiplication and uses shifts.
>>
>> Verfied correctness with:
>> 1.) Converted either with Virtualbox to VDI and then with qemu-img and then
>> with qemu-img only
>> VBoxManage clonehd --format vdi /VM/XP-W/new/XP-W1.vmdk
>> ~/.VirtualBox/Harddisks/XP-W1-new-test.vdi
>> ./qemu-img convert -O raw ~/.VirtualBox/Harddisks/XP-W1-new-test.vdi
>> /root/QEMU/VM-XP-W1/XP-W1-via-VBOX.img
>> md5sum /root/QEMU/VM-XP-W/XP-W1-direct.img
>> md5sum /root/QEMU/VM-XP-W/XP-W1-via-VBOX.img
>> => same MD5 hash
>> 2.) Verified debug log files
>> 3.) Run Windows XP successfully
>> 4.) chkdsk run successfully without any errors
>>
>> Signed-off-by: Gerhard Wiesinger <lists@wiesinger.com>
>> ---
>>   block.c      |  50 +++++++++++++++++++++++
>>   block/vmdk.c | 129
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>   2 files changed, 170 insertions(+), 9 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index da1fdca..69259f2 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -49,6 +49,12 @@
>>   #include <windows.h>
>>   #endif
>>
>> +#if 0
>> +   #define DEBUG_BLOCK
>> +#endif
>> +
>> +#define DEBUG_BLOCK_PREFIX "BLOCK: "
>> +
>>   #define NOT_DONE 0x7fffffff /* used while emulated sync operation in
>> progress */
>>
>>   typedef enum {
>> @@ -789,6 +795,18 @@ int bdrv_open(BlockDriverState *bs, const char
>> *filename, int flags,
>>       int ret;
>>       char tmp_filename[PATH_MAX];
>>
>> +#ifdef DEBUG_BLOCK
>> +    const char *format = "(nil)";
>> +    if (drv) {
>> +        format = drv->format_name;
>> +    }
>> +
>> +    printf(DEBUG_BLOCK
>> +           "bdrv_open: filename=%s, BlockDriver=%p, format=%s\n",
>> +           filename, drv, format
>> +           );
>> +#endif
>> +
>>       if (flags & BDRV_O_SNAPSHOT) {
>>           BlockDriverState *bs1;
>>           int64_t total_size;
>> @@ -2004,6 +2022,22 @@ static int bdrv_rw_co(BlockDriverState *bs, int64_t
>> sector_num, uint8_t *buf,
>>   int bdrv_read(BlockDriverState *bs, int64_t sector_num,
>>                 uint8_t *buf, int nb_sectors)
>>   {
>> +#ifdef DEBUG_BLOCK
>> +    BlockDriver *drv = bs->drv;
>> +    const char *format_name = "(nil)";
>> +    if (drv) {
>> +        if (drv->format_name) {
>> +            format_name = drv->format_name;
>> +        }
>> +    }
>> +
>> +    printf(DEBUG_BLOCK
>> +           "bdrv_read: driver=%s, filename=%s, sector_num=%" PRId64
>> +           ", nb_sectors=%i\n",
>> +           format_name, bs->filename, sector_num, nb_sectors
>> +           );
>> +#endif
>> +
>>       return bdrv_rw_co(bs, sector_num, buf, nb_sectors, false);
>>   }
>>
>> @@ -2060,6 +2094,22 @@ static void set_dirty_bitmap(BlockDriverState *bs,
>> int64_t sector_num,
>>   int bdrv_write(BlockDriverState *bs, int64_t sector_num,
>>                  const uint8_t *buf, int nb_sectors)
>>   {
>> +#ifdef DEBUG_BLOCK
>> +    BlockDriver *drv = bs->drv;
>> +    const char *format_name = "(nil)";
>> +    if (drv) {
>> +        if (drv->format_name) {
>> +            format_name = drv->format_name;
>> +        }
>> +    }
>> +
>> +    printf(DEBUG_BLOCK
>> +           "bdrv_write: driver=%s, filename=%s, sector_num=%" PRId64
>> +           ", nb_sectors=%i\n",
>> +           format_name, bs->filename, sector_num, nb_sectors
>> +           );
>> +#endif
>> +
>>       return bdrv_rw_co(bs, sector_num, (uint8_t *)buf, nb_sectors, true);
>>   }
>>
>> diff --git a/block/vmdk.c b/block/vmdk.c
>> index 1a80e5a..92ab92c 100644
>> --- a/block/vmdk.c
>> +++ b/block/vmdk.c
>> @@ -29,6 +29,15 @@
>>   #include "migration.h"
>>   #include <zlib.h>
>>
>> +#if 0
>> +    #define DEBUG_VMDK
>> +#endif
>> +
>> +#define DEBUG_VMDK_PREFIX "VMDK: "
>> +#define DEBUG_VMDK_SEPARATOR \
>> +        "########################################" \
>> +        "########################################"
>> +
>>   #define VMDK3_MAGIC (('C' << 24) | ('O' << 16) | ('W' << 8) | 'D')
>>   #define VMDK4_MAGIC (('K' << 24) | ('D' << 16) | ('M' << 8) | 'V')
>>   #define VMDK4_COMPRESSION_DEFLATE 1
>> @@ -377,6 +386,16 @@ static VmdkExtent *vmdk_add_extent(BlockDriverState
>> *bs,
>>       VmdkExtent *extent;
>>       BDRVVmdkState *s = bs->opaque;
>>
>> +#ifdef DEBUG_VMDK
>> +    printf(DEBUG_VMDK_PREFIX
>> +           "vmdk_add_extent: flat=%i, sectors=%" PRId64
>> +           ", l1_backup_offset=%" PRId64
>> +           ", l1_size=%u, l2_size=%i, cluster_sectors=%u\n",
>> +           (int)flat, sectors, l1_backup_offset,
>> +           l1_size, l2_size, cluster_sectors
>> +           );
>> +#endif
>> +
>>       s->extents = g_realloc(s->extents,
>>                                 (s->num_extents + 1) * sizeof(VmdkExtent));
>>       extent = &s->extents[s->num_extents];
>> @@ -674,6 +693,14 @@ static int vmdk_parse_extents(const char *desc,
>> BlockDriverState *bs,
>>               return ret;
>>           }
>>
>> +#ifdef DEBUG_VMDK
>> +        printf(DEBUG_VMDK_PREFIX
>> +               "valid extent found: access=%s, sectors=%" PRId64
>> +               ", type=%s, filename=%s, flat_offset=%" PRId64 "\n",
>> +               access, sectors, type, fname, flat_offset
>> +               );
>> +#endif
>> +
>>           /* save to extents array */
>>           if (!strcmp(type, "FLAT")) {
>>               /* FLAT extent */
>> @@ -704,6 +731,45 @@ next_line:
>>       return 0;
>>   }
>>
>> +#ifdef DEBUG_VMDK
>> +static void vmdk_debug_print_one_extent(VmdkExtent *extent)
>> +{
>> +    printf(DEBUG_VMDK_PREFIX DEBUG_VMDK_SEPARATOR "\n");
>> +    printf(DEBUG_VMDK_PREFIX "filename%s\n", extent->file->filename);
>> +    printf(DEBUG_VMDK_PREFIX "flat=%i\n", (int)extent->flat);
>> +    printf(DEBUG_VMDK_PREFIX "compressed=%i\n", (int)extent->compressed);
>> +    printf(DEBUG_VMDK_PREFIX "has_marker=%i\n", (int)extent->has_marker);
>> +    printf(DEBUG_VMDK_PREFIX "sectors=%" PRId64 "\n", extent->sectors);
>> +    printf(DEBUG_VMDK_PREFIX "end_sector=%" PRId64 "\n",
>> extent->end_sector);
>> +    printf(DEBUG_VMDK_PREFIX "flat_start_offset=%" PRId64 "\n",
>> +                             extent->flat_start_offset);
>> +    printf(DEBUG_VMDK_PREFIX "l1_table_offsett=%" PRId64 "\n",
>> +                             extent->l1_table_offset);
>> +    printf(DEBUG_VMDK_PREFIX "l1_backup_table_offsett=%"
>> +                             PRId64 "\n", extent->l1_backup_table_offset);
>> +    printf(DEBUG_VMDK_PREFIX "l1_size=%u\n", extent->l1_size);
>> +    printf(DEBUG_VMDK_PREFIX "l1_entry_sectors=%u\n",
>> extent->l1_entry_sectors);
>> +    printf(DEBUG_VMDK_PREFIX "l2_size=%u\n", extent->l2_size);
>> +    printf(DEBUG_VMDK_PREFIX "cluster_sectors=%u\n",
>> extent->cluster_sectors);
>> +    fflush(stdout);
>> +}
>> +
>> +static void vmdk_debug_print_extents(BlockDriverState *bs)
>> +{
>> +    BDRVVmdkState *s = bs->opaque;
>> +    VmdkExtent *extent = &s->extents[0];
>> +
>> +    printf(DEBUG_VMDK_PREFIX DEBUG_VMDK_SEPARATOR "\n");
>> +    printf(DEBUG_VMDK_PREFIX "total_sectors=%" PRId64 "\n",
>> bs->total_sectors);
>> +    printf(DEBUG_VMDK_PREFIX "num_extents=%u\n", s->num_extents);
>> +
>> +    while (extent < &s->extents[s->num_extents]) {
>> +        vmdk_debug_print_one_extent(extent);
>> +        extent++;
>> +    }
>> +}
>> +#endif
>> +
>>   static int vmdk_open_desc_file(BlockDriverState *bs, int flags,
>>                                  int64_t desc_offset)
>>   {
>> @@ -728,7 +794,13 @@ static int vmdk_open_desc_file(BlockDriverState *bs,
>> int flags,
>>           return -ENOTSUP;
>>       }
>>       s->desc_offset = 0;
>> -    return vmdk_parse_extents(buf, bs, bs->file->filename);
>> +    ret = vmdk_parse_extents(buf, bs, bs->file->filename);
>> +
>> +#ifdef DEBUG_VMDK
>> +    vmdk_debug_print_extents(bs);
>> +#endif
>> +
>> +    return ret;
>>   }
>>
>>   static int vmdk_open(BlockDriverState *bs, int flags)
>> @@ -774,6 +846,14 @@ static int get_whole_cluster(BlockDriverState *bs,
>>       /* 128 sectors * 512 bytes each = grain size 64KB */
>>       uint8_t  whole_grain[extent->cluster_sectors * 512];
>>
>> +#ifdef DEBUG_VMDK
>> +    printf(DEBUG_VMDK_PREFIX
>> +           "vmdk_get_whole_cluster: filename=%s, cluster_offset=%" PRId64
>> +           ", offset=%" PRId64 "\n",
>> +           extent->file->filename, cluster_offset, offset
>> +           );
>> +#endif
>> +
>>       /* we will be here if it's first write on non-exist grain(cluster).
>>        * try to read from parent image, if exist */
>>       if (bs->backing_hd) {
>> @@ -1032,18 +1112,35 @@ static int vmdk_read_extent(VmdkExtent *extent,
>> int64_t cluster_offset,
>>       VmdkGrainMarker *marker;
>>       uLongf buf_len;
>>
>> +#ifdef DEBUG_VMDK
>> +    printf(DEBUG_VMDK_PREFIX
>> +           "vmdk_read_extent: filename=%s, cluster_offset=%" PRId64
>> +           ", offset_in_cluster=%" PRId64 ", nb_sectors=%i\n",
>> +           extent->file->filename, cluster_offset,
>> +           offset_in_cluster, nb_sectors
>> +           );
>> +#endif
>>
>>       if (!extent->compressed) {
>> +
>> +#ifdef DEBUG_VMDK
>> +        printf(DEBUG_VMDK_PREFIX
>> +               "vmdk_read_extent:bdrv_pread filename=%s, offset=%" PRId64
>> +               ", length=%i\n",
>> +               extent->file->filename, offset_in_cluster, nb_sectors << 9
>> +               );
>> +#endif
>> +
>>           ret = bdrv_pread(extent->file,
>>                             cluster_offset + offset_in_cluster,
>> -                          buf, nb_sectors * 512);
>> -        if (ret == nb_sectors * 512) {
>> +                          buf, nb_sectors << 9);
>> +        if (ret == nb_sectors << 9) {
>>               return 0;
>>           } else {
>>               return -EIO;
>>           }
>>       }
>> -    cluster_bytes = extent->cluster_sectors * 512;
>> +    cluster_bytes = extent->cluster_sectors << 9;
>>       /* Read two clusters in case GrainMarker + compressed data > one
>> cluster */
>>       buf_bytes = cluster_bytes * 2;
>>       cluster_buf = g_malloc(buf_bytes);
>> @@ -1092,9 +1189,18 @@ static int vmdk_read(BlockDriverState *bs, int64_t
>> sector_num,
>>       BDRVVmdkState *s = bs->opaque;
>>       int ret;
>>       uint64_t n, index_in_cluster;
>> +    uint64_t extent_begin_sector, extent_relative_sector_num;
>>       VmdkExtent *extent = NULL;
>>       uint64_t cluster_offset;
>>
>> +#ifdef DEBUG_VMDK
>> +    printf(DEBUG_VMDK_PREFIX
>> +           "vmdk_read: sectornum=%" PRId64
>> +           ", nb_sectors=%i\n",
>> +           sector_num, nb_sectors
>> +           );
>> +#endif
>> +
>>       while (nb_sectors > 0) {
>>           extent = find_extent(s, sector_num, extent);
>>           if (!extent) {
>> @@ -1103,7 +1209,9 @@ static int vmdk_read(BlockDriverState *bs, int64_t
>> sector_num,
>>           ret = get_cluster_offset(
>>                               bs, extent, NULL,
>>                               sector_num << 9, 0, &cluster_offset);
>> -        index_in_cluster = sector_num % extent->cluster_sectors;
>> +        extent_begin_sector = extent->end_sector - extent->sectors;
>> +        extent_relative_sector_num = sector_num - extent_begin_sector;
>> +        index_in_cluster = extent_relative_sector_num %
>> extent->cluster_sectors;
>>           n = extent->cluster_sectors - index_in_cluster;
>>           if (n > nb_sectors) {
>>               n = nb_sectors;
>> @@ -1119,11 +1227,11 @@ static int vmdk_read(BlockDriverState *bs, int64_t
>> sector_num,
>>                       return ret;
>>                   }
>>               } else {
>> -                memset(buf, 0, 512 * n);
>> +                memset(buf, 0, n << 9);
>>               }
>>           } else {
>>               ret = vmdk_read_extent(extent,
>> -                            cluster_offset, index_in_cluster * 512,
>> +                            cluster_offset, index_in_cluster << 9,
>>                               buf, n);
>>               if (ret) {
>>                   return ret;
>> @@ -1131,7 +1239,7 @@ static int vmdk_read(BlockDriverState *bs, int64_t
>> sector_num,
>>           }
>>           nb_sectors -= n;
>>           sector_num += n;
>> -        buf += n * 512;
>> +        buf += n << 9;
>>       }
>>       return 0;
>>   }
>> @@ -1154,6 +1262,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t
>> sector_num,
>>       VmdkExtent *extent = NULL;
>>       int n, ret;
>>       int64_t index_in_cluster;
>> +    uint64_t extent_begin_sector, extent_relative_sector_num;
>>       uint64_t cluster_offset;
>>       VmdkMetaData m_data;
>>
>> @@ -1196,7 +1305,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t
>> sector_num,
>>           if (ret) {
>>               return -EINVAL;
>>           }
>> -        index_in_cluster = sector_num % extent->cluster_sectors;
>> +        extent_begin_sector = extent->end_sector - extent->sectors;
>> +        extent_relative_sector_num = sector_num - extent_begin_sector;
>> +        index_in_cluster = extent_relative_sector_num %
>> extent->cluster_sectors;
>>           n = extent->cluster_sectors - index_in_cluster;
>>           if (n > nb_sectors) {
>>               n = nb_sectors;
>> --
>> 1.7.11.7
>>

  reply	other threads:[~2012-11-09  6:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-08 19:05 [Qemu-devel] [PATCH] block.c, block/vmdk.c: Fixed major bug in VMDK WRITE and READ handling - FIXES DATA CORRUPTION Gerhard Wiesinger
2012-11-09  5:12 ` Fam Zheng
2012-11-09  6:51   ` Gerhard Wiesinger [this message]
2012-11-09  7:38   ` Lei Li
2012-11-09  8:26 ` Paolo Bonzini
2012-11-10  8:30   ` Gerhard Wiesinger
2012-11-10  8:55     ` Paolo Bonzini
2012-11-10  9:59       ` Gerhard Wiesinger
2012-11-13 12:27         ` Kevin Wolf
2012-11-13 13:36           ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=509CA7F9.7020400@wiesinger.com \
    --to=lists@wiesinger.com \
    --cc=famcool@gmail.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).