qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5
@ 2010-07-14 11:23 Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 01/14] vmdk: fix double free Kevin Wolf
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:23 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

Hi Aurelien,

these are the block related patches that have been applied to git master since
0.12.4 was released and that I consider relevant for stable-0.12.

Kevin


The following changes since commit 0c0f53e25cded4b3b1909391f9bf22d334ed8155:

  qemu-options: add documentation for stdio signal=on|off (2010-07-13 21:18:13 +0200)

are available in the git repository at:
  git://repo.or.cz/qemu/kevin.git for-stable-0.12

Kevin Wolf (13):
      vmdk: fix double free
      qcow2: Fix creation of large images
      vmdk: Fix COW
      qcow2: Remove abort on free_clusters failure
      block/vdi: Fix image opening and creation for odd disk sizes
      qcow2: Restore L1 entry on l2_allocate failure
      block: Add bdrv_(p)write_sync
      qcow: Use bdrv_(p)write_sync for metadata writes
      qcow2: Use bdrv_(p)write_sync for metadata writes
      vmdk: Use bdrv_(p)write_sync for metadata writes
      vpc: Use bdrv_(p)write_sync for metadata writes
      block: Fix early failure in multiwrite
      block: Handle multiwrite errors only when all requests have completed

Stefan Weil (1):
      block/vpc: Fix conversion from size to disk geometry

 block.c                |   81 +++++++++++++++++++++++++++++++++++++++++------
 block.h                |    4 ++
 block/qcow.c           |   18 ++++++-----
 block/qcow2-cluster.c  |   46 +++++++++++++++------------
 block/qcow2-refcount.c |   33 ++++++++++---------
 block/qcow2-snapshot.c |   23 ++++++-------
 block/qcow2.c          |   41 +++++++++++++++++++-----
 block/vdi.c            |   18 +++++++++--
 block/vmdk.c           |   47 ++++++++++------------------
 block/vpc.c            |   30 ++++++++++--------
 block_int.h            |    1 +
 11 files changed, 219 insertions(+), 123 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 01/14] vmdk: fix double free
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 02/14] qcow2: Fix creation of large images Kevin Wolf
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

fail_gd error case would also free rgd_buf that was already freed

Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit a161329b61106ab093aab6d3227ac85e0b8251a9)

Conflicts:

	block/vmdk.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vmdk.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 4e48622..765e95a 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -285,7 +285,6 @@ static int vmdk_snapshot_create(const char *filename, const char *backing_file)
         goto fail_rgd;
     if (write(snp_fd, rgd_buf, gd_size) == -1)
         goto fail_rgd;
-    qemu_free(rgd_buf);
 
     /* write GD */
     gd_buf = qemu_malloc(gd_size);
@@ -298,6 +297,7 @@ static int vmdk_snapshot_create(const char *filename, const char *backing_file)
     if (write(snp_fd, gd_buf, gd_size) == -1)
         goto fail_gd;
     qemu_free(gd_buf);
+    qemu_free(rgd_buf);
 
     close(p_fd);
     close(snp_fd);
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 02/14] qcow2: Fix creation of large images
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 01/14] vmdk: fix double free Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 03/14] vmdk: Fix COW Kevin Wolf
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

qcow_create2 assumes that the new image will only need one cluster for its
refcount table initially. Obviously that's not true any more when the image is
big enough (exact value depends on the cluster size).

This patch calculates the refcount table size dynamically.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 4768fa902c3860f2fe34403e6e1c83bfca6da034)

Conflicts:

	block/qcow2.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c |   41 ++++++++++++++++++++++++++++++++---------
 1 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 5d33d6c..35c05e0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -747,10 +747,11 @@ static int qcow_create2(const char *filename, int64_t total_size,
 {
 
     int fd, header_size, backing_filename_len, l1_size, i, shift, l2_bits;
-    int ref_clusters, backing_format_len = 0;
+    int ref_clusters, reftable_clusters, backing_format_len = 0;
     int rounded_ext_bf_len = 0;
     QCowHeader header;
     uint64_t tmp, offset;
+    uint64_t old_ref_clusters;
     QCowCreateState s1, *s = &s1;
     QCowExtension ext_bf = {0, 0};
 
@@ -809,17 +810,37 @@ static int qcow_create2(const char *filename, int64_t total_size,
     header.l1_size = cpu_to_be32(l1_size);
     offset += align_offset(l1_size * sizeof(uint64_t), s->cluster_size);
 
-    s->refcount_table = qemu_mallocz(s->cluster_size);
+    /* count how many refcount blocks needed */
+
+#define NUM_CLUSTERS(bytes) \
+    (((bytes) + (s->cluster_size) - 1) / (s->cluster_size))
+
+    ref_clusters = NUM_CLUSTERS(NUM_CLUSTERS(offset) * sizeof(uint16_t));
+
+    do {
+        uint64_t image_clusters;
+        old_ref_clusters = ref_clusters;
+
+        /* Number of clusters used for the refcount table */
+        reftable_clusters = NUM_CLUSTERS(ref_clusters * sizeof(uint64_t));
+
+        /* Number of clusters that the whole image will have */
+        image_clusters = NUM_CLUSTERS(offset) + ref_clusters
+            + reftable_clusters;
+
+        /* Number of refcount blocks needed for the image */
+        ref_clusters = NUM_CLUSTERS(image_clusters * sizeof(uint16_t));
+
+    } while (ref_clusters != old_ref_clusters);
+
+    s->refcount_table = qemu_mallocz(reftable_clusters * s->cluster_size);
 
     s->refcount_table_offset = offset;
     header.refcount_table_offset = cpu_to_be64(offset);
-    header.refcount_table_clusters = cpu_to_be32(1);
-    offset += s->cluster_size;
+    header.refcount_table_clusters = cpu_to_be32(reftable_clusters);
+    offset += (reftable_clusters * s->cluster_size);
     s->refcount_block_offset = offset;
 
-    /* count how many refcount blocks needed */
-    tmp = offset >> s->cluster_bits;
-    ref_clusters = (tmp >> (s->cluster_bits - REFCOUNT_SHIFT)) + 1;
     for (i=0; i < ref_clusters; i++) {
         s->refcount_table[i] = cpu_to_be64(offset);
         offset += s->cluster_size;
@@ -831,7 +852,8 @@ static int qcow_create2(const char *filename, int64_t total_size,
     qcow2_create_refcount_update(s, 0, header_size);
     qcow2_create_refcount_update(s, s->l1_table_offset,
         l1_size * sizeof(uint64_t));
-    qcow2_create_refcount_update(s, s->refcount_table_offset, s->cluster_size);
+    qcow2_create_refcount_update(s, s->refcount_table_offset,
+        reftable_clusters * s->cluster_size);
     qcow2_create_refcount_update(s, s->refcount_block_offset,
         ref_clusters * s->cluster_size);
 
@@ -859,7 +881,8 @@ static int qcow_create2(const char *filename, int64_t total_size,
         write(fd, &tmp, sizeof(tmp));
     }
     lseek(fd, s->refcount_table_offset, SEEK_SET);
-    write(fd, s->refcount_table, s->cluster_size);
+    write(fd, s->refcount_table,
+        reftable_clusters * s->cluster_size);
 
     lseek(fd, s->refcount_block_offset, SEEK_SET);
     write(fd, s->refcount_block, ref_clusters * s->cluster_size);
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 03/14] vmdk: Fix COW
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 01/14] vmdk: fix double free Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 02/14] qcow2: Fix creation of large images Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 04/14] qcow2: Remove abort on free_clusters failure Kevin Wolf
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

When trying to do COW, VMDK wrote the data back to the backing file. This
problem was revealed by the patch that made backing files read-only. This patch
does not only fix the problem, but also simplifies the VMDK code a bit.

This fixes the backing file qemu-iotests cases for VMDK.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c336500df5bf08492f4e7796b2193cd4976f3548)
---
 block/vmdk.c |   35 +++++++++++------------------------
 1 files changed, 11 insertions(+), 24 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 765e95a..d52904a 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -87,14 +87,6 @@ typedef struct VmdkMetaData {
     int valid;
 } VmdkMetaData;
 
-typedef struct ActiveBDRVState{
-    BlockDriverState *hd;            // active image handler
-    uint64_t cluster_offset;         // current write offset
-}ActiveBDRVState;
-
-static ActiveBDRVState activeBDRV;
-
-
 static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename)
 {
     uint32_t magic;
@@ -458,30 +450,28 @@ static uint64_t get_cluster_offset(BlockDriverState *bs, VmdkMetaData *m_data,
 static int get_whole_cluster(BlockDriverState *bs, uint64_t cluster_offset,
                              uint64_t offset, int allocate)
 {
-    uint64_t parent_cluster_offset;
     BDRVVmdkState *s = bs->opaque;
     uint8_t  whole_grain[s->cluster_sectors*512];        // 128 sectors * 512 bytes each = grain size 64KB
 
     // we will be here if it's first write on non-exist grain(cluster).
     // try to read from parent image, if exist
     if (bs->backing_hd) {
-        BDRVVmdkState *ps = bs->backing_hd->opaque;
+        int ret;
 
         if (!vmdk_is_cid_valid(bs))
             return -1;
 
-        parent_cluster_offset = get_cluster_offset(bs->backing_hd, NULL,
-            offset, allocate);
-
-        if (parent_cluster_offset) {
-            BDRVVmdkState *act_s = activeBDRV.hd->opaque;
-
-            if (bdrv_pread(ps->hd, parent_cluster_offset, whole_grain, ps->cluster_sectors*512) != ps->cluster_sectors*512)
-                return -1;
+        ret = bdrv_read(bs->backing_hd, offset >> 9, whole_grain,
+            s->cluster_sectors);
+        if (ret < 0) {
+            return -1;
+        }
 
-            //Write grain only into the active image
-            if (bdrv_pwrite(act_s->hd, activeBDRV.cluster_offset << 9, whole_grain, sizeof(whole_grain)) != sizeof(whole_grain))
-                return -1;
+        //Write grain only into the active image
+        ret = bdrv_write(s->hd, cluster_offset, whole_grain,
+            s->cluster_sectors);
+        if (ret < 0) {
+            return -1;
         }
     }
     return 0;
@@ -567,9 +557,6 @@ static uint64_t get_cluster_offset(BlockDriverState *bs, VmdkMetaData *m_data,
             cluster_offset >>= 9;
             tmp = cpu_to_le32(cluster_offset);
             l2_table[l2_index] = tmp;
-            // Save the active image state
-            activeBDRV.cluster_offset = cluster_offset;
-            activeBDRV.hd = bs;
         }
         /* First of all we write grain itself, to avoid race condition
          * that may to corrupt the image.
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 04/14] qcow2: Remove abort on free_clusters failure
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (2 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 03/14] vmdk: Fix COW Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 05/14] block/vpc: Fix conversion from size to disk geometry Kevin Wolf
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

While it's true that during regular operation free_clusters failure would be a
bug, an I/O error can always happen. There's no need to kill the VM, the worst
thing that can happen (and it will) is that we leak some clusters.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 003fad6e2cae5311d3aea996388c90e3ab17de90)
---
 block/qcow2-refcount.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 465d5d3..ff2cf6d 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -631,7 +631,7 @@ void qcow2_free_clusters(BlockDriverState *bs,
     ret = update_refcount(bs, offset, size, -1);
     if (ret < 0) {
         fprintf(stderr, "qcow2_free_clusters failed: %s\n", strerror(-ret));
-        abort();
+        /* TODO Remember the clusters to free them later and avoid leaking */
     }
 }
 
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 05/14] block/vpc: Fix conversion from size to disk geometry
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (3 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 04/14] qcow2: Remove abort on free_clusters failure Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 06/14] block/vdi: Fix image opening and creation for odd disk sizes Kevin Wolf
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

From: Stefan Weil <weil@mail.berlios.de>

The VHD algorithm calculates a disk geometry
which is usually smaller than the requested size.

QEMU tried to round up but failed for certain sizes:

qemu-img create -f vpc disk.vpc 9437184
would create an image with 9435136 bytes
(which is too small for qemu-img convert).

Instead of hacking the geometry algorithm, the patch
increases the number of sectors until we get enough
sectors.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit dede4188cc817a039154ed2ecd7f3285f6b94056)
---
 block/vpc.c |   21 ++++++++++++---------
 1 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index 950ad58..afe6f1a 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -470,9 +470,7 @@ static int calculate_geometry(int64_t total_sectors, uint16_t* cyls,
         }
     }
 
-    // Note: Rounding up deviates from the Virtual PC behaviour
-    // However, we need this to avoid truncating images in qemu-img convert
-    *cyls = (cyls_times_heads + *heads - 1) / *heads;
+    *cyls = cyls_times_heads / *heads;
 
     return 0;
 }
@@ -484,9 +482,9 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options)
     struct vhd_dyndisk_header* dyndisk_header =
         (struct vhd_dyndisk_header*) buf;
     int fd, i;
-    uint16_t cyls;
-    uint8_t heads;
-    uint8_t secs_per_cyl;
+    uint16_t cyls = 0;
+    uint8_t heads = 0;
+    uint8_t secs_per_cyl = 0;
     size_t block_size, num_bat_entries;
     int64_t total_sectors = 0;
 
@@ -503,9 +501,14 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options)
     if (fd < 0)
         return -EIO;
 
-    // Calculate matching total_size and geometry
-    if (calculate_geometry(total_sectors, &cyls, &heads, &secs_per_cyl))
-        return -EFBIG;
+    /* Calculate matching total_size and geometry. Increase the number of
+       sectors requested until we get enough (or fail). */
+    for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) {
+        if (calculate_geometry(total_sectors + i,
+                               &cyls, &heads, &secs_per_cyl)) {
+            return -EFBIG;
+        }
+    }
     total_sectors = (int64_t) cyls * heads * secs_per_cyl;
 
     // Prepare the Hard Disk Footer
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 06/14] block/vdi: Fix image opening and creation for odd disk sizes
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (4 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 05/14] block/vpc: Fix conversion from size to disk geometry Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 07/14] qcow2: Restore L1 entry on l2_allocate failure Kevin Wolf
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

The fix is based on a patch from Kevin Wolf. Here his comment:

"The number of blocks needs to be rounded up to cover all of the virtual hard
disk. Without this fix, we can't even open our own images if their size is not
a multiple of the block size."

While Kevin's patch addressed vdi_create, my modification also fixes
vdi_open which now accepts images with odd disk sizes.

v3:
Don't allow reading of disk images with too large disk sizes.
Neither VBoxManage nor old versions of qemu-img read such images.
This change requires rounding of odd disk sizes before we do the checks.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: François Revol <revol@free.fr>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f21dc3a4652eeb82117d7d55d975278fe1444b26)

Conflicts:

	block/vdi.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vdi.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 45aa81c..64bf66f 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -399,6 +399,15 @@ static int vdi_open(BlockDriverState *bs, const char *filename, int flags)
     vdi_header_print(&header);
 #endif
 
+    if (header.disk_size % SECTOR_SIZE != 0) {
+        /* 'VBoxManage convertfromraw' can create images with odd disk sizes.
+           We accept them but round the disk size to the next multiple of
+           SECTOR_SIZE. */
+        logout("odd disk size %" PRIu64 " B, round up\n", header.disk_size);
+        header.disk_size += SECTOR_SIZE - 1;
+        header.disk_size &= ~(SECTOR_SIZE - 1);
+    }
+
     if (header.version != VDI_VERSION_1_1) {
         logout("unsupported version %u.%u\n",
                header.version >> 16, header.version & 0xffff);
@@ -417,9 +426,9 @@ static int vdi_open(BlockDriverState *bs, const char *filename, int flags)
     } else if (header.block_size != 1 * MiB) {
         logout("unsupported block size %u B\n", header.block_size);
         goto fail;
-    } else if (header.disk_size !=
+    } else if (header.disk_size >
                (uint64_t)header.blocks_in_image * header.block_size) {
-        logout("unexpected block number %u B\n", header.blocks_in_image);
+        logout("unsupported disk size %" PRIu64 " B\n", header.disk_size);
         goto fail;
     } else if (!uuid_is_null(header.uuid_link)) {
         logout("link uuid != 0, unsupported\n");
@@ -831,7 +840,10 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options)
         return -errno;
     }
 
-    blocks = bytes / block_size;
+    /* We need enough blocks to store the given disk size,
+       so always round up. */
+    blocks = (bytes + block_size - 1) / block_size;
+
     bmap_size = blocks * sizeof(uint32_t);
     bmap_size = ((bmap_size + SECTOR_SIZE - 1) & ~(SECTOR_SIZE -1));
 
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 07/14] qcow2: Restore L1 entry on l2_allocate failure
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (5 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 06/14] block/vdi: Fix image opening and creation for odd disk sizes Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 08/14] block: Add bdrv_(p)write_sync Kevin Wolf
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

If writing the L1 table to disk failed, we need to restore its old content in
memory to avoid inconsistencies.

Reported-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 68dba0bf455e60061bb3c9c40ef0d82916372664)
---
 block/qcow2-cluster.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index b7a5b35..8c67e3c 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -266,6 +266,7 @@ static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index)
     return l2_table;
 
 fail:
+    s->l1_table[l1_index] = old_l2_offset;
     qcow2_l2_cache_reset(bs);
     return NULL;
 }
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 08/14] block: Add bdrv_(p)write_sync
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (6 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 07/14] qcow2: Restore L1 entry on l2_allocate failure Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 09/14] qcow: Use bdrv_(p)write_sync for metadata writes Kevin Wolf
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

Add new functions that write and flush the written data to disk immediately.
This is what needs to be used for image format metadata to maintain integrity
for cache=... modes that don't use O_DSYNC. (Actually, we only need barriers,
and therefore the functions are defined as such, but flushes is what is
implemented in this patch - we can try to change that later)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f08145fe16470aca09304099888f68cfbc5d1de7)
---
 block.c     |   39 +++++++++++++++++++++++++++++++++++++++
 block.h     |    4 ++++
 block_int.h |    1 +
 3 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 298414c..e3b3a55 100644
--- a/block.c
+++ b/block.c
@@ -452,6 +452,8 @@ int bdrv_open2(BlockDriverState *bs, const char *filename, int flags,
             (flags & (BDRV_O_CACHE_MASK|BDRV_O_NATIVE_AIO));
     else
         open_flags = flags & ~(BDRV_O_FILE | BDRV_O_SNAPSHOT);
+
+    bs->open_flags = open_flags;
     if (use_bdrv_whitelist && !bdrv_is_whitelisted(drv))
         ret = -ENOTSUP;
     else
@@ -779,6 +781,43 @@ int bdrv_pwrite(BlockDriverState *bs, int64_t offset,
     return count1;
 }
 
+/*
+ * Writes to the file and ensures that no writes are reordered across this
+ * request (acts as a barrier)
+ *
+ * Returns 0 on success, -errno in error cases.
+ */
+int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
+    const void *buf, int count)
+{
+    int ret;
+
+    ret = bdrv_pwrite(bs, offset, buf, count);
+    if (ret < 0) {
+        return ret;
+    }
+
+    /* No flush needed for cache=writethrough, it uses O_DSYNC */
+    if ((bs->open_flags & BDRV_O_CACHE_MASK) != 0) {
+        bdrv_flush(bs);
+    }
+
+    return 0;
+}
+
+/*
+ * Writes to the file and ensures that no writes are reordered across this
+ * request (acts as a barrier)
+ *
+ * Returns 0 on success, -errno in error cases.
+ */
+int bdrv_write_sync(BlockDriverState *bs, int64_t sector_num,
+    const uint8_t *buf, int nb_sectors)
+{
+    return bdrv_pwrite_sync(bs, BDRV_SECTOR_SIZE * sector_num,
+        buf, BDRV_SECTOR_SIZE * nb_sectors);
+}
+
 /**
  * Truncate file to 'offset' bytes (needed only for file protocols)
  */
diff --git a/block.h b/block.h
index fa51ddf..762d88a 100644
--- a/block.h
+++ b/block.h
@@ -77,6 +77,10 @@ int bdrv_pread(BlockDriverState *bs, int64_t offset,
                void *buf, int count);
 int bdrv_pwrite(BlockDriverState *bs, int64_t offset,
                 const void *buf, int count);
+int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
+    const void *buf, int count);
+int bdrv_write_sync(BlockDriverState *bs, int64_t sector_num,
+    const uint8_t *buf, int nb_sectors);
 int bdrv_truncate(BlockDriverState *bs, int64_t offset);
 int64_t bdrv_getlength(BlockDriverState *bs);
 void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
diff --git a/block_int.h b/block_int.h
index 9a3b2e0..631e8c5 100644
--- a/block_int.h
+++ b/block_int.h
@@ -127,6 +127,7 @@ struct BlockDriverState {
     int64_t total_sectors; /* if we are reading a disk image, give its
                               size in sectors */
     int read_only; /* if true, the media is read only */
+    int open_flags; /* flags used to open the file, re-used for re-open */
     int removable; /* if true, the media can be removed */
     int locked;    /* if true, the media cannot temporarily be ejected */
     int encrypted; /* if true, the media is encrypted */
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 09/14] qcow: Use bdrv_(p)write_sync for metadata writes
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (7 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 08/14] block: Add bdrv_(p)write_sync Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 10/14] qcow2: " Kevin Wolf
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5e5557d97026d1d3325e0e7b0ba593366da2f3dc)

Conflicts:

	block/qcow.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index 7fc85ae..6a4c30f 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -277,8 +277,9 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
         /* update the L1 entry */
         s->l1_table[l1_index] = l2_offset;
         tmp = cpu_to_be64(l2_offset);
-        if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp),
-                        &tmp, sizeof(tmp)) != sizeof(tmp))
+        if (bdrv_pwrite_sync(s->hd,
+                s->l1_table_offset + l1_index * sizeof(tmp),
+                &tmp, sizeof(tmp)) < 0)
             return 0;
         new_l2_table = 1;
     }
@@ -306,8 +307,8 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
     l2_table = s->l2_cache + (min_index << s->l2_bits);
     if (new_l2_table) {
         memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
-        if (bdrv_pwrite(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) !=
-            s->l2_size * sizeof(uint64_t))
+        if (bdrv_pwrite_sync(s->hd, l2_offset, l2_table,
+                s->l2_size * sizeof(uint64_t)) < 0)
             return 0;
     } else {
         if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) !=
@@ -372,8 +373,8 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
         /* update L2 table */
         tmp = cpu_to_be64(cluster_offset);
         l2_table[l2_index] = tmp;
-        if (bdrv_pwrite(s->hd,
-                        l2_offset + l2_index * sizeof(tmp), &tmp, sizeof(tmp)) != sizeof(tmp))
+        if (bdrv_pwrite_sync(s->hd, l2_offset + l2_index * sizeof(tmp),
+                &tmp, sizeof(tmp)) < 0)
             return 0;
     }
     return cluster_offset;
@@ -821,8 +822,9 @@ static int qcow_make_empty(BlockDriverState *bs)
     int ret;
 
     memset(s->l1_table, 0, l1_length);
-    if (bdrv_pwrite(s->hd, s->l1_table_offset, s->l1_table, l1_length) < 0)
-	return -1;
+    if (bdrv_pwrite_sync(s->hd, s->l1_table_offset, s->l1_table,
+            l1_length) < 0)
+        return -1;
     ret = bdrv_truncate(s->hd, s->l1_table_offset + l1_length);
     if (ret < 0)
         return ret;
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 10/14] qcow2: Use bdrv_(p)write_sync for metadata writes
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (8 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 09/14] qcow: Use bdrv_(p)write_sync for metadata writes Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 11/14] vmdk: " Kevin Wolf
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8b3b720620a1137a1b794fc3ed64734236f94e06)

Conflicts:

	block/qcow2-cluster.c
	block/qcow2-refcount.c
	block/qcow2-snapshot.c
	block/qcow2.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c  |   45 ++++++++++++++++++++++++---------------------
 block/qcow2-refcount.c |   31 ++++++++++++++++---------------
 block/qcow2-snapshot.c |   23 +++++++++++------------
 3 files changed, 51 insertions(+), 48 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8c67e3c..0a555dc 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -62,8 +62,8 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
 
     for(i = 0; i < s->l1_size; i++)
         new_l1_table[i] = cpu_to_be64(new_l1_table[i]);
-    ret = bdrv_pwrite(s->hd, new_l1_table_offset, new_l1_table, new_l1_size2);
-    if (ret != new_l1_size2)
+    ret = bdrv_pwrite_sync(s->hd, new_l1_table_offset, new_l1_table, new_l1_size2);
+    if (ret < 0)
         goto fail;
     for(i = 0; i < s->l1_size; i++)
         new_l1_table[i] = be64_to_cpu(new_l1_table[i]);
@@ -71,8 +71,8 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
     /* set new table */
     cpu_to_be32w((uint32_t*)data, new_l1_size);
     cpu_to_be64w((uint64_t*)(data + 4), new_l1_table_offset);
-    ret = bdrv_pwrite(s->hd, offsetof(QCowHeader, l1_size), data,sizeof(data));
-    if (ret != sizeof(data)) {
+    ret = bdrv_pwrite_sync(s->hd, offsetof(QCowHeader, l1_size), data,sizeof(data));
+    if (ret < 0) {
         goto fail;
     }
     qemu_free(s->l1_table);
@@ -84,7 +84,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
  fail:
     qemu_free(new_l1_table);
     qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2);
-    return ret < 0 ? ret : -EIO;
+    return ret;
 }
 
 void qcow2_l2_cache_reset(BlockDriverState *bs)
@@ -188,17 +188,17 @@ static int write_l1_entry(BDRVQcowState *s, int l1_index)
 {
     uint64_t buf[L1_ENTRIES_PER_SECTOR];
     int l1_start_index;
-    int i;
+    int i, ret;
 
     l1_start_index = l1_index & ~(L1_ENTRIES_PER_SECTOR - 1);
     for (i = 0; i < L1_ENTRIES_PER_SECTOR; i++) {
         buf[i] = cpu_to_be64(s->l1_table[l1_start_index + i]);
     }
 
-    if (bdrv_pwrite(s->hd, s->l1_table_offset + 8 * l1_start_index,
-        buf, sizeof(buf)) != sizeof(buf))
-    {
-        return -1;
+    ret = bdrv_pwrite_sync(s->hd, s->l1_table_offset + 8 * l1_start_index,
+        buf, sizeof(buf));
+    if (ret < 0) {
+        return ret;
     }
 
     return 0;
@@ -221,6 +221,7 @@ static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index)
     uint64_t old_l2_offset;
     uint64_t *l2_table;
     int64_t l2_offset;
+    int ret;
 
     old_l2_offset = s->l1_table[l1_index];
 
@@ -247,10 +248,11 @@ static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index)
             goto fail;
     }
     /* write the l2 table to the file */
-    if (bdrv_pwrite(s->hd, l2_offset,
-                    l2_table, s->l2_size * sizeof(uint64_t)) !=
-        s->l2_size * sizeof(uint64_t))
+    ret = bdrv_pwrite_sync(s->hd, l2_offset, l2_table,
+        s->l2_size * sizeof(uint64_t));
+    if (ret < 0) {
         goto fail;
+    }
 
     /* update the L1 entry */
     s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED;
@@ -384,8 +386,8 @@ static int copy_sectors(BlockDriverState *bs, uint64_t start_sect,
                         s->cluster_data, n, 1,
                         &s->aes_encrypt_key);
     }
-    ret = bdrv_write(s->hd, (cluster_offset >> 9) + n_start,
-                     s->cluster_data, n);
+    ret = bdrv_write_sync(s->hd, (cluster_offset >> 9) + n_start,
+        s->cluster_data, n);
     if (ret < 0)
         return ret;
     return 0;
@@ -597,10 +599,10 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
     /* compressed clusters never have the copied flag */
 
     l2_table[l2_index] = cpu_to_be64(cluster_offset);
-    if (bdrv_pwrite(s->hd,
+    if (bdrv_pwrite_sync(s->hd,
                     l2_offset + l2_index * sizeof(uint64_t),
                     l2_table + l2_index,
-                    sizeof(uint64_t)) != sizeof(uint64_t))
+                    sizeof(uint64_t)) < 0)
         return 0;
 
     return cluster_offset;
@@ -618,11 +620,12 @@ static int write_l2_entries(BDRVQcowState *s, uint64_t *l2_table,
     int start_offset = (8 * l2_index) & ~511;
     int end_offset = (8 * (l2_index + num) + 511) & ~511;
     size_t len = end_offset - start_offset;
+    int ret;
 
-    if (bdrv_pwrite(s->hd, l2_offset + start_offset, &l2_table[l2_start_index],
-        len) != len)
-    {
-        return -1;
+    ret = bdrv_pwrite_sync(s->hd, l2_offset + start_offset,
+        &l2_table[l2_start_index], len);
+    if (ret < 0) {
+        return ret;
     }
 
     return 0;
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ff2cf6d..06998da 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -42,8 +42,8 @@ static int write_refcount_block(BDRVQcowState *s)
         return 0;
     }
 
-    if (bdrv_pwrite(s->hd, s->refcount_block_cache_offset,
-            s->refcount_block_cache, size) != size)
+    if (bdrv_pwrite_sync(s->hd, s->refcount_block_cache_offset,
+            s->refcount_block_cache, size) < 0)
     {
         return -EIO;
     }
@@ -246,7 +246,7 @@ static int64_t alloc_refcount_block(BlockDriverState *bs, int64_t cluster_index)
     }
 
     /* Now the new refcount block needs to be written to disk */
-    ret = bdrv_pwrite(s->hd, new_block, s->refcount_block_cache,
+    ret = bdrv_pwrite_sync(s->hd, new_block, s->refcount_block_cache,
         s->cluster_size);
     if (ret < 0) {
         goto fail_block;
@@ -255,7 +255,7 @@ static int64_t alloc_refcount_block(BlockDriverState *bs, int64_t cluster_index)
     /* If the refcount table is big enough, just hook the block up there */
     if (refcount_table_index < s->refcount_table_size) {
         uint64_t data64 = cpu_to_be64(new_block);
-        ret = bdrv_pwrite(s->hd,
+        ret = bdrv_pwrite_sync(s->hd,
             s->refcount_table_offset + refcount_table_index * sizeof(uint64_t),
             &data64, sizeof(data64));
         if (ret < 0) {
@@ -332,7 +332,7 @@ static int64_t alloc_refcount_block(BlockDriverState *bs, int64_t cluster_index)
     }
 
     /* Write refcount blocks to disk */
-    ret = bdrv_pwrite(s->hd, meta_offset, new_blocks,
+    ret = bdrv_pwrite_sync(s->hd, meta_offset, new_blocks,
         blocks_clusters * s->cluster_size);
     qemu_free(new_blocks);
     if (ret < 0) {
@@ -344,7 +344,7 @@ static int64_t alloc_refcount_block(BlockDriverState *bs, int64_t cluster_index)
         cpu_to_be64s(&new_table[i]);
     }
 
-    ret = bdrv_pwrite(s->hd, table_offset, new_table,
+    ret = bdrv_pwrite_sync(s->hd, table_offset, new_table,
         table_size * sizeof(uint64_t));
     if (ret < 0) {
         goto fail_table;
@@ -358,7 +358,7 @@ static int64_t alloc_refcount_block(BlockDriverState *bs, int64_t cluster_index)
     uint8_t data[12];
     cpu_to_be64w((uint64_t*)data, table_offset);
     cpu_to_be32w((uint32_t*)(data + 8), table_clusters);
-    ret = bdrv_pwrite(s->hd, offsetof(QCowHeader, refcount_table_offset),
+    ret = bdrv_pwrite_sync(s->hd, offsetof(QCowHeader, refcount_table_offset),
         data, sizeof(data));
     if (ret < 0) {
         goto fail_table;
@@ -397,6 +397,7 @@ static int write_refcount_block_entries(BDRVQcowState *s,
     int64_t refcount_block_offset, int first_index, int last_index)
 {
     size_t size;
+    int ret;
 
     if (cache_refcount_updates) {
         return 0;
@@ -411,11 +412,11 @@ static int write_refcount_block_entries(BDRVQcowState *s,
         & ~(REFCOUNTS_PER_SECTOR - 1);
 
     size = (last_index - first_index) << REFCOUNT_SHIFT;
-    if (bdrv_pwrite(s->hd,
+    ret = bdrv_pwrite_sync(s->hd,
         refcount_block_offset + (first_index << REFCOUNT_SHIFT),
-        &s->refcount_block_cache[first_index], size) != size)
-    {
-        return -EIO;
+        &s->refcount_block_cache[first_index], size);
+    if (ret < 0) {
+        return ret;
     }
 
     return 0;
@@ -771,8 +772,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                 }
             }
             if (l2_modified) {
-                if (bdrv_pwrite(s->hd,
-                                l2_offset, l2_table, l2_size) != l2_size)
+                if (bdrv_pwrite_sync(s->hd,
+                                l2_offset, l2_table, l2_size) < 0)
                     goto fail;
             }
 
@@ -793,8 +794,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     if (l1_modified) {
         for(i = 0; i < l1_size; i++)
             cpu_to_be64s(&l1_table[i]);
-        if (bdrv_pwrite(s->hd, l1_table_offset, l1_table,
-                        l1_size2) != l1_size2)
+        if (bdrv_pwrite_sync(s->hd, l1_table_offset, l1_table,
+                        l1_size2) < 0)
             goto fail;
         for(i = 0; i < l1_size; i++)
             be64_to_cpus(&l1_table[i]);
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 8ddaea2..64af361 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -158,25 +158,25 @@ static int qcow_write_snapshots(BlockDriverState *bs)
         h.id_str_size = cpu_to_be16(id_str_size);
         h.name_size = cpu_to_be16(name_size);
         offset = align_offset(offset, 8);
-        if (bdrv_pwrite(s->hd, offset, &h, sizeof(h)) != sizeof(h))
+        if (bdrv_pwrite_sync(s->hd, offset, &h, sizeof(h)) < 0)
             goto fail;
         offset += sizeof(h);
-        if (bdrv_pwrite(s->hd, offset, sn->id_str, id_str_size) != id_str_size)
+        if (bdrv_pwrite_sync(s->hd, offset, sn->id_str, id_str_size) < 0)
             goto fail;
         offset += id_str_size;
-        if (bdrv_pwrite(s->hd, offset, sn->name, name_size) != name_size)
+        if (bdrv_pwrite_sync(s->hd, offset, sn->name, name_size) < 0)
             goto fail;
         offset += name_size;
     }
 
     /* update the various header fields */
     data64 = cpu_to_be64(snapshots_offset);
-    if (bdrv_pwrite(s->hd, offsetof(QCowHeader, snapshots_offset),
-                    &data64, sizeof(data64)) != sizeof(data64))
+    if (bdrv_pwrite_sync(s->hd, offsetof(QCowHeader, snapshots_offset),
+                    &data64, sizeof(data64)) < 0)
         goto fail;
     data32 = cpu_to_be32(s->nb_snapshots);
-    if (bdrv_pwrite(s->hd, offsetof(QCowHeader, nb_snapshots),
-                    &data32, sizeof(data32)) != sizeof(data32))
+    if (bdrv_pwrite_sync(s->hd, offsetof(QCowHeader, nb_snapshots),
+                    &data32, sizeof(data32)) < 0)
         goto fail;
 
     /* free the old snapshot table */
@@ -284,9 +284,8 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     for(i = 0; i < s->l1_size; i++) {
         l1_table[i] = cpu_to_be64(s->l1_table[i]);
     }
-    if (bdrv_pwrite(s->hd, sn->l1_table_offset,
-                    l1_table, s->l1_size * sizeof(uint64_t)) !=
-        (s->l1_size * sizeof(uint64_t)))
+    if (bdrv_pwrite_sync(s->hd, sn->l1_table_offset,
+                    l1_table, s->l1_size * sizeof(uint64_t)) < 0)
         goto fail;
     qemu_free(l1_table);
     l1_table = NULL;
@@ -335,8 +334,8 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
     if (bdrv_pread(s->hd, sn->l1_table_offset,
                    s->l1_table, l1_size2) != l1_size2)
         goto fail;
-    if (bdrv_pwrite(s->hd, s->l1_table_offset,
-                    s->l1_table, l1_size2) != l1_size2)
+    if (bdrv_pwrite_sync(s->hd, s->l1_table_offset,
+                    s->l1_table, l1_size2) < 0)
         goto fail;
     for(i = 0;i < s->l1_size; i++) {
         be64_to_cpus(&s->l1_table[i]);
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 11/14] vmdk: Use bdrv_(p)write_sync for metadata writes
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (9 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 10/14] qcow2: " Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 12/14] vpc: " Kevin Wolf
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b8852e87d9d113096342c3e0977266cda0fe9ee5)

Conflicts:

	block/vmdk.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vmdk.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index d52904a..cd87279 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -153,7 +153,7 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t cid)
         pstrcat(desc, sizeof(desc), tmp_desc);
     }
 
-    if (bdrv_pwrite(s->hd, 0x200, desc, DESC_SIZE) != DESC_SIZE)
+    if (bdrv_pwrite_sync(s->hd, 0x200, desc, DESC_SIZE) < 0)
         return -1;
     return 0;
 }
@@ -482,14 +482,14 @@ static int vmdk_L2update(BlockDriverState *bs, VmdkMetaData *m_data)
     BDRVVmdkState *s = bs->opaque;
 
     /* update L2 table */
-    if (bdrv_pwrite(s->hd, ((int64_t)m_data->l2_offset * 512) + (m_data->l2_index * sizeof(m_data->offset)),
-                    &(m_data->offset), sizeof(m_data->offset)) != sizeof(m_data->offset))
+    if (bdrv_pwrite_sync(s->hd, ((int64_t)m_data->l2_offset * 512) + (m_data->l2_index * sizeof(m_data->offset)),
+                    &(m_data->offset), sizeof(m_data->offset)) < 0)
         return -1;
     /* update backup L2 table */
     if (s->l1_backup_table_offset != 0) {
         m_data->l2_offset = s->l1_backup_table[m_data->l1_index];
-        if (bdrv_pwrite(s->hd, ((int64_t)m_data->l2_offset * 512) + (m_data->l2_index * sizeof(m_data->offset)),
-                        &(m_data->offset), sizeof(m_data->offset)) != sizeof(m_data->offset))
+        if (bdrv_pwrite_sync(s->hd, ((int64_t)m_data->l2_offset * 512) + (m_data->l2_index * sizeof(m_data->offset)),
+                        &(m_data->offset), sizeof(m_data->offset)) < 0)
             return -1;
     }
 
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 12/14] vpc: Use bdrv_(p)write_sync for metadata writes
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (10 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 11/14] vmdk: " Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 13/14] block: Fix early failure in multiwrite Kevin Wolf
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 078a458e077d6b0db262c4b05fee51d01de2d1d2)

Conflicts:

	block/vpc.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/vpc.c |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index afe6f1a..9e0acf4 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -266,7 +266,7 @@ static inline int64_t get_sector_offset(BlockDriverState *bs,
 
         s->last_bitmap_offset = bitmap_offset;
         memset(bitmap, 0xff, s->bitmap_size);
-        bdrv_pwrite(s->hd, bitmap_offset, bitmap, s->bitmap_size);
+        bdrv_pwrite_sync(s->hd, bitmap_offset, bitmap, s->bitmap_size);
     }
 
 //    printf("sector: %" PRIx64 ", index: %x, offset: %x, bioff: %" PRIx64 ", bloff: %" PRIx64 "\n",
@@ -316,7 +316,7 @@ static int rewrite_footer(BlockDriverState* bs)
     BDRVVPCState *s = bs->opaque;
     int64_t offset = s->free_data_block_offset;
 
-    ret = bdrv_pwrite(s->hd, offset, s->footer_buf, HEADER_SIZE);
+    ret = bdrv_pwrite_sync(s->hd, offset, s->footer_buf, HEADER_SIZE);
     if (ret < 0)
         return ret;
 
@@ -351,7 +351,8 @@ static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
 
     // Initialize the block's bitmap
     memset(bitmap, 0xff, s->bitmap_size);
-    bdrv_pwrite(s->hd, s->free_data_block_offset, bitmap, s->bitmap_size);
+    bdrv_pwrite_sync(s->hd, s->free_data_block_offset, bitmap,
+        s->bitmap_size);
 
     // Write new footer (the old one will be overwritten)
     s->free_data_block_offset += s->block_size + s->bitmap_size;
@@ -362,7 +363,7 @@ static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
     // Write BAT entry to disk
     bat_offset = s->bat_offset + (4 * index);
     bat_value = be32_to_cpu(s->pagetable[index]);
-    ret = bdrv_pwrite(s->hd, bat_offset, &bat_value, 4);
+    ret = bdrv_pwrite_sync(s->hd, bat_offset, &bat_value, 4);
     if (ret < 0)
         goto fail;
 
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 13/14] block: Fix early failure in multiwrite
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (11 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 12/14] vpc: " Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 14/14] block: Handle multiwrite errors only when all requests have completed Kevin Wolf
  2010-07-14 12:50 ` [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Aurelien Jarno
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

bdrv_aio_writev may call the callback immediately (and it will commonly do so
in error cases). Current code doesn't consider this. For details see the
comment added by this patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 453f9a1652629e5805995b165be2e634c8487139)

Conflicts:

	block.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c |   37 ++++++++++++++++++++++++++++++-------
 1 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index e3b3a55..80f2fae 100644
--- a/block.c
+++ b/block.c
@@ -1802,8 +1802,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
     // Check for mergable requests
     num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb);
 
-    // Run the aio requests
+    /*
+     * Run the aio requests. As soon as one request can't be submitted
+     * successfully, fail all requests that are not yet submitted (we must
+     * return failure for all requests anyway)
+     *
+     * num_requests cannot be set to the right value immediately: If
+     * bdrv_aio_writev fails for some request, num_requests would be too high
+     * and therefore multiwrite_cb() would never recognize the multiwrite
+     * request as completed. We also cannot use the loop variable i to set it
+     * when the first request fails because the callback may already have been
+     * called for previously submitted requests. Thus, num_requests must be
+     * incremented for each request that is submitted.
+     *
+     * The problem that callbacks may be called early also means that we need
+     * to take care that num_requests doesn't become 0 before all requests are
+     * submitted - multiwrite_cb() would consider the multiwrite request
+     * completed. A dummy request that is "completed" by a manual call to
+     * multiwrite_cb() takes care of this.
+     */
+    mcb->num_requests = 1;
+
     for (i = 0; i < num_reqs; i++) {
+        mcb->num_requests++;
         acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov,
             reqs[i].nb_sectors, multiwrite_cb, mcb);
 
@@ -1811,23 +1832,25 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
             // We can only fail the whole thing if no request has been
             // submitted yet. Otherwise we'll wait for the submitted AIOs to
             // complete and report the error in the callback.
-            if (mcb->num_requests == 0) {
-                reqs[i].error = -EIO;
+            if (i == 0) {
                 goto fail;
             } else {
-                mcb->num_requests++;
                 multiwrite_cb(mcb, -EIO);
                 break;
             }
-        } else {
-            mcb->num_requests++;
         }
     }
 
+    /* Complete the dummy request */
+    multiwrite_cb(mcb, 0);
+
     return 0;
 
 fail:
-    free(mcb);
+    for (i = 0; i < mcb->num_callbacks; i++) {
+        reqs[i].error = -EIO;
+    }
+    qemu_free(mcb);
     return -1;
 }
 
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [STABLE][PATCH 14/14] block: Handle multiwrite errors only when all requests have completed
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (12 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 13/14] block: Fix early failure in multiwrite Kevin Wolf
@ 2010-07-14 11:24 ` Kevin Wolf
  2010-07-14 12:50 ` [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Aurelien Jarno
  14 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2010-07-14 11:24 UTC (permalink / raw)
  To: aurelien; +Cc: kwolf, qemu-devel

Don't try to be clever by freeing all temporary data and calling all callbacks
when the return value (an error) is certain. Doing so has at least two
important problems:

* The temporary data that is freed (qiov, possibly zero buffer) is still used
  by the requests that have not yet completed.
* Calling the callbacks for all requests in the multiwrite means for the caller
  that it may free buffers etc. which are still in use.

Just remember the error value and do the cleanup when all requests have
completed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit de189a1b4a471d37a2909e97646654fc9751b52f)
---
 block.c |    5 +----
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 80f2fae..1694780 100644
--- a/block.c
+++ b/block.c
@@ -1661,14 +1661,11 @@ static void multiwrite_cb(void *opaque, int ret)
 
     if (ret < 0 && !mcb->error) {
         mcb->error = ret;
-        multiwrite_user_cb(mcb);
     }
 
     mcb->num_requests--;
     if (mcb->num_requests == 0) {
-        if (mcb->error == 0) {
-            multiwrite_user_cb(mcb);
-        }
+        multiwrite_user_cb(mcb);
         qemu_free(mcb);
     }
 }
-- 
1.7.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5
  2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
                   ` (13 preceding siblings ...)
  2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 14/14] block: Handle multiwrite errors only when all requests have completed Kevin Wolf
@ 2010-07-14 12:50 ` Aurelien Jarno
  14 siblings, 0 replies; 16+ messages in thread
From: Aurelien Jarno @ 2010-07-14 12:50 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel

On Wed, Jul 14, 2010 at 01:23:59PM +0200, Kevin Wolf wrote:
> Hi Aurelien,
> 
> these are the block related patches that have been applied to git master since
> 0.12.4 was released and that I consider relevant for stable-0.12.
> 

Thanks, pulled.

> 
> 
> The following changes since commit 0c0f53e25cded4b3b1909391f9bf22d334ed8155:
> 
>   qemu-options: add documentation for stdio signal=on|off (2010-07-13 21:18:13 +0200)
> 
> are available in the git repository at:
>   git://repo.or.cz/qemu/kevin.git for-stable-0.12
> 
> Kevin Wolf (13):
>       vmdk: fix double free
>       qcow2: Fix creation of large images
>       vmdk: Fix COW
>       qcow2: Remove abort on free_clusters failure
>       block/vdi: Fix image opening and creation for odd disk sizes
>       qcow2: Restore L1 entry on l2_allocate failure
>       block: Add bdrv_(p)write_sync
>       qcow: Use bdrv_(p)write_sync for metadata writes
>       qcow2: Use bdrv_(p)write_sync for metadata writes
>       vmdk: Use bdrv_(p)write_sync for metadata writes
>       vpc: Use bdrv_(p)write_sync for metadata writes
>       block: Fix early failure in multiwrite
>       block: Handle multiwrite errors only when all requests have completed
> 
> Stefan Weil (1):
>       block/vpc: Fix conversion from size to disk geometry
> 
>  block.c                |   81 +++++++++++++++++++++++++++++++++++++++++------
>  block.h                |    4 ++
>  block/qcow.c           |   18 ++++++-----
>  block/qcow2-cluster.c  |   46 +++++++++++++++------------
>  block/qcow2-refcount.c |   33 ++++++++++---------
>  block/qcow2-snapshot.c |   23 ++++++-------
>  block/qcow2.c          |   41 +++++++++++++++++++-----
>  block/vdi.c            |   18 +++++++++--
>  block/vmdk.c           |   47 ++++++++++------------------
>  block/vpc.c            |   30 ++++++++++--------
>  block_int.h            |    1 +
>  11 files changed, 219 insertions(+), 123 deletions(-)
> 
> 

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-07-14 12:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-14 11:23 [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 01/14] vmdk: fix double free Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 02/14] qcow2: Fix creation of large images Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 03/14] vmdk: Fix COW Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 04/14] qcow2: Remove abort on free_clusters failure Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 05/14] block/vpc: Fix conversion from size to disk geometry Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 06/14] block/vdi: Fix image opening and creation for odd disk sizes Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 07/14] qcow2: Restore L1 entry on l2_allocate failure Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 08/14] block: Add bdrv_(p)write_sync Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 09/14] qcow: Use bdrv_(p)write_sync for metadata writes Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 10/14] qcow2: " Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 11/14] vmdk: " Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 12/14] vpc: " Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 13/14] block: Fix early failure in multiwrite Kevin Wolf
2010-07-14 11:24 ` [Qemu-devel] [STABLE][PATCH 14/14] block: Handle multiwrite errors only when all requests have completed Kevin Wolf
2010-07-14 12:50 ` [Qemu-devel] [STABLE][PULL 00/14] Block patches for 0.12.5 Aurelien Jarno

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).