[Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots
@ 2011-11-17 15:13 Kevin Wolf
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 1/8] qcow2: Return real error code in qcow2_read_snapshots Kevin Wolf
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This is more or less the same kind of fixes that we made in the rest of qcow2
last year: Return the right error codes and make the order of operations safe
so that a crash can lead to no more than cluster leaks.

Although all of these are bug fixes, I'm not so sure about taking them into
1.0. Maybe we can take some of the easier ones and leave others out, or just
move the whole series to 1.1. Feedback on this would appreciated.

Kevin Wolf (8):
  qcow2: Return real error code in qcow2_read_snapshots
  qcow2: Return real error code in qcow2_write_snapshots
  qcow2: Cleanups and memleak fix in qcow2_snapshot_create
  qcow2: Rework qcow2_snapshot_create error handling
  qcow2: Return real error in qcow2_snapshot_goto
  qcow2: Fix order of refcount updates in qcow2_snapshot_goto
  qcow2: Fix order in qcow2_snapshot_delete
  qcow2: Fix error path in qcow2_snapshot_load_tmp

 block/qcow2-refcount.c |    7 +-
 block/qcow2-snapshot.c |  322 +++++++++++++++++++++++++++++++++++-------------
 block/qcow2.c          |    5 +-
 3 files changed, 244 insertions(+), 90 deletions(-)

-- 
1.7.6.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/8] qcow2: Return real error code in qcow2_read_snapshots
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
@ 2011-11-17 15:13 ` Kevin Wolf
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 2/8] qcow2: Return real error code in qcow2_write_snapshots Kevin Wolf
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-snapshot.c |   25 ++++++++++++++++++++-----
 block/qcow2.c          |    5 +++--
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index fb7f58c..db49bb3 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -72,6 +72,7 @@ int qcow2_read_snapshots(BlockDriverState *bs)
     int i, id_str_size, name_size;
     int64_t offset;
     uint32_t extra_data_size;
+    int ret;
 
     if (!s->nb_snapshots) {
         s->snapshots = NULL;
@@ -81,10 +82,15 @@ int qcow2_read_snapshots(BlockDriverState *bs)
 
     offset = s->snapshots_offset;
     s->snapshots = g_malloc0(s->nb_snapshots * sizeof(QCowSnapshot));
+
     for(i = 0; i < s->nb_snapshots; i++) {
+        /* Read statically sized part of the snapshot header */
         offset = align_offset(offset, 8);
-        if (bdrv_pread(bs->file, offset, &h, sizeof(h)) != sizeof(h))
+        ret = bdrv_pread(bs->file, offset, &h, sizeof(h));
+        if (ret < 0) {
             goto fail;
+        }
+
         offset += sizeof(h);
         sn = s->snapshots + i;
         sn->l1_table_offset = be64_to_cpu(h.l1_table_offset);
@@ -98,25 +104,34 @@ int qcow2_read_snapshots(BlockDriverState *bs)
         id_str_size = be16_to_cpu(h.id_str_size);
         name_size = be16_to_cpu(h.name_size);
 
+        /* Skip extra data */
         offset += extra_data_size;
 
+        /* Read snapshot ID */
         sn->id_str = g_malloc(id_str_size + 1);
-        if (bdrv_pread(bs->file, offset, sn->id_str, id_str_size) != id_str_size)
+        ret = bdrv_pread(bs->file, offset, sn->id_str, id_str_size);
+        if (ret < 0) {
             goto fail;
+        }
         offset += id_str_size;
         sn->id_str[id_str_size] = '\0';
 
+        /* Read snapshot name */
         sn->name = g_malloc(name_size + 1);
-        if (bdrv_pread(bs->file, offset, sn->name, name_size) != name_size)
+        ret = bdrv_pread(bs->file, offset, sn->name, name_size);
+        if (ret < 0) {
             goto fail;
+        }
         offset += name_size;
         sn->name[name_size] = '\0';
     }
+
     s->snapshots_size = offset - s->snapshots_offset;
     return 0;
- fail:
+
+fail:
     qcow2_free_snapshots(bs);
-    return -1;
+    return ret;
 }
 
 /* add at the end of the file a new list of snapshots */
diff --git a/block/qcow2.c b/block/qcow2.c
index a56b011..27cbbeb 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -272,8 +272,9 @@ static int qcow2_open(BlockDriverState *bs, int flags)
         }
         bs->backing_file[len] = '\0';
     }
-    if (qcow2_read_snapshots(bs) < 0) {
-        ret = -EINVAL;
+
+    ret = qcow2_read_snapshots(bs);
+    if (ret < 0) {
         goto fail;
     }
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/8] qcow2: Return real error code in qcow2_write_snapshots
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 1/8] qcow2: Return real error code in qcow2_read_snapshots Kevin Wolf
@ 2011-11-17 15:13 ` Kevin Wolf
  2011-11-18 14:14   ` Stefan Hajnoczi
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 3/8] qcow2: Cleanups and memleak fix in qcow2_snapshot_create Kevin Wolf
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Doesn't immediately fix anything as the callers don't use the return
value, but they will be fixed next.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-snapshot.c |   48 ++++++++++++++++++++++++++++++++++++++----------
 1 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index db49bb3..4d387a7 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -144,6 +144,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
     uint64_t data64;
     uint32_t data32;
     int64_t offset, snapshots_offset;
+    int ret;
 
     /* compute the size of the snapshots */
     offset = 0;
@@ -156,6 +157,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
     }
     snapshots_size = offset;
 
+    /* Allocate space for the new snapshot list */
     snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
     bdrv_flush(bs->file);
     offset = snapshots_offset;
@@ -163,6 +165,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
         return offset;
     }
 
+    /* Write all snapshots to the new list */
     for(i = 0; i < s->nb_snapshots; i++) {
         sn = s->snapshots + i;
         memset(&h, 0, sizeof(h));
@@ -178,34 +181,59 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
         h.id_str_size = cpu_to_be16(id_str_size);
         h.name_size = cpu_to_be16(name_size);
         offset = align_offset(offset, 8);
-        if (bdrv_pwrite_sync(bs->file, offset, &h, sizeof(h)) < 0)
+
+        ret = bdrv_pwrite(bs->file, offset, &h, sizeof(h));
+        if (ret < 0) {
             goto fail;
+        }
         offset += sizeof(h);
-        if (bdrv_pwrite_sync(bs->file, offset, sn->id_str, id_str_size) < 0)
+
+        ret = bdrv_pwrite(bs->file, offset, sn->id_str, id_str_size);
+        if (ret < 0) {
             goto fail;
+        }
         offset += id_str_size;
-        if (bdrv_pwrite_sync(bs->file, offset, sn->name, name_size) < 0)
+
+        ret = bdrv_pwrite(bs->file, offset, sn->name, name_size);
+        if (ret < 0) {
             goto fail;
+        }
         offset += name_size;
     }
 
-    /* update the various header fields */
+    /*
+     * Update the header to point to the new snapshot table. This requires the
+     * new table and its refcounts to be stable on disk.
+     *
+     * FIXME This should be done with a single write
+     */
+    ret = bdrv_flush(bs);
+    if (ret < 0) {
+        goto fail;
+    }
+
     data64 = cpu_to_be64(snapshots_offset);
-    if (bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, snapshots_offset),
-                    &data64, sizeof(data64)) < 0)
+    ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, snapshots_offset),
+                      &data64, sizeof(data64));
+    if (ret < 0) {
         goto fail;
+    }
+
     data32 = cpu_to_be32(s->nb_snapshots);
-    if (bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, nb_snapshots),
-                    &data32, sizeof(data32)) < 0)
+    ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, nb_snapshots),
+                           &data32, sizeof(data32));
+    if (ret < 0) {
         goto fail;
+    }
 
     /* free the old snapshot table */
     qcow2_free_clusters(bs, s->snapshots_offset, s->snapshots_size);
     s->snapshots_offset = snapshots_offset;
     s->snapshots_size = snapshots_size;
     return 0;
- fail:
-    return -1;
+
+fail:
+    return ret;
 }
 
 static void find_new_snapshot_id(BlockDriverState *bs,
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 3/8] qcow2: Cleanups and memleak fix in qcow2_snapshot_create
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 1/8] qcow2: Return real error code in qcow2_read_snapshots Kevin Wolf
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 2/8] qcow2: Return real error code in qcow2_write_snapshots Kevin Wolf
@ 2011-11-17 15:13 ` Kevin Wolf
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 4/8] qcow2: Rework qcow2_snapshot_create error handling Kevin Wolf
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

sn->id_str could be leaked before this. The rest of this patch changes
comments, fixes coding style or removes checks that are unnecessary with
g_malloc.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-snapshot.c |   26 +++++++++++---------------
 1 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 4d387a7..2df7858 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -290,21 +290,20 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
 
     memset(sn, 0, sizeof(*sn));
 
+    /* Generate an ID if it wasn't passed */
     if (sn_info->id_str[0] == '\0') {
-        /* compute a new id */
         find_new_snapshot_id(bs, sn_info->id_str, sizeof(sn_info->id_str));
     }
 
-    /* check that the ID is unique */
-    if (find_snapshot_by_id(bs, sn_info->id_str) >= 0)
+    /* Check that the ID is unique */
+    if (find_snapshot_by_id(bs, sn_info->id_str) >= 0) {
         return -ENOENT;
+    }
 
+    /* Populate sn with passed data */
     sn->id_str = g_strdup(sn_info->id_str);
-    if (!sn->id_str)
-        goto fail;
     sn->name = g_strdup(sn_info->name);
-    if (!sn->name)
-        goto fail;
+
     sn->vm_state_size = sn_info->vm_state_size;
     sn->date_sec = sn_info->date_sec;
     sn->date_nsec = sn_info->date_nsec;
@@ -314,7 +313,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     if (ret < 0)
         goto fail;
 
-    /* create the L1 table of the snapshot */
+    /* Allocate the L1 table of the snapshot and copy the current one there. */
     l1_table_offset = qcow2_alloc_clusters(bs, s->l1_size * sizeof(uint64_t));
     if (l1_table_offset < 0) {
         goto fail;
@@ -324,12 +323,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     sn->l1_table_offset = l1_table_offset;
     sn->l1_size = s->l1_size;
 
-    if (s->l1_size != 0) {
-        l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
-    } else {
-        l1_table = NULL;
-    }
-
+    l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
     for(i = 0; i < s->l1_size; i++) {
         l1_table[i] = cpu_to_be64(s->l1_table[i]);
     }
@@ -356,7 +350,9 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     }
 #endif
     return 0;
- fail:
+
+fail:
+    g_free(sn->id_str);
     g_free(sn->name);
     g_free(l1_table);
     return -1;
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 4/8] qcow2: Rework qcow2_snapshot_create error handling
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
                   ` (2 preceding siblings ...)
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 3/8] qcow2: Cleanups and memleak fix in qcow2_snapshot_create Kevin Wolf
@ 2011-11-17 15:13 ` Kevin Wolf
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 5/8] qcow2: Return real error in qcow2_snapshot_goto Kevin Wolf
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Increase refcounts only after allocating a new L1 table has succeeded in
order to make leaks less likely. If writing the snapshot table fails,
revert in-memory state to be consistent with that on disk.

While at it, make it return the real error codes instead of -1.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-snapshot.c |   55 +++++++++++++++++++++++++++++++++++------------
 1 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 2df7858..066d56b 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -283,7 +283,9 @@ static int find_snapshot_by_id_or_name(BlockDriverState *bs, const char *name)
 int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
 {
     BDRVQcowState *s = bs->opaque;
-    QCowSnapshot *snapshots1, sn1, *sn = &sn1;
+    QCowSnapshot *new_snapshot_list = NULL;
+    QCowSnapshot *old_snapshot_list = NULL;
+    QCowSnapshot sn1, *sn = &sn1;
     int i, ret;
     uint64_t *l1_table = NULL;
     int64_t l1_table_offset;
@@ -309,16 +311,12 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     sn->date_nsec = sn_info->date_nsec;
     sn->vm_clock_nsec = sn_info->vm_clock_nsec;
 
-    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 1);
-    if (ret < 0)
-        goto fail;
-
     /* Allocate the L1 table of the snapshot and copy the current one there. */
     l1_table_offset = qcow2_alloc_clusters(bs, s->l1_size * sizeof(uint64_t));
     if (l1_table_offset < 0) {
+        ret = l1_table_offset;
         goto fail;
     }
-    bdrv_flush(bs->file);
 
     sn->l1_table_offset = l1_table_offset;
     sn->l1_size = s->l1_size;
@@ -327,22 +325,50 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     for(i = 0; i < s->l1_size; i++) {
         l1_table[i] = cpu_to_be64(s->l1_table[i]);
     }
-    if (bdrv_pwrite_sync(bs->file, sn->l1_table_offset,
-                    l1_table, s->l1_size * sizeof(uint64_t)) < 0)
+
+    ret = bdrv_pwrite(bs->file, sn->l1_table_offset, l1_table,
+                      s->l1_size * sizeof(uint64_t));
+    if (ret < 0) {
         goto fail;
+    }
+
     g_free(l1_table);
     l1_table = NULL;
 
-    snapshots1 = g_malloc((s->nb_snapshots + 1) * sizeof(QCowSnapshot));
+    /*
+     * Increase the refcounts of all clusters and make sure everything is
+     * stable on disk before updating the snapshot table to contain a pointer
+     * to the new L1 table.
+     */
+    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 1);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    ret = bdrv_flush(bs->file);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    /* Append the new snapshot to the snapshot list */
+    new_snapshot_list = g_malloc((s->nb_snapshots + 1) * sizeof(QCowSnapshot));
     if (s->snapshots) {
-        memcpy(snapshots1, s->snapshots, s->nb_snapshots * sizeof(QCowSnapshot));
-        g_free(s->snapshots);
+        memcpy(new_snapshot_list, s->snapshots,
+               s->nb_snapshots * sizeof(QCowSnapshot));
+        old_snapshot_list = s->snapshots;
     }
-    s->snapshots = snapshots1;
+    s->snapshots = new_snapshot_list;
     s->snapshots[s->nb_snapshots++] = *sn;
 
-    if (qcow2_write_snapshots(bs) < 0)
+    ret = qcow2_write_snapshots(bs);
+    if (ret < 0) {
+        g_free(s->snapshots);
+        s->snapshots = old_snapshot_list;
         goto fail;
+    }
+
+    g_free(old_snapshot_list);
+
 #ifdef DEBUG_ALLOC
     {
       BdrvCheckResult result = {0};
@@ -355,7 +381,8 @@ fail:
     g_free(sn->id_str);
     g_free(sn->name);
     g_free(l1_table);
-    return -1;
+
+    return ret;
 }
 
 /* copy the snapshot 'snapshot_name' into the current disk image */
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 5/8] qcow2: Return real error in qcow2_snapshot_goto
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
                   ` (3 preceding siblings ...)
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 4/8] qcow2: Rework qcow2_snapshot_create error handling Kevin Wolf
@ 2011-11-17 15:13 ` Kevin Wolf
  2011-11-18 16:08   ` Stefan Hajnoczi
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 6/8] qcow2: Fix order of refcount updates " Kevin Wolf
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-snapshot.c |   50 +++++++++++++++++++++++++++++++++++++----------
 1 files changed, 39 insertions(+), 11 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 066d56b..9f6647f 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -392,17 +392,32 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
     QCowSnapshot *sn;
     int i, snapshot_index;
     int cur_l1_bytes, sn_l1_bytes;
+    int ret;
 
+    /* Search the snapshot */
     snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_id);
-    if (snapshot_index < 0)
+    if (snapshot_index < 0) {
         return -ENOENT;
+    }
     sn = &s->snapshots[snapshot_index];
 
-    if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, -1) < 0)
+    /* Decrease refcount of clusters of current L1 table.
+     * FIXME This is too early! */
+    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset,
+                                         s->l1_size, -1);
+    if (ret < 0) {
         goto fail;
+    }
 
-    if (qcow2_grow_l1_table(bs, sn->l1_size, true) < 0)
+    /*
+     * Make sure that the current L1 table is big enough to contain the whole
+     * L1 table of the snapshot. If the snapshot L1 table is smaller, the
+     * current one must be padded with zeros.
+     */
+    ret = qcow2_grow_l1_table(bs, sn->l1_size, true);
+    if (ret < 0) {
         goto fail;
+    }
 
     cur_l1_bytes = s->l1_size * sizeof(uint64_t);
     sn_l1_bytes = sn->l1_size * sizeof(uint64_t);
@@ -411,19 +426,31 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
         memset(s->l1_table + sn->l1_size, 0, cur_l1_bytes - sn_l1_bytes);
     }
 
-    /* copy the snapshot l1 table to the current l1 table */
-    if (bdrv_pread(bs->file, sn->l1_table_offset,
-                   s->l1_table, sn_l1_bytes) < 0)
+    /*
+     * Copy the snapshot L1 table to the current L1 table.
+     *
+     * Before overwriting the old current L1 table on disk, make sure to
+     * increase all refcounts for the clusters referenced by the new one.
+     */
+    ret = bdrv_pread(bs->file, sn->l1_table_offset, s->l1_table, sn_l1_bytes);
+    if (ret < 0) {
         goto fail;
-    if (bdrv_pwrite_sync(bs->file, s->l1_table_offset,
-                    s->l1_table, cur_l1_bytes) < 0)
+    }
+
+    ret = bdrv_pwrite(bs->file, s->l1_table_offset, s->l1_table, cur_l1_bytes);
+    if (ret < 0) {
         goto fail;
+    }
+
     for(i = 0;i < s->l1_size; i++) {
         be64_to_cpus(&s->l1_table[i]);
     }
 
-    if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 1) < 0)
+    /* FIXME This is too late! */
+    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 1);
+    if (ret < 0) {
         goto fail;
+    }
 
 #ifdef DEBUG_ALLOC
     {
@@ -432,8 +459,9 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
     }
 #endif
     return 0;
- fail:
-    return -EIO;
+
+fail:
+    return ret;
 }
 
 int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 6/8] qcow2: Fix order of refcount updates in qcow2_snapshot_goto
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
                   ` (4 preceding siblings ...)
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 5/8] qcow2: Return real error in qcow2_snapshot_goto Kevin Wolf
@ 2011-11-17 15:13 ` Kevin Wolf
  2011-11-18 16:28   ` Stefan Hajnoczi
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 7/8] qcow2: Fix order in qcow2_snapshot_delete Kevin Wolf
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

The refcount updates must be moved so that in the worst case we can get
cluster leaks, but refcounts may never be too low.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-refcount.c |    7 ++++-
 block/qcow2-snapshot.c |   61 ++++++++++++++++++++++++++++++++++-------------
 2 files changed, 50 insertions(+), 18 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 9605367..2db2ede 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -700,6 +700,10 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     l2_table = NULL;
     l1_table = NULL;
     l1_size2 = l1_size * sizeof(uint64_t);
+
+    /* WARNING: qcow2_snapshot_goto relies on this function not using the
+     * l1_table_offset when it is the current s->l1_table_offset! Be careful
+     * when changing this! */
     if (l1_table_offset != s->l1_table_offset) {
         if (l1_size2 != 0) {
             l1_table = g_malloc0(align_offset(l1_size2, 512));
@@ -819,7 +823,8 @@ fail:
     qcow2_cache_set_writethrough(bs, s->refcount_block_cache,
         old_refcount_writethrough);
 
-    if (l1_modified) {
+    /* Update L1 only if it isn't deleted anyway (addend = -1) */
+    if (addend >= 0 && l1_modified) {
         for(i = 0; i < l1_size; i++)
             cpu_to_be64s(&l1_table[i]);
         if (bdrv_pwrite_sync(bs->file, l1_table_offset, l1_table,
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 9f6647f..d6b5506 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -393,6 +393,7 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
     int i, snapshot_index;
     int cur_l1_bytes, sn_l1_bytes;
     int ret;
+    uint64_t *sn_l1_table = NULL;
 
     /* Search the snapshot */
     snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_id);
@@ -401,14 +402,6 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
     }
     sn = &s->snapshots[snapshot_index];
 
-    /* Decrease refcount of clusters of current L1 table.
-     * FIXME This is too early! */
-    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset,
-                                         s->l1_size, -1);
-    if (ret < 0) {
-        goto fail;
-    }
-
     /*
      * Make sure that the current L1 table is big enough to contain the whole
      * L1 table of the snapshot. If the snapshot L1 table is smaller, the
@@ -422,32 +415,65 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
     cur_l1_bytes = s->l1_size * sizeof(uint64_t);
     sn_l1_bytes = sn->l1_size * sizeof(uint64_t);
 
-    if (cur_l1_bytes > sn_l1_bytes) {
-        memset(s->l1_table + sn->l1_size, 0, cur_l1_bytes - sn_l1_bytes);
-    }
-
     /*
      * Copy the snapshot L1 table to the current L1 table.
      *
      * Before overwriting the old current L1 table on disk, make sure to
      * increase all refcounts for the clusters referenced by the new one.
+     * Decrease the refcount referenced by the old one only when the L1
+     * table is overwritten.
      */
-    ret = bdrv_pread(bs->file, sn->l1_table_offset, s->l1_table, sn_l1_bytes);
+    sn_l1_table = g_malloc0(cur_l1_bytes);
+
+    ret = bdrv_pread(bs->file, sn->l1_table_offset, sn_l1_table, sn_l1_bytes);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    ret = qcow2_update_snapshot_refcount(bs, sn->l1_table_offset,
+                                         sn->l1_size, 1);
     if (ret < 0) {
         goto fail;
     }
 
-    ret = bdrv_pwrite(bs->file, s->l1_table_offset, s->l1_table, cur_l1_bytes);
+    ret = bdrv_pwrite_sync(bs->file, s->l1_table_offset, sn_l1_table,
+                           cur_l1_bytes);
     if (ret < 0) {
         goto fail;
     }
 
+    /*
+     * Decrease refcount of clusters of current L1 table.
+     *
+     * At this point, the in-memory s->l1_table points to the old L1 table,
+     * whereas on disk we already have the new one.
+     *
+     * qcow2_update_snapshot_refcount special cases the current L1 table to use
+     * the in-memory data instead of really using the offset to load a new one,
+     * which is why this works.
+     */
+    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset,
+                                         s->l1_size, -1);
+
+    /*
+     * Now update the in-memory L1 table to be in sync with the on-disk one. We
+     * need to do this even if updating refcounts failed.
+     */
     for(i = 0;i < s->l1_size; i++) {
-        be64_to_cpus(&s->l1_table[i]);
+        s->l1_table[i] = be64_to_cpu(sn_l1_table[i]);
     }
 
-    /* FIXME This is too late! */
-    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 1);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    g_free(sn_l1_table);
+
+    /*
+     * Update QCOW_OFLAG_COPIED in the active L1 table (it may have changed
+     * when we decreased the refcount of the old snapshot.
+     */
+    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
     if (ret < 0) {
         goto fail;
     }
@@ -461,6 +487,7 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
     return 0;
 
 fail:
+    g_free(sn_l1_table);
     return ret;
 }
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 7/8] qcow2: Fix order in qcow2_snapshot_delete
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
                   ` (5 preceding siblings ...)
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 6/8] qcow2: Fix order of refcount updates " Kevin Wolf
@ 2011-11-17 15:13 ` Kevin Wolf
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 8/8] qcow2: Fix error path in qcow2_snapshot_load_tmp Kevin Wolf
  2011-11-18 16:48 ` [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Stefan Hajnoczi
  8 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

First the snapshot must be deleted and only then the refcounts can be
decreased.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-snapshot.c |   48 +++++++++++++++++++++++++++++++++---------------
 1 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index d6b5506..4c2fbe8 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -494,32 +494,50 @@ fail:
 int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
 {
     BDRVQcowState *s = bs->opaque;
-    QCowSnapshot *sn;
+    QCowSnapshot sn;
     int snapshot_index, ret;
 
+    /* Search the snapshot */
     snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_id);
-    if (snapshot_index < 0)
+    if (snapshot_index < 0) {
         return -ENOENT;
-    sn = &s->snapshots[snapshot_index];
+    }
+    sn = s->snapshots[snapshot_index];
 
-    ret = qcow2_update_snapshot_refcount(bs, sn->l1_table_offset, sn->l1_size, -1);
-    if (ret < 0)
+    /* Remove it from the snapshot list */
+    memmove(s->snapshots + snapshot_index,
+            s->snapshots + snapshot_index + 1,
+            (s->nb_snapshots - snapshot_index - 1) * sizeof(sn));
+    s->nb_snapshots--;
+    ret = qcow2_write_snapshots(bs);
+    if (ret < 0) {
         return ret;
-    /* must update the copied flag on the current cluster offsets */
-    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
-    if (ret < 0)
+    }
+
+    /*
+     * The snapshot is now unused, clean up. If we fail after this point, we
+     * won't recover but just leak clusters.
+     */
+    g_free(sn.id_str);
+    g_free(sn.name);
+
+    /*
+     * Now decrease the refcounts of clusters referenced by the snapshot and
+     * free the L1 table.
+     */
+    ret = qcow2_update_snapshot_refcount(bs, sn.l1_table_offset,
+                                         sn.l1_size, -1);
+    if (ret < 0) {
         return ret;
-    qcow2_free_clusters(bs, sn->l1_table_offset, sn->l1_size * sizeof(uint64_t));
+    }
+    qcow2_free_clusters(bs, sn.l1_table_offset, sn.l1_size * sizeof(uint64_t));
 
-    g_free(sn->id_str);
-    g_free(sn->name);
-    memmove(sn, sn + 1, (s->nb_snapshots - snapshot_index - 1) * sizeof(*sn));
-    s->nb_snapshots--;
-    ret = qcow2_write_snapshots(bs);
+    /* must update the copied flag on the current cluster offsets */
+    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
     if (ret < 0) {
-        /* XXX: restore snapshot if error ? */
         return ret;
     }
+
 #ifdef DEBUG_ALLOC
     {
         BdrvCheckResult result = {0};
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 8/8] qcow2: Fix error path in qcow2_snapshot_load_tmp
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
                   ` (6 preceding siblings ...)
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 7/8] qcow2: Fix order in qcow2_snapshot_delete Kevin Wolf
@ 2011-11-17 15:13 ` Kevin Wolf
  2011-11-18 16:45   ` Stefan Hajnoczi
  2011-11-18 16:48 ` [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Stefan Hajnoczi
  8 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2011-11-17 15:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

If the bdrv_read() of the snapshot's L1 table fails, return the right
error code and make sure that the old L1 table is still loaded and we
don't break the BlockDriverState completely.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-snapshot.c |   33 +++++++++++++++++++++------------
 1 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 4c2fbe8..0214b95 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -578,32 +578,41 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
 
 int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name)
 {
-    int i, snapshot_index, l1_size2;
+    int i, snapshot_index;
     BDRVQcowState *s = bs->opaque;
     QCowSnapshot *sn;
+    uint64_t *new_l1_table;
+    int new_l1_bytes;
+    int ret;
 
+    assert(bs->read_only);
+
+    /* Search the snapshot */
     snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_name);
     if (snapshot_index < 0) {
         return -ENOENT;
     }
-
     sn = &s->snapshots[snapshot_index];
-    s->l1_size = sn->l1_size;
-    l1_size2 = s->l1_size * sizeof(uint64_t);
-    if (s->l1_table != NULL) {
-        g_free(s->l1_table);
-    }
 
-    s->l1_table_offset = sn->l1_table_offset;
-    s->l1_table = g_malloc0(align_offset(l1_size2, 512));
+    /* Allocate and read in the snapshot's L1 table */
+    new_l1_bytes = s->l1_size * sizeof(uint64_t);
+    new_l1_table = g_malloc0(align_offset(new_l1_bytes, 512));
 
-    if (bdrv_pread(bs->file, sn->l1_table_offset,
-                   s->l1_table, l1_size2) != l1_size2) {
-        return -1;
+    ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, new_l1_bytes);
+    if (ret < 0) {
+        g_free(new_l1_table);
+        return ret;
     }
 
+    /* Switch the L1 table */
+    s->l1_size = sn->l1_size;
+    s->l1_table_offset = sn->l1_table_offset;
+    s->l1_table = new_l1_table;
+    g_free(s->l1_table);
+
     for(i = 0;i < s->l1_size; i++) {
         be64_to_cpus(&s->l1_table[i]);
     }
+
     return 0;
 }
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/8] qcow2: Return real error code in qcow2_write_snapshots
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 2/8] qcow2: Return real error code in qcow2_write_snapshots Kevin Wolf
@ 2011-11-18 14:14   ` Stefan Hajnoczi
  0 siblings, 0 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2011-11-18 14:14 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel

On Thu, Nov 17, 2011 at 3:13 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> -    /* update the various header fields */
> +    /*
> +     * Update the header to point to the new snapshot table. This requires the
> +     * new table and its refcounts to be stable on disk.
> +     *
> +     * FIXME This should be done with a single write

The nb_snapshots and snapshots_offset writes can be easily combined
since the fields are adjacent in QcowHeader.  I think it's worth
changing the code to do it in this patch series instead of leaving a
FIXME comment.

Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/8] qcow2: Return real error in qcow2_snapshot_goto
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 5/8] qcow2: Return real error in qcow2_snapshot_goto Kevin Wolf
@ 2011-11-18 16:08   ` Stefan Hajnoczi
  2011-11-18 16:26     ` Kevin Wolf
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Hajnoczi @ 2011-11-18 16:08 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel

On Thu, Nov 17, 2011 at 3:13 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-snapshot.c |   50 +++++++++++++++++++++++++++++++++++++----------
>  1 files changed, 39 insertions(+), 11 deletions(-)
>
> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
> index 066d56b..9f6647f 100644
> --- a/block/qcow2-snapshot.c
> +++ b/block/qcow2-snapshot.c
> @@ -392,17 +392,32 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
>     QCowSnapshot *sn;
>     int i, snapshot_index;
>     int cur_l1_bytes, sn_l1_bytes;
> +    int ret;
>
> +    /* Search the snapshot */
>     snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_id);
> -    if (snapshot_index < 0)
> +    if (snapshot_index < 0) {
>         return -ENOENT;
> +    }
>     sn = &s->snapshots[snapshot_index];
>
> -    if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, -1) < 0)
> +    /* Decrease refcount of clusters of current L1 table.
> +     * FIXME This is too early! */
> +    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset,
> +                                         s->l1_size, -1);

Just following along your comments:

Here we may free clusters.  Should any of the following intermediate
steps fail, we're left without a backup plan ;).

If this function fails we're in trouble.  We still have the l1 table
in memory but refcounts are broken, especially if we execute
qcow2_alloc_clusters() and freed clusters get reallocated.

So if this function fails I think the image is in a dangerous state.
It may not be possible to recover data referenced by the current l1
table.

> +    if (ret < 0) {
>         goto fail;
> +    }
>
> -    if (qcow2_grow_l1_table(bs, sn->l1_size, true) < 0)
> +    /*
> +     * Make sure that the current L1 table is big enough to contain the whole
> +     * L1 table of the snapshot. If the snapshot L1 table is smaller, the
> +     * current one must be padded with zeros.
> +     */
> +    ret = qcow2_grow_l1_table(bs, sn->l1_size, true);
> +    if (ret < 0) {
>         goto fail;
> +    }
>
>     cur_l1_bytes = s->l1_size * sizeof(uint64_t);
>     sn_l1_bytes = sn->l1_size * sizeof(uint64_t);
> @@ -411,19 +426,31 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
>         memset(s->l1_table + sn->l1_size, 0, cur_l1_bytes - sn_l1_bytes);
>     }
>
> -    /* copy the snapshot l1 table to the current l1 table */
> -    if (bdrv_pread(bs->file, sn->l1_table_offset,
> -                   s->l1_table, sn_l1_bytes) < 0)
> +    /*
> +     * Copy the snapshot L1 table to the current L1 table.
> +     *
> +     * Before overwriting the old current L1 table on disk, make sure to
> +     * increase all refcounts for the clusters referenced by the new one.
> +     */
> +    ret = bdrv_pread(bs->file, sn->l1_table_offset, s->l1_table, sn_l1_bytes);
> +    if (ret < 0) {
>         goto fail;
> -    if (bdrv_pwrite_sync(bs->file, s->l1_table_offset,
> -                    s->l1_table, cur_l1_bytes) < 0)
> +    }
> +
> +    ret = bdrv_pwrite(bs->file, s->l1_table_offset, s->l1_table, cur_l1_bytes);

Now this function does not issue an explicit bdrv_flush() anymore.  Is
this really okay?

Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/8] qcow2: Return real error in qcow2_snapshot_goto
  2011-11-18 16:08   ` Stefan Hajnoczi
@ 2011-11-18 16:26     ` Kevin Wolf
  0 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-11-18 16:26 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel

Am 18.11.2011 17:08, schrieb Stefan Hajnoczi:
> On Thu, Nov 17, 2011 at 3:13 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>> ---
>>  block/qcow2-snapshot.c |   50 +++++++++++++++++++++++++++++++++++++----------
>>  1 files changed, 39 insertions(+), 11 deletions(-)
>>
>> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
>> index 066d56b..9f6647f 100644
>> --- a/block/qcow2-snapshot.c
>> +++ b/block/qcow2-snapshot.c
>> @@ -392,17 +392,32 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
>>     QCowSnapshot *sn;
>>     int i, snapshot_index;
>>     int cur_l1_bytes, sn_l1_bytes;
>> +    int ret;
>>
>> +    /* Search the snapshot */
>>     snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_id);
>> -    if (snapshot_index < 0)
>> +    if (snapshot_index < 0) {
>>         return -ENOENT;
>> +    }
>>     sn = &s->snapshots[snapshot_index];
>>
>> -    if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, -1) < 0)
>> +    /* Decrease refcount of clusters of current L1 table.
>> +     * FIXME This is too early! */
>> +    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset,
>> +                                         s->l1_size, -1);
> 
> Just following along your comments:
> 
> Here we may free clusters.  Should any of the following intermediate
> steps fail, we're left without a backup plan ;).
> 
> If this function fails we're in trouble.  We still have the l1 table
> in memory but refcounts are broken, especially if we execute
> qcow2_alloc_clusters() and freed clusters get reallocated.
> 
> So if this function fails I think the image is in a dangerous state.
> It may not be possible to recover data referenced by the current l1
> table.

Correct. This patch doesn't do anything else than making this clear (and
fixing return codes, of course). The next one fixes the order.

Initially, I had both of them in the same patch, but I found it hard to
understand because fixing the order is really hard stuff. So I decided
to do the boring stuff in this patch so that it doesn't distract
reviewers when they try to understand the hard part.

I guess I should have mentioned this in the commit log.

>> +    if (ret < 0) {
>>         goto fail;
>> +    }
>>
>> -    if (qcow2_grow_l1_table(bs, sn->l1_size, true) < 0)
>> +    /*
>> +     * Make sure that the current L1 table is big enough to contain the whole
>> +     * L1 table of the snapshot. If the snapshot L1 table is smaller, the
>> +     * current one must be padded with zeros.
>> +     */
>> +    ret = qcow2_grow_l1_table(bs, sn->l1_size, true);
>> +    if (ret < 0) {
>>         goto fail;
>> +    }
>>
>>     cur_l1_bytes = s->l1_size * sizeof(uint64_t);
>>     sn_l1_bytes = sn->l1_size * sizeof(uint64_t);
>> @@ -411,19 +426,31 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
>>         memset(s->l1_table + sn->l1_size, 0, cur_l1_bytes - sn_l1_bytes);
>>     }
>>
>> -    /* copy the snapshot l1 table to the current l1 table */
>> -    if (bdrv_pread(bs->file, sn->l1_table_offset,
>> -                   s->l1_table, sn_l1_bytes) < 0)
>> +    /*
>> +     * Copy the snapshot L1 table to the current L1 table.
>> +     *
>> +     * Before overwriting the old current L1 table on disk, make sure to
>> +     * increase all refcounts for the clusters referenced by the new one.
>> +     */
>> +    ret = bdrv_pread(bs->file, sn->l1_table_offset, s->l1_table, sn_l1_bytes);
>> +    if (ret < 0) {
>>         goto fail;
>> -    if (bdrv_pwrite_sync(bs->file, s->l1_table_offset,
>> -                    s->l1_table, cur_l1_bytes) < 0)
>> +    }
>> +
>> +    ret = bdrv_pwrite(bs->file, s->l1_table_offset, s->l1_table, cur_l1_bytes);
> 
> Now this function does not issue an explicit bdrv_flush() anymore.  Is
> this really okay?

No. The next patch reintroduces the sync. Yes, I should learn how to
split patches properly. :-)

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/8] qcow2: Fix order of refcount updates in qcow2_snapshot_goto
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 6/8] qcow2: Fix order of refcount updates " Kevin Wolf
@ 2011-11-18 16:28   ` Stefan Hajnoczi
  2011-11-18 16:38     ` Kevin Wolf
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Hajnoczi @ 2011-11-18 16:28 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel

On Thu, Nov 17, 2011 at 3:13 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> -    /* FIXME This is too late! */
> -    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 1);
> +    if (ret < 0) {
> +        goto fail;
> +    }
> +
> +    g_free(sn_l1_table);

Prevent double-free if qcow2_update_snapshot_refcount() fails below:

sn_l1_table = NULL;

Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/8] qcow2: Fix order of refcount updates in qcow2_snapshot_goto
  2011-11-18 16:28   ` Stefan Hajnoczi
@ 2011-11-18 16:38     ` Kevin Wolf
  0 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-11-18 16:38 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel

Am 18.11.2011 17:28, schrieb Stefan Hajnoczi:
> On Thu, Nov 17, 2011 at 3:13 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> -    /* FIXME This is too late! */
>> -    ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 1);
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
>> +    g_free(sn_l1_table);
> 
> Prevent double-free if qcow2_update_snapshot_refcount() fails below:
> 
> sn_l1_table = NULL;

Thanks, good catch. Fixed this locally.

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 8/8] qcow2: Fix error path in qcow2_snapshot_load_tmp
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 8/8] qcow2: Fix error path in qcow2_snapshot_load_tmp Kevin Wolf
@ 2011-11-18 16:45   ` Stefan Hajnoczi
  2011-11-18 17:01     ` Kevin Wolf
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Hajnoczi @ 2011-11-18 16:45 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel

On Thu, Nov 17, 2011 at 3:13 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> +    s->l1_table = new_l1_table;
> +    g_free(s->l1_table);

O_o .oO( immediately frees new_l1_table?! )

Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots
  2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
                   ` (7 preceding siblings ...)
  2011-11-17 15:13 ` [Qemu-devel] [PATCH 8/8] qcow2: Fix error path in qcow2_snapshot_load_tmp Kevin Wolf
@ 2011-11-18 16:48 ` Stefan Hajnoczi
  8 siblings, 0 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2011-11-18 16:48 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel

On Thu, Nov 17, 2011 at 3:13 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> This is more or less the same kind of fixes that we made in the rest of qcow2
> last year: Return the right error codes and make the order of operations safe
> so that a crash can lead to no more than cluster leaks.
>
> Although all of these are bug fixes, I'm not so sure about taking them into
> 1.0. Maybe we can take some of the easier ones and leave others out, or just
> move the whole series to 1.1. Feedback on this would appreciated.

Thanks Kevin.  I left comments on individual patches.

I'll review v2 from scratch again because these changes can be subtle.

Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 8/8] qcow2: Fix error path in qcow2_snapshot_load_tmp
  2011-11-18 16:45   ` Stefan Hajnoczi
@ 2011-11-18 17:01     ` Kevin Wolf
  0 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-11-18 17:01 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel

Am 18.11.2011 17:45, schrieb Stefan Hajnoczi:
> On Thu, Nov 17, 2011 at 3:13 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> +    s->l1_table = new_l1_table;
>> +    g_free(s->l1_table);
> 
> O_o .oO( immediately frees new_l1_table?! )

This was just a test to see if you're still awake at patch 8.

*cough*

Ok, I guess you found the patch that I forgot to test...

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-11-18 16:58 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-17 15:13 [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Kevin Wolf
2011-11-17 15:13 ` [Qemu-devel] [PATCH 1/8] qcow2: Return real error code in qcow2_read_snapshots Kevin Wolf
2011-11-17 15:13 ` [Qemu-devel] [PATCH 2/8] qcow2: Return real error code in qcow2_write_snapshots Kevin Wolf
2011-11-18 14:14   ` Stefan Hajnoczi
2011-11-17 15:13 ` [Qemu-devel] [PATCH 3/8] qcow2: Cleanups and memleak fix in qcow2_snapshot_create Kevin Wolf
2011-11-17 15:13 ` [Qemu-devel] [PATCH 4/8] qcow2: Rework qcow2_snapshot_create error handling Kevin Wolf
2011-11-17 15:13 ` [Qemu-devel] [PATCH 5/8] qcow2: Return real error in qcow2_snapshot_goto Kevin Wolf
2011-11-18 16:08   ` Stefan Hajnoczi
2011-11-18 16:26     ` Kevin Wolf
2011-11-17 15:13 ` [Qemu-devel] [PATCH 6/8] qcow2: Fix order of refcount updates " Kevin Wolf
2011-11-18 16:28   ` Stefan Hajnoczi
2011-11-18 16:38     ` Kevin Wolf
2011-11-17 15:13 ` [Qemu-devel] [PATCH 7/8] qcow2: Fix order in qcow2_snapshot_delete Kevin Wolf
2011-11-17 15:13 ` [Qemu-devel] [PATCH 8/8] qcow2: Fix error path in qcow2_snapshot_load_tmp Kevin Wolf
2011-11-18 16:45   ` Stefan Hajnoczi
2011-11-18 17:01     ` Kevin Wolf
2011-11-18 16:48 ` [Qemu-devel] [PATCH 0/8] qcow2: Fix error paths for internal snapshots Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).