qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication
@ 2012-11-26 13:04 Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification Benoît Canet
                   ` (26 more replies)
  0 siblings, 27 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

This patchset is the first working version of the QCOW2 deduplication.

Images must be created with "-o dedup=on" in order to activate the
deduplication in the image.


Since v2: make it work barely
          replace kernel red black trees by gtree.

Benoît Canet (24):
  qcow2: Add deduplication to the qcow2 specification.
  qcow2: Add deduplication structures and fields.
  qcow2: Add qcow2_dedup_read_missing_and_concatenate
  qcow2: Make update_cluster_refcount public.
  qcow2: Create a way to link to l2 tables in dedup.
  qcow2: Add qcow2_dedup and related functions.
  qcow2: Add qcow2_dedup_write_new_hashes.
  qcow2: Implement qcow2_compute_cluster_hash.
  qcow2: Extract qcow2_dedup_grow_table
  qcow2: create function to load deduplication hashes at startup.
  qcow2: Load and save deduplication table header extension.
  qcow2: Extract qcow2_do_table_init.
  qcow2: Add qcow2_dedup_init and qcow2_dedup_close.
  qcow2: Extract qcow2_add_feature and qcow2_remove_feature.
  block: Add dedup image create option.
  qcow2: Allow creation of images using deduplication.
  qcow2: Behave correctly when refcount reach 0 or 2^16.
  qcow2: Integrate deduplication in qcow2_co_writev loop.
  qcow2: Add verification of dedup table.
  qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup.
  qcow2: Add check_dedup_l2 in order to check l2 of dedup table.
  qcow2: Do not overwrite existing entries with QCOW_OFLAG_COPIED.
  qcow2: init and cleanup deduplication.
  qemu-iotests: Filter dedup=on/off so existing tests don't break.

 Makefile                     |    3 +
 Makefile.target              |    2 +-
 block/Makefile.objs          |    1 +
 block/qcow2-cluster.c        |  115 ++++--
 block/qcow2-dedup.c          |  914 ++++++++++++++++++++++++++++++++++++++++++
 block/qcow2-refcount.c       |  154 +++++--
 block/qcow2.c                |  267 ++++++++++--
 block/qcow2.h                |   89 +++-
 block_int.h                  |    1 +
 docs/specs/qcow2.txt         |   33 +-
 tests/qemu-iotests/common.rc |    3 +-
 11 files changed, 1480 insertions(+), 102 deletions(-)
 create mode 100644 block/qcow2-dedup.c

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-12-11 11:28   ` Stefan Hajnoczi
                     ` (2 more replies)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 02/24] qcow2: Add deduplication structures and fields Benoît Canet
                   ` (25 subsequent siblings)
  26 siblings, 3 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 docs/specs/qcow2.txt |   33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 36a559d..16eafd7 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -80,7 +80,10 @@ in the description of a field.
                                 tables to repair refcounts before accessing the
                                 image.
 
-                    Bits 1-63:  Reserved (set to 0)
+                    Bit 1:      Deduplication bit.  If this bit is set then
+                                deduplication is used on this image.
+
+                    Bits 2-63:  Reserved (set to 0)
 
          80 -  87:  compatible_features
                     Bitmask of compatible features. An implementation can
@@ -116,6 +119,7 @@ be stored. Each extension has a structure like the following:
                         0x00000000 - End of the header extension area
                         0xE2792ACA - Backing file format name
                         0x6803f857 - Feature name table
+                        0xCD8E819B - Deduplication
                         other      - Unknown header extension, can be safely
                                      ignored
 
@@ -159,6 +163,33 @@ the header extension data. Each entry look like this:
                     terminated if it has full length)
 
 
+== Deduplication ==
+
+The deduplication extension contains the offset and size of the deduplication
+table.
+
+    Byte   0 - 7:   Offset
+
+          8 - 11:   Size
+
+== Deduplication table ==
+
+The deduplication table contains 64 bits offsets to the level 2 deduplication
+table clusters.
+Each entry of these clusters contains a 32 bytes SHA256 hash followed by the
+64 bits logical offset of the first encountered block having this hash.
+
+Entries in the deduplication table are orderered by physical cluster index.
+
+The number of entries in an l2 deduplication table cluster is :
+l2_dedup_cluster_entries = cluster_size / (32 + 8)
+
+The index in the level 1 deduplication table is :
+l1_dedup_index = physical_cluster_index / l2_dedup_cluster_entries
+
+The index in the level 2 deduplication table is:
+l2_dedup_index = physical_cluster_index % l2_dedup_cluster_entries
+
 == Host cluster management ==
 
 qcow2 manages the allocation of host clusters by maintaining a reference count
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 02/24] qcow2: Add deduplication structures and fields.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-12-11 11:34   ` Stefan Hajnoczi
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 03/24] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2.h |   28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index b4eb654..e192001 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -58,6 +58,23 @@
 
 #define DEFAULT_CLUSTER_SIZE 65536
 
+/* deduplication node */
+typedef struct {
+    uint8_t *hash;         /* 32 bytes hash of a given cluster */
+    uint64_t offset;       /* offset where the cluster is stored (sectors) */
+    uint64_t first_logical_offset;
+} QCowHashNode;
+
+/* Undedupable hashes that must be written later to disk */
+typedef struct QCowHashElement {
+    uint8_t *hash;
+    QTAILQ_ENTRY(QCowHashElement) next;
+} QCowHashElement;
+
+typedef struct UndedupableHashes {
+    QTAILQ_HEAD(, QCowHashElement) undedupable_hashes;
+} UndedupableHashes;
+
 typedef struct QCowHeader {
     uint32_t magic;
     uint32_t version;
@@ -114,8 +131,10 @@ enum {
 enum {
     QCOW2_INCOMPAT_DIRTY_BITNR   = 0,
     QCOW2_INCOMPAT_DIRTY         = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
+    QCOW2_INCOMPAT_DEDUP_BITNR   = 1,
+    QCOW2_INCOMPAT_DEDUP         = 1 << QCOW2_INCOMPAT_DEDUP_BITNR,
 
-    QCOW2_INCOMPAT_MASK          = QCOW2_INCOMPAT_DIRTY,
+    QCOW2_INCOMPAT_MASK          = QCOW2_INCOMPAT_DIRTY | QCOW2_INCOMPAT_DEDUP,
 };
 
 /* Compatible feature bits */
@@ -148,6 +167,7 @@ typedef struct BDRVQcowState {
 
     Qcow2Cache* l2_table_cache;
     Qcow2Cache* refcount_block_cache;
+    Qcow2Cache *dedup_cluster_cache;
 
     uint8_t *cluster_cache;
     uint8_t *cluster_data;
@@ -160,6 +180,12 @@ typedef struct BDRVQcowState {
     int64_t free_cluster_index;
     int64_t free_byte_offset;
 
+    bool has_dedup;
+    uint64_t *dedup_table;
+    uint64_t dedup_table_offset;
+    int32_t dedup_table_size;
+    GTree *dedup_tree_by_hash;
+    GTree *dedup_tree_by_offset;
     CoMutex lock;
 
     uint32_t crypt_method; /* current crypt method, 0 if no key yet */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 03/24] qcow2: Add qcow2_dedup_read_missing_and_concatenate
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 02/24] qcow2: Add deduplication structures and fields Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-12-11 11:52   ` Stefan Hajnoczi
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 04/24] qcow2: Make update_cluster_refcount public Benoît Canet
                   ` (23 subsequent siblings)
  26 siblings, 1 reply; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

This function is used to read missing data when unaligned writes are
done. This function also concatenate missing data with the given
qiov data in order to prepare a buffer used to look for duplicated
clusters.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/Makefile.objs |    1 +
 block/qcow2-dedup.c |  150 +++++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.h       |    8 +++
 3 files changed, 159 insertions(+)
 create mode 100644 block/qcow2-dedup.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 554f429..790cb5f 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,6 @@
 block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
 block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
+block-obj-y += qcow2-dedup.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
 block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
new file mode 100644
index 0000000..e4e6108
--- /dev/null
+++ b/block/qcow2-dedup.c
@@ -0,0 +1,150 @@
+/*
+ * Deduplication for the QCOW2 format
+ *
+ * Copyright (C) Nodalink, SARL. 2012
+ *
+ * Author:
+ *   Benoît Canet <benoit.canet@irqsave.net>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "block_int.h"
+#include "qemu-common.h"
+#include "qcow2.h"
+
+/**
+ * Read some data from the QCOW2 file
+ *
+ * @data:       the buffer where the data must be stored
+ * @sector_num: the sector number to read in the QCOW2 file
+ * @nb_sectors: the number of sectors to read
+ * @ret:        negative on error
+ */
+static int qcow2_dedup_read_missing_cluster_data(BlockDriverState *bs,
+                                                 uint8_t *data,
+                                                 uint64_t sector_num,
+                                                 int nb_sectors)
+{
+    BDRVQcowState *s = bs->opaque;
+    QEMUIOVector qiov;
+    struct iovec iov;
+    int ret;
+
+    iov.iov_len = nb_sectors * BDRV_SECTOR_SIZE;
+    iov.iov_base = data;
+    qemu_iovec_init_external(&qiov, &iov, 1);
+    qemu_co_mutex_unlock(&s->lock);
+    ret = bdrv_co_readv(bs, sector_num, nb_sectors, &qiov);
+    qemu_co_mutex_lock(&s->lock);
+    if (ret < 0) {
+        error_report("failed to read %d sectors at offset %" PRIu64 "\n",
+                     nb_sectors, sector_num);
+    }
+
+    return ret;
+}
+
+/*
+ * Prepare a buffer containing all the required data required to compute cluster
+ * sized deduplication hashes.
+ * If sector_num and nb_sectors are unaligned cluster wize it read the missing
+ * data before and after the qiov.
+ *
+ * @qiov:               the qiov for which missing data must be read
+ * @sector_num:         the first sectors that must be read into the qiov
+ * @nb_sectors:         the number of sectors to read into the qiov
+ * @data:               the place where the data will be concatenated and stored
+ * @nb_data_sectors:    the resulting size of the contatenated data (in sectors)
+ * @ret:                negative on error
+ */
+int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
+                                             QEMUIOVector *qiov,
+                                             uint64_t sector_num,
+                                             int nb_sectors,
+                                             uint8_t **data,
+                                             int *nb_data_sectors)
+{
+    BDRVQcowState *s = bs->opaque;
+    int ret;
+    uint64_t cluster_beginning_sector;
+    uint64_t first_sector_after_qiov;
+    int cluster_beginning_nr;
+    int cluster_ending_nr;
+    int unaligned_ending_nr;
+    uint64_t max_cluster_ending_nr;
+
+    /* compute how much and where to read at the beginning */
+    cluster_beginning_nr = sector_num & (s->cluster_sectors - 1);
+    cluster_beginning_sector = sector_num - cluster_beginning_nr;
+
+    /* for the ending */
+    first_sector_after_qiov = sector_num + nb_sectors;
+    unaligned_ending_nr = first_sector_after_qiov & (s->cluster_sectors - 1);
+    cluster_ending_nr = unaligned_ending_nr ?
+                        s->cluster_sectors - unaligned_ending_nr : 0;
+
+    /* compute total size in sectors and allocate memory */
+    *nb_data_sectors = cluster_beginning_nr + nb_sectors + cluster_ending_nr;
+    *data = qemu_blockalign(bs, *nb_data_sectors * BDRV_SECTOR_SIZE);
+    memset(*data, 0, *nb_data_sectors * BDRV_SECTOR_SIZE);
+
+    /* read beginning */
+    if (cluster_beginning_nr) {
+        ret = qcow2_dedup_read_missing_cluster_data(bs,
+                                                    *data,
+                                                    cluster_beginning_sector,
+                                                    cluster_beginning_nr);
+
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
+    /* append qiov content */
+    qemu_iovec_to_buf(qiov, 0, *data + cluster_beginning_nr * BDRV_SECTOR_SIZE,
+                      qiov->size);
+
+    /* Fix cluster_ending_nr if we are at risk of reading outside the image
+     * (Cluster unaligned image size)
+     */
+    max_cluster_ending_nr = bs->total_sectors - first_sector_after_qiov;
+    cluster_ending_nr = max_cluster_ending_nr < (uint64_t) cluster_ending_nr ?
+                        (int) max_cluster_ending_nr : cluster_ending_nr;
+
+    /* read and add ending */
+    if (cluster_ending_nr) {
+        ret = qcow2_dedup_read_missing_cluster_data(bs,
+                                                    *data +
+                                                    (cluster_beginning_nr +
+                                                    nb_sectors) *
+                                                    BDRV_SECTOR_SIZE,
+                                                    first_sector_after_qiov,
+                                                    cluster_ending_nr);
+
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
+    return 0;
+
+fail:
+    return ret;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index e192001..858fef3 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -359,4 +359,12 @@ int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
     void **table);
 int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
 
+/* qcow2-dedup.c functions */
+int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
+                                             QEMUIOVector *qiov,
+                                             uint64_t sector,
+                                             int sectors_nr,
+                                             uint8_t **dedup_cluster_data,
+                                             int *dedup_cluster_data_nr);
+
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 04/24] qcow2: Make update_cluster_refcount public.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (2 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 03/24] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 05/24] qcow2: Create a way to link to l2 tables in dedup Benoît Canet
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Also add a flush parameter to make flushing optional.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-refcount.c |   25 +++++++++++++++++--------
 block/qcow2.h          |    4 ++++
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 5e3f915..faca64c 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -513,9 +513,10 @@ fail:
  * If the return value is non-negative, it is the new refcount of the cluster.
  * If it is negative, it is -errno and indicates an error.
  */
-static int update_cluster_refcount(BlockDriverState *bs,
-                                   int64_t cluster_index,
-                                   int addend)
+int update_cluster_refcount(BlockDriverState *bs,
+                            int64_t cluster_index,
+                            int addend,
+                            bool flush)
 {
     BDRVQcowState *s = bs->opaque;
     int ret;
@@ -525,7 +526,9 @@ static int update_cluster_refcount(BlockDriverState *bs,
         return ret;
     }
 
-    bdrv_flush(bs->file);
+    if (flush) {
+        bdrv_flush(bs->file);
+    }
 
     return get_refcount(bs, cluster_index);
 }
@@ -644,7 +647,7 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
         if (free_in_cluster == 0)
             s->free_byte_offset = 0;
         if ((offset & (s->cluster_size - 1)) != 0)
-            update_cluster_refcount(bs, offset >> s->cluster_bits, 1);
+            update_cluster_refcount(bs, offset >> s->cluster_bits, 1, true);
     } else {
         offset = qcow2_alloc_clusters(bs, s->cluster_size);
         if (offset < 0) {
@@ -654,7 +657,7 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
         if ((cluster_offset + s->cluster_size) == offset) {
             /* we are lucky: contiguous data */
             offset = s->free_byte_offset;
-            update_cluster_refcount(bs, offset >> s->cluster_bits, 1);
+            update_cluster_refcount(bs, offset >> s->cluster_bits, 1, true);
             s->free_byte_offset += size;
         } else {
             s->free_byte_offset = offset;
@@ -795,7 +798,10 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                     } else {
                         uint64_t cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
                         if (addend != 0) {
-                            refcount = update_cluster_refcount(bs, cluster_index, addend);
+                            refcount = update_cluster_refcount(bs,
+                                                               cluster_index,
+                                                               addend,
+                                                               true);
                         } else {
                             refcount = get_refcount(bs, cluster_index);
                         }
@@ -827,7 +833,10 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 
 
             if (addend != 0) {
-                refcount = update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend);
+                refcount = update_cluster_refcount(bs,
+                                                   l2_offset >> s->cluster_bits,
+                                                   addend,
+                                                   true);
             } else {
                 refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
             }
diff --git a/block/qcow2.h b/block/qcow2.h
index 858fef3..ee9aecc 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -304,6 +304,10 @@ void qcow2_free_clusters(BlockDriverState *bs,
     int64_t offset, int64_t size);
 void qcow2_free_any_clusters(BlockDriverState *bs,
     uint64_t cluster_offset, int nb_clusters);
+int update_cluster_refcount(BlockDriverState *bs,
+                            int64_t cluster_index,
+                            int addend,
+                            bool flush);
 
 int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     int64_t l1_table_offset, int l1_size, int addend);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 05/24] qcow2: Create a way to link to l2 tables in dedup.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (3 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 04/24] qcow2: Make update_cluster_refcount public Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 06/24] qcow2: Add qcow2_dedup and related functions Benoît Canet
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-cluster.c |    9 +++++++--
 block/qcow2.c         |    2 ++
 block/qcow2.h         |    2 ++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index e179211..9a07191 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -683,7 +683,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
             old_cluster[j++] = l2_table[l2_index + i];
 
         l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
-                    (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
+                    (i << s->cluster_bits)) |
+                    (m->oflag_copied ? QCOW_OFLAG_COPIED : 0));
      }
 
 
@@ -696,7 +697,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
      * If this was a COW, we need to decrease the refcount of the old cluster.
      * Also flush bs->file to get the right order for L2 and refcount update.
      */
-    if (j != 0) {
+    if (!m->overwrite && j != 0) {
         for (i = 0; i < j; i++) {
             qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1);
         }
@@ -922,6 +923,8 @@ again:
     *m = (QCowL2Meta) {
         .cluster_offset     = cluster_offset,
         .nb_clusters        = 0,
+        .oflag_copied       = true,
+        .overwrite          = false,
     };
     qemu_co_queue_init(&m->dependent_requests);
 
@@ -970,6 +973,8 @@ again:
                 .n_start        = keep_clusters == 0 ? n_start : 0,
                 .nb_clusters    = nb_clusters,
                 .nb_available   = MIN(requested_sectors, avail_sectors),
+                .oflag_copied   = true,
+                .overwrite      = false,
             };
             qemu_co_queue_init(&m->dependent_requests);
             QLIST_INSERT_HEAD(&s->cluster_allocs, m, next_in_flight);
diff --git a/block/qcow2.c b/block/qcow2.c
index c1ff31f..b5276c0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -776,6 +776,8 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
     uint8_t *cluster_data = NULL;
     QCowL2Meta l2meta = {
         .nb_clusters = 0,
+        .oflag_copied = true,
+        .overwrite = false,
     };
 
     trace_qcow2_writev_start_req(qemu_coroutine_self(), sector_num,
diff --git a/block/qcow2.h b/block/qcow2.h
index ee9aecc..ccb24ad 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -231,6 +231,8 @@ typedef struct QCowL2Meta
     int n_start;
     int nb_available;
     int nb_clusters;
+    bool oflag_copied;
+    bool overwrite;
     CoQueue dependent_requests;
 
     QLIST_ENTRY(QCowL2Meta) next_in_flight;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 06/24] qcow2: Add qcow2_dedup and related functions.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (4 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 05/24] qcow2: Create a way to link to l2 tables in dedup Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-12-11 13:16   ` Stefan Hajnoczi
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 07/24] qcow2: Add qcow2_dedup_write_new_hashes Benoît Canet
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-dedup.c |  312 +++++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.h       |   13 +++
 2 files changed, 325 insertions(+)

diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index e4e6108..a7c7202 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -29,6 +29,8 @@
 #include "qemu-common.h"
 #include "qcow2.h"
 
+#define HASH_LENGTH 32
+
 /**
  * Read some data from the QCOW2 file
  *
@@ -148,3 +150,313 @@ int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
 fail:
     return ret;
 }
+
+/*
+ * Build a QCowHashNode structure
+ *
+ * @hash:             the given hash
+ * @physical_offset:  the cluster offset in the QCOW2 file
+ * @first_offset:     the first logical cluster offset written
+ * @ret:              the build QCowHashNode
+ */
+static QCowHashNode *qcow2_dedup_build_qcow_hash_node(uint8_t *hash,
+                                                      uint64_t physical_offset,
+                                                      uint64_t first_offset)
+{
+    QCowHashNode *data;
+
+    data = g_new0(QCowHashNode, 1);
+    data->hash = hash;
+    data->offset = physical_offset;
+    data->first_logical_offset = first_offset;
+
+    return data;
+}
+
+/*
+ * Compute the hash of a given cluster
+ *
+ * @data: a buffer containing the cluster data
+ * @ret:  a HASH_LENGTH long dynamically allocated array containing the hash
+ */
+static uint8_t *qcow2_compute_cluster_hash(BlockDriverState *bs,
+                                           uint8_t *data)
+{
+    return NULL;
+}
+
+/* Try to find the offset of a given cluster if it's duplicated
+ * Exceptionally we cast return value to int64_t to use as error code.
+ *
+ * @data:            a buffer containing the cluster
+ * @skip_cluster_nr: the number of cluster to skip in the buffer
+ * @hash:            if hash is provided it's used else it's computed
+ * @ret:             QCowHashNode of the duplicated cluster or NULL
+ */
+static QCowHashNode *qcow2_build_hash_lookup_offset(BlockDriverState *bs,
+                                                    uint8_t *data,
+                                                    int skip_cluster_nr,
+                                                    uint8_t **hash)
+{
+    BDRVQcowState *s = bs->opaque;
+    QCowHashNode *hash_node;
+    if (!*hash) {
+        /* no hash has been provided compute it and store it for caller usage */
+        *hash = qcow2_compute_cluster_hash(bs,
+                                           data + skip_cluster_nr *
+                                           s->cluster_size);
+    }
+    hash_node =  g_tree_lookup(s->dedup_tree_by_hash, *hash);
+    if (hash_node) {
+        return hash_node;
+    }
+
+    /* cluster not duplicated */
+    hash_node = qcow2_dedup_build_qcow_hash_node(*hash,
+                                                 QCOW_FLAG_EMPTY,
+                                                 QCOW_FLAG_EMPTY);
+    g_tree_insert(s->dedup_tree_by_hash, *hash, hash_node);
+
+    return NULL;
+}
+
+/*
+ * Helper used to build a QCowHashElement
+ *
+ * @hash: the hash to use
+ * @ret:  a newly allocated QCowHashElement pointing to the given hash
+ */
+static QCowHashElement *qcow2_build_dedup_hash(uint8_t *hash)
+{
+    QCowHashElement *dedup_hash;
+    dedup_hash = g_new0(QCowHashElement, 1);
+    dedup_hash->hash = hash;
+    return dedup_hash;
+}
+
+/*
+ * Helper used to link a deduplicated cluster in the l2
+ *
+ * @logical_cluster_offset:  the cluster offset seen by the guest (in sectors)
+ * @physical_cluster_offset: the cluster offset in the QCOW2 file (in sectors)
+ * @overwrite:               true if we are overwriting an l2 table
+ * @ret:
+ */
+static int qcow2_dedup_link_l2(BlockDriverState *bs,
+                               uint64_t logical_cluster_offset,
+                               uint64_t physical_cluster_offset,
+                               bool overwrite)
+{
+    QCowL2Meta m;
+    /* function correctness regarding copy on write ? */
+    m.offset         = logical_cluster_offset << 9;
+    m.alloc_offset   = physical_cluster_offset << 9;
+    m.nb_clusters    = 1; /* we are linking only one cluster in l2 */
+    m.cluster_offset = 0;
+    m.n_start        = 0;
+    m.nb_available   = 0;
+    m.oflag_copied   = false;
+    m.overwrite      = overwrite;
+    return qcow2_alloc_cluster_link_l2(bs, &m);
+}
+
+/* This function tries to deduplicate a given cluster.
+ *
+ * @sector_num:           the logical sector number we are trying to deduplicate
+ * @precomputed_hash:     Used instead of computing the hash if provided
+ * @data:                 the buffer in which to look for a duplicated cluster
+ * @skip_clusters_nr:     the number of cluster that must be skipped in data
+ * @non_duplicated_dedup_hash:   returned if the cluster is not deduplicated
+ * @ret:                  ret < 0 on error, 1 on deduplication else 0
+ */
+static int qcow2_dedup_cluster(BlockDriverState *bs,
+                               uint64_t sector_num,
+                               uint8_t **precomputed_hash,
+                               uint8_t *data,
+                               int skip_clusters_nr,
+                               QCowHashElement **non_duplicated_dedup_hash)
+{
+    BDRVQcowState *s = bs->opaque;
+    QCowHashNode *hash_node;
+    int64_t physical_cluster_offset;
+    uint64_t first_logical_offset;
+    uint64_t logical_cluster_offset;
+    uint64_t existing_physical_cluster_offset;
+    int ret;
+    int pnum = s->cluster_sectors;
+    *non_duplicated_dedup_hash = NULL;
+
+    /* look if the cluster is duplicated */
+    hash_node = qcow2_build_hash_lookup_offset(bs,
+                                               data,
+                                               skip_clusters_nr,
+                                               precomputed_hash);
+
+    /* round the logical cluster offset to cluster boundaries */
+    logical_cluster_offset = sector_num & ~(s->cluster_sectors - 1);
+    ret = qcow2_get_cluster_offset(bs, logical_cluster_offset << 9,
+                                   &pnum, &existing_physical_cluster_offset);
+    if (ret < 0) {
+        goto exit;
+    }
+
+    if (!hash_node) {
+        /* no duplicated cluster found, store the hash for later usage */
+        *non_duplicated_dedup_hash = qcow2_build_dedup_hash(*precomputed_hash);
+        return 0;
+    } else {
+        /* duplicated cluster found */
+        physical_cluster_offset = hash_node->offset;
+        first_logical_offset = hash_node->first_logical_offset;
+
+        if (existing_physical_cluster_offset != physical_cluster_offset << 9) {
+
+            ret = update_cluster_refcount(bs,
+                                          physical_cluster_offset /
+                                          s->cluster_sectors,
+                                          1,
+                                          false);
+            if (ret < 0) {
+                goto exit;
+            }
+
+            ret = qcow2_dedup_link_l2(bs, logical_cluster_offset,
+                                      physical_cluster_offset,
+                                      false);
+            if (ret < 0) {
+                goto exit;
+            }
+
+            /* if refcount was one remove the QCOW_FLAG_FIRST flag */
+            if (first_logical_offset & QCOW_FLAG_FIRST) {
+                first_logical_offset &= ~QCOW_FLAG_FIRST;
+                ret = qcow2_dedup_link_l2(bs, first_logical_offset,
+                                          physical_cluster_offset,
+                                          true);
+                if (ret < 0) {
+                    goto exit;
+                }
+                hash_node->first_logical_offset = first_logical_offset;
+            }
+        }
+    }
+
+    ret = 1;
+exit:
+    g_free(*precomputed_hash);
+    *precomputed_hash = NULL;
+    return ret;
+}
+
+/*
+ * Deduplicate all the cluster that can be deduplicated.
+ *
+ * Next it compute the number of non deduplicable sectors to come while storing
+ * the hashes of these sectors in a linked list for later usage.
+ * Then it compute the first duplicated cluster hash that come after non
+ * deduplicable cluster, this hash will be used at next call of the function
+ *
+ * @u:               where to store the list of undedupable hashes
+ * @sector_num:      the first sector to deduplicate (in sectors)
+ * @data:            the buffer containing the data to deduplicate
+ * @data_nr:         the size of the buffer in sectors
+ * @skip_cluster_nr: the number of cluster to skip at the begining of data
+ * @next_non_dedupable_sectors_nr: result containing the number of non
+ *                                 deduplicable sectors to come
+ * next_call_first_hash:           hash saved between the function call
+ *
+ *
+ *
+ */
+int qcow2_dedup(BlockDriverState *bs,
+                UndedupableHashes *u,
+                uint64_t sector_num,
+                uint8_t *data,
+                int data_nr,
+                int *skip_clusters_nr,
+                int *next_non_dedupable_sectors_nr,
+                uint8_t **next_call_first_hash)
+{
+    BDRVQcowState *s = bs->opaque;
+    int ret;
+    int deduped_clusters_nr = 0;
+    int left_to_test_clusters_nr;
+    int begining_index;
+    uint8_t *hash = NULL;
+    QCowHashElement *non_duplicated_dedup_hash = NULL;
+
+    /* should already be zero when entering this function */
+    assert(*next_non_dedupable_sectors_nr == 0);
+
+    begining_index = sector_num & (s->cluster_sectors - 1);
+
+    left_to_test_clusters_nr = (data_nr / s->cluster_sectors) -
+                               *skip_clusters_nr;
+
+    /* Deduplicate all that can be */
+    while (left_to_test_clusters_nr-- &&
+           (ret = qcow2_dedup_cluster(bs,
+                                      sector_num,
+                                      next_call_first_hash,
+                                      data,
+                                      (*skip_clusters_nr)++,
+                                      &non_duplicated_dedup_hash)) == 1) {
+        sector_num += s->cluster_sectors;
+        deduped_clusters_nr++;
+    }
+
+    if (ret < 0) {
+        *next_call_first_hash = NULL;
+        goto exit;
+    }
+
+    /* We deduped everything till the end */
+    if (!non_duplicated_dedup_hash) {
+        *next_call_first_hash = NULL;
+        *next_non_dedupable_sectors_nr = 0;
+        goto exit;
+    }
+
+    /* We consumed the precomputed hash */
+    *next_call_first_hash = NULL;
+
+    /* remember that the last sector we tested in non deduplicable */
+    *next_non_dedupable_sectors_nr += s->cluster_sectors;
+
+    /* Memorize the hash of the first non duplicated cluster.
+     * we will store it before writing the cluster to disk.
+     */
+    QTAILQ_INSERT_TAIL(&u->undedupable_hashes, non_duplicated_dedup_hash, next);
+
+    /* Count how many non duplicated sector can be written and memorize the
+     * hashes for later.
+     * We make sure we pass hash == NULL to force computation of the hash.
+     */
+    hash = NULL;
+    while (left_to_test_clusters_nr-- > 0 &&
+           !qcow2_build_hash_lookup_offset(bs,
+                                           data,
+                                           *skip_clusters_nr,
+                                           &hash)) {
+        *next_non_dedupable_sectors_nr += s->cluster_sectors;
+        non_duplicated_dedup_hash = qcow2_build_dedup_hash(hash);
+        QTAILQ_INSERT_TAIL(&u->undedupable_hashes,
+                           non_duplicated_dedup_hash, next);
+        hash = NULL;
+        (*skip_clusters_nr)++;
+    }
+
+    /* We find a duplicated cluster before stopping iterating store it for
+     * next call.
+     */
+    if (hash && non_duplicated_dedup_hash &&
+        non_duplicated_dedup_hash->hash != hash) {
+        *next_call_first_hash = hash;
+    }
+
+exit:
+    if (!deduped_clusters_nr) {
+        return 0;
+    }
+    return deduped_clusters_nr * s->cluster_sectors - begining_index;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index ccb24ad..5c18425 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -58,6 +58,11 @@
 
 #define DEFAULT_CLUSTER_SIZE 65536
 
+/* indicate that the hash structure is empty and miss offset */
+#define QCOW_FLAG_EMPTY   (1LL << 62)
+/* indicate that the cluster for this hash has QCOW_OFLAG_COPIED on disk */
+#define QCOW_FLAG_FIRST   (1LL << 63)
+
 /* deduplication node */
 typedef struct {
     uint8_t *hash;         /* 32 bytes hash of a given cluster */
@@ -372,5 +377,13 @@ int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
                                              int sectors_nr,
                                              uint8_t **dedup_cluster_data,
                                              int *dedup_cluster_data_nr);
+int qcow2_dedup(BlockDriverState *bs,
+                UndedupableHashes *u,
+                uint64_t sector_num,
+                uint8_t *data,
+                int data_nr,
+                int *skip_clusters_nr,
+                int *next_non_dedupable_sectors_nr,
+                uint8_t **next_call_first_hash);
 
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 07/24] qcow2: Add qcow2_dedup_write_new_hashes.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (5 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 06/24] qcow2: Add qcow2_dedup and related functions Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 08/24] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-dedup.c |  220 +++++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.h       |    5 ++
 2 files changed, 225 insertions(+)

diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index a7c7202..83ad61e 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -31,6 +31,12 @@
 
 #define HASH_LENGTH 32
 
+static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
+                                       uint8_t **hash,
+                                       uint64_t *first_logical_offset,
+                                       uint64_t physical_cluster_offset,
+                                       bool write);
+
 /**
  * Read some data from the QCOW2 file
  *
@@ -336,7 +342,15 @@ static int qcow2_dedup_cluster(BlockDriverState *bs,
                 if (ret < 0) {
                     goto exit;
                 }
+
                 hash_node->first_logical_offset = first_logical_offset;
+                ret = qcow2_dedup_read_write_hash(bs, precomputed_hash,
+                                                  &first_logical_offset,
+                                                  physical_cluster_offset,
+                                                  true);
+                if (ret < 0) {
+                    goto exit;
+                }
             }
         }
     }
@@ -460,3 +474,209 @@ exit:
     }
     return deduped_clusters_nr * s->cluster_sectors - begining_index;
 }
+
+/* Read a hash cluster from disk or allocate it if it doesn't exist yet
+ *
+ * @in_dedup_table_index: The index of the hash cluster in the dedup table
+ * @hash_block:           the place where the cluster will be loaded
+ * @create:               set to true if dedup table entries must be created
+ *                        when not found
+ * @ret:                  0 on success, errno on error
+ */
+static int get_hash_cluster_from_cache(BlockDriverState *bs,
+                                       int32_t in_dedup_table_index,
+                                       uint8_t **hash_block, bool create)
+{
+    BDRVQcowState *s = bs->opaque;
+    int ret = -ENOSPC;
+    int64_t hash_cluster_offset;
+
+    if (in_dedup_table_index >= (s->dedup_table_size - 1)) {
+        goto fail;
+    }
+
+    hash_cluster_offset = s->dedup_table[in_dedup_table_index];
+    if (!hash_cluster_offset && create) {
+        /* the dedup table entry doesn't exists and we must create it */
+        uint64_t data64;
+        /* allocate a new dedup table cluster */
+        hash_cluster_offset = qcow2_alloc_clusters(bs, s->cluster_size);
+        if (hash_cluster_offset < 0) {
+            return hash_cluster_offset;
+        }
+
+        ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+        if (ret < 0) {
+            goto fail;
+        }
+
+        s->dedup_table[in_dedup_table_index] = hash_cluster_offset;
+        /* get an empty cluster from the dedup cache */
+        ret = qcow2_cache_get_empty(bs, s->dedup_cluster_cache,
+                                    hash_cluster_offset,
+                                    (void **) hash_block);
+        if (ret < 0) {
+            goto fail;
+        }
+        /* clear it */
+        memset(*hash_block, 0, s->cluster_size);
+        /* write the new block offset in the dedup table */
+        data64 = cpu_to_be64(hash_cluster_offset);
+        ret = bdrv_pwrite_sync(bs->file,
+                               s->dedup_table_offset +
+                               in_dedup_table_index * sizeof(uint64_t),
+                               &data64, sizeof(data64));
+        if (ret < 0) {
+            goto fail;
+        }
+    } else if (!hash_cluster_offset && !create) {
+        /* the dedup table entry doesn't exits and we must _not_ create */
+        return 1;
+    } else {
+        /* the entry exists get it */
+        hash_cluster_offset = s->dedup_table[in_dedup_table_index];
+        ret = qcow2_cache_get(bs, s->dedup_cluster_cache,
+                              hash_cluster_offset, (void **) hash_block);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    return 0;
+
+fail:
+    qcow2_free_clusters(bs, hash_cluster_offset, s->cluster_size);
+    return ret;
+}
+
+/* Read/write a given hash and cluster_offset from/to the dedup table
+ *
+ * This function doesn't flush the dedup cache to disk
+ *
+ * @hash:                     the hash to read or store
+ * @first_logical_offset:     logical offset of the QCOW_FLAG_OCOPIED cluster
+ * @physical_cluster_offset:  offset of the cluster in QCOW2 file (in sectors)
+ * @write:                    true to write, false to read
+ * @ret:                      0 on succes, errno on error
+ */
+static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
+                                       uint8_t **hash,
+                                       uint64_t *first_logical_offset,
+                                       uint64_t physical_cluster_offset,
+                                       bool write)
+{
+    BDRVQcowState *s = bs->opaque;
+    uint8_t *hash_block = NULL;
+    int ret;
+    int64_t cluster_number;
+    int64_t in_dedup_table_index;
+    int hash_block_offset;
+    int nb_hash_in_dedup_cluster = s->cluster_size / (HASH_LENGTH + 8);
+    uint64_t first;
+
+    cluster_number = physical_cluster_offset / s->cluster_sectors;
+    in_dedup_table_index = cluster_number / nb_hash_in_dedup_cluster;
+
+    /* if we are doing a write this will create missing dedup table entries */
+    ret = get_hash_cluster_from_cache(bs, in_dedup_table_index,
+                                      &hash_block, write);
+    if (ret < 0) {
+        return ret;
+    }
+
+    hash_block_offset = (cluster_number % nb_hash_in_dedup_cluster) *
+                        (HASH_LENGTH + 8);
+    if (ret == 1) {
+        /* dedup cache is not used */
+        *hash = g_malloc0(HASH_LENGTH);
+        *first_logical_offset = 0;
+    } else if (write)  {
+        first = cpu_to_be64(*first_logical_offset);
+        memcpy(hash_block + hash_block_offset , *hash, HASH_LENGTH);
+        memcpy(hash_block + hash_block_offset + HASH_LENGTH, &first, 8);
+        qcow2_cache_entry_mark_dirty(s->dedup_cluster_cache, hash_block);
+    } else  {
+        *hash = g_malloc(HASH_LENGTH);
+        memcpy(*hash, hash_block + hash_block_offset, HASH_LENGTH);
+        memcpy(&first, hash_block + hash_block_offset + HASH_LENGTH, 8);
+        *first_logical_offset = be64_to_cpu(first);
+    }
+
+    if (!ret) {
+        qcow2_cache_put(bs, s->dedup_cluster_cache, (void **) &hash_block);
+    }
+
+    return ret;
+}
+
+static void qcow2_dedup_remove_old_hash_by_offset(BlockDriverState *bs,
+                                                  uint64_t offset)
+{
+    BDRVQcowState *s = bs->opaque;
+    QCowHashNode *hash_node;
+
+    hash_node = g_tree_lookup(s->dedup_tree_by_offset, &offset);
+
+    if (hash_node) {
+        g_tree_remove(s->dedup_tree_by_offset, &hash_node->offset);
+        g_tree_remove(s->dedup_tree_by_hash, hash_node->hash);
+    }
+}
+
+/* This function write the hashes of the clusters which are not duplicated
+ *
+ * @u:                       the list of undedupable hashes
+ * @logical_cluster_offset:  logical offset of the first cluster (in sectors)
+ * @physical_cluster_offset: offset of the first cluster (in sectors)
+ * @ret:                     0 on succes, errno on error
+ */
+int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
+                                 UndedupableHashes *u,
+                                 int hash_count,
+                                 uint64_t logical_cluster_offset,
+                                 uint64_t physical_cluster_offset)
+{
+    int ret;
+    BDRVQcowState *s = bs->opaque;
+    QCowHashElement *dedup_hash, *next_dedup_hash;
+    QCowHashNode *hash_node;
+
+    int i = 0;
+
+    QTAILQ_FOREACH_SAFE(dedup_hash, &u->undedupable_hashes,
+                        next, next_dedup_hash) {
+        uint64_t physical = physical_cluster_offset + i * s->cluster_sectors;
+        uint64_t logical = logical_cluster_offset + i * s->cluster_sectors;
+
+        hash_node = g_tree_lookup(s->dedup_tree_by_hash, dedup_hash->hash);
+
+        if (hash_node && hash_node->offset & QCOW_FLAG_EMPTY) {
+            logical = logical | QCOW_FLAG_FIRST;
+            hash_node->offset = physical;
+            hash_node->first_logical_offset = logical &
+                                              ~(s->cluster_sectors - 1);
+            qcow2_dedup_remove_old_hash_by_offset(bs, hash_node->offset);
+            g_tree_insert(s->dedup_tree_by_offset, &hash_node->offset,
+                          hash_node);
+
+            ret = qcow2_dedup_read_write_hash(bs, &dedup_hash->hash,
+                                              &logical,
+                                              physical,
+                                              true);
+            if (ret < 0) {
+                goto fail;
+            }
+        }
+
+        QTAILQ_REMOVE(&u->undedupable_hashes, dedup_hash, next);
+        g_free(dedup_hash);
+        i++;
+        if (i == hash_count) {
+            break;
+        }
+    }
+
+    ret = qcow2_cache_flush(bs, s->dedup_cluster_cache);
+fail:
+    return ret;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 5c18425..3e05a8c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -385,5 +385,10 @@ int qcow2_dedup(BlockDriverState *bs,
                 int *skip_clusters_nr,
                 int *next_non_dedupable_sectors_nr,
                 uint8_t **next_call_first_hash);
+int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
+                                 UndedupableHashes *u,
+                                 int hash_count,
+                                 uint64_t logical_cluster_offset,
+                                 uint64_t physical_cluster_offset);
 
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 08/24] qcow2: Implement qcow2_compute_cluster_hash.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (6 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 07/24] qcow2: Add qcow2_dedup_write_new_hashes Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-12-11 13:28   ` Stefan Hajnoczi
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 09/24] qcow2: Extract qcow2_dedup_grow_table Benoît Canet
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 Makefile            |    3 +++
 Makefile.target     |    2 +-
 block/qcow2-dedup.c |   10 ++++++++--
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index 88285a4..c79b2da 100644
--- a/Makefile
+++ b/Makefile
@@ -168,6 +168,9 @@ qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y) $(qapi-obj-y) \
                               qapi-visit.o qapi-types.o
 qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y)
 qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y)
+qemu-img$(EXESUF): LIBS+=-lcrypto
+qemu-nbd$(EXESUF): LIBS+=-lcrypto
+qemu-io$(EXESUF): LIBS+=-lcrypto
 
 qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
 
diff --git a/Makefile.target b/Makefile.target
index 3822bc5..f9a988a 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -119,7 +119,7 @@ obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += memory_mapping.o
 obj-$(CONFIG_HAVE_CORE_DUMP) += dump.o
 obj-$(CONFIG_NO_GET_MEMORY_MAPPING) += memory_mapping-stub.o
 obj-$(CONFIG_NO_CORE_DUMP) += dump-stub.o
-LIBS+=-lz
+LIBS+=-lz -lcrypto
 
 QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
 QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 83ad61e..37e8266 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -25,11 +25,13 @@
  * THE SOFTWARE.
  */
 
+#include <openssl/sha.h>
+#include <openssl/evp.h>
 #include "block_int.h"
 #include "qemu-common.h"
 #include "qcow2.h"
 
-#define HASH_LENGTH 32
+#define HASH_LENGTH SHA256_DIGEST_LENGTH
 
 static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
                                        uint8_t **hash,
@@ -188,7 +190,11 @@ static QCowHashNode *qcow2_dedup_build_qcow_hash_node(uint8_t *hash,
 static uint8_t *qcow2_compute_cluster_hash(BlockDriverState *bs,
                                            uint8_t *data)
 {
-    return NULL;
+    BDRVQcowState *s = bs->opaque;
+    uint8_t *hash = g_malloc0(HASH_LENGTH);
+    EVP_Digest(data, s->cluster_size,
+               hash, NULL, EVP_sha256(), NULL);
+    return hash;
 }
 
 /* Try to find the offset of a given cluster if it's duplicated
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 09/24] qcow2: Extract qcow2_dedup_grow_table
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (7 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 08/24] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 10/24] qcow2: create function to load deduplication hashes at startup Benoît Canet
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-cluster.c |  102 +++++++++++++++++++++++++++++++------------------
 block/qcow2-dedup.c   |   45 +++++++++++++++++++++-
 block/qcow2.h         |    9 +++++
 3 files changed, 116 insertions(+), 40 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 9a07191..8db1b2a 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -29,44 +29,48 @@
 #include "block/qcow2.h"
 #include "trace.h"
 
-int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
+int qcow2_do_grow_table(BlockDriverState *bs, int min_size, bool exact_size,
+                        uint64_t **table, uint64_t *table_offset,
+                        int *table_size, qcow2_save_table save_table,
+                        const char *table_name)
 {
     BDRVQcowState *s = bs->opaque;
-    int new_l1_size, new_l1_size2, ret, i;
-    uint64_t *new_l1_table;
-    int64_t new_l1_table_offset;
-    uint8_t data[12];
+    int new_size, new_size2, ret, i;
+    uint64_t *new_table;
+    int64_t new_table_offset;
 
-    if (min_size <= s->l1_size)
+    if (min_size <= *table_size) {
         return 0;
+    }
 
     if (exact_size) {
-        new_l1_size = min_size;
+        new_size = min_size;
     } else {
         /* Bump size up to reduce the number of times we have to grow */
-        new_l1_size = s->l1_size;
-        if (new_l1_size == 0) {
-            new_l1_size = 1;
+        new_size = *table_size;
+        if (new_size == 0) {
+            new_size = 1;
         }
-        while (min_size > new_l1_size) {
-            new_l1_size = (new_l1_size * 3 + 1) / 2;
+        while (min_size > new_size) {
+            new_size = (new_size * 3 + 1) / 2;
         }
     }
 
 #ifdef DEBUG_ALLOC2
-    fprintf(stderr, "grow l1_table from %d to %d\n", s->l1_size, new_l1_size);
+    fprintf(stderr, "grow %s_table from %d to %d\n",
+            table_name, *table_size, new_size);
 #endif
 
-    new_l1_size2 = sizeof(uint64_t) * new_l1_size;
-    new_l1_table = g_malloc0(align_offset(new_l1_size2, 512));
-    memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
+    new_size2 = sizeof(uint64_t) * new_size;
+    new_table = g_malloc0(align_offset(new_size2, 512));
+    memcpy(new_table, *table, *table_size * sizeof(uint64_t));
 
     /* write new table (align to cluster) */
     BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ALLOC_TABLE);
-    new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
-    if (new_l1_table_offset < 0) {
-        g_free(new_l1_table);
-        return new_l1_table_offset;
+    new_table_offset = qcow2_alloc_clusters(bs, new_size2);
+    if (new_table_offset < 0) {
+        g_free(new_table);
+        return new_table_offset;
     }
 
     ret = qcow2_cache_flush(bs, s->refcount_block_cache);
@@ -75,34 +79,56 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
     }
 
     BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE);
-    for(i = 0; i < s->l1_size; i++)
-        new_l1_table[i] = cpu_to_be64(new_l1_table[i]);
-    ret = bdrv_pwrite_sync(bs->file, new_l1_table_offset, new_l1_table, new_l1_size2);
+    for (i = 0; i < *table_size; i++) {
+        new_table[i] = cpu_to_be64(new_table[i]);
+    }
+    ret = bdrv_pwrite_sync(bs->file, new_table_offset, new_table, new_size2);
     if (ret < 0)
         goto fail;
-    for(i = 0; i < s->l1_size; i++)
-        new_l1_table[i] = be64_to_cpu(new_l1_table[i]);
+    for (i = 0; i < *table_size; i++) {
+        new_table[i] = be64_to_cpu(new_table[i]);
+    }
+
+    g_free(*table);
+    qcow2_free_clusters(bs, *table_offset, *table_size * sizeof(uint64_t));
+    *table_offset = new_table_offset;
+    *table = new_table;
+    *table_size = new_size;
 
     /* set new table */
     BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ACTIVATE_TABLE);
-    cpu_to_be32w((uint32_t*)data, new_l1_size);
-    cpu_to_be64wu((uint64_t*)(data + 4), new_l1_table_offset);
-    ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size), data,sizeof(data));
-    if (ret < 0) {
-        goto fail;
-    }
-    g_free(s->l1_table);
-    qcow2_free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t));
-    s->l1_table_offset = new_l1_table_offset;
-    s->l1_table = new_l1_table;
-    s->l1_size = new_l1_size;
+    save_table(bs, *table_offset, *table_size);
+
     return 0;
  fail:
-    g_free(new_l1_table);
-    qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2);
+    g_free(new_table);
+    qcow2_free_clusters(bs, new_table_offset, new_size2);
     return ret;
 }
 
+static int qcow2_l1_save_table(BlockDriverState *bs,
+                               int64_t table_offset, int size)
+{
+    uint8_t data[12];
+    cpu_to_be32w((uint32_t *)data, size);
+    cpu_to_be64wu((uint64_t *)(data + 4), table_offset);
+    return bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size),
+                            data, sizeof(data));
+}
+
+int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
+{
+    BDRVQcowState *s = bs->opaque;
+    return qcow2_do_grow_table(bs,
+                               min_size,
+                               exact_size,
+                               &s->l1_table,
+                               &s->l1_table_offset,
+                               &s->l1_size,
+                               qcow2_l1_save_table,
+                               "l1");
+}
+
 /*
  * l2_load
  *
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 37e8266..2ebbbcf 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -497,8 +497,11 @@ static int get_hash_cluster_from_cache(BlockDriverState *bs,
     int ret = -ENOSPC;
     int64_t hash_cluster_offset;
 
-    if (in_dedup_table_index >= (s->dedup_table_size - 1)) {
-        goto fail;
+    if (in_dedup_table_index >= (s->dedup_table_size-1)) {
+        ret = qcow2_dedup_grow_table(bs, in_dedup_table_index + 1, false);
+        if (ret < 0) {
+            goto fail;
+        }
     }
 
     hash_cluster_offset = s->dedup_table[in_dedup_table_index];
@@ -686,3 +689,41 @@ int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
 fail:
     return ret;
 }
+
+/*
+ * Save the dedup table information into the header extensions
+ *
+ * @table_offset: the dedup table offset in the QCOW2 file
+ * @size:         the size of the dedup table
+ * @ret:          0 on success, -errno  on error
+ */
+static int qcow2_dedup_save_table_info(BlockDriverState *bs,
+                                  int64_t table_offset, int size)
+{
+    BDRVQcowState *s = bs->opaque;
+    s->dedup_table_offset = table_offset;
+    s->dedup_table_size = size;
+    return qcow2_update_header(bs);
+}
+
+/*
+ * Grow the deduplication table
+ *
+ * @min_size:   minimal size
+ * @exact_size: if true force to grow to the exact size
+ * @ret:        0 on success, -errno  on error
+ */
+int qcow2_dedup_grow_table(BlockDriverState *bs,
+                           int min_size,
+                           bool exact_size)
+{
+    BDRVQcowState *s = bs->opaque;
+    return qcow2_do_grow_table(bs,
+                               min_size,
+                               exact_size,
+                               &s->dedup_table,
+                               &s->dedup_table_offset,
+                               &s->dedup_table_size,
+                               qcow2_dedup_save_table_info,
+                               "dedup");
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 3e05a8c..62822b7 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -323,6 +323,12 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
                           BdrvCheckMode fix);
 
 /* qcow2-cluster.c functions */
+typedef int (*qcow2_save_table)(BlockDriverState *bs,
+                                int64_t table_offset, int size);
+int qcow2_do_grow_table(BlockDriverState *bs, int min_size, bool exact_size,
+                        uint64_t **table, uint64_t *table_offset,
+                        int *table_size, qcow2_save_table save_table,
+                        const char *table_name);
 int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size);
 void qcow2_l2_cache_reset(BlockDriverState *bs);
 int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
@@ -390,5 +396,8 @@ int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
                                  int hash_count,
                                  uint64_t logical_cluster_offset,
                                  uint64_t physical_cluster_offset);
+int qcow2_dedup_grow_table(BlockDriverState *bs,
+                           int min_size,
+                           bool exact_size);
 
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 10/24] qcow2: create function to load deduplication hashes at startup.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (8 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 09/24] qcow2: Extract qcow2_dedup_grow_table Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 11/24] qcow2: Load and save deduplication table header extension Benoît Canet
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-dedup.c |   66 +++++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.h       |    1 +
 2 files changed, 67 insertions(+)

diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 2ebbbcf..1760e8a 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -690,6 +690,72 @@ fail:
     return ret;
 }
 
+static void qcow2_dedup_insert_hash_and_preserve_newer(BlockDriverState *bs,
+                                                       QCowHashNode *hash_node)
+{
+    BDRVQcowState *s = bs->opaque;
+    QCowHashNode *newer_hash_node;
+
+    newer_hash_node = g_tree_lookup(s->dedup_tree_by_offset,
+                                    &hash_node->offset);
+
+    if (!newer_hash_node) {
+        g_tree_insert(s->dedup_tree_by_hash, hash_node->hash, hash_node);
+        g_tree_insert(s->dedup_tree_by_offset, &hash_node->offset, hash_node);
+    } else {
+        g_free(hash_node->hash);
+        g_free(hash_node);
+    }
+}
+
+/*
+ * This coroutine load the deduplication hashes in the tree
+ *
+ * @data: the given BlockDriverState
+ * @ret:  NULL
+ */
+void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque)
+{
+    BlockDriverState *bs = opaque;
+    BDRVQcowState *s = bs->opaque;
+    int ret;
+    uint8_t *hash = NULL;
+    uint8_t null_hash[HASH_LENGTH];
+    uint64_t max_cluster_offset, i;
+    uint64_t first_logical_offset;
+    int nb_hash_in_dedup_cluster = s->cluster_size / (HASH_LENGTH + 8);
+    QCowHashNode *hash_node;
+
+    /* prepare the null hash */
+    memset(null_hash, 0, HASH_LENGTH);
+
+    max_cluster_offset = s->dedup_table_size * nb_hash_in_dedup_cluster;
+
+    for (i = 0; i < max_cluster_offset; i++) {
+        /* get the hash */
+        qemu_co_mutex_lock(&s->lock);
+        ret = qcow2_dedup_read_write_hash(bs, &hash,
+                                          &first_logical_offset,
+                                          i * s->cluster_sectors,
+                                          false);
+        if (ret < 0) {
+            qemu_co_mutex_unlock(&s->lock);
+            error_report("Failed to load deduplication hash.");
+        }
+
+        /* if the hash is not null load it into the tree */
+        if (memcmp(hash, null_hash, HASH_LENGTH)) {
+            hash_node = qcow2_dedup_build_qcow_hash_node(hash,
+                                                         i * s->cluster_sectors,
+                                                         first_logical_offset);
+            qcow2_dedup_insert_hash_and_preserve_newer(bs, hash_node);
+        } else {
+            free(hash);
+        }
+        qemu_co_mutex_unlock(&s->lock);
+    }
+}
+
 /*
  * Save the dedup table information into the header extensions
  *
diff --git a/block/qcow2.h b/block/qcow2.h
index 62822b7..c7edb14 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -396,6 +396,7 @@ int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
                                  int hash_count,
                                  uint64_t logical_cluster_offset,
                                  uint64_t physical_cluster_offset);
+void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque);
 int qcow2_dedup_grow_table(BlockDriverState *bs,
                            int min_size,
                            bool exact_size);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 11/24] qcow2: Load and save deduplication table header extension.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (9 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 10/24] qcow2: create function to load deduplication hashes at startup Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 12/24] qcow2: Extract qcow2_do_table_init Benoît Canet
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2.c |   35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index b5276c0..6329770 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -53,9 +53,15 @@ typedef struct {
     uint32_t len;
 } QCowExtension;
 
+typedef struct {
+    uint64_t offset;
+    int32_t  size;
+} QCowDedupTableExtension;
+
 #define  QCOW2_EXT_MAGIC_END 0
 #define  QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
 #define  QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857
+#define  QCOW2_EXT_MAGIC_DEDUP_TABLE 0xCD8E819B
 
 static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename)
 {
@@ -84,6 +90,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
     QCowExtension ext;
     uint64_t offset;
     int ret;
+    QCowDedupTableExtension dedup_table_extension;
 
 #ifdef DEBUG_EXT
     printf("qcow2_read_extensions: start=%ld end=%ld\n", start_offset, end_offset);
@@ -148,6 +155,18 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
             }
             break;
 
+        case QCOW2_EXT_MAGIC_DEDUP_TABLE:
+                ret = bdrv_pread(bs->file, offset,
+                                 &dedup_table_extension, ext.len);
+                if (ret < 0) {
+                    return ret;
+                }
+                s->dedup_table_offset =
+                    be64_to_cpu(dedup_table_extension.offset);
+                s->dedup_table_size =
+                    be32_to_cpu(dedup_table_extension.size);
+            break;
+
         default:
             /* unknown magic - save it in case we need to rewrite the header */
             {
@@ -966,6 +985,7 @@ int qcow2_update_header(BlockDriverState *bs)
     uint32_t refcount_table_clusters;
     size_t header_length;
     Qcow2UnknownHeaderExtension *uext;
+    QCowDedupTableExtension dedup_table_extension;
 
     buf = qemu_blockalign(bs, buflen);
 
@@ -1069,6 +1089,21 @@ int qcow2_update_header(BlockDriverState *bs)
     buf += ret;
     buflen -= ret;
 
+    if (s->has_dedup) {
+        dedup_table_extension.offset = cpu_to_be64(s->dedup_table_offset);
+        dedup_table_extension.size = cpu_to_be32(s->dedup_table_size);
+        ret = header_ext_add(buf,
+                             QCOW2_EXT_MAGIC_DEDUP_TABLE,
+                             &dedup_table_extension,
+                             sizeof(dedup_table_extension),
+                             buflen);
+        if (ret < 0) {
+            goto fail;
+        }
+        buf += ret;
+        buflen -= ret;
+    }
+
     /* Keep unknown header extensions */
     QLIST_FOREACH(uext, &s->unknown_header_ext, next) {
         ret = header_ext_add(buf, uext->magic, uext->data, uext->len, buflen);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 12/24] qcow2: Extract qcow2_do_table_init.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (10 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 11/24] qcow2: Load and save deduplication table header extension Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 13/24] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-refcount.c |   43 ++++++++++++++++++++++++++++++-------------
 block/qcow2.h          |    5 +++++
 2 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index faca64c..7681001 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -35,27 +35,44 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
 /*********************************************************/
 /* refcount handling */
 
-int qcow2_refcount_init(BlockDriverState *bs)
+int qcow2_do_table_init(BlockDriverState *bs,
+                        uint64_t **table,
+                        int64_t offset,
+                        int size,
+                        bool is_refcount)
 {
-    BDRVQcowState *s = bs->opaque;
-    int ret, refcount_table_size2, i;
-
-    refcount_table_size2 = s->refcount_table_size * sizeof(uint64_t);
-    s->refcount_table = g_malloc(refcount_table_size2);
-    if (s->refcount_table_size > 0) {
-        BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_LOAD);
-        ret = bdrv_pread(bs->file, s->refcount_table_offset,
-                         s->refcount_table, refcount_table_size2);
-        if (ret != refcount_table_size2)
+    int ret, size2, i;
+
+    size2 = size * sizeof(uint64_t);
+    *table = g_malloc(size2);
+    if (size > 0) {
+        if (is_refcount) {
+            BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_LOAD);
+        }
+        ret = bdrv_pread(bs->file, offset,
+                         *table, size2);
+        if (ret != size2) {
             goto fail;
-        for(i = 0; i < s->refcount_table_size; i++)
-            be64_to_cpus(&s->refcount_table[i]);
+        }
+        for (i = 0; i < size; i++) {
+            be64_to_cpus(&(*table)[i]);
+        }
     }
     return 0;
  fail:
     return -ENOMEM;
 }
 
+int qcow2_refcount_init(BlockDriverState *bs)
+{
+    BDRVQcowState *s = bs->opaque;
+    return qcow2_do_table_init(bs,
+                               &s->refcount_table,
+                               s->refcount_table_offset,
+                               s->refcount_table_size,
+                               true);
+}
+
 void qcow2_refcount_close(BlockDriverState *bs)
 {
     BDRVQcowState *s = bs->opaque;
diff --git a/block/qcow2.h b/block/qcow2.h
index c7edb14..af80d16 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -300,6 +300,11 @@ int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
 int qcow2_update_header(BlockDriverState *bs);
 
 /* qcow2-refcount.c functions */
+int qcow2_do_table_init(BlockDriverState *bs,
+                        uint64_t **table,
+                        int64_t offset,
+                        int size,
+                        bool is_refcount);
 int qcow2_refcount_init(BlockDriverState *bs);
 void qcow2_refcount_close(BlockDriverState *bs);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 13/24] qcow2: Add qcow2_dedup_init and qcow2_dedup_close.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (11 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 12/24] qcow2: Extract qcow2_do_table_init Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 14/24] qcow2: Extract qcow2_add_feature and qcow2_remove_feature Benoît Canet
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-dedup.c |   16 ++++++++++++++++
 block/qcow2.h       |    2 ++
 2 files changed, 18 insertions(+)

diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 1760e8a..80e9477 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -793,3 +793,19 @@ int qcow2_dedup_grow_table(BlockDriverState *bs,
                                qcow2_dedup_save_table_info,
                                "dedup");
 }
+
+int qcow2_dedup_init(BlockDriverState *bs)
+{
+    BDRVQcowState *s = bs->opaque;
+    return qcow2_do_table_init(bs,
+                               &s->dedup_table,
+                               s->dedup_table_offset,
+                               s->dedup_table_size,
+                               false);
+}
+
+void qcow2_dedup_close(BlockDriverState *bs)
+{
+    BDRVQcowState *s = bs->opaque;
+    g_free(s->dedup_table);
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index af80d16..87a7f43 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -405,5 +405,7 @@ void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque);
 int qcow2_dedup_grow_table(BlockDriverState *bs,
                            int min_size,
                            bool exact_size);
+int qcow2_dedup_init(BlockDriverState *bs);
+void qcow2_dedup_close(BlockDriverState *bs);
 
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 14/24] qcow2: Extract qcow2_add_feature and qcow2_remove_feature.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (12 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 13/24] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 15/24] block: Add dedup image create option Benoît Canet
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

These functions will be use to mark that deduplication is activatedi
on an image.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2.c |   40 +++++++++++++++++++++++++++-------------
 block/qcow2.h |    4 ++--
 2 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 6329770..76d2340 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -235,13 +235,14 @@ static void report_unsupported_feature(BlockDriverState *bs,
 }
 
 /*
- * Sets the dirty bit and flushes afterwards if necessary.
+ * Sets the an incompatible feature bit and flushes afterwards if necessary.
  *
  * The incompatible_features bit is only set if the image file header was
  * updated successfully.  Therefore it is not required to check the return
  * value of this function.
  */
-static int qcow2_mark_dirty(BlockDriverState *bs)
+static int qcow2_add_feature(BlockDriverState *bs,
+                             QCow2IncompatibleFeature feature)
 {
     BDRVQcowState *s = bs->opaque;
     uint64_t val;
@@ -249,11 +250,11 @@ static int qcow2_mark_dirty(BlockDriverState *bs)
 
     assert(s->qcow_version >= 3);
 
-    if (s->incompatible_features & QCOW2_INCOMPAT_DIRTY) {
-        return 0; /* already dirty */
+    if (s->incompatible_features & feature) {
+        return 0; /* already added */
     }
 
-    val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY);
+    val = cpu_to_be64(s->incompatible_features | feature);
     ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, incompatible_features),
                       &val, sizeof(val));
     if (ret < 0) {
@@ -264,32 +265,45 @@ static int qcow2_mark_dirty(BlockDriverState *bs)
         return ret;
     }
 
-    /* Only treat image as dirty if the header was updated successfully */
-    s->incompatible_features |= QCOW2_INCOMPAT_DIRTY;
+    /* Only treat image as having the feature if the header was updated
+     * successfully
+     */
+    s->incompatible_features |= feature;
     return 0;
 }
 
+static int qcow2_mark_dirty(BlockDriverState *bs)
+{
+    return qcow2_add_feature(bs, QCOW2_INCOMPAT_DIRTY);
+}
+
 /*
- * Clears the dirty bit and flushes before if necessary.  Only call this
- * function when there are no pending requests, it does not guard against
- * concurrent requests dirtying the image.
+ * Clears an incompatible feature bit and flushes before if necessary.
+ * Only call this function when there are no pending requests, it does not
+ * guard against concurrent requests adding a feature to the image.
  */
-static int qcow2_mark_clean(BlockDriverState *bs)
+static int qcow2_remove_feature(BlockDriverState *bs,
+                             QCow2IncompatibleFeature feature)
 {
     BDRVQcowState *s = bs->opaque;
 
-    if (s->incompatible_features & QCOW2_INCOMPAT_DIRTY) {
+    if (s->incompatible_features & feature) {
         int ret = bdrv_flush(bs);
         if (ret < 0) {
             return ret;
         }
 
-        s->incompatible_features &= ~QCOW2_INCOMPAT_DIRTY;
+        s->incompatible_features &= ~feature;
         return qcow2_update_header(bs);
     }
     return 0;
 }
 
+static int qcow2_mark_clean(BlockDriverState *bs)
+{
+    return qcow2_remove_feature(bs, QCOW2_INCOMPAT_DIRTY);
+}
+
 static int qcow2_check(BlockDriverState *bs, BdrvCheckResult *result,
                        BdrvCheckMode fix)
 {
diff --git a/block/qcow2.h b/block/qcow2.h
index 87a7f43..9d08bf9 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -133,14 +133,14 @@ enum {
 };
 
 /* Incompatible feature bits */
-enum {
+typedef enum {
     QCOW2_INCOMPAT_DIRTY_BITNR   = 0,
     QCOW2_INCOMPAT_DIRTY         = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
     QCOW2_INCOMPAT_DEDUP_BITNR   = 1,
     QCOW2_INCOMPAT_DEDUP         = 1 << QCOW2_INCOMPAT_DEDUP_BITNR,
 
     QCOW2_INCOMPAT_MASK          = QCOW2_INCOMPAT_DIRTY | QCOW2_INCOMPAT_DEDUP,
-};
+} QCow2IncompatibleFeature;
 
 /* Compatible feature bits */
 enum {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 15/24] block: Add dedup image create option.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (13 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 14/24] qcow2: Extract qcow2_add_feature and qcow2_remove_feature Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 16/24] qcow2: Allow creation of images using deduplication Benoît Canet
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block_int.h |    1 +
 1 file changed, 1 insertion(+)

diff --git a/block_int.h b/block_int.h
index f4bae04..6419513 100644
--- a/block_int.h
+++ b/block_int.h
@@ -55,6 +55,7 @@
 #define BLOCK_OPT_SUBFMT            "subformat"
 #define BLOCK_OPT_COMPAT_LEVEL      "compat"
 #define BLOCK_OPT_LAZY_REFCOUNTS    "lazy_refcounts"
+#define BLOCK_OPT_DEDUP             "dedup"
 
 typedef struct BdrvTrackedRequest BdrvTrackedRequest;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 16/24] qcow2: Allow creation of images using deduplication.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (14 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 15/24] block: Add dedup image create option Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 17/24] qcow2: Behave correctly when refcount reach 0 or 2^16 Benoît Canet
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

todo: Change qemu-img output so it reflect the dedup cluster size.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2.c |   94 +++++++++++++++++++++++++++++++++++++++++++++++++--------
 block/qcow2.h |    2 ++
 2 files changed, 84 insertions(+), 12 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 76d2340..e641049 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -277,6 +277,11 @@ static int qcow2_mark_dirty(BlockDriverState *bs)
     return qcow2_add_feature(bs, QCOW2_INCOMPAT_DIRTY);
 }
 
+static int qcow2_activate_dedup(BlockDriverState *bs)
+{
+    return qcow2_add_feature(bs, QCOW2_INCOMPAT_DEDUP);
+}
+
 /*
  * Clears an incompatible feature bit and flushes before if necessary.
  * Only call this function when there are no pending requests, it does not
@@ -913,6 +918,11 @@ static void qcow2_close(BlockDriverState *bs)
     BDRVQcowState *s = bs->opaque;
     g_free(s->l1_table);
 
+    if (s->has_dedup) {
+        qcow2_cache_flush(bs, s->dedup_cluster_cache);
+        qcow2_cache_destroy(bs, s->dedup_cluster_cache);
+    }
+
     qcow2_cache_flush(bs, s->l2_table_cache);
     qcow2_cache_flush(bs, s->refcount_block_cache);
 
@@ -1231,7 +1241,8 @@ static int preallocate(BlockDriverState *bs)
 static int qcow2_create2(const char *filename, int64_t total_size,
                          const char *backing_file, const char *backing_format,
                          int flags, size_t cluster_size, int prealloc,
-                         QEMUOptionParameter *options, int version)
+                         QEMUOptionParameter *options, int version,
+                         bool dedup)
 {
     /* Calculate cluster_bits */
     int cluster_bits;
@@ -1258,8 +1269,10 @@ static int qcow2_create2(const char *filename, int64_t total_size,
      * size for any qcow2 image.
      */
     BlockDriverState* bs;
+    BDRVQcowState *s;
     QCowHeader header;
-    uint8_t* refcount_table;
+    uint8_t *tables;
+    int size;
     int ret;
 
     ret = bdrv_create_file(filename, options);
@@ -1301,10 +1314,11 @@ static int qcow2_create2(const char *filename, int64_t total_size,
         goto out;
     }
 
-    /* Write an empty refcount table */
-    refcount_table = g_malloc0(cluster_size);
-    ret = bdrv_pwrite(bs, cluster_size, refcount_table, cluster_size);
-    g_free(refcount_table);
+    /* Write an empty refcount table + extra space for dedup table if needed */
+    size = cluster_size * (dedup ? 2 : 1);
+    tables = g_malloc0(size);
+    ret = bdrv_pwrite(bs, cluster_size, tables, size);
+    g_free(tables);
 
     if (ret < 0) {
         goto out;
@@ -1325,7 +1339,8 @@ static int qcow2_create2(const char *filename, int64_t total_size,
         goto out;
     }
 
-    ret = qcow2_alloc_clusters(bs, 2 * cluster_size);
+    size += cluster_size;
+    ret = qcow2_alloc_clusters(bs, size);
     if (ret < 0) {
         goto out;
 
@@ -1335,11 +1350,32 @@ static int qcow2_create2(const char *filename, int64_t total_size,
     }
 
     /* Okay, now that we have a valid image, let's give it the right size */
+    s = bs->opaque;
+    size = (total_size + (dedup ? s->cluster_sectors : 0)) * BDRV_SECTOR_SIZE;
     ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
     if (ret < 0) {
         goto out;
     }
 
+    if (dedup) {
+        s->has_dedup = true;
+        s->dedup_table_offset = cluster_size * 2;
+        s->dedup_table_size = cluster_size / sizeof(uint64_t);
+
+        ret = qcow2_activate_dedup(bs);
+        if (ret < 0) {
+            goto out;
+        }
+
+        ret = qcow2_update_header(bs);
+        if (ret < 0) {
+            goto out;
+        }
+
+        /* minimal init */
+        s->dedup_cluster_cache = qcow2_cache_create(bs, DEDUP_CACHE_SIZE);
+    }
+
     /* Want a backing file? There you go.*/
     if (backing_file) {
         ret = bdrv_change_backing_file(bs, backing_file, backing_format);
@@ -1365,15 +1401,30 @@ out:
     return ret;
 }
 
+static int qcow2_warn_if_version_3_is_needed(int version,
+                                             bool has_feature,
+                                             const char *feature)
+{
+    if (version < 3 && has_feature) {
+        fprintf(stderr, "%s only supported with compatibility "
+                "level 1.1 and above (use compat=1.1 or greater)\n",
+                feature);
+        return -EINVAL;
+    }
+    return 0;
+}
+
 static int qcow2_create(const char *filename, QEMUOptionParameter *options)
 {
     const char *backing_file = NULL;
     const char *backing_fmt = NULL;
     uint64_t sectors = 0;
     int flags = 0;
+    int ret;
     size_t cluster_size = DEFAULT_CLUSTER_SIZE;
     int prealloc = 0;
     int version = 2;
+    bool dedup = false;
 
     /* Read out options */
     while (options && options->name) {
@@ -1411,24 +1462,38 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options)
             }
         } else if (!strcmp(options->name, BLOCK_OPT_LAZY_REFCOUNTS)) {
             flags |= options->value.n ? BLOCK_FLAG_LAZY_REFCOUNTS : 0;
+        } else if (!strcmp(options->name, BLOCK_OPT_DEDUP)) {
+            dedup = options->value.n ? true : false;
         }
         options++;
     }
 
+    if (dedup) {
+        cluster_size = 4096;
+    }
+
     if (backing_file && prealloc) {
         fprintf(stderr, "Backing file and preallocation cannot be used at "
             "the same time\n");
         return -EINVAL;
     }
 
-    if (version < 3 && (flags & BLOCK_FLAG_LAZY_REFCOUNTS)) {
-        fprintf(stderr, "Lazy refcounts only supported with compatibility "
-                "level 1.1 and above (use compat=1.1 or greater)\n");
-        return -EINVAL;
+    ret = qcow2_warn_if_version_3_is_needed(version,
+                                            flags & BLOCK_FLAG_LAZY_REFCOUNTS,
+                                            "Lazy refcounts");
+    if (ret < 0) {
+        return ret;
+    }
+    ret = qcow2_warn_if_version_3_is_needed(version,
+                                            dedup,
+                                            "Deduplication");
+    if (ret < 0) {
+        return ret;
     }
 
     return qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
-                         cluster_size, prealloc, options, version);
+                         cluster_size, prealloc, options, version,
+                         dedup);
 }
 
 static int qcow2_make_empty(BlockDriverState *bs)
@@ -1731,6 +1796,11 @@ static QEMUOptionParameter qcow2_create_options[] = {
         .type = OPT_FLAG,
         .help = "Postpone refcount updates",
     },
+    {
+        .name = BLOCK_OPT_DEDUP,
+        .type = OPT_FLAG,
+        .help = "Live deduplication",
+    },
     { NULL }
 };
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 9d08bf9..90dcdd9 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -56,6 +56,8 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define REFCOUNT_CACHE_SIZE 4
 
+#define DEDUP_CACHE_SIZE 4
+
 #define DEFAULT_CLUSTER_SIZE 65536
 
 /* indicate that the hash structure is empty and miss offset */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 17/24] qcow2: Behave correctly when refcount reach 0 or 2^16.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (15 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 16/24] qcow2: Allow creation of images using deduplication Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 18/24] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

When refcount reach zero we destroy the hash on disk and remove it from GTree.
When refcount is at it's maximum value we mark the hash so it won't be loaded
at next startup and remove it from GTree.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-dedup.c    |   46 ++++++++++++++++++++++++++++++++++++++++++++--
 block/qcow2-refcount.c |    6 ++++++
 block/qcow2.h          |    6 ++++++
 3 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 80e9477..097d71b 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -743,8 +743,9 @@ void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque)
             error_report("Failed to load deduplication hash.");
         }
 
-        /* if the hash is not null load it into the tree */
-        if (memcmp(hash, null_hash, HASH_LENGTH)) {
+        /* if the hash is not null load it into red black tree */
+        if (memcmp(hash, null_hash, HASH_LENGTH) &&
+            !(first_logical_offset & QCOW_FLAG_MAX_REFCOUNT)) {
             hash_node = qcow2_dedup_build_qcow_hash_node(hash,
                                                          i * s->cluster_sectors,
                                                          first_logical_offset);
@@ -809,3 +810,44 @@ void qcow2_dedup_close(BlockDriverState *bs)
     BDRVQcowState *s = bs->opaque;
     g_free(s->dedup_table);
 }
+
+static void qcow2_dedup_refcount_limit_reached(BlockDriverState *bs,
+                                        uint64_t cluster_index,
+                                        bool bottom)
+{
+    BDRVQcowState *s = bs->opaque;
+    QCowHashNode *hash_node;
+    uint64_t offset = cluster_index * s->cluster_sectors;
+
+    hash_node =  g_tree_lookup(s->dedup_tree_by_offset, &offset);
+    if (!hash_node) {
+        return;
+    }
+
+    if (bottom) {
+        /* clear the hash from disk */
+        memset(hash_node->hash, 0, HASH_LENGTH);
+    } else {
+        /* mark this hash so we won't load it anymore at startup */
+        hash_node->first_logical_offset |= QCOW_FLAG_MAX_REFCOUNT;
+    }
+    qcow2_dedup_read_write_hash(bs,
+                                &hash_node->hash,
+                                &hash_node->first_logical_offset,
+                                hash_node->offset,
+                                true);
+    g_tree_remove(s->dedup_tree_by_offset, &hash_node->offset);
+    g_tree_remove(s->dedup_tree_by_hash, hash_node->hash);
+}
+
+void qcow2_dedup_refcount_zero_reached(BlockDriverState *bs,
+                                      uint64_t cluster_index)
+{
+    qcow2_dedup_refcount_limit_reached(bs, cluster_index, true);
+}
+
+void qcow2_dedup_refcount_max_reached(BlockDriverState *bs,
+                                      uint64_t cluster_index)
+{
+    qcow2_dedup_refcount_limit_reached(bs, cluster_index, false);
+}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 7681001..efc1179 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -492,6 +492,12 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
             ret = -EINVAL;
             goto fail;
         }
+        if (s->has_dedup && refcount == 0) {
+            qcow2_dedup_refcount_zero_reached(bs, cluster_index);
+        }
+        if (s->has_dedup && refcount == 0xffff) {
+            qcow2_dedup_refcount_max_reached(bs, cluster_index);
+        }
         if (refcount == 0 && cluster_index < s->free_cluster_index) {
             s->free_cluster_index = cluster_index;
         }
diff --git a/block/qcow2.h b/block/qcow2.h
index 90dcdd9..8852696 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -60,6 +60,8 @@
 
 #define DEFAULT_CLUSTER_SIZE 65536
 
+/* indicate that this cluster refcount has reached its maximum value */
+#define QCOW_FLAG_MAX_REFCOUNT (1LL << 61)
 /* indicate that the hash structure is empty and miss offset */
 #define QCOW_FLAG_EMPTY   (1LL << 62)
 /* indicate that the cluster for this hash has QCOW_OFLAG_COPIED on disk */
@@ -409,5 +411,9 @@ int qcow2_dedup_grow_table(BlockDriverState *bs,
                            bool exact_size);
 int qcow2_dedup_init(BlockDriverState *bs);
 void qcow2_dedup_close(BlockDriverState *bs);
+void qcow2_dedup_refcount_zero_reached(BlockDriverState *bs,
+                                       uint64_t cluster_index);
+void qcow2_dedup_refcount_max_reached(BlockDriverState *bs,
+                                      uint64_t cluster_index);
 
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 18/24] qcow2: Integrate deduplication in qcow2_co_writev loop.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (16 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 17/24] qcow2: Behave correctly when refcount reach 0 or 2^16 Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 19/24] qcow2: Add verification of dedup table Benoît Canet
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2.c |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 85 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index e641049..d5f28dd 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -330,6 +330,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
     QCowHeader header;
     uint64_t ext_end;
 
+    s->has_dedup = false;
     ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
     if (ret < 0) {
         goto fail;
@@ -812,11 +813,19 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
     QEMUIOVector hd_qiov;
     uint64_t bytes_done = 0;
     uint8_t *cluster_data = NULL;
+    uint8_t *dedup_cluster_data = NULL;
+    uint8_t *next_call_first_hash;
+    int dedup_cluster_data_nr;
+    int deduped_sectors_nr;
+    int skip_before_dedup_clusters_nr;
+    int next_non_dedupable_sectors_nr;
+    UndedupableHashes u;
     QCowL2Meta l2meta = {
         .nb_clusters = 0,
         .oflag_copied = true,
         .overwrite = false,
     };
+    QTAILQ_INIT(&u.undedupable_hashes);
 
     trace_qcow2_writev_start_req(qemu_coroutine_self(), sector_num,
                                  remaining_sectors);
@@ -829,11 +838,67 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
 
     qemu_co_mutex_lock(&s->lock);
 
+    if (s->has_dedup) {
+        /* if deduplication is on we make sure dedup_cluster_data
+         * contains a multiple of cluster size of data in order
+         * to compute the hashes
+         */
+        ret = qcow2_dedup_read_missing_and_concatenate(bs,
+                                                       qiov,
+                                                       sector_num,
+                                                       remaining_sectors,
+                                                       &dedup_cluster_data,
+                                                       &dedup_cluster_data_nr);
+
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
+    next_call_first_hash = NULL;
+    next_non_dedupable_sectors_nr = 0;
+    skip_before_dedup_clusters_nr = 0;
     while (remaining_sectors != 0) {
 
         trace_qcow2_writev_start_part(qemu_coroutine_self());
+
+        if (s->has_dedup && next_non_dedupable_sectors_nr == 0) {
+            /* Try to deduplicate as much clusters as possible */
+            deduped_sectors_nr = qcow2_dedup(bs,
+                                             &u,
+                                             sector_num,
+                                             dedup_cluster_data,
+                                             dedup_cluster_data_nr,
+                                             &skip_before_dedup_clusters_nr,
+                                             &next_non_dedupable_sectors_nr,
+                                             &next_call_first_hash);
+
+            remaining_sectors -= deduped_sectors_nr;
+            sector_num += deduped_sectors_nr;
+            bytes_done += deduped_sectors_nr * 512;
+
+            /* no more data to write -> exit
+             * Can be < 0 because of the presence of sectors we read in
+             * qcow2_read_missing_dedup_sectors_and_concatenate.
+             */
+            if (next_non_dedupable_sectors_nr <= 0) {
+                goto fail;
+            }
+
+            /* if we deduped something trace it */
+            if (deduped_sectors_nr) {
+                trace_qcow2_writev_done_part(qemu_coroutine_self(),
+                                             deduped_sectors_nr);
+                trace_qcow2_writev_start_part(qemu_coroutine_self());
+            }
+        }
+
         index_in_cluster = sector_num & (s->cluster_sectors - 1);
-        n_end = index_in_cluster + remaining_sectors;
+        n_end = s->has_dedup &&
+                next_non_dedupable_sectors_nr < remaining_sectors ?
+                index_in_cluster + next_non_dedupable_sectors_nr :
+                index_in_cluster + remaining_sectors;
+
         if (s->crypt_method &&
             n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) {
             n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors;
@@ -875,6 +940,23 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
                 cur_nr_sectors * 512);
         }
 
+        /* Write the non duplicated clusters hashes to disk */
+        if (s->has_dedup) {
+            int count = cur_nr_sectors / s->cluster_sectors;
+            int has_ending = ((cluster_offset >> 9) + index_in_cluster +
+                             cur_nr_sectors) & (s->cluster_sectors - 1);
+            count = index_in_cluster ? count + 1 : count;
+            count = has_ending ? count + 1 : count;
+            ret = qcow2_dedup_write_new_hashes(bs,
+                                               &u,
+                                               count,
+                                               sector_num,
+                                               (cluster_offset >> 9));
+            if (ret < 0) {
+                goto fail;
+            }
+        }
+
         BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
         qemu_co_mutex_unlock(&s->lock);
         trace_qcow2_writev_data(qemu_coroutine_self(),
@@ -894,6 +976,7 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
 
         run_dependent_requests(s, &l2meta);
 
+        next_non_dedupable_sectors_nr -= cur_nr_sectors;
         remaining_sectors -= cur_nr_sectors;
         sector_num += cur_nr_sectors;
         bytes_done += cur_nr_sectors * 512;
@@ -908,6 +991,7 @@ fail:
 
     qemu_iovec_destroy(&hd_qiov);
     qemu_vfree(cluster_data);
+    qemu_vfree(dedup_cluster_data);
     trace_qcow2_writev_done_req(qemu_coroutine_self(), ret);
 
     return ret;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 19/24] qcow2: Add verification of dedup table.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (17 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 18/24] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 20/24] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup Benoît Canet
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-refcount.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index efc1179..16aa8a2 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1168,6 +1168,14 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
         goto fail;
     }
 
+    if (s->has_dedup) {
+        ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
+                                 s->dedup_table_offset, s->dedup_table_size, 0);
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
     /* snapshots */
     for(i = 0; i < s->nb_snapshots; i++) {
         sn = s->snapshots + i;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 20/24] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (18 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 19/24] qcow2: Add verification of dedup table Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 21/24] qcow2: Add check_dedup_l2 in order to check l2 of dedup table Benoît Canet
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-refcount.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 16aa8a2..9b64e37 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1013,7 +1013,14 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
                         PRIx64 ": %s\n", l2_entry, strerror(-refcount));
                     goto fail;
                 }
-                if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
+                if (!s->has_dedup &&
+                    (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
+                    fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
+                        PRIx64 " refcount=%d\n", l2_entry, refcount);
+                    res->corruptions++;
+                }
+                if (s->has_dedup && refcount > 1 &&
+                    ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
                     fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
                         PRIx64 " refcount=%d\n", l2_entry, refcount);
                     res->corruptions++;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 21/24] qcow2: Add check_dedup_l2 in order to check l2 of dedup table.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (19 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 20/24] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 22/24] qcow2: Do not overwrite existing entries with QCOW_OFLAG_COPIED Benoît Canet
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-refcount.c |   65 +++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 9b64e37..0cf4ad7 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1057,6 +1057,43 @@ fail:
     return -EIO;
 }
 
+static int check_dedup_l2(BlockDriverState *bs, BdrvCheckResult *res,
+                          int64_t l2_offset)
+{
+    BDRVQcowState *s = bs->opaque;
+    uint64_t *l2_table;
+    int i, l2_size;
+
+    /* Read L2 table from disk */
+    l2_size = s->cluster_size;
+    l2_table = g_malloc(l2_size);
+
+    if (bdrv_pread(bs->file, l2_offset, l2_table, l2_size) != l2_size) {
+        goto fail;
+    }
+
+    /* Do the actual checks */
+    for (i = 0; i < (s->l2_size - 5); i += 5) {
+        uint64_t first_logical_offset = be64_to_cpu(l2_table[i + 4]) &
+                                        ~QCOW_FLAG_FIRST;
+        if (first_logical_offset > (bs->total_sectors * BDRV_SECTOR_SIZE)) {
+            fprintf(stderr, "ERROR: l2 deduplication first_logical_offset"
+                    "=%" PRIi64 " outside of deduplicated volume in l2 table "
+                    "with offset %" PRIi64 ".\n", first_logical_offset,
+                    l2_offset);
+            res->corruptions++;
+        }
+    }
+
+    g_free(l2_table);
+    return 0;
+
+fail:
+    fprintf(stderr, "ERROR: I/O error in check_dedup_l2\n");
+    g_free(l2_table);
+    return -EIO;
+}
+
 /*
  * Increases the refcount for the L1 table, its L2 tables and all referenced
  * clusters in the given refcount table. While doing so, performs some checks
@@ -1070,7 +1107,8 @@ static int check_refcounts_l1(BlockDriverState *bs,
                               uint16_t *refcount_table,
                               int refcount_table_size,
                               int64_t l1_table_offset, int l1_size,
-                              int check_copied)
+                              int check_copied,
+                              bool dedup)
 {
     BDRVQcowState *s = bs->opaque;
     uint64_t *l1_table, l2_offset, l1_size2;
@@ -1126,11 +1164,19 @@ static int check_refcounts_l1(BlockDriverState *bs,
                 res->corruptions++;
             }
 
-            /* Process and check L2 entries */
-            ret = check_refcounts_l2(bs, res, refcount_table,
-                refcount_table_size, l2_offset, check_copied);
-            if (ret < 0) {
-                goto fail;
+            if (dedup) {
+                /* Process and check dedup l2 entries */
+                ret = check_dedup_l2(bs, res, l2_offset);
+                if (ret < 0) {
+                    goto fail;
+                }
+                } else {
+                /* Process and check L2 entries */
+                ret = check_refcounts_l2(bs, res, refcount_table,
+                    refcount_table_size, l2_offset, check_copied);
+                if (ret < 0) {
+                    goto fail;
+                }
             }
         }
     }
@@ -1170,14 +1216,15 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
 
     /* current L1 table */
     ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
-                       s->l1_table_offset, s->l1_size, 1);
+                       s->l1_table_offset, s->l1_size, 1, false);
     if (ret < 0) {
         goto fail;
     }
 
     if (s->has_dedup) {
         ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
-                                 s->dedup_table_offset, s->dedup_table_size, 0);
+                                 s->dedup_table_offset, s->dedup_table_size,
+                                 0, true);
         if (ret < 0) {
             goto fail;
         }
@@ -1187,7 +1234,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
     for(i = 0; i < s->nb_snapshots; i++) {
         sn = s->snapshots + i;
         ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
-            sn->l1_table_offset, sn->l1_size, 0);
+            sn->l1_table_offset, sn->l1_size, 0, false);
         if (ret < 0) {
             goto fail;
         }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 22/24] qcow2: Do not overwrite existing entries with QCOW_OFLAG_COPIED.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (20 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 21/24] qcow2: Add check_dedup_l2 in order to check l2 of dedup table Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 23/24] qcow2: init and cleanup deduplication Benoît Canet
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

In the case of a race condition between two writes a l2 entry can be written
without QCOW_OFLAG_COPIED before the first write fill it.
This patch simply check if the l2 entry has the correct offset without
QCOW_OFLAG_COPIED and do nothing.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-cluster.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8db1b2a..0042742 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -699,6 +699,10 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
     qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
 
     for (i = 0; i < m->nb_clusters; i++) {
+        if (be64_to_cpu(l2_table[l2_index + i]) ==
+            (cluster_offset + (i << s->cluster_bits))) {
+            continue;
+        }
         /* if two concurrent writes happen to the same unallocated cluster
 	 * each write allocates separate cluster and writes data concurrently.
 	 * The first one to complete updates l2 table with pointer to its
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 23/24] qcow2: init and cleanup deduplication.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (21 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 22/24] qcow2: Do not overwrite existing entries with QCOW_OFLAG_COPIED Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 24/24] qemu-iotests: Filter dedup=on/off so existing tests don't break Benoît Canet
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 block/qcow2-dedup.c |   71 +++++++++++++++++++++++++++++++++++++++++++++++----
 block/qcow2.c       |   16 +++++++++---
 2 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 097d71b..cdd1b83 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -795,20 +795,81 @@ int qcow2_dedup_grow_table(BlockDriverState *bs,
                                "dedup");
 }
 
+static gint qcow2_dedup_compare_by_hash(gconstpointer a,
+                                        gconstpointer b,
+                                        gpointer data)
+{
+    uint8_t *hash_a = (uint8_t *) a;
+    uint8_t *hash_b = (uint8_t *) b;
+    return memcmp(hash_a, hash_b, HASH_LENGTH);
+}
+
+static void qcow2_dedup_destroy_qcow_hash_node(gpointer p)
+{
+    QCowHashNode *hash_node = (QCowHashNode *) p;
+    g_free(hash_node->hash);
+    g_free(hash_node);
+}
+
+static gint qcow2_dedup_compare_by_offset(gconstpointer a,
+                                          gconstpointer b,
+                                          gpointer data)
+{
+    uint64_t offset_a = *((uint64_t *) a);
+    uint64_t offset_b = *((uint64_t *) b);
+
+    if (offset_a > offset_b) {
+        return 1;
+    }
+    if (offset_a < offset_b) {
+        return -1;
+    }
+    return 0;
+}
+
 int qcow2_dedup_init(BlockDriverState *bs)
 {
     BDRVQcowState *s = bs->opaque;
-    return qcow2_do_table_init(bs,
-                               &s->dedup_table,
-                               s->dedup_table_offset,
-                               s->dedup_table_size,
-                               false);
+    Coroutine *co;
+    int ret;
+
+    s->has_dedup = true;
+    s->dedup_tree_by_hash = g_tree_new_full(qcow2_dedup_compare_by_hash, NULL,
+                                            NULL,
+                                            qcow2_dedup_destroy_qcow_hash_node);
+    s->dedup_tree_by_offset = g_tree_new_full(qcow2_dedup_compare_by_offset,
+                                              NULL, NULL, NULL);
+
+    s->dedup_cluster_cache = qcow2_cache_create(bs, DEDUP_CACHE_SIZE);
+
+    ret = qcow2_do_table_init(bs,
+                              &s->dedup_table,
+                              s->dedup_table_offset,
+                              s->dedup_table_size,
+                              false);
+
+    if (ret < 0) {
+        goto fail;
+    }
+
+    /* load asynchronously the hashes */
+    co = qemu_coroutine_create(qcow2_co_load_dedup_hashes);
+    qemu_coroutine_enter(co, bs);
+    return 0;
+
+fail:
+    qcow2_cache_destroy(bs, s->dedup_cluster_cache);
+    return ret;
 }
 
 void qcow2_dedup_close(BlockDriverState *bs)
 {
     BDRVQcowState *s = bs->opaque;
+    qcow2_cache_flush(bs, s->dedup_cluster_cache);
+    qcow2_cache_destroy(bs, s->dedup_cluster_cache);
     g_free(s->dedup_table);
+    g_tree_destroy(s->dedup_tree_by_offset);
+    g_tree_destroy(s->dedup_tree_by_hash);
 }
 
 static void qcow2_dedup_refcount_limit_reached(BlockDriverState *bs,
diff --git a/block/qcow2.c b/block/qcow2.c
index d5f28dd..566cf1f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -534,6 +534,13 @@ static int qcow2_open(BlockDriverState *bs, int flags)
         }
     }
 
+    if (s->incompatible_features & QCOW2_INCOMPAT_DEDUP) {
+        ret = qcow2_dedup_init(bs);
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
 #ifdef DEBUG_ALLOC
     {
         BdrvCheckResult result = {0};
@@ -1000,11 +1007,11 @@ fail:
 static void qcow2_close(BlockDriverState *bs)
 {
     BDRVQcowState *s = bs->opaque;
+
     g_free(s->l1_table);
 
     if (s->has_dedup) {
-        qcow2_cache_flush(bs, s->dedup_cluster_cache);
-        qcow2_cache_destroy(bs, s->dedup_cluster_cache);
+        qcow2_dedup_close(bs);
     }
 
     qcow2_cache_flush(bs, s->l2_table_cache);
@@ -1457,7 +1464,10 @@ static int qcow2_create2(const char *filename, int64_t total_size,
         }
 
         /* minimal init */
-        s->dedup_cluster_cache = qcow2_cache_create(bs, DEDUP_CACHE_SIZE);
+        ret = qcow2_dedup_init(bs);
+        if (ret < 0) {
+            goto out;
+        }
     }
 
     /* Want a backing file? There you go.*/
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [RFC V3 24/24] qemu-iotests: Filter dedup=on/off so existing tests don't break.
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (22 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 23/24] qcow2: init and cleanup deduplication Benoît Canet
@ 2012-11-26 13:05 ` Benoît Canet
  2012-12-11 14:19 ` [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Stefan Hajnoczi
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-11-26 13:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha

Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
 tests/qemu-iotests/common.rc |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index d534e94..411f135 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -114,7 +114,8 @@ _make_test_img()
             -e "s# compat='[^']*'##g" \
             -e "s# compat6=\\(on\\|off\\)##g" \
             -e "s# static=\\(on\\|off\\)##g" \
-            -e "s# lazy_refcounts=\\(on\\|off\\)##g"
+            -e "s# lazy_refcounts=\\(on\\|off\\)##g" \
+            -e "s# dedup=\\(on\\|off\\)##g"
 }
 
 _cleanup_test_img()
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification.
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification Benoît Canet
@ 2012-12-11 11:28   ` Stefan Hajnoczi
  2012-12-11 11:32   ` Stefan Hajnoczi
  2012-12-11 23:03   ` Eric Blake
  2 siblings, 0 replies; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-11 11:28 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha

On Mon, Nov 26, 2012 at 02:05:00PM +0100, Benoît Canet wrote:
> Signed-off-by: Benoit Canet <benoit@irqsave.net>
> ---
>  docs/specs/qcow2.txt |   33 ++++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 36a559d..16eafd7 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -80,7 +80,10 @@ in the description of a field.
>                                  tables to repair refcounts before accessing the
>                                  image.
>  
> -                    Bits 1-63:  Reserved (set to 0)
> +                    Bit 1:      Deduplication bit.  If this bit is set then
> +                                deduplication is used on this image.
> +
> +                    Bits 2-63:  Reserved (set to 0)
>  
>           80 -  87:  compatible_features
>                      Bitmask of compatible features. An implementation can

This bit prevents programs that don't support dedup from opening the
image file.  What are the restrictions really - can a program without
dedup support read the file?  Can it write to the file (invalidating the
dedup table)?

> @@ -116,6 +119,7 @@ be stored. Each extension has a structure like the following:
>                          0x00000000 - End of the header extension area
>                          0xE2792ACA - Backing file format name
>                          0x6803f857 - Feature name table
> +                        0xCD8E819B - Deduplication
>                          other      - Unknown header extension, can be safely
>                                       ignored
>  
> @@ -159,6 +163,33 @@ the header extension data. Each entry look like this:
>                      terminated if it has full length)
>  
>  
> +== Deduplication ==
> +
> +The deduplication extension contains the offset and size of the deduplication
> +table.
> +
> +    Byte   0 - 7:   Offset
> +
> +          8 - 11:   Size

Units?

> +
> +== Deduplication table ==

Before going into the layout please summarize the point of this table:

The deduplication table maps a physical offset to a data hash and
logical offset.  ...

> +The deduplication table contains 64 bits offsets to the level 2 deduplication
> +table clusters.
> +Each entry of these clusters contains a 32 bytes SHA256 hash followed by the
> +64 bits logical offset of the first encountered block having this hash.

At this point a diagram showing L1, L2, and dedup table entry would
help.

Or perhaps the entry structure can be presented like other structures in
this spec to reduce the amount of English description and use a more
formal reference:

Each L2 deduplication table entry has the following structure:

    Byte  0 - 31:   SHA256 hash of data cluster

         32 - 39:   Logical offset of first encountered block having
                    this hash

> +Entries in the deduplication table are orderered by physical cluster index.
> +
> +The number of entries in an l2 deduplication table cluster is :
> +l2_dedup_cluster_entries = cluster_size / (32 + 8)
> +
> +The index in the level 1 deduplication table is :
> +l1_dedup_index = physical_cluster_index / l2_dedup_cluster_entries
> +
> +The index in the level 2 deduplication table is:
> +l2_dedup_index = physical_cluster_index % l2_dedup_cluster_entries
> +
>  == Host cluster management ==
>  
>  qcow2 manages the allocation of host clusters by maintaining a reference count
> -- 
> 1.7.10.4
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification.
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification Benoît Canet
  2012-12-11 11:28   ` Stefan Hajnoczi
@ 2012-12-11 11:32   ` Stefan Hajnoczi
  2012-12-12 15:57     ` Benoît Canet
  2012-12-11 23:03   ` Eric Blake
  2 siblings, 1 reply; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-11 11:32 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha

On Mon, Nov 26, 2012 at 02:05:00PM +0100, Benoît Canet wrote:
> +== Deduplication table ==
> +
> +The deduplication table contains 64 bits offsets to the level 2 deduplication
> +table clusters.
> +Each entry of these clusters contains a 32 bytes SHA256 hash followed by the
> +64 bits logical offset of the first encountered block having this hash.

Can you foresee the need to use a different hash algorithm in the future
and should we add a hash_algo enum field to the dedup QCOW2 header
extension?

Stefan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 02/24] qcow2: Add deduplication structures and fields.
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 02/24] qcow2: Add deduplication structures and fields Benoît Canet
@ 2012-12-11 11:34   ` Stefan Hajnoczi
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-11 11:34 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha

On Mon, Nov 26, 2012 at 02:05:01PM +0100, Benoît Canet wrote:
> diff --git a/block/qcow2.h b/block/qcow2.h
> index b4eb654..e192001 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -58,6 +58,23 @@
>  
>  #define DEFAULT_CLUSTER_SIZE 65536
>  
> +/* deduplication node */
> +typedef struct {
> +    uint8_t *hash;         /* 32 bytes hash of a given cluster */

Pointer to the hash value instead of storing the value inline?  At this
point in the series I'm not sure yet why it's not stored inline.  That
way we'd avoid a 4- or 8-byte pointer to a separately allocated 32-byte
blob.  Maybe there is a reason later on...

Stefan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 03/24] qcow2: Add qcow2_dedup_read_missing_and_concatenate
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 03/24] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
@ 2012-12-11 11:52   ` Stefan Hajnoczi
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-11 11:52 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha

On Mon, Nov 26, 2012 at 02:05:02PM +0100, Benoît Canet wrote:
> +/**
> + * Read some data from the QCOW2 file
> + *
> + * @data:       the buffer where the data must be stored
> + * @sector_num: the sector number to read in the QCOW2 file
> + * @nb_sectors: the number of sectors to read
> + * @ret:        negative on error

Dropping s->lock is important information to document - it means things
can change by the time this function returns and the caller needs to be
prepared.

Perhaps this function can be moved to qcow2.c and given a generic name.
It does nothing dedup-specific.

> + */
> +static int qcow2_dedup_read_missing_cluster_data(BlockDriverState *bs,
> +                                                 uint8_t *data,
> +                                                 uint64_t sector_num,
> +                                                 int nb_sectors)
> +{
> +    BDRVQcowState *s = bs->opaque;
> +    QEMUIOVector qiov;
> +    struct iovec iov;
> +    int ret;
> +
> +    iov.iov_len = nb_sectors * BDRV_SECTOR_SIZE;
> +    iov.iov_base = data;
> +    qemu_iovec_init_external(&qiov, &iov, 1);
> +    qemu_co_mutex_unlock(&s->lock);
> +    ret = bdrv_co_readv(bs, sector_num, nb_sectors, &qiov);
> +    qemu_co_mutex_lock(&s->lock);
> +    if (ret < 0) {
> +        error_report("failed to read %d sectors at offset %" PRIu64 "\n",
> +                     nb_sectors, sector_num);
> +    }
> +
> +    return ret;
> +}
> +
> +/*
> + * Prepare a buffer containing all the required data required to compute cluster
> + * sized deduplication hashes.
> + * If sector_num and nb_sectors are unaligned cluster wize it read the missing
> + * data before and after the qiov.

If sector_num or nb_sectors are not cluster-aligned, missing data
before/after the qiov will be read.

> + *
> + * @qiov:               the qiov for which missing data must be read
> + * @sector_num:         the first sectors that must be read into the qiov
> + * @nb_sectors:         the number of sectors to read into the qiov
> + * @data:               the place where the data will be concatenated and stored
> + * @nb_data_sectors:    the resulting size of the contatenated data (in sectors)
> + * @ret:                negative on error
> + */
> +int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
> +                                             QEMUIOVector *qiov,
> +                                             uint64_t sector_num,
> +                                             int nb_sectors,
> +                                             uint8_t **data,
> +                                             int *nb_data_sectors)
> +{
> +    BDRVQcowState *s = bs->opaque;
> +    int ret;
> +    uint64_t cluster_beginning_sector;
> +    uint64_t first_sector_after_qiov;
> +    int cluster_beginning_nr;
> +    int cluster_ending_nr;
> +    int unaligned_ending_nr;
> +    uint64_t max_cluster_ending_nr;
> +
> +    /* compute how much and where to read at the beginning */
> +    cluster_beginning_nr = sector_num & (s->cluster_sectors - 1);
> +    cluster_beginning_sector = sector_num - cluster_beginning_nr;
> +
> +    /* for the ending */
> +    first_sector_after_qiov = sector_num + nb_sectors;
> +    unaligned_ending_nr = first_sector_after_qiov & (s->cluster_sectors - 1);
> +    cluster_ending_nr = unaligned_ending_nr ?
> +                        s->cluster_sectors - unaligned_ending_nr : 0;
> +
> +    /* compute total size in sectors and allocate memory */
> +    *nb_data_sectors = cluster_beginning_nr + nb_sectors + cluster_ending_nr;
> +    *data = qemu_blockalign(bs, *nb_data_sectors * BDRV_SECTOR_SIZE);
> +    memset(*data, 0, *nb_data_sectors * BDRV_SECTOR_SIZE);

Is memset necessary since we either read all data or return an error?

> +    /* read beginning */
> +    if (cluster_beginning_nr) {
> +        ret = qcow2_dedup_read_missing_cluster_data(bs,
> +                                                    *data,
> +                                                    cluster_beginning_sector,
> +                                                    cluster_beginning_nr);
> +
> +        if (ret < 0) {
> +            goto fail;
> +        }
> +    }
> +
> +    /* append qiov content */
> +    qemu_iovec_to_buf(qiov, 0, *data + cluster_beginning_nr * BDRV_SECTOR_SIZE,
> +                      qiov->size);
> +
> +    /* Fix cluster_ending_nr if we are at risk of reading outside the image
> +     * (Cluster unaligned image size)
> +     */
> +    max_cluster_ending_nr = bs->total_sectors - first_sector_after_qiov;
> +    cluster_ending_nr = max_cluster_ending_nr < (uint64_t) cluster_ending_nr ?
> +                        (int) max_cluster_ending_nr : cluster_ending_nr;
> +
> +    /* read and add ending */
> +    if (cluster_ending_nr) {
> +        ret = qcow2_dedup_read_missing_cluster_data(bs,
> +                                                    *data +
> +                                                    (cluster_beginning_nr +
> +                                                    nb_sectors) *
> +                                                    BDRV_SECTOR_SIZE,
> +                                                    first_sector_after_qiov,
> +                                                    cluster_ending_nr);
> +
> +        if (ret < 0) {
> +            goto fail;
> +        }
> +    }
> +
> +    return 0;
> +
> +fail:

Is it useful to leave the caller with a failed buffer still allocated?

qemu_vfree(*data);
*data = NULL;

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 06/24] qcow2: Add qcow2_dedup and related functions.
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 06/24] qcow2: Add qcow2_dedup and related functions Benoît Canet
@ 2012-12-11 13:16   ` Stefan Hajnoczi
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-11 13:16 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha

On Mon, Nov 26, 2012 at 02:05:05PM +0100, Benoît Canet wrote:
> +/*
> + * Compute the hash of a given cluster
> + *
> + * @data: a buffer containing the cluster data
> + * @ret:  a HASH_LENGTH long dynamically allocated array containing the hash
> + */
> +static uint8_t *qcow2_compute_cluster_hash(BlockDriverState *bs,
> +                                           uint8_t *data)
> +{
> +    return NULL;
> +}
> +
> +/* Try to find the offset of a given cluster if it's duplicated
> + * Exceptionally we cast return value to int64_t to use as error code.

I don't see an int64_t return value, this comment is probably outdated.

> + *
> + * @data:            a buffer containing the cluster
> + * @skip_cluster_nr: the number of cluster to skip in the buffer
> + * @hash:            if hash is provided it's used else it's computed
> + * @ret:             QCowHashNode of the duplicated cluster or NULL

This function does several things:
1. Allocating and computing *hash if not given.
2. Returning existing dedup_tree_by_hash node or NULL if the node wasn't
   already in the tree.
3. Inserting the node into the tree if not present.

I wonder if it can be simplified or split to do less work.

> +/*
> + * Helper used to link a deduplicated cluster in the l2
> + *
> + * @logical_cluster_offset:  the cluster offset seen by the guest (in sectors)
> + * @physical_cluster_offset: the cluster offset in the QCOW2 file (in sectors)

Perhaps s/_offset/_sect/ because usually offset is in bytes.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 08/24] qcow2: Implement qcow2_compute_cluster_hash.
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 08/24] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
@ 2012-12-11 13:28   ` Stefan Hajnoczi
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-11 13:28 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha

On Mon, Nov 26, 2012 at 02:05:07PM +0100, Benoît Canet wrote:
> Signed-off-by: Benoit Canet <benoit@irqsave.net>
> ---
>  Makefile            |    3 +++
>  Makefile.target     |    2 +-
>  block/qcow2-dedup.c |   10 ++++++++--
>  3 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 88285a4..c79b2da 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -168,6 +168,9 @@ qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y) $(qapi-obj-y) \
>                                qapi-visit.o qapi-types.o
>  qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y)
>  qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y)
> +qemu-img$(EXESUF): LIBS+=-lcrypto
> +qemu-nbd$(EXESUF): LIBS+=-lcrypto
> +qemu-io$(EXESUF): LIBS+=-lcrypto
>  
>  qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
>  
> diff --git a/Makefile.target b/Makefile.target
> index 3822bc5..f9a988a 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -119,7 +119,7 @@ obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += memory_mapping.o
>  obj-$(CONFIG_HAVE_CORE_DUMP) += dump.o
>  obj-$(CONFIG_NO_GET_MEMORY_MAPPING) += memory_mapping-stub.o
>  obj-$(CONFIG_NO_CORE_DUMP) += dump-stub.o
> -LIBS+=-lz
> +LIBS+=-lz -lcrypto

Need a ./configure check for openssl?

VNC can already use gnutls so perhaps we should support that?
http://gnutls.org/manual/gnutls.html#Hash-and-HMAC-functions

>  
>  QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
>  QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
> diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
> index 83ad61e..37e8266 100644
> --- a/block/qcow2-dedup.c
> +++ b/block/qcow2-dedup.c
> @@ -25,11 +25,13 @@
>   * THE SOFTWARE.
>   */
>  
> +#include <openssl/sha.h>
> +#include <openssl/evp.h>
>  #include "block_int.h"
>  #include "qemu-common.h"
>  #include "qcow2.h"
>  
> -#define HASH_LENGTH 32
> +#define HASH_LENGTH SHA256_DIGEST_LENGTH
>  
>  static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
>                                         uint8_t **hash,
> @@ -188,7 +190,11 @@ static QCowHashNode *qcow2_dedup_build_qcow_hash_node(uint8_t *hash,
>  static uint8_t *qcow2_compute_cluster_hash(BlockDriverState *bs,
>                                             uint8_t *data)
>  {
> -    return NULL;
> +    BDRVQcowState *s = bs->opaque;
> +    uint8_t *hash = g_malloc0(HASH_LENGTH);
> +    EVP_Digest(data, s->cluster_size,
> +               hash, NULL, EVP_sha256(), NULL);
> +    return hash;
>  }

Not sure if it's worth allocating these relatively small objects on the
heap and worrying about their lifecycle.

It's simpler to pass references to the hashes and copy the entire object
to pass ownership.

This function would become:
void qcow2_compute_cluster_hash(BlockDriverState *bs, uint8_t *data, QcowHash *hash);

typedef struct {
    uint8_t data[SHA256_DIGEST_LENGTH];
} QcowHash;

The caller needs to decide whether a stack-allocated variable is
appropriate or if the hash should live inside a QcowHashNode, etc.

Stefan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (23 preceding siblings ...)
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 24/24] qemu-iotests: Filter dedup=on/off so existing tests don't break Benoît Canet
@ 2012-12-11 14:19 ` Stefan Hajnoczi
  2012-12-11 14:38 ` Stefan Hajnoczi
  2012-12-12 16:14 ` Benoît Canet
  26 siblings, 0 replies; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-11 14:19 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha

On Mon, Nov 26, 2012 at 02:04:59PM +0100, Benoît Canet wrote:
> This patchset is the first working version of the QCOW2 deduplication.
> 
> Images must be created with "-o dedup=on" in order to activate the
> deduplication in the image.
> 
> 
> Since v2: make it work barely
>           replace kernel red black trees by gtree.
> 
> Benoît Canet (24):
>   qcow2: Add deduplication to the qcow2 specification.
>   qcow2: Add deduplication structures and fields.
>   qcow2: Add qcow2_dedup_read_missing_and_concatenate
>   qcow2: Make update_cluster_refcount public.
>   qcow2: Create a way to link to l2 tables in dedup.
>   qcow2: Add qcow2_dedup and related functions.
>   qcow2: Add qcow2_dedup_write_new_hashes.
>   qcow2: Implement qcow2_compute_cluster_hash.
>   qcow2: Extract qcow2_dedup_grow_table
>   qcow2: create function to load deduplication hashes at startup.
>   qcow2: Load and save deduplication table header extension.
>   qcow2: Extract qcow2_do_table_init.
>   qcow2: Add qcow2_dedup_init and qcow2_dedup_close.
>   qcow2: Extract qcow2_add_feature and qcow2_remove_feature.
>   block: Add dedup image create option.
>   qcow2: Allow creation of images using deduplication.
>   qcow2: Behave correctly when refcount reach 0 or 2^16.
>   qcow2: Integrate deduplication in qcow2_co_writev loop.
>   qcow2: Add verification of dedup table.
>   qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup.
>   qcow2: Add check_dedup_l2 in order to check l2 of dedup table.
>   qcow2: Do not overwrite existing entries with QCOW_OFLAG_COPIED.
>   qcow2: init and cleanup deduplication.
>   qemu-iotests: Filter dedup=on/off so existing tests don't break.
> 
>  Makefile                     |    3 +
>  Makefile.target              |    2 +-
>  block/Makefile.objs          |    1 +
>  block/qcow2-cluster.c        |  115 ++++--
>  block/qcow2-dedup.c          |  914 ++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2-refcount.c       |  154 +++++--
>  block/qcow2.c                |  267 ++++++++++--
>  block/qcow2.h                |   89 +++-
>  block_int.h                  |    1 +
>  docs/specs/qcow2.txt         |   33 +-
>  tests/qemu-iotests/common.rc |    3 +-
>  11 files changed, 1480 insertions(+), 102 deletions(-)
>  create mode 100644 block/qcow2-dedup.c

$ make check
  LINK  tests/test-coroutine
/usr/bin/ld: block/qcow2-dedup.o: undefined reference to symbol 'EVP_sha256@@libcrypto.so.10'
/usr/bin/ld: note: 'EVP_sha256@@libcrypto.so.10' is defined in DSO /lib64/libcrypto.so.10 so try adding it to the linker command line
/lib64/libcrypto.so.10: could not read symbols: Invalid operation
collect2: error: ld returned 1 exit status
make: *** [tests/test-coroutine] Error 1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (24 preceding siblings ...)
  2012-12-11 14:19 ` [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Stefan Hajnoczi
@ 2012-12-11 14:38 ` Stefan Hajnoczi
  2012-12-12 16:14 ` Benoît Canet
  26 siblings, 0 replies; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-11 14:38 UTC (permalink / raw)
  To: Benoît Canet; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

$ ./check -qcow2 026
[...]
Event: l1_grow.activate_table; errno: 5; imm: off; once: on
wrote 65536/65536 bytes at offset 0
64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
ERROR: cluster 14482163507622771: copied flag must never be set for
compressed clusters
Warning: cluster offset=0xdcdcdcdcdcdcc00 is after the end of the
image file, can't properly check refcou
nts.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification.
  2012-11-26 13:05 ` [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification Benoît Canet
  2012-12-11 11:28   ` Stefan Hajnoczi
  2012-12-11 11:32   ` Stefan Hajnoczi
@ 2012-12-11 23:03   ` Eric Blake
  2012-12-12 15:59     ` Benoît Canet
  2 siblings, 1 reply; 40+ messages in thread
From: Eric Blake @ 2012-12-11 23:03 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 1897 bytes --]

On 11/26/2012 06:05 AM, Benoît Canet wrote:
> Signed-off-by: Benoit Canet <benoit@irqsave.net>
> ---
>  docs/specs/qcow2.txt |   33 ++++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 

In addition to Stefan's comments,

> @@ -159,6 +163,33 @@ the header extension data. Each entry look like this:
>                      terminated if it has full length)
>  
>  
> +== Deduplication ==
> +
> +The deduplication extension contains the offset and size of the deduplication
> +table.
> +
> +    Byte   0 - 7:   Offset
> +
> +          8 - 11:   Size
> +
> +== Deduplication table ==
> +
> +The deduplication table contains 64 bits offsets to the level 2 deduplication

s/64 bits/64-bit/

> +table clusters.
> +Each entry of these clusters contains a 32 bytes SHA256 hash followed by the

s/32 bytes/32-byte/

> +64 bits logical offset of the first encountered block having this hash.

s/64 bits/64-bit/

> +
> +Entries in the deduplication table are orderered by physical cluster index.

s/orderered/ordered/

> +
> +The number of entries in an l2 deduplication table cluster is :
> +l2_dedup_cluster_entries = cluster_size / (32 + 8)

32+8 is not a power of two; what happens to the tail bytes at the end of
a cluster of entries?  If you define them to be 0 now, you can use them
for possible extensions later.

> +
> +The index in the level 1 deduplication table is :
> +l1_dedup_index = physical_cluster_index / l2_dedup_cluster_entries
> +
> +The index in the level 2 deduplication table is:
> +l2_dedup_index = physical_cluster_index % l2_dedup_cluster_entries
> +
>  == Host cluster management ==
>  
>  qcow2 manages the allocation of host clusters by maintaining a reference count
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification.
  2012-12-11 11:32   ` Stefan Hajnoczi
@ 2012-12-12 15:57     ` Benoît Canet
  2012-12-18 13:38       ` Stefan Hajnoczi
  0 siblings, 1 reply; 40+ messages in thread
From: Benoît Canet @ 2012-12-12 15:57 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, qemu-devel, stefanha

> Can you foresee the need to use a different hash algorithm in the future
> and should we add a hash_algo enum field to the dedup QCOW2 header
> extension?

Yes I foresee the future use of faster hash function like SHA3 or Skein.

I also think an alternate deduplication mechanism where lookups are done
on disk in order to be able to deduplicate very large volume could be added.

What would be the cleanest way to store this in the header extension ?
bitmaps or two char fields ?

Benoît

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification.
  2012-12-11 23:03   ` Eric Blake
@ 2012-12-12 15:59     ` Benoît Canet
  0 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-12-12 15:59 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, qemu-devel, stefanha

> 32+8 is not a power of two; what happens to the tail bytes at the end of
> a cluster of entries?  If you define them to be 0 now, you can use them
> for possible extensions later.
They are currently unused. That's a nice idea.

Benoît

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication
  2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
                   ` (25 preceding siblings ...)
  2012-12-11 14:38 ` Stefan Hajnoczi
@ 2012-12-12 16:14 ` Benoît Canet
  2012-12-18 13:42   ` Stefan Hajnoczi
  26 siblings, 1 reply; 40+ messages in thread
From: Benoît Canet @ 2012-12-12 16:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha


Hi Stefan,

I have a few questions

1) overlapping sequential sub-cluster writes

The current code pass most of the tests and behave well with a 4KB cluster sized
ext3 volume on the deduplicated image.

But less than cluster size sequentials writes are troublesome.
They fail with xfstest.
The problem is that the lock is released twice so that coherency is not
garanteed when two sub cluster size write are done on the same area.
(a deduplication attempt is done while the first write is yet not on disk)

My understanding is that a wait_for_overlapping_cluster_write function called
before the writev loop in order to serialize such writes would solve the problem.
What do you this of this idea ?

2) Internal snapshot
I don't fully understand if the current deduplication implementation is
compatible with internal snapshots. If not could it be done on a latter
patchset ?

Benoît

> Le Monday 26 Nov 2012 à 14:04:59 (+0100), Benoît Canet a écrit :
> This patchset is the first working version of the QCOW2 deduplication.
> 
> Images must be created with "-o dedup=on" in order to activate the
> deduplication in the image.
> 
> 
> Since v2: make it work barely
>           replace kernel red black trees by gtree.
> 
> Benoît Canet (24):
>   qcow2: Add deduplication to the qcow2 specification.
>   qcow2: Add deduplication structures and fields.
>   qcow2: Add qcow2_dedup_read_missing_and_concatenate
>   qcow2: Make update_cluster_refcount public.
>   qcow2: Create a way to link to l2 tables in dedup.
>   qcow2: Add qcow2_dedup and related functions.
>   qcow2: Add qcow2_dedup_write_new_hashes.
>   qcow2: Implement qcow2_compute_cluster_hash.
>   qcow2: Extract qcow2_dedup_grow_table
>   qcow2: create function to load deduplication hashes at startup.
>   qcow2: Load and save deduplication table header extension.
>   qcow2: Extract qcow2_do_table_init.
>   qcow2: Add qcow2_dedup_init and qcow2_dedup_close.
>   qcow2: Extract qcow2_add_feature and qcow2_remove_feature.
>   block: Add dedup image create option.
>   qcow2: Allow creation of images using deduplication.
>   qcow2: Behave correctly when refcount reach 0 or 2^16.
>   qcow2: Integrate deduplication in qcow2_co_writev loop.
>   qcow2: Add verification of dedup table.
>   qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup.
>   qcow2: Add check_dedup_l2 in order to check l2 of dedup table.
>   qcow2: Do not overwrite existing entries with QCOW_OFLAG_COPIED.
>   qcow2: init and cleanup deduplication.
>   qemu-iotests: Filter dedup=on/off so existing tests don't break.
> 
>  Makefile                     |    3 +
>  Makefile.target              |    2 +-
>  block/Makefile.objs          |    1 +
>  block/qcow2-cluster.c        |  115 ++++--
>  block/qcow2-dedup.c          |  914 ++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2-refcount.c       |  154 +++++--
>  block/qcow2.c                |  267 ++++++++++--
>  block/qcow2.h                |   89 +++-
>  block_int.h                  |    1 +
>  docs/specs/qcow2.txt         |   33 +-
>  tests/qemu-iotests/common.rc |    3 +-
>  11 files changed, 1480 insertions(+), 102 deletions(-)
>  create mode 100644 block/qcow2-dedup.c
> 
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification.
  2012-12-12 15:57     ` Benoît Canet
@ 2012-12-18 13:38       ` Stefan Hajnoczi
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-18 13:38 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, Stefan Hajnoczi, qemu-devel

On Wed, Dec 12, 2012 at 04:57:38PM +0100, Benoît Canet wrote:
> > Can you foresee the need to use a different hash algorithm in the future
> > and should we add a hash_algo enum field to the dedup QCOW2 header
> > extension?
> 
> Yes I foresee the future use of faster hash function like SHA3 or Skein.
> 
> I also think an alternate deduplication mechanism where lookups are done
> on disk in order to be able to deduplicate very large volume could be added.
> 
> What would be the cleanest way to store this in the header extension ?
> bitmaps or two char fields ?

The header extension could have a uint8_t hash_algo field (and 3
reserved bytes that can be used in the future).

0 - SHA256
1 - Skein
...

Stefan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication
  2012-12-12 16:14 ` Benoît Canet
@ 2012-12-18 13:42   ` Stefan Hajnoczi
  2012-12-24 12:26     ` Benoît Canet
  0 siblings, 1 reply; 40+ messages in thread
From: Stefan Hajnoczi @ 2012-12-18 13:42 UTC (permalink / raw)
  To: Benoît Canet; +Cc: kwolf, qemu-devel

On Wed, Dec 12, 2012 at 05:14:28PM +0100, Benoît Canet wrote:
> 
> Hi Stefan,
> 
> I have a few questions
> 
> 1) overlapping sequential sub-cluster writes
> 
> The current code pass most of the tests and behave well with a 4KB cluster sized
> ext3 volume on the deduplicated image.
> 
> But less than cluster size sequentials writes are troublesome.
> They fail with xfstest.
> The problem is that the lock is released twice so that coherency is not
> garanteed when two sub cluster size write are done on the same area.
> (a deduplication attempt is done while the first write is yet not on disk)
> 
> My understanding is that a wait_for_overlapping_cluster_write function called
> before the writev loop in order to serialize such writes would solve the problem.
> What do you this of this idea ?

Yes, it's the same problem that copy-on-read has.  We can serialize I/O
requests, if necessary, in order to prevent them racing with each other.

> 2) Internal snapshot
> I don't fully understand if the current deduplication implementation is
> compatible with internal snapshots. If not could it be done on a latter
> patchset ?

Let's figure out how hard it is to support internal snapshots for dedup.

Internal snapshot creation is simple:

1. Copy the current L1 table for the internal snapshot.
2. Increment refcounts for L2 and data clusters.
3. Finalize the internal snapshot.

Where do you see an issue - do you think the refcount manipulations
you're doing for dedup might conflict with internal snapshots?

Stefan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication
  2012-12-18 13:42   ` Stefan Hajnoczi
@ 2012-12-24 12:26     ` Benoît Canet
  0 siblings, 0 replies; 40+ messages in thread
From: Benoît Canet @ 2012-12-24 12:26 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Benoît Canet, kwolf, qemu-devel

> Yes, it's the same problem that copy-on-read has.  We can serialize I/O
> requests, if necessary, in order to prevent them racing with each other.

My current patchset have a big dedup_lock co mutex.
I'll replace it with the overlapping request solution.

> Where do you see an issue - do you think the refcount manipulations
> you're doing for dedup might conflict with internal snapshots?

Ok,
Anyway it seems to pass the savevm tests.

Benoît

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2012-12-24 12:26 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-26 13:04 [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification Benoît Canet
2012-12-11 11:28   ` Stefan Hajnoczi
2012-12-11 11:32   ` Stefan Hajnoczi
2012-12-12 15:57     ` Benoît Canet
2012-12-18 13:38       ` Stefan Hajnoczi
2012-12-11 23:03   ` Eric Blake
2012-12-12 15:59     ` Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 02/24] qcow2: Add deduplication structures and fields Benoît Canet
2012-12-11 11:34   ` Stefan Hajnoczi
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 03/24] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
2012-12-11 11:52   ` Stefan Hajnoczi
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 04/24] qcow2: Make update_cluster_refcount public Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 05/24] qcow2: Create a way to link to l2 tables in dedup Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 06/24] qcow2: Add qcow2_dedup and related functions Benoît Canet
2012-12-11 13:16   ` Stefan Hajnoczi
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 07/24] qcow2: Add qcow2_dedup_write_new_hashes Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 08/24] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
2012-12-11 13:28   ` Stefan Hajnoczi
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 09/24] qcow2: Extract qcow2_dedup_grow_table Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 10/24] qcow2: create function to load deduplication hashes at startup Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 11/24] qcow2: Load and save deduplication table header extension Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 12/24] qcow2: Extract qcow2_do_table_init Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 13/24] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 14/24] qcow2: Extract qcow2_add_feature and qcow2_remove_feature Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 15/24] block: Add dedup image create option Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 16/24] qcow2: Allow creation of images using deduplication Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 17/24] qcow2: Behave correctly when refcount reach 0 or 2^16 Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 18/24] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 19/24] qcow2: Add verification of dedup table Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 20/24] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 21/24] qcow2: Add check_dedup_l2 in order to check l2 of dedup table Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 22/24] qcow2: Do not overwrite existing entries with QCOW_OFLAG_COPIED Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 23/24] qcow2: init and cleanup deduplication Benoît Canet
2012-11-26 13:05 ` [Qemu-devel] [RFC V3 24/24] qemu-iotests: Filter dedup=on/off so existing tests don't break Benoît Canet
2012-12-11 14:19 ` [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication Stefan Hajnoczi
2012-12-11 14:38 ` Stefan Hajnoczi
2012-12-12 16:14 ` Benoît Canet
2012-12-18 13:42   ` Stefan Hajnoczi
2012-12-24 12:26     ` Benoît Canet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).