* [Qemu-devel] [RFC V7 01/32] qcow2: Add deduplication to the qcow2 specification.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 02/32] qmp: Add DedupStatus enum Benoît Canet
` (30 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
docs/specs/qcow2.txt | 105 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 103 insertions(+), 2 deletions(-)
diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 36a559d..8e52de1 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -80,7 +80,12 @@ in the description of a field.
tables to repair refcounts before accessing the
image.
- Bits 1-63: Reserved (set to 0)
+ Bit 1: Deduplication bit. If this bit is set then
+ deduplication is used on this image.
+ L2 tables size 64KB is different from
+ cluster size 4KB.
+
+ Bits 2-63: Reserved (set to 0)
80 - 87: compatible_features
Bitmask of compatible features. An implementation can
@@ -116,6 +121,7 @@ be stored. Each extension has a structure like the following:
0x00000000 - End of the header extension area
0xE2792ACA - Backing file format name
0x6803f857 - Feature name table
+ 0xCD8E819B - Deduplication
other - Unknown header extension, can be safely
ignored
@@ -159,6 +165,101 @@ the header extension data. Each entry look like this:
terminated if it has full length)
+== Deduplication ==
+
+The deduplication extension contains information concerning deduplication.
+
+ Byte 0 - 7: Offset of the RAM deduplication table (RAM lookup)
+
+ 8 - 11: Size of the RAM deduplication table = number of L1 64-bit
+ pointers
+
+ 12: Hash algo enum field
+ 0: SHA-256
+ 1: SHA3
+ 2: SKEIN-256
+
+ 13: Dedup strategies bitmap
+ 0: RAM based hash lookup (always set to 1 for now)
+ 1: Disk based hash lookup
+ 2: Deduplication running if set to 1
+
+ 14 - 69: Set to zero and reserved for future use
+
+Disk based lookup structure will be described in a future QCOW2 specification.
+
+== Deduplication table (RAM method) ==
+
+The deduplication table maps a physical offset to a data hash and
+logical offset. It is used to permanently store the information to
+do the deduplication. It is loaded at startup into a RAM based representation
+used to do the lookups.
+
+The deduplication table contains 64-bit offsets to the level 2 deduplication
+table blocks.
+Each entry of these blocks contains a 32-byte SHA256 hash followed by the
+64-bit logical offset of the first encountered cluster having this hash.
+
+== Deduplication table schematic (RAM method) ==
+
+0 l1_dedup_index Size
+ |
+|--------------------------------------------------------------------|
+| | |
+| | L1 Deduplication table |
+| | |
+|--------------------------------------------------------------------|
+ |
+ |
+ |
+0 | l2_dedup_block_entries
+ |
+|---------------------------------|
+| |
+| L2 deduplication block |
+| |
+| l2_dedup_index |
+|---------------------------------|
+ |
+ 0 | 40
+ |
+ |-------------------------------|
+ | |
+ | Deduplication table entry |
+ | |
+ |-------------------------------|
+
+
+== Deduplication table entry description (RAM method) ==
+
+Each L2 deduplication table entry has the following structure:
+
+ Byte 0 - 31: hash of data cluster
+
+ 32 - 39: Logical offset of first encountered block having
+ this hash
+
+== Deduplication table arithmetics (RAM method) ==
+
+cluster_size = 4096
+dedup_block_size = 65536 * 5
+l2_size = 65536 * 16 (16 factor is from the smaller cluster_size)
+refcount_order must be >= 4
+
+Entries in the deduplication table are ordered by physical cluster index.
+
+The number of entries in an l2 deduplication table block is :
+l2_dedup_block_entries = FLOOR(dedup_block_size / (32 + 8))
+
+The index in the level 1 deduplication table is :
+l1_dedup_index = physical_cluster_index / l2_block_cluster_entries
+
+The index in the level 2 deduplication table is:
+l2_dedup_index = physical_cluster_index % l2_block_cluster_entries
+
+The 16 remaining bytes in each l2 deduplication blocks are set to zero and
+reserved for a future usage.
+
== Host cluster management ==
qcow2 manages the allocation of host clusters by maintaining a reference count
@@ -211,7 +312,7 @@ guest clusters to host clusters. They are called L1 and L2 table.
The L1 table has a variable size (stored in the header) and may use multiple
clusters, however it must be contiguous in the image file. L2 tables are
-exactly one cluster in size.
+exactly one cluster in size excepted for the deduplication case.
Given a offset into the virtual disk, the offset into the image file can be
obtained as follows:
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 02/32] qmp: Add DedupStatus enum.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 01/32] qcow2: Add deduplication to the qcow2 specification Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 03/32] qcow2: Add deduplication structures and fields Benoît Canet
` (29 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
qapi-schema.json | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/qapi-schema.json b/qapi-schema.json
index 28b070f..ec01773 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -839,6 +839,24 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'] }
##
+# @DedupStatus
+#
+# An enumeration of a virtual block device deduplication status.
+#
+# @stopped: The deduplication has been stopped
+#
+# @starting: The deduplication is starting
+#
+# @started: The deduplication is started
+#
+# @stopping: The deduplication is stopping
+#
+# Since: 1.5.0
+##
+{ 'enum': 'DedupStatus', 'data': [ 'stopped', 'starting', 'started',
+ 'stopping' ] }
+
+##
# @BlockDeviceStats:
#
# Statistics of a virtual block device or a block backing device.
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 03/32] qcow2: Add deduplication structures and fields.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 01/32] qcow2: Add deduplication to the qcow2 specification Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 02/32] qmp: Add DedupStatus enum Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 04/32] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
` (28 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2.h | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 76 insertions(+), 1 deletion(-)
diff --git a/block/qcow2.h b/block/qcow2.h
index 718b52b..87da573 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -43,6 +43,10 @@
#define QCOW_OFLAG_COPIED (1LL << 63)
/* indicate that the cluster is compressed (they never have the copied flag) */
#define QCOW_OFLAG_COMPRESSED (1LL << 62)
+/* indicate that the cluster must be processed when deduplication restarts
+ * also indicate that the on disk dedup hash must be ignored and discarded
+ */
+#define QCOW_OFLAG_PENDING_DEDUP (1LL << 61)
/* The cluster reads as all zeros */
#define QCOW_OFLAG_ZERO (1LL << 0)
@@ -56,7 +60,64 @@
/* Must be at least 4 to cover all cases of refcount table growth */
#define REFCOUNT_CACHE_SIZE 4
+#define DEDUP_CACHE_SIZE 4
#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_DEDUP_CLUSTER_SIZE 4096
+
+#define HASH_LENGTH 32
+
+/* indicate that this cluster refcount has reached its maximum value */
+#define QCOW_DEDUP_FLAG_HALF_MAX_REFCOUNT (1LL << 61)
+/* indicate that the hash structure is empty and miss offset */
+#define QCOW_DEDUP_FLAG_EMPTY (1LL << 62)
+
+#define QCOW_DEDUP_STRATEGY_RUNNING (1 << 0)
+#define QCOW_DEDUP_STRATEGY_RAM (1 << 1)
+
+typedef enum {
+ QCOW_HASH_SHA256 = 0,
+ QCOW_HASH_SHA3 = 1,
+ QCOW_HASH_SKEIN = 2,
+} QCowHashAlgo;
+
+typedef struct {
+ uint8_t data[HASH_LENGTH]; /* 32 bytes hash of a given cluster */
+} QCowHash;
+
+/* Used to keep a single precomputed hash between the calls of the dedup
+ * function
+ */
+typedef struct {
+ QCowHash hash;
+ bool reuse; /* The main deduplication function can set this field to
+ * true before exiting to avoid computing the same hash
+ * twice. It's a speed optimization.
+ */
+} QcowPersistentHash;
+
+/* deduplication node */
+typedef struct {
+ QCowHash hash;
+ uint64_t physical_sect; /* where the cluster is stored on disk */
+ uint64_t first_logical_sect; /* logical sector of the first occurrence of
+ * this cluster
+ */
+} QCowHashNode;
+
+/* Undedupable hashes that must be written later to disk */
+typedef struct QCowHashElement {
+ QCowHash hash;
+ QTAILQ_ENTRY(QCowHashElement) next;
+} QCowHashElement;
+
+typedef struct {
+ QcowPersistentHash phash; /* contains a hash persisting between calls of
+ * qcow2_dedup()
+ */
+ QTAILQ_HEAD(, QCowHashElement) undedupables;
+ uint64_t nb_clusters_processed;
+ uint64_t nb_undedupable_sectors;
+} QCowDedupState;
typedef struct QCowHeader {
uint32_t magic;
@@ -114,8 +175,10 @@ enum {
enum {
QCOW2_INCOMPAT_DIRTY_BITNR = 0,
QCOW2_INCOMPAT_DIRTY = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
+ QCOW2_INCOMPAT_DEDUP_BITNR = 1,
+ QCOW2_INCOMPAT_DEDUP = 1 << QCOW2_INCOMPAT_DEDUP_BITNR,
- QCOW2_INCOMPAT_MASK = QCOW2_INCOMPAT_DIRTY,
+ QCOW2_INCOMPAT_MASK = QCOW2_INCOMPAT_DIRTY | QCOW2_INCOMPAT_DEDUP,
};
/* Compatible feature bits */
@@ -138,6 +201,7 @@ typedef struct BDRVQcowState {
int cluster_sectors;
int l2_bits;
int l2_size;
+ int hash_block_size;
int l1_size;
int l1_vm_state_index;
int csize_shift;
@@ -148,6 +212,7 @@ typedef struct BDRVQcowState {
Qcow2Cache* l2_table_cache;
Qcow2Cache* refcount_block_cache;
+ Qcow2Cache *dedup_cluster_cache;
uint8_t *cluster_cache;
uint8_t *cluster_data;
@@ -160,6 +225,16 @@ typedef struct BDRVQcowState {
int64_t free_cluster_index;
int64_t free_byte_offset;
+ bool has_dedup;
+ DedupStatus dedup_status;
+ QCowHashAlgo dedup_hash_algo;
+ Coroutine *dedup_resume_co;
+ int dedup_co_delay;
+ uint64_t *dedup_table;
+ uint64_t dedup_table_offset;
+ size_t dedup_table_size;
+ GTree *dedup_tree_by_hash;
+
CoMutex lock;
uint32_t crypt_method; /* current crypt method, 0 if no key yet */
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 04/32] qcow2: Add qcow2_dedup_read_missing_and_concatenate
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (2 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 03/32] qcow2: Add deduplication structures and fields Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 05/32] qcow2: Create a way to link to l2 tables when deduplicating Benoît Canet
` (27 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
This function is used to read missing data when unaligned writes are
done. This function also concatenate missing data with the given
qiov data in order to prepare a buffer used to look for duplicated
clusters.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/Makefile.objs | 1 +
block/qcow2-dedup.c | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.c | 35 +++++++++++++++
block/qcow2.h | 12 +++++
4 files changed, 169 insertions(+)
create mode 100644 block/qcow2-dedup.c
diff --git a/block/Makefile.objs b/block/Makefile.objs
index c067f38..21afc85 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,6 @@
block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
+block-obj-y += qcow2-dedup.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
block-obj-y += parallels.o blkdebug.o blkverify.o
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
new file mode 100644
index 0000000..bc6e2c2
--- /dev/null
+++ b/block/qcow2-dedup.c
@@ -0,0 +1,121 @@
+/*
+ * Deduplication for the QCOW2 format
+ *
+ * Copyright (C) Nodalink, SARL. 2012-2013
+ *
+ * Author:
+ * Benoît Canet <benoit.canet@irqsave.net>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "block/block_int.h"
+#include "qemu-common.h"
+#include "qcow2.h"
+
+/*
+ * Prepare a buffer containing everything required to compute cluster
+ * sized deduplication hashes.
+ * If sector_num or nb_sectors are not cluster-aligned, missing data
+ * before/after the qiov will be read.
+ *
+ * @qiov: the qiov for which missing data must be read
+ * @sector_num: the first sectors that must be read into the qiov
+ * @nb_sectors: the number of sectors to read into the qiov
+ * @data: the place where the data will be concatenated and stored
+ * the caller is responsible to use qemu_vfree() to
+ * data on success.
+ * @nb_data_sectors: the resulting size of the contatenated data (in sectors)
+ * @ret: negative on error
+ */
+int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
+ QEMUIOVector *qiov,
+ uint64_t sector_num,
+ int nb_sectors,
+ uint8_t **data,
+ int *nb_data_sectors)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret = 0;
+ uint64_t cluster_beginning_sector;
+ uint64_t first_sector_after_qiov;
+ int cluster_beginning_nr;
+ int cluster_ending_nr;
+ int unaligned_ending_nr;
+ uint64_t max_cluster_ending_nr;
+
+ /* compute how much and where to read at the beginning */
+ cluster_beginning_nr = sector_num & (s->cluster_sectors - 1);
+ cluster_beginning_sector = sector_num - cluster_beginning_nr;
+
+ /* for the ending */
+ first_sector_after_qiov = sector_num + nb_sectors;
+ unaligned_ending_nr = first_sector_after_qiov & (s->cluster_sectors - 1);
+ cluster_ending_nr = unaligned_ending_nr ?
+ s->cluster_sectors - unaligned_ending_nr : 0;
+
+ /* compute total size in sectors and allocate memory */
+ *nb_data_sectors = cluster_beginning_nr + nb_sectors + cluster_ending_nr;
+ *data = qemu_blockalign(bs, *nb_data_sectors * BDRV_SECTOR_SIZE);
+
+ /* read beginning */
+ if (cluster_beginning_nr) {
+ ret = qcow2_read_cluster_data(bs,
+ *data,
+ cluster_beginning_sector,
+ cluster_beginning_nr);
+ }
+
+ if (ret < 0) {
+ goto fail;
+ }
+
+ /* append qiov content */
+ qemu_iovec_to_buf(qiov, 0, *data + cluster_beginning_nr * BDRV_SECTOR_SIZE,
+ qiov->size);
+
+ /* Fix cluster_ending_nr if we are at risk of reading outside the image
+ * (Cluster unaligned image size)
+ */
+ max_cluster_ending_nr = bs->total_sectors - first_sector_after_qiov;
+ cluster_ending_nr = max_cluster_ending_nr < (uint64_t) cluster_ending_nr ?
+ (int) max_cluster_ending_nr : cluster_ending_nr;
+
+ /* read and add ending */
+ if (cluster_ending_nr) {
+ ret = qcow2_read_cluster_data(bs,
+ *data +
+ (cluster_beginning_nr +
+ nb_sectors) *
+ BDRV_SECTOR_SIZE,
+ first_sector_after_qiov,
+ cluster_ending_nr);
+ }
+
+ if (ret < 0) {
+ goto fail;
+ }
+
+ return 0;
+
+fail:
+ qemu_vfree(*data);
+ *data = NULL;
+ return ret;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 7610e56..ca38cc3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1110,6 +1110,41 @@ fail:
return ret;
}
+/**
+ * Read some data from the QCOW2 file
+ *
+ * Important: s->lock is dropped. Things can change before the function returns
+ * to the caller.
+ *
+ * @data: the buffer where the data must be stored
+ * @sector_num: the sector number to read in the QCOW2 file
+ * @nb_sectors: the number of sectors to read
+ * @ret: negative on error
+ */
+coroutine_fn int qcow2_read_cluster_data(BlockDriverState *bs,
+ uint8_t *data,
+ uint64_t sector_num,
+ int nb_sectors)
+{
+ BDRVQcowState *s = bs->opaque;
+ QEMUIOVector qiov;
+ struct iovec iov;
+ int ret;
+
+ iov.iov_len = nb_sectors * BDRV_SECTOR_SIZE;
+ iov.iov_base = data;
+ qemu_iovec_init_external(&qiov, &iov, 1);
+ qemu_co_mutex_unlock(&s->lock);
+ ret = qcow2_co_readv(bs, sector_num, nb_sectors, &qiov);
+ qemu_co_mutex_lock(&s->lock);
+ if (ret < 0) {
+ error_report("failed to read %d sectors at offset %" PRIu64 "\n",
+ nb_sectors, sector_num);
+ }
+
+ return ret;
+}
+
static int qcow2_change_backing_file(BlockDriverState *bs,
const char *backing_file, const char *backing_fmt)
{
diff --git a/block/qcow2.h b/block/qcow2.h
index 87da573..83c90b6 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -381,6 +381,10 @@ int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
int qcow2_mark_dirty(BlockDriverState *bs);
int qcow2_update_header(BlockDriverState *bs);
+int qcow2_read_cluster_data(BlockDriverState *bs,
+ uint8_t *data,
+ uint64_t sector_num,
+ int nb_sectors);
/* qcow2-refcount.c functions */
int qcow2_refcount_init(BlockDriverState *bs);
@@ -449,4 +453,12 @@ int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
void **table);
int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
+/* qcow2-dedup.c functions */
+int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
+ QEMUIOVector *qiov,
+ uint64_t sector,
+ int sectors_nr,
+ uint8_t **dedup_cluster_data,
+ int *dedup_cluster_data_nr);
+
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 05/32] qcow2: Create a way to link to l2 tables when deduplicating.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (3 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 04/32] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 06/32] qcow2: Make qcow2_update_cluster_refcount public Benoît Canet
` (26 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-cluster.c | 7 +++++--
block/qcow2.h | 6 ++++++
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 56fccf9..3354f39 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -693,7 +693,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
old_cluster[j++] = l2_table[l2_index + i];
l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
- (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
+ (i << s->cluster_bits)) | m->l2_entry_flags);
}
@@ -706,7 +706,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
* If this was a COW, we need to decrease the refcount of the old cluster.
* Also flush bs->file to get the right order for L2 and refcount update.
*/
- if (j != 0) {
+ if (!m->overwrite && j != 0) {
for (i = 0; i < j; i++) {
qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1);
}
@@ -1006,6 +1006,9 @@ again:
.offset = nb_sectors * BDRV_SECTOR_SIZE,
.nb_sectors = avail_sectors - nb_sectors,
},
+
+ .l2_entry_flags = QCOW_OFLAG_COPIED,
+ .overwrite = false,
};
qemu_co_queue_init(&(*m)->dependent_requests);
QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
diff --git a/block/qcow2.h b/block/qcow2.h
index 83c90b6..6c45520 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -309,6 +309,12 @@ typedef struct QCowL2Meta
*/
CoQueue dependent_requests;
+ /* contains the flags to apply to the l2 entry */
+ uint64_t l2_entry_flags;
+
+ /* set to true if we are overwriting an L2 table entry */
+ bool overwrite;
+
/**
* The COW Region between the start of the first allocated cluster and the
* area the guest actually writes to.
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 06/32] qcow2: Make qcow2_update_cluster_refcount public.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (4 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 05/32] qcow2: Create a way to link to l2 tables when deduplicating Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 07/32] qcow2: Add qcow2_dedup and related functions Benoît Canet
` (25 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Also add it a flush parameter.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-refcount.c | 28 ++++++++++++++++++++--------
block/qcow2.h | 4 ++++
2 files changed, 24 insertions(+), 8 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 55543ed..e12b58c 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -510,13 +510,15 @@ fail:
/*
* Increases or decreases the refcount of a given cluster by one.
* addend must be 1 or -1.
+ * flush must be true if flushing is needed
*
* If the return value is non-negative, it is the new refcount of the cluster.
* If it is negative, it is -errno and indicates an error.
*/
-static int update_cluster_refcount(BlockDriverState *bs,
- int64_t cluster_index,
- int addend)
+int qcow2_update_cluster_refcount(BlockDriverState *bs,
+ int64_t cluster_index,
+ int addend,
+ bool flush)
{
BDRVQcowState *s = bs->opaque;
int ret;
@@ -526,7 +528,9 @@ static int update_cluster_refcount(BlockDriverState *bs,
return ret;
}
- bdrv_flush(bs->file);
+ if (flush) {
+ bdrv_flush(bs->file);
+ }
return get_refcount(bs, cluster_index);
}
@@ -645,7 +649,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
if (free_in_cluster == 0)
s->free_byte_offset = 0;
if ((offset & (s->cluster_size - 1)) != 0)
- update_cluster_refcount(bs, offset >> s->cluster_bits, 1);
+ qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
+ true);
} else {
offset = qcow2_alloc_clusters(bs, s->cluster_size);
if (offset < 0) {
@@ -655,7 +660,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
if ((cluster_offset + s->cluster_size) == offset) {
/* we are lucky: contiguous data */
offset = s->free_byte_offset;
- update_cluster_refcount(bs, offset >> s->cluster_bits, 1);
+ qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
+ true);
s->free_byte_offset += size;
} else {
s->free_byte_offset = offset;
@@ -792,7 +798,10 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
} else {
uint64_t cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
if (addend != 0) {
- refcount = update_cluster_refcount(bs, cluster_index, addend);
+ refcount = qcow2_update_cluster_refcount(bs,
+ cluster_index,
+ addend,
+ true);
} else {
refcount = get_refcount(bs, cluster_index);
}
@@ -824,7 +833,10 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
if (addend != 0) {
- refcount = update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend);
+ refcount = qcow2_update_cluster_refcount(bs,
+ l2_offset >> s->cluster_bits,
+ addend,
+ true);
} else {
refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
}
diff --git a/block/qcow2.h b/block/qcow2.h
index 6c45520..3dc9834 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -410,6 +410,10 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix);
+int qcow2_update_cluster_refcount(BlockDriverState *bs,
+ int64_t cluster_index,
+ int addend,
+ bool flush);
/* qcow2-cluster.c functions */
int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 07/32] qcow2: Add qcow2_dedup and related functions
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (5 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 06/32] qcow2: Make qcow2_update_cluster_refcount public Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 08/32] qcow2: Add qcow2_dedup_store_new_hashes Benoît Canet
` (24 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 432 +++++++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.h | 5 +
2 files changed, 437 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index bc6e2c2..3ef34a9 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -119,3 +119,435 @@ fail:
*data = NULL;
return ret;
}
+
+/*
+ * Build a QCowHashNode structure
+ *
+ * @hash: the given hash
+ * @physical_sect: the cluster offset in the QCOW2 file
+ * @first_logical_sect: the first logical cluster offset written
+ * @ret: the build QCowHashNode
+ */
+static QCowHashNode *qcow2_hash_node_new(QCowHash *hash,
+ uint64_t physical_sect,
+ uint64_t first_logical_sect)
+{
+ QCowHashNode *hash_node;
+
+ hash_node = g_new0(QCowHashNode, 1);
+ memcpy(hash_node->hash.data, hash->data, HASH_LENGTH);
+ hash_node->physical_sect = physical_sect;
+ hash_node->first_logical_sect = first_logical_sect;
+
+ return hash_node;
+}
+
+/*
+ * Compute the hash of a given cluster
+ *
+ * @data: a buffer containing the cluster data
+ * @hash: a QCowHash where to store the computed hash
+ * @ret: 0 on success, negative on error
+ */
+static int qcow2_compute_cluster_hash(BlockDriverState *bs,
+ QCowHash *hash,
+ uint8_t *data)
+{
+ return 0;
+}
+
+/*
+ * Get a QCowHashNode corresponding to a cluster data
+ *
+ * @phash: if phash can be used no hash is computed
+ * @data: a buffer containing the cluster
+ * @err: Error code if any
+ * @ret: QCowHashNode of the duplicated cluster or NULL if not found
+ */
+static QCowHashNode *qcow2_get_hash_node_for_cluster(BlockDriverState *bs,
+ QcowPersistentHash *phash,
+ uint8_t *data,
+ int *err)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret = 0;
+ *err = 0;
+
+ /* no hash has been provided compute it and store it for later usage */
+ if (!phash->reuse) {
+ ret = qcow2_compute_cluster_hash(bs,
+ &phash->hash,
+ data);
+ }
+
+ /* do not reuse the hash anymore if it was precomputed */
+ phash->reuse = false;
+
+ if (ret < 0) {
+ *err = ret;
+ return NULL;
+ }
+
+ return g_tree_lookup(s->dedup_tree_by_hash, &phash->hash);
+}
+
+/*
+ * Build a QCowHashNode from a given QCowHash and insert it into the tree
+ *
+ * @hash: the given QCowHash
+ */
+static void qcow2_build_and_insert_hash_node(BlockDriverState *bs,
+ QCowHash *hash)
+{
+ BDRVQcowState *s = bs->opaque;
+ QCowHashNode *hash_node;
+
+ /* build the hash node with QCOW_DEDUP_FLAG_EMPTY as offsets so we will remember
+ * to fill these field later with real values.
+ */
+ hash_node = qcow2_hash_node_new(hash,
+ QCOW_DEDUP_FLAG_EMPTY,
+ QCOW_DEDUP_FLAG_EMPTY);
+ g_tree_insert(s->dedup_tree_by_hash, &hash_node->hash, hash_node);
+}
+
+/*
+ * Helper used to build a QCowHashElement
+ *
+ * @hash: the QCowHash to use
+ * @ret: a newly allocated QCowHashElement containing the given hash
+ */
+static QCowHashElement *qcow2_dedup_hash_new(QCowHash *hash)
+{
+ QCowHashElement *dedup_hash;
+ dedup_hash = g_new0(QCowHashElement, 1);
+ memcpy(dedup_hash->hash.data, hash->data, HASH_LENGTH);
+ return dedup_hash;
+}
+
+/*
+ * Helper used to link a deduplicated cluster in the l2
+ *
+ * @logical_sect: the cluster sector seen by the guest
+ * @physical_sect: the cluster sector in the QCOW2 file
+ * @overwrite: true if we must overwrite the L2 table entry
+ * @ret:
+ */
+static int qcow2_dedup_link_l2(BlockDriverState *bs,
+ uint64_t logical_sect,
+ uint64_t physical_sect,
+ bool overwrite)
+{
+ QCowL2Meta m = {
+ .alloc_offset = physical_sect << 9,
+ .offset = logical_sect << 9,
+ .nb_clusters = 1,
+ .nb_available = 0,
+ .cow_start = {
+ .offset = 0,
+ .nb_sectors = 0,
+ },
+ .cow_end = {
+ .offset = 0,
+ .nb_sectors = 0,
+ },
+ .l2_entry_flags = 0,
+ .overwrite = overwrite,
+ };
+ return qcow2_alloc_cluster_link_l2(bs, &m);
+}
+
+/* Clear the QCOW_OFLAG_COPIED from the first L2 entry written for a physical
+ * cluster.
+ *
+ * @hash_node: the duplicated hash node
+ * @ret: 0 on success, negative on error
+ */
+static int qcow2_clear_l2_copied_flag_if_needed(BlockDriverState *bs,
+ QCowHashNode *hash_node)
+{
+ int ret = 0;
+ uint64_t first_logical_sect = hash_node->first_logical_sect;
+
+ /* QCOW_OFLAG_COPIED already cleared -> do nothing */
+ if (!(first_logical_sect & QCOW_OFLAG_COPIED)) {
+ return 0;
+ }
+
+ first_logical_sect &= ~QCOW_OFLAG_COPIED;
+
+ /* overwrite first L2 entry to clear QCOW_FLAG_COPIED */
+ ret = qcow2_dedup_link_l2(bs, first_logical_sect,
+ hash_node->physical_sect,
+ true);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* remember that we don't need to clear QCOW_OFLAG_COPIED again */
+ hash_node->first_logical_sect = first_logical_sect;
+
+ return 0;
+}
+
+/* This function deduplicate a cluster
+ *
+ * @logical_sect: The logical sector of the write
+ * @hash_node: The duplicated cluster hash node
+ * @ret: 0 on success, negative on error
+ */
+static int qcow2_deduplicate_cluster(BlockDriverState *bs,
+ uint64_t logical_sect,
+ QCowHashNode *hash_node)
+{
+ BDRVQcowState *s = bs->opaque;
+ uint64_t cluster_index = hash_node->physical_sect / s->cluster_sectors;
+ int ret = 0;
+
+ /* Increment the refcount of the cluster */
+ ret = qcow2_update_cluster_refcount(bs,
+ cluster_index,
+ 1,
+ false);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* create new L2 entry */
+ return qcow2_dedup_link_l2(bs, logical_sect,
+ hash_node->physical_sect,
+ false);
+}
+
+/* This function tries to deduplicate a given cluster.
+ *
+ * @sector_num: the logical sector number we are trying to deduplicate
+ * @phash: Used instead of computing the hash if provided
+ * @data: the buffer in which to look for a duplicated cluster
+ * @ret: ret < 0 on error, 1 on deduplication else 0
+ */
+static int qcow2_try_dedup_cluster(BlockDriverState *bs,
+ QcowPersistentHash *phash,
+ uint64_t sector_num,
+ uint8_t *data)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret = 0;
+ QCowHashNode *hash_node;
+ uint64_t logical_sect;
+ uint64_t existing_physical_offset;
+ int pnum = s->cluster_sectors;
+
+ /* search the tree for duplicated cluster */
+ hash_node = qcow2_get_hash_node_for_cluster(bs,
+ phash,
+ data,
+ &ret);
+
+ /* we won't reuse the hash on error */
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* if cluster is not duplicated store hash for later usage */
+ if (!hash_node) {
+ qcow2_build_and_insert_hash_node(bs, &phash->hash);
+ return 0;
+ }
+
+ logical_sect = sector_num & ~(s->cluster_sectors - 1);
+ ret = qcow2_get_cluster_offset(bs, logical_sect << 9,
+ &pnum, &existing_physical_offset);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* if we are rewriting the same cluster at the same place do nothing */
+ if (existing_physical_offset == hash_node->physical_sect << 9) {
+ return 1;
+ }
+
+ /* take care of not having refcount > 1 and QCOW_OFLAG_COPIED at once */
+ ret = qcow2_clear_l2_copied_flag_if_needed(bs, hash_node);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* do the deduplication */
+ ret = qcow2_deduplicate_cluster(bs, logical_sect,
+ hash_node);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ return 1;
+}
+
+
+static void add_hash_to_undedupable_list(BlockDriverState *bs,
+ QCowDedupState *ds)
+{
+ /* memorise hash for later storage in gtree and disk */
+ QCowHashElement *dedup_hash = qcow2_dedup_hash_new(&ds->phash.hash);
+ QTAILQ_INSERT_TAIL(&ds->undedupables, dedup_hash, next);
+}
+
+static int qcow2_dedup_starting_from_begining(BlockDriverState *bs,
+ QCowDedupState *ds,
+ uint64_t sector_num,
+ uint8_t *data,
+ int left_to_process)
+{
+ BDRVQcowState *s = bs->opaque;
+ int i;
+ int ret = 0;
+
+ for (i = 0; i < left_to_process; i++) {
+ ret = qcow2_try_dedup_cluster(bs,
+ &ds->phash,
+ sector_num + i * s->cluster_sectors,
+ data + i * s->cluster_size);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* stop if a cluster has not been deduplicated */
+ if (ret != 1) {
+ break;
+ }
+ }
+
+ return i;
+}
+
+static int qcow2_count_next_non_dedupable_clusters(BlockDriverState *bs,
+ QCowDedupState *ds,
+ uint8_t *data,
+ int left_to_process)
+{
+ BDRVQcowState *s = bs->opaque;
+ int i;
+ int ret = 0;
+ QCowHashNode *hash_node;
+
+ for (i = 0; i < left_to_process; i++) {
+ hash_node = qcow2_get_hash_node_for_cluster(bs,
+ &ds->phash,
+ data + i * s->cluster_size,
+ &ret);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* found a duplicated cluster : stop here */
+ if (hash_node) {
+ break;
+ }
+
+ qcow2_build_and_insert_hash_node(bs, &ds->phash.hash);
+ add_hash_to_undedupable_list(bs, ds);
+ }
+
+ return i;
+}
+
+
+/* Deduplicate all the cluster that can be deduplicated.
+ *
+ * Next it computes the number of non deduplicable sectors to come while storing
+ * the hashes of these sectors in a linked list for later usage.
+ * Then it computes the first duplicated cluster hash that comes after non
+ * deduplicable cluster, this hash will be used at next call of the function
+ *
+ * @ds: a structure containing the state of the deduplication
+ * for this write request
+ * @sector_num: The logical sector
+ * @data: the buffer containing the data to deduplicate
+ * @data_nr: the size of the buffer in sectors
+ *
+ */
+int qcow2_dedup(BlockDriverState *bs,
+ QCowDedupState *ds,
+ uint64_t sector_num,
+ uint8_t *data,
+ int data_nr)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret = 0;
+ int deduped_clusters_nr = 0;
+ int left_to_process;
+ int start_index;
+
+ start_index = sector_num & (s->cluster_sectors - 1);
+
+ left_to_process = (data_nr / s->cluster_sectors) -
+ ds->nb_clusters_processed;
+
+ data += ds->nb_clusters_processed * s->cluster_size;
+
+ /* start deduplicating all that can be cluster after cluster */
+ ret = qcow2_dedup_starting_from_begining(bs,
+ ds,
+ sector_num,
+ data,
+ left_to_process);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ deduped_clusters_nr = ret;
+
+ left_to_process -= ret;
+ ds->nb_clusters_processed += ret;
+ data += ret * s->cluster_size;
+
+ /* We deduped everything till the end */
+ if (!left_to_process) {
+ ds->nb_undedupable_sectors = 0;
+ goto exit;
+ }
+
+ /* skip and account the first undedupable cluster found */
+ left_to_process--;
+ ds->nb_clusters_processed++;
+ data += s->cluster_size;
+ ds->nb_undedupable_sectors += s->cluster_sectors;
+
+ add_hash_to_undedupable_list(bs, ds);
+
+ /* Count how many non duplicated sector can be written and memorize hashes
+ * to write them after data has reached disk.
+ */
+ ret = qcow2_count_next_non_dedupable_clusters(bs,
+ ds,
+ data,
+ left_to_process);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ left_to_process -= ret;
+ ds->nb_clusters_processed += ret;
+ ds->nb_undedupable_sectors += ret * s->cluster_sectors;
+
+ /* remember to reuse the last hash computed at new qcow2_dedup call */
+ if (left_to_process) {
+ ds->phash.reuse = true;
+ }
+
+exit:
+ if (!deduped_clusters_nr) {
+ return 0;
+ }
+
+ return deduped_clusters_nr * s->cluster_sectors - start_index;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 3dc9834..6194030 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -470,5 +470,10 @@ int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
int sectors_nr,
uint8_t **dedup_cluster_data,
int *dedup_cluster_data_nr);
+int qcow2_dedup(BlockDriverState *bs,
+ QCowDedupState *ds,
+ uint64_t sector_num,
+ uint8_t *data,
+ int data_nr);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 08/32] qcow2: Add qcow2_dedup_store_new_hashes.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (6 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 07/32] qcow2: Add qcow2_dedup and related functions Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 09/32] qcow2: Do allocate on rewrite on the dedup case Benoît Canet
` (23 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 292 ++++++++++++++++++++++++++++++++++++++++++++++++++-
block/qcow2.h | 5 +
2 files changed, 296 insertions(+), 1 deletion(-)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 3ef34a9..3210d26 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -29,6 +29,12 @@
#include "qemu-common.h"
#include "qcow2.h"
+static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
+ QCowHash *hash,
+ uint64_t *first_logical_sect,
+ uint64_t physical_sect,
+ bool write);
+
/*
* Prepare a buffer containing everything required to compute cluster
* sized deduplication hashes.
@@ -288,7 +294,11 @@ static int qcow2_clear_l2_copied_flag_if_needed(BlockDriverState *bs,
/* remember that we don't need to clear QCOW_OFLAG_COPIED again */
hash_node->first_logical_sect = first_logical_sect;
- return 0;
+ /* clear the QCOW_OFLAG_COPIED flag from disk */
+ return qcow2_dedup_read_write_hash(bs, &hash_node->hash,
+ &hash_node->first_logical_sect,
+ hash_node->physical_sect,
+ true);
}
/* This function deduplicate a cluster
@@ -551,3 +561,283 @@ exit:
return deduped_clusters_nr * s->cluster_sectors - start_index;
}
+
+
+/* Create a deduplication table hash block, write it's offset to disk and
+ * reference it in the RAM deduplication table
+ *
+ * sync this to disk and get the dedup cluster cache entry
+ *
+ * @index: index in the RAM deduplication table
+ * @ret: offset on success, negative on error
+ */
+static uint64_t qcow2_create_dedup_block(BlockDriverState *bs,
+ int32_t index)
+{
+ BDRVQcowState *s = bs->opaque;
+ int64_t offset;
+ uint64_t data64;
+ int ret = 0;
+
+ /* allocate a new dedup table hash block */
+ offset = qcow2_alloc_clusters(bs, s->hash_block_size);
+
+ if (offset < 0) {
+ return offset;
+ }
+
+ ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+ if (ret < 0) {
+ goto free_fail;
+ }
+
+ /* write the new block offset in the dedup table L1 */
+ data64 = cpu_to_be64(offset);
+ ret = bdrv_pwrite_sync(bs->file,
+ s->dedup_table_offset +
+ index * sizeof(uint64_t),
+ &data64, sizeof(data64));
+
+ if (ret < 0) {
+ goto free_fail;
+ }
+
+ s->dedup_table[index] = offset;
+
+ return offset;
+
+free_fail:
+ qcow2_free_clusters(bs, offset, s->hash_block_size);
+ return ret;
+}
+
+static int qcow2_create_and_get_block(BlockDriverState *bs,
+ uint32_t index,
+ uint8_t **block)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret = 0;
+ int64_t offset;
+
+ offset = qcow2_create_dedup_block(bs, index);
+
+ if (offset < 0) {
+ return offset;
+ }
+
+
+ /* get an empty cluster from the dedup cache */
+ ret = qcow2_cache_get_empty(bs, s->dedup_cluster_cache,
+ offset,
+ (void **) block);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* clear it */
+ memset(*block, 0, s->hash_block_size);
+
+ return 0;
+}
+
+static inline bool qcow2_has_dedup_block(BlockDriverState *bs,
+ uint32_t index)
+{
+ BDRVQcowState *s = bs->opaque;
+ return s->dedup_table[index];
+}
+
+static inline void qcow2_set_hash_block_entry(BlockDriverState *bs,
+ uint8_t *block,
+ QCowHash *hash,
+ int offset,
+ uint64_t *logical_sect)
+{
+ BDRVQcowState *s = bs->opaque;
+ uint64_t first;
+ first = cpu_to_be64(*logical_sect);
+ memcpy(block + offset, hash->data, HASH_LENGTH);
+ memcpy(block + offset + HASH_LENGTH, &first, 8);
+ qcow2_cache_entry_mark_dirty(s->dedup_cluster_cache, block);
+}
+
+static inline uint64_t qcow2_read_hash_from_block(uint8_t *block,
+ QCowHash *hash,
+ int offset)
+{
+ uint64_t first;
+ memcpy(hash->data, block + offset, HASH_LENGTH);
+ memcpy(&first, block + offset + HASH_LENGTH, 8);
+ return be64_to_cpu(first);
+}
+
+/* Read/write a given hash and cluster_sect from/to the dedup table
+ *
+ * This function doesn't flush the dedup cache to disk
+ *
+ * @hash: the hash to read or store
+ * @first_logical_sect: logical sector
+ * @physical_sect: sector of the cluster in QCOW2 file (in sectors)
+ * @write: true to write, false to read
+ * @ret: 0 on success, -errno on error
+ */
+static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
+ QCowHash *hash,
+ uint64_t *first_logical_sect,
+ uint64_t physical_sect,
+ bool write)
+{
+ BDRVQcowState *s = bs->opaque;
+ uint8_t *block = NULL;
+ int ret = 0;
+ int64_t cluster_number;
+ uint32_t index_in_dedup_table;
+ int offset_in_block;
+ int nb_hash_in_block = s->hash_block_size / (HASH_LENGTH + 8);
+
+ cluster_number = physical_sect / s->cluster_sectors;
+ index_in_dedup_table = cluster_number / nb_hash_in_block;
+
+ if (s->dedup_table_size <= index_in_dedup_table) {
+ return -ENOSPC;
+ }
+
+ /* if we must read and there is nothing to read return a null hash */
+ if (!qcow2_has_dedup_block(bs, index_in_dedup_table) && !write) {
+ memset(hash->data, 0, HASH_LENGTH);
+ *first_logical_sect = 0;
+ return 0;
+ }
+
+ if (qcow2_has_dedup_block(bs, index_in_dedup_table)) {
+ ret = qcow2_cache_get(bs,
+ s->dedup_cluster_cache,
+ s->dedup_table[index_in_dedup_table],
+ (void **) &block);
+ } else {
+ ret = qcow2_create_and_get_block(bs,
+ index_in_dedup_table,
+ &block);
+ }
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ offset_in_block = (cluster_number % nb_hash_in_block) *
+ (HASH_LENGTH + 8);
+
+ if (write) {
+ qcow2_set_hash_block_entry(bs,
+ block,
+ hash,
+ offset_in_block,
+ first_logical_sect);
+ } else {
+ *first_logical_sect = qcow2_read_hash_from_block(block,
+ hash,
+ offset_in_block);
+ }
+
+ qcow2_cache_put(bs, s->dedup_cluster_cache, (void **) &block);
+
+ return 0;
+}
+
+static inline bool is_hash_node_empty(QCowHashNode *hash_node)
+{
+ return hash_node->physical_sect & QCOW_DEDUP_FLAG_EMPTY;
+}
+
+/* This function store a hash information to disk and RAM
+ *
+ * @hash: the QCowHash to process
+ * @logical_sect: the logical sector of the cluster seen by the guest
+ * @physical_sect: the physical sector of the stored cluster
+ * @ret: 0 on success, negative on error
+ */
+static int qcow2_store_hash(BlockDriverState *bs,
+ QCowHash *hash,
+ uint64_t logical_sect,
+ uint64_t physical_sect)
+{
+ BDRVQcowState *s = bs->opaque;
+ QCowHashNode *hash_node;
+
+ hash_node = g_tree_lookup(s->dedup_tree_by_hash, hash);
+
+ /* no hash node found for this hash */
+ if (!hash_node) {
+ return 0;
+ }
+
+ /* the hash node information are already completed */
+ if (!is_hash_node_empty(hash_node)) {
+ return 0;
+ }
+
+ /* Remember that this QCowHashNode represents the first occurrence of the
+ * cluster so we will be able to clear QCOW_OFLAG_COPIED from the L2 table
+ * entry when refcount will go > 1.
+ */
+ logical_sect = logical_sect | QCOW_OFLAG_COPIED;
+
+ /* fill the missing fields of the hash node */
+ hash_node->physical_sect = physical_sect;
+ hash_node->first_logical_sect = logical_sect;
+
+ /* write the hash to disk */
+ return qcow2_dedup_read_write_hash(bs,
+ hash,
+ &logical_sect,
+ physical_sect,
+ true);
+}
+
+/* This function store the hashes of the clusters which are not duplicated
+ *
+ * @ds: The deduplication state
+ * @count: the number of dedup hash to process
+ * @logical_sect: logical offset of the first cluster (in sectors)
+ * @physical_sect: offset of the first cluster (in sectors)
+ * @ret: 0 on succes, errno on error
+ */
+int qcow2_dedup_store_new_hashes(BlockDriverState *bs,
+ QCowDedupState *ds,
+ int count,
+ uint64_t logical_sect,
+ uint64_t physical_sect)
+{
+ int ret = 0;
+ int i = 0;
+ BDRVQcowState *s = bs->opaque;
+ QCowHashElement *dedup_hash, *next_dedup_hash;
+
+ /* round values on cluster boundaries for easier cluster deletion */
+ logical_sect = logical_sect & ~(s->cluster_sectors - 1);
+ physical_sect = physical_sect & ~(s->cluster_sectors - 1);
+
+ QTAILQ_FOREACH_SAFE(dedup_hash, &ds->undedupables, next, next_dedup_hash) {
+
+ ret = qcow2_store_hash(bs,
+ &dedup_hash->hash,
+ logical_sect + i * s->cluster_sectors,
+ physical_sect + i * s->cluster_sectors);
+
+ QTAILQ_REMOVE(&ds->undedupables, dedup_hash, next);
+ g_free(dedup_hash);
+
+ if (ret < 0) {
+ break;
+ }
+
+ i++;
+
+ if (i == count) {
+ break;
+ }
+ }
+
+ return ret;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 6194030..7979fc2 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -475,5 +475,10 @@ int qcow2_dedup(BlockDriverState *bs,
uint64_t sector_num,
uint8_t *data,
int data_nr);
+int qcow2_dedup_store_new_hashes(BlockDriverState *bs,
+ QCowDedupState *ds,
+ int count,
+ uint64_t logical_sect,
+ uint64_t physical_sect);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 09/32] qcow2: Do allocate on rewrite on the dedup case.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (7 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 08/32] qcow2: Add qcow2_dedup_store_new_hashes Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 10/32] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
` (22 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
This patch does allocate on rewrite when deduplication is on.
This get rid of the need of removing the old hash of the lookup structure
when a cluster get rewritten.
The old data is left in place and will be collected/deleted when it's cluster
will reach 0.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-cluster.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 3354f39..fae4110 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -898,11 +898,17 @@ again:
cluster_offset = be64_to_cpu(l2_table[l2_index]);
+ /* If dedup is on we are always allocating wether it's a write or a
+ * rewrite. Doing this simplify a lot the rewrite cluster case since
+ * we don't need to remove the obsolete hash from the tree.
+ * Old clusters will be deleted when their refcount will reach 0.
+ */
/*
- * Check how many clusters are already allocated and don't need COW, and how
- * many need a new allocation.
+ * Check how many clusters are already allocated and don't need COW,
+ * and how many need a new allocation.
*/
- if (qcow2_get_cluster_type(cluster_offset) == QCOW2_CLUSTER_NORMAL
+ if (!s->has_dedup &&
+ qcow2_get_cluster_type(cluster_offset) == QCOW2_CLUSTER_NORMAL
&& (cluster_offset & QCOW_OFLAG_COPIED))
{
/* We keep all QCOW_OFLAG_COPIED clusters */
@@ -931,11 +937,11 @@ again:
cluster_offset &= L2E_OFFSET_MASK;
/*
- * The L2 table isn't used any more after this. As long as the cache works
- * synchronously, it's important to release it before calling
- * do_alloc_cluster_offset, which may yield if we need to wait for another
- * request to complete. If we still had the reference, we could use up the
- * whole cache with sleeping requests.
+ * The L2 table isn't used any more after this. As long as the cache
+ * works synchronously, it's important to release it before calling
+ * do_alloc_cluster_offset, which may yield if we need to wait for
+ * another request to complete. If we still had the reference, we could
+ * use up the whole cache with sleeping requests.
*/
ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
if (ret < 0) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 10/32] qcow2: Implement qcow2_compute_cluster_hash.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (8 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 09/32] qcow2: Do allocate on rewrite on the dedup case Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 11/32] qcow2: Add qcow2_dedup_grow_table and use it Benoît Canet
` (21 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Also factorize detection of libgnutls with vnc tls.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 17 +++++++++-
configure | 86 +++++++++++++++++++++++++++++++++++++--------------
2 files changed, 79 insertions(+), 24 deletions(-)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 3210d26..089b999 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -28,6 +28,10 @@
#include "block/block_int.h"
#include "qemu-common.h"
#include "qcow2.h"
+#ifdef CONFIG_SHA256_DEDUP
+#include <gnutls/gnutls.h>
+#include <gnutls/crypto.h>
+#endif
static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
QCowHash *hash,
@@ -159,7 +163,18 @@ static int qcow2_compute_cluster_hash(BlockDriverState *bs,
QCowHash *hash,
uint8_t *data)
{
- return 0;
+ BDRVQcowState *s = bs->opaque;
+ switch (s->dedup_hash_algo) {
+#ifdef CONFIG_SHA256_DEDUP
+ case QCOW_HASH_SHA256:
+ return gnutls_hash_fast(GNUTLS_DIG_SHA256, data,
+ s->cluster_size, hash->data);
+#endif
+ default:
+ error_report("Invalid deduplication hash algorithm %i",
+ s->dedup_hash_algo);
+ abort();
+ }
}
/*
diff --git a/configure b/configure
index 84317c6..0b8c92e 100755
--- a/configure
+++ b/configure
@@ -228,6 +228,7 @@ glusterfs=""
virtio_blk_data_plane=""
gtk=""
gtkabi="2.0"
+sha256_dedup="yes"
# parse CC options first
for opt do
@@ -905,6 +906,10 @@ for opt do
;;
--with-gtkabi=*) gtkabi="$optarg"
;;
+ --disable-sha256-dedup) sha256_dedup="no"
+ ;;
+ --enable-sha256-dedup) sha256_dedup="yes"
+ ;;
*) echo "ERROR: unknown option $opt"; show_help="yes"
;;
esac
@@ -1158,6 +1163,8 @@ echo " --with-coroutine=BACKEND coroutine backend. Supported options:"
echo " gthread, ucontext, sigaltstack, windows"
echo " --enable-glusterfs enable GlusterFS backend"
echo " --disable-glusterfs disable GlusterFS backend"
+echo " --disable-sha256-dedup disable sha256 dedup"
+echo " --enable-sha256-dedup enables sha256 dedup"
echo " --enable-gcov enable test coverage analysis with gcov"
echo " --gcov=GCOV use specified gcov [$gcov_tool]"
echo ""
@@ -1763,32 +1770,60 @@ EOF
fi
##########################################
-# VNC TLS/WS detection
-if test "$vnc" = "yes" -a \( "$vnc_tls" != "no" -o "$vnc_ws" != "no" \) ; then
- cat > $TMPC <<EOF
+# gnutls detection (factorize the VNC TLS and SHA256 deduplication test)
+cat > $TMPC <<EOF
#include <gnutls/gnutls.h>
-int main(void) { gnutls_session_t s; gnutls_init(&s, GNUTLS_SERVER); return 0; }
+#include <gnutls/crypto.h>
+int main(void) {char data[4096], digest[32];
+gnutls_hash_fast(GNUTLS_DIG_SHA256, data, 4096, digest);
+return 0;
+}
EOF
- vnc_tls_cflags=`$pkg_config --cflags gnutls 2> /dev/null`
- vnc_tls_libs=`$pkg_config --libs gnutls 2> /dev/null`
- if compile_prog "$vnc_tls_cflags" "$vnc_tls_libs" ; then
- if test "$vnc_tls" != "no" ; then
- vnc_tls=yes
- fi
- if test "$vnc_ws" != "no" ; then
- vnc_ws=yes
- fi
- libs_softmmu="$vnc_tls_libs $libs_softmmu"
- QEMU_CFLAGS="$QEMU_CFLAGS $vnc_tls_cflags"
+gnu_tls_cflags=`$pkg_config --cflags gnutls 2> /dev/null`
+gnu_tls_libs=`$pkg_config --libs gnutls 2> /dev/null`
+if compile_prog "$gnu_tls_cflags" "$gnu_tls_libs" ; then
+ gnu_tls=yes
+else
+ gnu_tls=no
+fi
+
+##########################################
+# VNC TLS/WS
+if test "$vnc" = "yes" -a "$gnu_tls" != "no"; then
+ if test "$vnc_tls" != "no" ; then
+ libs_softmmu="$gnu_tls_libs $libs_softmmu"
+ libs_tools="$gnu_tls_libs $libs_softmmu"
+ QEMU_CFLAGS="$QEMU_CFLAGS $gnu_tls_cflags"
+ vnc_tls=yes
+ fi
+ if test "$vnc_ws" != "no" ; then
+ libs_softmmu="$gnu_tls_libs $libs_softmmu"
+ libs_tools="$gnu_tls_libs $libs_softmmu"
+ QEMU_CFLAGS="$QEMU_CFLAGS $gnu_tls_cflags"
+ vnc_ws=yes
+ fi
+else
+ if test "$vnc_tls" = "yes" ; then
+ feature_not_found "vnc-tls"
+ fi
+ if test "$vnc_ws" = "yes" ; then
+ feature_not_found "vnc-ws"
+ fi
+ vnc_tls=no
+ vnc_ws=no
+fi
+
+##########################################
+# SHA256 deduplication
+if test "$sha256_dedup" = "yes"; then
+ if test "$gnu_tls" = "yes"; then
+ libs_softmmu="$gnu_tls_libs $libs_softmmu"
+ libs_tools="$gnu_tls_libs $libs_softmmu"
+ QEMU_CFLAGS="$QEMU_CFLAGS $gnu_tls_cflags"
+ sha256_dedup=yes
else
- if test "$vnc_tls" = "yes" ; then
- feature_not_found "vnc-tls"
- fi
- if test "$vnc_ws" = "yes" ; then
- feature_not_found "vnc-ws"
- fi
- vnc_tls=no
- vnc_ws=no
+ echo "gnutls > 2.10.0 required to compile QEMU with sha256 deduplication"
+ exit 1
fi
fi
@@ -3418,6 +3453,7 @@ echo "seccomp support $seccomp"
echo "coroutine backend $coroutine_backend"
echo "GlusterFS support $glusterfs"
echo "virtio-blk-data-plane $virtio_blk_data_plane"
+echo "sha256-dedup $sha256_dedup"
echo "gcov $gcov_tool"
echo "gcov enabled $gcov"
@@ -3786,6 +3822,10 @@ if test "$virtio_blk_data_plane" = "yes" ; then
echo "CONFIG_VIRTIO_BLK_DATA_PLANE=y" >> $config_host_mak
fi
+if test "$sha256_dedup" = "yes" ; then
+ echo "CONFIG_SHA256_DEDUP=y" >> $config_host_mak
+fi
+
# USB host support
case "$usb" in
linux)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 11/32] qcow2: Add qcow2_dedup_grow_table and use it.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (9 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 10/32] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 12/32] qcow2: Makes qcow2_alloc_cluster_link_l2 mark to deduplicate clusters Benoît Canet
` (20 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++-
include/block/block.h | 4 ++
2 files changed, 103 insertions(+), 1 deletion(-)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 089b999..819c37e 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -40,6 +40,100 @@ static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
bool write);
/*
+ * Grow the deduplication table
+ *
+ * @min_size: minimal size
+ * @exact_size: if true force to grow to the exact size
+ * @ret: 0 on success, -errno on error
+ */
+static int qcow2_grow_dedup_table(BlockDriverState *bs, size_t min_size,
+ bool exact_size)
+{
+ BDRVQcowState *s = bs->opaque;
+ size_t table_size, table_size2;
+ int ret, i;
+ uint64_t *new_dedup_table;
+ int64_t table_offset;
+
+ if (min_size <= s->dedup_table_size) {
+ return 0;
+ }
+
+ if (exact_size) {
+ table_size = min_size;
+ } else {
+ /* Bump size up to reduce the number of times we have to grow */
+ table_size = s->dedup_table_size;
+ if (table_size == 0) {
+ table_size = 1;
+ }
+ while (min_size > table_size) {
+ table_size = (table_size * 3 + 1) / 2;
+ }
+ }
+
+#ifdef DEBUG_ALLOC2
+ fprintf(stderr, "grow dedup_table from %d to %d\n", s->dedup_table_size,
+ table_size);
+#endif
+
+ table_size2 = sizeof(uint64_t) * table_size;
+ new_dedup_table = g_malloc0(align_offset(table_size2, 512));
+ memcpy(new_dedup_table, s->dedup_table,
+ s->dedup_table_size * sizeof(uint64_t));
+
+ /* write new table (align to cluster) */
+ BLKDBG_EVENT(bs->file, BLKDBG_DEDUP_GROW_ALLOC_TABLE);
+ table_offset = qcow2_alloc_clusters(bs, table_size2);
+ if (table_offset < 0) {
+ g_free(new_dedup_table);
+ return table_offset;
+ }
+
+ ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+ if (ret < 0) {
+ goto fail;
+ }
+
+ BLKDBG_EVENT(bs->file, BLKDBG_DEDUP_GROW_WRITE_TABLE);
+ for (i = 0; i < s->dedup_table_size; i++) {
+ new_dedup_table[i] = cpu_to_be64(new_dedup_table[i]);
+ }
+
+ ret = bdrv_pwrite_sync(bs->file, table_offset,
+ new_dedup_table, table_size2);
+
+ if (ret < 0) {
+ goto fail;
+ }
+
+ for (i = 0; i < s->dedup_table_size; i++) {
+ new_dedup_table[i] = be64_to_cpu(new_dedup_table[i]);
+ }
+
+ g_free(s->dedup_table);
+ qcow2_free_clusters(bs, s->dedup_table_offset,
+ s->dedup_table_size * sizeof(uint64_t));
+
+ /* set new table */
+ s->dedup_table = new_dedup_table;
+ BLKDBG_EVENT(bs->file, BLKDBG_DEDUP_GROW_ACTIVATE_TABLE);
+ s->dedup_table_offset = table_offset;
+ s->dedup_table_size = table_size;
+ ret = qcow2_update_header(bs);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ return 0;
+fail:
+ g_free(new_dedup_table);
+ qcow2_free_clusters(bs, table_offset, table_size2);
+ return ret;
+}
+
+/*
* Prepare a buffer containing everything required to compute cluster
* sized deduplication hashes.
* If sector_num or nb_sectors are not cluster-aligned, missing data
@@ -715,7 +809,11 @@ static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
index_in_dedup_table = cluster_number / nb_hash_in_block;
if (s->dedup_table_size <= index_in_dedup_table) {
- return -ENOSPC;
+ ret = qcow2_grow_dedup_table(bs, index_in_dedup_table + 1, false);
+ }
+
+ if (ret < 0) {
+ return ret;
}
/* if we must read and there is nothing to read return a null hash */
diff --git a/include/block/block.h b/include/block/block.h
index 0f750d7..f109452 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -445,6 +445,10 @@ typedef enum {
BLKDBG_CLUSTER_ALLOC_BYTES,
BLKDBG_CLUSTER_FREE,
+ BLKDBG_DEDUP_GROW_ALLOC_TABLE,
+ BLKDBG_DEDUP_GROW_WRITE_TABLE,
+ BLKDBG_DEDUP_GROW_ACTIVATE_TABLE,
+
BLKDBG_EVENT_MAX,
} BlkDebugEvent;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 12/32] qcow2: Makes qcow2_alloc_cluster_link_l2 mark to deduplicate clusters.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (10 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 11/32] qcow2: Add qcow2_dedup_grow_table and use it Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 13/32] qcow2: make the deduplication forget a cluster hash when a cluster is to dedupe Benoît Canet
` (19 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-cluster.c | 4 +++-
block/qcow2-dedup.c | 6 ++++++
block/qcow2.h | 3 +++
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index fae4110..0d11ef0 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1013,7 +1013,9 @@ again:
.nb_sectors = avail_sectors - nb_sectors,
},
- .l2_entry_flags = QCOW_OFLAG_COPIED,
+ .l2_entry_flags = QCOW_OFLAG_COPIED |
+ qcow2_must_deduplicate(bs) ?
+ QCOW_OFLAG_PENDING_DEDUP : 0,
.overwrite = false,
};
qemu_co_queue_init(&(*m)->dependent_requests);
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 819c37e..c106bd5 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -39,6 +39,12 @@ static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
uint64_t physical_sect,
bool write);
+bool qcow2_must_deduplicate(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ return s->has_dedup && s->dedup_status != DEDUP_STATUS_STARTED;
+}
+
/*
* Grow the deduplication table
*
diff --git a/block/qcow2.h b/block/qcow2.h
index 7979fc2..9f24b4c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -314,6 +314,8 @@ typedef struct QCowL2Meta
/* set to true if we are overwriting an L2 table entry */
bool overwrite;
+ /* set to true if the cluster must be tagged with QCOW_OFLAG_PENDING_DEDUP */
+ bool to_deduplicate;
/**
* The COW Region between the start of the first allocated cluster and the
@@ -464,6 +466,7 @@ int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
/* qcow2-dedup.c functions */
+bool qcow2_must_deduplicate(BlockDriverState *bs);
int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
QEMUIOVector *qiov,
uint64_t sector,
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 13/32] qcow2: make the deduplication forget a cluster hash when a cluster is to dedupe
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (11 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 12/32] qcow2: Makes qcow2_alloc_cluster_link_l2 mark to deduplicate clusters Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 14/32] qcow2: Create qcow2_is_cluster_to_dedup Benoît Canet
` (18 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-cluster.c | 11 +++++++++--
block/qcow2-dedup.c | 40 ++++++++++++++++++++++++++++++++++++++++
block/qcow2.h | 2 ++
3 files changed, 51 insertions(+), 2 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 0d11ef0..3cbb64f 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -683,6 +683,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
for (i = 0; i < m->nb_clusters; i++) {
+ uint64_t offset = cluster_offset + (i << s->cluster_bits);
/* if two concurrent writes happen to the same unallocated cluster
* each write allocates separate cluster and writes data concurrently.
* The first one to complete updates l2 table with pointer to its
@@ -692,8 +693,14 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
if(l2_table[l2_index + i] != 0)
old_cluster[j++] = l2_table[l2_index + i];
- l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
- (i << s->cluster_bits)) | m->l2_entry_flags);
+ l2_table[l2_index + i] = cpu_to_be64(offset | m->l2_entry_flags);
+
+ /* make the deduplication forget the cluster to avoid making
+ * the dedup pointing to a cluster that has changed on it's back.
+ */
+ if (m->to_deduplicate) {
+ qcow2_dedup_forget_cluster_by_sector(bs, offset >> 9);
+ }
}
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index c106bd5..2aca01a 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -869,6 +869,46 @@ static inline bool is_hash_node_empty(QCowHashNode *hash_node)
return hash_node->physical_sect & QCOW_DEDUP_FLAG_EMPTY;
}
+/* This function removes a hash_node from the trees given a physical sector
+ *
+ * @physical_sect: The physical sector of the cluster corresponding to the hash
+ */
+static void qcow2_remove_hash_node_by_sector(BlockDriverState *bs,
+ uint64_t physical_sect)
+{
+ BDRVQcowState *s = bs->opaque;
+ QCowHash hash;
+ int ret = 0;
+ uint8_t *data = qemu_blockalign(bs, s->cluster_sectors * BDRV_SECTOR_SIZE);
+
+
+ /* read the cluster data */
+ ret = bdrv_pread(bs->file, physical_sect << 9, data, s->cluster_size);
+
+ if (ret < 0) {
+ goto free_exit;
+ }
+
+ ret = qcow2_compute_cluster_hash(bs,
+ &hash,
+ data);
+
+ if (ret < 0) {
+ goto free_exit;
+ }
+
+ g_tree_remove(s->dedup_tree_by_hash, &hash);
+
+free_exit:
+ qemu_vfree(data);
+}
+
+void qcow2_dedup_forget_cluster_by_sector(BlockDriverState *bs,
+ uint64_t physical_sect)
+{
+ qcow2_remove_hash_node_by_sector(bs, physical_sect);
+}
+
/* This function store a hash information to disk and RAM
*
* @hash: the QCowHash to process
diff --git a/block/qcow2.h b/block/qcow2.h
index 9f24b4c..2eab1a4 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -467,6 +467,8 @@ int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
/* qcow2-dedup.c functions */
bool qcow2_must_deduplicate(BlockDriverState *bs);
+void qcow2_dedup_forget_cluster_by_sector(BlockDriverState *bs,
+ uint64_t physical_sect);
int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
QEMUIOVector *qiov,
uint64_t sector,
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 14/32] qcow2: Create qcow2_is_cluster_to_dedup.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (12 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 13/32] qcow2: make the deduplication forget a cluster hash when a cluster is to dedupe Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 15/32] qcow2: Load and save deduplication table header extension Benoît Canet
` (17 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-cluster.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.h | 4 ++++
2 files changed, 56 insertions(+)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 3cbb64f..bc42fc6 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -488,6 +488,58 @@ out:
return ret;
}
+/* Check if a cluster is to deduplicate given it's index
+ *
+ * @index: The logical index of the cluster starting from 0
+ * @physical_sect: The physical sector of the cluster as return value
+ * @err: 0 on success, negative on error
+ * @ret: True if the cluster is to deduplicate else false
+ */
+bool qcow2_is_cluster_to_dedup(BlockDriverState *bs,
+ uint64_t index,
+ uint64_t *physical_sect,
+ int *err)
+{
+ BDRVQcowState *s = bs->opaque;
+ unsigned int l1_index, l2_index;
+ uint64_t offset;
+ uint64_t l2_offset;
+ uint64_t *l2_table = NULL;
+
+ *physical_sect = 0;
+ *err = 0;
+
+ l1_index = index >> s->l2_bits;
+
+ if (l1_index >= s->l1_size) {
+ return false;
+ }
+
+ /* no l1 entry */
+ if (!(s->l1_table[l1_index] & L1E_OFFSET_MASK)) {
+ return false;
+ }
+
+ l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
+
+ *err = l2_load(bs, l2_offset, &l2_table);
+ if (*err < 0) {
+ return false;
+ }
+
+ l2_index = index & (s->l2_size - 1);
+
+ offset = be64_to_cpu(l2_table[l2_index]);
+ *physical_sect = (offset & L2E_OFFSET_MASK) >> 9;
+
+ *err = qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
+ if (*err < 0) {
+ return false;
+ }
+
+ return offset & QCOW_OFLAG_PENDING_DEDUP;
+}
+
/*
* get_cluster_table
*
diff --git a/block/qcow2.h b/block/qcow2.h
index 2eab1a4..59c4881 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -438,6 +438,10 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors);
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);
+bool qcow2_is_cluster_to_dedup(BlockDriverState *bs,
+ uint64_t index,
+ uint64_t *physical_sect,
+ int *ret);
/* qcow2-snapshot.c functions */
int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 15/32] qcow2: Load and save deduplication table header extension.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (13 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 14/32] qcow2: Create qcow2_is_cluster_to_dedup Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 16/32] qcow2: Extract qcow2_do_table_init Benoît Canet
` (16 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/block/qcow2.c b/block/qcow2.c
index ca38cc3..eaddcb6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -53,9 +53,18 @@ typedef struct {
uint32_t len;
} QCowExtension;
+typedef struct {
+ uint64_t offset;
+ int32_t size;
+ uint8_t hash_algo;
+ uint8_t strategies;
+ char reserved[56];
+} QCowDedupTableExtension;
+
#define QCOW2_EXT_MAGIC_END 0
#define QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
#define QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857
+#define QCOW2_EXT_MAGIC_DEDUP_TABLE 0xCD8E819B
static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename)
{
@@ -84,6 +93,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
QCowExtension ext;
uint64_t offset;
int ret;
+ QCowDedupTableExtension dedup_table_extension;
#ifdef DEBUG_EXT
printf("qcow2_read_extensions: start=%ld end=%ld\n", start_offset, end_offset);
@@ -148,6 +158,25 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
}
break;
+ case QCOW2_EXT_MAGIC_DEDUP_TABLE:
+ if (ext.len > sizeof(dedup_table_extension)) {
+ fprintf(stderr, "ERROR: dedup_table_extension: len=%u too large"
+ " (>=%zu)\n",
+ ext.len, sizeof(dedup_table_extension));
+ return 2;
+ }
+ ret = bdrv_pread(bs->file, offset,
+ &dedup_table_extension, ext.len);
+ if (ret < 0) {
+ return ret;
+ }
+ s->dedup_table_offset =
+ be64_to_cpu(dedup_table_extension.offset);
+ s->dedup_table_size =
+ be32_to_cpu(dedup_table_extension.size);
+ s->dedup_hash_algo = dedup_table_extension.hash_algo;
+ break;
+
default:
/* unknown magic - save it in case we need to rewrite the header */
{
@@ -959,6 +988,7 @@ int qcow2_update_header(BlockDriverState *bs)
uint32_t refcount_table_clusters;
size_t header_length;
Qcow2UnknownHeaderExtension *uext;
+ QCowDedupTableExtension dedup_table_extension;
buf = qemu_blockalign(bs, buflen);
@@ -1062,6 +1092,25 @@ int qcow2_update_header(BlockDriverState *bs)
buf += ret;
buflen -= ret;
+ if (s->has_dedup) {
+ memset(&dedup_table_extension, 0, sizeof(dedup_table_extension));
+ dedup_table_extension.offset = cpu_to_be64(s->dedup_table_offset);
+ dedup_table_extension.size = cpu_to_be32(s->dedup_table_size);
+ dedup_table_extension.hash_algo = s->dedup_hash_algo;
+ dedup_table_extension.strategies |= QCOW_DEDUP_STRATEGY_RAM;
+ dedup_table_extension.strategies |= QCOW_DEDUP_STRATEGY_RUNNING;
+ ret = header_ext_add(buf,
+ QCOW2_EXT_MAGIC_DEDUP_TABLE,
+ &dedup_table_extension,
+ sizeof(dedup_table_extension),
+ buflen);
+ if (ret < 0) {
+ goto fail;
+ }
+ buf += ret;
+ buflen -= ret;
+ }
+
/* Keep unknown header extensions */
QLIST_FOREACH(uext, &s->unknown_header_ext, next) {
ret = header_ext_add(buf, uext->magic, uext->data, uext->len, buflen);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 16/32] qcow2: Extract qcow2_do_table_init.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (14 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 15/32] qcow2: Load and save deduplication table header extension Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 17/32] qcow2-cache: Allow to choose table size at creation Benoît Canet
` (15 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-refcount.c | 43 ++++++++++++++++++++++++++++++-------------
block/qcow2.h | 5 +++++
2 files changed, 35 insertions(+), 13 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index e12b58c..b2b3031 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -35,27 +35,44 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
/*********************************************************/
/* refcount handling */
-int qcow2_refcount_init(BlockDriverState *bs)
+int qcow2_do_table_init(BlockDriverState *bs,
+ uint64_t **table,
+ int64_t offset,
+ int size,
+ bool is_refcount)
{
- BDRVQcowState *s = bs->opaque;
- int ret, refcount_table_size2, i;
-
- refcount_table_size2 = s->refcount_table_size * sizeof(uint64_t);
- s->refcount_table = g_malloc(refcount_table_size2);
- if (s->refcount_table_size > 0) {
- BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_LOAD);
- ret = bdrv_pread(bs->file, s->refcount_table_offset,
- s->refcount_table, refcount_table_size2);
- if (ret != refcount_table_size2)
+ int ret, size2, i;
+
+ size2 = size * sizeof(uint64_t);
+ *table = g_malloc(size2);
+ if (size > 0) {
+ if (is_refcount) {
+ BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_LOAD);
+ }
+ ret = bdrv_pread(bs->file, offset,
+ *table, size2);
+ if (ret != size2) {
goto fail;
- for(i = 0; i < s->refcount_table_size; i++)
- be64_to_cpus(&s->refcount_table[i]);
+ }
+ for (i = 0; i < size; i++) {
+ be64_to_cpus(&(*table)[i]);
+ }
}
return 0;
fail:
return -ENOMEM;
}
+int qcow2_refcount_init(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ return qcow2_do_table_init(bs,
+ &s->refcount_table,
+ s->refcount_table_offset,
+ s->refcount_table_size,
+ true);
+}
+
void qcow2_refcount_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
diff --git a/block/qcow2.h b/block/qcow2.h
index 59c4881..ab3a3c5 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -395,6 +395,11 @@ int qcow2_read_cluster_data(BlockDriverState *bs,
int nb_sectors);
/* qcow2-refcount.c functions */
+int qcow2_do_table_init(BlockDriverState *bs,
+ uint64_t **table,
+ int64_t offset,
+ int size,
+ bool is_refcount);
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 17/32] qcow2-cache: Allow to choose table size at creation.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (15 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 16/32] qcow2: Extract qcow2_do_table_init Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 18/32] qcow2: Extract qcow2_set_incompat_feature and qcow2_clear_incompat_feature Benoît Canet
` (14 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-cache.c | 12 +++++++-----
block/qcow2.c | 5 +++--
block/qcow2.h | 3 ++-
3 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 2f3114e..83f2814 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -40,20 +40,22 @@ struct Qcow2Cache {
struct Qcow2Cache* depends;
int size;
bool depends_on_flush;
+ int table_size;
};
-Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
+Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables,
+ int table_size)
{
- BDRVQcowState *s = bs->opaque;
Qcow2Cache *c;
int i;
c = g_malloc0(sizeof(*c));
c->size = num_tables;
c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
+ c->table_size = table_size;
for (i = 0; i < c->size; i++) {
- c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
+ c->entries[i].table = qemu_blockalign(bs, c->table_size);
}
return c;
@@ -121,7 +123,7 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
}
ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
- s->cluster_size);
+ c->table_size);
if (ret < 0) {
return ret;
}
@@ -253,7 +255,7 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
}
- ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size);
+ ret = bdrv_pread(bs->file, offset, c->entries[i].table, c->table_size);
if (ret < 0) {
return ret;
}
diff --git a/block/qcow2.c b/block/qcow2.c
index eaddcb6..6d693ac 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -459,8 +459,9 @@ static int qcow2_open(BlockDriverState *bs, int flags)
}
/* alloc L2 table/refcount block cache */
- s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE);
- s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE);
+ s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE, s->cluster_size);
+ s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE,
+ s->cluster_size);
s->cluster_cache = g_malloc(s->cluster_size);
/* one more sector for decompressed data alignment */
diff --git a/block/qcow2.h b/block/qcow2.h
index ab3a3c5..1493276 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -459,7 +459,8 @@ void qcow2_free_snapshots(BlockDriverState *bs);
int qcow2_read_snapshots(BlockDriverState *bs);
/* qcow2-cache.c functions */
-Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
+Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables,
+ int table_size);
int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 18/32] qcow2: Extract qcow2_set_incompat_feature and qcow2_clear_incompat_feature.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (16 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 17/32] qcow2-cache: Allow to choose table size at creation Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 19/32] block: Add qcow2_dedup format and image creation code Benoît Canet
` (13 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Also change callers.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-cluster.c | 2 +-
block/qcow2.c | 43 ++++++++++++++++++++++---------------------
block/qcow2.h | 7 ++++---
3 files changed, 27 insertions(+), 25 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index bc42fc6..1008df8 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -721,7 +721,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
/* Update L2 table. */
if (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS) {
- qcow2_mark_dirty(bs);
+ qcow2_set_incompat_feature(bs, QCOW2_INCOMPAT_DIRTY);
}
if (qcow2_need_accurate_refcounts(s)) {
qcow2_cache_set_dependency(bs, s->l2_table_cache,
diff --git a/block/qcow2.c b/block/qcow2.c
index 6d693ac..1210780 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -245,56 +245,57 @@ static void report_unsupported_feature(BlockDriverState *bs,
}
/*
- * Sets the dirty bit and flushes afterwards if necessary.
+ * Sets the an incompatible feature bit and flushes afterwards if necessary.
*
* The incompatible_features bit is only set if the image file header was
* updated successfully. Therefore it is not required to check the return
* value of this function.
*/
-int qcow2_mark_dirty(BlockDriverState *bs)
+int qcow2_set_incompat_feature(BlockDriverState *bs,
+ QCow2IncompatibleFeature feature)
{
BDRVQcowState *s = bs->opaque;
uint64_t val;
- int ret;
+ int ret = 0;
assert(s->qcow_version >= 3);
- if (s->incompatible_features & QCOW2_INCOMPAT_DIRTY) {
- return 0; /* already dirty */
+ if (s->incompatible_features & feature) {
+ return 0; /* already added */
}
- val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY);
+ val = cpu_to_be64(s->incompatible_features | feature);
ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, incompatible_features),
&val, sizeof(val));
if (ret < 0) {
return ret;
}
- ret = bdrv_flush(bs->file);
- if (ret < 0) {
- return ret;
- }
- /* Only treat image as dirty if the header was updated successfully */
- s->incompatible_features |= QCOW2_INCOMPAT_DIRTY;
+ /* Only treat image as having the feature if the header was updated
+ * successfully
+ */
+ s->incompatible_features |= feature;
return 0;
}
/*
- * Clears the dirty bit and flushes before if necessary. Only call this
- * function when there are no pending requests, it does not guard against
- * concurrent requests dirtying the image.
+ * Clears an incompatible feature bit and flushes before if necessary.
+ * Only call this function when there are no pending requests, it does not
+ * guard against concurrent requests adding a feature to the image.
*/
-static int qcow2_mark_clean(BlockDriverState *bs)
+static int qcow2_clear_incompat_feature(BlockDriverState *bs,
+ QCow2IncompatibleFeature feature)
{
BDRVQcowState *s = bs->opaque;
+ int ret = 0;
- if (s->incompatible_features & QCOW2_INCOMPAT_DIRTY) {
- int ret = bdrv_flush(bs);
+ if (s->incompatible_features & feature) {
+ ret = bdrv_flush(bs);
if (ret < 0) {
return ret;
}
- s->incompatible_features &= ~QCOW2_INCOMPAT_DIRTY;
+ s->incompatible_features &= ~feature;
return qcow2_update_header(bs);
}
return 0;
@@ -309,7 +310,7 @@ static int qcow2_check(BlockDriverState *bs, BdrvCheckResult *result,
}
if (fix && result->check_errors == 0 && result->corruptions == 0) {
- return qcow2_mark_clean(bs);
+ return qcow2_clear_incompat_feature(bs, QCOW2_INCOMPAT_DIRTY);
}
return ret;
}
@@ -906,7 +907,7 @@ static void qcow2_close(BlockDriverState *bs)
qcow2_cache_flush(bs, s->l2_table_cache);
qcow2_cache_flush(bs, s->refcount_block_cache);
- qcow2_mark_clean(bs);
+ qcow2_clear_incompat_feature(bs, QCOW2_INCOMPAT_DIRTY);
qcow2_cache_destroy(bs, s->l2_table_cache);
qcow2_cache_destroy(bs, s->refcount_block_cache);
diff --git a/block/qcow2.h b/block/qcow2.h
index 1493276..fd48243 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -172,14 +172,14 @@ enum {
};
/* Incompatible feature bits */
-enum {
+typedef enum {
QCOW2_INCOMPAT_DIRTY_BITNR = 0,
QCOW2_INCOMPAT_DIRTY = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
QCOW2_INCOMPAT_DEDUP_BITNR = 1,
QCOW2_INCOMPAT_DEDUP = 1 << QCOW2_INCOMPAT_DEDUP_BITNR,
QCOW2_INCOMPAT_MASK = QCOW2_INCOMPAT_DIRTY | QCOW2_INCOMPAT_DEDUP,
-};
+} QCow2IncompatibleFeature;
/* Compatible feature bits */
enum {
@@ -387,7 +387,8 @@ static inline bool qcow2_need_accurate_refcounts(BDRVQcowState *s)
int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
int64_t sector_num, int nb_sectors);
-int qcow2_mark_dirty(BlockDriverState *bs);
+int qcow2_set_incompat_feature(BlockDriverState *bs,
+ QCow2IncompatibleFeature feature);
int qcow2_update_header(BlockDriverState *bs);
int qcow2_read_cluster_data(BlockDriverState *bs,
uint8_t *data,
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 19/32] block: Add qcow2_dedup format and image creation code.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (17 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 18/32] qcow2: Extract qcow2_set_incompat_feature and qcow2_clear_incompat_feature Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 20/32] qcow2: Drop hash for a given cluster when dedup makes refcount > 2^16/2 Benoît Canet
` (12 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Also modify qemu-io-test.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2.c | 185 +++++++++++++++++++++++++++++++++++++++---
include/block/block_int.h | 1 +
tests/qemu-iotests/common.rc | 3 +-
3 files changed, 175 insertions(+), 14 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index 1210780..9032dfc 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1263,7 +1263,8 @@ static int preallocate(BlockDriverState *bs)
static int qcow2_create2(const char *filename, int64_t total_size,
const char *backing_file, const char *backing_format,
int flags, size_t cluster_size, int prealloc,
- QEMUOptionParameter *options, int version)
+ QEMUOptionParameter *options, int version,
+ bool dedup, uint8_t hash_algo)
{
/* Calculate cluster_bits */
int cluster_bits;
@@ -1291,7 +1292,8 @@ static int qcow2_create2(const char *filename, int64_t total_size,
*/
BlockDriverState* bs;
QCowHeader header;
- uint8_t* refcount_table;
+ uint8_t *tables;
+ int size;
int ret;
ret = bdrv_create_file(filename, options);
@@ -1333,10 +1335,11 @@ static int qcow2_create2(const char *filename, int64_t total_size,
goto out;
}
- /* Write an empty refcount table */
- refcount_table = g_malloc0(cluster_size);
- ret = bdrv_pwrite(bs, cluster_size, refcount_table, cluster_size);
- g_free(refcount_table);
+ /* Write an empty refcount table + extra space for dedup table if needed */
+ size = dedup ? 2 : 1;
+ tables = g_malloc0(size * cluster_size);
+ ret = bdrv_pwrite(bs, cluster_size, tables, size * cluster_size);
+ g_free(tables);
if (ret < 0) {
goto out;
@@ -1347,7 +1350,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
/*
* And now open the image and make it consistent first (i.e. increase the
* refcount of the cluster that is occupied by the header and the refcount
- * table)
+ * table and the eventual dedup table)
*/
BlockDriver* drv = bdrv_find_format("qcow2");
assert(drv != NULL);
@@ -1357,7 +1360,8 @@ static int qcow2_create2(const char *filename, int64_t total_size,
goto out;
}
- ret = qcow2_alloc_clusters(bs, 2 * cluster_size);
+ size++; /* Add a cluster for the header */
+ ret = qcow2_alloc_clusters(bs, size * cluster_size);
if (ret < 0) {
goto out;
@@ -1367,11 +1371,31 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* Okay, now that we have a valid image, let's give it the right size */
+ BDRVQcowState *s = bs->opaque;
ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
if (ret < 0) {
goto out;
}
+ if (dedup) {
+ s->has_dedup = true;
+ s->dedup_table_offset = cluster_size * 2;
+ s->dedup_table_size = cluster_size / sizeof(uint64_t);
+ s->dedup_hash_algo = hash_algo;
+
+ ret = qcow2_set_incompat_feature(bs, QCOW2_INCOMPAT_DEDUP);
+ if (ret < 0) {
+ goto out;
+ }
+
+ s->dedup_status = DEDUP_STATUS_STARTED;
+ ret = qcow2_update_header(bs);
+ s->dedup_status = DEDUP_STATUS_STOPPED;
+ if (ret < 0) {
+ goto out;
+ }
+ }
+
/* Want a backing file? There you go.*/
if (backing_file) {
ret = bdrv_change_backing_file(bs, backing_file, backing_format);
@@ -1397,15 +1421,41 @@ out:
return ret;
}
+static int qcow2_warn_if_version_3_is_needed(int version,
+ bool has_feature,
+ const char *feature)
+{
+ if (version < 3 && has_feature) {
+ fprintf(stderr, "%s only supported with compatibility "
+ "level 1.1 and above (use compat=1.1 or greater)\n",
+ feature);
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static int8_t qcow2_get_dedup_hash_algo(char *value)
+{
+ if (!value || !strcmp(value, "sha256")) {
+ return QCOW_HASH_SHA256;
+ }
+
+ error_printf("Unsupported deduplication hash algorithm.\n");
+ return -EINVAL;
+}
+
static int qcow2_create(const char *filename, QEMUOptionParameter *options)
{
const char *backing_file = NULL;
const char *backing_fmt = NULL;
uint64_t sectors = 0;
int flags = 0;
+ int ret;
size_t cluster_size = DEFAULT_CLUSTER_SIZE;
int prealloc = 0;
int version = 2;
+ bool dedup = false;
+ int8_t hash_algo = 0;
/* Read out options */
while (options && options->name) {
@@ -1443,6 +1493,13 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options)
}
} else if (!strcmp(options->name, BLOCK_OPT_LAZY_REFCOUNTS)) {
flags |= options->value.n ? BLOCK_FLAG_LAZY_REFCOUNTS : 0;
+ } else if (!strcmp(options->name, BLOCK_OPT_DEDUP)) {
+ hash_algo = qcow2_get_dedup_hash_algo(options->value.s);
+ if (hash_algo < 0) {
+ return hash_algo;
+ }
+ dedup = true;
+ version = 3;
}
options++;
}
@@ -1453,14 +1510,22 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options)
return -EINVAL;
}
- if (version < 3 && (flags & BLOCK_FLAG_LAZY_REFCOUNTS)) {
- fprintf(stderr, "Lazy refcounts only supported with compatibility "
- "level 1.1 and above (use compat=1.1 or greater)\n");
- return -EINVAL;
+ ret = qcow2_warn_if_version_3_is_needed(version,
+ flags & BLOCK_FLAG_LAZY_REFCOUNTS,
+ "Lazy refcounts");
+ if (ret < 0) {
+ return ret;
+ }
+ ret = qcow2_warn_if_version_3_is_needed(version,
+ dedup,
+ "Deduplication");
+ if (ret < 0) {
+ return ret;
}
return qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
- cluster_size, prealloc, options, version);
+ cluster_size, prealloc, options, version,
+ dedup, hash_algo);
}
static int qcow2_make_empty(BlockDriverState *bs)
@@ -1766,6 +1831,51 @@ static QEMUOptionParameter qcow2_create_options[] = {
{ NULL }
};
+static QEMUOptionParameter qcow2_dedup_create_options[] = {
+ {
+ .name = BLOCK_OPT_SIZE,
+ .type = OPT_SIZE,
+ .help = "Virtual disk size"
+ },
+ {
+ .name = BLOCK_OPT_BACKING_FILE,
+ .type = OPT_STRING,
+ .help = "File name of a base image"
+ },
+ {
+ .name = BLOCK_OPT_BACKING_FMT,
+ .type = OPT_STRING,
+ .help = "Image format of the base image"
+ },
+ {
+ .name = BLOCK_OPT_ENCRYPT,
+ .type = OPT_FLAG,
+ .help = "Encrypt the image"
+ },
+ {
+ .name = BLOCK_OPT_CLUSTER_SIZE,
+ .type = OPT_SIZE,
+ .help = "qcow2 cluster size",
+ .value = { .n = DEFAULT_DEDUP_CLUSTER_SIZE },
+ },
+ {
+ .name = BLOCK_OPT_PREALLOC,
+ .type = OPT_STRING,
+ .help = "Preallocation mode (allowed values: off, metadata)"
+ },
+ {
+ .name = BLOCK_OPT_LAZY_REFCOUNTS,
+ .type = OPT_FLAG,
+ .help = "Postpone refcount updates",
+ },
+ {
+ .name = BLOCK_OPT_DEDUP,
+ .type = OPT_STRING,
+ .help = "Deduplication",
+ },
+ { NULL }
+};
+
static BlockDriver bdrv_qcow2 = {
.format_name = "qcow2",
.instance_size = sizeof(BDRVQcowState),
@@ -1805,9 +1915,58 @@ static BlockDriver bdrv_qcow2 = {
.bdrv_check = qcow2_check,
};
+/* As all the defined .create_options are passed to qcow2_create() even if
+ * the user does not specify them it's not possible to have a default 4KB
+ * cluster size for deduplication.
+ * For example it's impossible to make the difference between the 64KB cluster
+ * size default create option of qcow2 or a 64KB user specified cluster size.
+ * So we declare the qcow2_dedup format in order to be able to define
+ * deduplication specific create options.
+ * It will also help for qemu-io-test integration.
+ */
+static BlockDriver bdrv_qcow2_dedup = {
+ .format_name = "qcow2_dedup",
+ .instance_size = sizeof(BDRVQcowState),
+ .bdrv_probe = qcow2_probe,
+ .bdrv_open = qcow2_open,
+ .bdrv_close = qcow2_close,
+ .bdrv_reopen_prepare = qcow2_reopen_prepare,
+ .bdrv_create = qcow2_create,
+ .bdrv_co_is_allocated = qcow2_co_is_allocated,
+ .bdrv_set_key = qcow2_set_key,
+ .bdrv_make_empty = qcow2_make_empty,
+
+ .bdrv_co_readv = qcow2_co_readv,
+ .bdrv_co_writev = qcow2_co_writev,
+ .bdrv_co_flush_to_os = qcow2_co_flush_to_os,
+
+ .bdrv_co_write_zeroes = qcow2_co_write_zeroes,
+ .bdrv_co_discard = qcow2_co_discard,
+ .bdrv_truncate = qcow2_truncate,
+ .bdrv_write_compressed = qcow2_write_compressed,
+
+ .bdrv_snapshot_create = qcow2_snapshot_create,
+ .bdrv_snapshot_goto = qcow2_snapshot_goto,
+ .bdrv_snapshot_delete = qcow2_snapshot_delete,
+ .bdrv_snapshot_list = qcow2_snapshot_list,
+ .bdrv_snapshot_load_tmp = qcow2_snapshot_load_tmp,
+ .bdrv_get_info = qcow2_get_info,
+
+ .bdrv_save_vmstate = qcow2_save_vmstate,
+ .bdrv_load_vmstate = qcow2_load_vmstate,
+
+ .bdrv_change_backing_file = qcow2_change_backing_file,
+
+ .bdrv_invalidate_cache = qcow2_invalidate_cache,
+
+ .create_options = qcow2_dedup_create_options,
+ .bdrv_check = qcow2_check,
+};
+
static void bdrv_qcow2_init(void)
{
bdrv_register(&bdrv_qcow2);
+ bdrv_register(&bdrv_qcow2_dedup);
}
block_init(bdrv_qcow2_init);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index eaad53e..62c72fc 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -57,6 +57,7 @@
#define BLOCK_OPT_COMPAT_LEVEL "compat"
#define BLOCK_OPT_LAZY_REFCOUNTS "lazy_refcounts"
#define BLOCK_OPT_ADAPTER_TYPE "adapter_type"
+#define BLOCK_OPT_DEDUP "dedup"
typedef struct BdrvTrackedRequest BdrvTrackedRequest;
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index e522d61..520083a 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -124,7 +124,8 @@ _make_test_img()
-e "s# compat='[^']*'##g" \
-e "s# compat6=\\(on\\|off\\)##g" \
-e "s# static=\\(on\\|off\\)##g" \
- -e "s# lazy_refcounts=\\(on\\|off\\)##g"
+ -e "s# lazy_refcounts=\\(on\\|off\\)##g" \
+ -e "s# dedup=\\('sha256'\\|'skein'\\|'sha3'\\)##g"
# Start an NBD server on the image file, which is what we'll be talking to
if [ $IMGPROTO = "nbd" ]; then
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 20/32] qcow2: Drop hash for a given cluster when dedup makes refcount > 2^16/2.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (18 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 19/32] block: Add qcow2_dedup format and image creation code Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 21/32] qcow2: Remove hash when cluster is deleted Benoît Canet
` (11 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
A new physical cluster with the same hash value will be used for further
occurrence of this hash.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 49 insertions(+), 5 deletions(-)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 2aca01a..e017721 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -38,6 +38,8 @@ static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
uint64_t *first_logical_sect,
uint64_t physical_sect,
bool write);
+static void qcow2_remove_hash_node(BlockDriverState *bs,
+ QCowHashNode *hash_node);
bool qcow2_must_deduplicate(BlockDriverState *bs)
{
@@ -416,6 +418,38 @@ static int qcow2_clear_l2_copied_flag_if_needed(BlockDriverState *bs,
true);
}
+/* Force to use a new physical cluster and QCowHashNode when the refcount pass
+ * 2^16/2.
+ *
+ * @cluster_index: the index of the physical cluster
+ * @ret: 0 on success, -errno on error
+ */
+static int qcow2_dedup_refcount_half_max_reached(BlockDriverState *bs,
+ uint64_t cluster_index,
+ QCowHashNode *hash_node)
+{
+ int ret = 0;
+
+ /* mark this hash so we won't load it anymore at startup after writing it */
+ hash_node->first_logical_sect |= QCOW_DEDUP_FLAG_HALF_MAX_REFCOUNT;
+
+ /* write to disk */
+ ret = qcow2_dedup_read_write_hash(bs,
+ &hash_node->hash,
+ &hash_node->first_logical_sect,
+ hash_node->physical_sect,
+ true);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* remove the QCowHashNode from ram so we won't use it anymore for dedup */
+ qcow2_remove_hash_node(bs, hash_node);
+
+ return 0;
+}
+
/* This function deduplicate a cluster
*
* @logical_sect: The logical sector of the write
@@ -428,13 +462,23 @@ static int qcow2_deduplicate_cluster(BlockDriverState *bs,
{
BDRVQcowState *s = bs->opaque;
uint64_t cluster_index = hash_node->physical_sect / s->cluster_sectors;
- int ret = 0;
+ int refcount, ret = 0;
/* Increment the refcount of the cluster */
- ret = qcow2_update_cluster_refcount(bs,
- cluster_index,
- 1,
- false);
+ refcount = qcow2_update_cluster_refcount(bs,
+ cluster_index,
+ 1,
+ false);
+
+ if (refcount < 0) {
+ return ret;
+ }
+
+ if (refcount >= 0xFFFF/2) {
+ ret = qcow2_dedup_refcount_half_max_reached(bs,
+ cluster_index,
+ hash_node);
+ }
if (ret < 0) {
return ret;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 21/32] qcow2: Remove hash when cluster is deleted.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (19 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 20/32] qcow2: Drop hash for a given cluster when dedup makes refcount > 2^16/2 Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 22/32] qcow2: Add qcow2_dedup_is_running to probe if dedup is running Benoît Canet
` (10 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 26 ++++++++++++++++++++++++++
block/qcow2-refcount.c | 3 +++
block/qcow2.h | 2 ++
3 files changed, 31 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index e017721..68a09ff 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -1044,3 +1044,29 @@ int qcow2_dedup_store_new_hashes(BlockDriverState *bs,
return ret;
}
+
+/* Clean the last reference to a given cluster when its refcount is zero
+ *
+ * @cluster_index: the index of the physical cluster
+ */
+void qcow2_dedup_destroy_hash(BlockDriverState *bs,
+ uint64_t cluster_index)
+{
+ BDRVQcowState *s = bs->opaque;
+ QCowHash null_hash;
+ uint64_t logical_sect = 0;
+ uint64_t physical_sect = cluster_index * s->cluster_sectors;
+
+ /* prepare null hash */
+ memset(&null_hash, 0, sizeof(null_hash));
+
+ /* clear from disk */
+ qcow2_dedup_read_write_hash(bs,
+ &null_hash,
+ &logical_sect,
+ physical_sect,
+ true);
+
+ /* remove from ram if present so we won't dedup with it anymore */
+ qcow2_remove_hash_node_by_sector(bs, physical_sect);
+}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index b2b3031..ffb8d3a 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -493,6 +493,9 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
ret = -EINVAL;
goto fail;
}
+ if (s->has_dedup && refcount == 0) {
+ qcow2_dedup_destroy_hash(bs, cluster_index);
+ }
if (refcount == 0 && cluster_index < s->free_cluster_index) {
s->free_cluster_index = cluster_index;
}
diff --git a/block/qcow2.h b/block/qcow2.h
index fd48243..c1c0978 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -496,5 +496,7 @@ int qcow2_dedup_store_new_hashes(BlockDriverState *bs,
int count,
uint64_t logical_sect,
uint64_t physical_sect);
+void qcow2_dedup_destroy_hash(BlockDriverState *bs,
+ uint64_t cluster_index);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 22/32] qcow2: Add qcow2_dedup_is_running to probe if dedup is running.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (20 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 21/32] qcow2: Remove hash when cluster is deleted Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 23/32] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
` (9 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 6 ++++++
block/qcow2.h | 1 +
2 files changed, 7 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 68a09ff..cd47e2c 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -1070,3 +1070,9 @@ void qcow2_dedup_destroy_hash(BlockDriverState *bs,
/* remove from ram if present so we won't dedup with it anymore */
qcow2_remove_hash_node_by_sector(bs, physical_sect);
}
+
+bool qcow2_dedup_is_running(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ return s->has_dedup && s->dedup_status == DEDUP_STATUS_STARTED;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index c1c0978..b858db9 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -498,5 +498,6 @@ int qcow2_dedup_store_new_hashes(BlockDriverState *bs,
uint64_t physical_sect);
void qcow2_dedup_destroy_hash(BlockDriverState *bs,
uint64_t cluster_index);
+bool qcow2_dedup_is_running(BlockDriverState *bs);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 23/32] qcow2: Integrate deduplication in qcow2_co_writev loop.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (21 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 22/32] qcow2: Add qcow2_dedup_is_running to probe if dedup is running Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 24/32] qcow2: Serialize write requests when deduplication is activated Benoît Canet
` (8 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 88 insertions(+), 2 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index 9032dfc..838241c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -322,6 +322,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
QCowHeader header;
uint64_t ext_end;
+ s->has_dedup = false;
ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
if (ret < 0) {
goto fail;
@@ -784,13 +785,18 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
BDRVQcowState *s = bs->opaque;
int index_in_cluster;
int n_end;
- int ret;
+ int ret = 0;
int cur_nr_sectors; /* number of sectors in current iteration */
uint64_t cluster_offset;
QEMUIOVector hd_qiov;
uint64_t bytes_done = 0;
uint8_t *cluster_data = NULL;
QCowL2Meta *l2meta = NULL;
+ uint8_t *dedup_cluster_data = NULL;
+ int dedup_cluster_data_nr;
+ int deduped_sectors_nr;
+ QCowDedupState ds;
+ bool atomic_dedup_is_running;
trace_qcow2_writev_start_req(qemu_coroutine_self(), sector_num,
remaining_sectors);
@@ -801,13 +807,70 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
qemu_co_mutex_lock(&s->lock);
+ atomic_dedup_is_running = qcow2_dedup_is_running(bs);
+ if (atomic_dedup_is_running) {
+ QTAILQ_INIT(&ds.undedupables);
+ ds.phash.reuse = false;
+ ds.nb_undedupable_sectors = 0;
+ ds.nb_clusters_processed = 0;
+
+ /* if deduplication is on we make sure dedup_cluster_data
+ * contains a multiple of cluster size of data in order
+ * to compute the hashes
+ */
+ ret = qcow2_dedup_read_missing_and_concatenate(bs,
+ qiov,
+ sector_num,
+ remaining_sectors,
+ &dedup_cluster_data,
+ &dedup_cluster_data_nr);
+
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
while (remaining_sectors != 0) {
l2meta = NULL;
trace_qcow2_writev_start_part(qemu_coroutine_self());
+
+ if (atomic_dedup_is_running && ds.nb_undedupable_sectors == 0) {
+ /* Try to deduplicate as much clusters as possible */
+ deduped_sectors_nr = qcow2_dedup(bs,
+ &ds,
+ sector_num,
+ dedup_cluster_data,
+ dedup_cluster_data_nr);
+
+ if (deduped_sectors_nr < 0) {
+ goto fail;
+ }
+
+ remaining_sectors -= deduped_sectors_nr;
+ sector_num += deduped_sectors_nr;
+ bytes_done += deduped_sectors_nr * 512;
+
+ /* no more data to write -> exit */
+ if (remaining_sectors <= 0) {
+ break;
+ }
+
+ /* if we deduped something trace it */
+ if (deduped_sectors_nr) {
+ trace_qcow2_writev_done_part(qemu_coroutine_self(),
+ deduped_sectors_nr);
+ trace_qcow2_writev_start_part(qemu_coroutine_self());
+ }
+ }
+
index_in_cluster = sector_num & (s->cluster_sectors - 1);
- n_end = index_in_cluster + remaining_sectors;
+ n_end = atomic_dedup_is_running &&
+ ds.nb_undedupable_sectors < remaining_sectors ?
+ index_in_cluster + ds.nb_undedupable_sectors :
+ index_in_cluster + remaining_sectors;
+
if (s->crypt_method &&
n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) {
n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors;
@@ -874,6 +937,28 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
l2meta = NULL;
}
+ /* Write the non duplicated clusters hashes to disk */
+ if (atomic_dedup_is_running) {
+ int count = cur_nr_sectors / s->cluster_sectors;
+ int has_ending = ((cluster_offset >> 9) + index_in_cluster +
+ cur_nr_sectors) & (s->cluster_sectors - 1);
+ if (index_in_cluster) {
+ count++;
+ }
+ if (has_ending) {
+ count++;
+ }
+ ret = qcow2_dedup_store_new_hashes(bs,
+ &ds,
+ count,
+ sector_num,
+ (cluster_offset >> 9));
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
+ ds.nb_undedupable_sectors -= cur_nr_sectors;
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
@@ -894,6 +979,7 @@ fail:
qemu_iovec_destroy(&hd_qiov);
qemu_vfree(cluster_data);
+ qemu_vfree(dedup_cluster_data);
trace_qcow2_writev_done_req(qemu_coroutine_self(), ret);
return ret;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 24/32] qcow2: Serialize write requests when deduplication is activated.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (22 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 23/32] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 25/32] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup Benoît Canet
` (7 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
This fixes the sub cluster sized writes race conditions while waiting
for a faster solution.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2.c | 17 ++++++++++++++++-
block/qcow2.h | 1 +
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index 838241c..9c613e5 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -515,6 +515,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
/* Initialise locks */
qemu_co_mutex_init(&s->lock);
+ qemu_co_mutex_init(&s->dedup_lock);
/* Repair image if dirty */
if (!(flags & BDRV_O_CHECK) && !bs->read_only &&
@@ -805,9 +806,19 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
s->cluster_cache_offset = -1; /* disable compressed cache */
+ atomic_dedup_is_running = qcow2_dedup_is_running(bs);
+
+ if (atomic_dedup_is_running) {
+ /* This mutex is used to serialize the write requests in the dedup case.
+ * The goal is to avoid that the dedup process concurrents requests to
+ * the same clusters and corrupt data.
+ * With qcow2_dedup_read_missing_and_concatenate that would not work.
+ */
+ qemu_co_mutex_lock(&s->dedup_lock);
+ }
+
qemu_co_mutex_lock(&s->lock);
- atomic_dedup_is_running = qcow2_dedup_is_running(bs);
if (atomic_dedup_is_running) {
QTAILQ_INIT(&ds.undedupables);
ds.phash.reuse = false;
@@ -977,6 +988,10 @@ fail:
g_free(l2meta);
}
+ if (atomic_dedup_is_running) {
+ qemu_co_mutex_unlock(&s->dedup_lock);
+ }
+
qemu_iovec_destroy(&hd_qiov);
qemu_vfree(cluster_data);
qemu_vfree(dedup_cluster_data);
diff --git a/block/qcow2.h b/block/qcow2.h
index b858db9..a430fe1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -236,6 +236,7 @@ typedef struct BDRVQcowState {
GTree *dedup_tree_by_hash;
CoMutex lock;
+ CoMutex dedup_lock;
uint32_t crypt_method; /* current crypt method, 0 if no key yet */
uint32_t crypt_method_header;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 25/32] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (23 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 24/32] qcow2: Serialize write requests when deduplication is activated Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 26/32] qcow2: Add check_dedup_l2 in order to check l2 of dedup table Benoît Canet
` (6 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-refcount.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ffb8d3a..7a53983 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1029,7 +1029,19 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
PRIx64 ": %s\n", l2_entry, strerror(-refcount));
goto fail;
}
- if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
+ /* QCOW_OFLAG_COPIED is not garanteed to be here when
+ * refcount == 1 when dedup is enabled since it would be
+ * too expensive to set it back everytime refcount is
+ * decremented to 1.
+ */
+ if (!s->has_dedup &&
+ (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
+ fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
+ PRIx64 " refcount=%d\n", l2_entry, refcount);
+ res->corruptions++;
+ }
+ if (s->has_dedup && refcount > 1 &&
+ ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
PRIx64 " refcount=%d\n", l2_entry, refcount);
res->corruptions++;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 26/32] qcow2: Add check_dedup_l2 in order to check l2 of dedup table.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (24 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 25/32] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 27/32] qcow2: Add verification " Benoît Canet
` (5 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-refcount.c | 62 +++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 54 insertions(+), 8 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 7a53983..af18f9b 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1087,6 +1087,43 @@ fail:
return -EIO;
}
+static int check_dedup_l2(BlockDriverState *bs, BdrvCheckResult *res,
+ int64_t l2_offset)
+{
+ BDRVQcowState *s = bs->opaque;
+ uint64_t *l2_table;
+ int i, l2_size;
+
+ /* Read L2 table from disk */
+ l2_size = s->cluster_size;
+ l2_table = g_malloc(l2_size);
+
+ if (bdrv_pread(bs->file, l2_offset, l2_table, l2_size) != l2_size) {
+ goto fail;
+ }
+
+ /* Do the actual checks */
+ for (i = 0; i < (s->l2_size - 5); i += 5) {
+ uint64_t first_logical_offset = be64_to_cpu(l2_table[i + 4]) &
+ ~QCOW_OFLAG_COPIED;
+ if (first_logical_offset > (bs->total_sectors * BDRV_SECTOR_SIZE)) {
+ fprintf(stderr, "ERROR: l2 deduplication first_logical_offset"
+ "=%" PRIi64 " outside of deduplicated volume in l2 table "
+ "with offset %" PRIi64 ".\n", first_logical_offset,
+ l2_offset);
+ res->corruptions++;
+ }
+ }
+
+ g_free(l2_table);
+ return 0;
+
+fail:
+ fprintf(stderr, "ERROR: I/O error in check_dedup_l2\n");
+ g_free(l2_table);
+ return -EIO;
+}
+
/*
* Increases the refcount for the L1 table, its L2 tables and all referenced
* clusters in the given refcount table. While doing so, performs some checks
@@ -1100,7 +1137,8 @@ static int check_refcounts_l1(BlockDriverState *bs,
uint16_t *refcount_table,
int refcount_table_size,
int64_t l1_table_offset, int l1_size,
- int flags)
+ int flags,
+ bool dedup)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l1_table, l2_offset, l1_size2;
@@ -1156,11 +1194,19 @@ static int check_refcounts_l1(BlockDriverState *bs,
res->corruptions++;
}
- /* Process and check L2 entries */
- ret = check_refcounts_l2(bs, res, refcount_table,
- refcount_table_size, l2_offset, flags);
- if (ret < 0) {
- goto fail;
+ if (dedup) {
+ /* Process and check dedup l2 entries */
+ ret = check_dedup_l2(bs, res, l2_offset);
+ if (ret < 0) {
+ goto fail;
+ }
+ } else {
+ /* Process and check L2 entries */
+ ret = check_refcounts_l2(bs, res, refcount_table,
+ refcount_table_size, l2_offset, flags);
+ if (ret < 0) {
+ goto fail;
+ }
}
}
}
@@ -1202,7 +1248,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
/* current L1 table */
ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
s->l1_table_offset, s->l1_size,
- CHECK_OFLAG_COPIED | CHECK_FRAG_INFO);
+ CHECK_OFLAG_COPIED | CHECK_FRAG_INFO, false);
if (ret < 0) {
goto fail;
}
@@ -1211,7 +1257,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
for(i = 0; i < s->nb_snapshots; i++) {
sn = s->snapshots + i;
ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
- sn->l1_table_offset, sn->l1_size, 0);
+ sn->l1_table_offset, sn->l1_size, 0, false);
if (ret < 0) {
goto fail;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 27/32] qcow2: Add verification of dedup table.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (25 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 26/32] qcow2: Add check_dedup_l2 in order to check l2 of dedup table Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 28/32] qcow2: Integrate SKEIN hash algorithm in deduplication Benoît Canet
` (4 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-refcount.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index af18f9b..58d142f 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1253,6 +1253,15 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
goto fail;
}
+ if (s->has_dedup) {
+ ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
+ s->dedup_table_offset, s->dedup_table_size,
+ 0, true);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
/* snapshots */
for(i = 0; i < s->nb_snapshots; i++) {
sn = s->snapshots + i;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 28/32] qcow2: Integrate SKEIN hash algorithm in deduplication.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (26 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 27/32] qcow2: Add verification " Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 29/32] qcow: Set large dedup hash block size Benoît Canet
` (3 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 15 +++++++++++++++
block/qcow2.c | 5 +++++
configure | 35 +++++++++++++++++++++++++++++++++++
3 files changed, 55 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index cd47e2c..6ad9d0c 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -33,6 +33,10 @@
#include <gnutls/crypto.h>
#endif
+#ifdef CONFIG_SKEIN_DEDUP
+#include <skeinApi.h>
+#endif
+
static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
QCowHash *hash,
uint64_t *first_logical_sect,
@@ -272,6 +276,17 @@ static int qcow2_compute_cluster_hash(BlockDriverState *bs,
return gnutls_hash_fast(GNUTLS_DIG_SHA256, data,
s->cluster_size, hash->data);
#endif
+#if defined(CONFIG_SKEIN_DEDUP)
+ case QCOW_HASH_SKEIN:
+ {
+ SkeinCtx_t ctx;
+ skeinCtxPrepare(&ctx, Skein256);
+ skeinInit(&ctx, Skein256);
+ skeinUpdate(&ctx, data, s->cluster_size);
+ skeinFinal(&ctx, hash->data);
+ }
+ return 0;
+#endif
default:
error_report("Invalid deduplication hash algorithm %i",
s->dedup_hash_algo);
diff --git a/block/qcow2.c b/block/qcow2.c
index 9c613e5..17b2fcb 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1540,6 +1540,11 @@ static int8_t qcow2_get_dedup_hash_algo(char *value)
if (!value || !strcmp(value, "sha256")) {
return QCOW_HASH_SHA256;
}
+#if defined(CONFIG_SKEIN_DEDUP)
+ if (!strcmp(value, "skein")) {
+ return QCOW_HASH_SKEIN;
+ }
+#endif
error_printf("Unsupported deduplication hash algorithm.\n");
return -EINVAL;
diff --git a/configure b/configure
index 0b8c92e..6631173 100755
--- a/configure
+++ b/configure
@@ -229,6 +229,7 @@ virtio_blk_data_plane=""
gtk=""
gtkabi="2.0"
sha256_dedup="yes"
+skein_dedup="no"
# parse CC options first
for opt do
@@ -910,6 +911,10 @@ for opt do
;;
--enable-sha256-dedup) sha256_dedup="yes"
;;
+ --disable-skein-dedup) skein_dedup="no"
+ ;;
+ --enable-skein-dedup) skein_dedup="yes"
+ ;;
*) echo "ERROR: unknown option $opt"; show_help="yes"
;;
esac
@@ -1167,6 +1172,7 @@ echo " --disable-sha256-dedup disable sha256 dedup"
echo " --enable-sha256-dedup enables sha256 dedup"
echo " --enable-gcov enable test coverage analysis with gcov"
echo " --gcov=GCOV use specified gcov [$gcov_tool]"
+echo " --enable-skein-dedup enable computing dedup hashes with SKEIN"
echo ""
echo "NOTE: The object files are built at the place where configure is launched"
exit 1
@@ -2509,6 +2515,30 @@ EOF
fi
fi
+##########################################
+# SKEIN dedup hash function probe
+if test "$skein_dedup" != "no" ; then
+ cat > $TMPC <<EOF
+#include <skeinApi.h>
+int main(void) {
+ SkeinCtx_t ctx;
+ skeinCtxPrepare(&ctx, 512);
+ return 0;
+}
+EOF
+ skein_libs="-lskein3fish"
+ if compile_prog "" "$skein_libs" ; then
+ skein_dedup=yes
+ libs_tools="$skein_libs $libs_tools"
+ libs_softmmu="$skein_libs $libs_softmmu"
+ else
+ if test "$skein_dedup" = "yes" ; then
+ feature_not_found "libskein3fish not found"
+ fi
+ skein_dedup=no
+ fi
+fi
+
#
# Check for xxxat() functions when we are building linux-user
# emulator. This is done because older glibc versions don't
@@ -3456,6 +3486,7 @@ echo "virtio-blk-data-plane $virtio_blk_data_plane"
echo "sha256-dedup $sha256_dedup"
echo "gcov $gcov_tool"
echo "gcov enabled $gcov"
+echo "SKEIN support $skein_dedup"
if test "$sdl_too_old" = "yes"; then
echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -3826,6 +3857,10 @@ if test "$sha256_dedup" = "yes" ; then
echo "CONFIG_SHA256_DEDUP=y" >> $config_host_mak
fi
+if test "$skein_dedup" = "yes" ; then
+ echo "CONFIG_SKEIN_DEDUP=y" >> $config_host_mak
+fi
+
# USB host support
case "$usb" in
linux)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 29/32] qcow: Set large dedup hash block size.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (27 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 28/32] qcow2: Integrate SKEIN hash algorithm in deduplication Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 30/32] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
` (2 subsequent siblings)
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-refcount.c | 5 +++--
block/qcow2.c | 3 +++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 58d142f..7e74896 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1095,7 +1095,7 @@ static int check_dedup_l2(BlockDriverState *bs, BdrvCheckResult *res,
int i, l2_size;
/* Read L2 table from disk */
- l2_size = s->cluster_size;
+ l2_size = s->hash_block_size;
l2_table = g_malloc(l2_size);
if (bdrv_pread(bs->file, l2_offset, l2_table, l2_size) != l2_size) {
@@ -1185,7 +1185,8 @@ static int check_refcounts_l1(BlockDriverState *bs,
/* Mark L2 table as used */
l2_offset &= L1E_OFFSET_MASK;
inc_refcounts(bs, res, refcount_table, refcount_table_size,
- l2_offset, s->cluster_size);
+ l2_offset,
+ dedup ? s->hash_block_size : s->l2_size * sizeof(uint64_t));
/* L2 tables are cluster aligned */
if (l2_offset & (s->cluster_size - 1)) {
diff --git a/block/qcow2.c b/block/qcow2.c
index 17b2fcb..96fc86a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -426,6 +426,9 @@ static int qcow2_open(BlockDriverState *bs, int flags)
s->cluster_sectors = 1 << (s->cluster_bits - 9);
s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
s->l2_size = 1 << s->l2_bits;
+ if (s->incompatible_features & QCOW2_INCOMPAT_DEDUP) {
+ s->hash_block_size = DEFAULT_CLUSTER_SIZE * 5;
+ }
bs->total_sectors = header.size / 512;
s->csize_shift = (62 - (s->cluster_bits - 8));
s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 30/32] qcow2: Add qcow2_dedup_init and qcow2_dedup_close.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (28 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 29/32] qcow: Set large dedup hash block size Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 31/32] qcow2: Add qcow2_co_dedup_resume to restart deduplication Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 32/32] qcow2: Enable the deduplication feature Benoît Canet
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Signed-off-by: Benoit Canet <benoit@irqsave.net>
---
block/qcow2-dedup.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.h | 2 ++
2 files changed, 80 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 6ad9d0c..c2dd145 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -1091,3 +1091,81 @@ bool qcow2_dedup_is_running(BlockDriverState *bs)
BDRVQcowState *s = bs->opaque;
return s->has_dedup && s->dedup_status == DEDUP_STATUS_STARTED;
}
+
+static gint qcow2_dedup_compare_by_hash(gconstpointer a,
+ gconstpointer b,
+ gpointer data)
+{
+ QCowHash *hash_a = (QCowHash *) a;
+ QCowHash *hash_b = (QCowHash *) b;
+ return memcmp(hash_a->data, hash_b->data, HASH_LENGTH);
+}
+
+static void qcow2_dedup_destroy_qcow_hash_node(gpointer p)
+{
+ QCowHashNode *hash_node = (QCowHashNode *) p;
+ g_free(hash_node);
+}
+
+static int qcow2_dedup_alloc(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret;
+
+ ret = qcow2_do_table_init(bs,
+ &s->dedup_table,
+ s->dedup_table_offset,
+ s->dedup_table_size,
+ false);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ s->dedup_tree_by_hash = g_tree_new_full(qcow2_dedup_compare_by_hash, NULL,
+ NULL,
+ qcow2_dedup_destroy_qcow_hash_node);
+
+ s->dedup_cluster_cache = qcow2_cache_create(bs, DEDUP_CACHE_SIZE,
+ s->hash_block_size);
+
+ return 0;
+}
+
+static void qcow2_dedup_free(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ g_free(s->dedup_table);
+
+ qcow2_cache_flush(bs, s->dedup_cluster_cache);
+ qcow2_cache_destroy(bs, s->dedup_cluster_cache);
+ g_tree_destroy(s->dedup_tree_by_hash);
+}
+
+int qcow2_dedup_init(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret = 0;
+
+ s->has_dedup = true;
+
+ ret = qcow2_dedup_alloc(bs);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* if we are read-only we don't load the deduplication table */
+ if (bs->read_only) {
+ return 0;
+ }
+
+ s->dedup_status = DEDUP_STATUS_STARTING;
+
+ return 0;
+}
+
+void qcow2_dedup_close(BlockDriverState *bs)
+{
+ qcow2_dedup_free(bs);
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index a430fe1..9275be1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -500,5 +500,7 @@ int qcow2_dedup_store_new_hashes(BlockDriverState *bs,
void qcow2_dedup_destroy_hash(BlockDriverState *bs,
uint64_t cluster_index);
bool qcow2_dedup_is_running(BlockDriverState *bs);
+int qcow2_dedup_init(BlockDriverState *bs);
+void qcow2_dedup_close(BlockDriverState *bs);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 31/32] qcow2: Add qcow2_co_dedup_resume to restart deduplication.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (29 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 30/32] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 32/32] qcow2: Enable the deduplication feature Benoît Canet
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-dedup.c | 179 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 179 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index c2dd145..93917d5 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -37,6 +37,7 @@
#include <skeinApi.h>
#endif
+static void qcow2_dedup_reset(BlockDriverState *bs);
static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
QCowHash *hash,
uint64_t *first_logical_sect,
@@ -1092,6 +1093,174 @@ bool qcow2_dedup_is_running(BlockDriverState *bs)
return s->has_dedup && s->dedup_status == DEDUP_STATUS_STARTED;
}
+static bool hash_is_null(QCowHash *hash)
+{
+ QCowHash null_hash;
+ memset(&null_hash.data, 0, HASH_LENGTH);
+ return !memcmp(hash->data, null_hash.data, HASH_LENGTH);
+}
+
+static void qcow2_dedup_insert_hash_node(BlockDriverState *bs,
+ QCowHashNode *hash_node)
+{
+ BDRVQcowState *s = bs->opaque;
+
+ g_tree_insert(s->dedup_tree_by_hash, &hash_node->hash, hash_node);
+}
+
+/* This load the QCowHashNode corresponding to a given cluster index into ram
+ *
+ * @index: index of the given physical sector
+ * @ret: 0 on succes, negative on error
+ */
+static int qcow2_load_cluster_hash(BlockDriverState *bs,
+ uint64_t index)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret = 0;
+ QCowHash hash;
+ uint64_t first_logical_sect;
+ QCowHashNode *hash_node;
+
+ /* get the hash */
+ ret = qcow2_dedup_read_write_hash(bs, &hash,
+ &first_logical_sect,
+ index * s->cluster_sectors,
+ false);
+
+ if (ret < 0) {
+ error_report("Failed to load deduplication hash.");
+ return ret;
+ }
+
+ /* if the hash is null don't load it */
+ if (hash_is_null(&hash)) {
+ return ret;
+ }
+
+ hash_node = qcow2_hash_node_new(&hash,
+ index * s->cluster_sectors,
+ first_logical_sect);
+ qcow2_dedup_insert_hash_node(bs, hash_node);
+
+ return 0;
+}
+
+/* Load all the actives hashes into RAM
+ *
+ * @ret: 0 on success, negative on error
+ */
+static int qcow2_load_valid_hashes(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ uint64_t max_clusters, i;
+ int nb_hash_in_hash_block = s->hash_block_size / (HASH_LENGTH + 8);
+ int ret = 0;
+
+ max_clusters = s->dedup_table_size * nb_hash_in_hash_block;
+
+ /* load all the hash stored to disk in memory */
+ for (i = 0; i < max_clusters; i++) {
+ if (!(i % nb_hash_in_hash_block)) {
+ co_sleep_ns(rt_clock, s->dedup_co_delay);
+ }
+ qemu_co_mutex_lock(&s->lock);
+ ret = qcow2_load_cluster_hash(bs, i);
+ qemu_co_mutex_unlock(&s->lock);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+static int qcow2_drop_to_dedup_stale_hash(BlockDriverState *bs,
+ uint64_t index)
+{
+ int ret = 0;
+ bool to_dedup;
+ uint64_t physical_sect;
+
+ to_dedup = qcow2_is_cluster_to_dedup(bs, index, &physical_sect, &ret);
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ if (!to_dedup) {
+ return 0;
+ }
+
+ qcow2_remove_hash_node_by_sector(bs, physical_sect);
+ return 0;
+}
+
+/* For each l2 entry marked as QCOW_OFLAG_PENDING_DEDUP drop the obsolete hash
+ * from the trees
+ *
+ * @ret: 0 on success, negative on error
+ */
+static int qcow2_drop_to_dedup_hashes(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ uint64_t i;
+ int ret = 0;
+
+ /* for each l2 entry */
+ for (i = 0; i < s->l2_size * s->l1_size; i++) {
+ if (!(i % s->l2_size)) {
+ co_sleep_ns(rt_clock, s->dedup_co_delay);
+ }
+ qemu_co_mutex_lock(&s->lock);
+ ret = qcow2_drop_to_dedup_stale_hash(bs, i);
+ qemu_co_mutex_unlock(&s->lock);
+
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+/*
+ * This coroutine resume deduplication
+ *
+ * @data: the given BlockDriverState
+ * @ret: NULL
+ */
+static void coroutine_fn qcow2_co_dedup_resume(void *opaque)
+{
+ BlockDriverState *bs = opaque;
+ BDRVQcowState *s = bs->opaque;
+ int ret = 0;
+
+ ret = qcow2_load_valid_hashes(bs);
+
+ if (ret < 0) {
+ goto fail;
+ }
+
+ ret = qcow2_drop_to_dedup_hashes(bs);
+
+ if (ret < 0) {
+ goto fail;
+ }
+
+ qemu_co_mutex_lock(&s->lock);
+ s->dedup_status = DEDUP_STATUS_STARTED;
+ qemu_co_mutex_unlock(&s->lock);
+
+ return;
+
+fail:
+ qemu_co_mutex_lock(&s->lock);
+ s->dedup_status = DEDUP_STATUS_STOPPED;
+ qcow2_dedup_reset(bs);
+ qemu_co_mutex_unlock(&s->lock);
+}
+
static gint qcow2_dedup_compare_by_hash(gconstpointer a,
gconstpointer b,
gpointer data)
@@ -1142,6 +1311,12 @@ static void qcow2_dedup_free(BlockDriverState *bs)
g_tree_destroy(s->dedup_tree_by_hash);
}
+static void qcow2_dedup_reset(BlockDriverState *bs)
+{
+ qcow2_dedup_free(bs);
+ qcow2_dedup_alloc(bs);
+}
+
int qcow2_dedup_init(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
@@ -1162,6 +1337,10 @@ int qcow2_dedup_init(BlockDriverState *bs)
s->dedup_status = DEDUP_STATUS_STARTING;
+ /* resume deduplication */
+ s->dedup_resume_co = qemu_coroutine_create(qcow2_co_dedup_resume);
+ qemu_coroutine_enter(s->dedup_resume_co, bs);
+
return 0;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [Qemu-devel] [RFC V7 32/32] qcow2: Enable the deduplication feature.
2013-03-15 14:49 [Qemu-devel] [RFC V7 00/32] QCOW2 deduplication core functionality Benoît Canet
` (30 preceding siblings ...)
2013-03-15 14:49 ` [Qemu-devel] [RFC V7 31/32] qcow2: Add qcow2_co_dedup_resume to restart deduplication Benoît Canet
@ 2013-03-15 14:49 ` Benoît Canet
31 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2013-03-15 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/block/qcow2.c b/block/qcow2.c
index 96fc86a..135c71e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -531,6 +531,13 @@ static int qcow2_open(BlockDriverState *bs, int flags)
}
}
+ if (s->incompatible_features & QCOW2_INCOMPAT_DEDUP) {
+ ret = qcow2_dedup_init(bs);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
#ifdef DEBUG_ALLOC
{
BdrvCheckResult result = {0};
@@ -1006,8 +1013,13 @@ fail:
static void qcow2_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
+
g_free(s->l1_table);
+ if (s->has_dedup) {
+ qcow2_dedup_close(bs);
+ }
+
qcow2_cache_flush(bs, s->l2_table_cache);
qcow2_cache_flush(bs, s->refcount_block_cache);
@@ -1498,6 +1510,12 @@ static int qcow2_create2(const char *filename, int64_t total_size,
if (ret < 0) {
goto out;
}
+
+ /* minimal init */
+ ret = qcow2_dedup_init(bs);
+ if (ret < 0) {
+ goto out;
+ }
}
/* Want a backing file? There you go.*/
--
1.7.10.4
^ permalink raw reply related [flat|nested] 33+ messages in thread