* [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication
@ 2012-10-17 16:00 Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 01/20] qcow2: Add deduplication to the qcow2 specification Benoît Canet
` (20 more replies)
0 siblings, 21 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
This patchset is a not yet working draft starting to implement deduplication
in QCOW2.
The Kernel red black trees are indented in kernel style.
I'll be happy to know what is the policy regarding this kind of inclusion
and what tools can be used to make the style compliant.
I am posting this patchset in order to have an early feedback regarding the
design.
Benoît Canet (20):
qcow2: Add deduplication to the qcow2 specification.
qcow2: Add kernel red black trees
qcow2: Add deduplication structures and fields.
qcow2: Add qcow2_dedup_read_missing_and_concatenate
qcow2: Rename update_refcount into qcow2_update_refcount.
qcow2: Add qcow2_dedup and related functions.
qcow2: Add qcow2_dedup_write_new_hashes.
qcow2: Implement qcow2_compute_cluster_hash.
qcow2: Add qcow2_co_load_dedup_hashes.
qcow2: Add qcow2_dedup_grow_table.
qcow2: Load and save deduplication table header extension.
qcow2: Extract qcow2_do_table_init.
qcow2: Add qcow2_dedup_init and qcow2_dedup_close.
qcow2: Extract qcow2_add_feature and qcow2_remove_feature.
block: Add dedup image create option.
qcow2: Allow creation of images using deduplication.
qcow2: Integrate deduplication in qcow2_co_writev loop.
qcow2: Add method to destroy the deduplication red black tree.
qcow2: init and cleanup deduplication.
qemu-iotests: Filter dedup=on/off so existing tests don't break.
Makefile | 3 +
Makefile.objs | 1 +
Makefile.target | 2 +-
block/Makefile.objs | 1 +
block/qcow2-cluster.c | 95 ++++--
block/qcow2-dedup.c | 774 ++++++++++++++++++++++++++++++++++++++++++
block/qcow2-refcount.c | 79 +++--
block/qcow2.c | 247 ++++++++++++--
block/qcow2.h | 72 +++-
block_int.h | 1 +
docs/specs/qcow2.txt | 16 +-
rbtree.c | 389 +++++++++++++++++++++
rbtree.h | 160 +++++++++
tests/qemu-iotests/common.rc | 3 +-
14 files changed, 1749 insertions(+), 94 deletions(-)
create mode 100644 block/qcow2-dedup.c
create mode 100644 rbtree.c
create mode 100644 rbtree.h
--
1.7.10.4
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 01/20] qcow2: Add deduplication to the qcow2 specification.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:28 ` Eric Blake
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 02/20] qcow2: Add kernel red black trees Benoît Canet
` (19 subsequent siblings)
20 siblings, 1 reply; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
docs/specs/qcow2.txt | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 36a559d..013a16b 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -80,7 +80,10 @@ in the description of a field.
tables to repair refcounts before accessing the
image.
- Bits 1-63: Reserved (set to 0)
+ Bit 1: Deduplication bit. If this bit is set then
+ deduplication is used on this image.
+
+ Bits 2-63: Reserved (set to 0)
80 - 87: compatible_features
Bitmask of compatible features. An implementation can
@@ -116,6 +119,7 @@ be stored. Each extension has a structure like the following:
0x00000000 - End of the header extension area
0xE2792ACA - Backing file format name
0x6803f857 - Feature name table
+ 0xCD8E819B - Deduplication
other - Unknown header extension, can be safely
ignored
@@ -159,6 +163,16 @@ the header extension data. Each entry look like this:
terminated if it has full length)
+== Deduplication ==
+
+The deduplication extension contains the offset and size of the deduplication
+table.
+
+ Byte 0 - 7: Offset
+
+ 8 - 11: Size
+
+
== Host cluster management ==
qcow2 manages the allocation of host clusters by maintaining a reference count
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 02/20] qcow2: Add kernel red black trees
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 01/20] qcow2: Add deduplication to the qcow2 specification Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 03/20] qcow2: Add deduplication structures and fields Benoît Canet
` (18 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
Makefile.objs | 1 +
rbtree.c | 389 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rbtree.h | 160 ++++++++++++++++++++++++
3 files changed, 550 insertions(+)
create mode 100644 rbtree.c
create mode 100644 rbtree.h
diff --git a/Makefile.objs b/Makefile.objs
index 74b3542..b842280 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -45,6 +45,7 @@ block-obj-y = cutils.o iov.o cache-utils.o qemu-option.o module.o async.o
block-obj-y += nbd.o block.o blockjob.o aio.o aes.o qemu-config.o
block-obj-y += qemu-progress.o qemu-sockets.o uri.o
block-obj-y += $(coroutine-obj-y) $(qobject-obj-y) $(version-obj-y)
+block-obj-y += rbtree.o
block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += block/
diff --git a/rbtree.c b/rbtree.c
new file mode 100644
index 0000000..6ad800f
--- /dev/null
+++ b/rbtree.c
@@ -0,0 +1,389 @@
+/*
+ Red Black Trees
+ (C) 1999 Andrea Arcangeli <andrea@suse.de>
+ (C) 2002 David Woodhouse <dwmw2@infradead.org>
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+ linux/lib/rbtree.c
+*/
+
+#include "rbtree.h"
+
+static void __rb_rotate_left(struct rb_node *node, struct rb_root *root)
+{
+ struct rb_node *right = node->rb_right;
+ struct rb_node *parent = rb_parent(node);
+
+ if ((node->rb_right = right->rb_left))
+ rb_set_parent(right->rb_left, node);
+ right->rb_left = node;
+
+ rb_set_parent(right, parent);
+
+ if (parent)
+ {
+ if (node == parent->rb_left)
+ parent->rb_left = right;
+ else
+ parent->rb_right = right;
+ }
+ else
+ root->rb_node = right;
+ rb_set_parent(node, right);
+}
+
+static void __rb_rotate_right(struct rb_node *node, struct rb_root *root)
+{
+ struct rb_node *left = node->rb_left;
+ struct rb_node *parent = rb_parent(node);
+
+ if ((node->rb_left = left->rb_right))
+ rb_set_parent(left->rb_right, node);
+ left->rb_right = node;
+
+ rb_set_parent(left, parent);
+
+ if (parent)
+ {
+ if (node == parent->rb_right)
+ parent->rb_right = left;
+ else
+ parent->rb_left = left;
+ }
+ else
+ root->rb_node = left;
+ rb_set_parent(node, left);
+}
+
+void rb_insert_color(struct rb_node *node, struct rb_root *root)
+{
+ struct rb_node *parent, *gparent;
+
+ while ((parent = rb_parent(node)) && rb_is_red(parent))
+ {
+ gparent = rb_parent(parent);
+
+ if (parent == gparent->rb_left)
+ {
+ {
+ register struct rb_node *uncle = gparent->rb_right;
+ if (uncle && rb_is_red(uncle))
+ {
+ rb_set_black(uncle);
+ rb_set_black(parent);
+ rb_set_red(gparent);
+ node = gparent;
+ continue;
+ }
+ }
+
+ if (parent->rb_right == node)
+ {
+ register struct rb_node *tmp;
+ __rb_rotate_left(parent, root);
+ tmp = parent;
+ parent = node;
+ node = tmp;
+ }
+
+ rb_set_black(parent);
+ rb_set_red(gparent);
+ __rb_rotate_right(gparent, root);
+ } else {
+ {
+ register struct rb_node *uncle = gparent->rb_left;
+ if (uncle && rb_is_red(uncle))
+ {
+ rb_set_black(uncle);
+ rb_set_black(parent);
+ rb_set_red(gparent);
+ node = gparent;
+ continue;
+ }
+ }
+
+ if (parent->rb_left == node)
+ {
+ register struct rb_node *tmp;
+ __rb_rotate_right(parent, root);
+ tmp = parent;
+ parent = node;
+ node = tmp;
+ }
+
+ rb_set_black(parent);
+ rb_set_red(gparent);
+ __rb_rotate_left(gparent, root);
+ }
+ }
+
+ rb_set_black(root->rb_node);
+}
+
+static void __rb_erase_color(struct rb_node *node, struct rb_node *parent,
+ struct rb_root *root)
+{
+ struct rb_node *other;
+
+ while ((!node || rb_is_black(node)) && node != root->rb_node)
+ {
+ if (parent->rb_left == node)
+ {
+ other = parent->rb_right;
+ if (rb_is_red(other))
+ {
+ rb_set_black(other);
+ rb_set_red(parent);
+ __rb_rotate_left(parent, root);
+ other = parent->rb_right;
+ }
+ if ((!other->rb_left || rb_is_black(other->rb_left)) &&
+ (!other->rb_right || rb_is_black(other->rb_right)))
+ {
+ rb_set_red(other);
+ node = parent;
+ parent = rb_parent(node);
+ }
+ else
+ {
+ if (!other->rb_right || rb_is_black(other->rb_right))
+ {
+ struct rb_node *o_left;
+ if ((o_left = other->rb_left))
+ rb_set_black(o_left);
+ rb_set_red(other);
+ __rb_rotate_right(other, root);
+ other = parent->rb_right;
+ }
+ rb_set_color(other, rb_color(parent));
+ rb_set_black(parent);
+ if (other->rb_right)
+ rb_set_black(other->rb_right);
+ __rb_rotate_left(parent, root);
+ node = root->rb_node;
+ break;
+ }
+ }
+ else
+ {
+ other = parent->rb_left;
+ if (rb_is_red(other))
+ {
+ rb_set_black(other);
+ rb_set_red(parent);
+ __rb_rotate_right(parent, root);
+ other = parent->rb_left;
+ }
+ if ((!other->rb_left || rb_is_black(other->rb_left)) &&
+ (!other->rb_right || rb_is_black(other->rb_right)))
+ {
+ rb_set_red(other);
+ node = parent;
+ parent = rb_parent(node);
+ }
+ else
+ {
+ if (!other->rb_left || rb_is_black(other->rb_left))
+ {
+ register struct rb_node *o_right;
+ if ((o_right = other->rb_right))
+ rb_set_black(o_right);
+ rb_set_red(other);
+ __rb_rotate_left(other, root);
+ other = parent->rb_left;
+ }
+ rb_set_color(other, rb_color(parent));
+ rb_set_black(parent);
+ if (other->rb_left)
+ rb_set_black(other->rb_left);
+ __rb_rotate_right(parent, root);
+ node = root->rb_node;
+ break;
+ }
+ }
+ }
+ if (node)
+ rb_set_black(node);
+}
+
+void rb_erase(struct rb_node *node, struct rb_root *root)
+{
+ struct rb_node *child, *parent;
+ int color;
+
+ if (!node->rb_left)
+ child = node->rb_right;
+ else if (!node->rb_right)
+ child = node->rb_left;
+ else
+ {
+ struct rb_node *old = node, *left;
+
+ node = node->rb_right;
+ while ((left = node->rb_left) != NULL)
+ node = left;
+ child = node->rb_right;
+ parent = rb_parent(node);
+ color = rb_color(node);
+
+ if (child)
+ rb_set_parent(child, parent);
+ if (parent == old) {
+ parent->rb_right = child;
+ parent = node;
+ } else
+ parent->rb_left = child;
+
+ node->rb_parent_color = old->rb_parent_color;
+ node->rb_right = old->rb_right;
+ node->rb_left = old->rb_left;
+
+ if (rb_parent(old))
+ {
+ if (rb_parent(old)->rb_left == old)
+ rb_parent(old)->rb_left = node;
+ else
+ rb_parent(old)->rb_right = node;
+ } else
+ root->rb_node = node;
+
+ rb_set_parent(old->rb_left, node);
+ if (old->rb_right)
+ rb_set_parent(old->rb_right, node);
+ goto color;
+ }
+
+ parent = rb_parent(node);
+ color = rb_color(node);
+
+ if (child)
+ rb_set_parent(child, parent);
+ if (parent)
+ {
+ if (parent->rb_left == node)
+ parent->rb_left = child;
+ else
+ parent->rb_right = child;
+ }
+ else
+ root->rb_node = child;
+
+ color:
+ if (color == RB_BLACK)
+ __rb_erase_color(child, parent, root);
+}
+
+/*
+ * This function returns the first node (in sort order) of the tree.
+ */
+struct rb_node *rb_first(struct rb_root *root)
+{
+ struct rb_node *n;
+
+ n = root->rb_node;
+ if (!n)
+ return NULL;
+ while (n->rb_left)
+ n = n->rb_left;
+ return n;
+}
+
+struct rb_node *rb_last(struct rb_root *root)
+{
+ struct rb_node *n;
+
+ n = root->rb_node;
+ if (!n)
+ return NULL;
+ while (n->rb_right)
+ n = n->rb_right;
+ return n;
+}
+
+struct rb_node *rb_next(struct rb_node *node)
+{
+ struct rb_node *parent;
+
+ if (rb_parent(node) == node)
+ return NULL;
+
+ /* If we have a right-hand child, go down and then left as far
+ as we can. */
+ if (node->rb_right) {
+ node = node->rb_right;
+ while (node->rb_left)
+ node=node->rb_left;
+ return node;
+ }
+
+ /* No right-hand children. Everything down and left is
+ smaller than us, so any 'next' node must be in the general
+ direction of our parent. Go up the tree; any time the
+ ancestor is a right-hand child of its parent, keep going
+ up. First time it's a left-hand child of its parent, said
+ parent is our 'next' node. */
+ while ((parent = rb_parent(node)) && node == parent->rb_right)
+ node = parent;
+
+ return parent;
+}
+
+struct rb_node *rb_prev(struct rb_node *node)
+{
+ struct rb_node *parent;
+
+ if (rb_parent(node) == node)
+ return NULL;
+
+ /* If we have a left-hand child, go down and then right as far
+ as we can. */
+ if (node->rb_left) {
+ node = node->rb_left;
+ while (node->rb_right)
+ node=node->rb_right;
+ return node;
+ }
+
+ /* No left-hand children. Go up till we find an ancestor which
+ is a right-hand child of its parent */
+ while ((parent = rb_parent(node)) && node == parent->rb_left)
+ node = parent;
+
+ return parent;
+}
+
+void rb_replace_node(struct rb_node *victim, struct rb_node *new,
+ struct rb_root *root)
+{
+ struct rb_node *parent = rb_parent(victim);
+
+ /* Set the surrounding nodes to point to the replacement */
+ if (parent) {
+ if (victim == parent->rb_left)
+ parent->rb_left = new;
+ else
+ parent->rb_right = new;
+ } else {
+ root->rb_node = new;
+ }
+ if (victim->rb_left)
+ rb_set_parent(victim->rb_left, new);
+ if (victim->rb_right)
+ rb_set_parent(victim->rb_right, new);
+
+ /* Copy the pointers/colour from the victim to the replacement */
+ *new = *victim;
+}
diff --git a/rbtree.h b/rbtree.h
new file mode 100644
index 0000000..e020e7c
--- /dev/null
+++ b/rbtree.h
@@ -0,0 +1,160 @@
+/*
+ Red Black Trees
+ (C) 1999 Andrea Arcangeli <andrea@suse.de>
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+ linux/include/linux/rbtree.h
+
+ To use rbtrees you'll have to implement your own insert and search cores.
+ This will avoid us to use callbacks and to drop drammatically performances.
+ I know it's not the cleaner way, but in C (not in C++) to get
+ performances and genericity...
+
+ Some example of insert and search follows here. The search is a plain
+ normal search over an ordered tree. The insert instead must be implemented
+ int two steps: as first thing the code must insert the element in
+ order as a red leaf in the tree, then the support library function
+ rb_insert_color() must be called. Such function will do the
+ not trivial work to rebalance the rbtree if necessary.
+
+-----------------------------------------------------------------------
+static inline struct page * rb_search_page_cache(struct inode * inode,
+ unsigned long offset)
+{
+ struct rb_node * n = inode->i_rb_page_cache.rb_node;
+ struct page * page;
+
+ while (n)
+ {
+ page = rb_entry(n, struct page, rb_page_cache);
+
+ if (offset < page->offset)
+ n = n->rb_left;
+ else if (offset > page->offset)
+ n = n->rb_right;
+ else
+ return page;
+ }
+ return NULL;
+}
+
+static inline struct page * __rb_insert_page_cache(struct inode * inode,
+ unsigned long offset,
+ struct rb_node * node)
+{
+ struct rb_node ** p = &inode->i_rb_page_cache.rb_node;
+ struct rb_node * parent = NULL;
+ struct page * page;
+
+ while (*p)
+ {
+ parent = *p;
+ page = rb_entry(parent, struct page, rb_page_cache);
+
+ if (offset < page->offset)
+ p = &(*p)->rb_left;
+ else if (offset > page->offset)
+ p = &(*p)->rb_right;
+ else
+ return page;
+ }
+
+ rb_link_node(node, parent, p);
+
+ return NULL;
+}
+
+static inline struct page * rb_insert_page_cache(struct inode * inode,
+ unsigned long offset,
+ struct rb_node * node)
+{
+ struct page * ret;
+ if ((ret = __rb_insert_page_cache(inode, offset, node)))
+ goto out;
+ rb_insert_color(node, &inode->i_rb_page_cache);
+ out:
+ return ret;
+}
+-----------------------------------------------------------------------
+*/
+
+#ifndef _LINUX_RBTREE_H
+#define _LINUX_RBTREE_H
+#include <stdlib.h>
+struct rb_node
+{
+ unsigned long rb_parent_color;
+#define RB_RED 0
+#define RB_BLACK 1
+ struct rb_node *rb_right;
+ struct rb_node *rb_left;
+} __attribute__((aligned(sizeof(long))));
+ /* The alignment might seem pointless, but allegedly CRIS needs it */
+
+struct rb_root
+{
+ struct rb_node *rb_node;
+ void (*rotate_notify)(struct rb_node *old_parent, struct rb_node *node);
+
+};
+
+
+#define rb_parent(r) ((struct rb_node *)((r)->rb_parent_color & ~3))
+#define rb_color(r) ((r)->rb_parent_color & 1)
+#define rb_is_red(r) (!rb_color(r))
+#define rb_is_black(r) rb_color(r)
+#define rb_set_red(r) do { (r)->rb_parent_color &= ~1; } while (0)
+#define rb_set_black(r) do { (r)->rb_parent_color |= 1; } while (0)
+
+static inline void rb_set_parent(struct rb_node *rb, struct rb_node *p)
+{
+ rb->rb_parent_color = (rb->rb_parent_color & 3) | (unsigned long)p;
+}
+static inline void rb_set_color(struct rb_node *rb, int color)
+{
+ rb->rb_parent_color = (rb->rb_parent_color & ~1) | color;
+}
+
+#define RB_ROOT (struct rb_root) { NULL, }
+#define rb_entry(ptr, type, member) container_of(ptr, type, member)
+
+#define RB_EMPTY_ROOT(root) ((root)->rb_node == NULL)
+#define RB_EMPTY_NODE(node) (rb_parent(node) == node)
+#define RB_CLEAR_NODE(node) (rb_set_parent(node, node))
+
+extern void rb_insert_color(struct rb_node *, struct rb_root *);
+extern void rb_erase(struct rb_node *, struct rb_root *);
+
+/* Find logical next and previous nodes in a tree */
+extern struct rb_node *rb_next(struct rb_node *);
+extern struct rb_node *rb_prev(struct rb_node *);
+extern struct rb_node *rb_first(struct rb_root *);
+extern struct rb_node *rb_last(struct rb_root *);
+
+/* Fast replacement of a single node without remove/rebalance/add/rebalance */
+extern void rb_replace_node(struct rb_node *victim, struct rb_node *new,
+ struct rb_root *root);
+
+static inline void rb_link_node(struct rb_node * node, struct rb_node * parent,
+ struct rb_node ** rb_link)
+{
+ node->rb_parent_color = (unsigned long )parent;
+ node->rb_left = node->rb_right = NULL;
+
+ *rb_link = node;
+}
+
+#endif /* _LINUX_RBTREE_H */
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 03/20] qcow2: Add deduplication structures and fields.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 01/20] qcow2: Add deduplication to the qcow2 specification Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 02/20] qcow2: Add kernel red black trees Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 04/20] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
` (17 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2.h | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/block/qcow2.h b/block/qcow2.h
index b4eb654..e38667c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -26,6 +26,7 @@
#define BLOCK_QCOW2_H
#include "aes.h"
+#include "rbtree.h"
#include "qemu-coroutine.h"
//#define DEBUG_ALLOC
@@ -58,6 +59,19 @@
#define DEFAULT_CLUSTER_SIZE 65536
+/* Red Black Tree deduplication node */
+typedef struct {
+ struct rb_node node;
+ uint8_t *hash; /* SHA256 hash of a given cluster */
+ uint64_t offset; /* offset where the cluster is stored (sectors) */
+} QCowHashNode;
+
+/* Undedupable hashes that must be written later to disk */
+typedef struct QCowHashElement {
+ uint8_t *hash;
+ QTAILQ_ENTRY(QCowHashElement) next;
+} QCowHashElement;
+
typedef struct QCowHeader {
uint32_t magic;
uint32_t version;
@@ -114,8 +128,10 @@ enum {
enum {
QCOW2_INCOMPAT_DIRTY_BITNR = 0,
QCOW2_INCOMPAT_DIRTY = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
+ QCOW2_INCOMPAT_DEDUP_BITNR = 1,
+ QCOW2_INCOMPAT_DEDUP = 1 << QCOW2_INCOMPAT_DEDUP_BITNR,
- QCOW2_INCOMPAT_MASK = QCOW2_INCOMPAT_DIRTY,
+ QCOW2_INCOMPAT_MASK = QCOW2_INCOMPAT_DIRTY | QCOW2_INCOMPAT_DEDUP,
};
/* Compatible feature bits */
@@ -148,6 +164,7 @@ typedef struct BDRVQcowState {
Qcow2Cache* l2_table_cache;
Qcow2Cache* refcount_block_cache;
+ Qcow2Cache *dedup_cluster_cache;
uint8_t *cluster_cache;
uint8_t *cluster_data;
@@ -160,6 +177,12 @@ typedef struct BDRVQcowState {
int64_t free_cluster_index;
int64_t free_byte_offset;
+ bool has_dedup;
+ uint64_t *dedup_table;
+ uint64_t dedup_table_offset;
+ int32_t dedup_table_size;
+ struct rb_root dedup_rb_tree;
+ QTAILQ_HEAD(, QCowHashElement) undedupable_hashes;
CoMutex lock;
uint32_t crypt_method; /* current crypt method, 0 if no key yet */
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 04/20] qcow2: Add qcow2_dedup_read_missing_and_concatenate
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (2 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 03/20] qcow2: Add deduplication structures and fields Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 05/20] qcow2: Rename update_refcount into qcow2_update_refcount Benoît Canet
` (16 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
This function is used to read missing data when unaligned writes are
done. This function also concatenate missing data with the given
qiov data in order to prepare a buffer used to look for duplicated
clusters.
---
block/Makefile.objs | 1 +
block/qcow2-dedup.c | 137 +++++++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.h | 8 +++
3 files changed, 146 insertions(+)
create mode 100644 block/qcow2-dedup.c
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 554f429..790cb5f 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,6 @@
block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
+block-obj-y += qcow2-dedup.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
new file mode 100644
index 0000000..9955843
--- /dev/null
+++ b/block/qcow2-dedup.c
@@ -0,0 +1,137 @@
+/*
+ * Deduplication for the QCOW2 format
+ *
+ * Copyright (C) Nodalink, SARL. 2012
+ *
+ * Author:
+ * Benoît Canet <benoit.canet@irqsave.net>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "block_int.h"
+#include "qemu-common.h"
+#include "qcow2.h"
+
+/**
+ * Read some data from the QCOW2 file
+ *
+ * @data: the buffer where the data must be stored
+ * @sector_num: the sector number to read in the QCOW2 file
+ * @nb_sectors: the number of sectors to read
+ * @ret: negative on error
+ */
+static int qcow2_dedup_read_missing_cluster_data(BlockDriverState *bs,
+ uint8_t *data,
+ uint64_t sector_num,
+ int nb_sectors)
+{
+ BDRVQcowState *s = bs->opaque;
+ QEMUIOVector qiov;
+ struct iovec iov;
+ int ret;
+
+ iov.iov_len = nb_sectors * BDRV_SECTOR_SIZE;
+ iov.iov_base = data;
+ qemu_iovec_init_external(&qiov, &iov, 1);
+ qemu_co_mutex_unlock(&s->lock);
+ ret = bdrv_co_readv(bs, sector_num, nb_sectors, &qiov);
+ qemu_co_mutex_lock(&s->lock);
+ if (ret < 0) {
+ error_report("failed to read %d sectors at offset %" PRIu64 "\n",
+ nb_sectors, sector_num);
+ }
+
+ return ret;
+}
+
+/*
+ * Prepare a buffer containing all the required data required to compute cluster
+ * sized deduplication hashes.
+ * If sector_num and nb_sectors are unaligned cluster wize it read the missing
+ * data before and after the qiov.
+ *
+ * @qiov: the qiov for which missing data must be read
+ * @sector_num: the first sectors that must be read into the qiov
+ * @nb_sectors: the number of sectors to read into the qiov
+ * @data: the place where the data will be concatenated and stored
+ * @nb_data_sectors: the resulting size of the contatenated data (in sectors)
+ * @ret: negative on error
+ */
+int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
+ QEMUIOVector *qiov,
+ uint64_t sector_num,
+ int nb_sectors,
+ uint8_t **data,
+ int *nb_data_sectors)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret;
+ uint64_t cluster_beginning_sector;
+ uint64_t first_sector_after_qiov;
+ int cluster_beginning_nr;
+ int cluster_ending_nr;
+ int unaligned_ending_nr;
+
+ /* compute how much and where to read at the beginning */
+ cluster_beginning_nr = sector_num & (s->cluster_sectors - 1);
+ cluster_beginning_sector = sector_num - cluster_beginning_nr;
+
+ /* for the ending */
+ first_sector_after_qiov = sector_num + nb_sectors;
+ unaligned_ending_nr = first_sector_after_qiov & (s->cluster_sectors - 1);
+ cluster_ending_nr = unaligned_ending_nr ?
+ s->cluster_sectors - unaligned_ending_nr : 0;
+
+ /* compute total size in sectors and allocate memory */
+ *nb_data_sectors = cluster_beginning_nr + nb_sectors + cluster_ending_nr;
+ *data = qemu_blockalign(bs, *nb_data_sectors * BDRV_SECTOR_SIZE);
+
+ /* read beginning */
+ ret = qcow2_dedup_read_missing_cluster_data(bs,
+ *data,
+ cluster_beginning_sector,
+ cluster_beginning_nr);
+
+ if (ret < 0) {
+ goto fail;
+ }
+
+ /* append qiov content */
+ qemu_iovec_to_buf(qiov, 0, *data + cluster_beginning_nr * BDRV_SECTOR_SIZE,
+ qiov->size);
+
+ /* read and add ending */
+ ret = qcow2_dedup_read_missing_cluster_data(bs,
+ *data +
+ (cluster_beginning_nr +
+ nb_sectors) * BDRV_SECTOR_SIZE,
+ first_sector_after_qiov,
+ cluster_ending_nr);
+
+ if (ret < 0) {
+ goto fail;
+ }
+
+ return 0;
+
+fail:
+ qemu_vfree(*data);
+ return ret;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index e38667c..6e867c8 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -356,4 +356,12 @@ int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
void **table);
int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
+/* qcow2-dedup.c functions */
+int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
+ QEMUIOVector *qiov,
+ uint64_t sector,
+ int sectors_nr,
+ uint8_t **dedup_cluster_data,
+ int *dedup_cluster_data_nr);
+
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 05/20] qcow2: Rename update_refcount into qcow2_update_refcount.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (3 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 04/20] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 06/20] qcow2: Add qcow2_dedup and related functions Benoît Canet
` (15 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
Also make it public so qcow2-dedup.c can use it.
---
block/qcow2-refcount.c | 38 +++++++++++++++++++-------------------
block/qcow2.h | 4 +++-
2 files changed, 22 insertions(+), 20 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 5e3f915..692f3fb 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -27,9 +27,6 @@
#include "block/qcow2.h"
static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size);
-static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
- int64_t offset, int64_t length,
- int addend);
/*********************************************************/
@@ -188,10 +185,11 @@ static int alloc_refcount_block(BlockDriverState *bs,
* recursion. Instead we must place the refcount blocks in a way that
* they can describe them themselves.
*
- * - We need to consider that at this point we are inside update_refcounts
- * and doing the initial refcount increase. This means that some clusters
- * have already been allocated by the caller, but their refcount isn't
- * accurate yet. free_cluster_index tells us where this allocation ends
+ * - We need to consider that at this point we are inside
+ * qcow2_update_refcounts and doing the initial refcount increase.
+ * This means that some clusters have already been allocated by the
+ * caller, but their refcount isn't accurate yet.
+ * free_cluster_index tells us where this allocation ends
* as long as we don't overwrite it by freeing clusters.
*
* - alloc_clusters_noref and qcow2_free_clusters may load a different
@@ -232,7 +230,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
} else {
/* Described somewhere else. This can recurse at most twice before we
* arrive at a block that describes itself. */
- ret = update_refcount(bs, new_block, s->cluster_size, 1);
+ ret = qcow2_update_refcount(bs, new_block, s->cluster_size, 1);
if (ret < 0) {
goto fail_block;
}
@@ -240,7 +238,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
bdrv_flush(bs->file);
/* Initialize the new refcount block only after updating its refcount,
- * update_refcount uses the refcount cache itself */
+ * qcow2_update_refcount uses the refcount cache itself */
ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block,
(void**) refcount_block);
if (ret < 0) {
@@ -412,7 +410,7 @@ fail_block:
}
/* XXX: cache several refcount block clusters ? */
-static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
+int QEMU_WARN_UNUSED_RESULT qcow2_update_refcount(BlockDriverState *bs,
int64_t offset, int64_t length, int addend)
{
BDRVQcowState *s = bs->opaque;
@@ -422,8 +420,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
int ret;
#ifdef DEBUG_ALLOC2
- fprintf(stderr, "update_refcount: offset=%" PRId64 " size=%" PRId64 " addend=%d\n",
- offset, length, addend);
+ fprintf(stderr, "qcow2_update_refcount: offset=%" PRId64
+ " size=%" PRId64 " addend=%d\n", offset, length, addend);
#endif
if (length < 0) {
return -EINVAL;
@@ -499,7 +497,8 @@ fail:
*/
if (ret < 0) {
int dummy;
- dummy = update_refcount(bs, offset, cluster_offset - offset, -addend);
+ dummy = qcow2_update_refcount(bs, offset,
+ cluster_offset - offset, -addend);
(void)dummy;
}
@@ -520,7 +519,8 @@ static int update_cluster_refcount(BlockDriverState *bs,
BDRVQcowState *s = bs->opaque;
int ret;
- ret = update_refcount(bs, cluster_index << s->cluster_bits, 1, addend);
+ ret = qcow2_update_refcount(bs, cluster_index << s->cluster_bits,
+ 1, addend);
if (ret < 0) {
return ret;
}
@@ -574,7 +574,7 @@ int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size)
return offset;
}
- ret = update_refcount(bs, offset, size, 1);
+ ret = qcow2_update_refcount(bs, offset, size, 1);
if (ret < 0) {
return ret;
}
@@ -606,7 +606,7 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
old_free_cluster_index = s->free_cluster_index;
s->free_cluster_index = cluster_index + i;
- ret = update_refcount(bs, offset, i << s->cluster_bits, 1);
+ ret = qcow2_update_refcount(bs, offset, i << s->cluster_bits, 1);
if (ret < 0) {
return ret;
}
@@ -672,7 +672,7 @@ void qcow2_free_clusters(BlockDriverState *bs,
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_FREE);
- ret = update_refcount(bs, offset, size, -1);
+ ret = qcow2_update_refcount(bs, offset, size, -1);
if (ret < 0) {
fprintf(stderr, "qcow2_free_clusters failed: %s\n", strerror(-ret));
/* TODO Remember the clusters to free them later and avoid leaking */
@@ -779,7 +779,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
s->csize_mask) + 1;
if (addend != 0) {
int ret;
- ret = update_refcount(bs,
+ ret = qcow2_update_refcount(bs,
(offset & s->cluster_offset_mask) & ~511,
nb_csectors * 512, addend);
if (ret < 0) {
@@ -1213,7 +1213,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
i, refcount1, refcount2);
if (num_fixed) {
- ret = update_refcount(bs, i << s->cluster_bits, 1,
+ ret = qcow2_update_refcount(bs, i << s->cluster_bits, 1,
refcount2 - refcount1);
if (ret >= 0) {
(*num_fixed)++;
diff --git a/block/qcow2.h b/block/qcow2.h
index 6e867c8..77c0f2b 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -292,7 +292,9 @@ int qcow2_update_header(BlockDriverState *bs);
/* qcow2-refcount.c functions */
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
-
+int QEMU_WARN_UNUSED_RESULT qcow2_update_refcount(BlockDriverState *bs,
+ int64_t offset, int64_t length,
+ int addend);
int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size);
int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
int nb_clusters);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 06/20] qcow2: Add qcow2_dedup and related functions.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (4 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 05/20] qcow2: Rename update_refcount into qcow2_update_refcount Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 07/20] qcow2: Add qcow2_dedup_write_new_hashes Benoît Canet
` (14 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-dedup.c | 332 +++++++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.h | 7 ++
2 files changed, 339 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 9955843..ae45130 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -29,6 +29,8 @@
#include "qemu-common.h"
#include "qcow2.h"
+#define HASH_LENGTH 32
+
/**
* Read some data from the QCOW2 file
*
@@ -135,3 +137,333 @@ fail:
qemu_vfree(*data);
return ret;
}
+
+/*
+ * Lookup a hash in the deduplication red black tree
+ *
+ * @hash: the hash to lookup in the red black tree
+ * @ret: a structure containing the offset of the cluster and the hash or NULL
+ */
+static QCowHashNode *qcow2_dedup_lookup_hash_in_rb_tree(BlockDriverState *bs,
+ uint8_t *hash)
+{
+ BDRVQcowState *s = bs->opaque;
+ struct rb_node *node = s->dedup_rb_tree.rb_node;
+
+ while (node) {
+ QCowHashNode *data = container_of(node, QCowHashNode, node);
+ int result = memcmp(hash, data->hash, HASH_LENGTH);
+
+ if (result < 0) {
+ node = node->rb_left;
+ } else if (result > 0) {
+ node = node->rb_right;
+ } else {
+ return data;
+ }
+ }
+
+ return NULL;
+}
+
+/*
+ * Insert a hash and offset couple in the red black tree
+ *
+ * @hash: the given hash
+ * @physical_cluster_offset: the cluster offset in the QCOW2 file
+ * @ret: true on successfull insertion, false on duplicate
+ */
+static bool qcow2_dedup_insert_hash_in_rb_tree(BlockDriverState *bs,
+ uint8_t *hash,
+ uint64_t physical_cluster_offset)
+{
+ BDRVQcowState *s = bs->opaque;
+ struct rb_node **new = &(s->dedup_rb_tree.rb_node), *parent = NULL;
+ QCowHashNode *data;
+
+ /* Figure out where to put new node */
+ while (*new) {
+ QCowHashNode *this = container_of(*new, QCowHashNode, node);
+ int result = memcmp(hash, this->hash, HASH_LENGTH);
+
+ parent = *new;
+ if (result < 0) {
+ new = &((*new)->rb_left);
+ } else if (result > 0) {
+ new = &((*new)->rb_right);
+ } else {
+ return false;
+ }
+ }
+
+ /* alloc node */
+ data = g_new0(QCowHashNode, 1);
+ data->hash = hash;
+ data->offset = physical_cluster_offset;
+
+ /* Add new node and rebalance tree. */
+ rb_link_node(&data->node, parent, new);
+ rb_insert_color(&data->node, &s->dedup_rb_tree);
+
+ return true;
+}
+
+/*
+ * Compute the hash of a given cluster
+ *
+ * @data: a buffer containing the cluster data
+ * @ret: a HASH_LENGTH long dynamically allocated array containing the hash
+ */
+static uint8_t *qcow2_compute_cluster_hash(BlockDriverState *bs,
+ uint8_t *data)
+{
+ return NULL;
+}
+
+/* Try to find the offset of a given cluster if it's duplicated
+ * Exceptionally we cast return value to int64_t to use as error code.
+ *
+ * @data: a buffer containing the cluster
+ * @skip_cluster_nr: the number of cluster to skip in the buffer
+ * @hash: if hash is provided it's used else it's computed
+ * @ret: the offset in sectors of the duplicated cluster or -1
+ */
+static int64_t qcow2_dedup_compute_hash_lookup_offset(BlockDriverState *bs,
+ uint8_t *data,
+ int skip_cluster_nr,
+ uint8_t **hash)
+{
+ BDRVQcowState *s = bs->opaque;
+ QCowHashNode *hash_node;
+ if (!*hash) {
+ /* no hash has been provided compute it and store it for caller usage */
+ *hash = qcow2_compute_cluster_hash(bs,
+ data + skip_cluster_nr *
+ s->cluster_size);
+ }
+ hash_node = qcow2_dedup_lookup_hash_in_rb_tree(bs, *hash);
+ if (hash_node) {
+ /* return offset of the duplicated cluster */
+ return hash_node->offset;
+ }
+ /* cluster not duplicated */
+ qcow2_dedup_insert_hash_in_rb_tree(bs, *hash,
+ (uint64_t) 1 <<
+ (sizeof(uint64_t) * 8 - 1));
+ return -1;
+}
+
+/*
+ * Helper used to build a QCowHashElement
+ *
+ * @hash: the hash to use
+ * @ret: a newly allocated QCowHashElement pointing to the given hash
+ */
+static QCowHashElement *qcow2_build_dedup_hash(uint8_t *hash)
+{
+ QCowHashElement *dedup_hash;
+ dedup_hash = g_new0(QCowHashElement, 1);
+ dedup_hash->hash = hash;
+ return dedup_hash;
+}
+
+/*
+ * Helper used to link a deduplicated cluster in the l2
+ *
+ * @logical_cluster_offset: the cluster offset seen by the guest (in sectors)
+ * @physical_cluster_offset: the cluster offset in the QCOW2 file (in sectors)
+ * @ret:
+ */
+static int qcow2_dedup_link_l2(BlockDriverState *bs,
+ uint64_t logical_cluster_offset,
+ uint64_t physical_cluster_offset)
+{
+ QCowL2Meta m;
+ /* function correctness regarding copy on write ? */
+ m.offset = logical_cluster_offset;
+ m.alloc_offset = physical_cluster_offset;
+ m.nb_clusters = 1; /* we are linking only one cluster in l2 */
+ return qcow2_alloc_cluster_link_l2(bs, &m);
+}
+
+/* This function tries to deduplicate a given cluster.
+ *
+ * @sector_num: the logical sector number we are trying to deduplicate
+ * @precomputed_hash: Used instead of computing the hash if provided
+ * @data: the buffer in which to look for a duplicated cluster
+ * @skip_clusters_nr: the number of cluster that must be skipped in data
+ * @non_duplicated_dedup_hash: returned if the cluster is not deduplicated
+ * @ret: ret < 0 on error, 1 on deduplication else 0
+ */
+static int qcow2_dedup_cluster(BlockDriverState *bs,
+ uint64_t sector_num,
+ uint8_t **precomputed_hash,
+ uint8_t *data,
+ int skip_clusters_nr,
+ QCowHashElement **non_duplicated_dedup_hash)
+{
+ BDRVQcowState *s = bs->opaque;
+ int64_t physical_cluster_offset;
+ uint64_t logical_cluster_offset;
+ uint64_t existing_physical_cluster_offset;
+ int ret;
+ int pnum = s->cluster_sectors;
+ *non_duplicated_dedup_hash = NULL;
+
+ /* look for physical cluster offset if the cluster is duplicated */
+ physical_cluster_offset =
+ qcow2_dedup_compute_hash_lookup_offset(bs,
+ data,
+ skip_clusters_nr,
+ precomputed_hash);
+
+ /* round the logical cluster offset to cluster boundaries */
+ logical_cluster_offset = sector_num & ~(s->cluster_sectors - 1);
+ ret = qcow2_get_cluster_offset(bs, logical_cluster_offset << 9,
+ &pnum, &existing_physical_cluster_offset);
+ if (ret < 0) {
+ goto exit;
+ }
+
+ if (physical_cluster_offset == -1) {
+ /* no duplicated cluster found, store the hash for later usage */
+ *non_duplicated_dedup_hash = qcow2_build_dedup_hash(*precomputed_hash);
+ return 0;
+ } else {
+ /* duplicated cluster found */
+
+ if (ret == QCOW2_CLUSTER_NORMAL) {
+ /* we are replacing a cluster, decrement it's physical refcount */
+ qcow2_free_clusters(bs, existing_physical_cluster_offset,
+ s->cluster_size);
+ }
+
+ ret = qcow2_update_refcount(bs, physical_cluster_offset,
+ s->cluster_size, 1);
+ if (ret < 0) {
+ goto exit;
+ }
+
+ ret = qcow2_dedup_link_l2(bs, logical_cluster_offset,
+ physical_cluster_offset);
+ if (ret < 0) {
+ goto exit;
+ }
+
+ }
+
+ ret = 1;
+exit:
+ *precomputed_hash = NULL;
+ free(*precomputed_hash);
+ return ret;
+}
+
+/*
+ * Deduplicate all the cluster that can be deduplicated.
+ *
+ * Next it compute the number of non deduplicable sectors to come while storing
+ * the hashes of these sectors in a linked list for later usage.
+ * Then it compute the first duplicated cluster hash that come after non
+ * deduplicable cluster, this hash will be used at next call of the function
+ *
+ * @sector_num: the first sector to deduplicate (in sectors)
+ * @data: the buffer containing the data to deduplicate
+ * @data_nr: the size of the buffer in sectors
+ * @skip_cluster_nr: the number of cluster to skip at the begining of data
+ * @next_non_dedupable_sectors_nr: result containing the number of non
+ * deduplicable sectors to come
+ * next_call_first_hash: hash saved between the function call
+ *
+ *
+ *
+ */
+int qcow2_dedup(BlockDriverState *bs,
+ uint64_t sector_num,
+ uint8_t *data,
+ int data_nr,
+ int *skip_clusters_nr,
+ int *next_non_dedupable_sectors_nr,
+ uint8_t **next_call_first_hash)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret;
+ int deduped_clusters_nr = 0;
+ int left_to_test_clusters_nr;
+ int begining_index;
+ uint8_t *hash = NULL;
+ QCowHashElement *non_duplicated_dedup_hash = NULL;
+
+ /* should already be zero when entering this function */
+ assert(*next_non_dedupable_sectors_nr == 0);
+
+ begining_index = sector_num & (s->cluster_sectors - 1);
+
+ left_to_test_clusters_nr = (data_nr / s->cluster_sectors) -
+ *skip_clusters_nr;
+
+ /* Deduplicate all that can be */
+ while (left_to_test_clusters_nr-- &&
+ (ret = qcow2_dedup_cluster(bs,
+ sector_num,
+ next_call_first_hash,
+ data,
+ (*skip_clusters_nr)++,
+ &non_duplicated_dedup_hash)) == 1) {
+ deduped_clusters_nr++;
+ }
+
+ if (ret < 0) {
+ *next_call_first_hash = NULL;
+ goto exit;
+ }
+
+ /* We deduped everything till the end */
+ if (!non_duplicated_dedup_hash) {
+ *next_call_first_hash = NULL;
+ *next_non_dedupable_sectors_nr = 0;
+ goto exit;
+ }
+
+ /* We consumed the precomputed hash */
+ *next_call_first_hash = NULL;
+
+ /* remember that the last sector we tested in non deduplicable */
+ *next_non_dedupable_sectors_nr += s->cluster_sectors;
+
+ /* Memorize the hash of the first non duplicated cluster.
+ * we will store it before writing the cluster to disk.
+ */
+ QTAILQ_INSERT_TAIL(&s->undedupable_hashes, non_duplicated_dedup_hash, next);
+
+ /* Count how many non duplicated sector can be written and memorize the
+ * hashes for later.
+ * We make sure we pass hash == NULL to force computation of the hash.
+ */
+ hash = NULL;
+ while (left_to_test_clusters_nr-- > 0 &&
+ -1 == qcow2_dedup_compute_hash_lookup_offset(bs,
+ data,
+ (*skip_clusters_nr)++,
+ &hash)) {
+ *next_non_dedupable_sectors_nr += s->cluster_sectors;
+ non_duplicated_dedup_hash = qcow2_build_dedup_hash(hash);
+ QTAILQ_INSERT_TAIL(&s->undedupable_hashes,
+ non_duplicated_dedup_hash, next);
+ hash = NULL;
+ }
+
+ /* We find a duplicated cluster before stopping iterating store it for
+ * next call.
+ */
+ if (hash && non_duplicated_dedup_hash &&
+ non_duplicated_dedup_hash->hash != hash) {
+ *next_call_first_hash = hash;
+ }
+
+exit:
+ if (!deduped_clusters_nr) {
+ return 0;
+ }
+ return deduped_clusters_nr * s->cluster_sectors - begining_index;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 77c0f2b..6292d4e 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -365,5 +365,12 @@ int qcow2_dedup_read_missing_and_concatenate(BlockDriverState *bs,
int sectors_nr,
uint8_t **dedup_cluster_data,
int *dedup_cluster_data_nr);
+int qcow2_dedup(BlockDriverState *bs,
+ uint64_t sector_num,
+ uint8_t *data,
+ int data_nr,
+ int *skip_clusters_nr,
+ int *next_non_dedupable_sectors_nr,
+ uint8_t **next_call_first_hash);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 07/20] qcow2: Add qcow2_dedup_write_new_hashes.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (5 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 06/20] qcow2: Add qcow2_dedup and related functions Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 08/20] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
` (13 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-dedup.c | 161 +++++++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.h | 3 +
2 files changed, 164 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index ae45130..50d61f2 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -467,3 +467,164 @@ exit:
}
return deduped_clusters_nr * s->cluster_sectors - begining_index;
}
+
+/* Read a hash cluster from disk or allocate it if it doesn't exist yet
+ *
+ * @in_dedup_table_index: The index of the hash cluster in the dedup table
+ * @hash_block: the place where the cluster will be loaded
+ * @create: set to true if dedup table entries must be created
+ * when not found
+ * @ret: 0 on success, errno on error
+ */
+static int get_hash_cluster_from_cache(BlockDriverState *bs,
+ int32_t in_dedup_table_index,
+ uint8_t **hash_block, bool create)
+{
+ BDRVQcowState *s = bs->opaque;
+ int ret = -ENOSPC;
+ uint64_t hash_cluster_offset;
+
+ if (in_dedup_table_index > s->dedup_table_size) {
+ goto fail;
+ }
+
+ if (!s->dedup_table[in_dedup_table_index] && create) {
+ /* the dedup table entry doesn't exists and we must create it */
+ uint64_t data64;
+ /* allocate a new dedup table cluster */
+ hash_cluster_offset = qcow2_alloc_clusters(bs, s->cluster_size);
+ s->dedup_table[in_dedup_table_index] = hash_cluster_offset;
+ /* get an empty cluster from the dedup cache */
+ ret = qcow2_cache_get_empty(bs, s->dedup_cluster_cache,
+ hash_cluster_offset,
+ (void **) hash_block);
+ if (ret < 0) {
+ goto fail;
+ }
+ /* clear it */
+ memset(*hash_block, 0, s->cluster_size);
+ /* write the new block offset in the dedup table */
+ data64 = cpu_to_be64(hash_cluster_offset);
+ ret = bdrv_pwrite_sync(bs->file,
+ s->dedup_table_offset * BDRV_SECTOR_SIZE +
+ in_dedup_table_index * sizeof(uint64_t),
+ &data64, sizeof(data64));
+ if (ret < 0) {
+ goto fail;
+ }
+ } else if (!s->dedup_table[in_dedup_table_index] && !create) {
+ /* the dedup table entry doesn't exits and we must _not_ create */
+ *hash_block = g_malloc0(s->cluster_size);
+ return 1;
+ } else {
+ /* the entry exists get it */
+ hash_cluster_offset = s->dedup_table[in_dedup_table_index];
+ ret = qcow2_cache_get(bs, s->dedup_cluster_cache,
+ hash_cluster_offset, (void **) hash_block);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
+ return 0;
+
+fail:
+ return ret;
+}
+
+/* Read/write a given hash and cluster_offset from/to the dedup table
+ *
+ * This function doesn't flush the dedup cache to disk
+ *
+ * @hash: the hash to read or store
+ * @physical_cluster_offset: offset of the cluster in QCOW2 file (in sectors)
+ * @write: true to write, false to read
+ * @ret: 0 on succes, errno on error
+ */
+static int qcow2_dedup_read_write_hash(BlockDriverState *bs,
+ uint8_t **hash,
+ uint64_t physical_cluster_offset,
+ bool write)
+{
+ BDRVQcowState *s = bs->opaque;
+ uint8_t *hash_block = NULL;
+ int ret;
+ int64_t cluster_number;
+ int64_t in_dedup_table_index;
+ int hash_block_offset;
+ int nb_entries_by_dedup_table_cluster = s->cluster_size / sizeof(uint64_t);
+ int nb_hash_in_dedup_cluster = s->cluster_size / HASH_LENGTH;
+
+ cluster_number = physical_cluster_offset / s->cluster_sectors;
+ in_dedup_table_index = cluster_number /
+ (nb_entries_by_dedup_table_cluster *
+ nb_hash_in_dedup_cluster);
+
+ /* if we are doing a write this will create missing dedup table entries */
+ ret = get_hash_cluster_from_cache(bs, in_dedup_table_index,
+ &hash_block, write);
+ if (ret < 0) {
+ return ret;
+ }
+
+ hash_block_offset = (cluster_number % nb_hash_in_dedup_cluster) *
+ HASH_LENGTH;
+ if (write) {
+ memcpy(hash_block + hash_block_offset , *hash, HASH_LENGTH);
+ } else {
+ *hash = g_malloc(HASH_LENGTH);
+ memcpy(*hash, hash_block + hash_block_offset, HASH_LENGTH);
+ }
+
+ if (!ret) {
+ qcow2_cache_put(bs, s->dedup_cluster_cache, (void **) &hash_block);
+ }
+
+ return 0;
+}
+
+/* This function write the hashes of the clusters which are not duplicated
+ *
+ * @physical_cluster_offset: offset of the first cluster (in sectors)
+ * @nr: the number of clusters to do
+ * @ret: 0 on succes, errno on error
+ */
+int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
+ uint64_t physical_cluster_offset,
+ int nr)
+{
+ int ret;
+ BDRVQcowState *s = bs->opaque;
+ QCowHashElement *dedup_hash, *next_dedup_hash;
+ QCowHashNode *hash_node;
+
+ int i = 0;
+
+ QTAILQ_FOREACH_SAFE(dedup_hash, &s->undedupable_hashes,
+ next, next_dedup_hash) {
+ ret = qcow2_dedup_read_write_hash(bs, &dedup_hash->hash,
+ physical_cluster_offset + i *
+ s->cluster_sectors,
+ true);
+ if (ret < 0) {
+ goto fail;
+ }
+
+ hash_node = qcow2_dedup_lookup_hash_in_rb_tree(bs, dedup_hash->hash);
+ if (hash_node->offset == (uint64_t) 1 <<
+ (sizeof(uint64_t) * 8 - 1)) {
+ hash_node->offset = physical_cluster_offset +
+ i * s->cluster_sectors;
+ }
+ QTAILQ_REMOVE(&s->undedupable_hashes, dedup_hash, next);
+ g_free(dedup_hash);
+ i++;
+ if (i == nr) {
+ break;
+ }
+ }
+
+ ret = qcow2_cache_flush(bs, s->dedup_cluster_cache);
+fail:
+ return ret;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 6292d4e..58aee77 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -372,5 +372,8 @@ int qcow2_dedup(BlockDriverState *bs,
int *skip_clusters_nr,
int *next_non_dedupable_sectors_nr,
uint8_t **next_call_first_hash);
+int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
+ uint64_t cluster_offset,
+ int count);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 08/20] qcow2: Implement qcow2_compute_cluster_hash.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (6 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 07/20] qcow2: Add qcow2_dedup_write_new_hashes Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 09/20] qcow2: Add qcow2_co_load_dedup_hashes Benoît Canet
` (12 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
Makefile | 3 +++
Makefile.target | 2 +-
block/qcow2-dedup.c | 10 ++++++++--
3 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/Makefile b/Makefile
index a9c22bf..b4dab79 100644
--- a/Makefile
+++ b/Makefile
@@ -166,6 +166,9 @@ qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y) $(qapi-obj-y) \
qapi-visit.o qapi-types.o
qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y)
qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y)
+qemu-img$(EXESUF): LIBS+=-lcrypto
+qemu-nbd$(EXESUF): LIBS+=-lcrypto
+qemu-io$(EXESUF): LIBS+=-lcrypto
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
diff --git a/Makefile.target b/Makefile.target
index 3822bc5..f9a988a 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -119,7 +119,7 @@ obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += memory_mapping.o
obj-$(CONFIG_HAVE_CORE_DUMP) += dump.o
obj-$(CONFIG_NO_GET_MEMORY_MAPPING) += memory_mapping-stub.o
obj-$(CONFIG_NO_CORE_DUMP) += dump-stub.o
-LIBS+=-lz
+LIBS+=-lz -lcrypto
QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 50d61f2..c626720 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -25,11 +25,13 @@
* THE SOFTWARE.
*/
+#include <openssl/sha.h>
+#include <openssl/evp.h>
#include "block_int.h"
#include "qemu-common.h"
#include "qcow2.h"
-#define HASH_LENGTH 32
+#define HASH_LENGTH SHA256_DIGEST_LENGTH
/**
* Read some data from the QCOW2 file
@@ -217,7 +219,11 @@ static bool qcow2_dedup_insert_hash_in_rb_tree(BlockDriverState *bs,
static uint8_t *qcow2_compute_cluster_hash(BlockDriverState *bs,
uint8_t *data)
{
- return NULL;
+ BDRVQcowState *s = bs->opaque;
+ uint8_t *hash = g_malloc0(HASH_LENGTH);
+ EVP_Digest(data, s->cluster_size,
+ hash, NULL, EVP_sha256(), NULL);
+ return hash;
}
/* Try to find the offset of a given cluster if it's duplicated
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 09/20] qcow2: Add qcow2_co_load_dedup_hashes.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (7 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 08/20] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 10/20] qcow2: Add qcow2_dedup_grow_table Benoît Canet
` (11 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
This coroutine load all the hashes in the red black tree at startup.
---
block/qcow2-dedup.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.h | 1 +
2 files changed, 45 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index c626720..a2f6657 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -634,3 +634,47 @@ int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
fail:
return ret;
}
+
+/*
+ * This coroutine load the deduplication hashes in the red black tree
+ *
+ * @data: the given BlockDriverState
+ * @ret: NULL
+ */
+void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque)
+{
+ BlockDriverState *bs = opaque;
+ BDRVQcowState *s = bs->opaque;
+ int ret;
+ uint8_t *hash = NULL;
+ uint8_t null_hash[HASH_LENGTH];
+ uint64_t max_cluster_offset, i;
+ int nb_hash_in_dedup_cluster = s->cluster_size / HASH_LENGTH;
+
+ /* prepare the null hash */
+ memset(null_hash, 0, HASH_LENGTH);
+
+ max_cluster_offset = s->dedup_table_size * nb_hash_in_dedup_cluster;
+
+ for (i = 0; i < max_cluster_offset; i++) {
+ /* get the hash */
+ qemu_co_mutex_lock(&s->lock);
+ ret = qcow2_dedup_read_write_hash(bs, &hash,
+ i * s->cluster_sectors,
+ false);
+ qemu_co_mutex_unlock(&s->lock);
+ if (ret < 0) {
+ error_report("Failed to load deduplication hash.");
+ }
+
+ /* if the hash is not null load it into red black tree */
+ if (memcmp(hash, null_hash, HASH_LENGTH)) {
+ qemu_co_mutex_lock(&s->lock);
+ qcow2_dedup_insert_hash_in_rb_tree(bs, hash,
+ i * s->cluster_sectors);
+ qemu_co_mutex_unlock(&s->lock);
+ } else {
+ free(hash);
+ }
+ }
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 58aee77..5bab6e2 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -375,5 +375,6 @@ int qcow2_dedup(BlockDriverState *bs,
int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
uint64_t cluster_offset,
int count);
+void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 10/20] qcow2: Add qcow2_dedup_grow_table.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (8 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 09/20] qcow2: Add qcow2_co_load_dedup_hashes Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 11/20] qcow2: Load and save deduplication table header extension Benoît Canet
` (10 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-cluster.c | 95 +++++++++++++++++++++++++++++++------------------
block/qcow2-dedup.c | 38 ++++++++++++++++++++
block/qcow2.h | 10 ++++++
3 files changed, 108 insertions(+), 35 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index e179211..04f7cab 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -29,44 +29,47 @@
#include "block/qcow2.h"
#include "trace.h"
-int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
+int qcow2_do_grow_table(BlockDriverState *bs, int min_size, bool exact_size,
+ uint64_t **table, uint64_t *table_offset,
+ int *table_size, qcow2_save_table save_table,
+ const char *table_name)
{
BDRVQcowState *s = bs->opaque;
- int new_l1_size, new_l1_size2, ret, i;
- uint64_t *new_l1_table;
- int64_t new_l1_table_offset;
- uint8_t data[12];
+ int new_size, new_size2, ret, i;
+ uint64_t *new_table;
+ int64_t new_table_offset;
- if (min_size <= s->l1_size)
+ if (min_size <= *table_size)
return 0;
if (exact_size) {
- new_l1_size = min_size;
+ new_size = min_size;
} else {
/* Bump size up to reduce the number of times we have to grow */
- new_l1_size = s->l1_size;
- if (new_l1_size == 0) {
- new_l1_size = 1;
+ new_size = *table_size;
+ if (new_size == 0) {
+ new_size = 1;
}
- while (min_size > new_l1_size) {
- new_l1_size = (new_l1_size * 3 + 1) / 2;
+ while (min_size > new_size) {
+ new_size = (new_size * 3 + 1) / 2;
}
}
#ifdef DEBUG_ALLOC2
- fprintf(stderr, "grow l1_table from %d to %d\n", s->l1_size, new_l1_size);
+ fprintf(stderr, "grow %s_table from %d to %d\n",
+ table_name, *table_size, new_size);
#endif
- new_l1_size2 = sizeof(uint64_t) * new_l1_size;
- new_l1_table = g_malloc0(align_offset(new_l1_size2, 512));
- memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
+ new_size2 = sizeof(uint64_t) * new_size;
+ new_table = g_malloc0(align_offset(new_size2, 512));
+ memcpy(new_table, *table, *table_size * sizeof(uint64_t));
/* write new table (align to cluster) */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ALLOC_TABLE);
- new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
- if (new_l1_table_offset < 0) {
- g_free(new_l1_table);
- return new_l1_table_offset;
+ new_table_offset = qcow2_alloc_clusters(bs, new_size2);
+ if (new_table_offset < 0) {
+ g_free(new_table);
+ return new_table_offset;
}
ret = qcow2_cache_flush(bs, s->refcount_block_cache);
@@ -75,34 +78,56 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
}
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE);
- for(i = 0; i < s->l1_size; i++)
- new_l1_table[i] = cpu_to_be64(new_l1_table[i]);
- ret = bdrv_pwrite_sync(bs->file, new_l1_table_offset, new_l1_table, new_l1_size2);
+ for(i = 0; i < *table_size; i++)
+ new_table[i] = cpu_to_be64(new_table[i]);
+ ret = bdrv_pwrite_sync(bs->file, new_table_offset, new_table, new_size2);
if (ret < 0)
goto fail;
- for(i = 0; i < s->l1_size; i++)
- new_l1_table[i] = be64_to_cpu(new_l1_table[i]);
+ for(i = 0; i < *table_size; i++)
+ new_table[i] = be64_to_cpu(new_table[i]);
/* set new table */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ACTIVATE_TABLE);
- cpu_to_be32w((uint32_t*)data, new_l1_size);
- cpu_to_be64wu((uint64_t*)(data + 4), new_l1_table_offset);
- ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size), data,sizeof(data));
+ ret = save_table(bs, new_table_offset, new_size);
+
if (ret < 0) {
goto fail;
}
- g_free(s->l1_table);
- qcow2_free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t));
- s->l1_table_offset = new_l1_table_offset;
- s->l1_table = new_l1_table;
- s->l1_size = new_l1_size;
+ g_free(*table);
+ qcow2_free_clusters(bs, *table_offset, *table_size * sizeof(uint64_t));
+ *table_offset = new_table_offset;
+ *table = new_table;
+ *table_size = new_size;
return 0;
fail:
- g_free(new_l1_table);
- qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2);
+ g_free(new_table);
+ qcow2_free_clusters(bs, new_table_offset, new_size2);
return ret;
}
+static int qcow2_l1_save_table(BlockDriverState *bs,
+ int64_t table_offset, int size)
+{
+ uint8_t data[12];
+ cpu_to_be32w((uint32_t*)data, size);
+ cpu_to_be64wu((uint64_t*)(data + 4), table_offset);
+ return bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size),
+ data,sizeof(data));
+}
+
+int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
+{
+ BDRVQcowState *s = bs->opaque;
+ return qcow2_do_grow_table(bs,
+ min_size,
+ exact_size,
+ &s->l1_table,
+ &s->l1_table_offset,
+ &s->l1_size,
+ qcow2_l1_save_table,
+ "l1");
+}
+
/*
* l2_load
*
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index a2f6657..5f648b5 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -678,3 +678,41 @@ void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque)
}
}
}
+
+/*
+ * Save the dedup table information into the header extensions
+ *
+ * @table_offset: the dedup table offset in the QCOW2 file
+ * @size: the size of the dedup table
+ * @ret: 0 on success, -errno on error
+ */
+static int qcow2_dedup_save_table_info(BlockDriverState *bs,
+ int64_t table_offset, int size)
+{
+ BDRVQcowState *s = bs->opaque;
+ s->dedup_table_offset = table_offset;
+ s->dedup_table_size = size;
+ return qcow2_update_header(bs);
+}
+
+/*
+ * Grow the deduplication table
+ *
+ * @min_size: minimal size
+ * @exact_size: if true force to grow to the exact size
+ * @ret: 0 on success, -errno on error
+ */
+int qcow2_dedup_grow_table(BlockDriverState *bs,
+ int min_size,
+ bool exact_size)
+{
+ BDRVQcowState *s = bs->opaque;
+ return qcow2_do_grow_table(bs,
+ min_size,
+ exact_size,
+ &s->dedup_table,
+ &s->dedup_table_offset,
+ &s->dedup_table_size,
+ qcow2_dedup_save_table_info,
+ "dedup");
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 5bab6e2..6c6e666 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -311,6 +311,12 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix);
/* qcow2-cluster.c functions */
+typedef int (*qcow2_save_table)(BlockDriverState *bs,
+ int64_t table_offset, int size);
+int qcow2_do_grow_table(BlockDriverState *bs, int min_size, bool exact_size,
+ uint64_t **table, uint64_t *table_offset,
+ int *table_size, qcow2_save_table save_table,
+ const char *table_name);
int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size);
void qcow2_l2_cache_reset(BlockDriverState *bs);
int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
@@ -376,5 +382,9 @@ int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
uint64_t cluster_offset,
int count);
void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque);
+void qcow2_dedup_start_loading_hashes(BlockDriverState *bs);
+int qcow2_dedup_grow_table(BlockDriverState *bs,
+ int min_size,
+ bool exact_size);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 11/20] qcow2: Load and save deduplication table header extension.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (9 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 10/20] qcow2: Add qcow2_dedup_grow_table Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 12/20] qcow2: Extract qcow2_do_table_init Benoît Canet
` (9 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/block/qcow2.c b/block/qcow2.c
index c1ff31f..18cb85d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -53,9 +53,15 @@ typedef struct {
uint32_t len;
} QCowExtension;
+typedef struct {
+ uint64_t offset;
+ int32_t size;
+} QCowDedupTableExtension;
+
#define QCOW2_EXT_MAGIC_END 0
#define QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
#define QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857
+#define QCOW2_EXT_MAGIC_DEDUP_TABLE 0xCD8E819B
static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename)
{
@@ -84,6 +90,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
QCowExtension ext;
uint64_t offset;
int ret;
+ QCowDedupTableExtension dedup_table_extension;
#ifdef DEBUG_EXT
printf("qcow2_read_extensions: start=%ld end=%ld\n", start_offset, end_offset);
@@ -148,6 +155,18 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
}
break;
+ case QCOW2_EXT_MAGIC_DEDUP_TABLE:
+ ret = bdrv_pread(bs->file, offset,
+ &dedup_table_extension, ext.len);
+ if (ret < 0) {
+ return ret;
+ }
+ s->dedup_table_offset =
+ be64_to_cpu(dedup_table_extension.offset);
+ s->dedup_table_size =
+ be32_to_cpu(dedup_table_extension.size);
+ break;
+
default:
/* unknown magic - save it in case we need to rewrite the header */
{
@@ -964,6 +983,7 @@ int qcow2_update_header(BlockDriverState *bs)
uint32_t refcount_table_clusters;
size_t header_length;
Qcow2UnknownHeaderExtension *uext;
+ QCowDedupTableExtension dedup_table_extension;
buf = qemu_blockalign(bs, buflen);
@@ -1067,6 +1087,21 @@ int qcow2_update_header(BlockDriverState *bs)
buf += ret;
buflen -= ret;
+ if (s->has_dedup) {
+ dedup_table_extension.offset = cpu_to_be64(s->dedup_table_offset);
+ dedup_table_extension.size = cpu_to_be32(s->dedup_table_size);
+ ret = header_ext_add(buf,
+ QCOW2_EXT_MAGIC_DEDUP_TABLE,
+ &dedup_table_extension,
+ sizeof(dedup_table_extension),
+ buflen);
+ if (ret < 0) {
+ goto fail;
+ }
+ buf += ret;
+ buflen -= ret;
+ }
+
/* Keep unknown header extensions */
QLIST_FOREACH(uext, &s->unknown_header_ext, next) {
ret = header_ext_add(buf, uext->magic, uext->data, uext->len, buflen);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 12/20] qcow2: Extract qcow2_do_table_init.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (10 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 11/20] qcow2: Load and save deduplication table header extension Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 13/20] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
` (8 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-refcount.c | 41 ++++++++++++++++++++++++++++-------------
block/qcow2.h | 5 +++++
2 files changed, 33 insertions(+), 13 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 692f3fb..3fa2d44 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -32,27 +32,42 @@ static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size);
/*********************************************************/
/* refcount handling */
-int qcow2_refcount_init(BlockDriverState *bs)
+int qcow2_do_table_init(BlockDriverState *bs,
+ uint64_t **table,
+ int64_t offset,
+ int size,
+ bool is_refcount)
{
- BDRVQcowState *s = bs->opaque;
- int ret, refcount_table_size2, i;
-
- refcount_table_size2 = s->refcount_table_size * sizeof(uint64_t);
- s->refcount_table = g_malloc(refcount_table_size2);
- if (s->refcount_table_size > 0) {
- BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_LOAD);
- ret = bdrv_pread(bs->file, s->refcount_table_offset,
- s->refcount_table, refcount_table_size2);
- if (ret != refcount_table_size2)
+ int ret, size2, i;
+
+ size2 = size * sizeof(uint64_t);
+ *table = g_malloc(size2);
+ if (size > 0) {
+ if (is_refcount) {
+ BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_LOAD);
+ }
+ ret = bdrv_pread(bs->file, offset,
+ *table, size2);
+ if (ret != size2)
goto fail;
- for(i = 0; i < s->refcount_table_size; i++)
- be64_to_cpus(&s->refcount_table[i]);
+ for(i = 0; i < size; i++)
+ be64_to_cpus(&(*table)[i]);
}
return 0;
fail:
return -ENOMEM;
}
+int qcow2_refcount_init(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ return qcow2_do_table_init(bs,
+ &s->refcount_table,
+ s->refcount_table_offset,
+ s->refcount_table_size,
+ true);
+}
+
void qcow2_refcount_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
diff --git a/block/qcow2.h b/block/qcow2.h
index 6c6e666..1bc6668 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -290,6 +290,11 @@ int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
int qcow2_update_header(BlockDriverState *bs);
/* qcow2-refcount.c functions */
+int qcow2_do_table_init(BlockDriverState *bs,
+ uint64_t **table,
+ int64_t offset,
+ int size,
+ bool is_refcount);
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
int QEMU_WARN_UNUSED_RESULT qcow2_update_refcount(BlockDriverState *bs,
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 13/20] qcow2: Add qcow2_dedup_init and qcow2_dedup_close.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (11 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 12/20] qcow2: Extract qcow2_do_table_init Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 14/20] qcow2: Extract qcow2_add_feature and qcow2_remove_feature Benoît Canet
` (7 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-dedup.c | 16 ++++++++++++++++
block/qcow2.h | 2 ++
2 files changed, 18 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 5f648b5..72fab64 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -716,3 +716,19 @@ int qcow2_dedup_grow_table(BlockDriverState *bs,
qcow2_dedup_save_table_info,
"dedup");
}
+
+int qcow2_dedup_init(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ return qcow2_do_table_init(bs,
+ &s->dedup_table,
+ s->dedup_table_offset,
+ s->dedup_table_size,
+ false);
+}
+
+void qcow2_dedup_close(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+ g_free(s->dedup_table);
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 1bc6668..ba900e0 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -391,5 +391,7 @@ void qcow2_dedup_start_loading_hashes(BlockDriverState *bs);
int qcow2_dedup_grow_table(BlockDriverState *bs,
int min_size,
bool exact_size);
+int qcow2_dedup_init(BlockDriverState *bs);
+void qcow2_dedup_close(BlockDriverState *bs);
#endif
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 14/20] qcow2: Extract qcow2_add_feature and qcow2_remove_feature.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (12 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 13/20] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 15/20] block: Add dedup image create option Benoît Canet
` (6 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
These functions will be use to mark that deduplication is activatedi
on an image.
---
block/qcow2.c | 40 +++++++++++++++++++++++++++-------------
block/qcow2.h | 4 ++--
2 files changed, 29 insertions(+), 15 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index 18cb85d..b11b6a7 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -235,13 +235,14 @@ static void report_unsupported_feature(BlockDriverState *bs,
}
/*
- * Sets the dirty bit and flushes afterwards if necessary.
+ * Sets the an incompatible feature bit and flushes afterwards if necessary.
*
* The incompatible_features bit is only set if the image file header was
* updated successfully. Therefore it is not required to check the return
* value of this function.
*/
-static int qcow2_mark_dirty(BlockDriverState *bs)
+static int qcow2_add_feature(BlockDriverState *bs,
+ QCow2IncompatibleFeature feature)
{
BDRVQcowState *s = bs->opaque;
uint64_t val;
@@ -249,11 +250,11 @@ static int qcow2_mark_dirty(BlockDriverState *bs)
assert(s->qcow_version >= 3);
- if (s->incompatible_features & QCOW2_INCOMPAT_DIRTY) {
- return 0; /* already dirty */
+ if (s->incompatible_features & feature) {
+ return 0; /* already added */
}
- val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY);
+ val = cpu_to_be64(s->incompatible_features | feature);
ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, incompatible_features),
&val, sizeof(val));
if (ret < 0) {
@@ -264,32 +265,45 @@ static int qcow2_mark_dirty(BlockDriverState *bs)
return ret;
}
- /* Only treat image as dirty if the header was updated successfully */
- s->incompatible_features |= QCOW2_INCOMPAT_DIRTY;
+ /* Only treat image as having the feature if the header was updated
+ * successfully
+ */
+ s->incompatible_features |= feature;
return 0;
}
+static int qcow2_mark_dirty(BlockDriverState *bs)
+{
+ return qcow2_add_feature(bs, QCOW2_INCOMPAT_DIRTY);
+}
+
/*
- * Clears the dirty bit and flushes before if necessary. Only call this
- * function when there are no pending requests, it does not guard against
- * concurrent requests dirtying the image.
+ * Clears an incompatible feature bit and flushes before if necessary.
+ * Only call this function when there are no pending requests, it does not
+ * guard against concurrent requests adding a feature to the image.
*/
-static int qcow2_mark_clean(BlockDriverState *bs)
+static int qcow2_remove_feature(BlockDriverState *bs,
+ QCow2IncompatibleFeature feature)
{
BDRVQcowState *s = bs->opaque;
- if (s->incompatible_features & QCOW2_INCOMPAT_DIRTY) {
+ if (s->incompatible_features & feature) {
int ret = bdrv_flush(bs);
if (ret < 0) {
return ret;
}
- s->incompatible_features &= ~QCOW2_INCOMPAT_DIRTY;
+ s->incompatible_features &= ~feature;
return qcow2_update_header(bs);
}
return 0;
}
+static int qcow2_mark_clean(BlockDriverState *bs)
+{
+ return qcow2_remove_feature(bs, QCOW2_INCOMPAT_DIRTY);
+}
+
static int qcow2_check(BlockDriverState *bs, BdrvCheckResult *result,
BdrvCheckMode fix)
{
diff --git a/block/qcow2.h b/block/qcow2.h
index ba900e0..5ecea94 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -125,14 +125,14 @@ enum {
};
/* Incompatible feature bits */
-enum {
+typedef enum {
QCOW2_INCOMPAT_DIRTY_BITNR = 0,
QCOW2_INCOMPAT_DIRTY = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
QCOW2_INCOMPAT_DEDUP_BITNR = 1,
QCOW2_INCOMPAT_DEDUP = 1 << QCOW2_INCOMPAT_DEDUP_BITNR,
QCOW2_INCOMPAT_MASK = QCOW2_INCOMPAT_DIRTY | QCOW2_INCOMPAT_DEDUP,
-};
+} QCow2IncompatibleFeature;
/* Compatible feature bits */
enum {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 15/20] block: Add dedup image create option.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (13 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 14/20] qcow2: Extract qcow2_add_feature and qcow2_remove_feature Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 16/20] qcow2: Allow creation of images using deduplication Benoît Canet
` (5 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block_int.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/block_int.h b/block_int.h
index f4bae04..6419513 100644
--- a/block_int.h
+++ b/block_int.h
@@ -55,6 +55,7 @@
#define BLOCK_OPT_SUBFMT "subformat"
#define BLOCK_OPT_COMPAT_LEVEL "compat"
#define BLOCK_OPT_LAZY_REFCOUNTS "lazy_refcounts"
+#define BLOCK_OPT_DEDUP "dedup"
typedef struct BdrvTrackedRequest BdrvTrackedRequest;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 16/20] qcow2: Allow creation of images using deduplication.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (14 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 15/20] block: Add dedup image create option Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 17/20] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
` (4 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
todo: Change qemu-img output so it reflect the dedup cluster size.
---
block/qcow2.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
block/qcow2.h | 2 ++
2 files changed, 79 insertions(+), 6 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index b11b6a7..c6879ea 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -277,6 +277,11 @@ static int qcow2_mark_dirty(BlockDriverState *bs)
return qcow2_add_feature(bs, QCOW2_INCOMPAT_DIRTY);
}
+static int qcow2_activate_dedup(BlockDriverState *bs)
+{
+ return qcow2_add_feature(bs, QCOW2_INCOMPAT_DEDUP);
+}
+
/*
* Clears an incompatible feature bit and flushes before if necessary.
* Only call this function when there are no pending requests, it does not
@@ -911,6 +916,11 @@ static void qcow2_close(BlockDriverState *bs)
BDRVQcowState *s = bs->opaque;
g_free(s->l1_table);
+ if (s->has_dedup) {
+ qcow2_cache_flush(bs, s->dedup_cluster_cache);
+ qcow2_cache_destroy(bs, s->dedup_cluster_cache);
+ }
+
qcow2_cache_flush(bs, s->l2_table_cache);
qcow2_cache_flush(bs, s->refcount_block_cache);
@@ -1229,7 +1239,8 @@ static int preallocate(BlockDriverState *bs)
static int qcow2_create2(const char *filename, int64_t total_size,
const char *backing_file, const char *backing_format,
int flags, size_t cluster_size, int prealloc,
- QEMUOptionParameter *options, int version)
+ QEMUOptionParameter *options, int version,
+ bool dedup)
{
/* Calculate cluster_bits */
int cluster_bits;
@@ -1256,6 +1267,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
* size for any qcow2 image.
*/
BlockDriverState* bs;
+ BDRVQcowState *s;
QCowHeader header;
uint8_t* refcount_table;
int ret;
@@ -1338,6 +1350,26 @@ static int qcow2_create2(const char *filename, int64_t total_size,
goto out;
}
+ if (dedup) {
+ s = bs->opaque;
+ s->has_dedup = true;
+ s->dedup_table_offset = qcow2_alloc_clusters(bs, cluster_size);
+ s->dedup_table_size = cluster_size / sizeof(uint64_t);
+
+ ret = qcow2_activate_dedup(bs);
+ if (ret < 0) {
+ goto out;
+ }
+
+ ret = qcow2_update_header(bs);
+ if (ret < 0) {
+ goto out;
+ }
+
+ /* minimal init */
+ s->dedup_cluster_cache = qcow2_cache_create(bs, DEDUP_CACHE_SIZE);
+ }
+
/* Want a backing file? There you go.*/
if (backing_file) {
ret = bdrv_change_backing_file(bs, backing_file, backing_format);
@@ -1363,15 +1395,30 @@ out:
return ret;
}
+static int qcow2_warn_if_version_3_is_needed(int version,
+ bool has_feature,
+ const char *feature)
+{
+ if (version < 3 && has_feature) {
+ fprintf(stderr, "%s only supported with compatibility "
+ "level 1.1 and above (use compat=1.1 or greater)\n",
+ feature);
+ return -EINVAL;
+ }
+ return 0;
+}
+
static int qcow2_create(const char *filename, QEMUOptionParameter *options)
{
const char *backing_file = NULL;
const char *backing_fmt = NULL;
uint64_t sectors = 0;
int flags = 0;
+ int ret;
size_t cluster_size = DEFAULT_CLUSTER_SIZE;
int prealloc = 0;
int version = 2;
+ bool dedup = false;
/* Read out options */
while (options && options->name) {
@@ -1409,24 +1456,43 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options)
}
} else if (!strcmp(options->name, BLOCK_OPT_LAZY_REFCOUNTS)) {
flags |= options->value.n ? BLOCK_FLAG_LAZY_REFCOUNTS : 0;
+ } else if (!strcmp(options->name, BLOCK_OPT_DEDUP)) {
+ dedup = options->value.n ? true : false;
}
options++;
}
+ if (dedup && cluster_size != DEFAULT_CLUSTER_SIZE) {
+ fprintf(stderr, "Deduplication cluster size must be 4096\n");
+ return -EINVAL;
+ }
+
+ if (dedup) {
+ cluster_size = 4096;
+ }
+
if (backing_file && prealloc) {
fprintf(stderr, "Backing file and preallocation cannot be used at "
"the same time\n");
return -EINVAL;
}
- if (version < 3 && (flags & BLOCK_FLAG_LAZY_REFCOUNTS)) {
- fprintf(stderr, "Lazy refcounts only supported with compatibility "
- "level 1.1 and above (use compat=1.1 or greater)\n");
- return -EINVAL;
+ ret = qcow2_warn_if_version_3_is_needed(version,
+ flags & BLOCK_FLAG_LAZY_REFCOUNTS,
+ "Lazy refcounts");
+ if (ret < 0) {
+ return ret;
+ }
+ ret = qcow2_warn_if_version_3_is_needed(version,
+ dedup,
+ "Deduplication");
+ if (ret < 0) {
+ return ret;
}
return qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
- cluster_size, prealloc, options, version);
+ cluster_size, prealloc, options, version,
+ dedup);
}
static int qcow2_make_empty(BlockDriverState *bs)
@@ -1729,6 +1795,11 @@ static QEMUOptionParameter qcow2_create_options[] = {
.type = OPT_FLAG,
.help = "Postpone refcount updates",
},
+ {
+ .name = BLOCK_OPT_DEDUP,
+ .type = OPT_FLAG,
+ .help = "Live deduplication",
+ },
{ NULL }
};
diff --git a/block/qcow2.h b/block/qcow2.h
index 5ecea94..0b999fb 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -57,6 +57,8 @@
/* Must be at least 4 to cover all cases of refcount table growth */
#define REFCOUNT_CACHE_SIZE 4
+#define DEDUP_CACHE_SIZE 4
+
#define DEFAULT_CLUSTER_SIZE 65536
/* Red Black Tree deduplication node */
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 17/20] qcow2: Integrate deduplication in qcow2_co_writev loop.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (15 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 16/20] qcow2: Allow creation of images using deduplication Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 18/20] qcow2: Add method to destroy the deduplication red black tree Benoît Canet
` (3 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 78 insertions(+), 1 deletion(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index c6879ea..e0c1a68 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -330,6 +330,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
QCowHeader header;
uint64_t ext_end;
+ s->has_dedup = false;
ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
if (ret < 0) {
goto fail;
@@ -812,6 +813,12 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
QEMUIOVector hd_qiov;
uint64_t bytes_done = 0;
uint8_t *cluster_data = NULL;
+ uint8_t *dedup_cluster_data = NULL;
+ uint8_t *next_call_first_hash;
+ int dedup_cluster_data_nr;
+ int deduped_sectors_nr;
+ int skip_before_dedup_clusters_nr;
+ int next_non_dedupable_sectors_nr;
QCowL2Meta l2meta = {
.nb_clusters = 0,
};
@@ -827,11 +834,66 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
qemu_co_mutex_lock(&s->lock);
+ if (s->has_dedup) {
+ /* if deduplication is on we make sure dedup_cluster_data
+ * contains a multiple of cluster size of data in order
+ * to compute the hashes
+ */
+ ret = qcow2_dedup_read_missing_and_concatenate(bs,
+ qiov,
+ sector_num,
+ remaining_sectors,
+ &dedup_cluster_data,
+ &dedup_cluster_data_nr);
+
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
+ next_call_first_hash = NULL;
+ next_non_dedupable_sectors_nr = 0;
+ skip_before_dedup_clusters_nr = 0;
while (remaining_sectors != 0) {
trace_qcow2_writev_start_part(qemu_coroutine_self());
+
+ if (s->has_dedup && next_non_dedupable_sectors_nr == 0) {
+ /* Try to deduplicate as much clusters as possible */
+ deduped_sectors_nr = qcow2_dedup(bs,
+ sector_num,
+ dedup_cluster_data,
+ dedup_cluster_data_nr,
+ &skip_before_dedup_clusters_nr,
+ &next_non_dedupable_sectors_nr,
+ &next_call_first_hash);
+
+ remaining_sectors -= deduped_sectors_nr;
+ sector_num += deduped_sectors_nr;
+ bytes_done += deduped_sectors_nr * 512;
+
+ /* no more data to write -> exit
+ * Can be < 0 because of the presence of sectors we read in
+ * qcow2_read_missing_dedup_sectors_and_concatenate.
+ */
+ if (next_non_dedupable_sectors_nr <= 0) {
+ goto fail;
+ }
+
+ /* if we deduped something trace it */
+ if (deduped_sectors_nr) {
+ trace_qcow2_writev_done_part(qemu_coroutine_self(),
+ deduped_sectors_nr);
+ trace_qcow2_writev_start_part(qemu_coroutine_self());
+ }
+ }
+
index_in_cluster = sector_num & (s->cluster_sectors - 1);
- n_end = index_in_cluster + remaining_sectors;
+ n_end = s->has_dedup &&
+ next_non_dedupable_sectors_nr < remaining_sectors ?
+ index_in_cluster + next_non_dedupable_sectors_nr :
+ index_in_cluster + remaining_sectors;
+
if (s->crypt_method &&
n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) {
n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors;
@@ -873,6 +935,19 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
cur_nr_sectors * 512);
}
+ /* Write the non duplicated clusters hashes to disk
+ * just before writing non duplicated data.
+ */
+ if (s->has_dedup) {
+ ret = qcow2_dedup_write_new_hashes(bs,
+ (cluster_offset >> 9),
+ cur_nr_sectors /
+ s->cluster_sectors);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
qemu_co_mutex_unlock(&s->lock);
trace_qcow2_writev_data(qemu_coroutine_self(),
@@ -892,6 +967,7 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
run_dependent_requests(s, &l2meta);
+ next_non_dedupable_sectors_nr -= cur_nr_sectors;
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
@@ -906,6 +982,7 @@ fail:
qemu_iovec_destroy(&hd_qiov);
qemu_vfree(cluster_data);
+ qemu_vfree(dedup_cluster_data);
trace_qcow2_writev_done_req(qemu_coroutine_self(), ret);
return ret;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 18/20] qcow2: Add method to destroy the deduplication red black tree.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (16 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 17/20] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 19/20] qcow2: init and cleanup deduplication Benoît Canet
` (2 subsequent siblings)
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-dedup.c | 16 ++++++++++++++++
block/qcow2.h | 1 +
2 files changed, 17 insertions(+)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 72fab64..60e93fe 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -680,6 +680,22 @@ void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque)
}
/*
+ * This function walk in the red black tree remove the elements and free them.
+ */
+void qcow2_dedup_destroy_rb_tree(BlockDriverState *bs)
+{
+ BDRVQcowState *s = bs->opaque;
+
+ while (!RB_EMPTY_ROOT(&s->dedup_rb_tree)) {
+ QCowHashNode *data =
+ rb_entry(s->dedup_rb_tree.rb_node, QCowHashNode, node);
+ rb_erase(&data->node, &s->dedup_rb_tree);
+ g_free(data->hash);
+ g_free(data);
+ }
+}
+
+/*
* Save the dedup table information into the header extensions
*
* @table_offset: the dedup table offset in the QCOW2 file
diff --git a/block/qcow2.h b/block/qcow2.h
index 0b999fb..1d629a7 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -390,6 +390,7 @@ int qcow2_dedup_write_new_hashes(BlockDriverState *bs,
int count);
void coroutine_fn qcow2_co_load_dedup_hashes(void *opaque);
void qcow2_dedup_start_loading_hashes(BlockDriverState *bs);
+void qcow2_dedup_destroy_rb_tree(BlockDriverState *bs);
int qcow2_dedup_grow_table(BlockDriverState *bs,
int min_size,
bool exact_size);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 19/20] qcow2: init and cleanup deduplication.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (17 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 18/20] qcow2: Add method to destroy the deduplication red black tree Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 20/20] qemu-iotests: Filter dedup=on/off so existing tests don't break Benoît Canet
2012-10-17 17:09 ` [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Avi Kivity
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
block/qcow2-dedup.c | 34 +++++++++++++++++++++++++++++-----
block/qcow2.c | 16 +++++++++++++---
2 files changed, 42 insertions(+), 8 deletions(-)
diff --git a/block/qcow2-dedup.c b/block/qcow2-dedup.c
index 60e93fe..d19bf8c 100644
--- a/block/qcow2-dedup.c
+++ b/block/qcow2-dedup.c
@@ -736,15 +736,39 @@ int qcow2_dedup_grow_table(BlockDriverState *bs,
int qcow2_dedup_init(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
- return qcow2_do_table_init(bs,
- &s->dedup_table,
- s->dedup_table_offset,
- s->dedup_table_size,
- false);
+ Coroutine *co;
+ int ret;
+
+ s->has_dedup = true;
+ s->dedup_rb_tree = RB_ROOT;
+ QTAILQ_INIT(&s->undedupable_hashes);
+ s->dedup_cluster_cache = qcow2_cache_create(bs, DEDUP_CACHE_SIZE);
+
+ ret = qcow2_do_table_init(bs,
+ &s->dedup_table,
+ s->dedup_table_offset,
+ s->dedup_table_size,
+ false);
+
+ if (ret < 0) {
+ goto fail;
+ }
+
+ /* load asynchronously the hashes */
+ co = qemu_coroutine_create(qcow2_co_load_dedup_hashes);
+ qemu_coroutine_enter(co, bs);
+ return 0;
+
+fail:
+ qcow2_cache_destroy(bs, s->dedup_cluster_cache);
+ return ret;
}
void qcow2_dedup_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
+ qcow2_cache_flush(bs, s->dedup_cluster_cache);
+ qcow2_cache_destroy(bs, s->dedup_cluster_cache);
g_free(s->dedup_table);
+ qcow2_dedup_destroy_rb_tree(bs);
}
diff --git a/block/qcow2.c b/block/qcow2.c
index e0c1a68..4102cf8 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -534,6 +534,13 @@ static int qcow2_open(BlockDriverState *bs, int flags)
}
}
+ if (s->incompatible_features & QCOW2_INCOMPAT_DEDUP) {
+ ret = qcow2_dedup_init(bs);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
#ifdef DEBUG_ALLOC
{
BdrvCheckResult result = {0};
@@ -991,11 +998,11 @@ fail:
static void qcow2_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
+
g_free(s->l1_table);
if (s->has_dedup) {
- qcow2_cache_flush(bs, s->dedup_cluster_cache);
- qcow2_cache_destroy(bs, s->dedup_cluster_cache);
+ qcow2_dedup_close(bs);
}
qcow2_cache_flush(bs, s->l2_table_cache);
@@ -1444,7 +1451,10 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* minimal init */
- s->dedup_cluster_cache = qcow2_cache_create(bs, DEDUP_CACHE_SIZE);
+ ret = qcow2_dedup_init(bs);
+ if (ret < 0) {
+ goto out;
+ }
}
/* Want a backing file? There you go.*/
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [Qemu-devel] [RFC V2 20/20] qemu-iotests: Filter dedup=on/off so existing tests don't break.
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (18 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 19/20] qcow2: init and cleanup deduplication Benoît Canet
@ 2012-10-17 16:00 ` Benoît Canet
2012-10-17 17:09 ` [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Avi Kivity
20 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-17 16:00 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Benoît Canet, stefanha
---
tests/qemu-iotests/common.rc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index d534e94..411f135 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -114,7 +114,8 @@ _make_test_img()
-e "s# compat='[^']*'##g" \
-e "s# compat6=\\(on\\|off\\)##g" \
-e "s# static=\\(on\\|off\\)##g" \
- -e "s# lazy_refcounts=\\(on\\|off\\)##g"
+ -e "s# lazy_refcounts=\\(on\\|off\\)##g" \
+ -e "s# dedup=\\(on\\|off\\)##g"
}
_cleanup_test_img()
--
1.7.10.4
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [RFC V2 01/20] qcow2: Add deduplication to the qcow2 specification.
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 01/20] qcow2: Add deduplication to the qcow2 specification Benoît Canet
@ 2012-10-17 16:28 ` Eric Blake
2012-10-18 9:06 ` Benoît Canet
0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2012-10-17 16:28 UTC (permalink / raw)
To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha
[-- Attachment #1: Type: text/plain, Size: 823 bytes --]
On 10/17/2012 10:00 AM, Benoît Canet wrote:
> ---
> docs/specs/qcow2.txt | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 36a559d..013a16b 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
>
> +== Deduplication ==
> +
> +The deduplication extension contains the offset and size of the deduplication
> +table.
> +
> + Byte 0 - 7: Offset
> +
> + 8 - 11: Size
That's still too sparse for a formal documentation - what is the format
of the deduplication table, and what do the bits in that table imply
with regards to how the rest of the qcow2 file is used?
--
Eric Blake eblake@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
` (19 preceding siblings ...)
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 20/20] qemu-iotests: Filter dedup=on/off so existing tests don't break Benoît Canet
@ 2012-10-17 17:09 ` Avi Kivity
2012-10-18 8:32 ` Benoît Canet
20 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2012-10-17 17:09 UTC (permalink / raw)
To: Benoît Canet; +Cc: kwolf, qemu-devel, stefanha
On 10/17/2012 06:00 PM, Benoît Canet wrote:
> This patchset is a not yet working draft starting to implement deduplication
> in QCOW2.
>
> The Kernel red black trees are indented in kernel style.
> I'll be happy to know what is the policy regarding this kind of inclusion
> and what tools can be used to make the style compliant.
Can you use glib GTree instead?
> I am posting this patchset in order to have an early feedback regarding the
> design.
Since it's still not working, it's hard to tell how effective it is in
practice.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication
2012-10-17 17:09 ` [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Avi Kivity
@ 2012-10-18 8:32 ` Benoît Canet
0 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-18 8:32 UTC (permalink / raw)
To: Avi Kivity; +Cc: kwolf, qemu-devel, stefanha
> Can you use glib GTree instead?
Ok
>
> > I am posting this patchset in order to have an early feedback regarding the
> > design.
>
> Since it's still not working, it's hard to tell how effective it is in
> practice.
I hope to have something working soon.
Best regards
Benoît
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [RFC V2 01/20] qcow2: Add deduplication to the qcow2 specification.
2012-10-17 16:28 ` Eric Blake
@ 2012-10-18 9:06 ` Benoît Canet
0 siblings, 0 replies; 25+ messages in thread
From: Benoît Canet @ 2012-10-18 9:06 UTC (permalink / raw)
To: Eric Blake; +Cc: kwolf, qemu-devel, stefanha
> That's still too sparse for a formal documentation - what is the format
> of the deduplication table, and what do the bits in that table imply
> with regards to how the rest of the qcow2 file is used?
I will add this in the next iteration.
Best regards
Benoît
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2012-10-18 9:06 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-17 16:00 [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 01/20] qcow2: Add deduplication to the qcow2 specification Benoît Canet
2012-10-17 16:28 ` Eric Blake
2012-10-18 9:06 ` Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 02/20] qcow2: Add kernel red black trees Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 03/20] qcow2: Add deduplication structures and fields Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 04/20] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 05/20] qcow2: Rename update_refcount into qcow2_update_refcount Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 06/20] qcow2: Add qcow2_dedup and related functions Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 07/20] qcow2: Add qcow2_dedup_write_new_hashes Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 08/20] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 09/20] qcow2: Add qcow2_co_load_dedup_hashes Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 10/20] qcow2: Add qcow2_dedup_grow_table Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 11/20] qcow2: Load and save deduplication table header extension Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 12/20] qcow2: Extract qcow2_do_table_init Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 13/20] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 14/20] qcow2: Extract qcow2_add_feature and qcow2_remove_feature Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 15/20] block: Add dedup image create option Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 16/20] qcow2: Allow creation of images using deduplication Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 17/20] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 18/20] qcow2: Add method to destroy the deduplication red black tree Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 19/20] qcow2: init and cleanup deduplication Benoît Canet
2012-10-17 16:00 ` [Qemu-devel] [RFC V2 20/20] qemu-iotests: Filter dedup=on/off so existing tests don't break Benoît Canet
2012-10-17 17:09 ` [Qemu-devel] [RFC V2 00/20] QCOW2 deduplication Avi Kivity
2012-10-18 8:32 ` Benoît Canet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).