[Qemu-devel] [patch 0/5][v3] Improve qcow2 performance when used with cache=off.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [patch 0/5][v3] Improve qcow2 performance when used with cache=off.
@ 2008-08-13 14:59 Laurent.Vivier
  2008-08-13 14:59 ` [Qemu-devel] [patch 1/5][v3] Extract code from get_cluster_offset() Laurent.Vivier
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Laurent.Vivier @ 2008-08-13 14:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laurent Vivier, Kevin Wolf

These patches improve qcow2 performance when used with cache=off.

They modify block-qcow2.c to read/write as many clusters as
possible per bdrv_aio_[read|write]().

These patches have been tested several hours, with normal disk, encrypted disk,
compressed disk and with backing file or not.
Tests were done with mkfs, untar of kernel source, build of kernel 
sources, rm of all files, dbench 1, dbench 16 and dd to fill the disk (500 MB).
fsck is used every time to check filesystem consistency.

Some benchmarks (from [v2]):

* mkfs on 500 MB qcow2 disk

mkfs                            WITHOUT         WITH

ide, cache=off,snapshot=off     41 s            5 s     8x faster
ide, cache=off,snapshot=on      40 s            5 s     8x faster
ide, cache=on, snapshot=off      3 s            3 s
ide, cache=on, snapshot=on       4 s            3 s

* tar jxvf linux-2.6.25.7.tar.bz2

ide, cache=off,snapshot=off     847 s           379 s   2.2x faster
ide, cache=off,snapshot=on      801 s           364 s   2.2x faster
ide, cache=on, snapshot=off     238 s           236 s
ide, cache=on, snapshot=on      236 s           237 s

* dd if=/dev/zero of=file

dd if=/dev/null                 WITHOUT         WITH

ide, cache=off,snapshot=off     333 kB/s        3.7 MB/s  11.3x faster
ide, cache=off,snapshot=on      337 kB/s        3.6 MB/s  10.9x faster
ide, cache=on, snapshot=off    9.06 MB/s        9.23 MB/s
ide, cache=on, snapshot=on     8,89 MB/s        8.89 MB/s

* dbench 1

dbench                          WITHOUT          WITH

ide, cache=off,snapshot=off     20.8494 MB/sec  24.8521 MB/sec          +19,2%
ide, cache=off,snapshot=on      20.9349 MB/sec  24.2296 MB/sec          +15,7%
ide, cache=on, snapshot=off     23.6612 MB/sec  25.4724 MB/sec          + 7,6%
ide, cache=on, snapshot=on      24.1836 MB/sec  24.8169 MB/sec

Changelog: 
v2: follow the advice of Avi kivity, and modify get_cluster_offset()
v3: follow comments from Kevin Wolf, and add some comments, remove
    free_used_clusters().
    follow comments from Anthony Liguori, and make more tests,
    correct a SEGV with encrypted disk (too small buffer),
    and a file corruption on commit.

[PATCH 1/5] extract code from get_cluster_offset() into new functions
            seek_l2_table(), l2_load() and l2_allocate().

[PATCH 2/5] divide get_cluster_offset() into get_cluster_offset() and
            alloc_cluster_offset().

[PATCH 3/5] divide alloc_cluster_offset() into alloc_cluster_offset()
            and alloc_compressed_cluster_offset().

[PATCH 4/5] modify get_cluster_offset(), alloc_cluster_offset() to specify 
            how many clusters we want.

[PATCH 5/5] in alloc_cluster_offset(), try to aggregate free clusters
            and freed clusters.
--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [patch 1/5][v3] Extract code from get_cluster_offset()
  2008-08-13 14:59 [Qemu-devel] [patch 0/5][v3] Improve qcow2 performance when used with cache=off Laurent.Vivier
@ 2008-08-13 14:59 ` Laurent.Vivier
  2008-08-14 14:07   ` [Qemu-devel] " Kevin Wolf
  2008-08-13 14:59 ` [Qemu-devel] [patch 2/5][v3] Divide get_cluster_offset() Laurent.Vivier
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Laurent.Vivier @ 2008-08-13 14:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laurent Vivier, Kevin Wolf

[-- Attachment #1: qcow2-factor.patch --]
[-- Type: text/plain, Size: 9482 bytes --]

Extract code from get_cluster_offset() into new functions:

- seek_l2_table()

Search an l2 offset in the l2_cache table.

- l2_load()

Read the l2 entry from disk

- l2_allocate()

Allocate a new l2 entry.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
 block-qcow2.c |  216 +++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 154 insertions(+), 62 deletions(-)

Index: qemu/block-qcow2.c
===================================================================
--- qemu.orig/block-qcow2.c	2008-08-08 11:18:13.000000000 +0200
+++ qemu/block-qcow2.c	2008-08-08 15:12:12.000000000 +0200
@@ -480,101 +480,192 @@ static int grow_l1_table(BlockDriverStat
     return -EIO;
 }
 
-/* 'allocate' is:
+/*
+ * seek_l2_table
  *
- * 0 not to allocate.
+ * seek l2_offset in the l2_cache table
+ * if not found, return NULL,
+ * if found,
+ *   increments the l2 cache count of the entry,
+ *   if counter overflow, divide by two all counters
+ *   return the pointer to the l2 cache entry
  *
- * 1 to allocate a normal cluster (for sector indexes 'n_start' to
- * 'n_end')
+ */
+
+static uint64_t *seek_l2_table(BDRVQcowState *s, uint64_t l2_offset)
+{
+    int i, j;
+
+    for(i = 0; i < L2_CACHE_SIZE; i++) {
+        if (l2_offset == s->l2_cache_offsets[i]) {
+            /* increment the hit count */
+            if (++s->l2_cache_counts[i] == 0xffffffff) {
+                for(j = 0; j < L2_CACHE_SIZE; j++) {
+                    s->l2_cache_counts[j] >>= 1;
+                }
+            }
+            return s->l2_cache + (i << s->l2_bits);
+        }
+    }
+    return NULL;
+}
+
+/*
+ * l2_load
  *
- * 2 to allocate a compressed cluster of size
- * 'compressed_size'. 'compressed_size' must be > 0 and <
- * cluster_size
+ * seek in the l2 cache the l2 table offset
+ * if found return 1
+ * if not found allocate a new entry in the l2 cache
+ *              load the l2 table from the file
+ *              update the l2 cache entry
+ *
+ * on exit, *l2_table is a pointer to the l2_table for the given offset
+ * on error, return 0, otherwise return 1
  *
- * return 0 if not allocated.
  */
+
+static uint64_t *l2_load(BlockDriverState *bs, uint64_t l2_offset)
+{
+    BDRVQcowState *s = bs->opaque;
+    int min_index;
+    uint64_t *l2_table;
+
+    /* seek if the table for the given offset is in the cache */
+
+    l2_table = seek_l2_table(s, l2_offset);
+    if (l2_table != NULL)
+        return l2_table;
+
+    /* not found: load a new entry in the least used one */
+
+    min_index = l2_cache_new_entry(bs);
+    l2_table = s->l2_cache + (min_index << s->l2_bits);
+    if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) !=
+        s->l2_size * sizeof(uint64_t))
+        return NULL;
+    s->l2_cache_offsets[min_index] = l2_offset;
+    s->l2_cache_counts[min_index] = 1;
+
+    return l2_table;
+}
+
+/*
+ *  l2_allocate
+ *
+ *  allocate a new l2 entry in the file
+ *
+ */
+
+static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index)
+{
+    BDRVQcowState *s = bs->opaque;
+    int min_index;
+    uint64_t old_l2_offset, tmp;
+    uint64_t *l2_table, l2_offset;
+
+    old_l2_offset = s->l1_table[l1_index];
+
+    /* allocate a new l2 entry */
+
+    l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t));
+
+    /* update the L1 entry */
+
+    s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED;
+
+    tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED);
+    if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp),
+                    &tmp, sizeof(tmp)) != sizeof(tmp))
+        return NULL;
+
+    /* allocate a new entry in the l2 cache */
+
+    min_index = l2_cache_new_entry(bs);
+    l2_table = s->l2_cache + (min_index << s->l2_bits);
+
+    if (old_l2_offset == 0) {
+        /* if there was no old l2 table, clear the new table */
+        memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
+    } else {
+        /* if there was an old l2 table, read it from the disk */
+        if (bdrv_pread(s->hd, old_l2_offset,
+                       l2_table, s->l2_size * sizeof(uint64_t)) !=
+            s->l2_size * sizeof(uint64_t))
+            return NULL;
+    }
+    /* write the l2 table to the file */
+    if (bdrv_pwrite(s->hd, l2_offset,
+                    l2_table, s->l2_size * sizeof(uint64_t)) !=
+        s->l2_size * sizeof(uint64_t))
+        return NULL;
+
+    /* update the l2 cache entry */
+
+    s->l2_cache_offsets[min_index] = l2_offset;
+    s->l2_cache_counts[min_index] = 1;
+
+    return l2_table;
+}
+
 static uint64_t get_cluster_offset(BlockDriverState *bs,
                                    uint64_t offset, int allocate,
                                    int compressed_size,
                                    int n_start, int n_end)
 {
     BDRVQcowState *s = bs->opaque;
-    int min_index, i, j, l1_index, l2_index, ret;
-    uint64_t l2_offset, *l2_table, cluster_offset, tmp, old_l2_offset;
+    int l1_index, l2_index, ret;
+    uint64_t l2_offset, *l2_table, cluster_offset, tmp;
+
+    /* seek the the l2 offset in the l1 table */
 
     l1_index = offset >> (s->l2_bits + s->cluster_bits);
     if (l1_index >= s->l1_size) {
         /* outside l1 table is allowed: we grow the table if needed */
         if (!allocate)
             return 0;
-        if (grow_l1_table(bs, l1_index + 1) < 0)
+        ret = grow_l1_table(bs, l1_index + 1);
+        if (ret < 0)
             return 0;
     }
     l2_offset = s->l1_table[l1_index];
+
+    /* seek the l2 table of the given l2 offset */
+
     if (!l2_offset) {
+        /* the l2 table doesn't exist */
         if (!allocate)
             return 0;
-    l2_allocate:
-        old_l2_offset = l2_offset;
-        /* allocate a new l2 entry */
-        l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t));
-        /* update the L1 entry */
-        s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED;
-        tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED);
-        if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp),
-                        &tmp, sizeof(tmp)) != sizeof(tmp))
-            return 0;
-        min_index = l2_cache_new_entry(bs);
-        l2_table = s->l2_cache + (min_index << s->l2_bits);
-
-        if (old_l2_offset == 0) {
-            memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
-        } else {
-            if (bdrv_pread(s->hd, old_l2_offset,
-                           l2_table, s->l2_size * sizeof(uint64_t)) !=
-                s->l2_size * sizeof(uint64_t))
-                return 0;
-        }
-        if (bdrv_pwrite(s->hd, l2_offset,
-                        l2_table, s->l2_size * sizeof(uint64_t)) !=
-            s->l2_size * sizeof(uint64_t))
+        /* allocate a new l2 table for this offset */
+        l2_table = l2_allocate(bs, l1_index);
+        if (l2_table == NULL)
             return 0;
+        l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED;
     } else {
-        if (!(l2_offset & QCOW_OFLAG_COPIED)) {
-            if (allocate) {
-                free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
-                goto l2_allocate;
-            }
+        /* the l2 table exists */
+        if (!(l2_offset & QCOW_OFLAG_COPIED) && allocate) {
+            /* duplicate the l2 table, and free the old table */
+            free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
+            l2_table = l2_allocate(bs, l1_index);
+            if (l2_table == NULL)
+                return 0;
+            l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED;
         } else {
+            /* load the l2 table in memory */
             l2_offset &= ~QCOW_OFLAG_COPIED;
+            l2_table = l2_load(bs, l2_offset);
+            if (l2_table == NULL)
+                return 0;
         }
-        for(i = 0; i < L2_CACHE_SIZE; i++) {
-            if (l2_offset == s->l2_cache_offsets[i]) {
-                /* increment the hit count */
-                if (++s->l2_cache_counts[i] == 0xffffffff) {
-                    for(j = 0; j < L2_CACHE_SIZE; j++) {
-                        s->l2_cache_counts[j] >>= 1;
-                    }
-                }
-                l2_table = s->l2_cache + (i << s->l2_bits);
-                goto found;
-            }
-        }
-        /* not found: load a new entry in the least used one */
-        min_index = l2_cache_new_entry(bs);
-        l2_table = s->l2_cache + (min_index << s->l2_bits);
-        if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) !=
-            s->l2_size * sizeof(uint64_t))
-            return 0;
     }
-    s->l2_cache_offsets[min_index] = l2_offset;
-    s->l2_cache_counts[min_index] = 1;
- found:
+
+    /* find the cluster offset for the given disk offset */
+
     l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
     cluster_offset = be64_to_cpu(l2_table[l2_index]);
     if (!cluster_offset) {
+        /* cluster doesn't exist */
         if (!allocate)
-            return cluster_offset;
+            return 0;
     } else if (!(cluster_offset & QCOW_OFLAG_COPIED)) {
         if (!allocate)
             return cluster_offset;
@@ -592,6 +683,7 @@ static uint64_t get_cluster_offset(Block
         cluster_offset &= ~QCOW_OFLAG_COPIED;
         return cluster_offset;
     }
+
     if (allocate == 1) {
         /* allocate a new cluster */
         cluster_offset = alloc_clusters(bs, s->cluster_size);

--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [patch 2/5][v3] Divide get_cluster_offset()
  2008-08-13 14:59 [Qemu-devel] [patch 0/5][v3] Improve qcow2 performance when used with cache=off Laurent.Vivier
  2008-08-13 14:59 ` [Qemu-devel] [patch 1/5][v3] Extract code from get_cluster_offset() Laurent.Vivier
@ 2008-08-13 14:59 ` Laurent.Vivier
  2008-08-14 14:10   ` [Qemu-devel] " Kevin Wolf
  2008-08-13 14:59 ` [Qemu-devel] [patch 3/5][v3] Extract compressing part from alloc_cluster_offset() Laurent.Vivier
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Laurent.Vivier @ 2008-08-13 14:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laurent Vivier, Kevin Wolf

[-- Attachment #1: qcow2-divide.patch --]
[-- Type: text/plain, Size: 11416 bytes --]

Divide get_cluster_offset() into get_cluster_offset() and
alloc_cluster_offset().

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
 block-qcow2.c |  208 ++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 132 insertions(+), 76 deletions(-)

Index: qemu/block-qcow2.c
===================================================================
--- qemu.orig/block-qcow2.c	2008-08-11 17:35:58.000000000 +0200
+++ qemu/block-qcow2.c	2008-08-11 17:44:07.000000000 +0200
@@ -607,22 +607,77 @@ static uint64_t *l2_allocate(BlockDriver
     return l2_table;
 }
 
-static uint64_t get_cluster_offset(BlockDriverState *bs,
-                                   uint64_t offset, int allocate,
-                                   int compressed_size,
-                                   int n_start, int n_end)
+/*
+ * get_cluster_offset
+ *
+ * For a given offset of the disk image, return cluster offset in
+ * qcow2 file.
+ *
+ * Return 1, if the offset is found
+ * Return 0, otherwise.
+ *
+ */
+
+static uint64_t get_cluster_offset(BlockDriverState *bs, uint64_t offset)
+{
+    BDRVQcowState *s = bs->opaque;
+    int l1_index, l2_index;
+    uint64_t l2_offset, *l2_table, cluster_offset;
+
+    /* seek the the l2 offset in the l1 table */
+
+    l1_index = offset >> (s->l2_bits + s->cluster_bits);
+    if (l1_index >= s->l1_size)
+        return 0;
+
+    l2_offset = s->l1_table[l1_index];
+
+    /* seek the l2 table of the given l2 offset */
+
+    if (!l2_offset)
+        return 0;
+
+    /* load the l2 table in memory */
+
+    l2_offset &= ~QCOW_OFLAG_COPIED;
+    l2_table = l2_load(bs, l2_offset);
+    if (l2_table == NULL)
+        return 0;
+
+    /* find the cluster offset for the given disk offset */
+
+    l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
+    cluster_offset = be64_to_cpu(l2_table[l2_index]);
+
+    return cluster_offset & ~QCOW_OFLAG_COPIED;
+}
+
+/*
+ * alloc_cluster_offset
+ *
+ * For a given offset of the disk image, return cluster offset in
+ * qcow2 file.
+ *
+ * If the offset is not found, allocate a new cluster.
+ *
+ * Return the cluster offset if successful,
+ * Return 0, otherwise.
+ *
+ */
+
+static uint64_t alloc_cluster_offset(BlockDriverState *bs,
+                                     uint64_t offset,
+                                     int compressed_size,
+                                     int n_start, int n_end)
 {
     BDRVQcowState *s = bs->opaque;
     int l1_index, l2_index, ret;
-    uint64_t l2_offset, *l2_table, cluster_offset, tmp;
+    uint64_t l2_offset, *l2_table, cluster_offset;
 
     /* seek the the l2 offset in the l1 table */
 
     l1_index = offset >> (s->l2_bits + s->cluster_bits);
     if (l1_index >= s->l1_size) {
-        /* outside l1 table is allowed: we grow the table if needed */
-        if (!allocate)
-            return 0;
         ret = grow_l1_table(bs, l1_index + 1);
         if (ret < 0)
             return 0;
@@ -631,44 +686,30 @@ static uint64_t get_cluster_offset(Block
 
     /* seek the l2 table of the given l2 offset */
 
-    if (!l2_offset) {
-        /* the l2 table doesn't exist */
-        if (!allocate)
+    if (l2_offset & QCOW_OFLAG_COPIED) {
+        /* load the l2 table in memory */
+        l2_offset &= ~QCOW_OFLAG_COPIED;
+        l2_table = l2_load(bs, l2_offset);
+        if (l2_table == NULL)
             return 0;
-        /* allocate a new l2 table for this offset */
+    } else {
+        if (l2_offset)
+            free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
         l2_table = l2_allocate(bs, l1_index);
         if (l2_table == NULL)
             return 0;
         l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED;
-    } else {
-        /* the l2 table exists */
-        if (!(l2_offset & QCOW_OFLAG_COPIED) && allocate) {
-            /* duplicate the l2 table, and free the old table */
-            free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
-            l2_table = l2_allocate(bs, l1_index);
-            if (l2_table == NULL)
-                return 0;
-            l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED;
-        } else {
-            /* load the l2 table in memory */
-            l2_offset &= ~QCOW_OFLAG_COPIED;
-            l2_table = l2_load(bs, l2_offset);
-            if (l2_table == NULL)
-                return 0;
-        }
     }
 
     /* find the cluster offset for the given disk offset */
 
     l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
     cluster_offset = be64_to_cpu(l2_table[l2_index]);
-    if (!cluster_offset) {
-        /* cluster doesn't exist */
-        if (!allocate)
-            return 0;
-    } else if (!(cluster_offset & QCOW_OFLAG_COPIED)) {
-        if (!allocate)
-            return cluster_offset;
+
+    if (cluster_offset & QCOW_OFLAG_COPIED)
+        return cluster_offset & ~QCOW_OFLAG_COPIED;
+
+    if (cluster_offset) {
         /* free the cluster */
         if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
             int nb_csectors;
@@ -679,46 +720,62 @@ static uint64_t get_cluster_offset(Block
         } else {
             free_clusters(bs, cluster_offset, s->cluster_size);
         }
-    } else {
-        cluster_offset &= ~QCOW_OFLAG_COPIED;
-        return cluster_offset;
     }
 
-    if (allocate == 1) {
-        /* allocate a new cluster */
-        cluster_offset = alloc_clusters(bs, s->cluster_size);
-
-        /* we must initialize the cluster content which won't be
-           written */
-        if ((n_end - n_start) < s->cluster_sectors) {
-            uint64_t start_sect;
-
-            start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
-            ret = copy_sectors(bs, start_sect,
-                               cluster_offset, 0, n_start);
-            if (ret < 0)
-                return 0;
-            ret = copy_sectors(bs, start_sect,
-                               cluster_offset, n_end, s->cluster_sectors);
-            if (ret < 0)
-                return 0;
-        }
-        tmp = cpu_to_be64(cluster_offset | QCOW_OFLAG_COPIED);
-    } else {
+    if (compressed_size) {
         int nb_csectors;
+
         cluster_offset = alloc_bytes(bs, compressed_size);
         nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) -
-            (cluster_offset >> 9);
+                      (cluster_offset >> 9);
+
         cluster_offset |= QCOW_OFLAG_COMPRESSED |
-            ((uint64_t)nb_csectors << s->csize_shift);
+                          ((uint64_t)nb_csectors << s->csize_shift);
+
+        /* update L2 table */
+
         /* compressed clusters never have the copied flag */
-        tmp = cpu_to_be64(cluster_offset);
+
+        l2_table[l2_index] = cpu_to_be64(cluster_offset);
+        if (bdrv_pwrite(s->hd,
+                        l2_offset + l2_index * sizeof(uint64_t),
+                        l2_table + l2_index,
+                        sizeof(uint64_t)) != sizeof(uint64_t))
+            return 0;
+
+        return cluster_offset;
     }
+
+    /* allocate a new cluster */
+
+    cluster_offset = alloc_clusters(bs, s->cluster_size);
+
+    /* we must initialize the cluster content which won't be
+       written */
+
+    if ((n_end - n_start) < s->cluster_sectors) {
+        uint64_t start_sect;
+
+        start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
+        ret = copy_sectors(bs, start_sect,
+                           cluster_offset, 0, n_start);
+        if (ret < 0)
+            return 0;
+        ret = copy_sectors(bs, start_sect,
+                           cluster_offset, n_end, s->cluster_sectors);
+        if (ret < 0)
+            return 0;
+    }
+
     /* update L2 table */
-    l2_table[l2_index] = tmp;
+
+    l2_table[l2_index] = cpu_to_be64(cluster_offset | QCOW_OFLAG_COPIED);
     if (bdrv_pwrite(s->hd,
-                    l2_offset + l2_index * sizeof(tmp), &tmp, sizeof(tmp)) != sizeof(tmp))
+                    l2_offset + l2_index * sizeof(uint64_t),
+                    l2_table + l2_index,
+                    sizeof(uint64_t)) != sizeof(uint64_t))
         return 0;
+
     return cluster_offset;
 }
 
@@ -729,7 +786,7 @@ static int qcow_is_allocated(BlockDriver
     int index_in_cluster, n;
     uint64_t cluster_offset;
 
-    cluster_offset = get_cluster_offset(bs, sector_num << 9, 0, 0, 0, 0);
+    cluster_offset = get_cluster_offset(bs, sector_num << 9);
     index_in_cluster = sector_num & (s->cluster_sectors - 1);
     n = s->cluster_sectors - index_in_cluster;
     if (n > nb_sectors)
@@ -811,7 +868,7 @@ static int qcow_read(BlockDriverState *b
     uint64_t cluster_offset;
 
     while (nb_sectors > 0) {
-        cluster_offset = get_cluster_offset(bs, sector_num << 9, 0, 0, 0, 0);
+        cluster_offset = get_cluster_offset(bs, sector_num << 9);
         index_in_cluster = sector_num & (s->cluster_sectors - 1);
         n = s->cluster_sectors - index_in_cluster;
         if (n > nb_sectors)
@@ -860,9 +917,9 @@ static int qcow_write(BlockDriverState *
         n = s->cluster_sectors - index_in_cluster;
         if (n > nb_sectors)
             n = nb_sectors;
-        cluster_offset = get_cluster_offset(bs, sector_num << 9, 1, 0,
-                                            index_in_cluster,
-                                            index_in_cluster + n);
+        cluster_offset = alloc_cluster_offset(bs, sector_num << 9, 0,
+                                              index_in_cluster,
+                                              index_in_cluster + n);
         if (!cluster_offset)
             return -1;
         if (s->crypt_method) {
@@ -935,8 +992,7 @@ static void qcow_aio_read_cb(void *opaqu
     }
 
     /* prepare next AIO request */
-    acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9,
-                                             0, 0, 0, 0);
+    acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9);
     index_in_cluster = acb->sector_num & (s->cluster_sectors - 1);
     acb->n = s->cluster_sectors - index_in_cluster;
     if (acb->n > acb->nb_sectors)
@@ -1045,9 +1101,9 @@ static void qcow_aio_write_cb(void *opaq
     acb->n = s->cluster_sectors - index_in_cluster;
     if (acb->n > acb->nb_sectors)
         acb->n = acb->nb_sectors;
-    cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, 1, 0,
-                                        index_in_cluster,
-                                        index_in_cluster + acb->n);
+    cluster_offset = alloc_cluster_offset(bs, acb->sector_num << 9, 0,
+                                          index_in_cluster,
+                                          index_in_cluster + acb->n);
     if (!cluster_offset || (cluster_offset & 511) != 0) {
         ret = -EIO;
         goto fail;
@@ -1307,8 +1363,8 @@ static int qcow_write_compressed(BlockDr
         /* could not compress: write normal cluster */
         qcow_write(bs, sector_num, buf, s->cluster_sectors);
     } else {
-        cluster_offset = get_cluster_offset(bs, sector_num << 9, 2,
-                                            out_len, 0, 0);
+        cluster_offset = alloc_cluster_offset(bs, sector_num << 9,
+                                              out_len, 0, 0);
         cluster_offset &= s->cluster_offset_mask;
         if (bdrv_pwrite(s->hd, cluster_offset, out_buf, out_len) != out_len) {
             qemu_free(out_buf);

--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [patch 3/5][v3] Extract compressing part from alloc_cluster_offset()
  2008-08-13 14:59 [Qemu-devel] [patch 0/5][v3] Improve qcow2 performance when used with cache=off Laurent.Vivier
  2008-08-13 14:59 ` [Qemu-devel] [patch 1/5][v3] Extract code from get_cluster_offset() Laurent.Vivier
  2008-08-13 14:59 ` [Qemu-devel] [patch 2/5][v3] Divide get_cluster_offset() Laurent.Vivier
@ 2008-08-13 14:59 ` Laurent.Vivier
  2008-08-14 14:25   ` [Qemu-devel] " Kevin Wolf
  2008-08-13 14:59 ` [Qemu-devel] [patch 4/5][v3] Aggregate same type clusters Laurent.Vivier
  2008-08-13 14:59 ` [Qemu-devel] [patch 5/5][v3] Try to aggregate free clusters and freed clusters Laurent.Vivier
  4 siblings, 1 reply; 13+ messages in thread
From: Laurent.Vivier @ 2008-08-13 14:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laurent Vivier, Kevin Wolf

[-- Attachment #1: qcow2-compressed.patch --]
[-- Type: text/plain, Size: 8347 bytes --]

Divide alloc_cluster_offset() into alloc_cluster_offset() and
alloc_compressed_cluster_offset().
Common parts are moved to free_any_clusters() and get_cluster_table();

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
 block-qcow2.c |  177 +++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 128 insertions(+), 49 deletions(-)

Index: qemu/block-qcow2.c
===================================================================
--- qemu.orig/block-qcow2.c	2008-08-11 17:47:45.000000000 +0200
+++ qemu/block-qcow2.c	2008-08-11 17:56:38.000000000 +0200
@@ -653,26 +653,53 @@ static uint64_t get_cluster_offset(Block
 }
 
 /*
- * alloc_cluster_offset
+ * free_any_clusters
  *
- * For a given offset of the disk image, return cluster offset in
- * qcow2 file.
+ * free clusters according to its type: compressed or not
  *
- * If the offset is not found, allocate a new cluster.
+ */
+
+static void free_any_clusters(BlockDriverState *bs,
+                              uint64_t cluster_offset)
+{
+    BDRVQcowState *s = bs->opaque;
+
+    if (cluster_offset == 0)
+        return;
+
+    /* free the cluster */
+
+    if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
+        int nb_csectors;
+        nb_csectors = ((cluster_offset >> s->csize_shift) &
+                       s->csize_mask) + 1;
+        free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511,
+                      nb_csectors * 512);
+        return;
+    }
+
+    free_clusters(bs, cluster_offset, s->cluster_size);
+}
+
+/*
+ * get_cluster_table
  *
- * Return the cluster offset if successful,
- * Return 0, otherwise.
+ * for a given disk offset, load (and allocate if needed)
+ * the l2 table.
+ *
+ * the l2 table offset in the qcow2 file and the cluster index
+ * in the l2 table are given to the caller.
  *
  */
 
-static uint64_t alloc_cluster_offset(BlockDriverState *bs,
-                                     uint64_t offset,
-                                     int compressed_size,
-                                     int n_start, int n_end)
+static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
+                             uint64_t **new_l2_table,
+                             uint64_t *new_l2_offset,
+                             int *new_l2_index)
 {
     BDRVQcowState *s = bs->opaque;
     int l1_index, l2_index, ret;
-    uint64_t l2_offset, *l2_table, cluster_offset;
+    uint64_t l2_offset, *l2_table;
 
     /* seek the the l2 offset in the l1 table */
 
@@ -704,47 +731,97 @@ static uint64_t alloc_cluster_offset(Blo
     /* find the cluster offset for the given disk offset */
 
     l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
-    cluster_offset = be64_to_cpu(l2_table[l2_index]);
 
+    *new_l2_table = l2_table;
+    *new_l2_offset = l2_offset;
+    *new_l2_index = l2_index;
+
+    return 1;
+}
+
+/*
+ * alloc_compressed_cluster_offset
+ *
+ * For a given offset of the disk image, return cluster offset in
+ * qcow2 file.
+ *
+ * If the offset is not found, allocate a new compressed cluster.
+ *
+ * Return the cluster offset if successful,
+ * Return 0, otherwise.
+ *
+ */
+
+static uint64_t alloc_compressed_cluster_offset(BlockDriverState *bs,
+                                                uint64_t offset,
+                                                int compressed_size)
+{
+    BDRVQcowState *s = bs->opaque;
+    int l2_index, ret;
+    uint64_t l2_offset, *l2_table, cluster_offset;
+    int nb_csectors;
+
+    ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index);
+    if (ret == 0)
+        return 0;
+
+    cluster_offset = be64_to_cpu(l2_table[l2_index]);
     if (cluster_offset & QCOW_OFLAG_COPIED)
         return cluster_offset & ~QCOW_OFLAG_COPIED;
 
-    if (cluster_offset) {
-        /* free the cluster */
-        if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
-            int nb_csectors;
-            nb_csectors = ((cluster_offset >> s->csize_shift) &
-                           s->csize_mask) + 1;
-            free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511,
-                          nb_csectors * 512);
-        } else {
-            free_clusters(bs, cluster_offset, s->cluster_size);
-        }
-    }
+    free_any_clusters(bs, cluster_offset);
 
-    if (compressed_size) {
-        int nb_csectors;
+    cluster_offset = alloc_bytes(bs, compressed_size);
+    nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) -
+                  (cluster_offset >> 9);
 
-        cluster_offset = alloc_bytes(bs, compressed_size);
-        nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) -
-                      (cluster_offset >> 9);
-
-        cluster_offset |= QCOW_OFLAG_COMPRESSED |
-                          ((uint64_t)nb_csectors << s->csize_shift);
-
-        /* update L2 table */
-
-        /* compressed clusters never have the copied flag */
-
-        l2_table[l2_index] = cpu_to_be64(cluster_offset);
-        if (bdrv_pwrite(s->hd,
-                        l2_offset + l2_index * sizeof(uint64_t),
-                        l2_table + l2_index,
-                        sizeof(uint64_t)) != sizeof(uint64_t))
-            return 0;
+    cluster_offset |= QCOW_OFLAG_COMPRESSED |
+                      ((uint64_t)nb_csectors << s->csize_shift);
 
-        return cluster_offset;
-    }
+    /* update L2 table */
+
+    /* compressed clusters never have the copied flag */
+
+    l2_table[l2_index] = cpu_to_be64(cluster_offset);
+    if (bdrv_pwrite(s->hd,
+                    l2_offset + l2_index * sizeof(uint64_t),
+                    l2_table + l2_index,
+                    sizeof(uint64_t)) != sizeof(uint64_t))
+        return 0;
+
+    return cluster_offset;
+}
+
+/*
+ * alloc_cluster_offset
+ *
+ * For a given offset of the disk image, return cluster offset in
+ * qcow2 file.
+ *
+ * If the offset is not found, allocate a new cluster.
+ *
+ * Return the cluster offset if successful,
+ * Return 0, otherwise.
+ *
+ */
+
+static uint64_t alloc_cluster_offset(BlockDriverState *bs,
+                                     uint64_t offset,
+                                     int n_start, int n_end)
+{
+    BDRVQcowState *s = bs->opaque;
+    int l2_index, ret;
+    uint64_t l2_offset, *l2_table, cluster_offset;
+
+    ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index);
+    if (ret == 0)
+        return 0;
+
+    cluster_offset = be64_to_cpu(l2_table[l2_index]);
+    if (cluster_offset & QCOW_OFLAG_COPIED)
+        return cluster_offset & ~QCOW_OFLAG_COPIED;
+
+    free_any_clusters(bs, cluster_offset);
 
     /* allocate a new cluster */
 
@@ -917,7 +994,7 @@ static int qcow_write(BlockDriverState *
         n = s->cluster_sectors - index_in_cluster;
         if (n > nb_sectors)
             n = nb_sectors;
-        cluster_offset = alloc_cluster_offset(bs, sector_num << 9, 0,
+        cluster_offset = alloc_cluster_offset(bs, sector_num << 9,
                                               index_in_cluster,
                                               index_in_cluster + n);
         if (!cluster_offset)
@@ -1101,7 +1178,7 @@ static void qcow_aio_write_cb(void *opaq
     acb->n = s->cluster_sectors - index_in_cluster;
     if (acb->n > acb->nb_sectors)
         acb->n = acb->nb_sectors;
-    cluster_offset = alloc_cluster_offset(bs, acb->sector_num << 9, 0,
+    cluster_offset = alloc_cluster_offset(bs, acb->sector_num << 9,
                                           index_in_cluster,
                                           index_in_cluster + acb->n);
     if (!cluster_offset || (cluster_offset & 511) != 0) {
@@ -1363,8 +1440,10 @@ static int qcow_write_compressed(BlockDr
         /* could not compress: write normal cluster */
         qcow_write(bs, sector_num, buf, s->cluster_sectors);
     } else {
-        cluster_offset = alloc_cluster_offset(bs, sector_num << 9,
-                                              out_len, 0, 0);
+        cluster_offset = alloc_compressed_cluster_offset(bs, sector_num << 9,
+                                              out_len);
+        if (!cluster_offset)
+            return -1;
         cluster_offset &= s->cluster_offset_mask;
         if (bdrv_pwrite(s->hd, cluster_offset, out_buf, out_len) != out_len) {
             qemu_free(out_buf);

--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [patch 4/5][v3] Aggregate same type clusters.
  2008-08-13 14:59 [Qemu-devel] [patch 0/5][v3] Improve qcow2 performance when used with cache=off Laurent.Vivier
                   ` (2 preceding siblings ...)
  2008-08-13 14:59 ` [Qemu-devel] [patch 3/5][v3] Extract compressing part from alloc_cluster_offset() Laurent.Vivier
@ 2008-08-13 14:59 ` Laurent.Vivier
  2008-08-14 15:53   ` [Qemu-devel] " Kevin Wolf
  2008-08-13 14:59 ` [Qemu-devel] [patch 5/5][v3] Try to aggregate free clusters and freed clusters Laurent.Vivier
  4 siblings, 1 reply; 13+ messages in thread
From: Laurent.Vivier @ 2008-08-13 14:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laurent Vivier, Kevin Wolf

[-- Attachment #1: qcow2-extent.patch --]
[-- Type: text/plain, Size: 13871 bytes --]

Modify get_cluster_offset(), alloc_cluster_offset() to specify how many clusters
we want.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
 block-qcow2.c |  236 ++++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 182 insertions(+), 54 deletions(-)

Index: qemu/block-qcow2.c
===================================================================
--- qemu.orig/block-qcow2.c	2008-08-12 12:49:24.000000000 +0200
+++ qemu/block-qcow2.c	2008-08-12 15:26:58.000000000 +0200
@@ -52,6 +52,8 @@
 #define QCOW_CRYPT_NONE 0
 #define QCOW_CRYPT_AES  1
 
+#define QCOW_MAX_CRYPT_CLUSTERS 32
+
 /* indicate that the refcount of the referenced cluster is exactly one. */
 #define QCOW_OFLAG_COPIED     (1LL << 63)
 /* indicate that the cluster is compressed (they never have the copied flag) */
@@ -263,7 +265,8 @@ static int qcow_open(BlockDriverState *b
     if (!s->cluster_cache)
         goto fail;
     /* one more sector for decompressed data alignment */
-    s->cluster_data = qemu_malloc(s->cluster_size + 512);
+    s->cluster_data = qemu_malloc(QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
+                                  + 512);
     if (!s->cluster_data)
         goto fail;
     s->cluster_cache_offset = -1;
@@ -613,43 +616,98 @@ static uint64_t *l2_allocate(BlockDriver
  * For a given offset of the disk image, return cluster offset in
  * qcow2 file.
  *
+ * on entry, *num is the number of contiguous clusters we'd like to
+ * access following offset.
+ *
+ * on exit, *num is the number of contiguous clusters we can read.
+ *
  * Return 1, if the offset is found
  * Return 0, otherwise.
  *
  */
 
-static uint64_t get_cluster_offset(BlockDriverState *bs, uint64_t offset)
+static uint64_t get_cluster_offset(BlockDriverState *bs,
+                                   uint64_t offset, int *num)
 {
     BDRVQcowState *s = bs->opaque;
     int l1_index, l2_index;
-    uint64_t l2_offset, *l2_table, cluster_offset;
+    uint64_t l2_offset, *l2_table, cluster_offset, next;
+    int l1_bits;
+    int index_in_cluster, nb_available, nb_needed;
+
+    index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1);
+    nb_needed = *num + index_in_cluster;
+
+    l1_bits = s->l2_bits + s->cluster_bits;
+
+    /* compute how many bytes there are between the offset and
+     * and the end of the l1 entry
+     */
+
+    nb_available = (1 << l1_bits) - (offset & ((1 << l1_bits) - 1));
+
+    /* compute the number of available sectors */
+
+    nb_available = (nb_available >> 9) + index_in_cluster;
+
+    cluster_offset = 0;
 
     /* seek the the l2 offset in the l1 table */
 
-    l1_index = offset >> (s->l2_bits + s->cluster_bits);
+    l1_index = offset >> l1_bits;
     if (l1_index >= s->l1_size)
-        return 0;
+        goto out;
 
     l2_offset = s->l1_table[l1_index];
 
     /* seek the l2 table of the given l2 offset */
 
     if (!l2_offset)
-        return 0;
+        goto out;
 
     /* load the l2 table in memory */
 
     l2_offset &= ~QCOW_OFLAG_COPIED;
     l2_table = l2_load(bs, l2_offset);
     if (l2_table == NULL)
-        return 0;
+        goto out;
 
     /* find the cluster offset for the given disk offset */
 
     l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
     cluster_offset = be64_to_cpu(l2_table[l2_index]);
+    nb_available = s->cluster_sectors;
+    l2_index++;
+
+    if (!cluster_offset) {
 
-    return cluster_offset & ~QCOW_OFLAG_COPIED;
+       /* how many empty clusters ? */
+
+       while (nb_available < nb_needed && !l2_table[l2_index]) {
+           l2_index++;
+           nb_available += s->cluster_sectors;
+       }
+    } else {
+
+       /* how many allocated clusters ? */
+
+       cluster_offset &= ~QCOW_OFLAG_COPIED;
+       while (nb_available < nb_needed) {
+           next = be64_to_cpu(l2_table[l2_index]) & ~QCOW_OFLAG_COPIED;
+           if (next != cluster_offset + (nb_available << 9))
+               break;
+           l2_index++;
+           nb_available += s->cluster_sectors;
+       }
+   }
+
+out:
+    if (nb_available > nb_needed)
+        nb_available = nb_needed;
+
+    *num = nb_available - index_in_cluster;
+
+    return cluster_offset;
 }
 
 /*
@@ -660,13 +718,10 @@ static uint64_t get_cluster_offset(Block
  */
 
 static void free_any_clusters(BlockDriverState *bs,
-                              uint64_t cluster_offset)
+                              uint64_t cluster_offset, int nb_clusters)
 {
     BDRVQcowState *s = bs->opaque;
 
-    if (cluster_offset == 0)
-        return;
-
     /* free the cluster */
 
     if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
@@ -678,7 +733,9 @@ static void free_any_clusters(BlockDrive
         return;
     }
 
-    free_clusters(bs, cluster_offset, s->cluster_size);
+    free_clusters(bs, cluster_offset, nb_clusters << s->cluster_bits);
+
+    return;
 }
 
 /*
@@ -769,7 +826,8 @@ static uint64_t alloc_compressed_cluster
     if (cluster_offset & QCOW_OFLAG_COPIED)
         return cluster_offset & ~QCOW_OFLAG_COPIED;
 
-    free_any_clusters(bs, cluster_offset);
+    if (cluster_offset)
+        free_any_clusters(bs, cluster_offset, 1);
 
     cluster_offset = alloc_bytes(bs, compressed_size);
     nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) -
@@ -807,68 +865,136 @@ static uint64_t alloc_compressed_cluster
 
 static uint64_t alloc_cluster_offset(BlockDriverState *bs,
                                      uint64_t offset,
-                                     int n_start, int n_end)
+                                     int n_start, int n_end,
+                                     int *num)
 {
     BDRVQcowState *s = bs->opaque;
     int l2_index, ret;
     uint64_t l2_offset, *l2_table, cluster_offset;
+    int nb_available, nb_clusters, i;
+    uint64_t start_sect, current;
 
     ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index);
     if (ret == 0)
         return 0;
 
+    nb_clusters = ((n_end << 9) + s->cluster_size - 1) >>
+                  s->cluster_bits;
+    if (nb_clusters > s->l2_size - l2_index)
+            nb_clusters = s->l2_size - l2_index;
+
     cluster_offset = be64_to_cpu(l2_table[l2_index]);
-    if (cluster_offset & QCOW_OFLAG_COPIED)
-        return cluster_offset & ~QCOW_OFLAG_COPIED;
 
-    free_any_clusters(bs, cluster_offset);
+    /* We keep all QCOW_OFLAG_COPIED clusters */
+
+    if (cluster_offset & QCOW_OFLAG_COPIED) {
+
+        for (i = 1; i < nb_clusters; i++) {
+            current = be64_to_cpu(l2_table[l2_index + i]);
+            if (cluster_offset + (i << s->cluster_bits) != current)
+                break;
+        }
+        nb_clusters = i;
+
+        nb_available = nb_clusters << (s->cluster_bits - 9);
+        if (nb_available > n_end)
+            nb_available = n_end;
+
+        cluster_offset &= ~QCOW_OFLAG_COPIED;
+
+        goto out;
+    }
+
+    /* for the moment, multiple compressed clusters are not managed */
+
+    if (cluster_offset & QCOW_OFLAG_COMPRESSED)
+        nb_clusters = 1;
+
+    /* how many empty or how many to free ? */
+
+    if (!cluster_offset) {
+
+        /* how many free clusters ? */
+
+        i = 1;
+        while (i < nb_clusters &&
+               l2_table[l2_index + i] == 0) {
+            i++;
+        }
+        nb_clusters = i;
+
+    } else {
+
+        /* how many contiguous clusters ? */
+
+        for (i = 1; i < nb_clusters; i++) {
+            current = be64_to_cpu(l2_table[l2_index + i]);
+            if (cluster_offset + (i << s->cluster_bits) != current)
+                break;
+        }
+        nb_clusters = i;
+
+        free_any_clusters(bs, cluster_offset, i);
+    }
 
     /* allocate a new cluster */
 
-    cluster_offset = alloc_clusters(bs, s->cluster_size);
+    cluster_offset = alloc_clusters(bs, nb_clusters * s->cluster_size);
 
     /* we must initialize the cluster content which won't be
        written */
 
-    if ((n_end - n_start) < s->cluster_sectors) {
-        uint64_t start_sect;
-
-        start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
-        ret = copy_sectors(bs, start_sect,
-                           cluster_offset, 0, n_start);
+    nb_available = nb_clusters << (s->cluster_bits - 9);
+    if (nb_available > n_end)
+        nb_available = n_end;
+
+    /* copy content of unmodified sectors */
+
+    start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
+    if (n_start) {
+        ret = copy_sectors(bs, start_sect, cluster_offset, 0, n_start);
         if (ret < 0)
             return 0;
-        ret = copy_sectors(bs, start_sect,
-                           cluster_offset, n_end, s->cluster_sectors);
+    }
+
+    if (nb_available & (s->cluster_sectors - 1)) {
+        uint64_t end = nb_available & ~(uint64_t)(s->cluster_sectors - 1);
+        ret = copy_sectors(bs, start_sect + end,
+                           cluster_offset + (end << 9),
+                           nb_available - end,
+                           s->cluster_sectors);
         if (ret < 0)
             return 0;
     }
 
     /* update L2 table */
 
-    l2_table[l2_index] = cpu_to_be64(cluster_offset | QCOW_OFLAG_COPIED);
+    for (i = 0; i < nb_clusters; i++)
+        l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
+                                             (i << s->cluster_bits)) |
+                                             QCOW_OFLAG_COPIED);
+
     if (bdrv_pwrite(s->hd,
                     l2_offset + l2_index * sizeof(uint64_t),
                     l2_table + l2_index,
-                    sizeof(uint64_t)) != sizeof(uint64_t))
+                    nb_clusters * sizeof(uint64_t)) !=
+                    nb_clusters * sizeof(uint64_t))
         return 0;
 
+out:
+    *num = nb_available - n_start;
+
     return cluster_offset;
 }
 
 static int qcow_is_allocated(BlockDriverState *bs, int64_t sector_num,
                              int nb_sectors, int *pnum)
 {
-    BDRVQcowState *s = bs->opaque;
-    int index_in_cluster, n;
     uint64_t cluster_offset;
 
-    cluster_offset = get_cluster_offset(bs, sector_num << 9);
-    index_in_cluster = sector_num & (s->cluster_sectors - 1);
-    n = s->cluster_sectors - index_in_cluster;
-    if (n > nb_sectors)
-        n = nb_sectors;
-    *pnum = n;
+    *pnum = nb_sectors;
+    cluster_offset = get_cluster_offset(bs, sector_num << 9, pnum);
+
     return (cluster_offset != 0);
 }
 
@@ -945,11 +1071,9 @@ static int qcow_read(BlockDriverState *b
     uint64_t cluster_offset;
 
     while (nb_sectors > 0) {
-        cluster_offset = get_cluster_offset(bs, sector_num << 9);
+        n = nb_sectors;
+        cluster_offset = get_cluster_offset(bs, sector_num << 9, &n);
         index_in_cluster = sector_num & (s->cluster_sectors - 1);
-        n = s->cluster_sectors - index_in_cluster;
-        if (n > nb_sectors)
-            n = nb_sectors;
         if (!cluster_offset) {
             if (bs->backing_hd) {
                 /* read from the base image */
@@ -988,15 +1112,17 @@ static int qcow_write(BlockDriverState *
     BDRVQcowState *s = bs->opaque;
     int ret, index_in_cluster, n;
     uint64_t cluster_offset;
+    int n_end;
 
     while (nb_sectors > 0) {
         index_in_cluster = sector_num & (s->cluster_sectors - 1);
-        n = s->cluster_sectors - index_in_cluster;
-        if (n > nb_sectors)
-            n = nb_sectors;
+        n_end = index_in_cluster + nb_sectors;
+        if (s->crypt_method &&
+            n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors)
+            n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors;
         cluster_offset = alloc_cluster_offset(bs, sector_num << 9,
                                               index_in_cluster,
-                                              index_in_cluster + n);
+                                              n_end, &n);
         if (!cluster_offset)
             return -1;
         if (s->crypt_method) {
@@ -1069,11 +1195,9 @@ static void qcow_aio_read_cb(void *opaqu
     }
 
     /* prepare next AIO request */
-    acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9);
+    acb->n = acb->nb_sectors;
+    acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9, &acb->n);
     index_in_cluster = acb->sector_num & (s->cluster_sectors - 1);
-    acb->n = s->cluster_sectors - index_in_cluster;
-    if (acb->n > acb->nb_sectors)
-        acb->n = acb->nb_sectors;
 
     if (!acb->cluster_offset) {
         if (bs->backing_hd) {
@@ -1153,6 +1277,7 @@ static void qcow_aio_write_cb(void *opaq
     int index_in_cluster;
     uint64_t cluster_offset;
     const uint8_t *src_buf;
+    int n_end;
 
     acb->hd_aiocb = NULL;
 
@@ -1175,19 +1300,22 @@ static void qcow_aio_write_cb(void *opaq
     }
 
     index_in_cluster = acb->sector_num & (s->cluster_sectors - 1);
-    acb->n = s->cluster_sectors - index_in_cluster;
-    if (acb->n > acb->nb_sectors)
-        acb->n = acb->nb_sectors;
+    n_end = index_in_cluster + acb->nb_sectors;
+    if (s->crypt_method &&
+        n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors)
+        n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors;
+
     cluster_offset = alloc_cluster_offset(bs, acb->sector_num << 9,
                                           index_in_cluster,
-                                          index_in_cluster + acb->n);
+                                          n_end, &acb->n);
     if (!cluster_offset || (cluster_offset & 511) != 0) {
         ret = -EIO;
         goto fail;
     }
     if (s->crypt_method) {
         if (!acb->cluster_data) {
-            acb->cluster_data = qemu_mallocz(s->cluster_size);
+            acb->cluster_data = qemu_mallocz(QCOW_MAX_CRYPT_CLUSTERS *
+                                             s->cluster_size);
             if (!acb->cluster_data) {
                 ret = -ENOMEM;
                 goto fail;

--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [patch 5/5][v3] Try to aggregate free clusters and freed clusters
  2008-08-13 14:59 [Qemu-devel] [patch 0/5][v3] Improve qcow2 performance when used with cache=off Laurent.Vivier
                   ` (3 preceding siblings ...)
  2008-08-13 14:59 ` [Qemu-devel] [patch 4/5][v3] Aggregate same type clusters Laurent.Vivier
@ 2008-08-13 14:59 ` Laurent.Vivier
  2008-08-14 16:11   ` [Qemu-devel] " Kevin Wolf
  4 siblings, 1 reply; 13+ messages in thread
From: Laurent.Vivier @ 2008-08-13 14:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laurent Vivier, Kevin Wolf

[-- Attachment #1: qcow2-extent-more.patch --]
[-- Type: text/plain, Size: 2725 bytes --]

In alloc_cluster_offset(), ry to aggregate free clusters and freed clusters.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
 block-qcow2.c |   54 ++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 36 insertions(+), 18 deletions(-)

Index: qemu/block-qcow2.c
===================================================================
--- qemu.orig/block-qcow2.c	2008-08-12 15:56:41.000000000 +0200
+++ qemu/block-qcow2.c	2008-08-12 16:02:22.000000000 +0200
@@ -871,7 +871,7 @@ static uint64_t alloc_cluster_offset(Blo
     BDRVQcowState *s = bs->opaque;
     int l2_index, ret;
     uint64_t l2_offset, *l2_table, cluster_offset;
-    int nb_available, nb_clusters, i;
+    int nb_available, nb_clusters, i, j;
     uint64_t start_sect, current;
 
     ret = get_cluster_table(bs, offset, &l2_table, &l2_offset, &l2_index);
@@ -910,32 +910,50 @@ static uint64_t alloc_cluster_offset(Blo
     if (cluster_offset & QCOW_OFLAG_COMPRESSED)
         nb_clusters = 1;
 
-    /* how many empty or how many to free ? */
+    /* how many available clusters ? */
 
-    if (!cluster_offset) {
+    i = 0;
+    while (i < nb_clusters) {
 
-        /* how many free clusters ? */
+        i++;
 
-        i = 1;
-        while (i < nb_clusters &&
-               l2_table[l2_index + i] == 0) {
-            i++;
-        }
-        nb_clusters = i;
+        if (!cluster_offset) {
 
-    } else {
+            /* how many free clusters ? */
 
-        /* how many contiguous clusters ? */
+            while (i < nb_clusters) {
+                cluster_offset = l2_table[l2_index + i];
+                if (cluster_offset != 0)
+                    break;
+                i++;
+            }
 
-        for (i = 1; i < nb_clusters; i++) {
-            current = be64_to_cpu(l2_table[l2_index + i]);
-            if (cluster_offset + (i << s->cluster_bits) != current)
+            if ((cluster_offset & QCOW_OFLAG_COPIED) ||
+                (cluster_offset & QCOW_OFLAG_COMPRESSED))
                 break;
-        }
-        nb_clusters = i;
 
-        free_any_clusters(bs, cluster_offset, i);
+        } else {
+
+            /* how many contiguous clusters ? */
+
+            j = 1;
+            current = 0;
+            while (i < nb_clusters) {
+                current = be64_to_cpu(l2_table[l2_index + i]);
+                if (cluster_offset + (j << s->cluster_bits) != current)
+                    break;
+
+                i++;
+                j++;
+            }
+
+            free_any_clusters(bs, cluster_offset, j);
+            if (current)
+                break;
+            cluster_offset = current;
+        }
     }
+    nb_clusters = i;
 
     /* allocate a new cluster */
 

--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: [patch 1/5][v3] Extract code from get_cluster_offset()
  2008-08-13 14:59 ` [Qemu-devel] [patch 1/5][v3] Extract code from get_cluster_offset() Laurent.Vivier
@ 2008-08-14 14:07   ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2008-08-14 14:07 UTC (permalink / raw)
  To: Laurent.Vivier; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

Laurent.Vivier@bull.net schrieb:
> Extract code from get_cluster_offset() into new functions:
> 
> - seek_l2_table()
> 
> Search an l2 offset in the l2_cache table.
> 
> - l2_load()
> 
> Read the l2 entry from disk
> 
> - l2_allocate()
> 
> Allocate a new l2 entry.
> 
> Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
> ---
>  block-qcow2.c |  216 +++++++++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 154 insertions(+), 62 deletions(-)

The logic looks fine to me. The comment for l2_load doesn't match the
code, though. As Laurent is not available until September I'm attaching
a corrected version of the patch myself.

Kevin

[-- Attachment #2: qcow2-factor.patch --]
[-- Type: text/x-patch, Size: 9617 bytes --]

Extract code from get_cluster_offset() into new functions:

- seek_l2_table()

Search an l2 offset in the l2_cache table.

- l2_load()

Read the l2 entry from disk

- l2_allocate()

Allocate a new l2 entry.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Kevin Wolf <kwolf@suse.de>
---
 block-qcow2.c |  215 +++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 153 insertions(+), 62 deletions(-)

Index: qemu-svn/block-qcow2.c
===================================================================
--- qemu-svn.orig/block-qcow2.c
+++ qemu-svn/block-qcow2.c
@@ -480,101 +480,191 @@ static int grow_l1_table(BlockDriverStat
     return -EIO;
 }
 
-/* 'allocate' is:
+/*
+ * seek_l2_table
  *
- * 0 not to allocate.
+ * seek l2_offset in the l2_cache table
+ * if not found, return NULL,
+ * if found,
+ *   increments the l2 cache hit count of the entry,
+ *   if counter overflow, divide by two all counters
+ *   return the pointer to the l2 cache entry
  *
- * 1 to allocate a normal cluster (for sector indexes 'n_start' to
- * 'n_end')
+ */
+
+static uint64_t *seek_l2_table(BDRVQcowState *s, uint64_t l2_offset)
+{
+    int i, j;
+
+    for(i = 0; i < L2_CACHE_SIZE; i++) {
+        if (l2_offset == s->l2_cache_offsets[i]) {
+            /* increment the hit count */
+            if (++s->l2_cache_counts[i] == 0xffffffff) {
+                for(j = 0; j < L2_CACHE_SIZE; j++) {
+                    s->l2_cache_counts[j] >>= 1;
+                }
+            }
+            return s->l2_cache + (i << s->l2_bits);
+        }
+    }
+    return NULL;
+}
+
+/*
+ * l2_load
+ *
+ * Loads a L2 table into memory. If the table is in the cache, the cache
+ * is used; otherwise the L2 table is loaded from the image file.
+ *
+ * Returns a pointer to the L2 table on success, or NULL if the read from
+ * the image file failed.
+ */
+
+static uint64_t *l2_load(BlockDriverState *bs, uint64_t l2_offset)
+{
+    BDRVQcowState *s = bs->opaque;
+    int min_index;
+    uint64_t *l2_table;
+
+    /* seek if the table for the given offset is in the cache */
+
+    l2_table = seek_l2_table(s, l2_offset);
+    if (l2_table != NULL)
+        return l2_table;
+
+    /* not found: load a new entry in the least used one */
+
+    min_index = l2_cache_new_entry(bs);
+    l2_table = s->l2_cache + (min_index << s->l2_bits);
+    if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) !=
+        s->l2_size * sizeof(uint64_t))
+        return NULL;
+    s->l2_cache_offsets[min_index] = l2_offset;
+    s->l2_cache_counts[min_index] = 1;
+
+    return l2_table;
+}
+
+/*
+ * l2_allocate
  *
- * 2 to allocate a compressed cluster of size
- * 'compressed_size'. 'compressed_size' must be > 0 and <
- * cluster_size
+ * Allocate a new l2 entry in the file. If l1_index points to an already
+ * used entry in the L2 table (i.e. we are doing a copy on write for the L2
+ * table) copy the contents of the old L2 table into the newly allocated one.
+ * Otherwise the new table is initialized with zeros.
  *
- * return 0 if not allocated.
  */
+
+static uint64_t *l2_allocate(BlockDriverState *bs, int l1_index)
+{
+    BDRVQcowState *s = bs->opaque;
+    int min_index;
+    uint64_t old_l2_offset, tmp;
+    uint64_t *l2_table, l2_offset;
+
+    old_l2_offset = s->l1_table[l1_index];
+
+    /* allocate a new l2 entry */
+
+    l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t));
+
+    /* update the L1 entry */
+
+    s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED;
+
+    tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED);
+    if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp),
+                    &tmp, sizeof(tmp)) != sizeof(tmp))
+        return NULL;
+
+    /* allocate a new entry in the l2 cache */
+
+    min_index = l2_cache_new_entry(bs);
+    l2_table = s->l2_cache + (min_index << s->l2_bits);
+
+    if (old_l2_offset == 0) {
+        /* if there was no old l2 table, clear the new table */
+        memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
+    } else {
+        /* if there was an old l2 table, read it from the disk */
+        if (bdrv_pread(s->hd, old_l2_offset,
+                       l2_table, s->l2_size * sizeof(uint64_t)) !=
+            s->l2_size * sizeof(uint64_t))
+            return NULL;
+    }
+    /* write the l2 table to the file */
+    if (bdrv_pwrite(s->hd, l2_offset,
+                    l2_table, s->l2_size * sizeof(uint64_t)) !=
+        s->l2_size * sizeof(uint64_t))
+        return NULL;
+
+    /* update the l2 cache entry */
+
+    s->l2_cache_offsets[min_index] = l2_offset;
+    s->l2_cache_counts[min_index] = 1;
+
+    return l2_table;
+}
+
 static uint64_t get_cluster_offset(BlockDriverState *bs,
                                    uint64_t offset, int allocate,
                                    int compressed_size,
                                    int n_start, int n_end)
 {
     BDRVQcowState *s = bs->opaque;
-    int min_index, i, j, l1_index, l2_index, ret;
-    uint64_t l2_offset, *l2_table, cluster_offset, tmp, old_l2_offset;
+    int l1_index, l2_index, ret;
+    uint64_t l2_offset, *l2_table, cluster_offset, tmp;
+
+    /* seek the the l2 offset in the l1 table */
 
     l1_index = offset >> (s->l2_bits + s->cluster_bits);
     if (l1_index >= s->l1_size) {
         /* outside l1 table is allowed: we grow the table if needed */
         if (!allocate)
             return 0;
-        if (grow_l1_table(bs, l1_index + 1) < 0)
+        ret = grow_l1_table(bs, l1_index + 1);
+        if (ret < 0)
             return 0;
     }
     l2_offset = s->l1_table[l1_index];
+
+    /* seek the l2 table of the given l2 offset */
+
     if (!l2_offset) {
+        /* the l2 table doesn't exist */
         if (!allocate)
             return 0;
-    l2_allocate:
-        old_l2_offset = l2_offset;
-        /* allocate a new l2 entry */
-        l2_offset = alloc_clusters(bs, s->l2_size * sizeof(uint64_t));
-        /* update the L1 entry */
-        s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED;
-        tmp = cpu_to_be64(l2_offset | QCOW_OFLAG_COPIED);
-        if (bdrv_pwrite(s->hd, s->l1_table_offset + l1_index * sizeof(tmp),
-                        &tmp, sizeof(tmp)) != sizeof(tmp))
-            return 0;
-        min_index = l2_cache_new_entry(bs);
-        l2_table = s->l2_cache + (min_index << s->l2_bits);
-
-        if (old_l2_offset == 0) {
-            memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
-        } else {
-            if (bdrv_pread(s->hd, old_l2_offset,
-                           l2_table, s->l2_size * sizeof(uint64_t)) !=
-                s->l2_size * sizeof(uint64_t))
-                return 0;
-        }
-        if (bdrv_pwrite(s->hd, l2_offset,
-                        l2_table, s->l2_size * sizeof(uint64_t)) !=
-            s->l2_size * sizeof(uint64_t))
+        /* allocate a new l2 table for this offset */
+        l2_table = l2_allocate(bs, l1_index);
+        if (l2_table == NULL)
             return 0;
+        l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED;
     } else {
-        if (!(l2_offset & QCOW_OFLAG_COPIED)) {
-            if (allocate) {
-                free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
-                goto l2_allocate;
-            }
+        /* the l2 table exists */
+        if (!(l2_offset & QCOW_OFLAG_COPIED) && allocate) {
+            /* duplicate the l2 table, and free the old table */
+            free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
+            l2_table = l2_allocate(bs, l1_index);
+            if (l2_table == NULL)
+                return 0;
+            l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED;
         } else {
+            /* load the l2 table in memory */
             l2_offset &= ~QCOW_OFLAG_COPIED;
+            l2_table = l2_load(bs, l2_offset);
+            if (l2_table == NULL)
+                return 0;
         }
-        for(i = 0; i < L2_CACHE_SIZE; i++) {
-            if (l2_offset == s->l2_cache_offsets[i]) {
-                /* increment the hit count */
-                if (++s->l2_cache_counts[i] == 0xffffffff) {
-                    for(j = 0; j < L2_CACHE_SIZE; j++) {
-                        s->l2_cache_counts[j] >>= 1;
-                    }
-                }
-                l2_table = s->l2_cache + (i << s->l2_bits);
-                goto found;
-            }
-        }
-        /* not found: load a new entry in the least used one */
-        min_index = l2_cache_new_entry(bs);
-        l2_table = s->l2_cache + (min_index << s->l2_bits);
-        if (bdrv_pread(s->hd, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) !=
-            s->l2_size * sizeof(uint64_t))
-            return 0;
     }
-    s->l2_cache_offsets[min_index] = l2_offset;
-    s->l2_cache_counts[min_index] = 1;
- found:
+
+    /* find the cluster offset for the given disk offset */
+
     l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
     cluster_offset = be64_to_cpu(l2_table[l2_index]);
     if (!cluster_offset) {
+        /* cluster doesn't exist */
         if (!allocate)
-            return cluster_offset;
+            return 0;
     } else if (!(cluster_offset & QCOW_OFLAG_COPIED)) {
         if (!allocate)
             return cluster_offset;
@@ -592,6 +682,7 @@ static uint64_t get_cluster_offset(Block
         cluster_offset &= ~QCOW_OFLAG_COPIED;
         return cluster_offset;
     }
+
     if (allocate == 1) {
         /* allocate a new cluster */
         cluster_offset = alloc_clusters(bs, s->cluster_size);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: [patch 2/5][v3] Divide get_cluster_offset()
  2008-08-13 14:59 ` [Qemu-devel] [patch 2/5][v3] Divide get_cluster_offset() Laurent.Vivier
@ 2008-08-14 14:10   ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2008-08-14 14:10 UTC (permalink / raw)
  To: Laurent.Vivier; +Cc: qemu-devel

Laurent.Vivier@bull.net schrieb:
> Divide get_cluster_offset() into get_cluster_offset() and
> alloc_cluster_offset().
> 
> Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
> ---
>  block-qcow2.c |  208 ++++++++++++++++++++++++++++++++++++----------------------
>  1 file changed, 132 insertions(+), 76 deletions(-)

The comment of get_cluster_offset describes a wrong return value.

Otherwise Acked-by: Kevin Wolf <kwolf@suse.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: [patch 3/5][v3] Extract compressing part from alloc_cluster_offset()
  2008-08-13 14:59 ` [Qemu-devel] [patch 3/5][v3] Extract compressing part from alloc_cluster_offset() Laurent.Vivier
@ 2008-08-14 14:25   ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2008-08-14 14:25 UTC (permalink / raw)
  To: Laurent.Vivier; +Cc: qemu-devel

Laurent.Vivier@bull.net schrieb:
> Divide alloc_cluster_offset() into alloc_cluster_offset() and
> alloc_compressed_cluster_offset().
> Common parts are moved to free_any_clusters() and get_cluster_table();
> 
> Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
> ---
>  block-qcow2.c |  177 +++++++++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 128 insertions(+), 49 deletions(-)

Acked-by: Kevin Wolf <kwolf@suse.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: [patch 4/5][v3] Aggregate same type clusters.
  2008-08-13 14:59 ` [Qemu-devel] [patch 4/5][v3] Aggregate same type clusters Laurent.Vivier
@ 2008-08-14 15:53   ` Kevin Wolf
  2008-08-14 16:20     ` Anthony Liguori
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2008-08-14 15:53 UTC (permalink / raw)
  To: Laurent.Vivier; +Cc: qemu-devel

Laurent.Vivier@bull.net schrieb:
> Modify get_cluster_offset(), alloc_cluster_offset() to specify how many clusters
> we want.
> 
> Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
> ---
>  block-qcow2.c |  236 ++++++++++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 182 insertions(+), 54 deletions(-)
> 
> Index: qemu/block-qcow2.c
> ===================================================================
> --- qemu.orig/block-qcow2.c	2008-08-12 12:49:24.000000000 +0200
> +++ qemu/block-qcow2.c	2008-08-12 15:26:58.000000000 +0200
> @@ -52,6 +52,8 @@
>  #define QCOW_CRYPT_NONE 0
>  #define QCOW_CRYPT_AES  1
>  
> +#define QCOW_MAX_CRYPT_CLUSTERS 32
> +
>  /* indicate that the refcount of the referenced cluster is exactly one. */
>  #define QCOW_OFLAG_COPIED     (1LL << 63)
>  /* indicate that the cluster is compressed (they never have the copied flag) */
> @@ -263,7 +265,8 @@ static int qcow_open(BlockDriverState *b
>      if (!s->cluster_cache)
>          goto fail;
>      /* one more sector for decompressed data alignment */
> -    s->cluster_data = qemu_malloc(s->cluster_size + 512);
> +    s->cluster_data = qemu_malloc(QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
> +                                  + 512);
>      if (!s->cluster_data)
>          goto fail;
>      s->cluster_cache_offset = -1;
> @@ -613,43 +616,98 @@ static uint64_t *l2_allocate(BlockDriver
>   * For a given offset of the disk image, return cluster offset in
>   * qcow2 file.
>   *
> + * on entry, *num is the number of contiguous clusters we'd like to
> + * access following offset.
> + *
> + * on exit, *num is the number of contiguous clusters we can read.
> + *
>   * Return 1, if the offset is found
>   * Return 0, otherwise.
>   *
>   */
>  
> -static uint64_t get_cluster_offset(BlockDriverState *bs, uint64_t offset)
> +static uint64_t get_cluster_offset(BlockDriverState *bs,
> +                                   uint64_t offset, int *num)
>  {
>      BDRVQcowState *s = bs->opaque;
>      int l1_index, l2_index;
> -    uint64_t l2_offset, *l2_table, cluster_offset;
> +    uint64_t l2_offset, *l2_table, cluster_offset, next;
> +    int l1_bits;
> +    int index_in_cluster, nb_available, nb_needed;
> +
> +    index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1);
> +    nb_needed = *num + index_in_cluster;
> +
> +    l1_bits = s->l2_bits + s->cluster_bits;
> +
> +    /* compute how many bytes there are between the offset and
> +     * and the end of the l1 entry
> +     */
> +
> +    nb_available = (1 << l1_bits) - (offset & ((1 << l1_bits) - 1));
> +
> +    /* compute the number of available sectors */
> +
> +    nb_available = (nb_available >> 9) + index_in_cluster;
> +
> +    cluster_offset = 0;
>  
>      /* seek the the l2 offset in the l1 table */
>  
> -    l1_index = offset >> (s->l2_bits + s->cluster_bits);
> +    l1_index = offset >> l1_bits;
>      if (l1_index >= s->l1_size)
> -        return 0;
> +        goto out;
>  
>      l2_offset = s->l1_table[l1_index];
>  
>      /* seek the l2 table of the given l2 offset */
>  
>      if (!l2_offset)
> -        return 0;
> +        goto out;
>  
>      /* load the l2 table in memory */
>  
>      l2_offset &= ~QCOW_OFLAG_COPIED;
>      l2_table = l2_load(bs, l2_offset);
>      if (l2_table == NULL)
> -        return 0;
> +        goto out;

You agreed that return 0 is actually the right thing to do here because
this is a real error.


>  
>      /* find the cluster offset for the given disk offset */
>  
>      l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
>      cluster_offset = be64_to_cpu(l2_table[l2_index]);
> +    nb_available = s->cluster_sectors;
> +    l2_index++;
> +
> +    if (!cluster_offset) {
>  
> -    return cluster_offset & ~QCOW_OFLAG_COPIED;
> +       /* how many empty clusters ? */
> +
> +       while (nb_available < nb_needed && !l2_table[l2_index]) {
> +           l2_index++;
> +           nb_available += s->cluster_sectors;
> +       }
> +    } else {
> +
> +       /* how many allocated clusters ? */
> +
> +       cluster_offset &= ~QCOW_OFLAG_COPIED;
> +       while (nb_available < nb_needed) {
> +           next = be64_to_cpu(l2_table[l2_index]) & ~QCOW_OFLAG_COPIED;
> +           if (next != cluster_offset + (nb_available << 9))
> +               break;
> +           l2_index++;
> +           nb_available += s->cluster_sectors;
> +       }
> +   }
> +
> +out:
> +    if (nb_available > nb_needed)
> +        nb_available = nb_needed;
> +
> +    *num = nb_available - index_in_cluster;
> +
> +    return cluster_offset;
>  }
>  
>  /*
> @@ -660,13 +718,10 @@ static uint64_t get_cluster_offset(Block
>   */
>  
>  static void free_any_clusters(BlockDriverState *bs,
> -                              uint64_t cluster_offset)
> +                              uint64_t cluster_offset, int nb_clusters)
>  {
>      BDRVQcowState *s = bs->opaque;
>  
> -    if (cluster_offset == 0)
> -        return;
> -

I don't think this check hurts. Even if this should never happen anyway.

The rest of it looks good.

Kevin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: [patch 5/5][v3] Try to aggregate free clusters and freed clusters
  2008-08-13 14:59 ` [Qemu-devel] [patch 5/5][v3] Try to aggregate free clusters and freed clusters Laurent.Vivier
@ 2008-08-14 16:11   ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2008-08-14 16:11 UTC (permalink / raw)
  To: Laurent.Vivier; +Cc: qemu-devel

Laurent.Vivier@bull.net schrieb:
> In alloc_cluster_offset(), ry to aggregate free clusters and freed clusters.
> 
> Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
> ---
>  block-qcow2.c |   54 ++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 36 insertions(+), 18 deletions(-)

Acked-by: Kevin Wolf <kwolf@suse.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: [patch 4/5][v3] Aggregate same type clusters.
  2008-08-14 15:53   ` [Qemu-devel] " Kevin Wolf
@ 2008-08-14 16:20     ` Anthony Liguori
  2008-08-14 17:16       ` Kevin Wolf
  0 siblings, 1 reply; 13+ messages in thread
From: Anthony Liguori @ 2008-08-14 16:20 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Laurent.Vivier, qemu-devel

Kevin Wolf wrote:
> Laurent.Vivier@bull.net schrieb:
>   
>>      /* seek the the l2 offset in the l1 table */
>>  
>> -    l1_index = offset >> (s->l2_bits + s->cluster_bits);
>> +    l1_index = offset >> l1_bits;
>>      if (l1_index >= s->l1_size)
>> -        return 0;
>> +        goto out;
>>  
>>      l2_offset = s->l1_table[l1_index];
>>  
>>      /* seek the l2 table of the given l2 offset */
>>  
>>      if (!l2_offset)
>> -        return 0;
>> +        goto out;
>>  
>>      /* load the l2 table in memory */
>>  
>>      l2_offset &= ~QCOW_OFLAG_COPIED;
>>      l2_table = l2_load(bs, l2_offset);
>>      if (l2_table == NULL)
>> -        return 0;
>> +        goto out;
>>     
>
> You agreed that return 0 is actually the right thing to do here because
> this is a real error.
>   

I'm inclined to apply this patch (and the rest of the series) and then 
when Laurent gets back, we can have another patch that changes this back 
to return 0.  Any objections?

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: [patch 4/5][v3] Aggregate same type clusters.
  2008-08-14 16:20     ` Anthony Liguori
@ 2008-08-14 17:16       ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2008-08-14 17:16 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Laurent.Vivier, qemu-devel

Anthony Liguori schrieb:
> Kevin Wolf wrote:
>> Laurent.Vivier@bull.net schrieb:
>>  
>>>      /* seek the the l2 offset in the l1 table */
>>>  
>>> -    l1_index = offset >> (s->l2_bits + s->cluster_bits);
>>> +    l1_index = offset >> l1_bits;
>>>      if (l1_index >= s->l1_size)
>>> -        return 0;
>>> +        goto out;
>>>  
>>>      l2_offset = s->l1_table[l1_index];
>>>  
>>>      /* seek the l2 table of the given l2 offset */
>>>  
>>>      if (!l2_offset)
>>> -        return 0;
>>> +        goto out;
>>>  
>>>      /* load the l2 table in memory */
>>>  
>>>      l2_offset &= ~QCOW_OFLAG_COPIED;
>>>      l2_table = l2_load(bs, l2_offset);
>>>      if (l2_table == NULL)
>>> -        return 0;
>>> +        goto out;
>>>     
>>
>> You agreed that return 0 is actually the right thing to do here because
>> this is a real error.
>>   
> 
> I'm inclined to apply this patch (and the rest of the series) and then
> when Laurent gets back, we can have another patch that changes this back
> to return 0.  Any objections?

I'd prefer that you change it before committing. If we don't return 0
here, the L2 table is considered free if it can't be loaded. So
returning 0 is definitely safer.

But then, if you can't load the L2 table, you're in trouble anyway...
It's your decision, I can live with both.

Kevin

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-08-14 17:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-13 14:59 [Qemu-devel] [patch 0/5][v3] Improve qcow2 performance when used with cache=off Laurent.Vivier
2008-08-13 14:59 ` [Qemu-devel] [patch 1/5][v3] Extract code from get_cluster_offset() Laurent.Vivier
2008-08-14 14:07   ` [Qemu-devel] " Kevin Wolf
2008-08-13 14:59 ` [Qemu-devel] [patch 2/5][v3] Divide get_cluster_offset() Laurent.Vivier
2008-08-14 14:10   ` [Qemu-devel] " Kevin Wolf
2008-08-13 14:59 ` [Qemu-devel] [patch 3/5][v3] Extract compressing part from alloc_cluster_offset() Laurent.Vivier
2008-08-14 14:25   ` [Qemu-devel] " Kevin Wolf
2008-08-13 14:59 ` [Qemu-devel] [patch 4/5][v3] Aggregate same type clusters Laurent.Vivier
2008-08-14 15:53   ` [Qemu-devel] " Kevin Wolf
2008-08-14 16:20     ` Anthony Liguori
2008-08-14 17:16       ` Kevin Wolf
2008-08-13 14:59 ` [Qemu-devel] [patch 5/5][v3] Try to aggregate free clusters and freed clusters Laurent.Vivier
2008-08-14 16:11   ` [Qemu-devel] " Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).