* [Qemu-devel] [PATCH V14 0/6] add-cow file format
@ 2012-10-25 13:36 Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 1/6] docs: document for " Dong Xu Wang
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: Dong Xu Wang @ 2012-10-25 13:36 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Dong Xu Wang
It will introduce a new file format: add-cow.
The add-cow file format makes it possible to perform copy-on-write on top of
a raw disk image. When we know that no backing file clusters remain visible
(e.g. we have streamed the entire image and copied all data from the backing
file), then it is possible to discard the add-cow file and use the raw image
file directly.
This feature adds the copy-on-write feature to raw files (which cannot support
it natively) while allowing us to get full performance again later when we no
longer need copy-on-write.
add-cow can benefit from other available functions, such as path_has_protocol
and qed_read_string, so we will make them public.
snapshot_blkdev are not supported now for add-cow. Will add it in futher patches.
These patches are using QemuOpts parser, former patches could be found here:
http://lists.nongnu.org/archive/html/qemu-devel/2012-10/msg04439.html
v13->v14:
1) Make some sentences more clear in docs.
2) Make MAGIC from 8 bytes to 4 bytes.
3) Fix some bugs.
v12->v13:
1) Use QemuOpts, not QEMUOptionParameter
2) cluster_size configuable
3) Refactor block-cache.c
4) Correct qemu-iotests script.
5) Other bug fix.
v11->v12:
1) Removed un-used feature bit.
2) Share cache code with qcow2.c.
3) Remove snapshot_blkdev support, will add it in another patch.
5) COW Bitmap field in add-cow file will be multiple of 65536.
6) fix grammer and typo.
Dong Xu Wang (6):
docs: document for add-cow file format
make path_has_protocol non static
qed_read_string to bdrv_read_string
rename qcow2-cache.c to block-cache.c
add-cow file format core code.
qemu-iotests: add add-cow iotests support.
block.c | 29 ++-
block.h | 3 +
block/Makefile.objs | 4 +-
block/add-cow.c | 692 ++++++++++++++++++++++++++++++++++++++++++
block/add-cow.h | 83 +++++
block/block-cache.c | 321 +++++++++++++++++++
block/block-cache.h | 77 +++++
block/qcow2-cache.c | 323 --------------------
block/qcow2-cluster.c | 53 ++--
block/qcow2-refcount.c | 66 +++--
block/qcow2.c | 21 +-
block/qcow2.h | 24 +--
block/qed.c | 34 +--
block_int.h | 2 +
docs/specs/add-cow.txt | 151 +++++++++
tests/qemu-iotests/017 | 2 +-
tests/qemu-iotests/020 | 2 +-
tests/qemu-iotests/common | 6 +
tests/qemu-iotests/common.rc | 15 +-
trace-events | 13 +-
20 files changed, 1472 insertions(+), 449 deletions(-)
create mode 100644 block/add-cow.c
create mode 100644 block/add-cow.h
create mode 100644 block/block-cache.c
create mode 100644 block/block-cache.h
delete mode 100644 block/qcow2-cache.c
create mode 100644 docs/specs/add-cow.txt
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH V14 1/6] docs: document for add-cow file format
2012-10-25 13:36 [Qemu-devel] [PATCH V14 0/6] add-cow file format Dong Xu Wang
@ 2012-10-25 13:36 ` Dong Xu Wang
2012-10-25 14:56 ` Eric Blake
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 2/6] make path_has_protocol non static Dong Xu Wang
` (4 subsequent siblings)
5 siblings, 1 reply; 10+ messages in thread
From: Dong Xu Wang @ 2012-10-25 13:36 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Dong Xu Wang
Document for add-cow format, the usage and spec of add-cow are introduced.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
docs/specs/add-cow.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 151 insertions(+), 0 deletions(-)
create mode 100644 docs/specs/add-cow.txt
diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
new file mode 100644
index 0000000..eeaaf10
--- /dev/null
+++ b/docs/specs/add-cow.txt
@@ -0,0 +1,151 @@
+== General ==
+
+The raw file format does not support backing files or copy on write feature.
+The add-cow image format makes it possible to use backing files with a raw
+image by keeping a separate .add-cow metadata file. Once all sectors
+have been written into the raw image it is safe to discard the .add-cow
+and backing files, then we can use the raw image directly.
+
+An example usage of add-cow would look like::
+(ubuntu.img is a disk image which has an installed OS.)
+ 1) Create a raw image with the same size of ubuntu.img
+ qemu-img create -f raw test.raw 8G
+ 2) Create an add-cow image which will store dirty bitmap
+ qemu-img create -f add-cow test.add-cow \
+ -o backing_file=ubuntu.img,image_file=test.raw
+ 3) Run qemu with add-cow image
+ qemu -drive if=virtio,file=test.add-cow
+
+test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
+will be calculated from the size of test.raw.
+
+=Specification=
+
+The file format looks like this:
+
+ +---------------+-------------------------------+
+ | Header | COW bitmap |
+ +---------------+-------------------------------+
+
+All numbers in add-cow are stored in Little Endian byte order.
+
+== Header ==
+
+The Header is included in the first bytes:
+(HEADER_SIZE is defined in 44-47 bytes.)
+ Byte 0 - 3: magic
+ add-cow magic string ("ACOW").
+
+ 4 - 7: version
+ Version number (only valid value is 1 now).
+
+ 8 - 11: backing file name offset
+ Offset in the add-cow file at which the backing file
+ name is stored (NB: The string is not nul-terminated).
+ If backing file name does NOT exist, this field will be
+ 0. Must be between 80 and [HEADER_SIZE - 2](a file name
+ must be at least 1 byte).
+
+ 12 - 15: backing file name size
+ Length of the backing file name in bytes. It will be 0
+ if the backing file name offset is 0. If backing file
+ name offset is non-zero, then it must be non-zero. Must
+ be less than [HEADER_SIZE - 80] to fit in the reserved
+ part of the header. Backing file name offset + size
+ must be no more than HEADER_SIZE.
+
+ 16 - 19: image file name offset
+ Offset in the add-cow file at which the image file name
+ is stored (NB: The string is not null terminated). It
+ must be between 80 and [HEADER_SIZE - 2]. Image file
+ name size + offset must be no more than HEADER_SIZE.
+
+ 20 - 23: image file name size
+ Length of the image file name in bytes.
+ Must be less than [HEADER_SIZE - 80] to fit in the reserved
+ part of the header.
+
+ 24 - 27: cluster bits
+ Number of bits that are used for addressing an offset
+ within a cluster (1 << cluster_bits is the cluster size).
+ Must not be less than 9 (i.e. 512 byte clusters).
+
+ Note: qemu as of today has an implementation limit of 2 MB
+ as the maximum cluster size and won't be able to open images
+ with larger cluster sizes.
+
+ 28 - 35: features
+ Bitmask of features. If a feature bit set that can not
+ be recognized, the add-cow file should be droped. They are not
+ used in v1.
+
+ Bits 0-63: Reserved (set to 0)
+
+ 36 - 43: compatible features
+ Bitmask of compatible features. An implementation can
+ safely ignore any unknown bits that are set.
+ Bit 0: All allocated bit. If this bit is set then
+ backing file and COW bitmap will not be used,
+ and can read from or write to image file directly.
+
+ Bits 1-63: Reserved (set to 0)
+
+ 44 - 47: HEADER_SIZE
+ The header field is variable-sized. This field indicates
+ how many bytes will be used to store add-cow header.
+ In add-cow v1, it is fixed to 4096.
+
+ 48 - 63: backing file format
+ Format of backing file. It will be filled with 0 if
+ backing file name offset is 0. If backing file name
+ offset is non-empty, it must be non-empty. It is coded
+ in free-form ASCII, and is not NUL-terminated. Zero
+ padded on the right.
+
+ 64 - 79: image file format
+ Format of image file. It must be non-empty. It is coded
+ in free-form ASCII, and is not NUL-terminated. Zero
+ padded on the right.
+
+ 80 - [HEADER_SIZE - 1]:
+ It is used to make sure COW bitmap field starts at the
+ HEADER_SIZE byte, backing file name and image file name
+ will be stored here. The bytes that is not pointing to
+ backing file and image file names must be set to 0.
+
+== COW bitmap ==
+
+The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
+backing file and image file. The bitmap will track whether the sector in
+backing file is dirty or not.
+
+Each bit in the bitmap tracks one cluster's status. For example, if cluster
+bit is 16, then each bit tracks one cluster, (1 << 16) = 65536 bytes. The
+image file size is rounded up to cluster size (where any bytes in the
+last cluster that do not fit in the image are ignored), then if the
+number of clusters is not a multiple of 8, then remaining bits in the
+bitmap will be set to 0.
+
+The size of bitmap is calculated according to virtual size of image file,and
+the size of bitmap should be multiple of add-cow file's cluster size, the bits
+not used will be set to 0. Within each byte, the least significant bit covers
+the first cluster. Bit orders in one byte look like:
+ +----+----+----+----+----+----+----+----+
+ | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
+ +----+----+----+----+----+----+----+----+
+
+If the bit is 0, it indicates the sector has not been allocated in image file,
+data should be loaded from backing file while reading; if the bit is 1, it
+indicates the related sector has been dirty, should be loaded from image file
+while reading. Writing to a sector causes the corresponding bit to be set to 1.
+If there is no backing file, or if the image file is larger than the backing
+file and the offset is beyond the end of the backing file, then the data should
+be read as all zero bytes instead.
+
+If raw image is not an even multiple of cluster bytes, bits that correspond to
+bytes beyond the raw file size in add-cow must be written as 0 and must be
+ignored when reading.
+
+Image file name and backing file name must NOT be the same, we prevent this
+while creating add-cow files via qemu-img. If image file name and backing file
+name are the same, the add-cow image must be treated as invalid.
--
1.7.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH V14 2/6] make path_has_protocol non static
2012-10-25 13:36 [Qemu-devel] [PATCH V14 0/6] add-cow file format Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 1/6] docs: document for " Dong Xu Wang
@ 2012-10-25 13:36 ` Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 3/6] qed_read_string to bdrv_read_string Dong Xu Wang
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Dong Xu Wang @ 2012-10-25 13:36 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Dong Xu Wang
We will use path_has_protocol outside block.c, so just make it public.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
---
block.c | 2 +-
block.h | 1 +
2 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/block.c b/block.c
index f639655..03ba485 100644
--- a/block.c
+++ b/block.c
@@ -198,7 +198,7 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs,
}
/* check if the path starts with "<protocol>:" */
-static int path_has_protocol(const char *path)
+int path_has_protocol(const char *path)
{
const char *p;
diff --git a/block.h b/block.h
index 7842d85..364ba04 100644
--- a/block.h
+++ b/block.h
@@ -329,6 +329,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn);
char *get_human_readable_size(char *buf, int buf_size, int64_t size);
int path_is_absolute(const char *path);
+int path_has_protocol(const char *path);
void path_combine(char *dest, int dest_size,
const char *base_path,
const char *filename);
--
1.7.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH V14 3/6] qed_read_string to bdrv_read_string
2012-10-25 13:36 [Qemu-devel] [PATCH V14 0/6] add-cow file format Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 1/6] docs: document for " Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 2/6] make path_has_protocol non static Dong Xu Wang
@ 2012-10-25 13:36 ` Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Dong Xu Wang @ 2012-10-25 13:36 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Dong Xu Wang
Make qed_read_string function to a common interface, so move it to block.c.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
block.c | 27 +++++++++++++++++++++++++++
block.h | 2 ++
block/qed.c | 34 ++++------------------------------
3 files changed, 33 insertions(+), 30 deletions(-)
diff --git a/block.c b/block.c
index 03ba485..9afb7b5 100644
--- a/block.c
+++ b/block.c
@@ -215,6 +215,33 @@ int path_has_protocol(const char *path)
return *p == ':';
}
+/**
+ * Read a string of known length from the image file
+ *
+ * @bs: Image file
+ * @offset: File offset to start of string, in bytes
+ * @n: String length in bytes
+ * @buf: Destination buffer
+ * @buflen: Destination buffer length in bytes
+ * @ret: 0 on success, -errno on failure
+ *
+ * The string is NUL-terminated.
+ */
+int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n,
+ char *buf, size_t buflen)
+{
+ int ret;
+ if (n >= buflen) {
+ return -EINVAL;
+ }
+ ret = bdrv_pread(bs, offset, buf, n);
+ if (ret < 0) {
+ return ret;
+ }
+ buf[n] = '\0';
+ return 0;
+}
+
int path_is_absolute(const char *path)
{
#ifdef _WIN32
diff --git a/block.h b/block.h
index 364ba04..166e00c 100644
--- a/block.h
+++ b/block.h
@@ -168,6 +168,8 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
const void *buf, int count);
int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);
+int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n,
+ char *buf, size_t buflen);
int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
diff --git a/block/qed.c b/block/qed.c
index 0a9dbe8..096de21 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -217,33 +217,6 @@ static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size,
}
/**
- * Read a string of known length from the image file
- *
- * @file: Image file
- * @offset: File offset to start of string, in bytes
- * @n: String length in bytes
- * @buf: Destination buffer
- * @buflen: Destination buffer length in bytes
- * @ret: 0 on success, -errno on failure
- *
- * The string is NUL-terminated.
- */
-static int qed_read_string(BlockDriverState *file, uint64_t offset, size_t n,
- char *buf, size_t buflen)
-{
- int ret;
- if (n >= buflen) {
- return -EINVAL;
- }
- ret = bdrv_pread(file, offset, buf, n);
- if (ret < 0) {
- return ret;
- }
- buf[n] = '\0';
- return 0;
-}
-
-/**
* Allocate new clusters
*
* @s: QED state
@@ -437,9 +410,10 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags)
return -EINVAL;
}
- ret = qed_read_string(bs->file, s->header.backing_filename_offset,
- s->header.backing_filename_size, bs->backing_file,
- sizeof(bs->backing_file));
+ ret = bdrv_read_string(bs->file, s->header.backing_filename_offset,
+ s->header.backing_filename_size,
+ bs->backing_file,
+ sizeof(bs->backing_file));
if (ret < 0) {
return ret;
}
--
1.7.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH V14 4/6] rename qcow2-cache.c to block-cache.c
2012-10-25 13:36 [Qemu-devel] [PATCH V14 0/6] add-cow file format Dong Xu Wang
` (2 preceding siblings ...)
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 3/6] qed_read_string to bdrv_read_string Dong Xu Wang
@ 2012-10-25 13:36 ` Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 5/6] add-cow file format core code Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 6/6] qemu-iotests: add add-cow iotests support Dong Xu Wang
5 siblings, 0 replies; 10+ messages in thread
From: Dong Xu Wang @ 2012-10-25 13:36 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Dong Xu Wang
We will re-use qcow2-cache as block layer common cache code,
so change its name and made some changes, define a struct named
BlockTableType, pass BlockTableType and table size parameters to
block cache initialization function.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
block/Makefile.objs | 3 +-
block/block-cache.c | 317 +++++++++++++++++++++++++++++++++++++++++++++++
block/block-cache.h | 76 +++++++++++
block/qcow2-cache.c | 323 ------------------------------------------------
block/qcow2-cluster.c | 53 ++++----
block/qcow2-refcount.c | 66 ++++++-----
block/qcow2.c | 21 ++--
block/qcow2.h | 24 +---
trace-events | 13 +-
9 files changed, 481 insertions(+), 415 deletions(-)
create mode 100644 block/block-cache.c
create mode 100644 block/block-cache.h
delete mode 100644 block/qcow2-cache.c
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 554f429..f128b78 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,6 @@
block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
-block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
+block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
+block-obj-y += block-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
diff --git a/block/block-cache.c b/block/block-cache.c
new file mode 100644
index 0000000..bf5c57c
--- /dev/null
+++ b/block/block-cache.c
@@ -0,0 +1,317 @@
+/*
+ * QEMU Block Layer Cache
+ *
+ * Copyright IBM, Corp. 2012
+ *
+ * Authors:
+ * Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
+ *
+ * This file is based on qcow2-cache.c, see its copyrights below:
+ *
+ * L2/refcount table cache for the QCOW2 format
+ *
+ * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "block_int.h"
+#include "qemu-common.h"
+#include "trace.h"
+#include "block-cache.h"
+
+BlockCache *block_cache_create(BlockDriverState *bs, int num_tables,
+ size_t cluster_size, BlockTableType type)
+{
+ BlockCache *c;
+ int i;
+
+ c = g_malloc0(sizeof(*c));
+ c->size = num_tables;
+ c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
+ c->table_type = type;
+ c->cluster_size = cluster_size;
+
+ for (i = 0; i < c->size; i++) {
+ c->entries[i].table = qemu_blockalign(bs, cluster_size);
+ }
+
+ return c;
+}
+
+int block_cache_destroy(BlockDriverState *bs, BlockCache *c)
+{
+ int i;
+
+ for (i = 0; i < c->size; i++) {
+ assert(c->entries[i].ref == 0);
+ qemu_vfree(c->entries[i].table);
+ }
+
+ g_free(c->entries);
+ g_free(c);
+
+ return 0;
+}
+
+static int block_cache_flush_dependency(BlockDriverState *bs, BlockCache *c)
+{
+ int ret;
+
+ ret = block_cache_flush(bs, c->depends);
+ if (ret < 0) {
+ return ret;
+ }
+
+ c->depends = NULL;
+ c->depends_on_flush = false;
+
+ return 0;
+}
+
+static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
+{
+ int ret = 0;
+
+ if (!c->entries[i].dirty || !c->entries[i].offset) {
+ return 0;
+ }
+
+ trace_block_cache_entry_flush(qemu_coroutine_self(), c->table_type, i);
+
+ if (c->depends) {
+ ret = block_cache_flush_dependency(bs, c);
+ } else if (c->depends_on_flush) {
+ ret = bdrv_flush(bs->file);
+ if (ret >= 0) {
+ c->depends_on_flush = false;
+ }
+ }
+
+ if (ret < 0) {
+ return ret;
+ }
+
+ if (c->table_type == BLOCK_TABLE_REF) {
+ BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART);
+ } else if (c->table_type == BLOCK_TABLE_L2) {
+ BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
+ }
+
+ ret = bdrv_pwrite(bs->file, c->entries[i].offset,
+ c->entries[i].table, c->cluster_size);
+ if (ret < 0) {
+ return ret;
+ }
+
+ c->entries[i].dirty = false;
+
+ return 0;
+}
+
+int block_cache_flush(BlockDriverState *bs, BlockCache *c)
+{
+ int result = 0;
+ int ret;
+ int i;
+
+ trace_block_cache_flush(qemu_coroutine_self(), c->table_type);
+
+ for (i = 0; i < c->size; i++) {
+ ret = block_cache_entry_flush(bs, c, i);
+ if (ret < 0 && result != -ENOSPC) {
+ result = ret;
+ }
+ }
+
+ if (result == 0) {
+ ret = bdrv_flush(bs->file);
+ if (ret < 0) {
+ result = ret;
+ }
+ }
+
+ return result;
+}
+
+int block_cache_set_dependency(BlockDriverState *bs,
+ BlockCache *c,
+ BlockCache *dependency)
+{
+ int ret;
+
+ if (dependency->depends) {
+ ret = block_cache_flush_dependency(bs, dependency);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
+ if (c->depends && (c->depends != dependency)) {
+ ret = block_cache_flush_dependency(bs, c);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
+ c->depends = dependency;
+ return 0;
+}
+
+void block_cache_depends_on_flush(BlockCache *c)
+{
+ c->depends_on_flush = true;
+}
+
+static int block_cache_find_entry_to_replace(BlockCache *c)
+{
+ int i;
+ int min_count = INT_MAX;
+ int min_index = -1;
+
+
+ for (i = 0; i < c->size; i++) {
+ if (c->entries[i].ref) {
+ continue;
+ }
+
+ if (c->entries[i].cache_hits < min_count) {
+ min_index = i;
+ min_count = c->entries[i].cache_hits;
+ }
+
+ /* Give newer hits priority */
+ /* TODO Check how to optimize the replacement strategy */
+ c->entries[i].cache_hits /= 2;
+ }
+
+ if (min_index == -1) {
+ /* This can't happen in current synchronous code, but leave the check
+ * here as a reminder for whoever starts using AIO with the cache */
+ abort();
+ }
+ return min_index;
+}
+
+static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
+ uint64_t offset, void **table,
+ bool read_from_disk)
+{
+ int i;
+ int ret;
+
+ trace_block_cache_get(qemu_coroutine_self(), c->table_type,
+ offset, read_from_disk);
+
+ /* Check if the table is already cached */
+ for (i = 0; i < c->size; i++) {
+ if (c->entries[i].offset == offset) {
+ goto found;
+ }
+ }
+
+ /* If not, write a table back and replace it */
+ i = block_cache_find_entry_to_replace(c);
+ trace_block_cache_get_replace_entry(qemu_coroutine_self(),
+ c->table_type, i);
+ if (i < 0) {
+ return i;
+ }
+
+ ret = block_cache_entry_flush(bs, c, i);
+ if (ret < 0) {
+ return ret;
+ }
+
+ trace_block_cache_get_read(qemu_coroutine_self(),
+ c->table_type, i);
+ c->entries[i].offset = 0;
+ if (read_from_disk) {
+ if (c->table_type == BLOCK_TABLE_L2) {
+ BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
+ }
+
+ ret = bdrv_pread(bs->file, offset, c->entries[i].table,
+ c->cluster_size);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
+ /* Give the table some hits for the start so that it won't be replaced
+ * immediately. The number 32 is completely arbitrary. */
+ c->entries[i].cache_hits = 32;
+ c->entries[i].offset = offset;
+
+ /* And return the right table */
+found:
+ c->entries[i].cache_hits++;
+ c->entries[i].ref++;
+ *table = c->entries[i].table;
+
+ trace_block_cache_get_done(qemu_coroutine_self(),
+ c->table_type, i);
+
+ return 0;
+}
+
+int block_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset,
+ void **table)
+{
+ return block_cache_do_get(bs, c, offset, table, true);
+}
+
+int block_cache_get_empty(BlockDriverState *bs, BlockCache *c,
+ uint64_t offset, void **table)
+{
+ return block_cache_do_get(bs, c, offset, table, false);
+}
+
+int block_cache_put(BlockDriverState *bs, BlockCache *c, void **table)
+{
+ int i;
+
+ for (i = 0; i < c->size; i++) {
+ if (c->entries[i].table == *table) {
+ goto found;
+ }
+ }
+ return -ENOENT;
+
+found:
+ c->entries[i].ref--;
+ assert(c->entries[i].ref >= 0);
+ *table = NULL;
+ return 0;
+}
+
+void block_cache_entry_mark_dirty(BlockCache *c, void *table)
+{
+ int i;
+
+ for (i = 0; i < c->size; i++) {
+ if (c->entries[i].table == table) {
+ goto found;
+ }
+ }
+ abort();
+
+found:
+ c->entries[i].dirty = true;
+}
diff --git a/block/block-cache.h b/block/block-cache.h
new file mode 100644
index 0000000..4efa06e
--- /dev/null
+++ b/block/block-cache.h
@@ -0,0 +1,76 @@
+/*
+ * QEMU Block Layer Cache
+ *
+ * Copyright IBM, Corp. 2012
+ *
+ * Authors:
+ * Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
+ *
+ * This file is based on qcow2-cache.c, see its copyrights below:
+ *
+ * L2/refcount table cache for the QCOW2 format
+ *
+ * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef BLOCK_CACHE_H
+#define BLOCK_CACHE_H
+
+typedef enum {
+ BLOCK_TABLE_REF,
+ BLOCK_TABLE_L2,
+} BlockTableType;
+
+typedef struct BlockCachedTable {
+ void *table;
+ int64_t offset;
+ bool dirty;
+ int cache_hits;
+ int ref;
+} BlockCachedTable;
+
+struct BlockCache {
+ BlockCachedTable *entries;
+ struct BlockCache *depends;
+ int size;
+ size_t cluster_size;
+ BlockTableType table_type;
+ bool depends_on_flush;
+};
+
+struct BlockCache;
+typedef struct BlockCache BlockCache;
+
+BlockCache *block_cache_create(BlockDriverState *bs, int num_tables,
+ size_t cluster_size, BlockTableType type);
+int block_cache_destroy(BlockDriverState *bs, BlockCache *c);
+int block_cache_flush(BlockDriverState *bs, BlockCache *c);
+int block_cache_set_dependency(BlockDriverState *bs,
+ BlockCache *c,
+ BlockCache *dependency);
+void block_cache_depends_on_flush(BlockCache *c);
+int block_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset,
+ void **table);
+int block_cache_get_empty(BlockDriverState *bs, BlockCache *c,
+ uint64_t offset, void **table);
+int block_cache_put(BlockDriverState *bs, BlockCache *c, void **table);
+void block_cache_entry_mark_dirty(BlockCache *c, void *table);
+#endif /* BLOCK_CACHE_H */
diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
deleted file mode 100644
index 2d4322a..0000000
--- a/block/qcow2-cache.c
+++ /dev/null
@@ -1,323 +0,0 @@
-/*
- * L2/refcount table cache for the QCOW2 format
- *
- * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-
-#include "block_int.h"
-#include "qemu-common.h"
-#include "qcow2.h"
-#include "trace.h"
-
-typedef struct Qcow2CachedTable {
- void* table;
- int64_t offset;
- bool dirty;
- int cache_hits;
- int ref;
-} Qcow2CachedTable;
-
-struct Qcow2Cache {
- Qcow2CachedTable* entries;
- struct Qcow2Cache* depends;
- int size;
- bool depends_on_flush;
-};
-
-Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
-{
- BDRVQcowState *s = bs->opaque;
- Qcow2Cache *c;
- int i;
-
- c = g_malloc0(sizeof(*c));
- c->size = num_tables;
- c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
-
- for (i = 0; i < c->size; i++) {
- c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
- }
-
- return c;
-}
-
-int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)
-{
- int i;
-
- for (i = 0; i < c->size; i++) {
- assert(c->entries[i].ref == 0);
- qemu_vfree(c->entries[i].table);
- }
-
- g_free(c->entries);
- g_free(c);
-
- return 0;
-}
-
-static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c)
-{
- int ret;
-
- ret = qcow2_cache_flush(bs, c->depends);
- if (ret < 0) {
- return ret;
- }
-
- c->depends = NULL;
- c->depends_on_flush = false;
-
- return 0;
-}
-
-static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
-{
- BDRVQcowState *s = bs->opaque;
- int ret = 0;
-
- if (!c->entries[i].dirty || !c->entries[i].offset) {
- return 0;
- }
-
- trace_qcow2_cache_entry_flush(qemu_coroutine_self(),
- c == s->l2_table_cache, i);
-
- if (c->depends) {
- ret = qcow2_cache_flush_dependency(bs, c);
- } else if (c->depends_on_flush) {
- ret = bdrv_flush(bs->file);
- if (ret >= 0) {
- c->depends_on_flush = false;
- }
- }
-
- if (ret < 0) {
- return ret;
- }
-
- if (c == s->refcount_block_cache) {
- BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART);
- } else if (c == s->l2_table_cache) {
- BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
- }
-
- ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
- s->cluster_size);
- if (ret < 0) {
- return ret;
- }
-
- c->entries[i].dirty = false;
-
- return 0;
-}
-
-int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
-{
- BDRVQcowState *s = bs->opaque;
- int result = 0;
- int ret;
- int i;
-
- trace_qcow2_cache_flush(qemu_coroutine_self(), c == s->l2_table_cache);
-
- for (i = 0; i < c->size; i++) {
- ret = qcow2_cache_entry_flush(bs, c, i);
- if (ret < 0 && result != -ENOSPC) {
- result = ret;
- }
- }
-
- if (result == 0) {
- ret = bdrv_flush(bs->file);
- if (ret < 0) {
- result = ret;
- }
- }
-
- return result;
-}
-
-int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
- Qcow2Cache *dependency)
-{
- int ret;
-
- if (dependency->depends) {
- ret = qcow2_cache_flush_dependency(bs, dependency);
- if (ret < 0) {
- return ret;
- }
- }
-
- if (c->depends && (c->depends != dependency)) {
- ret = qcow2_cache_flush_dependency(bs, c);
- if (ret < 0) {
- return ret;
- }
- }
-
- c->depends = dependency;
- return 0;
-}
-
-void qcow2_cache_depends_on_flush(Qcow2Cache *c)
-{
- c->depends_on_flush = true;
-}
-
-static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c)
-{
- int i;
- int min_count = INT_MAX;
- int min_index = -1;
-
-
- for (i = 0; i < c->size; i++) {
- if (c->entries[i].ref) {
- continue;
- }
-
- if (c->entries[i].cache_hits < min_count) {
- min_index = i;
- min_count = c->entries[i].cache_hits;
- }
-
- /* Give newer hits priority */
- /* TODO Check how to optimize the replacement strategy */
- c->entries[i].cache_hits /= 2;
- }
-
- if (min_index == -1) {
- /* This can't happen in current synchronous code, but leave the check
- * here as a reminder for whoever starts using AIO with the cache */
- abort();
- }
- return min_index;
-}
-
-static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
- uint64_t offset, void **table, bool read_from_disk)
-{
- BDRVQcowState *s = bs->opaque;
- int i;
- int ret;
-
- trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
- offset, read_from_disk);
-
- /* Check if the table is already cached */
- for (i = 0; i < c->size; i++) {
- if (c->entries[i].offset == offset) {
- goto found;
- }
- }
-
- /* If not, write a table back and replace it */
- i = qcow2_cache_find_entry_to_replace(c);
- trace_qcow2_cache_get_replace_entry(qemu_coroutine_self(),
- c == s->l2_table_cache, i);
- if (i < 0) {
- return i;
- }
-
- ret = qcow2_cache_entry_flush(bs, c, i);
- if (ret < 0) {
- return ret;
- }
-
- trace_qcow2_cache_get_read(qemu_coroutine_self(),
- c == s->l2_table_cache, i);
- c->entries[i].offset = 0;
- if (read_from_disk) {
- if (c == s->l2_table_cache) {
- BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
- }
-
- ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size);
- if (ret < 0) {
- return ret;
- }
- }
-
- /* Give the table some hits for the start so that it won't be replaced
- * immediately. The number 32 is completely arbitrary. */
- c->entries[i].cache_hits = 32;
- c->entries[i].offset = offset;
-
- /* And return the right table */
-found:
- c->entries[i].cache_hits++;
- c->entries[i].ref++;
- *table = c->entries[i].table;
-
- trace_qcow2_cache_get_done(qemu_coroutine_self(),
- c == s->l2_table_cache, i);
-
- return 0;
-}
-
-int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
- void **table)
-{
- return qcow2_cache_do_get(bs, c, offset, table, true);
-}
-
-int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
- void **table)
-{
- return qcow2_cache_do_get(bs, c, offset, table, false);
-}
-
-int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
-{
- int i;
-
- for (i = 0; i < c->size; i++) {
- if (c->entries[i].table == *table) {
- goto found;
- }
- }
- return -ENOENT;
-
-found:
- c->entries[i].ref--;
- *table = NULL;
-
- assert(c->entries[i].ref >= 0);
- return 0;
-}
-
-void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table)
-{
- int i;
-
- for (i = 0; i < c->size; i++) {
- if (c->entries[i].table == table) {
- goto found;
- }
- }
- abort();
-
-found:
- c->entries[i].dirty = true;
-}
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index e179211..b5a0960 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -69,7 +69,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
return new_l1_table_offset;
}
- ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+ ret = block_cache_flush(bs, s->refcount_block_cache);
if (ret < 0) {
goto fail;
}
@@ -119,7 +119,8 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
BDRVQcowState *s = bs->opaque;
int ret;
- ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table);
+ ret = block_cache_get(bs, s->l2_table_cache, l2_offset,
+ (void **) l2_table);
return ret;
}
@@ -180,7 +181,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
return l2_offset;
}
- ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+ ret = block_cache_flush(bs, s->refcount_block_cache);
if (ret < 0) {
goto fail;
}
@@ -188,7 +189,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
/* allocate a new entry in the l2 cache */
trace_qcow2_l2_allocate_get_empty(bs, l1_index);
- ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table);
+ ret = block_cache_get_empty(bs, s->l2_table_cache, l2_offset,
+ (void **) table);
if (ret < 0) {
return ret;
}
@@ -203,16 +205,16 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
/* if there was an old l2 table, read it from the disk */
BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ);
- ret = qcow2_cache_get(bs, s->l2_table_cache,
- old_l2_offset & L1E_OFFSET_MASK,
- (void**) &old_table);
+ ret = block_cache_get(bs, s->l2_table_cache,
+ old_l2_offset & L1E_OFFSET_MASK,
+ (void **) &old_table);
if (ret < 0) {
goto fail;
}
memcpy(l2_table, old_table, s->cluster_size);
- ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &old_table);
+ ret = block_cache_put(bs, s->l2_table_cache, (void **) &old_table);
if (ret < 0) {
goto fail;
}
@@ -222,8 +224,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
trace_qcow2_l2_allocate_write_l2(bs, l1_index);
- qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
- ret = qcow2_cache_flush(bs, s->l2_table_cache);
+ block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+ ret = block_cache_flush(bs, s->l2_table_cache);
if (ret < 0) {
goto fail;
}
@@ -242,7 +244,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
fail:
trace_qcow2_l2_allocate_done(bs, l1_index, ret);
- qcow2_cache_put(bs, s->l2_table_cache, (void**) table);
+ block_cache_put(bs, s->l2_table_cache, (void **) table);
s->l1_table[l1_index] = old_l2_offset;
return ret;
}
@@ -475,7 +477,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
abort();
}
- qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
nb_available = (c * s->cluster_sectors);
@@ -584,13 +586,13 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
* allocated. */
cluster_offset = be64_to_cpu(l2_table[l2_index]);
if (cluster_offset & L2E_OFFSET_MASK) {
- qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
return 0;
}
cluster_offset = qcow2_alloc_bytes(bs, compressed_size);
if (cluster_offset < 0) {
- qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
return 0;
}
@@ -605,9 +607,9 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
/* compressed clusters never have the copied flag */
BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED);
- qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+ block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
l2_table[l2_index] = cpu_to_be64(cluster_offset);
- ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
if (ret < 0) {
return 0;
}
@@ -659,18 +661,19 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
* handled.
*/
if (cow) {
- qcow2_cache_depends_on_flush(s->l2_table_cache);
+ block_cache_depends_on_flush(s->l2_table_cache);
}
+
if (qcow2_need_accurate_refcounts(s)) {
- qcow2_cache_set_dependency(bs, s->l2_table_cache,
+ block_cache_set_dependency(bs, s->l2_table_cache,
s->refcount_block_cache);
}
ret = get_cluster_table(bs, m->offset, &l2_table, &l2_index);
if (ret < 0) {
goto err;
}
- qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+ block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
for (i = 0; i < m->nb_clusters; i++) {
/* if two concurrent writes happen to the same unallocated cluster
@@ -687,7 +690,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
}
- ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
if (ret < 0) {
goto err;
}
@@ -913,7 +916,7 @@ again:
* request to complete. If we still had the reference, we could use up the
* whole cache with sleeping requests.
*/
- ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
if (ret < 0) {
return ret;
}
@@ -1077,14 +1080,14 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
}
/* First remove L2 entries */
- qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+ block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
l2_table[l2_index + i] = cpu_to_be64(0);
/* Then decrease the refcount */
qcow2_free_any_clusters(bs, old_offset, 1);
}
- ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
if (ret < 0) {
return ret;
}
@@ -1154,7 +1157,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
old_offset = be64_to_cpu(l2_table[l2_index + i]);
/* Update L2 entries */
- qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+ block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
if (old_offset & QCOW_OFLAG_COMPRESSED) {
l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
qcow2_free_any_clusters(bs, old_offset, 1);
@@ -1163,7 +1166,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
}
}
- ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
if (ret < 0) {
return ret;
}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 5e3f915..0b0c2e2 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -71,8 +71,8 @@ static int load_refcount_block(BlockDriverState *bs,
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_LOAD);
- ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
- refcount_block);
+ ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
+ refcount_block);
return ret;
}
@@ -98,8 +98,8 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
if (!refcount_block_offset)
return 0;
- ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
- (void**) &refcount_block);
+ ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
+ (void **) &refcount_block);
if (ret < 0) {
return ret;
}
@@ -108,8 +108,8 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
((1 << (s->cluster_bits - REFCOUNT_SHIFT)) - 1);
refcount = be16_to_cpu(refcount_block[block_index]);
- ret = qcow2_cache_put(bs, s->refcount_block_cache,
- (void**) &refcount_block);
+ ret = block_cache_put(bs, s->refcount_block_cache,
+ (void **) &refcount_block);
if (ret < 0) {
return ret;
}
@@ -201,7 +201,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
*refcount_block = NULL;
/* We write to the refcount table, so we might depend on L2 tables */
- qcow2_cache_flush(bs, s->l2_table_cache);
+ block_cache_flush(bs, s->l2_table_cache);
/* Allocate the refcount block itself and mark it as used */
int64_t new_block = alloc_clusters_noref(bs, s->cluster_size);
@@ -217,8 +217,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
if (in_same_refcount_block(s, new_block, cluster_index << s->cluster_bits)) {
/* Zero the new refcount block before updating it */
- ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block,
- (void**) refcount_block);
+ ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block,
+ (void **) refcount_block);
if (ret < 0) {
goto fail_block;
}
@@ -241,8 +241,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
/* Initialize the new refcount block only after updating its refcount,
* update_refcount uses the refcount cache itself */
- ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block,
- (void**) refcount_block);
+ ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block,
+ (void **) refcount_block);
if (ret < 0) {
goto fail_block;
}
@@ -252,8 +252,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
/* Now the new refcount block needs to be written to disk */
BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC_WRITE);
- qcow2_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block);
- ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+ block_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block);
+ ret = block_cache_flush(bs, s->refcount_block_cache);
if (ret < 0) {
goto fail_block;
}
@@ -273,7 +273,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
return 0;
}
- ret = qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
+ ret = block_cache_put(bs, s->refcount_block_cache,
+ (void **) refcount_block);
if (ret < 0) {
goto fail_block;
}
@@ -406,7 +407,8 @@ fail_table:
g_free(new_table);
fail_block:
if (*refcount_block != NULL) {
- qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
+ block_cache_put(bs, s->refcount_block_cache,
+ (void **) refcount_block);
}
return ret;
}
@@ -432,8 +434,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
}
if (addend < 0) {
- qcow2_cache_set_dependency(bs, s->refcount_block_cache,
- s->l2_table_cache);
+ block_cache_set_dependency(bs, s->refcount_block_cache,
+ s->l2_table_cache);
}
start = offset & ~(s->cluster_size - 1);
@@ -449,8 +451,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
/* Load the refcount block and allocate it if needed */
if (table_index != old_table_index) {
if (refcount_block) {
- ret = qcow2_cache_put(bs, s->refcount_block_cache,
- (void**) &refcount_block);
+ ret = block_cache_put(bs, s->refcount_block_cache,
+ (void **) &refcount_block);
if (ret < 0) {
goto fail;
}
@@ -463,7 +465,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
}
old_table_index = table_index;
- qcow2_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block);
+ block_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block);
/* we can update the count and save it */
block_index = cluster_index &
@@ -486,8 +488,8 @@ fail:
/* Write last changed block to disk */
if (refcount_block) {
int wret;
- wret = qcow2_cache_put(bs, s->refcount_block_cache,
- (void**) &refcount_block);
+ wret = block_cache_put(bs, s->refcount_block_cache,
+ (void **) &refcount_block);
if (wret < 0) {
return ret < 0 ? ret : wret;
}
@@ -763,8 +765,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
old_l2_offset = l2_offset;
l2_offset &= L1E_OFFSET_MASK;
- ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
- (void**) &l2_table);
+ ret = block_cache_get(bs, s->l2_table_cache, l2_offset,
+ (void **) &l2_table);
if (ret < 0) {
goto fail;
}
@@ -811,16 +813,18 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
}
if (offset != old_offset) {
if (addend > 0) {
- qcow2_cache_set_dependency(bs, s->l2_table_cache,
+ block_cache_set_dependency(bs, s->l2_table_cache,
s->refcount_block_cache);
}
l2_table[j] = cpu_to_be64(offset);
- qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+ block_cache_entry_mark_dirty(s->l2_table_cache,
+ l2_table);
}
}
}
- ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ ret = block_cache_put(bs, s->l2_table_cache,
+ (void **) &l2_table);
if (ret < 0) {
goto fail;
}
@@ -847,7 +851,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
ret = 0;
fail:
if (l2_table) {
- qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ block_cache_put(bs, s->l2_table_cache,
+ (void **) &l2_table);
}
/* Update L1 only if it isn't deleted anyway (addend = -1) */
@@ -1130,8 +1135,9 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
0, s->cluster_size);
/* current L1 table */
- ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
- s->l1_table_offset, s->l1_size, 1);
+ ret = check_refcounts_l1(bs, res, refcount_table,
+ nb_clusters, s->l1_table_offset,
+ s->l1_size, 1);
if (ret < 0) {
goto fail;
}
diff --git a/block/qcow2.c b/block/qcow2.c
index 54bda70..3f4c09e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -430,8 +430,11 @@ static int qcow2_open(BlockDriverState *bs, int flags)
}
/* alloc L2 table/refcount block cache */
- s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE);
- s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE);
+ s->l2_table_cache = block_cache_create(bs, L2_CACHE_SIZE,
+ s->cluster_size, BLOCK_TABLE_L2);
+ s->refcount_block_cache = block_cache_create(bs, REFCOUNT_CACHE_SIZE,
+ s->cluster_size,
+ BLOCK_TABLE_REF);
s->cluster_cache = g_malloc(s->cluster_size);
/* one more sector for decompressed data alignment */
@@ -510,7 +513,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
qcow2_refcount_close(bs);
g_free(s->l1_table);
if (s->l2_table_cache) {
- qcow2_cache_destroy(bs, s->l2_table_cache);
+ block_cache_destroy(bs, s->l2_table_cache);
}
g_free(s->cluster_cache);
qemu_vfree(s->cluster_data);
@@ -878,13 +881,13 @@ static void qcow2_close(BlockDriverState *bs)
BDRVQcowState *s = bs->opaque;
g_free(s->l1_table);
- qcow2_cache_flush(bs, s->l2_table_cache);
- qcow2_cache_flush(bs, s->refcount_block_cache);
+ block_cache_flush(bs, s->l2_table_cache);
+ block_cache_flush(bs, s->refcount_block_cache);
qcow2_mark_clean(bs);
- qcow2_cache_destroy(bs, s->l2_table_cache);
- qcow2_cache_destroy(bs, s->refcount_block_cache);
+ block_cache_destroy(bs, s->l2_table_cache);
+ block_cache_destroy(bs, s->refcount_block_cache);
g_free(s->unknown_header_fields);
cleanup_unknown_header_ext(bs);
@@ -1552,14 +1555,14 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
int ret;
qemu_co_mutex_lock(&s->lock);
- ret = qcow2_cache_flush(bs, s->l2_table_cache);
+ ret = block_cache_flush(bs, s->l2_table_cache);
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
return ret;
}
if (qcow2_need_accurate_refcounts(s)) {
- ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+ ret = block_cache_flush(bs, s->refcount_block_cache);
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
return ret;
diff --git a/block/qcow2.h b/block/qcow2.h
index b4eb654..cb6fd7a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
#include "aes.h"
#include "qemu-coroutine.h"
+#include "block-cache.h"
//#define DEBUG_ALLOC
//#define DEBUG_ALLOC2
@@ -94,8 +95,6 @@ typedef struct QCowSnapshot {
uint64_t vm_clock_nsec;
} QCowSnapshot;
-struct Qcow2Cache;
-typedef struct Qcow2Cache Qcow2Cache;
typedef struct Qcow2UnknownHeaderExtension {
uint32_t magic;
@@ -146,8 +145,8 @@ typedef struct BDRVQcowState {
uint64_t l1_table_offset;
uint64_t *l1_table;
- Qcow2Cache* l2_table_cache;
- Qcow2Cache* refcount_block_cache;
+ BlockCache *l2_table_cache;
+ BlockCache *refcount_block_cache;
uint8_t *cluster_cache;
uint8_t *cluster_data;
@@ -316,21 +315,4 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name);
void qcow2_free_snapshots(BlockDriverState *bs);
int qcow2_read_snapshots(BlockDriverState *bs);
-
-/* qcow2-cache.c functions */
-Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
-int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
-
-void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table);
-int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
-int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
- Qcow2Cache *dependency);
-void qcow2_cache_depends_on_flush(Qcow2Cache *c);
-
-int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
- void **table);
-int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
- void **table);
-int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
-
#endif
diff --git a/trace-events b/trace-events
index e2d4580..f53a8f5 100644
--- a/trace-events
+++ b/trace-events
@@ -454,12 +454,13 @@ qcow2_l2_allocate_write_l2(void *bs, int l1_index) "bs %p l1_index %d"
qcow2_l2_allocate_write_l1(void *bs, int l1_index) "bs %p l1_index %d"
qcow2_l2_allocate_done(void *bs, int l1_index, int ret) "bs %p l1_index %d ret %d"
-qcow2_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d"
-qcow2_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d"
-qcow2_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d"
-qcow2_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d"
-qcow2_cache_flush(void *co, int c) "co %p is_l2_cache %d"
-qcow2_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d"
+# block/block-cache.c
+block_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d"
+block_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d"
+block_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d"
+block_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d"
+block_cache_flush(void *co, int c) "co %p is_l2_cache %d"
+block_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d"
# block/qed-l2-cache.c
qed_alloc_l2_cache_entry(void *l2_cache, void *entry) "l2_cache %p entry %p"
--
1.7.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH V14 5/6] add-cow file format core code.
2012-10-25 13:36 [Qemu-devel] [PATCH V14 0/6] add-cow file format Dong Xu Wang
` (3 preceding siblings ...)
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang
@ 2012-10-25 13:36 ` Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 6/6] qemu-iotests: add add-cow iotests support Dong Xu Wang
5 siblings, 0 replies; 10+ messages in thread
From: Dong Xu Wang @ 2012-10-25 13:36 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Dong Xu Wang
add-cow file format core code. It use block-cache.c as cache code.
It lacks of snapshot_blkdev support.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
block/Makefile.objs | 1 +
block/add-cow.c | 692 +++++++++++++++++++++++++++++++++++++++++++++++++++
block/add-cow.h | 83 ++++++
block/block-cache.c | 4 +
block/block-cache.h | 1 +
block_int.h | 2 +
6 files changed, 783 insertions(+), 0 deletions(-)
create mode 100644 block/add-cow.c
create mode 100644 block/add-cow.h
diff --git a/block/Makefile.objs b/block/Makefile.objs
index f128b78..ed9222d 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,6 @@
block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
+block-obj-y += add-cow.o
block-obj-y += block-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
diff --git a/block/add-cow.c b/block/add-cow.c
new file mode 100644
index 0000000..ad6e22e
--- /dev/null
+++ b/block/add-cow.c
@@ -0,0 +1,692 @@
+/*
+ * QEMU ADD-COW Disk Format
+ *
+ * Copyright IBM, Corp. 2012
+ *
+ * Authors:
+ * Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "block_int.h"
+#include "module.h"
+#include "add-cow.h"
+
+static void add_cow_header_le_to_cpu(const AddCowHeader *le, AddCowHeader *cpu)
+{
+ cpu->magic = le32_to_cpu(le->magic);
+ cpu->version = le32_to_cpu(le->version);
+
+ cpu->backing_offset = le32_to_cpu(le->backing_offset);
+ cpu->backing_size = le32_to_cpu(le->backing_size);
+ cpu->image_offset = le32_to_cpu(le->image_offset);
+ cpu->image_size = le32_to_cpu(le->image_size);
+
+ cpu->cluster_bits = le32_to_cpu(le->cluster_bits);
+ cpu->features = le64_to_cpu(le->features);
+ cpu->compat_features = le64_to_cpu(le->compat_features);
+ cpu->header_size = le32_to_cpu(le->header_size);
+
+ memcpy(cpu->backing_fmt, le->backing_fmt, sizeof(cpu->backing_fmt));
+ memcpy(cpu->image_fmt, le->image_fmt, sizeof(cpu->image_fmt));
+}
+
+static void add_cow_header_cpu_to_le(const AddCowHeader *cpu, AddCowHeader *le)
+{
+ le->magic = cpu_to_le32(cpu->magic);
+ le->version = cpu_to_le32(cpu->version);
+
+ le->backing_offset = cpu_to_le32(cpu->backing_offset);
+ le->backing_size = cpu_to_le32(cpu->backing_size);
+ le->image_offset = cpu_to_le32(cpu->image_offset);
+ le->image_size = cpu_to_le32(cpu->image_size);
+
+ le->cluster_bits = cpu_to_le32(cpu->cluster_bits);
+ le->features = cpu_to_le64(cpu->features);
+ le->compat_features = cpu_to_le64(cpu->compat_features);
+ le->header_size = cpu_to_le32(cpu->header_size);
+ memcpy(le->backing_fmt, cpu->backing_fmt, sizeof(le->backing_fmt));
+ memcpy(le->image_fmt, cpu->image_fmt, sizeof(le->image_fmt));
+}
+
+static int add_cow_probe(const uint8_t *buf, int buf_size, const char *filename)
+{
+ const AddCowHeader *header = (const AddCowHeader *)buf;
+
+ if (buf_size < sizeof(*header)) {
+ return 0;
+ }
+ if (le32_to_cpu(header->magic) == ADD_COW_MAGIC &&
+ le32_to_cpu(header->version) == ADD_COW_VERSION) {
+ return 100;
+ }
+ return 0;
+}
+
+static int add_cow_create(const char *filename, QemuOpts *opts)
+{
+ AddCowHeader header = {
+ .magic = ADD_COW_MAGIC,
+ .version = ADD_COW_VERSION,
+ .features = 0,
+ .compat_features = 0,
+ .header_size = ADD_COW_HEADER_SIZE,
+ };
+ AddCowHeader le_header;
+ int64_t image_len;
+ const char *backing_filename, *backing_fmt, *image_filename, *image_format;
+ BlockDriverState *bs, *image_bs, *backing_bs;
+ BlockDriver *drv = bdrv_find_format("add-cow");
+ size_t cluster_size;
+ int ret;
+
+ image_len = qemu_opt_get_number(opts, BLOCK_OPT_SIZE, 0);
+ backing_filename = qemu_opt_get(opts, BLOCK_OPT_BACKING_FILE);
+ backing_fmt = qemu_opt_get(opts, BLOCK_OPT_BACKING_FMT);
+ image_filename = qemu_opt_get(opts, BLOCK_OPT_IMAGE_FILE);
+ image_format = qemu_opt_get(opts, BLOCK_OPT_IMAGE_FMT);
+ cluster_size = qemu_opt_get_size(opts, BLOCK_OPT_CLUSTER_SIZE,
+ ADD_COW_CLUSTER_SIZE);
+
+ header.cluster_bits = ffs(cluster_size) - 1;
+ if (header.cluster_bits < MIN_CLUSTER_BITS ||
+ header.cluster_bits > MAX_CLUSTER_BITS ||
+ (1 << header.cluster_bits) != cluster_size) {
+ error_report(
+ "Cluster size must be a power of two between %d and %dk",
+ 1 << MIN_CLUSTER_BITS, 1 << (MAX_CLUSTER_BITS - 10));
+ return -EINVAL;
+ }
+
+ if (backing_filename) {
+ header.backing_offset = sizeof(header);
+ header.backing_size = strlen(backing_filename);
+
+ if (!backing_fmt) {
+ backing_bs = bdrv_new("image");
+ ret = bdrv_open(backing_bs, backing_filename,
+ BDRV_O_RDWR | BDRV_O_CACHE_WB, NULL);
+ if (ret < 0) {
+ return ret;
+ }
+ backing_fmt = bdrv_get_format_name(backing_bs);
+ bdrv_delete(backing_bs);
+ }
+ } else {
+ header.compat_features |= ADD_COW_F_ALL_ALLOCATED;
+ }
+
+ if (image_filename) {
+ header.image_offset = sizeof(header) + header.backing_size;
+ header.image_size = strlen(image_filename);
+ } else {
+ error_report("Error: image_file should be given.");
+ return -EINVAL;
+ }
+
+ if (backing_filename && !strcmp(backing_filename, image_filename)) {
+ error_report("Error: Trying to create an image with the "
+ "same backing file name as the image file name");
+ return -EINVAL;
+ }
+
+ if (!strcmp(filename, image_filename)) {
+ error_report("Error: Trying to create an image with the "
+ "same filename as the image file name");
+ return -EINVAL;
+ }
+
+ if (header.image_offset + header.image_size > header.header_size) {
+ error_report("image_file name or backing_file name too long.");
+ return -ENOSPC;
+ }
+
+ ret = bdrv_file_open(&image_bs, image_filename, BDRV_O_RDWR);
+ if (ret < 0) {
+ return ret;
+ }
+ bdrv_delete(image_bs);
+
+ ret = bdrv_create_file(filename, NULL);
+ if (ret < 0) {
+ return ret;
+ }
+
+ ret = bdrv_file_open(&bs, filename, BDRV_O_RDWR);
+ if (ret < 0) {
+ return ret;
+ }
+ snprintf(header.backing_fmt, sizeof(header.backing_fmt), "%s",
+ backing_fmt ? backing_fmt : "");
+ snprintf(header.image_fmt, sizeof(header.image_fmt), "%s",
+ image_format ? image_format : "raw");
+ add_cow_header_cpu_to_le(&header, &le_header);
+ ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header));
+ if (ret < 0) {
+ bdrv_delete(bs);
+ return ret;
+ }
+
+ ret = bdrv_pwrite(bs, header.image_offset, image_filename,
+ header.image_size);
+ if (ret < 0) {
+ bdrv_delete(bs);
+ return ret;
+ }
+
+ if (backing_filename) {
+ ret = bdrv_pwrite(bs, header.backing_offset, backing_filename,
+ header.backing_size);
+ if (ret < 0) {
+ bdrv_delete(bs);
+ return ret;
+ }
+ }
+ bdrv_close(bs);
+
+ ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv);
+ if (ret < 0) {
+ bdrv_delete(bs);
+ return ret;
+ }
+
+ ret = bdrv_truncate(bs, image_len);
+ bdrv_delete(bs);
+ return ret;
+}
+
+static int add_cow_open(BlockDriverState *bs, int flags)
+{
+ char image_filename[PATH_MAX];
+ char tmp_name[PATH_MAX];
+ int ret;
+ int sector_per_byte;
+ BDRVAddCowState *s = bs->opaque;
+ AddCowHeader le_header;
+ char backing_filename[PATH_MAX];
+
+ ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header));
+ if (ret < 0) {
+ goto fail;
+ }
+
+ add_cow_header_le_to_cpu(&le_header, &s->header);
+
+ if (s->header.magic != ADD_COW_MAGIC) {
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ if (s->header.version != ADD_COW_VERSION) {
+ char version[64];
+ snprintf(version, sizeof(version), "ADD-COW version %d",
+ s->header.version);
+ qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+ bs->device_name, "add-cow", version);
+ ret = -ENOTSUP;
+ goto fail;
+ }
+
+ if (s->header.features & ~ADD_COW_FEATURE_MASK) {
+ char buf[64];
+ snprintf(buf, sizeof(buf), "Feature Flags: %" PRIx64,
+ s->header.features & ~ADD_COW_FEATURE_MASK);
+ qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+ bs->device_name, "add-cow", buf);
+ return -ENOTSUP;
+ }
+
+ if ((s->header.compat_features & ADD_COW_F_ALL_ALLOCATED) == 0) {
+ snprintf(bs->backing_format, sizeof(bs->backing_format),
+ "%s", s->header.backing_fmt);
+ }
+
+ if (s->header.cluster_bits < MIN_CLUSTER_BITS ||
+ s->header.cluster_bits > MAX_CLUSTER_BITS) {
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ if ((s->header.compat_features & ADD_COW_F_ALL_ALLOCATED) == 0) {
+ ret = bdrv_read_string(bs->file, s->header.backing_offset,
+ s->header.backing_size,
+ bs->backing_file,
+ sizeof(bs->backing_file));
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
+ ret = bdrv_read_string(bs->file, s->header.image_offset,
+ s->header.image_size, tmp_name,
+ sizeof(tmp_name));
+ if (ret < 0) {
+ goto fail;
+ }
+
+ s->image_hd = bdrv_new("");
+ if (path_has_protocol(tmp_name)) {
+ pstrcpy(image_filename, sizeof(image_filename), tmp_name);
+ } else {
+ path_combine(image_filename, sizeof(image_filename),
+ bs->filename, tmp_name);
+ }
+ bdrv_get_full_backing_filename(bs, backing_filename,
+ sizeof(backing_filename));
+ if (!strcmp(backing_filename, image_filename)) {
+ fprintf(stderr, "Error: backing file and image file are same: %s\n",
+ backing_filename);
+ ret = EINVAL;
+ goto fail;
+ }
+ ret = bdrv_open(s->image_hd, image_filename, flags,
+ bdrv_find_format(s->header.image_fmt));
+ if (ret < 0) {
+ bdrv_delete(s->image_hd);
+ goto fail;
+ }
+
+ bs->total_sectors = bdrv_getlength(s->image_hd) / BDRV_SECTOR_SIZE;
+ s->cluster_size = 1 << s->header.cluster_bits;
+ sector_per_byte = (s->cluster_size / BDRV_SECTOR_SIZE) * 8;
+ s->bitmap_size =
+ (bs->total_sectors + sector_per_byte - 1) / sector_per_byte;
+ s->bitmap_cache = block_cache_create(bs, ADD_COW_CACHE_SIZE,
+ ADD_COW_CACHE_ENTRY_SIZE,
+ BLOCK_TABLE_BITMAP);
+ qemu_co_mutex_init(&s->lock);
+ return 0;
+fail:
+ if (s->bitmap_cache) {
+ block_cache_destroy(bs, s->bitmap_cache);
+ }
+ return ret;
+}
+
+static void add_cow_close(BlockDriverState *bs)
+{
+ BDRVAddCowState *s = bs->opaque;
+ if (s->bitmap_cache) {
+ block_cache_destroy(bs, s->bitmap_cache);
+ }
+ bdrv_delete(s->image_hd);
+}
+
+static bool is_allocated(BlockDriverState *bs, int64_t sector_num)
+{
+ BDRVAddCowState *s = bs->opaque;
+ BlockCache *c = s->bitmap_cache;
+ int64_t cluster_num = sector_num / (s->cluster_size / BDRV_SECTOR_SIZE);
+ uint8_t *table = NULL;
+ bool val = false;
+ int ret;
+
+ uint64_t offset = s->header.header_size
+ + (offset_in_bitmap(sector_num, s->cluster_size / BDRV_SECTOR_SIZE)
+ & (~(c->cluster_size - 1)));
+ ret = block_cache_get(bs, s->bitmap_cache, offset, (void **)&table);
+ if (ret < 0) {
+ return ret;
+ }
+
+ val = table[cluster_num / 8 % ADD_COW_CACHE_ENTRY_SIZE]
+ & (1 << (cluster_num % 8));
+ ret = block_cache_put(bs, s->bitmap_cache, (void **)&table);
+ if (ret < 0) {
+ return ret;
+ }
+ return val;
+}
+
+static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs,
+ int64_t sector_num, int nb_sectors, int *num_same)
+{
+ BDRVAddCowState *s = bs->opaque;
+ int changed;
+
+ if (nb_sectors == 0) {
+ *num_same = 0;
+ return 0;
+ }
+
+ if (s->header.compat_features & ADD_COW_F_ALL_ALLOCATED) {
+ *num_same = nb_sectors;
+ return 1;
+ }
+ changed = is_allocated(bs, sector_num);
+
+ for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
+ if (is_allocated(bs, sector_num + *num_same) != changed) {
+ break;
+ }
+ }
+ return changed;
+}
+
+static int add_cow_backing_read(BlockDriverState *bs, QEMUIOVector *qiov,
+ int64_t sector_num, int nb_sectors)
+{
+ int n1;
+ if ((sector_num + nb_sectors) <= bs->total_sectors) {
+ return nb_sectors;
+ }
+ if (sector_num >= bs->total_sectors) {
+ n1 = 0;
+ } else {
+ n1 = bs->total_sectors - sector_num;
+ }
+
+ qemu_iovec_memset(qiov, BDRV_SECTOR_SIZE * n1,
+ 0, BDRV_SECTOR_SIZE * (nb_sectors - n1));
+
+ return n1;
+}
+
+static coroutine_fn int add_cow_co_readv(BlockDriverState *bs,
+ int64_t sector_num,
+ int remaining_sectors,
+ QEMUIOVector *qiov)
+{
+ BDRVAddCowState *s = bs->opaque;
+ int cur_nr_sectors;
+ uint64_t bytes_done = 0;
+ QEMUIOVector hd_qiov;
+ int n1, ret = 0;
+
+ qemu_iovec_init(&hd_qiov, qiov->niov);
+ qemu_co_mutex_lock(&s->lock);
+ while (remaining_sectors != 0) {
+ cur_nr_sectors = remaining_sectors;
+ if (add_cow_is_allocated(bs, sector_num, cur_nr_sectors,
+ &cur_nr_sectors)) {
+ qemu_iovec_reset(&hd_qiov);
+ qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
+ cur_nr_sectors * BDRV_SECTOR_SIZE);
+ qemu_co_mutex_unlock(&s->lock);
+ ret = bdrv_co_readv(s->image_hd, sector_num,
+ cur_nr_sectors, &hd_qiov);
+ qemu_co_mutex_lock(&s->lock);
+ if (ret < 0) {
+ goto fail;
+ }
+ } else {
+ if (bs->backing_hd) {
+ qemu_iovec_reset(&hd_qiov);
+ qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
+ cur_nr_sectors * BDRV_SECTOR_SIZE);
+ n1 = add_cow_backing_read(bs->backing_hd, &hd_qiov,
+ sector_num, cur_nr_sectors);
+ if (n1 > 0) {
+ qemu_co_mutex_unlock(&s->lock);
+ ret = bdrv_co_readv(bs->backing_hd, sector_num,
+ cur_nr_sectors, &hd_qiov);
+ qemu_co_mutex_lock(&s->lock);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+ } else {
+ qemu_iovec_memset(&hd_qiov, 0, 0,
+ BDRV_SECTOR_SIZE * cur_nr_sectors);
+ }
+ }
+ remaining_sectors -= cur_nr_sectors;
+ sector_num += cur_nr_sectors;
+ bytes_done += cur_nr_sectors * BDRV_SECTOR_SIZE;
+ }
+fail:
+ qemu_co_mutex_unlock(&s->lock);
+ qemu_iovec_destroy(&hd_qiov);
+ return ret;
+}
+
+static int coroutine_fn copy_sectors(BlockDriverState *bs,
+ int n_start, int n_end)
+{
+ BDRVAddCowState *s = bs->opaque;
+ QEMUIOVector qiov;
+ struct iovec iov;
+ int n, ret;
+
+ n = n_end - n_start;
+ if (n <= 0) {
+ return 0;
+ }
+
+ iov.iov_len = n * BDRV_SECTOR_SIZE;
+ iov.iov_base = qemu_blockalign(bs, iov.iov_len);
+
+ qemu_iovec_init_external(&qiov, &iov, 1);
+
+ ret = bdrv_co_readv(bs->backing_hd, n_start, n, &qiov);
+ if (ret < 0) {
+ goto out;
+ }
+ ret = bdrv_co_writev(s->image_hd, n_start, n, &qiov);
+ if (ret < 0) {
+ goto out;
+ }
+
+ ret = 0;
+out:
+ qemu_vfree(iov.iov_base);
+ return ret;
+}
+
+static coroutine_fn int add_cow_co_writev(BlockDriverState *bs,
+ int64_t sector_num,
+ int remaining_sectors,
+ QEMUIOVector *qiov)
+{
+ BDRVAddCowState *s = bs->opaque;
+ BlockCache *c = s->bitmap_cache;
+ int ret = 0, i;
+ QEMUIOVector hd_qiov;
+ uint8_t *table;
+ uint64_t offset;
+ int mask = (s->cluster_size / BDRV_SECTOR_SIZE) - 1;
+ int cluster_mask = c->cluster_size - 1;
+ int64_t sector_per_cluster = s->cluster_size / BDRV_SECTOR_SIZE;
+
+ qemu_co_mutex_lock(&s->lock);
+ qemu_iovec_init(&hd_qiov, qiov->niov);
+ ret = bdrv_co_writev(s->image_hd, sector_num,
+ remaining_sectors, qiov);
+
+ if (ret < 0) {
+ goto fail;
+ }
+ if ((s->header.compat_features & ADD_COW_F_ALL_ALLOCATED) == 0) {
+ /* Copy content of unmodified sectors */
+ if (!is_cluster_head(sector_num, sector_per_cluster)
+ && !is_allocated(bs, sector_num)) {
+ ret = copy_sectors(bs, sector_num & ~mask, sector_num);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
+ if (!is_cluster_tail(sector_num + remaining_sectors - 1,
+ sector_per_cluster)
+ && !is_allocated(bs, sector_num + remaining_sectors - 1)) {
+ ret = copy_sectors(bs, sector_num + remaining_sectors,
+ ((sector_num + remaining_sectors) | mask) + 1);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
+ for (i = sector_num / sector_per_cluster;
+ i <= (sector_num + remaining_sectors - 1) / sector_per_cluster;
+ i++) {
+ offset = s->header.header_size
+ + (offset_in_bitmap(i * sector_per_cluster,
+ s->cluster_size / BDRV_SECTOR_SIZE) & (~cluster_mask));
+ ret = block_cache_get(bs, s->bitmap_cache, offset, (void **)&table);
+ if (ret < 0) {
+ goto fail;
+ }
+ if ((table[i / 8] & (1 << (i % 8))) == 0) {
+ table[i / 8] |= (1 << (i % 8));
+ block_cache_entry_mark_dirty(s->bitmap_cache, table);
+ }
+
+ ret = block_cache_put(bs, s->bitmap_cache, (void **) &table);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+ }
+ ret = 0;
+fail:
+ qemu_co_mutex_unlock(&s->lock);
+ qemu_iovec_destroy(&hd_qiov);
+ return ret;
+}
+
+static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size)
+{
+ BDRVAddCowState *s = bs->opaque;
+ int sector_per_byte = (s->cluster_size / BDRV_SECTOR_SIZE) * 8;
+ int ret;
+ int64_t bitmap_size =
+ (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte;
+ bitmap_size = (bitmap_size + ADD_COW_CACHE_ENTRY_SIZE - 1)
+ & (~(ADD_COW_CACHE_ENTRY_SIZE - 1));
+
+ ret = bdrv_truncate(bs->file, s->header.header_size + bitmap_size);
+ if (ret < 0) {
+ return ret;
+ }
+
+ ret = bdrv_truncate(s->image_hd, size);
+ if (ret < 0) {
+ return ret;
+ }
+ return 0;
+}
+
+static int add_cow_reopen_prepare(BDRVReopenState *state,
+ BlockReopenQueue *queue, Error **errp)
+{
+ BDRVAddCowState *s;
+ int ret = -1;
+
+ assert(state != NULL);
+ assert(state->bs != NULL);
+
+ if (queue == NULL) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "No reopen queue for add-cow");
+ goto exit;
+ }
+
+ s = state->bs->opaque;
+
+ assert(s != NULL);
+
+
+ bdrv_reopen_queue(queue, s->image_hd, state->flags);
+ ret = 0;
+
+exit:
+ return ret;
+}
+
+
+static coroutine_fn int add_cow_co_flush(BlockDriverState *bs)
+{
+ BDRVAddCowState *s = bs->opaque;
+ int ret;
+
+ qemu_co_mutex_lock(&s->lock);
+ if (s->bitmap_cache) {
+ ret = block_cache_flush(bs, s->bitmap_cache);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+ ret = bdrv_flush(s->image_hd);
+ qemu_co_mutex_unlock(&s->lock);
+ return ret;
+}
+
+static int add_cow_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+ BDRVAddCowState *s = bs->opaque;
+ bdi->cluster_size = s->cluster_size;
+ return 0;
+}
+
+static QemuOptsList add_cow_create_opts = {
+ .name = "add-cow-create-opts",
+ .head = QTAILQ_HEAD_INITIALIZER(add_cow_create_opts.head),
+ .desc = {
+ {
+ .name = BLOCK_OPT_SIZE,
+ .type = QEMU_OPT_NUMBER,
+ .help = "Virtual disk size"
+ },
+ {
+ .name = BLOCK_OPT_BACKING_FILE,
+ .type = QEMU_OPT_STRING,
+ .help = "File name of a base image"
+ },
+ {
+ .name = BLOCK_OPT_BACKING_FMT,
+ .type = QEMU_OPT_STRING,
+ .help = "Image format of the base image"
+ },
+ {
+ .name = BLOCK_OPT_IMAGE_FILE,
+ .type = QEMU_OPT_STRING,
+ .help = "File name of a image file"
+ },
+ {
+ .name = BLOCK_OPT_IMAGE_FMT,
+ .type = QEMU_OPT_STRING,
+ .help = "Image format of the image file"
+ },
+ {
+ .name = BLOCK_OPT_CLUSTER_SIZE,
+ .type = QEMU_OPT_SIZE,
+ .help = "add-cow cluster size",
+ .def_value = ADD_COW_CLUSTER_SIZE
+ },
+ { /* end of list */ }
+ }
+};
+
+static QemuOptsList *add_cow_create_options(void)
+{
+ return &add_cow_create_opts;
+}
+
+static BlockDriver bdrv_add_cow = {
+ .format_name = "add-cow",
+ .instance_size = sizeof(BDRVAddCowState),
+ .bdrv_probe = add_cow_probe,
+ .bdrv_open = add_cow_open,
+ .bdrv_close = add_cow_close,
+ .bdrv_create = add_cow_create,
+ .bdrv_co_readv = add_cow_co_readv,
+ .bdrv_co_writev = add_cow_co_writev,
+ .bdrv_truncate = bdrv_add_cow_truncate,
+ .bdrv_co_is_allocated = add_cow_is_allocated,
+ .bdrv_reopen_prepare = add_cow_reopen_prepare,
+ .bdrv_get_info = add_cow_get_info,
+
+ .bdrv_create_options = add_cow_create_options,
+ .bdrv_co_flush_to_os = add_cow_co_flush,
+};
+
+static void bdrv_add_cow_init(void)
+{
+ bdrv_register(&bdrv_add_cow);
+}
+
+block_init(bdrv_add_cow_init);
diff --git a/block/add-cow.h b/block/add-cow.h
new file mode 100644
index 0000000..dba2b6c
--- /dev/null
+++ b/block/add-cow.h
@@ -0,0 +1,83 @@
+/*
+ * QEMU ADD-COW Disk Format
+ *
+ * Copyright IBM, Corp. 2012
+ *
+ * Authors:
+ * Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#ifndef BLOCK_ADD_COW_H
+#define BLOCK_ADD_COW_H
+#include "block-cache.h"
+
+enum {
+ /* compat_features bit */
+ ADD_COW_F_ALL_ALLOCATED = 0X01,
+
+ /* none feature bit used now */
+ ADD_COW_FEATURE_MASK = 0,
+
+ ADD_COW_MAGIC = 'A' | 'C' << 8 | 'O' << 16 | 'W' << 24,
+ ADD_COW_VERSION = 1,
+ ADD_COW_CACHE_SIZE = 16,
+ ADD_COW_CACHE_ENTRY_SIZE = 65536,
+ ADD_COW_CLUSTER_SIZE = 65536,
+ ADD_COW_HEADER_SIZE = 4096,
+ MIN_CLUSTER_BITS = 9,
+ MAX_CLUSTER_BITS = 21,
+};
+
+typedef struct AddCowHeader {
+ uint32_t magic;
+ uint32_t version;
+
+ uint32_t backing_offset;
+ uint32_t backing_size;
+ uint32_t image_offset;
+ uint32_t image_size;
+
+ uint32_t cluster_bits;
+ uint64_t features;
+ uint64_t compat_features;
+ uint32_t header_size;
+
+ char backing_fmt[16];
+ char image_fmt[16];
+} QEMU_PACKED AddCowHeader;
+
+typedef struct BDRVAddCowState {
+ BlockDriverState *image_hd;
+ CoMutex lock;
+ int cluster_size;
+ BlockCache *bitmap_cache;
+ uint64_t bitmap_size;
+ AddCowHeader header;
+ char backing_fmt[16];
+ char image_fmt[16];
+} BDRVAddCowState;
+
+/* Convert sector_num to offset in bitmap */
+static inline int64_t offset_in_bitmap(int64_t sector_num,
+ int64_t sector_per_cluster)
+{
+ int64_t cluster_num = sector_num / sector_per_cluster;
+ return cluster_num / 8;
+}
+
+static inline bool is_cluster_head(int64_t sector_num,
+ int64_t sector_per_cluster)
+{
+ return sector_num % sector_per_cluster == 0;
+}
+
+static inline bool is_cluster_tail(int64_t sector_num,
+ int64_t sector_per_cluster)
+{
+ return (sector_num + 1) % sector_per_cluster == 0;
+}
+#endif
diff --git a/block/block-cache.c b/block/block-cache.c
index bf5c57c..1a30462 100644
--- a/block/block-cache.c
+++ b/block/block-cache.c
@@ -112,6 +112,8 @@ static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART);
} else if (c->table_type == BLOCK_TABLE_L2) {
BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
+ } else if (c->table_type == BLOCK_TABLE_BITMAP) {
+ BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
}
ret = bdrv_pwrite(bs->file, c->entries[i].offset,
@@ -245,6 +247,8 @@ static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
if (read_from_disk) {
if (c->table_type == BLOCK_TABLE_L2) {
BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
+ } else if (c->table_type == BLOCK_TABLE_BITMAP) {
+ BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
}
ret = bdrv_pread(bs->file, offset, c->entries[i].table,
diff --git a/block/block-cache.h b/block/block-cache.h
index 4efa06e..a3c4a1c 100644
--- a/block/block-cache.h
+++ b/block/block-cache.h
@@ -37,6 +37,7 @@
typedef enum {
BLOCK_TABLE_REF,
BLOCK_TABLE_L2,
+ BLOCK_TABLE_BITMAP,
} BlockTableType;
typedef struct BlockCachedTable {
diff --git a/block_int.h b/block_int.h
index a104e70..8a79045 100644
--- a/block_int.h
+++ b/block_int.h
@@ -55,6 +55,8 @@
#define BLOCK_OPT_SUBFMT "subformat"
#define BLOCK_OPT_COMPAT_LEVEL "compat"
#define BLOCK_OPT_LAZY_REFCOUNTS "lazy_refcounts"
+#define BLOCK_OPT_IMAGE_FILE "image_file"
+#define BLOCK_OPT_IMAGE_FMT "image_format"
typedef struct BdrvTrackedRequest BdrvTrackedRequest;
--
1.7.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH V14 6/6] qemu-iotests: add add-cow iotests support.
2012-10-25 13:36 [Qemu-devel] [PATCH V14 0/6] add-cow file format Dong Xu Wang
` (4 preceding siblings ...)
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 5/6] add-cow file format core code Dong Xu Wang
@ 2012-10-25 13:36 ` Dong Xu Wang
5 siblings, 0 replies; 10+ messages in thread
From: Dong Xu Wang @ 2012-10-25 13:36 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Dong Xu Wang
This patch will use qemu-iotests to test add-cow file format.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
tests/qemu-iotests/017 | 2 +-
tests/qemu-iotests/020 | 2 +-
tests/qemu-iotests/common | 6 ++++++
tests/qemu-iotests/common.rc | 15 ++++++++++++++-
4 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/tests/qemu-iotests/017 b/tests/qemu-iotests/017
index 66951eb..d31432f 100755
--- a/tests/qemu-iotests/017
+++ b/tests/qemu-iotests/017
@@ -40,7 +40,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
. ./common.pattern
# Any format supporting backing files
-_supported_fmt qcow qcow2 vmdk qed
+_supported_fmt qcow qcow2 vmdk qed add-cow
_supported_proto generic
_supported_os Linux
diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index 2fb0ff8..3dbb495 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -42,7 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
. ./common.pattern
# Any format supporting backing files
-_supported_fmt qcow qcow2 vmdk qed
+_supported_fmt qcow qcow2 vmdk qed add-cow
_supported_proto generic
_supported_os Linux
diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index 1f6fdf5..4c06895 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -128,6 +128,7 @@ common options
check options
-raw test raw (default)
-cow test cow
+ -add-cow test add-cow
-qcow test qcow
-qcow2 test qcow2
-qed test qed
@@ -163,6 +164,11 @@ testlist options
xpand=false
;;
+ -add-cow)
+ IMGFMT=add-cow
+ xpand=false
+ ;;
+
-qcow)
IMGFMT=qcow
xpand=false
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index d534e94..f48b02a 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -97,6 +97,16 @@ _make_test_img()
fi
if [ \( "$IMGFMT" = "qcow2" -o "$IMGFMT" = "qed" \) -a -n "$CLUSTER_SIZE" ]; then
optstr=$(_optstr_add "$optstr" "cluster_size=$CLUSTER_SIZE")
+ elif [ "$IMGFMT" = "add-cow" ]; then
+ local IMG="$TEST_IMG"".raw"
+ if [ "$1" = "-b" ]; then
+ IMG="$IMG"".b"
+ $QEMU_IMG create -f raw $IMG $image_size>/dev/null
+ extra_img_options="-o image_file=$IMG $extra_img_options"
+ else
+ $QEMU_IMG create -f raw $IMG $image_size>/dev/null
+ extra_img_options="-o image_file=$IMG"
+ fi
fi
if [ -n "$optstr" ]; then
@@ -114,7 +124,8 @@ _make_test_img()
-e "s# compat='[^']*'##g" \
-e "s# compat6=\\(on\\|off\\)##g" \
-e "s# static=\\(on\\|off\\)##g" \
- -e "s# lazy_refcounts=\\(on\\|off\\)##g"
+ -e "s# lazy_refcounts=\\(on\\|off\\)##g" \
+ -e "s# image_file='[^']*'##g"
}
_cleanup_test_img()
@@ -125,6 +136,8 @@ _cleanup_test_img()
rm -f $TEST_DIR/t.$IMGFMT
rm -f $TEST_DIR/t.$IMGFMT.orig
rm -f $TEST_DIR/t.$IMGFMT.base
+ rm -f $TEST_DIR/t.$IMGFMT.raw
+ rm -f $TEST_DIR/t.$IMGFMT.raw.b
;;
rbd)
--
1.7.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH V14 1/6] docs: document for add-cow file format
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 1/6] docs: document for " Dong Xu Wang
@ 2012-10-25 14:56 ` Eric Blake
2012-10-26 2:54 ` Dong Xu Wang
0 siblings, 1 reply; 10+ messages in thread
From: Eric Blake @ 2012-10-25 14:56 UTC (permalink / raw)
To: Dong Xu Wang; +Cc: kwolf, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 5530 bytes --]
On 10/25/2012 07:36 AM, Dong Xu Wang wrote:
> Document for add-cow format, the usage and spec of add-cow are introduced.
>
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> ---
> +An example usage of add-cow would look like::
> +(ubuntu.img is a disk image which has an installed OS.)
> + 1) Create a raw image with the same size of ubuntu.img
> + qemu-img create -f raw test.raw 8G
> + 2) Create an add-cow image which will store dirty bitmap
> + qemu-img create -f add-cow test.add-cow \
> + -o backing_file=ubuntu.img,image_file=test.raw
This example does not specify the file format of either the backing_file
or the image_file, yet[1]...
>
> + 8 - 11: backing file name offset
> + Offset in the add-cow file at which the backing file
> + name is stored (NB: The string is not nul-terminated).
Correct spelling of nul-terminated, but...[2]
>
> + 16 - 19: image file name offset
> + Offset in the add-cow file at which the image file name
> + is stored (NB: The string is not null terminated). It
[2]...here, you used the wrong spelling...
> + 28 - 35: features
> + Bitmask of features. If a feature bit set that can not
> + be recognized, the add-cow file should be droped. They are not
> + used in v1.
If a feature bit is set but not recognized, the add-cow file should be
dropped.
> +
> + 48 - 63: backing file format
> + Format of backing file. It will be filled with 0 if
> + backing file name offset is 0. If backing file name
> + offset is non-empty, it must be non-empty. It is coded
> + in free-form ASCII, and is not NUL-terminated. Zero
> + padded on the right.
[2]...and here, you used different capitalization. I think I prefer
NUL-terminated in all cases.
> +
> + 64 - 79: image file format
> + Format of image file. It must be non-empty. It is coded
> + in free-form ASCII, and is not NUL-terminated. Zero
> + padded on the right.
[1]...here you claim that backing and image file format are mandatory
(must not be empty). Shouldn't you allow the file format to be empty,
in which case qemu will probe? And why do you even need image file
format - isn't the whole point of add-cow to wrap a raw image file, or
are you planning on also being able to wrap non-raw files? Are there
other non-raw file formats that lack backing file support, where add-cow
can be used to give it a backing file?
> +
> + 80 - [HEADER_SIZE - 1]:
> + It is used to make sure COW bitmap field starts at the
> + HEADER_SIZE byte, backing file name and image file name
> + will be stored here. The bytes that is not pointing to
s/is/are/
> + backing file and image file names must be set to 0.
> +
> +== COW bitmap ==
> +
> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
> +backing file and image file. The bitmap will track whether the sector in
> +backing file is dirty or not.
Rather, it is tracking whether the sector in image file is allocated or not.
> +
> +Each bit in the bitmap tracks one cluster's status. For example, if cluster
> +bit is 16, then each bit tracks one cluster, (1 << 16) = 65536 bytes. The
> +image file size is rounded up to cluster size (where any bytes in the
> +last cluster that do not fit in the image are ignored), then if the
> +number of clusters is not a multiple of 8, then remaining bits in the
> +bitmap will be set to 0.
> +
> +The size of bitmap is calculated according to virtual size of image file,and
s/file,and/file, and/
> +the size of bitmap should be multiple of add-cow file's cluster size, the bits
> +not used will be set to 0. Within each byte, the least significant bit covers
> +the first cluster. Bit orders in one byte look like:
> + +----+----+----+----+----+----+----+----+
> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
> + +----+----+----+----+----+----+----+----+
> +
> +If the bit is 0, it indicates the sector has not been allocated in image file,
> +data should be loaded from backing file while reading; if the bit is 1, it
> +indicates the related sector has been dirty, should be loaded from image file
> +while reading. Writing to a sector causes the corresponding bit to be set to 1.
> +If there is no backing file, or if the image file is larger than the backing
> +file and the offset is beyond the end of the backing file, then the data should
> +be read as all zero bytes instead.
> +
> +If raw image is not an even multiple of cluster bytes, bits that correspond to
> +bytes beyond the raw file size in add-cow must be written as 0 and must be
> +ignored when reading.
> +
> +Image file name and backing file name must NOT be the same, we prevent this
> +while creating add-cow files via qemu-img. If image file name and backing file
> +name are the same, the add-cow image must be treated as invalid.
>
--
Eric Blake eblake@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH V14 1/6] docs: document for add-cow file format
2012-10-25 14:56 ` Eric Blake
@ 2012-10-26 2:54 ` Dong Xu Wang
2012-10-26 8:34 ` Kevin Wolf
0 siblings, 1 reply; 10+ messages in thread
From: Dong Xu Wang @ 2012-10-26 2:54 UTC (permalink / raw)
To: Kevin Wolf; +Cc: qemu-devel
On Thu, Oct 25, 2012 at 10:56 PM, Eric Blake <eblake@redhat.com> wrote:
> On 10/25/2012 07:36 AM, Dong Xu Wang wrote:
>> Document for add-cow format, the usage and spec of add-cow are introduced.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>> ---
>> +An example usage of add-cow would look like::
>> +(ubuntu.img is a disk image which has an installed OS.)
>> + 1) Create a raw image with the same size of ubuntu.img
>> + qemu-img create -f raw test.raw 8G
>> + 2) Create an add-cow image which will store dirty bitmap
>> + qemu-img create -f add-cow test.add-cow \
>> + -o backing_file=ubuntu.img,image_file=test.raw
>
> This example does not specify the file format of either the backing_file
> or the image_file, yet[1]...
>
>>
>> + 8 - 11: backing file name offset
>> + Offset in the add-cow file at which the backing file
>> + name is stored (NB: The string is not nul-terminated).
>
> Correct spelling of nul-terminated, but...[2]
>
>>
>> + 16 - 19: image file name offset
>> + Offset in the add-cow file at which the image file name
>> + is stored (NB: The string is not null terminated). It
>
> [2]...here, you used the wrong spelling...
>
>> + 28 - 35: features
>> + Bitmask of features. If a feature bit set that can not
>> + be recognized, the add-cow file should be droped. They are not
>> + used in v1.
>
> If a feature bit is set but not recognized, the add-cow file should be
> dropped.
>
>> +
>> + 48 - 63: backing file format
>> + Format of backing file. It will be filled with 0 if
>> + backing file name offset is 0. If backing file name
>> + offset is non-empty, it must be non-empty. It is coded
>> + in free-form ASCII, and is not NUL-terminated. Zero
>> + padded on the right.
>
> [2]...and here, you used different capitalization. I think I prefer
> NUL-terminated in all cases.
>
>> +
>> + 64 - 79: image file format
>> + Format of image file. It must be non-empty. It is coded
>> + in free-form ASCII, and is not NUL-terminated. Zero
>> + padded on the right.
>
> [1]...here you claim that backing and image file format are mandatory
> (must not be empty). Shouldn't you allow the file format to be empty,
> in which case qemu will probe? And why do you even need image file
> format - isn't the whole point of add-cow to wrap a raw image file, or
> are you planning on also being able to wrap non-raw files? Are there
> other non-raw file formats that lack backing file support, where add-cow
> can be used to give it a backing file?
>
Kevin or Stefan, can you give me some opinion? Thanks.
>> +
>> + 80 - [HEADER_SIZE - 1]:
>> + It is used to make sure COW bitmap field starts at the
>> + HEADER_SIZE byte, backing file name and image file name
>> + will be stored here. The bytes that is not pointing to
>
> s/is/are/
>
>> + backing file and image file names must be set to 0.
>> +
>> +== COW bitmap ==
>> +
>> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
>> +backing file and image file. The bitmap will track whether the sector in
>> +backing file is dirty or not.
>
> Rather, it is tracking whether the sector in image file is allocated or not.
>
>> +
>> +Each bit in the bitmap tracks one cluster's status. For example, if cluster
>> +bit is 16, then each bit tracks one cluster, (1 << 16) = 65536 bytes. The
>> +image file size is rounded up to cluster size (where any bytes in the
>> +last cluster that do not fit in the image are ignored), then if the
>> +number of clusters is not a multiple of 8, then remaining bits in the
>> +bitmap will be set to 0.
>> +
>> +The size of bitmap is calculated according to virtual size of image file,and
>
> s/file,and/file, and/
>
>> +the size of bitmap should be multiple of add-cow file's cluster size, the bits
>> +not used will be set to 0. Within each byte, the least significant bit covers
>> +the first cluster. Bit orders in one byte look like:
>> + +----+----+----+----+----+----+----+----+
>> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
>> + +----+----+----+----+----+----+----+----+
>> +
>> +If the bit is 0, it indicates the sector has not been allocated in image file,
>> +data should be loaded from backing file while reading; if the bit is 1, it
>> +indicates the related sector has been dirty, should be loaded from image file
>> +while reading. Writing to a sector causes the corresponding bit to be set to 1.
>> +If there is no backing file, or if the image file is larger than the backing
>> +file and the offset is beyond the end of the backing file, then the data should
>> +be read as all zero bytes instead.
>> +
>> +If raw image is not an even multiple of cluster bytes, bits that correspond to
>> +bytes beyond the raw file size in add-cow must be written as 0 and must be
>> +ignored when reading.
>> +
>> +Image file name and backing file name must NOT be the same, we prevent this
>> +while creating add-cow files via qemu-img. If image file name and backing file
>> +name are the same, the add-cow image must be treated as invalid.
>>
>
Thank you Eric, I will correct these in next version.
> --
> Eric Blake eblake@redhat.com +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH V14 1/6] docs: document for add-cow file format
2012-10-26 2:54 ` Dong Xu Wang
@ 2012-10-26 8:34 ` Kevin Wolf
0 siblings, 0 replies; 10+ messages in thread
From: Kevin Wolf @ 2012-10-26 8:34 UTC (permalink / raw)
To: Dong Xu Wang; +Cc: qemu-devel
Am 26.10.2012 04:54, schrieb Dong Xu Wang:
> On Thu, Oct 25, 2012 at 10:56 PM, Eric Blake <eblake@redhat.com> wrote:
>> On 10/25/2012 07:36 AM, Dong Xu Wang wrote:
>>> +
>>> + 64 - 79: image file format
>>> + Format of image file. It must be non-empty. It is coded
>>> + in free-form ASCII, and is not NUL-terminated. Zero
>>> + padded on the right.
>>
>> [1]...here you claim that backing and image file format are mandatory
>> (must not be empty). Shouldn't you allow the file format to be empty,
>> in which case qemu will probe? And why do you even need image file
>> format - isn't the whole point of add-cow to wrap a raw image file, or
>> are you planning on also being able to wrap non-raw files? Are there
>> other non-raw file formats that lack backing file support, where add-cow
>> can be used to give it a backing file?
>>
>
> Kevin or Stefan, can you give me some opinion? Thanks.
Yes, there are plenty of other block drivers that don't support backing
files, either because their image format/protocol can't provide such
information or just because it's not implemented in qemu today.
Kevin
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-10-26 8:34 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-25 13:36 [Qemu-devel] [PATCH V14 0/6] add-cow file format Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 1/6] docs: document for " Dong Xu Wang
2012-10-25 14:56 ` Eric Blake
2012-10-26 2:54 ` Dong Xu Wang
2012-10-26 8:34 ` Kevin Wolf
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 2/6] make path_has_protocol non static Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 3/6] qed_read_string to bdrv_read_string Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 5/6] add-cow file format core code Dong Xu Wang
2012-10-25 13:36 ` [Qemu-devel] [PATCH V14 6/6] qemu-iotests: add add-cow iotests support Dong Xu Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).