* [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support
@ 2012-03-27 15:03 Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3 Kevin Wolf
` (15 more replies)
0 siblings, 16 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
While this is not quite ready to be merged, I think the important stuff is done
and works (it survives qemu-iotests at least) and probably now is the right
time to start getting feedback. I'm going to get this merged before the 1.1
soft freeze.
What's left is probably some cleanup, adding more new test cases and obviously
fixing whatever bugs come up.
Kevin Wolf (15):
Specification for qcow2 version 3
qcow2: Ignore reserved bits in get_cluster_offset
qcow2: Ignore reserved bits in count_contiguous_clusters()
qcow2: Fail write_compressed when overwriting data
qcow2: Ignore reserved bits in L1/L2 entries
qcow2: Refactor qcow2_free_any_clusters
qcow2: Simplify count_cow_clusters
qcow2: Ignore reserved bits in refcount table entries
qcow2: Ignore reserved bits in check_refcounts
qcow2: Version 3 images
qcow2: Support reading zero clusters
qcow2: Support for feature table header extension
qemu-iotests: Test COW with zero clusters
qcow2: Zero write support
qemu-iotests: use qcow3
Paolo Bonzini (1):
qemu-iotests: add a simple test for write_zeroes
block.c | 14 ++-
block/qcow2-cluster.c | 224 ++++++++++++++++++++++++++----------
block/qcow2-refcount.c | 156 ++++++++++++++-----------
block/qcow2.c | 260 ++++++++++++++++++++++++++++++++++++++----
block/qcow2.h | 58 +++++++++-
block_int.h | 1 +
docs/specs/qcow2.txt | 129 +++++++++++++++++----
tests/qemu-iotests/031 | 137 ++++++++++++++++++++++
tests/qemu-iotests/031.out | 109 ++++++++++++++++++
tests/qemu-iotests/common.rc | 21 ++++-
tests/qemu-iotests/group | 1 +
11 files changed, 932 insertions(+), 178 deletions(-)
create mode 100755 tests/qemu-iotests/031
create mode 100644 tests/qemu-iotests/031.out
--
1.7.6.5
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 16:25 ` Eric Blake
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 02/16] qcow2: Ignore reserved bits in get_cluster_offset Kevin Wolf
` (14 subsequent siblings)
15 siblings, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
This is the second draft for what I think could be added when we increase qcow2's
version number to 3. This includes points that have been made by several people
over the past few months. We're probably not going to implement this next week,
but I think it's important to get discussions started early, so here it is.
Changes implemented in this RFC:
- Added compatible/incompatible/auto-clear feature bits plus an optional
feature name table to allow useful error messages even if an older version
doesn't know some feature at all.
- Added a dirty flag which tells that the refcount may not be accurate ("QED
mode"). This means that we can save writes to the refcount table with
cache=writethrough, but isn't really useful otherwise since Qcow2Cache.
- Configurable refcount width. If you don't want to use internal snapshots,
make refcounts one bit and save cache space and I/O.
- Added subclusters. This separate the COW size (one subcluster, I'm thinking
of 64k default size here) from the allocation size (one cluster, 2M). Less
fragmentation, less metadata, but still reasonable COW granularity.
This also allows to preallocate clusters, but none of their subclusters. You
can have an image that is like raw + COW metadata, and you can also
preallocate metadata for images with backing files.
- Zero cluster flags. This allows discard even with a backing file that doesn't
contain zeros. It is also useful for copy-on-read/image streaming, as you'll
want to keep sparseness without accessing the remote image for an unallocated
cluster all the time.
- Fixed internal snapshot metadata to use 64 bit VM state size. You can't save
a snapshot of a VM with >= 4 GB RAM today.
Possible future additions:
- Add per-L2-table dirty flag to L1?
- Add per-refcount-block full flag to refcount table?
---
docs/specs/qcow2.txt | 129 +++++++++++++++++++++++++++++++++++++++++---------
1 files changed, 107 insertions(+), 22 deletions(-)
diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index b6adcad..9a492cd 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -18,7 +18,7 @@ The first cluster of a qcow2 image contains the file header:
QCOW magic string ("QFI\xfb")
4 - 7: version
- Version number (only valid value is 2)
+ Version number (valid values are 2 and 3)
8 - 15: backing_file_offset
Offset into the image file at which the backing file name
@@ -67,12 +67,53 @@ The first cluster of a qcow2 image contains the file header:
Offset into the image file at which the snapshot table
starts. Must be aligned to a cluster boundary.
+If the version is 3 or higher, the header has the following additional fields.
+For version 2, the values are assumed to be zero, unless specified otherwise
+in the description of a field.
+
+ 72 - 79: incompatible_features
+ Bitmask of incompatible features. An implementation must
+ fail to open an image if an unknown bit is set.
+
+ Bit 0: The reference counts in the image file may be
+ inaccurate. Implementations must check/rebuild
+ them if they rely on them.
+
+ Bit 1: Enable subclusters. This affects the L2 table
+ format.
+
+ Bits 2-31: Reserved (set to 0)
+
+ 80 - 87: compatible_features
+ Bitmask of compatible features. An implementation can
+ safely ignore any unknown bits that are set.
+
+ Bits 0-31: Reserved (set to 0)
+
+ 88 - 95: autoclear_features
+ Bitmask of auto-clear features. An implementation may only
+ write to an image with unknown auto-clear features if it
+ clears the respective bits from this field first.
+
+ Bits 0-31: Reserved (set to 0)
+
+ 96 - 99: refcount_bits
+ Size of a reference count block entry in bits. For version 2
+ images, the size is always assumed to be 16 bits. The size
+ must be a power of two.
+ [ TODO: Define order in sub-byte sizes ]
+
+ 100 - 103: header_length
+ Length of the header structure in bytes. For version 2
+ images, the length is always assumed to be 72 bytes.
+
Directly after the image header, optional sections called header extensions can
be stored. Each extension has a structure like the following:
Byte 0 - 3: Header extension type:
0x00000000 - End of the header extension area
0xE2792ACA - Backing file format name
+ 0x6803f857 - Feature name table
other - Unknown header extension, can be safely
ignored
@@ -84,8 +125,32 @@ be stored. Each extension has a structure like the following:
multiple of 8.
The remaining space between the end of the header extension area and the end of
-the first cluster can be used for other data. Usually, the backing file name is
-stored there.
+the first cluster can be used for the backing file name. It is not allowed to
+store other data here, so that an implementation can safely modify the header
+and add extensions without harming data of compatible features that it
+doesn't support. Compatible features that need space for additional data can
+use a header extension.
+
+
+== Feature name table ==
+
+A feature name table is an optional header extension that contains the name for
+features used by the image. It can be used by applications that don't know
+the respective feature (e.g. because the feature was introduced only later) to
+display a useful error message.
+
+The number of entries in the feature name table is determined by the length of
+the header extension data. Its entries look like this:
+
+ Byte 0: Type of feature (select feature bitmap)
+ 0: Incompatible feature
+ 1: Compatible feature
+ 2: Autoclear feature
+
+ 1: Bit number within the selected feature bitmap
+
+ 2 - 47: Feature name (padded with zeros, but not necessarily null
+ terminated if it has full length)
== Host cluster management ==
@@ -138,7 +203,8 @@ guest clusters to host clusters. They are called L1 and L2 table.
The L1 table has a variable size (stored in the header) and may use multiple
clusters, however it must be contiguous in the image file. L2 tables are
-exactly one cluster in size.
+exactly one cluster in size if subclusters are disabled, and two clusters if
+they are enabled.
Given a offset into the virtual disk, the offset into the image file can be
obtained as follows:
@@ -168,9 +234,40 @@ L1 table entry:
refcount is exactly one. This information is only accurate
in the active L1 table.
-L2 table entry (for normal clusters):
+L2 table entry:
- Bit 0 - 8: Reserved (set to 0)
+ Bit 0 - 61: Cluster descriptor
+
+ 62: 0 for standard clusters
+ 1 for compressed clusters
+
+ 63: 0 for a cluster that is unused or requires COW, 1 if its
+ refcount is exactly one. This information is only accurate
+ in L2 tables that are reachable from the the active L1
+ table.
+
+ 64 - 127: If subclusters are enabled, this contains a bitmask that
+ describes the allocation status of all 32 subclusters (two
+ bits for each). The first subcluster is represented by the
+ LSB. The values for each subcluster are:
+
+ 0: Subcluster is unallocated
+ 1: Subcluster is allocated
+ 2: Subcluster is unallocated and reads as all zeros
+ instead of referring to the backing file
+ 3: Reserved
+
+Standard Cluster Descriptor:
+
+ Bit 0: If set to 1, the cluster reads as all zeros. The host
+ cluster offset can be used to describe a preallocation,
+ but it won't be used for reading data from this cluster,
+ nor is data read from the backing file if the cluster is
+ unallocated.
+
+ With version 2, this is always 0.
+
+ 1 - 8: Reserved (set to 0)
9 - 55: Bits 9-55 of host cluster offset. Must be aligned to a
cluster boundary. If the offset is 0, the cluster is
@@ -178,29 +275,17 @@ L2 table entry (for normal clusters):
56 - 61: Reserved (set to 0)
- 62: 0 (this cluster is not compressed)
-
- 63: 0 for a cluster that is unused or requires COW, 1 if its
- refcount is exactly one. This information is only accurate
- in L2 tables that are reachable from the the active L1
- table.
-L2 table entry (for compressed clusters; x = 62 - (cluster_size - 8)):
+Compressed Clusters Descriptor (x = 62 - (cluster_size - 8)):
Bit 0 - x: Host cluster offset. This is usually _not_ aligned to a
cluster boundary!
x+1 - 61: Compressed size of the images in sectors of 512 bytes
- 62: 1 (this cluster is compressed using zlib)
-
- 63: 0 for a cluster that is unused or requires COW, 1 if its
- refcount is exactly one. This information is only accurate
- in L2 tables that are reachable from the the active L1
- table.
-
-If a cluster is unallocated, read requests shall read the data from the backing
-file. If there is no backing file or the backing file is smaller than the image,
+If a cluster or a subcluster is unallocated, read requests shall read the data
+from the backing file (except if bit 0 in the Standard Cluster Descriptor is
+set). If there is no backing file or the backing file is smaller than the image,
they shall read zeros for all parts that are not covered by the backing file.
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 02/16] qcow2: Ignore reserved bits in get_cluster_offset
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3 Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 03/16] qcow2: Ignore reserved bits in count_contiguous_clusters() Kevin Wolf
` (13 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
With this change, reading from a qcow2 image ignores all reserved bits
that are set in an L1 or L2 table entry.
Now get_cluster_offset() assigns *cluster_offset only the offset without
any other flags. The cluster type is not longer encoded in the offset,
but a positive return value in case of success.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 41 +++++++++++++++++++++++++----------------
block/qcow2.c | 17 ++++++++++++++---
block/qcow2.h | 21 +++++++++++++++++++++
3 files changed, 60 insertions(+), 19 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index cbd224d..44d13de 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -367,11 +367,9 @@ out:
*
* on exit, *num is the number of contiguous sectors we can read.
*
- * Return 0, if the offset is found
- * Return -errno, otherwise.
- *
+ * Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error
+ * cases.
*/
-
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *cluster_offset)
{
@@ -407,19 +405,19 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
/* seek the the l2 offset in the l1 table */
l1_index = offset >> l1_bits;
- if (l1_index >= s->l1_size)
+ if (l1_index >= s->l1_size) {
+ ret = QCOW2_CLUSTER_UNALLOCATED;
goto out;
+ }
- l2_offset = s->l1_table[l1_index];
-
- /* seek the l2 table of the given l2 offset */
-
- if (!l2_offset)
+ l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
+ if (!l2_offset) {
+ ret = QCOW2_CLUSTER_UNALLOCATED;
goto out;
+ }
/* load the l2 table in memory */
- l2_offset &= ~QCOW_OFLAG_COPIED;
ret = l2_load(bs, l2_offset, &l2_table);
if (ret < 0) {
return ret;
@@ -431,26 +429,37 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
*cluster_offset = be64_to_cpu(l2_table[l2_index]);
nb_clusters = size_to_clusters(s, nb_needed << 9);
- if (!*cluster_offset) {
+ ret = qcow2_get_cluster_type(*cluster_offset);
+ switch (ret) {
+ case QCOW2_CLUSTER_COMPRESSED:
+ /* Compressed clusters can only be processed one by one */
+ c = 1;
+ *cluster_offset &= L2E_COMPRESSED_OFFSET_SIZE_MASK;
+ break;
+ case QCOW2_CLUSTER_UNALLOCATED:
/* how many empty clusters ? */
c = count_contiguous_free_clusters(nb_clusters, &l2_table[l2_index]);
- } else {
+ *cluster_offset = 0;
+ break;
+ case QCOW2_CLUSTER_NORMAL:
/* how many allocated clusters ? */
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], 0, QCOW_OFLAG_COPIED);
+ *cluster_offset &= L2E_OFFSET_MASK;
+ break;
}
qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
- nb_available = (c * s->cluster_sectors);
+ nb_available = (c * s->cluster_sectors);
+
out:
if (nb_available > nb_needed)
nb_available = nb_needed;
*num = nb_available - index_in_cluster;
- *cluster_offset &=~QCOW_OFLAG_COPIED;
- return 0;
+ return ret;
}
/*
diff --git a/block/qcow2.c b/block/qcow2.c
index 70d3141..4c82adc 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -449,7 +449,8 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
qemu_iovec_copy(&hd_qiov, qiov, bytes_done,
cur_nr_sectors * 512);
- if (!cluster_offset) {
+ switch (ret) {
+ case QCOW2_CLUSTER_UNALLOCATED:
if (bs->backing_hd) {
/* read from the base image */
@@ -469,7 +470,9 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
/* Note: in this case, no need to wait */
qemu_iovec_memset(&hd_qiov, 0, 512 * cur_nr_sectors);
}
- } else if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
+ break;
+
+ case QCOW2_CLUSTER_COMPRESSED:
/* add AIO support for compressed blocks ? */
ret = qcow2_decompress_cluster(bs, cluster_offset);
if (ret < 0) {
@@ -479,7 +482,9 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
qemu_iovec_from_buffer(&hd_qiov,
s->cluster_cache + index_in_cluster * 512,
512 * cur_nr_sectors);
- } else {
+ break;
+
+ case QCOW2_CLUSTER_NORMAL:
if ((cluster_offset & 511) != 0) {
ret = -EIO;
goto fail;
@@ -520,6 +525,12 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
qemu_iovec_from_buffer(&hd_qiov, cluster_data,
512 * cur_nr_sectors);
}
+ break;
+
+ default:
+ g_assert_not_reached();
+ ret = -EIO;
+ goto fail;
}
remaining_sectors -= cur_nr_sectors;
diff --git a/block/qcow2.h b/block/qcow2.h
index e4ac366..bad5448 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -164,6 +164,16 @@ typedef struct QCowL2Meta
QLIST_ENTRY(QCowL2Meta) next_in_flight;
} QCowL2Meta;
+enum {
+ QCOW2_CLUSTER_UNALLOCATED,
+ QCOW2_CLUSTER_NORMAL,
+ QCOW2_CLUSTER_COMPRESSED,
+};
+
+#define L1E_OFFSET_MASK 0x00ffffffffffff00ULL
+#define L2E_OFFSET_MASK 0x00ffffffffffff00ULL
+#define L2E_COMPRESSED_OFFSET_SIZE_MASK 0x3fffffffffffffffULL
+
static inline int size_to_clusters(BDRVQcowState *s, int64_t size)
{
return (size + (s->cluster_size - 1)) >> s->cluster_bits;
@@ -181,6 +191,17 @@ static inline int64_t align_offset(int64_t offset, int n)
return offset;
}
+static inline int qcow2_get_cluster_type(uint64_t l2_entry)
+{
+ if (l2_entry & QCOW_OFLAG_COMPRESSED) {
+ return QCOW2_CLUSTER_COMPRESSED;
+ } else if (!(l2_entry & L2E_OFFSET_MASK)) {
+ return QCOW2_CLUSTER_UNALLOCATED;
+ } else {
+ return QCOW2_CLUSTER_NORMAL;
+ }
+}
+
// FIXME Need qcow2_ prefix to global functions
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 03/16] qcow2: Ignore reserved bits in count_contiguous_clusters()
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3 Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 02/16] qcow2: Ignore reserved bits in get_cluster_offset Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 04/16] qcow2: Fail write_compressed when overwriting data Kevin Wolf
` (12 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Until now, count_contiguous_clusters() has an argument that allowed to
specify flags that should be ignored in the comparison, i.e. that are
allowed to change between contiguous clusters.
This patch changes the function so that it ignores all flags by default
now and you need to pass the flags on which it should stop.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 38 ++++++++++++++++++++++++++++----------
1 files changed, 28 insertions(+), 10 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 44d13de..9547fa9 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -246,28 +246,44 @@ fail:
return ret;
}
+/*
+ * Checks how many clusters in a given L2 table are contiguous in the image
+ * file. As soon as one of the flags in the bitmask stop_flags changes compared
+ * to the first cluster, the search is stopped and the cluster is not counted
+ * as contiguous. (This allows it, for example, to stop at the first compressed
+ * cluster which may require a different handling)
+ */
static int count_contiguous_clusters(uint64_t nb_clusters, int cluster_size,
- uint64_t *l2_table, uint64_t start, uint64_t mask)
+ uint64_t *l2_table, uint64_t start, uint64_t stop_flags)
{
int i;
- uint64_t offset = be64_to_cpu(l2_table[0]) & ~mask;
+ uint64_t mask = stop_flags | L2E_OFFSET_MASK;
+ uint64_t offset = be64_to_cpu(l2_table[0]) & mask;
if (!offset)
return 0;
- for (i = start; i < start + nb_clusters; i++)
- if (offset + (uint64_t) i * cluster_size != (be64_to_cpu(l2_table[i]) & ~mask))
+ for (i = start; i < start + nb_clusters; i++) {
+ uint64_t l2_entry = be64_to_cpu(l2_table[i]) & mask;
+ if (offset + (uint64_t) i * cluster_size != l2_entry) {
break;
+ }
+ }
return (i - start);
}
static int count_contiguous_free_clusters(uint64_t nb_clusters, uint64_t *l2_table)
{
- int i = 0;
+ int i;
- while(nb_clusters-- && l2_table[i] == 0)
- i++;
+ for (i = 0; i < nb_clusters; i++) {
+ int type = qcow2_get_cluster_type(be64_to_cpu(l2_table[i]));
+
+ if (type != QCOW2_CLUSTER_UNALLOCATED) {
+ break;
+ }
+ }
return i;
}
@@ -444,7 +460,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
case QCOW2_CLUSTER_NORMAL:
/* how many allocated clusters ? */
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
- &l2_table[l2_index], 0, QCOW_OFLAG_COPIED);
+ &l2_table[l2_index], 0, QCOW_OFLAG_COMPRESSED);
*cluster_offset &= L2E_OFFSET_MASK;
break;
}
@@ -696,7 +712,8 @@ static int count_cow_clusters(BDRVQcowState *s, int nb_clusters,
while (i < nb_clusters) {
i += count_contiguous_clusters(nb_clusters - i, s->cluster_size,
- &l2_table[l2_index], i, 0);
+ &l2_table[l2_index], i,
+ QCOW_OFLAG_COPIED | QCOW_OFLAG_COMPRESSED);
if ((i >= nb_clusters) || be64_to_cpu(l2_table[l2_index + i])) {
break;
}
@@ -854,7 +871,8 @@ again:
if (cluster_offset & QCOW_OFLAG_COPIED) {
/* We keep all QCOW_OFLAG_COPIED clusters */
keep_clusters = count_contiguous_clusters(nb_clusters, s->cluster_size,
- &l2_table[l2_index], 0, 0);
+ &l2_table[l2_index], 0,
+ QCOW_OFLAG_COPIED);
assert(keep_clusters <= nb_clusters);
nb_clusters -= keep_clusters;
} else {
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 04/16] qcow2: Fail write_compressed when overwriting data
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (2 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 03/16] qcow2: Ignore reserved bits in count_contiguous_clusters() Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 05/16] qcow2: Ignore reserved bits in L1/L2 entries Kevin Wolf
` (11 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
qcow2_alloc_compressed_cluster_offset() already fails if the copied flag
is set, because qcow2_write_compressed() doesn't perform COW as it would
have to do to allow this.
However, what we really want to check here is whether the cluster is
allocated or not. With internal snapshots the copied flag may not be set
on allocated clusters. Check the cluster offset instead.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 9547fa9..b26028c 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -571,15 +571,14 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
return 0;
}
+ /* Compression can't overwrite anything. Fail if the cluster was already
+ * allocated. */
cluster_offset = be64_to_cpu(l2_table[l2_index]);
- if (cluster_offset & QCOW_OFLAG_COPIED) {
+ if (cluster_offset & L2E_OFFSET_MASK) {
qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
return 0;
}
- if (cluster_offset)
- qcow2_free_any_clusters(bs, cluster_offset, 1);
-
cluster_offset = qcow2_alloc_bytes(bs, compressed_size);
if (cluster_offset < 0) {
qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 05/16] qcow2: Ignore reserved bits in L1/L2 entries
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (3 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 04/16] qcow2: Fail write_compressed when overwriting data Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 06/16] qcow2: Refactor qcow2_free_any_clusters Kevin Wolf
` (10 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
This changes the still existing places that assume that the only flags
are QCOW_OFLAG_COPIED and QCOW_OFLAG_COMPRESSED to properly mask out
reserved bits.
It does not convert bdrv_check yet.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 26 +++++++++++++-------------
block/qcow2-refcount.c | 12 ++++++------
2 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index b26028c..157a156 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -195,7 +195,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
l2_table = *table;
- if (old_l2_offset == 0) {
+ if ((old_l2_offset & L1E_OFFSET_MASK) == 0) {
/* if there was no old l2 table, clear the new table */
memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
} else {
@@ -203,7 +203,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
/* if there was an old l2 table, read it from the disk */
BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ);
- ret = qcow2_cache_get(bs, s->l2_table_cache, old_l2_offset,
+ ret = qcow2_cache_get(bs, s->l2_table_cache,
+ old_l2_offset & L1E_OFFSET_MASK,
(void**) &old_table);
if (ret < 0) {
goto fail;
@@ -508,13 +509,13 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
return ret;
}
}
- l2_offset = s->l1_table[l1_index];
+
+ l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
/* seek the l2 table of the given l2 offset */
- if (l2_offset & QCOW_OFLAG_COPIED) {
+ if (s->l1_table[l1_index] & QCOW_OFLAG_COPIED) {
/* load the l2 table in memory */
- l2_offset &= ~QCOW_OFLAG_COPIED;
ret = l2_load(bs, l2_offset, &l2_table);
if (ret < 0) {
return ret;
@@ -530,7 +531,7 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
if (l2_offset) {
qcow2_free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
}
- l2_offset = s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED;
+ l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
}
/* find the cluster offset for the given disk offset */
@@ -687,8 +688,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
*/
if (j != 0) {
for (i = 0; i < j; i++) {
- qcow2_free_any_clusters(bs,
- be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
+ qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1);
}
}
@@ -867,7 +867,9 @@ again:
* Check how many clusters are already allocated and don't need COW, and how
* many need a new allocation.
*/
- if (cluster_offset & QCOW_OFLAG_COPIED) {
+ if (qcow2_get_cluster_type(cluster_offset) == QCOW2_CLUSTER_NORMAL
+ && (cluster_offset & QCOW_OFLAG_COPIED))
+ {
/* We keep all QCOW_OFLAG_COPIED clusters */
keep_clusters = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], 0,
@@ -886,7 +888,7 @@ again:
cluster_offset = 0;
}
- cluster_offset &= ~QCOW_OFLAG_COPIED;
+ cluster_offset &= L2E_OFFSET_MASK;
/* If there is something left to allocate, do that now */
*m = (QCowL2Meta) {
@@ -1041,9 +1043,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
uint64_t old_offset;
old_offset = be64_to_cpu(l2_table[l2_index + i]);
- old_offset &= ~QCOW_OFLAG_COPIED;
-
- if (old_offset == 0) {
+ if ((old_offset & L2E_OFFSET_MASK) == 0) {
continue;
}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index f39928a..7c95fe3 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -696,9 +696,8 @@ void qcow2_free_any_clusters(BlockDriverState *bs,
return;
}
- qcow2_free_clusters(bs, cluster_offset, nb_clusters << s->cluster_bits);
-
- return;
+ qcow2_free_clusters(bs, cluster_offset & L2E_OFFSET_MASK,
+ nb_clusters << s->cluster_bits);
}
@@ -758,7 +757,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
l2_offset = l1_table[i];
if (l2_offset) {
old_l2_offset = l2_offset;
- l2_offset &= ~QCOW_OFLAG_COPIED;
+ l2_offset &= L1E_OFFSET_MASK;
ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
(void**) &l2_table);
@@ -790,10 +789,11 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
/* compressed clusters are never modified */
refcount = 2;
} else {
+ uint64_t cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
if (addend != 0) {
- refcount = update_cluster_refcount(bs, offset >> s->cluster_bits, addend);
+ refcount = update_cluster_refcount(bs, cluster_index, addend);
} else {
- refcount = get_refcount(bs, offset >> s->cluster_bits);
+ refcount = get_refcount(bs, cluster_index);
}
if (refcount < 0) {
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 06/16] qcow2: Refactor qcow2_free_any_clusters
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (4 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 05/16] qcow2: Ignore reserved bits in L1/L2 entries Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 07/16] qcow2: Simplify count_cow_clusters Kevin Wolf
` (9 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Zero clusters will add another cluster type. Refactor the open-coded
cluster type detection into a switch of QCOW2_CLUSTER_* options so that
the detection is in a single place. This makes it easier to add new
cluster types.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-refcount.c | 41 ++++++++++++++++++++++-------------------
1 files changed, 22 insertions(+), 19 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 7c95fe3..2ec3aa7 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -673,31 +673,34 @@ void qcow2_free_clusters(BlockDriverState *bs,
}
/*
- * free_any_clusters
- *
- * free clusters according to its type: compressed or not
- *
+ * Free a cluster using its L2 entry (handles clusters of all types, e.g.
+ * normal cluster, compressed cluster, etc.)
*/
-
void qcow2_free_any_clusters(BlockDriverState *bs,
- uint64_t cluster_offset, int nb_clusters)
+ uint64_t l2_entry, int nb_clusters)
{
BDRVQcowState *s = bs->opaque;
- /* free the cluster */
-
- if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
- int nb_csectors;
- nb_csectors = ((cluster_offset >> s->csize_shift) &
- s->csize_mask) + 1;
- qcow2_free_clusters(bs,
- (cluster_offset & s->cluster_offset_mask) & ~511,
- nb_csectors * 512);
- return;
+ switch (qcow2_get_cluster_type(l2_entry)) {
+ case QCOW2_CLUSTER_COMPRESSED:
+ {
+ int nb_csectors;
+ nb_csectors = ((l2_entry >> s->csize_shift) &
+ s->csize_mask) + 1;
+ qcow2_free_clusters(bs,
+ (l2_entry & s->cluster_offset_mask) & ~511,
+ nb_csectors * 512);
+ }
+ break;
+ case QCOW2_CLUSTER_NORMAL:
+ qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
+ nb_clusters << s->cluster_bits);
+ break;
+ case QCOW2_CLUSTER_UNALLOCATED:
+ break;
+ default:
+ abort();
}
-
- qcow2_free_clusters(bs, cluster_offset & L2E_OFFSET_MASK,
- nb_clusters << s->cluster_bits);
}
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 07/16] qcow2: Simplify count_cow_clusters
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (5 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 06/16] qcow2: Refactor qcow2_free_any_clusters Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 08/16] qcow2: Ignore reserved bits in refcount table entries Kevin Wolf
` (8 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
count_cow_clusters() tries to reuse existing functions, and all it
achieves is to make things much more complicated than they really are:
Everything needs COW, unless it's a normal cluster with refcount 1.
This patch implements the obvious way of doing this, and by using
qcow2_get_cluster_type() it gets rid of all flag magic.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 35 ++++++++++++++++-------------------
1 files changed, 16 insertions(+), 19 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 157a156..b120b92 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -706,30 +706,27 @@ err:
static int count_cow_clusters(BDRVQcowState *s, int nb_clusters,
uint64_t *l2_table, int l2_index)
{
- int i = 0;
- uint64_t cluster_offset;
+ int i;
- while (i < nb_clusters) {
- i += count_contiguous_clusters(nb_clusters - i, s->cluster_size,
- &l2_table[l2_index], i,
- QCOW_OFLAG_COPIED | QCOW_OFLAG_COMPRESSED);
- if ((i >= nb_clusters) || be64_to_cpu(l2_table[l2_index + i])) {
- break;
- }
+ for (i = 0; i < nb_clusters; i++) {
+ uint64_t l2_entry = be64_to_cpu(l2_table[l2_index + i]);
+ int cluster_type = qcow2_get_cluster_type(l2_entry);
- i += count_contiguous_free_clusters(nb_clusters - i,
- &l2_table[l2_index + i]);
- if (i >= nb_clusters) {
+ switch(cluster_type) {
+ case QCOW2_CLUSTER_NORMAL:
+ if (l2_entry & QCOW_OFLAG_COPIED) {
+ goto out;
+ }
break;
- }
-
- cluster_offset = be64_to_cpu(l2_table[l2_index + i]);
-
- if ((cluster_offset & QCOW_OFLAG_COPIED) ||
- (cluster_offset & QCOW_OFLAG_COMPRESSED))
+ case QCOW2_CLUSTER_UNALLOCATED:
+ case QCOW2_CLUSTER_COMPRESSED:
break;
+ default:
+ abort();
+ }
}
+out:
assert(i <= nb_clusters);
return i;
}
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 08/16] qcow2: Ignore reserved bits in refcount table entries
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (6 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 07/16] qcow2: Simplify count_cow_clusters Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 09/16] qcow2: Ignore reserved bits in check_refcounts Kevin Wolf
` (7 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-refcount.c | 2 +-
block/qcow2.h | 2 ++
2 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 2ec3aa7..50bf44e 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -167,7 +167,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
if (refcount_table_index < s->refcount_table_size) {
uint64_t refcount_block_offset =
- s->refcount_table[refcount_table_index];
+ s->refcount_table[refcount_table_index] & REFT_OFFSET_MASK;
/* If it's already there, we're done */
if (refcount_block_offset) {
diff --git a/block/qcow2.h b/block/qcow2.h
index bad5448..102578e 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -174,6 +174,8 @@ enum {
#define L2E_OFFSET_MASK 0x00ffffffffffff00ULL
#define L2E_COMPRESSED_OFFSET_SIZE_MASK 0x3fffffffffffffffULL
+#define REFT_OFFSET_MASK 0xffffffffffffff00ULL
+
static inline int size_to_clusters(BDRVQcowState *s, int64_t size)
{
return (size + (s->cluster_size - 1)) >> s->cluster_bits;
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 09/16] qcow2: Ignore reserved bits in check_refcounts
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (7 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 08/16] qcow2: Ignore reserved bits in refcount table entries Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 10/16] qcow2: Version 3 images Kevin Wolf
` (6 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Also don't infer the cluster type directly from the L2 entries, but use
qcow2_get_cluster_type() to keep everything in a single place.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-refcount.c | 98 ++++++++++++++++++++++++++---------------------
1 files changed, 54 insertions(+), 44 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 50bf44e..c5d3171 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -934,7 +934,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
int check_copied)
{
BDRVQcowState *s = bs->opaque;
- uint64_t *l2_table, offset;
+ uint64_t *l2_table, l2_entry;
int i, l2_size, nb_csectors, refcount;
/* Read L2 table from disk */
@@ -946,54 +946,64 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
/* Do the actual checks */
for(i = 0; i < s->l2_size; i++) {
- offset = be64_to_cpu(l2_table[i]);
- if (offset != 0) {
- if (offset & QCOW_OFLAG_COMPRESSED) {
- /* Compressed clusters don't have QCOW_OFLAG_COPIED */
- if (offset & QCOW_OFLAG_COPIED) {
- fprintf(stderr, "ERROR: cluster %" PRId64 ": "
- "copied flag must never be set for compressed "
- "clusters\n", offset >> s->cluster_bits);
- offset &= ~QCOW_OFLAG_COPIED;
- res->corruptions++;
- }
+ l2_entry = be64_to_cpu(l2_table[i]);
+
+ switch (qcow2_get_cluster_type(l2_entry)) {
+ case QCOW2_CLUSTER_COMPRESSED:
+ /* Compressed clusters don't have QCOW_OFLAG_COPIED */
+ if (l2_entry & QCOW_OFLAG_COPIED) {
+ fprintf(stderr, "ERROR: cluster %" PRId64 ": "
+ "copied flag must never be set for compressed "
+ "clusters\n", l2_entry >> s->cluster_bits);
+ l2_entry &= ~QCOW_OFLAG_COPIED;
+ res->corruptions++;
+ }
- /* Mark cluster as used */
- nb_csectors = ((offset >> s->csize_shift) &
- s->csize_mask) + 1;
- offset &= s->cluster_offset_mask;
- inc_refcounts(bs, res, refcount_table, refcount_table_size,
- offset & ~511, nb_csectors * 512);
- } else {
- /* QCOW_OFLAG_COPIED must be set iff refcount == 1 */
- if (check_copied) {
- uint64_t entry = offset;
- offset &= ~QCOW_OFLAG_COPIED;
- refcount = get_refcount(bs, offset >> s->cluster_bits);
- if (refcount < 0) {
- fprintf(stderr, "Can't get refcount for offset %"
- PRIx64 ": %s\n", entry, strerror(-refcount));
- goto fail;
- }
- if ((refcount == 1) != ((entry & QCOW_OFLAG_COPIED) != 0)) {
- fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
- PRIx64 " refcount=%d\n", entry, refcount);
- res->corruptions++;
- }
- }
+ /* Mark cluster as used */
+ nb_csectors = ((l2_entry >> s->csize_shift) &
+ s->csize_mask) + 1;
+ l2_entry &= s->cluster_offset_mask;
+ inc_refcounts(bs, res, refcount_table, refcount_table_size,
+ l2_entry & ~511, nb_csectors * 512);
+ break;
- /* Mark cluster as used */
- offset &= ~QCOW_OFLAG_COPIED;
- inc_refcounts(bs, res, refcount_table,refcount_table_size,
- offset, s->cluster_size);
+ case QCOW2_CLUSTER_NORMAL:
+ {
+ /* QCOW_OFLAG_COPIED must be set iff refcount == 1 */
+ uint64_t offset = l2_entry & L2E_OFFSET_MASK;
- /* Correct offsets are cluster aligned */
- if (offset & (s->cluster_size - 1)) {
- fprintf(stderr, "ERROR offset=%" PRIx64 ": Cluster is not "
- "properly aligned; L2 entry corrupted.\n", offset);
+ if (check_copied) {
+ refcount = get_refcount(bs, offset >> s->cluster_bits);
+ if (refcount < 0) {
+ fprintf(stderr, "Can't get refcount for offset %"
+ PRIx64 ": %s\n", l2_entry, strerror(-refcount));
+ goto fail;
+ }
+ if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
+ fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
+ PRIx64 " refcount=%d\n", l2_entry, refcount);
res->corruptions++;
}
}
+
+ /* Mark cluster as used */
+ inc_refcounts(bs, res, refcount_table,refcount_table_size,
+ offset, s->cluster_size);
+
+ /* Correct offsets are cluster aligned */
+ if (offset & (s->cluster_size - 1)) {
+ fprintf(stderr, "ERROR offset=%" PRIx64 ": Cluster is not "
+ "properly aligned; L2 entry corrupted.\n", offset);
+ res->corruptions++;
+ }
+ break;
+ }
+
+ case QCOW2_CLUSTER_UNALLOCATED:
+ break;
+
+ default:
+ abort();
}
}
@@ -1064,7 +1074,7 @@ static int check_refcounts_l1(BlockDriverState *bs,
}
/* Mark L2 table as used */
- l2_offset &= ~QCOW_OFLAG_COPIED;
+ l2_offset &= L1E_OFFSET_MASK;
inc_refcounts(bs, res, refcount_table, refcount_table_size,
l2_offset, s->cluster_size);
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 10/16] qcow2: Version 3 images
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (8 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 09/16] qcow2: Ignore reserved bits in check_refcounts Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 11/16] qcow2: Support reading zero clusters Kevin Wolf
` (5 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
This adds the basic infrastructure to qcow2 to handle version 3 images.
It includes code to create v3 images, allow header updates for v3 images
and checks feature bits.
It still misses support for zero clusters, so this is not a fully
compliant implementation of v3 yet.
The default for creating new images stays at v2 for now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2.c | 146 +++++++++++++++++++++++++++++++++++++++++++++++++++------
block/qcow2.h | 17 ++++++-
block_int.h | 1 +
3 files changed, 149 insertions(+), 15 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index 4c82adc..c1f113d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -61,7 +61,7 @@ static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename)
if (buf_size >= sizeof(QCowHeader) &&
be32_to_cpu(cow_header->magic) == QCOW_MAGIC &&
- be32_to_cpu(cow_header->version) >= QCOW_VERSION)
+ be32_to_cpu(cow_header->version) >= 2)
return 100;
else
return 0;
@@ -169,6 +169,19 @@ static void cleanup_unknown_header_ext(BlockDriverState *bs)
}
}
+static void report_unsupported(BlockDriverState *bs, const char *fmt, ...)
+{
+ char msg[64];
+ va_list ap;
+
+ va_start(ap, fmt);
+ vsnprintf(msg, sizeof(msg), fmt, ap);
+ va_end(ap);
+
+ qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+ bs->device_name, "qcow2", msg);
+}
+
static int qcow2_open(BlockDriverState *bs, int flags)
{
BDRVQcowState *s = bs->opaque;
@@ -199,14 +212,64 @@ static int qcow2_open(BlockDriverState *bs, int flags)
ret = -EINVAL;
goto fail;
}
- if (header.version != QCOW_VERSION) {
- char version[64];
- snprintf(version, sizeof(version), "QCOW version %d", header.version);
- qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
- bs->device_name, "qcow2", version);
+ if (header.version < 2 || header.version > 3) {
+ report_unsupported(bs, "QCOW version %d", header.version);
+ ret = -ENOTSUP;
+ goto fail;
+ }
+
+ s->qcow_version = header.version;
+
+ /* Initialise version 3 header fields */
+ if (header.version == 2) {
+ header.incompatible_features = 0;
+ header.compatible_features = 0;
+ header.autoclear_features = 0;
+ header.refcount_bits = 16;
+ header.header_length = 72;
+ } else {
+ be64_to_cpus(&header.incompatible_features);
+ be64_to_cpus(&header.compatible_features);
+ be64_to_cpus(&header.autoclear_features);
+ be32_to_cpus(&header.refcount_bits);
+ be32_to_cpus(&header.header_length);
+ }
+
+ if (header.header_length > sizeof(header)) {
+ s->unknown_header_fields_size = header.header_length - sizeof(header);
+ s->unknown_header_fields = g_malloc(s->unknown_header_fields_size);
+ ret = bdrv_pread(bs->file, sizeof(header), s->unknown_header_fields,
+ s->unknown_header_fields_size);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
+
+ /* Handle feature bits */
+ s->incompatible_features = header.incompatible_features;
+ s->compatible_features = header.compatible_features;
+ s->autoclear_features = header.autoclear_features;
+
+ if (s->incompatible_features != 0) {
+ report_unsupported(bs, "incompatible features mask %" PRIx64,
+ header.incompatible_features);
+ ret = -ENOTSUP;
+ goto fail;
+ }
+
+ if (!bs->read_only && s->autoclear_features != 0) {
+ s->autoclear_features = 0;
+ qcow2_update_header(bs);
+ }
+
+ /* Check support for various header values */
+ if (header.refcount_bits != 16) {
+ report_unsupported(bs, "%d bit reference counts",
+ header.refcount_bits);
ret = -ENOTSUP;
goto fail;
}
+
if (header.cluster_bits < MIN_CLUSTER_BITS ||
header.cluster_bits > MAX_CLUSTER_BITS) {
ret = -EINVAL;
@@ -285,7 +348,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
} else {
ext_end = s->cluster_size;
}
- if (qcow2_read_extensions(bs, sizeof(header), ext_end)) {
+ if (qcow2_read_extensions(bs, header.header_length, ext_end)) {
ret = -EINVAL;
goto fail;
}
@@ -321,6 +384,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
return ret;
fail:
+ g_free(s->unknown_header_fields);
cleanup_unknown_header_ext(bs);
qcow2_free_snapshots(bs);
qcow2_refcount_close(bs);
@@ -682,7 +746,9 @@ static void qcow2_close(BlockDriverState *bs)
qcow2_cache_destroy(bs, s->l2_table_cache);
qcow2_cache_destroy(bs, s->refcount_block_cache);
+ g_free(s->unknown_header_fields);
cleanup_unknown_header_ext(bs);
+
g_free(s->cluster_cache);
qemu_vfree(s->cluster_data);
qcow2_refcount_close(bs);
@@ -756,10 +822,10 @@ int qcow2_update_header(BlockDriverState *bs)
int ret;
uint64_t total_size;
uint32_t refcount_table_clusters;
+ size_t header_length;
Qcow2UnknownHeaderExtension *uext;
buf = qemu_blockalign(bs, buflen);
- memset(buf, 0, s->cluster_size);
/* Header structure */
header = (QCowHeader*) buf;
@@ -769,12 +835,14 @@ int qcow2_update_header(BlockDriverState *bs)
goto fail;
}
+ header_length = sizeof(*header) + s->unknown_header_fields_size;
total_size = bs->total_sectors * BDRV_SECTOR_SIZE;
refcount_table_clusters = s->refcount_table_size >> (s->cluster_bits - 3);
*header = (QCowHeader) {
+ /* Version 2 fields */
.magic = cpu_to_be32(QCOW_MAGIC),
- .version = cpu_to_be32(QCOW_VERSION),
+ .version = cpu_to_be32(s->qcow_version),
.backing_file_offset = 0,
.backing_file_size = 0,
.cluster_bits = cpu_to_be32(s->cluster_bits),
@@ -786,10 +854,42 @@ int qcow2_update_header(BlockDriverState *bs)
.refcount_table_clusters = cpu_to_be32(refcount_table_clusters),
.nb_snapshots = cpu_to_be32(s->nb_snapshots),
.snapshots_offset = cpu_to_be64(s->snapshots_offset),
+
+ /* Version 3 fields */
+ .incompatible_features = cpu_to_be64(s->incompatible_features),
+ .compatible_features = cpu_to_be64(s->compatible_features),
+ .autoclear_features = cpu_to_be64(s->autoclear_features),
+ .refcount_bits = cpu_to_be32(8 << REFCOUNT_SHIFT),
+ .header_length = cpu_to_be32(header_length),
};
- buf += sizeof(*header);
- buflen -= sizeof(*header);
+ /* For older versions, write a shorter header */
+ switch (s->qcow_version) {
+ case 2:
+ ret = offsetof(QCowHeader, incompatible_features);
+ break;
+ case 3:
+ ret = sizeof(*header);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ buf += ret;
+ buflen -= ret;
+ memset(buf, 0, buflen);
+
+ /* Preserve any unknown field in the header */
+ if (s->unknown_header_fields_size) {
+ if (buflen < s->unknown_header_fields_size) {
+ ret = -ENOSPC;
+ goto fail;
+ }
+
+ memcpy(buf, s->unknown_header_fields, s->unknown_header_fields_size);
+ buf += s->unknown_header_fields_size;
+ buflen -= s->unknown_header_fields_size;
+ }
/* Backing file format header extension */
if (*bs->backing_format) {
@@ -921,7 +1021,7 @@ static int preallocate(BlockDriverState *bs)
static int qcow2_create2(const char *filename, int64_t total_size,
const char *backing_file, const char *backing_format,
int flags, size_t cluster_size, int prealloc,
- QEMUOptionParameter *options)
+ QEMUOptionParameter *options, int version)
{
/* Calculate cluster_bits */
int cluster_bits;
@@ -965,13 +1065,15 @@ static int qcow2_create2(const char *filename, int64_t total_size,
/* Write the header */
memset(&header, 0, sizeof(header));
header.magic = cpu_to_be32(QCOW_MAGIC);
- header.version = cpu_to_be32(QCOW_VERSION);
+ header.version = cpu_to_be32(version);
header.cluster_bits = cpu_to_be32(cluster_bits);
header.size = cpu_to_be64(0);
header.l1_table_offset = cpu_to_be64(0);
header.l1_size = cpu_to_be32(0);
header.refcount_table_offset = cpu_to_be64(cluster_size);
header.refcount_table_clusters = cpu_to_be32(1);
+ header.refcount_bits = cpu_to_be32(8 << REFCOUNT_SHIFT);
+ header.header_length = cpu_to_be32(sizeof(header));
if (flags & BLOCK_FLAG_ENCRYPT) {
header.crypt_method = cpu_to_be32(QCOW_CRYPT_AES);
@@ -1053,6 +1155,7 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options)
int flags = 0;
size_t cluster_size = DEFAULT_CLUSTER_SIZE;
int prealloc = 0;
+ int version = 2;
/* Read out options */
while (options && options->name) {
@@ -1078,6 +1181,16 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options)
options->value.s);
return -EINVAL;
}
+ } else if (!strcmp(options->name, BLOCK_OPT_COMPAT_LEVEL)) {
+ if (!options->value.s || !strcmp(options->value.s, "0.10")) {
+ version = 2;
+ } else if (!strcmp(options->value.s, "1.1")) {
+ version = 3;
+ } else {
+ fprintf(stderr, "Invalid compatibility level: '%s'\n",
+ options->value.s);
+ return -EINVAL;
+ }
}
options++;
}
@@ -1089,7 +1202,7 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options)
}
return qcow2_create2(filename, sectors, backing_file, backing_fmt, flags,
- cluster_size, prealloc, options);
+ cluster_size, prealloc, options, version);
}
static int qcow2_make_empty(BlockDriverState *bs)
@@ -1341,6 +1454,11 @@ static QEMUOptionParameter qcow2_create_options[] = {
.help = "Virtual disk size"
},
{
+ .name = BLOCK_OPT_COMPAT_LEVEL,
+ .type = OPT_STRING,
+ .help = "Compatibility level (0.10 or 1.1)"
+ },
+ {
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
diff --git a/block/qcow2.h b/block/qcow2.h
index 102578e..71b8d88 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -33,7 +33,6 @@
//#define DEBUG_EXT
#define QCOW_MAGIC (('Q' << 24) | ('F' << 16) | ('I' << 8) | 0xfb)
-#define QCOW_VERSION 2
#define QCOW_CRYPT_NONE 0
#define QCOW_CRYPT_AES 1
@@ -71,6 +70,14 @@ typedef struct QCowHeader {
uint32_t refcount_table_clusters;
uint32_t nb_snapshots;
uint64_t snapshots_offset;
+
+ /* The following fields are only valid for version >= 3 */
+ uint64_t incompatible_features;
+ uint64_t compatible_features;
+ uint64_t autoclear_features;
+
+ uint32_t refcount_bits;
+ uint32_t header_length;
} QCowHeader;
typedef struct QCowSnapshot {
@@ -134,6 +141,14 @@ typedef struct BDRVQcowState {
QCowSnapshot *snapshots;
int flags;
+ int qcow_version;
+
+ uint64_t incompatible_features;
+ uint64_t compatible_features;
+ uint64_t autoclear_features;
+
+ size_t unknown_header_fields_size;
+ void* unknown_header_fields;
QLIST_HEAD(, Qcow2UnknownHeaderExtension) unknown_header_ext;
} BDRVQcowState;
diff --git a/block_int.h b/block_int.h
index 22c86a5..909cb2b 100644
--- a/block_int.h
+++ b/block_int.h
@@ -50,6 +50,7 @@
#define BLOCK_OPT_TABLE_SIZE "table_size"
#define BLOCK_OPT_PREALLOC "preallocation"
#define BLOCK_OPT_SUBFMT "subformat"
+#define BLOCK_OPT_COMPAT_LEVEL "compat"
typedef struct BdrvTrackedRequest BdrvTrackedRequest;
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 11/16] qcow2: Support reading zero clusters
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (9 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 10/16] qcow2: Version 3 images Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 12/16] qcow2: Support for feature table header extension Kevin Wolf
` (4 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
This adds support for reading zero clusters in version 3 images.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2-cluster.c | 17 +++++++++++++----
block/qcow2-refcount.c | 7 +++++++
block/qcow2.c | 8 ++++++++
block/qcow2.h | 5 +++++
4 files changed, 33 insertions(+), 4 deletions(-)
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index b120b92..4853f1f 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -453,6 +453,12 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
c = 1;
*cluster_offset &= L2E_COMPRESSED_OFFSET_SIZE_MASK;
break;
+ case QCOW2_CLUSTER_ZERO:
+ c = count_contiguous_clusters(nb_clusters, s->cluster_size,
+ &l2_table[l2_index], 0,
+ QCOW_OFLAG_COMPRESSED | QCOW_OFLAG_ZERO);
+ *cluster_offset = 0;
+ break;
case QCOW2_CLUSTER_UNALLOCATED:
/* how many empty clusters ? */
c = count_contiguous_free_clusters(nb_clusters, &l2_table[l2_index]);
@@ -461,7 +467,8 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
case QCOW2_CLUSTER_NORMAL:
/* how many allocated clusters ? */
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
- &l2_table[l2_index], 0, QCOW_OFLAG_COMPRESSED);
+ &l2_table[l2_index], 0,
+ QCOW_OFLAG_COMPRESSED | QCOW_OFLAG_ZERO);
*cluster_offset &= L2E_OFFSET_MASK;
break;
}
@@ -720,6 +727,7 @@ static int count_cow_clusters(BDRVQcowState *s, int nb_clusters,
break;
case QCOW2_CLUSTER_UNALLOCATED:
case QCOW2_CLUSTER_COMPRESSED:
+ case QCOW2_CLUSTER_ZERO:
break;
default:
abort();
@@ -868,9 +876,10 @@ again:
&& (cluster_offset & QCOW_OFLAG_COPIED))
{
/* We keep all QCOW_OFLAG_COPIED clusters */
- keep_clusters = count_contiguous_clusters(nb_clusters, s->cluster_size,
- &l2_table[l2_index], 0,
- QCOW_OFLAG_COPIED);
+ keep_clusters =
+ count_contiguous_clusters(nb_clusters, s->cluster_size,
+ &l2_table[l2_index], 0,
+ QCOW_OFLAG_COPIED | QCOW_OFLAG_ZERO);
assert(keep_clusters <= nb_clusters);
nb_clusters -= keep_clusters;
} else {
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index c5d3171..60e2fb5 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -697,6 +697,7 @@ void qcow2_free_any_clusters(BlockDriverState *bs,
nb_clusters << s->cluster_bits);
break;
case QCOW2_CLUSTER_UNALLOCATED:
+ case QCOW2_CLUSTER_ZERO:
break;
default:
abort();
@@ -967,6 +968,12 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
l2_entry & ~511, nb_csectors * 512);
break;
+ case QCOW2_CLUSTER_ZERO:
+ if ((l2_entry & L2E_OFFSET_MASK) == 0) {
+ break;
+ }
+ /* fall through */
+
case QCOW2_CLUSTER_NORMAL:
{
/* QCOW_OFLAG_COPIED must be set iff refcount == 1 */
diff --git a/block/qcow2.c b/block/qcow2.c
index c1f113d..aef8282 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -536,6 +536,14 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
}
break;
+ case QCOW2_CLUSTER_ZERO:
+ if (s->qcow_version < 3) {
+ ret = -EIO;
+ goto fail;
+ }
+ qemu_iovec_memset(&hd_qiov, 0, 512 * cur_nr_sectors);
+ break;
+
case QCOW2_CLUSTER_COMPRESSED:
/* add AIO support for compressed blocks ? */
ret = qcow2_decompress_cluster(bs, cluster_offset);
diff --git a/block/qcow2.h b/block/qcow2.h
index 71b8d88..64cfa69 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -43,6 +43,8 @@
#define QCOW_OFLAG_COPIED (1LL << 63)
/* indicate that the cluster is compressed (they never have the copied flag) */
#define QCOW_OFLAG_COMPRESSED (1LL << 62)
+/* The cluster reads as all zeros */
+#define QCOW_OFLAG_ZERO (1LL << 0)
#define REFCOUNT_SHIFT 1 /* refcount size is 2 bytes */
@@ -183,6 +185,7 @@ enum {
QCOW2_CLUSTER_UNALLOCATED,
QCOW2_CLUSTER_NORMAL,
QCOW2_CLUSTER_COMPRESSED,
+ QCOW2_CLUSTER_ZERO
};
#define L1E_OFFSET_MASK 0x00ffffffffffff00ULL
@@ -212,6 +215,8 @@ static inline int qcow2_get_cluster_type(uint64_t l2_entry)
{
if (l2_entry & QCOW_OFLAG_COMPRESSED) {
return QCOW2_CLUSTER_COMPRESSED;
+ } else if (l2_entry & QCOW_OFLAG_ZERO) {
+ return QCOW2_CLUSTER_ZERO;
} else if (!(l2_entry & L2E_OFFSET_MASK)) {
return QCOW2_CLUSTER_UNALLOCATED;
} else {
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 12/16] qcow2: Support for feature table header extension
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (10 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 11/16] qcow2: Support reading zero clusters Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 13/16] qemu-iotests: add a simple test for write_zeroes Kevin Wolf
` (3 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Instead of printing an ugly bitmask, qemu can now print a more helpful
string even for yet unknown features.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/qcow2.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
block/qcow2.h | 12 +++++++++
2 files changed, 77 insertions(+), 9 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index aef8282..002e138 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -54,6 +54,7 @@ typedef struct {
} QCowExtension;
#define QCOW2_EXT_MAGIC_END 0
#define QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
+#define QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857
static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename)
{
@@ -76,7 +77,7 @@ static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename)
* return 0 upon success, non-0 otherwise
*/
static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
- uint64_t end_offset)
+ uint64_t end_offset, void **p_feature_table)
{
BDRVQcowState *s = bs->opaque;
QCowExtension ext;
@@ -134,6 +135,18 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
#endif
break;
+ case QCOW2_EXT_MAGIC_FEATURE_TABLE:
+ if (p_feature_table != NULL) {
+ void* feature_table = g_malloc0(ext.len + 2 * sizeof(Qcow2Feature));
+ ret = bdrv_pread(bs->file, offset , feature_table, ext.len);
+ if (ret < 0) {
+ return ret;
+ }
+
+ *p_feature_table = feature_table;
+ }
+ break;
+
default:
/* unknown magic - save it in case we need to rewrite the header */
{
@@ -182,6 +195,24 @@ static void report_unsupported(BlockDriverState *bs, const char *fmt, ...)
bs->device_name, "qcow2", msg);
}
+static void report_unsupported_feature(BlockDriverState *bs,
+ Qcow2Feature *table, uint64_t mask)
+{
+ while (table && table->name[0] != '\0') {
+ if (table->type == QCOW2_FEAT_TYPE_INCOMPATIBLE) {
+ if (mask & (1 << table->bit)) {
+ report_unsupported(bs, "%.46s",table->name);
+ mask &= ~(1 << table->bit);
+ }
+ }
+ table++;
+ }
+
+ if (mask) {
+ report_unsupported(bs, "Unknown incompatible feature: %" PRIx64, mask);
+ }
+}
+
static int qcow2_open(BlockDriverState *bs, int flags)
{
BDRVQcowState *s = bs->opaque;
@@ -245,14 +276,23 @@ static int qcow2_open(BlockDriverState *bs, int flags)
}
}
+ if (header.backing_file_offset) {
+ ext_end = header.backing_file_offset;
+ } else {
+ ext_end = 1 << header.cluster_bits;
+ }
+
/* Handle feature bits */
s->incompatible_features = header.incompatible_features;
s->compatible_features = header.compatible_features;
s->autoclear_features = header.autoclear_features;
if (s->incompatible_features != 0) {
- report_unsupported(bs, "incompatible features mask %" PRIx64,
- header.incompatible_features);
+ void *feature_table = NULL;
+ qcow2_read_extensions(bs, header.header_length, ext_end,
+ &feature_table);
+ report_unsupported_feature(bs, feature_table,
+ s->incompatible_features);
ret = -ENOTSUP;
goto fail;
}
@@ -343,12 +383,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
QLIST_INIT(&s->cluster_allocs);
/* read qcow2 extensions */
- if (header.backing_file_offset) {
- ext_end = header.backing_file_offset;
- } else {
- ext_end = s->cluster_size;
- }
- if (qcow2_read_extensions(bs, header.header_length, ext_end)) {
+ if (qcow2_read_extensions(bs, header.header_length, ext_end, NULL)) {
ret = -EINVAL;
goto fail;
}
@@ -912,6 +947,27 @@ int qcow2_update_header(BlockDriverState *bs)
buflen -= ret;
}
+ /* Feature table */
+ Qcow2Feature features[] = {
+ {
+ .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
+ .bit = 0,
+ .name = "Reference count recovery",
+ }, {
+ .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
+ .bit = 1,
+ .name = "Subclusters",
+ }
+ };
+
+ ret = header_ext_add(buf, QCOW2_EXT_MAGIC_FEATURE_TABLE,
+ features, sizeof(features), buflen);
+ if (ret < 0) {
+ goto fail;
+ }
+ buf += ret;
+ buflen -= ret;
+
/* Keep unknown header extensions */
QLIST_FOREACH(uext, &s->unknown_header_ext, next) {
ret = header_ext_add(buf, uext->magic, uext->data, uext->len, buflen);
diff --git a/block/qcow2.h b/block/qcow2.h
index 64cfa69..1adf01d 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -103,6 +103,18 @@ typedef struct Qcow2UnknownHeaderExtension {
uint8_t data[];
} Qcow2UnknownHeaderExtension;
+enum {
+ QCOW2_FEAT_TYPE_INCOMPATIBLE = 0,
+ QCOW2_FEAT_TYPE_COMPATIBLE = 1,
+ QCOW2_FEAT_TYPE_AUTOCLEAR = 2,
+};
+
+typedef struct Qcow2Feature {
+ uint8_t type;
+ uint8_t bit;
+ char name[46];
+} QEMU_PACKED Qcow2Feature;
+
typedef struct BDRVQcowState {
int cluster_bits;
int cluster_size;
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 13/16] qemu-iotests: add a simple test for write_zeroes
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (11 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 12/16] qcow2: Support for feature table header extension Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 14/16] qemu-iotests: Test COW with zero clusters Kevin Wolf
` (2 subsequent siblings)
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
From: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
tests/qemu-iotests/031 | 73 ++++++++++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/031.out | 29 +++++++++++++++++
tests/qemu-iotests/group | 1 +
3 files changed, 103 insertions(+), 0 deletions(-)
create mode 100755 tests/qemu-iotests/031
create mode 100644 tests/qemu-iotests/031.out
diff --git a/tests/qemu-iotests/031 b/tests/qemu-iotests/031
new file mode 100755
index 0000000..9aee078
--- /dev/null
+++ b/tests/qemu-iotests/031
@@ -0,0 +1,73 @@
+#!/bin/bash
+#
+# Test aligned and misaligned write zeroes operations.
+#
+# Copyright (C) 2012 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=pbonzini@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1 # failure is the default!
+
+_cleanup()
+{
+ _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt generic
+_supported_proto generic
+_supported_os Linux
+
+
+size=128M
+_make_test_img $size
+
+echo
+echo "== preparing image =="
+$QEMU_IO -c "write -P 0xa 0x200 0x400" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "write -P 0xa 0x20000 0x600" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "write -z 0x400 0x20000" $TEST_IMG | _filter_qemu_io
+
+echo
+echo "== verifying patterns (1) =="
+$QEMU_IO -c "read -P 0xa 0x200 0x200" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x0 0x400 0x20000" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0xa 0x20400 0x200" $TEST_IMG | _filter_qemu_io
+
+echo
+echo "== rewriting zeroes =="
+$QEMU_IO -c "write -P 0xb 0x10000 0x10000" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "write -z 0x10000 0x10000" $TEST_IMG | _filter_qemu_io
+
+echo
+echo "== verifying patterns (2) =="
+$QEMU_IO -c "read -P 0x0 0x400 0x20000" $TEST_IMG | _filter_qemu_io
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/031.out b/tests/qemu-iotests/031.out
new file mode 100644
index 0000000..3c990dd
--- /dev/null
+++ b/tests/qemu-iotests/031.out
@@ -0,0 +1,29 @@
+QA output created by 031
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
+
+== preparing image ==
+wrote 1024/1024 bytes at offset 512
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1536/1536 bytes at offset 131072
+2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 131072/131072 bytes at offset 1024
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== verifying patterns (1) ==
+read 512/512 bytes at offset 512
+512.000000 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 131072/131072 bytes at offset 1024
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 132096
+512.000000 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== rewriting zeroes ==
+wrote 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== verifying patterns (2) ==
+read 131072/131072 bytes at offset 1024
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index b549f10..f7ee37b 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -37,3 +37,4 @@
028 rw backing auto
029 rw auto quick
030 rw auto
+031 rw auto
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 14/16] qemu-iotests: Test COW with zero clusters
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (12 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 13/16] qemu-iotests: add a simple test for write_zeroes Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 15/16] qcow2: Zero write support Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 16/16] qemu-iotests: use qcow3 Kevin Wolf
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
tests/qemu-iotests/031 | 66 ++++++++++++++++++++++++++++++++++-
tests/qemu-iotests/031.out | 82 +++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 146 insertions(+), 2 deletions(-)
diff --git a/tests/qemu-iotests/031 b/tests/qemu-iotests/031
index 9aee078..6458e82 100755
--- a/tests/qemu-iotests/031
+++ b/tests/qemu-iotests/031
@@ -42,7 +42,7 @@ _supported_fmt generic
_supported_proto generic
_supported_os Linux
-
+CLUSTER_SIZE=4k
size=128M
_make_test_img $size
@@ -67,6 +67,70 @@ echo
echo "== verifying patterns (2) =="
$QEMU_IO -c "read -P 0x0 0x400 0x20000" $TEST_IMG | _filter_qemu_io
+_check_test_img
+
+echo
+echo "== creating backing file for COW tests =="
+
+_make_test_img $size
+$QEMU_IO -c "write -P 0x55 0 1M" $TEST_IMG | _filter_qemu_io
+mv $TEST_IMG $TEST_IMG.base
+
+_make_test_img -b $TEST_IMG.base 6G
+
+echo
+echo "== zero write with backing file =="
+$QEMU_IO -c "write -z 64k 192k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "write -z 513k 13k" $TEST_IMG | _filter_qemu_io
+
+_check_test_img
+
+echo
+echo "== verifying patterns (3) =="
+$QEMU_IO -c "read -P 0x55 0 64k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x0 64k 192k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x55 256k 257k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x0 513k 13k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x55 526k 498k" $TEST_IMG | _filter_qemu_io
+
+echo
+echo "== overwriting zero cluster =="
+$QEMU_IO -c "write -P 0xa 60k 8k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "write -P 0xb 64k 8k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "write -P 0xc 76k 4k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "write -P 0xd 252k 8k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "write -P 0xe 248k 8k" $TEST_IMG | _filter_qemu_io
+
+_check_test_img
+
+echo
+echo "== verifying patterns (4) =="
+$QEMU_IO -c "read -P 0x55 0 60k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0xa 60k 4k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0xb 64k 8k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x0 72k 4k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0xc 76k 4k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x0 80k 168k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0xe 248k 8k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0xd 256k 4k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x55 260k 64k" $TEST_IMG | _filter_qemu_io
+
+echo
+echo "== re-zeroing overwritten area =="
+$QEMU_IO -c "write -z 64k 192k" $TEST_IMG | _filter_qemu_io
+
+_check_test_img
+
+echo
+echo "== verifying patterns (5) =="
+$QEMU_IO -c "read -P 0x55 0 60k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0xa 60k 4k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x0 64k 192k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0xd 256k 4k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x55 260k 253k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x0 513k 13k" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0x55 526k 498k" $TEST_IMG | _filter_qemu_io
+
# success, all done
echo "*** done"
rm -f $seq.full
diff --git a/tests/qemu-iotests/031.out b/tests/qemu-iotests/031.out
index 3c990dd..7a6e51a 100644
--- a/tests/qemu-iotests/031.out
+++ b/tests/qemu-iotests/031.out
@@ -1,5 +1,5 @@
QA output created by 031
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 cluster_size=4096
== preparing image ==
wrote 1024/1024 bytes at offset 512
@@ -26,4 +26,84 @@ wrote 65536/65536 bytes at offset 65536
== verifying patterns (2) ==
read 131072/131072 bytes at offset 1024
128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+== creating backing file for COW tests ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 cluster_size=4096
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=6442450944 backing_file='TEST_DIR/t.IMGFMT.base' cluster_size=4096
+
+== zero write with backing file ==
+wrote 196608/196608 bytes at offset 65536
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 13312/13312 bytes at offset 525312
+13 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+== verifying patterns (3) ==
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 196608/196608 bytes at offset 65536
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 263168/263168 bytes at offset 262144
+257 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 13312/13312 bytes at offset 525312
+13 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 509952/509952 bytes at offset 538624
+498 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== overwriting zero cluster ==
+wrote 8192/8192 bytes at offset 61440
+8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 8192/8192 bytes at offset 65536
+8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 4096/4096 bytes at offset 77824
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 8192/8192 bytes at offset 258048
+8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 8192/8192 bytes at offset 253952
+8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+== verifying patterns (4) ==
+read 61440/61440 bytes at offset 0
+60 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 4096/4096 bytes at offset 61440
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 8192/8192 bytes at offset 65536
+8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 4096/4096 bytes at offset 73728
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 4096/4096 bytes at offset 77824
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 172032/172032 bytes at offset 81920
+168 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 8192/8192 bytes at offset 253952
+8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 4096/4096 bytes at offset 262144
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 266240
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== re-zeroing overwritten area ==
+wrote 196608/196608 bytes at offset 65536
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+== verifying patterns (5) ==
+read 61440/61440 bytes at offset 0
+60 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 4096/4096 bytes at offset 61440
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 196608/196608 bytes at offset 65536
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 4096/4096 bytes at offset 262144
+4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 259072/259072 bytes at offset 266240
+253 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 13312/13312 bytes at offset 525312
+13 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 509952/509952 bytes at offset 538624
+498 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
*** done
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 15/16] qcow2: Zero write support
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (13 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 14/16] qemu-iotests: Test COW with zero clusters Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 16/16] qemu-iotests: use qcow3 Kevin Wolf
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block.c | 14 +++++++--
block/qcow2-cluster.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.c | 21 ++++++++++++++
block/qcow2.h | 1 +
4 files changed, 105 insertions(+), 3 deletions(-)
diff --git a/block.c b/block.c
index 8858be0..0b99ebc 100644
--- a/block.c
+++ b/block.c
@@ -80,6 +80,8 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
void *opaque,
bool is_write);
static void coroutine_fn bdrv_co_do_rw(void *opaque);
+static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
+ int64_t sector_num, int nb_sectors);
static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors,
bool is_write, double elapsed_time, uint64_t *wait);
@@ -1680,8 +1682,8 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs,
if (drv->bdrv_co_write_zeroes &&
buffer_is_zero(bounce_buffer, iov.iov_len)) {
- ret = drv->bdrv_co_write_zeroes(bs, cluster_sector_num,
- cluster_nb_sectors);
+ ret = bdrv_co_do_write_zeroes(bs, cluster_sector_num,
+ cluster_nb_sectors);
} else {
ret = drv->bdrv_co_writev(bs, cluster_sector_num, cluster_nb_sectors,
&bounce_qiov);
@@ -1791,9 +1793,15 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
struct iovec iov;
int ret;
+ /* TODO Emulate only part of misaligned requests instead of letting block
+ * drivers return -ENOTSUP and emulate everything */
+
/* First try the efficient write zeroes operation */
if (drv->bdrv_co_write_zeroes) {
- return drv->bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
+ ret = drv->bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
+ if (ret != -ENOTSUP) {
+ return ret;
+ }
}
/* Fall back to bounce buffer if write zeroes is unsupported */
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 4853f1f..f820ac3 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1102,3 +1102,75 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
return 0;
}
+
+/*
+ * This zeroes as many clusters of nb_clusters as possible at once (i.e.
+ * all clusters in the same L2 table) and returns the number of zeroed
+ * clusters.
+ */
+static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
+ unsigned int nb_clusters)
+{
+ BDRVQcowState *s = bs->opaque;
+ uint64_t *l2_table;
+ int l2_index;
+ int ret;
+ int i;
+
+ ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
+ if (ret < 0) {
+ return ret;
+ }
+
+ /* Limit nb_clusters to one L2 table */
+ nb_clusters = MIN(nb_clusters, s->l2_size - l2_index);
+
+ for (i = 0; i < nb_clusters; i++) {
+ uint64_t old_offset;
+
+ old_offset = be64_to_cpu(l2_table[l2_index + i]);
+
+ /* Update L2 entries */
+ qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+ if (old_offset & QCOW_OFLAG_COMPRESSED) {
+ l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
+ qcow2_free_any_clusters(bs, old_offset, 1);
+ } else {
+ l2_table[l2_index + i] |= cpu_to_be64(QCOW_OFLAG_ZERO);
+ }
+ }
+
+ ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+ if (ret < 0) {
+ return ret;
+ }
+
+ return nb_clusters;
+}
+
+int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors)
+{
+ BDRVQcowState *s = bs->opaque;
+ unsigned int nb_clusters;
+ int ret;
+
+ /* The zero flag is only supported by version 3 and newer */
+ if (s->qcow_version < 3) {
+ return -ENOTSUP;
+ }
+
+ /* Each L2 table is handled by its own loop iteration */
+ nb_clusters = size_to_clusters(s, nb_sectors << BDRV_SECTOR_BITS);
+
+ while (nb_clusters > 0) {
+ ret = zero_single_l2(bs, offset, nb_clusters);
+ if (ret < 0) {
+ return ret;
+ }
+
+ nb_clusters -= ret;
+ offset += (ret * s->cluster_size);
+ }
+
+ return 0;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 002e138..6f8228f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1289,6 +1289,26 @@ static int qcow2_make_empty(BlockDriverState *bs)
return 0;
}
+static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
+ int64_t sector_num, int nb_sectors)
+{
+ int ret;
+ BDRVQcowState *s = bs->opaque;
+
+ /* Emulate misaligned zero writes */
+ if (sector_num % s->cluster_sectors || nb_sectors % s->cluster_sectors) {
+ return -ENOTSUP;
+ }
+
+ /* Whatever is left can use real zero clusters */
+ qemu_co_mutex_lock(&s->lock);
+ ret = qcow2_zero_clusters(bs, sector_num << BDRV_SECTOR_BITS,
+ nb_sectors);
+ qemu_co_mutex_unlock(&s->lock);
+
+ return ret;
+}
+
static coroutine_fn int qcow2_co_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors)
{
@@ -1566,6 +1586,7 @@ static BlockDriver bdrv_qcow2 = {
.bdrv_co_writev = qcow2_co_writev,
.bdrv_co_flush_to_os = qcow2_co_flush_to_os,
+ .bdrv_co_write_zeroes = qcow2_co_write_zeroes,
.bdrv_co_discard = qcow2_co_discard,
.bdrv_truncate = qcow2_truncate,
.bdrv_write_compressed = qcow2_write_compressed,
diff --git a/block/qcow2.h b/block/qcow2.h
index 1adf01d..1143245 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -282,6 +282,7 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors);
+int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);
/* qcow2-snapshot.c functions */
int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [Qemu-devel] [RFC PATCH 16/16] qemu-iotests: use qcow3
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
` (14 preceding siblings ...)
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 15/16] qcow2: Zero write support Kevin Wolf
@ 2012-03-27 15:03 ` Kevin Wolf
15 siblings, 0 replies; 20+ messages in thread
From: Kevin Wolf @ 2012-03-27 15:03 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Not supposed to be committed like this.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
tests/qemu-iotests/common.rc | 21 ++++++++++++++++++++-
1 files changed, 20 insertions(+), 1 deletions(-)
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 26811ca..0e66fcc 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -53,18 +53,36 @@ else
TEST_IMG=$IMGPROTO:$TEST_DIR/t.$IMGFMT
fi
+_optstr_add()
+{
+ if [ -n "$1" ]; then
+ echo "$1,$2"
+ else
+ echo "$2"
+ fi
+}
+
_make_test_img()
{
# extra qemu-img options can be added by tests
# at least one argument (the image size) needs to be added
local extra_img_options=$*
local cluster_size_filter="s# cluster_size=[0-9]\\+##g"
+ local optstr=""
+
+ if [ "$IMGFMT" = "qcow2" ]; then
+ optstr=$(_optstr_add "$optstr" "compat=1.1")
+ fi
if [ \( "$IMGFMT" = "qcow2" -o "$IMGFMT" = "qed" \) -a -n "$CLUSTER_SIZE" ]; then
- extra_img_options="-o cluster_size=$CLUSTER_SIZE $extra_img_options"
+ optstr=$(_optstr_add "$optstr" "cluster_size=$CLUSTER_SIZE")
cluster_size_filter=""
fi
+ if [ -n "$optstr" ]; then
+ extra_img_options="-o $optstr $extra_img_options"
+ fi
+
# XXX(hch): have global image options?
$QEMU_IMG create -f $IMGFMT $TEST_IMG $extra_img_options | \
sed -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" | \
@@ -73,6 +91,7 @@ _make_test_img()
sed -e "s# encryption=off##g" | \
sed -e "$cluster_size_filter" | \
sed -e "s# table_size=0##g" | \
+ sed -e "s# compat='[^']*'##g" | \
sed -e "s# compat6=off##g" | \
sed -e "s# static=off##g"
}
--
1.7.6.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3 Kevin Wolf
@ 2012-03-27 16:25 ` Eric Blake
2012-04-02 10:00 ` Kevin Wolf
0 siblings, 1 reply; 20+ messages in thread
From: Eric Blake @ 2012-03-27 16:25 UTC (permalink / raw)
To: Kevin Wolf; +Cc: stefanha, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 4231 bytes --]
On 03/27/2012 09:03 AM, Kevin Wolf wrote:
> This is the second draft for what I think could be added when we increase qcow2's
> version number to 3. This includes points that have been made by several people
> over the past few months. We're probably not going to implement this next week,
> but I think it's important to get discussions started early, so here it is.
>
> +If the version is 3 or higher, the header has the following additional fields.
> +For version 2, the values are assumed to be zero, unless specified otherwise
> +in the description of a field.
> +
> + 72 - 79: incompatible_features
> + Bitmask of incompatible features. An implementation must
> + fail to open an image if an unknown bit is set.
> +
> + Bit 0: The reference counts in the image file may be
> + inaccurate. Implementations must check/rebuild
> + them if they rely on them.
> +
> + Bit 1: Enable subclusters. This affects the L2 table
> + format.
> +
> + Bits 2-31: Reserved (set to 0)
Offsets 72-79 forms 8 bytes, so this should be bits 2-63 are reserved.
> +
> + 80 - 87: compatible_features
> + Bitmask of compatible features. An implementation can
> + safely ignore any unknown bits that are set.
> +
> + Bits 0-31: Reserved (set to 0)
Again, bits 0-63, based on offsets.
> +
> + 88 - 95: autoclear_features
> + Bitmask of auto-clear features. An implementation may only
> + write to an image with unknown auto-clear features if it
> + clears the respective bits from this field first.
> +
> + Bits 0-31: Reserved (set to 0)
And again.
> +
> + 96 - 99: refcount_bits
> + Size of a reference count block entry in bits. For version 2
> + images, the size is always assumed to be 16 bits. The size
> + must be a power of two.
> + [ TODO: Define order in sub-byte sizes ]
> +
> + 100 - 103: header_length
> + Length of the header structure in bytes. For version 2
> + images, the length is always assumed to be 72 bytes.
Might be a good idea to require this to be a multiple of 8, since both
72 and 104 qualify, and since header extensions are also required to be
padded out to multiples of 8.
> +== Feature name table ==
> +
> +A feature name table is an optional header extension that contains the name for
> +features used by the image. It can be used by applications that don't know
> +the respective feature (e.g. because the feature was introduced only later) to
> +display a useful error message.
> +
> +The number of entries in the feature name table is determined by the length of
> +the header extension data. Its entries look like this:
> +
> + Byte 0: Type of feature (select feature bitmap)
> + 0: Incompatible feature
> + 1: Compatible feature
> + 2: Autoclear feature
> +
> + 1: Bit number within the selected feature bitmap
> +
> + 2 - 47: Feature name (padded with zeros, but not necessarily null
> + terminated if it has full length)
Semantic nit: The NUL character is all zeros; it is one byte in all
unibyte and multi-byte encodings, and the NUL wide character is the
all-zero wchar_t value; while 'null' refers to a pointer to nowhere.
Saying a string is null terminated is wrong, because you don't have a 4-
or 8-byte NULL pointer at the end of the string, just a one-byte NUL
character. Therefore, strings are nul-terminated, not null-terminated.
Is this extension capped at 48 bytes, or it is a repeating table of as
many 48-byte multiples as necessary to represent each feature name?
--
Eric Blake eblake@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3
2012-03-27 16:25 ` Eric Blake
@ 2012-04-02 10:00 ` Kevin Wolf
2012-04-02 16:14 ` Eric Blake
0 siblings, 1 reply; 20+ messages in thread
From: Kevin Wolf @ 2012-04-02 10:00 UTC (permalink / raw)
To: Eric Blake; +Cc: stefanha, qemu-devel
Am 27.03.2012 18:25, schrieb Eric Blake:
> On 03/27/2012 09:03 AM, Kevin Wolf wrote:
>> This is the second draft for what I think could be added when we increase qcow2's
>> version number to 3. This includes points that have been made by several people
>> over the past few months. We're probably not going to implement this next week,
>> but I think it's important to get discussions started early, so here it is.
>>
>
>> +If the version is 3 or higher, the header has the following additional fields.
>> +For version 2, the values are assumed to be zero, unless specified otherwise
>> +in the description of a field.
>> +
>> + 72 - 79: incompatible_features
>> + Bitmask of incompatible features. An implementation must
>> + fail to open an image if an unknown bit is set.
>> +
>> + Bit 0: The reference counts in the image file may be
>> + inaccurate. Implementations must check/rebuild
>> + them if they rely on them.
>> +
>> + Bit 1: Enable subclusters. This affects the L2 table
>> + format.
>> +
>> + Bits 2-31: Reserved (set to 0)
>
> Offsets 72-79 forms 8 bytes, so this should be bits 2-63 are reserved.
Thanks, good catch! This was a 32 bit field initially and when I updated
it, I forgot this.
>> +
>> + 96 - 99: refcount_bits
>> + Size of a reference count block entry in bits. For version 2
>> + images, the size is always assumed to be 16 bits. The size
>> + must be a power of two.
>> + [ TODO: Define order in sub-byte sizes ]
>> +
>> + 100 - 103: header_length
>> + Length of the header structure in bytes. For version 2
>> + images, the length is always assumed to be 72 bytes.
>
> Might be a good idea to require this to be a multiple of 8, since both
> 72 and 104 qualify, and since header extensions are also required to be
> padded out to multiples of 8.
Do you see any arguments for padding to multiples of 8 besides
consistency? If I did the format from scratch, without having to pay
attention to compatibility, I would drop the requirement even for header
extensions as I don't see what it buys us.
Consistency is important and certainly good enough to make me unsure
about this, but I don't like artificial restrictions either. If we had
another good reason, it would be easier for me to decide.
>> +== Feature name table ==
>> +
>> +A feature name table is an optional header extension that contains the name for
>> +features used by the image. It can be used by applications that don't know
>> +the respective feature (e.g. because the feature was introduced only later) to
>> +display a useful error message.
>> +
>> +The number of entries in the feature name table is determined by the length of
>> +the header extension data. Its entries look like this:
>> +
>> + Byte 0: Type of feature (select feature bitmap)
>> + 0: Incompatible feature
>> + 1: Compatible feature
>> + 2: Autoclear feature
>> +
>> + 1: Bit number within the selected feature bitmap
>> +
>> + 2 - 47: Feature name (padded with zeros, but not necessarily null
>> + terminated if it has full length)
>
> Semantic nit: The NUL character is all zeros; it is one byte in all
> unibyte and multi-byte encodings, and the NUL wide character is the
> all-zero wchar_t value; while 'null' refers to a pointer to nowhere.
> Saying a string is null terminated is wrong, because you don't have a 4-
> or 8-byte NULL pointer at the end of the string, just a one-byte NUL
> character. Therefore, strings are nul-terminated, not null-terminated.
"null-terminated" is much more common. Google and Wikipedia are the
proof. ;-)
> Is this extension capped at 48 bytes, or it is a repeating table of as
> many 48-byte multiples as necessary to represent each feature name?
The latter. All feature names are in a single table in a single header
extensions. Any suggestion how to clarify this? Would something like
"There shall be at most one feature name table header extension in an
image" be clear enough?
Kevin
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3
2012-04-02 10:00 ` Kevin Wolf
@ 2012-04-02 16:14 ` Eric Blake
0 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2012-04-02 16:14 UTC (permalink / raw)
To: Kevin Wolf; +Cc: stefanha, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 4837 bytes --]
On 04/02/2012 04:00 AM, Kevin Wolf wrote:
> Am 27.03.2012 18:25, schrieb Eric Blake:
>> On 03/27/2012 09:03 AM, Kevin Wolf wrote:
>>> This is the second draft for what I think could be added when we increase qcow2's
>>> version number to 3. This includes points that have been made by several people
>>> over the past few months. We're probably not going to implement this next week,
>>> but I think it's important to get discussions started early, so here it is.
>>>
>>
>>> +
>>> + 100 - 103: header_length
>>> + Length of the header structure in bytes. For version 2
>>> + images, the length is always assumed to be 72 bytes.
>>
>> Might be a good idea to require this to be a multiple of 8, since both
>> 72 and 104 qualify, and since header extensions are also required to be
>> padded out to multiples of 8.
>
> Do you see any arguments for padding to multiples of 8 besides
> consistency?
Yes - void* on some platforms is 8 bytes, and having everything
guarantee 8-byte alignment can make processing of headers more efficient
when you are reading things on natural alignments.
Furthermore, guaranteeing 8-byte alignment buys us three bits that are
always 0 but which can later be converted to bit flags for future
extensions; by requiring 8-byte alignment, older parsers will reject the
new bit flags (because it looks like a non-multiple-of-8 length), while
newer parsers will know that they are bit flags and what those flags
mean, as well as know to mask out those bits when computing aligned size
of the header.
> If I did the format from scratch, without having to pay
> attention to compatibility, I would drop the requirement even for header
> extensions as I don't see what it buys us.
It's always hard to predict what future extensions will look like, but I
argue in return that it is easier to start out strict and relax things
in the future than it is to start relaxed and then wish we could tighten
it up.
>
> Consistency is important and certainly good enough to make me unsure
> about this, but I don't like artificial restrictions either. If we had
> another good reason, it would be easier for me to decide.
If sizeof(void*) for natural alignment and the possibility of extension
to 3 bit flags per extension header don't convince you, then I won't insist.
>> Semantic nit: The NUL character is all zeros; it is one byte in all
>> unibyte and multi-byte encodings, and the NUL wide character is the
>> all-zero wchar_t value; while 'null' refers to a pointer to nowhere.
>> Saying a string is null terminated is wrong, because you don't have a 4-
>> or 8-byte NULL pointer at the end of the string, just a one-byte NUL
>> character. Therefore, strings are nul-terminated, not null-terminated.
>
> "null-terminated" is much more common. Google and Wikipedia are the
> proof. ;-)
Unfortunately true :) But I'll quit bothering you about this one, as
I'm swimming against the current on that one.
>
>> Is this extension capped at 48 bytes, or it is a repeating table of as
>> many 48-byte multiples as necessary to represent each feature name?
>
> The latter. All feature names are in a single table in a single header
> extensions. Any suggestion how to clarify this? Would something like
> "There shall be at most one feature name table header extension in an
> image" be clear enough?
Maybe:
A feature name table is an optional header extension that contains the
name for features used by the image. It can be used by applications
that don't know the respective feature (e.g. because the feature was
introduced only later) to display a useful error message.
There can be at most one feature name table, and within that table, each
feature name may only appear once. The number of entries (n) in the
feature name table is determined by the length of the header extension
data. Its entries look like this:
Byte 48*n + 0: Type of feature (select feature bitmap)
0: Incompatible feature
1: Compatible feature
2: Autoclear feature
48*n + 1: Bit number within the selected feature bitmap
48*n + 2 to 47: Feature name (padded with zeros, but not
necessarily null
terminated if it has full length)
Do we also need to clarify that at offsets 48*n + 1, the bit number must
be 0-63 (and thus the upper two bits must be 0)? Do we also want to
enforce that the table is sorted (that is, given the tuple <feature,bit>
in bytes 0 and 1 of each entry, we want to require that entry <0,0>
appears before <0,1> appears before <1,0>)?
--
Eric Blake eblake@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2012-04-02 16:14 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-27 15:03 [Qemu-devel] [RFC PATCH 00/16] qcow2: Basic version 3 support Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 01/16] Specification for qcow2 version 3 Kevin Wolf
2012-03-27 16:25 ` Eric Blake
2012-04-02 10:00 ` Kevin Wolf
2012-04-02 16:14 ` Eric Blake
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 02/16] qcow2: Ignore reserved bits in get_cluster_offset Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 03/16] qcow2: Ignore reserved bits in count_contiguous_clusters() Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 04/16] qcow2: Fail write_compressed when overwriting data Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 05/16] qcow2: Ignore reserved bits in L1/L2 entries Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 06/16] qcow2: Refactor qcow2_free_any_clusters Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 07/16] qcow2: Simplify count_cow_clusters Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 08/16] qcow2: Ignore reserved bits in refcount table entries Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 09/16] qcow2: Ignore reserved bits in check_refcounts Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 10/16] qcow2: Version 3 images Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 11/16] qcow2: Support reading zero clusters Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 12/16] qcow2: Support for feature table header extension Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 13/16] qemu-iotests: add a simple test for write_zeroes Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 14/16] qemu-iotests: Test COW with zero clusters Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 15/16] qcow2: Zero write support Kevin Wolf
2012-03-27 15:03 ` [Qemu-devel] [RFC PATCH 16/16] qemu-iotests: use qcow3 Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).