* [Qemu-devel] [PATCH for-4.1? 1/4] qcow2: Fix .bdrv_has_zero_init()
2019-07-15 10:45 [Qemu-devel] [PATCH for-4.1? 0/4] block: Fix three .bdrv_has_zero_init()s Max Reitz
@ 2019-07-15 10:45 ` Max Reitz
2019-07-16 16:54 ` Kevin Wolf
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 2/4] vdi: " Max Reitz
` (3 subsequent siblings)
4 siblings, 1 reply; 8+ messages in thread
From: Max Reitz @ 2019-07-15 10:45 UTC (permalink / raw)
To: qemu-block
Cc: Kevin Wolf, Jeff Cody, Stefan Weil, qemu-devel, Max Reitz,
Stefano Garzarella
If a qcow2 file is preallocated, it can no longer guarantee that it
initially appears as filled with zeroes.
So implement .bdrv_has_zero_init() by checking whether the file is
preallocated; if so, forward the call to the underlying storage node,
except for when it is encrypted: Encrypted preallocated images always
return effectively random data, so .bdrv_has_zero_init() must always
return 0 for them.
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
block/qcow2.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 89 insertions(+), 1 deletion(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index 039bdc2f7e..730fd53890 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -4631,6 +4631,94 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs,
return spec_info;
}
+/*
+ * Return 1 if the file only contains zero and unallocated clusters.
+ * Return 0 if it contains compressed or normal clusters.
+ * Return -errno on error.
+ */
+static int qcow2_is_zero(BlockDriverState *bs)
+{
+ BDRVQcow2State *s = bs->opaque;
+ int l1_i;
+ int ret;
+
+ if (bs->backing) {
+ return 0;
+ }
+
+ for (l1_i = 0; l1_i < s->l1_size; l1_i++) {
+ uint64_t l2_offset = s->l1_table[l1_i] & L1E_OFFSET_MASK;
+ int slice_start_i;
+
+ if (!l2_offset) {
+ continue;
+ }
+
+ for (slice_start_i = 0; slice_start_i < s->l2_size;
+ slice_start_i += s->l2_slice_size)
+ {
+ uint64_t *l2_slice;
+ int l2_slice_i;
+
+ ret = qcow2_cache_get(bs, s->l2_table_cache,
+ l2_offset + slice_start_i * sizeof(uint64_t),
+ (void **)&l2_slice);
+ if (ret < 0) {
+ return ret;
+ }
+
+ for (l2_slice_i = 0; l2_slice_i < s->l2_slice_size; l2_slice_i++) {
+ uint64_t l2_entry = be64_to_cpu(l2_slice[l2_slice_i]);
+
+ switch (qcow2_get_cluster_type(bs, l2_entry)) {
+ case QCOW2_CLUSTER_UNALLOCATED:
+ case QCOW2_CLUSTER_ZERO_PLAIN:
+ case QCOW2_CLUSTER_ZERO_ALLOC:
+ break;
+
+ case QCOW2_CLUSTER_NORMAL:
+ case QCOW2_CLUSTER_COMPRESSED:
+ qcow2_cache_put(s->l2_table_cache, (void **)&l2_slice);
+ return 0;
+
+ default:
+ abort();
+ }
+ }
+
+ qcow2_cache_put(s->l2_table_cache, (void **)&l2_slice);
+ }
+ }
+
+ return 1;
+}
+
+static int qcow2_has_zero_init(BlockDriverState *bs)
+{
+ BDRVQcow2State *s = bs->opaque;
+ int ret;
+
+ if (qemu_in_coroutine()) {
+ qemu_co_mutex_lock(&s->lock);
+ }
+ /* Check preallocation status */
+ ret = qcow2_is_zero(bs);
+ if (qemu_in_coroutine()) {
+ qemu_co_mutex_unlock(&s->lock);
+ }
+ if (ret < 0) {
+ return 0;
+ }
+
+ if (ret == 1) {
+ return 1;
+ } else if (bs->encrypted) {
+ return 0;
+ } else {
+ return bdrv_has_zero_init(s->data_file->bs);
+ }
+}
+
static int qcow2_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
int64_t pos)
{
@@ -5186,7 +5274,7 @@ BlockDriver bdrv_qcow2 = {
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_create_opts = qcow2_co_create_opts,
.bdrv_co_create = qcow2_co_create,
- .bdrv_has_zero_init = bdrv_has_zero_init_1,
+ .bdrv_has_zero_init = qcow2_has_zero_init,
.bdrv_co_block_status = qcow2_co_block_status,
.bdrv_co_preadv = qcow2_co_preadv,
--
2.21.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH for-4.1? 1/4] qcow2: Fix .bdrv_has_zero_init()
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 1/4] qcow2: Fix .bdrv_has_zero_init() Max Reitz
@ 2019-07-16 16:54 ` Kevin Wolf
2019-07-17 8:37 ` Max Reitz
0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2019-07-16 16:54 UTC (permalink / raw)
To: Max Reitz
Cc: Jeff Cody, Stefan Weil, qemu-devel, qemu-block,
Stefano Garzarella
Am 15.07.2019 um 12:45 hat Max Reitz geschrieben:
> If a qcow2 file is preallocated, it can no longer guarantee that it
> initially appears as filled with zeroes.
>
> So implement .bdrv_has_zero_init() by checking whether the file is
> preallocated; if so, forward the call to the underlying storage node,
> except for when it is encrypted: Encrypted preallocated images always
> return effectively random data, so .bdrv_has_zero_init() must always
> return 0 for them.
>
> Reported-by: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
Hm... This patch only really works directly after image creation (which
is indeed where .bdrv_has_zero_init is used). Why do we have to have a
full qcow2_is_zero() that loops over the whole image just to find out
whether it's preallocated? Wouldn't looking at a single data cluster be
enough?
Kevin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH for-4.1? 1/4] qcow2: Fix .bdrv_has_zero_init()
2019-07-16 16:54 ` Kevin Wolf
@ 2019-07-17 8:37 ` Max Reitz
0 siblings, 0 replies; 8+ messages in thread
From: Max Reitz @ 2019-07-17 8:37 UTC (permalink / raw)
To: Kevin Wolf
Cc: Jeff Cody, Stefan Weil, qemu-devel, qemu-block,
Stefano Garzarella
[-- Attachment #1.1: Type: text/plain, Size: 2516 bytes --]
On 16.07.19 18:54, Kevin Wolf wrote:
> Am 15.07.2019 um 12:45 hat Max Reitz geschrieben:
>> If a qcow2 file is preallocated, it can no longer guarantee that it
>> initially appears as filled with zeroes.
>>
>> So implement .bdrv_has_zero_init() by checking whether the file is
>> preallocated; if so, forward the call to the underlying storage node,
>> except for when it is encrypted: Encrypted preallocated images always
>> return effectively random data, so .bdrv_has_zero_init() must always
>> return 0 for them.
>>
>> Reported-by: Stefano Garzarella <sgarzare@redhat.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>
> Hm... This patch only really works directly after image creation (which
> is indeed where .bdrv_has_zero_init is used). Why do we have to have a
> full qcow2_is_zero() that loops over the whole image just to find out
> whether it's preallocated? Wouldn't looking at a single data cluster be
> enough?
Hm. I would like to agree (because you’re right), but now I see that
the callers of bdrv_has_zero_init() don’t necessarily hold to that
convention.
For example, qemu-img convert has the -n flag, but that doesn’t stop it
from invoking bdrv_has_zero_init().
Which is a bug, of course.
$ ./qemu-img create -f qcow2 src.qcow2 64M
$ ./qemu-img create -f qcow2 dest.qcow2 64M
$ ./qemu-io -c 'write -P 42 0 64M' dest.qcow2
$ ./qemu-img convert -n src.qcow2 dest.qcow2
$ ./qemu-img compare src.qcow2 dest.qcow2
Content mismatch at offset 0!
Aw, man, why does this keep happening... :-/
OK, so qemu-img convert -n is easy to fix.
But there are more callers:
mirror: Uses this function to inquire whether it needs to zero the
target before actually doing something useful. There is no guarantee
that the target is a new image. Well, it just isn’t with mode=existing
or blockdev-mirror.
parallels: Whether to write zeroes to newly added image areas. That
actually sounds correct, because those new areas cannot point to any
data yet.
Well, maybe not correct, because bdrv_has_zero_init() is not the same as
“when this image grows, new areas will be zero”, but at least
bdrv_hsa_zero_init() will return false if the the latter is false.
vhdx: Similarly to parallels, it uses this information to check whether
it needs to zero new areas when growing an image file.
raw/vmdk/vpc: Just passing through info from their storage child.
Hm, OK. So mirror and qemu-img need fixing. That sounds possible.
Max
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Qemu-devel] [PATCH for-4.1? 2/4] vdi: Fix .bdrv_has_zero_init()
2019-07-15 10:45 [Qemu-devel] [PATCH for-4.1? 0/4] block: Fix three .bdrv_has_zero_init()s Max Reitz
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 1/4] qcow2: Fix .bdrv_has_zero_init() Max Reitz
@ 2019-07-15 10:45 ` Max Reitz
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 3/4] vhdx: " Max Reitz
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Max Reitz @ 2019-07-15 10:45 UTC (permalink / raw)
To: qemu-block
Cc: Kevin Wolf, Jeff Cody, Stefan Weil, qemu-devel, Max Reitz,
Stefano Garzarella
Static VDI images cannot guarantee to be zero-initialized. If the image
has been statically allocated, forward the call to the underlying
storage node.
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
block/vdi.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/block/vdi.c b/block/vdi.c
index b9845a4cbd..0caa3f281d 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -988,6 +988,17 @@ static void vdi_close(BlockDriverState *bs)
error_free(s->migration_blocker);
}
+static int vdi_has_zero_init(BlockDriverState *bs)
+{
+ BDRVVdiState *s = bs->opaque;
+
+ if (s->header.image_type == VDI_TYPE_STATIC) {
+ return bdrv_has_zero_init(bs->file->bs);
+ } else {
+ return 1;
+ }
+}
+
static QemuOptsList vdi_create_opts = {
.name = "vdi-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vdi_create_opts.head),
@@ -1028,7 +1039,7 @@ static BlockDriver bdrv_vdi = {
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_create = vdi_co_create,
.bdrv_co_create_opts = vdi_co_create_opts,
- .bdrv_has_zero_init = bdrv_has_zero_init_1,
+ .bdrv_has_zero_init = vdi_has_zero_init,
.bdrv_co_block_status = vdi_co_block_status,
.bdrv_make_empty = vdi_make_empty,
--
2.21.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Qemu-devel] [PATCH for-4.1? 3/4] vhdx: Fix .bdrv_has_zero_init()
2019-07-15 10:45 [Qemu-devel] [PATCH for-4.1? 0/4] block: Fix three .bdrv_has_zero_init()s Max Reitz
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 1/4] qcow2: Fix .bdrv_has_zero_init() Max Reitz
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 2/4] vdi: " Max Reitz
@ 2019-07-15 10:45 ` Max Reitz
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 4/4] iotests: Convert to preallocated encrypted qcow2 Max Reitz
2019-07-15 14:49 ` [Qemu-devel] [PATCH for-4.1? 0/4] block: Fix three .bdrv_has_zero_init()s Stefano Garzarella
4 siblings, 0 replies; 8+ messages in thread
From: Max Reitz @ 2019-07-15 10:45 UTC (permalink / raw)
To: qemu-block
Cc: Kevin Wolf, Jeff Cody, Stefan Weil, qemu-devel, Max Reitz,
Stefano Garzarella
Fixed VHDX images cannot guarantee to be zero-initialized. If the image
has the "fixed" subformat, forward the call to the underlying storage
node.
Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
block/vhdx.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/block/vhdx.c b/block/vhdx.c
index d6070b6fa8..cc9c25d1c4 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -2075,6 +2075,25 @@ static int coroutine_fn vhdx_co_check(BlockDriverState *bs,
return 0;
}
+static int vhdx_has_zero_init(BlockDriverState *bs)
+{
+ BDRVVHDXState *s = bs->opaque;
+ uint32_t bat_i;
+
+ /* Check the subformat (fixed or dynamic) */
+ for (bat_i = 0; bat_i < s->bat_entries; bat_i++) {
+ int state = s->bat[bat_i] & VHDX_BAT_STATE_BIT_MASK;
+
+ if (state == PAYLOAD_BLOCK_FULLY_PRESENT) {
+ /* Fixed subformat */
+ return bdrv_has_zero_init(bs->file->bs);
+ }
+ }
+
+ /* Dynamic subformat */
+ return 1;
+}
+
static QemuOptsList vhdx_create_opts = {
.name = "vhdx-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vhdx_create_opts.head),
@@ -2128,7 +2147,7 @@ static BlockDriver bdrv_vhdx = {
.bdrv_co_create_opts = vhdx_co_create_opts,
.bdrv_get_info = vhdx_get_info,
.bdrv_co_check = vhdx_co_check,
- .bdrv_has_zero_init = bdrv_has_zero_init_1,
+ .bdrv_has_zero_init = vhdx_has_zero_init,
.create_opts = &vhdx_create_opts,
};
--
2.21.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Qemu-devel] [PATCH for-4.1? 4/4] iotests: Convert to preallocated encrypted qcow2
2019-07-15 10:45 [Qemu-devel] [PATCH for-4.1? 0/4] block: Fix three .bdrv_has_zero_init()s Max Reitz
` (2 preceding siblings ...)
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 3/4] vhdx: " Max Reitz
@ 2019-07-15 10:45 ` Max Reitz
2019-07-15 14:49 ` [Qemu-devel] [PATCH for-4.1? 0/4] block: Fix three .bdrv_has_zero_init()s Stefano Garzarella
4 siblings, 0 replies; 8+ messages in thread
From: Max Reitz @ 2019-07-15 10:45 UTC (permalink / raw)
To: qemu-block
Cc: Kevin Wolf, Jeff Cody, Stefan Weil, qemu-devel, Max Reitz,
Stefano Garzarella
Add a test case for converting an empty image (which only returns zeroes
when read) to a preallocated encrypted qcow2 image.
qcow2_has_zero_init() should return 0 then, thus forcing qemu-img
convert to create zero clusters.
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
tests/qemu-iotests/188 | 20 +++++++++++++++++++-
tests/qemu-iotests/188.out | 4 ++++
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/tests/qemu-iotests/188 b/tests/qemu-iotests/188
index be7278aa65..afca44df54 100755
--- a/tests/qemu-iotests/188
+++ b/tests/qemu-iotests/188
@@ -48,7 +48,7 @@ SECRETALT="secret,id=sec0,data=platypus"
_make_test_img --object $SECRET -o "encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10" $size
-IMGSPEC="driver=$IMGFMT,file.filename=$TEST_IMG,encrypt.key-secret=sec0"
+IMGSPEC="driver=$IMGFMT,encrypt.key-secret=sec0,file.filename=$TEST_IMG"
QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
@@ -68,6 +68,24 @@ echo
echo "== verify open failure with wrong password =="
$QEMU_IO --object $SECRETALT -c "read -P 0xa 0 $size" --image-opts $IMGSPEC | _filter_qemu_io | _filter_testdir
+_cleanup_test_img
+
+echo
+echo "== verify that has_zero_init returns false when preallocating =="
+
+# Empty source file
+if [ -n "$TEST_IMG_FILE" ]; then
+ TEST_IMG_FILE="${TEST_IMG_FILE}.orig" _make_test_img $size
+else
+ TEST_IMG="${TEST_IMG}.orig" _make_test_img $size
+fi
+
+$QEMU_IMG convert -O "$IMGFMT" --object $SECRET \
+ -o "encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10,preallocation=metadata" \
+ "${TEST_IMG}.orig" "$TEST_IMG"
+
+$QEMU_IMG compare --object $SECRET --image-opts "${IMGSPEC}.orig" "$IMGSPEC"
+
# success, all done
echo "*** done"
diff --git a/tests/qemu-iotests/188.out b/tests/qemu-iotests/188.out
index 97b1402671..c568ef3701 100644
--- a/tests/qemu-iotests/188.out
+++ b/tests/qemu-iotests/188.out
@@ -15,4 +15,8 @@ read 16777216/16777216 bytes at offset 0
== verify open failure with wrong password ==
qemu-io: can't open: Invalid password, cannot unlock any keyslot
+
+== verify that has_zero_init returns false when preallocating ==
+Formatting 'TEST_DIR/t.IMGFMT.orig', fmt=IMGFMT size=16777216
+Images are identical.
*** done
--
2.21.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH for-4.1? 0/4] block: Fix three .bdrv_has_zero_init()s
2019-07-15 10:45 [Qemu-devel] [PATCH for-4.1? 0/4] block: Fix three .bdrv_has_zero_init()s Max Reitz
` (3 preceding siblings ...)
2019-07-15 10:45 ` [Qemu-devel] [PATCH for-4.1? 4/4] iotests: Convert to preallocated encrypted qcow2 Max Reitz
@ 2019-07-15 14:49 ` Stefano Garzarella
4 siblings, 0 replies; 8+ messages in thread
From: Stefano Garzarella @ 2019-07-15 14:49 UTC (permalink / raw)
To: Max Reitz; +Cc: Kevin Wolf, Stefan Weil, Jeff Cody, qemu-devel, qemu-block
On Mon, Jul 15, 2019 at 12:45:04PM +0200, Max Reitz wrote:
> Hi,
>
> .bdrv_has_zero_init() must not return 1 if the (newly created[1]) image
> may not return zeroes when read.
>
> [1] This is guaranteed by the caller.
>
> If the image is preallocated, this generally depends on the protocol
> layer, because the format layer will just allocate the necessary
> metadata, make it point to data blocks and leave their initialization to
> the protocol layer. For example, qcow2:
>
> - leaves the blocks uninitialized with preallocation=metadata,
> - and passes preallocation=falloc and =full on to the protocol node.
>
> In either case, the data then stored in these blocks fully depends on
> the protocol level.
>
> Therefore, format drivers have to pass through .bdrv_has_zero_init() to
> the data storage node when dealing with preallocated images.
>
> Protocol drivers OTOH have to be accurate in what they return from
> .bdrv_has_zero_init(). They are free to return 0 even for preallocated
> images.
>
>
> So let’s look at the existing .bdrv_has_zero_init() implementations:
>
> - file-posix: Always returns 1 (for regular files). Correct, because it
> makes sure the image always reads as 0, preallocated or not.
>
> - file-win32: Same. (But doesn’t even support preallocation.)
>
> - gluster: Always returns 0. Safe.
>
> - nfs: Only returns 1 for regular files, similarly to file-posix. Seems
> reasonable.
>
> - parallels: Always returns 1. This format does not support
> preallocation, but apparently ensures that it always writes out data
> that reads back as 0 (where necessary), because if the protocol node
> does not have_zero_init, it explicitly writes zeroes to it instead of
> just truncating it.
> So this driver seems OK, too.
>
> - qcow2: Always returns 1. This is wrong for preallocated images, and
> really wrong for preallocated encrypted images. Addressed by patch 1.
>
> - qcow: Always returns 1. Has no preallocation support, so that seems
> OK.
>
> - qed: Same as qcow.
>
> - raw: Always forwards the value from the filtered node. OK.
>
> - rbd: Always returns 1. This is a protocol driver, so I’ll just trust
> it knows what it’s doing.
>
> - sheepdog: Always returns 1. From the fact that its preallocation code
> simply reads the image and writes it back, this seems correct to me.
>
> - ssh: Same as nfs.
>
> - vdi: Always returns 1. It does support preallocation=metadata, in
> which case this may be wrong. Addressed by patch 2.
>
> - vhdx: Similar to vdi, except it doesn’t support @preallocation, but
> has its own option “subformat=fixed”. Addressed by patch 3.
>
> - vmdk: Hey, this one is already exactly what we want. If any of the
> extents is flat, it goes to the respective protocol node, and if that
> does not have_zero_init, it returns 0. Great.
> (Added in commit da7a50f9385.)
>
> - vpc: Hey, this one, too. With subformat=fixed, it returns what the
> protocol node has to say about has_zero_init.
> (Added in commit 72c6cc94daa.)
>
> So that leaves three cases to fix, which are the first three patches in
> this series. The final patch adds a test case for qcow2. (It’s
> difficult to test the other drivers, because that would require a
> protocol driver with image creation support and has_zero_init=0, which
> is not so easily available.)
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
Thanks,
Stefano
>
>
> Max Reitz (4):
> qcow2: Fix .bdrv_has_zero_init()
> vdi: Fix .bdrv_has_zero_init()
> vhdx: Fix .bdrv_has_zero_init()
> iotests: Convert to preallocated encrypted qcow2
>
> block/qcow2.c | 90 +++++++++++++++++++++++++++++++++++++-
> block/vdi.c | 13 +++++-
> block/vhdx.c | 21 ++++++++-
> tests/qemu-iotests/188 | 20 ++++++++-
> tests/qemu-iotests/188.out | 4 ++
> 5 files changed, 144 insertions(+), 4 deletions(-)
>
^ permalink raw reply [flat|nested] 8+ messages in thread