* [Qemu-devel] [PATCH 0/2] qcow2: Repair OFLAG_COPIED when fixing leaks
@ 2018-04-28 16:34 Max Reitz
2018-04-28 16:34 ` [Qemu-devel] [PATCH 1/2] " Max Reitz
2018-04-28 16:34 ` [Qemu-devel] [PATCH 2/2] iotests: Repairing error during snapshot deletion Max Reitz
0 siblings, 2 replies; 6+ messages in thread
From: Max Reitz @ 2018-04-28 16:34 UTC (permalink / raw)
To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf
Suppose you have an image with consistent OFLAG_COPIED and refcounts.
Now further suppose that image has leaked clusters (single reference,
but refcount 2). When checking such an image with qemu-img check, it
will notify you of the leakage, and that's it.
Now when trying to repair that image, you naturally use -r leaks because
that should be sufficient and you want to give qemu-img check just as
many "permissions" as it needs so it doesn't destroy your image by
accident. But while the qcow2 driver will fix the refcounts, alas! it
will now complain about OFLAG_COPIED because that should be set now.
However, it will not set that without -r all. So, congratulations, now
you have an image that qemu-img check tells you is really corrupted.
This series makes -r leaks fix OFLAG_COPIED as well, but only if
repairing the refcounts was completely successful.
There is a Red Hat Bugzilla entry here:
https://bugzilla.redhat.com/show_bug.cgi?id=1527085
...but I'm afraid the bug description is set to confidential, although I
don't know why. Sorry. :-/
For everyone who cannot read it, let's say the test case in this series
may very well be a good indication of what you don't see.
Max Reitz (2):
qcow2: Repair OFLAG_COPIED when fixing leaks
iotests: Repairing error during snapshot deletion
block/qcow2-refcount.c | 25 ++++++++-----
tests/qemu-iotests/217 | 89 ++++++++++++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/217.out | 42 ++++++++++++++++++++++
tests/qemu-iotests/group | 1 +
4 files changed, 149 insertions(+), 8 deletions(-)
create mode 100755 tests/qemu-iotests/217
create mode 100644 tests/qemu-iotests/217.out
--
2.14.3
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 1/2] qcow2: Repair OFLAG_COPIED when fixing leaks
2018-04-28 16:34 [Qemu-devel] [PATCH 0/2] qcow2: Repair OFLAG_COPIED when fixing leaks Max Reitz
@ 2018-04-28 16:34 ` Max Reitz
2018-04-30 15:55 ` Eric Blake
2018-04-28 16:34 ` [Qemu-devel] [PATCH 2/2] iotests: Repairing error during snapshot deletion Max Reitz
1 sibling, 1 reply; 6+ messages in thread
From: Max Reitz @ 2018-04-28 16:34 UTC (permalink / raw)
To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf
Repairing OFLAG_COPIED is usually safe because it is done after the
refcounts have been repaired. Therefore, it we did not find anyone else
referencing a data or L2 cluster, it makes no sense to not set
OFLAG_COPIED -- and the other direction (clearing OFLAG_COPIED) is
always safe, anyway, it may just induce leaks.
Furthermore, if OFLAG_COPIED is actually consistent with a wrong (leaky)
refcount, we will decrement the refcount with -r leaks, but OFLAG_COPIED
will then be wrong. qemu-img check should not produce images that are
more corrupted afterwards then they were before.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527085
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
block/qcow2-refcount.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 2dc23005b7..e1ab4d0bfd 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1799,6 +1799,19 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
int ret;
uint64_t refcount;
int i, j;
+ bool repair;
+
+ if (fix & BDRV_FIX_ERRORS) {
+ /* Always repair */
+ repair = true;
+ } else if (fix & BDRV_FIX_LEAKS) {
+ /* Repair only if that seems safe: This function is always
+ * called after the refcounts have been fixed, so the refcount
+ * is accurate if that repair was successful */
+ repair = !res->check_errors && !res->corruptions && !res->leaks;
+ } else {
+ repair = false;
+ }
for (i = 0; i < s->l1_size; i++) {
uint64_t l1_entry = s->l1_table[i];
@@ -1818,10 +1831,8 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
if ((refcount == 1) != ((l1_entry & QCOW_OFLAG_COPIED) != 0)) {
fprintf(stderr, "%s OFLAG_COPIED L2 cluster: l1_index=%d "
"l1_entry=%" PRIx64 " refcount=%" PRIu64 "\n",
- fix & BDRV_FIX_ERRORS ? "Repairing" :
- "ERROR",
- i, l1_entry, refcount);
- if (fix & BDRV_FIX_ERRORS) {
+ repair ? "Repairing" : "ERROR", i, l1_entry, refcount);
+ if (repair) {
s->l1_table[i] = refcount == 1
? l1_entry | QCOW_OFLAG_COPIED
: l1_entry & ~QCOW_OFLAG_COPIED;
@@ -1862,10 +1873,8 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
fprintf(stderr, "%s OFLAG_COPIED data cluster: "
"l2_entry=%" PRIx64 " refcount=%" PRIu64 "\n",
- fix & BDRV_FIX_ERRORS ? "Repairing" :
- "ERROR",
- l2_entry, refcount);
- if (fix & BDRV_FIX_ERRORS) {
+ repair ? "Repairing" : "ERROR", l2_entry, refcount);
+ if (repair) {
l2_table[j] = cpu_to_be64(refcount == 1
? l2_entry | QCOW_OFLAG_COPIED
: l2_entry & ~QCOW_OFLAG_COPIED);
--
2.14.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 2/2] iotests: Repairing error during snapshot deletion
2018-04-28 16:34 [Qemu-devel] [PATCH 0/2] qcow2: Repair OFLAG_COPIED when fixing leaks Max Reitz
2018-04-28 16:34 ` [Qemu-devel] [PATCH 1/2] " Max Reitz
@ 2018-04-28 16:34 ` Max Reitz
2018-04-30 16:01 ` Eric Blake
1 sibling, 1 reply; 6+ messages in thread
From: Max Reitz @ 2018-04-28 16:34 UTC (permalink / raw)
To: qemu-block; +Cc: qemu-devel, Max Reitz, Kevin Wolf
This adds a test for an I/O error during snapshot deletion, and maybe
more importantly, for how to repair the resulting image. If the
snapshot has been deleted before the error occurs, the only negative
result will be leaked clusters -- and those should be repairable with
qemu-img check -r leaks.
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
tests/qemu-iotests/217 | 89 ++++++++++++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/217.out | 42 ++++++++++++++++++++++
tests/qemu-iotests/group | 1 +
3 files changed, 132 insertions(+)
create mode 100755 tests/qemu-iotests/217
create mode 100644 tests/qemu-iotests/217.out
diff --git a/tests/qemu-iotests/217 b/tests/qemu-iotests/217
new file mode 100755
index 0000000000..3b42138760
--- /dev/null
+++ b/tests/qemu-iotests/217
@@ -0,0 +1,89 @@
+#!/bin/bash
+#
+# I/O errors when working with internal qcow2 snapshots, and repairing
+# the result
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+seq=$(basename $0)
+echo "QA output created by $seq"
+
+status=1 # failure is the default!
+
+_cleanup()
+{
+ _cleanup_test_img
+ rm -f "$TEST_DIR/blkdebug.conf"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# This test is specific to qcow2
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+# Internal snapshots are (currently) impossible with refcount_bits=1
+_unsupported_imgopts 'refcount_bits=1[^0-9]'
+
+echo
+echo '=== Simulating an I/O error during snapshot deletion ==='
+echo
+
+_make_test_img 64M
+$QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io
+
+# Create the snapshot
+$QEMU_IMG snapshot -c foo "$TEST_IMG"
+
+# Verify the snapshot is there
+echo
+_img_info | grep 'Snapshot list'
+echo '(Snapshot filtered)'
+echo
+
+# Try to delete the snapshot (with an error happening when freeing the
+# then leaked clusters)
+cat > "$TEST_DIR/blkdebug.conf" <<EOF
+[inject-error]
+event = "cluster_free"
+errno = "5"
+EOF
+$QEMU_IMG snapshot -d foo "blkdebug:$TEST_DIR/blkdebug.conf:$TEST_IMG"
+
+# Verify the snapshot is gone
+echo
+_img_info | grep 'Snapshot list'
+
+# Should only show leaks
+echo '--- Checking test image ---'
+_check_test_img
+
+echo
+
+# As there are only leaks, this should be able to fully repair the
+# image
+echo '--- Repairing test image ---'
+_check_test_img -r leaks
+
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/217.out b/tests/qemu-iotests/217.out
new file mode 100644
index 0000000000..e3fc40a8c7
--- /dev/null
+++ b/tests/qemu-iotests/217.out
@@ -0,0 +1,42 @@
+QA output created by 217
+
+=== Simulating an I/O error during snapshot deletion ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+Snapshot list:
+(Snapshot filtered)
+
+qcow2_free_clusters failed: Input/output error
+qemu-img: Could not delete snapshot 'foo': Failed to free the cluster and L1 table: Input/output error
+
+--- Checking test image ---
+Leaked cluster 4 refcount=2 reference=1
+Leaked cluster 5 refcount=2 reference=1
+Leaked cluster 6 refcount=1 reference=0
+Leaked cluster 7 refcount=1 reference=0
+
+4 leaked clusters were found on the image.
+This means waste of disk space, but no harm to data.
+
+--- Repairing test image ---
+Leaked cluster 4 refcount=2 reference=1
+Leaked cluster 5 refcount=2 reference=1
+Leaked cluster 6 refcount=1 reference=0
+Leaked cluster 7 refcount=1 reference=0
+Repairing cluster 4 refcount=2 reference=1
+Repairing cluster 5 refcount=2 reference=1
+Repairing cluster 6 refcount=1 reference=0
+Repairing cluster 7 refcount=1 reference=0
+Repairing OFLAG_COPIED L2 cluster: l1_index=0 l1_entry=40000 refcount=1
+Repairing OFLAG_COPIED data cluster: l2_entry=50000 refcount=1
+The following inconsistencies were found and repaired:
+
+ 4 leaked clusters
+ 2 corruptions
+
+Double checking the fixed image now...
+No errors were found on the image.
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 2edc377370..b20afb940b 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -215,3 +215,4 @@
214 rw auto
215 rw auto quick
216 rw auto quick
+217 rw auto quick
--
2.14.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qcow2: Repair OFLAG_COPIED when fixing leaks
2018-04-28 16:34 ` [Qemu-devel] [PATCH 1/2] " Max Reitz
@ 2018-04-30 15:55 ` Eric Blake
0 siblings, 0 replies; 6+ messages in thread
From: Eric Blake @ 2018-04-30 15:55 UTC (permalink / raw)
To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel
On 04/28/2018 11:34 AM, Max Reitz wrote:
> Repairing OFLAG_COPIED is usually safe because it is done after the
> refcounts have been repaired. Therefore, it we did not find anyone else
> referencing a data or L2 cluster, it makes no sense to not set
> OFLAG_COPIED -- and the other direction (clearing OFLAG_COPIED) is
> always safe, anyway, it may just induce leaks.
>
> Furthermore, if OFLAG_COPIED is actually consistent with a wrong (leaky)
> refcount, we will decrement the refcount with -r leaks, but OFLAG_COPIED
> will then be wrong. qemu-img check should not produce images that are
> more corrupted afterwards then they were before.
>
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527085
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
> block/qcow2-refcount.c | 25 +++++++++++++++++--------
> 1 file changed, 17 insertions(+), 8 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] iotests: Repairing error during snapshot deletion
2018-04-28 16:34 ` [Qemu-devel] [PATCH 2/2] iotests: Repairing error during snapshot deletion Max Reitz
@ 2018-04-30 16:01 ` Eric Blake
2018-05-01 15:40 ` Max Reitz
0 siblings, 1 reply; 6+ messages in thread
From: Eric Blake @ 2018-04-30 16:01 UTC (permalink / raw)
To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel
On 04/28/2018 11:34 AM, Max Reitz wrote:
> This adds a test for an I/O error during snapshot deletion, and maybe
> more importantly, for how to repair the resulting image. If the
> snapshot has been deleted before the error occurs, the only negative
> result will be leaked clusters -- and those should be repairable with
> qemu-img check -r leaks.
>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
> tests/qemu-iotests/217 | 89 ++++++++++++++++++++++++++++++++++++++++++++++
> tests/qemu-iotests/217.out | 42 ++++++++++++++++++++++
> tests/qemu-iotests/group | 1 +
> 3 files changed, 132 insertions(+)
> create mode 100755 tests/qemu-iotests/217
> create mode 100644 tests/qemu-iotests/217.out
>
> +
> +# Internal snapshots are (currently) impossible with refcount_bits=1
> +_unsupported_imgopts 'refcount_bits=1[^0-9]'
> +
And even if they are made possible (by cloning clusters instead of
incrementing refcounts), you STILL need a refcount > 1 to test the
particular repair functionality just fixed.
Up to you if you want to reword that comment a bit.
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] iotests: Repairing error during snapshot deletion
2018-04-30 16:01 ` Eric Blake
@ 2018-05-01 15:40 ` Max Reitz
0 siblings, 0 replies; 6+ messages in thread
From: Max Reitz @ 2018-05-01 15:40 UTC (permalink / raw)
To: Eric Blake, qemu-block; +Cc: Kevin Wolf, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1452 bytes --]
On 2018-04-30 18:01, Eric Blake wrote:
> On 04/28/2018 11:34 AM, Max Reitz wrote:
>> This adds a test for an I/O error during snapshot deletion, and maybe
>> more importantly, for how to repair the resulting image. If the
>> snapshot has been deleted before the error occurs, the only negative
>> result will be leaked clusters -- and those should be repairable with
>> qemu-img check -r leaks.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>> tests/qemu-iotests/217 | 89
>> ++++++++++++++++++++++++++++++++++++++++++++++
>> tests/qemu-iotests/217.out | 42 ++++++++++++++++++++++
>> tests/qemu-iotests/group | 1 +
>> 3 files changed, 132 insertions(+)
>> create mode 100755 tests/qemu-iotests/217
>> create mode 100644 tests/qemu-iotests/217.out
>>
>
>> +
>> +# Internal snapshots are (currently) impossible with refcount_bits=1
>> +_unsupported_imgopts 'refcount_bits=1[^0-9]'
>> +
>
> And even if they are made possible (by cloning clusters instead of
> incrementing refcounts), you STILL need a refcount > 1 to test the
> particular repair functionality just fixed.
Ah, right...
> Up to you if you want to reword that comment a bit.
Yes, will do. ("This test needs clusters with at least a refcount of 2
so that OFLAG_COPIED is not set. refcount_bits=1 is therefore
unsupported.")
> Reviewed-by: Eric Blake <eblake@redhat.com>
(As always,) thanks!
Max
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-05-01 15:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-28 16:34 [Qemu-devel] [PATCH 0/2] qcow2: Repair OFLAG_COPIED when fixing leaks Max Reitz
2018-04-28 16:34 ` [Qemu-devel] [PATCH 1/2] " Max Reitz
2018-04-30 15:55 ` Eric Blake
2018-04-28 16:34 ` [Qemu-devel] [PATCH 2/2] iotests: Repairing error during snapshot deletion Max Reitz
2018-04-30 16:01 ` Eric Blake
2018-05-01 15:40 ` Max Reitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).