Re: [Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Eric Blake <eblake@redhat.com>
To: Alberto Garcia <berto@igalia.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor
Date: Tue, 20 Feb 2018 16:03:00 -0600	[thread overview]
Message-ID: <6064e2b5-c7aa-7ec1-b917-5431f0dc57a8@redhat.com> (raw)
In-Reply-To: <2d9713f2-87c1-12ba-3f46-a9a476279cda@redhat.com>

On 02/20/2018 01:40 PM, Eric Blake wrote:
> On 02/20/2018 11:01 AM, Alberto Garcia wrote:
> 
> tl:dr; I think we need a v3 with even more clarification.
> 

> 
> I'm also making an additional observationn: Due to the pigeonhole 
> principle and the fact that the compression stream adds metadata, we 
> KNOW that there are some (rare) cases where attempting to compress data 
> will actually result in an INCREASE in size ('man gzip' backs up this 
> claim, calling out a worst case -0.015% compression ratio, or 15 bytes 
> added for every 1000 bytes of input, on uncompressible data).  So 
> presumably, we should state that a cluster can only be written in 
> compressed form IF it occupies less space than the uncompressed cluster 
> (we could also allow a compressed form that occupies the same length as 
> the uncompressed cluster, but that's a waste of CPU cycles).
> 
> Once we have that restriction stated, then it becomes obvious that a 
> compressed cluster should never REQUIRE using more than one host cluster 
> (and this is backed up by qcow2_alloc_bytes() asserting that size <= 
> s->cluster_size).  Where things get interesting, though, is whether we 
> PERMIT a compressed cluster to overlap a host cluster boundary. 
> Technically, it might be possible, but qemu does NOT do that (again, 
> looking at qcow2_alloc_bytes() - we loop if free_in_cluster < size) - so 
> we may want to be explicit about this point to prevent OTHER 
> implementations from creating a compressed cluster that crosses host 
> cluster boundaries (right now, I can't see qcow2_decompress_cluster() 
> validating it, though - YIKES).

That said, a simple patch to try this:

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8c4b26ceaf2..85b5dbd9c16 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1598,6 +1598,15 @@ int qcow2_decompress_cluster(BlockDriverState 
*bs, uint64_t cluster_offset)
          sector_offset = coffset & 511;
          csize = nb_csectors * 512 - sector_offset;

+        /* We never write a compressed cluster that crosses host
+         * cluster boundaries; reject images that do that.  */
+        if (csize + (coffset % s->cluster_size) > s->cluster_size) {
+            qcow2_signal_corruption(bs, true, coffset, csize,
+                                    "Compressed cluster at %#" PRIx64
+                                    " crosses host cluster boundary", 
coffset);
+            return -EIO;
+        }
+
          /* Allocate buffers on first decompress operation, most images are
           * uncompressed and the memory overhead can be avoided.  The 
buffers
           * are freed in .bdrv_close().

triggers failures in iotests 122:

--- /home/eblake/qemu/tests/qemu-iotests/122.out	2017-10-06 
13:45:25.559279136 -0500
+++ /home/eblake/qemu/tests/qemu-iotests/122.out.bad	2018-02-20 
15:54:29.890221575 -0600
@@ -117,8 +117,8 @@
  convert -c -S 0:
  read 3145728/3145728 bytes at offset 0
  3 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-read 63963136/63963136 bytes at offset 3145728
-61 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qcow2: Marking image as corrupt: Compressed cluster at 0x5ffd2 crosses 
host cluster boundary; further corruption events will be suppressed
+read failed: Input/output error
  [{ "start": 0, "length": 67108864, "depth": 0, "zero": false, "data": 
true}]

so it looks like I'm reading qcow2_alloc_bytes() wrong and that we CAN 
have a compressed cluster that crosses host cluster boundaries?


> So if I may suggest:
> 
>     x+1 - 61:    Number of additional 512-byte sectors used for the
>                  compressed data, beyond the sector containing the
>                  offset in the previous field.  These sectors must fit
>                  within the same host cluster.

This sentence needs tweaking to match reality, given that my simple 
patch to flag cross-sector hosts triggered (or I need to figure out what 
was wrong with my patch).

>  Note that the compressed
>                  data does not necessarily occupy all of the bytes in
>                  the final sector; rather, decompression stops when it
>                  has produced a cluster of data.  Another compressed
>                  cluster may map to the tail of the final sector used
>                  by this compressed cluster.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

next prev parent reply	other threads:[~2018-02-20 22:03 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-20 17:01 [Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor Alberto Garcia
2018-02-20 19:40 ` Eric Blake
2018-02-20 22:03   ` Eric Blake [this message]
2018-02-20 22:13     ` Eric Blake
2018-02-21 13:23   ` Alberto Garcia

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:8c4b26ceaf dfblob:85b5dbd9c1 )
 OR (
bs:"Re: [Qemu-devel] [PATCH v2] specs/qcow2: Fix documentation of the compressed cluster descriptor" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6064e2b5-c7aa-7ec1-b917-5431f0dc57a8@redhat.com \
    --to=eblake@redhat.com \
    --cc=berto@igalia.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).