From: <mgreger@cinci.rr.com>
To: Eric Blake <eblake@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] QCow2 compression
Date: Sat, 5 Mar 2016 0:11:44 +0000 [thread overview]
Message-ID: <20160305001144.6JX6U.312524.root@dnvrco-web07> (raw)
---- Eric Blake <eblake@redhat.com> wrote:
> [any way you could convince your mailer to not break threading?]
>
> On 03/03/2016 09:24 PM, mgreger@cinci.rr.com wrote:
> >>
> >> The zeros are not part of the compressed data, though, that's why the
> >> Compressed Cluster Descriptor indicates a shorter size. Had another
> >> compressed cluster been written to the same image, it might have ended
> >> up where you are seeing the zero padding now. (The trick with
> >> compression is that multiple guest clusters can end up in a single host
> >> cluster.)
> >>
> >
> > Thanks, but the given length of 0x5600 is still short by 160(decimal) bytes
> > compared to the
> > non-zero data (which occupies an additional 133 bytes beyond the expected end at
> > 0x3DED50) and zero
> > padding (an additional 27 bytes beyond that). Could there be an off-by-one error
> > somewhere?
> > The file doesn't even end on a sector boundary let alone a cluster boundary.
>
> Based on an IRC conversation I had when you first asked the question, I
> think the spec is indeed weak, and that we DO have some fishy code.
>
> Look at what the code does:
>
> uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
> uint64_t offset,
> int compressed_size)
> ...
> nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) -
> (cluster_offset >> 9);
>
> cluster_offset |= QCOW_OFLAG_COMPRESSED |
> ((uint64_t)nb_csectors << s->csize_shift);
>
> That sure does sound like an off-by-one. cluster_offset does indeed
> look like a byte offset (from qcow2_alloc_bytes); so let's consider what
> happens if we've already allocated one compressed cluster in the past,
> going from 65536 to 66999. So on this call, cluster_offset would be
> 67000, and if compressed size is 1025 (just round numbers to make
> discussion easy; no idea if gzip would really do this on any particular
> data), we are computing ((67000+1025-1)>>9) - (67000>>9) == 2, but 1025
> bytes occupies 3 sectors, not 2.
>
> But we offset it by another off-by-one:
>
> int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset)
> {
> ...
> nb_csectors = ((cluster_offset >> s->csize_shift) &
> s->csize_mask) + 1;
>
> Yuck. That is horribly ugly.
>
> We need to fix the documentation to make it obvious that the sector
> count is a _LOWER BOUND_ on the number of sectors occupied, and that you
> need to read at least one more cluster's worth of data before decompressing.
>
> It would also be nice to fix qcow2 code to not have quite so many
> off-by-one computations, but be more precise about what data is going where.
>
> --
> Eric Blake eblake redhat com +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
Thanks for verifying the behavior for me. This will
allow me to add compression support to the code I've
already written.
I've implemented QCow2 support for a fork of the popular
DOSBox emulator.
The project (not mine), is called DOSBox-X, and is geared
towards emulating Windows9x era hardware with a specific
focus on games: http://dosbox-x.com/
If you'd like to give me any feedback on the QCow2 code I've written
please feel free to e-mail me directly. The relevant files are
qcow2_disk.cpp and qcow_disk.h. If you do, please be kind as
my C++ skills are very rusty. Overall I think it is pretty readable.
It took me about 2 and a half weeks working in my spare time to go
from reading the spec to the final code integrated into DOSBox-X.
Thanks,
Michael Greger
next reply other threads:[~2016-03-05 0:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-05 0:11 mgreger [this message]
-- strict thread matches above, loose matches on Subject: below --
2016-03-04 4:24 [Qemu-devel] QCow2 compression mgreger
2016-03-04 22:19 ` Eric Blake
2016-02-27 5:00 mgreger
2016-02-29 14:01 ` Kevin Wolf
2016-02-29 15:58 ` Eric Blake
2016-02-29 14:59 ` Eric Blake
2016-02-29 15:54 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160305001144.6JX6U.312524.root@dnvrco-web07 \
--to=mgreger@cinci.rr.com \
--cc=eblake@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).