From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35574) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aby4o-000686-1Y for qemu-devel@nongnu.org; Fri, 04 Mar 2016 17:20:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aby4k-0006mr-1E for qemu-devel@nongnu.org; Fri, 04 Mar 2016 17:20:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48699) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aby4j-0006mn-QX for qemu-devel@nongnu.org; Fri, 04 Mar 2016 17:19:57 -0500 References: <20160304042411.UUWFM.307100.root@dnvrco-web23> From: Eric Blake Message-ID: <56DA0A0B.20509@redhat.com> Date: Fri, 4 Mar 2016 15:19:55 -0700 MIME-Version: 1.0 In-Reply-To: <20160304042411.UUWFM.307100.root@dnvrco-web23> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="wJdXrtrT4e6QjKO3PqwO3wcjGkvAPVIbO" Subject: Re: [Qemu-devel] QCow2 compression List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: mgreger@cinci.rr.com, qemu-devel@nongnu.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --wJdXrtrT4e6QjKO3PqwO3wcjGkvAPVIbO Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable [any way you could convince your mailer to not break threading?] On 03/03/2016 09:24 PM, mgreger@cinci.rr.com wrote: >> >> The zeros are not part of the compressed data, though, that's why the = >> Compressed Cluster Descriptor indicates a shorter size. Had another=20 >> compressed cluster been written to the same image, it might have ended= =20 >> up where you are seeing the zero padding now. (The trick with=20 >> compression is that multiple guest clusters can end up in a single hos= t=20 >> cluster.)=20 >> > =20 > Thanks, but the given length of 0x5600 is still short by 160(decimal) b= ytes=20 > compared to the=20 > non-zero data (which occupies an additional 133 bytes beyond the expect= ed end at=20 > 0x3DED50) and zero=20 > padding (an additional 27 bytes beyond that). Could there be an off-by-= one error=20 > somewhere?=20 > The file doesn't even end on a sector boundary let alone a cluster boun= dary.=20 Based on an IRC conversation I had when you first asked the question, I think the spec is indeed weak, and that we DO have some fishy code. Look at what the code does: uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, uint64_t offset, int compressed_size) =2E.. nb_csectors =3D ((cluster_offset + compressed_size - 1) >> 9) - (cluster_offset >> 9); cluster_offset |=3D QCOW_OFLAG_COMPRESSED | ((uint64_t)nb_csectors << s->csize_shift); That sure does sound like an off-by-one. cluster_offset does indeed look like a byte offset (from qcow2_alloc_bytes); so let's consider what happens if we've already allocated one compressed cluster in the past, going from 65536 to 66999. So on this call, cluster_offset would be 67000, and if compressed size is 1025 (just round numbers to make discussion easy; no idea if gzip would really do this on any particular data), we are computing ((67000+1025-1)>>9) - (67000>>9) =3D=3D 2, but 10= 25 bytes occupies 3 sectors, not 2. But we offset it by another off-by-one: int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offse= t) { =2E.. nb_csectors =3D ((cluster_offset >> s->csize_shift) & s->csize_mask) + 1; Yuck. That is horribly ugly. We need to fix the documentation to make it obvious that the sector count is a _LOWER BOUND_ on the number of sectors occupied, and that you need to read at least one more cluster's worth of data before decompressi= ng. It would also be nice to fix qcow2 code to not have quite so many off-by-one computations, but be more precise about what data is going whe= re. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --wJdXrtrT4e6QjKO3PqwO3wcjGkvAPVIbO Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJW2goMAAoJEKeha0olJ0NqVzkH/iMHwSD/0g5gUZpwOBKS20WF vftFXCZ3Lwp7sdqkfBxwp+2129IcF8I95+J6tRdYoStfAaKLTsglROlnh0Ap4/GA 2qE0n2T+y+RRKRAoWtItwjPSkvN5vgY/tZv/pK3SNE8JLSE+QR+0v2EbBpKmAvDR mwL5BZrqi0LssJlWMML9Wjl0a5DMWN99kLsog698X751lCOr/P13OM2f5Wt6fdup 3T0bj635FZNpqeFvPjqGeuVpDRK68idw6vvJ1S+eU4pdm2BNF/XlZs7jC7IBmF9R Spbw5x1HceLpMnMv5PFcRbFmCuZzQqhDIR8zxdqFOOhy8fBH5mlyF/dKELUyGLo= =0cZA -----END PGP SIGNATURE----- --wJdXrtrT4e6QjKO3PqwO3wcjGkvAPVIbO--