From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52741) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XoE4T-0005JL-FB for qemu-devel@nongnu.org; Tue, 11 Nov 2014 11:13:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XoE4M-0004iR-Sp for qemu-devel@nongnu.org; Tue, 11 Nov 2014 11:13:33 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33666) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XoE4M-0004iB-LK for qemu-devel@nongnu.org; Tue, 11 Nov 2014 11:13:26 -0500 Message-ID: <5462359D.4040503@redhat.com> Date: Tue, 11 Nov 2014 09:13:17 -0700 From: Eric Blake MIME-Version: 1.0 References: <1415627159-15941-1-git-send-email-mreitz@redhat.com> <1415627159-15941-6-git-send-email-mreitz@redhat.com> <54612A27.7000801@redhat.com> <5461C751.3080607@redhat.com> In-Reply-To: <5461C751.3080607@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="FPgOm8JTdOVOPGsGTHQqAH3JbI0EaSaW0" Subject: Re: [Qemu-devel] [PATCH 05/21] qcow2: Refcount overflow and qcow2_alloc_bytes() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Max Reitz , qemu-devel@nongnu.org Cc: Kevin Wolf , Peter Lieven , Stefan Hajnoczi This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --FPgOm8JTdOVOPGsGTHQqAH3JbI0EaSaW0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 11/11/2014 01:22 AM, Max Reitz wrote: > On 2014-11-10 at 22:12, Eric Blake wrote: >> On 11/10/2014 06:45 AM, Max Reitz wrote: >>> qcow2_alloc_bytes() may reuse a cluster multiple times, in which case= >>> the refcount is increased accordingly. However, if this would lead to= an >>> overflow the function should instead just not reuse this cluster and >>> allocate a new one. >> So if recount_order is 1 (2 bits per refcount, max refcount of 4 >=20 > *max refcount of 3 (0b11) Oh right, because 0 is special. Although I think I figured that out... >=20 >> ), and >> we encounter the same cluster 6 times (say by 5 back-to-back internal >> snapshots), does this code optimize to only 2 clusters (both with >> refcount 3) or does it result in each of the last 3 clusters spilling = to =2E..when talking about 3 shares of a cluster. >> its own 1-ref cluster for a total of 4 clusters? Short of Benoit's wo= rk >> on deduplication, is there even a way to avoid inefficient use of >> spilled clusters? >=20 > I'm not sure what you're referring to; maybe I should add that > qcow2_alloc_bytes() is used for allocating compressed clusters (which > ideally don't take up a full host cluster), so "reuse" in this context > just means that several compressed clusters share one host cluster. No, I was thinking about internal snapshots rather than compressed clusters (although there's probably some overlap on what happens). >=20 > Maybe you're referring to the following situation: We have the default > cluster size of 64k. Now we're trying to allocate 16k for each of the > compressed clusters A, B, C and D. D won't fit into that cluster becaus= e > the maximum refcount is three, so it will be put into a newly allocated= > host cluster. Finally, we're trying to allocate 32k for a compressed > cluster E, which will then be put into the same cluster as D. We > therefore have the following allocation (each sub-box representing 16k)= : >=20 > +---+---+---+---+ +---+---+---+---+ > |A |B | C | | | D | E | | > +---+---+---+---+ +---+---+---+---+ >=20 > whereas the ideal allocation would be: >=20 > +---+---+---+---+ +---+---+---+---+ > |A |B | E | | C | D | | | > +---+---+---+---+ +---+---+---+---+ >=20 > This is a problem, but I think first it's a minor one (just use a > sufficiently large refcount width if you're going to use compressed > clusters) and second it's about compressed clusters, whose performance = I > could hardly care less about, frankly. No, I was envisioning that we have a brand new image with one cluster allocated (cluster 1 has refcount 1), then 5 times in a row we do 'savevm' to take an internal snapshot. If I understand your code correctly, the first two snapshots increase the refcount, so cluster 1 has a refcount of 3. Then the next snapshot can't increase the refcount, so it instead copies the contents to cluster 2. The fourth and fifth snapshots also see that cluster 1 is full, and allocate cluster 3 and 4; whereas a more efficient usage would increase the refcount of cluster 2 instead of allocating. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --FPgOm8JTdOVOPGsGTHQqAH3JbI0EaSaW0 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAEBCAAGBQJUYjWdAAoJEKeha0olJ0NqVUcIAKFRKB8+gvXOHL4VHb5OYRFc nDeSnEW5HcM220ZcP8Yfg3ya2D2GP9E+2HliAlcDPvL7ZGh4mYyOsnasKFhzZlAU pE7qPwmQl/7U6H+0qJhlRfeC4P3KEEwqL5Ve5l0vRJrrh+BdBVNVm5OxnGpnRNwC Jfb1MpOA1p3PneHZV7wrYwF1JrOTDygfXJmlRqtntvtE8MeHKGmoKbh5VFuOe/oO zlbJ+UwyG9B95+Afy2y0OkYOxO/4WLFiMjmysYVuxSFlK+QIKewfG9p6HINSSGSm zmmhzz29yu55M7yD9i0BXJBc+H/M2wppUPJ8Yk8LsfPJlRd1kO6UQTpqjk5u2po= =vhLs -----END PGP SIGNATURE----- --FPgOm8JTdOVOPGsGTHQqAH3JbI0EaSaW0--