qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Max Reitz <mreitz@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, qemu-block@nongnu.org, berto@igalia.com
Subject: Re: [Qemu-devel] [PATCH v5 5/5] qcow2: Avoid memory over-allocation on compressed images
Date: Wed, 25 Apr 2018 13:37:06 -0500	[thread overview]
Message-ID: <99a0fa78-0d55-9b93-b29d-fd28537aeadb@redhat.com> (raw)
In-Reply-To: <5cb0271b-6b03-7520-cd3d-46deff0ee682@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2235 bytes --]

On 04/25/2018 10:00 AM, Max Reitz wrote:
> On 2018-04-24 00:33, Eric Blake wrote:
>> When reading a compressed image, we were allocating s->cluster_data
>> to 32*cluster_size + 512 (possibly over 64 megabytes, for an image
>> with 2M clusters).  Let's check out the history:
>>

>>
>> So, it's time to cut back on the waste.  A compressed cluster
>> written by qemu will NEVER occupy more than an uncompressed
>> cluster, but based on mid-sector alignment, we may still need
>> to read 1 cluster + 1 sector in order to recover enough bytes
>> for the decompression.  But third-party producers of qcow2 may
>> not be as smart, and gzip DOES document that because the
>> compression stream adds metadata, and because of the pigeonhole
>> principle, there are worst case scenarios where attempts to
>> compress will actually inflate an image, by up to 0.015% (or 62
>> sectors larger for an unfortunate 2M compression).
> 
> Hm?  2M * 0.00015 < 315 (bytes!), so where does that number come from?

/me puts on the cone of shame...

Umm, it comes from me forgetting that percents need to be scaled before
multiplying.  31.5k bytes is 62 sectors, but you're right that 315 bytes
is only 1 sector.

> 
>>                                                     In fact,
>> the qcow2 spec permits an all-ones sector count, plus a full
>> sector containing the initial offset, for a maximum read of
>> 2 full clusters; and thanks to the way decompression works,
>> even though such a value is probably too large for the actual
>> compressed data, it really doesn't matter if we read too much
>> (gzip ignores slop, once it has decoded a full cluster).  So
>> it's easier to just allocate cluster_data to be as large as we
>> can ever possibly see; even if it still wastes up to 2M on any
>> image created by qcow2, that's still an improvment of 60M less
>> waste than pre-patch.
> 
> OK, so from the technical perspective it's irrelevant anyway, but I
> suppose the number should still be fixed in the commit message.

Hey, at least it's a math error instead of a typo ;)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

      reply	other threads:[~2018-04-25 18:37 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-23 22:33 [Qemu-devel] [PATCH v5 0/5] minor qcow2 compression improvements Eric Blake
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 1/5] qcow2: Prefer byte-based calls into bs->file Eric Blake
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 2/5] qcow2: Document some maximum size constraints Eric Blake
2018-04-24  9:13   ` Alberto Garcia
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 3/5] qcow2: Reduce REFT_OFFSET_MASK Eric Blake
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 4/5] qcow2: Don't allow overflow during cluster allocation Eric Blake
2018-04-25 14:44   ` Max Reitz
2018-04-25 18:26     ` Eric Blake
2018-04-25 20:31     ` Eric Blake
2018-04-23 22:33 ` [Qemu-devel] [PATCH v5 5/5] qcow2: Avoid memory over-allocation on compressed images Eric Blake
2018-04-25 15:00   ` Max Reitz
2018-04-25 18:37     ` Eric Blake [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99a0fa78-0d55-9b93-b29d-fd28537aeadb@redhat.com \
    --to=eblake@redhat.com \
    --cc=berto@igalia.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).