Re: [Qemu-devel] [PATCH] qcow2: do not allocate extra memory

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Eric Blake <eblake@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: mreitz@redhat.com, kwolf@redhat.com, stefanha@redhat.com,
	den@openvz.org, fabrice@bellard.org
Subject: Re: [Qemu-devel] [PATCH] qcow2: do not allocate extra memory
Date: Tue, 12 Jul 2016 22:11:48 +0300	[thread overview]
Message-ID: <578540F4.2070309@virtuozzo.com> (raw)
In-Reply-To: <57853A35.4030501@redhat.com>

On 12.07.2016 21:43, Eric Blake wrote:
> On 07/12/2016 11:43 AM, Vladimir Sementsov-Ogievskiy wrote:
>> There are no needs to allocate more than one cluster, as we set
>> avail_out for deflate to one cluster.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
>> Hi all!
>>
>> Please, can anybody say me what I'm missing?
>>
> https://tools.ietf.org/html/rfc1951 states a simple fact about compression:
>
>        A simple counting argument shows that no lossless compression
>        algorithm can compress every possible input data set.  For the
>        format defined here, the worst case expansion is 5 bytes per 32K-
>        byte block, i.e., a size increase of 0.015% for large data sets.
>
> So overallocating the output buffer guarantees that you will get a valid
> compression result via a single function call, even when the data is
> incompressible (the zlib format specifically documents that if the
> normal algorithm on the data does not reduce its size, then you merely
> add a fixed-length marker that documents that fact, so you at least
> avoid unbounded expansion when trying to compress pathological data).
>
> But since the qcow2 format already has a way of documenting whether a
> cluster is compressed or not, we probably don't have to rely on zlib's
> marker for uncompressible data, and could instead tweak the code to
> specifically refuse to compress any cluster whose output would result in
> more than a cluster's worth of bytes.  I'm not familiar enough with
> zlib's interface to know how easy or hard this is, and whether merely
> checking error codes is sufficient, nor whether qemu's use of zlib would
> behave correctly in the face of such an error when the output buffer is
> undersized because the data was incompressible.

zlib doc says:

"deflate compresses as much data as possible, and stops when the input 
buffer becomes empty or the output buffer becomes full"

It will not write more then avail_out bytes to out buffer.


>
>
>> ...
>> strm.avail_out = s->cluster_size;
>> strm.next_out = out_buf;
>>
>> ret = deflate(&strm, Z_FINISH);
>> ...
>> out_len = strm.next_out - out_buf;
> You've skipped what is done with ret, which will be different according
> to whether the entire compressed stream fit in the buffer described by
> strm, and that would have to be audited as part of your proposed patch.

ret would be Z_STREAM_END if it fit in and Z_OK if not. (if there are no 
errors ofcourse). What I've skipped? I just say that nobody knows about 
this extra allocation - neither zlib nor other code in this function 
(except g_free=).

>
>
>> -    out_buf = g_malloc(s->cluster_size + (s->cluster_size / 1000) + 128);
>> +    out_buf = g_malloc(s->cluster_size);
> Is avoiding the fudge factor really worth it? I don't know that we'll
> get a noticeable performance gain with this patch, and it may be easier
> to leave things alone than to audit that we are correctly handling cases
> where the attempt at compression results in a zlib buffer larger than
> the original data, even when the output buffer size is now constrained
> differently.
>

I'm not insist on applying this patch, I am just trying to understand this.

-- 
Best regards,
Vladimir

next prev parent reply	other threads:[~2016-07-12 19:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-12 17:43 [Qemu-devel] [PATCH] qcow2: do not allocate extra memory Vladimir Sementsov-Ogievskiy
2016-07-12 18:43 ` Eric Blake
2016-07-12 19:11   ` Vladimir Sementsov-Ogievskiy [this message]
2016-07-12 20:30     ` Eric Blake
2016-07-12 19:03 ` [Qemu-devel] [Qemu-block] " John Snow
2016-07-12 19:19 ` [Qemu-devel] " Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=578540F4.2070309@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=fabrice@bellard.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).