From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Eric Blake <eblake@redhat.com>,
qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: mreitz@redhat.com, kwolf@redhat.com, stefanha@redhat.com,
den@openvz.org, fabrice@bellard.org
Subject: Re: [Qemu-devel] [PATCH] qcow2: do not allocate extra memory
Date: Tue, 12 Jul 2016 22:11:48 +0300 [thread overview]
Message-ID: <578540F4.2070309@virtuozzo.com> (raw)
In-Reply-To: <57853A35.4030501@redhat.com>
On 12.07.2016 21:43, Eric Blake wrote:
> On 07/12/2016 11:43 AM, Vladimir Sementsov-Ogievskiy wrote:
>> There are no needs to allocate more than one cluster, as we set
>> avail_out for deflate to one cluster.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
>> Hi all!
>>
>> Please, can anybody say me what I'm missing?
>>
> https://tools.ietf.org/html/rfc1951 states a simple fact about compression:
>
> A simple counting argument shows that no lossless compression
> algorithm can compress every possible input data set. For the
> format defined here, the worst case expansion is 5 bytes per 32K-
> byte block, i.e., a size increase of 0.015% for large data sets.
>
> So overallocating the output buffer guarantees that you will get a valid
> compression result via a single function call, even when the data is
> incompressible (the zlib format specifically documents that if the
> normal algorithm on the data does not reduce its size, then you merely
> add a fixed-length marker that documents that fact, so you at least
> avoid unbounded expansion when trying to compress pathological data).
>
> But since the qcow2 format already has a way of documenting whether a
> cluster is compressed or not, we probably don't have to rely on zlib's
> marker for uncompressible data, and could instead tweak the code to
> specifically refuse to compress any cluster whose output would result in
> more than a cluster's worth of bytes. I'm not familiar enough with
> zlib's interface to know how easy or hard this is, and whether merely
> checking error codes is sufficient, nor whether qemu's use of zlib would
> behave correctly in the face of such an error when the output buffer is
> undersized because the data was incompressible.
zlib doc says:
"deflate compresses as much data as possible, and stops when the input
buffer becomes empty or the output buffer becomes full"
It will not write more then avail_out bytes to out buffer.
>
>
>> ...
>> strm.avail_out = s->cluster_size;
>> strm.next_out = out_buf;
>>
>> ret = deflate(&strm, Z_FINISH);
>> ...
>> out_len = strm.next_out - out_buf;
> You've skipped what is done with ret, which will be different according
> to whether the entire compressed stream fit in the buffer described by
> strm, and that would have to be audited as part of your proposed patch.
ret would be Z_STREAM_END if it fit in and Z_OK if not. (if there are no
errors ofcourse). What I've skipped? I just say that nobody knows about
this extra allocation - neither zlib nor other code in this function
(except g_free=).
>
>
>> - out_buf = g_malloc(s->cluster_size + (s->cluster_size / 1000) + 128);
>> + out_buf = g_malloc(s->cluster_size);
> Is avoiding the fudge factor really worth it? I don't know that we'll
> get a noticeable performance gain with this patch, and it may be easier
> to leave things alone than to audit that we are correctly handling cases
> where the attempt at compression results in a zlib buffer larger than
> the original data, even when the output buffer size is now constrained
> differently.
>
I'm not insist on applying this patch, I am just trying to understand this.
--
Best regards,
Vladimir
next prev parent reply other threads:[~2016-07-12 19:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-12 17:43 [Qemu-devel] [PATCH] qcow2: do not allocate extra memory Vladimir Sementsov-Ogievskiy
2016-07-12 18:43 ` Eric Blake
2016-07-12 19:11 ` Vladimir Sementsov-Ogievskiy [this message]
2016-07-12 20:30 ` Eric Blake
2016-07-12 19:03 ` [Qemu-devel] [Qemu-block] " John Snow
2016-07-12 19:19 ` [Qemu-devel] " Vladimir Sementsov-Ogievskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=578540F4.2070309@virtuozzo.com \
--to=vsementsov@virtuozzo.com \
--cc=den@openvz.org \
--cc=eblake@redhat.com \
--cc=fabrice@bellard.org \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).