qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Denis Plotnikov <dplotnikov@virtuozzo.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org
Cc: kwolf@redhat.com, berto@igalia.com, qemu-block@nongnu.org,
	armbru@redhat.com, den@openvz.org
Subject: Re: [PATCH v22 3/4] qcow2: add zstd cluster compression
Date: Thu, 30 Apr 2020 13:47:06 +0200	[thread overview]
Message-ID: <22b2918b-3bdd-6627-a378-baddb342dacc@redhat.com> (raw)
In-Reply-To: <477e76fd-ff73-ac34-c636-c8d9ae6be30d@virtuozzo.com>


[-- Attachment #1.1: Type: text/plain, Size: 5961 bytes --]

On 30.04.20 11:48, Denis Plotnikov wrote:
> 
> 
> On 30.04.2020 11:26, Max Reitz wrote:
>> On 29.04.20 15:02, Vladimir Sementsov-Ogievskiy wrote:
>>> 29.04.2020 15:17, Max Reitz wrote:
>>>> On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 29.04.2020 13:24, Max Reitz wrote:
>>>>>> On 28.04.20 22:00, Denis Plotnikov wrote:
>>>>>>> zstd significantly reduces cluster compression time.
>>>>>>> It provides better compression performance maintaining
>>>>>>> the same level of the compression ratio in comparison with
>>>>>>> zlib, which, at the moment, is the only compression
>>>>>>> method available.
>>>>>>>
>>>>>>> The performance test results:
>>>>>>> Test compresses and decompresses qemu qcow2 image with just
>>>>>>> installed rhel-7.6 guest.
>>>>>>> Image cluster size: 64K. Image on disk size: 2.2G
>>>>>>>
>>>>>>> The test was conducted with brd disk to reduce the influence
>>>>>>> of disk subsystem to the test results.
>>>>>>> The results is given in seconds.
>>>>>>>
>>>>>>> compress cmd:
>>>>>>>      time ./qemu-img convert -O qcow2 -c -o
>>>>>>> compression_type=[zlib|zstd]
>>>>>>>                      src.img [zlib|zstd]_compressed.img
>>>>>>> decompress cmd
>>>>>>>      time ./qemu-img convert -O qcow2
>>>>>>>                      [zlib|zstd]_compressed.img uncompressed.img
>>>>>>>
>>>>>>>               compression               decompression
>>>>>>>             zlib       zstd           zlib         zstd
>>>>>>> ------------------------------------------------------------
>>>>>>> real     65.5       16.3 (-75 %)    1.9          1.6 (-16 %)
>>>>>>> user     65.0       15.8            5.3          2.5
>>>>>>> sys       3.3        0.2            2.0          2.0
>>>>>>>
>>>>>>> Both ZLIB and ZSTD gave the same compression ratio: 1.57
>>>>>>> compressed image size in both cases: 1.4G
>>>>>>>
>>>>>>> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
>>>>>>> QAPI part:
>>>>>>> Acked-by: Markus Armbruster <armbru@redhat.com>
>>>>>>> ---
>>>>>>>     docs/interop/qcow2.txt |   1 +
>>>>>>>     configure              |   2 +-
>>>>>>>     qapi/block-core.json   |   3 +-
>>>>>>>     block/qcow2-threads.c  | 169
>>>>>>> +++++++++++++++++++++++++++++++++++++++++
>>>>>>>     block/qcow2.c          |   7 ++
>>>>>>>     slirp                  |   2 +-
>>>>>>>     6 files changed, 181 insertions(+), 3 deletions(-)
>>>>>> [...]
>>>>>>
>>>>>>> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
>>>>>>> index 7dbaf53489..a0b12e1b15 100644
>>>>>>> --- a/block/qcow2-threads.c
>>>>>>> +++ b/block/qcow2-threads.c
>>>>>> [...]
>>>>>>
>>>>>>> +static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
>>>>>>> +                                     const void *src, size_t
>>>>>>> src_size)
>>>>>>> +{
>>>>>> [...]
>>>>>>
>>>>>>> +    /*
>>>>>>> +     * The compressed stream from the input buffer may consist of
>>>>>>> more
>>>>>>> +     * than one zstd frame.
>>>>>> Can it?
>>>>> If not, we must require it in the specification.
>>>> Actually, now that you mention it, it would make sense anyway to add
>>>> some note to the specification on what exactly compressed with zstd
>>>> means.
>>>>
>>>>> Hmm. If at some point
>>>>> we'll want multi-threaded compression of one big (2M) cluster.. Could
>>>>> this be implemented with zstd lib, if multiple frames are allowed,
>>>>> will
>>>>> allowing multiple frames help? I don't know actually, but I think
>>>>> better
>>>>> not to forbid it. On the other hand, I don't see any benefit in large
>>>>> compressed clusters. At least, in our scenarios (for compressed
>>>>> backups)
>>>>> we use 64k compressed clusters, for good granularity of incremental
>>>>> backups (when for running vm we use 1M clusters).
>>>> Is it really that important?  Naïvely, it sounds rather complicated to
>>>> introduce multithreading into block drivers.
>>> It is already here: compression and encryption already multithreaded.
>>> But of course, one cluster is handled in one thread.
>> Ah, good.  I forgot.
>>
>>>> (Also, as for compression, it can only be used in backup scenarios
>>>> anyway, where you write many clusters at once.  So parallelism on the
>>>> cluster level should sufficient to get high usage, and it would benefit
>>>> all compression types and cluster sizes.)
>>>>
>>> Yes it works in this way already :)
>> Well, OK then.
>>
>>> So, we don't know do we want one frame restriction or not. Do you have a
>>> preference?
>> *shrug*
>>
>> Seems like it would be preferential to allow multiple frames still.  A
>> note in the spec would be nice (i.e., streaming format, multiple frames
>> per cluster possible).
> 
> We don't mention anything about zlib compressing details in the spec.

Yep.  Which is bad enough.

> If we mention anything about ZSTD compressing details we'll have to do
> it for
> zlib as well.

I personally don’t like “If you can’t make it perfect, you shouldn’t do
it at all”.  Mentioning it for zstd would still be an improvement.

Also, I believe that “zlib compression” is pretty much unambiguous,
considering all the code does is call deflate()/inflate().

I’m not saying we need to amend the spec in this series, but I don’t see
a good argument against doing so at some point.

> So, I think we have two possibilities for the spec:
> 1. mention for both
> 2. don't mention at all
> 
> I think the 2nd is better. It gives more degree of freedom for the
> future improvements.

No, it absolutely doesn’t.  There is a de-facto format, namely what qemu
accepts.  Changing that would be an incompatible change.  Just because
we don’t write what’s allowed into the spec doesn’t change that fact.

> and we've already covered both cases (one frame, may frames) in the code.
There are still different zstd formats, namely streaming or not
streaming.  That’s what should be mentioned.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-04-30 11:48 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-28 20:00 [PATCH v22 0/4] implement zstd cluster compression method Denis Plotnikov
2020-04-28 20:00 ` [PATCH v22 1/4] qcow2: introduce compression type feature Denis Plotnikov
2020-04-28 20:00 ` [PATCH v22 2/4] qcow2: rework the cluster compression routine Denis Plotnikov
2020-04-28 20:00 ` [PATCH v22 3/4] qcow2: add zstd cluster compression Denis Plotnikov
2020-04-28 21:05   ` Eric Blake
2020-04-29 10:24   ` Max Reitz
2020-04-29 10:37     ` Vladimir Sementsov-Ogievskiy
2020-04-29 12:17       ` Max Reitz
2020-04-29 13:02         ` Vladimir Sementsov-Ogievskiy
2020-04-29 13:49           ` Eric Blake
2020-04-30  8:26           ` Max Reitz
2020-04-30  9:48             ` Denis Plotnikov
2020-04-30 11:47               ` Max Reitz [this message]
2020-04-30 13:56                 ` Denis Plotnikov
2020-05-04  7:53                   ` Max Reitz
2020-04-29 10:38     ` Denis Plotnikov
2020-04-29 12:24       ` Max Reitz
2020-04-28 20:00 ` [PATCH v22 4/4] iotests: 287: add qcow2 compression type test Denis Plotnikov
2020-04-28 21:08   ` Eric Blake
2020-04-29 10:26   ` Max Reitz
2020-04-29 10:40     ` Denis Plotnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22b2918b-3bdd-6627-a378-baddb342dacc@redhat.com \
    --to=mreitz@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berto@igalia.com \
    --cc=den@openvz.org \
    --cc=dplotnikov@virtuozzo.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).