qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 0/3] block: zero write detection
Date: Wed, 12 Oct 2011 12:59:52 +0100	[thread overview]
Message-ID: <CAJSP0QVoLEnL6gPjNGWKApA05KUqCyaEBLN1v6NBLFpyX6nekA@mail.gmail.com> (raw)
In-Reply-To: <4E957408.3010204@redhat.com>

On Wed, Oct 12, 2011 at 12:03 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 12.10.2011 12:39, schrieb Stefan Hajnoczi:
>> On Tue, Oct 11, 2011 at 03:46:28PM +0200, Kevin Wolf wrote:
>>> Am 07.10.2011 17:49, schrieb Stefan Hajnoczi:
>>>> Image streaming copies data from the backing file into the image file.  It is
>>>> important to represent zero regions from the backing file efficiently during
>>>> streaming, otherwise the image file grows to the full virtual disk size and
>>>> loses sparseness.
>>>>
>>>> There are two ways to implement zero write detection, they are subtly different:
>>>>
>>>> 1. Allow image formats to provide efficient representations for zero regions.
>>>>    QED does this with "zero clusters" and it has been discussed for qcow2v3.
>>>>
>>>> 2. During streaming, check for zeroes and skip writing to the image file when
>>>>    zeroes are detected.
>>>>
>>>> However, there are some disadvantages to #2 because it leaves unallocated holes
>>>> in the image file.  If image streaming is aborted before it completes then it
>>>> will be necessary to reread all unallocated clusters from the backing file upon
>>>> resuming image streaming.  Potentionally worse is that a backing file over a
>>>> slow remote connection will have the zero regions fetched again and again if
>>>> the guest accesses them.  #1 avoids these problems because the image file
>>>> contains information on which regions are zeroes and do not need to be
>>>> refetched.
>>>>
>>>> This patch series implements #1 with the existing QED zero cluster feature.  In
>>>> the future we can add qcow2v3 zero clusters too.  We can also implement #2
>>>> directly in the image streaming code as a fallback when the BlockDriver does
>>>> not support zero detection #1 itself.  That way we get the best possible zero
>>>> write detection, depending on the image format.
>>>>
>>>> Here is a qemu-iotest to verify that zero write detection is working:
>>>> http://repo.or.cz/w/qemu-iotests/stefanha.git/commitdiff/226949695eef51bdcdea3e6ce3d7e5a863427f37
>>>>
>>>> Stefan Hajnoczi (3):
>>>>   block: add zero write detection interface
>>>>   qed: add zero write detection support
>>>>   qemu-io: add zero write detection option
>>>>
>>>>  block.c     |   16 +++++++++++
>>>>  block.h     |    2 +
>>>>  block/qed.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++++++++------
>>>>  block_int.h |   13 +++++++++
>>>>  qemu-io.c   |   35 ++++++++++++++++++++-----
>>>>  5 files changed, 132 insertions(+), 15 deletions(-)
>>>
>>> It's good to have an option to detect zero writes and turn them into
>>> zero clusters, but it's something that introduces some overhead and
>>> probably won't be suitable as a default.
>>
>> Yes, this series simply has a bdrv_set_zero_detection() API to toggle it
>> at runtime.  By default it is off to save CPU cycles.
>>
>>> I think what we really want to have for image streaming is an API that
>>> explicitly writes zeros and doesn't have to look at the whole buffer (or
>>> actually doesn't even get a buffer).
>>
>> I didn't take this approach to avoid having block drivers handle the
>> zero buffers that need to be allocated when the region does not cover
>> entire clusters.  It can be done for sure but I'm not sure how to do it
>> nicely yet.
>
> If I understand your QED code right, in such cases it ignores that there
> are some zeros that could be turned into a zero cluster. Considering
> this and that you always fill a buffer just to be able to check it
> (which is known to take considerable time from qemu-img convert
> experience) - how could any solution that works consistently, but
> requires an allocation in the block driver be less nice?

The fallback is easy when you already have a buffer - just do the write :).

My point is that this patch is the simplest approach.  Other
approaches can optimize better and the question is whether they are
worth doing.

Stefan

      reply	other threads:[~2011-10-12 11:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-07 15:49 [Qemu-devel] [PATCH 0/3] block: zero write detection Stefan Hajnoczi
2011-10-07 15:49 ` [Qemu-devel] [PATCH 1/3] block: add zero write detection interface Stefan Hajnoczi
2011-10-07 15:49 ` [Qemu-devel] [PATCH 2/3] qed: add zero write detection support Stefan Hajnoczi
2011-10-07 15:49 ` [Qemu-devel] [PATCH 3/3] qemu-io: add zero write detection option Stefan Hajnoczi
2011-10-09  9:52 ` [Qemu-devel] [PATCH 0/3] block: zero write detection Mars.cao
2011-10-11 13:46 ` Kevin Wolf
2011-10-12 10:39   ` Stefan Hajnoczi
2011-10-12 11:03     ` Kevin Wolf
2011-10-12 11:59       ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJSP0QVoLEnL6gPjNGWKApA05KUqCyaEBLN1v6NBLFpyX6nekA@mail.gmail.com \
    --to=stefanha@gmail.com \
    --cc=aliguori@us.ibm.com \
    --cc=kwolf@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).