qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Max Reitz <mreitz@redhat.com>, qemu-devel@nongnu.org
Cc: david.edmondson@oracle.com, qemu-block@nongnu.org
Subject: Re: [PATCH 00/17] Improve qcow2 all-zero detection
Date: Wed, 5 Feb 2020 13:21:40 -0600	[thread overview]
Message-ID: <59ba95c5-4c8b-a059-2332-3bafdc90dd2e@redhat.com> (raw)
In-Reply-To: <5141ea4b-a7c2-e9a3-045e-91dc088785c7@redhat.com>

On 2/5/20 11:04 AM, Max Reitz wrote:
> OK, I expected users to come in a separate patch.

I can refactor that better in v2.

> 
>> That's the use case: when copying into a destination file, it's useful
>> to know if the destination already reads as all zeroes, before
>> attempting a fallback to bdrv_make_zero(BDRV_REQ_NO_FALLBACK) or calls
>> to block status checking for holes.
> 
> But that was my point on IRC.  Is it really more useful if
> bdrv_make_zero() is just as quick?  (And the fact that NBD doesn’t have
> an implementation looks more like a problem with NBD to me.)

That is indeed a thought - why should qemu-img even TRY to call 
bdrv_has_init_zero; it should instead call bdrv_make_zero(), which 
should be as fast as possible (see my latest reply on 9/17 exploring 
that idea more).  Under the hood, we can then make bdrv_make_zero() use 
whatever tricks it needs, whether keeping the driver's 
.bdrv_has_zero_init/_truncate callbacks, adding another callback, making 
write_zeroes faster, or whatever, but instead of making qemu-img sort 
through multiple ideas, the burden would now be hidden in the block layer.

> 
> (Considering that at least the code we discussed on IRC didn’t work for
> preallocated images, which was the one point where we actually have a
> problem in practice.)

And this series DOES improve the case for preallocated qcow2 images, by 
virtue of a new qcow2 autoclear bit.  But again, that may be something 
we want to hide in the driver callback interfaces, while qemu-img just 
blindly calls bdrv_make_zero() and gets success (the image now reads as 
zeroes, either because it was already that way or we did something 
quick) or failure (it is a waste of time to prezero, whether by 
write_zeroes or by trim or by truncate, so just manually write zeroes as 
part of your image copying).

> 
>>> (We have a use case with convert -n to freshly created image files, but
>>> my position on this on IRC was that we want the --target-is-zero flag
>>> for that anyway: Auto-detection may always break, our preferred default
>>> behavior may always change, so if you want convert -n not to touch the
>>> target image except to write non-zero data from the source, we need a
>>> --target-is-zero flag and users need to use it.  Well, management
>>> layers, because I don’t think users would use convert -n anyway.
>>>
>>> And with --target-is-zero and users effectively having to use it, I
>>> don’t think that’s a good example of a use case.)
>>
>> Yes, there will still be cases where you have to use --target-is-zero
>> because the image itself couldn't report that it already reads as
>> zeroes, but there are also enough cases where the destination is already
>> known to read zeroes and it's a shame to tell the user that 'you have to
>> add --target-is-zero to get faster copying even though we could have
>> inferred it on your behalf'.
> 
> How is it a shame?  I think only management tools would use convert -n.
>   Management tools want reliable behavior.  If you want reliable
> behavior, you have to use --target-is-zero anyway.  So I don’t see the
> actual benefit for qemu-img convert.

qemu-img convert to an NBD destination cannot create the destination, so 
it ALWAYS has to use -n.  I don't know how often users are likely to 
wire up a command line for qemu-img convert with NBD as the destination, 
or whether you are correct that it will be a management app able to 
supply -n (and thus able to supply --target-is-zero).  But at the same 
time, can a management app learn whether it is safe to supply 
--target-is-zero?  With my upcoming NBD patches, 'qemu-img --list' will 
show whether the NBD target is known to start life all zero, and a 
management app could use mechanism to decide when to pass 
--target-is-zero (whether a management app would actually fork qemu-img 
--list, or just be an NBD client itself to do the same thing qemu-img 
would do, is beside the point).

Similarly, this series includes enhancements to 'qemu-img info' on qcow2 
images known to currently read as zero; again, that sort of information 
is necessary somewhere in the chain, whether it be because qemu-img 
consumes the information itself, or because the management app consumes 
the information in order to pass the --target-is-zero option to 
qemu-img, either way, the information needs to be available for consumption.

> 
>>> I suppose there is the point of blockdev-create + blockdev-mirror: This
>>> has exactly the same problem as convert -n.  But again, if you really
>>> want blockdev-mirror not just to force-zero the image, you probably need
>>> to tell it so explicitly (i.e., with a --target-is-zero flag for
>>> blockdev-mirror).
>>>
>>> (Well, I suppose we could save us a target-is-zero for mirror if we took
>>> this series and had a filter driver that force-reports BDRV_ZERO_OPEN.
>>> But, well, please no.)
>>>
>>> But maybe I’m just an idiot and there is no reason not to take this
>>> series and make blockdev-create + blockdev-mirror do the sensible thing
>>> by default in most cases. *shrug*
>>
>> My argument for taking the series _is_ that the common case can be made
>> more efficient without user effort.
> 
> The thing is, I don’t see the user effort.  I don’t think users use
> convert -n or backup manually.  And for management tools, it isn’t
> really effort to add another switch.

Maybe, but it is just shifting the burden between who consumes the 
information that an image is all zero, and how the consumption of that 
information is passed to qemu-img.

> 
>> Yes, we still need the knob for
>> when the common case isn't already smart enough,
> 
> But the user can’t know when qemu isn’t smart enough.  So users who care
> have to always give the flag.
> 
>> but the difference in
>> avoiding a pre-zeroing pass is noticeable when copying images around
> 
> I’m sure it is, but the question I ask is whether in practice we
> wouldn’t get --target-is-zero in all of these cases anyway.
> 
> 
> So I’m not sold on “it works most of the time”, because if it’s just
> most of the time, then we’ll likely see --target-is-zero all of the time.
> 
> OTOH, I suppose that with the new qcow2 extension, it would always work
> for the following case:
> (1) Create a qcow2 file,
> (2) Immediately (with the next qemu-img/QMP invocation) use it as a
> target of convert -n or mirror or anything similar.

Yes, that is one of the immediately obvious fallouts from this series - 
you can now create a preallocated qcow2 image in one process, and the 
next process using that image can readily tell that it is still 
just-created.

> 
> If so, that means it works reliably all of the time for a common case.
> I guess that’d be enough for me.
> 
> Max
> 
>> (and more than just for qcow2 - my followup series to improve NBD is
>> similarly useful given how much work has already been invested in
>> mapping NBD into storage access over https in the upper layers like ovirt).
>>
> 
> 

At any rate, I think you've convinced me to rethink how I present v2 
(maybe not by refactoring bdrv_known_zeroes usage, but instead 
refactoring bdrv_make_zero), but that the qcow2 autoclear bit is still a 
useful feature to have.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



  reply	other threads:[~2020-02-05 19:23 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-31 17:44 [PATCH 00/17] Improve qcow2 all-zero detection Eric Blake
2020-01-31 17:44 ` [PATCH 01/17] qcow2: Comment typo fixes Eric Blake
2020-02-04 14:12   ` Vladimir Sementsov-Ogievskiy
2020-02-09 19:34   ` Alberto Garcia
2020-01-31 17:44 ` [PATCH 02/17] qcow2: List autoclear bit names in header Eric Blake
2020-02-04 14:26   ` Vladimir Sementsov-Ogievskiy
2020-01-31 17:44 ` [PATCH 03/17] qcow2: Avoid feature name extension on small cluster size Eric Blake
2020-02-04 14:39   ` Vladimir Sementsov-Ogievskiy
2020-02-09 19:28   ` Alberto Garcia
2020-01-31 17:44 ` [PATCH 04/17] block: Improve documentation of .bdrv_has_zero_init Eric Blake
2020-02-04 15:03   ` Vladimir Sementsov-Ogievskiy
2020-02-04 15:16     ` Eric Blake
2020-01-31 17:44 ` [PATCH 05/17] block: Don't advertise zero_init_truncate with encryption Eric Blake
2020-02-10 18:12   ` Alberto Garcia
2020-01-31 17:44 ` [PATCH 06/17] block: Improve bdrv_has_zero_init_truncate with backing file Eric Blake
2020-02-10 18:13   ` Alberto Garcia
2020-01-31 17:44 ` [PATCH 07/17] gluster: Drop useless has_zero_init callback Eric Blake
2020-02-04 15:06   ` Vladimir Sementsov-Ogievskiy
2020-02-10 18:21   ` Alberto Garcia
2020-02-17  8:06   ` [GEDI] " Niels de Vos
2020-02-17 12:03     ` Eric Blake
2020-02-17 12:22       ` Eric Blake
2020-02-17 14:01       ` Niels de Vos
2020-01-31 17:44 ` [PATCH 08/17] sheepdog: Consistently set bdrv_has_zero_init_truncate Eric Blake
2020-02-04 15:09   ` Vladimir Sementsov-Ogievskiy
2020-01-31 17:44 ` [PATCH 09/17] block: Refactor bdrv_has_zero_init{,_truncate} Eric Blake
2020-02-04 15:35   ` Vladimir Sementsov-Ogievskiy
2020-02-04 15:49     ` Eric Blake
2020-02-04 16:07       ` Vladimir Sementsov-Ogievskiy
2020-02-04 17:42     ` Max Reitz
2020-02-04 17:51       ` Eric Blake
2020-02-05 16:43         ` Max Reitz
2020-02-05  7:51       ` Vladimir Sementsov-Ogievskiy
2020-02-05 14:07         ` Eric Blake
2020-02-05 14:25           ` Vladimir Sementsov-Ogievskiy
2020-02-05 14:36             ` Eric Blake
2020-02-05 17:55           ` Max Reitz
2020-02-04 17:53   ` Max Reitz
2020-02-04 19:03     ` Eric Blake
2020-02-05 17:22       ` Max Reitz
2020-02-05 18:39         ` Eric Blake
2020-02-06  9:18           ` Max Reitz
2020-01-31 17:44 ` [PATCH 10/17] block: Add new BDRV_ZERO_OPEN flag Eric Blake
2020-01-31 18:03   ` Eric Blake
2020-02-04 17:34   ` Max Reitz
2020-02-04 17:50     ` Eric Blake
2020-02-05  8:39       ` Vladimir Sementsov-Ogievskiy
2020-02-05 17:26       ` Max Reitz
2020-01-31 17:44 ` [PATCH 11/17] file-posix: Support BDRV_ZERO_OPEN Eric Blake
2020-01-31 17:44 ` [PATCH 12/17] gluster: " Eric Blake
2020-02-17  8:16   ` [GEDI] " Niels de Vos
2020-01-31 17:44 ` [PATCH 13/17] qcow2: Add new autoclear feature for all zero image Eric Blake
2020-02-03 17:45   ` Vladimir Sementsov-Ogievskiy
2020-02-04 13:12     ` Eric Blake
2020-02-04 13:29       ` Vladimir Sementsov-Ogievskiy
2020-01-31 17:44 ` [PATCH 14/17] qcow2: Expose all zero bit through .bdrv_known_zeroes Eric Blake
2020-01-31 17:44 ` [PATCH 15/17] qcow2: Implement all-zero autoclear bit Eric Blake
2020-01-31 17:44 ` [PATCH 16/17] iotests: Add new test for qcow2 all-zero bit Eric Blake
2020-01-31 17:44 ` [PATCH 17/17] qcow2: Let qemu-img check cover " Eric Blake
2020-02-04 17:32 ` [PATCH 00/17] Improve qcow2 all-zero detection Max Reitz
2020-02-04 18:53   ` Eric Blake
2020-02-05 17:04     ` Max Reitz
2020-02-05 19:21       ` Eric Blake [this message]
2020-02-06  9:12         ` Max Reitz
2020-02-05  9:04 ` Vladimir Sementsov-Ogievskiy
2020-02-05  9:25   ` Vladimir Sementsov-Ogievskiy
2020-02-05 14:26     ` Eric Blake
2020-02-05 14:47       ` Vladimir Sementsov-Ogievskiy
2020-02-05 15:14         ` Vladimir Sementsov-Ogievskiy
2020-02-05 17:58           ` Max Reitz
2020-02-05 14:22   ` Eric Blake
2020-02-05 14:43     ` Vladimir Sementsov-Ogievskiy
2020-02-05 14:58       ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59ba95c5-4c8b-a059-2332-3bafdc90dd2e@redhat.com \
    --to=eblake@redhat.com \
    --cc=david.edmondson@oracle.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).