Re: [Qemu-devel] [PATCH 1/4] qemu-img: Fix preallocation with -S 0 for convert

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Max Reitz <mreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Fam Zheng <famz@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/4] qemu-img: Fix preallocation with -S 0 for convert
Date: Mon, 22 Feb 2016 18:24:14 +0100	[thread overview]
Message-ID: <56CB443E.8060301@redhat.com> (raw)
In-Reply-To: <56CB3AB2.4090809@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5680 bytes --]

On 22.02.2016 17:43, Max Reitz wrote:
> On 22.02.2016 13:50, Kevin Wolf wrote:
>> Am 20.02.2016 um 18:39 hat Max Reitz geschrieben:
>>> When passing -S 0 to qemu-img convert, the target image is supposed to
>>> be fully allocated. Right now, this is not the case if the source image
>>> contains areas which bdrv_get_block_status() reports as being zero.
>>>
>>> This patch introduces a new ImgConvertBlockStatus named BLK_ZERO_DATA
>>> which is set by convert_iteration_sectors() if an area is detected to be
>>> zero and min_sparse is 0. Simply using BLK_DATA is not optimal, because
>>> knowing an area to be zero allows us to memset() the buffer to be
>>> written directly rather than having to use blk_read().
>>>
>>> Other than that, BLK_ZERO_DATA is handled exactly like BLK_DATA.
>>>
>>> This patch changes the reference output for iotest 122; contrary to what
>>> it assumed, -S 0 really should allocate everything in the output, not
>>> just areas that are filled with zeros (as opposed to being zeroed).
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>
>> I don't like how you touch the conversion code all over the place.
>> Specifically, convert_iteration_sectors() and convert_read() (and
>> consequently s->status) are supposed to be only about the source file.
>> -S 0 doesn't make a difference for what the source file looks like, so
>> we shouldn't need to change anything there.
>>
>> The following change should do the same thing (it passes your test case
>> anyway) and is more contained to the actual writeout of image data.
> 
> Well, I briefly considered making @buf non-const in convert_write(), but
> I discarded that idea, and I'm still not comfortable with that. If you
> argue that convert_read() should only deal with the source image, I'll
> argue that convert_write() should only deal with the target image. I
> know that you're making @buf non-const because we need some scratch
> buffer and, well, @buf is available, so why not use that? But it still
> doesn't sit right with me.
> 
> So I'd like to pull that memset() out of convert_write() and just tell
> convert_write() to write that buffer as data. In fact, that is basically
> what my patch does. But why does it then not just use BLK_DATA but this
> new status?
> 
> Because of two reasons: First, another issue with your patch is that
> zeroed areas are not counted in the progress update. If we are writing
> them as zeros, we should count them, however. Therefore, we need some
> special-casing in convert_do_copy(). Effectively, we end up with stuff like:
> 
>> if (s->status == BLK_DATA ||
>>     (!s->min_sparse && s->status == BLK_ZERO))
> 
> I found that combination of (min_sparse && BLK_ZERO) to be ugly, and
> liked it much better if we could do that test a single time in
> convert_read and be done with it. This is why I added the new status.*
> 
> And if you pull the memset() out of convert_write() and add a new
> status, what you end up with is basically my patch.
> 
> *Note that I initially thought we'd have this test in many more places
> than we actually would, as it turned out. I'll take a look at how much
> simpler this patch becomes if I drop the BLK_ZERO_DATA status.

Said patch would look like this:

diff --git a/qemu-img.c b/qemu-img.c
index 2edb139..b696ba4 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1509,10 +1509,6 @@ static int convert_read(ImgConvertState *s,
int64_t sector_num, int nb_sectors,
     int n;
     int ret;

-    if (s->status == BLK_ZERO || s->status == BLK_BACKING_FILE) {
-        return 0;
-    }
-
     assert(nb_sectors <= s->buf_sectors);
     while (nb_sectors > 0) {
         BlockBackend *blk;
@@ -1650,7 +1646,8 @@ static int convert_do_copy(ImgConvertState *s)
             ret = n;
             goto fail;
         }
-        if (s->status == BLK_DATA) {
+        if (s->status == BLK_DATA || (!s->min_sparse && s->status ==
BLK_ZERO))
+        {
             s->allocated_sectors += n;
         }
         sector_num += n;
@@ -1670,17 +1667,24 @@ static int convert_do_copy(ImgConvertState *s)
             ret = n;
             goto fail;
         }
-        if (s->status == BLK_DATA) {
+        if (s->status == BLK_DATA || (!s->min_sparse && s->status ==
BLK_ZERO))
+        {
             allocated_done += n;
             qemu_progress_print(100.0 * allocated_done /
s->allocated_sectors,
                                 0);
         }

-        ret = convert_read(s, sector_num, n, buf);
-        if (ret < 0) {
-            error_report("error while reading sector %" PRId64
-                         ": %s", sector_num, strerror(-ret));
-            goto fail;
+        if (s->status == BLK_DATA) {
+            ret = convert_read(s, sector_num, n, buf);
+            if (ret < 0) {
+                error_report("error while reading sector %" PRId64
+                             ": %s", sector_num, strerror(-ret));
+                goto fail;
+            }
+        } else if (!s->min_sparse && s->status == BLK_ZERO) {
+            n = MIN(n, s->buf_sectors);
+            memset(buf, 0, n * BDRV_SECTOR_SIZE);
+            s->status = BLK_DATA;
         }

         ret = convert_write(s, sector_num, n, buf);



I'd get the diffcount even smaller by keeping convert_read() unchanged
and continuing to call it unconditionally, but I like that change in
itself because I find it makes the logic clearer. I can be persuaded to
split this patch into two, however (one pulling the condition out of
convert_read(), and another one doing the rest of this patch).

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

next prev parent reply	other threads:[~2016-02-22 17:24 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-20 17:39 [Qemu-devel] [PATCH 0/4] qemu-img: Fix preallocation with -S 0 for convert Max Reitz
2016-02-20 17:39 ` [Qemu-devel] [PATCH 1/4] " Max Reitz
2016-02-22 12:50   ` Kevin Wolf
2016-02-22 16:43     ` Max Reitz
2016-02-22 17:24       ` Max Reitz [this message]
2016-02-20 17:39 ` [Qemu-devel] [PATCH 2/4] block/null-{co, aio}: Return zeros when read Max Reitz
2016-02-22 16:31   ` Eric Blake
2016-02-22 16:45     ` Max Reitz
2016-02-20 17:39 ` [Qemu-devel] [PATCH 3/4] block/null-{co, aio}: Implement get_block_status() Max Reitz
2016-02-20 17:39 ` [Qemu-devel] [PATCH 4/4] iotests: Test qemu-img convert -S 0 behavior Max Reitz

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:2edb139 dfblob:b696ba4 )
 OR (
bs:"Re: [Qemu-devel] [PATCH 1/4] qemu-img: Fix preallocation with -S 0 for convert" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56CB443E.8060301@redhat.com \
    --to=mreitz@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.