From: Stefan Hajnoczi <stefanha@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, sunnyzhyy@qq.com,
vsementsov@yandex-team.ru, John Snow <jsnow@redhat.com>,
Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>
Subject: Re: [PATCH v4 11/13] mirror: Skip writing zeroes when target is already zero
Date: Mon, 12 May 2025 14:43:56 -0400 [thread overview]
Message-ID: <20250512184356.GG141177@fedora> (raw)
In-Reply-To: <20250509204341.3553601-26-eblake@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 4186 bytes --]
On Fri, May 09, 2025 at 03:40:28PM -0500, Eric Blake wrote:
> When mirroring, the goal is to ensure that the destination reads the
> same as the source; this goal is met whether the destination is sparse
> or fully-allocated (except when explicitly punching holes, then merely
> reading zero is not enough to know if it is sparse, so we still want
> to punch the hole). Avoiding a redundant write to zero (whether in
> the background because the zero cluster was marked in the dirty
> bitmap, or in the foreground because the guest is writing zeroes) when
> the destination already reads as zero makes mirroring faster, and
> avoids allocating the destination merely because the source reports as
> allocated.
>
> The effect is especially pronounced when the source is a raw file.
> That's because when the source is a qcow2 file, the dirty bitmap only
> visits the portions of the source that are allocated, which tend to be
> non-zero. But when the source is a raw file,
> bdrv_co_is_allocated_above() reports the entire file as allocated so
> mirror_dirty_init sets the entire dirty bitmap, and it is only later
> during mirror_iteration that we change to consulting the more precise
> bdrv_co_block_status_above() to learn where the source reads as zero.
>
> Remember that since a mirror operation can write a cluster more than
> once (every time the guest changes the source, the destination is also
> changed to keep up), and the guest can change whether a given cluster
> reads as zero, is discarded, or has non-zero data over the course of
> the mirror operation, we can't take the shortcut of relying on
> s->target_is_zero (which is static for the life of the job) in
> mirror_co_zero() to see if the destination is already zero, because
> that information may be stale. Any solution we use must be dynamic in
> the face of the guest writing or discarding a cluster while the mirror
> has been ongoing.
>
> We could just teach mirror_co_zero() to do a block_status() probe of
> the destination, and skip the zeroes if the destination already reads
> as zero, but we know from past experience that extra block_status()
> calls are not always cheap (tmpfs, anyone?), especially when they are
> random access rather than linear. Use of block_status() of the source
> by the background task in a linear fashion is not our bottleneck (it's
> a background task, after all); but since mirroring can be done while
> the source is actively being changed, we don't want a slow
> block_status() of the destination to occur on the hot path of the
> guest trying to do random-access writes to the source.
>
> So this patch takes a slightly different approach: any time we have to
> track dirty clusters, we can also track which clusters are known to
> read as zero. For sync=TOP or when we are punching holes from
> "detect-zeroes":"unmap", the zero bitmap starts out empty, but
> prevents a second write zero to a cluster that was already zero by an
> earlier pass; for sync=FULL when we are not punching holes, the zero
> bitmap starts out full if the destination reads as zero during
> initialization. Either way, I/O to the destination can now avoid
> redundant write zero to a cluster that already reads as zero, all
> without having to do a block_status() per write on the destination.
>
> With this patch, if I create a raw sparse destination file, connect it
> with QMP 'blockdev-add' while leaving it at the default "discard":
> "ignore", then run QMP 'blockdev-mirror' with "sync": "full", the
> destination remains sparse rather than fully allocated. Meanwhile, a
> destination image that is already fully allocated remains so unless it
> was opened with "detect-zeroes": "unmap". And any time writing zeroes
> is skipped, the job counters are not incremented.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v3: also skip counters when skipping I/O [Sunny]
> v4: rebase to earlier changes
> ---
> block/mirror.c | 107 ++++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 93 insertions(+), 14 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2025-05-12 18:44 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
2025-05-09 20:40 ` [PATCH v4 01/13] block: Expand block status mode from bool to flags Eric Blake
2025-05-09 20:40 ` [PATCH v4 02/13] file-posix, gluster: Handle zero block status hint better Eric Blake
2025-05-09 20:40 ` [PATCH v4 03/13] block: Let bdrv_co_is_zero_fast consolidate adjacent extents Eric Blake
2025-05-09 20:40 ` [PATCH v4 04/13] block: Add new bdrv_co_is_all_zeroes() function Eric Blake
2025-05-09 20:40 ` [PATCH v4 05/13] iotests: Improve iotest 194 to mirror data Eric Blake
2025-05-09 20:40 ` [PATCH v4 06/13] mirror: Minor refactoring Eric Blake
2025-05-09 20:40 ` [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals Eric Blake
2025-05-02 23:17 ` Sunny Zhu
2025-05-12 15:19 ` Stefan Hajnoczi
2025-05-09 20:40 ` [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero Eric Blake
2025-05-02 23:57 ` Sunny Zhu
2025-05-12 17:30 ` Eric Blake
2025-05-03 0:40 ` Sunny Zhu
2025-05-12 15:23 ` Stefan Hajnoczi
2025-05-09 20:40 ` [PATCH v4 09/13] mirror: Drop redundant zero_target parameter Eric Blake
2025-05-03 0:47 ` Sunny Zhu
2025-05-12 15:24 ` Stefan Hajnoczi
2025-05-14 22:09 ` Eric Blake
2025-05-15 1:11 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero Eric Blake
2025-05-03 0:51 ` Sunny Zhu
2025-05-12 18:40 ` Stefan Hajnoczi
2025-05-09 20:40 ` [PATCH v4 11/13] mirror: Skip writing zeroes when target " Eric Blake
2025-05-12 18:43 ` Stefan Hajnoczi [this message]
2025-05-13 21:37 ` Eric Blake
2025-05-14 15:37 ` Sunny Zhu
2025-05-14 16:30 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 12/13] iotests/common.rc: add disk_usage function Eric Blake
2025-05-09 20:40 ` [PATCH v4 13/13] tests: Add iotest mirror-sparse for recent patches Eric Blake
2025-05-12 18:44 ` Stefan Hajnoczi
2025-05-12 18:44 ` [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Stefan Hajnoczi
2025-05-13 22:00 ` [PATCH v4 14/13] mirror: Reduce I/O when destination is detect-zeroes:unmap Eric Blake
2025-05-14 13:20 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250512184356.GG141177@fedora \
--to=stefanha@redhat.com \
--cc=eblake@redhat.com \
--cc=hreitz@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=sunnyzhyy@qq.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).