All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org,
	vsementsov@yandex-team.ru, John Snow <jsnow@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>
Subject: Re: [PATCH v2 08/11] mirror: Skip writing zeroes when target is already zero
Date: Thu, 17 Apr 2025 16:54:14 -0400	[thread overview]
Message-ID: <20250417205414.GG85491@fedora> (raw)
In-Reply-To: <20250417184133.105746-21-eblake@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3383 bytes --]

On Thu, Apr 17, 2025 at 01:39:13PM -0500, Eric Blake wrote:
> When mirroring, the goal is to ensure that the destination reads the
> same as the source; this goal is met whether the destination is sparse
> or fully-allocated.  However, if the destination cannot efficiently
> write zeroes, then any time the mirror operation wants to copy zeroes
> from the source to the destination (either during the background over
> sparse regions when doing a full mirror, or in the foreground when the
> guest actively writes zeroes), we were causing the destination to
> fully allocate that portion of the disk, even if it already read as
> zeroes.
> 
> The effect is especially pronounced when the source is a raw file.
> That's because when the source is a qcow2 file, the dirty bitmap only
> visits the portions of the source that are allocated, which tend to be
> non-zero.  But when the source is a raw file,
> bdrv_co_is_allocated_above() reports the entire file as allocated so
> mirror_dirty_init sets the entire dirty bitmap, and it is only later
> during mirror_iteration that we change to consulting the more precise
> bdrv_co_block_status_above() to learn where the source reads as zero.
> 
> Remember that since a mirror operation can write a cluster more than
> once (every time the guest changes the source, the destination is also
> changed to keep up), we can't take the shortcut of relying on
> s->zero_target (which is static for the life of the job) in
> mirror_co_zero() to see if the destination is already zero, because
> that information may be stale.  Any solution we use must be dynamic in
> the face of the guest writing or discarding a cluster while the mirror
> has been ongoing.
> 
> We could just teach mirror_co_zero() to do a block_status() probe of
> the destination, and skip the zeroes if the destination already reads
> as zero, but we know from past experience that extra block_status()
> calls are not always cheap (tmpfs, anyone?), especially when they are
> random access rather than linear.  Use of block_status() of the source
> by the background task in a linear fashion is not our bottleneck (it's
> a background task, after all); but since mirroring can be done while
> the source is actively being changed, we don't want a slow
> block_status() of the destination to occur on the hot path of the
> guest trying to do random-access writes to the source.
> 
> So this patch takes a slightly different approach: any time we have to
> transfer the full image, we know that mirror_dirty_init() is _already_
> doing a pre-zero pass over the entire destination.  Therefore, if we
> track which clusters of the destination are zero at any given moment,
> we don't have to do a block_status() call on the destination, but can
> instead just refer to the zero bitmap associated with the job.
> 
> With this patch, if I create a raw sparse destination file, connect it
> with QMP 'blockdev-add' while leaving it at the default "discard":
> "ignore", then run QMP 'blockdev-mirror' with "sync": "full", the
> destination remains sparse rather than fully allocated.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  block/mirror.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 65 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2025-04-17 20:55 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-17 18:39 [PATCH v2 00/11] Make blockdev-mirror dest sparse in more cases Eric Blake
2025-04-17 18:39 ` [PATCH v2 01/11] block: Expand block status mode from bool to enum Eric Blake
2025-04-17 20:17   ` Stefan Hajnoczi
2025-04-18 19:02     ` Eric Blake
2025-04-18 21:55       ` Eric Blake
2025-04-17 18:39 ` [PATCH v2 02/11] file-posix: Handle zero block status hint better Eric Blake
2025-04-17 20:58   ` Stefan Hajnoczi
2025-04-17 18:39 ` [PATCH v2 03/11] block: Let bdrv_co_is_zero_fast consolidate adjacent extents Eric Blake
2025-04-17 20:21   ` Stefan Hajnoczi
2025-04-17 18:39 ` [PATCH v2 04/11] block: Add new bdrv_co_is_all_zeroes() function Eric Blake
2025-04-17 20:35   ` Stefan Hajnoczi
2025-04-18 19:07     ` Eric Blake
2025-04-17 18:39 ` [PATCH v2 05/11] iotests: Improve iotest 194 to mirror data Eric Blake
2025-04-17 20:39   ` Stefan Hajnoczi
2025-04-17 18:39 ` [PATCH v2 06/11] mirror: Minor refactoring Eric Blake
2025-04-17 20:42   ` Stefan Hajnoczi
2025-04-17 18:39 ` [PATCH v2 07/11] mirror: Skip pre-zeroing destination if it is already zero Eric Blake
2025-04-17 20:46   ` Stefan Hajnoczi
2025-04-24 17:10   ` Eric Blake
2025-04-17 18:39 ` [PATCH v2 08/11] mirror: Skip writing zeroes when target " Eric Blake
2025-04-17 20:54   ` Stefan Hajnoczi [this message]
2025-04-23 16:42   ` Sunny Zhu
2025-04-23 19:12     ` Eric Blake
2025-04-17 18:39 ` [PATCH v2 09/11] iotests/common.rc: add disk_usage function Eric Blake
2025-04-17 20:54   ` Stefan Hajnoczi
2025-04-17 18:39 ` [PATCH v2 10/11] tests: Add iotest mirror-sparse for recent patches Eric Blake
2025-04-17 20:55   ` Stefan Hajnoczi
2025-04-17 18:39 ` [PATCH v2 11/11] mirror: Allow QMP override to declare target already zero Eric Blake
2025-04-17 20:57   ` Stefan Hajnoczi
2025-04-18  4:47   ` Markus Armbruster
2025-04-17 20:59 ` [PATCH v2 00/11] Make blockdev-mirror dest sparse in more cases Stefan Hajnoczi
2025-04-18 21:52 ` [PATCH v2.5 01/11] block: Expand block status mode from bool to flags Eric Blake
2025-04-18 21:52   ` [PATCH v2.5 02/11] file-posix, gluster: Handle zero block status hint better Eric Blake
2025-04-22 14:43     ` Stefan Hajnoczi
2025-04-22 14:43   ` [PATCH v2.5 01/11] block: Expand block status mode from bool to flags Stefan Hajnoczi
2025-04-24 18:08   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250417205414.GG85491@fedora \
    --to=stefanha@redhat.com \
    --cc=eblake@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.