* Re: [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals
2025-05-09 20:40 ` [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals Eric Blake
@ 2025-05-02 23:17 ` Sunny Zhu
2025-05-12 15:19 ` Stefan Hajnoczi
1 sibling, 0 replies; 34+ messages in thread
From: Sunny Zhu @ 2025-05-02 23:17 UTC (permalink / raw)
To: eblake
Cc: hreitz, jsnow, kwolf, qemu-block, qemu-devel, stefanha, sunnyzhyy,
vsementsov
On Fri, 9 May 2025 15:40:24 -0500, Eric Blake wrote:
> Out of the five possible values for MirrorSyncMode, INCREMENTAL and
> BITMAP are already rejected up front in mirror_start, leaving NONE,
> TOP, and FULL as the remaining values that the code was collapsing
> into a single bool is_none_mode. Furthermore, mirror_dirty_init() is
> only reachable for modes TOP and FULL, as further guided by
> s->zero_target. However, upcoming patches want to further optimize
> the pre-zeroing pass of a sync=full mirror in mirror_dirty_init(),
> while avoiding that pass on a sync=top action. Instead of throwing
> away context by collapsing these two values into
> s->is_none_mode=false, it is better to pass s->sync_mode throughout
> the entire operation. For active commit, the desired semantics match
> sync mode TOP.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v4: new patch
> ---
> block/mirror.c | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
Reviewed-by: Sunny Zhu <sunnyzhyy@qq.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero
2025-05-09 20:40 ` [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero Eric Blake
@ 2025-05-02 23:57 ` Sunny Zhu
2025-05-12 17:30 ` Eric Blake
2025-05-03 0:40 ` Sunny Zhu
2025-05-12 15:23 ` Stefan Hajnoczi
2 siblings, 1 reply; 34+ messages in thread
From: Sunny Zhu @ 2025-05-02 23:57 UTC (permalink / raw)
To: eblake
Cc: armbru, hreitz, jsnow, kwolf, qemu-block, qemu-devel, stefanha,
sunnyzhyy, vsementsov
On Fri, May 09, 2025 at 03:40:25PM -0500, Eric Blake wrote:
> QEMU has an optimization for a just-created drive-mirror destination
> that is not possible for blockdev-mirror (which can't create the
> destination) - any time we know the destination starts life as all
> zeroes, we can skip a pre-zeroing pass on the destination. Recent
> patches have added an improved heuristic for detecting if a file
> contains all zeroes, and we plan to use that heuristic in upcoming
> patches. But since a heuristic cannot quickly detect all scenarios,
> and there may be cases where the caller is aware of information that
> QEMU cannot learn quickly, it makes sense to have a way to tell QEMU
> to assume facts about the destination that can make the mirror
> operation faster. Given our existing example of "qemu-img convert
> --target-is-zero", it is time to expose this override in QMP for
> blockdev-mirror as well.
>
> This patch results in some slight redundancy between the older
> s->zero_target (set any time mode==FULL and the destination image was
> not just created - ie. clear if drive-mirror is asking to skip the
> pre-zero pass) and the newly-introduced s->target_is_zero (in addition
> to the QMP override, it is set when drive-mirror creates the
> destination image); this will be cleaned up in the next patch.
>
> There is also a subtlety that we must consider. When drive-mirror is
> passing target_is_zero on behalf of a just-created image, we know the
> image is sparse (skipping the pre-zeroing keeps it that way), so it
> doesn't matter whether the destination also has "discard":"unmap" and
> "detect-zeroes":"unmap". But now that we are letting the user set the
> knob for target-is-zero, if the user passes a pre-existing file that
> is fully allocated, it is fine to leave the file fully allocated under
> "detect-zeroes":"on", but if the file is open with
> "detect-zeroes":"unmap", we should really be trying harder to punch
> holes in the destination for every region of zeroes copied from the
> source. The easiest way to do this is to still run the pre-zeroing
> pass (turning the entire destination file sparse before populating
> just the allocated portions of the source), even though that currently
> results in double I/O to the portions of the file that are allocated.
> A later patch will add further optimizations to reduce redundant
> zeroing I/O during the mirror operation.
>
> Since "target-is-zero":true is designed for optimizations, it is okay
> to silently ignore the parameter rather than erroring if the user ever
> sets the parameter in a scenario where the mirror job can't exploit it
> (for example, when doing "sync":"top" instead of "sync":"full", we
> can't pre-zero, so setting the parameter won't make a speed
> difference).
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Acked-by: Markus Armbruster <armbru@redhat.com>
>
> ---
>
> v4: hoist earlier in series. QMP design is unchanged, but logic in
> mirror_dirty_init is different (in many aspects simpler, but now
> catering to "detect-zeroes":"unmap") so Acked-by on QMP kept, but
> Reviewed-by dropped.
> ---
> qapi/block-core.json | 8 +++++++-
> include/block/block_int-global-state.h | 3 ++-
> block/mirror.c | 27 ++++++++++++++++++++++----
> blockdev.c | 18 ++++++++++-------
> tests/unit/test-block-iothread.c | 2 +-
> 5 files changed, 44 insertions(+), 14 deletions(-)
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index b1937780e19..7f70ec6d3cb 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2538,6 +2538,11 @@
> # disappear from the query list without user intervention.
> # Defaults to true. (Since 3.1)
> #
> +# @target-is-zero: Assume the destination reads as all zeroes before
> +# the mirror started. Setting this to true can speed up the
> +# mirror. Setting this to true when the destination is not
> +# actually all zero can corrupt the destination. (Since 10.1)
It appears feasible to add target-is-zero to the drive-mirror qmp as well,
provided this is done under the NEW_IMAGE_MODE_EXISTING scenario. This would
allow users to pass target-is-zero for optimizations whether using blockdev-mirror
or drive-mirror when they have pre-prepared destination image.
> +#
> # Since: 2.6
> #
> # .. qmp-example::
> @@ -2557,7 +2562,8 @@
> '*on-target-error': 'BlockdevOnError',
> '*filter-node-name': 'str',
> '*copy-mode': 'MirrorCopyMode',
> - '*auto-finalize': 'bool', '*auto-dismiss': 'bool' },
> + '*auto-finalize': 'bool', '*auto-dismiss': 'bool',
> + '*target-is-zero': 'bool'},
> 'allow-preconfig': true }
>
> ##
> diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
> index eb2d92a2261..8cf0003ce78 100644
> --- a/include/block/block_int-global-state.h
> +++ b/include/block/block_int-global-state.h
> @@ -140,6 +140,7 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
> * @mode: Whether to collapse all images in the chain to the target.
> * @backing_mode: How to establish the target's backing chain after completion.
> * @zero_target: Whether the target should be explicitly zero-initialized
> + * @target_is_zero: Whether the target already is zero-initialized.
> * @on_source_error: The action to take upon error reading from the source.
> * @on_target_error: The action to take upon error writing to the target.
> * @unmap: Whether to unmap target where source sectors only contain zeroes.
> @@ -159,7 +160,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
> int creation_flags, int64_t speed,
> uint32_t granularity, int64_t buf_size,
> MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
> - bool zero_target,
> + bool zero_target, bool target_is_zero,
> BlockdevOnError on_source_error,
> BlockdevOnError on_target_error,
> bool unmap, const char *filter_node_name,
> diff --git a/block/mirror.c b/block/mirror.c
> index 2599b75d092..4dcb50c81ad 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -55,6 +55,8 @@ typedef struct MirrorBlockJob {
> BlockMirrorBackingMode backing_mode;
> /* Whether the target image requires explicit zero-initialization */
> bool zero_target;
> + /* Whether the target should be assumed to be already zero initialized */
> + bool target_is_zero;
> /*
> * To be accesssed with atomics. Written only under the BQL (required by the
> * current implementation of mirror_change()).
> @@ -844,12 +846,26 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> BlockDriverState *target_bs = blk_bs(s->target);
> int ret = -EIO;
> int64_t count;
> + bool punch_holes =
> + target_bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
> + bdrv_can_write_zeroes_with_unmap(target_bs);
>
> bdrv_graph_co_rdlock();
> bs = s->mirror_top_bs->backing->bs;
> bdrv_graph_co_rdunlock();
>
> - if (s->zero_target) {
> + if (s->zero_target && (!s->target_is_zero || punch_holes)) {
> + /*
> + * Here, we are in FULL mode; our goal is to avoid writing
> + * zeroes if the destination already reads as zero, except
> + * when we are trying to punch holes. This is possible if
> + * zeroing happened externally (s->target_is_zero) or if we
> + * have a fast way to pre-zero the image (the dirty bitmap
> + * will be populated later by the non-zero portions, the same
> + * as for TOP mode). If pre-zeroing is not fast, or we need
> + * to punch holes, then our only recourse is to write the
> + * entire image.
> + */
> if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
> bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
> return 0;
> @@ -1714,7 +1730,7 @@ static BlockJob *mirror_start_job(
> uint32_t granularity, int64_t buf_size,
> MirrorSyncMode sync_mode,
> BlockMirrorBackingMode backing_mode,
> - bool zero_target,
> + bool zero_target, bool target_is_zero,
> BlockdevOnError on_source_error,
> BlockdevOnError on_target_error,
> bool unmap,
> @@ -1883,6 +1899,7 @@ static BlockJob *mirror_start_job(
> s->sync_mode = sync_mode;
> s->backing_mode = backing_mode;
> s->zero_target = zero_target;
> + s->target_is_zero = target_is_zero;
> qatomic_set(&s->copy_mode, copy_mode);
> s->base = base;
> s->base_overlay = bdrv_find_overlay(bs, base);
> @@ -2011,7 +2028,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
> int creation_flags, int64_t speed,
> uint32_t granularity, int64_t buf_size,
> MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
> - bool zero_target,
> + bool zero_target, bool target_is_zero,
> BlockdevOnError on_source_error,
> BlockdevOnError on_target_error,
> bool unmap, const char *filter_node_name,
> @@ -2034,7 +2051,8 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
>
> mirror_start_job(job_id, bs, creation_flags, target, replaces,
> speed, granularity, buf_size, mode, backing_mode,
> - zero_target, on_source_error, on_target_error, unmap,
> + zero_target,
> + target_is_zero, on_source_error, on_target_error, unmap,
> NULL, NULL, &mirror_job_driver, base, false,
> filter_node_name, true, copy_mode, false, errp);
> }
> @@ -2062,6 +2080,7 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
> job = mirror_start_job(
> job_id, bs, creation_flags, base, NULL, speed, 0, 0,
> MIRROR_SYNC_MODE_TOP, MIRROR_LEAVE_BACKING_CHAIN, false,
> + false,
> on_error, on_error, true, cb, opaque,
> &commit_active_job_driver, base, auto_complete,
> filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
> diff --git a/blockdev.c b/blockdev.c
> index 1d1f27cfff6..2e2fed539ee 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -2798,7 +2798,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
> const char *replaces,
> enum MirrorSyncMode sync,
> BlockMirrorBackingMode backing_mode,
> - bool zero_target,
> + bool zero_target, bool target_is_zero,
> bool has_speed, int64_t speed,
> bool has_granularity, uint32_t granularity,
> bool has_buf_size, int64_t buf_size,
> @@ -2909,11 +2909,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
> /* pass the node name to replace to mirror start since it's loose coupling
> * and will allow to check whether the node still exist at mirror completion
> */
> - mirror_start(job_id, bs, target,
> - replaces, job_flags,
> + mirror_start(job_id, bs, target, replaces, job_flags,
> speed, granularity, buf_size, sync, backing_mode, zero_target,
> - on_source_error, on_target_error, unmap, filter_node_name,
> - copy_mode, errp);
> + target_is_zero, on_source_error, on_target_error, unmap,
> + filter_node_name, copy_mode, errp);
> }
>
> void qmp_drive_mirror(DriveMirror *arg, Error **errp)
> @@ -2928,6 +2927,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
> int64_t size;
> const char *format = arg->format;
> bool zero_target;
> + bool target_is_zero;
> int ret;
>
> bs = qmp_get_root_bs(arg->device, errp);
> @@ -3044,6 +3044,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
> zero_target = (arg->sync == MIRROR_SYNC_MODE_FULL &&
> (arg->mode == NEW_IMAGE_MODE_EXISTING ||
> !bdrv_has_zero_init(target_bs)));
> + target_is_zero = (arg->mode != NEW_IMAGE_MODE_EXISTING &&
> + bdrv_has_zero_init(target_bs));
Similarly, if the qmp_drive_mirror supports the target-is-zero parameter, we would then
be able to determine whether the destination image is fully zero-initialized when
arg->mode is NEW_IMAGE_MODE_EXISTING.
> bdrv_graph_rdunlock_main_loop();
>
>
> @@ -3055,7 +3057,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>
thanks.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero
2025-05-09 20:40 ` [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero Eric Blake
2025-05-02 23:57 ` Sunny Zhu
@ 2025-05-03 0:40 ` Sunny Zhu
2025-05-12 15:23 ` Stefan Hajnoczi
2 siblings, 0 replies; 34+ messages in thread
From: Sunny Zhu @ 2025-05-03 0:40 UTC (permalink / raw)
To: eblake
Cc: armbru, hreitz, jsnow, kwolf, qemu-block, qemu-devel, stefanha,
sunnyzhyy, vsementsov
On Fri, 9 May 2025 15:40:25 -0500, Eric Blake wrote:
> QEMU has an optimization for a just-created drive-mirror destination
> that is not possible for blockdev-mirror (which can't create the
> destination) - any time we know the destination starts life as all
> zeroes, we can skip a pre-zeroing pass on the destination. Recent
> patches have added an improved heuristic for detecting if a file
> contains all zeroes, and we plan to use that heuristic in upcoming
> patches. But since a heuristic cannot quickly detect all scenarios,
> and there may be cases where the caller is aware of information that
> QEMU cannot learn quickly, it makes sense to have a way to tell QEMU
> to assume facts about the destination that can make the mirror
> operation faster. Given our existing example of "qemu-img convert
> --target-is-zero", it is time to expose this override in QMP for
> blockdev-mirror as well.
>
> This patch results in some slight redundancy between the older
> s->zero_target (set any time mode==FULL and the destination image was
> not just created - ie. clear if drive-mirror is asking to skip the
> pre-zero pass) and the newly-introduced s->target_is_zero (in addition
> to the QMP override, it is set when drive-mirror creates the
> destination image); this will be cleaned up in the next patch.
>
> There is also a subtlety that we must consider. When drive-mirror is
> passing target_is_zero on behalf of a just-created image, we know the
> image is sparse (skipping the pre-zeroing keeps it that way), so it
> doesn't matter whether the destination also has "discard":"unmap" and
> "detect-zeroes":"unmap". But now that we are letting the user set the
> knob for target-is-zero, if the user passes a pre-existing file that
> is fully allocated, it is fine to leave the file fully allocated under
> "detect-zeroes":"on", but if the file is open with
> "detect-zeroes":"unmap", we should really be trying harder to punch
> holes in the destination for every region of zeroes copied from the
> source. The easiest way to do this is to still run the pre-zeroing
> pass (turning the entire destination file sparse before populating
> just the allocated portions of the source), even though that currently
> results in double I/O to the portions of the file that are allocated.
> A later patch will add further optimizations to reduce redundant
> zeroing I/O during the mirror operation.
>
> Since "target-is-zero":true is designed for optimizations, it is okay
> to silently ignore the parameter rather than erroring if the user ever
> sets the parameter in a scenario where the mirror job can't exploit it
> (for example, when doing "sync":"top" instead of "sync":"full", we
> can't pre-zero, so setting the parameter won't make a speed
> difference).
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Acked-by: Markus Armbruster <armbru@redhat.com>
>
> ---
>
> v4: hoist earlier in series. QMP design is unchanged, but logic in
> mirror_dirty_init is different (in many aspects simpler, but now
> catering to "detect-zeroes":"unmap") so Acked-by on QMP kept, but
> Reviewed-by dropped.
> ---
> qapi/block-core.json | 8 +++++++-
> include/block/block_int-global-state.h | 3 ++-
> block/mirror.c | 27 ++++++++++++++++++++++----
> blockdev.c | 18 ++++++++++-------
> tests/unit/test-block-iothread.c | 2 +-
> 5 files changed, 44 insertions(+), 14 deletions(-)
Reviewed-by: Sunny Zhu <sunnyzhyy@qq.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 09/13] mirror: Drop redundant zero_target parameter
2025-05-09 20:40 ` [PATCH v4 09/13] mirror: Drop redundant zero_target parameter Eric Blake
@ 2025-05-03 0:47 ` Sunny Zhu
2025-05-12 15:24 ` Stefan Hajnoczi
2025-05-14 22:09 ` Eric Blake
2 siblings, 0 replies; 34+ messages in thread
From: Sunny Zhu @ 2025-05-03 0:47 UTC (permalink / raw)
To: eblake
Cc: hreitz, jsnow, kwolf, qemu-block, qemu-devel, stefanha, sunnyzhyy,
vsementsov
On Fri, May 09, 2025 at 03:40:26PM -0500, Eric Blake wrote:
> The two callers to a mirror job (drive-mirror and blockdev-mirror) set
> zero_target precisely when sync mode == FULL, with the one exception
> that drive-mirror skips zeroing the target if it was newly created and
> reads as zero. But given the previous patch, that exception is
> equally captured by target_is_zero. And since we recently updated
> things to pass the sync mode all the way through to
> mirror_dirty_init(), we can now reconstruct the same conditionals
> without the redundant parameter.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v4: new patch
> ---
> include/block/block_int-global-state.h | 3 +--
> block/mirror.c | 13 +++++--------
> blockdev.c | 15 ++++-----------
> tests/unit/test-block-iothread.c | 2 +-
> 4 files changed, 11 insertions(+), 22 deletions(-)
Reviewed-by: Sunny Zhu <sunnyzhyy@qq.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero
2025-05-09 20:40 ` [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero Eric Blake
@ 2025-05-03 0:51 ` Sunny Zhu
2025-05-12 18:40 ` Stefan Hajnoczi
1 sibling, 0 replies; 34+ messages in thread
From: Sunny Zhu @ 2025-05-03 0:51 UTC (permalink / raw)
To: eblake
Cc: hreitz, jsnow, kwolf, qemu-block, qemu-devel, stefanha, sunnyzhyy,
vsementsov
On Fri, May 09, 2025 at 03:40:27PM -0500, Eric Blake wrote:
> When doing a sync=full mirroring, we can skip pre-zeroing the
> destination if it already reads as zeroes and we are not also trying
> to punch holes due to detect-zeroes. With this patch, there are fewer
> scenarios that have to pass in an explicit target-is-zero, while still
> resulting in a sparse destination remaining sparse.
>
> A later patch will then further improve things to skip writing to the
> destination for parts of the image where the source is zero; but even
> with just this patch, it is possible to see a difference for any
> source that does not report itself as fully allocated, coupled with a
> destination BDS that can quickly report that it already reads as zero.
> (For a source that reports as fully allocated, such as a file, the
> rest of mirror_dirty_init() still sets the entire dirty bitmap to
> true, so even though we avoided the pre-zeroing, we are not yet
> avoiding all redundant I/O).
>
> Iotest 194 detects the difference made by this patch: for a file
> source (where block status reports the entire image as allocated, and
> therefore we end up writing zeroes everywhere in the destination
> anyways), the job length remains the same. But for a qcow2 source and
> a destination that reads as all zeroes, the dirty bitmap changes to
> just tracking the allocated portions of the source, which results in
> faster completion and smaller job statistics. For the test to pass
> with both ./check -file and -qcow2, a new python filter is needed to
> mask out the now-varying job amounts (this matches the shell filters
> _filter_block_job_{offset,len} in common.filter). A later test will
> also be added which further validates expected sparseness, so it does
> not matter that 194 is no longer explicitly looking at how many bytes
> were copied.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v3: add exemption for "detect-zeroes":"unmap" on destination
> v4: Rebase later in series, revise logic for pre-zeroing [Sunny], add
> in python filter
> ---
> block/mirror.c | 24 ++++++++++++++++--------
> tests/qemu-iotests/194 | 6 ++++--
> tests/qemu-iotests/194.out | 4 ++--
> tests/qemu-iotests/iotests.py | 12 +++++++++++-
> 4 files changed, 33 insertions(+), 13 deletions(-)
Reviewed-by: Sunny Zhu <sunnyzhyy@qq.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases
@ 2025-05-09 20:40 Eric Blake
2025-05-09 20:40 ` [PATCH v4 01/13] block: Expand block status mode from bool to flags Eric Blake
` (14 more replies)
0 siblings, 15 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-block, sunnyzhyy, stefanha, vsementsov
v3 was here:
https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg04525.html
In v4:
- Rearrange series a bit to try and simplify logic for how zero
bitmap is populated, when pre-zeroing is attempted, and how
detect-zeroes=unmap interacts [Sunny]
- Add a couple new patches to make this easier to follow
- iotest tweaks [Stefan]
001/13:[----] [--] 'block: Expand block status mode from bool to flags'
002/13:[----] [--] 'file-posix, gluster: Handle zero block status hint better'
003/13:[----] [--] 'block: Let bdrv_co_is_zero_fast consolidate adjacent extents'
004/13:[----] [--] 'block: Add new bdrv_co_is_all_zeroes() function'
005/13:[----] [--] 'iotests: Improve iotest 194 to mirror data'
006/13:[----] [--] 'mirror: Minor refactoring'
007/13:[down] 'mirror: Pass full sync mode rather than bool to internals'
008/13:[0038] [FC] 'mirror: Allow QMP override to declare target already zero'
009/13:[down] 'mirror: Drop redundant zero_target parameter'
010/13:[0063] [FC] 'mirror: Skip pre-zeroing destination if it is already zero'
011/13:[0021] [FC] 'mirror: Skip writing zeroes when target is already zero'
012/13:[----] [--] 'iotests/common.rc: add disk_usage function'
013/13:[0013] [FC] 'tests: Add iotest mirror-sparse for recent patches'
Andrey Drobyshev (1):
iotests/common.rc: add disk_usage function
Eric Blake (12):
block: Expand block status mode from bool to flags
file-posix, gluster: Handle zero block status hint better
block: Let bdrv_co_is_zero_fast consolidate adjacent extents
block: Add new bdrv_co_is_all_zeroes() function
iotests: Improve iotest 194 to mirror data
mirror: Minor refactoring
mirror: Pass full sync mode rather than bool to internals
mirror: Allow QMP override to declare target already zero
mirror: Drop redundant zero_target parameter
mirror: Skip pre-zeroing destination if it is already zero
mirror: Skip writing zeroes when target is already zero
tests: Add iotest mirror-sparse for recent patches
qapi/block-core.json | 8 +-
block/coroutines.h | 4 +-
include/block/block-common.h | 11 +
include/block/block-io.h | 2 +
include/block/block_int-common.h | 27 +-
include/block/block_int-global-state.h | 4 +-
include/block/block_int-io.h | 4 +-
block/io.c | 126 +++++--
block/blkdebug.c | 6 +-
block/copy-before-write.c | 4 +-
block/file-posix.c | 5 +-
block/gluster.c | 4 +-
block/iscsi.c | 6 +-
block/mirror.c | 183 ++++++++---
block/nbd.c | 4 +-
block/null.c | 6 +-
block/parallels.c | 6 +-
block/qcow.c | 2 +-
block/qcow2.c | 6 +-
block/qed.c | 6 +-
block/quorum.c | 4 +-
block/raw-format.c | 4 +-
block/rbd.c | 6 +-
block/snapshot-access.c | 4 +-
block/vdi.c | 4 +-
block/vmdk.c | 2 +-
block/vpc.c | 2 +-
block/vvfat.c | 6 +-
blockdev.c | 27 +-
tests/unit/test-block-iothread.c | 2 +-
tests/qemu-iotests/common.rc | 6 +
tests/qemu-iotests/194 | 7 +-
tests/qemu-iotests/194.out | 4 +-
tests/qemu-iotests/250 | 5 -
tests/qemu-iotests/iotests.py | 12 +-
tests/qemu-iotests/tests/mirror-sparse | 125 +++++++
tests/qemu-iotests/tests/mirror-sparse.out | 365 +++++++++++++++++++++
37 files changed, 850 insertions(+), 159 deletions(-)
create mode 100755 tests/qemu-iotests/tests/mirror-sparse
create mode 100644 tests/qemu-iotests/tests/mirror-sparse.out
--
2.49.0
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v4 01/13] block: Expand block status mode from bool to flags
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 02/13] file-posix, gluster: Handle zero block status hint better Eric Blake
` (13 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, Kevin Wolf,
Hanna Reitz, John Snow, Fam Zheng, Ronnie Sahlberg, Paolo Bonzini,
Peter Lieven, Denis V. Lunev, Alberto Garcia, Ilya Dryomov,
Stefan Weil, open list:GLUSTER
This patch is purely mechanical, changing bool want_zero into an
unsigned int for bitwise-or of flags. As of this patch, all
implementations are unchanged (the old want_zero==true is now
mode==BDRV_WANT_PRECISE which is a superset of BDRV_WANT_ZERO); but
the callers in io.c that used to pass want_zero==false are now
prepared for future driver changes that can now distinguish bewteen
BDRV_WANT_ZERO vs. BDRV_WANT_ALLOCATED. The next patch will actually
change the file-posix driver along those lines, now that we have
more-specific hints.
As for the background why this patch is useful: right now, the
file-posix driver recognizes that if allocation is being queried, the
entire image can be reported as allocated (there is no backing file to
refer to) - but this throws away information on whether the entire
image reads as zero (trivially true if lseek(SEEK_HOLE) at offset 0
returns -ENXIO, a bit more complicated to prove if the raw file was
created with 'qemu-img create' since we intentionally allocate a small
chunk of all-zero data to help with alignment probing). Later patches
will add a generic algorithm for seeing if an entire file reads as
zeroes.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/coroutines.h | 4 +--
include/block/block-common.h | 11 +++++++
include/block/block_int-common.h | 27 +++++++++--------
include/block/block_int-io.h | 4 +--
block/io.c | 51 ++++++++++++++++----------------
block/blkdebug.c | 6 ++--
block/copy-before-write.c | 4 +--
block/file-posix.c | 4 +--
block/gluster.c | 4 +--
block/iscsi.c | 6 ++--
block/nbd.c | 4 +--
block/null.c | 6 ++--
block/parallels.c | 6 ++--
block/qcow.c | 2 +-
block/qcow2.c | 6 ++--
block/qed.c | 6 ++--
block/quorum.c | 4 +--
block/raw-format.c | 4 +--
block/rbd.c | 6 ++--
block/snapshot-access.c | 4 +--
block/vdi.c | 4 +--
block/vmdk.c | 2 +-
block/vpc.c | 2 +-
block/vvfat.c | 6 ++--
tests/unit/test-block-iothread.c | 2 +-
25 files changed, 99 insertions(+), 86 deletions(-)
diff --git a/block/coroutines.h b/block/coroutines.h
index 79e5efbf752..892646bb7aa 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -47,7 +47,7 @@ int coroutine_fn GRAPH_RDLOCK
bdrv_co_common_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
bool include_base,
- bool want_zero,
+ unsigned int mode,
int64_t offset,
int64_t bytes,
int64_t *pnum,
@@ -78,7 +78,7 @@ int co_wrapper_mixed_bdrv_rdlock
bdrv_common_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
bool include_base,
- bool want_zero,
+ unsigned int mode,
int64_t offset,
int64_t bytes,
int64_t *pnum,
diff --git a/include/block/block-common.h b/include/block/block-common.h
index 0b831ef87b1..c8c626daeaa 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -333,6 +333,17 @@ typedef enum {
#define BDRV_BLOCK_RECURSE 0x40
#define BDRV_BLOCK_COMPRESSED 0x80
+/*
+ * Block status hints: the bitwise-or of these flags emphasize what
+ * the caller hopes to learn, and some drivers may be able to give
+ * faster answers by doing less work when the hint permits.
+ */
+#define BDRV_WANT_ZERO BDRV_BLOCK_ZERO
+#define BDRV_WANT_OFFSET_VALID BDRV_BLOCK_OFFSET_VALID
+#define BDRV_WANT_ALLOCATED BDRV_BLOCK_ALLOCATED
+#define BDRV_WANT_PRECISE (BDRV_WANT_ZERO | BDRV_WANT_OFFSET_VALID | \
+ BDRV_WANT_OFFSET_VALID)
+
typedef QTAILQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) BlockReopenQueue;
typedef struct BDRVReopenState {
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 0d8187f6567..2982dd31180 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -604,15 +604,16 @@ struct BlockDriver {
* according to the current layer, and should only need to set
* BDRV_BLOCK_DATA, BDRV_BLOCK_ZERO, BDRV_BLOCK_OFFSET_VALID,
* and/or BDRV_BLOCK_RAW; if the current layer defers to a backing
- * layer, the result should be 0 (and not BDRV_BLOCK_ZERO). See
- * block.h for the overall meaning of the bits. As a hint, the
- * flag want_zero is true if the caller cares more about precise
- * mappings (favor accurate _OFFSET_VALID/_ZERO) or false for
- * overall allocation (favor larger *pnum, perhaps by reporting
- * _DATA instead of _ZERO). The block layer guarantees input
- * clamped to bdrv_getlength() and aligned to request_alignment,
- * as well as non-NULL pnum, map, and file; in turn, the driver
- * must return an error or set pnum to an aligned non-zero value.
+ * layer, the result should be 0 (and not BDRV_BLOCK_ZERO). The
+ * caller will synthesize BDRV_BLOCK_ALLOCATED based on the
+ * non-zero results. See block.h for the overall meaning of the
+ * bits. As a hint, the flags in @mode may include a bitwise-or
+ * of BDRV_WANT_ALLOCATED, BDRV_WANT_OFFSET_VALID, or
+ * BDRV_WANT_ZERO based on what the caller is looking for in the
+ * results. The block layer guarantees input clamped to
+ * bdrv_getlength() and aligned to request_alignment, as well as
+ * non-NULL pnum, map, and file; in turn, the driver must return
+ * an error or set pnum to an aligned non-zero value.
*
* Note that @bytes is just a hint on how big of a region the
* caller wants to inspect. It is not a limit on *pnum.
@@ -624,8 +625,8 @@ struct BlockDriver {
* to clamping *pnum for return to its caller.
*/
int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_block_status)(
- BlockDriverState *bs,
- bool want_zero, int64_t offset, int64_t bytes, int64_t *pnum,
+ BlockDriverState *bs, unsigned int mode,
+ int64_t offset, int64_t bytes, int64_t *pnum,
int64_t *map, BlockDriverState **file);
/*
@@ -649,8 +650,8 @@ struct BlockDriver {
QEMUIOVector *qiov, size_t qiov_offset);
int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_snapshot_block_status)(
- BlockDriverState *bs, bool want_zero, int64_t offset, int64_t bytes,
- int64_t *pnum, int64_t *map, BlockDriverState **file);
+ BlockDriverState *bs, unsigned int mode, int64_t offset,
+ int64_t bytes, int64_t *pnum, int64_t *map, BlockDriverState **file);
int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_pdiscard_snapshot)(
BlockDriverState *bs, int64_t offset, int64_t bytes);
diff --git a/include/block/block_int-io.h b/include/block/block_int-io.h
index 4a7cf2b4fdc..4f94eb3c5a2 100644
--- a/include/block/block_int-io.h
+++ b/include/block/block_int-io.h
@@ -38,8 +38,8 @@
int coroutine_fn GRAPH_RDLOCK bdrv_co_preadv_snapshot(BdrvChild *child,
int64_t offset, int64_t bytes, QEMUIOVector *qiov, size_t qiov_offset);
int coroutine_fn GRAPH_RDLOCK bdrv_co_snapshot_block_status(
- BlockDriverState *bs, bool want_zero, int64_t offset, int64_t bytes,
- int64_t *pnum, int64_t *map, BlockDriverState **file);
+ BlockDriverState *bs, unsigned int mode, int64_t offset,
+ int64_t bytes, int64_t *pnum, int64_t *map, BlockDriverState **file);
int coroutine_fn GRAPH_RDLOCK bdrv_co_pdiscard_snapshot(BlockDriverState *bs,
int64_t offset, int64_t bytes);
diff --git a/block/io.c b/block/io.c
index 6d98b0abb98..b5b143cd1b0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2364,10 +2364,8 @@ int bdrv_flush_all(void)
* Drivers not implementing the functionality are assumed to not support
* backing files, hence all their sectors are reported as allocated.
*
- * If 'want_zero' is true, the caller is querying for mapping
- * purposes, with a focus on valid BDRV_BLOCK_OFFSET_VALID, _DATA, and
- * _ZERO where possible; otherwise, the result favors larger 'pnum',
- * with a focus on accurate BDRV_BLOCK_ALLOCATED.
+ * 'mode' serves as a hint as to which results are favored; see the
+ * BDRV_WANT_* macros for details.
*
* If 'offset' is beyond the end of the disk image the return value is
* BDRV_BLOCK_EOF and 'pnum' is set to 0.
@@ -2387,7 +2385,7 @@ int bdrv_flush_all(void)
* set to the host mapping and BDS corresponding to the guest offset.
*/
static int coroutine_fn GRAPH_RDLOCK
-bdrv_co_do_block_status(BlockDriverState *bs, bool want_zero,
+bdrv_co_do_block_status(BlockDriverState *bs, unsigned int mode,
int64_t offset, int64_t bytes,
int64_t *pnum, int64_t *map, BlockDriverState **file)
{
@@ -2476,7 +2474,7 @@ bdrv_co_do_block_status(BlockDriverState *bs, bool want_zero,
local_file = bs;
local_map = aligned_offset;
} else {
- ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
+ ret = bs->drv->bdrv_co_block_status(bs, mode, aligned_offset,
aligned_bytes, pnum, &local_map,
&local_file);
@@ -2488,10 +2486,10 @@ bdrv_co_do_block_status(BlockDriverState *bs, bool want_zero,
* the cache requires an RCU update, so double check here to avoid
* such an update if possible.
*
- * Check want_zero, because we only want to update the cache when we
+ * Check mode, because we only want to update the cache when we
* have accurate information about what is zero and what is data.
*/
- if (want_zero &&
+ if (mode == BDRV_WANT_PRECISE &&
ret == (BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID) &&
QLIST_EMPTY(&bs->children))
{
@@ -2548,7 +2546,7 @@ bdrv_co_do_block_status(BlockDriverState *bs, bool want_zero,
if (ret & BDRV_BLOCK_RAW) {
assert(ret & BDRV_BLOCK_OFFSET_VALID && local_file);
- ret = bdrv_co_do_block_status(local_file, want_zero, local_map,
+ ret = bdrv_co_do_block_status(local_file, mode, local_map,
*pnum, pnum, &local_map, &local_file);
goto out;
}
@@ -2560,7 +2558,7 @@ bdrv_co_do_block_status(BlockDriverState *bs, bool want_zero,
if (!cow_bs) {
ret |= BDRV_BLOCK_ZERO;
- } else if (want_zero) {
+ } else if (mode == BDRV_WANT_PRECISE) {
int64_t size2 = bdrv_co_getlength(cow_bs);
if (size2 >= 0 && offset >= size2) {
@@ -2569,14 +2567,14 @@ bdrv_co_do_block_status(BlockDriverState *bs, bool want_zero,
}
}
- if (want_zero && ret & BDRV_BLOCK_RECURSE &&
+ if (mode == BDRV_WANT_PRECISE && ret & BDRV_BLOCK_RECURSE &&
local_file && local_file != bs &&
(ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
(ret & BDRV_BLOCK_OFFSET_VALID)) {
int64_t file_pnum;
int ret2;
- ret2 = bdrv_co_do_block_status(local_file, want_zero, local_map,
+ ret2 = bdrv_co_do_block_status(local_file, mode, local_map,
*pnum, &file_pnum, NULL, NULL);
if (ret2 >= 0) {
/* Ignore errors. This is just providing extra information, it
@@ -2627,7 +2625,7 @@ int coroutine_fn
bdrv_co_common_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
bool include_base,
- bool want_zero,
+ unsigned int mode,
int64_t offset,
int64_t bytes,
int64_t *pnum,
@@ -2654,7 +2652,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
return 0;
}
- ret = bdrv_co_do_block_status(bs, want_zero, offset, bytes, pnum,
+ ret = bdrv_co_do_block_status(bs, mode, offset, bytes, pnum,
map, file);
++*depth;
if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
@@ -2671,7 +2669,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
for (p = bdrv_filter_or_cow_bs(bs); include_base || p != base;
p = bdrv_filter_or_cow_bs(p))
{
- ret = bdrv_co_do_block_status(p, want_zero, offset, bytes, pnum,
+ ret = bdrv_co_do_block_status(p, mode, offset, bytes, pnum,
map, file);
++*depth;
if (ret < 0) {
@@ -2734,7 +2732,8 @@ int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs,
BlockDriverState **file)
{
IO_CODE();
- return bdrv_co_common_block_status_above(bs, base, false, true, offset,
+ return bdrv_co_common_block_status_above(bs, base, false,
+ BDRV_WANT_PRECISE, offset,
bytes, pnum, map, file, NULL);
}
@@ -2765,8 +2764,9 @@ int coroutine_fn bdrv_co_is_zero_fast(BlockDriverState *bs, int64_t offset,
return 1;
}
- ret = bdrv_co_common_block_status_above(bs, NULL, false, false, offset,
- bytes, &pnum, NULL, NULL, NULL);
+ ret = bdrv_co_common_block_status_above(bs, NULL, false, BDRV_WANT_ZERO,
+ offset, bytes, &pnum, NULL, NULL,
+ NULL);
if (ret < 0) {
return ret;
@@ -2782,9 +2782,9 @@ int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs, int64_t offset,
int64_t dummy;
IO_CODE();
- ret = bdrv_co_common_block_status_above(bs, bs, true, false, offset,
- bytes, pnum ? pnum : &dummy, NULL,
- NULL, NULL);
+ ret = bdrv_co_common_block_status_above(bs, bs, true, BDRV_WANT_ALLOCATED,
+ offset, bytes, pnum ? pnum : &dummy,
+ NULL, NULL, NULL);
if (ret < 0) {
return ret;
}
@@ -2817,7 +2817,8 @@ int coroutine_fn bdrv_co_is_allocated_above(BlockDriverState *bs,
int ret;
IO_CODE();
- ret = bdrv_co_common_block_status_above(bs, base, include_base, false,
+ ret = bdrv_co_common_block_status_above(bs, base, include_base,
+ BDRV_WANT_ALLOCATED,
offset, bytes, pnum, NULL, NULL,
&depth);
if (ret < 0) {
@@ -3698,8 +3699,8 @@ bdrv_co_preadv_snapshot(BdrvChild *child, int64_t offset, int64_t bytes,
}
int coroutine_fn
-bdrv_co_snapshot_block_status(BlockDriverState *bs,
- bool want_zero, int64_t offset, int64_t bytes,
+bdrv_co_snapshot_block_status(BlockDriverState *bs, unsigned int mode,
+ int64_t offset, int64_t bytes,
int64_t *pnum, int64_t *map,
BlockDriverState **file)
{
@@ -3717,7 +3718,7 @@ bdrv_co_snapshot_block_status(BlockDriverState *bs,
}
bdrv_inc_in_flight(bs);
- ret = drv->bdrv_co_snapshot_block_status(bs, want_zero, offset, bytes,
+ ret = drv->bdrv_co_snapshot_block_status(bs, mode, offset, bytes,
pnum, map, file);
bdrv_dec_in_flight(bs);
diff --git a/block/blkdebug.c b/block/blkdebug.c
index 1c1967f8e0a..c54aee0c84b 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -751,9 +751,9 @@ blkdebug_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes)
}
static int coroutine_fn GRAPH_RDLOCK
-blkdebug_co_block_status(BlockDriverState *bs, bool want_zero, int64_t offset,
- int64_t bytes, int64_t *pnum, int64_t *map,
- BlockDriverState **file)
+blkdebug_co_block_status(BlockDriverState *bs, unsigned int mode,
+ int64_t offset, int64_t bytes, int64_t *pnum,
+ int64_t *map, BlockDriverState **file)
{
int err;
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index fd470f5f926..2badb3a8856 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -291,8 +291,8 @@ cbw_co_preadv_snapshot(BlockDriverState *bs, int64_t offset, int64_t bytes,
}
static int coroutine_fn GRAPH_RDLOCK
-cbw_co_snapshot_block_status(BlockDriverState *bs,
- bool want_zero, int64_t offset, int64_t bytes,
+cbw_co_snapshot_block_status(BlockDriverState *bs, unsigned int mode,
+ int64_t offset, int64_t bytes,
int64_t *pnum, int64_t *map,
BlockDriverState **file)
{
diff --git a/block/file-posix.c b/block/file-posix.c
index ef52ed9169e..805a1a2949b 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3273,7 +3273,7 @@ static int find_allocation(BlockDriverState *bs, off_t start,
* well exceed it.
*/
static int coroutine_fn raw_co_block_status(BlockDriverState *bs,
- bool want_zero,
+ unsigned int mode,
int64_t offset,
int64_t bytes, int64_t *pnum,
int64_t *map,
@@ -3289,7 +3289,7 @@ static int coroutine_fn raw_co_block_status(BlockDriverState *bs,
return ret;
}
- if (!want_zero) {
+ if (mode != BDRV_WANT_PRECISE) {
*pnum = bytes;
*map = offset;
*file = bs;
diff --git a/block/gluster.c b/block/gluster.c
index 8712aa606a4..1a2ef53e9b0 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -1461,7 +1461,7 @@ exit:
* (Based on raw_co_block_status() from file-posix.c.)
*/
static int coroutine_fn qemu_gluster_co_block_status(BlockDriverState *bs,
- bool want_zero,
+ unsigned int mode,
int64_t offset,
int64_t bytes,
int64_t *pnum,
@@ -1478,7 +1478,7 @@ static int coroutine_fn qemu_gluster_co_block_status(BlockDriverState *bs,
return ret;
}
- if (!want_zero) {
+ if (mode != BDRV_WANT_PRECISE) {
*pnum = bytes;
*map = offset;
*file = bs;
diff --git a/block/iscsi.c b/block/iscsi.c
index 2f0f4dac097..15b96ee8800 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -694,9 +694,9 @@ out_unlock:
static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs,
- bool want_zero, int64_t offset,
- int64_t bytes, int64_t *pnum,
- int64_t *map,
+ unsigned int mode,
+ int64_t offset, int64_t bytes,
+ int64_t *pnum, int64_t *map,
BlockDriverState **file)
{
IscsiLun *iscsilun = bs->opaque;
diff --git a/block/nbd.c b/block/nbd.c
index 887841bc813..d5a2b21c6d1 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1397,8 +1397,8 @@ nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes)
}
static int coroutine_fn GRAPH_RDLOCK nbd_client_co_block_status(
- BlockDriverState *bs, bool want_zero, int64_t offset, int64_t bytes,
- int64_t *pnum, int64_t *map, BlockDriverState **file)
+ BlockDriverState *bs, unsigned int mode, int64_t offset,
+ int64_t bytes, int64_t *pnum, int64_t *map, BlockDriverState **file)
{
int ret, request_ret;
NBDExtent64 extent = { 0 };
diff --git a/block/null.c b/block/null.c
index dc0b1fdbd9b..4e448d593d7 100644
--- a/block/null.c
+++ b/block/null.c
@@ -227,9 +227,9 @@ static int null_reopen_prepare(BDRVReopenState *reopen_state,
}
static int coroutine_fn null_co_block_status(BlockDriverState *bs,
- bool want_zero, int64_t offset,
- int64_t bytes, int64_t *pnum,
- int64_t *map,
+ unsigned int mode,
+ int64_t offset, int64_t bytes,
+ int64_t *pnum, int64_t *map,
BlockDriverState **file)
{
BDRVNullState *s = bs->opaque;
diff --git a/block/parallels.c b/block/parallels.c
index 347ca127f34..3a375e2a8ab 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -416,9 +416,9 @@ parallels_co_flush_to_os(BlockDriverState *bs)
}
static int coroutine_fn GRAPH_RDLOCK
-parallels_co_block_status(BlockDriverState *bs, bool want_zero, int64_t offset,
- int64_t bytes, int64_t *pnum, int64_t *map,
- BlockDriverState **file)
+parallels_co_block_status(BlockDriverState *bs, unsigned int mode,
+ int64_t offset, int64_t bytes, int64_t *pnum,
+ int64_t *map, BlockDriverState **file)
{
BDRVParallelsState *s = bs->opaque;
int count;
diff --git a/block/qcow.c b/block/qcow.c
index da8ad4d2430..8a3e7591a92 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -530,7 +530,7 @@ get_cluster_offset(BlockDriverState *bs, uint64_t offset, int allocate,
}
static int coroutine_fn GRAPH_RDLOCK
-qcow_co_block_status(BlockDriverState *bs, bool want_zero,
+qcow_co_block_status(BlockDriverState *bs, unsigned int mode,
int64_t offset, int64_t bytes, int64_t *pnum,
int64_t *map, BlockDriverState **file)
{
diff --git a/block/qcow2.c b/block/qcow2.c
index 7774e7f0909..66fba89b414 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2141,9 +2141,9 @@ static void qcow2_join_options(QDict *options, QDict *old_options)
}
static int coroutine_fn GRAPH_RDLOCK
-qcow2_co_block_status(BlockDriverState *bs, bool want_zero, int64_t offset,
- int64_t count, int64_t *pnum, int64_t *map,
- BlockDriverState **file)
+qcow2_co_block_status(BlockDriverState *bs, unsigned int mode,
+ int64_t offset, int64_t count, int64_t *pnum,
+ int64_t *map, BlockDriverState **file)
{
BDRVQcow2State *s = bs->opaque;
uint64_t host_offset;
diff --git a/block/qed.c b/block/qed.c
index ac24449ffb3..4a36fb39294 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -833,9 +833,9 @@ fail:
}
static int coroutine_fn GRAPH_RDLOCK
-bdrv_qed_co_block_status(BlockDriverState *bs, bool want_zero, int64_t pos,
- int64_t bytes, int64_t *pnum, int64_t *map,
- BlockDriverState **file)
+bdrv_qed_co_block_status(BlockDriverState *bs, unsigned int mode,
+ int64_t pos, int64_t bytes, int64_t *pnum,
+ int64_t *map, BlockDriverState **file)
{
BDRVQEDState *s = bs->opaque;
size_t len = MIN(bytes, SIZE_MAX);
diff --git a/block/quorum.c b/block/quorum.c
index 30747a6df93..ed8ce801ee3 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -1226,7 +1226,7 @@ static void quorum_child_perm(BlockDriverState *bs, BdrvChild *c,
* region contains zeroes, and BDRV_BLOCK_DATA otherwise.
*/
static int coroutine_fn GRAPH_RDLOCK
-quorum_co_block_status(BlockDriverState *bs, bool want_zero,
+quorum_co_block_status(BlockDriverState *bs, unsigned int mode,
int64_t offset, int64_t count,
int64_t *pnum, int64_t *map, BlockDriverState **file)
{
@@ -1238,7 +1238,7 @@ quorum_co_block_status(BlockDriverState *bs, bool want_zero,
for (i = 0; i < s->num_children; i++) {
int64_t bytes;
ret = bdrv_co_common_block_status_above(s->children[i]->bs, NULL, false,
- want_zero, offset, count,
+ mode, offset, count,
&bytes, NULL, NULL, NULL);
if (ret < 0) {
quorum_report_bad(QUORUM_OP_TYPE_READ, offset, count,
diff --git a/block/raw-format.c b/block/raw-format.c
index e08526e2eca..df16ac1ea25 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -283,8 +283,8 @@ fail:
}
static int coroutine_fn GRAPH_RDLOCK
-raw_co_block_status(BlockDriverState *bs, bool want_zero, int64_t offset,
- int64_t bytes, int64_t *pnum, int64_t *map,
+raw_co_block_status(BlockDriverState *bs, unsigned int mode,
+ int64_t offset, int64_t bytes, int64_t *pnum, int64_t *map,
BlockDriverState **file)
{
BDRVRawState *s = bs->opaque;
diff --git a/block/rbd.c b/block/rbd.c
index 7446e66659e..951cd63f9ae 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1503,9 +1503,9 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,
}
static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
- bool want_zero, int64_t offset,
- int64_t bytes, int64_t *pnum,
- int64_t *map,
+ unsigned int mode,
+ int64_t offset, int64_t bytes,
+ int64_t *pnum, int64_t *map,
BlockDriverState **file)
{
BDRVRBDState *s = bs->opaque;
diff --git a/block/snapshot-access.c b/block/snapshot-access.c
index 71ac83c01f0..17ed2402db8 100644
--- a/block/snapshot-access.c
+++ b/block/snapshot-access.c
@@ -41,11 +41,11 @@ snapshot_access_co_preadv_part(BlockDriverState *bs,
static int coroutine_fn GRAPH_RDLOCK
snapshot_access_co_block_status(BlockDriverState *bs,
- bool want_zero, int64_t offset,
+ unsigned int mode, int64_t offset,
int64_t bytes, int64_t *pnum,
int64_t *map, BlockDriverState **file)
{
- return bdrv_co_snapshot_block_status(bs->file->bs, want_zero, offset,
+ return bdrv_co_snapshot_block_status(bs->file->bs, mode, offset,
bytes, pnum, map, file);
}
diff --git a/block/vdi.c b/block/vdi.c
index a2da6ecab01..3ddc62a5690 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -523,8 +523,8 @@ static int vdi_reopen_prepare(BDRVReopenState *state,
}
static int coroutine_fn GRAPH_RDLOCK
-vdi_co_block_status(BlockDriverState *bs, bool want_zero, int64_t offset,
- int64_t bytes, int64_t *pnum, int64_t *map,
+vdi_co_block_status(BlockDriverState *bs, unsigned int mode,
+ int64_t offset, int64_t bytes, int64_t *pnum, int64_t *map,
BlockDriverState **file)
{
BDRVVdiState *s = (BDRVVdiState *)bs->opaque;
diff --git a/block/vmdk.c b/block/vmdk.c
index 2adec499122..9c7ab037e14 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1777,7 +1777,7 @@ static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
}
static int coroutine_fn GRAPH_RDLOCK
-vmdk_co_block_status(BlockDriverState *bs, bool want_zero,
+vmdk_co_block_status(BlockDriverState *bs, unsigned int mode,
int64_t offset, int64_t bytes, int64_t *pnum,
int64_t *map, BlockDriverState **file)
{
diff --git a/block/vpc.c b/block/vpc.c
index 0309e319f60..801ff5793f8 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -726,7 +726,7 @@ fail:
}
static int coroutine_fn GRAPH_RDLOCK
-vpc_co_block_status(BlockDriverState *bs, bool want_zero,
+vpc_co_block_status(BlockDriverState *bs, unsigned int mode,
int64_t offset, int64_t bytes,
int64_t *pnum, int64_t *map,
BlockDriverState **file)
diff --git a/block/vvfat.c b/block/vvfat.c
index 91d69b3cc83..814796d9185 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -3134,9 +3134,9 @@ vvfat_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes,
}
static int coroutine_fn vvfat_co_block_status(BlockDriverState *bs,
- bool want_zero, int64_t offset,
- int64_t bytes, int64_t *n,
- int64_t *map,
+ unsigned int mode,
+ int64_t offset, int64_t bytes,
+ int64_t *n, int64_t *map,
BlockDriverState **file)
{
*n = bytes;
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index 2b358eaaa82..e26b3be5939 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -63,7 +63,7 @@ bdrv_test_co_truncate(BlockDriverState *bs, int64_t offset, bool exact,
}
static int coroutine_fn bdrv_test_co_block_status(BlockDriverState *bs,
- bool want_zero,
+ unsigned int mode,
int64_t offset, int64_t count,
int64_t *pnum, int64_t *map,
BlockDriverState **file)
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 02/13] file-posix, gluster: Handle zero block status hint better
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
2025-05-09 20:40 ` [PATCH v4 01/13] block: Expand block status mode from bool to flags Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 03/13] block: Let bdrv_co_is_zero_fast consolidate adjacent extents Eric Blake
` (12 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, Kevin Wolf,
Hanna Reitz, open list:GLUSTER
Although the previous patch to change 'bool want_zero' into a bitmask
made no semantic change, it is now time to differentiate. When the
caller specifically wants to know what parts of the file read as zero,
we need to use lseek and actually reporting holes, rather than
short-circuiting and advertising full allocation.
This change will be utilized in later patches to let mirroring
optimize for the case when the destination already reads as zeroes.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/file-posix.c | 3 ++-
block/gluster.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 805a1a2949b..ec95b748696 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3289,7 +3289,8 @@ static int coroutine_fn raw_co_block_status(BlockDriverState *bs,
return ret;
}
- if (mode != BDRV_WANT_PRECISE) {
+ if (!(mode & BDRV_WANT_ZERO)) {
+ /* There is no backing file - all bytes are allocated in this file. */
*pnum = bytes;
*map = offset;
*file = bs;
diff --git a/block/gluster.c b/block/gluster.c
index 1a2ef53e9b0..89abd40f313 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -1478,7 +1478,7 @@ static int coroutine_fn qemu_gluster_co_block_status(BlockDriverState *bs,
return ret;
}
- if (mode != BDRV_WANT_PRECISE) {
+ if (!(mode & BDRV_WANT_ZERO)) {
*pnum = bytes;
*map = offset;
*file = bs;
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 03/13] block: Let bdrv_co_is_zero_fast consolidate adjacent extents
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
2025-05-09 20:40 ` [PATCH v4 01/13] block: Expand block status mode from bool to flags Eric Blake
2025-05-09 20:40 ` [PATCH v4 02/13] file-posix, gluster: Handle zero block status hint better Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 04/13] block: Add new bdrv_co_is_all_zeroes() function Eric Blake
` (11 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, Fam Zheng,
Kevin Wolf, Hanna Reitz
Some BDS drivers have a cap on how much block status they can supply
in one query (for example, NBD talking to an older server cannot
inspect more than 4G per query; and qcow2 tends to cap its answers
rather than cross a cluster boundary of an L1 table). Although the
existing callers of bdrv_co_is_zero_fast are not passing in that large
of a 'bytes' parameter, an upcoming caller wants to query the entire
image at once, and will thus benefit from being able to treat adjacent
zero regions in a coalesced manner, rather than claiming the region is
non-zero merely because pnum was truncated and didn't match the
incoming bytes.
While refactoring this into a loop, note that there is no need to
assign pnum prior to calling bdrv_co_common_block_status_above() (it
is guaranteed to be assigned deeper in the callstack).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3: also tweak function comment
---
block/io.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)
diff --git a/block/io.c b/block/io.c
index b5b143cd1b0..50dc0e193f0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2751,28 +2751,31 @@ int coroutine_fn bdrv_co_block_status(BlockDriverState *bs, int64_t offset,
* by @offset and @bytes is known to read as zeroes.
* Return 1 if that is the case, 0 otherwise and -errno on error.
* This test is meant to be fast rather than accurate so returning 0
- * does not guarantee non-zero data.
+ * does not guarantee non-zero data; but a return of 1 is reliable.
*/
int coroutine_fn bdrv_co_is_zero_fast(BlockDriverState *bs, int64_t offset,
int64_t bytes)
{
int ret;
- int64_t pnum = bytes;
+ int64_t pnum;
IO_CODE();
- if (!bytes) {
- return 1;
+ while (bytes) {
+ ret = bdrv_co_common_block_status_above(bs, NULL, false,
+ BDRV_WANT_ZERO, offset, bytes,
+ &pnum, NULL, NULL, NULL);
+
+ if (ret < 0) {
+ return ret;
+ }
+ if (!(ret & BDRV_BLOCK_ZERO)) {
+ return 0;
+ }
+ offset += pnum;
+ bytes -= pnum;
}
- ret = bdrv_co_common_block_status_above(bs, NULL, false, BDRV_WANT_ZERO,
- offset, bytes, &pnum, NULL, NULL,
- NULL);
-
- if (ret < 0) {
- return ret;
- }
-
- return (pnum == bytes) && (ret & BDRV_BLOCK_ZERO);
+ return 1;
}
int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs, int64_t offset,
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 04/13] block: Add new bdrv_co_is_all_zeroes() function
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (2 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 03/13] block: Let bdrv_co_is_zero_fast consolidate adjacent extents Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 05/13] iotests: Improve iotest 194 to mirror data Eric Blake
` (10 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, Fam Zheng,
Kevin Wolf, Hanna Reitz
There are some optimizations that require knowing if an image starts
out as reading all zeroes, such as making blockdev-mirror faster by
skipping the copying of source zeroes to the destination. The
existing bdrv_co_is_zero_fast() is a good building block for answering
this question, but it tends to give an answer of 0 for a file we just
created via QMP 'blockdev-create' or similar (such as 'qemu-img create
-f raw'). Why? Because file-posix.c insists on allocating a tiny
header to any file rather than leaving it 100% sparse, due to some
filesystems that are unable to answer alignment probes on a hole. But
teaching file-posix.c to read the tiny header doesn't scale - the
problem of a small header is also visible when libvirt sets up an NBD
client to a just-created file on a migration destination host.
So, we need a wrapper function that handles a bit more complexity in a
common manner for all block devices - when the BDS is mostly a hole,
but has a small non-hole header, it is still worth the time to read
that header and check if it reads as all zeroes before giving up and
returning a pessimistic answer.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v3: Use constant 128k as maximum data header size to read [Stefan]
---
include/block/block-io.h | 2 ++
block/io.c | 62 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+)
diff --git a/include/block/block-io.h b/include/block/block-io.h
index b49e0537dd4..b99cc98d265 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -161,6 +161,8 @@ bdrv_is_allocated_above(BlockDriverState *bs, BlockDriverState *base,
int coroutine_fn GRAPH_RDLOCK
bdrv_co_is_zero_fast(BlockDriverState *bs, int64_t offset, int64_t bytes);
+int coroutine_fn GRAPH_RDLOCK
+bdrv_co_is_all_zeroes(BlockDriverState *bs);
int GRAPH_RDLOCK
bdrv_apply_auto_read_only(BlockDriverState *bs, const char *errmsg,
diff --git a/block/io.c b/block/io.c
index 50dc0e193f0..4fd7768f9cd 100644
--- a/block/io.c
+++ b/block/io.c
@@ -38,10 +38,14 @@
#include "qemu/error-report.h"
#include "qemu/main-loop.h"
#include "system/replay.h"
+#include "qemu/units.h"
/* Maximum bounce buffer for copy-on-read and write zeroes, in bytes */
#define MAX_BOUNCE_BUFFER (32768 << BDRV_SECTOR_BITS)
+/* Maximum read size for checking if data reads as zero, in bytes */
+#define MAX_ZERO_CHECK_BUFFER (128 * KiB)
+
static void coroutine_fn GRAPH_RDLOCK
bdrv_parent_cb_resize(BlockDriverState *bs);
@@ -2778,6 +2782,64 @@ int coroutine_fn bdrv_co_is_zero_fast(BlockDriverState *bs, int64_t offset,
return 1;
}
+/*
+ * Check @bs (and its backing chain) to see if the entire image is known
+ * to read as zeroes.
+ * Return 1 if that is the case, 0 otherwise and -errno on error.
+ * This test is meant to be fast rather than accurate so returning 0
+ * does not guarantee non-zero data; however, a return of 1 is reliable,
+ * and this function can report 1 in more cases than bdrv_co_is_zero_fast.
+ */
+int coroutine_fn bdrv_co_is_all_zeroes(BlockDriverState *bs)
+{
+ int ret;
+ int64_t pnum, bytes;
+ char *buf;
+ QEMUIOVector local_qiov;
+ IO_CODE();
+
+ bytes = bdrv_co_getlength(bs);
+ if (bytes < 0) {
+ return bytes;
+ }
+
+ /* First probe - see if the entire image reads as zero */
+ ret = bdrv_co_common_block_status_above(bs, NULL, false, BDRV_WANT_ZERO,
+ 0, bytes, &pnum, NULL, NULL,
+ NULL);
+ if (ret < 0) {
+ return ret;
+ }
+ if (ret & BDRV_BLOCK_ZERO) {
+ return bdrv_co_is_zero_fast(bs, pnum, bytes - pnum);
+ }
+
+ /*
+ * Because of the way 'blockdev-create' works, raw files tend to
+ * be created with a non-sparse region at the front to make
+ * alignment probing easier. If the block starts with only a
+ * small allocated region, it is still worth the effort to see if
+ * the rest of the image is still sparse, coupled with manually
+ * reading the first region to see if it reads zero after all.
+ */
+ if (pnum > MAX_ZERO_CHECK_BUFFER) {
+ return 0;
+ }
+ ret = bdrv_co_is_zero_fast(bs, pnum, bytes - pnum);
+ if (ret <= 0) {
+ return ret;
+ }
+ /* Only the head of the image is unknown, and it's small. Read it. */
+ buf = qemu_blockalign(bs, pnum);
+ qemu_iovec_init_buf(&local_qiov, buf, pnum);
+ ret = bdrv_driver_preadv(bs, 0, pnum, &local_qiov, 0, 0);
+ if (ret >= 0) {
+ ret = buffer_is_zero(buf, pnum);
+ }
+ qemu_vfree(buf);
+ return ret;
+}
+
int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs, int64_t offset,
int64_t bytes, int64_t *pnum)
{
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 05/13] iotests: Improve iotest 194 to mirror data
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (3 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 04/13] block: Add new bdrv_co_is_all_zeroes() function Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 06/13] mirror: Minor refactoring Eric Blake
` (9 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, Kevin Wolf,
Hanna Reitz
Mirroring a completely sparse image to a sparse destination should be
practically instantaneous. It isn't yet, but the test will be more
realistic if it has some non-zero to mirror as well as the holes.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
tests/qemu-iotests/194 | 1 +
1 file changed, 1 insertion(+)
diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
index c0ce82dd257..d0b9c084f5f 100755
--- a/tests/qemu-iotests/194
+++ b/tests/qemu-iotests/194
@@ -34,6 +34,7 @@ with iotests.FilePath('source.img') as source_img_path, \
img_size = '1G'
iotests.qemu_img_create('-f', iotests.imgfmt, source_img_path, img_size)
+ iotests.qemu_io('-f', iotests.imgfmt, '-c', 'write 512M 1M', source_img_path)
iotests.qemu_img_create('-f', iotests.imgfmt, dest_img_path, img_size)
iotests.log('Launching VMs...')
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 06/13] mirror: Minor refactoring
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (4 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 05/13] iotests: Improve iotest 194 to mirror data Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals Eric Blake
` (8 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
Commit 5791ba52 (v9.2) pre-initialized ret in mirror_dirty_init to
silence a false positive compiler warning, even though in all code
paths where ret is used, it was guaranteed to be reassigned
beforehand. But since the function returns -errno, and -1 is not
always the right errno, it's better to initialize to -EIO.
An upcoming patch wants to track two bitmaps in
do_sync_target_write(); this will be easier if the current variables
related to the dirty bitmap are renamed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/mirror.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index a53582f17bb..34c6c5252e1 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -841,7 +841,7 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
int64_t offset;
BlockDriverState *bs;
BlockDriverState *target_bs = blk_bs(s->target);
- int ret = -1;
+ int ret = -EIO;
int64_t count;
bdrv_graph_co_rdlock();
@@ -1341,7 +1341,7 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
{
int ret;
size_t qiov_offset = 0;
- int64_t bitmap_offset, bitmap_end;
+ int64_t dirty_bitmap_offset, dirty_bitmap_end;
if (!QEMU_IS_ALIGNED(offset, job->granularity) &&
bdrv_dirty_bitmap_get(job->dirty_bitmap, offset))
@@ -1388,11 +1388,11 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
* Tails are either clean or shrunk, so for bitmap resetting
* we safely align the range down.
*/
- bitmap_offset = QEMU_ALIGN_UP(offset, job->granularity);
- bitmap_end = QEMU_ALIGN_DOWN(offset + bytes, job->granularity);
- if (bitmap_offset < bitmap_end) {
- bdrv_reset_dirty_bitmap(job->dirty_bitmap, bitmap_offset,
- bitmap_end - bitmap_offset);
+ dirty_bitmap_offset = QEMU_ALIGN_UP(offset, job->granularity);
+ dirty_bitmap_end = QEMU_ALIGN_DOWN(offset + bytes, job->granularity);
+ if (dirty_bitmap_offset < dirty_bitmap_end) {
+ bdrv_reset_dirty_bitmap(job->dirty_bitmap, dirty_bitmap_offset,
+ dirty_bitmap_end - dirty_bitmap_offset);
}
job_progress_increase_remaining(&job->common.job, bytes);
@@ -1430,10 +1430,10 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
* at function start, and they must be still dirty, as we've locked
* the region for in-flight op.
*/
- bitmap_offset = QEMU_ALIGN_DOWN(offset, job->granularity);
- bitmap_end = QEMU_ALIGN_UP(offset + bytes, job->granularity);
- bdrv_set_dirty_bitmap(job->dirty_bitmap, bitmap_offset,
- bitmap_end - bitmap_offset);
+ dirty_bitmap_offset = QEMU_ALIGN_DOWN(offset, job->granularity);
+ dirty_bitmap_end = QEMU_ALIGN_UP(offset + bytes, job->granularity);
+ bdrv_set_dirty_bitmap(job->dirty_bitmap, dirty_bitmap_offset,
+ dirty_bitmap_end - dirty_bitmap_offset);
qatomic_set(&job->actively_synced, false);
action = mirror_error_action(job, false, -ret);
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (5 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 06/13] mirror: Minor refactoring Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-02 23:17 ` Sunny Zhu
2025-05-12 15:19 ` Stefan Hajnoczi
2025-05-09 20:40 ` [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero Eric Blake
` (7 subsequent siblings)
14 siblings, 2 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
Out of the five possible values for MirrorSyncMode, INCREMENTAL and
BITMAP are already rejected up front in mirror_start, leaving NONE,
TOP, and FULL as the remaining values that the code was collapsing
into a single bool is_none_mode. Furthermore, mirror_dirty_init() is
only reachable for modes TOP and FULL, as further guided by
s->zero_target. However, upcoming patches want to further optimize
the pre-zeroing pass of a sync=full mirror in mirror_dirty_init(),
while avoiding that pass on a sync=top action. Instead of throwing
away context by collapsing these two values into
s->is_none_mode=false, it is better to pass s->sync_mode throughout
the entire operation. For active commit, the desired semantics match
sync mode TOP.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
v4: new patch
---
block/mirror.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 34c6c5252e1..2599b75d092 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -51,7 +51,7 @@ typedef struct MirrorBlockJob {
BlockDriverState *to_replace;
/* Used to block operations on the drive-mirror-replace target */
Error *replace_blocker;
- bool is_none_mode;
+ MirrorSyncMode sync_mode;
BlockMirrorBackingMode backing_mode;
/* Whether the target image requires explicit zero-initialization */
bool zero_target;
@@ -723,9 +723,10 @@ static int mirror_exit_common(Job *job)
&error_abort);
if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
- BlockDriverState *backing = s->is_none_mode ? src : s->base;
+ BlockDriverState *backing;
BlockDriverState *unfiltered_target = bdrv_skip_filters(target_bs);
+ backing = s->sync_mode == MIRROR_SYNC_MODE_NONE ? src : s->base;
if (bdrv_cow_bs(unfiltered_target) != backing) {
bdrv_set_backing_hd(unfiltered_target, backing, &local_err);
if (local_err) {
@@ -1020,7 +1021,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
mirror_free_init(s);
s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
- if (!s->is_none_mode) {
+ if (s->sync_mode != MIRROR_SYNC_MODE_NONE) {
ret = mirror_dirty_init(s);
if (ret < 0 || job_is_cancelled(&s->common.job)) {
goto immediate_exit;
@@ -1711,6 +1712,7 @@ static BlockJob *mirror_start_job(
int creation_flags, BlockDriverState *target,
const char *replaces, int64_t speed,
uint32_t granularity, int64_t buf_size,
+ MirrorSyncMode sync_mode,
BlockMirrorBackingMode backing_mode,
bool zero_target,
BlockdevOnError on_source_error,
@@ -1719,7 +1721,7 @@ static BlockJob *mirror_start_job(
BlockCompletionFunc *cb,
void *opaque,
const BlockJobDriver *driver,
- bool is_none_mode, BlockDriverState *base,
+ BlockDriverState *base,
bool auto_complete, const char *filter_node_name,
bool is_mirror, MirrorCopyMode copy_mode,
bool base_ro,
@@ -1878,7 +1880,7 @@ static BlockJob *mirror_start_job(
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
- s->is_none_mode = is_none_mode;
+ s->sync_mode = sync_mode;
s->backing_mode = backing_mode;
s->zero_target = zero_target;
qatomic_set(&s->copy_mode, copy_mode);
@@ -2015,7 +2017,6 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
bool unmap, const char *filter_node_name,
MirrorCopyMode copy_mode, Error **errp)
{
- bool is_none_mode;
BlockDriverState *base;
GLOBAL_STATE_CODE();
@@ -2028,14 +2029,13 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
}
bdrv_graph_rdlock_main_loop();
- is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
base = mode == MIRROR_SYNC_MODE_TOP ? bdrv_backing_chain_next(bs) : NULL;
bdrv_graph_rdunlock_main_loop();
mirror_start_job(job_id, bs, creation_flags, target, replaces,
- speed, granularity, buf_size, backing_mode, zero_target,
- on_source_error, on_target_error, unmap, NULL, NULL,
- &mirror_job_driver, is_none_mode, base, false,
+ speed, granularity, buf_size, mode, backing_mode,
+ zero_target, on_source_error, on_target_error, unmap,
+ NULL, NULL, &mirror_job_driver, base, false,
filter_node_name, true, copy_mode, false, errp);
}
@@ -2061,9 +2061,9 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
job = mirror_start_job(
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
- MIRROR_LEAVE_BACKING_CHAIN, false,
+ MIRROR_SYNC_MODE_TOP, MIRROR_LEAVE_BACKING_CHAIN, false,
on_error, on_error, true, cb, opaque,
- &commit_active_job_driver, false, base, auto_complete,
+ &commit_active_job_driver, base, auto_complete,
filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
base_read_only, errp);
if (!job) {
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (6 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-02 23:57 ` Sunny Zhu
` (2 more replies)
2025-05-09 20:40 ` [PATCH v4 09/13] mirror: Drop redundant zero_target parameter Eric Blake
` (6 subsequent siblings)
14 siblings, 3 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, Markus Armbruster,
John Snow, Kevin Wolf, Hanna Reitz
QEMU has an optimization for a just-created drive-mirror destination
that is not possible for blockdev-mirror (which can't create the
destination) - any time we know the destination starts life as all
zeroes, we can skip a pre-zeroing pass on the destination. Recent
patches have added an improved heuristic for detecting if a file
contains all zeroes, and we plan to use that heuristic in upcoming
patches. But since a heuristic cannot quickly detect all scenarios,
and there may be cases where the caller is aware of information that
QEMU cannot learn quickly, it makes sense to have a way to tell QEMU
to assume facts about the destination that can make the mirror
operation faster. Given our existing example of "qemu-img convert
--target-is-zero", it is time to expose this override in QMP for
blockdev-mirror as well.
This patch results in some slight redundancy between the older
s->zero_target (set any time mode==FULL and the destination image was
not just created - ie. clear if drive-mirror is asking to skip the
pre-zero pass) and the newly-introduced s->target_is_zero (in addition
to the QMP override, it is set when drive-mirror creates the
destination image); this will be cleaned up in the next patch.
There is also a subtlety that we must consider. When drive-mirror is
passing target_is_zero on behalf of a just-created image, we know the
image is sparse (skipping the pre-zeroing keeps it that way), so it
doesn't matter whether the destination also has "discard":"unmap" and
"detect-zeroes":"unmap". But now that we are letting the user set the
knob for target-is-zero, if the user passes a pre-existing file that
is fully allocated, it is fine to leave the file fully allocated under
"detect-zeroes":"on", but if the file is open with
"detect-zeroes":"unmap", we should really be trying harder to punch
holes in the destination for every region of zeroes copied from the
source. The easiest way to do this is to still run the pre-zeroing
pass (turning the entire destination file sparse before populating
just the allocated portions of the source), even though that currently
results in double I/O to the portions of the file that are allocated.
A later patch will add further optimizations to reduce redundant
zeroing I/O during the mirror operation.
Since "target-is-zero":true is designed for optimizations, it is okay
to silently ignore the parameter rather than erroring if the user ever
sets the parameter in a scenario where the mirror job can't exploit it
(for example, when doing "sync":"top" instead of "sync":"full", we
can't pre-zero, so setting the parameter won't make a speed
difference).
Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
---
v4: hoist earlier in series. QMP design is unchanged, but logic in
mirror_dirty_init is different (in many aspects simpler, but now
catering to "detect-zeroes":"unmap") so Acked-by on QMP kept, but
Reviewed-by dropped.
---
qapi/block-core.json | 8 +++++++-
include/block/block_int-global-state.h | 3 ++-
block/mirror.c | 27 ++++++++++++++++++++++----
blockdev.c | 18 ++++++++++-------
tests/unit/test-block-iothread.c | 2 +-
5 files changed, 44 insertions(+), 14 deletions(-)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index b1937780e19..7f70ec6d3cb 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2538,6 +2538,11 @@
# disappear from the query list without user intervention.
# Defaults to true. (Since 3.1)
#
+# @target-is-zero: Assume the destination reads as all zeroes before
+# the mirror started. Setting this to true can speed up the
+# mirror. Setting this to true when the destination is not
+# actually all zero can corrupt the destination. (Since 10.1)
+#
# Since: 2.6
#
# .. qmp-example::
@@ -2557,7 +2562,8 @@
'*on-target-error': 'BlockdevOnError',
'*filter-node-name': 'str',
'*copy-mode': 'MirrorCopyMode',
- '*auto-finalize': 'bool', '*auto-dismiss': 'bool' },
+ '*auto-finalize': 'bool', '*auto-dismiss': 'bool',
+ '*target-is-zero': 'bool'},
'allow-preconfig': true }
##
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index eb2d92a2261..8cf0003ce78 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -140,6 +140,7 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
* @mode: Whether to collapse all images in the chain to the target.
* @backing_mode: How to establish the target's backing chain after completion.
* @zero_target: Whether the target should be explicitly zero-initialized
+ * @target_is_zero: Whether the target already is zero-initialized.
* @on_source_error: The action to take upon error reading from the source.
* @on_target_error: The action to take upon error writing to the target.
* @unmap: Whether to unmap target where source sectors only contain zeroes.
@@ -159,7 +160,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
- bool zero_target,
+ bool zero_target, bool target_is_zero,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap, const char *filter_node_name,
diff --git a/block/mirror.c b/block/mirror.c
index 2599b75d092..4dcb50c81ad 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -55,6 +55,8 @@ typedef struct MirrorBlockJob {
BlockMirrorBackingMode backing_mode;
/* Whether the target image requires explicit zero-initialization */
bool zero_target;
+ /* Whether the target should be assumed to be already zero initialized */
+ bool target_is_zero;
/*
* To be accesssed with atomics. Written only under the BQL (required by the
* current implementation of mirror_change()).
@@ -844,12 +846,26 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
BlockDriverState *target_bs = blk_bs(s->target);
int ret = -EIO;
int64_t count;
+ bool punch_holes =
+ target_bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
+ bdrv_can_write_zeroes_with_unmap(target_bs);
bdrv_graph_co_rdlock();
bs = s->mirror_top_bs->backing->bs;
bdrv_graph_co_rdunlock();
- if (s->zero_target) {
+ if (s->zero_target && (!s->target_is_zero || punch_holes)) {
+ /*
+ * Here, we are in FULL mode; our goal is to avoid writing
+ * zeroes if the destination already reads as zero, except
+ * when we are trying to punch holes. This is possible if
+ * zeroing happened externally (s->target_is_zero) or if we
+ * have a fast way to pre-zero the image (the dirty bitmap
+ * will be populated later by the non-zero portions, the same
+ * as for TOP mode). If pre-zeroing is not fast, or we need
+ * to punch holes, then our only recourse is to write the
+ * entire image.
+ */
if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
return 0;
@@ -1714,7 +1730,7 @@ static BlockJob *mirror_start_job(
uint32_t granularity, int64_t buf_size,
MirrorSyncMode sync_mode,
BlockMirrorBackingMode backing_mode,
- bool zero_target,
+ bool zero_target, bool target_is_zero,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap,
@@ -1883,6 +1899,7 @@ static BlockJob *mirror_start_job(
s->sync_mode = sync_mode;
s->backing_mode = backing_mode;
s->zero_target = zero_target;
+ s->target_is_zero = target_is_zero;
qatomic_set(&s->copy_mode, copy_mode);
s->base = base;
s->base_overlay = bdrv_find_overlay(bs, base);
@@ -2011,7 +2028,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
- bool zero_target,
+ bool zero_target, bool target_is_zero,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap, const char *filter_node_name,
@@ -2034,7 +2051,8 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
mirror_start_job(job_id, bs, creation_flags, target, replaces,
speed, granularity, buf_size, mode, backing_mode,
- zero_target, on_source_error, on_target_error, unmap,
+ zero_target,
+ target_is_zero, on_source_error, on_target_error, unmap,
NULL, NULL, &mirror_job_driver, base, false,
filter_node_name, true, copy_mode, false, errp);
}
@@ -2062,6 +2080,7 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
job = mirror_start_job(
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_SYNC_MODE_TOP, MIRROR_LEAVE_BACKING_CHAIN, false,
+ false,
on_error, on_error, true, cb, opaque,
&commit_active_job_driver, base, auto_complete,
filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
diff --git a/blockdev.c b/blockdev.c
index 1d1f27cfff6..2e2fed539ee 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2798,7 +2798,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
const char *replaces,
enum MirrorSyncMode sync,
BlockMirrorBackingMode backing_mode,
- bool zero_target,
+ bool zero_target, bool target_is_zero,
bool has_speed, int64_t speed,
bool has_granularity, uint32_t granularity,
bool has_buf_size, int64_t buf_size,
@@ -2909,11 +2909,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
/* pass the node name to replace to mirror start since it's loose coupling
* and will allow to check whether the node still exist at mirror completion
*/
- mirror_start(job_id, bs, target,
- replaces, job_flags,
+ mirror_start(job_id, bs, target, replaces, job_flags,
speed, granularity, buf_size, sync, backing_mode, zero_target,
- on_source_error, on_target_error, unmap, filter_node_name,
- copy_mode, errp);
+ target_is_zero, on_source_error, on_target_error, unmap,
+ filter_node_name, copy_mode, errp);
}
void qmp_drive_mirror(DriveMirror *arg, Error **errp)
@@ -2928,6 +2927,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
int64_t size;
const char *format = arg->format;
bool zero_target;
+ bool target_is_zero;
int ret;
bs = qmp_get_root_bs(arg->device, errp);
@@ -3044,6 +3044,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
zero_target = (arg->sync == MIRROR_SYNC_MODE_FULL &&
(arg->mode == NEW_IMAGE_MODE_EXISTING ||
!bdrv_has_zero_init(target_bs)));
+ target_is_zero = (arg->mode != NEW_IMAGE_MODE_EXISTING &&
+ bdrv_has_zero_init(target_bs));
bdrv_graph_rdunlock_main_loop();
@@ -3055,7 +3057,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
blockdev_mirror_common(arg->job_id, bs, target_bs,
arg->replaces, arg->sync,
- backing_mode, zero_target,
+ backing_mode, zero_target, target_is_zero,
arg->has_speed, arg->speed,
arg->has_granularity, arg->granularity,
arg->has_buf_size, arg->buf_size,
@@ -3085,6 +3087,7 @@ void qmp_blockdev_mirror(const char *job_id,
bool has_copy_mode, MirrorCopyMode copy_mode,
bool has_auto_finalize, bool auto_finalize,
bool has_auto_dismiss, bool auto_dismiss,
+ bool has_target_is_zero, bool target_is_zero,
Error **errp)
{
BlockDriverState *bs;
@@ -3115,7 +3118,8 @@ void qmp_blockdev_mirror(const char *job_id,
blockdev_mirror_common(job_id, bs, target_bs,
replaces, sync, backing_mode,
- zero_target, has_speed, speed,
+ zero_target, has_target_is_zero && target_is_zero,
+ has_speed, speed,
has_granularity, granularity,
has_buf_size, buf_size,
has_on_source_error, on_source_error,
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index e26b3be5939..54aed8252c0 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -755,7 +755,7 @@ static void test_propagate_mirror(void)
/* Start a mirror job */
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,
- MIRROR_SYNC_MODE_NONE, MIRROR_OPEN_BACKING_CHAIN, false,
+ MIRROR_SYNC_MODE_NONE, MIRROR_OPEN_BACKING_CHAIN, false, false,
BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
false, "filter_node", MIRROR_COPY_MODE_BACKGROUND,
&error_abort);
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 09/13] mirror: Drop redundant zero_target parameter
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (7 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-03 0:47 ` Sunny Zhu
` (2 more replies)
2025-05-09 20:40 ` [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero Eric Blake
` (5 subsequent siblings)
14 siblings, 3 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
The two callers to a mirror job (drive-mirror and blockdev-mirror) set
zero_target precisely when sync mode == FULL, with the one exception
that drive-mirror skips zeroing the target if it was newly created and
reads as zero. But given the previous patch, that exception is
equally captured by target_is_zero. And since we recently updated
things to pass the sync mode all the way through to
mirror_dirty_init(), we can now reconstruct the same conditionals
without the redundant parameter.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
v4: new patch
---
include/block/block_int-global-state.h | 3 +--
block/mirror.c | 13 +++++--------
blockdev.c | 15 ++++-----------
tests/unit/test-block-iothread.c | 2 +-
4 files changed, 11 insertions(+), 22 deletions(-)
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index 8cf0003ce78..d21bd7fd2f5 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -139,7 +139,6 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
* @buf_size: The amount of data that can be in flight at one time.
* @mode: Whether to collapse all images in the chain to the target.
* @backing_mode: How to establish the target's backing chain after completion.
- * @zero_target: Whether the target should be explicitly zero-initialized
* @target_is_zero: Whether the target already is zero-initialized.
* @on_source_error: The action to take upon error reading from the source.
* @on_target_error: The action to take upon error writing to the target.
@@ -160,7 +159,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
- bool zero_target, bool target_is_zero,
+ bool target_is_zero,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap, const char *filter_node_name,
diff --git a/block/mirror.c b/block/mirror.c
index 4dcb50c81ad..d04db85883d 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -53,8 +53,6 @@ typedef struct MirrorBlockJob {
Error *replace_blocker;
MirrorSyncMode sync_mode;
BlockMirrorBackingMode backing_mode;
- /* Whether the target image requires explicit zero-initialization */
- bool zero_target;
/* Whether the target should be assumed to be already zero initialized */
bool target_is_zero;
/*
@@ -854,7 +852,9 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
bs = s->mirror_top_bs->backing->bs;
bdrv_graph_co_rdunlock();
- if (s->zero_target && (!s->target_is_zero || punch_holes)) {
+ if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
+ /* In TOP mode, there is no benefit to a pre-zeroing pass. */
+ } else if (!s->target_is_zero || punch_holes) {
/*
* Here, we are in FULL mode; our goal is to avoid writing
* zeroes if the destination already reads as zero, except
@@ -1730,7 +1730,7 @@ static BlockJob *mirror_start_job(
uint32_t granularity, int64_t buf_size,
MirrorSyncMode sync_mode,
BlockMirrorBackingMode backing_mode,
- bool zero_target, bool target_is_zero,
+ bool target_is_zero,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap,
@@ -1898,7 +1898,6 @@ static BlockJob *mirror_start_job(
s->on_target_error = on_target_error;
s->sync_mode = sync_mode;
s->backing_mode = backing_mode;
- s->zero_target = zero_target;
s->target_is_zero = target_is_zero;
qatomic_set(&s->copy_mode, copy_mode);
s->base = base;
@@ -2028,7 +2027,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
- bool zero_target, bool target_is_zero,
+ bool target_is_zero,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap, const char *filter_node_name,
@@ -2051,7 +2050,6 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
mirror_start_job(job_id, bs, creation_flags, target, replaces,
speed, granularity, buf_size, mode, backing_mode,
- zero_target,
target_is_zero, on_source_error, on_target_error, unmap,
NULL, NULL, &mirror_job_driver, base, false,
filter_node_name, true, copy_mode, false, errp);
@@ -2080,7 +2078,6 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
job = mirror_start_job(
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_SYNC_MODE_TOP, MIRROR_LEAVE_BACKING_CHAIN, false,
- false,
on_error, on_error, true, cb, opaque,
&commit_active_job_driver, base, auto_complete,
filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
diff --git a/blockdev.c b/blockdev.c
index 2e2fed539ee..7eb1ffb6396 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2798,7 +2798,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
const char *replaces,
enum MirrorSyncMode sync,
BlockMirrorBackingMode backing_mode,
- bool zero_target, bool target_is_zero,
+ bool target_is_zero,
bool has_speed, int64_t speed,
bool has_granularity, uint32_t granularity,
bool has_buf_size, int64_t buf_size,
@@ -2910,7 +2910,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
* and will allow to check whether the node still exist at mirror completion
*/
mirror_start(job_id, bs, target, replaces, job_flags,
- speed, granularity, buf_size, sync, backing_mode, zero_target,
+ speed, granularity, buf_size, sync, backing_mode,
target_is_zero, on_source_error, on_target_error, unmap,
filter_node_name, copy_mode, errp);
}
@@ -2926,7 +2926,6 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
int flags;
int64_t size;
const char *format = arg->format;
- bool zero_target;
bool target_is_zero;
int ret;
@@ -3041,9 +3040,6 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
}
bdrv_graph_rdlock_main_loop();
- zero_target = (arg->sync == MIRROR_SYNC_MODE_FULL &&
- (arg->mode == NEW_IMAGE_MODE_EXISTING ||
- !bdrv_has_zero_init(target_bs)));
target_is_zero = (arg->mode != NEW_IMAGE_MODE_EXISTING &&
bdrv_has_zero_init(target_bs));
bdrv_graph_rdunlock_main_loop();
@@ -3057,7 +3053,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
blockdev_mirror_common(arg->job_id, bs, target_bs,
arg->replaces, arg->sync,
- backing_mode, zero_target, target_is_zero,
+ backing_mode, target_is_zero,
arg->has_speed, arg->speed,
arg->has_granularity, arg->granularity,
arg->has_buf_size, arg->buf_size,
@@ -3094,7 +3090,6 @@ void qmp_blockdev_mirror(const char *job_id,
BlockDriverState *target_bs;
AioContext *aio_context;
BlockMirrorBackingMode backing_mode = MIRROR_LEAVE_BACKING_CHAIN;
- bool zero_target;
int ret;
bs = qmp_get_root_bs(device, errp);
@@ -3107,8 +3102,6 @@ void qmp_blockdev_mirror(const char *job_id,
return;
}
- zero_target = (sync == MIRROR_SYNC_MODE_FULL);
-
aio_context = bdrv_get_aio_context(bs);
ret = bdrv_try_change_aio_context(target_bs, aio_context, NULL, errp);
@@ -3118,7 +3111,7 @@ void qmp_blockdev_mirror(const char *job_id,
blockdev_mirror_common(job_id, bs, target_bs,
replaces, sync, backing_mode,
- zero_target, has_target_is_zero && target_is_zero,
+ has_target_is_zero && target_is_zero,
has_speed, speed,
has_granularity, granularity,
has_buf_size, buf_size,
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index 54aed8252c0..e26b3be5939 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -755,7 +755,7 @@ static void test_propagate_mirror(void)
/* Start a mirror job */
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,
- MIRROR_SYNC_MODE_NONE, MIRROR_OPEN_BACKING_CHAIN, false, false,
+ MIRROR_SYNC_MODE_NONE, MIRROR_OPEN_BACKING_CHAIN, false,
BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
false, "filter_node", MIRROR_COPY_MODE_BACKGROUND,
&error_abort);
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (8 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 09/13] mirror: Drop redundant zero_target parameter Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-03 0:51 ` Sunny Zhu
2025-05-12 18:40 ` Stefan Hajnoczi
2025-05-09 20:40 ` [PATCH v4 11/13] mirror: Skip writing zeroes when target " Eric Blake
` (4 subsequent siblings)
14 siblings, 2 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
When doing a sync=full mirroring, we can skip pre-zeroing the
destination if it already reads as zeroes and we are not also trying
to punch holes due to detect-zeroes. With this patch, there are fewer
scenarios that have to pass in an explicit target-is-zero, while still
resulting in a sparse destination remaining sparse.
A later patch will then further improve things to skip writing to the
destination for parts of the image where the source is zero; but even
with just this patch, it is possible to see a difference for any
source that does not report itself as fully allocated, coupled with a
destination BDS that can quickly report that it already reads as zero.
(For a source that reports as fully allocated, such as a file, the
rest of mirror_dirty_init() still sets the entire dirty bitmap to
true, so even though we avoided the pre-zeroing, we are not yet
avoiding all redundant I/O).
Iotest 194 detects the difference made by this patch: for a file
source (where block status reports the entire image as allocated, and
therefore we end up writing zeroes everywhere in the destination
anyways), the job length remains the same. But for a qcow2 source and
a destination that reads as all zeroes, the dirty bitmap changes to
just tracking the allocated portions of the source, which results in
faster completion and smaller job statistics. For the test to pass
with both ./check -file and -qcow2, a new python filter is needed to
mask out the now-varying job amounts (this matches the shell filters
_filter_block_job_{offset,len} in common.filter). A later test will
also be added which further validates expected sparseness, so it does
not matter that 194 is no longer explicitly looking at how many bytes
were copied.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
v3: add exemption for "detect-zeroes":"unmap" on destination
v4: Rebase later in series, revise logic for pre-zeroing [Sunny], add
in python filter
---
block/mirror.c | 24 ++++++++++++++++--------
tests/qemu-iotests/194 | 6 ++++--
tests/qemu-iotests/194.out | 4 ++--
tests/qemu-iotests/iotests.py | 12 +++++++++++-
4 files changed, 33 insertions(+), 13 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index d04db85883d..bca99ec206b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -848,23 +848,31 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
target_bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
bdrv_can_write_zeroes_with_unmap(target_bs);
+ /* Determine if the image is already zero, regardless of sync mode. */
bdrv_graph_co_rdlock();
bs = s->mirror_top_bs->backing->bs;
+ if (s->target_is_zero) {
+ ret = 1;
+ } else {
+ ret = bdrv_co_is_all_zeroes(target_bs);
+ }
bdrv_graph_co_rdunlock();
- if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
+ /* Determine if a pre-zeroing pass is necessary. */
+ if (ret < 0) {
+ return ret;
+ } else if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
/* In TOP mode, there is no benefit to a pre-zeroing pass. */
- } else if (!s->target_is_zero || punch_holes) {
+ } else if (ret == 0 || punch_holes) {
/*
* Here, we are in FULL mode; our goal is to avoid writing
* zeroes if the destination already reads as zero, except
* when we are trying to punch holes. This is possible if
- * zeroing happened externally (s->target_is_zero) or if we
- * have a fast way to pre-zero the image (the dirty bitmap
- * will be populated later by the non-zero portions, the same
- * as for TOP mode). If pre-zeroing is not fast, or we need
- * to punch holes, then our only recourse is to write the
- * entire image.
+ * zeroing happened externally (ret > 0) or if we have a fast
+ * way to pre-zero the image (the dirty bitmap will be
+ * populated later by the non-zero portions, the same as for
+ * TOP mode). If pre-zeroing is not fast, or we need to punch
+ * holes, then our only recourse is to write the entire image.
*/
if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
index d0b9c084f5f..e114c0b2695 100755
--- a/tests/qemu-iotests/194
+++ b/tests/qemu-iotests/194
@@ -62,7 +62,8 @@ with iotests.FilePath('source.img') as source_img_path, \
iotests.log('Waiting for `drive-mirror` to complete...')
iotests.log(source_vm.event_wait('BLOCK_JOB_READY'),
- filters=[iotests.filter_qmp_event])
+ filters=[iotests.filter_qmp_event,
+ iotests.filter_block_job])
iotests.log('Starting migration...')
capabilities = [{'capability': 'events', 'state': True},
@@ -88,7 +89,8 @@ with iotests.FilePath('source.img') as source_img_path, \
while True:
event2 = source_vm.event_wait('BLOCK_JOB_COMPLETED')
- iotests.log(event2, filters=[iotests.filter_qmp_event])
+ iotests.log(event2, filters=[iotests.filter_qmp_event,
+ iotests.filter_block_job])
if event2['event'] == 'BLOCK_JOB_COMPLETED':
iotests.log('Stopping the NBD server on destination...')
iotests.log(dest_vm.qmp('nbd-server-stop'))
diff --git a/tests/qemu-iotests/194.out b/tests/qemu-iotests/194.out
index 6940e809cde..d02655a5147 100644
--- a/tests/qemu-iotests/194.out
+++ b/tests/qemu-iotests/194.out
@@ -7,7 +7,7 @@ Launching NBD server on destination...
Starting `drive-mirror` on source...
{"return": {}}
Waiting for `drive-mirror` to complete...
-{"data": {"device": "mirror-job0", "len": 1073741824, "offset": 1073741824, "speed": 0, "type": "mirror"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+{"data": {"device": "mirror-job0", "len": "LEN", "offset": "OFFSET", "speed": 0, "type": "mirror"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
Starting migration...
{"return": {}}
{"execute": "migrate-start-postcopy", "arguments": {}}
@@ -18,7 +18,7 @@ Starting migration...
{"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
Gracefully ending the `drive-mirror` job on source...
{"return": {}}
-{"data": {"device": "mirror-job0", "len": 1073741824, "offset": 1073741824, "speed": 0, "type": "mirror"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+{"data": {"device": "mirror-job0", "len": "LEN", "offset": "OFFSET", "speed": 0, "type": "mirror"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
Stopping the NBD server on destination...
{"return": {}}
Wait for migration completion on target...
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 7292c8b342a..05274772ce4 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -601,13 +601,23 @@ def filter_chown(msg):
return chown_re.sub("chown UID:GID", msg)
def filter_qmp_event(event):
- '''Filter a QMP event dict'''
+ '''Filter the timestamp of a QMP event dict'''
event = dict(event)
if 'timestamp' in event:
event['timestamp']['seconds'] = 'SECS'
event['timestamp']['microseconds'] = 'USECS'
return event
+def filter_block_job(event):
+ '''Filter the offset and length of a QMP block job event dict'''
+ event = dict(event)
+ if 'data' in event:
+ if 'offset' in event['data']:
+ event['data']['offset'] = 'OFFSET'
+ if 'len' in event['data']:
+ event['data']['len'] = 'LEN'
+ return event
+
def filter_qmp(qmsg, filter_fn):
'''Given a string filter, filter a QMP object's values.
filter_fn takes a (key, value) pair.'''
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 11/13] mirror: Skip writing zeroes when target is already zero
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (9 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-12 18:43 ` Stefan Hajnoczi
` (2 more replies)
2025-05-09 20:40 ` [PATCH v4 12/13] iotests/common.rc: add disk_usage function Eric Blake
` (3 subsequent siblings)
14 siblings, 3 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
When mirroring, the goal is to ensure that the destination reads the
same as the source; this goal is met whether the destination is sparse
or fully-allocated (except when explicitly punching holes, then merely
reading zero is not enough to know if it is sparse, so we still want
to punch the hole). Avoiding a redundant write to zero (whether in
the background because the zero cluster was marked in the dirty
bitmap, or in the foreground because the guest is writing zeroes) when
the destination already reads as zero makes mirroring faster, and
avoids allocating the destination merely because the source reports as
allocated.
The effect is especially pronounced when the source is a raw file.
That's because when the source is a qcow2 file, the dirty bitmap only
visits the portions of the source that are allocated, which tend to be
non-zero. But when the source is a raw file,
bdrv_co_is_allocated_above() reports the entire file as allocated so
mirror_dirty_init sets the entire dirty bitmap, and it is only later
during mirror_iteration that we change to consulting the more precise
bdrv_co_block_status_above() to learn where the source reads as zero.
Remember that since a mirror operation can write a cluster more than
once (every time the guest changes the source, the destination is also
changed to keep up), and the guest can change whether a given cluster
reads as zero, is discarded, or has non-zero data over the course of
the mirror operation, we can't take the shortcut of relying on
s->target_is_zero (which is static for the life of the job) in
mirror_co_zero() to see if the destination is already zero, because
that information may be stale. Any solution we use must be dynamic in
the face of the guest writing or discarding a cluster while the mirror
has been ongoing.
We could just teach mirror_co_zero() to do a block_status() probe of
the destination, and skip the zeroes if the destination already reads
as zero, but we know from past experience that extra block_status()
calls are not always cheap (tmpfs, anyone?), especially when they are
random access rather than linear. Use of block_status() of the source
by the background task in a linear fashion is not our bottleneck (it's
a background task, after all); but since mirroring can be done while
the source is actively being changed, we don't want a slow
block_status() of the destination to occur on the hot path of the
guest trying to do random-access writes to the source.
So this patch takes a slightly different approach: any time we have to
track dirty clusters, we can also track which clusters are known to
read as zero. For sync=TOP or when we are punching holes from
"detect-zeroes":"unmap", the zero bitmap starts out empty, but
prevents a second write zero to a cluster that was already zero by an
earlier pass; for sync=FULL when we are not punching holes, the zero
bitmap starts out full if the destination reads as zero during
initialization. Either way, I/O to the destination can now avoid
redundant write zero to a cluster that already reads as zero, all
without having to do a block_status() per write on the destination.
With this patch, if I create a raw sparse destination file, connect it
with QMP 'blockdev-add' while leaving it at the default "discard":
"ignore", then run QMP 'blockdev-mirror' with "sync": "full", the
destination remains sparse rather than fully allocated. Meanwhile, a
destination image that is already fully allocated remains so unless it
was opened with "detect-zeroes": "unmap". And any time writing zeroes
is skipped, the job counters are not incremented.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
v3: also skip counters when skipping I/O [Sunny]
v4: rebase to earlier changes
---
block/mirror.c | 107 ++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 93 insertions(+), 14 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index bca99ec206b..724318f0371 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -73,6 +73,7 @@ typedef struct MirrorBlockJob {
size_t buf_size;
int64_t bdev_length;
unsigned long *cow_bitmap;
+ unsigned long *zero_bitmap;
BdrvDirtyBitmap *dirty_bitmap;
BdrvDirtyBitmapIter *dbi;
uint8_t *buf;
@@ -108,9 +109,12 @@ struct MirrorOp {
int64_t offset;
uint64_t bytes;
- /* The pointee is set by mirror_co_read(), mirror_co_zero(), and
- * mirror_co_discard() before yielding for the first time */
+ /*
+ * These pointers are set by mirror_co_read(), mirror_co_zero(), and
+ * mirror_co_discard() before yielding for the first time
+ */
int64_t *bytes_handled;
+ bool *io_skipped;
bool is_pseudo_op;
bool is_active_write;
@@ -408,15 +412,34 @@ static void coroutine_fn mirror_co_read(void *opaque)
static void coroutine_fn mirror_co_zero(void *opaque)
{
MirrorOp *op = opaque;
- int ret;
+ bool write_needed = true;
+ int ret = 0;
op->s->in_flight++;
op->s->bytes_in_flight += op->bytes;
*op->bytes_handled = op->bytes;
op->is_in_flight = true;
- ret = blk_co_pwrite_zeroes(op->s->target, op->offset, op->bytes,
- op->s->unmap ? BDRV_REQ_MAY_UNMAP : 0);
+ if (op->s->zero_bitmap) {
+ unsigned long end = DIV_ROUND_UP(op->offset + op->bytes,
+ op->s->granularity);
+ assert(QEMU_IS_ALIGNED(op->offset, op->s->granularity));
+ assert(QEMU_IS_ALIGNED(op->bytes, op->s->granularity) ||
+ op->offset + op->bytes == op->s->bdev_length);
+ if (find_next_zero_bit(op->s->zero_bitmap, end,
+ op->offset / op->s->granularity) == end) {
+ write_needed = false;
+ *op->io_skipped = true;
+ }
+ }
+ if (write_needed) {
+ ret = blk_co_pwrite_zeroes(op->s->target, op->offset, op->bytes,
+ op->s->unmap ? BDRV_REQ_MAY_UNMAP : 0);
+ }
+ if (ret >= 0 && op->s->zero_bitmap) {
+ bitmap_set(op->s->zero_bitmap, op->offset / op->s->granularity,
+ DIV_ROUND_UP(op->bytes, op->s->granularity));
+ }
mirror_write_complete(op, ret);
}
@@ -435,29 +458,43 @@ static void coroutine_fn mirror_co_discard(void *opaque)
}
static unsigned mirror_perform(MirrorBlockJob *s, int64_t offset,
- unsigned bytes, MirrorMethod mirror_method)
+ unsigned bytes, MirrorMethod mirror_method,
+ bool *io_skipped)
{
MirrorOp *op;
Coroutine *co;
int64_t bytes_handled = -1;
+ assert(QEMU_IS_ALIGNED(offset, s->granularity));
+ assert(QEMU_IS_ALIGNED(bytes, s->granularity) ||
+ offset + bytes == s->bdev_length);
op = g_new(MirrorOp, 1);
*op = (MirrorOp){
.s = s,
.offset = offset,
.bytes = bytes,
.bytes_handled = &bytes_handled,
+ .io_skipped = io_skipped,
};
qemu_co_queue_init(&op->waiting_requests);
switch (mirror_method) {
case MIRROR_METHOD_COPY:
+ if (s->zero_bitmap) {
+ bitmap_clear(s->zero_bitmap, offset / s->granularity,
+ DIV_ROUND_UP(bytes, s->granularity));
+ }
co = qemu_coroutine_create(mirror_co_read, op);
break;
case MIRROR_METHOD_ZERO:
+ /* s->zero_bitmap handled in mirror_co_zero */
co = qemu_coroutine_create(mirror_co_zero, op);
break;
case MIRROR_METHOD_DISCARD:
+ if (s->zero_bitmap) {
+ bitmap_clear(s->zero_bitmap, offset / s->granularity,
+ DIV_ROUND_UP(bytes, s->granularity));
+ }
co = qemu_coroutine_create(mirror_co_discard, op);
break;
default:
@@ -568,6 +605,7 @@ static void coroutine_fn GRAPH_UNLOCKED mirror_iteration(MirrorBlockJob *s)
int ret = -1;
int64_t io_bytes;
int64_t io_bytes_acct;
+ bool io_skipped = false;
MirrorMethod mirror_method = MIRROR_METHOD_COPY;
assert(!(offset % s->granularity));
@@ -611,8 +649,10 @@ static void coroutine_fn GRAPH_UNLOCKED mirror_iteration(MirrorBlockJob *s)
}
io_bytes = mirror_clip_bytes(s, offset, io_bytes);
- io_bytes = mirror_perform(s, offset, io_bytes, mirror_method);
- if (mirror_method != MIRROR_METHOD_COPY && write_zeroes_ok) {
+ io_bytes = mirror_perform(s, offset, io_bytes, mirror_method,
+ &io_skipped);
+ if (io_skipped ||
+ (mirror_method != MIRROR_METHOD_COPY && write_zeroes_ok)) {
io_bytes_acct = 0;
} else {
io_bytes_acct = io_bytes;
@@ -847,8 +887,10 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
bool punch_holes =
target_bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
bdrv_can_write_zeroes_with_unmap(target_bs);
+ int64_t bitmap_length = DIV_ROUND_UP(s->bdev_length, s->granularity);
/* Determine if the image is already zero, regardless of sync mode. */
+ s->zero_bitmap = bitmap_new(bitmap_length);
bdrv_graph_co_rdlock();
bs = s->mirror_top_bs->backing->bs;
if (s->target_is_zero) {
@@ -862,7 +904,14 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
if (ret < 0) {
return ret;
} else if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
- /* In TOP mode, there is no benefit to a pre-zeroing pass. */
+ /*
+ * In TOP mode, there is no benefit to a pre-zeroing pass, but
+ * the zero bitmap can be set if the destination already reads
+ * as zero and we are not punching holes.
+ */
+ if (ret > 0 && !punch_holes) {
+ bitmap_set(s->zero_bitmap, 0, bitmap_length);
+ }
} else if (ret == 0 || punch_holes) {
/*
* Here, we are in FULL mode; our goal is to avoid writing
@@ -871,8 +920,9 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
* zeroing happened externally (ret > 0) or if we have a fast
* way to pre-zero the image (the dirty bitmap will be
* populated later by the non-zero portions, the same as for
- * TOP mode). If pre-zeroing is not fast, or we need to punch
- * holes, then our only recourse is to write the entire image.
+ * TOP mode). If pre-zeroing is not fast, then our only
+ * recourse is to mark the entire image dirty. The act of
+ * pre-zeroing will populate the zero bitmap.
*/
if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
@@ -883,6 +933,7 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
for (offset = 0; offset < s->bdev_length; ) {
int bytes = MIN(s->bdev_length - offset,
QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
+ bool ignored;
mirror_throttle(s);
@@ -898,12 +949,15 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
continue;
}
- mirror_perform(s, offset, bytes, MIRROR_METHOD_ZERO);
+ mirror_perform(s, offset, bytes, MIRROR_METHOD_ZERO, &ignored);
offset += bytes;
}
mirror_wait_for_all_io(s);
s->initial_zeroing_ongoing = false;
+ } else {
+ /* In FULL mode, and image already reads as zero. */
+ bitmap_set(s->zero_bitmap, 0, bitmap_length);
}
/* First part, loop on the sectors and initialize the dirty bitmap. */
@@ -1188,6 +1242,7 @@ immediate_exit:
assert(s->in_flight == 0);
qemu_vfree(s->buf);
g_free(s->cow_bitmap);
+ g_free(s->zero_bitmap);
g_free(s->in_flight_bitmap);
bdrv_dirty_iter_free(s->dbi);
@@ -1367,6 +1422,7 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
int ret;
size_t qiov_offset = 0;
int64_t dirty_bitmap_offset, dirty_bitmap_end;
+ int64_t zero_bitmap_offset, zero_bitmap_end;
if (!QEMU_IS_ALIGNED(offset, job->granularity) &&
bdrv_dirty_bitmap_get(job->dirty_bitmap, offset))
@@ -1410,8 +1466,9 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
}
/*
- * Tails are either clean or shrunk, so for bitmap resetting
- * we safely align the range down.
+ * Tails are either clean or shrunk, so for dirty bitmap resetting
+ * we safely align the range narrower. But for zero bitmap, round
+ * range wider for checking or clearing, and narrower for setting.
*/
dirty_bitmap_offset = QEMU_ALIGN_UP(offset, job->granularity);
dirty_bitmap_end = QEMU_ALIGN_DOWN(offset + bytes, job->granularity);
@@ -1419,22 +1476,44 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
bdrv_reset_dirty_bitmap(job->dirty_bitmap, dirty_bitmap_offset,
dirty_bitmap_end - dirty_bitmap_offset);
}
+ zero_bitmap_offset = offset / job->granularity;
+ zero_bitmap_end = DIV_ROUND_UP(offset + bytes, job->granularity);
job_progress_increase_remaining(&job->common.job, bytes);
job->active_write_bytes_in_flight += bytes;
switch (method) {
case MIRROR_METHOD_COPY:
+ if (job->zero_bitmap) {
+ bitmap_clear(job->zero_bitmap, zero_bitmap_offset,
+ zero_bitmap_end - zero_bitmap_offset);
+ }
ret = blk_co_pwritev_part(job->target, offset, bytes,
qiov, qiov_offset, flags);
break;
case MIRROR_METHOD_ZERO:
+ if (job->zero_bitmap) {
+ if (find_next_zero_bit(job->zero_bitmap, zero_bitmap_end,
+ zero_bitmap_offset) == zero_bitmap_end) {
+ ret = 0;
+ break;
+ }
+ }
assert(!qiov);
ret = blk_co_pwrite_zeroes(job->target, offset, bytes, flags);
+ if (job->zero_bitmap && ret >= 0) {
+ bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
+ (dirty_bitmap_end - dirty_bitmap_offset) /
+ job->granularity);
+ }
break;
case MIRROR_METHOD_DISCARD:
+ if (job->zero_bitmap) {
+ bitmap_clear(job->zero_bitmap, zero_bitmap_offset,
+ zero_bitmap_end - zero_bitmap_offset);
+ }
assert(!qiov);
ret = blk_co_pdiscard(job->target, offset, bytes);
break;
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 12/13] iotests/common.rc: add disk_usage function
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (10 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 11/13] mirror: Skip writing zeroes when target " Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 13/13] tests: Add iotest mirror-sparse for recent patches Eric Blake
` (2 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, Andrey Drobyshev,
Alexander Ivanov, Alberto Garcia, Kevin Wolf, Hanna Reitz
From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Move the definition from iotests/250 to common.rc. This is used to
detect real disk usage of sparse files. In particular, we want to use
it for checking subclusters-based discards.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-ID: <20240913163942.423050-6-andrey.drobyshev@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
tests/qemu-iotests/common.rc | 6 ++++++
tests/qemu-iotests/250 | 5 -----
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 95c12577dd4..237f746af88 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -140,6 +140,12 @@ _optstr_add()
fi
}
+# report real disk usage for sparse files
+disk_usage()
+{
+ du --block-size=1 "$1" | awk '{print $1}'
+}
+
# Set the variables to the empty string to turn Valgrind off
# for specific processes, e.g.
# $ VALGRIND_QEMU_IO= ./check -qcow2 -valgrind 015
diff --git a/tests/qemu-iotests/250 b/tests/qemu-iotests/250
index af48f83abac..c0a0dbc0ff1 100755
--- a/tests/qemu-iotests/250
+++ b/tests/qemu-iotests/250
@@ -52,11 +52,6 @@ _unsupported_imgopts data_file
# bdrv_co_truncate(bs->file) call in qcow2_co_truncate(), which might succeed
# anyway.
-disk_usage()
-{
- du --block-size=1 $1 | awk '{print $1}'
-}
-
size=2100M
_make_test_img -o "cluster_size=1M,preallocation=metadata" $size
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v4 13/13] tests: Add iotest mirror-sparse for recent patches
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (11 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 12/13] iotests/common.rc: add disk_usage function Eric Blake
@ 2025-05-09 20:40 ` Eric Blake
2025-05-12 18:44 ` Stefan Hajnoczi
2025-05-12 18:44 ` [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Stefan Hajnoczi
2025-05-13 22:00 ` [PATCH v4 14/13] mirror: Reduce I/O when destination is detect-zeroes:unmap Eric Blake
14 siblings, 1 reply; 34+ messages in thread
From: Eric Blake @ 2025-05-09 20:40 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, Kevin Wolf,
Hanna Reitz
Prove that blockdev-mirror can now result in sparse raw destination
files, regardless of whether the source is raw or qcow2. By making
this a separate test, it was possible to test effects of individual
patches for the various pieces that all have to work together for a
sparse mirror to be successful.
Note that ./check -file produces different job lengths than ./check
-qcow2 (the test uses a filter to normalize); that's because when
deciding how much of the image to be mirrored, the code looks at how
much of the source image was allocated (for qcow2, this is only the
written clusters; for raw, it is the entire file). But the important
part is that the destination file ends up smaller than 3M, rather than
the 20M it used to be before this patch series.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
v3: Improve test with more cases
v4: drop -machine q35 from command line [Stefan], update to
match rework in rest of series
---
tests/qemu-iotests/tests/mirror-sparse | 125 +++++++
tests/qemu-iotests/tests/mirror-sparse.out | 365 +++++++++++++++++++++
2 files changed, 490 insertions(+)
create mode 100755 tests/qemu-iotests/tests/mirror-sparse
create mode 100644 tests/qemu-iotests/tests/mirror-sparse.out
diff --git a/tests/qemu-iotests/tests/mirror-sparse b/tests/qemu-iotests/tests/mirror-sparse
new file mode 100755
index 00000000000..8c52a4e2448
--- /dev/null
+++ b/tests/qemu-iotests/tests/mirror-sparse
@@ -0,0 +1,125 @@
+#!/usr/bin/env bash
+# group: rw auto quick
+#
+# Test blockdev-mirror with raw sparse destination
+#
+# Copyright (C) 2025 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+status=1 # failure is the default!
+
+_cleanup()
+{
+ _cleanup_test_img
+ _cleanup_qemu
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+cd ..
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt qcow2 raw # Format of the source. dst is always raw file
+_supported_proto file
+_supported_os Linux
+
+echo
+echo "=== Initial image setup ==="
+echo
+
+TEST_IMG="$TEST_IMG.base" _make_test_img 20M
+$QEMU_IO -c 'w 8M 2M' -f $IMGFMT "$TEST_IMG.base" | _filter_qemu_io
+
+_launch_qemu \
+ -blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false},
+ "filename":"'"$TEST_IMG.base"'", "node-name":"src-file"}' \
+ -blockdev '{"driver":"'$IMGFMT'", "node-name":"src", "file":"src-file"}'
+h1=$QEMU_HANDLE
+_send_qemu_cmd $h1 '{"execute": "qmp_capabilities"}' 'return'
+
+# Check several combinations; most should result in a sparse destination;
+# the destination should only be fully allocated if pre-allocated
+# and not punching holes due to detect-zeroes
+# do_test creation discard zeroes result
+do_test() {
+ creation=$1
+ discard=$2
+ zeroes=$3
+ expected=$4
+
+echo
+echo "=== Testing creation=$creation discard=$discard zeroes=$zeroes ==="
+echo
+
+rm -f $TEST_IMG
+if test $creation = external; then
+ truncate --size=20M $TEST_IMG
+else
+ _send_qemu_cmd $h1 '{"execute": "blockdev-create", "arguments":
+ {"options": {"driver":"file", "filename":"'$TEST_IMG'",
+ "size":'$((20*1024*1024))', "preallocation":"'$creation'"},
+ "job-id":"job1"}}' 'concluded'
+ _send_qemu_cmd $h1 '{"execute": "job-dismiss", "arguments":
+ {"id": "job1"}}' 'return'
+fi
+_send_qemu_cmd $h1 '{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"'$TEST_IMG'", "aio":"threads",
+ "auto-read-only":true, "discard":"'$discard'",
+ "detect-zeroes":"'$zeroes'"}}' 'return'
+_send_qemu_cmd $h1 '{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}' 'return'
+_timed_wait_for $h1 '"ready"'
+_send_qemu_cmd $h1 '{"execute": "job-complete", "arguments":
+ {"id":"job2"}}' 'return' \
+ | _filter_block_job_offset | _filter_block_job_len
+_send_qemu_cmd $h1 '{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}' 'return' \
+ | _filter_block_job_offset | _filter_block_job_len
+$QEMU_IMG compare -U -f $IMGFMT -F raw $TEST_IMG.base $TEST_IMG
+result=$(disk_usage $TEST_IMG)
+if test $result -lt $((3*1024*1024)); then
+ actual=sparse
+elif test $result = $((20*1024*1024)); then
+ actual=full
+else
+ actual=unknown
+fi
+echo "Destination is $actual; expected $expected"
+}
+
+do_test external ignore off sparse
+do_test external unmap off sparse
+do_test external unmap unmap sparse
+do_test off ignore off sparse
+do_test off unmap off sparse
+do_test off unmap unmap sparse
+do_test full ignore off full
+do_test full unmap off sparse
+do_test full unmap unmap sparse
+
+_send_qemu_cmd $h1 '{"execute":"quit"}' ''
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/tests/mirror-sparse.out b/tests/qemu-iotests/tests/mirror-sparse.out
new file mode 100644
index 00000000000..2103b891c3f
--- /dev/null
+++ b/tests/qemu-iotests/tests/mirror-sparse.out
@@ -0,0 +1,365 @@
+QA output created by mirror-sparse
+
+=== Initial image setup ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=20971520
+wrote 2097152/2097152 bytes at offset 8388608
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"execute": "qmp_capabilities"}
+{"return": {}}
+
+=== Testing creation=external discard=ignore zeroes=off ===
+
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"ignore",
+ "detect-zeroes":"off"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is sparse; expected sparse
+
+=== Testing creation=external discard=unmap zeroes=off ===
+
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"unmap",
+ "detect-zeroes":"off"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is sparse; expected sparse
+
+=== Testing creation=external discard=unmap zeroes=unmap ===
+
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"unmap",
+ "detect-zeroes":"unmap"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is sparse; expected sparse
+
+=== Testing creation=off discard=ignore zeroes=off ===
+
+{"execute": "blockdev-create", "arguments":
+ {"options": {"driver":"file", "filename":"TEST_DIR/t.IMGFMT",
+ "size":20971520, "preallocation":"off"},
+ "job-id":"job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job1"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job1"}}
+{"execute": "job-dismiss", "arguments":
+ {"id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
+{"return": {}}
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"ignore",
+ "detect-zeroes":"off"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is sparse; expected sparse
+
+=== Testing creation=off discard=unmap zeroes=off ===
+
+{"execute": "blockdev-create", "arguments":
+ {"options": {"driver":"file", "filename":"TEST_DIR/t.IMGFMT",
+ "size":20971520, "preallocation":"off"},
+ "job-id":"job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job1"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job1"}}
+{"execute": "job-dismiss", "arguments":
+ {"id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
+{"return": {}}
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"unmap",
+ "detect-zeroes":"off"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is sparse; expected sparse
+
+=== Testing creation=off discard=unmap zeroes=unmap ===
+
+{"execute": "blockdev-create", "arguments":
+ {"options": {"driver":"file", "filename":"TEST_DIR/t.IMGFMT",
+ "size":20971520, "preallocation":"off"},
+ "job-id":"job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job1"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job1"}}
+{"execute": "job-dismiss", "arguments":
+ {"id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
+{"return": {}}
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"unmap",
+ "detect-zeroes":"unmap"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is sparse; expected sparse
+
+=== Testing creation=full discard=ignore zeroes=off ===
+
+{"execute": "blockdev-create", "arguments":
+ {"options": {"driver":"file", "filename":"TEST_DIR/t.IMGFMT",
+ "size":20971520, "preallocation":"full"},
+ "job-id":"job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job1"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job1"}}
+{"execute": "job-dismiss", "arguments":
+ {"id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
+{"return": {}}
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"ignore",
+ "detect-zeroes":"off"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is full; expected full
+
+=== Testing creation=full discard=unmap zeroes=off ===
+
+{"execute": "blockdev-create", "arguments":
+ {"options": {"driver":"file", "filename":"TEST_DIR/t.IMGFMT",
+ "size":20971520, "preallocation":"full"},
+ "job-id":"job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job1"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job1"}}
+{"execute": "job-dismiss", "arguments":
+ {"id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
+{"return": {}}
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"unmap",
+ "detect-zeroes":"off"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is sparse; expected sparse
+
+=== Testing creation=full discard=unmap zeroes=unmap ===
+
+{"execute": "blockdev-create", "arguments":
+ {"options": {"driver":"file", "filename":"TEST_DIR/t.IMGFMT",
+ "size":20971520, "preallocation":"full"},
+ "job-id":"job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job1"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job1"}}
+{"execute": "job-dismiss", "arguments":
+ {"id": "job1"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job1"}}
+{"return": {}}
+{"execute": "blockdev-add", "arguments":
+ {"node-name": "dst", "driver":"file",
+ "filename":"TEST_DIR/t.IMGFMT", "aio":"threads",
+ "auto-read-only":true, "discard":"unmap",
+ "detect-zeroes":"unmap"}}
+{"return": {}}
+{"execute":"blockdev-mirror", "arguments":
+ {"sync":"full", "device":"src", "target":"dst",
+ "job-id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job2"}}
+{"return": {}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job2"}}
+{"execute": "job-complete", "arguments":
+ {"id":"job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"return": {}}
+{"execute": "blockdev-del", "arguments":
+ {"node-name": "dst"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "job2", "len": LEN, "offset": OFFSET, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "job2"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "job2"}}
+{"return": {}}
+Images are identical.
+Destination is sparse; expected sparse
+{"execute":"quit"}
+*** done
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals
2025-05-09 20:40 ` [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals Eric Blake
2025-05-02 23:17 ` Sunny Zhu
@ 2025-05-12 15:19 ` Stefan Hajnoczi
1 sibling, 0 replies; 34+ messages in thread
From: Stefan Hajnoczi @ 2025-05-12 15:19 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-devel, qemu-block, sunnyzhyy, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
[-- Attachment #1: Type: text/plain, Size: 1084 bytes --]
On Fri, May 09, 2025 at 03:40:24PM -0500, Eric Blake wrote:
> Out of the five possible values for MirrorSyncMode, INCREMENTAL and
> BITMAP are already rejected up front in mirror_start, leaving NONE,
> TOP, and FULL as the remaining values that the code was collapsing
> into a single bool is_none_mode. Furthermore, mirror_dirty_init() is
> only reachable for modes TOP and FULL, as further guided by
> s->zero_target. However, upcoming patches want to further optimize
> the pre-zeroing pass of a sync=full mirror in mirror_dirty_init(),
> while avoiding that pass on a sync=top action. Instead of throwing
> away context by collapsing these two values into
> s->is_none_mode=false, it is better to pass s->sync_mode throughout
> the entire operation. For active commit, the desired semantics match
> sync mode TOP.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v4: new patch
> ---
> block/mirror.c | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero
2025-05-09 20:40 ` [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero Eric Blake
2025-05-02 23:57 ` Sunny Zhu
2025-05-03 0:40 ` Sunny Zhu
@ 2025-05-12 15:23 ` Stefan Hajnoczi
2 siblings, 0 replies; 34+ messages in thread
From: Stefan Hajnoczi @ 2025-05-12 15:23 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-devel, qemu-block, sunnyzhyy, vsementsov, Markus Armbruster,
John Snow, Kevin Wolf, Hanna Reitz
[-- Attachment #1: Type: text/plain, Size: 3595 bytes --]
On Fri, May 09, 2025 at 03:40:25PM -0500, Eric Blake wrote:
> QEMU has an optimization for a just-created drive-mirror destination
> that is not possible for blockdev-mirror (which can't create the
> destination) - any time we know the destination starts life as all
> zeroes, we can skip a pre-zeroing pass on the destination. Recent
> patches have added an improved heuristic for detecting if a file
> contains all zeroes, and we plan to use that heuristic in upcoming
> patches. But since a heuristic cannot quickly detect all scenarios,
> and there may be cases where the caller is aware of information that
> QEMU cannot learn quickly, it makes sense to have a way to tell QEMU
> to assume facts about the destination that can make the mirror
> operation faster. Given our existing example of "qemu-img convert
> --target-is-zero", it is time to expose this override in QMP for
> blockdev-mirror as well.
>
> This patch results in some slight redundancy between the older
> s->zero_target (set any time mode==FULL and the destination image was
> not just created - ie. clear if drive-mirror is asking to skip the
> pre-zero pass) and the newly-introduced s->target_is_zero (in addition
> to the QMP override, it is set when drive-mirror creates the
> destination image); this will be cleaned up in the next patch.
>
> There is also a subtlety that we must consider. When drive-mirror is
> passing target_is_zero on behalf of a just-created image, we know the
> image is sparse (skipping the pre-zeroing keeps it that way), so it
> doesn't matter whether the destination also has "discard":"unmap" and
> "detect-zeroes":"unmap". But now that we are letting the user set the
> knob for target-is-zero, if the user passes a pre-existing file that
> is fully allocated, it is fine to leave the file fully allocated under
> "detect-zeroes":"on", but if the file is open with
> "detect-zeroes":"unmap", we should really be trying harder to punch
> holes in the destination for every region of zeroes copied from the
> source. The easiest way to do this is to still run the pre-zeroing
> pass (turning the entire destination file sparse before populating
> just the allocated portions of the source), even though that currently
> results in double I/O to the portions of the file that are allocated.
> A later patch will add further optimizations to reduce redundant
> zeroing I/O during the mirror operation.
>
> Since "target-is-zero":true is designed for optimizations, it is okay
> to silently ignore the parameter rather than erroring if the user ever
> sets the parameter in a scenario where the mirror job can't exploit it
> (for example, when doing "sync":"top" instead of "sync":"full", we
> can't pre-zero, so setting the parameter won't make a speed
> difference).
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Acked-by: Markus Armbruster <armbru@redhat.com>
>
> ---
>
> v4: hoist earlier in series. QMP design is unchanged, but logic in
> mirror_dirty_init is different (in many aspects simpler, but now
> catering to "detect-zeroes":"unmap") so Acked-by on QMP kept, but
> Reviewed-by dropped.
> ---
> qapi/block-core.json | 8 +++++++-
> include/block/block_int-global-state.h | 3 ++-
> block/mirror.c | 27 ++++++++++++++++++++++----
> blockdev.c | 18 ++++++++++-------
> tests/unit/test-block-iothread.c | 2 +-
> 5 files changed, 44 insertions(+), 14 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 09/13] mirror: Drop redundant zero_target parameter
2025-05-09 20:40 ` [PATCH v4 09/13] mirror: Drop redundant zero_target parameter Eric Blake
2025-05-03 0:47 ` Sunny Zhu
@ 2025-05-12 15:24 ` Stefan Hajnoczi
2025-05-14 22:09 ` Eric Blake
2 siblings, 0 replies; 34+ messages in thread
From: Stefan Hajnoczi @ 2025-05-12 15:24 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-devel, qemu-block, sunnyzhyy, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
[-- Attachment #1: Type: text/plain, Size: 1008 bytes --]
On Fri, May 09, 2025 at 03:40:26PM -0500, Eric Blake wrote:
> The two callers to a mirror job (drive-mirror and blockdev-mirror) set
> zero_target precisely when sync mode == FULL, with the one exception
> that drive-mirror skips zeroing the target if it was newly created and
> reads as zero. But given the previous patch, that exception is
> equally captured by target_is_zero. And since we recently updated
> things to pass the sync mode all the way through to
> mirror_dirty_init(), we can now reconstruct the same conditionals
> without the redundant parameter.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v4: new patch
> ---
> include/block/block_int-global-state.h | 3 +--
> block/mirror.c | 13 +++++--------
> blockdev.c | 15 ++++-----------
> tests/unit/test-block-iothread.c | 2 +-
> 4 files changed, 11 insertions(+), 22 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero
2025-05-02 23:57 ` Sunny Zhu
@ 2025-05-12 17:30 ` Eric Blake
0 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-12 17:30 UTC (permalink / raw)
To: Sunny Zhu
Cc: armbru, hreitz, jsnow, kwolf, qemu-block, qemu-devel, stefanha,
vsementsov
On Sat, May 03, 2025 at 07:57:24AM +0800, Sunny Zhu wrote:
> On Fri, May 09, 2025 at 03:40:25PM -0500, Eric Blake wrote:
> > QEMU has an optimization for a just-created drive-mirror destination
> > that is not possible for blockdev-mirror (which can't create the
> > destination) - any time we know the destination starts life as all
> > zeroes, we can skip a pre-zeroing pass on the destination. Recent
> > patches have added an improved heuristic for detecting if a file
> > contains all zeroes, and we plan to use that heuristic in upcoming
> > patches. But since a heuristic cannot quickly detect all scenarios,
> > and there may be cases where the caller is aware of information that
> > QEMU cannot learn quickly, it makes sense to have a way to tell QEMU
> > to assume facts about the destination that can make the mirror
> > operation faster. Given our existing example of "qemu-img convert
> > --target-is-zero", it is time to expose this override in QMP for
> > blockdev-mirror as well.
> >
> > +++ b/qapi/block-core.json
> > @@ -2538,6 +2538,11 @@
> > # disappear from the query list without user intervention.
> > # Defaults to true. (Since 3.1)
> > #
> > +# @target-is-zero: Assume the destination reads as all zeroes before
> > +# the mirror started. Setting this to true can speed up the
> > +# mirror. Setting this to true when the destination is not
> > +# actually all zero can corrupt the destination. (Since 10.1)
>
> It appears feasible to add target-is-zero to the drive-mirror qmp as well,
> provided this is done under the NEW_IMAGE_MODE_EXISTING scenario. This would
> allow users to pass target-is-zero for optimizations whether using blockdev-mirror
> or drive-mirror when they have pre-prepared destination image.
I didn't want to touch drive-mirror, since it is an older command that
has some awkward semantics that were improved on when we added
blockdev-mirror (that's why libvirt only uses blockdev-mirror). If we
still think it is worth bolting features onto the old command, instead
of encouraging users to migrate to the new one, then that should be a
separate patch.
> > @@ -3044,6 +3044,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
> > zero_target = (arg->sync == MIRROR_SYNC_MODE_FULL &&
> > (arg->mode == NEW_IMAGE_MODE_EXISTING ||
> > !bdrv_has_zero_init(target_bs)));
> > + target_is_zero = (arg->mode != NEW_IMAGE_MODE_EXISTING &&
> > + bdrv_has_zero_init(target_bs));
>
> Similarly, if the qmp_drive_mirror supports the target-is-zero parameter, we would then
> be able to determine whether the destination image is fully zero-initialized when
> arg->mode is NEW_IMAGE_MODE_EXISTING.
Determination of whether a pre-existing image is fully
zero-initialized for drive-mirror comes for free later in the patch
series (at the same time that blockdev-mirror is able to infer that
information without an explicit target-is-zero parameter).
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero
2025-05-09 20:40 ` [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero Eric Blake
2025-05-03 0:51 ` Sunny Zhu
@ 2025-05-12 18:40 ` Stefan Hajnoczi
1 sibling, 0 replies; 34+ messages in thread
From: Stefan Hajnoczi @ 2025-05-12 18:40 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-devel, qemu-block, sunnyzhyy, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
[-- Attachment #1: Type: text/plain, Size: 2430 bytes --]
On Fri, May 09, 2025 at 03:40:27PM -0500, Eric Blake wrote:
> When doing a sync=full mirroring, we can skip pre-zeroing the
> destination if it already reads as zeroes and we are not also trying
> to punch holes due to detect-zeroes. With this patch, there are fewer
> scenarios that have to pass in an explicit target-is-zero, while still
> resulting in a sparse destination remaining sparse.
>
> A later patch will then further improve things to skip writing to the
> destination for parts of the image where the source is zero; but even
> with just this patch, it is possible to see a difference for any
> source that does not report itself as fully allocated, coupled with a
> destination BDS that can quickly report that it already reads as zero.
> (For a source that reports as fully allocated, such as a file, the
> rest of mirror_dirty_init() still sets the entire dirty bitmap to
> true, so even though we avoided the pre-zeroing, we are not yet
> avoiding all redundant I/O).
>
> Iotest 194 detects the difference made by this patch: for a file
> source (where block status reports the entire image as allocated, and
> therefore we end up writing zeroes everywhere in the destination
> anyways), the job length remains the same. But for a qcow2 source and
> a destination that reads as all zeroes, the dirty bitmap changes to
> just tracking the allocated portions of the source, which results in
> faster completion and smaller job statistics. For the test to pass
> with both ./check -file and -qcow2, a new python filter is needed to
> mask out the now-varying job amounts (this matches the shell filters
> _filter_block_job_{offset,len} in common.filter). A later test will
> also be added which further validates expected sparseness, so it does
> not matter that 194 is no longer explicitly looking at how many bytes
> were copied.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v3: add exemption for "detect-zeroes":"unmap" on destination
> v4: Rebase later in series, revise logic for pre-zeroing [Sunny], add
> in python filter
> ---
> block/mirror.c | 24 ++++++++++++++++--------
> tests/qemu-iotests/194 | 6 ++++--
> tests/qemu-iotests/194.out | 4 ++--
> tests/qemu-iotests/iotests.py | 12 +++++++++++-
> 4 files changed, 33 insertions(+), 13 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 11/13] mirror: Skip writing zeroes when target is already zero
2025-05-09 20:40 ` [PATCH v4 11/13] mirror: Skip writing zeroes when target " Eric Blake
@ 2025-05-12 18:43 ` Stefan Hajnoczi
2025-05-13 21:37 ` Eric Blake
2025-05-14 15:37 ` Sunny Zhu
2 siblings, 0 replies; 34+ messages in thread
From: Stefan Hajnoczi @ 2025-05-12 18:43 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-devel, qemu-block, sunnyzhyy, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
[-- Attachment #1: Type: text/plain, Size: 4186 bytes --]
On Fri, May 09, 2025 at 03:40:28PM -0500, Eric Blake wrote:
> When mirroring, the goal is to ensure that the destination reads the
> same as the source; this goal is met whether the destination is sparse
> or fully-allocated (except when explicitly punching holes, then merely
> reading zero is not enough to know if it is sparse, so we still want
> to punch the hole). Avoiding a redundant write to zero (whether in
> the background because the zero cluster was marked in the dirty
> bitmap, or in the foreground because the guest is writing zeroes) when
> the destination already reads as zero makes mirroring faster, and
> avoids allocating the destination merely because the source reports as
> allocated.
>
> The effect is especially pronounced when the source is a raw file.
> That's because when the source is a qcow2 file, the dirty bitmap only
> visits the portions of the source that are allocated, which tend to be
> non-zero. But when the source is a raw file,
> bdrv_co_is_allocated_above() reports the entire file as allocated so
> mirror_dirty_init sets the entire dirty bitmap, and it is only later
> during mirror_iteration that we change to consulting the more precise
> bdrv_co_block_status_above() to learn where the source reads as zero.
>
> Remember that since a mirror operation can write a cluster more than
> once (every time the guest changes the source, the destination is also
> changed to keep up), and the guest can change whether a given cluster
> reads as zero, is discarded, or has non-zero data over the course of
> the mirror operation, we can't take the shortcut of relying on
> s->target_is_zero (which is static for the life of the job) in
> mirror_co_zero() to see if the destination is already zero, because
> that information may be stale. Any solution we use must be dynamic in
> the face of the guest writing or discarding a cluster while the mirror
> has been ongoing.
>
> We could just teach mirror_co_zero() to do a block_status() probe of
> the destination, and skip the zeroes if the destination already reads
> as zero, but we know from past experience that extra block_status()
> calls are not always cheap (tmpfs, anyone?), especially when they are
> random access rather than linear. Use of block_status() of the source
> by the background task in a linear fashion is not our bottleneck (it's
> a background task, after all); but since mirroring can be done while
> the source is actively being changed, we don't want a slow
> block_status() of the destination to occur on the hot path of the
> guest trying to do random-access writes to the source.
>
> So this patch takes a slightly different approach: any time we have to
> track dirty clusters, we can also track which clusters are known to
> read as zero. For sync=TOP or when we are punching holes from
> "detect-zeroes":"unmap", the zero bitmap starts out empty, but
> prevents a second write zero to a cluster that was already zero by an
> earlier pass; for sync=FULL when we are not punching holes, the zero
> bitmap starts out full if the destination reads as zero during
> initialization. Either way, I/O to the destination can now avoid
> redundant write zero to a cluster that already reads as zero, all
> without having to do a block_status() per write on the destination.
>
> With this patch, if I create a raw sparse destination file, connect it
> with QMP 'blockdev-add' while leaving it at the default "discard":
> "ignore", then run QMP 'blockdev-mirror' with "sync": "full", the
> destination remains sparse rather than fully allocated. Meanwhile, a
> destination image that is already fully allocated remains so unless it
> was opened with "detect-zeroes": "unmap". And any time writing zeroes
> is skipped, the job counters are not incremented.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v3: also skip counters when skipping I/O [Sunny]
> v4: rebase to earlier changes
> ---
> block/mirror.c | 107 ++++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 93 insertions(+), 14 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 13/13] tests: Add iotest mirror-sparse for recent patches
2025-05-09 20:40 ` [PATCH v4 13/13] tests: Add iotest mirror-sparse for recent patches Eric Blake
@ 2025-05-12 18:44 ` Stefan Hajnoczi
0 siblings, 0 replies; 34+ messages in thread
From: Stefan Hajnoczi @ 2025-05-12 18:44 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-devel, qemu-block, sunnyzhyy, vsementsov, Kevin Wolf,
Hanna Reitz
[-- Attachment #1: Type: text/plain, Size: 1430 bytes --]
On Fri, May 09, 2025 at 03:40:30PM -0500, Eric Blake wrote:
> Prove that blockdev-mirror can now result in sparse raw destination
> files, regardless of whether the source is raw or qcow2. By making
> this a separate test, it was possible to test effects of individual
> patches for the various pieces that all have to work together for a
> sparse mirror to be successful.
>
> Note that ./check -file produces different job lengths than ./check
> -qcow2 (the test uses a filter to normalize); that's because when
> deciding how much of the image to be mirrored, the code looks at how
> much of the source image was allocated (for qcow2, this is only the
> written clusters; for raw, it is the entire file). But the important
> part is that the destination file ends up smaller than 3M, rather than
> the 20M it used to be before this patch series.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v3: Improve test with more cases
> v4: drop -machine q35 from command line [Stefan], update to
> match rework in rest of series
> ---
> tests/qemu-iotests/tests/mirror-sparse | 125 +++++++
> tests/qemu-iotests/tests/mirror-sparse.out | 365 +++++++++++++++++++++
> 2 files changed, 490 insertions(+)
> create mode 100755 tests/qemu-iotests/tests/mirror-sparse
> create mode 100644 tests/qemu-iotests/tests/mirror-sparse.out
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (12 preceding siblings ...)
2025-05-09 20:40 ` [PATCH v4 13/13] tests: Add iotest mirror-sparse for recent patches Eric Blake
@ 2025-05-12 18:44 ` Stefan Hajnoczi
2025-05-13 22:00 ` [PATCH v4 14/13] mirror: Reduce I/O when destination is detect-zeroes:unmap Eric Blake
14 siblings, 0 replies; 34+ messages in thread
From: Stefan Hajnoczi @ 2025-05-12 18:44 UTC (permalink / raw)
To: Eric Blake; +Cc: qemu-devel, qemu-block, sunnyzhyy, vsementsov
[-- Attachment #1: Type: text/plain, Size: 4543 bytes --]
On Fri, May 09, 2025 at 03:40:17PM -0500, Eric Blake wrote:
> v3 was here:
> https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg04525.html
>
> In v4:
> - Rearrange series a bit to try and simplify logic for how zero
> bitmap is populated, when pre-zeroing is attempted, and how
> detect-zeroes=unmap interacts [Sunny]
> - Add a couple new patches to make this easier to follow
> - iotest tweaks [Stefan]
>
> 001/13:[----] [--] 'block: Expand block status mode from bool to flags'
> 002/13:[----] [--] 'file-posix, gluster: Handle zero block status hint better'
> 003/13:[----] [--] 'block: Let bdrv_co_is_zero_fast consolidate adjacent extents'
> 004/13:[----] [--] 'block: Add new bdrv_co_is_all_zeroes() function'
> 005/13:[----] [--] 'iotests: Improve iotest 194 to mirror data'
> 006/13:[----] [--] 'mirror: Minor refactoring'
> 007/13:[down] 'mirror: Pass full sync mode rather than bool to internals'
> 008/13:[0038] [FC] 'mirror: Allow QMP override to declare target already zero'
> 009/13:[down] 'mirror: Drop redundant zero_target parameter'
> 010/13:[0063] [FC] 'mirror: Skip pre-zeroing destination if it is already zero'
> 011/13:[0021] [FC] 'mirror: Skip writing zeroes when target is already zero'
> 012/13:[----] [--] 'iotests/common.rc: add disk_usage function'
> 013/13:[0013] [FC] 'tests: Add iotest mirror-sparse for recent patches'
>
> Andrey Drobyshev (1):
> iotests/common.rc: add disk_usage function
>
> Eric Blake (12):
> block: Expand block status mode from bool to flags
> file-posix, gluster: Handle zero block status hint better
> block: Let bdrv_co_is_zero_fast consolidate adjacent extents
> block: Add new bdrv_co_is_all_zeroes() function
> iotests: Improve iotest 194 to mirror data
> mirror: Minor refactoring
> mirror: Pass full sync mode rather than bool to internals
> mirror: Allow QMP override to declare target already zero
> mirror: Drop redundant zero_target parameter
> mirror: Skip pre-zeroing destination if it is already zero
> mirror: Skip writing zeroes when target is already zero
> tests: Add iotest mirror-sparse for recent patches
>
> qapi/block-core.json | 8 +-
> block/coroutines.h | 4 +-
> include/block/block-common.h | 11 +
> include/block/block-io.h | 2 +
> include/block/block_int-common.h | 27 +-
> include/block/block_int-global-state.h | 4 +-
> include/block/block_int-io.h | 4 +-
> block/io.c | 126 +++++--
> block/blkdebug.c | 6 +-
> block/copy-before-write.c | 4 +-
> block/file-posix.c | 5 +-
> block/gluster.c | 4 +-
> block/iscsi.c | 6 +-
> block/mirror.c | 183 ++++++++---
> block/nbd.c | 4 +-
> block/null.c | 6 +-
> block/parallels.c | 6 +-
> block/qcow.c | 2 +-
> block/qcow2.c | 6 +-
> block/qed.c | 6 +-
> block/quorum.c | 4 +-
> block/raw-format.c | 4 +-
> block/rbd.c | 6 +-
> block/snapshot-access.c | 4 +-
> block/vdi.c | 4 +-
> block/vmdk.c | 2 +-
> block/vpc.c | 2 +-
> block/vvfat.c | 6 +-
> blockdev.c | 27 +-
> tests/unit/test-block-iothread.c | 2 +-
> tests/qemu-iotests/common.rc | 6 +
> tests/qemu-iotests/194 | 7 +-
> tests/qemu-iotests/194.out | 4 +-
> tests/qemu-iotests/250 | 5 -
> tests/qemu-iotests/iotests.py | 12 +-
> tests/qemu-iotests/tests/mirror-sparse | 125 +++++++
> tests/qemu-iotests/tests/mirror-sparse.out | 365 +++++++++++++++++++++
> 37 files changed, 850 insertions(+), 159 deletions(-)
> create mode 100755 tests/qemu-iotests/tests/mirror-sparse
> create mode 100644 tests/qemu-iotests/tests/mirror-sparse.out
>
> --
> 2.49.0
>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 11/13] mirror: Skip writing zeroes when target is already zero
2025-05-09 20:40 ` [PATCH v4 11/13] mirror: Skip writing zeroes when target " Eric Blake
2025-05-12 18:43 ` Stefan Hajnoczi
@ 2025-05-13 21:37 ` Eric Blake
2025-05-14 15:37 ` Sunny Zhu
2 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-13 21:37 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
On Fri, May 09, 2025 at 03:40:28PM -0500, Eric Blake wrote:
> When mirroring, the goal is to ensure that the destination reads the
> same as the source; this goal is met whether the destination is sparse
> or fully-allocated (except when explicitly punching holes, then merely
> reading zero is not enough to know if it is sparse, so we still want
> to punch the hole). Avoiding a redundant write to zero (whether in
> the background because the zero cluster was marked in the dirty
> bitmap, or in the foreground because the guest is writing zeroes) when
> the destination already reads as zero makes mirroring faster, and
> avoids allocating the destination merely because the source reports as
> allocated.
I just realized: when we are trying to punch holes, a pre-zeroing pass
still does double I/O on portions of the source that are allocated
(once to punch the hole, another to write it back to non-zero). That
was the best earlier patches could do, without fine-grained zero
tracking. But with this patch, we can do better:
> @@ -847,8 +887,10 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> bool punch_holes =
> target_bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
> bdrv_can_write_zeroes_with_unmap(target_bs);
> + int64_t bitmap_length = DIV_ROUND_UP(s->bdev_length, s->granularity);
>
> /* Determine if the image is already zero, regardless of sync mode. */
> + s->zero_bitmap = bitmap_new(bitmap_length);
> bdrv_graph_co_rdlock();
> bs = s->mirror_top_bs->backing->bs;
> if (s->target_is_zero) {
> @@ -862,7 +904,14 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> if (ret < 0) {
> return ret;
> } else if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
> - /* In TOP mode, there is no benefit to a pre-zeroing pass. */
> + /*
> + * In TOP mode, there is no benefit to a pre-zeroing pass, but
> + * the zero bitmap can be set if the destination already reads
> + * as zero and we are not punching holes.
> + */
> + if (ret > 0 && !punch_holes) {
> + bitmap_set(s->zero_bitmap, 0, bitmap_length);
> + }
> } else if (ret == 0 || punch_holes) {
> /*
> * Here, we are in FULL mode; our goal is to avoid writing
> @@ -871,8 +920,9 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> * zeroing happened externally (ret > 0) or if we have a fast
> * way to pre-zero the image (the dirty bitmap will be
> * populated later by the non-zero portions, the same as for
> - * TOP mode). If pre-zeroing is not fast, or we need to punch
> - * holes, then our only recourse is to write the entire image.
> + * TOP mode). If pre-zeroing is not fast, then our only
> + * recourse is to mark the entire image dirty. The act of
> + * pre-zeroing will populate the zero bitmap.
> */
> if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
> bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
If we are punching holes, leaving the zero bitmap clear (at this
stage) and setting the entire dirty bitmap will cause the entire image
to be visited, and punch holes only where they are needed, without a
pre-zeroing pass.
I will add that as a followup patch (no need to invalidate the reviews
on this one).
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v4 14/13] mirror: Reduce I/O when destination is detect-zeroes:unmap
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
` (13 preceding siblings ...)
2025-05-12 18:44 ` [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Stefan Hajnoczi
@ 2025-05-13 22:00 ` Eric Blake
2025-05-14 13:20 ` Stefan Hajnoczi
14 siblings, 1 reply; 34+ messages in thread
From: Eric Blake @ 2025-05-13 22:00 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
If we are going to punch holes in the mirror destination even for the
portions where the source image is unallocated, it is nicer to treat
the entire image as dirty and punch as we go, rather than pre-zeroing
the entire image just to re-do I/O to the allocated portions of the
image.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
block/mirror.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 724318f0371..c2c5099c951 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -920,11 +920,16 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
* zeroing happened externally (ret > 0) or if we have a fast
* way to pre-zero the image (the dirty bitmap will be
* populated later by the non-zero portions, the same as for
- * TOP mode). If pre-zeroing is not fast, then our only
- * recourse is to mark the entire image dirty. The act of
- * pre-zeroing will populate the zero bitmap.
+ * TOP mode). If pre-zeroing is not fast, or we need to visit
+ * the entire image in order to punch holes even in the
+ * non-allocated regions of the source, then just mark the
+ * entire image dirty and leave the zero bitmap clear at this
+ * point in time. Otherwise, it can be faster to pre-zero the
+ * image now, even if we re-write the allocated portions of
+ * the disk later, and the pre-zero pass will populate the
+ * zero bitmap.
*/
- if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
+ if (!bdrv_can_write_zeroes_with_unmap(target_bs) || punch_holes) {
bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
return 0;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v4 14/13] mirror: Reduce I/O when destination is detect-zeroes:unmap
2025-05-13 22:00 ` [PATCH v4 14/13] mirror: Reduce I/O when destination is detect-zeroes:unmap Eric Blake
@ 2025-05-14 13:20 ` Stefan Hajnoczi
0 siblings, 0 replies; 34+ messages in thread
From: Stefan Hajnoczi @ 2025-05-14 13:20 UTC (permalink / raw)
To: Eric Blake
Cc: qemu-devel, qemu-block, sunnyzhyy, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
[-- Attachment #1: Type: text/plain, Size: 564 bytes --]
On Tue, May 13, 2025 at 05:00:45PM -0500, Eric Blake wrote:
> If we are going to punch holes in the mirror destination even for the
> portions where the source image is unallocated, it is nicer to treat
> the entire image as dirty and punch as we go, rather than pre-zeroing
> the entire image just to re-do I/O to the allocated portions of the
> image.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> block/mirror.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 11/13] mirror: Skip writing zeroes when target is already zero
2025-05-09 20:40 ` [PATCH v4 11/13] mirror: Skip writing zeroes when target " Eric Blake
2025-05-12 18:43 ` Stefan Hajnoczi
2025-05-13 21:37 ` Eric Blake
@ 2025-05-14 15:37 ` Sunny Zhu
2025-05-14 16:30 ` Eric Blake
2 siblings, 1 reply; 34+ messages in thread
From: Sunny Zhu @ 2025-05-14 15:37 UTC (permalink / raw)
To: eblake
Cc: hreitz, jsnow, kwolf, qemu-block, qemu-devel, stefanha, sunnyzhyy,
vsementsov
On Fri, 9 May 2025 15:40:28 -0500, Eric Blake wrote:
> @@ -847,8 +887,10 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> bool punch_holes =
> target_bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
> bdrv_can_write_zeroes_with_unmap(target_bs);
> + int64_t bitmap_length = DIV_ROUND_UP(s->bdev_length, s->granularity);
>
> /* Determine if the image is already zero, regardless of sync mode. */
> + s->zero_bitmap = bitmap_new(bitmap_length);
> bdrv_graph_co_rdlock();
> bs = s->mirror_top_bs->backing->bs;
> if (s->target_is_zero) {
> @@ -862,7 +904,14 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> if (ret < 0) {
> return ret;
> } else if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
> - /* In TOP mode, there is no benefit to a pre-zeroing pass. */
> + /*
> + * In TOP mode, there is no benefit to a pre-zeroing pass, but
> + * the zero bitmap can be set if the destination already reads
> + * as zero and we are not punching holes.
> + */
> + if (ret > 0 && !punch_holes) {
> + bitmap_set(s->zero_bitmap, 0, bitmap_length);
It's ok when ret > 0 is obtained through bdrv_co_is_all_zeroes(target_bs). However,
if ret = 1 originates by target-is-zero == true from qmp_blockdev_mirror, this means
that target-is-zero also takes effect under sync=TOP. I am uncertain whether this
aligns with our intended behavior.
Under sync=TOP, target-is-zero could carry two distinct meanings:
[1] The top snapshot reads as fully zero.
[2] The entire destination snapshot chain reads as fully zero.
Currently, target-is-zero is designed to represent scenario [2] on sync=TOP.
> + }
> } else if (ret == 0 || punch_holes) {
> /*
> * Here, we are in FULL mode; our goal is to avoid writing
> @@ -871,8 +920,9 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> * zeroing happened externally (ret > 0) or if we have a fast
> * way to pre-zero the image (the dirty bitmap will be
> * populated later by the non-zero portions, the same as for
> - * TOP mode). If pre-zeroing is not fast, or we need to punch
> - * holes, then our only recourse is to write the entire image.
> + * TOP mode). If pre-zeroing is not fast, then our only
> + * recourse is to mark the entire image dirty. The act of
> + * pre-zeroing will populate the zero bitmap.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 11/13] mirror: Skip writing zeroes when target is already zero
2025-05-14 15:37 ` Sunny Zhu
@ 2025-05-14 16:30 ` Eric Blake
0 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-14 16:30 UTC (permalink / raw)
To: Sunny Zhu
Cc: hreitz, jsnow, kwolf, qemu-block, qemu-devel, stefanha,
vsementsov
On Wed, May 14, 2025 at 11:37:54PM +0800, Sunny Zhu wrote:
> On Fri, 9 May 2025 15:40:28 -0500, Eric Blake wrote:
> > @@ -847,8 +887,10 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> > bool punch_holes =
> > target_bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP &&
> > bdrv_can_write_zeroes_with_unmap(target_bs);
> > + int64_t bitmap_length = DIV_ROUND_UP(s->bdev_length, s->granularity);
> >
> > /* Determine if the image is already zero, regardless of sync mode. */
> > + s->zero_bitmap = bitmap_new(bitmap_length);
> > bdrv_graph_co_rdlock();
> > bs = s->mirror_top_bs->backing->bs;
> > if (s->target_is_zero) {
> > @@ -862,7 +904,14 @@ static int coroutine_fn GRAPH_UNLOCKED mirror_dirty_init(MirrorBlockJob *s)
> > if (ret < 0) {
> > return ret;
> > } else if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
> > - /* In TOP mode, there is no benefit to a pre-zeroing pass. */
> > + /*
> > + * In TOP mode, there is no benefit to a pre-zeroing pass, but
> > + * the zero bitmap can be set if the destination already reads
> > + * as zero and we are not punching holes.
> > + */
> > + if (ret > 0 && !punch_holes) {
> > + bitmap_set(s->zero_bitmap, 0, bitmap_length);
>
> It's ok when ret > 0 is obtained through bdrv_co_is_all_zeroes(target_bs). However,
> if ret = 1 originates by target-is-zero == true from qmp_blockdev_mirror, this means
> that target-is-zero also takes effect under sync=TOP. I am uncertain whether this
> aligns with our intended behavior.
>
> Under sync=TOP, target-is-zero could carry two distinct meanings:
> [1] The top snapshot reads as fully zero.
> [2] The entire destination snapshot chain reads as fully zero.
>
> Currently, target-is-zero is designed to represent scenario [2] on sync=TOP.
As written, we are telling the user they should not set target-is-zero
unless the entire destination (including through any snapshot chain)
reads as zero. Offhand (but untested) I think you can get by with
passing target-is-zero:true even when the destination is not fully
zero, _IF_ all portions of the image that will be mirrored read as
zero (and by doing that, the portions that will not be mirrored are
not cleared as a result). If my assumption is plausible, that would
be a slick way to do a sync=TOP mirror to copy just the dirty portions
of an image into a destination that already contains all contents from
the clean portions of the image prior to the point in time of the
dirty bitmap, in order to rehydrate a full image from an incremental
mirror. But given that it is not tested, I wouldn't recommend trying
it in a production environment.
I doubt that sync=TOP users will ever want to manually set
target-is-zero:true, but don't see any reason to change the series as
reviewed, because the documentation for target-is-zero was clear that
that manually setting it when the image is not actually zero can
possibly cause corruption (even if it ends up working without
corruption in some corner cases).
Meanwhile, the only effect of having the zero bitmap set when
target-is-zero was passed to sync=TOP (even if that is not what most
users will do) is that _if_ the portion of the image that is dirty and
needs to be mirrored reads as zero on the source, it will not be
written to the destination. It doesn't punch holes in the portions of
the destionation that correspond to unallocated bits of the source.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 09/13] mirror: Drop redundant zero_target parameter
2025-05-09 20:40 ` [PATCH v4 09/13] mirror: Drop redundant zero_target parameter Eric Blake
2025-05-03 0:47 ` Sunny Zhu
2025-05-12 15:24 ` Stefan Hajnoczi
@ 2025-05-14 22:09 ` Eric Blake
2025-05-15 1:11 ` Eric Blake
2 siblings, 1 reply; 34+ messages in thread
From: Eric Blake @ 2025-05-14 22:09 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
On Fri, May 09, 2025 at 03:40:26PM -0500, Eric Blake wrote:
> The two callers to a mirror job (drive-mirror and blockdev-mirror) set
> zero_target precisely when sync mode == FULL, with the one exception
> that drive-mirror skips zeroing the target if it was newly created and
> reads as zero. But given the previous patch, that exception is
> equally captured by target_is_zero. And since we recently updated
> things to pass the sync mode all the way through to
> mirror_dirty_init(), we can now reconstruct the same conditionals
> without the redundant parameter.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v4: new patch
I was about to send the pull request, and noticed that this patch
reliably makes iotest 185 fail:
{"return": {}}
-{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "mirror", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
---- Writing data to the virtio-blk device ---
...
-*** done
+Timeout waiting for event BLOCK_JOB_READY on handle 5
Failures: 185
Investigating now...
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v4 09/13] mirror: Drop redundant zero_target parameter
2025-05-14 22:09 ` Eric Blake
@ 2025-05-15 1:11 ` Eric Blake
0 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2025-05-15 1:11 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, sunnyzhyy, stefanha, vsementsov, John Snow,
Kevin Wolf, Hanna Reitz
On Wed, May 14, 2025 at 05:09:14PM -0500, Eric Blake wrote:
> On Fri, May 09, 2025 at 03:40:26PM -0500, Eric Blake wrote:
> > The two callers to a mirror job (drive-mirror and blockdev-mirror) set
> > zero_target precisely when sync mode == FULL, with the one exception
> > that drive-mirror skips zeroing the target if it was newly created and
> > reads as zero. But given the previous patch, that exception is
> > equally captured by target_is_zero. And since we recently updated
> > things to pass the sync mode all the way through to
> > mirror_dirty_init(), we can now reconstruct the same conditionals
> > without the redundant parameter.
> >
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> >
> > ---
> >
> > v4: new patch
>
> I was about to send the pull request, and noticed that this patch
> reliably makes iotest 185 fail:
>
> {"return": {}}
> -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "mirror", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
> ---- Writing data to the virtio-blk device ---
> ...
> -*** done
> +Timeout waiting for event BLOCK_JOB_READY on handle 5
> Failures: 185
>
>
> Investigating now...
Oh, that was extremely subtle.
Pre-patch, zero_target is set to false while sync == SYNC_TOP prior to
calling blockdev.c:blockdev_mirror_common(). But in that function, we
were forcefully slamming sync to MIRROR_SYNC_MODE_FULL if
bdrv_backing_chain_next(bs) == NULL. Once zero_target is no longer
available, that means any mirror job using "sync":"top" but with no
backing image (which is effectively syncing the entire chain) now
defaults to pre-zeroing - and since 185 intentionally throttles
things, the newly-triggered pre-zeroing takes so long that the test
times out.
Fortunately, an examination of all places in mirror.c that care about
sync_mode TOP vs FULL shows that prior to this series, they were
mostly treated identically everywhere (only zero_target mattered)
thanks to the is_none_mode bool. The lone exception was in
mirror_start, with this statement:
base = mode == MIRROR_SYNC_MODE_TOP ? bdrv_backing_chain_next(bs) : NULL;
but we know that the caller slammed mode to MODE_FULL precisely if it
used to be TOP and bdrv_backing_chain_next(bs) == NULL. So if the
caller doesn't slam sync to FULL, this ternary still sets base to the
same value. And we are back to skipping the pre-zeroing pass for
"sync":"top", the way test 185 wants.
Hence, I'm squashing this in, and preparing the pull request.
diff --git i/blockdev.c w/blockdev.c
index 93d403d8210..21443b45144 100644
--- i/blockdev.c
+++ w/blockdev.c
@@ -2871,10 +2871,6 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
return;
}
- if (!bdrv_backing_chain_next(bs) && sync == MIRROR_SYNC_MODE_TOP) {
- sync = MIRROR_SYNC_MODE_FULL;
- }
-
if (!replaces) {
/* We want to mirror from @bs, but keep implicit filters on top */
unfiltered_bs = bdrv_skip_implicit_filters(bs);
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
^ permalink raw reply related [flat|nested] 34+ messages in thread
end of thread, other threads:[~2025-05-15 1:12 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-09 20:40 [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Eric Blake
2025-05-09 20:40 ` [PATCH v4 01/13] block: Expand block status mode from bool to flags Eric Blake
2025-05-09 20:40 ` [PATCH v4 02/13] file-posix, gluster: Handle zero block status hint better Eric Blake
2025-05-09 20:40 ` [PATCH v4 03/13] block: Let bdrv_co_is_zero_fast consolidate adjacent extents Eric Blake
2025-05-09 20:40 ` [PATCH v4 04/13] block: Add new bdrv_co_is_all_zeroes() function Eric Blake
2025-05-09 20:40 ` [PATCH v4 05/13] iotests: Improve iotest 194 to mirror data Eric Blake
2025-05-09 20:40 ` [PATCH v4 06/13] mirror: Minor refactoring Eric Blake
2025-05-09 20:40 ` [PATCH v4 07/13] mirror: Pass full sync mode rather than bool to internals Eric Blake
2025-05-02 23:17 ` Sunny Zhu
2025-05-12 15:19 ` Stefan Hajnoczi
2025-05-09 20:40 ` [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero Eric Blake
2025-05-02 23:57 ` Sunny Zhu
2025-05-12 17:30 ` Eric Blake
2025-05-03 0:40 ` Sunny Zhu
2025-05-12 15:23 ` Stefan Hajnoczi
2025-05-09 20:40 ` [PATCH v4 09/13] mirror: Drop redundant zero_target parameter Eric Blake
2025-05-03 0:47 ` Sunny Zhu
2025-05-12 15:24 ` Stefan Hajnoczi
2025-05-14 22:09 ` Eric Blake
2025-05-15 1:11 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero Eric Blake
2025-05-03 0:51 ` Sunny Zhu
2025-05-12 18:40 ` Stefan Hajnoczi
2025-05-09 20:40 ` [PATCH v4 11/13] mirror: Skip writing zeroes when target " Eric Blake
2025-05-12 18:43 ` Stefan Hajnoczi
2025-05-13 21:37 ` Eric Blake
2025-05-14 15:37 ` Sunny Zhu
2025-05-14 16:30 ` Eric Blake
2025-05-09 20:40 ` [PATCH v4 12/13] iotests/common.rc: add disk_usage function Eric Blake
2025-05-09 20:40 ` [PATCH v4 13/13] tests: Add iotest mirror-sparse for recent patches Eric Blake
2025-05-12 18:44 ` Stefan Hajnoczi
2025-05-12 18:44 ` [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases Stefan Hajnoczi
2025-05-13 22:00 ` [PATCH v4 14/13] mirror: Reduce I/O when destination is detect-zeroes:unmap Eric Blake
2025-05-14 13:20 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).