* [PATCH 0/2] block/null: Add read-pattern @ 2025-04-27 22:58 Nir Soffer 2025-04-27 22:58 ` [PATCH 1/2] block/null: Report DATA if not reading zeroes Nir Soffer 2025-04-27 22:59 ` [PATCH 2/2] block/null: Add read-pattern option Nir Soffer 0 siblings, 2 replies; 9+ messages in thread From: Nir Soffer @ 2025-04-27 22:58 UTC (permalink / raw) To: qemu-devel Cc: Hanna Reitz, Eric Blake, Markus Armbruster, qemu-block, Fam Zheng, Kevin Wolf, Nir Soffer The null driver is very usefull for testing and benchmarking, but it can not emulate an image full of zeroes or an image full of non-zero bytes. Such images are needed for testing computing a blkhash via qemu-nbd or qemu-storage-daemon. This change adds `read-pattern` option allowing emulution of image full of zeroes and image for of non-zero bytes. I used this for testing https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html Nir Soffer (2): block/null: Report DATA if not reading zeroes block/null: Add read-pattern option block/null.c | 21 ++++++++++++++++++--- qapi/block-core.json | 9 ++++++++- 2 files changed, 26 insertions(+), 4 deletions(-) -- 2.39.5 (Apple Git-154) ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] block/null: Report DATA if not reading zeroes 2025-04-27 22:58 [PATCH 0/2] block/null: Add read-pattern Nir Soffer @ 2025-04-27 22:58 ` Nir Soffer 2025-04-28 21:49 ` Eric Blake 2025-04-27 22:59 ` [PATCH 2/2] block/null: Add read-pattern option Nir Soffer 1 sibling, 1 reply; 9+ messages in thread From: Nir Soffer @ 2025-04-27 22:58 UTC (permalink / raw) To: qemu-devel Cc: Hanna Reitz, Eric Blake, Markus Armbruster, qemu-block, Fam Zheng, Kevin Wolf, Nir Soffer If read-zeroes is not set, we did not report BDRV_BLOCK_DATA or BDRV_BLOCK_ZERO. This is not consistent with other drivers and can confuse users or other programs: % qemu-img map --output json "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" [{ "start": 0, "length": 1073741824, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}] % qemu-nbd "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" & % nbdinfo --map nbd://127.0.0.1 0 1073741824 1 hole With this change we report DATA in this case: % ./qemu-img map --output json "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" [{ "start": 0, "length": 1073741824, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": 0}] % ./qemu-nbd "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" & % nbdinfo --map nbd://127.0.0.1 0 1073741824 0 data Signed-off-by: Nir Soffer <nirsof@gmail.com> --- block/null.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/block/null.c b/block/null.c index dc0b1fdbd9..7ba87bd9a9 100644 --- a/block/null.c +++ b/block/null.c @@ -239,9 +239,7 @@ static int coroutine_fn null_co_block_status(BlockDriverState *bs, *map = offset; *file = bs; - if (s->read_zeroes) { - ret |= BDRV_BLOCK_ZERO; - } + ret |= s->read_zeroes ? BDRV_BLOCK_ZERO : BDRV_BLOCK_DATA; return ret; } -- 2.39.5 (Apple Git-154) ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block/null: Report DATA if not reading zeroes 2025-04-27 22:58 ` [PATCH 1/2] block/null: Report DATA if not reading zeroes Nir Soffer @ 2025-04-28 21:49 ` Eric Blake 2025-04-30 19:46 ` Nir Soffer 0 siblings, 1 reply; 9+ messages in thread From: Eric Blake @ 2025-04-28 21:49 UTC (permalink / raw) To: Nir Soffer Cc: qemu-devel, Hanna Reitz, Markus Armbruster, qemu-block, Fam Zheng, Kevin Wolf On Mon, Apr 28, 2025 at 01:58:59AM +0300, Nir Soffer wrote: > If read-zeroes is not set, we did not report BDRV_BLOCK_DATA or > BDRV_BLOCK_ZERO. This is not consistent with other drivers and can > confuse users or other programs: > > % qemu-img map --output json "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" > [{ "start": 0, "length": 1073741824, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}] > > % qemu-nbd "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" & > > % nbdinfo --map nbd://127.0.0.1 > 0 1073741824 1 hole > > With this change we report DATA in this case: > > % ./qemu-img map --output json "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" > [{ "start": 0, "length": 1073741824, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": 0}] Do we really want to represent the region as present? I think we're okay (we guarantee that reads from the region will return sensible contents), and your next patch to allow control over what data is seen makes it even better. > > % ./qemu-nbd "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" & > > % nbdinfo --map nbd://127.0.0.1 > 0 1073741824 0 data > > Signed-off-by: Nir Soffer <nirsof@gmail.com> > --- > block/null.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/block/null.c b/block/null.c > index dc0b1fdbd9..7ba87bd9a9 100644 > --- a/block/null.c > +++ b/block/null.c > @@ -239,9 +239,7 @@ static int coroutine_fn null_co_block_status(BlockDriverState *bs, > *map = offset; > *file = bs; > > - if (s->read_zeroes) { > - ret |= BDRV_BLOCK_ZERO; > - } > + ret |= s->read_zeroes ? BDRV_BLOCK_ZERO : BDRV_BLOCK_DATA; > return ret; According to include/block/block-common.h, * BDRV_BLOCK_DATA: allocation for data at offset is tied to this layer * BDRV_BLOCK_ZERO: offset reads as zero * BDRV_BLOCK_OFFSET_VALID: an associated offset exists for accessing raw data Or read differently, BDRV_BLOCK_DATA says an area is not sparse. This patch is changing the null driver to report the entire image as either sparse and zero, or not sparse and not zero. For testing purposes, I think that is reasonable enough; it matches that file-posix.c always returns one or the other, when using only lseek() to probe things. NBD has a bit finer-grained control (it has two bits, one for sparseness which is effectively !BDRV_BLOCK_DATA, and one for zero; where you can have something that is sparse but does not read as zero, such as SCSI that lacks coherent uninitialized read). In fact, you could even have DATA|ZERO if we know something is allocated AND that it reads as zero (which will be the case in your next patch if the user explicitly asks for a read pattern of 0). But for all of that thinking, in the end your patch makes sense to me. Reviewed-by: Eric Blake <eblake@redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block/null: Report DATA if not reading zeroes 2025-04-28 21:49 ` Eric Blake @ 2025-04-30 19:46 ` Nir Soffer 0 siblings, 0 replies; 9+ messages in thread From: Nir Soffer @ 2025-04-30 19:46 UTC (permalink / raw) To: Eric Blake Cc: QEMU Developers, Hanna Reitz, Markus Armbruster, qemu-block, Fam Zheng, Kevin Wolf > On 29 Apr 2025, at 0:49, Eric Blake <eblake@redhat.com> wrote: > > On Mon, Apr 28, 2025 at 01:58:59AM +0300, Nir Soffer wrote: >> If read-zeroes is not set, we did not report BDRV_BLOCK_DATA or >> BDRV_BLOCK_ZERO. This is not consistent with other drivers and can >> confuse users or other programs: >> >> % qemu-img map --output json "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" >> [{ "start": 0, "length": 1073741824, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}] >> >> % qemu-nbd "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" & >> >> % nbdinfo --map nbd://127.0.0.1 >> 0 1073741824 1 hole >> >> With this change we report DATA in this case: >> >> % ./qemu-img map --output json "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" >> [{ "start": 0, "length": 1073741824, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": 0}] > > Do we really want to represent the region as present? I think we're > okay (we guarantee that reads from the region will return sensible > contents), and your next patch to allow control over what data is seen > makes it even better. I kept the existing behavior, but I think present=false is relevant only if you have a backing file (like qcow2), so for null-co it makes sense to behave like file driver. > > >> >> % ./qemu-nbd "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '1g'}}" & >> >> % nbdinfo --map nbd://127.0.0.1 >> 0 1073741824 0 data >> >> Signed-off-by: Nir Soffer <nirsof@gmail.com> >> --- >> block/null.c | 4 +--- >> 1 file changed, 1 insertion(+), 3 deletions(-) >> >> diff --git a/block/null.c b/block/null.c >> index dc0b1fdbd9..7ba87bd9a9 100644 >> --- a/block/null.c >> +++ b/block/null.c >> @@ -239,9 +239,7 @@ static int coroutine_fn null_co_block_status(BlockDriverState *bs, >> *map = offset; >> *file = bs; >> >> - if (s->read_zeroes) { >> - ret |= BDRV_BLOCK_ZERO; >> - } >> + ret |= s->read_zeroes ? BDRV_BLOCK_ZERO : BDRV_BLOCK_DATA; >> return ret; > > According to include/block/block-common.h, > > * BDRV_BLOCK_DATA: allocation for data at offset is tied to this layer > * BDRV_BLOCK_ZERO: offset reads as zero > * BDRV_BLOCK_OFFSET_VALID: an associated offset exists for accessing raw data > > Or read differently, BDRV_BLOCK_DATA says an area is not sparse. This > patch is changing the null driver to report the entire image as either > sparse and zero, or not sparse and not zero. > > For testing purposes, I think that is reasonable enough; it matches > that file-posix.c always returns one or the other, when using only > lseek() to probe things. NBD has a bit finer-grained control (it has > two bits, one for sparseness which is effectively !BDRV_BLOCK_DATA, > and one for zero; where you can have something that is sparse but does > not read as zero, such as SCSI that lacks coherent uninitialized > read). In fact, you could even have DATA|ZERO if we know something is > allocated AND that it reads as zero (which will be the case in your > next patch if the user explicitly asks for a read pattern of 0). If we want this we can remove read_zeros, since read_pattern support reading zeros, and have: *read_pattern: uint8 (default unset) *block_status: “data"|”zero” (default data?) *allocated: bool (default true) We can replace reads-zeros=on with read-pattern=0 in tests and docs, but it may break other users using null-co. Maybe keep read_zeros for backward compatibility, and make it conflict with the new options? > > But for all of that thinking, in the end your patch makes sense to me. > > Reviewed-by: Eric Blake <eblake@redhat.com> > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. > Virtualization: qemu.org | libguestfs.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] block/null: Add read-pattern option 2025-04-27 22:58 [PATCH 0/2] block/null: Add read-pattern Nir Soffer 2025-04-27 22:58 ` [PATCH 1/2] block/null: Report DATA if not reading zeroes Nir Soffer @ 2025-04-27 22:59 ` Nir Soffer 2025-04-28 21:55 ` Eric Blake 1 sibling, 1 reply; 9+ messages in thread From: Nir Soffer @ 2025-04-27 22:59 UTC (permalink / raw) To: qemu-devel Cc: Hanna Reitz, Eric Blake, Markus Armbruster, qemu-block, Fam Zheng, Kevin Wolf, Nir Soffer When the `read-zeroes` is set, reads produce zeroes, and block status return BDRV_BLOCK_ZERO, emulating a sparse image. If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do not promise to zero allocated memory. When computing a blkhash of an image via qemu-nbd, we want to test 3 cases: 1. Sparse image: skip reading the entire image based on block status result, and use a pre-computed zero block hash. 2. Image full of zeroes: read the entire image, detect block full of zeroes and skip block hash computation. 3. Image full of data: read the entire image and compute a hash of all blocks. This change adds `read-pattern` option. If the option is set, reads produce the specified pattern. With this option we can emulate an image full of zeroes or full of non-zeroes. Specifying both `read-zeroes` and `read-pattern != 0` is not useful since `read-zeroes` implies a sparse image. In this case `read-zeroes` wins and we ignore the pattern. Maybe we need to make the options mutual exclusive. The following examples shows how the new option can be used with blksum (or nbdcopy --blkhash) to compute a blkhash of an image using the null-co driver. Sparse image - the very fast path: % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \ "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-zeroes': true}}" & % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock' 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/sparse.sock blksum 'nbd+unix:///?socket=/tmp/sparse.sock' 0.05s user 0.01s system 92% cpu 0.061 total Image full of zeros - same hash, 268 times slower: % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \ "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': 0}}" & % time blksum 'nbd+unix:///?socket=/tmp/zero.sock' 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/zero.sock blksum 'nbd+unix:///?socket=/tmp/zero.sock' 7.45s user 22.57s system 183% cpu 16.347 total Image full of data - difference hash, heavy cpu usage: % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \ "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': -1}}" & % time blksum 'nbd+unix:///?socket=/tmp/data.sock' 2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432 nbd+unix:///?socket=/tmp/data.sock blksum 'nbd+unix:///?socket=/tmp/data.sock' 46.05s user 14.15s system 448% cpu 13.414 total Tested on top of https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html. Signed-off-by: Nir Soffer <nirsof@gmail.com> --- block/null.c | 17 +++++++++++++++++ qapi/block-core.json | 9 ++++++++- 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/block/null.c b/block/null.c index 7ba87bd9a9..cbd7d1fdbd 100644 --- a/block/null.c +++ b/block/null.c @@ -22,11 +22,14 @@ #define NULL_OPT_LATENCY "latency-ns" #define NULL_OPT_ZEROES "read-zeroes" +#define NULL_OPT_PATTERN "read-pattern" typedef struct { int64_t length; int64_t latency_ns; bool read_zeroes; + bool has_read_pattern; + int read_pattern; } BDRVNullState; static QemuOptsList runtime_opts = { @@ -49,6 +52,11 @@ static QemuOptsList runtime_opts = { .type = QEMU_OPT_BOOL, .help = "return zeroes when read", }, + { + .name = NULL_OPT_PATTERN, + .type = QEMU_OPT_NUMBER, + .help = "return pattern when read", + }, { /* end of list */ } }, }; @@ -95,6 +103,10 @@ static int null_open(BlockDriverState *bs, QDict *options, int flags, ret = -EINVAL; } s->read_zeroes = qemu_opt_get_bool(opts, NULL_OPT_ZEROES, false); + s->has_read_pattern = qemu_opt_find(opts, NULL_OPT_PATTERN) != NULL; + if (s->has_read_pattern) { + s->read_pattern = qemu_opt_get_number(opts, NULL_OPT_PATTERN, 0); + } qemu_opts_del(opts); bs->supported_write_flags = BDRV_REQ_FUA; return ret; @@ -125,6 +137,8 @@ static coroutine_fn int null_co_preadv(BlockDriverState *bs, if (s->read_zeroes) { qemu_iovec_memset(qiov, 0, 0, bytes); + } else if (s->has_read_pattern) { + qemu_iovec_memset(qiov, 0, s->read_pattern, bytes); } return null_co_common(bs); @@ -199,6 +213,8 @@ static BlockAIOCB *null_aio_preadv(BlockDriverState *bs, if (s->read_zeroes) { qemu_iovec_memset(qiov, 0, 0, bytes); + } else if (s->has_read_pattern) { + qemu_iovec_memset(qiov, 0, s->read_pattern, bytes); } return null_aio_common(bs, cb, opaque); @@ -272,6 +288,7 @@ null_co_get_allocated_file_size(BlockDriverState *bs) static const char *const null_strong_runtime_opts[] = { BLOCK_OPT_SIZE, NULL_OPT_ZEROES, + NULL_OPT_PATTERN, NULL }; diff --git a/qapi/block-core.json b/qapi/block-core.json index b1937780e1..7d576cccbb 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -3297,10 +3297,17 @@ # false, the buffer is left unchanged. # (default: false; since: 4.1) # +# @read-pattern: if set, reads from the device produce the specified +# pattern; if unset, the buffer is left unchanged. +# (since: 10.1) +# # Since: 2.9 ## { 'struct': 'BlockdevOptionsNull', - 'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } } + 'data': { '*size': 'int', + '*latency-ns': 'uint64', + '*read-zeroes': 'bool', + '*read-pattern': 'int' } } ## # @BlockdevOptionsNVMe: -- 2.39.5 (Apple Git-154) ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] block/null: Add read-pattern option 2025-04-27 22:59 ` [PATCH 2/2] block/null: Add read-pattern option Nir Soffer @ 2025-04-28 21:55 ` Eric Blake 2025-04-29 6:00 ` Markus Armbruster 2025-04-30 19:54 ` Nir Soffer 0 siblings, 2 replies; 9+ messages in thread From: Eric Blake @ 2025-04-28 21:55 UTC (permalink / raw) To: Nir Soffer Cc: qemu-devel, Hanna Reitz, Markus Armbruster, qemu-block, Fam Zheng, Kevin Wolf On Mon, Apr 28, 2025 at 01:59:00AM +0300, Nir Soffer wrote: > When the `read-zeroes` is set, reads produce zeroes, and block status > return BDRV_BLOCK_ZERO, emulating a sparse image. > > If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data > is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do > not promise to zero allocated memory. > > When computing a blkhash of an image via qemu-nbd, we want to test 3 > cases: > > 1. Sparse image: skip reading the entire image based on block status > result, and use a pre-computed zero block hash. > 2. Image full of zeroes: read the entire image, detect block full of > zeroes and skip block hash computation. > 3. Image full of data: read the entire image and compute a hash of all > blocks. > > This change adds `read-pattern` option. If the option is set, reads > produce the specified pattern. With this option we can emulate an image > full of zeroes or full of non-zeroes. > > Specifying both `read-zeroes` and `read-pattern != 0` is not useful > since `read-zeroes` implies a sparse image. In this case `read-zeroes` > wins and we ignore the pattern. Maybe we need to make the options mutual > exclusive. I would lean towards an error. It's easier to remove an error later if we find it was too strict, than to be lax up front and then regret it down the road when we wish we had been more strict. > > The following examples shows how the new option can be used with blksum > (or nbdcopy --blkhash) to compute a blkhash of an image using the > null-co driver. > > Sparse image - the very fast path: > > % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \ > "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-zeroes': true}}" & > > % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock' > 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/sparse.sock > blksum 'nbd+unix:///?socket=/tmp/sparse.sock' 0.05s user 0.01s system 92% cpu 0.061 total > > Image full of zeros - same hash, 268 times slower: > > % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \ > "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': 0}}" & > > % time blksum 'nbd+unix:///?socket=/tmp/zero.sock' > 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/zero.sock > blksum 'nbd+unix:///?socket=/tmp/zero.sock' 7.45s user 22.57s system 183% cpu 16.347 total > > Image full of data - difference hash, heavy cpu usage: > > % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \ > "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': -1}}" & > > % time blksum 'nbd+unix:///?socket=/tmp/data.sock' > 2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432 nbd+unix:///?socket=/tmp/data.sock > blksum 'nbd+unix:///?socket=/tmp/data.sock' 46.05s user 14.15s system 448% cpu 13.414 total > > Tested on top of > https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html. > > Signed-off-by: Nir Soffer <nirsof@gmail.com> > --- > block/null.c | 17 +++++++++++++++++ > qapi/block-core.json | 9 ++++++++- > 2 files changed, 25 insertions(+), 1 deletion(-) Should block status return ZERO|DATA when read-pattern=0 is present? > +# @read-pattern: if set, reads from the device produce the specified > +# pattern; if unset, the buffer is left unchanged. > +# (since: 10.1) > +# > # Since: 2.9 > ## > { 'struct': 'BlockdevOptionsNull', > - 'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } } > + 'data': { '*size': 'int', > + '*latency-ns': 'uint64', > + '*read-zeroes': 'bool', > + '*read-pattern': 'int' } } Should this be 'uint8' instead of 'int', so that we aren't silently truncating spurious upper bits when passing it to memset()? -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] block/null: Add read-pattern option 2025-04-28 21:55 ` Eric Blake @ 2025-04-29 6:00 ` Markus Armbruster 2025-04-30 19:58 ` Nir Soffer 2025-04-30 19:54 ` Nir Soffer 1 sibling, 1 reply; 9+ messages in thread From: Markus Armbruster @ 2025-04-29 6:00 UTC (permalink / raw) To: Eric Blake Cc: Nir Soffer, qemu-devel, Hanna Reitz, qemu-block, Fam Zheng, Kevin Wolf Eric Blake <eblake@redhat.com> writes: > On Mon, Apr 28, 2025 at 01:59:00AM +0300, Nir Soffer wrote: >> When the `read-zeroes` is set, reads produce zeroes, and block status >> return BDRV_BLOCK_ZERO, emulating a sparse image. >> >> If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data >> is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do >> not promise to zero allocated memory. >> >> When computing a blkhash of an image via qemu-nbd, we want to test 3 >> cases: >> >> 1. Sparse image: skip reading the entire image based on block status >> result, and use a pre-computed zero block hash. >> 2. Image full of zeroes: read the entire image, detect block full of >> zeroes and skip block hash computation. >> 3. Image full of data: read the entire image and compute a hash of all >> blocks. >> >> This change adds `read-pattern` option. If the option is set, reads >> produce the specified pattern. With this option we can emulate an image >> full of zeroes or full of non-zeroes. >> >> Specifying both `read-zeroes` and `read-pattern != 0` is not useful >> since `read-zeroes` implies a sparse image. In this case `read-zeroes` >> wins and we ignore the pattern. Maybe we need to make the options mutual >> exclusive. > > I would lean towards an error. It's easier to remove an error later > if we find it was too strict, than to be lax up front and then regret > it down the road when we wish we had been more strict. Seconded. Silently "fixing" the user's nonsensical instructions is commonly a bad idea. >> The following examples shows how the new option can be used with blksum >> (or nbdcopy --blkhash) to compute a blkhash of an image using the >> null-co driver. >> >> Sparse image - the very fast path: >> >> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \ >> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-zeroes': true}}" & >> >> % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock' >> 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/sparse.sock >> blksum 'nbd+unix:///?socket=/tmp/sparse.sock' 0.05s user 0.01s system 92% cpu 0.061 total >> >> Image full of zeros - same hash, 268 times slower: >> >> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \ >> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': 0}}" & >> >> % time blksum 'nbd+unix:///?socket=/tmp/zero.sock' >> 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/zero.sock >> blksum 'nbd+unix:///?socket=/tmp/zero.sock' 7.45s user 22.57s system 183% cpu 16.347 total >> >> Image full of data - difference hash, heavy cpu usage: >> >> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \ >> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': -1}}" & >> >> % time blksum 'nbd+unix:///?socket=/tmp/data.sock' >> 2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432 nbd+unix:///?socket=/tmp/data.sock >> blksum 'nbd+unix:///?socket=/tmp/data.sock' 46.05s user 14.15s system 448% cpu 13.414 total >> >> Tested on top of >> https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html. >> >> Signed-off-by: Nir Soffer <nirsof@gmail.com> >> --- >> block/null.c | 17 +++++++++++++++++ >> qapi/block-core.json | 9 ++++++++- >> 2 files changed, 25 insertions(+), 1 deletion(-) > > Should block status return ZERO|DATA when read-pattern=0 is present? > diff --git a/qapi/block-core.json b/qapi/block-core.json index b1937780e1..7d576cccbb 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -3297,10 +3297,17 @@ # @read-zeroes: if true, reads from the device produce zeroes; if # false, the buffer is left unchanged. # (default: false; since: 4.1) The commit message explains "read-zeroes": true behaves like a sparse image. The existing doc comment does not. I suspect it should. # >> +# @read-pattern: if set, reads from the device produce the specified >> +# pattern; if unset, the buffer is left unchanged. >> +# (since: 10.1) >> +# >> # Since: 2.9 >> ## >> { 'struct': 'BlockdevOptionsNull', >> - 'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } } >> + 'data': { '*size': 'int', >> + '*latency-ns': 'uint64', >> + '*read-zeroes': 'bool', >> + '*read-pattern': 'int' } } > > Should this be 'uint8' instead of 'int', so that we aren't silently > truncating spurious upper bits when passing it to memset()? Yes, please. Without this, the doc comment does not sufficiently specify the contents of the image. "The specified pattern" could be read as * Four bytes given by the 32 bit value of @read-pattern in big endian * or in little endian * or in host endian In fact, it's none of the above, it's the least significant byte. Please try to clarify the doc comment in addition to narrowing the type. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] block/null: Add read-pattern option 2025-04-29 6:00 ` Markus Armbruster @ 2025-04-30 19:58 ` Nir Soffer 0 siblings, 0 replies; 9+ messages in thread From: Nir Soffer @ 2025-04-30 19:58 UTC (permalink / raw) To: Markus Armbruster Cc: Eric Blake, QEMU Developers, Hanna Reitz, qemu-block, Fam Zheng, Kevin Wolf > On 29 Apr 2025, at 9:00, Markus Armbruster <armbru@redhat.com> wrote: > > Eric Blake <eblake@redhat.com> writes: > >> On Mon, Apr 28, 2025 at 01:59:00AM +0300, Nir Soffer wrote: >>> When the `read-zeroes` is set, reads produce zeroes, and block status >>> return BDRV_BLOCK_ZERO, emulating a sparse image. >>> >>> If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data >>> is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do >>> not promise to zero allocated memory. >>> >>> When computing a blkhash of an image via qemu-nbd, we want to test 3 >>> cases: >>> >>> 1. Sparse image: skip reading the entire image based on block status >>> result, and use a pre-computed zero block hash. >>> 2. Image full of zeroes: read the entire image, detect block full of >>> zeroes and skip block hash computation. >>> 3. Image full of data: read the entire image and compute a hash of all >>> blocks. >>> >>> This change adds `read-pattern` option. If the option is set, reads >>> produce the specified pattern. With this option we can emulate an image >>> full of zeroes or full of non-zeroes. >>> >>> Specifying both `read-zeroes` and `read-pattern != 0` is not useful >>> since `read-zeroes` implies a sparse image. In this case `read-zeroes` >>> wins and we ignore the pattern. Maybe we need to make the options mutual >>> exclusive. >> >> I would lean towards an error. It's easier to remove an error later >> if we find it was too strict, than to be lax up front and then regret >> it down the road when we wish we had been more strict. > > Seconded. Silently "fixing" the user's nonsensical instructions is > commonly a bad idea. Thanks, making the options mutually exclusive. > >>> The following examples shows how the new option can be used with blksum >>> (or nbdcopy --blkhash) to compute a blkhash of an image using the >>> null-co driver. >>> >>> Sparse image - the very fast path: >>> >>> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \ >>> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-zeroes': true}}" & >>> >>> % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock' >>> 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/sparse.sock >>> blksum 'nbd+unix:///?socket=/tmp/sparse.sock' 0.05s user 0.01s system 92% cpu 0.061 total >>> >>> Image full of zeros - same hash, 268 times slower: >>> >>> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \ >>> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': 0}}" & >>> >>> % time blksum 'nbd+unix:///?socket=/tmp/zero.sock' >>> 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/zero.sock >>> blksum 'nbd+unix:///?socket=/tmp/zero.sock' 7.45s user 22.57s system 183% cpu 16.347 total >>> >>> Image full of data - difference hash, heavy cpu usage: >>> >>> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \ >>> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': -1}}" & >>> >>> % time blksum 'nbd+unix:///?socket=/tmp/data.sock' >>> 2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432 nbd+unix:///?socket=/tmp/data.sock >>> blksum 'nbd+unix:///?socket=/tmp/data.sock' 46.05s user 14.15s system 448% cpu 13.414 total >>> >>> Tested on top of >>> https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html. >>> >>> Signed-off-by: Nir Soffer <nirsof@gmail.com> >>> --- >>> block/null.c | 17 +++++++++++++++++ >>> qapi/block-core.json | 9 ++++++++- >>> 2 files changed, 25 insertions(+), 1 deletion(-) >> >> Should block status return ZERO|DATA when read-pattern=0 is present? >> > diff --git a/qapi/block-core.json b/qapi/block-core.json > index b1937780e1..7d576cccbb 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -3297,10 +3297,17 @@ > # @read-zeroes: if true, reads from the device produce zeroes; if > # false, the buffer is left unchanged. > # (default: false; since: 4.1) > > The commit message explains "read-zeroes": true behaves like a sparse > image. The existing doc comment does not. I suspect it should. > > # >>> +# @read-pattern: if set, reads from the device produce the specified >>> +# pattern; if unset, the buffer is left unchanged. >>> +# (since: 10.1) >>> +# >>> # Since: 2.9 >>> ## >>> { 'struct': 'BlockdevOptionsNull', >>> - 'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } } >>> + 'data': { '*size': 'int', >>> + '*latency-ns': 'uint64', >>> + '*read-zeroes': 'bool', >>> + '*read-pattern': 'int' } } >> >> Should this be 'uint8' instead of 'int', so that we aren't silently >> truncating spurious upper bits when passing it to memset()? > > Yes, please. > > Without this, the doc comment does not sufficiently specify the contents > of the image. "The specified pattern" could be read as > > * Four bytes given by the 32 bit value of @read-pattern in big endian > > * or in little endian > > * or in host endian > > In fact, it's none of the above, it's the least significant byte. > > Please try to clarify the doc comment in addition to narrowing the type. Fixing in v2. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] block/null: Add read-pattern option 2025-04-28 21:55 ` Eric Blake 2025-04-29 6:00 ` Markus Armbruster @ 2025-04-30 19:54 ` Nir Soffer 1 sibling, 0 replies; 9+ messages in thread From: Nir Soffer @ 2025-04-30 19:54 UTC (permalink / raw) To: Eric Blake Cc: QEMU Developers, Hanna Reitz, Markus Armbruster, qemu-block, Fam Zheng, Kevin Wolf > On 29 Apr 2025, at 0:55, Eric Blake <eblake@redhat.com> wrote: > > On Mon, Apr 28, 2025 at 01:59:00AM +0300, Nir Soffer wrote: >> When the `read-zeroes` is set, reads produce zeroes, and block status >> return BDRV_BLOCK_ZERO, emulating a sparse image. >> >> If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data >> is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do >> not promise to zero allocated memory. >> >> When computing a blkhash of an image via qemu-nbd, we want to test 3 >> cases: >> >> 1. Sparse image: skip reading the entire image based on block status >> result, and use a pre-computed zero block hash. >> 2. Image full of zeroes: read the entire image, detect block full of >> zeroes and skip block hash computation. >> 3. Image full of data: read the entire image and compute a hash of all >> blocks. >> >> This change adds `read-pattern` option. If the option is set, reads >> produce the specified pattern. With this option we can emulate an image >> full of zeroes or full of non-zeroes. >> >> Specifying both `read-zeroes` and `read-pattern != 0` is not useful >> since `read-zeroes` implies a sparse image. In this case `read-zeroes` >> wins and we ignore the pattern. Maybe we need to make the options mutual >> exclusive. > > I would lean towards an error. It's easier to remove an error later > if we find it was too strict, than to be lax up front and then regret > it down the road when we wish we had been more strict. Good point > >> >> The following examples shows how the new option can be used with blksum >> (or nbdcopy --blkhash) to compute a blkhash of an image using the >> null-co driver. >> >> Sparse image - the very fast path: >> >> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \ >> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-zeroes': true}}" & >> >> % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock' >> 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/sparse.sock >> blksum 'nbd+unix:///?socket=/tmp/sparse.sock' 0.05s user 0.01s system 92% cpu 0.061 total >> >> Image full of zeros - same hash, 268 times slower: >> >> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \ >> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': 0}}" & >> >> % time blksum 'nbd+unix:///?socket=/tmp/zero.sock' >> 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 nbd+unix:///?socket=/tmp/zero.sock >> blksum 'nbd+unix:///?socket=/tmp/zero.sock' 7.45s user 22.57s system 183% cpu 16.347 total >> >> Image full of data - difference hash, heavy cpu usage: >> >> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \ >> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': -1}}" & >> >> % time blksum 'nbd+unix:///?socket=/tmp/data.sock' >> 2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432 nbd+unix:///?socket=/tmp/data.sock >> blksum 'nbd+unix:///?socket=/tmp/data.sock' 46.05s user 14.15s system 448% cpu 13.414 total >> >> Tested on top of >> https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html. >> >> Signed-off-by: Nir Soffer <nirsof@gmail.com> >> --- >> block/null.c | 17 +++++++++++++++++ >> qapi/block-core.json | 9 ++++++++- >> 2 files changed, 25 insertions(+), 1 deletion(-) > > Should block status return ZERO|DATA when read-pattern=0 is present? This will make it useless for testing image full of zeros which is the reason I added it. qemu-img convert, nbdcopy, and blksum look only at the zero bit and skip reading the extent. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-04-30 19:58 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-27 22:58 [PATCH 0/2] block/null: Add read-pattern Nir Soffer 2025-04-27 22:58 ` [PATCH 1/2] block/null: Report DATA if not reading zeroes Nir Soffer 2025-04-28 21:49 ` Eric Blake 2025-04-30 19:46 ` Nir Soffer 2025-04-27 22:59 ` [PATCH 2/2] block/null: Add read-pattern option Nir Soffer 2025-04-28 21:55 ` Eric Blake 2025-04-29 6:00 ` Markus Armbruster 2025-04-30 19:58 ` Nir Soffer 2025-04-30 19:54 ` Nir Soffer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).