Re: [PATCH v2 2/2] block/null: Add read-pattern option

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Markus Armbruster <armbru@redhat.com>
To: Nir Soffer <nirsof@gmail.com>
Cc: qemu-devel@nongnu.org,  Eric Blake <eblake@redhat.com>,
	qemu-block@nongnu.org,  Kevin Wolf <kwolf@redhat.com>,
	 Fam Zheng <fam@euphon.net>,  Hanna Reitz <hreitz@redhat.com>
Subject: Re: [PATCH v2 2/2] block/null: Add read-pattern option
Date: Thu, 08 May 2025 07:28:26 +0200	[thread overview]
Message-ID: <87h61vekn9.fsf@pond.sub.org> (raw)
In-Reply-To: <20250430203717.16359-3-nirsof@gmail.com> (Nir Soffer's message of "Wed, 30 Apr 2025 23:37:17 +0300")

Nir Soffer <nirsof@gmail.com> writes:

> When the `read-zeroes` is set, reads produce zeroes, and block status
> return BDRV_BLOCK_ZERO, emulating a sparse image.
>
> If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data
> is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do
> not promise to zero allocated memory.
>
> When computing a blkhash of an image via qemu-nbd, we want to test 3
> cases:
>
> 1. Sparse image: skip reading the entire image based on block status
>    result, and use a pre-computed zero block hash.
> 2. Image full of zeroes: read the entire image, detect block full of
>    zeroes and skip block hash computation.
> 3. Image full of data: read the entire image and compute a hash of all
>    blocks.
>
> This change adds `read-pattern` option. If the option is set, reads
> produce the specified pattern. With this option we can emulate an image
> full of zeroes or full of non-zeroes.
>
> The following examples shows how the new option can be used with blksum
> (or nbdcopy --blkhash) to compute a blkhash of an image using the
> null-co driver.
>
> Sparse image - the very fast path:
>
>     % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \
>         "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-zeroes': true}}" &
>
>     % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock'
>     300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6  nbd+unix:///?socket=/tmp/sparse.sock
>     blksum 'nbd+unix:///?socket=/tmp/sparse.sock'  0.05s user 0.01s system 92% cpu 0.061 total
>
> Image full of zeros - same hash, 268 times slower:
>
>     % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \
>         "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': 0}}" &
>
>     % time blksum 'nbd+unix:///?socket=/tmp/zero.sock'
>     300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6  nbd+unix:///?socket=/tmp/zero.sock
>     blksum 'nbd+unix:///?socket=/tmp/zero.sock'  7.45s user 22.57s system 183% cpu 16.347 total
>
> Image full of data - difference hash, heavy cpu usage:
>
>     % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \
>         "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 'read-pattern': 255}}" &
>
>     % time blksum 'nbd+unix:///?socket=/tmp/data.sock'
>     2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432  nbd+unix:///?socket=/tmp/data.sock
>     blksum 'nbd+unix:///?socket=/tmp/data.sock'  46.05s user 14.15s system 448% cpu 13.414 total
>
> Specifying both `read-zeroes` and `read-pattern` is an error since
> `read-zeroes` implies a sparse image. Example errors:
>
>     % ./qemu-img map --output json \
>         "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'read-pattern': -1}}"
>     qemu-img: Could not open 'json:{...}': read_pattern is out of range (0-255)
>
>     % ./qemu-img map --output json \
>         "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'read-pattern': 0, 'read-zeroes': true}}"
>     qemu-img: Could not open 'json:{...}': The parameters read-zeroes and read-pattern are in conflict
>
> Tested on top of
> https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html.
>
> Signed-off-by: Nir Soffer <nirsof@gmail.com>

[...]

> diff --git a/docs/devel/secure-coding-practices.rst b/docs/devel/secure-coding-practices.rst
> index 0454cc527e..73830684ea 100644
> --- a/docs/devel/secure-coding-practices.rst
> +++ b/docs/devel/secure-coding-practices.rst
> @@ -111,5 +111,6 @@ Use of null-co block drivers
>  The ``null-co`` block driver is designed for performance: its read accesses are
>  not initialized by default. In case this driver has to be used for security
>  research, it must be used with the ``read-zeroes=on`` option which fills read
> -buffers with zeroes. Security issues reported with the default
> +buffers with zeroes, or with the ``read-pattern=N`` option which fills read
> +buffers with pattern. Security issues reported with the default
>  (``read-zeroes=off``) will be discarded.
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 7c95c9e36a..2205ac9758 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -3295,13 +3295,23 @@
>  #
>  # @read-zeroes: if true, emulate a sparse image, and reads from the
>  #     device produce zeroes; if false, emulate an allocated image but
> -#     reads from the device leave the buffer unchanged.
> +#     reads from the device leave the buffer unchanged. Mutually
> +#     exclusive with @read-pattern.
>  #     (default: false; since: 4.1)

Correct before the patch: absent behaves just like present and false.

That's no longer the case afterwards.  Hmm.

>  #
> +# @read-pattern: if set, emulate an allocated image, and reads from the
> +#     device produce the specified byte value; if unset, reads from the
> +#     device leave the buffer unchanged. Mutually exclusive with
> +#     @read-zeroes.
> +#     (since: 10.1)

Default?  The usual way to document it would be something like (default:
false), but that's again problematic.

> +#
>  # Since: 2.9
>  ##
>  { 'struct': 'BlockdevOptionsNull',
> -  'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } }
> +  'data': { '*size': 'int',
> +            '*latency-ns': 'uint64',
> +            '*read-zeroes': 'bool',
> +            '*read-pattern': 'uint8' } }
>  
>  ##
>  # @BlockdevOptionsNVMe:

Let's take a step back from the concrete interface and ponder what we're
trying to do.  We want three cases:

* Allocated, reads return unspecified crap (security hazard)

* Allocated, reads return specified data

* Sparse, reads return zeroes

How would we do this if we started on a green field?

Here's my try:

    bool sparse
    uint8 contents

so that

* Allocated, reads return unspecified crap (security hazard)

  @sparse is false, and @contents is absent

* Allocated, reads return specified data

  @sparse is false, and @contents is present

* Sparse, reads return zeroes

  @sparse is true, and @contents must absent or zero

I'd make @sparse either mandatory or default to true, so that we don't
default to security hazard.

Now compare this to your patch: same configuration data (bool × uint8),
better names, cleaner semantics, better defaults.

Unless we want to break compatibility, we're stuck with the name
@read-zeroes for the bool, and its trap-for-the-unwary default value,
but cleaner semantics seem possible.

Thoughts?

next prev parent reply	other threads:[~2025-05-08  5:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-30 20:37 [PATCH v2 0/2] block/null: Add read-pattern option Nir Soffer
2025-04-30 20:37 ` [PATCH v2 1/2] block/null: Report DATA if not reading zeroes Nir Soffer
2025-05-08  4:37   ` Markus Armbruster
2025-05-08  5:03     ` Markus Armbruster
2025-05-08  5:20       ` Markus Armbruster
2025-05-16 21:35         ` Nir Soffer
2025-05-16 21:32       ` Nir Soffer
2025-05-16 21:31     ` Nir Soffer
2025-04-30 20:37 ` [PATCH v2 2/2] block/null: Add read-pattern option Nir Soffer
2025-05-08  5:28   ` Markus Armbruster [this message]
2025-05-08 14:30     ` Eric Blake
2025-05-09  7:19       ` Markus Armbruster
2025-05-16 21:12     ` Nir Soffer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h61vekn9.fsf@pond.sub.org \
    --to=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=nirsof@gmail.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.