qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: yadong.qi@intel.com
Cc: fam@euphon.net, qemu-block@nongnu.org, mst@redhat.com,
	luhai.chen@intel.com, qemu-devel@nongnu.org,
	kai.z.wang@intel.com, hreitz@redhat.com, stefanha@redhat.com
Subject: Re: [PATCH 1/2] block:hdev: support BLKSECDISCARD
Date: Mon, 15 Nov 2021 13:40:55 +0100	[thread overview]
Message-ID: <YZJVVzott+zsoLqN@redhat.com> (raw)
In-Reply-To: <20211115045200.3567293-2-yadong.qi@intel.com>

Am 15.11.2021 um 05:51 hat yadong.qi@intel.com geschrieben:
> From: Yadong Qi <yadong.qi@intel.com>
> 
> Add a new option "secdiscard" for block drive. Parse it and
> record it in bs->open_flags as bit(BDRV_O_SECDISCARD).
> 
> Add a new BDRV_REQ_SECDISCARD bit for secure discard request
> from virtual device.
> 
> For host_device backend: implement by ioctl(BLKSECDISCARD) on
> real host block device.
> For other backend, no implementation.
> 
> E.g.:
>     qemu-system-x86_64 \
>     ... \
>     -drive file=/dev/mmcblk0p2,if=none,format=raw,discard=on,secdiscard=on,id=sd0 \
>     -device virtio-blk-pci,drive=sd0,id=sd0_vblk \
>     ...
> 
> Signed-off-by: Yadong Qi <yadong.qi@intel.com>
> ---
>  block.c                          | 46 +++++++++++++++++++++++++++++
>  block/blkdebug.c                 |  5 ++--
>  block/blklogwrites.c             |  6 ++--
>  block/blkreplay.c                |  5 ++--
>  block/block-backend.c            | 15 ++++++----
>  block/copy-before-write.c        |  5 ++--
>  block/copy-on-read.c             |  5 ++--
>  block/coroutines.h               |  6 ++--
>  block/file-posix.c               | 50 ++++++++++++++++++++++++++++----
>  block/filter-compress.c          |  5 ++--
>  block/io.c                       |  5 ++--
>  block/mirror.c                   |  5 ++--
>  block/nbd.c                      |  3 +-
>  block/nvme.c                     |  3 +-
>  block/preallocate.c              |  5 ++--
>  block/qcow2-refcount.c           |  4 +--
>  block/qcow2.c                    |  3 +-
>  block/raw-format.c               |  5 ++--
>  block/throttle.c                 |  5 ++--
>  hw/block/virtio-blk.c            |  2 +-
>  hw/ide/core.c                    |  1 +
>  hw/nvme/ctrl.c                   |  3 +-
>  hw/scsi/scsi-disk.c              |  2 +-
>  include/block/block.h            | 13 +++++++--
>  include/block/block_int.h        |  2 +-
>  include/block/raw-aio.h          |  4 ++-
>  include/sysemu/block-backend.h   |  1 +
>  tests/unit/test-block-iothread.c |  9 +++---
>  28 files changed, 171 insertions(+), 52 deletions(-)

Notably absent: qapi/block-core.json. Without changing this, the option
can't be available in -blockdev, which is the primary option to configure
block device backends.

This patch seems to contain multiple logical changes that should be
split into separate patches:

* Adding a flags parameter to .bdrv_co_pdiscard

* Support for the new feature in the core block layer (should be done
  with -blockdev)

* Convenience magic for -drive (BDRV_O_SECDISCARD). It's not clear that
  this should be done at all because the option is really specific to
  one single block driver (file-posix). I think in your patch, all
  other block drivers silently ignore the option, which is not what we
  want.

> diff --git a/block.c b/block.c
> index 580cb77a70..4f05e96d12 100644
> --- a/block.c
> +++ b/block.c
> @@ -1128,6 +1128,32 @@ int bdrv_parse_discard_flags(const char *mode, int *flags)
>      return 0;
>  }
>  
> +/**
> + * Set open flags for a given secdiscard mode
> + *
> + * Return 0 on success, -1 if the secdiscard mode was invalid.
> + */
> +int bdrv_parse_secdiscard_flags(const char *mode, int *flags, Error **errp)
> +{
> +    *flags &= ~BDRV_O_SECDISCARD;
> +
> +    if (!strcmp(mode, "off")) {
> +        /* do nothing */
> +    } else if (!strcmp(mode, "on")) {
> +        if (!(*flags & BDRV_O_UNMAP)) {
> +            error_setg(errp, "cannot enable secdiscard when discard is "
> +                             "disabled!");
> +            return -1;
> +        }

This check has nothing to do with parsing the option, it's validating
its value.

You don't even need a new function to parse it, because there is already
qemu_opt_get_bool(). Duplicating it means only that you're inconsistent
with other boolean options, which alternatively accept "yes"/"no",
"true"/"false", "y/n".

> +
> +        *flags |= BDRV_O_SECDISCARD;
> +    } else {
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
>  /**
>   * Set open flags for a given cache mode
>   *
> @@ -1695,6 +1721,11 @@ QemuOptsList bdrv_runtime_opts = {
>              .type = QEMU_OPT_STRING,
>              .help = "discard operation (ignore/off, unmap/on)",
>          },
> +        {
> +            .name = BDRV_OPT_SECDISCARD,
> +            .type = QEMU_OPT_STRING,
> +            .help = "secure discard operation (off, on)",
> +        },
>          {
>              .name = BDRV_OPT_FORCE_SHARE,
>              .type = QEMU_OPT_BOOL,
> @@ -1735,6 +1766,7 @@ static int bdrv_open_common(BlockDriverState *bs, BlockBackend *file,
>      const char *driver_name = NULL;
>      const char *node_name = NULL;
>      const char *discard;
> +    const char *secdiscard;
>      QemuOpts *opts;
>      BlockDriver *drv;
>      Error *local_err = NULL;
> @@ -1829,6 +1861,16 @@ static int bdrv_open_common(BlockDriverState *bs, BlockBackend *file,
>          }
>      }
>  
> +
> +    secdiscard = qemu_opt_get(opts, BDRV_OPT_SECDISCARD);
> +    if (secdiscard != NULL) {
> +        if (bdrv_parse_secdiscard_flags(secdiscard, &bs->open_flags,
> +                                        errp) != 0) {
> +            ret = -EINVAL;
> +            goto fail_opts;
> +        }
> +    }
> +
>      bs->detect_zeroes =
>          bdrv_parse_detect_zeroes(opts, bs->open_flags, &local_err);
>      if (local_err) {
> @@ -3685,6 +3727,10 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
>                                 &flags, options, flags, options);
>      }
>  
> +    if (g_strcmp0(qdict_get_try_str(options, BDRV_OPT_SECDISCARD), "on")) {
> +            flags |= BDRV_O_SECDISCARD;

Indentation is off.

> +    }
> +
>      bs->open_flags = flags;
>      bs->options = options;
>      options = qdict_clone_shallow(options);
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index bbf2948703..b49bb6a3e9 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -717,7 +717,8 @@ static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
>  }
>  
>  static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs,
> -                                             int64_t offset, int64_t bytes)
> +                                             int64_t offset, int64_t bytes,
> +                                             BdrvRequestFlags flags)
>  {
>      uint32_t align = bs->bl.pdiscard_alignment;
>      int err;
> @@ -747,7 +748,7 @@ static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs,
>          return err;
>      }
>  
> -    return bdrv_co_pdiscard(bs->file, offset, bytes);
> +    return bdrv_co_pdiscard(bs->file, offset, bytes, 0);
>  }
>  
>  static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs,
> diff --git a/block/blklogwrites.c b/block/blklogwrites.c
> index f7a251e91f..d8d81a40ae 100644
> --- a/block/blklogwrites.c
> +++ b/block/blklogwrites.c
> @@ -456,7 +456,8 @@ static int coroutine_fn blk_log_writes_co_do_file_flush(BlkLogWritesFileReq *fr)
>  static int coroutine_fn
>  blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr)
>  {
> -    return bdrv_co_pdiscard(fr->bs->file, fr->offset, fr->bytes);
> +    return bdrv_co_pdiscard(fr->bs->file, fr->offset, fr->bytes,
> +                            fr->file_flags);
>  }
>  
>  static int coroutine_fn
> @@ -484,7 +485,8 @@ static int coroutine_fn blk_log_writes_co_flush_to_disk(BlockDriverState *bs)
>  }
>  
>  static int coroutine_fn
> -blk_log_writes_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes)
> +blk_log_writes_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes,
> +                                       BdrvRequestFlags flags)
>  {
>      return blk_log_writes_co_log(bs, offset, bytes, NULL, 0,
>                                   blk_log_writes_co_do_file_pdiscard,
> diff --git a/block/blkreplay.c b/block/blkreplay.c
> index dcbe780ddb..65e66d0766 100644
> --- a/block/blkreplay.c
> +++ b/block/blkreplay.c
> @@ -105,10 +105,11 @@ static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs,
>  }
>  
>  static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs,
> -                                              int64_t offset, int64_t bytes)
> +                                              int64_t offset, int64_t bytes,
> +                                              BdrvRequestFlags flags)
>  {
>      uint64_t reqid = blkreplay_next_id();
> -    int ret = bdrv_co_pdiscard(bs->file, offset, bytes);
> +    int ret = bdrv_co_pdiscard(bs->file, offset, bytes, flags);
>      block_request_create(reqid, bs, qemu_coroutine_self());
>      qemu_coroutine_yield();
>  
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 12ef80ea17..f2c5776172 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1597,7 +1597,8 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
>  
>  /* To be called between exactly one pair of blk_inc/dec_in_flight() */
>  int coroutine_fn
> -blk_co_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes)
> +blk_co_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes,
> +                   BdrvRequestFlags flags)
>  {
>      int ret;
>  
> @@ -1608,7 +1609,7 @@ blk_co_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes)
>          return ret;
>      }
>  
> -    return bdrv_co_pdiscard(blk->root, offset, bytes);
> +    return bdrv_co_pdiscard(blk->root, offset, bytes, flags);
>  }
>  
>  static void blk_aio_pdiscard_entry(void *opaque)
> @@ -1616,15 +1617,17 @@ static void blk_aio_pdiscard_entry(void *opaque)
>      BlkAioEmAIOCB *acb = opaque;
>      BlkRwCo *rwco = &acb->rwco;
>  
> -    rwco->ret = blk_co_do_pdiscard(rwco->blk, rwco->offset, acb->bytes);
> +    rwco->ret = blk_co_do_pdiscard(rwco->blk, rwco->offset, acb->bytes,
> +                                   rwco->flags);
>      blk_aio_complete(acb);
>  }
>  
>  BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk,
>                               int64_t offset, int64_t bytes,
> +                             BdrvRequestFlags flags,
>                               BlockCompletionFunc *cb, void *opaque)
>  {
> -    return blk_aio_prwv(blk, offset, bytes, NULL, blk_aio_pdiscard_entry, 0,
> +    return blk_aio_prwv(blk, offset, bytes, NULL, blk_aio_pdiscard_entry, flags,
>                          cb, opaque);
>  }
>  
> @@ -1634,7 +1637,7 @@ int coroutine_fn blk_co_pdiscard(BlockBackend *blk, int64_t offset,
>      int ret;
>  
>      blk_inc_in_flight(blk);
> -    ret = blk_co_do_pdiscard(blk, offset, bytes);
> +    ret = blk_co_do_pdiscard(blk, offset, bytes, 0);
>      blk_dec_in_flight(blk);
>  
>      return ret;
> @@ -1645,7 +1648,7 @@ int blk_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes)
>      int ret;
>  
>      blk_inc_in_flight(blk);
> -    ret = blk_do_pdiscard(blk, offset, bytes);
> +    ret = blk_do_pdiscard(blk, offset, bytes, 0);
>      blk_dec_in_flight(blk);
>  
>      return ret;
> diff --git a/block/copy-before-write.c b/block/copy-before-write.c
> index c30a5ff8de..8d60a3028f 100644
> --- a/block/copy-before-write.c
> +++ b/block/copy-before-write.c
> @@ -64,14 +64,15 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
>  }
>  
>  static int coroutine_fn cbw_co_pdiscard(BlockDriverState *bs,
> -                                        int64_t offset, int64_t bytes)
> +                                        int64_t offset, int64_t bytes,
> +                                       BdrvRequestFlags flags)
>  {
>      int ret = cbw_do_copy_before_write(bs, offset, bytes, 0);
>      if (ret < 0) {
>          return ret;
>      }
>  
> -    return bdrv_co_pdiscard(bs->file, offset, bytes);
> +    return bdrv_co_pdiscard(bs->file, offset, bytes, 0);
>  }
>  
>  static int coroutine_fn cbw_co_pwrite_zeroes(BlockDriverState *bs,
> diff --git a/block/copy-on-read.c b/block/copy-on-read.c
> index 1fc7fb3333..52183cc9a2 100644
> --- a/block/copy-on-read.c
> +++ b/block/copy-on-read.c
> @@ -201,9 +201,10 @@ static int coroutine_fn cor_co_pwrite_zeroes(BlockDriverState *bs,
>  
>  
>  static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs,
> -                                        int64_t offset, int64_t bytes)
> +                                        int64_t offset, int64_t bytes,
> +                                       BdrvRequestFlags flags)
>  {
> -    return bdrv_co_pdiscard(bs->file, offset, bytes);
> +    return bdrv_co_pdiscard(bs->file, offset, bytes, 0);
>  }
>  
>  
> diff --git a/block/coroutines.h b/block/coroutines.h
> index c8c14a29c8..b0ba771bef 100644
> --- a/block/coroutines.h
> +++ b/block/coroutines.h
> @@ -98,9 +98,11 @@ int coroutine_fn
>  blk_co_do_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
>  
>  int generated_co_wrapper
> -blk_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes);
> +blk_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes,
> +                BdrvRequestFlags flags);
>  int coroutine_fn
> -blk_co_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes);
> +blk_co_do_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes,
> +                   BdrvRequestFlags flags);
>  
>  int generated_co_wrapper blk_do_flush(BlockBackend *blk);
>  int coroutine_fn blk_co_do_flush(BlockBackend *blk);
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 7a27c83060..caa406e429 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -160,6 +160,7 @@ typedef struct BDRVRawState {
>      bool is_xfs:1;
>  #endif
>      bool has_discard:1;
> +    bool has_secdiscard:1;
>      bool has_write_zeroes:1;
>      bool discard_zeroes:1;
>      bool use_linux_aio:1;

has_secdiscard is only set to false in raw_open_common() and never
changed or used.

> @@ -727,6 +728,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
>  #endif /* !defined(CONFIG_LINUX_IO_URING) */
>  
>      s->has_discard = true;
> +    s->has_secdiscard = false;
>      s->has_write_zeroes = true;
>      if ((bs->open_flags & BDRV_O_NOCACHE) != 0 && !dio_byte_aligned(s->fd)) {
>          s->needs_alignment = true;
> @@ -765,6 +767,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
>              s->discard_zeroes = true;
>          }
>  #endif
> +
>  #ifdef __linux__
>          /* On Linux 3.10, BLKDISCARD leaves stale data in the page cache.  Do
>           * not rely on the contents of discarded blocks unless using O_DIRECT.

Unrelated hunk.

> @@ -1859,6 +1862,35 @@ static int handle_aiocb_discard(void *opaque)
>      return ret;
>  }
>  
> +static int handle_aiocb_secdiscard(void *opaque)
> +{
> +    RawPosixAIOData *aiocb = opaque;
> +    int ret = -ENOTSUP;
> +    BlockDriverState *bs = aiocb->bs;
> +
> +    if (!(bs->open_flags & BDRV_O_SECDISCARD)) {
> +        return -ENOTSUP;
> +    }
> +
> +    if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
> +#ifdef BLKSECDISCARD
> +        do {
> +            uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
> +            if (ioctl(aiocb->aio_fildes, BLKSECDISCARD, range) == 0) {
> +                return 0;
> +            }
> +        } while (errno == EINTR);
> +
> +        ret = translate_err(-errno);
> +#endif
> +    }
> +
> +    if (ret == -ENOTSUP) {
> +        bs->open_flags &= ~BDRV_O_SECDISCARD;

I'd rather avoid changing bs->open_flags. This is user input and I would
preserve it in its original state.

We already know when opening the image whether it is a block device. Why
do we even open the image instead of erroring out there?

> +    }
> +    return ret;
> +}
> +
>  /*
>   * Help alignment probing by allocating the first block.
>   *

Kevin



  reply	other threads:[~2021-11-15 12:49 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-15  4:51 [PATCH 0/2] support BLKSECDISCARD yadong.qi
2021-11-15  4:51 ` [PATCH 1/2] block:hdev: " yadong.qi
2021-11-15 12:40   ` Kevin Wolf [this message]
2021-11-16  1:54     ` Qi, Yadong
2021-11-15 14:12   ` Stefan Hajnoczi
2021-11-16  2:03     ` Qi, Yadong
2021-11-16 10:58       ` Stefan Hajnoczi
2021-11-17  5:53         ` Christoph Hellwig
2021-11-17 10:32           ` Stefan Hajnoczi
2021-11-18  1:13           ` Qi, Yadong
2021-11-15  4:52 ` [PATCH 2/2] virtio-blk: " yadong.qi
2021-11-15 10:00   ` Stefano Garzarella
2021-11-16  1:26     ` Qi, Yadong
2021-11-15 10:57   ` Michael S. Tsirkin
2021-11-16  1:33     ` Qi, Yadong
2021-11-15 14:26   ` Stefan Hajnoczi
2021-11-16  2:13     ` Qi, Yadong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YZJVVzott+zsoLqN@redhat.com \
    --to=kwolf@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=kai.z.wang@intel.com \
    --cc=luhai.chen@intel.com \
    --cc=mst@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=yadong.qi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).