All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Lieven <pl@kamp.de>
To: "Denis V. Lunev" <den@openvz.org>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 1/7] block: fix maximum length sent to bdrv_co_do_write_zeroes callback in bs
Date: Fri, 26 Dec 2014 14:13:58 +0100	[thread overview]
Message-ID: <549D5F16.2070007@kamp.de> (raw)
In-Reply-To: <1419597313-20514-2-git-send-email-den@openvz.org>

Am 26.12.2014 um 13:35 schrieb Denis V. Lunev:
> The check for maximum length was added by
>   commit c31cb70728d2c0c8900b35a66784baa446fd5147
>   Author: Peter Lieven <pl@kamp.de>
>   Date:   Thu Oct 24 12:06:58 2013 +0200
>     block: honour BlockLimits in bdrv_co_do_write_zeroes
>
> but actually if driver provides .bdrv_co_write_zeroes callback, there is
> no need to limit the size to 32 MB. Callback should provide effective
> implementation which normally should not write any zeroes in comparable
> amount.

NACK.

First there is no guarantee that bdrv_co_do_write_zeroes is a fast operation.
This heaviliy depends on several circumstances that the block layer is not aware of.
If a specific protocol knows it is very fast in writing zeroes under any circumstance
it should provide INT_MAX in bs->bl.max_write_zeroes. It is then still allowed to
return -ENOTSUP if the request size or alignment doesn't fit.

There are known backends e.g. anything that deals with SCSI that have a known
limitation of the maximum number of zeroes they can write fast in a single request.
This number MUST NOT be exceeded. The below patch would break all those backends.

What issue are you trying to fix with this patch? Maybe there is a better way to fix
it at another point in the code.

Peter

>
> Correct check should be as follows to honour BlockLimits for slow writes:
> - if bs->bl.max_write_zeroes is specified, it should be applied to all
>   paths (fast and slow)
> - MAX_WRITE_ZEROES_DEFAULT should be applied for slow path only
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Eric Blake <eblake@redhat.com>
> CC: Peter Lieven <pl@kamp.de>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/block.c b/block.c
> index 4165d42..0ec8b15 100644
> --- a/block.c
> +++ b/block.c
> @@ -3182,8 +3182,12 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>      struct iovec iov = {0};
>      int ret = 0;
>  
> -    int max_write_zeroes = bs->bl.max_write_zeroes ?
> -                           bs->bl.max_write_zeroes : MAX_WRITE_ZEROES_DEFAULT;
> +    int max_fast_write_size = nb_sectors;
> +    int max_slow_write_size = MAX_WRITE_ZEROES_DEFAULT;
> +    if (bs->bl.max_write_zeroes != 0) {
> +        max_fast_write_size = bs->bl.max_write_zeroes;
> +        max_slow_write_size = bs->bl.max_write_zeroes;
> +    }
>  
>      while (nb_sectors > 0 && !ret) {
>          int num = nb_sectors;
> @@ -3205,9 +3209,9 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>              }
>          }
>  
> -        /* limit request size */
> -        if (num > max_write_zeroes) {
> -            num = max_write_zeroes;
> +        /* limit request size for fast path */
> +        if (num > max_fast_write_size) {
> +            num = max_fast_write_size;
>          }
>  
>          ret = -ENOTSUP;
> @@ -3217,6 +3221,11 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>          }
>  
>          if (ret == -ENOTSUP) {
> +            /* limit request size further for slow path */
> +            if (num > max_slow_write_size) {
> +                num = max_slow_write_size;
> +            }
> +
>              /* Fall back to bounce buffer if write zeroes is unsupported */
>              iov.iov_len = num * BDRV_SECTOR_SIZE;
>              if (iov.iov_base == NULL) {
> @@ -3234,7 +3243,7 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
>              /* Keep bounce buffer around if it is big enough for all
>               * all future requests.
>               */
> -            if (num < max_write_zeroes) {
> +            if (num < max_slow_write_size) {
>                  qemu_vfree(iov.iov_base);
>                  iov.iov_base = NULL;
>              }

  reply	other threads:[~2014-12-26 13:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-26 12:35 [Qemu-devel] [PATCH v2 0/7] eliminate data write in bdrv_write_zeroes on Linux Denis V. Lunev
2014-12-26 12:35 ` [Qemu-devel] [PATCH 1/7] block: fix maximum length sent to bdrv_co_do_write_zeroes callback in bs Denis V. Lunev
2014-12-26 13:13   ` Peter Lieven [this message]
2014-12-26 13:32     ` Denis V. Lunev
2014-12-26 19:15       ` Denis V. Lunev
2014-12-27 14:52         ` Peter Lieven
2014-12-27 17:42           ` Denis V. Lunev
2014-12-27 20:01             ` Peter Lieven
2014-12-27 20:14               ` Denis V. Lunev
2014-12-26 12:35 ` [Qemu-devel] [PATCH 2/7] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes Denis V. Lunev
2014-12-26 12:35 ` [Qemu-devel] [PATCH 3/7] block/raw-posix: create do_fallocate helper Denis V. Lunev
2014-12-26 12:35 ` [Qemu-devel] [PATCH 4/7] block/raw-posix: create translate_err helper to merge errno values Denis V. Lunev
2014-12-26 12:35 ` [Qemu-devel] [PATCH 5/7] block/raw-posix: refactor handle_aiocb_write_zeroes a bit Denis V. Lunev
2014-12-26 12:35 ` [Qemu-devel] [PATCH 6/7] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes Denis V. Lunev
2014-12-26 12:35 ` [Qemu-devel] [PATCH 7/7] block/raw-posix: call plain fallocate in handle_aiocb_write_zeroes Denis V. Lunev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=549D5F16.2070007@kamp.de \
    --to=pl@kamp.de \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.