qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jean-Louis Dupond <jean-louis@dupond.be>
To: qemu-devel@nongnu.org, stefanha@redhat.com, fam@euphon.net,
	kwolf@redhat.com, hreitz@redhat.com
Subject: Re: [PATCH] block: Add zeroes discard option
Date: Mon, 15 May 2023 09:40:53 +0200	[thread overview]
Message-ID: <8614ebb6-5c03-be09-4778-977dd68920ed@dupond.be> (raw)
In-Reply-To: <20230510142720.71894-1-jean-louis@dupond.be>

Uploaded a new patch which might be better/cleaner :)
"[PATCH] qcow2: add discard-no-unref option"

That patch only applies to qcow2 and also passes unmap further down the 
storage stack.

On 10/05/2023 16:27, Jean-Louis Dupond wrote:
> When we for example have a sparse qcow2 image and discard: unmap is enabled,
> there can be a lot of fragmentation in the image after some time. Surely on VM's
> that do a lot of writes/deletes.
> This causes the qcow2 image to grow even over 110% of its virtual size,
> because the free gaps in the image get to small to allocate new
> continuous clusters. So it allocates new space as the end of the image.
>
> Disabling discard is not an option, as discard is needed to keep the
> incremental backup size as low as possible. Without discard, the
> incremental backups would become large, as qemu thinks it's just dirty
> blocks but it doesn't know the blocks are empty/useless.
> So we need to avoid fragmentation but also 'empty' the useless blocks in
> the image to have a small incremental backup.
>
> There are multiple ways to properly resolve this. One way is to not
> discard the blocks on a discard request, but just zero them. This causes
> the allocation the still exist, and results in no gaps.
> This should also cause a perfectly continuous image when using full
> preallocation.
>
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
> Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
> ---
>   block.c                              | 2 ++
>   block/io.c                           | 7 ++++++-
>   include/block/block-common.h         | 1 +
>   qapi/block-core.json                 | 3 ++-
>   qemu-nbd.c                           | 2 +-
>   qemu-options.hx                      | 2 +-
>   storage-daemon/qemu-storage-daemon.c | 2 +-
>   7 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/block.c b/block.c
> index 5ec1a3897e..ed21d115dd 100644
> --- a/block.c
> +++ b/block.c
> @@ -1146,6 +1146,8 @@ int bdrv_parse_discard_flags(const char *mode, int *flags)
>           /* do nothing */
>       } else if (!strcmp(mode, "on") || !strcmp(mode, "unmap")) {
>           *flags |= BDRV_O_UNMAP;
> +    } else if (!strcmp(mode, "zeroes")) {
> +        *flags |= BDRV_O_UNMAP_ZERO;
>       } else {
>           return -1;
>       }
> diff --git a/block/io.c b/block/io.c
> index 6fa1993374..dc7592e938 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2988,10 +2988,15 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
>       }
>   
>       /* Do nothing if disabled.  */
> -    if (!(bs->open_flags & BDRV_O_UNMAP)) {
> +    if (!(bs->open_flags & BDRV_O_UNMAP) &&
> +        !(bs->open_flags & BDRV_O_UNMAP_ZERO)) {
>           return 0;
>       }
>   
> +    if (bs->open_flags & BDRV_O_UNMAP_ZERO) {
> +        return bdrv_co_pwrite_zeroes(child, offset, bytes, 0);
> +    }
> +
>       if (!bs->drv->bdrv_co_pdiscard && !bs->drv->bdrv_aio_pdiscard) {
>           return 0;
>       }
> diff --git a/include/block/block-common.h b/include/block/block-common.h
> index b5122ef8ab..079ee390c3 100644
> --- a/include/block/block-common.h
> +++ b/include/block/block-common.h
> @@ -179,6 +179,7 @@ typedef enum {
>   #define BDRV_O_AUTO_RDONLY 0x20000 /* degrade to read-only if opening
>                                         read-write fails */
>   #define BDRV_O_IO_URING    0x40000 /* use io_uring instead of the thread pool */
> +#define BDRV_O_UNMAP_ZERO  0x80000 /* rewrite guest unmap to zero */
>   
>   #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH)
>   
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index c05ad0c07e..0f91d1a6b6 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2971,11 +2971,12 @@
>   #
>   # @ignore: Ignore the request
>   # @unmap: Forward as an unmap request
> +# @zeroes: Zero the clusters instead of unmapping (since 8.1)
>   #
>   # Since: 2.9
>   ##
>   { 'enum': 'BlockdevDiscardOptions',
> -  'data': [ 'ignore', 'unmap' ] }
> +  'data': [ 'ignore', 'unmap', 'zeroes' ] }
>   
>   ##
>   # @BlockdevDetectZeroesOptions:
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 6ff45308a9..6c0b326db4 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -148,7 +148,7 @@ static void usage(const char *name)
>   "                            valid options are: 'none', 'writeback' (default),\n"
>   "                            'writethrough', 'directsync' and 'unsafe'\n"
>   "      --aio=MODE            set AIO mode (native, io_uring or threads)\n"
> -"      --discard=MODE        set discard mode (ignore, unmap)\n"
> +"      --discard=MODE        set discard mode (ignore, unmap, zeroes)\n"
>   "      --detect-zeroes=MODE  set detect-zeroes mode (off, on, unmap)\n"
>   "      --image-opts          treat FILE as a full set of image options\n"
>   "\n"
> diff --git a/qemu-options.hx b/qemu-options.hx
> index b5efa648ba..7e9d383499 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1209,7 +1209,7 @@ SRST
>   ERST
>   
>   DEF("blockdev", HAS_ARG, QEMU_OPTION_blockdev,
> -    "-blockdev [driver=]driver[,node-name=N][,discard=ignore|unmap]\n"
> +    "-blockdev [driver=]driver[,node-name=N][,discard=ignore|unmap|zeroes]\n"
>       "          [,cache.direct=on|off][,cache.no-flush=on|off]\n"
>       "          [,read-only=on|off][,auto-read-only=on|off]\n"
>       "          [,force-share=on|off][,detect-zeroes=on|off|unmap]\n"
> diff --git a/storage-daemon/qemu-storage-daemon.c b/storage-daemon/qemu-storage-daemon.c
> index 0e9354faa6..08e8b1b3d9 100644
> --- a/storage-daemon/qemu-storage-daemon.c
> +++ b/storage-daemon/qemu-storage-daemon.c
> @@ -86,7 +86,7 @@ static void help(void)
>   "                         specify tracing options\n"
>   "  -V, --version          output version information and exit\n"
>   "\n"
> -"  --blockdev [driver=]<driver>[,node-name=<N>][,discard=ignore|unmap]\n"
> +"  --blockdev [driver=]<driver>[,node-name=<N>][,discard=ignore|unmap|zeroes]\n"
>   "             [,cache.direct=on|off][,cache.no-flush=on|off]\n"
>   "             [,read-only=on|off][,auto-read-only=on|off]\n"
>   "             [,force-share=on|off][,detect-zeroes=on|off|unmap]\n"


      reply	other threads:[~2023-05-15  7:41 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-10 14:27 [PATCH] block: Add zeroes discard option Jean-Louis Dupond
2023-05-15  7:40 ` Jean-Louis Dupond [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8614ebb6-5c03-be09-4778-977dd68920ed@dupond.be \
    --to=jean-louis@dupond.be \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).