From: Jean-Louis Dupond <jean-louis@dupond.be>
To: qemu-devel@nongnu.org, stefanha@redhat.com, fam@euphon.net,
kwolf@redhat.com, hreitz@redhat.com
Subject: Re: [PATCH] block: Add zeroes discard option
Date: Mon, 15 May 2023 09:40:53 +0200 [thread overview]
Message-ID: <8614ebb6-5c03-be09-4778-977dd68920ed@dupond.be> (raw)
In-Reply-To: <20230510142720.71894-1-jean-louis@dupond.be>
Uploaded a new patch which might be better/cleaner :)
"[PATCH] qcow2: add discard-no-unref option"
That patch only applies to qcow2 and also passes unmap further down the
storage stack.
On 10/05/2023 16:27, Jean-Louis Dupond wrote:
> When we for example have a sparse qcow2 image and discard: unmap is enabled,
> there can be a lot of fragmentation in the image after some time. Surely on VM's
> that do a lot of writes/deletes.
> This causes the qcow2 image to grow even over 110% of its virtual size,
> because the free gaps in the image get to small to allocate new
> continuous clusters. So it allocates new space as the end of the image.
>
> Disabling discard is not an option, as discard is needed to keep the
> incremental backup size as low as possible. Without discard, the
> incremental backups would become large, as qemu thinks it's just dirty
> blocks but it doesn't know the blocks are empty/useless.
> So we need to avoid fragmentation but also 'empty' the useless blocks in
> the image to have a small incremental backup.
>
> There are multiple ways to properly resolve this. One way is to not
> discard the blocks on a discard request, but just zero them. This causes
> the allocation the still exist, and results in no gaps.
> This should also cause a perfectly continuous image when using full
> preallocation.
>
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
> Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
> ---
> block.c | 2 ++
> block/io.c | 7 ++++++-
> include/block/block-common.h | 1 +
> qapi/block-core.json | 3 ++-
> qemu-nbd.c | 2 +-
> qemu-options.hx | 2 +-
> storage-daemon/qemu-storage-daemon.c | 2 +-
> 7 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/block.c b/block.c
> index 5ec1a3897e..ed21d115dd 100644
> --- a/block.c
> +++ b/block.c
> @@ -1146,6 +1146,8 @@ int bdrv_parse_discard_flags(const char *mode, int *flags)
> /* do nothing */
> } else if (!strcmp(mode, "on") || !strcmp(mode, "unmap")) {
> *flags |= BDRV_O_UNMAP;
> + } else if (!strcmp(mode, "zeroes")) {
> + *flags |= BDRV_O_UNMAP_ZERO;
> } else {
> return -1;
> }
> diff --git a/block/io.c b/block/io.c
> index 6fa1993374..dc7592e938 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2988,10 +2988,15 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset,
> }
>
> /* Do nothing if disabled. */
> - if (!(bs->open_flags & BDRV_O_UNMAP)) {
> + if (!(bs->open_flags & BDRV_O_UNMAP) &&
> + !(bs->open_flags & BDRV_O_UNMAP_ZERO)) {
> return 0;
> }
>
> + if (bs->open_flags & BDRV_O_UNMAP_ZERO) {
> + return bdrv_co_pwrite_zeroes(child, offset, bytes, 0);
> + }
> +
> if (!bs->drv->bdrv_co_pdiscard && !bs->drv->bdrv_aio_pdiscard) {
> return 0;
> }
> diff --git a/include/block/block-common.h b/include/block/block-common.h
> index b5122ef8ab..079ee390c3 100644
> --- a/include/block/block-common.h
> +++ b/include/block/block-common.h
> @@ -179,6 +179,7 @@ typedef enum {
> #define BDRV_O_AUTO_RDONLY 0x20000 /* degrade to read-only if opening
> read-write fails */
> #define BDRV_O_IO_URING 0x40000 /* use io_uring instead of the thread pool */
> +#define BDRV_O_UNMAP_ZERO 0x80000 /* rewrite guest unmap to zero */
>
> #define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH)
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index c05ad0c07e..0f91d1a6b6 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2971,11 +2971,12 @@
> #
> # @ignore: Ignore the request
> # @unmap: Forward as an unmap request
> +# @zeroes: Zero the clusters instead of unmapping (since 8.1)
> #
> # Since: 2.9
> ##
> { 'enum': 'BlockdevDiscardOptions',
> - 'data': [ 'ignore', 'unmap' ] }
> + 'data': [ 'ignore', 'unmap', 'zeroes' ] }
>
> ##
> # @BlockdevDetectZeroesOptions:
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 6ff45308a9..6c0b326db4 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -148,7 +148,7 @@ static void usage(const char *name)
> " valid options are: 'none', 'writeback' (default),\n"
> " 'writethrough', 'directsync' and 'unsafe'\n"
> " --aio=MODE set AIO mode (native, io_uring or threads)\n"
> -" --discard=MODE set discard mode (ignore, unmap)\n"
> +" --discard=MODE set discard mode (ignore, unmap, zeroes)\n"
> " --detect-zeroes=MODE set detect-zeroes mode (off, on, unmap)\n"
> " --image-opts treat FILE as a full set of image options\n"
> "\n"
> diff --git a/qemu-options.hx b/qemu-options.hx
> index b5efa648ba..7e9d383499 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1209,7 +1209,7 @@ SRST
> ERST
>
> DEF("blockdev", HAS_ARG, QEMU_OPTION_blockdev,
> - "-blockdev [driver=]driver[,node-name=N][,discard=ignore|unmap]\n"
> + "-blockdev [driver=]driver[,node-name=N][,discard=ignore|unmap|zeroes]\n"
> " [,cache.direct=on|off][,cache.no-flush=on|off]\n"
> " [,read-only=on|off][,auto-read-only=on|off]\n"
> " [,force-share=on|off][,detect-zeroes=on|off|unmap]\n"
> diff --git a/storage-daemon/qemu-storage-daemon.c b/storage-daemon/qemu-storage-daemon.c
> index 0e9354faa6..08e8b1b3d9 100644
> --- a/storage-daemon/qemu-storage-daemon.c
> +++ b/storage-daemon/qemu-storage-daemon.c
> @@ -86,7 +86,7 @@ static void help(void)
> " specify tracing options\n"
> " -V, --version output version information and exit\n"
> "\n"
> -" --blockdev [driver=]<driver>[,node-name=<N>][,discard=ignore|unmap]\n"
> +" --blockdev [driver=]<driver>[,node-name=<N>][,discard=ignore|unmap|zeroes]\n"
> " [,cache.direct=on|off][,cache.no-flush=on|off]\n"
> " [,read-only=on|off][,auto-read-only=on|off]\n"
> " [,force-share=on|off][,detect-zeroes=on|off|unmap]\n"
prev parent reply other threads:[~2023-05-15 7:41 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-10 14:27 [PATCH] block: Add zeroes discard option Jean-Louis Dupond
2023-05-15 7:40 ` Jean-Louis Dupond [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8614ebb6-5c03-be09-4778-977dd68920ed@dupond.be \
--to=jean-louis@dupond.be \
--cc=fam@euphon.net \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).