From: Stefan Hajnoczi <stefanha@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org, pbonzini@redhat.com, afaria@redhat.com,
hreitz@redhat.com, qemu-devel@nongnu.org
Subject: Re: [PATCH 1/5] file-posix: Support FUA writes
Date: Mon, 10 Mar 2025 18:41:58 +0800 [thread overview]
Message-ID: <20250310104158.GA359802@fedora> (raw)
In-Reply-To: <20250307221634.71951-2-kwolf@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3613 bytes --]
On Fri, Mar 07, 2025 at 11:16:30PM +0100, Kevin Wolf wrote:
> Until now, FUA was always emulated with a separate flush after the write
> for file-posix. The overhead of processing a second request can reduce
> performance significantly for a guest disk that has disabled the write
> cache, especially if the host disk is already write through, too, and
> the flush isn't actually doing anything.
>
> Advertise support for REQ_FUA in write requests and implement it for
> Linux AIO and io_uring using the RWF_DSYNC flag for write requests. The
> thread pool still performs a separate fdatasync() call. This can be
> improved later by using the pwritev2() syscall if available.
>
> As an example, this is how fio numbers can be improved in some scenarios
> with this patch (all using virtio-blk with cache=directsync on an nvme
> block device for the VM, fio with ioengine=libaio,direct=1,sync=1):
>
> | old | with FUA support
> ------------------------------+---------------+-------------------
> bs=4k, iodepth=1, numjobs=1 | 45.6k iops | 56.1k iops
> bs=4k, iodepth=1, numjobs=16 | 183.3k iops | 236.0k iops
> bs=4k, iodepth=16, numjobs=1 | 258.4k iops | 311.1k iops
>
> However, not all scenarios are clear wins. On another slower disk I saw
> little to no improvment. In fact, in two corner case scenarios, I even
> observed a regression, which I however consider acceptable:
>
> 1. On slow host disks in a write through cache mode, when the guest is
> using virtio-blk in a separate iothread so that polling can be
> enabled, and each completion is quickly followed up with a new
> request (so that polling gets it), it can happen that enabling FUA
> makes things slower - the additional very fast no-op flush we used to
> have gave the adaptive polling algorithm a success so that it kept
> polling. Without it, we only have the slow write request, which
> disables polling. This is a problem in the polling algorithm that
> will be fixed later in this series.
>
> 2. With a high queue depth, it can be beneficial to have flush requests
> for another reason: The optimisation in bdrv_co_flush() that flushes
> only once per write generation acts as a synchronisation mechanism
> that lets all requests complete at the same time. This can result in
> better batching and if the disk is very fast (I only saw this with a
> null_blk backend), this can make up for the overhead of the flush and
> improve throughput. In theory, we could optionally introduce a
> similar artificial latency in the normal completion path to achieve
> the same kind of completion batching. This is not implemented in this
> series.
>
> Compatibility is not a concern for io_uring, it has supported RWF_DSYNC
> from the start. Linux AIO started supporting it in Linux 4.13 and libaio
> 0.3.111. The kernel is not a problem for any supported build platform,
> so it's not necessary to add runtime checks. However, openSUSE is still
> stuck with an older libaio version that would break the build. We must
> detect this at build time to avoid build failures.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> include/block/raw-aio.h | 8 ++++++--
> block/file-posix.c | 26 ++++++++++++++++++--------
> block/io_uring.c | 13 ++++++++-----
> block/linux-aio.c | 24 +++++++++++++++++++++---
> meson.build | 4 ++++
> 5 files changed, 57 insertions(+), 18 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2025-03-10 10:43 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-07 22:16 [PATCH 0/5] block: Improve writethrough performance Kevin Wolf
2025-03-07 22:16 ` [PATCH 1/5] file-posix: Support FUA writes Kevin Wolf
2025-03-10 10:41 ` Stefan Hajnoczi [this message]
2025-03-07 22:16 ` [PATCH 2/5] block/io: Ignore FUA with cache.no-flush=on Kevin Wolf
2025-03-10 10:42 ` Stefan Hajnoczi
2025-03-07 22:16 ` [PATCH 3/5] aio: Create AioPolledEvent Kevin Wolf
2025-03-10 10:55 ` Stefan Hajnoczi
2025-03-07 22:16 ` [PATCH 4/5] aio-posix: Factor out adjust_polling_time() Kevin Wolf
2025-03-10 10:55 ` Stefan Hajnoczi
2025-03-07 22:16 ` [PATCH 5/5] aio-posix: Separate AioPolledEvent per AioHandler Kevin Wolf
2025-03-10 10:55 ` Stefan Hajnoczi
2025-03-10 11:11 ` Kevin Wolf
2025-03-11 2:18 ` Stefan Hajnoczi
2025-03-10 10:55 ` [PATCH 0/5] block: Improve writethrough performance Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250310104158.GA359802@fedora \
--to=stefanha@redhat.com \
--cc=afaria@redhat.com \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.