From: Stefan Hajnoczi <stefanha@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org, pbonzini@redhat.com, afaria@redhat.com,
hreitz@redhat.com, qemu-devel@nongnu.org
Subject: Re: [PATCH 1/5] file-posix: Support FUA writes
Date: Mon, 10 Mar 2025 18:41:58 +0800 [thread overview]
Message-ID: <20250310104158.GA359802@fedora> (raw)
In-Reply-To: <20250307221634.71951-2-kwolf@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3613 bytes --]
On Fri, Mar 07, 2025 at 11:16:30PM +0100, Kevin Wolf wrote:
> Until now, FUA was always emulated with a separate flush after the write
> for file-posix. The overhead of processing a second request can reduce
> performance significantly for a guest disk that has disabled the write
> cache, especially if the host disk is already write through, too, and
> the flush isn't actually doing anything.
>
> Advertise support for REQ_FUA in write requests and implement it for
> Linux AIO and io_uring using the RWF_DSYNC flag for write requests. The
> thread pool still performs a separate fdatasync() call. This can be
> improved later by using the pwritev2() syscall if available.
>
> As an example, this is how fio numbers can be improved in some scenarios
> with this patch (all using virtio-blk with cache=directsync on an nvme
> block device for the VM, fio with ioengine=libaio,direct=1,sync=1):
>
> | old | with FUA support
> ------------------------------+---------------+-------------------
> bs=4k, iodepth=1, numjobs=1 | 45.6k iops | 56.1k iops
> bs=4k, iodepth=1, numjobs=16 | 183.3k iops | 236.0k iops
> bs=4k, iodepth=16, numjobs=1 | 258.4k iops | 311.1k iops
>
> However, not all scenarios are clear wins. On another slower disk I saw
> little to no improvment. In fact, in two corner case scenarios, I even
> observed a regression, which I however consider acceptable:
>
> 1. On slow host disks in a write through cache mode, when the guest is
> using virtio-blk in a separate iothread so that polling can be
> enabled, and each completion is quickly followed up with a new
> request (so that polling gets it), it can happen that enabling FUA
> makes things slower - the additional very fast no-op flush we used to
> have gave the adaptive polling algorithm a success so that it kept
> polling. Without it, we only have the slow write request, which
> disables polling. This is a problem in the polling algorithm that
> will be fixed later in this series.
>
> 2. With a high queue depth, it can be beneficial to have flush requests
> for another reason: The optimisation in bdrv_co_flush() that flushes
> only once per write generation acts as a synchronisation mechanism
> that lets all requests complete at the same time. This can result in
> better batching and if the disk is very fast (I only saw this with a
> null_blk backend), this can make up for the overhead of the flush and
> improve throughput. In theory, we could optionally introduce a
> similar artificial latency in the normal completion path to achieve
> the same kind of completion batching. This is not implemented in this
> series.
>
> Compatibility is not a concern for io_uring, it has supported RWF_DSYNC
> from the start. Linux AIO started supporting it in Linux 4.13 and libaio
> 0.3.111. The kernel is not a problem for any supported build platform,
> so it's not necessary to add runtime checks. However, openSUSE is still
> stuck with an older libaio version that would break the build. We must
> detect this at build time to avoid build failures.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> include/block/raw-aio.h | 8 ++++++--
> block/file-posix.c | 26 ++++++++++++++++++--------
> block/io_uring.c | 13 ++++++++-----
> block/linux-aio.c | 24 +++++++++++++++++++++---
> meson.build | 4 ++++
> 5 files changed, 57 insertions(+), 18 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2025-03-10 10:43 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-07 22:16 [PATCH 0/5] block: Improve writethrough performance Kevin Wolf
2025-03-07 22:16 ` [PATCH 1/5] file-posix: Support FUA writes Kevin Wolf
2025-03-10 10:41 ` Stefan Hajnoczi [this message]
2025-03-07 22:16 ` [PATCH 2/5] block/io: Ignore FUA with cache.no-flush=on Kevin Wolf
2025-03-10 10:42 ` Stefan Hajnoczi
2025-03-07 22:16 ` [PATCH 3/5] aio: Create AioPolledEvent Kevin Wolf
2025-03-10 10:55 ` Stefan Hajnoczi
2025-03-07 22:16 ` [PATCH 4/5] aio-posix: Factor out adjust_polling_time() Kevin Wolf
2025-03-10 10:55 ` Stefan Hajnoczi
2025-03-07 22:16 ` [PATCH 5/5] aio-posix: Separate AioPolledEvent per AioHandler Kevin Wolf
2025-03-10 10:55 ` Stefan Hajnoczi
2025-03-10 11:11 ` Kevin Wolf
2025-03-11 2:18 ` Stefan Hajnoczi
2025-03-10 10:55 ` [PATCH 0/5] block: Improve writethrough performance Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250310104158.GA359802@fedora \
--to=stefanha@redhat.com \
--cc=afaria@redhat.com \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).