qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Hanna Czenczek <hreitz@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Kevin Wolf <kwolf@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Brian Song <hibriansong@gmail.com>
Subject: Re: [PATCH v2 13/21] fuse: Manually process requests (without libfuse)
Date: Mon, 9 Jun 2025 12:54:56 -0400	[thread overview]
Message-ID: <20250609165456.GG29452@fedora> (raw)
In-Reply-To: <20250604132813.359438-14-hreitz@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3675 bytes --]

On Wed, Jun 04, 2025 at 03:28:05PM +0200, Hanna Czenczek wrote:
> Manually read requests from the /dev/fuse FD and process them, without
> using libfuse.  This allows us to safely add parallel request processing
> in coroutines later, without having to worry about libfuse internals.
> (Technically, we already have exactly that problem with
> read_from_fuse_export()/read_from_fuse_fd() nesting.)
> 
> We will continue to use libfuse for mounting the filesystem; fusermount3
> is a effectively a helper program of libfuse, so it should know best how
> to interact with it.  (Doing it manually without libfuse, while doable,
> is a bit of a pain, and it is not clear to me how stable the "protocol"
> actually is.)
> 
> Take this opportunity of quite a major rewrite to update the Copyright
> line with corrected information that has surfaced in the meantime.
> 
> Here are some benchmarks from before this patch (4k, iodepth=16, libaio;
> except 'sync', which are iodepth=1 and pvsync2):
> 
> file:
>   read:
>     seq aio:   78.6k ±1.3k IOPS
>     rand aio:  39.3k ±2.9k
>     seq sync:  32.5k ±0.7k
>     rand sync:  9.9k ±0.1k
>   write:
>     seq aio:   61.9k ±0.5k
>     rand aio:  61.2k ±0.6k
>     seq sync:  27.9k ±0.2k
>     rand sync: 27.6k ±0.4k
> null:
>   read:
>     seq aio:   214.0k ±5.9k
>     rand aio:  212.7k ±4.5k
>     seq sync:   90.3k ±6.5k
>     rand sync:  89.7k ±5.1k
>   write:
>     seq aio:   203.9k ±1.5k
>     rand aio:  201.4k ±3.6k
>     seq sync:   86.1k ±6.2k
>     rand sync:  84.9k ±5.3k
> 
> And with this patch applied:
> 
> file:
>   read:
>     seq aio:   76.6k ±1.8k (- 3 %)
>     rand aio:  26.7k ±0.4k (-32 %)
>     seq sync:  47.7k ±1.2k (+47 %)
>     rand sync: 10.1k ±0.2k (+ 2 %)
>   write:
>     seq aio:   58.1k ±0.5k (- 6 %)
>     rand aio:  58.1k ±0.5k (- 5 %)
>     seq sync:  36.3k ±0.3k (+30 %)
>     rand sync: 36.1k ±0.4k (+31 %)
> null:
>   read:
>     seq aio:   268.4k ±3.4k (+25 %)
>     rand aio:  265.3k ±2.1k (+25 %)
>     seq sync:  134.3k ±2.7k (+49 %)
>     rand sync: 132.4k ±1.4k (+48 %)
>   write:
>     seq aio:   275.3k ±1.7k (+35 %)
>     rand aio:  272.3k ±1.9k (+35 %)
>     seq sync:  130.7k ±1.6k (+52 %)
>     rand sync: 127.4k ±2.4k (+50 %)
> 
> So clearly the AIO file results are actually not good, and random reads
> are indeed quite terrible.  On the other hand, we can see from the sync
> and null results that request handling should in theory be quicker.  How
> does this fit together?
> 
> I believe the bad AIO results are an artifact of the accidental parallel
> request processing we have due to nested polling: Depending on how the
> actual request processing is structured and how long request processing
> takes, more or less requests will be submitted in parallel.  So because
> of the restructuring, I think this patch accidentally changes how many
> requests end up being submitted in parallel, which decreases
> performance.
> 
> (I have seen something like this before: In RSD, without having
> implemented a polling mode, the debug build tended to have better
> performance than the more optimized release build, because the debug
> build, taking longer to submit requests, ended up processing more
> requests in parallel.)
> 
> In any case, once we use coroutines throughout the code, performance
> will improve again across the board.
> 
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  block/export/fuse.c | 754 +++++++++++++++++++++++++++++++-------------
>  1 file changed, 535 insertions(+), 219 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2025-06-09 16:55 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-04 13:27 [PATCH v2 00/21] export/fuse: Use coroutines and multi-threading Hanna Czenczek
2025-06-04 13:27 ` [PATCH v2 01/21] fuse: Copy write buffer content before polling Hanna Czenczek
2025-06-09 14:45   ` Stefan Hajnoczi
2025-06-04 13:27 ` [PATCH v2 02/21] fuse: Ensure init clean-up even with error_fatal Hanna Czenczek
2025-06-04 13:27 ` [PATCH v2 03/21] fuse: Remove superfluous empty line Hanna Czenczek
2025-06-04 13:27 ` [PATCH v2 04/21] fuse: Explicitly set inode ID to 1 Hanna Czenczek
2025-06-04 13:27 ` [PATCH v2 05/21] fuse: Change setup_... to mount_fuse_export() Hanna Czenczek
2025-06-04 13:27 ` [PATCH v2 06/21] fuse: Fix mount options Hanna Czenczek
2025-06-04 13:27 ` [PATCH v2 07/21] fuse: Set direct_io and parallel_direct_writes Hanna Czenczek
2025-06-04 13:28 ` [PATCH v2 08/21] fuse: Introduce fuse_{at,de}tach_handlers() Hanna Czenczek
2025-06-04 13:28 ` [PATCH v2 09/21] fuse: Introduce fuse_{inc,dec}_in_flight() Hanna Czenczek
2025-06-04 13:28 ` [PATCH v2 10/21] fuse: Add halted flag Hanna Czenczek
2025-06-04 13:28 ` [PATCH v2 11/21] fuse: Rename length to blk_len in fuse_write() Hanna Czenczek
2025-06-09 14:48   ` Stefan Hajnoczi
2025-06-04 13:28 ` [PATCH v2 12/21] block: Move qemu_fcntl_addfl() into osdep.c Hanna Czenczek
2025-06-04 15:18   ` Eric Blake
2025-06-09 15:03   ` Stefan Hajnoczi
2025-07-01  7:24     ` Hanna Czenczek
2025-06-04 13:28 ` [PATCH v2 13/21] fuse: Manually process requests (without libfuse) Hanna Czenczek
2025-06-09 16:54   ` Stefan Hajnoczi [this message]
2025-06-04 13:28 ` [PATCH v2 14/21] fuse: Reduce max read size Hanna Czenczek
2025-06-04 13:28 ` [PATCH v2 15/21] fuse: Process requests in coroutines Hanna Czenczek
2025-06-05  8:12   ` Hanna Czenczek
2025-06-09 16:57   ` Stefan Hajnoczi
2025-06-04 13:28 ` [PATCH v2 16/21] block/export: Add multi-threading interface Hanna Czenczek
2025-06-04 13:58   ` Markus Armbruster
2025-06-09 17:00   ` Stefan Hajnoczi
2025-06-04 13:28 ` [PATCH v2 17/21] iotests/307: Test multi-thread export interface Hanna Czenczek
2025-06-04 13:28 ` [PATCH v2 18/21] fuse: Implement multi-threading Hanna Czenczek
2025-06-09 18:10   ` Stefan Hajnoczi
2025-06-27  1:08   ` Brian
2025-07-01  7:31     ` Hanna Czenczek
2025-06-04 13:28 ` [PATCH v2 19/21] qapi/block-export: Document FUSE's multi-threading Hanna Czenczek
2025-06-04 13:58   ` Markus Armbruster
2025-06-04 13:28 ` [PATCH v2 20/21] iotests/308: Add multi-threading sanity test Hanna Czenczek
2025-06-09 18:12   ` Stefan Hajnoczi
2025-06-04 13:28 ` [PATCH v2 21/21] fuse: Increase MAX_WRITE_SIZE with a second buffer Hanna Czenczek
2025-06-10 23:37   ` Brian
2025-06-11 13:46     ` Stefan Hajnoczi
2025-06-09 18:14 ` [PATCH v2 00/21] export/fuse: Use coroutines and multi-threading Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250609165456.GG29452@fedora \
    --to=stefanha@redhat.com \
    --cc=armbru@redhat.com \
    --cc=hibriansong@gmail.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).