From: Hanna Czenczek <hreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
Stefan Hajnoczi <stefanha@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
Brian Song <hibriansong@gmail.com>
Subject: Re: [PATCH v3 21/21] fuse: Increase MAX_WRITE_SIZE with a second buffer
Date: Fri, 31 Oct 2025 13:13:56 +0100 [thread overview]
Message-ID: <c9339232-476d-4074-9150-ea7c154658b7@redhat.com> (raw)
In-Reply-To: <aPpFogd44fuISVKd@redhat.com>
On 23.10.25 17:11, Kevin Wolf wrote:
> Am 01.07.2025 um 13:44 hat Hanna Czenczek geschrieben:
>> We probably want to support larger write sizes than just 4k; 64k seems
>> nice. However, we cannot read partial requests from the FUSE FD, we
>> always have to read requests in full; so our read buffer must be large
>> enough to accommodate potential 64k writes if we want to support that.
>>
>> Always allocating FuseRequest objects with 64k buffers in them seems
>> wasteful, though. But we can get around the issue by splitting the
>> buffer into two and using readv(): One part will hold all normal (up to
>> 4k) write requests and all other requests, and a second part (the
>> "spill-over buffer") will be used only for larger write requests. Each
>> FuseQueue has its own spill-over buffer, and only if we find it used
>> when reading a request will we move its ownership into the FuseRequest
>> object and allocate a new spill-over buffer for the queue.
>>
>> This way, we get to support "large" write sizes without having to
>> allocate big buffers when they aren't used.
>>
>> Also, this even reduces the size of the FuseRequest objects because the
>> read buffer has to have at least FUSE_MIN_READ_BUFFER (8192) bytes; but
>> the requests we support are not quite so large (except for >4k writes),
>> so until now, we basically had to have useless padding in there.
>>
>> With the spill-over buffer added, the FUSE_MIN_READ_BUFFER requirement
>> is easily met and we can decrease the size of the buffer portion that is
>> right inside of FuseRequest.
>>
>> As for benchmarks, the benefit of this patch can be shown easily by
>> writing a 4G image (with qemu-img convert) to a FUSE export:
>> - Before this patch: Takes 25.6 s (14.4 s with -t none)
>> - After this patch: Takes 4.5 s (5.5 s with -t none)
>>
>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> The commit message seems outdated, there is no such thing as a
> FuseRequest object.
>
> I agree with the idea of allocating a separate buffer for the data to be
> written. I'm not so sure that the approach taken here with combining an
> in-place and a spillover buffer does actually do much for us in exchange
> for the additional complexity.
>
> The allocation + memcpy for in_place buf in fuse_co_write() bothers me a
> bit. I'd rather have a buffer for the data to write that can be directly
> used. And let's be real, we already allocate a 1 MB stack per request. I
> don't think 64k more or less make a big difference, but it would allow
> us to save the memcpy() for 4k requests and additionally an allocation
> for larger requests.
>
> The tradeoff when you use an iov for the buffer in FuseQueue that is
> only big enough for the header and fuse_write_in and then directly the
> per-request buffer that is owned by the coroutine is that for requests
> that are larger than fuse_write_in, you'll have to copy the rest back
> from the data buffer first. This seems to be only fuse_setattr_in, which
> shouldn't be a hot path at all, and only a few bytes.
So I understand that first, you disagree with “Always allocating
FuseRequest objects with 64k buffers in them seems wasteful, though.”
I.e. to just use a 64k buffer per request. OK, fair.
Second, you suggest to improve performance by having an aligned 64k data
buffer separate from the request metadata buffer to save on memcpy(). I
did consider this, but discarded it because of I was afraid of the
complexity. Actually probably too afraid.
I’ll take a look, it actually can’t be too hard. (Just have two
buffers; make the metadata buffer long enough to capture all request
headers, but only pass the sizeof(fuse_write_in)-long head into readv(),
then check the request opcode. If metadata spilled over to the data
buffer, copy it back into the “shadowed” metadata buffer tail.)
Hanna
> All that said, what you have is an obvious improvement over limiting
> write requests to 4k.
>
> Kevin
>
next prev parent reply other threads:[~2025-10-31 12:15 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-01 11:44 [PATCH v3 00/21] export/fuse: Use coroutines and multi-threading Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 01/21] fuse: Copy write buffer content before polling Hanna Czenczek
2025-10-13 13:48 ` Kevin Wolf
2025-07-01 11:44 ` [PATCH v3 02/21] fuse: Ensure init clean-up even with error_fatal Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 03/21] fuse: Remove superfluous empty line Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 04/21] fuse: Explicitly set inode ID to 1 Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 05/21] fuse: Change setup_... to mount_fuse_export() Hanna Czenczek
2025-10-13 13:50 ` Kevin Wolf
2025-07-01 11:44 ` [PATCH v3 06/21] fuse: Fix mount options Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 07/21] fuse: Set direct_io and parallel_direct_writes Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 08/21] fuse: Introduce fuse_{at,de}tach_handlers() Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 09/21] fuse: Introduce fuse_{inc,dec}_in_flight() Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 10/21] fuse: Add halted flag Hanna Czenczek
2025-10-13 16:18 ` Kevin Wolf
2025-07-01 11:44 ` [PATCH v3 11/21] fuse: Rename length to blk_len in fuse_write() Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 12/21] block: Move qemu_fcntl_addfl() into osdep.c Hanna Czenczek
2025-07-30 17:10 ` Stefan Hajnoczi
2025-07-01 11:44 ` [PATCH v3 13/21] fuse: Manually process requests (without libfuse) Hanna Czenczek
2025-10-22 12:19 ` Kevin Wolf
2025-10-31 11:55 ` Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 14/21] fuse: Reduce max read size Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 15/21] fuse: Process requests in coroutines Hanna Czenczek
2025-10-22 12:53 ` Kevin Wolf
2025-10-31 11:55 ` Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 16/21] block/export: Add multi-threading interface Hanna Czenczek
2025-10-22 12:57 ` Kevin Wolf
2025-10-31 11:56 ` Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 17/21] iotests/307: Test multi-thread export interface Hanna Czenczek
2025-07-30 17:12 ` Stefan Hajnoczi
2025-07-01 11:44 ` [PATCH v3 18/21] fuse: Implement multi-threading Hanna Czenczek
2025-07-30 17:18 ` Stefan Hajnoczi
2025-10-23 11:47 ` Kevin Wolf
2025-10-31 12:05 ` Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 19/21] qapi/block-export: Document FUSE's multi-threading Hanna Czenczek
2025-07-30 17:19 ` Stefan Hajnoczi
2025-07-01 11:44 ` [PATCH v3 20/21] iotests/308: Add multi-threading sanity test Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 21/21] fuse: Increase MAX_WRITE_SIZE with a second buffer Hanna Czenczek
2025-10-23 15:11 ` Kevin Wolf
2025-10-31 12:13 ` Hanna Czenczek [this message]
2025-10-31 12:33 ` Kevin Wolf
2025-10-31 15:03 ` Hanna Czenczek
2025-10-31 15:50 ` Kevin Wolf
2025-07-30 17:19 ` [PATCH v3 00/21] export/fuse: Use coroutines and multi-threading Stefan Hajnoczi
2025-10-23 15:15 ` Kevin Wolf
2025-10-31 12:14 ` Hanna Czenczek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c9339232-476d-4074-9150-ea7c154658b7@redhat.com \
--to=hreitz@redhat.com \
--cc=armbru@redhat.com \
--cc=hibriansong@gmail.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).