From: Kevin Wolf <kwolf@redhat.com>
To: Hanna Czenczek <hreitz@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
Stefan Hajnoczi <stefanha@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
Brian Song <hibriansong@gmail.com>
Subject: Re: [PATCH v3 21/21] fuse: Increase MAX_WRITE_SIZE with a second buffer
Date: Fri, 31 Oct 2025 13:33:28 +0100 [thread overview]
Message-ID: <aQSsmAOnj2tAeNxx@redhat.com> (raw)
In-Reply-To: <c9339232-476d-4074-9150-ea7c154658b7@redhat.com>
Am 31.10.2025 um 13:13 hat Hanna Czenczek geschrieben:
> On 23.10.25 17:11, Kevin Wolf wrote:
> > Am 01.07.2025 um 13:44 hat Hanna Czenczek geschrieben:
> > > We probably want to support larger write sizes than just 4k; 64k seems
> > > nice. However, we cannot read partial requests from the FUSE FD, we
> > > always have to read requests in full; so our read buffer must be large
> > > enough to accommodate potential 64k writes if we want to support that.
> > >
> > > Always allocating FuseRequest objects with 64k buffers in them seems
> > > wasteful, though. But we can get around the issue by splitting the
> > > buffer into two and using readv(): One part will hold all normal (up to
> > > 4k) write requests and all other requests, and a second part (the
> > > "spill-over buffer") will be used only for larger write requests. Each
> > > FuseQueue has its own spill-over buffer, and only if we find it used
> > > when reading a request will we move its ownership into the FuseRequest
> > > object and allocate a new spill-over buffer for the queue.
> > >
> > > This way, we get to support "large" write sizes without having to
> > > allocate big buffers when they aren't used.
> > >
> > > Also, this even reduces the size of the FuseRequest objects because the
> > > read buffer has to have at least FUSE_MIN_READ_BUFFER (8192) bytes; but
> > > the requests we support are not quite so large (except for >4k writes),
> > > so until now, we basically had to have useless padding in there.
> > >
> > > With the spill-over buffer added, the FUSE_MIN_READ_BUFFER requirement
> > > is easily met and we can decrease the size of the buffer portion that is
> > > right inside of FuseRequest.
> > >
> > > As for benchmarks, the benefit of this patch can be shown easily by
> > > writing a 4G image (with qemu-img convert) to a FUSE export:
> > > - Before this patch: Takes 25.6 s (14.4 s with -t none)
> > > - After this patch: Takes 4.5 s (5.5 s with -t none)
> > >
> > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> > The commit message seems outdated, there is no such thing as a
> > FuseRequest object.
> >
> > I agree with the idea of allocating a separate buffer for the data to be
> > written. I'm not so sure that the approach taken here with combining an
> > in-place and a spillover buffer does actually do much for us in exchange
> > for the additional complexity.
> >
> > The allocation + memcpy for in_place buf in fuse_co_write() bothers me a
> > bit. I'd rather have a buffer for the data to write that can be directly
> > used. And let's be real, we already allocate a 1 MB stack per request. I
> > don't think 64k more or less make a big difference, but it would allow
> > us to save the memcpy() for 4k requests and additionally an allocation
> > for larger requests.
> >
> > The tradeoff when you use an iov for the buffer in FuseQueue that is
> > only big enough for the header and fuse_write_in and then directly the
> > per-request buffer that is owned by the coroutine is that for requests
> > that are larger than fuse_write_in, you'll have to copy the rest back
> > from the data buffer first. This seems to be only fuse_setattr_in, which
> > shouldn't be a hot path at all, and only a few bytes.
>
> So I understand that first, you disagree with “Always allocating FuseRequest
> objects with 64k buffers in them seems wasteful, though.” I.e. to just use a
> 64k buffer per request. OK, fair.
I think in practice most write requests will exceed the 4k anyway, so
we'd already use the spillover buffer. Maybe the most common exception
is fio, if we want to optimise for that one. :-)
> Second, you suggest to improve performance by having an aligned 64k data
> buffer separate from the request metadata buffer to save on memcpy(). I did
> consider this, but discarded it because of I was afraid of the complexity.
> Actually probably too afraid.
>
> I’ll take a look, it actually can’t be too hard. (Just have two buffers;
> make the metadata buffer long enough to capture all request headers, but
> only pass the sizeof(fuse_write_in)-long head into readv(), then check the
> request opcode. If metadata spilled over to the data buffer, copy it back
> into the “shadowed” metadata buffer tail.)
Yes, that's what I had in mind. Not sure if it's premature
optimisation...
Kevin
next prev parent reply other threads:[~2025-10-31 12:35 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-01 11:44 [PATCH v3 00/21] export/fuse: Use coroutines and multi-threading Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 01/21] fuse: Copy write buffer content before polling Hanna Czenczek
2025-10-13 13:48 ` Kevin Wolf
2025-07-01 11:44 ` [PATCH v3 02/21] fuse: Ensure init clean-up even with error_fatal Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 03/21] fuse: Remove superfluous empty line Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 04/21] fuse: Explicitly set inode ID to 1 Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 05/21] fuse: Change setup_... to mount_fuse_export() Hanna Czenczek
2025-10-13 13:50 ` Kevin Wolf
2025-07-01 11:44 ` [PATCH v3 06/21] fuse: Fix mount options Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 07/21] fuse: Set direct_io and parallel_direct_writes Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 08/21] fuse: Introduce fuse_{at,de}tach_handlers() Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 09/21] fuse: Introduce fuse_{inc,dec}_in_flight() Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 10/21] fuse: Add halted flag Hanna Czenczek
2025-10-13 16:18 ` Kevin Wolf
2025-07-01 11:44 ` [PATCH v3 11/21] fuse: Rename length to blk_len in fuse_write() Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 12/21] block: Move qemu_fcntl_addfl() into osdep.c Hanna Czenczek
2025-07-30 17:10 ` Stefan Hajnoczi
2025-07-01 11:44 ` [PATCH v3 13/21] fuse: Manually process requests (without libfuse) Hanna Czenczek
2025-10-22 12:19 ` Kevin Wolf
2025-10-31 11:55 ` Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 14/21] fuse: Reduce max read size Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 15/21] fuse: Process requests in coroutines Hanna Czenczek
2025-10-22 12:53 ` Kevin Wolf
2025-10-31 11:55 ` Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 16/21] block/export: Add multi-threading interface Hanna Czenczek
2025-10-22 12:57 ` Kevin Wolf
2025-10-31 11:56 ` Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 17/21] iotests/307: Test multi-thread export interface Hanna Czenczek
2025-07-30 17:12 ` Stefan Hajnoczi
2025-07-01 11:44 ` [PATCH v3 18/21] fuse: Implement multi-threading Hanna Czenczek
2025-07-30 17:18 ` Stefan Hajnoczi
2025-10-23 11:47 ` Kevin Wolf
2025-10-31 12:05 ` Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 19/21] qapi/block-export: Document FUSE's multi-threading Hanna Czenczek
2025-07-30 17:19 ` Stefan Hajnoczi
2025-07-01 11:44 ` [PATCH v3 20/21] iotests/308: Add multi-threading sanity test Hanna Czenczek
2025-07-01 11:44 ` [PATCH v3 21/21] fuse: Increase MAX_WRITE_SIZE with a second buffer Hanna Czenczek
2025-10-23 15:11 ` Kevin Wolf
2025-10-31 12:13 ` Hanna Czenczek
2025-10-31 12:33 ` Kevin Wolf [this message]
2025-10-31 15:03 ` Hanna Czenczek
2025-10-31 15:50 ` Kevin Wolf
2025-07-30 17:19 ` [PATCH v3 00/21] export/fuse: Use coroutines and multi-threading Stefan Hajnoczi
2025-10-23 15:15 ` Kevin Wolf
2025-10-31 12:14 ` Hanna Czenczek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aQSsmAOnj2tAeNxx@redhat.com \
--to=kwolf@redhat.com \
--cc=armbru@redhat.com \
--cc=hibriansong@gmail.com \
--cc=hreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).