From: Josef Bacik <josef@toxicpanda.com>
To: Bernd Schubert <bernd.schubert@fastmail.fm>
Cc: Bernd Schubert <bschubert@ddn.com>,
Miklos Szeredi <miklos@szeredi.hu>,
Amir Goldstein <amir73il@gmail.com>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH RFC v2 04/19] fuse: Add fuse-io-uring design documentation
Date: Thu, 30 May 2024 10:59:53 -0400 [thread overview]
Message-ID: <20240530145953.GB2205585@perftesting> (raw)
In-Reply-To: <8e756ed6-3b12-4afa-ad6a-94e9a56fd4be@fastmail.fm>
On Thu, May 30, 2024 at 02:50:30PM +0200, Bernd Schubert wrote:
>
>
> On 5/29/24 23:17, Josef Bacik wrote:
> > On Wed, May 29, 2024 at 08:00:39PM +0200, Bernd Schubert wrote:
> >> Signed-off-by: Bernd Schubert <bschubert@ddn.com>
> >> ---
> >> Documentation/filesystems/fuse-io-uring.rst | 167 ++++++++++++++++++++++++++++
> >> 1 file changed, 167 insertions(+)
> >>
> >> diff --git a/Documentation/filesystems/fuse-io-uring.rst b/Documentation/filesystems/fuse-io-uring.rst
> >> new file mode 100644
> >> index 000000000000..4aa168e3b229
> >> --- /dev/null
> >> +++ b/Documentation/filesystems/fuse-io-uring.rst
> >> @@ -0,0 +1,167 @@
> >> +.. SPDX-License-Identifier: GPL-2.0
> >> +
> >> +===============================
> >> +FUSE Uring design documentation
> >> +==============================
> >> +
> >> +This documentation covers basic details how the fuse
> >> +kernel/userspace communication through uring is configured
> >> +and works. For generic details about FUSE see fuse.rst.
> >> +
> >> +This document also covers the current interface, which is
> >> +still in development and might change.
> >> +
> >> +Limitations
> >> +===========
> >> +As of now not all requests types are supported through uring, userspace
> >
> > s/userspace side/userspace/
> >
> >> +side is required to also handle requests through /dev/fuse after
> >> +uring setup is complete. These are especially notifications (initiated
> >
> > especially is an awkward word choice here, I'm not quite sure what you're trying
> > say here, perhaps
> >
> > "Specifically notifications (initiated from the daemon side), interrupts and
> > forgets"
>
> Yep, thanks a lot! I removed forgets", these should be working over the ring
> in the mean time.
>
> >
> > ?
> >
> >> +from daemon side), interrupts and forgets.
> >> +Interrupts are probably not working at all when uring is used. At least
> >> +current state of libfuse will not be able to handle those for requests
> >> +on ring queues.
> >> +All these limitation will be addressed later.
> >> +
> >> +Fuse uring configuration
> >> +========================
> >> +
> >> +Fuse kernel requests are queued through the classical /dev/fuse
> >> +read/write interface - until uring setup is complete.
> >> +
> >> +In order to set up fuse-over-io-uring userspace has to send ioctls,
> >> +mmap requests in the right order
> >> +
> >> +1) FUSE_DEV_IOC_URING ioctl with FUSE_URING_IOCTL_CMD_RING_CFG
> >> +
> >> +First the basic kernel data structure has to be set up, using
> >> +FUSE_DEV_IOC_URING with subcommand FUSE_URING_IOCTL_CMD_RING_CFG.
> >> +
> >> +Example (from libfuse)
> >> +
> >> +static int fuse_uring_setup_kernel_ring(int session_fd,
> >> + int nr_queues, int sync_qdepth,
> >> + int async_qdepth, int req_arg_len,
> >> + int req_alloc_sz)
> >> +{
> >> + int rc;
> >> +
> >> + struct fuse_ring_config rconf = {
> >> + .nr_queues = nr_queues,
> >> + .sync_queue_depth = sync_qdepth,
> >> + .async_queue_depth = async_qdepth,
> >> + .req_arg_len = req_arg_len,
> >> + .user_req_buf_sz = req_alloc_sz,
> >> + .numa_aware = nr_queues > 1,
> >> + };
> >> +
> >> + struct fuse_uring_cfg ioc_cfg = {
> >> + .flags = 0,
> >> + .cmd = FUSE_URING_IOCTL_CMD_RING_CFG,
> >> + .rconf = rconf,
> >> + };
> >> +
> >> + rc = ioctl(session_fd, FUSE_DEV_IOC_URING, &ioc_cfg);
> >> + if (rc)
> >> + rc = -errno;
> >> +
> >> + return rc;
> >> +}
> >> +
> >> +2) MMAP
> >> +
> >> +For shared memory communication between kernel and userspace
> >> +each queue has to allocate and map memory buffer.
> >> +For numa awares kernel side verifies if the allocating thread
> >
> > This bit is awkwardly worded and there's some spelling mistakes. Perhaps
> > something like this?
> >
> > "For numa aware kernels, the kernel verifies that the allocating thread is bound
> > to a single core, as the kernel has the expectation that only a single thread
> > accesses a queue, and for numa aware memory allocation the core of the thread
> > sending the mmap request is used to identify the numa node"
>
> Thank you, updated. I actually consider to reduce this to a warning (will try
> to add an async FUSE_WARN request type for this and others). Issue is that
> systems cannot set up fuse-uring when a core is disabled.
>
> >
> >> +is bound to a single core - in general kernel side has expectations
> >> +that only a single thread accesses a queue and for numa aware
> >> +memory alloation the core of the thread sending the mmap request
> >> +is used to identify the numa node.
> >> +
> >> +The offsset parameter has to be FUSE_URING_MMAP_OFF to identify
> > ^^^^ "offset"
>
>
> Fixed.
>
> >
> >> +it is a request concerning fuse-over-io-uring.
> >> +
> >> +3) FUSE_DEV_IOC_URING ioctl with FUSE_URING_IOCTL_CMD_QUEUE_CFG
> >> +
> >> +This ioctl has to be send for every queue and takes the queue-id (qid)
> > ^^^^ "sent"
> >
> >> +and memory address obtained by mmap to set up queue data structures.
> >> +
> >> +Kernel - userspace interface using uring
> >> +========================================
> >> +
> >> +After queue ioctl setup and memory mapping userspace submits
> >
> > This needs a comma, so
> >
> > "After queue ioctl setup and memory mapping, userspace submites"
> >
> >> +SQEs (opcode = IORING_OP_URING_CMD) in order to fetch
> >> +fuse requests. Initial submit is with the sub command
> >> +FUSE_URING_REQ_FETCH, which will just register entries
> >> +to be available on the kernel side - it sets the according
> >
> > s/according/associated/ maybe?
> >
> >> +entry state and marks the entry as available in the queue bitmap.
>
> Or maybe like this?
>
> Initial submit is with the sub command FUSE_URING_REQ_FETCH, which
> will just register entries to be available in the kernel.
>
>
> >> +
> >> +Once all entries for all queues are submitted kernel side starts
> >> +to enqueue to ring queue(s). The request is copied into the shared
> >> +memory queue entry buffer and submitted as CQE to the userspace
> >> +side.
> >> +Userspace side handles the CQE and submits the result as subcommand
> >> +FUSE_URING_REQ_COMMIT_AND_FETCH - kernel side does completes the requests
> >
> > "the kernel completes the request"
>
> Yeah, now I see the bad grammar myself. Updated to
>
>
> Once all entries for all queues are submitted, kernel starts
> to enqueue to ring queues. The request is copied into the shared
> memory buffer and submitted as CQE to the daemon.
> Userspace handles the CQE/fuse-request and submits the result as
> subcommand FUSE_URING_REQ_COMMIT_AND_FETCH - kernel completes
> the requests and also marks the entry available again. If there are
> pending requests waiting the request will be immediately submitted
> to the daemon again.
>
>
>
> Thank you very much for your help to phrase this better!
>
This all looks great, thanks!
Josef
next prev parent reply other threads:[~2024-05-30 14:59 UTC|newest]
Thread overview: 113+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-29 18:00 [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 01/19] fuse: rename to fuse_dev_end_requests and make non-static Bernd Schubert
2024-05-29 21:09 ` Josef Bacik
2024-05-29 18:00 ` [PATCH RFC v2 02/19] fuse: Move fuse_get_dev to header file Bernd Schubert
2024-05-29 21:09 ` Josef Bacik
2024-05-29 18:00 ` [PATCH RFC v2 03/19] fuse: Move request bits Bernd Schubert
2024-05-29 21:10 ` Josef Bacik
2024-05-29 18:00 ` [PATCH RFC v2 04/19] fuse: Add fuse-io-uring design documentation Bernd Schubert
2024-05-29 21:17 ` Josef Bacik
2024-05-30 12:50 ` Bernd Schubert
2024-05-30 14:59 ` Josef Bacik [this message]
2024-05-29 18:00 ` [PATCH RFC v2 05/19] fuse: Add a uring config ioctl Bernd Schubert
2024-05-29 21:24 ` Josef Bacik
2024-05-30 12:51 ` Bernd Schubert
2024-06-03 13:03 ` Miklos Szeredi
2024-06-03 13:48 ` Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 06/19] Add a vmalloc_node_user function Bernd Schubert
2024-05-30 15:10 ` Josef Bacik
2024-05-30 16:13 ` Bernd Schubert
2024-05-31 13:56 ` Christoph Hellwig
2024-06-03 15:59 ` Kent Overstreet
2024-06-03 19:24 ` Bernd Schubert
2024-06-04 4:20 ` Christoph Hellwig
2024-06-07 2:30 ` Dave Chinner
2024-06-07 4:49 ` Christoph Hellwig
2024-06-04 4:08 ` Christoph Hellwig
2024-05-29 18:00 ` [PATCH RFC v2 07/19] fuse uring: Add an mmap method Bernd Schubert
2024-05-30 15:37 ` Josef Bacik
2024-05-29 18:00 ` [PATCH RFC v2 08/19] fuse: Add the queue configuration ioctl Bernd Schubert
2024-05-30 15:54 ` Josef Bacik
2024-05-30 17:49 ` Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 09/19] fuse: {uring} Add a dev_release exception for fuse-over-io-uring Bernd Schubert
2024-05-30 19:00 ` Josef Bacik
2024-05-29 18:00 ` [PATCH RFC v2 10/19] fuse: {uring} Handle SQEs - register commands Bernd Schubert
2024-05-30 19:55 ` Josef Bacik
2024-05-29 18:00 ` [PATCH RFC v2 11/19] fuse: Add support to copy from/to the ring buffer Bernd Schubert
2024-05-30 19:59 ` Josef Bacik
2024-09-01 11:56 ` Bernd Schubert
2024-09-01 11:56 ` Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 12/19] fuse: {uring} Add uring sqe commit and fetch support Bernd Schubert
2024-05-30 20:08 ` Josef Bacik
2024-05-29 18:00 ` [PATCH RFC v2 13/19] fuse: {uring} Handle uring shutdown Bernd Schubert
2024-05-30 20:21 ` Josef Bacik
2024-05-29 18:00 ` [PATCH RFC v2 14/19] fuse: {uring} Allow to queue to the ring Bernd Schubert
2024-05-30 20:32 ` Josef Bacik
2024-05-30 21:26 ` Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 15/19] export __wake_on_current_cpu Bernd Schubert
2024-05-30 20:37 ` Josef Bacik
2024-06-04 9:26 ` Peter Zijlstra
2024-06-04 9:36 ` Bernd Schubert
2024-06-04 19:27 ` Peter Zijlstra
2024-09-01 12:07 ` Bernd Schubert
2024-05-31 13:51 ` Christoph Hellwig
2024-05-29 18:00 ` [PATCH RFC v2 16/19] fuse: {uring} Wake requests on the the current cpu Bernd Schubert
2024-05-30 16:44 ` Shachar Sharon
2024-05-30 16:59 ` Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 17/19] fuse: {uring} Send async requests to qid of core + 1 Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 18/19] fuse: {uring} Set a min cpu offset io-size for reads/writes Bernd Schubert
2024-05-29 18:00 ` [PATCH RFC v2 19/19] fuse: {uring} Optimize async sends Bernd Schubert
2024-05-31 16:24 ` Jens Axboe
2024-05-31 17:36 ` Bernd Schubert
2024-05-31 19:10 ` Jens Axboe
2024-06-01 16:37 ` Bernd Schubert
2024-05-30 7:07 ` [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Amir Goldstein
2024-05-30 12:09 ` Bernd Schubert
2024-05-30 15:36 ` Kent Overstreet
2024-05-30 16:02 ` Bernd Schubert
2024-05-30 16:10 ` Kent Overstreet
2024-05-30 16:17 ` Bernd Schubert
2024-05-30 17:30 ` Kent Overstreet
2024-05-30 19:09 ` Josef Bacik
2024-05-30 20:05 ` Kent Overstreet
2024-05-31 3:53 ` [PATCH] fs: sys_ringbuffer() (WIP) Kent Overstreet
2024-05-31 13:11 ` kernel test robot
2024-05-31 15:49 ` kernel test robot
2024-05-30 16:21 ` [PATCH RFC v2 00/19] fuse: fuse-over-io-uring Jens Axboe
2024-05-30 16:32 ` Bernd Schubert
2024-05-30 17:26 ` Jens Axboe
2024-05-30 17:16 ` Kent Overstreet
2024-05-30 17:28 ` Jens Axboe
2024-05-30 17:58 ` Kent Overstreet
2024-05-30 18:48 ` Jens Axboe
2024-05-30 19:35 ` Kent Overstreet
2024-05-31 0:11 ` Jens Axboe
2024-06-04 23:45 ` Ming Lei
2024-05-30 20:47 ` Josef Bacik
2024-06-11 8:20 ` Miklos Szeredi
2024-06-11 10:26 ` Bernd Schubert
2024-06-11 15:35 ` Miklos Szeredi
2024-06-11 17:37 ` Bernd Schubert
2024-06-11 23:35 ` Kent Overstreet
2024-06-12 13:53 ` Bernd Schubert
2024-06-12 14:19 ` Kent Overstreet
2024-06-12 15:40 ` Bernd Schubert
2024-06-12 15:55 ` Kent Overstreet
2024-06-12 16:15 ` Bernd Schubert
2024-06-12 16:24 ` Kent Overstreet
2024-06-12 16:44 ` Bernd Schubert
2024-06-12 7:39 ` Miklos Szeredi
2024-06-12 13:32 ` Bernd Schubert
2024-06-12 13:46 ` Bernd Schubert
2024-06-12 14:07 ` Miklos Szeredi
2024-06-12 14:56 ` Bernd Schubert
2024-08-02 23:03 ` Bernd Schubert
2024-08-29 22:32 ` Bernd Schubert
2024-08-30 13:12 ` Jens Axboe
2024-08-30 13:28 ` Bernd Schubert
2024-08-30 13:33 ` Jens Axboe
2024-08-30 14:55 ` Pavel Begunkov
2024-08-30 15:10 ` Bernd Schubert
2024-08-30 20:08 ` Jens Axboe
2024-08-31 0:02 ` Bernd Schubert
2024-08-31 0:49 ` Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240530145953.GB2205585@perftesting \
--to=josef@toxicpanda.com \
--cc=amir73il@gmail.com \
--cc=bernd.schubert@fastmail.fm \
--cc=bschubert@ddn.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).