Linux filesystem development
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Joanne Koong <joannelkoong@gmail.com>, miklos@szeredi.hu
Cc: bernd@bsbernd.com, axboe@kernel.dk, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2 10/14] fuse: add io-uring buffer rings
Date: Thu, 30 Apr 2026 12:08:55 +0100	[thread overview]
Message-ID: <c6977f02cd8c341b80ec5e7cc1bf67e0e455d859.camel@kernel.org> (raw)
In-Reply-To: <20260402162840.2989717-11-joannelkoong@gmail.com>

On Thu, 2026-04-02 at 09:28 -0700, Joanne Koong wrote:
> Add fuse buffer rings for servers communicating through the io-uring
> interface. To use this, the server must set the FUSE_URING_BUFRING
> flag and provide header and payload buffers via an iovec array in the
> sqe during registration. The payload buffers are used to back the buffer
> ring. The kernel manages buffer selection and recycling through a simple
> internal ring.
> 
> This has the following advantages over the non-bufring (iovec) path:
> - Reduced memory usage: in the iovec path, each entry has its own
>   dedicated payload buffer, requiring N buffers for N entries where each
>   buffer must be large enough to accommodate the maximum possible
>   payload size. With buffer rings, payload buffers are pooled and
>   selected on demand. Entries only hold a buffer while actively
>   processing a request with payload data. When incremental buffer
>   consumption is added, this will allow non-overlapping regions of a
>   single buffer to be used simultaneously across multiple requests,
>   further reducing memory requirements.
> - Foundation for pinned buffers: the buffer ring headers and payloads
>   are now each passed in as a contiguous memory allocation, which allows
>   fuse to easily pin and vmap the entire region in one operation during
>   queue setup. This will eliminate the per-request overhead of having to
>   pin/unpin user pages and translate virtual addresses and is a
>   prerequisite for future optimizations like performing data copies
>   outside of the server's task context.
> 
> Each ring entry gets a fixed ID (sqe->buf_index) that maps to a specific
> header slot in the headers buffer. Payload buffers are selected from
> the ring on demand and recycled after each request. Buffer ring usage is
> set on a per-queue basis. All subsequent registration SQEs for the same
> queue must use consistent flags.
> 
> The headers are laid out contiguously and provided via iov[0]. Each slot
> maps to ent->id:
> 
> > <- headers_size (>= queue_depth * sizeof(fuse_uring_req_header)) ->|
> +------------------------------+------------------------------+-----+
> > struct fuse_uring_req_header | struct fuse_uring_req_header | ... |
> >        [ent id=0]            |        [ent id=1]            |     |
> +------------------------------+------------------------------+-----+
> 
> On the server side, the ent id is used to determine where in the headers
> buffer the headers data for the ent resides. This is done by
> calculating ent_id * sizeof(struct fuse_uring_req_header) as the offset
> into the headers buffer.
> 
> The buffer ring is backed by the payload buffer, which is contiguous but
> partitioned into individual bufs according to the buf_size passed in at
> registration.
> 
>   PAYLOAD BUFFER POOL (contiguous, provided via iov[1]):
>     |<-------------- payload_size ------------>|
>     +--------- --+-----------+-----------+-----+
>     |  buf [0]   |  buf [1]  |  buf [2]  | ... |
>     |  buf_size  |  buf_size |  buf_size | ... |
>     +--------- --+-----------+-----------+-----+
> 
> buffer ring state (struct fuse_bufring, kernel-internal):
> bufs[]: [ used | used | FREE | FREE | FREE ]
> 			^^^^^^^^^^^^^^^^^^^
> 			available for selection
> 
> The buffer ring logic is as follows:
> select:  buf = bufs[head % nbufs]; head++
> recycle: bufs[tail % nbufs] = buf; tail++
> empty:   tail == head (no buffers available)
> full:    tail - head >= nbufs
> 
> Buffer ring request flow
> ------------------------
> >  Kernel                                  |  FUSE daemon
> >                                          |
> >  [client request arrives]                |
> >  >fuse_uring_send()                      |
> >    [select payload buf from ring]        |
> >    >fuse_uring_select_buffer()           |
> >    [copy headers to ent's header slot]   |
> >    >copy_header_to_ring()                |
> >    [copy payload to selected buf]        |
> >    >fuse_uring_copy_to_ring()            |
> >    [set buf_id in ent_in_out header]     |
> >    >io_uring_cmd_done()                  |
> >                                          |  [CQE received]
> >                                          |  [read headers from header
> >                                          |    slot]
> >                                          |  [read payload from buf_id]
> >                                          |  [process request]
> >                                          |  [write reply to header
> >                                          |    slot]
> >                                          |  [write reply payload to
> >                                          |    buf]
> >                                          |  >io_uring_submit()
> >                                          |   COMMIT_AND_FETCH
> >  >fuse_uring_commit_fetch()              |
> >    >fuse_uring_commit()                  |
> >     [copy reply from ring]               |
> >     >fuse_uring_recycle_buffer()         |
> >    >fuse_uring_get_next_fuse_req()       |
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>  fs/fuse/dev_uring.c       | 363 +++++++++++++++++++++++++++++++++-----
>  fs/fuse/dev_uring_i.h     |  45 ++++-
>  include/uapi/linux/fuse.h |  27 ++-
>  3 files changed, 381 insertions(+), 54 deletions(-)
> 
> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
> index a061f175b3fd..9f14a2bcde3f 100644
> --- a/fs/fuse/dev_uring.c
> +++ b/fs/fuse/dev_uring.c
> @@ -41,6 +41,11 @@ enum fuse_uring_header_type {
>  	FUSE_URING_HEADER_RING_ENT,
>  };
>  
> +static inline bool bufring_enabled(struct fuse_ring_queue *queue)
> +{
> +	return queue->bufring != NULL;
> +}
> +
>  static void uring_cmd_set_ring_ent(struct io_uring_cmd *cmd,
>  				   struct fuse_ring_ent *ring_ent)
>  {
> @@ -222,6 +227,7 @@ void fuse_uring_destruct(struct fuse_conn *fc)
>  		}
>  
>  		kfree(queue->fpq.processing);
> +		kfree(queue->bufring);
>  		kfree(queue);
>  		ring->queues[qid] = NULL;
>  	}
> @@ -303,20 +309,102 @@ static int fuse_uring_get_iovec_from_sqe(const struct io_uring_sqe *sqe,
>  	return 0;
>  }
>  
> -static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
> -						       int qid)
> +static int fuse_uring_bufring_setup(struct io_uring_cmd *cmd,
> +				     struct fuse_ring_queue *queue)
> +{
> +	const struct fuse_uring_cmd_req *cmd_req =
> +		io_uring_sqe128_cmd(cmd->sqe, struct fuse_uring_cmd_req);
> +	u16 queue_depth = READ_ONCE(cmd_req->init.queue_depth);
> +	unsigned int buf_size = READ_ONCE(cmd_req->init.buf_size);
> +	struct iovec iov[FUSE_URING_IOV_SEGS];
> +	void __user *payload, *headers;
> +	size_t headers_size, payload_size, ring_size;
> +	struct fuse_bufring *br;
> +	unsigned int nr_bufs, i;
> +	uintptr_t payload_addr;
> +	int err;
> +
> +	if (!queue_depth || !buf_size)
> +		return -EINVAL;
> +
> +	err = fuse_uring_get_iovec_from_sqe(cmd->sqe, iov);
> +	if (err)
> +		return err;
> +
> +	headers = iov[FUSE_URING_IOV_HEADERS].iov_base;
> +	headers_size = iov[FUSE_URING_IOV_HEADERS].iov_len;
> +	payload = iov[FUSE_URING_IOV_PAYLOAD].iov_base;
> +	payload_size = iov[FUSE_URING_IOV_PAYLOAD].iov_len;
> +
> +	/* check if there's enough space for all the headers */
> +	if (headers_size < queue_depth * sizeof(struct fuse_uring_req_header))
> +		return -EINVAL;
> +
> +	if (buf_size < queue->ring->max_payload_sz)
> +		return -EINVAL;
> +
> +	nr_bufs = payload_size / buf_size;
> +	if (!nr_bufs || nr_bufs > U16_MAX)

What's the significance of U16_MAX here? It looks like the br->nbufs
field is an unsigned int. Is it because struct fuse_uring_ent_in_out
has buf_id as a u16?

Not that I think you'll ever need more than 2^16 buffers, just curious
about the limitation.

> +		return -EINVAL;
> +
> +	/* create the ring buffer */
> +	ring_size = struct_size(br, bufs, nr_bufs);
> +	br = kzalloc(ring_size, GFP_KERNEL_ACCOUNT);
> +	if (!br)
> +		return -ENOMEM;
> +
> +	br->queue_depth = queue_depth;
> +	br->headers = headers;
> +
> +	payload_addr = (uintptr_t)payload;
> +
> +	/* populate the ring buffer */
> +	for (i = 0; i < nr_bufs; i++, payload_addr += buf_size) {
> +		struct fuse_bufring_buf *buf = &br->bufs[i];
> +
> +		buf->addr = payload_addr;
> +		buf->len = buf_size;
> +		buf->id = i;
> +	}
> +
> +	br->nbufs = nr_bufs;
> +	br->tail = nr_bufs;
> +
> +	queue->bufring = br;
> +
> +	return 0;
> +}
> +
> +/*
> + * if the queue is already registered, check that the queue was initialized with
> + * the same init flags set for this FUSE_IO_URING_CMD_REGISTER cmd. all
> + * FUSE_IO_URING_CMD_REGISTER cmds should have the same init fields set on a
> + * per-queue basis.
> + */
> +static bool queue_init_flags_consistent(struct fuse_ring_queue *queue,
> +					u64 init_flags)
>  {
> +	bool bufring = init_flags & FUSE_URING_BUFRING;
> +
> +	return bufring_enabled(queue) == bufring;
> +}
> +
> +static struct fuse_ring_queue *
> +fuse_uring_create_queue(struct io_uring_cmd *cmd, struct fuse_ring *ring,
> +			int qid, u64 init_flags)
> +{
> +	bool use_bufring = init_flags & FUSE_URING_BUFRING;
>  	struct fuse_conn *fc = ring->fc;
>  	struct fuse_ring_queue *queue;
>  	struct list_head *pq;
>  
>  	queue = kzalloc_obj(*queue, GFP_KERNEL_ACCOUNT);
>  	if (!queue)
> -		return NULL;
> +		return ERR_PTR(-ENOMEM);
>  	pq = kzalloc_objs(struct list_head, FUSE_PQ_HASH_SIZE);
>  	if (!pq) {
>  		kfree(queue);
> -		return NULL;
> +		return ERR_PTR(-ENOMEM);
>  	}
>  
>  	queue->qid = qid;
> @@ -334,12 +422,29 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
>  	queue->fpq.processing = pq;
>  	fuse_pqueue_init(&queue->fpq);
>  
> +	if (use_bufring) {
> +		int err = fuse_uring_bufring_setup(cmd, queue);
> +
> +		if (err) {
> +			kfree(pq);
> +			kfree(queue);
> +			return ERR_PTR(err);
> +		}
> +	}
> +
>  	spin_lock(&fc->lock);
> +	/* check if the queue creation raced with another thread */
>  	if (ring->queues[qid]) {
>  		spin_unlock(&fc->lock);
>  		kfree(queue->fpq.processing);
> +		if (use_bufring)
> +			kfree(queue->bufring);

nit: presumably you could skip the if here. If use_bufring is false,
then queue->bufring _should_ be NULL.

>  		kfree(queue);
> -		return ring->queues[qid];
> +
> +		queue = ring->queues[qid];
> +		if (!queue_init_flags_consistent(queue, init_flags))
> +			return ERR_PTR(-EINVAL);
> +		return queue;
>  	}
>  
>  	/*
> @@ -649,7 +754,14 @@ static int copy_header_to_ring(struct fuse_ring_ent *ent,
>  	if (offset < 0)
>  		return offset;
>  
> -	ring = (void __user *)ent->headers + offset;
> +	if (bufring_enabled(ent->queue)) {
> +		int buf_offset = offset +
> +			sizeof(struct fuse_uring_req_header) * ent->id;
> +
> +		ring = ent->queue->bufring->headers + buf_offset;
> +	} else {
> +		ring = (void __user *)ent->headers + offset;
> +	}
>  
>  	if (copy_to_user(ring, header, header_size)) {
>  		pr_info_ratelimited("Copying header to ring failed.\n");
> @@ -669,7 +781,14 @@ static int copy_header_from_ring(struct fuse_ring_ent *ent,
>  	if (offset < 0)
>  		return offset;
>  
> -	ring = (void __user *)ent->headers + offset;
> +	if (bufring_enabled(ent->queue)) {
> +		int buf_offset = offset +
> +			sizeof(struct fuse_uring_req_header) * ent->id;
> +
> +		ring = ent->queue->bufring->headers + buf_offset;
> +	} else {
> +		ring = (void __user *)ent->headers + offset;
> +	}
>  
>  	if (copy_from_user(header, ring, header_size)) {
>  		pr_info_ratelimited("Copying header from ring failed.\n");
> @@ -684,12 +803,20 @@ static int setup_fuse_copy_state(struct fuse_copy_state *cs,
>  				 struct fuse_ring_ent *ent, int dir,
>  				 struct iov_iter *iter)
>  {
> +	void __user *payload;
>  	int err;
>  
> -	err = import_ubuf(dir, ent->payload, ring->max_payload_sz, iter);
> -	if (err) {
> -		pr_info_ratelimited("fuse: Import of user buffer failed\n");
> -		return err;
> +	if (bufring_enabled(ent->queue))
> +		payload = (void __user *)ent->payload_buf.addr;
> +	else
> +		payload = ent->payload;
> +
> +	if (payload) {
> +		err = import_ubuf(dir, payload, ring->max_payload_sz, iter);
> +		if (err) {
> +			pr_info_ratelimited("fuse: Import of user buffer failed\n");
> +			return err;
> +		}
>  	}
>  
>  	fuse_copy_init(cs, dir == ITER_DEST, iter);
> @@ -741,6 +868,9 @@ static int fuse_uring_args_to_ring(struct fuse_ring *ring, struct fuse_req *req,
>  		.commit_id = req->in.h.unique,
>  	};
>  
> +	if (bufring_enabled(ent->queue))
> +		ent_in_out.buf_id = ent->payload_buf.id;
> +
>  	err = setup_fuse_copy_state(&cs, ring, req, ent, ITER_DEST, &iter);
>  	if (err)
>  		return err;
> @@ -805,6 +935,96 @@ static int fuse_uring_copy_to_ring(struct fuse_ring_ent *ent,
>  				   sizeof(req->in.h));
>  }
>  
> +static bool fuse_uring_req_has_payload(struct fuse_req *req)
> +{
> +	struct fuse_args *args = req->args;
> +
> +	return args->in_numargs > 1 || args->out_numargs;
> +}
> +
> +static int fuse_uring_select_buffer(struct fuse_ring_ent *ent)
> +	__must_hold(&ent->queue->lock)
> +{
> +	struct fuse_ring_queue *queue = ent->queue;
> +	struct fuse_bufring *br = queue->bufring;
> +	struct fuse_bufring_buf *buf;
> +	unsigned int tail = br->tail, head = br->head;
> +
> +	lockdep_assert_held(&queue->lock);
> +
> +	/* Get a buffer to use for the payload */
> +	if (tail == head)
> +		return -ENOBUFS;
> +
> +	buf = &br->bufs[head % br->nbufs];
> +	br->head++;
> +
> +	ent->payload_buf = *buf;
> +
> +	return 0;
> +}
> +
> +static void fuse_uring_recycle_buffer(struct fuse_ring_ent *ent)
> +	__must_hold(&ent->queue->lock)
> +{
> +	struct fuse_bufring_buf *ent_payload = &ent->payload_buf;
> +	struct fuse_ring_queue *queue = ent->queue;
> +	struct fuse_bufring_buf *buf;
> +	struct fuse_bufring *br;
> +
> +	lockdep_assert_held(&queue->lock);
> +
> +	if (!bufring_enabled(queue) || !ent_payload->addr)
> +		return;
> +
> +	br = queue->bufring;
> +
> +	/* ring should never be full */
> +	WARN_ON_ONCE(br->tail - br->head >= br->nbufs);
> +
> +	buf = &br->bufs[(br->tail) % br->nbufs];
> +
> +	*buf = *ent_payload;
> +
> +	br->tail++;
> +
> +	memset(ent_payload, 0, sizeof(*ent_payload));
> +}
> +
> +static int fuse_uring_next_req_update_buffer(struct fuse_ring_ent *ent,
> +					     struct fuse_req *req)
> +{
> +	bool buffer_selected;
> +	bool has_payload;
> +
> +	if (!bufring_enabled(ent->queue))
> +		return 0;
> +
> +	buffer_selected = !!ent->payload_buf.addr;
> +	has_payload = fuse_uring_req_has_payload(req);
> +
> +	if (has_payload && !buffer_selected)
> +		return fuse_uring_select_buffer(ent);
> +
> +	if (!has_payload && buffer_selected)
> +		fuse_uring_recycle_buffer(ent);
> +
> +	return 0;
> +}
> +
> +static int fuse_uring_prep_buffer(struct fuse_ring_ent *ent,
> +				  struct fuse_req *req)
> +{
> +	if (!bufring_enabled(ent->queue))
> +		return 0;
> +
> +	/* no payload to copy, can skip selecting a buffer */
> +	if (!fuse_uring_req_has_payload(req))
> +		return 0;
> +
> +	return fuse_uring_select_buffer(ent);
> +}
> +
>  static int fuse_uring_prepare_send(struct fuse_ring_ent *ent,
>  				   struct fuse_req *req)
>  {
> @@ -878,10 +1098,21 @@ static struct fuse_req *fuse_uring_ent_assign_req(struct fuse_ring_ent *ent)
>  
>  	/* get and assign the next entry while it is still holding the lock */
>  	req = list_first_entry_or_null(req_queue, struct fuse_req, list);
> -	if (req)
> -		fuse_uring_add_req_to_ring_ent(ent, req);
> +	if (req) {
> +		int err = fuse_uring_next_req_update_buffer(ent, req);
>  
> -	return req;
> +		if (!err) {
> +			fuse_uring_add_req_to_ring_ent(ent, req);
> +			return req;
> +		}
> +	}
> +
> +	/*
> +	 * Buffer selection may fail if all the buffers are currently saturated.
> +	 * The request will be serviced when a buffer is freed up.
> +	 */
> +	fuse_uring_recycle_buffer(ent);
> +	return NULL;
>  }
>  
>  /*
> @@ -1041,6 +1272,12 @@ static int fuse_uring_commit_fetch(struct io_uring_cmd *cmd, int issue_flags,
>  	 * fuse requests would otherwise not get processed - committing
>  	 * and fetching is done in one step vs legacy fuse, which has separated
>  	 * read (fetch request) and write (commit result).
> +	 *
> +	 * If the server is using bufrings and has populated the ring with less
> +	 * payload buffers than ents, it is possible that there may not be an
> +	 * available buffer for the next request. If so, then the fetch is a
> +	 * no-op and the next request will be serviced when a buffer becomes
> +	 * available.
>  	 */
>  	if (fuse_uring_get_next_fuse_req(ent, queue))
>  		fuse_uring_send(ent, cmd, 0, issue_flags);
> @@ -1120,30 +1357,38 @@ fuse_uring_create_ring_ent(struct io_uring_cmd *cmd,
>  
>  	ent->queue = queue;
>  
> -	err = fuse_uring_get_iovec_from_sqe(cmd->sqe, iov);
> -	if (err) {
> -		pr_info_ratelimited("Failed to get iovec from sqe, err=%d\n",
> -				    err);
> -		goto error;
> -	}
> +	if (bufring_enabled(queue)) {
> +		ent->id = READ_ONCE(cmd->sqe->buf_index);
> +		if (ent->id >= queue->bufring->queue_depth) {
> +			err = -EINVAL;
> +			goto error;
> +		}
> +	} else {
> +		err = fuse_uring_get_iovec_from_sqe(cmd->sqe, iov);
> +		if (err) {
> +			pr_info_ratelimited("Failed to get iovec from sqe, err=%d\n",
> +					    err);
> +			goto error;
> +		}
>  
> -	err = -EINVAL;
> -	headers = &iov[FUSE_URING_IOV_HEADERS];
> -	if (headers->iov_len < sizeof(struct fuse_uring_req_header)) {
> -		pr_info_ratelimited("Invalid header len %zu\n", headers->iov_len);
> -		goto error;
> -	}
> +		err = -EINVAL;
> +		headers = &iov[FUSE_URING_IOV_HEADERS];
> +		if (headers->iov_len < sizeof(struct fuse_uring_req_header)) {
> +			pr_info_ratelimited("Invalid header len %zu\n",
> +					    headers->iov_len);
> +			goto error;
> +		}
>  
> -	payload = &iov[FUSE_URING_IOV_PAYLOAD];
> -	if (payload->iov_len < ring->max_payload_sz) {
> -		pr_info_ratelimited("Invalid req payload len %zu\n",
> -				    payload->iov_len);
> -		goto error;
> +		payload = &iov[FUSE_URING_IOV_PAYLOAD];
> +		if (payload->iov_len < ring->max_payload_sz) {
> +			pr_info_ratelimited("Invalid req payload len %zu\n",
> +					    payload->iov_len);
> +			goto error;
> +		}
> +		ent->headers = headers->iov_base;
> +		ent->payload = payload->iov_base;
>  	}
>  
> -	ent->headers = headers->iov_base;
> -	ent->payload = payload->iov_base;
> -
>  	atomic_inc(&ring->queue_refs);
>  	return ent;
>  
> @@ -1152,6 +1397,13 @@ fuse_uring_create_ring_ent(struct io_uring_cmd *cmd,
>  	return ERR_PTR(err);
>  }
>  
> +static bool init_flags_valid(u64 init_flags)
> +{
> +	u64 valid_flags = FUSE_URING_BUFRING;
> +
> +	return !(init_flags & ~valid_flags);
> +}
> +
>  /*
>   * Register header and payload buffer with the kernel and puts the
>   * entry as "ready to get fuse requests" on the queue
> @@ -1161,6 +1413,7 @@ static int fuse_uring_register(struct io_uring_cmd *cmd,
>  {
>  	const struct fuse_uring_cmd_req *cmd_req = io_uring_sqe128_cmd(cmd->sqe,
>  								       struct fuse_uring_cmd_req);
> +	u64 init_flags = READ_ONCE(cmd_req->flags);
>  	struct fuse_ring *ring = smp_load_acquire(&fc->ring);
>  	struct fuse_ring_queue *queue;
>  	struct fuse_ring_ent *ent;
> @@ -1179,11 +1432,16 @@ static int fuse_uring_register(struct io_uring_cmd *cmd,
>  		return -EINVAL;
>  	}
>  
> +	if (!init_flags_valid(init_flags))
> +		return -EINVAL;
> +
>  	queue = ring->queues[qid];
>  	if (!queue) {
> -		queue = fuse_uring_create_queue(ring, qid);
> -		if (!queue)
> -			return err;
> +		queue = fuse_uring_create_queue(cmd, ring, qid, init_flags);
> +		if (IS_ERR(queue))
> +			return PTR_ERR(queue);
> +	} else if (!queue_init_flags_consistent(queue, init_flags)) {
> +		return -EINVAL;
>  	}
>  
>  	/*
> @@ -1349,14 +1607,18 @@ void fuse_uring_queue_fuse_req(struct fuse_iqueue *fiq, struct fuse_req *req)
>  	req->ring_queue = queue;
>  	ent = list_first_entry_or_null(&queue->ent_avail_queue,
>  				       struct fuse_ring_ent, list);
> -	if (ent)
> -		fuse_uring_add_req_to_ring_ent(ent, req);
> -	else
> -		list_add_tail(&req->list, &queue->fuse_req_queue);
> -	spin_unlock(&queue->lock);
> +	if (ent) {
> +		err = fuse_uring_prep_buffer(ent, req);
> +		if (!err) {
> +			fuse_uring_add_req_to_ring_ent(ent, req);
> +			spin_unlock(&queue->lock);
> +			fuse_uring_dispatch_ent(ent);
> +			return;
> +		}
> +	}
>  
> -	if (ent)
> -		fuse_uring_dispatch_ent(ent);
> +	list_add_tail(&req->list, &queue->fuse_req_queue);
> +	spin_unlock(&queue->lock);
>  
>  	return;
>  
> @@ -1406,14 +1668,17 @@ bool fuse_uring_queue_bq_req(struct fuse_req *req)
>  	req = list_first_entry_or_null(&queue->fuse_req_queue, struct fuse_req,
>  				       list);
>  	if (ent && req) {
> -		fuse_uring_add_req_to_ring_ent(ent, req);
> -		spin_unlock(&queue->lock);
> +		int err = fuse_uring_prep_buffer(ent, req);
>  
> -		fuse_uring_dispatch_ent(ent);
> -	} else {
> -		spin_unlock(&queue->lock);
> +		if (!err) {
> +			fuse_uring_add_req_to_ring_ent(ent, req);
> +			spin_unlock(&queue->lock);
> +			fuse_uring_dispatch_ent(ent);
> +			return true;
> +		}
>  	}
>  
> +	spin_unlock(&queue->lock);
>  	return true;
>  }
>  
> diff --git a/fs/fuse/dev_uring_i.h b/fs/fuse/dev_uring_i.h
> index 349418db3374..66d5d5f8dc3f 100644
> --- a/fs/fuse/dev_uring_i.h
> +++ b/fs/fuse/dev_uring_i.h
> @@ -36,11 +36,47 @@ enum fuse_ring_req_state {
>  	FRRS_RELEASED,
>  };
>  
> +struct fuse_bufring_buf {
> +	uintptr_t addr;
> +	unsigned int len;
> +	unsigned int id;
> +};
> +
> +struct fuse_bufring {
> +	/* pointer to the headers buffer */
> +	void __user *headers;
> +
> +	unsigned int queue_depth;
> +
> +	/* metadata tracking state of the bufring */
> +	unsigned int nbufs;
> +	unsigned int head;
> +	unsigned int tail;
> +
> +	/* the buffers backing the ring */
> +	__DECLARE_FLEX_ARRAY(struct fuse_bufring_buf, bufs);
> +};
> +
>  /** A fuse ring entry, part of the ring queue */
>  struct fuse_ring_ent {
> -	/* userspace buffer */
> -	struct fuse_uring_req_header __user *headers;
> -	void __user *payload;
> +	union {
> +		/* if bufrings are not used */
> +		struct {
> +			/* userspace buffers */
> +			struct fuse_uring_req_header __user *headers;
> +			void __user *payload;
> +		};
> +		/* if bufrings are used */
> +		struct {
> +			/*
> +			 * unique fixed id for the ent. used by kernel/server to
> +			 * locate where in the headers buffer the data for this
> +			 * ent resides
> +			 */
> +			unsigned int id;
> +			struct fuse_bufring_buf payload_buf;
> +		};
> +	};
>  
>  	/* the ring queue that owns the request */
>  	struct fuse_ring_queue *queue;
> @@ -99,6 +135,9 @@ struct fuse_ring_queue {
>  	unsigned int active_background;
>  
>  	bool stopped;
> +
> +	/* only allocated if the server uses bufrings */
> +	struct fuse_bufring *bufring;
>  };
>  
>  /**
> diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
> index c13e1f9a2f12..8753de7eb189 100644
> --- a/include/uapi/linux/fuse.h
> +++ b/include/uapi/linux/fuse.h
> @@ -240,6 +240,10 @@
>   *  - add FUSE_COPY_FILE_RANGE_64
>   *  - add struct fuse_copy_file_range_out
>   *  - add FUSE_NOTIFY_PRUNE
> + *
> + *  7.46
> + *  - add FUSE_URING_BUFRING flag
> + *  - add fuse_uring_cmd_req init struct
>   */
>  
>  #ifndef _LINUX_FUSE_H
> @@ -1263,7 +1267,13 @@ struct fuse_uring_ent_in_out {
>  
>  	/* size of user payload buffer */
>  	uint32_t payload_sz;
> -	uint32_t padding;
> +
> +	/*
> +	 * if using bufrings, this is the id of the selected buffer.
> +	 * the selected buffer holds the request payload
> +	 */
> +	uint16_t buf_id;
> +	uint16_t padding;
>  
>  	uint64_t reserved;
>  };
> @@ -1294,6 +1304,9 @@ enum fuse_uring_cmd {
>  	FUSE_IO_URING_CMD_COMMIT_AND_FETCH = 2,
>  };
>  
> +/* fuse_uring_cmd_req flags */
> +#define FUSE_URING_BUFRING		(1 << 0)
> +
>  /**
>   * In the 80B command area of the SQE.
>   */
> @@ -1305,7 +1318,17 @@ struct fuse_uring_cmd_req {
>  
>  	/* queue the command is for (queue index) */
>  	uint16_t qid;
> -	uint8_t padding[6];
> +	uint16_t padding;
> +
> +	union {
> +		struct {
> +			/* size of the bufring's backing buffers */
> +			uint32_t buf_size;
> +			/* number of entries in the queue */
> +			uint16_t queue_depth;
> +			uint16_t padding;
> +		} init;
> +	};
>  };
>  
>  #endif /* _LINUX_FUSE_H */

Overall, this looks good though.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

  parent reply	other threads:[~2026-04-30 11:08 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-02 16:28 [PATCH v2 00/14] fuse: add io-uring buffer rings and zero-copy Joanne Koong
2026-04-02 16:28 ` [PATCH v2 01/14] fuse: separate next request fetching from sending logic Joanne Koong
2026-04-29 11:52   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 02/14] fuse: refactor io-uring header copying to ring Joanne Koong
2026-04-29 12:05   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 03/14] fuse: refactor io-uring header copying from ring Joanne Koong
2026-04-29 12:06   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 04/14] fuse: use enum types for header copying Joanne Koong
2026-04-30  8:04   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 05/14] fuse: refactor setting up copy state for payload copying Joanne Koong
2026-04-30  8:06   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 06/14] fuse: support buffer copying for kernel addresses Joanne Koong
2026-04-30  8:19   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 07/14] fuse: use named constants for io-uring iovec indices Joanne Koong
2026-04-15  9:36   ` Bernd Schubert
2026-04-30  8:20   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 08/14] fuse: move fuse_uring_abort() from header to dev_uring.c Joanne Koong
2026-04-15  9:40   ` Bernd Schubert
2026-04-30  8:21   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 09/14] fuse: rearrange io-uring iovec and ent allocation logic Joanne Koong
2026-04-15  9:45   ` Bernd Schubert
2026-04-30  8:24   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 10/14] fuse: add io-uring buffer rings Joanne Koong
2026-04-15  9:48   ` Bernd Schubert
2026-04-15 21:40     ` Joanne Koong
2026-04-30 11:08   ` Jeff Layton [this message]
2026-04-30 12:44     ` Joanne Koong
2026-05-05 22:47   ` Bernd Schubert
2026-04-02 16:28 ` [PATCH v2 11/14] fuse: add pinned headers capability for " Joanne Koong
2026-04-14 12:47   ` Bernd Schubert
2026-04-15  0:48     ` Joanne Koong
2026-05-05 22:51       ` Bernd Schubert
2026-04-30 11:22   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 12/14] fuse: add pinned payload buffers " Joanne Koong
2026-04-30 11:29   ` Jeff Layton
2026-04-02 16:28 ` [PATCH v2 13/14] fuse: add zero-copy over io-uring Joanne Koong
2026-04-30 11:42   ` Jeff Layton
2026-04-30 12:35     ` Joanne Koong
2026-04-30 12:55       ` Jeff Layton
2026-05-05 22:55         ` Bernd Schubert
2026-04-30 12:56   ` Jeff Layton
2026-05-05 23:45   ` Bernd Schubert
2026-04-02 16:28 ` [PATCH v2 14/14] docs: fuse: add io-uring bufring and zero-copy documentation Joanne Koong
2026-04-14 21:05   ` Bernd Schubert
2026-04-15  1:10     ` Joanne Koong
2026-04-15 10:55       ` Bernd Schubert
2026-04-15 22:40         ` Joanne Koong
2026-04-30 12:57   ` Jeff Layton
2026-04-30 12:59 ` [PATCH v2 00/14] fuse: add io-uring buffer rings and zero-copy Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6977f02cd8c341b80ec5e7cc1bf67e0e455d859.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bernd@bsbernd.com \
    --cc=joannelkoong@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox