From: Jens Axboe <axboe@kernel.dk>
To: Gabriel Krisman Bertazi <krisman@suse.de>
Cc: io-uring@vger.kernel.org, Martin Michaelis <code@mgjm.de>,
stable@vger.kernel.org
Subject: Re: [PATCH 2/2] io_uring/kbuf: support min length left for incremental buffers
Date: Tue, 28 Apr 2026 12:02:34 -0600 [thread overview]
Message-ID: <7645db80-8a8a-4ed6-9a3a-f2406cf93322@kernel.dk> (raw)
In-Reply-To: <87ik9bj7jt.fsf@mailhost.krisman.be>
On 4/28/26 11:53 AM, Gabriel Krisman Bertazi wrote:
> Jens Axboe <axboe@kernel.dk> writes:
>
>> From: Martin Michaelis <code@mgjm.de>
>>
>> Incrementally consumed buffer rings are generally fully consumed, but
>> it's quite possible that the application has a minimum size it needs to
>> meet to avoid truncation. Currently that minimum limit is 1 byte, but
>> this should be a setting that is the hands of the application. For
>> recvmsg multishot, a prime use case for incrementally consumed buffers,
>> the application may get spurious -EFAULT returned at the end of an
>> incrementally consumed buffer, as less space is available than the
>> headers need.
>>
>> Grab a u32 field in struct io_uring_buf_reg, which the application can
>> use to inform the kernel of the minimum size that should be available
>> in an incrementally consumed buffer. If less than that is available,
>> the current buffer is fully processed and the next one will be picked.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption")
>> Link: https://github.com/axboe/liburing/issues/1433
>> Signed-off-by: Martin Michaelis <code@mgjm.de>
>> [axboe: write commit message, change io_buffer_list member name]
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>> include/uapi/linux/io_uring.h | 3 ++-
>> io_uring/kbuf.c | 8 +++++++-
>> io_uring/kbuf.h | 7 +++++++
>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
>> index 17ac1b785440..909fb7aea638 100644
>> --- a/include/uapi/linux/io_uring.h
>> +++ b/include/uapi/linux/io_uring.h
>> @@ -905,7 +905,8 @@ struct io_uring_buf_reg {
>> __u32 ring_entries;
>> __u16 bgid;
>> __u16 flags;
>> - __u64 resv[3];
>> + __u32 min_left;
>> + __u32 resv[5];
>
> Honest question, isn't this a property of the specific operation and/or
> fd being operated, instead of the buffer_reg?
It kind of is, in that some users may not care. But it's not currently
possible to pass this in on a per-op basis, and while I did hack that
up initially, it's almost impossible as you end up with layering
violations. In practice, this is really mostly a recvmsg multishot
issue, because we need to store the headers. Hence the solution to
stuff it in the io_uring_buf_reg instead, and make it a fixed property
of the buffer group. In practice, you may even want a larger min_left
than what the recvmsg requires, as you don't want a tiny truncated
transfer at the end, regardless of what type of recv or read operation
this is. Hence it works generically as well.
Also see the linked GH issue, that's where most of the discussion
around this have happened already.
>> /* argument for IORING_REGISTER_PBUF_STATUS */
>> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
>> index 43e4f8615fe8..63061aa1cab9 100644
>> --- a/io_uring/kbuf.c
>> +++ b/io_uring/kbuf.c
>> @@ -47,7 +47,7 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>> this_len = min_t(u32, len, buf_len);
>> buf_len -= this_len;
>> /* Stop looping for invalid buffer length of 0 */
>> - if (buf_len || !this_len) {
>> + if (buf_len > bl->min_left_sub_one || !this_len) {
>
> Cosmetic, but perhaps store min_left_sub_one instead of min_left itself? the
> buf_len must be >= min_left, and that is easier to read. (buf_len &&
> buf_len >= min_left || !this_len)
Also see GH issue.
--
Jens Axboe
next prev parent reply other threads:[~2026-04-28 18:02 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260428154557.2150818-1-axboe@kernel.dk>
2026-04-28 15:44 ` [PATCH 2/2] io_uring/kbuf: support min length left for incremental buffers Jens Axboe
2026-04-28 17:53 ` Gabriel Krisman Bertazi
2026-04-28 18:02 ` Jens Axboe [this message]
2026-04-28 19:08 ` Gabriel Krisman Bertazi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7645db80-8a8a-4ed6-9a3a-f2406cf93322@kernel.dk \
--to=axboe@kernel.dk \
--cc=code@mgjm.de \
--cc=io-uring@vger.kernel.org \
--cc=krisman@suse.de \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox