From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-oa1-f47.google.com (mail-oa1-f47.google.com [209.85.160.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA4073A9638 for ; Tue, 28 Apr 2026 15:46:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777391164; cv=none; b=Wv+e++WtFcLj+jFhz4HZdZNXSyOv9s1wyPAtfwvB0alHBhwIj98UaBV/aDoJqLglHspoTehWUkTogns0MXqVxof0QCymLJSAT/X8bHt5gFSqMIQ6C6eECguYBhR3V30pTxxLpSSKLgCCX/cpeI9X1Uzt2buR+i/Q+wU5lBmZYN0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777391164; c=relaxed/simple; bh=lZ8GVIQeyx6FhafZAJBBqfDuAvMKXOY+e5J3qcPq8kw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FbQv+SdVS1Bk6BC28x0Pc8T5lMC4SxwqYHFVV1nKydtpvtE9pAf8dUl2/u/cH0fY7a1y3yvhk+xYesXl3NfIvn0kZzO7fG2AAKcxmICqJ6YOROoEqg2MXQhnRtsEGMKdPHFkELM74XJPdprgOoYXF/0RhcoRLz35vOGzHhfjQKs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20251104.gappssmtp.com header.i=@kernel-dk.20251104.gappssmtp.com header.b=UPpl9Rrl; arc=none smtp.client-ip=209.85.160.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20251104.gappssmtp.com header.i=@kernel-dk.20251104.gappssmtp.com header.b="UPpl9Rrl" Received: by mail-oa1-f47.google.com with SMTP id 586e51a60fabf-40ee9b945d5so9516335fac.0 for ; Tue, 28 Apr 2026 08:46:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20251104.gappssmtp.com; s=20251104; t=1777391162; x=1777995962; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pwoQUuKpX/ekRA8Qo1PHn8+ae1HpgVzRgnT0CoXolUA=; b=UPpl9RrlsrxtUgRHCoIZxAcNrjEL2kTjCZxj72+qp2smvSXkytx2P+UcZcQb/l9p6x qdAXVR+vrVwZKB042aJP7erNlh1SYW3rzcQDybbOKDxiz9Se2zL9DFnyx8y+rnSfnB8u SBm0DO2LCVauy2i3gFTlNPJIlL7+0loBLo1b4wZFhi11VLVxgKJABd1aKFCtubQN+ucf w/CGd6zCMKuh6YOUojm6a762tPvqXuAjI8wj/i6yRolwItKrvKf7g1SzYshQpH1nYTyS 5VN6DGhuyamtkCkFT7nY6PnI7RKJhSsIqZ9vpDb99YT4UPSYJT/29Gr9Y+Cp/j7gvni6 agXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777391162; x=1777995962; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=pwoQUuKpX/ekRA8Qo1PHn8+ae1HpgVzRgnT0CoXolUA=; b=J7q3S8Y5F1u3m4CBaKoi/LEDk6AhQ/yfwCuAPuR0SGXPDyj6IjDOnSrA5bmRDGv+al 8X1gnNo4U9IFX68n5WsVv2aTTRufiNuiQSiSf0SuXIvch7wjcI0fh1p2den91hM07vvb 8M5Ld3349H25peZuafbiX3C3YQicPZliIZmH3fCLO2hRErOo/SfemewPtYO/snzHonAJ r/KniWj8Iw7sE+BW1F+xXoaI7w3FO4nzskaElN6skBvVxc1K7iDQ2RpRAQHfVJo1KNYg bsCIkETcWEQkd6qkRgvyCZuumc2AoGs2pP3aFhN24MB9xAYbWmmU4L0v5RTfUfVrADHX HIIw== X-Forwarded-Encrypted: i=1; AFNElJ+qCjrR+mSGsM9AxRSSnMLT2Jg0XPu1RvOllxAfMr3csUmFL8WNIZUqPWkXe6BsCyUILt7jXjk=@vger.kernel.org X-Gm-Message-State: AOJu0YyKp/v82UB95fOierMpeCU2Rcac2ofP2iBzTdYNkufc5EoGDUFK XyVkuTR7IkuN9qBKuh/pcd7oPsGQOclQcBYS77CnnaMUnydD6yZOWx6x+4SS3rU+AhpuCYTaOVK F8Co4MK0= X-Gm-Gg: AeBDieuR7zeYmwi/H5xERjBBHgHHRUlMuVLej274vw3B6gUZDmIKN3JZ9eNL8ZnBPCO WUwDwj1+rHwt9iOhpll3MaVNAfyHiLAQrmBAwUY4ibDnX1H1luOtvc2mXv0+dfW0j+qd3jM4hjH fzQbeuUtJIjDq/ZYjNHOEhaLX5pz/BGW88oZyiSxMCow+XIrqgboNZKcFS6agX+G4o7TGkUrvEL uwhLtAJhbEJ61QfmlC3H4hgsizsw3ya1zpQRyXONnP7icqcWI0IlNemC8B7VsOLW7BJi/kxfRL8 8DV46j4nXkRnmJV+BatGviBu+J5NSCmcQoLvDkdLkUZBLvHzQ85ZXOx0+ekaPpbeNujIFzCTgQK yVGzi/iZ3hfFQ3uZtCMbYGwM8tSXkEBlRTx1w59dC2/Wt/Q2/5t3YDtTHOWNFWEKODi7bjnGm5a netsdpuLex8irHuiomYt7HOo5O0eeJ/IBdd3YDQvWvctBPR5dY7VjbvcUx0dEqMeqCTG7UHjoHJ ucP9w== X-Received: by 2002:a05:6870:2e88:b0:42f:c1ea:f19d with SMTP id 586e51a60fabf-4340a89d62bmr22904fac.10.1777391161589; Tue, 28 Apr 2026 08:46:01 -0700 (PDT) Received: from m2max ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-433effdc79bsm2109567fac.18.2026.04.28.08.46.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Apr 2026 08:46:00 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Martin Michaelis , stable@vger.kernel.org, Jens Axboe Subject: [PATCH 2/2] io_uring/kbuf: support min length left for incremental buffers Date: Tue, 28 Apr 2026 09:44:50 -0600 Message-ID: <20260428154557.2150818-3-axboe@kernel.dk> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260428154557.2150818-1-axboe@kernel.dk> References: <20260428154557.2150818-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Martin Michaelis Incrementally consumed buffer rings are generally fully consumed, but it's quite possible that the application has a minimum size it needs to meet to avoid truncation. Currently that minimum limit is 1 byte, but this should be a setting that is the hands of the application. For recvmsg multishot, a prime use case for incrementally consumed buffers, the application may get spurious -EFAULT returned at the end of an incrementally consumed buffer, as less space is available than the headers need. Grab a u32 field in struct io_uring_buf_reg, which the application can use to inform the kernel of the minimum size that should be available in an incrementally consumed buffer. If less than that is available, the current buffer is fully processed and the next one will be picked. Cc: stable@vger.kernel.org Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption") Link: https://github.com/axboe/liburing/issues/1433 Signed-off-by: Martin Michaelis [axboe: write commit message, change io_buffer_list member name] Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 3 ++- io_uring/kbuf.c | 8 +++++++- io_uring/kbuf.h | 7 +++++++ 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 17ac1b785440..909fb7aea638 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -905,7 +905,8 @@ struct io_uring_buf_reg { __u32 ring_entries; __u16 bgid; __u16 flags; - __u64 resv[3]; + __u32 min_left; + __u32 resv[5]; }; /* argument for IORING_REGISTER_PBUF_STATUS */ diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c index 43e4f8615fe8..63061aa1cab9 100644 --- a/io_uring/kbuf.c +++ b/io_uring/kbuf.c @@ -47,7 +47,7 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len) this_len = min_t(u32, len, buf_len); buf_len -= this_len; /* Stop looping for invalid buffer length of 0 */ - if (buf_len || !this_len) { + if (buf_len > bl->min_left_sub_one || !this_len) { WRITE_ONCE(buf->addr, READ_ONCE(buf->addr) + this_len); WRITE_ONCE(buf->len, buf_len); return false; @@ -637,6 +637,10 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) if (reg.ring_entries >= 65536) return -EINVAL; + /* minimum left byte count is a property of incremental buffers */ + if (!(reg.flags & IOU_PBUF_RING_INC) && reg.min_left) + return -EINVAL; + bl = io_buffer_get_list(ctx, reg.bgid); if (bl) { /* if mapped buffer ring OR classic exists, don't allow */ @@ -683,6 +687,8 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) bl->mask = reg.ring_entries - 1; bl->flags |= IOBL_BUF_RING; bl->buf_ring = br; + if (reg.min_left) + bl->min_left_sub_one = reg.min_left - 1; if (reg.flags & IOU_PBUF_RING_INC) bl->flags |= IOBL_INC; ret = io_buffer_add_list(ctx, bl, reg.bgid); diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h index abf7052b556e..401773e1ef80 100644 --- a/io_uring/kbuf.h +++ b/io_uring/kbuf.h @@ -32,6 +32,13 @@ struct io_buffer_list { __u16 flags; + /* + * minimum required amount to be left to reuse an incrementally + * consumed buffer. If less than this is left at consumption time, + * buffer is done and head is incremented to the next buffer. + */ + __u32 min_left_sub_one; + struct io_mapped_region region; }; -- 2.53.0