From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EADA038423B for ; Wed, 29 Apr 2026 15:27:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.42 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476427; cv=none; b=NPoCMoo6gESpb/L/DvyFHdsPbyvwvROsLbCurEGLdHibAmY3OUOqSj62Q4HI9PNh5O+iFyU4PfGTp+cNCy7IxncyluVpdGl8iTMVTeCaAojInM6YYquOUt9hogPnKmKfQL3cLte46a+JP0aPwXeUuJLzWeHfqwnDLAS/1itMtKI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476427; c=relaxed/simple; bh=MzCLUVNjom1kdvfOy4mbWiP95oxUQojGE7qkZH3udWI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gNKDwKhq4CQxR8XhiJ9dkYuacG2XAymEL6lvDyRF5w4N43HGS0ux1NkzFSrsrPsdbTJ89XgJzuPyv+i5EqX9DXt++4hQBi9hvaMA9AqbJF2IyXp0RxnmC46kpVj/IxUcjOo3Sm1QeL4W//l0z134GSZNHLhdlvticwOJNS8MUQA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=l51hFs9t; arc=none smtp.client-ip=209.85.221.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="l51hFs9t" Received: by mail-wr1-f42.google.com with SMTP id ffacd0b85a97d-445795cf6f1so1873798f8f.1 for ; Wed, 29 Apr 2026 08:27:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476423; x=1778081223; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lUujl6+MDkMLMC27AkxpxJZzY2/aRxcdD8WLm5JxhAU=; b=l51hFs9tUU7CUqP7PIYM0H8C1Vzu/tTPsqh2+6zJvV7kKmiLIeNk32HRXfiVdgrAD2 PQrXGBO2CzV+JX/oBVCCUMlnDRFvx4VxZ6K2HR0zCDrpdmu3MnHlvtByYFfyu2CRAqc1 r7UkVrykwjJEot5zsWsHlX26bPgPRNscaOgWhT8dTFpoKSqLYJGN5rJ6+6H+memeMMB5 ibIWsm8c94FZn8/ivMg0lZpbq2rpuwfnkvWz0OEkXjg8HMoDekq1ej0etNGvbp7z4S4V IqkFI+h5lJg2Xh0SSyfIjaJMo1iDLLIp7qSMkfedJXVvhgcQ54JpzwXZkJB1VB4+ejO3 MdnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476423; x=1778081223; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lUujl6+MDkMLMC27AkxpxJZzY2/aRxcdD8WLm5JxhAU=; b=U/mfd3rL5/bmTJFopzudOZ4RcC6VLUwCLZbKWj39lKcq+wuWLb8vsqEimEZR1Q4dBD OfiJosNgmggSURNQtcauk2C0pCu5DilW5/7sEEBZkpsyjdeSltUfgnBeb3tcOZpXzWoK e/R5thEXKmwM/S3HgqUucJ428lGb3WIHbryap4OksP0sNc/GXEWDUxiSl+bcksvjkIyo Idf55Az9AStmTBd3tcUn21j6F0iinUFZNSmba9MR7BWhCLh5p1NdS9vMgVvystw0WUYG /TlQt25fB/L6QmtofP9uJW2LMGi4WQhaiuhUyOT+9bSIijHL7jKGEsk3rcT+ZaG6CByI DWwg== X-Forwarded-Encrypted: i=1; AFNElJ/1pQX4CzWxmq/MFnLDLP/UFsBINPBZwck0pVLCH6NykWC4bmUsr5WhA+fiLTv7xzsDuUA/yMTN6Q==@vger.kernel.org X-Gm-Message-State: AOJu0Yxcs54J3frSM0KL3fZWG+s23C8RDVDXtc7TVH+760flXtDFO1KB Ev6oTtlfigC+EmU5PKIAmhpOREVl1aVO6gYaJLvOV4sC4j8XoRd/BNFl X-Gm-Gg: AeBDiev3eC/xOKMtJs/jlEda2bWpa+wmhdTc+dUicqTky7LkZ4JiTfV68wgvr3BN3cG i4FeT8FedfXLVasni2TbbQ2V98CT8tfV8njq6g5Q6RoCCQBonyYX55L748RjmIx2PEi59Cu5PKz hFjj8ioaoT2HcdOPy85YqEpyYKTzUFMatE4nx9NCUHSd1QR+t1X9YsoraZEUAWw217MDIx991ue 1J7JvS3cQBv7MQM/nlw8kMfjCtkefekciW8Mtq1G4ISHVc2ziohqa20VU3+8rB7hF3T23fk+Zsn LE4jbqD1DNrzNcGPSH6y5ng38BIztsz5y44DvjHF3tA3RTDa/2bqrAsgUel9WAtis2E/fcG6sNp CQvsoEI5/fruzqGB65zQNn35IEjU6WiLvkIw8g+Oep7s0LWUTTOayHy6An3Q8/4D4s++LpX5Qq0 TDYOFl4XUqYzZs9/++ukRw4ksFcR+ZYjt/vkYmSHHpoYGbiCP6ZByYEHHkCos6czDJCLuhk0la0 AqJRN4RpalpzqZTYWUHUSS0JQSuSD7sWeSPQaJjwXl3 X-Received: by 2002:a05:6000:2903:b0:43d:4b00:9ee7 with SMTP id ffacd0b85a97d-4464b1b8722mr14754743f8f.33.1777476423217; Wed, 29 Apr 2026 08:27:03 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:27:02 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 09/10] io_uring/rsrc: extend buffer update Date: Wed, 29 Apr 2026 16:25:55 +0100 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit We need to pass more information to buffer registration than we can fit into a single struct iovec. This patch allows users to optionally pass struct io_uring_regbuf_desc. Apart from having more space for future use cases, it also introduces registration types. Currently, the type can be either of IO_REGBUF_TYPE_UADDR, which mirrors the iovec path, or IO_REGBUF_TYPE_EMPTY for leaving a buffer table slot empty. The next patch introduces a dmabuf backed type, and can be useful for other extensions like splicing a list of user addresses (i.e. iovec[]), interoperability with zcrx, kernel allocated memory like was brough up by Cristoph. Note, the type only represents a registration option, which is distinct from how io_uring internally stores it. The flags field is not used yet but always useful to have, e.g. we can encode read-only / write-only restrictions using it. Signed-off-by: Pavel Begunkov --- include/uapi/linux/io_uring.h | 27 +++++++++++++- io_uring/rsrc.c | 69 ++++++++++++++++++++++------------- 2 files changed, 69 insertions(+), 27 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 17ac1b785440..05c3fd078767 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -790,13 +790,38 @@ struct io_uring_rsrc_update { struct io_uring_rsrc_update2 { __u32 offset; - __u32 resv; + __u32 flags; __aligned_u64 data; __aligned_u64 tags; __u32 nr; __u32 resv2; }; +/* struct io_uring_rsrc_update2::flags */ +enum io_uring_rsrc_reg_flags { + /* + * Use the extended descriptor format for buffer updates, + * see struct io_uring_regbuf_desc + */ + IORING_RSRC_UPDATE_EXTENDED = 1U << 1, +}; + +/* Buffer registration type, passed in struct io_uring_regbuf_desc::type */ +enum io_uring_regbuf_type { + IO_REGBUF_TYPE_EMPTY, + IO_REGBUF_TYPE_UADDR, + + __IO_REGBUF_TYPE_MAX, +}; + +struct io_uring_regbuf_desc { + __u32 type; /* enum io_uring_regbuf_type */ + __u32 flags; + __u64 size; + __u64 uaddr; + __u64 __resv[7]; +}; + /* Skip updating fd indexes set to this value in the fd table */ #define IORING_REGISTER_FILES_SKIP (-2) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index ba00238941ed..f8696b01cb54 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -27,11 +27,6 @@ struct io_rsrc_update { u32 offset; }; -struct io_uring_regbuf_desc { - __u64 uaddr; - __u64 size; -}; - static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, struct io_uring_regbuf_desc *desc, struct page **last_hpage); @@ -46,9 +41,12 @@ static void io_iov_to_regbuf_desc(const struct iovec *iov, struct io_uring_regbuf_desc *desc) { *desc = (struct io_uring_regbuf_desc) { + .type = IO_REGBUF_TYPE_UADDR, .uaddr = (u64)iov->iov_base, .size = iov->iov_len, }; + if (!desc->uaddr) + desc->type = IO_REGBUF_TYPE_EMPTY; } int __io_account_mem(struct user_struct *user, unsigned long nr_pages) @@ -236,6 +234,8 @@ static int __io_sqe_files_update(struct io_ring_ctx *ctx, return -ENXIO; if (up->offset + nr_args > ctx->file_table.data.nr) return -EINVAL; + if (up->flags) + return -EINVAL; for (done = 0; done < nr_args; done++) { u64 tag = 0; @@ -292,10 +292,9 @@ static int __io_sqe_buffers_update(struct io_ring_ctx *ctx, struct io_uring_rsrc_update2 *up, unsigned int nr_args) { + bool extended = up->flags & IORING_RSRC_UPDATE_EXTENDED; u64 __user *tags = u64_to_user_ptr(up->tags); - struct iovec fast_iov, *iov; struct page *last_hpage = NULL; - struct iovec __user *uvec; u64 user_data = up->data; __u32 done; int i, err; @@ -304,29 +303,49 @@ static int __io_sqe_buffers_update(struct io_ring_ctx *ctx, return -ENXIO; if (up->offset + nr_args > ctx->buf_table.nr) return -EINVAL; + if (up->flags & ~IORING_RSRC_UPDATE_EXTENDED) + return -EINVAL; for (done = 0; done < nr_args; done++) { struct io_uring_regbuf_desc desc; struct io_rsrc_node *node; u64 tag = 0; - uvec = u64_to_user_ptr(user_data); - iov = iovec_from_user(uvec, 1, 1, &fast_iov, io_is_compat(ctx)); - if (IS_ERR(iov)) { - err = PTR_ERR(iov); - break; - } if (tags && copy_from_user(&tag, &tags[done], sizeof(tag))) { err = -EFAULT; break; } - io_iov_to_regbuf_desc(iov, &desc); + if (extended) { + if (copy_from_user(&desc, u64_to_user_ptr(user_data), + sizeof(desc))) { + err = -EFAULT; + break; + } + user_data += sizeof(desc); + } else { + struct iovec __user *uvec = u64_to_user_ptr(user_data); + struct iovec fast_iov, *iov; + + if (io_is_compat(ctx)) + user_data += sizeof(struct compat_iovec); + else + user_data += sizeof(struct iovec); + + iov = iovec_from_user(uvec, 1, 1, &fast_iov, io_is_compat(ctx)); + if (IS_ERR(iov)) { + err = PTR_ERR(iov); + break; + } + io_iov_to_regbuf_desc(iov, &desc); + } + node = io_sqe_buffer_register(ctx, &desc, &last_hpage); if (IS_ERR(node)) { err = PTR_ERR(node); break; } + if (tag) { if (!node) { err = -EINVAL; @@ -337,10 +356,6 @@ static int __io_sqe_buffers_update(struct io_ring_ctx *ctx, i = array_index_nospec(up->offset + done, ctx->buf_table.nr); io_reset_rsrc_node(ctx, &ctx->buf_table, i); ctx->buf_table.nodes[i] = node; - if (io_is_compat(ctx)) - user_data += sizeof(struct compat_iovec); - else - user_data += sizeof(struct iovec); } return done ? done : err; } @@ -375,7 +390,7 @@ int io_register_files_update(struct io_ring_ctx *ctx, void __user *arg, memset(&up, 0, sizeof(up)); if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update))) return -EFAULT; - if (up.resv || up.resv2) + if (up.resv2) return -EINVAL; return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args); } @@ -389,7 +404,7 @@ int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg, return -EINVAL; if (copy_from_user(&up, arg, sizeof(up))) return -EFAULT; - if (!up.nr || up.resv || up.resv2) + if (!up.nr || up.resv2) return -EINVAL; return __io_register_rsrc_update(ctx, type, &up, up.nr); } @@ -489,12 +504,9 @@ int io_files_update(struct io_kiocb *req, unsigned int issue_flags) struct io_uring_rsrc_update2 up2; int ret; + memset(&up2, 0, sizeof(up2)); up2.offset = up->offset; up2.data = up->arg; - up2.nr = 0; - up2.tags = 0; - up2.resv = 0; - up2.resv2 = 0; if (up->offset == IORING_FILE_INDEX_ALLOC) { ret = io_files_update_with_index_alloc(req, issue_flags); @@ -791,8 +803,13 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, struct io_imu_folio_data data; bool coalesced = false; - if (!uaddr) { - if (size) + if (desc->type >= __IO_REGBUF_TYPE_MAX) + return ERR_PTR(-EINVAL); + if (!mem_is_zero(&desc->__resv, sizeof(desc->__resv))) + return ERR_PTR(-EINVAL); + + if (desc->type == IO_REGBUF_TYPE_EMPTY) { + if (uaddr || size) return ERR_PTR(-EFAULT); /* remove the buffer without installing a new one */ return NULL; -- 2.53.0