linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	 linux-fsdevel@vger.kernel.org, Jeff Layton <jlayton@kernel.org>,
	 Amir Goldstein <amir73il@gmail.com>,
	Jens Axboe <axboe@kernel.dk>,
	 Christian Brauner <brauner@kernel.org>
Subject: [PATCH v4 01/47] file: add FD_{ADD,PREPARE}()
Date: Sun, 23 Nov 2025 17:33:19 +0100	[thread overview]
Message-ID: <20251123-work-fd-prepare-v4-1-b6efa1706cfd@kernel.org> (raw)
In-Reply-To: <20251123-work-fd-prepare-v4-0-b6efa1706cfd@kernel.org>

I've been playing with this to allow for moderately flexible usage of
the get_unused_fd_flags() + create file + fd_install() pattern that's
used quite extensively.

How callers allocate files is really heterogenous so it's not really
convenient to fold them into a single class. It's possibe to split them
into subclasses like for anon inodes. I think that's not necessarily
nice as well.

My take is to add two primites:
(1) FD_ADD() the simple cases a file is installed:

    fd = FD_ADD(O_CLOEXEC, open_file(some, args)));
    if (fd >= 0)
            kvm_get_kvm(vcpu->kvm);
    return fd;

(2) FD_PREPARE() that captures all the cases where access to fd or file
    or additional work before publishing the fd is needed:

    FD_PREPARE(fdf, open_flag, file_open_handle(&path, open_flag));
    if (fdf.err)
            return fdf.err;

    if (copy_to_user(/* something something */))
            return -EFAULT;

    return fd_publish(fdf);

I've converted all of the easy cases over to it and it gets rid of an
aweful lot of convoluted cleanup logic.

It's centered around struct fd_prepare. FD_PREPARE() encapsulates all of
allocation and cleanup logic and must be followed by a call to
fd_publish() which associates the fd with the file and installs it into
the callers fdtable. If fd_publish() isn't called both are deallocated.

It mandates a specific order namely that first we allocate the fd and
then instantiate the file. But that shouldn't be a problem nearly
everyone I've converted uses this exact pattern anyway.

There's a bunch of additional cases where it would be easy to convert
them to this pattern. For example, the whole sync file stuff in dma
currently retains the containing structure of the file instead of the
file itself even though it's only used to allocate files. Changing that
would make it fall into the FD_PREPARE() pattern easily. I've not done
that work yet.

There's room for extending this in a way that wed'd have subclasses for
some particularly often use patterns but as I said I'm not even sure
that's worth it.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 include/linux/cleanup.h |  7 ++++
 include/linux/file.h    | 91 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+)

diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index 19c7e475d3a4..b8bd2f15f91f 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -261,6 +261,10 @@ const volatile void * __must_check_fn(const volatile void *val)
  * CLASS(name, var)(args...):
  *	declare the variable @var as an instance of the named class
  *
+ * CLASS_INIT(name, var, init_expr):
+ *	declare the variable @var as an instance of the named class with
+ *	custom initialization expression.
+ *
  * Ex.
  *
  * DEFINE_CLASS(fdget, struct fd, fdput(_T), fdget(fd), int fd)
@@ -290,6 +294,9 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
 	class_##_name##_t var __cleanup(class_##_name##_destructor) =	\
 		class_##_name##_constructor
 
+#define CLASS_INIT(_name, _var, _init_expr)                             \
+        class_##_name##_t _var __cleanup(class_##_name##_destructor) = (_init_expr)
+
 #define __scoped_class(_name, var, _label, args...)        \
 	for (CLASS(_name, var)(args); ; ({ goto _label; })) \
 		if (0) {                                   \
diff --git a/include/linux/file.h b/include/linux/file.h
index af1768d934a0..2f1853612b56 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -127,4 +127,95 @@ extern void __fput_sync(struct file *);
 
 extern unsigned int sysctl_nr_open_min, sysctl_nr_open_max;
 
+/*
+ * class_fd_prepare_t: Combined fd + file allocation cleanup class.
+ *
+ * Allocates an fd and a file together. On error paths, automatically cleans
+ * up whichever resource was successfully allocated. Allows flexible file
+ * allocation with different functions per usage.
+ */
+typedef struct {
+	s32 err;
+	s32 __fd;
+	struct file *__file;
+} class_fd_prepare_t;
+
+#define fd_prepare_fd(_T) ((_T).__fd)
+#define fd_prepare_file(_T) ((_T).__file)
+
+static inline void class_fd_prepare_destructor(class_fd_prepare_t *_T)
+{
+	if (unlikely(_T->err)) {
+		if (likely(_T->__fd >= 0))
+			put_unused_fd(_T->__fd);
+		if (unlikely(!IS_ERR_OR_NULL(_T->__file)))
+			fput(_T->__file);
+	}
+}
+
+static inline int class_fd_prepare_lock_err(class_fd_prepare_t *_T)
+{
+	if (unlikely(_T->__fd < 0))
+		return _T->__fd;
+	if (unlikely(IS_ERR(_T->__file)))
+		return PTR_ERR(_T->__file);
+	if (unlikely(!_T->__file))
+		return -ENOMEM;
+	return 0;
+}
+
+/*
+ * __FD_PREPARE_INIT(fd_flags, file_init_expr):
+ *     Helper to initialize fd_prepare class.
+ * @fd_flags: flags for get_unused_fd_flags()
+ * @file_init_expr: expression that returns struct file *
+ *
+ * Returns a struct fd_prepare with fd, file, and err set.
+ * If fd allocation fails, fd will be negative and err will be set.
+ * If fd succeeds but file_init_expr fails, file will be ERR_PTR and err will be set.
+ * The err field is the single source of truth for error checking.
+ */
+#define __FD_PREPARE_INIT(_fd_flags, _file_init_owned)                   \
+	({                                                               \
+		class_fd_prepare_t _fd_prepare = {                       \
+			.__fd = get_unused_fd_flags((_fd_flags)),        \
+		};                                                       \
+		if (likely(_fd_prepare.__fd >= 0))                       \
+			_fd_prepare.__file = (_file_init_owned);         \
+		_fd_prepare.err = ACQUIRE_ERR(fd_prepare, &_fd_prepare); \
+		_fd_prepare;                                             \
+	})
+
+/*
+ * FD_PREPARE(var, fd_flags, file_init_owned):
+ *     Declares and initializes an fd_prepare variable with automatic cleanup.
+ *     No separate scope required - cleanup happens when variable goes out of scope.
+ *
+ * @_var: name of struct fd_prepare variable to define
+ * @_fd_flags: flags for get_unused_fd_flags()
+ * @_file_init_owned: struct file to take ownership of (can be expression)
+ */
+#define FD_PREPARE(_var, _fd_flags, _file_init_owned) \
+	CLASS_INIT(fd_prepare, _var, __FD_PREPARE_INIT(_fd_flags, _file_init_owned))
+
+#define fd_publish(_fd_prepare)                                \
+	({                                                     \
+		class_fd_prepare_t *__p = &(_fd_prepare);      \
+		VFS_WARN_ON_ONCE(__p->err);                    \
+		VFS_WARN_ON_ONCE(__p->__fd < 0);               \
+		VFS_WARN_ON_ONCE(IS_ERR_OR_NULL(__p->__file)); \
+		fd_install(__p->__fd, __p->__file);            \
+		retain_and_null_ptr(__p->__file);              \
+		take_fd(__p->__fd);                            \
+	})
+
+#define FD_ADD(_fd_flags, _file_init_owned)                    \
+	({                                                     \
+		FD_PREPARE(_var, _fd_flags, _file_init_owned); \
+		s32 ret = _var.err;                            \
+		if (likely(!ret))                              \
+			ret = fd_publish(_var);                \
+		ret;                                           \
+	})
+
 #endif /* __LINUX_FILE_H */

-- 
2.47.3


  reply	other threads:[~2025-11-23 16:33 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-23 16:33 [PATCH v4 00/47] file: FD_{ADD,PREPARE}() Christian Brauner
2025-11-23 16:33 ` Christian Brauner [this message]
2025-11-24 15:03   ` [PATCH v4 01/47] file: add FD_{ADD,PREPARE}() Jason Gunthorpe
2025-11-23 16:33 ` [PATCH v4 02/47] anon_inodes: convert to FD_PREPARE() Christian Brauner
2025-11-25 12:16   ` Jan Kara
2025-11-23 16:33 ` [PATCH v4 03/47] eventfd: convert do_eventfd() " Christian Brauner
2025-11-25 12:18   ` Jan Kara
2025-11-23 16:33 ` [PATCH v4 04/47] fhandle: convert do_handle_open() " Christian Brauner
2025-11-25 12:15   ` Jan Kara
2025-11-23 16:33 ` [PATCH v4 05/47] namespace: convert open_tree() " Christian Brauner
2025-11-25 12:32   ` Jan Kara
2025-11-23 16:33 ` [PATCH v4 06/47] namespace: convert open_tree_attr() " Christian Brauner
2025-11-25 12:33   ` Jan Kara
2025-11-23 16:33 ` [PATCH v4 07/47] namespace: convert fsmount() " Christian Brauner
2025-11-25 12:42   ` Jan Kara
2025-11-23 16:33 ` [PATCH v4 08/47] fanotify: convert fanotify_init() " Christian Brauner
2025-11-24 17:13   ` Amir Goldstein
2025-11-25 12:13   ` Jan Kara
2025-11-23 16:33 ` [PATCH v4 09/47] nsfs: convert open_namespace() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 10/47] nsfs: convert ns_ioctl() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 11/47] autofs: convert autofs_dev_ioctl_open_mountpoint() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 12/47] eventpoll: convert do_epoll_create() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 13/47] open: convert do_sys_openat2() " Christian Brauner
2025-11-25 12:20   ` Jan Kara
2025-11-23 16:33 ` [PATCH v4 14/47] signalfd: convert do_signalfd4() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 15/47] timerfd: convert timerfd_create() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 16/47] userfaultfd: convert new_userfaultfd() " Christian Brauner
2025-11-25 11:24   ` Mike Rapoport
2025-11-23 16:33 ` [PATCH v4 17/47] xfs: convert xfs_open_by_handle() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 18/47] dma: convert dma_buf_fd() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 19/47] af_unix: convert unix_file_open() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 20/47] dma: convert sync_file_ioctl_merge() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 21/47] exec: convert begin_new_exec() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 22/47] ipc: convert do_mq_open() " Christian Brauner
2025-11-25 22:30   ` Mark Brown
2025-11-26  0:05     ` Linus Torvalds
2025-11-26 13:20       ` Christian Brauner
2025-11-26 16:10         ` Linus Torvalds
2025-11-23 16:33 ` [PATCH v4 23/47] bpf: convert bpf_iter_new_fd() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 24/47] bpf: convert bpf_token_create() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 25/47] memfd: convert memfd_create() " Christian Brauner
2025-11-25 11:30   ` Mike Rapoport
2025-11-23 16:33 ` [PATCH v4 26/47] secretmem: convert memfd_secret() " Christian Brauner
2025-11-25 11:21   ` Mike Rapoport
2025-11-23 16:33 ` [PATCH v4 27/47] net/handshake: convert handshake_nl_accept_doit() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 28/47] net/kcm: convert kcm_ioctl() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 29/47] net/sctp: convert sctp_getsockopt_peeloff_common() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 30/47] net/socket: convert sock_map_fd() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 31/47] net/socket: convert __sys_accept4_file() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 32/47] spufs: convert spufs_context_open() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 33/47] papr-hvpipe: convert papr_hvpipe_dev_create_handle() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 34/47] spufs: convert spufs_gang_open() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 35/47] pseries: convert papr_platform_dump_create_handle() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 36/47] pseries: port papr_rtas_setup_file_interface() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 37/47] dma: port sw_sync_ioctl_create_fence() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 38/47] gpio: convert linehandle_create() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 39/47] hv: convert mshv_ioctl_create_partition() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 40/47] media: convert media_request_alloc() " Christian Brauner
2025-11-23 16:33 ` [PATCH v4 41/47] ntsync: convert ntsync_obj_get_fd() " Christian Brauner
2025-11-23 16:34 ` [PATCH v4 42/47] tty: convert ptm_open_peer() " Christian Brauner
2025-11-25 22:39   ` Mark Brown
2025-11-26 13:35     ` Christian Brauner
2025-11-26 14:22       ` Mark Brown
2025-11-23 16:34 ` [PATCH v4 43/47] vfio: convert vfio_group_ioctl_get_device_fd() " Christian Brauner
2025-11-24 14:52   ` Jason Gunthorpe
2025-11-23 16:34 ` [PATCH v4 44/47] file: convert replace_fd() " Christian Brauner
2025-11-25 12:29   ` Jan Kara
2025-11-23 16:34 ` [PATCH v4 45/47] io_uring: convert io_create_mock_file() " Christian Brauner
2025-11-25 16:38   ` Jens Axboe
2025-11-23 16:34 ` [PATCH v4 46/47] kvm: convert kvm_arch_supports_gmem_init_shared() " Christian Brauner
2025-11-23 16:34 ` [PATCH v4 47/47] kvm: convert kvm_vcpu_ioctl_get_stats_fd() " Christian Brauner
2025-11-23 16:43 ` [PATCH v4 00/47] file: FD_{ADD,PREPARE}() Linus Torvalds
2025-11-23 16:46   ` Linus Torvalds
2025-11-23 22:53   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251123-work-fd-prepare-v4-1-b6efa1706cfd@kernel.org \
    --to=brauner@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).