* [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle
@ 2025-06-24 8:29 Christian Brauner
2025-06-24 8:29 ` [PATCH v2 01/11] fhandle: raise FILEID_IS_DIR in handle_type Christian Brauner
` (12 more replies)
0 siblings, 13 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner, stable
Various filesystems such as pidfs and drm support opening file handles
without having to require a file descriptor to identify the filesystem.
The filesystem are global single instances and can be trivially
identified solely on the information encoded in the file handle.
This makes it possible to not have to keep or acquire a sentinal file
descriptor just to pass it to open_by_handle_at() to identify the
filesystem. That's especially useful when such sentinel file descriptor
cannot or should not be acquired.
For pidfs this means a file handle can function as full replacement for
storing a pid in a file. Instead a file handle can be stored and
reopened purely based on the file handle.
Such autonomous file handles can be opened with or without specifying a
a file descriptor. If no proper file descriptor is used the FD_INVALID
sentinel must be passed. This allows us to define further special
negative fd sentinels in the future.
Userspace can trivially test for support by trying to open the file
handle with an invalid file descriptor.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Changes in v2:
- Simplify the FILEID_PIDFS enum.
- Introduce FD_INVALID.
- Require FD_INVALID for autonomous file handles.
- Link to v1: https://lore.kernel.org/20250623-work-pidfs-fhandle-v1-0-75899d67555f@kernel.org
---
Christian Brauner (11):
fhandle: raise FILEID_IS_DIR in handle_type
fhandle: hoist copy_from_user() above get_path_from_fd()
fhandle: rename to get_path_anchor()
pidfs: add pidfs_root_path() helper
fhandle: reflow get_path_anchor()
uapi/fcntl: mark range as reserved
uapi/fcntl: add FD_INVALID
exportfs: add FILEID_PIDFS
fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker
fhandle, pidfs: support open_by_handle_at() purely based on file handle
selftests/pidfd: decode pidfd file handles withou having to specify an fd
fs/fhandle.c | 82 ++++++++++++++--------
fs/internal.h | 1 +
fs/pidfs.c | 16 ++++-
include/linux/exportfs.h | 9 ++-
include/uapi/linux/fcntl.h | 17 +++++
include/uapi/linux/pidfd.h | 15 ----
tools/testing/selftests/pidfd/Makefile | 2 +-
tools/testing/selftests/pidfd/pidfd.h | 6 +-
.../selftests/pidfd/pidfd_file_handle_test.c | 60 ++++++++++++++++
9 files changed, 158 insertions(+), 50 deletions(-)
---
base-commit: 4e3d1e6e1b2d9df9650be14380c534b3c5081ddd
change-id: 20250619-work-pidfs-fhandle-b63ff35c4924
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH v2 01/11] fhandle: raise FILEID_IS_DIR in handle_type
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:31 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 02/11] fhandle: hoist copy_from_user() above get_path_from_fd() Christian Brauner
` (11 subsequent siblings)
12 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner, stable
Currently FILEID_IS_DIR is raised in fh_flags which is wrong.
Raise it in handle->handle_type were it's supposed to be.
Fixes: c374196b2b9f ("fs: name_to_handle_at() support for "explicit connectable" file handles")
Cc: <stable@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/fhandle.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/fhandle.c b/fs/fhandle.c
index 3e092ae6d142..66ff60591d17 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -88,7 +88,7 @@ static long do_sys_name_to_handle(const struct path *path,
if (fh_flags & EXPORT_FH_CONNECTABLE) {
handle->handle_type |= FILEID_IS_CONNECTABLE;
if (d_is_dir(path->dentry))
- fh_flags |= FILEID_IS_DIR;
+ handle->handle_type |= FILEID_IS_DIR;
}
retval = 0;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 02/11] fhandle: hoist copy_from_user() above get_path_from_fd()
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
2025-06-24 8:29 ` [PATCH v2 01/11] fhandle: raise FILEID_IS_DIR in handle_type Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:31 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 03/11] fhandle: rename to get_path_anchor() Christian Brauner
` (10 subsequent siblings)
12 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
In follow-up patches we need access to @file_handle->handle_type
before we start caring about get_path_from_fd().
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/fhandle.c | 35 ++++++++++++++---------------------
1 file changed, 14 insertions(+), 21 deletions(-)
diff --git a/fs/fhandle.c b/fs/fhandle.c
index 66ff60591d17..73f56f8e7d5d 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -323,13 +323,24 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
{
int retval = 0;
struct file_handle f_handle;
- struct file_handle *handle = NULL;
+ struct file_handle *handle __free(kfree) = NULL;
struct handle_to_path_ctx ctx = {};
const struct export_operations *eops;
+ if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle)))
+ return -EFAULT;
+
+ if ((f_handle.handle_bytes > MAX_HANDLE_SZ) ||
+ (f_handle.handle_bytes == 0))
+ return -EINVAL;
+
+ if (f_handle.handle_type < 0 ||
+ FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS)
+ return -EINVAL;
+
retval = get_path_from_fd(mountdirfd, &ctx.root);
if (retval)
- goto out_err;
+ return retval;
eops = ctx.root.mnt->mnt_sb->s_export_op;
if (eops && eops->permission)
@@ -339,21 +350,6 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
if (retval)
goto out_path;
- if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
- retval = -EFAULT;
- goto out_path;
- }
- if ((f_handle.handle_bytes > MAX_HANDLE_SZ) ||
- (f_handle.handle_bytes == 0)) {
- retval = -EINVAL;
- goto out_path;
- }
- if (f_handle.handle_type < 0 ||
- FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS) {
- retval = -EINVAL;
- goto out_path;
- }
-
handle = kmalloc(struct_size(handle, f_handle, f_handle.handle_bytes),
GFP_KERNEL);
if (!handle) {
@@ -366,7 +362,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
&ufh->f_handle,
f_handle.handle_bytes)) {
retval = -EFAULT;
- goto out_handle;
+ goto out_path;
}
/*
@@ -384,11 +380,8 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
handle->handle_type &= ~FILEID_USER_FLAGS_MASK;
retval = do_handle_to_path(handle, path, &ctx);
-out_handle:
- kfree(handle);
out_path:
path_put(&ctx.root);
-out_err:
return retval;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 03/11] fhandle: rename to get_path_anchor()
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
2025-06-24 8:29 ` [PATCH v2 01/11] fhandle: raise FILEID_IS_DIR in handle_type Christian Brauner
2025-06-24 8:29 ` [PATCH v2 02/11] fhandle: hoist copy_from_user() above get_path_from_fd() Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:31 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 04/11] pidfs: add pidfs_root_path() helper Christian Brauner
` (9 subsequent siblings)
12 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Rename as we're going to expand the function in the next step. The path
just serves as the anchor tying the decoding to the filesystem.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/fhandle.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/fhandle.c b/fs/fhandle.c
index 73f56f8e7d5d..d8d32208c621 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -168,7 +168,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
return err;
}
-static int get_path_from_fd(int fd, struct path *root)
+static int get_path_anchor(int fd, struct path *root)
{
if (fd == AT_FDCWD) {
struct fs_struct *fs = current->fs;
@@ -338,7 +338,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS)
return -EINVAL;
- retval = get_path_from_fd(mountdirfd, &ctx.root);
+ retval = get_path_anchor(mountdirfd, &ctx.root);
if (retval)
return retval;
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 04/11] pidfs: add pidfs_root_path() helper
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (2 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 03/11] fhandle: rename to get_path_anchor() Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:31 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 05/11] fhandle: reflow get_path_anchor() Christian Brauner
` (8 subsequent siblings)
12 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Allow to return the root of the global pidfs filesystem.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/internal.h | 1 +
fs/pidfs.c | 11 +++++++++++
2 files changed, 12 insertions(+)
diff --git a/fs/internal.h b/fs/internal.h
index 22ba066d1dba..ad256bccdc85 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -353,3 +353,4 @@ int anon_inode_getattr(struct mnt_idmap *idmap, const struct path *path,
unsigned int query_flags);
int anon_inode_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
struct iattr *attr);
+void pidfs_get_root(struct path *path);
diff --git a/fs/pidfs.c b/fs/pidfs.c
index ba526fdd4c4d..69b0541042b5 100644
--- a/fs/pidfs.c
+++ b/fs/pidfs.c
@@ -31,6 +31,14 @@
static struct kmem_cache *pidfs_attr_cachep __ro_after_init;
static struct kmem_cache *pidfs_xattr_cachep __ro_after_init;
+static struct path pidfs_root_path = {};
+
+void pidfs_get_root(struct path *path)
+{
+ *path = pidfs_root_path;
+ path_get(path);
+}
+
/*
* Stashes information that userspace needs to access even after the
* process has been reaped.
@@ -1066,4 +1074,7 @@ void __init pidfs_init(void)
pidfs_mnt = kern_mount(&pidfs_type);
if (IS_ERR(pidfs_mnt))
panic("Failed to mount pidfs pseudo filesystem");
+
+ pidfs_root_path.mnt = pidfs_mnt;
+ pidfs_root_path.dentry = pidfs_mnt->mnt_root;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 05/11] fhandle: reflow get_path_anchor()
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (3 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 04/11] pidfs: add pidfs_root_path() helper Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:16 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 06/11] uapi/fcntl: mark range as reserved Christian Brauner
` (7 subsequent siblings)
12 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Switch to a more common coding style.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/fhandle.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/fs/fhandle.c b/fs/fhandle.c
index d8d32208c621..22edced83e4c 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -170,18 +170,22 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
static int get_path_anchor(int fd, struct path *root)
{
+ if (fd >= 0) {
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
+ *root = fd_file(f)->f_path;
+ path_get(root);
+ return 0;
+ }
+
if (fd == AT_FDCWD) {
struct fs_struct *fs = current->fs;
spin_lock(&fs->lock);
*root = fs->pwd;
path_get(root);
spin_unlock(&fs->lock);
- } else {
- CLASS(fd, f)(fd);
- if (fd_empty(f))
- return -EBADF;
- *root = fd_file(f)->f_path;
- path_get(root);
+ return 0;
}
return 0;
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 06/11] uapi/fcntl: mark range as reserved
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (4 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 05/11] fhandle: reflow get_path_anchor() Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:16 ` Jan Kara
2025-06-24 10:57 ` Amir Goldstein
2025-06-24 8:29 ` [PATCH v2 07/11] uapi/fcntl: add FD_INVALID Christian Brauner
` (6 subsequent siblings)
12 siblings, 2 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Mark the range from -10000 to -40000 as a range reserved for special
in-kernel values. Move the PIDFD_SELF_*/PIDFD_THREAD_* sentinels over so
all the special values are in one place.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
include/uapi/linux/fcntl.h | 16 ++++++++++++++++
include/uapi/linux/pidfd.h | 15 ---------------
tools/testing/selftests/pidfd/pidfd.h | 2 +-
3 files changed, 17 insertions(+), 16 deletions(-)
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index a15ac2fa4b20..ba4a698d2f33 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -90,10 +90,26 @@
#define DN_ATTRIB 0x00000020 /* File changed attibutes */
#define DN_MULTISHOT 0x80000000 /* Don't remove notifier */
+/* Reserved kernel ranges [-100], [-10000, -40000]. */
#define AT_FDCWD -100 /* Special value for dirfd used to
indicate openat should use the
current working directory. */
+/*
+ * The concept of process and threads in userland and the kernel is a confusing
+ * one - within the kernel every thread is a 'task' with its own individual PID,
+ * however from userland's point of view threads are grouped by a single PID,
+ * which is that of the 'thread group leader', typically the first thread
+ * spawned.
+ *
+ * To cut the Gideon knot, for internal kernel usage, we refer to
+ * PIDFD_SELF_THREAD to refer to the current thread (or task from a kernel
+ * perspective), and PIDFD_SELF_THREAD_GROUP to refer to the current thread
+ * group leader...
+ */
+#define PIDFD_SELF_THREAD -10000 /* Current thread. */
+#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
+
/* Generic flags for the *at(2) family of syscalls. */
diff --git a/include/uapi/linux/pidfd.h b/include/uapi/linux/pidfd.h
index c27a4e238e4b..957db425d459 100644
--- a/include/uapi/linux/pidfd.h
+++ b/include/uapi/linux/pidfd.h
@@ -42,21 +42,6 @@
#define PIDFD_COREDUMP_USER (1U << 2) /* coredump was done as the user. */
#define PIDFD_COREDUMP_ROOT (1U << 3) /* coredump was done as root. */
-/*
- * The concept of process and threads in userland and the kernel is a confusing
- * one - within the kernel every thread is a 'task' with its own individual PID,
- * however from userland's point of view threads are grouped by a single PID,
- * which is that of the 'thread group leader', typically the first thread
- * spawned.
- *
- * To cut the Gideon knot, for internal kernel usage, we refer to
- * PIDFD_SELF_THREAD to refer to the current thread (or task from a kernel
- * perspective), and PIDFD_SELF_THREAD_GROUP to refer to the current thread
- * group leader...
- */
-#define PIDFD_SELF_THREAD -10000 /* Current thread. */
-#define PIDFD_SELF_THREAD_GROUP -20000 /* Current thread group leader. */
-
/*
* ...and for userland we make life simpler - PIDFD_SELF refers to the current
* thread, PIDFD_SELF_PROCESS refers to the process thread group leader.
diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
index efd74063126e..5dfeb1bdf399 100644
--- a/tools/testing/selftests/pidfd/pidfd.h
+++ b/tools/testing/selftests/pidfd/pidfd.h
@@ -56,7 +56,7 @@
#endif
#ifndef PIDFD_SELF_THREAD_GROUP
-#define PIDFD_SELF_THREAD_GROUP -20000 /* Current thread group leader. */
+#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
#endif
#ifndef PIDFD_SELF
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 07/11] uapi/fcntl: add FD_INVALID
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (5 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 06/11] uapi/fcntl: mark range as reserved Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:17 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 08/11] exportfs: add FILEID_PIDFS Christian Brauner
` (5 subsequent siblings)
12 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Add a marker for an invalid file descriptor.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
include/uapi/linux/fcntl.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index ba4a698d2f33..a5bebe7c4400 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -110,6 +110,7 @@
#define PIDFD_SELF_THREAD -10000 /* Current thread. */
#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
+#define FD_INVALID -10009 /* Invalid file descriptor: -10000 - EBADF = -10009 */
/* Generic flags for the *at(2) family of syscalls. */
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 08/11] exportfs: add FILEID_PIDFS
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (6 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 07/11] uapi/fcntl: add FD_INVALID Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:17 ` Jan Kara
2025-06-24 13:15 ` Amir Goldstein
2025-06-24 8:29 ` [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker Christian Brauner
` (4 subsequent siblings)
12 siblings, 2 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Introduce new pidfs file handle values.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
include/linux/exportfs.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index 25c4a5afbd44..5bb757b51f5c 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -131,6 +131,11 @@ enum fid_type {
* Filesystems must not use 0xff file ID.
*/
FILEID_INVALID = 0xff,
+
+ /* Internal kernel fid types */
+
+ /* pidfs fid type */
+ FILEID_PIDFS = 0x100,
};
struct fid {
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (7 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 08/11] exportfs: add FILEID_PIDFS Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:18 ` Jan Kara
2025-06-24 9:20 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle Christian Brauner
` (3 subsequent siblings)
12 siblings, 2 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Allow a filesystem to indicate that it supports encoding autonomous file
handles that can be decoded without having to pass a filesystem for the
filesystem. In other words, the file handle uniquely identifies the
filesystem.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/fhandle.c | 7 ++++++-
include/linux/exportfs.h | 4 +++-
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/fhandle.c b/fs/fhandle.c
index 22edced83e4c..ab4891925b52 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -23,12 +23,13 @@ static long do_sys_name_to_handle(const struct path *path,
struct file_handle f_handle;
int handle_dwords, handle_bytes;
struct file_handle *handle = NULL;
+ const struct export_operations *eops = path->dentry->d_sb->s_export_op;
/*
* We need to make sure whether the file system support decoding of
* the file handle if decodeable file handle was requested.
*/
- if (!exportfs_can_encode_fh(path->dentry->d_sb->s_export_op, fh_flags))
+ if (!exportfs_can_encode_fh(eops, fh_flags))
return -EOPNOTSUPP;
/*
@@ -90,6 +91,10 @@ static long do_sys_name_to_handle(const struct path *path,
if (d_is_dir(path->dentry))
handle->handle_type |= FILEID_IS_DIR;
}
+
+ /* Filesystems supports autonomous file handles. */
+ if (eops->flags & EXPORT_OP_AUTONOMOUS_HANDLES)
+ handle->handle_type |= FILEID_IS_AUTONOMOUS;
retval = 0;
}
/* copy the mount id */
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index 5bb757b51f5c..f7f9038b285e 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -188,7 +188,8 @@ struct handle_to_path_ctx {
/* Flags supported in encoded handle_type that is exported to user */
#define FILEID_IS_CONNECTABLE 0x10000
#define FILEID_IS_DIR 0x20000
-#define FILEID_VALID_USER_FLAGS (FILEID_IS_CONNECTABLE | FILEID_IS_DIR)
+#define FILEID_IS_AUTONOMOUS 0x40000
+#define FILEID_VALID_USER_FLAGS (FILEID_IS_CONNECTABLE | FILEID_IS_DIR | FILEID_IS_AUTONOMOUS)
/**
* struct export_operations - for nfsd to communicate with file systems
@@ -285,6 +286,7 @@ struct export_operations {
*/
#define EXPORT_OP_FLUSH_ON_CLOSE (0x20) /* fs flushes file data on close */
#define EXPORT_OP_NOLOCKS (0x40) /* no file locking support */
+#define EXPORT_OP_AUTONOMOUS_HANDLES (0x80) /* filesystem supports autonomous file handles */
unsigned long flags;
};
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (8 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:30 ` Jan Kara
2025-06-24 23:07 ` Al Viro
2025-06-24 8:29 ` [PATCH v2 11/11] selftests/pidfd: decode pidfd file handles withou having to specify an fd Christian Brauner
` (2 subsequent siblings)
12 siblings, 2 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Various filesystems such as pidfs (and likely drm in the future) have a
use-case to support opening files purely based on the handle without
having to require a file descriptor to another object. That's especially
the case for filesystems that don't do any lookup whatsoever and there's
zero relationship between the objects. Such filesystems are also
singletons that stay around for the lifetime of the system meaning that
they can be uniquely identified and accessed purely based on the file
handle type. Enable that so that userspace doesn't have to allocate an
object needlessly especially if they can't do that for whatever reason.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/fhandle.c | 22 ++++++++++++++++++++--
fs/pidfs.c | 5 ++++-
2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/fs/fhandle.c b/fs/fhandle.c
index ab4891925b52..54081e19f594 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
return err;
}
-static int get_path_anchor(int fd, struct path *root)
+static int get_path_anchor(int fd, struct path *root, int handle_type)
{
if (fd >= 0) {
CLASS(fd, f)(fd);
@@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
return 0;
}
+ /*
+ * Only autonomous handles can be decoded without a file
+ * descriptor.
+ */
+ if (!(handle_type & FILEID_IS_AUTONOMOUS))
+ return -EOPNOTSUPP;
+
+ if (fd != FD_INVALID)
+ return -EINVAL;
+
+ switch (handle_type & ~FILEID_USER_FLAGS_MASK) {
+ case FILEID_PIDFS:
+ pidfs_get_root(root);
+ break;
+ default:
+ return -EINVAL;
+ }
+
return 0;
}
@@ -347,7 +365,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS)
return -EINVAL;
- retval = get_path_anchor(mountdirfd, &ctx.root);
+ retval = get_path_anchor(mountdirfd, &ctx.root, f_handle.handle_type);
if (retval)
return retval;
diff --git a/fs/pidfs.c b/fs/pidfs.c
index 69b0541042b5..2ab9b47fbfae 100644
--- a/fs/pidfs.c
+++ b/fs/pidfs.c
@@ -747,7 +747,7 @@ static int pidfs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
*max_len = 2;
*(u64 *)fh = pid->ino;
- return FILEID_KERNFS;
+ return FILEID_PIDFS;
}
static int pidfs_ino_find(const void *key, const struct rb_node *node)
@@ -802,6 +802,8 @@ static struct dentry *pidfs_fh_to_dentry(struct super_block *sb,
return NULL;
switch (fh_type) {
+ case FILEID_PIDFS:
+ fallthrough;
case FILEID_KERNFS:
pid_ino = *(u64 *)fid;
break;
@@ -860,6 +862,7 @@ static const struct export_operations pidfs_export_operations = {
.fh_to_dentry = pidfs_fh_to_dentry,
.open = pidfs_export_open,
.permission = pidfs_export_permission,
+ .flags = EXPORT_OP_AUTONOMOUS_HANDLES,
};
static int pidfs_init_inode(struct inode *inode, void *data)
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH v2 11/11] selftests/pidfd: decode pidfd file handles withou having to specify an fd
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (9 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle Christian Brauner
@ 2025-06-24 8:29 ` Christian Brauner
2025-06-24 9:39 ` Jan Kara
2025-06-24 10:58 ` [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Amir Goldstein
2025-06-24 10:59 ` Christian Brauner
12 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 8:29 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, Christian Brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
tools/testing/selftests/pidfd/Makefile | 2 +-
tools/testing/selftests/pidfd/pidfd.h | 4 ++
.../selftests/pidfd/pidfd_file_handle_test.c | 60 ++++++++++++++++++++++
3 files changed, 65 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/pidfd/Makefile b/tools/testing/selftests/pidfd/Makefile
index 03a6eede9c9e..764a8f9ecefa 100644
--- a/tools/testing/selftests/pidfd/Makefile
+++ b/tools/testing/selftests/pidfd/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-CFLAGS += -g $(KHDR_INCLUDES) -pthread -Wall
+CFLAGS += -g $(KHDR_INCLUDES) $(TOOLS_INCLUDES) -pthread -Wall
TEST_GEN_PROGS := pidfd_test pidfd_fdinfo_test pidfd_open_test \
pidfd_poll_test pidfd_wait pidfd_getfd_test pidfd_setns_test \
diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
index 5dfeb1bdf399..b427a2636402 100644
--- a/tools/testing/selftests/pidfd/pidfd.h
+++ b/tools/testing/selftests/pidfd/pidfd.h
@@ -19,6 +19,10 @@
#include "../kselftest.h"
#include "../clone3/clone3_selftests.h"
+#ifndef FD_INVALID
+#define FD_INVALID -10009 /* Invalid file descriptor. */
+#endif
+
#ifndef P_PIDFD
#define P_PIDFD 3
#endif
diff --git a/tools/testing/selftests/pidfd/pidfd_file_handle_test.c b/tools/testing/selftests/pidfd/pidfd_file_handle_test.c
index 439b9c6c0457..ff1bf51bca5e 100644
--- a/tools/testing/selftests/pidfd/pidfd_file_handle_test.c
+++ b/tools/testing/selftests/pidfd/pidfd_file_handle_test.c
@@ -500,4 +500,64 @@ TEST_F(file_handle, valid_name_to_handle_at_flags)
ASSERT_EQ(close(pidfd), 0);
}
+/*
+ * That we decode a file handle without having to pass a pidfd.
+ */
+TEST_F(file_handle, decode_purely_based_on_file_handle)
+{
+ int mnt_id;
+ struct file_handle *fh;
+ int pidfd = -EBADF;
+ struct stat st1, st2;
+
+ fh = malloc(sizeof(struct file_handle) + MAX_HANDLE_SZ);
+ ASSERT_NE(fh, NULL);
+ memset(fh, 0, sizeof(struct file_handle) + MAX_HANDLE_SZ);
+ fh->handle_bytes = MAX_HANDLE_SZ;
+
+ ASSERT_EQ(name_to_handle_at(self->child_pidfd1, "", fh, &mnt_id, AT_EMPTY_PATH), 0);
+
+ ASSERT_EQ(fstat(self->child_pidfd1, &st1), 0);
+
+ pidfd = open_by_handle_at(FD_INVALID, fh, 0);
+ ASSERT_GE(pidfd, 0);
+
+ ASSERT_EQ(fstat(pidfd, &st2), 0);
+ ASSERT_TRUE(st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);
+
+ ASSERT_EQ(close(pidfd), 0);
+
+ pidfd = open_by_handle_at(FD_INVALID, fh, O_CLOEXEC);
+ ASSERT_GE(pidfd, 0);
+
+ ASSERT_EQ(fstat(pidfd, &st2), 0);
+ ASSERT_TRUE(st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);
+
+ ASSERT_EQ(close(pidfd), 0);
+
+ pidfd = open_by_handle_at(FD_INVALID, fh, O_NONBLOCK);
+ ASSERT_GE(pidfd, 0);
+
+ ASSERT_EQ(fstat(pidfd, &st2), 0);
+ ASSERT_TRUE(st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);
+
+ ASSERT_EQ(close(pidfd), 0);
+
+ pidfd = open_by_handle_at(self->pidfd, fh, 0);
+ ASSERT_GE(pidfd, 0);
+
+ ASSERT_EQ(fstat(pidfd, &st2), 0);
+ ASSERT_TRUE(st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);
+
+ ASSERT_EQ(close(pidfd), 0);
+
+ pidfd = open_by_handle_at(-EBADF, fh, 0);
+ ASSERT_LT(pidfd, 0);
+
+ pidfd = open_by_handle_at(AT_FDCWD, fh, 0);
+ ASSERT_LT(pidfd, 0);
+
+ free(fh);
+}
+
TEST_HARNESS_MAIN
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH v2 05/11] fhandle: reflow get_path_anchor()
2025-06-24 8:29 ` [PATCH v2 05/11] fhandle: reflow get_path_anchor() Christian Brauner
@ 2025-06-24 9:16 ` Jan Kara
2025-06-24 10:16 ` Christian Brauner
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:16 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:08, Christian Brauner wrote:
> Switch to a more common coding style.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> fs/fhandle.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index d8d32208c621..22edced83e4c 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -170,18 +170,22 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
>
> static int get_path_anchor(int fd, struct path *root)
> {
> + if (fd >= 0) {
> + CLASS(fd, f)(fd);
> + if (fd_empty(f))
> + return -EBADF;
> + *root = fd_file(f)->f_path;
> + path_get(root);
> + return 0;
> + }
> +
> if (fd == AT_FDCWD) {
> struct fs_struct *fs = current->fs;
> spin_lock(&fs->lock);
> *root = fs->pwd;
> path_get(root);
> spin_unlock(&fs->lock);
> - } else {
> - CLASS(fd, f)(fd);
> - if (fd_empty(f))
> - return -EBADF;
> - *root = fd_file(f)->f_path;
> - path_get(root);
> + return 0;
> }
This actually introduces a regression that when userspace passes invalid fd
< 0, we'd be returning 0 whereas previously we were returning -EBADF. I
think the return below should be switched to -EBADF to fix that.
> return 0;
>
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 06/11] uapi/fcntl: mark range as reserved
2025-06-24 8:29 ` [PATCH v2 06/11] uapi/fcntl: mark range as reserved Christian Brauner
@ 2025-06-24 9:16 ` Jan Kara
2025-06-24 10:57 ` Amir Goldstein
1 sibling, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:16 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:09, Christian Brauner wrote:
> Mark the range from -10000 to -40000 as a range reserved for special
> in-kernel values. Move the PIDFD_SELF_*/PIDFD_THREAD_* sentinels over so
> all the special values are in one place.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> include/uapi/linux/fcntl.h | 16 ++++++++++++++++
> include/uapi/linux/pidfd.h | 15 ---------------
> tools/testing/selftests/pidfd/pidfd.h | 2 +-
> 3 files changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> index a15ac2fa4b20..ba4a698d2f33 100644
> --- a/include/uapi/linux/fcntl.h
> +++ b/include/uapi/linux/fcntl.h
> @@ -90,10 +90,26 @@
> #define DN_ATTRIB 0x00000020 /* File changed attibutes */
> #define DN_MULTISHOT 0x80000000 /* Don't remove notifier */
>
> +/* Reserved kernel ranges [-100], [-10000, -40000]. */
> #define AT_FDCWD -100 /* Special value for dirfd used to
> indicate openat should use the
> current working directory. */
>
> +/*
> + * The concept of process and threads in userland and the kernel is a confusing
> + * one - within the kernel every thread is a 'task' with its own individual PID,
> + * however from userland's point of view threads are grouped by a single PID,
> + * which is that of the 'thread group leader', typically the first thread
> + * spawned.
> + *
> + * To cut the Gideon knot, for internal kernel usage, we refer to
> + * PIDFD_SELF_THREAD to refer to the current thread (or task from a kernel
> + * perspective), and PIDFD_SELF_THREAD_GROUP to refer to the current thread
> + * group leader...
> + */
> +#define PIDFD_SELF_THREAD -10000 /* Current thread. */
> +#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
> +
>
> /* Generic flags for the *at(2) family of syscalls. */
>
> diff --git a/include/uapi/linux/pidfd.h b/include/uapi/linux/pidfd.h
> index c27a4e238e4b..957db425d459 100644
> --- a/include/uapi/linux/pidfd.h
> +++ b/include/uapi/linux/pidfd.h
> @@ -42,21 +42,6 @@
> #define PIDFD_COREDUMP_USER (1U << 2) /* coredump was done as the user. */
> #define PIDFD_COREDUMP_ROOT (1U << 3) /* coredump was done as root. */
>
> -/*
> - * The concept of process and threads in userland and the kernel is a confusing
> - * one - within the kernel every thread is a 'task' with its own individual PID,
> - * however from userland's point of view threads are grouped by a single PID,
> - * which is that of the 'thread group leader', typically the first thread
> - * spawned.
> - *
> - * To cut the Gideon knot, for internal kernel usage, we refer to
> - * PIDFD_SELF_THREAD to refer to the current thread (or task from a kernel
> - * perspective), and PIDFD_SELF_THREAD_GROUP to refer to the current thread
> - * group leader...
> - */
> -#define PIDFD_SELF_THREAD -10000 /* Current thread. */
> -#define PIDFD_SELF_THREAD_GROUP -20000 /* Current thread group leader. */
> -
> /*
> * ...and for userland we make life simpler - PIDFD_SELF refers to the current
> * thread, PIDFD_SELF_PROCESS refers to the process thread group leader.
> diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
> index efd74063126e..5dfeb1bdf399 100644
> --- a/tools/testing/selftests/pidfd/pidfd.h
> +++ b/tools/testing/selftests/pidfd/pidfd.h
> @@ -56,7 +56,7 @@
> #endif
>
> #ifndef PIDFD_SELF_THREAD_GROUP
> -#define PIDFD_SELF_THREAD_GROUP -20000 /* Current thread group leader. */
> +#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
> #endif
>
> #ifndef PIDFD_SELF
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 07/11] uapi/fcntl: add FD_INVALID
2025-06-24 8:29 ` [PATCH v2 07/11] uapi/fcntl: add FD_INVALID Christian Brauner
@ 2025-06-24 9:17 ` Jan Kara
0 siblings, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:17 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:10, Christian Brauner wrote:
> Add a marker for an invalid file descriptor.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> include/uapi/linux/fcntl.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> index ba4a698d2f33..a5bebe7c4400 100644
> --- a/include/uapi/linux/fcntl.h
> +++ b/include/uapi/linux/fcntl.h
> @@ -110,6 +110,7 @@
> #define PIDFD_SELF_THREAD -10000 /* Current thread. */
> #define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
>
> +#define FD_INVALID -10009 /* Invalid file descriptor: -10000 - EBADF = -10009 */
>
> /* Generic flags for the *at(2) family of syscalls. */
>
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 08/11] exportfs: add FILEID_PIDFS
2025-06-24 8:29 ` [PATCH v2 08/11] exportfs: add FILEID_PIDFS Christian Brauner
@ 2025-06-24 9:17 ` Jan Kara
2025-06-24 13:15 ` Amir Goldstein
1 sibling, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:17 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:11, Christian Brauner wrote:
> Introduce new pidfs file handle values.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> include/linux/exportfs.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> index 25c4a5afbd44..5bb757b51f5c 100644
> --- a/include/linux/exportfs.h
> +++ b/include/linux/exportfs.h
> @@ -131,6 +131,11 @@ enum fid_type {
> * Filesystems must not use 0xff file ID.
> */
> FILEID_INVALID = 0xff,
> +
> + /* Internal kernel fid types */
> +
> + /* pidfs fid type */
> + FILEID_PIDFS = 0x100,
> };
>
> struct fid {
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker
2025-06-24 8:29 ` [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker Christian Brauner
@ 2025-06-24 9:18 ` Jan Kara
2025-06-24 9:20 ` Jan Kara
1 sibling, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:18 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:12, Christian Brauner wrote:
> Allow a filesystem to indicate that it supports encoding autonomous file
> handles that can be decoded without having to pass a filesystem for the
> filesystem. In other words, the file handle uniquely identifies the
> filesystem.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/fhandle.c | 7 ++++++-
> include/linux/exportfs.h | 4 +++-
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index 22edced83e4c..ab4891925b52 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -23,12 +23,13 @@ static long do_sys_name_to_handle(const struct path *path,
> struct file_handle f_handle;
> int handle_dwords, handle_bytes;
> struct file_handle *handle = NULL;
> + const struct export_operations *eops = path->dentry->d_sb->s_export_op;
>
> /*
> * We need to make sure whether the file system support decoding of
> * the file handle if decodeable file handle was requested.
> */
> - if (!exportfs_can_encode_fh(path->dentry->d_sb->s_export_op, fh_flags))
> + if (!exportfs_can_encode_fh(eops, fh_flags))
> return -EOPNOTSUPP;
>
> /*
> @@ -90,6 +91,10 @@ static long do_sys_name_to_handle(const struct path *path,
> if (d_is_dir(path->dentry))
> handle->handle_type |= FILEID_IS_DIR;
> }
> +
> + /* Filesystems supports autonomous file handles. */
> + if (eops->flags & EXPORT_OP_AUTONOMOUS_HANDLES)
> + handle->handle_type |= FILEID_IS_AUTONOMOUS;
> retval = 0;
> }
> /* copy the mount id */
> diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> index 5bb757b51f5c..f7f9038b285e 100644
> --- a/include/linux/exportfs.h
> +++ b/include/linux/exportfs.h
> @@ -188,7 +188,8 @@ struct handle_to_path_ctx {
> /* Flags supported in encoded handle_type that is exported to user */
> #define FILEID_IS_CONNECTABLE 0x10000
> #define FILEID_IS_DIR 0x20000
> -#define FILEID_VALID_USER_FLAGS (FILEID_IS_CONNECTABLE | FILEID_IS_DIR)
> +#define FILEID_IS_AUTONOMOUS 0x40000
> +#define FILEID_VALID_USER_FLAGS (FILEID_IS_CONNECTABLE | FILEID_IS_DIR | FILEID_IS_AUTONOMOUS)
>
> /**
> * struct export_operations - for nfsd to communicate with file systems
> @@ -285,6 +286,7 @@ struct export_operations {
> */
> #define EXPORT_OP_FLUSH_ON_CLOSE (0x20) /* fs flushes file data on close */
> #define EXPORT_OP_NOLOCKS (0x40) /* no file locking support */
> +#define EXPORT_OP_AUTONOMOUS_HANDLES (0x80) /* filesystem supports autonomous file handles */
> unsigned long flags;
> };
>
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker
2025-06-24 8:29 ` [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker Christian Brauner
2025-06-24 9:18 ` Jan Kara
@ 2025-06-24 9:20 ` Jan Kara
2025-06-24 10:16 ` Christian Brauner
1 sibling, 1 reply; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:20 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:12, Christian Brauner wrote:
> Allow a filesystem to indicate that it supports encoding autonomous file
> handles that can be decoded without having to pass a filesystem for the
> filesystem.
Forgot to mention the above phrase "to pass a filesystem for the
filesystem" doesn't make sense :) But my reviewed-by holds.
Honza
> In other words, the file handle uniquely identifies the > filesystem.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> fs/fhandle.c | 7 ++++++-
> include/linux/exportfs.h | 4 +++-
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index 22edced83e4c..ab4891925b52 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -23,12 +23,13 @@ static long do_sys_name_to_handle(const struct path *path,
> struct file_handle f_handle;
> int handle_dwords, handle_bytes;
> struct file_handle *handle = NULL;
> + const struct export_operations *eops = path->dentry->d_sb->s_export_op;
>
> /*
> * We need to make sure whether the file system support decoding of
> * the file handle if decodeable file handle was requested.
> */
> - if (!exportfs_can_encode_fh(path->dentry->d_sb->s_export_op, fh_flags))
> + if (!exportfs_can_encode_fh(eops, fh_flags))
> return -EOPNOTSUPP;
>
> /*
> @@ -90,6 +91,10 @@ static long do_sys_name_to_handle(const struct path *path,
> if (d_is_dir(path->dentry))
> handle->handle_type |= FILEID_IS_DIR;
> }
> +
> + /* Filesystems supports autonomous file handles. */
> + if (eops->flags & EXPORT_OP_AUTONOMOUS_HANDLES)
> + handle->handle_type |= FILEID_IS_AUTONOMOUS;
> retval = 0;
> }
> /* copy the mount id */
> diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> index 5bb757b51f5c..f7f9038b285e 100644
> --- a/include/linux/exportfs.h
> +++ b/include/linux/exportfs.h
> @@ -188,7 +188,8 @@ struct handle_to_path_ctx {
> /* Flags supported in encoded handle_type that is exported to user */
> #define FILEID_IS_CONNECTABLE 0x10000
> #define FILEID_IS_DIR 0x20000
> -#define FILEID_VALID_USER_FLAGS (FILEID_IS_CONNECTABLE | FILEID_IS_DIR)
> +#define FILEID_IS_AUTONOMOUS 0x40000
> +#define FILEID_VALID_USER_FLAGS (FILEID_IS_CONNECTABLE | FILEID_IS_DIR | FILEID_IS_AUTONOMOUS)
>
> /**
> * struct export_operations - for nfsd to communicate with file systems
> @@ -285,6 +286,7 @@ struct export_operations {
> */
> #define EXPORT_OP_FLUSH_ON_CLOSE (0x20) /* fs flushes file data on close */
> #define EXPORT_OP_NOLOCKS (0x40) /* no file locking support */
> +#define EXPORT_OP_AUTONOMOUS_HANDLES (0x80) /* filesystem supports autonomous file handles */
> unsigned long flags;
> };
>
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 8:29 ` [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle Christian Brauner
@ 2025-06-24 9:30 ` Jan Kara
2025-06-24 10:15 ` Christian Brauner
2025-06-24 10:53 ` Amir Goldstein
2025-06-24 23:07 ` Al Viro
1 sibling, 2 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:30 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> Various filesystems such as pidfs (and likely drm in the future) have a
> use-case to support opening files purely based on the handle without
> having to require a file descriptor to another object. That's especially
> the case for filesystems that don't do any lookup whatsoever and there's
> zero relationship between the objects. Such filesystems are also
> singletons that stay around for the lifetime of the system meaning that
> they can be uniquely identified and accessed purely based on the file
> handle type. Enable that so that userspace doesn't have to allocate an
> object needlessly especially if they can't do that for whatever reason.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> fs/fhandle.c | 22 ++++++++++++++++++++--
> fs/pidfs.c | 5 ++++-
> 2 files changed, 24 insertions(+), 3 deletions(-)
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index ab4891925b52..54081e19f594 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> return err;
> }
>
> -static int get_path_anchor(int fd, struct path *root)
> +static int get_path_anchor(int fd, struct path *root, int handle_type)
> {
> if (fd >= 0) {
> CLASS(fd, f)(fd);
> @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> return 0;
> }
>
> + /*
> + * Only autonomous handles can be decoded without a file
> + * descriptor.
> + */
> + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> + return -EOPNOTSUPP;
This somewhat ties to my comment to patch 5 that if someone passed invalid
fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
much about it so feel free to ignore me but I think the following might be
more sensible error codes:
if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
if (fd == FD_INVALID)
return -EOPNOTSUPP;
return -EBADF;
}
if (fd != FD_INVALID)
return -EBADF; (or -EINVAL no strong preference here)
Since I don't care that much feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> +
> + if (fd != FD_INVALID)
> + return -EINVAL;
> +
> + switch (handle_type & ~FILEID_USER_FLAGS_MASK) {
> + case FILEID_PIDFS:
> + pidfs_get_root(root);
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> return 0;
> }
>
> @@ -347,7 +365,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
> FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS)
> return -EINVAL;
>
> - retval = get_path_anchor(mountdirfd, &ctx.root);
> + retval = get_path_anchor(mountdirfd, &ctx.root, f_handle.handle_type);
> if (retval)
> return retval;
>
> diff --git a/fs/pidfs.c b/fs/pidfs.c
> index 69b0541042b5..2ab9b47fbfae 100644
> --- a/fs/pidfs.c
> +++ b/fs/pidfs.c
> @@ -747,7 +747,7 @@ static int pidfs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
>
> *max_len = 2;
> *(u64 *)fh = pid->ino;
> - return FILEID_KERNFS;
> + return FILEID_PIDFS;
> }
>
> static int pidfs_ino_find(const void *key, const struct rb_node *node)
> @@ -802,6 +802,8 @@ static struct dentry *pidfs_fh_to_dentry(struct super_block *sb,
> return NULL;
>
> switch (fh_type) {
> + case FILEID_PIDFS:
> + fallthrough;
> case FILEID_KERNFS:
> pid_ino = *(u64 *)fid;
> break;
> @@ -860,6 +862,7 @@ static const struct export_operations pidfs_export_operations = {
> .fh_to_dentry = pidfs_fh_to_dentry,
> .open = pidfs_export_open,
> .permission = pidfs_export_permission,
> + .flags = EXPORT_OP_AUTONOMOUS_HANDLES,
> };
>
> static int pidfs_init_inode(struct inode *inode, void *data)
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 01/11] fhandle: raise FILEID_IS_DIR in handle_type
2025-06-24 8:29 ` [PATCH v2 01/11] fhandle: raise FILEID_IS_DIR in handle_type Christian Brauner
@ 2025-06-24 9:31 ` Jan Kara
0 siblings, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:31 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs, stable
On Tue 24-06-25 10:29:04, Christian Brauner wrote:
> Currently FILEID_IS_DIR is raised in fh_flags which is wrong.
> Raise it in handle->handle_type were it's supposed to be.
>
> Fixes: c374196b2b9f ("fs: name_to_handle_at() support for "explicit connectable" file handles")
> Cc: <stable@kernel.org>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Still looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/fhandle.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index 3e092ae6d142..66ff60591d17 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -88,7 +88,7 @@ static long do_sys_name_to_handle(const struct path *path,
> if (fh_flags & EXPORT_FH_CONNECTABLE) {
> handle->handle_type |= FILEID_IS_CONNECTABLE;
> if (d_is_dir(path->dentry))
> - fh_flags |= FILEID_IS_DIR;
> + handle->handle_type |= FILEID_IS_DIR;
> }
> retval = 0;
> }
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 02/11] fhandle: hoist copy_from_user() above get_path_from_fd()
2025-06-24 8:29 ` [PATCH v2 02/11] fhandle: hoist copy_from_user() above get_path_from_fd() Christian Brauner
@ 2025-06-24 9:31 ` Jan Kara
0 siblings, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:31 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:05, Christian Brauner wrote:
> In follow-up patches we need access to @file_handle->handle_type
> before we start caring about get_path_from_fd().
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Still looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/fhandle.c | 35 ++++++++++++++---------------------
> 1 file changed, 14 insertions(+), 21 deletions(-)
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index 66ff60591d17..73f56f8e7d5d 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -323,13 +323,24 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
> {
> int retval = 0;
> struct file_handle f_handle;
> - struct file_handle *handle = NULL;
> + struct file_handle *handle __free(kfree) = NULL;
> struct handle_to_path_ctx ctx = {};
> const struct export_operations *eops;
>
> + if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle)))
> + return -EFAULT;
> +
> + if ((f_handle.handle_bytes > MAX_HANDLE_SZ) ||
> + (f_handle.handle_bytes == 0))
> + return -EINVAL;
> +
> + if (f_handle.handle_type < 0 ||
> + FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS)
> + return -EINVAL;
> +
> retval = get_path_from_fd(mountdirfd, &ctx.root);
> if (retval)
> - goto out_err;
> + return retval;
>
> eops = ctx.root.mnt->mnt_sb->s_export_op;
> if (eops && eops->permission)
> @@ -339,21 +350,6 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
> if (retval)
> goto out_path;
>
> - if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
> - retval = -EFAULT;
> - goto out_path;
> - }
> - if ((f_handle.handle_bytes > MAX_HANDLE_SZ) ||
> - (f_handle.handle_bytes == 0)) {
> - retval = -EINVAL;
> - goto out_path;
> - }
> - if (f_handle.handle_type < 0 ||
> - FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS) {
> - retval = -EINVAL;
> - goto out_path;
> - }
> -
> handle = kmalloc(struct_size(handle, f_handle, f_handle.handle_bytes),
> GFP_KERNEL);
> if (!handle) {
> @@ -366,7 +362,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
> &ufh->f_handle,
> f_handle.handle_bytes)) {
> retval = -EFAULT;
> - goto out_handle;
> + goto out_path;
> }
>
> /*
> @@ -384,11 +380,8 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
> handle->handle_type &= ~FILEID_USER_FLAGS_MASK;
> retval = do_handle_to_path(handle, path, &ctx);
>
> -out_handle:
> - kfree(handle);
> out_path:
> path_put(&ctx.root);
> -out_err:
> return retval;
> }
>
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 03/11] fhandle: rename to get_path_anchor()
2025-06-24 8:29 ` [PATCH v2 03/11] fhandle: rename to get_path_anchor() Christian Brauner
@ 2025-06-24 9:31 ` Jan Kara
0 siblings, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:31 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:06, Christian Brauner wrote:
> Rename as we're going to expand the function in the next step. The path
> just serves as the anchor tying the decoding to the filesystem.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/fhandle.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fhandle.c b/fs/fhandle.c
> index 73f56f8e7d5d..d8d32208c621 100644
> --- a/fs/fhandle.c
> +++ b/fs/fhandle.c
> @@ -168,7 +168,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> return err;
> }
>
> -static int get_path_from_fd(int fd, struct path *root)
> +static int get_path_anchor(int fd, struct path *root)
> {
> if (fd == AT_FDCWD) {
> struct fs_struct *fs = current->fs;
> @@ -338,7 +338,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
> FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS)
> return -EINVAL;
>
> - retval = get_path_from_fd(mountdirfd, &ctx.root);
> + retval = get_path_anchor(mountdirfd, &ctx.root);
> if (retval)
> return retval;
>
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 04/11] pidfs: add pidfs_root_path() helper
2025-06-24 8:29 ` [PATCH v2 04/11] pidfs: add pidfs_root_path() helper Christian Brauner
@ 2025-06-24 9:31 ` Jan Kara
0 siblings, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:31 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:07, Christian Brauner wrote:
> Allow to return the root of the global pidfs filesystem.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/internal.h | 1 +
> fs/pidfs.c | 11 +++++++++++
> 2 files changed, 12 insertions(+)
>
> diff --git a/fs/internal.h b/fs/internal.h
> index 22ba066d1dba..ad256bccdc85 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -353,3 +353,4 @@ int anon_inode_getattr(struct mnt_idmap *idmap, const struct path *path,
> unsigned int query_flags);
> int anon_inode_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
> struct iattr *attr);
> +void pidfs_get_root(struct path *path);
> diff --git a/fs/pidfs.c b/fs/pidfs.c
> index ba526fdd4c4d..69b0541042b5 100644
> --- a/fs/pidfs.c
> +++ b/fs/pidfs.c
> @@ -31,6 +31,14 @@
> static struct kmem_cache *pidfs_attr_cachep __ro_after_init;
> static struct kmem_cache *pidfs_xattr_cachep __ro_after_init;
>
> +static struct path pidfs_root_path = {};
> +
> +void pidfs_get_root(struct path *path)
> +{
> + *path = pidfs_root_path;
> + path_get(path);
> +}
> +
> /*
> * Stashes information that userspace needs to access even after the
> * process has been reaped.
> @@ -1066,4 +1074,7 @@ void __init pidfs_init(void)
> pidfs_mnt = kern_mount(&pidfs_type);
> if (IS_ERR(pidfs_mnt))
> panic("Failed to mount pidfs pseudo filesystem");
> +
> + pidfs_root_path.mnt = pidfs_mnt;
> + pidfs_root_path.dentry = pidfs_mnt->mnt_root;
> }
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 11/11] selftests/pidfd: decode pidfd file handles withou having to specify an fd
2025-06-24 8:29 ` [PATCH v2 11/11] selftests/pidfd: decode pidfd file handles withou having to specify an fd Christian Brauner
@ 2025-06-24 9:39 ` Jan Kara
0 siblings, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 9:39 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 10:29:14, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> tools/testing/selftests/pidfd/Makefile | 2 +-
> tools/testing/selftests/pidfd/pidfd.h | 4 ++
> .../selftests/pidfd/pidfd_file_handle_test.c | 60 ++++++++++++++++++++++
> 3 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/pidfd/Makefile b/tools/testing/selftests/pidfd/Makefile
> index 03a6eede9c9e..764a8f9ecefa 100644
> --- a/tools/testing/selftests/pidfd/Makefile
> +++ b/tools/testing/selftests/pidfd/Makefile
> @@ -1,5 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0-only
> -CFLAGS += -g $(KHDR_INCLUDES) -pthread -Wall
> +CFLAGS += -g $(KHDR_INCLUDES) $(TOOLS_INCLUDES) -pthread -Wall
>
> TEST_GEN_PROGS := pidfd_test pidfd_fdinfo_test pidfd_open_test \
> pidfd_poll_test pidfd_wait pidfd_getfd_test pidfd_setns_test \
> diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
> index 5dfeb1bdf399..b427a2636402 100644
> --- a/tools/testing/selftests/pidfd/pidfd.h
> +++ b/tools/testing/selftests/pidfd/pidfd.h
> @@ -19,6 +19,10 @@
> #include "../kselftest.h"
> #include "../clone3/clone3_selftests.h"
>
> +#ifndef FD_INVALID
> +#define FD_INVALID -10009 /* Invalid file descriptor. */
> +#endif
> +
> #ifndef P_PIDFD
> #define P_PIDFD 3
> #endif
> diff --git a/tools/testing/selftests/pidfd/pidfd_file_handle_test.c b/tools/testing/selftests/pidfd/pidfd_file_handle_test.c
> index 439b9c6c0457..ff1bf51bca5e 100644
> --- a/tools/testing/selftests/pidfd/pidfd_file_handle_test.c
> +++ b/tools/testing/selftests/pidfd/pidfd_file_handle_test.c
> @@ -500,4 +500,64 @@ TEST_F(file_handle, valid_name_to_handle_at_flags)
> ASSERT_EQ(close(pidfd), 0);
> }
>
> +/*
> + * That we decode a file handle without having to pass a pidfd.
> + */
> +TEST_F(file_handle, decode_purely_based_on_file_handle)
> +{
> + int mnt_id;
> + struct file_handle *fh;
> + int pidfd = -EBADF;
> + struct stat st1, st2;
> +
> + fh = malloc(sizeof(struct file_handle) + MAX_HANDLE_SZ);
> + ASSERT_NE(fh, NULL);
> + memset(fh, 0, sizeof(struct file_handle) + MAX_HANDLE_SZ);
> + fh->handle_bytes = MAX_HANDLE_SZ;
> +
> + ASSERT_EQ(name_to_handle_at(self->child_pidfd1, "", fh, &mnt_id, AT_EMPTY_PATH), 0);
> +
> + ASSERT_EQ(fstat(self->child_pidfd1, &st1), 0);
> +
> + pidfd = open_by_handle_at(FD_INVALID, fh, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + ASSERT_EQ(fstat(pidfd, &st2), 0);
> + ASSERT_TRUE(st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);
> +
> + ASSERT_EQ(close(pidfd), 0);
> +
> + pidfd = open_by_handle_at(FD_INVALID, fh, O_CLOEXEC);
> + ASSERT_GE(pidfd, 0);
> +
> + ASSERT_EQ(fstat(pidfd, &st2), 0);
> + ASSERT_TRUE(st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);
> +
> + ASSERT_EQ(close(pidfd), 0);
> +
> + pidfd = open_by_handle_at(FD_INVALID, fh, O_NONBLOCK);
> + ASSERT_GE(pidfd, 0);
> +
> + ASSERT_EQ(fstat(pidfd, &st2), 0);
> + ASSERT_TRUE(st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);
> +
> + ASSERT_EQ(close(pidfd), 0);
> +
> + pidfd = open_by_handle_at(self->pidfd, fh, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + ASSERT_EQ(fstat(pidfd, &st2), 0);
> + ASSERT_TRUE(st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);
> +
> + ASSERT_EQ(close(pidfd), 0);
> +
> + pidfd = open_by_handle_at(-EBADF, fh, 0);
> + ASSERT_LT(pidfd, 0);
> +
> + pidfd = open_by_handle_at(AT_FDCWD, fh, 0);
> + ASSERT_LT(pidfd, 0);
> +
> + free(fh);
> +}
> +
> TEST_HARNESS_MAIN
>
> --
> 2.47.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 9:30 ` Jan Kara
@ 2025-06-24 10:15 ` Christian Brauner
2025-06-24 10:53 ` Amir Goldstein
1 sibling, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 10:15 UTC (permalink / raw)
To: Jan Kara
Cc: Jeff Layton, Chuck Lever, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue, Jun 24, 2025 at 11:30:42AM +0200, Jan Kara wrote:
> On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > Various filesystems such as pidfs (and likely drm in the future) have a
> > use-case to support opening files purely based on the handle without
> > having to require a file descriptor to another object. That's especially
> > the case for filesystems that don't do any lookup whatsoever and there's
> > zero relationship between the objects. Such filesystems are also
> > singletons that stay around for the lifetime of the system meaning that
> > they can be uniquely identified and accessed purely based on the file
> > handle type. Enable that so that userspace doesn't have to allocate an
> > object needlessly especially if they can't do that for whatever reason.
> >
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> > fs/fhandle.c | 22 ++++++++++++++++++++--
> > fs/pidfs.c | 5 ++++-
> > 2 files changed, 24 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > index ab4891925b52..54081e19f594 100644
> > --- a/fs/fhandle.c
> > +++ b/fs/fhandle.c
> > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > return err;
> > }
> >
> > -static int get_path_anchor(int fd, struct path *root)
> > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > {
> > if (fd >= 0) {
> > CLASS(fd, f)(fd);
> > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > return 0;
> > }
> >
> > + /*
> > + * Only autonomous handles can be decoded without a file
> > + * descriptor.
> > + */
> > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > + return -EOPNOTSUPP;
>
> This somewhat ties to my comment to patch 5 that if someone passed invalid
> fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> much about it so feel free to ignore me but I think the following might be
> more sensible error codes:
>
> if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> if (fd == FD_INVALID)
> return -EOPNOTSUPP;
> return -EBADF;
> }
Yes, that makes sense. I'll take the suggestion.
>
> if (fd != FD_INVALID)
> return -EBADF; (or -EINVAL no strong preference here)
>
> Since I don't care that much feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> Honza
>
> > +
> > + if (fd != FD_INVALID)
> > + return -EINVAL;
> > +
> > + switch (handle_type & ~FILEID_USER_FLAGS_MASK) {
> > + case FILEID_PIDFS:
> > + pidfs_get_root(root);
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > return 0;
> > }
> >
> > @@ -347,7 +365,7 @@ static int handle_to_path(int mountdirfd, struct file_handle __user *ufh,
> > FILEID_USER_FLAGS(f_handle.handle_type) & ~FILEID_VALID_USER_FLAGS)
> > return -EINVAL;
> >
> > - retval = get_path_anchor(mountdirfd, &ctx.root);
> > + retval = get_path_anchor(mountdirfd, &ctx.root, f_handle.handle_type);
> > if (retval)
> > return retval;
> >
> > diff --git a/fs/pidfs.c b/fs/pidfs.c
> > index 69b0541042b5..2ab9b47fbfae 100644
> > --- a/fs/pidfs.c
> > +++ b/fs/pidfs.c
> > @@ -747,7 +747,7 @@ static int pidfs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
> >
> > *max_len = 2;
> > *(u64 *)fh = pid->ino;
> > - return FILEID_KERNFS;
> > + return FILEID_PIDFS;
> > }
> >
> > static int pidfs_ino_find(const void *key, const struct rb_node *node)
> > @@ -802,6 +802,8 @@ static struct dentry *pidfs_fh_to_dentry(struct super_block *sb,
> > return NULL;
> >
> > switch (fh_type) {
> > + case FILEID_PIDFS:
> > + fallthrough;
> > case FILEID_KERNFS:
> > pid_ino = *(u64 *)fid;
> > break;
> > @@ -860,6 +862,7 @@ static const struct export_operations pidfs_export_operations = {
> > .fh_to_dentry = pidfs_fh_to_dentry,
> > .open = pidfs_export_open,
> > .permission = pidfs_export_permission,
> > + .flags = EXPORT_OP_AUTONOMOUS_HANDLES,
> > };
> >
> > static int pidfs_init_inode(struct inode *inode, void *data)
> >
> > --
> > 2.47.2
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker
2025-06-24 9:20 ` Jan Kara
@ 2025-06-24 10:16 ` Christian Brauner
0 siblings, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 10:16 UTC (permalink / raw)
To: Jan Kara
Cc: Jeff Layton, Chuck Lever, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue, Jun 24, 2025 at 11:20:02AM +0200, Jan Kara wrote:
> On Tue 24-06-25 10:29:12, Christian Brauner wrote:
> > Allow a filesystem to indicate that it supports encoding autonomous file
> > handles that can be decoded without having to pass a filesystem for the
> > filesystem.
>
> Forgot to mention the above phrase "to pass a filesystem for the
> filesystem" doesn't make sense :) But my reviewed-by holds.
Ah thanks! I'll fix.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 05/11] fhandle: reflow get_path_anchor()
2025-06-24 9:16 ` Jan Kara
@ 2025-06-24 10:16 ` Christian Brauner
0 siblings, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 10:16 UTC (permalink / raw)
To: Jan Kara
Cc: Jeff Layton, Chuck Lever, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue, Jun 24, 2025 at 11:16:39AM +0200, Jan Kara wrote:
> On Tue 24-06-25 10:29:08, Christian Brauner wrote:
> > Switch to a more common coding style.
> >
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> > fs/fhandle.c | 16 ++++++++++------
> > 1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > index d8d32208c621..22edced83e4c 100644
> > --- a/fs/fhandle.c
> > +++ b/fs/fhandle.c
> > @@ -170,18 +170,22 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> >
> > static int get_path_anchor(int fd, struct path *root)
> > {
> > + if (fd >= 0) {
> > + CLASS(fd, f)(fd);
> > + if (fd_empty(f))
> > + return -EBADF;
> > + *root = fd_file(f)->f_path;
> > + path_get(root);
> > + return 0;
> > + }
> > +
> > if (fd == AT_FDCWD) {
> > struct fs_struct *fs = current->fs;
> > spin_lock(&fs->lock);
> > *root = fs->pwd;
> > path_get(root);
> > spin_unlock(&fs->lock);
> > - } else {
> > - CLASS(fd, f)(fd);
> > - if (fd_empty(f))
> > - return -EBADF;
> > - *root = fd_file(f)->f_path;
> > - path_get(root);
> > + return 0;
> > }
>
> This actually introduces a regression that when userspace passes invalid fd
> < 0, we'd be returning 0 whereas previously we were returning -EBADF. I
> think the return below should be switched to -EBADF to fix that.
Whoops. Thanks!
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 9:30 ` Jan Kara
2025-06-24 10:15 ` Christian Brauner
@ 2025-06-24 10:53 ` Amir Goldstein
2025-06-24 14:28 ` Amir Goldstein
1 sibling, 1 reply; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 10:53 UTC (permalink / raw)
To: Jan Kara
Cc: Christian Brauner, Jeff Layton, Chuck Lever, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue, Jun 24, 2025 at 11:30 AM Jan Kara <jack@suse.cz> wrote:
>
> On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > Various filesystems such as pidfs (and likely drm in the future) have a
> > use-case to support opening files purely based on the handle without
> > having to require a file descriptor to another object. That's especially
> > the case for filesystems that don't do any lookup whatsoever and there's
> > zero relationship between the objects. Such filesystems are also
> > singletons that stay around for the lifetime of the system meaning that
> > they can be uniquely identified and accessed purely based on the file
> > handle type. Enable that so that userspace doesn't have to allocate an
> > object needlessly especially if they can't do that for whatever reason.
> >
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> > fs/fhandle.c | 22 ++++++++++++++++++++--
> > fs/pidfs.c | 5 ++++-
> > 2 files changed, 24 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > index ab4891925b52..54081e19f594 100644
> > --- a/fs/fhandle.c
> > +++ b/fs/fhandle.c
> > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > return err;
> > }
> >
> > -static int get_path_anchor(int fd, struct path *root)
> > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > {
> > if (fd >= 0) {
> > CLASS(fd, f)(fd);
> > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > return 0;
> > }
> >
> > + /*
> > + * Only autonomous handles can be decoded without a file
> > + * descriptor.
> > + */
> > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > + return -EOPNOTSUPP;
>
> This somewhat ties to my comment to patch 5 that if someone passed invalid
> fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> much about it so feel free to ignore me but I think the following might be
> more sensible error codes:
>
> if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> if (fd == FD_INVALID)
> return -EOPNOTSUPP;
> return -EBADF;
> }
>
> if (fd != FD_INVALID)
> return -EBADF; (or -EINVAL no strong preference here)
FWIW, I like -EBADF better.
it makes the error more descriptive and keeps the flow simple:
+ /*
+ * Only autonomous handles can be decoded without a file
+ * descriptor and only when FD_INVALID is provided.
+ */
+ if (fd != FD_INVALID)
+ return -EBADF;
+
+ if (!(handle_type & FILEID_IS_AUTONOMOUS))
+ return -EOPNOTSUPP;
Thanks,
Amir.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 06/11] uapi/fcntl: mark range as reserved
2025-06-24 8:29 ` [PATCH v2 06/11] uapi/fcntl: mark range as reserved Christian Brauner
2025-06-24 9:16 ` Jan Kara
@ 2025-06-24 10:57 ` Amir Goldstein
2025-06-24 13:47 ` Christian Brauner
1 sibling, 1 reply; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 10:57 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Simona Vetter, linux-fsdevel,
linux-nfs
On Tue, Jun 24, 2025 at 10:29 AM Christian Brauner <brauner@kernel.org> wrote:
>
> Mark the range from -10000 to -40000 as a range reserved for special
> in-kernel values. Move the PIDFD_SELF_*/PIDFD_THREAD_* sentinels over so
> all the special values are in one place.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> include/uapi/linux/fcntl.h | 16 ++++++++++++++++
> include/uapi/linux/pidfd.h | 15 ---------------
> tools/testing/selftests/pidfd/pidfd.h | 2 +-
> 3 files changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> index a15ac2fa4b20..ba4a698d2f33 100644
> --- a/include/uapi/linux/fcntl.h
> +++ b/include/uapi/linux/fcntl.h
> @@ -90,10 +90,26 @@
> #define DN_ATTRIB 0x00000020 /* File changed attibutes */
> #define DN_MULTISHOT 0x80000000 /* Don't remove notifier */
>
> +/* Reserved kernel ranges [-100], [-10000, -40000]. */
> #define AT_FDCWD -100 /* Special value for dirfd used to
> indicate openat should use the
> current working directory. */
>
> +/*
> + * The concept of process and threads in userland and the kernel is a confusing
> + * one - within the kernel every thread is a 'task' with its own individual PID,
> + * however from userland's point of view threads are grouped by a single PID,
> + * which is that of the 'thread group leader', typically the first thread
> + * spawned.
> + *
> + * To cut the Gideon knot, for internal kernel usage, we refer to
> + * PIDFD_SELF_THREAD to refer to the current thread (or task from a kernel
> + * perspective), and PIDFD_SELF_THREAD_GROUP to refer to the current thread
> + * group leader...
> + */
> +#define PIDFD_SELF_THREAD -10000 /* Current thread. */
> +#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
> +
>
> /* Generic flags for the *at(2) family of syscalls. */
>
> diff --git a/include/uapi/linux/pidfd.h b/include/uapi/linux/pidfd.h
> index c27a4e238e4b..957db425d459 100644
> --- a/include/uapi/linux/pidfd.h
> +++ b/include/uapi/linux/pidfd.h
> @@ -42,21 +42,6 @@
> #define PIDFD_COREDUMP_USER (1U << 2) /* coredump was done as the user. */
> #define PIDFD_COREDUMP_ROOT (1U << 3) /* coredump was done as root. */
>
> -/*
> - * The concept of process and threads in userland and the kernel is a confusing
> - * one - within the kernel every thread is a 'task' with its own individual PID,
> - * however from userland's point of view threads are grouped by a single PID,
> - * which is that of the 'thread group leader', typically the first thread
> - * spawned.
> - *
> - * To cut the Gideon knot, for internal kernel usage, we refer to
> - * PIDFD_SELF_THREAD to refer to the current thread (or task from a kernel
> - * perspective), and PIDFD_SELF_THREAD_GROUP to refer to the current thread
> - * group leader...
> - */
> -#define PIDFD_SELF_THREAD -10000 /* Current thread. */
> -#define PIDFD_SELF_THREAD_GROUP -20000 /* Current thread group leader. */
> -
> /*
> * ...and for userland we make life simpler - PIDFD_SELF refers to the current
> * thread, PIDFD_SELF_PROCESS refers to the process thread group leader.
> diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
> index efd74063126e..5dfeb1bdf399 100644
> --- a/tools/testing/selftests/pidfd/pidfd.h
> +++ b/tools/testing/selftests/pidfd/pidfd.h
> @@ -56,7 +56,7 @@
> #endif
>
> #ifndef PIDFD_SELF_THREAD_GROUP
> -#define PIDFD_SELF_THREAD_GROUP -20000 /* Current thread group leader. */
> +#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
The commit message claims to move definions between header files,
but the value of PIDFD_SELF_THREAD_GROUP was changed.
What am I missing?
Thanks,
Amir.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (10 preceding siblings ...)
2025-06-24 8:29 ` [PATCH v2 11/11] selftests/pidfd: decode pidfd file handles withou having to specify an fd Christian Brauner
@ 2025-06-24 10:58 ` Amir Goldstein
2025-06-24 10:59 ` Christian Brauner
12 siblings, 0 replies; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 10:58 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Simona Vetter, linux-fsdevel,
linux-nfs, stable
On Tue, Jun 24, 2025 at 10:29 AM Christian Brauner <brauner@kernel.org> wrote:
>
> Various filesystems such as pidfs and drm support opening file handles
> without having to require a file descriptor to identify the filesystem.
> The filesystem are global single instances and can be trivially
> identified solely on the information encoded in the file handle.
>
> This makes it possible to not have to keep or acquire a sentinal file
> descriptor just to pass it to open_by_handle_at() to identify the
> filesystem. That's especially useful when such sentinel file descriptor
> cannot or should not be acquired.
>
> For pidfs this means a file handle can function as full replacement for
> storing a pid in a file. Instead a file handle can be stored and
> reopened purely based on the file handle.
>
> Such autonomous file handles can be opened with or without specifying a
> a file descriptor. If no proper file descriptor is used the FD_INVALID
> sentinel must be passed. This allows us to define further special
> negative fd sentinels in the future.
>
> Userspace can trivially test for support by trying to open the file
> handle with an invalid file descriptor.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> Changes in v2:
> - Simplify the FILEID_PIDFS enum.
> - Introduce FD_INVALID.
> - Require FD_INVALID for autonomous file handles.
> - Link to v1: https://lore.kernel.org/20250623-work-pidfs-fhandle-v1-0-75899d67555f@kernel.org
>
After fixing the minor nits, feel free to add for the series
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> ---
> Christian Brauner (11):
> fhandle: raise FILEID_IS_DIR in handle_type
> fhandle: hoist copy_from_user() above get_path_from_fd()
> fhandle: rename to get_path_anchor()
> pidfs: add pidfs_root_path() helper
> fhandle: reflow get_path_anchor()
> uapi/fcntl: mark range as reserved
> uapi/fcntl: add FD_INVALID
> exportfs: add FILEID_PIDFS
> fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker
> fhandle, pidfs: support open_by_handle_at() purely based on file handle
> selftests/pidfd: decode pidfd file handles withou having to specify an fd
>
> fs/fhandle.c | 82 ++++++++++++++--------
> fs/internal.h | 1 +
> fs/pidfs.c | 16 ++++-
> include/linux/exportfs.h | 9 ++-
> include/uapi/linux/fcntl.h | 17 +++++
> include/uapi/linux/pidfd.h | 15 ----
> tools/testing/selftests/pidfd/Makefile | 2 +-
> tools/testing/selftests/pidfd/pidfd.h | 6 +-
> .../selftests/pidfd/pidfd_file_handle_test.c | 60 ++++++++++++++++
> 9 files changed, 158 insertions(+), 50 deletions(-)
> ---
> base-commit: 4e3d1e6e1b2d9df9650be14380c534b3c5081ddd
> change-id: 20250619-work-pidfs-fhandle-b63ff35c4924
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
` (11 preceding siblings ...)
2025-06-24 10:58 ` [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Amir Goldstein
@ 2025-06-24 10:59 ` Christian Brauner
2025-06-24 14:15 ` Jan Kara
12 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 10:59 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter
Cc: linux-fsdevel, linux-nfs, stable
> For pidfs this means a file handle can function as full replacement for
> storing a pid in a file. Instead a file handle can be stored and
> reopened purely based on the file handle.
One thing I want to comment on generally. I know we document that a file
handle is an opaque thing and userspace shouldn't rely on the layout or
format (Propably so that we're free to redefine it.).
Realistically though that's just not what's happening. I've linked Amir
to that code already a few times but I'm doing it here for all of you
again:
[1]: https://github.com/systemd/systemd/blob/7e1647ae4e33dd8354bd311a7f7f5eb701be2391/src/basic/cgroup-util.c#L62-L77
Specifically:
/* The structure to pass to name_to_handle_at() on cgroupfs2 */
typedef union {
struct file_handle file_handle;
uint8_t space[offsetof(struct file_handle, f_handle) + sizeof(uint64_t)];
} cg_file_handle;
#define CG_FILE_HANDLE_INIT \
(cg_file_handle) { \
.file_handle.handle_bytes = sizeof(uint64_t), \
.file_handle.handle_type = FILEID_KERNFS, \
}
#define CG_FILE_HANDLE_CGROUPID(fh) (*CAST_ALIGN_PTR(uint64_t, (fh).file_handle.f_handle))
cg_file_handle fh = CG_FILE_HANDLE_INIT;
CG_FILE_HANDLE_CGROUPID(fh) = id;
return RET_NERRNO(open_by_handle_at(cgroupfs_fd, &fh.file_handle, O_DIRECTORY|O_CLOEXEC));
Another example where the layout is assumed to be uapi/uabi is:
[2]: https://github.com/systemd/systemd/blob/7e1647ae4e33dd8354bd311a7f7f5eb701be2391/src/basic/pidfd-util.c#L232-L259
int pidfd_get_inode_id_impl(int fd, uint64_t *ret) {
<snip>
union {
struct file_handle file_handle;
uint8_t space[offsetof(struct file_handle, f_handle) + sizeof(uint64_t)];
} fh = {
.file_handle.handle_bytes = sizeof(uint64_t),
.file_handle.handle_type = FILEID_KERNFS,
};
int mnt_id;
r = RET_NERRNO(name_to_handle_at(fd, "", &fh.file_handle, &mnt_id, AT_EMPTY_PATH));
if (r >= 0) {
if (ret)
*ret = *CAST_ALIGN_PTR(uint64_t, fh.file_handle.f_handle);
return 0;
}
In (1) you can see that the layout is assumed to be uabi by
reconstructing the handle. In (2) you can see that the layout is assumed
to be uabi by extrating the inode number.
So both points mean that the "don't rely on it"-ship has already sailed.
If we get regressions reports for this (and we surely would) because we
changed it we're bound by the no-regression rule.
So, for pidfs I'm very tempted to explicitly give the guarantee that
systemd just assumes currently.
The reason is that it will allow userspace to just store the 64-bit
pidfs inode number in a file or wherever they want and then reconstruct
the file handle without ever having to involve name_to_handle_at(). That
means you can just read the pid file and see the inode number you're
dealing with and not some binary gunk.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 08/11] exportfs: add FILEID_PIDFS
2025-06-24 8:29 ` [PATCH v2 08/11] exportfs: add FILEID_PIDFS Christian Brauner
2025-06-24 9:17 ` Jan Kara
@ 2025-06-24 13:15 ` Amir Goldstein
2025-06-24 13:43 ` Christian Brauner
1 sibling, 1 reply; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 13:15 UTC (permalink / raw)
To: Christian Brauner, Jan Kara
Cc: Jeff Layton, Chuck Lever, Simona Vetter, linux-fsdevel, linux-nfs
On Tue, Jun 24, 2025 at 10:29 AM Christian Brauner <brauner@kernel.org> wrote:
>
> Introduce new pidfs file handle values.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> include/linux/exportfs.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> index 25c4a5afbd44..5bb757b51f5c 100644
> --- a/include/linux/exportfs.h
> +++ b/include/linux/exportfs.h
> @@ -131,6 +131,11 @@ enum fid_type {
> * Filesystems must not use 0xff file ID.
> */
> FILEID_INVALID = 0xff,
> +
> + /* Internal kernel fid types */
> +
> + /* pidfs fid type */
> + FILEID_PIDFS = 0x100,
> };
>
Jan,
I just noticed that we have a fh_type <= 0xff assumption
built into fanotify:
/* Fixed size struct for file handle */
struct fanotify_fh {
u8 type;
u8 len;
and we do not enforce it.
there is only check of type range 1..0xffff
in exportfs_encode_inode_fh()
We should probably do either:
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -454,7 +454,7 @@ static int fanotify_encode_fh(struct fanotify_fh
*fh, struct inode *inode,
dwords = fh_len >> 2;
type = exportfs_encode_fid(inode, buf, &dwords);
err = -EINVAL;
- if (type <= 0 || type == FILEID_INVALID || fh_len != dwords << 2)
+ if (type <= 0 || type >= FILEID_INVALID || fh_len != dwords << 2)
goto out_err;
fh->type = type;
OR
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -29,11 +29,10 @@ enum {
/* Fixed size struct for file handle */
struct fanotify_fh {
- u8 type;
+ u16 type;
u8 len;
#define FANOTIFY_FH_FLAG_EXT_BUF 1
u8 flags;
- u8 pad;
} __aligned(4);
Christian,
Do you know if pidfs supports (or should support) fanotify with FAN_REPORT_FID?
If it does not need to be supported we can block it in fanotify_test_fid(),
but if it does need fanotify support, we need to think about this.
Especially, if we want fanotify to report autonomous file handles.
In that case, the design that adds the flag in the open_by_handle_at()
syscall won't do.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 08/11] exportfs: add FILEID_PIDFS
2025-06-24 13:15 ` Amir Goldstein
@ 2025-06-24 13:43 ` Christian Brauner
2025-06-24 14:20 ` Amir Goldstein
0 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 13:43 UTC (permalink / raw)
To: Amir Goldstein
Cc: Jan Kara, Jeff Layton, Chuck Lever, Simona Vetter, linux-fsdevel,
linux-nfs
On Tue, Jun 24, 2025 at 03:15:18PM +0200, Amir Goldstein wrote:
> On Tue, Jun 24, 2025 at 10:29 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > Introduce new pidfs file handle values.
> >
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> > include/linux/exportfs.h | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> > index 25c4a5afbd44..5bb757b51f5c 100644
> > --- a/include/linux/exportfs.h
> > +++ b/include/linux/exportfs.h
> > @@ -131,6 +131,11 @@ enum fid_type {
> > * Filesystems must not use 0xff file ID.
> > */
> > FILEID_INVALID = 0xff,
> > +
> > + /* Internal kernel fid types */
> > +
> > + /* pidfs fid type */
> > + FILEID_PIDFS = 0x100,
> > };
> >
>
> Jan,
>
> I just noticed that we have a fh_type <= 0xff assumption
> built into fanotify:
>
> /* Fixed size struct for file handle */
> struct fanotify_fh {
> u8 type;
> u8 len;
>
> and we do not enforce it.
> there is only check of type range 1..0xffff
> in exportfs_encode_inode_fh()
>
> We should probably do either:
>
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -454,7 +454,7 @@ static int fanotify_encode_fh(struct fanotify_fh
> *fh, struct inode *inode,
> dwords = fh_len >> 2;
> type = exportfs_encode_fid(inode, buf, &dwords);
> err = -EINVAL;
> - if (type <= 0 || type == FILEID_INVALID || fh_len != dwords << 2)
> + if (type <= 0 || type >= FILEID_INVALID || fh_len != dwords << 2)
> goto out_err;
>
> fh->type = type;
>
> OR
>
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -29,11 +29,10 @@ enum {
>
> /* Fixed size struct for file handle */
> struct fanotify_fh {
> - u8 type;
> + u16 type;
> u8 len;
> #define FANOTIFY_FH_FLAG_EXT_BUF 1
> u8 flags;
> - u8 pad;
> } __aligned(4);
>
>
> Christian,
>
> Do you know if pidfs supports (or should support) fanotify with FAN_REPORT_FID?
I think it's at least supported by fanotify in that FAN_REPORT_FID and
FAN_REPORT_PIDFD aren't mutually exclusive options afaict. I don't know
if it's used though.
> If it does not need to be supported we can block it in fanotify_test_fid(),
> but if it does need fanotify support, we need to think about this.
Sure, block it.
>
> Especially, if we want fanotify to report autonomous file handles.
> In that case, the design that adds the flag in the open_by_handle_at()
> syscall won't do.
Sorry, the design that adds the flag? You mean the FD_INVALID?
Why is that?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 06/11] uapi/fcntl: mark range as reserved
2025-06-24 10:57 ` Amir Goldstein
@ 2025-06-24 13:47 ` Christian Brauner
0 siblings, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 13:47 UTC (permalink / raw)
To: Amir Goldstein
Cc: Jeff Layton, Chuck Lever, Jan Kara, Simona Vetter, linux-fsdevel,
linux-nfs
On Tue, Jun 24, 2025 at 12:57:06PM +0200, Amir Goldstein wrote:
> On Tue, Jun 24, 2025 at 10:29 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > Mark the range from -10000 to -40000 as a range reserved for special
> > in-kernel values. Move the PIDFD_SELF_*/PIDFD_THREAD_* sentinels over so
> > all the special values are in one place.
> >
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> > include/uapi/linux/fcntl.h | 16 ++++++++++++++++
> > include/uapi/linux/pidfd.h | 15 ---------------
> > tools/testing/selftests/pidfd/pidfd.h | 2 +-
> > 3 files changed, 17 insertions(+), 16 deletions(-)
> >
> > diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> > index a15ac2fa4b20..ba4a698d2f33 100644
> > --- a/include/uapi/linux/fcntl.h
> > +++ b/include/uapi/linux/fcntl.h
> > @@ -90,10 +90,26 @@
> > #define DN_ATTRIB 0x00000020 /* File changed attibutes */
> > #define DN_MULTISHOT 0x80000000 /* Don't remove notifier */
> >
> > +/* Reserved kernel ranges [-100], [-10000, -40000]. */
> > #define AT_FDCWD -100 /* Special value for dirfd used to
> > indicate openat should use the
> > current working directory. */
> >
> > +/*
> > + * The concept of process and threads in userland and the kernel is a confusing
> > + * one - within the kernel every thread is a 'task' with its own individual PID,
> > + * however from userland's point of view threads are grouped by a single PID,
> > + * which is that of the 'thread group leader', typically the first thread
> > + * spawned.
> > + *
> > + * To cut the Gideon knot, for internal kernel usage, we refer to
> > + * PIDFD_SELF_THREAD to refer to the current thread (or task from a kernel
> > + * perspective), and PIDFD_SELF_THREAD_GROUP to refer to the current thread
> > + * group leader...
> > + */
> > +#define PIDFD_SELF_THREAD -10000 /* Current thread. */
> > +#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
> > +
> >
> > /* Generic flags for the *at(2) family of syscalls. */
> >
> > diff --git a/include/uapi/linux/pidfd.h b/include/uapi/linux/pidfd.h
> > index c27a4e238e4b..957db425d459 100644
> > --- a/include/uapi/linux/pidfd.h
> > +++ b/include/uapi/linux/pidfd.h
> > @@ -42,21 +42,6 @@
> > #define PIDFD_COREDUMP_USER (1U << 2) /* coredump was done as the user. */
> > #define PIDFD_COREDUMP_ROOT (1U << 3) /* coredump was done as root. */
> >
> > -/*
> > - * The concept of process and threads in userland and the kernel is a confusing
> > - * one - within the kernel every thread is a 'task' with its own individual PID,
> > - * however from userland's point of view threads are grouped by a single PID,
> > - * which is that of the 'thread group leader', typically the first thread
> > - * spawned.
> > - *
> > - * To cut the Gideon knot, for internal kernel usage, we refer to
> > - * PIDFD_SELF_THREAD to refer to the current thread (or task from a kernel
> > - * perspective), and PIDFD_SELF_THREAD_GROUP to refer to the current thread
> > - * group leader...
> > - */
> > -#define PIDFD_SELF_THREAD -10000 /* Current thread. */
> > -#define PIDFD_SELF_THREAD_GROUP -20000 /* Current thread group leader. */
> > -
> > /*
> > * ...and for userland we make life simpler - PIDFD_SELF refers to the current
> > * thread, PIDFD_SELF_PROCESS refers to the process thread group leader.
> > diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
> > index efd74063126e..5dfeb1bdf399 100644
> > --- a/tools/testing/selftests/pidfd/pidfd.h
> > +++ b/tools/testing/selftests/pidfd/pidfd.h
> > @@ -56,7 +56,7 @@
> > #endif
> >
> > #ifndef PIDFD_SELF_THREAD_GROUP
> > -#define PIDFD_SELF_THREAD_GROUP -20000 /* Current thread group leader. */
> > +#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
>
> The commit message claims to move definions between header files,
> but the value of PIDFD_SELF_THREAD_GROUP was changed.
>
> What am I missing?
I've split that into two patches.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle
2025-06-24 10:59 ` Christian Brauner
@ 2025-06-24 14:15 ` Jan Kara
2025-06-24 14:34 ` Amir Goldstein
2025-06-24 14:39 ` Christian Brauner
0 siblings, 2 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 14:15 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs, stable
On Tue 24-06-25 12:59:26, Christian Brauner wrote:
> > For pidfs this means a file handle can function as full replacement for
> > storing a pid in a file. Instead a file handle can be stored and
> > reopened purely based on the file handle.
>
> One thing I want to comment on generally. I know we document that a file
> handle is an opaque thing and userspace shouldn't rely on the layout or
> format (Propably so that we're free to redefine it.).
>
> Realistically though that's just not what's happening. I've linked Amir
> to that code already a few times but I'm doing it here for all of you
> again:
>
> [1]: https://github.com/systemd/systemd/blob/7e1647ae4e33dd8354bd311a7f7f5eb701be2391/src/basic/cgroup-util.c#L62-L77
>
> Specifically:
>
> /* The structure to pass to name_to_handle_at() on cgroupfs2 */
> typedef union {
> struct file_handle file_handle;
> uint8_t space[offsetof(struct file_handle, f_handle) + sizeof(uint64_t)];
> } cg_file_handle;
>
> #define CG_FILE_HANDLE_INIT \
> (cg_file_handle) { \
> .file_handle.handle_bytes = sizeof(uint64_t), \
> .file_handle.handle_type = FILEID_KERNFS, \
> }
>
> #define CG_FILE_HANDLE_CGROUPID(fh) (*CAST_ALIGN_PTR(uint64_t, (fh).file_handle.f_handle))
>
> cg_file_handle fh = CG_FILE_HANDLE_INIT;
> CG_FILE_HANDLE_CGROUPID(fh) = id;
>
> return RET_NERRNO(open_by_handle_at(cgroupfs_fd, &fh.file_handle, O_DIRECTORY|O_CLOEXEC));
>
> Another example where the layout is assumed to be uapi/uabi is:
>
> [2]: https://github.com/systemd/systemd/blob/7e1647ae4e33dd8354bd311a7f7f5eb701be2391/src/basic/pidfd-util.c#L232-L259
>
> int pidfd_get_inode_id_impl(int fd, uint64_t *ret) {
> <snip>
> union {
> struct file_handle file_handle;
> uint8_t space[offsetof(struct file_handle, f_handle) + sizeof(uint64_t)];
> } fh = {
> .file_handle.handle_bytes = sizeof(uint64_t),
> .file_handle.handle_type = FILEID_KERNFS,
> };
> int mnt_id;
>
> r = RET_NERRNO(name_to_handle_at(fd, "", &fh.file_handle, &mnt_id, AT_EMPTY_PATH));
> if (r >= 0) {
> if (ret)
> *ret = *CAST_ALIGN_PTR(uint64_t, fh.file_handle.f_handle);
> return 0;
> }
Thanks for sharing. Sigh... Personal note for the future: If something
should be opaque blob for userspace, don't forget to encrypt the data
before handing it over to userspace. :-P
> In (1) you can see that the layout is assumed to be uabi by
> reconstructing the handle. In (2) you can see that the layout is assumed
> to be uabi by extrating the inode number.
>
> So both points mean that the "don't rely on it"-ship has already sailed.
> If we get regressions reports for this (and we surely would) because we
> changed it we're bound by the no-regression rule.
Yep, FILEID_KERNFS is pretty much set in stone. OTOH I don't expect these
kinds of hacks to be very widespread (I guess I'm naive ;) so if we really
need to change it we could talk to systemd folks.
> So, for pidfs I'm very tempted to explicitly give the guarantee that
> systemd just assumes currently.
>
> The reason is that it will allow userspace to just store the 64-bit
> pidfs inode number in a file or wherever they want and then reconstruct
> the file handle without ever having to involve name_to_handle_at(). That
> means you can just read the pid file and see the inode number you're
> dealing with and not some binary gunk.
Well, you could just fprintf() the fhandle into the pid file if you don't
like binary gunk. Those numbers would be telling about as much as the pidfs
inode number tells you, don't they? I mean I'm still not at the point where
I would *encourage* userspace to decode what's supposed to be opaque blob
;). But maybe I'll get used to that idea...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 08/11] exportfs: add FILEID_PIDFS
2025-06-24 13:43 ` Christian Brauner
@ 2025-06-24 14:20 ` Amir Goldstein
0 siblings, 0 replies; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 14:20 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Jeff Layton, Chuck Lever, Simona Vetter, linux-fsdevel,
linux-nfs
On Tue, Jun 24, 2025 at 3:43 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Jun 24, 2025 at 03:15:18PM +0200, Amir Goldstein wrote:
> > On Tue, Jun 24, 2025 at 10:29 AM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > Introduce new pidfs file handle values.
> > >
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > ---
> > > include/linux/exportfs.h | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
> > > index 25c4a5afbd44..5bb757b51f5c 100644
> > > --- a/include/linux/exportfs.h
> > > +++ b/include/linux/exportfs.h
> > > @@ -131,6 +131,11 @@ enum fid_type {
> > > * Filesystems must not use 0xff file ID.
> > > */
> > > FILEID_INVALID = 0xff,
> > > +
> > > + /* Internal kernel fid types */
> > > +
> > > + /* pidfs fid type */
> > > + FILEID_PIDFS = 0x100,
> > > };
> > >
> >
> > Jan,
> >
> > I just noticed that we have a fh_type <= 0xff assumption
> > built into fanotify:
> >
> > /* Fixed size struct for file handle */
> > struct fanotify_fh {
> > u8 type;
> > u8 len;
> >
> > and we do not enforce it.
> > there is only check of type range 1..0xffff
> > in exportfs_encode_inode_fh()
> >
> > We should probably do either:
> >
> > --- a/fs/notify/fanotify/fanotify.c
> > +++ b/fs/notify/fanotify/fanotify.c
> > @@ -454,7 +454,7 @@ static int fanotify_encode_fh(struct fanotify_fh
> > *fh, struct inode *inode,
> > dwords = fh_len >> 2;
> > type = exportfs_encode_fid(inode, buf, &dwords);
> > err = -EINVAL;
> > - if (type <= 0 || type == FILEID_INVALID || fh_len != dwords << 2)
> > + if (type <= 0 || type >= FILEID_INVALID || fh_len != dwords << 2)
> > goto out_err;
> >
> > fh->type = type;
> >
> > OR
> >
> > --- a/fs/notify/fanotify/fanotify.h
> > +++ b/fs/notify/fanotify/fanotify.h
> > @@ -29,11 +29,10 @@ enum {
> >
> > /* Fixed size struct for file handle */
> > struct fanotify_fh {
> > - u8 type;
> > + u16 type;
> > u8 len;
> > #define FANOTIFY_FH_FLAG_EXT_BUF 1
> > u8 flags;
> > - u8 pad;
> > } __aligned(4);
> >
> >
> > Christian,
> >
> > Do you know if pidfs supports (or should support) fanotify with FAN_REPORT_FID?
>
> I think it's at least supported by fanotify in that FAN_REPORT_FID and
> FAN_REPORT_PIDFD aren't mutually exclusive options afaict. I don't know
> if it's used though.
>
> > If it does not need to be supported we can block it in fanotify_test_fid(),
> > but if it does need fanotify support, we need to think about this.
>
> Sure, block it.
>
hmm, we are already denying sb/mount marks from SB_NOUSER,
but inode marks are allowed.
I don't want to special case pidfs.
I guess we can do:
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1736,6 +1736,10 @@ static int fanotify_test_fid(struct dentry
*dentry, unsigned int flags)
if (mark_type != FAN_MARK_INODE && !exportfs_can_decode_fh(nop))
return -EOPNOTSUPP;
+ /* We do not support reporting autonomous file handles. */
+ if (nop && nop->flags & EXPORT_OP_AUTONOMOUS_HANDLES)
+ return -EOPNOTSUPP;
+
return 0;
}
if needed
> >
> > Especially, if we want fanotify to report autonomous file handles.
> > In that case, the design that adds the flag in the open_by_handle_at()
> > syscall won't do.
>
> Sorry, the design that adds the flag? You mean the FD_INVALID?
> Why is that?
I mean the design that adds the flag FILEID_IS_AUTONOMOUS
in do_sys_name_to_handle(), because fanotify encodes the file handle
with the vfs helper exportfs_encode_fid(), so the resulting fid it reports
is not autonomous.
But I think I have a suggestion to untangle all this mess.
I will explain in another email.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 10:53 ` Amir Goldstein
@ 2025-06-24 14:28 ` Amir Goldstein
2025-06-24 14:51 ` Christian Brauner
0 siblings, 1 reply; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 14:28 UTC (permalink / raw)
To: Jan Kara
Cc: Christian Brauner, Jeff Layton, Chuck Lever, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue, Jun 24, 2025 at 12:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Jun 24, 2025 at 11:30 AM Jan Kara <jack@suse.cz> wrote:
> >
> > On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > > Various filesystems such as pidfs (and likely drm in the future) have a
> > > use-case to support opening files purely based on the handle without
> > > having to require a file descriptor to another object. That's especially
> > > the case for filesystems that don't do any lookup whatsoever and there's
> > > zero relationship between the objects. Such filesystems are also
> > > singletons that stay around for the lifetime of the system meaning that
> > > they can be uniquely identified and accessed purely based on the file
> > > handle type. Enable that so that userspace doesn't have to allocate an
> > > object needlessly especially if they can't do that for whatever reason.
> > >
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > ---
> > > fs/fhandle.c | 22 ++++++++++++++++++++--
> > > fs/pidfs.c | 5 ++++-
> > > 2 files changed, 24 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > > index ab4891925b52..54081e19f594 100644
> > > --- a/fs/fhandle.c
> > > +++ b/fs/fhandle.c
> > > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > > return err;
> > > }
> > >
> > > -static int get_path_anchor(int fd, struct path *root)
> > > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > > {
> > > if (fd >= 0) {
> > > CLASS(fd, f)(fd);
> > > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > > return 0;
> > > }
> > >
> > > + /*
> > > + * Only autonomous handles can be decoded without a file
> > > + * descriptor.
> > > + */
> > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > + return -EOPNOTSUPP;
> >
> > This somewhat ties to my comment to patch 5 that if someone passed invalid
> > fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> > or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> > much about it so feel free to ignore me but I think the following might be
> > more sensible error codes:
> >
> > if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> > if (fd == FD_INVALID)
> > return -EOPNOTSUPP;
> > return -EBADF;
> > }
> >
> > if (fd != FD_INVALID)
> > return -EBADF; (or -EINVAL no strong preference here)
>
> FWIW, I like -EBADF better.
> it makes the error more descriptive and keeps the flow simple:
>
> + /*
> + * Only autonomous handles can be decoded without a file
> + * descriptor and only when FD_INVALID is provided.
> + */
> + if (fd != FD_INVALID)
> + return -EBADF;
> +
> + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> + return -EOPNOTSUPP;
>
Thinking about it some more, as I am trying to address your concerns
about crafting autonomous file handles by systemd, as you already
decided to define a range for kernel reserved values for fd, why not,
instead of requiring FD_INVALID for autonomous file handle, that we
actually define a kernel fd value that translates to "the root of pidfs":
+ /*
+ * Autonomous handles can be decoded with a special file
+ * descriptor value that describes the filesystem.
+ */
+ switch (fd) {
+ case FD_PIDFS_ROOT:
+ pidfs_get_root(root);
+ break;
+ default:
+ return -EBADF;
+ }
+
Then you can toss all my old ideas, including FILEID_IS_AUTONOMOUS,
and EXPORT_OP_AUTONOMOUS_HANDLES and you do not even need
to define FILEID_PIDFS anymore, just keep exporting FILEID_KERNFS
as before (you can also keep the existing systemd code) and when you want
to open file by handle you just go
open_by_handle_at(FD_PIDFS, &handle, 0)
and that's it.
In the end, my one and only concern with autonomous file handles is that
there should be a user opt-in to request them.
Sorry for taking the long road to get to this simpler design.
WDYT?
Thanks,
Amir.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle
2025-06-24 14:15 ` Jan Kara
@ 2025-06-24 14:34 ` Amir Goldstein
2025-06-24 14:39 ` Christian Brauner
1 sibling, 0 replies; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 14:34 UTC (permalink / raw)
To: Jan Kara
Cc: Christian Brauner, Jeff Layton, Chuck Lever, Simona Vetter,
linux-fsdevel, linux-nfs, stable
On Tue, Jun 24, 2025 at 4:15 PM Jan Kara <jack@suse.cz> wrote:
>
> On Tue 24-06-25 12:59:26, Christian Brauner wrote:
> > > For pidfs this means a file handle can function as full replacement for
> > > storing a pid in a file. Instead a file handle can be stored and
> > > reopened purely based on the file handle.
> >
> > One thing I want to comment on generally. I know we document that a file
> > handle is an opaque thing and userspace shouldn't rely on the layout or
> > format (Propably so that we're free to redefine it.).
> >
> > Realistically though that's just not what's happening. I've linked Amir
> > to that code already a few times but I'm doing it here for all of you
> > again:
> >
> > [1]: https://github.com/systemd/systemd/blob/7e1647ae4e33dd8354bd311a7f7f5eb701be2391/src/basic/cgroup-util.c#L62-L77
> >
> > Specifically:
> >
> > /* The structure to pass to name_to_handle_at() on cgroupfs2 */
> > typedef union {
> > struct file_handle file_handle;
> > uint8_t space[offsetof(struct file_handle, f_handle) + sizeof(uint64_t)];
> > } cg_file_handle;
> >
> > #define CG_FILE_HANDLE_INIT \
> > (cg_file_handle) { \
> > .file_handle.handle_bytes = sizeof(uint64_t), \
> > .file_handle.handle_type = FILEID_KERNFS, \
> > }
> >
> > #define CG_FILE_HANDLE_CGROUPID(fh) (*CAST_ALIGN_PTR(uint64_t, (fh).file_handle.f_handle))
> >
> > cg_file_handle fh = CG_FILE_HANDLE_INIT;
> > CG_FILE_HANDLE_CGROUPID(fh) = id;
> >
> > return RET_NERRNO(open_by_handle_at(cgroupfs_fd, &fh.file_handle, O_DIRECTORY|O_CLOEXEC));
> >
> > Another example where the layout is assumed to be uapi/uabi is:
> >
> > [2]: https://github.com/systemd/systemd/blob/7e1647ae4e33dd8354bd311a7f7f5eb701be2391/src/basic/pidfd-util.c#L232-L259
> >
> > int pidfd_get_inode_id_impl(int fd, uint64_t *ret) {
> > <snip>
> > union {
> > struct file_handle file_handle;
> > uint8_t space[offsetof(struct file_handle, f_handle) + sizeof(uint64_t)];
> > } fh = {
> > .file_handle.handle_bytes = sizeof(uint64_t),
> > .file_handle.handle_type = FILEID_KERNFS,
> > };
> > int mnt_id;
> >
> > r = RET_NERRNO(name_to_handle_at(fd, "", &fh.file_handle, &mnt_id, AT_EMPTY_PATH));
> > if (r >= 0) {
> > if (ret)
> > *ret = *CAST_ALIGN_PTR(uint64_t, fh.file_handle.f_handle);
> > return 0;
> > }
>
> Thanks for sharing. Sigh... Personal note for the future: If something
> should be opaque blob for userspace, don't forget to encrypt the data
> before handing it over to userspace. :-P
>
> > In (1) you can see that the layout is assumed to be uabi by
> > reconstructing the handle. In (2) you can see that the layout is assumed
> > to be uabi by extrating the inode number.
> >
> > So both points mean that the "don't rely on it"-ship has already sailed.
> > If we get regressions reports for this (and we surely would) because we
> > changed it we're bound by the no-regression rule.
>
> Yep, FILEID_KERNFS is pretty much set in stone. OTOH I don't expect these
> kinds of hacks to be very widespread (I guess I'm naive ;) so if we really
> need to change it we could talk to systemd folks.
>
> > So, for pidfs I'm very tempted to explicitly give the guarantee that
> > systemd just assumes currently.
> >
> > The reason is that it will allow userspace to just store the 64-bit
> > pidfs inode number in a file or wherever they want and then reconstruct
> > the file handle without ever having to involve name_to_handle_at(). That
> > means you can just read the pid file and see the inode number you're
> > dealing with and not some binary gunk.
>
> Well, you could just fprintf() the fhandle into the pid file if you don't
> like binary gunk. Those numbers would be telling about as much as the pidfs
> inode number tells you, don't they? I mean I'm still not at the point where
> I would *encourage* userspace to decode what's supposed to be opaque blob
> ;). But maybe I'll get used to that idea...
As I was also writing a similar response, I will share my somewhat
duplicate feedback ;)
I still hold the option that as a rule, we should try not to encourage
crafting file handles in userspace.
Exceptions to the rule are easier to accept for non-persistent pseudo
filesystems and especially in the case that the handle format
is simply the u64 inode number.
But if you want to hand craft the pidfs file handles, I get that
the addition of FILEID_IS_AUTONOMOUS flag is a nuisance,
because you'd want something as simple as the examples you quoted
and not more complicated encodings.
Therefore, see my suggestion to toss the FILEID_IS_AUTONOMOUS flag
as well as the new FILEID_PIDFS type and opt-in to opening fd
without a path fd with FD_PIDFS_ROOT instead, so systemd can continue
to use the same simple hand crafted file handles.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle
2025-06-24 14:15 ` Jan Kara
2025-06-24 14:34 ` Amir Goldstein
@ 2025-06-24 14:39 ` Christian Brauner
1 sibling, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 14:39 UTC (permalink / raw)
To: Jan Kara
Cc: Jeff Layton, Chuck Lever, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs, stable
On Tue, Jun 24, 2025 at 04:15:37PM +0200, Jan Kara wrote:
> On Tue 24-06-25 12:59:26, Christian Brauner wrote:
> > > For pidfs this means a file handle can function as full replacement for
> > > storing a pid in a file. Instead a file handle can be stored and
> > > reopened purely based on the file handle.
> >
> > One thing I want to comment on generally. I know we document that a file
> > handle is an opaque thing and userspace shouldn't rely on the layout or
> > format (Propably so that we're free to redefine it.).
> >
> > Realistically though that's just not what's happening. I've linked Amir
> > to that code already a few times but I'm doing it here for all of you
> > again:
> >
> > [1]: https://github.com/systemd/systemd/blob/7e1647ae4e33dd8354bd311a7f7f5eb701be2391/src/basic/cgroup-util.c#L62-L77
> >
> > Specifically:
> >
> > /* The structure to pass to name_to_handle_at() on cgroupfs2 */
> > typedef union {
> > struct file_handle file_handle;
> > uint8_t space[offsetof(struct file_handle, f_handle) + sizeof(uint64_t)];
> > } cg_file_handle;
> >
> > #define CG_FILE_HANDLE_INIT \
> > (cg_file_handle) { \
> > .file_handle.handle_bytes = sizeof(uint64_t), \
> > .file_handle.handle_type = FILEID_KERNFS, \
> > }
> >
> > #define CG_FILE_HANDLE_CGROUPID(fh) (*CAST_ALIGN_PTR(uint64_t, (fh).file_handle.f_handle))
> >
> > cg_file_handle fh = CG_FILE_HANDLE_INIT;
> > CG_FILE_HANDLE_CGROUPID(fh) = id;
> >
> > return RET_NERRNO(open_by_handle_at(cgroupfs_fd, &fh.file_handle, O_DIRECTORY|O_CLOEXEC));
> >
> > Another example where the layout is assumed to be uapi/uabi is:
> >
> > [2]: https://github.com/systemd/systemd/blob/7e1647ae4e33dd8354bd311a7f7f5eb701be2391/src/basic/pidfd-util.c#L232-L259
> >
> > int pidfd_get_inode_id_impl(int fd, uint64_t *ret) {
> > <snip>
> > union {
> > struct file_handle file_handle;
> > uint8_t space[offsetof(struct file_handle, f_handle) + sizeof(uint64_t)];
> > } fh = {
> > .file_handle.handle_bytes = sizeof(uint64_t),
> > .file_handle.handle_type = FILEID_KERNFS,
> > };
> > int mnt_id;
> >
> > r = RET_NERRNO(name_to_handle_at(fd, "", &fh.file_handle, &mnt_id, AT_EMPTY_PATH));
> > if (r >= 0) {
> > if (ret)
> > *ret = *CAST_ALIGN_PTR(uint64_t, fh.file_handle.f_handle);
> > return 0;
> > }
>
> Thanks for sharing. Sigh... Personal note for the future: If something
> should be opaque blob for userspace, don't forget to encrypt the data
> before handing it over to userspace. :-P
Yeah, honestly, that's what we should probably do. Use some hash
function or something.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 14:28 ` Amir Goldstein
@ 2025-06-24 14:51 ` Christian Brauner
2025-06-24 15:07 ` Amir Goldstein
0 siblings, 1 reply; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 14:51 UTC (permalink / raw)
To: Amir Goldstein
Cc: Jan Kara, Jeff Layton, Chuck Lever, Simona Vetter, linux-fsdevel,
linux-nfs
On Tue, Jun 24, 2025 at 04:28:50PM +0200, Amir Goldstein wrote:
> On Tue, Jun 24, 2025 at 12:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Tue, Jun 24, 2025 at 11:30 AM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > > > Various filesystems such as pidfs (and likely drm in the future) have a
> > > > use-case to support opening files purely based on the handle without
> > > > having to require a file descriptor to another object. That's especially
> > > > the case for filesystems that don't do any lookup whatsoever and there's
> > > > zero relationship between the objects. Such filesystems are also
> > > > singletons that stay around for the lifetime of the system meaning that
> > > > they can be uniquely identified and accessed purely based on the file
> > > > handle type. Enable that so that userspace doesn't have to allocate an
> > > > object needlessly especially if they can't do that for whatever reason.
> > > >
> > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > ---
> > > > fs/fhandle.c | 22 ++++++++++++++++++++--
> > > > fs/pidfs.c | 5 ++++-
> > > > 2 files changed, 24 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > > > index ab4891925b52..54081e19f594 100644
> > > > --- a/fs/fhandle.c
> > > > +++ b/fs/fhandle.c
> > > > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > > > return err;
> > > > }
> > > >
> > > > -static int get_path_anchor(int fd, struct path *root)
> > > > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > > > {
> > > > if (fd >= 0) {
> > > > CLASS(fd, f)(fd);
> > > > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > > > return 0;
> > > > }
> > > >
> > > > + /*
> > > > + * Only autonomous handles can be decoded without a file
> > > > + * descriptor.
> > > > + */
> > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > + return -EOPNOTSUPP;
> > >
> > > This somewhat ties to my comment to patch 5 that if someone passed invalid
> > > fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> > > or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> > > much about it so feel free to ignore me but I think the following might be
> > > more sensible error codes:
> > >
> > > if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> > > if (fd == FD_INVALID)
> > > return -EOPNOTSUPP;
> > > return -EBADF;
> > > }
> > >
> > > if (fd != FD_INVALID)
> > > return -EBADF; (or -EINVAL no strong preference here)
> >
> > FWIW, I like -EBADF better.
> > it makes the error more descriptive and keeps the flow simple:
> >
> > + /*
> > + * Only autonomous handles can be decoded without a file
> > + * descriptor and only when FD_INVALID is provided.
> > + */
> > + if (fd != FD_INVALID)
> > + return -EBADF;
> > +
> > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > + return -EOPNOTSUPP;
> >
>
> Thinking about it some more, as I am trying to address your concerns
> about crafting autonomous file handles by systemd, as you already
> decided to define a range for kernel reserved values for fd, why not,
> instead of requiring FD_INVALID for autonomous file handle, that we
> actually define a kernel fd value that translates to "the root of pidfs":
>
> + /*
> + * Autonomous handles can be decoded with a special file
> + * descriptor value that describes the filesystem.
> + */
> + switch (fd) {
> + case FD_PIDFS_ROOT:
> + pidfs_get_root(root);
> + break;
> + default:
> + return -EBADF;
> + }
> +
>
> Then you can toss all my old ideas, including FILEID_IS_AUTONOMOUS,
> and EXPORT_OP_AUTONOMOUS_HANDLES and you do not even need
> to define FILEID_PIDFS anymore, just keep exporting FILEID_KERNFS
> as before (you can also keep the existing systemd code) and when you want
> to open file by handle you just go
> open_by_handle_at(FD_PIDFS, &handle, 0)
> and that's it.
>
> In the end, my one and only concern with autonomous file handles is that
> there should be a user opt-in to request them.
>
> Sorry for taking the long road to get to this simpler design.
> WDYT?
And simply place FD_PIDFS_ROOT into the -10000 range?
Sounds good to me.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 14:51 ` Christian Brauner
@ 2025-06-24 15:07 ` Amir Goldstein
2025-06-24 15:23 ` Christian Brauner
0 siblings, 1 reply; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 15:07 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Jeff Layton, Chuck Lever, Simona Vetter, linux-fsdevel,
linux-nfs
On Tue, Jun 24, 2025 at 4:51 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Jun 24, 2025 at 04:28:50PM +0200, Amir Goldstein wrote:
> > On Tue, Jun 24, 2025 at 12:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Tue, Jun 24, 2025 at 11:30 AM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > > > > Various filesystems such as pidfs (and likely drm in the future) have a
> > > > > use-case to support opening files purely based on the handle without
> > > > > having to require a file descriptor to another object. That's especially
> > > > > the case for filesystems that don't do any lookup whatsoever and there's
> > > > > zero relationship between the objects. Such filesystems are also
> > > > > singletons that stay around for the lifetime of the system meaning that
> > > > > they can be uniquely identified and accessed purely based on the file
> > > > > handle type. Enable that so that userspace doesn't have to allocate an
> > > > > object needlessly especially if they can't do that for whatever reason.
> > > > >
> > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > ---
> > > > > fs/fhandle.c | 22 ++++++++++++++++++++--
> > > > > fs/pidfs.c | 5 ++++-
> > > > > 2 files changed, 24 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > > > > index ab4891925b52..54081e19f594 100644
> > > > > --- a/fs/fhandle.c
> > > > > +++ b/fs/fhandle.c
> > > > > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > > > > return err;
> > > > > }
> > > > >
> > > > > -static int get_path_anchor(int fd, struct path *root)
> > > > > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > > > > {
> > > > > if (fd >= 0) {
> > > > > CLASS(fd, f)(fd);
> > > > > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > > > > return 0;
> > > > > }
> > > > >
> > > > > + /*
> > > > > + * Only autonomous handles can be decoded without a file
> > > > > + * descriptor.
> > > > > + */
> > > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > > + return -EOPNOTSUPP;
> > > >
> > > > This somewhat ties to my comment to patch 5 that if someone passed invalid
> > > > fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> > > > or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> > > > much about it so feel free to ignore me but I think the following might be
> > > > more sensible error codes:
> > > >
> > > > if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> > > > if (fd == FD_INVALID)
> > > > return -EOPNOTSUPP;
> > > > return -EBADF;
> > > > }
> > > >
> > > > if (fd != FD_INVALID)
> > > > return -EBADF; (or -EINVAL no strong preference here)
> > >
> > > FWIW, I like -EBADF better.
> > > it makes the error more descriptive and keeps the flow simple:
> > >
> > > + /*
> > > + * Only autonomous handles can be decoded without a file
> > > + * descriptor and only when FD_INVALID is provided.
> > > + */
> > > + if (fd != FD_INVALID)
> > > + return -EBADF;
> > > +
> > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > + return -EOPNOTSUPP;
> > >
> >
> > Thinking about it some more, as I am trying to address your concerns
> > about crafting autonomous file handles by systemd, as you already
> > decided to define a range for kernel reserved values for fd, why not,
> > instead of requiring FD_INVALID for autonomous file handle, that we
> > actually define a kernel fd value that translates to "the root of pidfs":
> >
> > + /*
> > + * Autonomous handles can be decoded with a special file
> > + * descriptor value that describes the filesystem.
> > + */
> > + switch (fd) {
> > + case FD_PIDFS_ROOT:
> > + pidfs_get_root(root);
> > + break;
> > + default:
> > + return -EBADF;
> > + }
> > +
> >
> > Then you can toss all my old ideas, including FILEID_IS_AUTONOMOUS,
> > and EXPORT_OP_AUTONOMOUS_HANDLES and you do not even need
> > to define FILEID_PIDFS anymore, just keep exporting FILEID_KERNFS
> > as before (you can also keep the existing systemd code) and when you want
> > to open file by handle you just go
> > open_by_handle_at(FD_PIDFS, &handle, 0)
> > and that's it.
> >
> > In the end, my one and only concern with autonomous file handles is that
> > there should be a user opt-in to request them.
> >
> > Sorry for taking the long road to get to this simpler design.
> > WDYT?
>
> And simply place FD_PIDFS_ROOT into the -10000 range?
> Sounds good to me.
Yes.
I mean I don't expect there will be a flood of those singleton
filesystems, but generally speaking, a unique fd constant
to describe the root of a singleton filesystem makes sense IMO.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 15:07 ` Amir Goldstein
@ 2025-06-24 15:23 ` Christian Brauner
2025-06-24 17:45 ` Jan Kara
2025-06-24 19:23 ` Amir Goldstein
0 siblings, 2 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-24 15:23 UTC (permalink / raw)
To: Amir Goldstein
Cc: Jan Kara, Jeff Layton, Chuck Lever, Simona Vetter, linux-fsdevel,
linux-nfs
[-- Attachment #1: Type: text/plain, Size: 5493 bytes --]
On Tue, Jun 24, 2025 at 05:07:59PM +0200, Amir Goldstein wrote:
> On Tue, Jun 24, 2025 at 4:51 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Tue, Jun 24, 2025 at 04:28:50PM +0200, Amir Goldstein wrote:
> > > On Tue, Jun 24, 2025 at 12:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 24, 2025 at 11:30 AM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > > > > > Various filesystems such as pidfs (and likely drm in the future) have a
> > > > > > use-case to support opening files purely based on the handle without
> > > > > > having to require a file descriptor to another object. That's especially
> > > > > > the case for filesystems that don't do any lookup whatsoever and there's
> > > > > > zero relationship between the objects. Such filesystems are also
> > > > > > singletons that stay around for the lifetime of the system meaning that
> > > > > > they can be uniquely identified and accessed purely based on the file
> > > > > > handle type. Enable that so that userspace doesn't have to allocate an
> > > > > > object needlessly especially if they can't do that for whatever reason.
> > > > > >
> > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > > ---
> > > > > > fs/fhandle.c | 22 ++++++++++++++++++++--
> > > > > > fs/pidfs.c | 5 ++++-
> > > > > > 2 files changed, 24 insertions(+), 3 deletions(-)
> > > > > >
> > > > > > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > > > > > index ab4891925b52..54081e19f594 100644
> > > > > > --- a/fs/fhandle.c
> > > > > > +++ b/fs/fhandle.c
> > > > > > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > > > > > return err;
> > > > > > }
> > > > > >
> > > > > > -static int get_path_anchor(int fd, struct path *root)
> > > > > > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > > > > > {
> > > > > > if (fd >= 0) {
> > > > > > CLASS(fd, f)(fd);
> > > > > > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > > > > > return 0;
> > > > > > }
> > > > > >
> > > > > > + /*
> > > > > > + * Only autonomous handles can be decoded without a file
> > > > > > + * descriptor.
> > > > > > + */
> > > > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > > > + return -EOPNOTSUPP;
> > > > >
> > > > > This somewhat ties to my comment to patch 5 that if someone passed invalid
> > > > > fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> > > > > or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> > > > > much about it so feel free to ignore me but I think the following might be
> > > > > more sensible error codes:
> > > > >
> > > > > if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> > > > > if (fd == FD_INVALID)
> > > > > return -EOPNOTSUPP;
> > > > > return -EBADF;
> > > > > }
> > > > >
> > > > > if (fd != FD_INVALID)
> > > > > return -EBADF; (or -EINVAL no strong preference here)
> > > >
> > > > FWIW, I like -EBADF better.
> > > > it makes the error more descriptive and keeps the flow simple:
> > > >
> > > > + /*
> > > > + * Only autonomous handles can be decoded without a file
> > > > + * descriptor and only when FD_INVALID is provided.
> > > > + */
> > > > + if (fd != FD_INVALID)
> > > > + return -EBADF;
> > > > +
> > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > + return -EOPNOTSUPP;
> > > >
> > >
> > > Thinking about it some more, as I am trying to address your concerns
> > > about crafting autonomous file handles by systemd, as you already
> > > decided to define a range for kernel reserved values for fd, why not,
> > > instead of requiring FD_INVALID for autonomous file handle, that we
> > > actually define a kernel fd value that translates to "the root of pidfs":
> > >
> > > + /*
> > > + * Autonomous handles can be decoded with a special file
> > > + * descriptor value that describes the filesystem.
> > > + */
> > > + switch (fd) {
> > > + case FD_PIDFS_ROOT:
> > > + pidfs_get_root(root);
> > > + break;
> > > + default:
> > > + return -EBADF;
> > > + }
> > > +
> > >
> > > Then you can toss all my old ideas, including FILEID_IS_AUTONOMOUS,
> > > and EXPORT_OP_AUTONOMOUS_HANDLES and you do not even need
> > > to define FILEID_PIDFS anymore, just keep exporting FILEID_KERNFS
> > > as before (you can also keep the existing systemd code) and when you want
> > > to open file by handle you just go
> > > open_by_handle_at(FD_PIDFS, &handle, 0)
> > > and that's it.
> > >
> > > In the end, my one and only concern with autonomous file handles is that
> > > there should be a user opt-in to request them.
> > >
> > > Sorry for taking the long road to get to this simpler design.
> > > WDYT?
> >
> > And simply place FD_PIDFS_ROOT into the -10000 range?
> > Sounds good to me.
>
> Yes.
>
> I mean I don't expect there will be a flood of those singleton
> filesystems, but generally speaking, a unique fd constant
> to describe the root of a singleton filesystem makes sense IMO.
Agreed. See the appended updated patches. I'm not resending completely.
I just dropped other patches.
[-- Attachment #2: 0001-uapi-fcntl-add-FD_PIDFS_ROOT.patch --]
[-- Type: text/x-diff, Size: 958 bytes --]
From 3941e37f62fe2c3c8b8675c12183185f20450539 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Tue, 24 Jun 2025 16:57:51 +0200
Subject: [PATCH 1/2] uapi/fcntl: add FD_PIDFS_ROOT
Add a special file descriptor indicating the root of the pidfs
filesystem.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
include/uapi/linux/fcntl.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index a5bebe7c4400..f291ab4f94eb 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -110,6 +110,7 @@
#define PIDFD_SELF_THREAD -10000 /* Current thread. */
#define PIDFD_SELF_THREAD_GROUP -10001 /* Current thread group leader. */
+#define FD_PIDFS_ROOT -10002 /* Root of the pidfs filesystem */
#define FD_INVALID -10009 /* Invalid file descriptor: -10000 - EBADF = -10009 */
/* Generic flags for the *at(2) family of syscalls. */
--
2.47.2
[-- Attachment #3: 0002-fhandle-pidfs-support-open_by_handle_at-purely-based.patch --]
[-- Type: text/x-diff, Size: 1411 bytes --]
From b95361481b1e5bd3627835b7e4b921d5a09e68a4 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Tue, 24 Jun 2025 10:29:13 +0200
Subject: [PATCH 2/2] fhandle, pidfs: support open_by_handle_at() purely based
on file handle
Various filesystems such as pidfs (and likely drm in the future) have a
use-case to support opening files purely based on the handle without
having to require a file descriptor to another object. That's especially
the case for filesystems that don't do any lookup whatsoever and there's
zero relationship between the objects. Such filesystems are also
singletons that stay around for the lifetime of the system meaning that
they can be uniquely identified and accessed purely based on the file
handle type. Enable that so that userspace doesn't have to allocate an
object needlessly especially if they can't do that for whatever reason.
Link: https://lore.kernel.org/20250624-work-pidfs-fhandle-v2-10-d02a04858fe3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/fhandle.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/fhandle.c b/fs/fhandle.c
index 9ef35f8e8989..b1363ead6c5e 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -188,6 +188,11 @@ static int get_path_anchor(int fd, struct path *root)
return 0;
}
+ if (fd == FD_PIDFS_ROOT) {
+ pidfs_get_root(root);
+ return 0;
+ }
+
return -EBADF;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 15:23 ` Christian Brauner
@ 2025-06-24 17:45 ` Jan Kara
2025-06-24 19:23 ` Amir Goldstein
1 sibling, 0 replies; 47+ messages in thread
From: Jan Kara @ 2025-06-24 17:45 UTC (permalink / raw)
To: Christian Brauner
Cc: Amir Goldstein, Jan Kara, Jeff Layton, Chuck Lever, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue 24-06-25 17:23:30, Christian Brauner wrote:
> On Tue, Jun 24, 2025 at 05:07:59PM +0200, Amir Goldstein wrote:
> > On Tue, Jun 24, 2025 at 4:51 PM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Tue, Jun 24, 2025 at 04:28:50PM +0200, Amir Goldstein wrote:
> > > > On Tue, Jun 24, 2025 at 12:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jun 24, 2025 at 11:30 AM Jan Kara <jack@suse.cz> wrote:
> > > > > >
> > > > > > On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > > > > > > Various filesystems such as pidfs (and likely drm in the future) have a
> > > > > > > use-case to support opening files purely based on the handle without
> > > > > > > having to require a file descriptor to another object. That's especially
> > > > > > > the case for filesystems that don't do any lookup whatsoever and there's
> > > > > > > zero relationship between the objects. Such filesystems are also
> > > > > > > singletons that stay around for the lifetime of the system meaning that
> > > > > > > they can be uniquely identified and accessed purely based on the file
> > > > > > > handle type. Enable that so that userspace doesn't have to allocate an
> > > > > > > object needlessly especially if they can't do that for whatever reason.
> > > > > > >
> > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > > > ---
> > > > > > > fs/fhandle.c | 22 ++++++++++++++++++++--
> > > > > > > fs/pidfs.c | 5 ++++-
> > > > > > > 2 files changed, 24 insertions(+), 3 deletions(-)
> > > > > > >
> > > > > > > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > > > > > > index ab4891925b52..54081e19f594 100644
> > > > > > > --- a/fs/fhandle.c
> > > > > > > +++ b/fs/fhandle.c
> > > > > > > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > > > > > > return err;
> > > > > > > }
> > > > > > >
> > > > > > > -static int get_path_anchor(int fd, struct path *root)
> > > > > > > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > > > > > > {
> > > > > > > if (fd >= 0) {
> > > > > > > CLASS(fd, f)(fd);
> > > > > > > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > > > > > > return 0;
> > > > > > > }
> > > > > > >
> > > > > > > + /*
> > > > > > > + * Only autonomous handles can be decoded without a file
> > > > > > > + * descriptor.
> > > > > > > + */
> > > > > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > > > > + return -EOPNOTSUPP;
> > > > > >
> > > > > > This somewhat ties to my comment to patch 5 that if someone passed invalid
> > > > > > fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> > > > > > or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> > > > > > much about it so feel free to ignore me but I think the following might be
> > > > > > more sensible error codes:
> > > > > >
> > > > > > if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> > > > > > if (fd == FD_INVALID)
> > > > > > return -EOPNOTSUPP;
> > > > > > return -EBADF;
> > > > > > }
> > > > > >
> > > > > > if (fd != FD_INVALID)
> > > > > > return -EBADF; (or -EINVAL no strong preference here)
> > > > >
> > > > > FWIW, I like -EBADF better.
> > > > > it makes the error more descriptive and keeps the flow simple:
> > > > >
> > > > > + /*
> > > > > + * Only autonomous handles can be decoded without a file
> > > > > + * descriptor and only when FD_INVALID is provided.
> > > > > + */
> > > > > + if (fd != FD_INVALID)
> > > > > + return -EBADF;
> > > > > +
> > > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > > + return -EOPNOTSUPP;
> > > > >
> > > >
> > > > Thinking about it some more, as I am trying to address your concerns
> > > > about crafting autonomous file handles by systemd, as you already
> > > > decided to define a range for kernel reserved values for fd, why not,
> > > > instead of requiring FD_INVALID for autonomous file handle, that we
> > > > actually define a kernel fd value that translates to "the root of pidfs":
> > > >
> > > > + /*
> > > > + * Autonomous handles can be decoded with a special file
> > > > + * descriptor value that describes the filesystem.
> > > > + */
> > > > + switch (fd) {
> > > > + case FD_PIDFS_ROOT:
> > > > + pidfs_get_root(root);
> > > > + break;
> > > > + default:
> > > > + return -EBADF;
> > > > + }
> > > > +
> > > >
> > > > Then you can toss all my old ideas, including FILEID_IS_AUTONOMOUS,
> > > > and EXPORT_OP_AUTONOMOUS_HANDLES and you do not even need
> > > > to define FILEID_PIDFS anymore, just keep exporting FILEID_KERNFS
> > > > as before (you can also keep the existing systemd code) and when you want
> > > > to open file by handle you just go
> > > > open_by_handle_at(FD_PIDFS, &handle, 0)
> > > > and that's it.
> > > >
> > > > In the end, my one and only concern with autonomous file handles is that
> > > > there should be a user opt-in to request them.
> > > >
> > > > Sorry for taking the long road to get to this simpler design.
> > > > WDYT?
> > >
> > > And simply place FD_PIDFS_ROOT into the -10000 range?
> > > Sounds good to me.
> >
> > Yes.
> >
> > I mean I don't expect there will be a flood of those singleton
> > filesystems, but generally speaking, a unique fd constant
> > to describe the root of a singleton filesystem makes sense IMO.
>
> Agreed. See the appended updated patches. I'm not resending completely.
> I just dropped other patches.
I like this! Much simpler.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 15:23 ` Christian Brauner
2025-06-24 17:45 ` Jan Kara
@ 2025-06-24 19:23 ` Amir Goldstein
2025-06-25 7:52 ` Christian Brauner
1 sibling, 1 reply; 47+ messages in thread
From: Amir Goldstein @ 2025-06-24 19:23 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Jeff Layton, Chuck Lever, Simona Vetter, linux-fsdevel,
linux-nfs
On Tue, Jun 24, 2025 at 5:23 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Jun 24, 2025 at 05:07:59PM +0200, Amir Goldstein wrote:
> > On Tue, Jun 24, 2025 at 4:51 PM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Tue, Jun 24, 2025 at 04:28:50PM +0200, Amir Goldstein wrote:
> > > > On Tue, Jun 24, 2025 at 12:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jun 24, 2025 at 11:30 AM Jan Kara <jack@suse.cz> wrote:
> > > > > >
> > > > > > On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > > > > > > Various filesystems such as pidfs (and likely drm in the future) have a
> > > > > > > use-case to support opening files purely based on the handle without
> > > > > > > having to require a file descriptor to another object. That's especially
> > > > > > > the case for filesystems that don't do any lookup whatsoever and there's
> > > > > > > zero relationship between the objects. Such filesystems are also
> > > > > > > singletons that stay around for the lifetime of the system meaning that
> > > > > > > they can be uniquely identified and accessed purely based on the file
> > > > > > > handle type. Enable that so that userspace doesn't have to allocate an
> > > > > > > object needlessly especially if they can't do that for whatever reason.
> > > > > > >
> > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > > > ---
> > > > > > > fs/fhandle.c | 22 ++++++++++++++++++++--
> > > > > > > fs/pidfs.c | 5 ++++-
> > > > > > > 2 files changed, 24 insertions(+), 3 deletions(-)
> > > > > > >
> > > > > > > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > > > > > > index ab4891925b52..54081e19f594 100644
> > > > > > > --- a/fs/fhandle.c
> > > > > > > +++ b/fs/fhandle.c
> > > > > > > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > > > > > > return err;
> > > > > > > }
> > > > > > >
> > > > > > > -static int get_path_anchor(int fd, struct path *root)
> > > > > > > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > > > > > > {
> > > > > > > if (fd >= 0) {
> > > > > > > CLASS(fd, f)(fd);
> > > > > > > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > > > > > > return 0;
> > > > > > > }
> > > > > > >
> > > > > > > + /*
> > > > > > > + * Only autonomous handles can be decoded without a file
> > > > > > > + * descriptor.
> > > > > > > + */
> > > > > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > > > > + return -EOPNOTSUPP;
> > > > > >
> > > > > > This somewhat ties to my comment to patch 5 that if someone passed invalid
> > > > > > fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> > > > > > or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> > > > > > much about it so feel free to ignore me but I think the following might be
> > > > > > more sensible error codes:
> > > > > >
> > > > > > if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> > > > > > if (fd == FD_INVALID)
> > > > > > return -EOPNOTSUPP;
> > > > > > return -EBADF;
> > > > > > }
> > > > > >
> > > > > > if (fd != FD_INVALID)
> > > > > > return -EBADF; (or -EINVAL no strong preference here)
> > > > >
> > > > > FWIW, I like -EBADF better.
> > > > > it makes the error more descriptive and keeps the flow simple:
> > > > >
> > > > > + /*
> > > > > + * Only autonomous handles can be decoded without a file
> > > > > + * descriptor and only when FD_INVALID is provided.
> > > > > + */
> > > > > + if (fd != FD_INVALID)
> > > > > + return -EBADF;
> > > > > +
> > > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > > + return -EOPNOTSUPP;
> > > > >
> > > >
> > > > Thinking about it some more, as I am trying to address your concerns
> > > > about crafting autonomous file handles by systemd, as you already
> > > > decided to define a range for kernel reserved values for fd, why not,
> > > > instead of requiring FD_INVALID for autonomous file handle, that we
> > > > actually define a kernel fd value that translates to "the root of pidfs":
> > > >
> > > > + /*
> > > > + * Autonomous handles can be decoded with a special file
> > > > + * descriptor value that describes the filesystem.
> > > > + */
> > > > + switch (fd) {
> > > > + case FD_PIDFS_ROOT:
> > > > + pidfs_get_root(root);
> > > > + break;
> > > > + default:
> > > > + return -EBADF;
> > > > + }
> > > > +
> > > >
> > > > Then you can toss all my old ideas, including FILEID_IS_AUTONOMOUS,
> > > > and EXPORT_OP_AUTONOMOUS_HANDLES and you do not even need
> > > > to define FILEID_PIDFS anymore, just keep exporting FILEID_KERNFS
> > > > as before (you can also keep the existing systemd code) and when you want
> > > > to open file by handle you just go
> > > > open_by_handle_at(FD_PIDFS, &handle, 0)
> > > > and that's it.
> > > >
> > > > In the end, my one and only concern with autonomous file handles is that
> > > > there should be a user opt-in to request them.
> > > >
> > > > Sorry for taking the long road to get to this simpler design.
> > > > WDYT?
> > >
> > > And simply place FD_PIDFS_ROOT into the -10000 range?
> > > Sounds good to me.
> >
> > Yes.
> >
> > I mean I don't expect there will be a flood of those singleton
> > filesystems, but generally speaking, a unique fd constant
> > to describe the root of a singleton filesystem makes sense IMO.
>
> Agreed. See the appended updated patches. I'm not resending completely.
> I just dropped other patches.
For those can also add:
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
But I see that FD_INVALID is still there.
Are you planning to keep the FD_INVALID patch without any
users for FD_INVALID?
Thanks,
Amir.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 8:29 ` [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle Christian Brauner
2025-06-24 9:30 ` Jan Kara
@ 2025-06-24 23:07 ` Al Viro
2025-06-25 7:52 ` Christian Brauner
1 sibling, 1 reply; 47+ messages in thread
From: Al Viro @ 2025-06-24 23:07 UTC (permalink / raw)
To: Christian Brauner
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Tue, Jun 24, 2025 at 10:29:13AM +0200, Christian Brauner wrote:
> Various filesystems such as pidfs (and likely drm in the future) have a
> use-case to support opening files purely based on the handle without
> having to require a file descriptor to another object. That's especially
> the case for filesystems that don't do any lookup whatsoever and there's
> zero relationship between the objects. Such filesystems are also
> singletons that stay around for the lifetime of the system meaning that
> they can be uniquely identified and accessed purely based on the file
> handle type. Enable that so that userspace doesn't have to allocate an
> object needlessly especially if they can't do that for whatever reason.
Whoa... Two notes:
1) you really want to make sure that no _directories_ on those
filesystems are decodable.
2) do you want to get the damn things bound somewhere?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 23:07 ` Al Viro
@ 2025-06-25 7:52 ` Christian Brauner
0 siblings, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-25 7:52 UTC (permalink / raw)
To: Al Viro
Cc: Jeff Layton, Chuck Lever, Jan Kara, Amir Goldstein, Simona Vetter,
linux-fsdevel, linux-nfs
On Wed, Jun 25, 2025 at 12:07:45AM +0100, Al Viro wrote:
> On Tue, Jun 24, 2025 at 10:29:13AM +0200, Christian Brauner wrote:
> > Various filesystems such as pidfs (and likely drm in the future) have a
> > use-case to support opening files purely based on the handle without
> > having to require a file descriptor to another object. That's especially
> > the case for filesystems that don't do any lookup whatsoever and there's
> > zero relationship between the objects. Such filesystems are also
> > singletons that stay around for the lifetime of the system meaning that
> > they can be uniquely identified and accessed purely based on the file
> > handle type. Enable that so that userspace doesn't have to allocate an
> > object needlessly especially if they can't do that for whatever reason.
>
> Whoa... Two notes:
> 1) you really want to make sure that no _directories_ on those
> filesystems are decodable.
No directories exist on any of the obvious candidates (certainly not
pidfs) and it's easy to add an assertion there. It's obvioulsy only
going to work for specific filesystems.
> 2) do you want to get the damn things bound somewhere?
For the obvious cases either bind-mounts aren't supported or having a
bind-mount would not constitute a problem. For starter pidfds can be
created purely based on pidfd_open() which means you can always open a
pidfd for process even if the bind-mount in question would not be
accessible to you.
Are there other concerns about bind-mounts that I'm missing?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle
2025-06-24 19:23 ` Amir Goldstein
@ 2025-06-25 7:52 ` Christian Brauner
0 siblings, 0 replies; 47+ messages in thread
From: Christian Brauner @ 2025-06-25 7:52 UTC (permalink / raw)
To: Amir Goldstein
Cc: Jan Kara, Jeff Layton, Chuck Lever, Simona Vetter, linux-fsdevel,
linux-nfs
On Tue, Jun 24, 2025 at 09:23:36PM +0200, Amir Goldstein wrote:
> On Tue, Jun 24, 2025 at 5:23 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Tue, Jun 24, 2025 at 05:07:59PM +0200, Amir Goldstein wrote:
> > > On Tue, Jun 24, 2025 at 4:51 PM Christian Brauner <brauner@kernel.org> wrote:
> > > >
> > > > On Tue, Jun 24, 2025 at 04:28:50PM +0200, Amir Goldstein wrote:
> > > > > On Tue, Jun 24, 2025 at 12:53 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jun 24, 2025 at 11:30 AM Jan Kara <jack@suse.cz> wrote:
> > > > > > >
> > > > > > > On Tue 24-06-25 10:29:13, Christian Brauner wrote:
> > > > > > > > Various filesystems such as pidfs (and likely drm in the future) have a
> > > > > > > > use-case to support opening files purely based on the handle without
> > > > > > > > having to require a file descriptor to another object. That's especially
> > > > > > > > the case for filesystems that don't do any lookup whatsoever and there's
> > > > > > > > zero relationship between the objects. Such filesystems are also
> > > > > > > > singletons that stay around for the lifetime of the system meaning that
> > > > > > > > they can be uniquely identified and accessed purely based on the file
> > > > > > > > handle type. Enable that so that userspace doesn't have to allocate an
> > > > > > > > object needlessly especially if they can't do that for whatever reason.
> > > > > > > >
> > > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > > > > ---
> > > > > > > > fs/fhandle.c | 22 ++++++++++++++++++++--
> > > > > > > > fs/pidfs.c | 5 ++++-
> > > > > > > > 2 files changed, 24 insertions(+), 3 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/fs/fhandle.c b/fs/fhandle.c
> > > > > > > > index ab4891925b52..54081e19f594 100644
> > > > > > > > --- a/fs/fhandle.c
> > > > > > > > +++ b/fs/fhandle.c
> > > > > > > > @@ -173,7 +173,7 @@ SYSCALL_DEFINE5(name_to_handle_at, int, dfd, const char __user *, name,
> > > > > > > > return err;
> > > > > > > > }
> > > > > > > >
> > > > > > > > -static int get_path_anchor(int fd, struct path *root)
> > > > > > > > +static int get_path_anchor(int fd, struct path *root, int handle_type)
> > > > > > > > {
> > > > > > > > if (fd >= 0) {
> > > > > > > > CLASS(fd, f)(fd);
> > > > > > > > @@ -193,6 +193,24 @@ static int get_path_anchor(int fd, struct path *root)
> > > > > > > > return 0;
> > > > > > > > }
> > > > > > > >
> > > > > > > > + /*
> > > > > > > > + * Only autonomous handles can be decoded without a file
> > > > > > > > + * descriptor.
> > > > > > > > + */
> > > > > > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > > > > > + return -EOPNOTSUPP;
> > > > > > >
> > > > > > > This somewhat ties to my comment to patch 5 that if someone passed invalid
> > > > > > > fd < 0 before, we'd be returning -EBADF and now we'd be returning -EINVAL
> > > > > > > or -EOPNOTSUPP based on FILEID_IS_AUTONOMOUS setting. I don't care that
> > > > > > > much about it so feel free to ignore me but I think the following might be
> > > > > > > more sensible error codes:
> > > > > > >
> > > > > > > if (!(handle_type & FILEID_IS_AUTONOMOUS)) {
> > > > > > > if (fd == FD_INVALID)
> > > > > > > return -EOPNOTSUPP;
> > > > > > > return -EBADF;
> > > > > > > }
> > > > > > >
> > > > > > > if (fd != FD_INVALID)
> > > > > > > return -EBADF; (or -EINVAL no strong preference here)
> > > > > >
> > > > > > FWIW, I like -EBADF better.
> > > > > > it makes the error more descriptive and keeps the flow simple:
> > > > > >
> > > > > > + /*
> > > > > > + * Only autonomous handles can be decoded without a file
> > > > > > + * descriptor and only when FD_INVALID is provided.
> > > > > > + */
> > > > > > + if (fd != FD_INVALID)
> > > > > > + return -EBADF;
> > > > > > +
> > > > > > + if (!(handle_type & FILEID_IS_AUTONOMOUS))
> > > > > > + return -EOPNOTSUPP;
> > > > > >
> > > > >
> > > > > Thinking about it some more, as I am trying to address your concerns
> > > > > about crafting autonomous file handles by systemd, as you already
> > > > > decided to define a range for kernel reserved values for fd, why not,
> > > > > instead of requiring FD_INVALID for autonomous file handle, that we
> > > > > actually define a kernel fd value that translates to "the root of pidfs":
> > > > >
> > > > > + /*
> > > > > + * Autonomous handles can be decoded with a special file
> > > > > + * descriptor value that describes the filesystem.
> > > > > + */
> > > > > + switch (fd) {
> > > > > + case FD_PIDFS_ROOT:
> > > > > + pidfs_get_root(root);
> > > > > + break;
> > > > > + default:
> > > > > + return -EBADF;
> > > > > + }
> > > > > +
> > > > >
> > > > > Then you can toss all my old ideas, including FILEID_IS_AUTONOMOUS,
> > > > > and EXPORT_OP_AUTONOMOUS_HANDLES and you do not even need
> > > > > to define FILEID_PIDFS anymore, just keep exporting FILEID_KERNFS
> > > > > as before (you can also keep the existing systemd code) and when you want
> > > > > to open file by handle you just go
> > > > > open_by_handle_at(FD_PIDFS, &handle, 0)
> > > > > and that's it.
> > > > >
> > > > > In the end, my one and only concern with autonomous file handles is that
> > > > > there should be a user opt-in to request them.
> > > > >
> > > > > Sorry for taking the long road to get to this simpler design.
> > > > > WDYT?
> > > >
> > > > And simply place FD_PIDFS_ROOT into the -10000 range?
> > > > Sounds good to me.
> > >
> > > Yes.
> > >
> > > I mean I don't expect there will be a flood of those singleton
> > > filesystems, but generally speaking, a unique fd constant
> > > to describe the root of a singleton filesystem makes sense IMO.
> >
> > Agreed. See the appended updated patches. I'm not resending completely.
> > I just dropped other patches.
>
> For those can also add:
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>
> But I see that FD_INVALID is still there.
> Are you planning to keep the FD_INVALID patch without any
> users for FD_INVALID?
Yes, even if just for userspace to get used to.
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2025-06-25 7:53 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-24 8:29 [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Christian Brauner
2025-06-24 8:29 ` [PATCH v2 01/11] fhandle: raise FILEID_IS_DIR in handle_type Christian Brauner
2025-06-24 9:31 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 02/11] fhandle: hoist copy_from_user() above get_path_from_fd() Christian Brauner
2025-06-24 9:31 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 03/11] fhandle: rename to get_path_anchor() Christian Brauner
2025-06-24 9:31 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 04/11] pidfs: add pidfs_root_path() helper Christian Brauner
2025-06-24 9:31 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 05/11] fhandle: reflow get_path_anchor() Christian Brauner
2025-06-24 9:16 ` Jan Kara
2025-06-24 10:16 ` Christian Brauner
2025-06-24 8:29 ` [PATCH v2 06/11] uapi/fcntl: mark range as reserved Christian Brauner
2025-06-24 9:16 ` Jan Kara
2025-06-24 10:57 ` Amir Goldstein
2025-06-24 13:47 ` Christian Brauner
2025-06-24 8:29 ` [PATCH v2 07/11] uapi/fcntl: add FD_INVALID Christian Brauner
2025-06-24 9:17 ` Jan Kara
2025-06-24 8:29 ` [PATCH v2 08/11] exportfs: add FILEID_PIDFS Christian Brauner
2025-06-24 9:17 ` Jan Kara
2025-06-24 13:15 ` Amir Goldstein
2025-06-24 13:43 ` Christian Brauner
2025-06-24 14:20 ` Amir Goldstein
2025-06-24 8:29 ` [PATCH v2 09/11] fhandle: add EXPORT_OP_AUTONOMOUS_HANDLES marker Christian Brauner
2025-06-24 9:18 ` Jan Kara
2025-06-24 9:20 ` Jan Kara
2025-06-24 10:16 ` Christian Brauner
2025-06-24 8:29 ` [PATCH v2 10/11] fhandle, pidfs: support open_by_handle_at() purely based on file handle Christian Brauner
2025-06-24 9:30 ` Jan Kara
2025-06-24 10:15 ` Christian Brauner
2025-06-24 10:53 ` Amir Goldstein
2025-06-24 14:28 ` Amir Goldstein
2025-06-24 14:51 ` Christian Brauner
2025-06-24 15:07 ` Amir Goldstein
2025-06-24 15:23 ` Christian Brauner
2025-06-24 17:45 ` Jan Kara
2025-06-24 19:23 ` Amir Goldstein
2025-06-25 7:52 ` Christian Brauner
2025-06-24 23:07 ` Al Viro
2025-06-25 7:52 ` Christian Brauner
2025-06-24 8:29 ` [PATCH v2 11/11] selftests/pidfd: decode pidfd file handles withou having to specify an fd Christian Brauner
2025-06-24 9:39 ` Jan Kara
2025-06-24 10:58 ` [PATCH v2 00/11] fhandle, pidfs: allow open_by_handle_at() purely based on file handle Amir Goldstein
2025-06-24 10:59 ` Christian Brauner
2025-06-24 14:15 ` Jan Kara
2025-06-24 14:34 ` Amir Goldstein
2025-06-24 14:39 ` Christian Brauner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).