linux-sh.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls
@ 2025-03-21 19:48 Andrey Albershteyn
  2025-03-21 19:48 ` [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode fsxattr Andrey Albershteyn
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Andrey Albershteyn @ 2025-03-21 19:48 UTC (permalink / raw)
  To: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn
  Cc: linux-alpha, linux-kernel, linux-arm-kernel, linux-m68k,
	linux-mips, linux-parisc, linuxppc-dev, linux-s390, linux-sh,
	sparclinux, linux-fsdevel, linux-security-module, linux-api,
	linux-arch, selinux, Andrey Albershteyn, Andrey Albershteyn,
	linux-xfs

This patchset introduced two new syscalls getfsxattrat() and
setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
except they use *at() semantics. Therefore, there's no need to open the
file to get an fd.

These syscalls allow userspace to set filesystem inode attributes on
special files. One of the usage examples is XFS quota projects.

XFS has project quotas which could be attached to a directory. All
new inodes in these directories inherit project ID set on parent
directory.

The project is created from userspace by opening and calling
FS_IOC_FSSETXATTR on each inode. This is not possible for special
files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
with empty project ID. Those inodes then are not shown in the quota
accounting but still exist in the directory. This is not critical but in
the case when special files are created in the directory with already
existing project quota, these new inodes inherit extended attributes.
This creates a mix of special files with and without attributes.
Moreover, special files with attributes don't have a possibility to
become clear or change the attributes. This, in turn, prevents userspace
from re-creating quota project on these existing files.

Christian, if this get in some mergeable state, please don't merge it
yet. Amir suggested these syscalls better to use updated struct fsxattr
with masking from Pali Rohár patchset, so, let's see how it goes.

NAME

	getfsxattrat/setfsxattrat - get/set filesystem inode attributes

SYNOPSIS

	#include <sys/syscall.h>    /* Definition of SYS_* constants */
	#include <unistd.h>

	long syscall(SYS_getfsxattrat, int dirfd, const char *pathname,
		struct fsxattr *fsx, size_t size,
		unsigned int at_flags);
	long syscall(SYS_setfsxattrat, int dirfd, const char *pathname,
		struct fsxattr *fsx, size_t size,
		unsigned int at_flags);

	Note: glibc doesn't provide for getfsxattrat()/setfsxattrat(),
	use syscall(2) instead.

DESCRIPTION

	The syscalls take fd and path to the child together with struct
	fsxattr. If path is absolute, fd is not used. If path is empty,
	inode under fd is used to get/set attributes on.

	This is an alternative to FS_IOC_FSGETXATTR/FS_IOC_FSSETXATTR
	ioctl with a difference that file don't need to be open as we
	can reference it with a path instead of fd. By having this we
	can manipulated filesystem inode attributes not only on regular
	files but also on special ones. This is not possible with
	FS_IOC_FSSETXATTR ioctl as with special files we can not call
	ioctl() directly on the filesystem inode using file descriptor.

RETURN VALUE

	On success, 0 is returned.  On error, -1 is returned, and errno
	is set to indicate the error.

ERRORS

	EINVAL		Invalid at_flag specified (only
			AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH is
			supported).

	EINVAL		Size was smaller than any known version of
			struct fsxattr.

	EINVAL		Invalid combination of parameters provided in
			fsxattr for this type of file.

	E2BIG		Size of input argument **struct fsxattr** is too
			big.

	EBADF		Invalid file descriptor was provided.

	EPERM		No permission to change this file.

	EOPNOTSUPP	Filesystem does not support setting attributes
			on this type of inode

HISTORY

	Added in Linux 6.14.

EXAMPLE

Create directory and file "mkdir ./dir && touch ./dir/foo" and then
execute the following program:

	#include <fcntl.h>
	#include <errno.h>
	#include <string.h>
	#include <linux/fs.h>
	#include <stdio.h>
	#include <sys/syscall.h>
	#include <unistd.h>

	int
	main(int argc, char **argv) {
		int dfd;
		int error;
		struct fsxattr fsx;

		dfd = open("./dir", O_RDONLY);
		if (dfd == -1) {
			printf("can not open ./dir");
			return dfd;
		}

		error = syscall(467, dfd, "./foo", &fsx, 0);
		if (error) {
			printf("can not call 467: %s", strerror(errno));
			return error;
		}

		printf("dir/foo flags: %d\n", fsx.fsx_xflags);

		fsx.fsx_xflags |= FS_XFLAG_NODUMP;
		error = syscall(468, dfd, "./foo", &fsx, 0);
		if (error) {
			printf("can not call 468: %s", strerror(errno));
			return error;
		}

		printf("dir/foo flags: %d\n", fsx.fsx_xflags);

		return error;
	}

SEE ALSO

	ioctl(2), ioctl_iflags(2), ioctl_xfs_fsgetxattr(2)

---
Changes in v4:
- Use getname_maybe_null() for correct handling of dfd + path semantic
- Remove restriction for special files on which flags are allowed
- Utilize copy_struct_from_user() for better future compatibility
- Add draft man page to cover letter
- Convert -ENOIOCTLCMD to -EOPNOSUPP as more appropriate for syscall
- Add missing __user to header declaration of syscalls
- Link to v3: https://lore.kernel.org/r/20250211-xattrat-syscall-v3-1-a07d15f898b2@kernel.org

Changes in v3:
- Remove unnecessary "dfd is dir" check as it checked in user_path_at()
- Remove unnecessary "same filesystem" check
- Use CLASS() instead of directly calling fdget/fdput
- Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org

v1:
https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/

Previous discussion:
https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/

---
Andrey Albershteyn (3):
      lsm: introduce new hooks for setting/getting inode fsxattr
      fs: split fileattr/fsxattr converters into helpers
      fs: introduce getfsxattrat and setfsxattrat syscalls

 arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
 arch/arm/tools/syscall.tbl                  |   2 +
 arch/arm64/tools/syscall_32.tbl             |   2 +
 arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
 arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
 arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
 arch/s390/kernel/syscalls/syscall.tbl       |   2 +
 arch/sh/kernel/syscalls/syscall.tbl         |   2 +
 arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
 fs/inode.c                                  | 130 ++++++++++++++++++++++++++++
 fs/ioctl.c                                  |  39 ++++++---
 include/linux/fileattr.h                    |   2 +
 include/linux/lsm_hook_defs.h               |   4 +
 include/linux/security.h                    |  16 ++++
 include/linux/syscalls.h                    |   6 ++
 include/uapi/asm-generic/unistd.h           |   8 +-
 include/uapi/linux/fs.h                     |   3 +
 security/security.c                         |  32 +++++++
 25 files changed, 259 insertions(+), 13 deletions(-)
---
base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04
change-id: 20250114-xattrat-syscall-6a1136d2db59

Best regards,
-- 
Andrey Albershteyn <aalbersh@kernel.org>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode fsxattr
  2025-03-21 19:48 [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
@ 2025-03-21 19:48 ` Andrey Albershteyn
  2025-03-21 21:32   ` Paul Moore
  2025-03-24 19:21   ` Mickaël Salaün
  2025-03-21 19:48 ` [PATCH v4 2/3] fs: split fileattr/fsxattr converters into helpers Andrey Albershteyn
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 28+ messages in thread
From: Andrey Albershteyn @ 2025-03-21 19:48 UTC (permalink / raw)
  To: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn
  Cc: linux-alpha, linux-kernel, linux-arm-kernel, linux-m68k,
	linux-mips, linux-parisc, linuxppc-dev, linux-s390, linux-sh,
	sparclinux, linux-fsdevel, linux-security-module, linux-api,
	linux-arch, selinux, Andrey Albershteyn

Introduce new hooks for setting and getting filesystem extended
attributes on inode (FS_IOC_FSGETXATTR).

Cc: selinux@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>

Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
---
 fs/ioctl.c                    |  7 ++++++-
 include/linux/lsm_hook_defs.h |  4 ++++
 include/linux/security.h      | 16 ++++++++++++++++
 security/security.c           | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 638a36be31c14afc66a7fd6eb237d9545e8ad997..4434c97bc5dff5a3e8635e28745cd99404ff353e 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -525,10 +525,15 @@ EXPORT_SYMBOL(fileattr_fill_flags);
 int vfs_fileattr_get(struct dentry *dentry, struct fileattr *fa)
 {
 	struct inode *inode = d_inode(dentry);
+	int error;
 
 	if (!inode->i_op->fileattr_get)
 		return -ENOIOCTLCMD;
 
+	error = security_inode_getfsxattr(inode, fa);
+	if (error)
+		return error;
+
 	return inode->i_op->fileattr_get(dentry, fa);
 }
 EXPORT_SYMBOL(vfs_fileattr_get);
@@ -692,7 +697,7 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
 			fa->flags |= old_ma.flags & ~FS_COMMON_FL;
 		}
 		err = fileattr_set_prepare(inode, &old_ma, fa);
-		if (!err)
+		if (!err && !security_inode_setfsxattr(inode, fa))
 			err = inode->i_op->fileattr_set(idmap, dentry, fa);
 	}
 	inode_unlock(inode);
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index eb2937599cb029004f491012b3bf5a3d6d2731df..49e64d23e9049568af133bf3f30ca719c9ec5f25 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -157,6 +157,10 @@ LSM_HOOK(int, 0, inode_removexattr, struct mnt_idmap *idmap,
 	 struct dentry *dentry, const char *name)
 LSM_HOOK(void, LSM_RET_VOID, inode_post_removexattr, struct dentry *dentry,
 	 const char *name)
+LSM_HOOK(int, 0, inode_setfsxattr, const struct inode *inode,
+	 const struct fileattr *fa)
+LSM_HOOK(int, 0, inode_getfsxattr, const struct inode *inode,
+	 const struct fileattr *fa)
 LSM_HOOK(int, 0, inode_set_acl, struct mnt_idmap *idmap,
 	 struct dentry *dentry, const char *acl_name, struct posix_acl *kacl)
 LSM_HOOK(void, LSM_RET_VOID, inode_post_set_acl, struct dentry *dentry,
diff --git a/include/linux/security.h b/include/linux/security.h
index cbdba435b798660130779d6919388779edd41d54..dd58ace29c6e325ee49470596d0abb6ecc38ba07 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -439,6 +439,10 @@ int security_inode_listxattr(struct dentry *dentry);
 int security_inode_removexattr(struct mnt_idmap *idmap,
 			       struct dentry *dentry, const char *name);
 void security_inode_post_removexattr(struct dentry *dentry, const char *name);
+int security_inode_setfsxattr(const struct inode *inode,
+			      const struct fileattr *fa);
+int security_inode_getfsxattr(const struct inode *inode,
+			      const struct fileattr *fa);
 int security_inode_need_killpriv(struct dentry *dentry);
 int security_inode_killpriv(struct mnt_idmap *idmap, struct dentry *dentry);
 int security_inode_getsecurity(struct mnt_idmap *idmap,
@@ -1042,6 +1046,18 @@ static inline void security_inode_post_removexattr(struct dentry *dentry,
 						   const char *name)
 { }
 
+static inline int security_inode_setfsxattr(const struct inode *inode,
+					    const struct fileattr *fa)
+{
+	return 0;
+}
+
+static inline int security_inode_getfsxattr(const struct inode *inode,
+					    const struct fileattr *fa)
+{
+	return 0;
+}
+
 static inline int security_inode_need_killpriv(struct dentry *dentry)
 {
 	return cap_inode_need_killpriv(dentry);
diff --git a/security/security.c b/security/security.c
index 09664e09fec9a1d502a23847aa2e87a6d19837db..d3b527f55ed52209d8e22c05adf278b164374d35 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2617,6 +2617,38 @@ void security_inode_post_removexattr(struct dentry *dentry, const char *name)
 	call_void_hook(inode_post_removexattr, dentry, name);
 }
 
+/**
+ * security_inode_setfsxattr() - check if setting fsxattr is allowed
+ * @inode: inode to set filesystem extended attributes on
+ * @fa: extended attributes to set on the inode
+ *
+ * Called when setfsxattrat() syscall or FS_IOC_FSSETXATTR ioctl() is called on
+ * inode
+ *
+ * Return: Returns 0 if permission is granted.
+ */
+int security_inode_setfsxattr(const struct inode *inode,
+			      const struct fileattr *fa)
+{
+	return call_int_hook(inode_setfsxattr, inode, fa);
+}
+
+/**
+ * security_inode_getfsxattr() - check if retrieving fsxattr is allowed
+ * @inode: inode to retrieve filesystem extended attributes from
+ * @fa: extended attributes to get
+ *
+ * Called when getfsxattrat() syscall or FS_IOC_FSGETXATTR ioctl() is called on
+ * inode
+ *
+ * Return: Returns 0 if permission is granted.
+ */
+int security_inode_getfsxattr(const struct inode *inode,
+			      const struct fileattr *fa)
+{
+	return call_int_hook(inode_getfsxattr, inode, fa);
+}
+
 /**
  * security_inode_need_killpriv() - Check if security_inode_killpriv() required
  * @dentry: associated dentry

-- 
2.47.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 2/3] fs: split fileattr/fsxattr converters into helpers
  2025-03-21 19:48 [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
  2025-03-21 19:48 ` [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode fsxattr Andrey Albershteyn
@ 2025-03-21 19:48 ` Andrey Albershteyn
  2025-03-27 12:32   ` Jan Kara
  2025-03-21 19:48 ` [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
  2025-03-23  8:45 ` [PATCH v4 0/3] " Amir Goldstein
  3 siblings, 1 reply; 28+ messages in thread
From: Andrey Albershteyn @ 2025-03-21 19:48 UTC (permalink / raw)
  To: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn
  Cc: linux-alpha, linux-kernel, linux-arm-kernel, linux-m68k,
	linux-mips, linux-parisc, linuxppc-dev, linux-s390, linux-sh,
	sparclinux, linux-fsdevel, linux-security-module, linux-api,
	linux-arch, Andrey Albershteyn

This will be helpful for get/setfsxattrat syscalls to convert
between fileattr and fsxattr.

Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
---
 fs/ioctl.c               | 32 +++++++++++++++++++++-----------
 include/linux/fileattr.h |  2 ++
 2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 4434c97bc5dff5a3e8635e28745cd99404ff353e..840283d8c406623d8d26790f89b62ebcbd39e2de 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -538,6 +538,16 @@ int vfs_fileattr_get(struct dentry *dentry, struct fileattr *fa)
 }
 EXPORT_SYMBOL(vfs_fileattr_get);
 
+void fileattr_to_fsxattr(const struct fileattr *fa, struct fsxattr *fsx)
+{
+	memset(fsx, 0, sizeof(struct fsxattr));
+	fsx->fsx_xflags = fa->fsx_xflags;
+	fsx->fsx_extsize = fa->fsx_extsize;
+	fsx->fsx_nextents = fa->fsx_nextents;
+	fsx->fsx_projid = fa->fsx_projid;
+	fsx->fsx_cowextsize = fa->fsx_cowextsize;
+}
+
 /**
  * copy_fsxattr_to_user - copy fsxattr to userspace.
  * @fa:		fileattr pointer
@@ -549,12 +559,7 @@ int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa)
 {
 	struct fsxattr xfa;
 
-	memset(&xfa, 0, sizeof(xfa));
-	xfa.fsx_xflags = fa->fsx_xflags;
-	xfa.fsx_extsize = fa->fsx_extsize;
-	xfa.fsx_nextents = fa->fsx_nextents;
-	xfa.fsx_projid = fa->fsx_projid;
-	xfa.fsx_cowextsize = fa->fsx_cowextsize;
+	fileattr_to_fsxattr(fa, &xfa);
 
 	if (copy_to_user(ufa, &xfa, sizeof(xfa)))
 		return -EFAULT;
@@ -563,6 +568,15 @@ int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa)
 }
 EXPORT_SYMBOL(copy_fsxattr_to_user);
 
+void fsxattr_to_fileattr(const struct fsxattr *fsx, struct fileattr *fa)
+{
+	fileattr_fill_xflags(fa, fsx->fsx_xflags);
+	fa->fsx_extsize = fsx->fsx_extsize;
+	fa->fsx_nextents = fsx->fsx_nextents;
+	fa->fsx_projid = fsx->fsx_projid;
+	fa->fsx_cowextsize = fsx->fsx_cowextsize;
+}
+
 static int copy_fsxattr_from_user(struct fileattr *fa,
 				  struct fsxattr __user *ufa)
 {
@@ -571,11 +585,7 @@ static int copy_fsxattr_from_user(struct fileattr *fa,
 	if (copy_from_user(&xfa, ufa, sizeof(xfa)))
 		return -EFAULT;
 
-	fileattr_fill_xflags(fa, xfa.fsx_xflags);
-	fa->fsx_extsize = xfa.fsx_extsize;
-	fa->fsx_nextents = xfa.fsx_nextents;
-	fa->fsx_projid = xfa.fsx_projid;
-	fa->fsx_cowextsize = xfa.fsx_cowextsize;
+	fsxattr_to_fileattr(&xfa, fa);
 
 	return 0;
 }
diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
index 47c05a9851d0600964b644c9c7218faacfd865f8..31888fa2edf10050be134f587299256088344365 100644
--- a/include/linux/fileattr.h
+++ b/include/linux/fileattr.h
@@ -33,7 +33,9 @@ struct fileattr {
 	bool	fsx_valid:1;
 };
 
+void fileattr_to_fsxattr(const struct fileattr *fa, struct fsxattr *fsx);
 int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa);
+void fsxattr_to_fileattr(const struct fsxattr *fsx, struct fileattr *fa);
 
 void fileattr_fill_xflags(struct fileattr *fa, u32 xflags);
 void fileattr_fill_flags(struct fileattr *fa, u32 flags);

-- 
2.47.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-21 19:48 [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
  2025-03-21 19:48 ` [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode fsxattr Andrey Albershteyn
  2025-03-21 19:48 ` [PATCH v4 2/3] fs: split fileattr/fsxattr converters into helpers Andrey Albershteyn
@ 2025-03-21 19:48 ` Andrey Albershteyn
  2025-03-23  8:56   ` Amir Goldstein
                     ` (2 more replies)
  2025-03-23  8:45 ` [PATCH v4 0/3] " Amir Goldstein
  3 siblings, 3 replies; 28+ messages in thread
From: Andrey Albershteyn @ 2025-03-21 19:48 UTC (permalink / raw)
  To: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn
  Cc: linux-alpha, linux-kernel, linux-arm-kernel, linux-m68k,
	linux-mips, linux-parisc, linuxppc-dev, linux-s390, linux-sh,
	sparclinux, linux-fsdevel, linux-security-module, linux-api,
	linux-arch, Andrey Albershteyn, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
extended attributes/flags. The syscalls take parent directory fd and
path to the child together with struct fsxattr.

This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
that file don't need to be open as we can reference it with a path
instead of fd. By having this we can manipulated inode extended
attributes not only on regular files but also on special ones. This
is not possible with FS_IOC_FSSETXATTR ioctl as with special files
we can not call ioctl() directly on the filesystem inode using fd.

This patch adds two new syscalls which allows userspace to get/set
extended inode attributes on special files by using parent directory
and a path - *at() like syscall.

CC: linux-api@vger.kernel.org
CC: linux-fsdevel@vger.kernel.org
CC: linux-xfs@vger.kernel.org
Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
 arch/arm/tools/syscall.tbl                  |   2 +
 arch/arm64/tools/syscall_32.tbl             |   2 +
 arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
 arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
 arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
 arch/s390/kernel/syscalls/syscall.tbl       |   2 +
 arch/sh/kernel/syscalls/syscall.tbl         |   2 +
 arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
 fs/inode.c                                  | 130 ++++++++++++++++++++++++++++
 include/linux/syscalls.h                    |   6 ++
 include/uapi/asm-generic/unistd.h           |   8 +-
 include/uapi/linux/fs.h                     |   3 +
 20 files changed, 178 insertions(+), 1 deletion(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index c59d53d6d3f3490f976ca179ddfe02e69265ae4d..4b9e687494c16b60c6fd6ca1dc4d6564706a7e25 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -506,3 +506,5 @@
 574	common	getxattrat			sys_getxattrat
 575	common	listxattrat			sys_listxattrat
 576	common	removexattrat			sys_removexattrat
+577	common	getfsxattrat			sys_getfsxattrat
+578	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 49eeb2ad8dbd8e074c6240417693f23fb328afa8..66466257f3c2debb3e2299f0b608c6740c98cab2 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -481,3 +481,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/arm64/tools/syscall_32.tbl b/arch/arm64/tools/syscall_32.tbl
index 69a829912a05eb8a3e21ed701d1030e31c0148bc..9c516118b154811d8d11d5696f32817430320dbf 100644
--- a/arch/arm64/tools/syscall_32.tbl
+++ b/arch/arm64/tools/syscall_32.tbl
@@ -478,3 +478,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index f5ed71f1910d09769c845c2d062d99ee0449437c..159476387f394a92ee5e29db89b118c630372db2 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -466,3 +466,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 680f568b77f2cbefc3eacb2517f276041f229b1e..a6d59ee740b58cacf823702003cf9bad17c0d3b7 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -472,3 +472,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 0b9b7e25b69ad592642f8533bee9ccfe95ce9626..cfe38fcebe1a0279e11751378d3e71c5ec6b6569 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -405,3 +405,5 @@
 464	n32	getxattrat			sys_getxattrat
 465	n32	listxattrat			sys_listxattrat
 466	n32	removexattrat			sys_removexattrat
+467	n32	getfsxattrat			sys_getfsxattrat
+468	n32	setfsxattrat			sys_setfsxattrat
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index c844cd5cda620b2809a397cdd6f4315ab6a1bfe2..29a0c5974d1aa2f01e33edc0252d75fb97abe230 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -381,3 +381,5 @@
 464	n64	getxattrat			sys_getxattrat
 465	n64	listxattrat			sys_listxattrat
 466	n64	removexattrat			sys_removexattrat
+467	n64	getfsxattrat			sys_getfsxattrat
+468	n64	setfsxattrat			sys_setfsxattrat
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 349b8aad1159f404103bd2057a1e64e9bf309f18..6c00436807c57c492ba957fcd59af1202231cf80 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -454,3 +454,5 @@
 464	o32	getxattrat			sys_getxattrat
 465	o32	listxattrat			sys_listxattrat
 466	o32	removexattrat			sys_removexattrat
+467	o32	getfsxattrat			sys_getfsxattrat
+468	o32	setfsxattrat			sys_setfsxattrat
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index d9fc94c869657fcfbd7aca1d5f5abc9fae2fb9d8..b3578fac43d6b65167787fcc97d2d09f5a9828e7 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -465,3 +465,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index d8b4ab78bef076bd50d49b87dea5060fd8c1686a..808045d82c9465c3bfa96b15947546efe5851e9a 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -557,3 +557,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index e9115b4d8b635b846e5c9ad6ce229605323723a5..78dfc2c184d4815baf8a9e61c546c9936d58a47c 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -469,3 +469,5 @@
 464  common	getxattrat		sys_getxattrat			sys_getxattrat
 465  common	listxattrat		sys_listxattrat			sys_listxattrat
 466  common	removexattrat		sys_removexattrat		sys_removexattrat
+467  common	getfsxattrat		sys_getfsxattrat		sys_getfsxattrat
+468  common	setfsxattrat		sys_setfsxattrat		sys_setfsxattrat
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index c8cad33bf250ea110de37bd1407f5a43ec5e38f2..d5a5c8339f0ed25ea07c4aba90351d352033c8a0 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -470,3 +470,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 727f99d333b304b3db0711953a3d91ece18a28eb..817dcd8603bcbffc47f3f59aa3b74b16486453d0 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -512,3 +512,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 4d0fb2fba7e208ae9455459afe11e277321d9f74..b4842c027c5d00c0236b2ba89387c5e2267447bd 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -472,3 +472,5 @@
 464	i386	getxattrat		sys_getxattrat
 465	i386	listxattrat		sys_listxattrat
 466	i386	removexattrat		sys_removexattrat
+467	i386	getfsxattrat		sys_getfsxattrat
+468	i386	setfsxattrat		sys_setfsxattrat
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 5eb708bff1c791debd6cfc5322583b2ae53f6437..b6f0a7236aaee624cf9b484239a1068085a8ffe1 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -390,6 +390,8 @@
 464	common	getxattrat		sys_getxattrat
 465	common	listxattrat		sys_listxattrat
 466	common	removexattrat		sys_removexattrat
+467	common	getfsxattrat		sys_getfsxattrat
+468	common	setfsxattrat		sys_setfsxattrat
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 37effc1b134eea061f2c350c1d68b4436b65a4dd..425d56be337d1de22f205ac503df61ff86224fee 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -437,3 +437,5 @@
 464	common	getxattrat			sys_getxattrat
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
+467	common	getfsxattrat			sys_getfsxattrat
+468	common	setfsxattrat			sys_setfsxattrat
diff --git a/fs/inode.c b/fs/inode.c
index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..811debf379ab299f287ed90863277cfda27db30c 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -23,6 +23,9 @@
 #include <linux/rw_hint.h>
 #include <linux/seq_file.h>
 #include <linux/debugfs.h>
+#include <linux/syscalls.h>
+#include <linux/fileattr.h>
+#include <linux/namei.h>
 #include <trace/events/writeback.h>
 #define CREATE_TRACE_POINTS
 #include <trace/events/timestamp.h>
@@ -2953,3 +2956,130 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap,
 	return mode & ~S_ISGID;
 }
 EXPORT_SYMBOL(mode_strip_sgid);
+
+SYSCALL_DEFINE5(getfsxattrat, int, dfd, const char __user *, filename,
+		struct fsxattr __user *, ufsx, size_t, usize,
+		unsigned int, at_flags)
+{
+	struct fileattr fa = {};
+	struct path filepath;
+	int error;
+	unsigned int lookup_flags = 0;
+	struct filename *name;
+	struct fsxattr fsx = {};
+
+	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
+	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
+
+	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
+		return -EINVAL;
+
+	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
+		lookup_flags |= LOOKUP_FOLLOW;
+
+	if (at_flags & AT_EMPTY_PATH)
+		lookup_flags |= LOOKUP_EMPTY;
+
+	if (usize > PAGE_SIZE)
+		return -E2BIG;
+
+	if (usize < FSXATTR_SIZE_VER0)
+		return -EINVAL;
+
+	name = getname_maybe_null(filename, at_flags);
+	if (!name) {
+		CLASS(fd, f)(dfd);
+
+		if (fd_empty(f))
+			return -EBADF;
+		error = vfs_fileattr_get(file_dentry(fd_file(f)), &fa);
+	} else {
+		error = filename_lookup(dfd, name, lookup_flags, &filepath,
+					NULL);
+		if (error)
+			goto out;
+		error = vfs_fileattr_get(filepath.dentry, &fa);
+		path_put(&filepath);
+	}
+	if (error == -ENOIOCTLCMD)
+		error = -EOPNOTSUPP;
+	if (!error) {
+		fileattr_to_fsxattr(&fa, &fsx);
+		error = copy_struct_to_user(ufsx, usize, &fsx,
+					    sizeof(struct fsxattr), NULL);
+	}
+out:
+	putname(name);
+	return error;
+}
+
+SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
+		struct fsxattr __user *, ufsx, size_t, usize,
+		unsigned int, at_flags)
+{
+	struct fileattr fa;
+	struct path filepath;
+	int error;
+	unsigned int lookup_flags = 0;
+	struct filename *name;
+	struct mnt_idmap *idmap;
+	struct dentry *dentry;
+	struct vfsmount *mnt;
+	struct fsxattr fsx = {};
+
+	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
+	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
+
+	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
+		return -EINVAL;
+
+	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
+		lookup_flags |= LOOKUP_FOLLOW;
+
+	if (at_flags & AT_EMPTY_PATH)
+		lookup_flags |= LOOKUP_EMPTY;
+
+	if (usize > PAGE_SIZE)
+		return -E2BIG;
+
+	if (usize < FSXATTR_SIZE_VER0)
+		return -EINVAL;
+
+	error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
+	if (error)
+		return error;
+
+	fsxattr_to_fileattr(&fsx, &fa);
+
+	name = getname_maybe_null(filename, at_flags);
+	if (!name) {
+		CLASS(fd, f)(dfd);
+
+		if (fd_empty(f))
+			return -EBADF;
+
+		idmap = file_mnt_idmap(fd_file(f));
+		dentry = file_dentry(fd_file(f));
+		mnt = fd_file(f)->f_path.mnt;
+	} else {
+		error = filename_lookup(dfd, name, lookup_flags, &filepath,
+					NULL);
+		if (error)
+			return error;
+
+		idmap = mnt_idmap(filepath.mnt);
+		dentry = filepath.dentry;
+		mnt = filepath.mnt;
+	}
+
+	error = mnt_want_write(mnt);
+	if (!error) {
+		error = vfs_fileattr_set(idmap, dentry, &fa);
+		if (error == -ENOIOCTLCMD)
+			error = -EOPNOTSUPP;
+		mnt_drop_write(mnt);
+	}
+
+	path_put(&filepath);
+	return error;
+}
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index c6333204d45130eb022f6db460eea34a1f6e91db..e242ea39b3e63a8008bc777764b616fd63bd40c4 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -371,6 +371,12 @@ asmlinkage long sys_removexattrat(int dfd, const char __user *path,
 asmlinkage long sys_lremovexattr(const char __user *path,
 				 const char __user *name);
 asmlinkage long sys_fremovexattr(int fd, const char __user *name);
+asmlinkage long sys_getfsxattrat(int dfd, const char __user *filename,
+				 struct fsxattr __user *ufsx, size_t usize,
+				 unsigned int at_flags);
+asmlinkage long sys_setfsxattrat(int dfd, const char __user *filename,
+				 struct fsxattr __user *ufsx, size_t usize,
+				 unsigned int at_flags);
 asmlinkage long sys_getcwd(char __user *buf, unsigned long size);
 asmlinkage long sys_eventfd2(unsigned int count, int flags);
 asmlinkage long sys_epoll_create1(int flags);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 88dc393c2bca38c0fa1b3fae579f7cfe4931223c..50be2e1007bc2779120d05c6e9512a689f86779c 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -850,8 +850,14 @@ __SYSCALL(__NR_listxattrat, sys_listxattrat)
 #define __NR_removexattrat 466
 __SYSCALL(__NR_removexattrat, sys_removexattrat)
 
+/* fs/inode.c */
+#define __NR_getfsxattrat 467
+__SYSCALL(__NR_getfsxattrat, sys_getfsxattrat)
+#define __NR_setfsxattrat 468
+__SYSCALL(__NR_setfsxattrat, sys_setfsxattrat)
+
 #undef __NR_syscalls
-#define __NR_syscalls 467
+#define __NR_syscalls 469
 
 /*
  * 32 bit systems traditionally used different
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 7539717707337a8cb22396a869baba3bafa08371..aed753e5d50c97da9b895a187fdaecf0477db74b 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -139,6 +139,9 @@ struct fsxattr {
 	unsigned char	fsx_pad[8];
 };
 
+#define FSXATTR_SIZE_VER0 28
+#define FSXATTR_SIZE_LATEST FSXATTR_SIZE_VER0
+
 /*
  * Flags for the fsx_xflags field
  */

-- 
2.47.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode  fsxattr
  2025-03-21 19:48 ` [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode fsxattr Andrey Albershteyn
@ 2025-03-21 21:32   ` Paul Moore
  2025-03-24 19:27     ` Mickaël Salaün
  2025-03-24 19:21   ` Mickaël Salaün
  1 sibling, 1 reply; 28+ messages in thread
From: Paul Moore @ 2025-03-21 21:32 UTC (permalink / raw)
  To: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, James Morris,
	Serge E. Hallyn
  Cc: linux-alpha, linux-kernel, linux-arm-kernel, linux-m68k,
	linux-mips, linux-parisc, linuxppc-dev, linux-s390, linux-sh,
	sparclinux, linux-fsdevel, linux-security-module, linux-api,
	linux-arch, selinux, Andrey Albershteyn

On Mar 21, 2025 Andrey Albershteyn <aalbersh@redhat.com> wrote:
> 
> Introduce new hooks for setting and getting filesystem extended
> attributes on inode (FS_IOC_FSGETXATTR).
> 
> Cc: selinux@vger.kernel.org
> Cc: Paul Moore <paul@paul-moore.com>
> 
> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
> ---
>  fs/ioctl.c                    |  7 ++++++-
>  include/linux/lsm_hook_defs.h |  4 ++++
>  include/linux/security.h      | 16 ++++++++++++++++
>  security/security.c           | 32 ++++++++++++++++++++++++++++++++
>  4 files changed, 58 insertions(+), 1 deletion(-)

Thanks Andrey, one small change below, but otherwise this looks pretty
good.  If you feel like trying to work up the SELinux implementation but
need some assitance please let me know, I'll be happy to help :)

> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 638a36be31c14afc66a7fd6eb237d9545e8ad997..4434c97bc5dff5a3e8635e28745cd99404ff353e 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -525,10 +525,15 @@ EXPORT_SYMBOL(fileattr_fill_flags);
>  int vfs_fileattr_get(struct dentry *dentry, struct fileattr *fa)
>  {
>  	struct inode *inode = d_inode(dentry);
> +	int error;
>  
>  	if (!inode->i_op->fileattr_get)
>  		return -ENOIOCTLCMD;
>  
> +	error = security_inode_getfsxattr(inode, fa);
> +	if (error)
> +		return error;
> +
>  	return inode->i_op->fileattr_get(dentry, fa);
>  }
>  EXPORT_SYMBOL(vfs_fileattr_get);
> @@ -692,7 +697,7 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
>  			fa->flags |= old_ma.flags & ~FS_COMMON_FL;
>  		}
>  		err = fileattr_set_prepare(inode, &old_ma, fa);
> -		if (!err)
> +		if (!err && !security_inode_setfsxattr(inode, fa))
>  			err = inode->i_op->fileattr_set(idmap, dentry, fa);
>  	}
>  	inode_unlock(inode);

I don't believe we want to hide or otherwise drop the LSM return code as
that could lead to odd behavior, e.g. returning 0/success despite not
having executed the fileattr_set operation.

--
paul-moore.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-21 19:48 [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
                   ` (2 preceding siblings ...)
  2025-03-21 19:48 ` [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
@ 2025-03-23  8:45 ` Amir Goldstein
  2025-03-23 10:32   ` Pali Rohár
  3 siblings, 1 reply; 28+ messages in thread
From: Amir Goldstein @ 2025-03-23  8:45 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn, linux-alpha, linux-kernel,
	linux-arm-kernel, linux-m68k, linux-mips, linux-parisc,
	linuxppc-dev, linux-s390, linux-sh, sparclinux, linux-fsdevel,
	linux-security-module, linux-api, linux-arch, selinux,
	Andrey Albershteyn, linux-xfs

On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
>
> This patchset introduced two new syscalls getfsxattrat() and
> setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> except they use *at() semantics. Therefore, there's no need to open the
> file to get an fd.
>
> These syscalls allow userspace to set filesystem inode attributes on
> special files. One of the usage examples is XFS quota projects.
>
> XFS has project quotas which could be attached to a directory. All
> new inodes in these directories inherit project ID set on parent
> directory.
>
> The project is created from userspace by opening and calling
> FS_IOC_FSSETXATTR on each inode. This is not possible for special
> files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> with empty project ID. Those inodes then are not shown in the quota
> accounting but still exist in the directory. This is not critical but in
> the case when special files are created in the directory with already
> existing project quota, these new inodes inherit extended attributes.
> This creates a mix of special files with and without attributes.
> Moreover, special files with attributes don't have a possibility to
> become clear or change the attributes. This, in turn, prevents userspace
> from re-creating quota project on these existing files.
>
> Christian, if this get in some mergeable state, please don't merge it
> yet. Amir suggested these syscalls better to use updated struct fsxattr
> with masking from Pali Rohár patchset, so, let's see how it goes.

Andrey,

To be honest I don't think it would be fair to delay your syscalls more
than needed.

If Pali can follow through and post patches on top of your syscalls for
next merge window that would be great, but otherwise, I think the
minimum requirement is that the syscalls return EINVAL if fsx_pad
is not zero. we can take it from there later.

We can always also increase the size of struct fsxattr, but let's first
use the padding space already available.

Thanks,
Amir.

>
> NAME
>
>         getfsxattrat/setfsxattrat - get/set filesystem inode attributes
>
> SYNOPSIS
>
>         #include <sys/syscall.h>    /* Definition of SYS_* constants */
>         #include <unistd.h>
>
>         long syscall(SYS_getfsxattrat, int dirfd, const char *pathname,
>                 struct fsxattr *fsx, size_t size,
>                 unsigned int at_flags);
>         long syscall(SYS_setfsxattrat, int dirfd, const char *pathname,
>                 struct fsxattr *fsx, size_t size,
>                 unsigned int at_flags);
>
>         Note: glibc doesn't provide for getfsxattrat()/setfsxattrat(),
>         use syscall(2) instead.
>
> DESCRIPTION
>
>         The syscalls take fd and path to the child together with struct
>         fsxattr. If path is absolute, fd is not used. If path is empty,
>         inode under fd is used to get/set attributes on.
>
>         This is an alternative to FS_IOC_FSGETXATTR/FS_IOC_FSSETXATTR
>         ioctl with a difference that file don't need to be open as we
>         can reference it with a path instead of fd. By having this we
>         can manipulated filesystem inode attributes not only on regular
>         files but also on special ones. This is not possible with
>         FS_IOC_FSSETXATTR ioctl as with special files we can not call
>         ioctl() directly on the filesystem inode using file descriptor.
>
> RETURN VALUE
>
>         On success, 0 is returned.  On error, -1 is returned, and errno
>         is set to indicate the error.
>
> ERRORS
>
>         EINVAL          Invalid at_flag specified (only
>                         AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH is
>                         supported).
>
>         EINVAL          Size was smaller than any known version of
>                         struct fsxattr.
>
>         EINVAL          Invalid combination of parameters provided in
>                         fsxattr for this type of file.
>
>         E2BIG           Size of input argument **struct fsxattr** is too
>                         big.
>
>         EBADF           Invalid file descriptor was provided.
>
>         EPERM           No permission to change this file.
>
>         EOPNOTSUPP      Filesystem does not support setting attributes
>                         on this type of inode
>
> HISTORY
>
>         Added in Linux 6.14.
>
> EXAMPLE
>
> Create directory and file "mkdir ./dir && touch ./dir/foo" and then
> execute the following program:
>
>         #include <fcntl.h>
>         #include <errno.h>
>         #include <string.h>
>         #include <linux/fs.h>
>         #include <stdio.h>
>         #include <sys/syscall.h>
>         #include <unistd.h>
>
>         int
>         main(int argc, char **argv) {
>                 int dfd;
>                 int error;
>                 struct fsxattr fsx;
>
>                 dfd = open("./dir", O_RDONLY);
>                 if (dfd == -1) {
>                         printf("can not open ./dir");
>                         return dfd;
>                 }
>
>                 error = syscall(467, dfd, "./foo", &fsx, 0);
>                 if (error) {
>                         printf("can not call 467: %s", strerror(errno));
>                         return error;
>                 }
>
>                 printf("dir/foo flags: %d\n", fsx.fsx_xflags);
>
>                 fsx.fsx_xflags |= FS_XFLAG_NODUMP;
>                 error = syscall(468, dfd, "./foo", &fsx, 0);
>                 if (error) {
>                         printf("can not call 468: %s", strerror(errno));
>                         return error;
>                 }
>
>                 printf("dir/foo flags: %d\n", fsx.fsx_xflags);
>
>                 return error;
>         }
>
> SEE ALSO
>
>         ioctl(2), ioctl_iflags(2), ioctl_xfs_fsgetxattr(2)
>
> ---
> Changes in v4:
> - Use getname_maybe_null() for correct handling of dfd + path semantic
> - Remove restriction for special files on which flags are allowed
> - Utilize copy_struct_from_user() for better future compatibility
> - Add draft man page to cover letter
> - Convert -ENOIOCTLCMD to -EOPNOSUPP as more appropriate for syscall
> - Add missing __user to header declaration of syscalls
> - Link to v3: https://lore.kernel.org/r/20250211-xattrat-syscall-v3-1-a07d15f898b2@kernel.org
>
> Changes in v3:
> - Remove unnecessary "dfd is dir" check as it checked in user_path_at()
> - Remove unnecessary "same filesystem" check
> - Use CLASS() instead of directly calling fdget/fdput
> - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org
>
> v1:
> https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/
>
> Previous discussion:
> https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/
>
> ---
> Andrey Albershteyn (3):
>       lsm: introduce new hooks for setting/getting inode fsxattr
>       fs: split fileattr/fsxattr converters into helpers
>       fs: introduce getfsxattrat and setfsxattrat syscalls
>
>  arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
>  arch/arm/tools/syscall.tbl                  |   2 +
>  arch/arm64/tools/syscall_32.tbl             |   2 +
>  arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
>  arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
>  arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
>  arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
>  arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
>  arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
>  arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
>  arch/s390/kernel/syscalls/syscall.tbl       |   2 +
>  arch/sh/kernel/syscalls/syscall.tbl         |   2 +
>  arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
>  arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
>  arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
>  arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
>  fs/inode.c                                  | 130 ++++++++++++++++++++++++++++
>  fs/ioctl.c                                  |  39 ++++++---
>  include/linux/fileattr.h                    |   2 +
>  include/linux/lsm_hook_defs.h               |   4 +
>  include/linux/security.h                    |  16 ++++
>  include/linux/syscalls.h                    |   6 ++
>  include/uapi/asm-generic/unistd.h           |   8 +-
>  include/uapi/linux/fs.h                     |   3 +
>  security/security.c                         |  32 +++++++
>  25 files changed, 259 insertions(+), 13 deletions(-)
> ---
> base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04
> change-id: 20250114-xattrat-syscall-6a1136d2db59
>
> Best regards,
> --
> Andrey Albershteyn <aalbersh@kernel.org>
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-21 19:48 ` [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
@ 2025-03-23  8:56   ` Amir Goldstein
  2025-03-27  9:33     ` Andrey Albershteyn
  2025-03-27 12:31   ` Jan Kara
  2025-04-22 14:59   ` Christian Brauner
  2 siblings, 1 reply; 28+ messages in thread
From: Amir Goldstein @ 2025-03-23  8:56 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn, linux-alpha, linux-kernel,
	linux-arm-kernel, linux-m68k, linux-mips, linux-parisc,
	linuxppc-dev, linux-s390, linux-sh, sparclinux, linux-fsdevel,
	linux-security-module, linux-api, linux-arch, linux-xfs

On Fri, Mar 21, 2025 at 8:49 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
>
> From: Andrey Albershteyn <aalbersh@redhat.com>
>
> Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> extended attributes/flags. The syscalls take parent directory fd and
> path to the child together with struct fsxattr.
>
> This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> that file don't need to be open as we can reference it with a path
> instead of fd. By having this we can manipulated inode extended
> attributes not only on regular files but also on special ones. This
> is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> we can not call ioctl() directly on the filesystem inode using fd.
>
> This patch adds two new syscalls which allows userspace to get/set
> extended inode attributes on special files by using parent directory
> and a path - *at() like syscall.
>
> CC: linux-api@vger.kernel.org
> CC: linux-fsdevel@vger.kernel.org
> CC: linux-xfs@vger.kernel.org
> Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> ---
...
> +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> +               struct fsxattr __user *, ufsx, size_t, usize,
> +               unsigned int, at_flags)
> +{
> +       struct fileattr fa;
> +       struct path filepath;
> +       int error;
> +       unsigned int lookup_flags = 0;
> +       struct filename *name;
> +       struct mnt_idmap *idmap;.

> +       struct dentry *dentry;
> +       struct vfsmount *mnt;
> +       struct fsxattr fsx = {};
> +
> +       BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> +       BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> +
> +       if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> +               return -EINVAL;
> +
> +       if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> +               lookup_flags |= LOOKUP_FOLLOW;
> +
> +       if (at_flags & AT_EMPTY_PATH)
> +               lookup_flags |= LOOKUP_EMPTY;
> +
> +       if (usize > PAGE_SIZE)
> +               return -E2BIG;
> +
> +       if (usize < FSXATTR_SIZE_VER0)
> +               return -EINVAL;
> +
> +       error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> +       if (error)
> +               return error;
> +
> +       fsxattr_to_fileattr(&fsx, &fa);
> +
> +       name = getname_maybe_null(filename, at_flags);
> +       if (!name) {
> +               CLASS(fd, f)(dfd);
> +
> +               if (fd_empty(f))
> +                       return -EBADF;
> +
> +               idmap = file_mnt_idmap(fd_file(f));
> +               dentry = file_dentry(fd_file(f));
> +               mnt = fd_file(f)->f_path.mnt;
> +       } else {
> +               error = filename_lookup(dfd, name, lookup_flags, &filepath,
> +                                       NULL);
> +               if (error)
> +                       return error;
> +
> +               idmap = mnt_idmap(filepath.mnt);
> +               dentry = filepath.dentry;
> +               mnt = filepath.mnt;
> +       }
> +
> +       error = mnt_want_write(mnt);
> +       if (!error) {
> +               error = vfs_fileattr_set(idmap, dentry, &fa);
> +               if (error == -ENOIOCTLCMD)
> +                       error = -EOPNOTSUPP;

This is awkward.
vfs_fileattr_set() should return -EOPNOTSUPP.
ioctl_setflags() could maybe convert it to -ENOIOCTLCMD,
but looking at similar cases ioctl_fiemap(), ioctl_fsfreeze() the
ioctl returns -EOPNOTSUPP.

I don't think it is necessarily a bad idea to start returning
 -EOPNOTSUPP instead of -ENOIOCTLCMD for the ioctl
because that really reflects the fact that the ioctl is now implemented
in vfs and not in the specific fs.

and I think it would not be a bad idea at all to make that change
together with the merge of the syscalls as a sort of hint to userspace
that uses the ioctl, that the sycalls API exists.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-23  8:45 ` [PATCH v4 0/3] " Amir Goldstein
@ 2025-03-23 10:32   ` Pali Rohár
  2025-03-27 11:47     ` Amir Goldstein
  0 siblings, 1 reply; 28+ messages in thread
From: Pali Rohár @ 2025-03-23 10:32 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, selinux, Andrey Albershteyn, linux-xfs

On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> >
> > This patchset introduced two new syscalls getfsxattrat() and
> > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > except they use *at() semantics. Therefore, there's no need to open the
> > file to get an fd.
> >
> > These syscalls allow userspace to set filesystem inode attributes on
> > special files. One of the usage examples is XFS quota projects.
> >
> > XFS has project quotas which could be attached to a directory. All
> > new inodes in these directories inherit project ID set on parent
> > directory.
> >
> > The project is created from userspace by opening and calling
> > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > with empty project ID. Those inodes then are not shown in the quota
> > accounting but still exist in the directory. This is not critical but in
> > the case when special files are created in the directory with already
> > existing project quota, these new inodes inherit extended attributes.
> > This creates a mix of special files with and without attributes.
> > Moreover, special files with attributes don't have a possibility to
> > become clear or change the attributes. This, in turn, prevents userspace
> > from re-creating quota project on these existing files.
> >
> > Christian, if this get in some mergeable state, please don't merge it
> > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > with masking from Pali Rohár patchset, so, let's see how it goes.
> 
> Andrey,
> 
> To be honest I don't think it would be fair to delay your syscalls more
> than needed.

I agree.

> If Pali can follow through and post patches on top of your syscalls for
> next merge window that would be great, but otherwise, I think the
> minimum requirement is that the syscalls return EINVAL if fsx_pad
> is not zero. we can take it from there later.

IMHO SYS_getfsxattrat is fine in this form.

For SYS_setfsxattrat I think there are needed some modifications
otherwise we would have problem again with backward compatibility as
is with ioctl if the syscall wants to be extended in future.

I would suggest for following modifications for SYS_setfsxattrat:

- return EINVAL if fsx_xflags contains some reserved or unsupported flag

- add some flag to completely ignore fsx_extsize, fsx_projid, and
  fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
  change fsx_xflags, and so could be used without the preceding
  SYS_getfsxattrat call.

What do you think about it?

Use cases for future without breaking backward compatibility:
- atomically / race-free do set or clear just one flag in fsx_xflags
  (so avoid getfsxattrat - modify buffer - setfsxattrat roundtrip)
- use fsx_pad[] for some new purposes 

> We can always also increase the size of struct fsxattr, but let's first
> use the padding space already available.
> 
> Thanks,
> Amir.
> 
> >
> > NAME
> >
> >         getfsxattrat/setfsxattrat - get/set filesystem inode attributes
> >
> > SYNOPSIS
> >
> >         #include <sys/syscall.h>    /* Definition of SYS_* constants */
> >         #include <unistd.h>
> >
> >         long syscall(SYS_getfsxattrat, int dirfd, const char *pathname,
> >                 struct fsxattr *fsx, size_t size,
> >                 unsigned int at_flags);
> >         long syscall(SYS_setfsxattrat, int dirfd, const char *pathname,
> >                 struct fsxattr *fsx, size_t size,
> >                 unsigned int at_flags);
> >
> >         Note: glibc doesn't provide for getfsxattrat()/setfsxattrat(),
> >         use syscall(2) instead.
> >
> > DESCRIPTION
> >
> >         The syscalls take fd and path to the child together with struct
> >         fsxattr. If path is absolute, fd is not used. If path is empty,
> >         inode under fd is used to get/set attributes on.
> >
> >         This is an alternative to FS_IOC_FSGETXATTR/FS_IOC_FSSETXATTR
> >         ioctl with a difference that file don't need to be open as we
> >         can reference it with a path instead of fd. By having this we
> >         can manipulated filesystem inode attributes not only on regular
> >         files but also on special ones. This is not possible with
> >         FS_IOC_FSSETXATTR ioctl as with special files we can not call
> >         ioctl() directly on the filesystem inode using file descriptor.
> >
> > RETURN VALUE
> >
> >         On success, 0 is returned.  On error, -1 is returned, and errno
> >         is set to indicate the error.
> >
> > ERRORS
> >
> >         EINVAL          Invalid at_flag specified (only
> >                         AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH is
> >                         supported).
> >
> >         EINVAL          Size was smaller than any known version of
> >                         struct fsxattr.
> >
> >         EINVAL          Invalid combination of parameters provided in
> >                         fsxattr for this type of file.
> >
> >         E2BIG           Size of input argument **struct fsxattr** is too
> >                         big.
> >
> >         EBADF           Invalid file descriptor was provided.
> >
> >         EPERM           No permission to change this file.
> >
> >         EOPNOTSUPP      Filesystem does not support setting attributes
> >                         on this type of inode
> >
> > HISTORY
> >
> >         Added in Linux 6.14.
> >
> > EXAMPLE
> >
> > Create directory and file "mkdir ./dir && touch ./dir/foo" and then
> > execute the following program:
> >
> >         #include <fcntl.h>
> >         #include <errno.h>
> >         #include <string.h>
> >         #include <linux/fs.h>
> >         #include <stdio.h>
> >         #include <sys/syscall.h>
> >         #include <unistd.h>
> >
> >         int
> >         main(int argc, char **argv) {
> >                 int dfd;
> >                 int error;
> >                 struct fsxattr fsx;
> >
> >                 dfd = open("./dir", O_RDONLY);
> >                 if (dfd == -1) {
> >                         printf("can not open ./dir");
> >                         return dfd;
> >                 }
> >
> >                 error = syscall(467, dfd, "./foo", &fsx, 0);
> >                 if (error) {
> >                         printf("can not call 467: %s", strerror(errno));
> >                         return error;
> >                 }
> >
> >                 printf("dir/foo flags: %d\n", fsx.fsx_xflags);
> >
> >                 fsx.fsx_xflags |= FS_XFLAG_NODUMP;
> >                 error = syscall(468, dfd, "./foo", &fsx, 0);
> >                 if (error) {
> >                         printf("can not call 468: %s", strerror(errno));
> >                         return error;
> >                 }
> >
> >                 printf("dir/foo flags: %d\n", fsx.fsx_xflags);
> >
> >                 return error;
> >         }
> >
> > SEE ALSO
> >
> >         ioctl(2), ioctl_iflags(2), ioctl_xfs_fsgetxattr(2)
> >
> > ---
> > Changes in v4:
> > - Use getname_maybe_null() for correct handling of dfd + path semantic
> > - Remove restriction for special files on which flags are allowed
> > - Utilize copy_struct_from_user() for better future compatibility
> > - Add draft man page to cover letter
> > - Convert -ENOIOCTLCMD to -EOPNOSUPP as more appropriate for syscall
> > - Add missing __user to header declaration of syscalls
> > - Link to v3: https://lore.kernel.org/r/20250211-xattrat-syscall-v3-1-a07d15f898b2@kernel.org
> >
> > Changes in v3:
> > - Remove unnecessary "dfd is dir" check as it checked in user_path_at()
> > - Remove unnecessary "same filesystem" check
> > - Use CLASS() instead of directly calling fdget/fdput
> > - Link to v2: https://lore.kernel.org/r/20250122-xattrat-syscall-v2-1-5b360d4fbcb2@kernel.org
> >
> > v1:
> > https://lore.kernel.org/linuxppc-dev/20250109174540.893098-1-aalbersh@kernel.org/
> >
> > Previous discussion:
> > https://lore.kernel.org/linux-xfs/20240520164624.665269-2-aalbersh@redhat.com/
> >
> > ---
> > Andrey Albershteyn (3):
> >       lsm: introduce new hooks for setting/getting inode fsxattr
> >       fs: split fileattr/fsxattr converters into helpers
> >       fs: introduce getfsxattrat and setfsxattrat syscalls
> >
> >  arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
> >  arch/arm/tools/syscall.tbl                  |   2 +
> >  arch/arm64/tools/syscall_32.tbl             |   2 +
> >  arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
> >  arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
> >  arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
> >  arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
> >  arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
> >  arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
> >  arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
> >  arch/s390/kernel/syscalls/syscall.tbl       |   2 +
> >  arch/sh/kernel/syscalls/syscall.tbl         |   2 +
> >  arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
> >  arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
> >  arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
> >  arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
> >  fs/inode.c                                  | 130 ++++++++++++++++++++++++++++
> >  fs/ioctl.c                                  |  39 ++++++---
> >  include/linux/fileattr.h                    |   2 +
> >  include/linux/lsm_hook_defs.h               |   4 +
> >  include/linux/security.h                    |  16 ++++
> >  include/linux/syscalls.h                    |   6 ++
> >  include/uapi/asm-generic/unistd.h           |   8 +-
> >  include/uapi/linux/fs.h                     |   3 +
> >  security/security.c                         |  32 +++++++
> >  25 files changed, 259 insertions(+), 13 deletions(-)
> > ---
> > base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04
> > change-id: 20250114-xattrat-syscall-6a1136d2db59
> >
> > Best regards,
> > --
> > Andrey Albershteyn <aalbersh@kernel.org>
> >
> >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode fsxattr
  2025-03-21 19:48 ` [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode fsxattr Andrey Albershteyn
  2025-03-21 21:32   ` Paul Moore
@ 2025-03-24 19:21   ` Mickaël Salaün
  1 sibling, 0 replies; 28+ messages in thread
From: Mickaël Salaün @ 2025-03-24 19:21 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Günther Noack, Arnd Bergmann,
	Pali Rohár, Paul Moore, James Morris, Serge E. Hallyn,
	linux-alpha, linux-kernel, linux-arm-kernel, linux-m68k,
	linux-mips, linux-parisc, linuxppc-dev, linux-s390, linux-sh,
	sparclinux, linux-fsdevel, linux-security-module, linux-api,
	linux-arch, selinux, Andrey Albershteyn

On Fri, Mar 21, 2025 at 08:48:40PM +0100, Andrey Albershteyn wrote:
> Introduce new hooks for setting and getting filesystem extended
> attributes on inode (FS_IOC_FSGETXATTR).
> 
> Cc: selinux@vger.kernel.org
> Cc: Paul Moore <paul@paul-moore.com>
> 
> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
> ---
>  fs/ioctl.c                    |  7 ++++++-
>  include/linux/lsm_hook_defs.h |  4 ++++
>  include/linux/security.h      | 16 ++++++++++++++++
>  security/security.c           | 32 ++++++++++++++++++++++++++++++++
>  4 files changed, 58 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 638a36be31c14afc66a7fd6eb237d9545e8ad997..4434c97bc5dff5a3e8635e28745cd99404ff353e 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -525,10 +525,15 @@ EXPORT_SYMBOL(fileattr_fill_flags);
>  int vfs_fileattr_get(struct dentry *dentry, struct fileattr *fa)
>  {
>  	struct inode *inode = d_inode(dentry);
> +	int error;
>  
>  	if (!inode->i_op->fileattr_get)
>  		return -ENOIOCTLCMD;
>  
> +	error = security_inode_getfsxattr(inode, fa);

It would help for both of these hooks to pass the dentry instead of the
inode.

> +	if (error)
> +		return error;
> +
>  	return inode->i_op->fileattr_get(dentry, fa);
>  }
>  EXPORT_SYMBOL(vfs_fileattr_get);
> @@ -692,7 +697,7 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
>  			fa->flags |= old_ma.flags & ~FS_COMMON_FL;
>  		}
>  		err = fileattr_set_prepare(inode, &old_ma, fa);
> -		if (!err)
> +		if (!err && !security_inode_setfsxattr(inode, fa))
>  			err = inode->i_op->fileattr_set(idmap, dentry, fa);
>  	}
>  	inode_unlock(inode);


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode  fsxattr
  2025-03-21 21:32   ` Paul Moore
@ 2025-03-24 19:27     ` Mickaël Salaün
  2025-03-27  9:19       ` Andrey Albershteyn
  0 siblings, 1 reply; 28+ messages in thread
From: Mickaël Salaün @ 2025-03-24 19:27 UTC (permalink / raw)
  To: Paul Moore
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Günther Noack, Arnd Bergmann,
	Pali Rohár, James Morris, Serge E. Hallyn, linux-alpha,
	linux-kernel, linux-arm-kernel, linux-m68k, linux-mips,
	linux-parisc, linuxppc-dev, linux-s390, linux-sh, sparclinux,
	linux-fsdevel, linux-security-module, linux-api, linux-arch,
	selinux, Andrey Albershteyn

On Fri, Mar 21, 2025 at 05:32:25PM -0400, Paul Moore wrote:
> On Mar 21, 2025 Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > 
> > Introduce new hooks for setting and getting filesystem extended
> > attributes on inode (FS_IOC_FSGETXATTR).
> > 
> > Cc: selinux@vger.kernel.org
> > Cc: Paul Moore <paul@paul-moore.com>
> > 
> > Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
> > ---
> >  fs/ioctl.c                    |  7 ++++++-
> >  include/linux/lsm_hook_defs.h |  4 ++++
> >  include/linux/security.h      | 16 ++++++++++++++++
> >  security/security.c           | 32 ++++++++++++++++++++++++++++++++
> >  4 files changed, 58 insertions(+), 1 deletion(-)
> 
> Thanks Andrey, one small change below, but otherwise this looks pretty
> good.  If you feel like trying to work up the SELinux implementation but
> need some assitance please let me know, I'll be happy to help :)
> 
> > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > index 638a36be31c14afc66a7fd6eb237d9545e8ad997..4434c97bc5dff5a3e8635e28745cd99404ff353e 100644
> > --- a/fs/ioctl.c
> > +++ b/fs/ioctl.c
> > @@ -525,10 +525,15 @@ EXPORT_SYMBOL(fileattr_fill_flags);
> >  int vfs_fileattr_get(struct dentry *dentry, struct fileattr *fa)
> >  {
> >  	struct inode *inode = d_inode(dentry);
> > +	int error;
> >  
> >  	if (!inode->i_op->fileattr_get)
> >  		return -ENOIOCTLCMD;
> >  
> > +	error = security_inode_getfsxattr(inode, fa);
> > +	if (error)
> > +		return error;
> > +
> >  	return inode->i_op->fileattr_get(dentry, fa);
> >  }
> >  EXPORT_SYMBOL(vfs_fileattr_get);
> > @@ -692,7 +697,7 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
> >  			fa->flags |= old_ma.flags & ~FS_COMMON_FL;
> >  		}
> >  		err = fileattr_set_prepare(inode, &old_ma, fa);
> > -		if (!err)
> > +		if (!err && !security_inode_setfsxattr(inode, fa))
> >  			err = inode->i_op->fileattr_set(idmap, dentry, fa);
> >  	}
> >  	inode_unlock(inode);
> 
> I don't believe we want to hide or otherwise drop the LSM return code as
> that could lead to odd behavior, e.g. returning 0/success despite not
> having executed the fileattr_set operation.

Yes, this should look something like this:

 		err = fileattr_set_prepare(inode, &old_ma, fa);
 		if (err)
 			goto out;
 		err = security_inode_setfsxattr(dentry, fa);
 		if (err)
 			goto out;
 		err = inode->i_op->fileattr_set(idmap, dentry, fa);
 		if (err)
 			goto out;

> 
> --
> paul-moore.com
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode  fsxattr
  2025-03-24 19:27     ` Mickaël Salaün
@ 2025-03-27  9:19       ` Andrey Albershteyn
  0 siblings, 0 replies; 28+ messages in thread
From: Andrey Albershteyn @ 2025-03-27  9:19 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Paul Moore, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Günther Noack, Arnd Bergmann,
	Pali Rohár, James Morris, Serge E. Hallyn, linux-alpha,
	linux-kernel, linux-arm-kernel, linux-m68k, linux-mips,
	linux-parisc, linuxppc-dev, linux-s390, linux-sh, sparclinux,
	linux-fsdevel, linux-security-module, linux-api, linux-arch,
	selinux, Andrey Albershteyn

On 2025-03-24 20:27:02, Mickaël Salaün wrote:
> On Fri, Mar 21, 2025 at 05:32:25PM -0400, Paul Moore wrote:
> > On Mar 21, 2025 Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > 
> > > Introduce new hooks for setting and getting filesystem extended
> > > attributes on inode (FS_IOC_FSGETXATTR).
> > > 
> > > Cc: selinux@vger.kernel.org
> > > Cc: Paul Moore <paul@paul-moore.com>
> > > 
> > > Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
> > > ---
> > >  fs/ioctl.c                    |  7 ++++++-
> > >  include/linux/lsm_hook_defs.h |  4 ++++
> > >  include/linux/security.h      | 16 ++++++++++++++++
> > >  security/security.c           | 32 ++++++++++++++++++++++++++++++++
> > >  4 files changed, 58 insertions(+), 1 deletion(-)
> > 
> > Thanks Andrey, one small change below, but otherwise this looks pretty
> > good.  If you feel like trying to work up the SELinux implementation but
> > need some assitance please let me know, I'll be happy to help :)
> > 
> > > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > > index 638a36be31c14afc66a7fd6eb237d9545e8ad997..4434c97bc5dff5a3e8635e28745cd99404ff353e 100644
> > > --- a/fs/ioctl.c
> > > +++ b/fs/ioctl.c
> > > @@ -525,10 +525,15 @@ EXPORT_SYMBOL(fileattr_fill_flags);
> > >  int vfs_fileattr_get(struct dentry *dentry, struct fileattr *fa)
> > >  {
> > >  	struct inode *inode = d_inode(dentry);
> > > +	int error;
> > >  
> > >  	if (!inode->i_op->fileattr_get)
> > >  		return -ENOIOCTLCMD;
> > >  
> > > +	error = security_inode_getfsxattr(inode, fa);
> > > +	if (error)
> > > +		return error;
> > > +
> > >  	return inode->i_op->fileattr_get(dentry, fa);
> > >  }
> > >  EXPORT_SYMBOL(vfs_fileattr_get);
> > > @@ -692,7 +697,7 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
> > >  			fa->flags |= old_ma.flags & ~FS_COMMON_FL;
> > >  		}
> > >  		err = fileattr_set_prepare(inode, &old_ma, fa);
> > > -		if (!err)
> > > +		if (!err && !security_inode_setfsxattr(inode, fa))
> > >  			err = inode->i_op->fileattr_set(idmap, dentry, fa);
> > >  	}
> > >  	inode_unlock(inode);
> > 
> > I don't believe we want to hide or otherwise drop the LSM return code as
> > that could lead to odd behavior, e.g. returning 0/success despite not
> > having executed the fileattr_set operation.
> 
> Yes, this should look something like this:
> 
>  		err = fileattr_set_prepare(inode, &old_ma, fa);
>  		if (err)
>  			goto out;
>  		err = security_inode_setfsxattr(dentry, fa);
>  		if (err)
>  			goto out;
>  		err = inode->i_op->fileattr_set(idmap, dentry, fa);
>  		if (err)
>  			goto out;
> 
> > 
> > --
> > paul-moore.com
> > 
> 

Sure, thanks for noticing, will switch to dentries and handle error
code it in v5

-- 
- Andrey


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-23  8:56   ` Amir Goldstein
@ 2025-03-27  9:33     ` Andrey Albershteyn
  2025-03-27 11:39       ` Amir Goldstein
  0 siblings, 1 reply; 28+ messages in thread
From: Andrey Albershteyn @ 2025-03-27  9:33 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn, linux-alpha, linux-kernel,
	linux-arm-kernel, linux-m68k, linux-mips, linux-parisc,
	linuxppc-dev, linux-s390, linux-sh, sparclinux, linux-fsdevel,
	linux-security-module, linux-api, linux-arch, linux-xfs

On 2025-03-23 09:56:25, Amir Goldstein wrote:
> On Fri, Mar 21, 2025 at 8:49 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> >
> > From: Andrey Albershteyn <aalbersh@redhat.com>
> >
> > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > extended attributes/flags. The syscalls take parent directory fd and
> > path to the child together with struct fsxattr.
> >
> > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > that file don't need to be open as we can reference it with a path
> > instead of fd. By having this we can manipulated inode extended
> > attributes not only on regular files but also on special ones. This
> > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > we can not call ioctl() directly on the filesystem inode using fd.
> >
> > This patch adds two new syscalls which allows userspace to get/set
> > extended inode attributes on special files by using parent directory
> > and a path - *at() like syscall.
> >
> > CC: linux-api@vger.kernel.org
> > CC: linux-fsdevel@vger.kernel.org
> > CC: linux-xfs@vger.kernel.org
> > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > ---
> ...
> > +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> > +               struct fsxattr __user *, ufsx, size_t, usize,
> > +               unsigned int, at_flags)
> > +{
> > +       struct fileattr fa;
> > +       struct path filepath;
> > +       int error;
> > +       unsigned int lookup_flags = 0;
> > +       struct filename *name;
> > +       struct mnt_idmap *idmap;.
> 
> > +       struct dentry *dentry;
> > +       struct vfsmount *mnt;
> > +       struct fsxattr fsx = {};
> > +
> > +       BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > +       BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > +
> > +       if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > +               return -EINVAL;
> > +
> > +       if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > +               lookup_flags |= LOOKUP_FOLLOW;
> > +
> > +       if (at_flags & AT_EMPTY_PATH)
> > +               lookup_flags |= LOOKUP_EMPTY;
> > +
> > +       if (usize > PAGE_SIZE)
> > +               return -E2BIG;
> > +
> > +       if (usize < FSXATTR_SIZE_VER0)
> > +               return -EINVAL;
> > +
> > +       error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> > +       if (error)
> > +               return error;
> > +
> > +       fsxattr_to_fileattr(&fsx, &fa);
> > +
> > +       name = getname_maybe_null(filename, at_flags);
> > +       if (!name) {
> > +               CLASS(fd, f)(dfd);
> > +
> > +               if (fd_empty(f))
> > +                       return -EBADF;
> > +
> > +               idmap = file_mnt_idmap(fd_file(f));
> > +               dentry = file_dentry(fd_file(f));
> > +               mnt = fd_file(f)->f_path.mnt;
> > +       } else {
> > +               error = filename_lookup(dfd, name, lookup_flags, &filepath,
> > +                                       NULL);
> > +               if (error)
> > +                       return error;
> > +
> > +               idmap = mnt_idmap(filepath.mnt);
> > +               dentry = filepath.dentry;
> > +               mnt = filepath.mnt;
> > +       }
> > +
> > +       error = mnt_want_write(mnt);
> > +       if (!error) {
> > +               error = vfs_fileattr_set(idmap, dentry, &fa);
> > +               if (error == -ENOIOCTLCMD)
> > +                       error = -EOPNOTSUPP;
> 
> This is awkward.
> vfs_fileattr_set() should return -EOPNOTSUPP.
> ioctl_setflags() could maybe convert it to -ENOIOCTLCMD,
> but looking at similar cases ioctl_fiemap(), ioctl_fsfreeze() the
> ioctl returns -EOPNOTSUPP.
> 
> I don't think it is necessarily a bad idea to start returning
>  -EOPNOTSUPP instead of -ENOIOCTLCMD for the ioctl
> because that really reflects the fact that the ioctl is now implemented
> in vfs and not in the specific fs.
> 
> and I think it would not be a bad idea at all to make that change
> together with the merge of the syscalls as a sort of hint to userspace
> that uses the ioctl, that the sycalls API exists.
> 
> Thanks,
> Amir.
> 

Hmm, not sure what you're suggesting here. I see it as:
- get/setfsxattrat should return EOPNOTSUPP as it make more sense
  than ENOIOCTLCMD
- ioctl_setflags returns ENOIOCTLCMD which also expected

Don't really see a reason to change what vfs_fileattr_set() returns
and then copying this if() to other places or start returning
EOPNOTSUPP.

-- 
- Andrey


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-27  9:33     ` Andrey Albershteyn
@ 2025-03-27 11:39       ` Amir Goldstein
  2025-04-22 14:31         ` Christian Brauner
  0 siblings, 1 reply; 28+ messages in thread
From: Amir Goldstein @ 2025-03-27 11:39 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn, linux-alpha, linux-kernel,
	linux-arm-kernel, linux-m68k, linux-mips, linux-parisc,
	linuxppc-dev, linux-s390, linux-sh, sparclinux, linux-fsdevel,
	linux-security-module, linux-api, linux-arch, linux-xfs

On Thu, Mar 27, 2025 at 10:33 AM Andrey Albershteyn <aalbersh@redhat.com> wrote:
>
> On 2025-03-23 09:56:25, Amir Goldstein wrote:
> > On Fri, Mar 21, 2025 at 8:49 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > >
> > > From: Andrey Albershteyn <aalbersh@redhat.com>
> > >
> > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > > extended attributes/flags. The syscalls take parent directory fd and
> > > path to the child together with struct fsxattr.
> > >
> > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > > that file don't need to be open as we can reference it with a path
> > > instead of fd. By having this we can manipulated inode extended
> > > attributes not only on regular files but also on special ones. This
> > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > > we can not call ioctl() directly on the filesystem inode using fd.
> > >
> > > This patch adds two new syscalls which allows userspace to get/set
> > > extended inode attributes on special files by using parent directory
> > > and a path - *at() like syscall.
> > >
> > > CC: linux-api@vger.kernel.org
> > > CC: linux-fsdevel@vger.kernel.org
> > > CC: linux-xfs@vger.kernel.org
> > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > ---
> > ...
> > > +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> > > +               struct fsxattr __user *, ufsx, size_t, usize,
> > > +               unsigned int, at_flags)
> > > +{
> > > +       struct fileattr fa;
> > > +       struct path filepath;
> > > +       int error;
> > > +       unsigned int lookup_flags = 0;
> > > +       struct filename *name;
> > > +       struct mnt_idmap *idmap;.
> >
> > > +       struct dentry *dentry;
> > > +       struct vfsmount *mnt;
> > > +       struct fsxattr fsx = {};
> > > +
> > > +       BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > > +       BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > > +
> > > +       if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > > +               return -EINVAL;
> > > +
> > > +       if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > > +               lookup_flags |= LOOKUP_FOLLOW;
> > > +
> > > +       if (at_flags & AT_EMPTY_PATH)
> > > +               lookup_flags |= LOOKUP_EMPTY;
> > > +
> > > +       if (usize > PAGE_SIZE)
> > > +               return -E2BIG;
> > > +
> > > +       if (usize < FSXATTR_SIZE_VER0)
> > > +               return -EINVAL;
> > > +
> > > +       error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> > > +       if (error)
> > > +               return error;
> > > +
> > > +       fsxattr_to_fileattr(&fsx, &fa);
> > > +
> > > +       name = getname_maybe_null(filename, at_flags);
> > > +       if (!name) {
> > > +               CLASS(fd, f)(dfd);
> > > +
> > > +               if (fd_empty(f))
> > > +                       return -EBADF;
> > > +
> > > +               idmap = file_mnt_idmap(fd_file(f));
> > > +               dentry = file_dentry(fd_file(f));
> > > +               mnt = fd_file(f)->f_path.mnt;
> > > +       } else {
> > > +               error = filename_lookup(dfd, name, lookup_flags, &filepath,
> > > +                                       NULL);
> > > +               if (error)
> > > +                       return error;
> > > +
> > > +               idmap = mnt_idmap(filepath.mnt);
> > > +               dentry = filepath.dentry;
> > > +               mnt = filepath.mnt;
> > > +       }
> > > +
> > > +       error = mnt_want_write(mnt);
> > > +       if (!error) {
> > > +               error = vfs_fileattr_set(idmap, dentry, &fa);
> > > +               if (error == -ENOIOCTLCMD)
> > > +                       error = -EOPNOTSUPP;
> >
> > This is awkward.
> > vfs_fileattr_set() should return -EOPNOTSUPP.
> > ioctl_setflags() could maybe convert it to -ENOIOCTLCMD,
> > but looking at similar cases ioctl_fiemap(), ioctl_fsfreeze() the
> > ioctl returns -EOPNOTSUPP.
> >
> > I don't think it is necessarily a bad idea to start returning
> >  -EOPNOTSUPP instead of -ENOIOCTLCMD for the ioctl
> > because that really reflects the fact that the ioctl is now implemented
> > in vfs and not in the specific fs.
> >
> > and I think it would not be a bad idea at all to make that change
> > together with the merge of the syscalls as a sort of hint to userspace
> > that uses the ioctl, that the sycalls API exists.
> >
> > Thanks,
> > Amir.
> >
>
> Hmm, not sure what you're suggesting here. I see it as:
> - get/setfsxattrat should return EOPNOTSUPP as it make more sense
>   than ENOIOCTLCMD
> - ioctl_setflags returns ENOIOCTLCMD which also expected
>
> Don't really see a reason to change what vfs_fileattr_set() returns
> and then copying this if() to other places or start returning
> EOPNOTSUPP.

ENOIOCTLCMD conceptually means that the ioctl command is unknown
This is not the case since ->fileattr_[gs]et() became a vfs API
the ioctl command is handled by vfs and it is known, but individual
filesystems may not support it, so conceptually, returning EOPNOTSUPP
from ioctl() is more correct these days, exactly as is done with the ioctls
FS_IOC_FIEMAP and FIFREEZE which were also historically per fs
ioctls and made into a vfs API.

The fact that bcachefs does not implement ->fileattr_[gs]et() and does
implement FS_IOC_FS[GS]ETXATTR is an oversight IMO, since it
was probably merged after the vfs conversion patch.

This mistake means that bcachefs fileattr cannot be copied up by
ovl_copy_fileattr() which uses the vfs API and NOT the ioctl.

However, if you would made the internal vfs API change that I suggested,
it will have broken ovl_real_fileattr_get() and ovl_copy_fileattr(),
so leave it for now - if I care enough I can do it later together with
fixing the overlayfs and fuse code.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-23 10:32   ` Pali Rohár
@ 2025-03-27 11:47     ` Amir Goldstein
  2025-03-27 19:26       ` Pali Rohár
  0 siblings, 1 reply; 28+ messages in thread
From: Amir Goldstein @ 2025-03-27 11:47 UTC (permalink / raw)
  To: Pali Rohár
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, selinux, Andrey Albershteyn, linux-xfs

On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
>
> On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > >
> > > This patchset introduced two new syscalls getfsxattrat() and
> > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > except they use *at() semantics. Therefore, there's no need to open the
> > > file to get an fd.
> > >
> > > These syscalls allow userspace to set filesystem inode attributes on
> > > special files. One of the usage examples is XFS quota projects.
> > >
> > > XFS has project quotas which could be attached to a directory. All
> > > new inodes in these directories inherit project ID set on parent
> > > directory.
> > >
> > > The project is created from userspace by opening and calling
> > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > with empty project ID. Those inodes then are not shown in the quota
> > > accounting but still exist in the directory. This is not critical but in
> > > the case when special files are created in the directory with already
> > > existing project quota, these new inodes inherit extended attributes.
> > > This creates a mix of special files with and without attributes.
> > > Moreover, special files with attributes don't have a possibility to
> > > become clear or change the attributes. This, in turn, prevents userspace
> > > from re-creating quota project on these existing files.
> > >
> > > Christian, if this get in some mergeable state, please don't merge it
> > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > with masking from Pali Rohár patchset, so, let's see how it goes.
> >
> > Andrey,
> >
> > To be honest I don't think it would be fair to delay your syscalls more
> > than needed.
>
> I agree.
>
> > If Pali can follow through and post patches on top of your syscalls for
> > next merge window that would be great, but otherwise, I think the
> > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > is not zero. we can take it from there later.
>
> IMHO SYS_getfsxattrat is fine in this form.
>
> For SYS_setfsxattrat I think there are needed some modifications
> otherwise we would have problem again with backward compatibility as
> is with ioctl if the syscall wants to be extended in future.
>
> I would suggest for following modifications for SYS_setfsxattrat:
>
> - return EINVAL if fsx_xflags contains some reserved or unsupported flag
>
> - add some flag to completely ignore fsx_extsize, fsx_projid, and
>   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
>   change fsx_xflags, and so could be used without the preceding
>   SYS_getfsxattrat call.
>
> What do you think about it?

I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.

You can use this later to extend for the semantics of flags/fields mask
and we can have a long discussion later on what this semantics should be.

Right?

Amir.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-21 19:48 ` [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
  2025-03-23  8:56   ` Amir Goldstein
@ 2025-03-27 12:31   ` Jan Kara
  2025-04-22 14:59   ` Christian Brauner
  2 siblings, 0 replies; 28+ messages in thread
From: Jan Kara @ 2025-03-27 12:31 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn, linux-alpha, linux-kernel,
	linux-arm-kernel, linux-m68k, linux-mips, linux-parisc,
	linuxppc-dev, linux-s390, linux-sh, sparclinux, linux-fsdevel,
	linux-security-module, linux-api, linux-arch, linux-xfs

On Fri 21-03-25 20:48:42, Andrey Albershteyn wrote:
> From: Andrey Albershteyn <aalbersh@redhat.com>
> 
> Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> extended attributes/flags. The syscalls take parent directory fd and
> path to the child together with struct fsxattr.
> 
> This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> that file don't need to be open as we can reference it with a path
> instead of fd. By having this we can manipulated inode extended
> attributes not only on regular files but also on special ones. This
> is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> we can not call ioctl() directly on the filesystem inode using fd.
> 
> This patch adds two new syscalls which allows userspace to get/set
> extended inode attributes on special files by using parent directory
> and a path - *at() like syscall.
> 
> CC: linux-api@vger.kernel.org
> CC: linux-fsdevel@vger.kernel.org
> CC: linux-xfs@vger.kernel.org
> Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> Acked-by: Arnd Bergmann <arnd@arndb.de>

Looks good. Just two nits below:

> +SYSCALL_DEFINE5(getfsxattrat, int, dfd, const char __user *, filename,
> +		struct fsxattr __user *, ufsx, size_t, usize,
> +		unsigned int, at_flags)
> +{
> +	struct fileattr fa = {};
> +	struct path filepath;
> +	int error;
> +	unsigned int lookup_flags = 0;
> +	struct filename *name;
> +	struct fsxattr fsx = {};
> +
> +	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> +	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> +
> +	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> +		return -EINVAL;
> +
> +	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> +		lookup_flags |= LOOKUP_FOLLOW;
> +
> +	if (at_flags & AT_EMPTY_PATH)
> +		lookup_flags |= LOOKUP_EMPTY;

Strictly speaking setting LOOKUP_EMPTY does not have any effect because
empty names are already handled by getname_maybe_null(). But it does not
hurt either so I don't really care...

> +
> +	if (usize > PAGE_SIZE)
> +		return -E2BIG;
> +
> +	if (usize < FSXATTR_SIZE_VER0)
> +		return -EINVAL;
> +
> +	name = getname_maybe_null(filename, at_flags);
> +	if (!name) {
> +		CLASS(fd, f)(dfd);
> +
> +		if (fd_empty(f))
> +			return -EBADF;
> +		error = vfs_fileattr_get(file_dentry(fd_file(f)), &fa);
> +	} else {
> +		error = filename_lookup(dfd, name, lookup_flags, &filepath,
> +					NULL);
> +		if (error)
> +			goto out;
> +		error = vfs_fileattr_get(filepath.dentry, &fa);
> +		path_put(&filepath);
> +	}
> +	if (error == -ENOIOCTLCMD)
> +		error = -EOPNOTSUPP;
> +	if (!error) {
> +		fileattr_to_fsxattr(&fa, &fsx);
> +		error = copy_struct_to_user(ufsx, usize, &fsx,
> +					    sizeof(struct fsxattr), NULL);
> +	}
> +out:
> +	putname(name);
> +	return error;
> +}
> +
> +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> +		struct fsxattr __user *, ufsx, size_t, usize,
> +		unsigned int, at_flags)
> +{
> +	struct fileattr fa;
> +	struct path filepath;
> +	int error;
> +	unsigned int lookup_flags = 0;
> +	struct filename *name;
> +	struct mnt_idmap *idmap;
> +	struct dentry *dentry;
> +	struct vfsmount *mnt;
> +	struct fsxattr fsx = {};
> +
> +	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> +	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> +
> +	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> +		return -EINVAL;
> +
> +	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> +		lookup_flags |= LOOKUP_FOLLOW;
> +
> +	if (at_flags & AT_EMPTY_PATH)
> +		lookup_flags |= LOOKUP_EMPTY;

Same comment regarding LOOKUP_EMPTY here.

> +
> +	if (usize > PAGE_SIZE)
> +		return -E2BIG;
> +
> +	if (usize < FSXATTR_SIZE_VER0)
> +		return -EINVAL;
> +
> +	error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> +	if (error)
> +		return error;
> +
> +	fsxattr_to_fileattr(&fsx, &fa);
> +
> +	name = getname_maybe_null(filename, at_flags);
> +	if (!name) {
> +		CLASS(fd, f)(dfd);
> +
> +		if (fd_empty(f))
> +			return -EBADF;
> +
> +		idmap = file_mnt_idmap(fd_file(f));
> +		dentry = file_dentry(fd_file(f));
> +		mnt = fd_file(f)->f_path.mnt;
> +	} else {
> +		error = filename_lookup(dfd, name, lookup_flags, &filepath,
> +					NULL);
> +		if (error)
> +			return error;
> +
> +		idmap = mnt_idmap(filepath.mnt);
> +		dentry = filepath.dentry;
> +		mnt = filepath.mnt;
> +	}
> +
> +	error = mnt_want_write(mnt);
> +	if (!error) {
> +		error = vfs_fileattr_set(idmap, dentry, &fa);
> +		if (error == -ENOIOCTLCMD)
> +			error = -EOPNOTSUPP;
> +		mnt_drop_write(mnt);
> +	}
> +
> +	path_put(&filepath);

filepath will not be initialized here in case of name == NULL. 

> +	return error;
> +}

With this fixed feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 2/3] fs: split fileattr/fsxattr converters into helpers
  2025-03-21 19:48 ` [PATCH v4 2/3] fs: split fileattr/fsxattr converters into helpers Andrey Albershteyn
@ 2025-03-27 12:32   ` Jan Kara
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Kara @ 2025-03-27 12:32 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Pali Rohár, Paul Moore,
	James Morris, Serge E. Hallyn, linux-alpha, linux-kernel,
	linux-arm-kernel, linux-m68k, linux-mips, linux-parisc,
	linuxppc-dev, linux-s390, linux-sh, sparclinux, linux-fsdevel,
	linux-security-module, linux-api, linux-arch, Andrey Albershteyn

On Fri 21-03-25 20:48:41, Andrey Albershteyn wrote:
> This will be helpful for get/setfsxattrat syscalls to convert
> between fileattr and fsxattr.
> 
> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ioctl.c               | 32 +++++++++++++++++++++-----------
>  include/linux/fileattr.h |  2 ++
>  2 files changed, 23 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 4434c97bc5dff5a3e8635e28745cd99404ff353e..840283d8c406623d8d26790f89b62ebcbd39e2de 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -538,6 +538,16 @@ int vfs_fileattr_get(struct dentry *dentry, struct fileattr *fa)
>  }
>  EXPORT_SYMBOL(vfs_fileattr_get);
>  
> +void fileattr_to_fsxattr(const struct fileattr *fa, struct fsxattr *fsx)
> +{
> +	memset(fsx, 0, sizeof(struct fsxattr));
> +	fsx->fsx_xflags = fa->fsx_xflags;
> +	fsx->fsx_extsize = fa->fsx_extsize;
> +	fsx->fsx_nextents = fa->fsx_nextents;
> +	fsx->fsx_projid = fa->fsx_projid;
> +	fsx->fsx_cowextsize = fa->fsx_cowextsize;
> +}
> +
>  /**
>   * copy_fsxattr_to_user - copy fsxattr to userspace.
>   * @fa:		fileattr pointer
> @@ -549,12 +559,7 @@ int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa)
>  {
>  	struct fsxattr xfa;
>  
> -	memset(&xfa, 0, sizeof(xfa));
> -	xfa.fsx_xflags = fa->fsx_xflags;
> -	xfa.fsx_extsize = fa->fsx_extsize;
> -	xfa.fsx_nextents = fa->fsx_nextents;
> -	xfa.fsx_projid = fa->fsx_projid;
> -	xfa.fsx_cowextsize = fa->fsx_cowextsize;
> +	fileattr_to_fsxattr(fa, &xfa);
>  
>  	if (copy_to_user(ufa, &xfa, sizeof(xfa)))
>  		return -EFAULT;
> @@ -563,6 +568,15 @@ int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa)
>  }
>  EXPORT_SYMBOL(copy_fsxattr_to_user);
>  
> +void fsxattr_to_fileattr(const struct fsxattr *fsx, struct fileattr *fa)
> +{
> +	fileattr_fill_xflags(fa, fsx->fsx_xflags);
> +	fa->fsx_extsize = fsx->fsx_extsize;
> +	fa->fsx_nextents = fsx->fsx_nextents;
> +	fa->fsx_projid = fsx->fsx_projid;
> +	fa->fsx_cowextsize = fsx->fsx_cowextsize;
> +}
> +
>  static int copy_fsxattr_from_user(struct fileattr *fa,
>  				  struct fsxattr __user *ufa)
>  {
> @@ -571,11 +585,7 @@ static int copy_fsxattr_from_user(struct fileattr *fa,
>  	if (copy_from_user(&xfa, ufa, sizeof(xfa)))
>  		return -EFAULT;
>  
> -	fileattr_fill_xflags(fa, xfa.fsx_xflags);
> -	fa->fsx_extsize = xfa.fsx_extsize;
> -	fa->fsx_nextents = xfa.fsx_nextents;
> -	fa->fsx_projid = xfa.fsx_projid;
> -	fa->fsx_cowextsize = xfa.fsx_cowextsize;
> +	fsxattr_to_fileattr(&xfa, fa);
>  
>  	return 0;
>  }
> diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
> index 47c05a9851d0600964b644c9c7218faacfd865f8..31888fa2edf10050be134f587299256088344365 100644
> --- a/include/linux/fileattr.h
> +++ b/include/linux/fileattr.h
> @@ -33,7 +33,9 @@ struct fileattr {
>  	bool	fsx_valid:1;
>  };
>  
> +void fileattr_to_fsxattr(const struct fileattr *fa, struct fsxattr *fsx);
>  int copy_fsxattr_to_user(const struct fileattr *fa, struct fsxattr __user *ufa);
> +void fsxattr_to_fileattr(const struct fsxattr *fsx, struct fileattr *fa);
>  
>  void fileattr_fill_xflags(struct fileattr *fa, u32 xflags);
>  void fileattr_fill_flags(struct fileattr *fa, u32 flags);
> 
> -- 
> 2.47.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-27 11:47     ` Amir Goldstein
@ 2025-03-27 19:26       ` Pali Rohár
  2025-03-27 20:57         ` Amir Goldstein
  0 siblings, 1 reply; 28+ messages in thread
From: Pali Rohár @ 2025-03-27 19:26 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, selinux, Andrey Albershteyn, linux-xfs

On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote:
> On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
> >
> > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > >
> > > > This patchset introduced two new syscalls getfsxattrat() and
> > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > > except they use *at() semantics. Therefore, there's no need to open the
> > > > file to get an fd.
> > > >
> > > > These syscalls allow userspace to set filesystem inode attributes on
> > > > special files. One of the usage examples is XFS quota projects.
> > > >
> > > > XFS has project quotas which could be attached to a directory. All
> > > > new inodes in these directories inherit project ID set on parent
> > > > directory.
> > > >
> > > > The project is created from userspace by opening and calling
> > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > > with empty project ID. Those inodes then are not shown in the quota
> > > > accounting but still exist in the directory. This is not critical but in
> > > > the case when special files are created in the directory with already
> > > > existing project quota, these new inodes inherit extended attributes.
> > > > This creates a mix of special files with and without attributes.
> > > > Moreover, special files with attributes don't have a possibility to
> > > > become clear or change the attributes. This, in turn, prevents userspace
> > > > from re-creating quota project on these existing files.
> > > >
> > > > Christian, if this get in some mergeable state, please don't merge it
> > > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > > with masking from Pali Rohár patchset, so, let's see how it goes.
> > >
> > > Andrey,
> > >
> > > To be honest I don't think it would be fair to delay your syscalls more
> > > than needed.
> >
> > I agree.
> >
> > > If Pali can follow through and post patches on top of your syscalls for
> > > next merge window that would be great, but otherwise, I think the
> > > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > > is not zero. we can take it from there later.
> >
> > IMHO SYS_getfsxattrat is fine in this form.
> >
> > For SYS_setfsxattrat I think there are needed some modifications
> > otherwise we would have problem again with backward compatibility as
> > is with ioctl if the syscall wants to be extended in future.
> >
> > I would suggest for following modifications for SYS_setfsxattrat:
> >
> > - return EINVAL if fsx_xflags contains some reserved or unsupported flag
> >
> > - add some flag to completely ignore fsx_extsize, fsx_projid, and
> >   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
> >   change fsx_xflags, and so could be used without the preceding
> >   SYS_getfsxattrat call.
> >
> > What do you think about it?
> 
> I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.
> 
> You can use this later to extend for the semantics of flags/fields mask
> and we can have a long discussion later on what this semantics should be.
> 
> Right?
> 
> Amir.

It is really enough? All new extensions later would have to be added
into fsx_pad fields, and currently unused bits in fsx_xflags would be
unusable for extensions.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-27 19:26       ` Pali Rohár
@ 2025-03-27 20:57         ` Amir Goldstein
  2025-03-27 21:13           ` Pali Rohár
  0 siblings, 1 reply; 28+ messages in thread
From: Amir Goldstein @ 2025-03-27 20:57 UTC (permalink / raw)
  To: Pali Rohár
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, selinux, Andrey Albershteyn, linux-xfs

On Thu, Mar 27, 2025 at 8:26 PM Pali Rohár <pali@kernel.org> wrote:
>
> On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote:
> > On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
> > >
> > > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > >
> > > > > This patchset introduced two new syscalls getfsxattrat() and
> > > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > > > except they use *at() semantics. Therefore, there's no need to open the
> > > > > file to get an fd.
> > > > >
> > > > > These syscalls allow userspace to set filesystem inode attributes on
> > > > > special files. One of the usage examples is XFS quota projects.
> > > > >
> > > > > XFS has project quotas which could be attached to a directory. All
> > > > > new inodes in these directories inherit project ID set on parent
> > > > > directory.
> > > > >
> > > > > The project is created from userspace by opening and calling
> > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > > > with empty project ID. Those inodes then are not shown in the quota
> > > > > accounting but still exist in the directory. This is not critical but in
> > > > > the case when special files are created in the directory with already
> > > > > existing project quota, these new inodes inherit extended attributes.
> > > > > This creates a mix of special files with and without attributes.
> > > > > Moreover, special files with attributes don't have a possibility to
> > > > > become clear or change the attributes. This, in turn, prevents userspace
> > > > > from re-creating quota project on these existing files.
> > > > >
> > > > > Christian, if this get in some mergeable state, please don't merge it
> > > > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > > > with masking from Pali Rohár patchset, so, let's see how it goes.
> > > >
> > > > Andrey,
> > > >
> > > > To be honest I don't think it would be fair to delay your syscalls more
> > > > than needed.
> > >
> > > I agree.
> > >
> > > > If Pali can follow through and post patches on top of your syscalls for
> > > > next merge window that would be great, but otherwise, I think the
> > > > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > > > is not zero. we can take it from there later.
> > >
> > > IMHO SYS_getfsxattrat is fine in this form.
> > >
> > > For SYS_setfsxattrat I think there are needed some modifications
> > > otherwise we would have problem again with backward compatibility as
> > > is with ioctl if the syscall wants to be extended in future.
> > >
> > > I would suggest for following modifications for SYS_setfsxattrat:
> > >
> > > - return EINVAL if fsx_xflags contains some reserved or unsupported flag
> > >
> > > - add some flag to completely ignore fsx_extsize, fsx_projid, and
> > >   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
> > >   change fsx_xflags, and so could be used without the preceding
> > >   SYS_getfsxattrat call.
> > >
> > > What do you think about it?
> >
> > I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.
> >
> > You can use this later to extend for the semantics of flags/fields mask
> > and we can have a long discussion later on what this semantics should be.
> >
> > Right?
> >
> > Amir.
>
> It is really enough?

I don't know. Let's see...

> All new extensions later would have to be added
> into fsx_pad fields, and currently unused bits in fsx_xflags would be
> unusable for extensions.

I am working under the assumption that the first extension would be
to support fsx_xflags_mask and from there, you could add filesystem
flags support checks and then new flags. Am I wrong?

Obviously, fsx_xflags_mask would be taken from fsx_pad space.
After that extension is implemented, calling SYS_setfsxattrat() with
a zero fsx_xflags_mask would be silly for programs that do not do
the legacy get+set.

So when we introduce  fsx_xflags_mask, we could say that a value
of zero means that the mask is not being checked at all and unknown
flags in set syscall are ignored (a.k.a legacy ioctl behavior).

Programs that actually want to try and set without get will have to set
a non zero fsx_xflags_mask to do something useful.

I don't think this is great.
I would rather that the first version of syscalls will require the mask
and will always enforce filesystems supported flags.

If you can get those patches (on top of current series) posted and
reviewed in time for the next merge window, including consensus
on the actual semantics, that would be the best IMO.

But I am just preparing a plan B in case you do not have time to
work on the patches or if consensus on the API extensions is not
reached on time.

I think that for plan B, the minimum is to verify zero pad field and
that is something that this syscall has to do anyway, because this
is the way that backward compact APIs work.

If you want the syscall to always return -EINVAL for setting xflags
that are currently undefined I agree that would be nice as well.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-27 20:57         ` Amir Goldstein
@ 2025-03-27 21:13           ` Pali Rohár
  2025-03-28 14:09             ` Amir Goldstein
  0 siblings, 1 reply; 28+ messages in thread
From: Pali Rohár @ 2025-03-27 21:13 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, selinux, Andrey Albershteyn, linux-xfs

On Thursday 27 March 2025 21:57:34 Amir Goldstein wrote:
> On Thu, Mar 27, 2025 at 8:26 PM Pali Rohár <pali@kernel.org> wrote:
> >
> > On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote:
> > > On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
> > > >
> > > > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > > > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > > >
> > > > > > This patchset introduced two new syscalls getfsxattrat() and
> > > > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > > > > except they use *at() semantics. Therefore, there's no need to open the
> > > > > > file to get an fd.
> > > > > >
> > > > > > These syscalls allow userspace to set filesystem inode attributes on
> > > > > > special files. One of the usage examples is XFS quota projects.
> > > > > >
> > > > > > XFS has project quotas which could be attached to a directory. All
> > > > > > new inodes in these directories inherit project ID set on parent
> > > > > > directory.
> > > > > >
> > > > > > The project is created from userspace by opening and calling
> > > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > > > > with empty project ID. Those inodes then are not shown in the quota
> > > > > > accounting but still exist in the directory. This is not critical but in
> > > > > > the case when special files are created in the directory with already
> > > > > > existing project quota, these new inodes inherit extended attributes.
> > > > > > This creates a mix of special files with and without attributes.
> > > > > > Moreover, special files with attributes don't have a possibility to
> > > > > > become clear or change the attributes. This, in turn, prevents userspace
> > > > > > from re-creating quota project on these existing files.
> > > > > >
> > > > > > Christian, if this get in some mergeable state, please don't merge it
> > > > > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > > > > with masking from Pali Rohár patchset, so, let's see how it goes.
> > > > >
> > > > > Andrey,
> > > > >
> > > > > To be honest I don't think it would be fair to delay your syscalls more
> > > > > than needed.
> > > >
> > > > I agree.
> > > >
> > > > > If Pali can follow through and post patches on top of your syscalls for
> > > > > next merge window that would be great, but otherwise, I think the
> > > > > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > > > > is not zero. we can take it from there later.
> > > >
> > > > IMHO SYS_getfsxattrat is fine in this form.
> > > >
> > > > For SYS_setfsxattrat I think there are needed some modifications
> > > > otherwise we would have problem again with backward compatibility as
> > > > is with ioctl if the syscall wants to be extended in future.
> > > >
> > > > I would suggest for following modifications for SYS_setfsxattrat:
> > > >
> > > > - return EINVAL if fsx_xflags contains some reserved or unsupported flag
> > > >
> > > > - add some flag to completely ignore fsx_extsize, fsx_projid, and
> > > >   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
> > > >   change fsx_xflags, and so could be used without the preceding
> > > >   SYS_getfsxattrat call.
> > > >
> > > > What do you think about it?
> > >
> > > I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.
> > >
> > > You can use this later to extend for the semantics of flags/fields mask
> > > and we can have a long discussion later on what this semantics should be.
> > >
> > > Right?
> > >
> > > Amir.
> >
> > It is really enough?
> 
> I don't know. Let's see...
> 
> > All new extensions later would have to be added
> > into fsx_pad fields, and currently unused bits in fsx_xflags would be
> > unusable for extensions.
> 
> I am working under the assumption that the first extension would be
> to support fsx_xflags_mask and from there, you could add filesystem
> flags support checks and then new flags. Am I wrong?
> 
> Obviously, fsx_xflags_mask would be taken from fsx_pad space.
> After that extension is implemented, calling SYS_setfsxattrat() with
> a zero fsx_xflags_mask would be silly for programs that do not do
> the legacy get+set.
> 
> So when we introduce  fsx_xflags_mask, we could say that a value
> of zero means that the mask is not being checked at all and unknown
> flags in set syscall are ignored (a.k.a legacy ioctl behavior).
> 
> Programs that actually want to try and set without get will have to set
> a non zero fsx_xflags_mask to do something useful.

Here we need to also solve the problem that without GET call we do not
have valid values for fsx_extsize, fsx_projid, and fsx_cowextsize. So
maybe we would need some flag in fsx_pad that fsx_extsize, fsx_projid,
or fsx_cowextsize are ignored/masked.

> I don't think this is great.
> I would rather that the first version of syscalls will require the mask
> and will always enforce filesystems supported flags.

It is not great... But what about this? In a first step (part of this
syscall patch series) would be just a check that fsx_pad is zero.
Non-zero will return -EINVAL.

In next changes would added fsx_filter bit field, which for each
fsx_xflags and also for fsx_extsize, fsx_projid, and fsx_cowextsize
fields would add a new bit flag which would say (when SET) that the
particular thing has to be ignored.

So when fsx_pad is all-zeros then fsx_filter (first field in fsx_pad)
would say that nothing in fsx_xflags, fsx_extsize, fsx_projid, and
fsx_cowextsize is ignored, and hence behave like before.

And when something in fsx_pad/fsx_filter is set then it says which
fields are ignored/filtered-out.

> If you can get those patches (on top of current series) posted and
> reviewed in time for the next merge window, including consensus
> on the actual semantics, that would be the best IMO.

I think that this starting to be more complicated to rebase my patches
in a way that they do not affect IOCTL path but implement it properly
for new syscall path. It does not sounds like a trivial thing which I
would finish in merge window time and having proper review and consensus
on this.

> But I am just preparing a plan B in case you do not have time to
> work on the patches or if consensus on the API extensions is not
> reached on time.
> 
> I think that for plan B, the minimum is to verify zero pad field and
> that is something that this syscall has to do anyway, because this
> is the way that backward compact APIs work.
> 
> If you want the syscall to always return -EINVAL for setting xflags
> that are currently undefined I agree that would be nice as well.
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-27 21:13           ` Pali Rohár
@ 2025-03-28 14:09             ` Amir Goldstein
  0 siblings, 0 replies; 28+ messages in thread
From: Amir Goldstein @ 2025-03-28 14:09 UTC (permalink / raw)
  To: Pali Rohár
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Christian Brauner, Jan Kara, Mickaël Salaün,
	Günther Noack, Arnd Bergmann, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, selinux, Andrey Albershteyn, linux-xfs

On Thu, Mar 27, 2025 at 10:13 PM Pali Rohár <pali@kernel.org> wrote:
>
> On Thursday 27 March 2025 21:57:34 Amir Goldstein wrote:
> > On Thu, Mar 27, 2025 at 8:26 PM Pali Rohár <pali@kernel.org> wrote:
> > >
> > > On Thursday 27 March 2025 12:47:02 Amir Goldstein wrote:
> > > > On Sun, Mar 23, 2025 at 11:32 AM Pali Rohár <pali@kernel.org> wrote:
> > > > >
> > > > > On Sunday 23 March 2025 09:45:06 Amir Goldstein wrote:
> > > > > > On Fri, Mar 21, 2025 at 8:50 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > > > >
> > > > > > > This patchset introduced two new syscalls getfsxattrat() and
> > > > > > > setfsxattrat(). These syscalls are similar to FS_IOC_FSSETXATTR ioctl()
> > > > > > > except they use *at() semantics. Therefore, there's no need to open the
> > > > > > > file to get an fd.
> > > > > > >
> > > > > > > These syscalls allow userspace to set filesystem inode attributes on
> > > > > > > special files. One of the usage examples is XFS quota projects.
> > > > > > >
> > > > > > > XFS has project quotas which could be attached to a directory. All
> > > > > > > new inodes in these directories inherit project ID set on parent
> > > > > > > directory.
> > > > > > >
> > > > > > > The project is created from userspace by opening and calling
> > > > > > > FS_IOC_FSSETXATTR on each inode. This is not possible for special
> > > > > > > files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
> > > > > > > with empty project ID. Those inodes then are not shown in the quota
> > > > > > > accounting but still exist in the directory. This is not critical but in
> > > > > > > the case when special files are created in the directory with already
> > > > > > > existing project quota, these new inodes inherit extended attributes.
> > > > > > > This creates a mix of special files with and without attributes.
> > > > > > > Moreover, special files with attributes don't have a possibility to
> > > > > > > become clear or change the attributes. This, in turn, prevents userspace
> > > > > > > from re-creating quota project on these existing files.
> > > > > > >
> > > > > > > Christian, if this get in some mergeable state, please don't merge it
> > > > > > > yet. Amir suggested these syscalls better to use updated struct fsxattr
> > > > > > > with masking from Pali Rohár patchset, so, let's see how it goes.
> > > > > >
> > > > > > Andrey,
> > > > > >
> > > > > > To be honest I don't think it would be fair to delay your syscalls more
> > > > > > than needed.
> > > > >
> > > > > I agree.
> > > > >
> > > > > > If Pali can follow through and post patches on top of your syscalls for
> > > > > > next merge window that would be great, but otherwise, I think the
> > > > > > minimum requirement is that the syscalls return EINVAL if fsx_pad
> > > > > > is not zero. we can take it from there later.
> > > > >
> > > > > IMHO SYS_getfsxattrat is fine in this form.
> > > > >
> > > > > For SYS_setfsxattrat I think there are needed some modifications
> > > > > otherwise we would have problem again with backward compatibility as
> > > > > is with ioctl if the syscall wants to be extended in future.
> > > > >
> > > > > I would suggest for following modifications for SYS_setfsxattrat:
> > > > >
> > > > > - return EINVAL if fsx_xflags contains some reserved or unsupported flag
> > > > >
> > > > > - add some flag to completely ignore fsx_extsize, fsx_projid, and
> > > > >   fsx_cowextsize fields, so SYS_setfsxattrat could be used just to
> > > > >   change fsx_xflags, and so could be used without the preceding
> > > > >   SYS_getfsxattrat call.
> > > > >
> > > > > What do you think about it?
> > > >
> > > > I think all Andrey needs to do now is return -EINVAL if fsx_pad is not zero.
> > > >
> > > > You can use this later to extend for the semantics of flags/fields mask
> > > > and we can have a long discussion later on what this semantics should be.
> > > >
> > > > Right?
> > > >
> > > > Amir.
> > >
> > > It is really enough?
> >
> > I don't know. Let's see...
> >
> > > All new extensions later would have to be added
> > > into fsx_pad fields, and currently unused bits in fsx_xflags would be
> > > unusable for extensions.
> >
> > I am working under the assumption that the first extension would be
> > to support fsx_xflags_mask and from there, you could add filesystem
> > flags support checks and then new flags. Am I wrong?
> >
> > Obviously, fsx_xflags_mask would be taken from fsx_pad space.
> > After that extension is implemented, calling SYS_setfsxattrat() with
> > a zero fsx_xflags_mask would be silly for programs that do not do
> > the legacy get+set.
> >
> > So when we introduce  fsx_xflags_mask, we could say that a value
> > of zero means that the mask is not being checked at all and unknown
> > flags in set syscall are ignored (a.k.a legacy ioctl behavior).
> >
> > Programs that actually want to try and set without get will have to set
> > a non zero fsx_xflags_mask to do something useful.
>
> Here we need to also solve the problem that without GET call we do not
> have valid values for fsx_extsize, fsx_projid, and fsx_cowextsize. So
> maybe we would need some flag in fsx_pad that fsx_extsize, fsx_projid,
> or fsx_cowextsize are ignored/masked.
>
> > I don't think this is great.
> > I would rather that the first version of syscalls will require the mask
> > and will always enforce filesystems supported flags.
>
> It is not great... But what about this? In a first step (part of this
> syscall patch series) would be just a check that fsx_pad is zero.
> Non-zero will return -EINVAL.
>
> In next changes would added fsx_filter bit field, which for each
> fsx_xflags and also for fsx_extsize, fsx_projid, and fsx_cowextsize
> fields would add a new bit flag which would say (when SET) that the
> particular thing has to be ignored.

1. I don't like the inverse mask. statx already has the stx_mask
    and stx_attributes_mask, so I rather stick to same semantics
    because some of those attributes are exposed via statx as well
2. fsx_*extsize already have a bit that says if that the particular
    attribute is valid or not, so setting a zero fsx_cowextsize with the
    flag FS_XFLAG_COWEXTSIZE has no effect in xfs:

        /*
         * Only set the extent size hint if we've already determined that the
         * extent size hint should be set on the inode. If no extent size flags
         * are set on the inode then unconditionally clear the extent size hint.
         */
        if (ip->i_diflags & (XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT))
                ip->i_extsize = XFS_B_TO_FSB(mp, fa->fsx_extsize);
        else
                ip->i_extsize = 0;

        if (xfs_has_v3inodes(mp)) {
                if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE)
                        ip->i_cowextsize = XFS_B_TO_FSB(mp, fa->fsx_cowextsize);
                else
                        ip->i_cowextsize = 0;
        }

I think we need to enforce this logic in fileattr_set_prepare()
and I think we need to add a flag FS_XFLAG_PROJID
that will be set in GET when fsx_projid != 0 and similarly
required when setting fsx_projid != 0.

Probably will need to add some backward compat glue for this
flag in GET ioctl to avoid breaking out of tree fs and fuse.

>
> So when fsx_pad is all-zeros then fsx_filter (first field in fsx_pad)
> would say that nothing in fsx_xflags, fsx_extsize, fsx_projid, and
> fsx_cowextsize is ignored, and hence behave like before.
>
> And when something in fsx_pad/fsx_filter is set then it says which
> fields are ignored/filtered-out.
>
> > If you can get those patches (on top of current series) posted and
> > reviewed in time for the next merge window, including consensus
> > on the actual semantics, that would be the best IMO.
>
> I think that this starting to be more complicated to rebase my patches
> in a way that they do not affect IOCTL path but implement it properly
> for new syscall path. It does not sounds like a trivial thing which I
> would finish in merge window time and having proper review and consensus
> on this.
>

Yes, it is better to separate the two efforts.

wrt erroring on unsupported SET flags, all fs other than xfs already
have some variant of fileattr_has_fsx(), so xfs is the only filesystem
that requires special care with the new syscalls.
It's easier to write a patch than it is to explain what I mean, so
I'll try to write a patch.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-27 11:39       ` Amir Goldstein
@ 2025-04-22 14:31         ` Christian Brauner
  2025-04-22 15:14           ` Christian Brauner
  0 siblings, 1 reply; 28+ messages in thread
From: Christian Brauner @ 2025-04-22 14:31 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Jan Kara, Mickaël Salaün, Günther Noack,
	Arnd Bergmann, Pali Rohár, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, linux-xfs

On Thu, Mar 27, 2025 at 12:39:28PM +0100, Amir Goldstein wrote:
> On Thu, Mar 27, 2025 at 10:33 AM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> >
> > On 2025-03-23 09:56:25, Amir Goldstein wrote:
> > > On Fri, Mar 21, 2025 at 8:49 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > >
> > > > From: Andrey Albershteyn <aalbersh@redhat.com>
> > > >
> > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > > > extended attributes/flags. The syscalls take parent directory fd and
> > > > path to the child together with struct fsxattr.
> > > >
> > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > > > that file don't need to be open as we can reference it with a path
> > > > instead of fd. By having this we can manipulated inode extended
> > > > attributes not only on regular files but also on special ones. This
> > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > > > we can not call ioctl() directly on the filesystem inode using fd.
> > > >
> > > > This patch adds two new syscalls which allows userspace to get/set
> > > > extended inode attributes on special files by using parent directory
> > > > and a path - *at() like syscall.
> > > >
> > > > CC: linux-api@vger.kernel.org
> > > > CC: linux-fsdevel@vger.kernel.org
> > > > CC: linux-xfs@vger.kernel.org
> > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > > ---
> > > ...
> > > > +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> > > > +               struct fsxattr __user *, ufsx, size_t, usize,
> > > > +               unsigned int, at_flags)
> > > > +{
> > > > +       struct fileattr fa;
> > > > +       struct path filepath;
> > > > +       int error;
> > > > +       unsigned int lookup_flags = 0;
> > > > +       struct filename *name;
> > > > +       struct mnt_idmap *idmap;.
> > >
> > > > +       struct dentry *dentry;
> > > > +       struct vfsmount *mnt;
> > > > +       struct fsxattr fsx = {};
> > > > +
> > > > +       BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > > > +       BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > > > +
> > > > +       if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > > > +               return -EINVAL;
> > > > +
> > > > +       if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > > > +               lookup_flags |= LOOKUP_FOLLOW;
> > > > +
> > > > +       if (at_flags & AT_EMPTY_PATH)
> > > > +               lookup_flags |= LOOKUP_EMPTY;
> > > > +
> > > > +       if (usize > PAGE_SIZE)
> > > > +               return -E2BIG;
> > > > +
> > > > +       if (usize < FSXATTR_SIZE_VER0)
> > > > +               return -EINVAL;
> > > > +
> > > > +       error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> > > > +       if (error)
> > > > +               return error;
> > > > +
> > > > +       fsxattr_to_fileattr(&fsx, &fa);
> > > > +
> > > > +       name = getname_maybe_null(filename, at_flags);
> > > > +       if (!name) {
> > > > +               CLASS(fd, f)(dfd);
> > > > +
> > > > +               if (fd_empty(f))
> > > > +                       return -EBADF;
> > > > +
> > > > +               idmap = file_mnt_idmap(fd_file(f));
> > > > +               dentry = file_dentry(fd_file(f));
> > > > +               mnt = fd_file(f)->f_path.mnt;
> > > > +       } else {
> > > > +               error = filename_lookup(dfd, name, lookup_flags, &filepath,
> > > > +                                       NULL);
> > > > +               if (error)
> > > > +                       return error;
> > > > +
> > > > +               idmap = mnt_idmap(filepath.mnt);
> > > > +               dentry = filepath.dentry;
> > > > +               mnt = filepath.mnt;
> > > > +       }
> > > > +
> > > > +       error = mnt_want_write(mnt);
> > > > +       if (!error) {
> > > > +               error = vfs_fileattr_set(idmap, dentry, &fa);
> > > > +               if (error == -ENOIOCTLCMD)
> > > > +                       error = -EOPNOTSUPP;
> > >
> > > This is awkward.
> > > vfs_fileattr_set() should return -EOPNOTSUPP.
> > > ioctl_setflags() could maybe convert it to -ENOIOCTLCMD,
> > > but looking at similar cases ioctl_fiemap(), ioctl_fsfreeze() the
> > > ioctl returns -EOPNOTSUPP.
> > >
> > > I don't think it is necessarily a bad idea to start returning
> > >  -EOPNOTSUPP instead of -ENOIOCTLCMD for the ioctl
> > > because that really reflects the fact that the ioctl is now implemented
> > > in vfs and not in the specific fs.
> > >
> > > and I think it would not be a bad idea at all to make that change
> > > together with the merge of the syscalls as a sort of hint to userspace
> > > that uses the ioctl, that the sycalls API exists.
> > >
> > > Thanks,
> > > Amir.
> > >
> >
> > Hmm, not sure what you're suggesting here. I see it as:
> > - get/setfsxattrat should return EOPNOTSUPP as it make more sense
> >   than ENOIOCTLCMD
> > - ioctl_setflags returns ENOIOCTLCMD which also expected
> >
> > Don't really see a reason to change what vfs_fileattr_set() returns
> > and then copying this if() to other places or start returning
> > EOPNOTSUPP.
> 
> ENOIOCTLCMD conceptually means that the ioctl command is unknown
> This is not the case since ->fileattr_[gs]et() became a vfs API

vfs_fileattr_{g,s}et() should not return ENOIOCTLCMD. Change the return
code to EOPNOTSUPP and then make EOPNOTSUPP be translated to ENOTTY on
on overlayfs and to ENOIOCTLCMD in ecryptfs and in fs/ioctl.c. This way
we get a clean VFS api while retaining current behavior. Amir can do his
cleanup based on that.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-03-21 19:48 ` [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
  2025-03-23  8:56   ` Amir Goldstein
  2025-03-27 12:31   ` Jan Kara
@ 2025-04-22 14:59   ` Christian Brauner
  2025-04-23  9:53     ` Jan Kara
  2 siblings, 1 reply; 28+ messages in thread
From: Christian Brauner @ 2025-04-22 14:59 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Richard Henderson, Matt Turner, Russell King, Catalin Marinas,
	Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Jan Kara, Mickaël Salaün, Günther Noack,
	Arnd Bergmann, Pali Rohár, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, linux-xfs

On Fri, Mar 21, 2025 at 08:48:42PM +0100, Andrey Albershteyn wrote:
> From: Andrey Albershteyn <aalbersh@redhat.com>
> 
> Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> extended attributes/flags. The syscalls take parent directory fd and
> path to the child together with struct fsxattr.
> 
> This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> that file don't need to be open as we can reference it with a path
> instead of fd. By having this we can manipulated inode extended
> attributes not only on regular files but also on special ones. This
> is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> we can not call ioctl() directly on the filesystem inode using fd.
> 
> This patch adds two new syscalls which allows userspace to get/set
> extended inode attributes on special files by using parent directory
> and a path - *at() like syscall.
> 
> CC: linux-api@vger.kernel.org
> CC: linux-fsdevel@vger.kernel.org
> CC: linux-xfs@vger.kernel.org
> Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
>  arch/arm/tools/syscall.tbl                  |   2 +
>  arch/arm64/tools/syscall_32.tbl             |   2 +
>  arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
>  arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
>  arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
>  arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
>  arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
>  arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
>  arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
>  arch/s390/kernel/syscalls/syscall.tbl       |   2 +
>  arch/sh/kernel/syscalls/syscall.tbl         |   2 +
>  arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
>  arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
>  arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
>  arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
>  fs/inode.c                                  | 130 ++++++++++++++++++++++++++++
>  include/linux/syscalls.h                    |   6 ++
>  include/uapi/asm-generic/unistd.h           |   8 +-
>  include/uapi/linux/fs.h                     |   3 +
>  20 files changed, 178 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
> index c59d53d6d3f3490f976ca179ddfe02e69265ae4d..4b9e687494c16b60c6fd6ca1dc4d6564706a7e25 100644
> --- a/arch/alpha/kernel/syscalls/syscall.tbl
> +++ b/arch/alpha/kernel/syscalls/syscall.tbl
> @@ -506,3 +506,5 @@
>  574	common	getxattrat			sys_getxattrat
>  575	common	listxattrat			sys_listxattrat
>  576	common	removexattrat			sys_removexattrat
> +577	common	getfsxattrat			sys_getfsxattrat
> +578	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
> index 49eeb2ad8dbd8e074c6240417693f23fb328afa8..66466257f3c2debb3e2299f0b608c6740c98cab2 100644
> --- a/arch/arm/tools/syscall.tbl
> +++ b/arch/arm/tools/syscall.tbl
> @@ -481,3 +481,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/arm64/tools/syscall_32.tbl b/arch/arm64/tools/syscall_32.tbl
> index 69a829912a05eb8a3e21ed701d1030e31c0148bc..9c516118b154811d8d11d5696f32817430320dbf 100644
> --- a/arch/arm64/tools/syscall_32.tbl
> +++ b/arch/arm64/tools/syscall_32.tbl
> @@ -478,3 +478,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
> index f5ed71f1910d09769c845c2d062d99ee0449437c..159476387f394a92ee5e29db89b118c630372db2 100644
> --- a/arch/m68k/kernel/syscalls/syscall.tbl
> +++ b/arch/m68k/kernel/syscalls/syscall.tbl
> @@ -466,3 +466,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
> index 680f568b77f2cbefc3eacb2517f276041f229b1e..a6d59ee740b58cacf823702003cf9bad17c0d3b7 100644
> --- a/arch/microblaze/kernel/syscalls/syscall.tbl
> +++ b/arch/microblaze/kernel/syscalls/syscall.tbl
> @@ -472,3 +472,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
> index 0b9b7e25b69ad592642f8533bee9ccfe95ce9626..cfe38fcebe1a0279e11751378d3e71c5ec6b6569 100644
> --- a/arch/mips/kernel/syscalls/syscall_n32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
> @@ -405,3 +405,5 @@
>  464	n32	getxattrat			sys_getxattrat
>  465	n32	listxattrat			sys_listxattrat
>  466	n32	removexattrat			sys_removexattrat
> +467	n32	getfsxattrat			sys_getfsxattrat
> +468	n32	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
> index c844cd5cda620b2809a397cdd6f4315ab6a1bfe2..29a0c5974d1aa2f01e33edc0252d75fb97abe230 100644
> --- a/arch/mips/kernel/syscalls/syscall_n64.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
> @@ -381,3 +381,5 @@
>  464	n64	getxattrat			sys_getxattrat
>  465	n64	listxattrat			sys_listxattrat
>  466	n64	removexattrat			sys_removexattrat
> +467	n64	getfsxattrat			sys_getfsxattrat
> +468	n64	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
> index 349b8aad1159f404103bd2057a1e64e9bf309f18..6c00436807c57c492ba957fcd59af1202231cf80 100644
> --- a/arch/mips/kernel/syscalls/syscall_o32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
> @@ -454,3 +454,5 @@
>  464	o32	getxattrat			sys_getxattrat
>  465	o32	listxattrat			sys_listxattrat
>  466	o32	removexattrat			sys_removexattrat
> +467	o32	getfsxattrat			sys_getfsxattrat
> +468	o32	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
> index d9fc94c869657fcfbd7aca1d5f5abc9fae2fb9d8..b3578fac43d6b65167787fcc97d2d09f5a9828e7 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -465,3 +465,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index d8b4ab78bef076bd50d49b87dea5060fd8c1686a..808045d82c9465c3bfa96b15947546efe5851e9a 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -557,3 +557,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
> index e9115b4d8b635b846e5c9ad6ce229605323723a5..78dfc2c184d4815baf8a9e61c546c9936d58a47c 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -469,3 +469,5 @@
>  464  common	getxattrat		sys_getxattrat			sys_getxattrat
>  465  common	listxattrat		sys_listxattrat			sys_listxattrat
>  466  common	removexattrat		sys_removexattrat		sys_removexattrat
> +467  common	getfsxattrat		sys_getfsxattrat		sys_getfsxattrat
> +468  common	setfsxattrat		sys_setfsxattrat		sys_setfsxattrat
> diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
> index c8cad33bf250ea110de37bd1407f5a43ec5e38f2..d5a5c8339f0ed25ea07c4aba90351d352033c8a0 100644
> --- a/arch/sh/kernel/syscalls/syscall.tbl
> +++ b/arch/sh/kernel/syscalls/syscall.tbl
> @@ -470,3 +470,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
> index 727f99d333b304b3db0711953a3d91ece18a28eb..817dcd8603bcbffc47f3f59aa3b74b16486453d0 100644
> --- a/arch/sparc/kernel/syscalls/syscall.tbl
> +++ b/arch/sparc/kernel/syscalls/syscall.tbl
> @@ -512,3 +512,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index 4d0fb2fba7e208ae9455459afe11e277321d9f74..b4842c027c5d00c0236b2ba89387c5e2267447bd 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -472,3 +472,5 @@
>  464	i386	getxattrat		sys_getxattrat
>  465	i386	listxattrat		sys_listxattrat
>  466	i386	removexattrat		sys_removexattrat
> +467	i386	getfsxattrat		sys_getfsxattrat
> +468	i386	setfsxattrat		sys_setfsxattrat
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index 5eb708bff1c791debd6cfc5322583b2ae53f6437..b6f0a7236aaee624cf9b484239a1068085a8ffe1 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -390,6 +390,8 @@
>  464	common	getxattrat		sys_getxattrat
>  465	common	listxattrat		sys_listxattrat
>  466	common	removexattrat		sys_removexattrat
> +467	common	getfsxattrat		sys_getfsxattrat
> +468	common	setfsxattrat		sys_setfsxattrat
>  
>  #
>  # Due to a historical design error, certain syscalls are numbered differently
> diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
> index 37effc1b134eea061f2c350c1d68b4436b65a4dd..425d56be337d1de22f205ac503df61ff86224fee 100644
> --- a/arch/xtensa/kernel/syscalls/syscall.tbl
> +++ b/arch/xtensa/kernel/syscalls/syscall.tbl
> @@ -437,3 +437,5 @@
>  464	common	getxattrat			sys_getxattrat
>  465	common	listxattrat			sys_listxattrat
>  466	common	removexattrat			sys_removexattrat
> +467	common	getfsxattrat			sys_getfsxattrat
> +468	common	setfsxattrat			sys_setfsxattrat
> diff --git a/fs/inode.c b/fs/inode.c
> index 6b4c77268fc0ecace4ac78a9ca777fbffc277f4a..811debf379ab299f287ed90863277cfda27db30c 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c

I really dislike the name fsxattr for a lot of reasons but it definitely
shouldn't go in inode.c. Just add a new fs/file_attr.c file and move all
the relevant helpers from fs/ioctl.c in there and then add the system
calls. Otherwise it's just all very confusing.

> @@ -23,6 +23,9 @@
>  #include <linux/rw_hint.h>
>  #include <linux/seq_file.h>
>  #include <linux/debugfs.h>
> +#include <linux/syscalls.h>
> +#include <linux/fileattr.h>
> +#include <linux/namei.h>
>  #include <trace/events/writeback.h>
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/timestamp.h>
> @@ -2953,3 +2956,130 @@ umode_t mode_strip_sgid(struct mnt_idmap *idmap,
>  	return mode & ~S_ISGID;
>  }
>  EXPORT_SYMBOL(mode_strip_sgid);
> +
> +SYSCALL_DEFINE5(getfsxattrat, int, dfd, const char __user *, filename,

This is really misnamed. It will end up confusing userspace to no end:

getxattr()
getxattrat()
getfsxattrat()

Please name this file_setattr() and file_getattr(). There's also no need
for the *at() prefix. We've never been consistent with that. We have
plent of system calls that to fd+path without having an *at() prefix.
And here it's especially unneeded because there's no pre-existing system
calls that would even force the use of that prefix.

> +		struct fsxattr __user *, ufsx, size_t, usize,
> +		unsigned int, at_flags)
> +{
> +	struct fileattr fa = {};
> +	struct path filepath;
> +	int error;
> +	unsigned int lookup_flags = 0;
> +	struct filename *name;
> +	struct fsxattr fsx = {};
> +
> +	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> +	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> +
> +	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> +		return -EINVAL;
> +
> +	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> +		lookup_flags |= LOOKUP_FOLLOW;
> +
> +	if (at_flags & AT_EMPTY_PATH)
> +		lookup_flags |= LOOKUP_EMPTY;
> +
> +	if (usize > PAGE_SIZE)
> +		return -E2BIG;
> +
> +	if (usize < FSXATTR_SIZE_VER0)
> +		return -EINVAL;
> +
> +	name = getname_maybe_null(filename, at_flags);
> +	if (!name) {

This is broken as it doesn't handle AT_FDCWD correctly. You need:

        name = getname_maybe_null(filename, at_flags);
        if (IS_ERR(name))
                return PTR_ERR(name);

        if (!name && dfd >= 0) {
		CLASS(fd, f)(dfd);


> +		CLASS(fd, f)(dfd);
> +
> +		if (fd_empty(f))
> +			return -EBADF;

I'm pretty sure you can just do a:

                path = fd_file(f_to)->f_path;
                path_get(&path);

and then the vfs_fileattr_get() call and the path_put() call can be
shared between the two branches. Note that you can also use:

struct path path __free(path_put) = NULL;

and then the cleanup infrastructure will handle the path_put() for you.

> +		error = vfs_fileattr_get(file_dentry(fd_file(f)), &fa);
> +	} else {
> +		error = filename_lookup(dfd, name, lookup_flags, &filepath,
> +					NULL);
> +		if (error)
> +			goto out;
> +		error = vfs_fileattr_get(filepath.dentry, &fa);
> +		path_put(&filepath);
> +	}
> +	if (error == -ENOIOCTLCMD)
> +		error = -EOPNOTSUPP;
> +	if (!error) {
> +		fileattr_to_fsxattr(&fa, &fsx);
> +		error = copy_struct_to_user(ufsx, usize, &fsx,
> +					    sizeof(struct fsxattr), NULL);
> +	}
> +out:
> +	putname(name);
> +	return error;
> +}
> +
> +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> +		struct fsxattr __user *, ufsx, size_t, usize,
> +		unsigned int, at_flags)
> +{
> +	struct fileattr fa;
> +	struct path filepath;
> +	int error;
> +	unsigned int lookup_flags = 0;
> +	struct filename *name;
> +	struct mnt_idmap *idmap;
> +	struct dentry *dentry;
> +	struct vfsmount *mnt;
> +	struct fsxattr fsx = {};
> +
> +	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> +	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> +
> +	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> +		return -EINVAL;
> +
> +	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> +		lookup_flags |= LOOKUP_FOLLOW;
> +
> +	if (at_flags & AT_EMPTY_PATH)
> +		lookup_flags |= LOOKUP_EMPTY;
> +
> +	if (usize > PAGE_SIZE)
> +		return -E2BIG;
> +
> +	if (usize < FSXATTR_SIZE_VER0)
> +		return -EINVAL;
> +
> +	error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> +	if (error)
> +		return error;
> +
> +	fsxattr_to_fileattr(&fsx, &fa);
> +
> +	name = getname_maybe_null(filename, at_flags);
> +	if (!name) {

Same comment as above.

> +		CLASS(fd, f)(dfd);
> +
> +		if (fd_empty(f))
> +			return -EBADF;
> +
> +		idmap = file_mnt_idmap(fd_file(f));
> +		dentry = file_dentry(fd_file(f));
> +		mnt = fd_file(f)->f_path.mnt;

This is a UAF. fd_file(f)->f_path.mnt and file_dentry(fd_file(f)) will
get auto cleaned at the end of the scope. By the time you call
vfs_fileattr_set() nothing pins them anymore...

In general, same comment about unifying the branches as for the get case
via path_get() as above. And just keep the path around don't store mount
and dentry separately.

> +	} else {
> +		error = filename_lookup(dfd, name, lookup_flags, &filepath,
> +					NULL);
> +		if (error)
> +			return error;
> +
> +		idmap = mnt_idmap(filepath.mnt);
> +		dentry = filepath.dentry;
> +		mnt = filepath.mnt;
> +	}
> +
> +	error = mnt_want_write(mnt);
> +	if (!error) {
> +		error = vfs_fileattr_set(idmap, dentry, &fa);
> +		if (error == -ENOIOCTLCMD)
> +			error = -EOPNOTSUPP;
> +		mnt_drop_write(mnt);
> +	}
> +
> +	path_put(&filepath);
> +	return error;
> +}
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index c6333204d45130eb022f6db460eea34a1f6e91db..e242ea39b3e63a8008bc777764b616fd63bd40c4 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -371,6 +371,12 @@ asmlinkage long sys_removexattrat(int dfd, const char __user *path,
>  asmlinkage long sys_lremovexattr(const char __user *path,
>  				 const char __user *name);
>  asmlinkage long sys_fremovexattr(int fd, const char __user *name);
> +asmlinkage long sys_getfsxattrat(int dfd, const char __user *filename,
> +				 struct fsxattr __user *ufsx, size_t usize,
> +				 unsigned int at_flags);
> +asmlinkage long sys_setfsxattrat(int dfd, const char __user *filename,
> +				 struct fsxattr __user *ufsx, size_t usize,
> +				 unsigned int at_flags);
>  asmlinkage long sys_getcwd(char __user *buf, unsigned long size);
>  asmlinkage long sys_eventfd2(unsigned int count, int flags);
>  asmlinkage long sys_epoll_create1(int flags);
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index 88dc393c2bca38c0fa1b3fae579f7cfe4931223c..50be2e1007bc2779120d05c6e9512a689f86779c 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -850,8 +850,14 @@ __SYSCALL(__NR_listxattrat, sys_listxattrat)
>  #define __NR_removexattrat 466
>  __SYSCALL(__NR_removexattrat, sys_removexattrat)
>  
> +/* fs/inode.c */
> +#define __NR_getfsxattrat 467
> +__SYSCALL(__NR_getfsxattrat, sys_getfsxattrat)
> +#define __NR_setfsxattrat 468
> +__SYSCALL(__NR_setfsxattrat, sys_setfsxattrat)
> +
>  #undef __NR_syscalls
> -#define __NR_syscalls 467
> +#define __NR_syscalls 469
>  
>  /*
>   * 32 bit systems traditionally used different
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 7539717707337a8cb22396a869baba3bafa08371..aed753e5d50c97da9b895a187fdaecf0477db74b 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -139,6 +139,9 @@ struct fsxattr {
>  	unsigned char	fsx_pad[8];
>  };
>  
> +#define FSXATTR_SIZE_VER0 28
> +#define FSXATTR_SIZE_LATEST FSXATTR_SIZE_VER0
> +
>  /*
>   * Flags for the fsx_xflags field
>   */
> 
> -- 
> 2.47.2
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-04-22 14:31         ` Christian Brauner
@ 2025-04-22 15:14           ` Christian Brauner
  2025-04-25 18:16             ` Andrey Albershteyn
  0 siblings, 1 reply; 28+ messages in thread
From: Christian Brauner @ 2025-04-22 15:14 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Jan Kara, Mickaël Salaün, Günther Noack,
	Arnd Bergmann, Pali Rohár, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, linux-xfs

On Tue, Apr 22, 2025 at 04:31:29PM +0200, Christian Brauner wrote:
> On Thu, Mar 27, 2025 at 12:39:28PM +0100, Amir Goldstein wrote:
> > On Thu, Mar 27, 2025 at 10:33 AM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > >
> > > On 2025-03-23 09:56:25, Amir Goldstein wrote:
> > > > On Fri, Mar 21, 2025 at 8:49 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > >
> > > > > From: Andrey Albershteyn <aalbersh@redhat.com>
> > > > >
> > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > > > > extended attributes/flags. The syscalls take parent directory fd and
> > > > > path to the child together with struct fsxattr.
> > > > >
> > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > > > > that file don't need to be open as we can reference it with a path
> > > > > instead of fd. By having this we can manipulated inode extended
> > > > > attributes not only on regular files but also on special ones. This
> > > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > > > > we can not call ioctl() directly on the filesystem inode using fd.
> > > > >
> > > > > This patch adds two new syscalls which allows userspace to get/set
> > > > > extended inode attributes on special files by using parent directory
> > > > > and a path - *at() like syscall.
> > > > >
> > > > > CC: linux-api@vger.kernel.org
> > > > > CC: linux-fsdevel@vger.kernel.org
> > > > > CC: linux-xfs@vger.kernel.org
> > > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > > > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > > > ---
> > > > ...
> > > > > +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> > > > > +               struct fsxattr __user *, ufsx, size_t, usize,
> > > > > +               unsigned int, at_flags)
> > > > > +{
> > > > > +       struct fileattr fa;
> > > > > +       struct path filepath;
> > > > > +       int error;
> > > > > +       unsigned int lookup_flags = 0;
> > > > > +       struct filename *name;
> > > > > +       struct mnt_idmap *idmap;.
> > > >
> > > > > +       struct dentry *dentry;
> > > > > +       struct vfsmount *mnt;
> > > > > +       struct fsxattr fsx = {};
> > > > > +
> > > > > +       BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > > > > +       BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > > > > +
> > > > > +       if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > > > > +               lookup_flags |= LOOKUP_FOLLOW;
> > > > > +
> > > > > +       if (at_flags & AT_EMPTY_PATH)
> > > > > +               lookup_flags |= LOOKUP_EMPTY;
> > > > > +
> > > > > +       if (usize > PAGE_SIZE)
> > > > > +               return -E2BIG;
> > > > > +
> > > > > +       if (usize < FSXATTR_SIZE_VER0)
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> > > > > +       if (error)
> > > > > +               return error;
> > > > > +
> > > > > +       fsxattr_to_fileattr(&fsx, &fa);
> > > > > +
> > > > > +       name = getname_maybe_null(filename, at_flags);
> > > > > +       if (!name) {
> > > > > +               CLASS(fd, f)(dfd);
> > > > > +
> > > > > +               if (fd_empty(f))
> > > > > +                       return -EBADF;
> > > > > +
> > > > > +               idmap = file_mnt_idmap(fd_file(f));
> > > > > +               dentry = file_dentry(fd_file(f));
> > > > > +               mnt = fd_file(f)->f_path.mnt;
> > > > > +       } else {
> > > > > +               error = filename_lookup(dfd, name, lookup_flags, &filepath,
> > > > > +                                       NULL);
> > > > > +               if (error)
> > > > > +                       return error;
> > > > > +
> > > > > +               idmap = mnt_idmap(filepath.mnt);
> > > > > +               dentry = filepath.dentry;
> > > > > +               mnt = filepath.mnt;
> > > > > +       }
> > > > > +
> > > > > +       error = mnt_want_write(mnt);
> > > > > +       if (!error) {
> > > > > +               error = vfs_fileattr_set(idmap, dentry, &fa);
> > > > > +               if (error == -ENOIOCTLCMD)
> > > > > +                       error = -EOPNOTSUPP;
> > > >
> > > > This is awkward.
> > > > vfs_fileattr_set() should return -EOPNOTSUPP.
> > > > ioctl_setflags() could maybe convert it to -ENOIOCTLCMD,
> > > > but looking at similar cases ioctl_fiemap(), ioctl_fsfreeze() the
> > > > ioctl returns -EOPNOTSUPP.
> > > >
> > > > I don't think it is necessarily a bad idea to start returning
> > > >  -EOPNOTSUPP instead of -ENOIOCTLCMD for the ioctl
> > > > because that really reflects the fact that the ioctl is now implemented
> > > > in vfs and not in the specific fs.
> > > >
> > > > and I think it would not be a bad idea at all to make that change
> > > > together with the merge of the syscalls as a sort of hint to userspace
> > > > that uses the ioctl, that the sycalls API exists.
> > > >
> > > > Thanks,
> > > > Amir.
> > > >
> > >
> > > Hmm, not sure what you're suggesting here. I see it as:
> > > - get/setfsxattrat should return EOPNOTSUPP as it make more sense
> > >   than ENOIOCTLCMD
> > > - ioctl_setflags returns ENOIOCTLCMD which also expected
> > >
> > > Don't really see a reason to change what vfs_fileattr_set() returns
> > > and then copying this if() to other places or start returning
> > > EOPNOTSUPP.
> > 
> > ENOIOCTLCMD conceptually means that the ioctl command is unknown
> > This is not the case since ->fileattr_[gs]et() became a vfs API
> 
> vfs_fileattr_{g,s}et() should not return ENOIOCTLCMD. Change the return
> code to EOPNOTSUPP and then make EOPNOTSUPP be translated to ENOTTY on
> on overlayfs and to ENOIOCTLCMD in ecryptfs and in fs/ioctl.c. This way
> we get a clean VFS api while retaining current behavior. Amir can do his
> cleanup based on that.

Also this get/set dance is not something new apis should do. It should
be handled like setattr_prepare() or generic_fillattr() where the
filesystem calls a VFS helper and that does all of this based on the
current state of the inode instead of calling into the filesystem twice:

int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
		     struct fileattr *fa)
{
<snip>
	inode_lock(inode);
	err = vfs_fileattr_get(dentry, &old_ma);
	if (!err) {
		/* initialize missing bits from old_ma */
		if (fa->flags_valid) {
<snip>
		err = fileattr_set_prepare(inode, &old_ma, fa);
		if (!err && !security_inode_setfsxattr(inode, fa))
			err = inode->i_op->fileattr_set(idmap, dentry, fa);

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-04-22 14:59   ` Christian Brauner
@ 2025-04-23  9:53     ` Jan Kara
  2025-04-24  9:06       ` Christian Brauner
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Kara @ 2025-04-23  9:53 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Jan Kara, Mickaël Salaün, Günther Noack,
	Arnd Bergmann, Pali Rohár, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, linux-xfs

On Tue 22-04-25 16:59:02, Christian Brauner wrote:
> On Fri, Mar 21, 2025 at 08:48:42PM +0100, Andrey Albershteyn wrote:
> > From: Andrey Albershteyn <aalbersh@redhat.com>
> > 
> > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > extended attributes/flags. The syscalls take parent directory fd and
> > path to the child together with struct fsxattr.
> > 
> > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > that file don't need to be open as we can reference it with a path
> > instead of fd. By having this we can manipulated inode extended
> > attributes not only on regular files but also on special ones. This
> > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > we can not call ioctl() directly on the filesystem inode using fd.
> > 
> > This patch adds two new syscalls which allows userspace to get/set
> > extended inode attributes on special files by using parent directory
> > and a path - *at() like syscall.
> > 
> > CC: linux-api@vger.kernel.org
> > CC: linux-fsdevel@vger.kernel.org
> > CC: linux-xfs@vger.kernel.org
> > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > Acked-by: Arnd Bergmann <arnd@arndb.de>
...
> > +		struct fsxattr __user *, ufsx, size_t, usize,
> > +		unsigned int, at_flags)
> > +{
> > +	struct fileattr fa = {};
> > +	struct path filepath;
> > +	int error;
> > +	unsigned int lookup_flags = 0;
> > +	struct filename *name;
> > +	struct fsxattr fsx = {};
> > +
> > +	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > +	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > +
> > +	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > +		return -EINVAL;
> > +
> > +	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > +		lookup_flags |= LOOKUP_FOLLOW;
> > +
> > +	if (at_flags & AT_EMPTY_PATH)
> > +		lookup_flags |= LOOKUP_EMPTY;
> > +
> > +	if (usize > PAGE_SIZE)
> > +		return -E2BIG;
> > +
> > +	if (usize < FSXATTR_SIZE_VER0)
> > +		return -EINVAL;
> > +
> > +	name = getname_maybe_null(filename, at_flags);
> > +	if (!name) {
> 
> This is broken as it doesn't handle AT_FDCWD correctly. You need:
> 
>         name = getname_maybe_null(filename, at_flags);
>         if (IS_ERR(name))
>                 return PTR_ERR(name);
> 
>         if (!name && dfd >= 0) {
> 		CLASS(fd, f)(dfd);

Ah, you're indeed right that if dfd == AT_FDCWD and filename == NULL, the
we should operate on cwd but we'd bail with error here. I've missed that
during my review. But as far as I've checked the same bug is there in
path_setxattrat() and path_getxattrat() so we should fix this there as
well?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-04-23  9:53     ` Jan Kara
@ 2025-04-24  9:06       ` Christian Brauner
  2025-04-24 17:45         ` Andrey Albershteyn
  0 siblings, 1 reply; 28+ messages in thread
From: Christian Brauner @ 2025-04-24  9:06 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrey Albershteyn, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Mickaël Salaün, Günther Noack, Arnd Bergmann,
	Pali Rohár, Paul Moore, James Morris, Serge E. Hallyn,
	linux-alpha, linux-kernel, linux-arm-kernel, linux-m68k,
	linux-mips, linux-parisc, linuxppc-dev, linux-s390, linux-sh,
	sparclinux, linux-fsdevel, linux-security-module, linux-api,
	linux-arch, linux-xfs

On Wed, Apr 23, 2025 at 11:53:25AM +0200, Jan Kara wrote:
> On Tue 22-04-25 16:59:02, Christian Brauner wrote:
> > On Fri, Mar 21, 2025 at 08:48:42PM +0100, Andrey Albershteyn wrote:
> > > From: Andrey Albershteyn <aalbersh@redhat.com>
> > > 
> > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > > extended attributes/flags. The syscalls take parent directory fd and
> > > path to the child together with struct fsxattr.
> > > 
> > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > > that file don't need to be open as we can reference it with a path
> > > instead of fd. By having this we can manipulated inode extended
> > > attributes not only on regular files but also on special ones. This
> > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > > we can not call ioctl() directly on the filesystem inode using fd.
> > > 
> > > This patch adds two new syscalls which allows userspace to get/set
> > > extended inode attributes on special files by using parent directory
> > > and a path - *at() like syscall.
> > > 
> > > CC: linux-api@vger.kernel.org
> > > CC: linux-fsdevel@vger.kernel.org
> > > CC: linux-xfs@vger.kernel.org
> > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> ...
> > > +		struct fsxattr __user *, ufsx, size_t, usize,
> > > +		unsigned int, at_flags)
> > > +{
> > > +	struct fileattr fa = {};
> > > +	struct path filepath;
> > > +	int error;
> > > +	unsigned int lookup_flags = 0;
> > > +	struct filename *name;
> > > +	struct fsxattr fsx = {};
> > > +
> > > +	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > > +	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > > +
> > > +	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > > +		return -EINVAL;
> > > +
> > > +	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > > +		lookup_flags |= LOOKUP_FOLLOW;
> > > +
> > > +	if (at_flags & AT_EMPTY_PATH)
> > > +		lookup_flags |= LOOKUP_EMPTY;
> > > +
> > > +	if (usize > PAGE_SIZE)
> > > +		return -E2BIG;
> > > +
> > > +	if (usize < FSXATTR_SIZE_VER0)
> > > +		return -EINVAL;
> > > +
> > > +	name = getname_maybe_null(filename, at_flags);
> > > +	if (!name) {
> > 
> > This is broken as it doesn't handle AT_FDCWD correctly. You need:
> > 
> >         name = getname_maybe_null(filename, at_flags);
> >         if (IS_ERR(name))
> >                 return PTR_ERR(name);
> > 
> >         if (!name && dfd >= 0) {
> > 		CLASS(fd, f)(dfd);
> 
> Ah, you're indeed right that if dfd == AT_FDCWD and filename == NULL, the
> we should operate on cwd but we'd bail with error here. I've missed that
> during my review. But as far as I've checked the same bug is there in
> path_setxattrat() and path_getxattrat() so we should fix this there as
> well?

Yes, please!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-04-24  9:06       ` Christian Brauner
@ 2025-04-24 17:45         ` Andrey Albershteyn
  0 siblings, 0 replies; 28+ messages in thread
From: Andrey Albershteyn @ 2025-04-24 17:45 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Mickaël Salaün, Günther Noack, Arnd Bergmann,
	Pali Rohár, Paul Moore, James Morris, Serge E. Hallyn,
	linux-alpha, linux-kernel, linux-arm-kernel, linux-m68k,
	linux-mips, linux-parisc, linuxppc-dev, linux-s390, linux-sh,
	sparclinux, linux-fsdevel, linux-security-module, linux-api,
	linux-arch, linux-xfs

On 2025-04-24 11:06:07, Christian Brauner wrote:
> On Wed, Apr 23, 2025 at 11:53:25AM +0200, Jan Kara wrote:
> > On Tue 22-04-25 16:59:02, Christian Brauner wrote:
> > > On Fri, Mar 21, 2025 at 08:48:42PM +0100, Andrey Albershteyn wrote:
> > > > From: Andrey Albershteyn <aalbersh@redhat.com>
> > > > 
> > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > > > extended attributes/flags. The syscalls take parent directory fd and
> > > > path to the child together with struct fsxattr.
> > > > 
> > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > > > that file don't need to be open as we can reference it with a path
> > > > instead of fd. By having this we can manipulated inode extended
> > > > attributes not only on regular files but also on special ones. This
> > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > > > we can not call ioctl() directly on the filesystem inode using fd.
> > > > 
> > > > This patch adds two new syscalls which allows userspace to get/set
> > > > extended inode attributes on special files by using parent directory
> > > > and a path - *at() like syscall.
> > > > 
> > > > CC: linux-api@vger.kernel.org
> > > > CC: linux-fsdevel@vger.kernel.org
> > > > CC: linux-xfs@vger.kernel.org
> > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > ...
> > > > +		struct fsxattr __user *, ufsx, size_t, usize,
> > > > +		unsigned int, at_flags)
> > > > +{
> > > > +	struct fileattr fa = {};
> > > > +	struct path filepath;
> > > > +	int error;
> > > > +	unsigned int lookup_flags = 0;
> > > > +	struct filename *name;
> > > > +	struct fsxattr fsx = {};
> > > > +
> > > > +	BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > > > +	BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > > > +
> > > > +	if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > > > +		lookup_flags |= LOOKUP_FOLLOW;
> > > > +
> > > > +	if (at_flags & AT_EMPTY_PATH)
> > > > +		lookup_flags |= LOOKUP_EMPTY;
> > > > +
> > > > +	if (usize > PAGE_SIZE)
> > > > +		return -E2BIG;
> > > > +
> > > > +	if (usize < FSXATTR_SIZE_VER0)
> > > > +		return -EINVAL;
> > > > +
> > > > +	name = getname_maybe_null(filename, at_flags);
> > > > +	if (!name) {
> > > 
> > > This is broken as it doesn't handle AT_FDCWD correctly. You need:
> > > 
> > >         name = getname_maybe_null(filename, at_flags);
> > >         if (IS_ERR(name))
> > >                 return PTR_ERR(name);
> > > 
> > >         if (!name && dfd >= 0) {
> > > 		CLASS(fd, f)(dfd);
> > 
> > Ah, you're indeed right that if dfd == AT_FDCWD and filename == NULL, the
> > we should operate on cwd but we'd bail with error here. I've missed that
> > during my review. But as far as I've checked the same bug is there in
> > path_setxattrat() and path_getxattrat() so we should fix this there as
> > well?
> 
> Yes, please!
> 

Thanks for the review, Christian. I will fix issues you noticed as
suggested. I see that Jan already sent fix for path_[s|g]etxattrat()
so won't do anything here.

-- 
- Andrey


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-04-22 15:14           ` Christian Brauner
@ 2025-04-25 18:16             ` Andrey Albershteyn
  2025-04-28  9:17               ` Christian Brauner
  0 siblings, 1 reply; 28+ messages in thread
From: Andrey Albershteyn @ 2025-04-25 18:16 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Amir Goldstein, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Jan Kara, Mickaël Salaün, Günther Noack,
	Arnd Bergmann, Pali Rohár, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, linux-xfs

On 2025-04-22 17:14:10, Christian Brauner wrote:
> On Tue, Apr 22, 2025 at 04:31:29PM +0200, Christian Brauner wrote:
> > On Thu, Mar 27, 2025 at 12:39:28PM +0100, Amir Goldstein wrote:
> > > On Thu, Mar 27, 2025 at 10:33 AM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > >
> > > > On 2025-03-23 09:56:25, Amir Goldstein wrote:
> > > > > On Fri, Mar 21, 2025 at 8:49 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > > >
> > > > > > From: Andrey Albershteyn <aalbersh@redhat.com>
> > > > > >
> > > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > > > > > extended attributes/flags. The syscalls take parent directory fd and
> > > > > > path to the child together with struct fsxattr.
> > > > > >
> > > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > > > > > that file don't need to be open as we can reference it with a path
> > > > > > instead of fd. By having this we can manipulated inode extended
> > > > > > attributes not only on regular files but also on special ones. This
> > > > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > > > > > we can not call ioctl() directly on the filesystem inode using fd.
> > > > > >
> > > > > > This patch adds two new syscalls which allows userspace to get/set
> > > > > > extended inode attributes on special files by using parent directory
> > > > > > and a path - *at() like syscall.
> > > > > >
> > > > > > CC: linux-api@vger.kernel.org
> > > > > > CC: linux-fsdevel@vger.kernel.org
> > > > > > CC: linux-xfs@vger.kernel.org
> > > > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > > > > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > > > > ---
> > > > > ...
> > > > > > +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> > > > > > +               struct fsxattr __user *, ufsx, size_t, usize,
> > > > > > +               unsigned int, at_flags)
> > > > > > +{
> > > > > > +       struct fileattr fa;
> > > > > > +       struct path filepath;
> > > > > > +       int error;
> > > > > > +       unsigned int lookup_flags = 0;
> > > > > > +       struct filename *name;
> > > > > > +       struct mnt_idmap *idmap;.
> > > > >
> > > > > > +       struct dentry *dentry;
> > > > > > +       struct vfsmount *mnt;
> > > > > > +       struct fsxattr fsx = {};
> > > > > > +
> > > > > > +       BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > > > > > +       BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > > > > > +
> > > > > > +       if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > > > > > +               lookup_flags |= LOOKUP_FOLLOW;
> > > > > > +
> > > > > > +       if (at_flags & AT_EMPTY_PATH)
> > > > > > +               lookup_flags |= LOOKUP_EMPTY;
> > > > > > +
> > > > > > +       if (usize > PAGE_SIZE)
> > > > > > +               return -E2BIG;
> > > > > > +
> > > > > > +       if (usize < FSXATTR_SIZE_VER0)
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> > > > > > +       if (error)
> > > > > > +               return error;
> > > > > > +
> > > > > > +       fsxattr_to_fileattr(&fsx, &fa);
> > > > > > +
> > > > > > +       name = getname_maybe_null(filename, at_flags);
> > > > > > +       if (!name) {
> > > > > > +               CLASS(fd, f)(dfd);
> > > > > > +
> > > > > > +               if (fd_empty(f))
> > > > > > +                       return -EBADF;
> > > > > > +
> > > > > > +               idmap = file_mnt_idmap(fd_file(f));
> > > > > > +               dentry = file_dentry(fd_file(f));
> > > > > > +               mnt = fd_file(f)->f_path.mnt;
> > > > > > +       } else {
> > > > > > +               error = filename_lookup(dfd, name, lookup_flags, &filepath,
> > > > > > +                                       NULL);
> > > > > > +               if (error)
> > > > > > +                       return error;
> > > > > > +
> > > > > > +               idmap = mnt_idmap(filepath.mnt);
> > > > > > +               dentry = filepath.dentry;
> > > > > > +               mnt = filepath.mnt;
> > > > > > +       }
> > > > > > +
> > > > > > +       error = mnt_want_write(mnt);
> > > > > > +       if (!error) {
> > > > > > +               error = vfs_fileattr_set(idmap, dentry, &fa);
> > > > > > +               if (error == -ENOIOCTLCMD)
> > > > > > +                       error = -EOPNOTSUPP;
> > > > >
> > > > > This is awkward.
> > > > > vfs_fileattr_set() should return -EOPNOTSUPP.
> > > > > ioctl_setflags() could maybe convert it to -ENOIOCTLCMD,
> > > > > but looking at similar cases ioctl_fiemap(), ioctl_fsfreeze() the
> > > > > ioctl returns -EOPNOTSUPP.
> > > > >
> > > > > I don't think it is necessarily a bad idea to start returning
> > > > >  -EOPNOTSUPP instead of -ENOIOCTLCMD for the ioctl
> > > > > because that really reflects the fact that the ioctl is now implemented
> > > > > in vfs and not in the specific fs.
> > > > >
> > > > > and I think it would not be a bad idea at all to make that change
> > > > > together with the merge of the syscalls as a sort of hint to userspace
> > > > > that uses the ioctl, that the sycalls API exists.
> > > > >
> > > > > Thanks,
> > > > > Amir.
> > > > >
> > > >
> > > > Hmm, not sure what you're suggesting here. I see it as:
> > > > - get/setfsxattrat should return EOPNOTSUPP as it make more sense
> > > >   than ENOIOCTLCMD
> > > > - ioctl_setflags returns ENOIOCTLCMD which also expected
> > > >
> > > > Don't really see a reason to change what vfs_fileattr_set() returns
> > > > and then copying this if() to other places or start returning
> > > > EOPNOTSUPP.
> > > 
> > > ENOIOCTLCMD conceptually means that the ioctl command is unknown
> > > This is not the case since ->fileattr_[gs]et() became a vfs API
> > 
> > vfs_fileattr_{g,s}et() should not return ENOIOCTLCMD. Change the return
> > code to EOPNOTSUPP and then make EOPNOTSUPP be translated to ENOTTY on
> > on overlayfs and to ENOIOCTLCMD in ecryptfs and in fs/ioctl.c. This way
> > we get a clean VFS api while retaining current behavior. Amir can do his
> > cleanup based on that.
> 
> Also this get/set dance is not something new apis should do. It should
> be handled like setattr_prepare() or generic_fillattr() where the
> filesystem calls a VFS helper and that does all of this based on the
> current state of the inode instead of calling into the filesystem twice:
> 
> int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
> 		     struct fileattr *fa)
> {
> <snip>
> 	inode_lock(inode);
> 	err = vfs_fileattr_get(dentry, &old_ma);
> 	if (!err) {
> 		/* initialize missing bits from old_ma */
> 		if (fa->flags_valid) {
> <snip>
> 		err = fileattr_set_prepare(inode, &old_ma, fa);
> 		if (!err && !security_inode_setfsxattr(inode, fa))
> 			err = inode->i_op->fileattr_set(idmap, dentry, fa);
> 

You mean something like this? (not all fs are done)

-- 

From 421445f054ccad3116d55ae22c8995a48bb753fd Mon Sep 17 00:00:00 2001
From: Andrey Albershteyn <aalbersh@kernel.org>
Date: Fri, 25 Apr 2025 17:20:42 +0200
Subject: [PATCH] fs: push retrieval of fileattr down to filesystems

Currently, vfs_fileattr_set() calls twice to the file system. Firstly,
to retrieve current state of the inode extended attributes and secondly
to set the new ones.

This patch refactors this in a way that filesystem firstly gets current
inode attribute state and then calls VFS helper to verify them. This way
vfs_fileattr_set() will call filesystem just once.

Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
---
 fs/ext2/ioctl.c          |  9 ++++++
 fs/ext4/ioctl.c          |  9 ++++++
 fs/f2fs/file.c           | 12 +++++++-
 fs/file_attr.c           | 62 ++++++++++++++++++++++++----------------
 fs/gfs2/file.c           |  9 ++++++
 fs/hfsplus/inode.c       |  9 ++++++
 fs/jfs/ioctl.c           |  9 +++++-
 fs/ntfs3/file.c          | 12 +++++++-
 fs/orangefs/inode.c      |  9 ++++++
 fs/ubifs/ioctl.c         | 12 +++++++-
 fs/xfs/xfs_ioctl.c       |  6 ++++
 include/linux/fileattr.h |  2 ++
 mm/shmem.c               |  8 ++++++
 13 files changed, 140 insertions(+), 28 deletions(-)

diff --git a/fs/ext2/ioctl.c b/fs/ext2/ioctl.c
index 44e04484e570..3a45ed9c12b7 100644
--- a/fs/ext2/ioctl.c
+++ b/fs/ext2/ioctl.c
@@ -32,6 +32,15 @@ int ext2_fileattr_set(struct mnt_idmap *idmap,
 {
 	struct inode *inode = d_inode(dentry);
 	struct ext2_inode_info *ei = EXT2_I(inode);
+	struct fileattr cfa;
+	int err;
+
+	err = ext2_fileattr_get(dentry, &cfa);
+	if (err)
+		return err;
+	err = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (err)
+		return err;
 
 	if (fileattr_has_fsx(fa))
 		return -EOPNOTSUPP;
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index d17207386ead..f988ff4d7256 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -1002,6 +1002,15 @@ int ext4_fileattr_set(struct mnt_idmap *idmap,
 	struct inode *inode = d_inode(dentry);
 	u32 flags = fa->flags;
 	int err = -EOPNOTSUPP;
+	struct fileattr cfa;
+
+	err = ext4_fileattr_get(dentry, &cfa);
+	if (err)
+		return err;
+
+	err = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (err)
+		return err;
 
 	if (flags & ~EXT4_FL_USER_VISIBLE)
 		goto out;
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index abbcbb5865a3..f196a07f1f17 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3371,14 +3371,24 @@ int f2fs_fileattr_set(struct mnt_idmap *idmap,
 		      struct dentry *dentry, struct fileattr *fa)
 {
 	struct inode *inode = d_inode(dentry);
-	u32 fsflags = fa->flags, mask = F2FS_SETTABLE_FS_FL;
+	u32 fsflags, mask = F2FS_SETTABLE_FS_FL;
 	u32 iflags;
+	struct fileattr cfa;
 	int err;
 
 	if (unlikely(f2fs_cp_error(F2FS_I_SB(inode))))
 		return -EIO;
 	if (!f2fs_is_checkpoint_ready(F2FS_I_SB(inode)))
 		return -ENOSPC;
+
+	err = f2fs_fileattr_get(dentry, &cfa);
+	if (err)
+		return err;
+	err = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (err)
+		return err;
+	fsflags = fa->flags;
+
 	if (fsflags & ~F2FS_GETTABLE_FS_FL)
 		return -EOPNOTSUPP;
 	fsflags &= F2FS_SETTABLE_FS_FL;
diff --git a/fs/file_attr.c b/fs/file_attr.c
index 5e51c5b851ef..d0a01377bca8 100644
--- a/fs/file_attr.c
+++ b/fs/file_attr.c
@@ -7,6 +7,8 @@
 #include <linux/fileattr.h>
 #include <linux/namei.h>
 
+#include "internal.h"
+
 /**
  * fileattr_fill_xflags - initialize fileattr with xflags
  * @fa:		fileattr pointer
@@ -225,6 +227,36 @@ static int fileattr_set_prepare(struct inode *inode,
 	return 0;
 }
 
+/**
+ * vfs_fileattr_set_prepare - merge new filettr state and check for validity
+ * @idmap:	idmap of the mount
+ * @dentry:	the object to change
+ * @cfa:	current fileattr state
+ * @fa:		fileattr pointer with new values
+ *
+ * Return: 0 on success, or a negative error on failure.
+ */
+int vfs_fileattr_set_prepare(struct mnt_idmap *idmap, struct dentry *dentry,
+			     struct fileattr *cfa, struct fileattr *fa)
+{
+	int err;
+
+	/* initialize missing bits from cfa */
+	if (fa->flags_valid) {
+		fa->fsx_xflags |= cfa->fsx_xflags & ~FS_XFLAG_COMMON;
+		fa->fsx_extsize = cfa->fsx_extsize;
+		fa->fsx_nextents = cfa->fsx_nextents;
+		fa->fsx_projid = cfa->fsx_projid;
+		fa->fsx_cowextsize = cfa->fsx_cowextsize;
+	} else {
+		fa->flags |= cfa->flags & ~FS_COMMON_FL;
+	}
+
+	err = fileattr_set_prepare(d_inode(dentry), cfa, fa);
+	return err;
+}
+EXPORT_SYMBOL(vfs_fileattr_set_prepare);
+
 /**
  * vfs_fileattr_set - change miscellaneous file attributes
  * @idmap:	idmap of the mount
@@ -245,7 +277,6 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
 		     struct fileattr *fa)
 {
 	struct inode *inode = d_inode(dentry);
-	struct fileattr old_ma = {};
 	int err;
 
 	if (!inode->i_op->fileattr_set)
@@ -255,29 +286,12 @@ int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
 		return -EPERM;
 
 	inode_lock(inode);
-	err = vfs_fileattr_get(dentry, &old_ma);
-	if (!err) {
-		/* initialize missing bits from old_ma */
-		if (fa->flags_valid) {
-			fa->fsx_xflags |= old_ma.fsx_xflags & ~FS_XFLAG_COMMON;
-			fa->fsx_extsize = old_ma.fsx_extsize;
-			fa->fsx_nextents = old_ma.fsx_nextents;
-			fa->fsx_projid = old_ma.fsx_projid;
-			fa->fsx_cowextsize = old_ma.fsx_cowextsize;
-		} else {
-			fa->flags |= old_ma.flags & ~FS_COMMON_FL;
-		}
-
-		err = fileattr_set_prepare(inode, &old_ma, fa);
-		if (err)
-			goto out;
-		err = security_inode_file_setattr(dentry, fa);
-		if (err)
-			goto out;
-		err = inode->i_op->fileattr_set(idmap, dentry, fa);
-		if (err)
-			goto out;
-	}
+	err = security_inode_file_setattr(dentry, fa);
+	if (err)
+		goto out;
+	err = inode->i_op->fileattr_set(idmap, dentry, fa);
+	if (err)
+		goto out;
 
 out:
 	inode_unlock(inode);
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index fd1147aa3891..cf796fa73af2 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -282,10 +282,19 @@ int gfs2_fileattr_set(struct mnt_idmap *idmap,
 	u32 fsflags = fa->flags, gfsflags = 0;
 	u32 mask;
 	int i;
+	struct fileattr cfa;
+	int error;
 
 	if (d_is_special(dentry))
 		return -ENOTTY;
 
+	error = gfs2_fileattr_get(dentry, &cfa);
+	if (error)
+		return error;
+	error = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (error)
+		return error;
+
 	if (fileattr_has_fsx(fa))
 		return -EOPNOTSUPP;
 
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index f331e9574217..cdb11d00faea 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -678,6 +678,15 @@ int hfsplus_fileattr_set(struct mnt_idmap *idmap,
 	struct inode *inode = d_inode(dentry);
 	struct hfsplus_inode_info *hip = HFSPLUS_I(inode);
 	unsigned int new_fl = 0;
+	struct fileattr cfa;
+	int err;
+
+	err = hfsplus_fileattr_get(dentry, &cfa);
+	if (err)
+		return err;
+	err = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (err)
+		return err;
 
 	if (fileattr_has_fsx(fa))
 		return -EOPNOTSUPP;
diff --git a/fs/jfs/ioctl.c b/fs/jfs/ioctl.c
index f7bd7e8f5be4..4c62c14d15b0 100644
--- a/fs/jfs/ioctl.c
+++ b/fs/jfs/ioctl.c
@@ -75,11 +75,18 @@ int jfs_fileattr_set(struct mnt_idmap *idmap,
 {
 	struct inode *inode = d_inode(dentry);
 	struct jfs_inode_info *jfs_inode = JFS_IP(inode);
-	unsigned int flags;
+	unsigned int flags = jfs_inode->mode2 & JFS_FL_USER_VISIBLE;
+	struct fileattr cfa;
+	int err;
 
 	if (d_is_special(dentry))
 		return -ENOTTY;
 
+	fileattr_fill_flags(&cfa, jfs_map_ext2(flags, 0));
+	err = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (err)
+		return err;
+
 	if (fileattr_has_fsx(fa))
 		return -EOPNOTSUPP;
 
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index 9b6a3f8d2e7c..bc7ee7595b70 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -83,12 +83,22 @@ int ntfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
 {
 	struct inode *inode = d_inode(dentry);
 	struct ntfs_inode *ni = ntfs_i(inode);
-	u32 flags = fa->flags;
+	u32 flags;
 	unsigned int new_fl = 0;
+	struct fileattr cfa;
+	int err;
+
+	err = ntfs_fileattr_get(dentry, &cfa);
+	if (err)
+		return err;
+	err = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (err)
+		return err;
 
 	if (fileattr_has_fsx(fa))
 		return -EOPNOTSUPP;
 
+	flags = fa->flags;
 	if (flags & ~(FS_IMMUTABLE_FL | FS_APPEND_FL | FS_COMPR_FL))
 		return -EOPNOTSUPP;
 
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 5ac743c6bc2e..aecb61146443 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -910,6 +910,15 @@ static int orangefs_fileattr_set(struct mnt_idmap *idmap,
 				 struct dentry *dentry, struct fileattr *fa)
 {
 	u64 val = 0;
+	struct fileattr cfa;
+	int error = 0;
+
+	error = orangefs_fileattr_get(dentry, &cfa);
+	if (error)
+		return error;
+	error = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (error)
+		return error;
 
 	gossip_debug(GOSSIP_FILE_DEBUG, "%s: called on %pd\n", __func__,
 		     dentry);
diff --git a/fs/ubifs/ioctl.c b/fs/ubifs/ioctl.c
index 2c99349cf537..e71e362c786b 100644
--- a/fs/ubifs/ioctl.c
+++ b/fs/ubifs/ioctl.c
@@ -148,14 +148,24 @@ int ubifs_fileattr_set(struct mnt_idmap *idmap,
 		       struct dentry *dentry, struct fileattr *fa)
 {
 	struct inode *inode = d_inode(dentry);
-	int flags = fa->flags;
+	int flags;
+	struct fileattr cfa;
+	int err;
 
 	if (d_is_special(dentry))
 		return -ENOTTY;
 
+	err = ubifs_fileattr_get(dentry, &cfa);
+	if (err)
+		return err;
+	err = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (err)
+		return err;
+
 	if (fileattr_has_fsx(fa))
 		return -EOPNOTSUPP;
 
+	flags = fa->flags;
 	if (flags & ~UBIFS_GETTABLE_IOCTL_FLAGS)
 		return -EOPNOTSUPP;
 
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index d250f7f74e3b..c861dc1c3cf0 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -733,12 +733,18 @@ xfs_fileattr_set(
 	struct xfs_dquot	*pdqp = NULL;
 	struct xfs_dquot	*olddquot = NULL;
 	int			error;
+	struct fileattr		cfa;
 
 	trace_xfs_ioctl_setattr(ip);
 
 	if (d_is_special(dentry))
 		return -ENOTTY;
 
+	xfs_fill_fsxattr(ip, XFS_DATA_FORK, &cfa);
+	error = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (error)
+		return error;
+
 	if (!fa->fsx_valid) {
 		if (fa->flags & ~(FS_IMMUTABLE_FL | FS_APPEND_FL |
 				  FS_NOATIME_FL | FS_NODUMP_FL |
diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
index f62a5143eb2d..aba76d897533 100644
--- a/include/linux/fileattr.h
+++ b/include/linux/fileattr.h
@@ -75,6 +75,8 @@ static inline bool fileattr_has_fsx(const struct fileattr *fa)
 }
 
 int vfs_fileattr_get(struct dentry *dentry, struct fileattr *fa);
+int vfs_fileattr_set_prepare(struct mnt_idmap *idmap, struct dentry *dentry,
+			     struct fileattr *cfa, struct fileattr *fa);
 int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
 		     struct fileattr *fa);
 int ioctl_getflags(struct file *file, unsigned int __user *argp);
diff --git a/mm/shmem.c b/mm/shmem.c
index 99327c30507c..c2a5991f944f 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -4199,6 +4199,14 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap,
 	struct inode *inode = d_inode(dentry);
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	int ret, flags;
+	struct fileattr cfa;
+
+	ret = shmem_fileattr_get(dentry, &cfa);
+	if (ret)
+		return ret;
+	ret = vfs_fileattr_set_prepare(idmap, dentry, &cfa, fa);
+	if (ret)
+		return ret;
 
 	if (fileattr_has_fsx(fa))
 		return -EOPNOTSUPP;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls
  2025-04-25 18:16             ` Andrey Albershteyn
@ 2025-04-28  9:17               ` Christian Brauner
  0 siblings, 0 replies; 28+ messages in thread
From: Christian Brauner @ 2025-04-28  9:17 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Amir Goldstein, Richard Henderson, Matt Turner, Russell King,
	Catalin Marinas, Will Deacon, Geert Uytterhoeven, Michal Simek,
	Thomas Bogendoerfer, James E.J. Bottomley, Helge Deller,
	Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Yoshinori Sato, Rich Felker, John Paul Adrian Glaubitz,
	David S. Miller, Andreas Larsson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Chris Zankel, Max Filippov, Alexander Viro,
	Jan Kara, Mickaël Salaün, Günther Noack,
	Arnd Bergmann, Pali Rohár, Paul Moore, James Morris,
	Serge E. Hallyn, linux-alpha, linux-kernel, linux-arm-kernel,
	linux-m68k, linux-mips, linux-parisc, linuxppc-dev, linux-s390,
	linux-sh, sparclinux, linux-fsdevel, linux-security-module,
	linux-api, linux-arch, linux-xfs

On Fri, Apr 25, 2025 at 08:16:48PM +0200, Andrey Albershteyn wrote:
> On 2025-04-22 17:14:10, Christian Brauner wrote:
> > On Tue, Apr 22, 2025 at 04:31:29PM +0200, Christian Brauner wrote:
> > > On Thu, Mar 27, 2025 at 12:39:28PM +0100, Amir Goldstein wrote:
> > > > On Thu, Mar 27, 2025 at 10:33 AM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > >
> > > > > On 2025-03-23 09:56:25, Amir Goldstein wrote:
> > > > > > On Fri, Mar 21, 2025 at 8:49 PM Andrey Albershteyn <aalbersh@redhat.com> wrote:
> > > > > > >
> > > > > > > From: Andrey Albershteyn <aalbersh@redhat.com>
> > > > > > >
> > > > > > > Introduce getfsxattrat and setfsxattrat syscalls to manipulate inode
> > > > > > > extended attributes/flags. The syscalls take parent directory fd and
> > > > > > > path to the child together with struct fsxattr.
> > > > > > >
> > > > > > > This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
> > > > > > > that file don't need to be open as we can reference it with a path
> > > > > > > instead of fd. By having this we can manipulated inode extended
> > > > > > > attributes not only on regular files but also on special ones. This
> > > > > > > is not possible with FS_IOC_FSSETXATTR ioctl as with special files
> > > > > > > we can not call ioctl() directly on the filesystem inode using fd.
> > > > > > >
> > > > > > > This patch adds two new syscalls which allows userspace to get/set
> > > > > > > extended inode attributes on special files by using parent directory
> > > > > > > and a path - *at() like syscall.
> > > > > > >
> > > > > > > CC: linux-api@vger.kernel.org
> > > > > > > CC: linux-fsdevel@vger.kernel.org
> > > > > > > CC: linux-xfs@vger.kernel.org
> > > > > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > > > > > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > > > > > ---
> > > > > > ...
> > > > > > > +SYSCALL_DEFINE5(setfsxattrat, int, dfd, const char __user *, filename,
> > > > > > > +               struct fsxattr __user *, ufsx, size_t, usize,
> > > > > > > +               unsigned int, at_flags)
> > > > > > > +{
> > > > > > > +       struct fileattr fa;
> > > > > > > +       struct path filepath;
> > > > > > > +       int error;
> > > > > > > +       unsigned int lookup_flags = 0;
> > > > > > > +       struct filename *name;
> > > > > > > +       struct mnt_idmap *idmap;.
> > > > > >
> > > > > > > +       struct dentry *dentry;
> > > > > > > +       struct vfsmount *mnt;
> > > > > > > +       struct fsxattr fsx = {};
> > > > > > > +
> > > > > > > +       BUILD_BUG_ON(sizeof(struct fsxattr) < FSXATTR_SIZE_VER0);
> > > > > > > +       BUILD_BUG_ON(sizeof(struct fsxattr) != FSXATTR_SIZE_LATEST);
> > > > > > > +
> > > > > > > +       if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
> > > > > > > +               return -EINVAL;
> > > > > > > +
> > > > > > > +       if (!(at_flags & AT_SYMLINK_NOFOLLOW))
> > > > > > > +               lookup_flags |= LOOKUP_FOLLOW;
> > > > > > > +
> > > > > > > +       if (at_flags & AT_EMPTY_PATH)
> > > > > > > +               lookup_flags |= LOOKUP_EMPTY;
> > > > > > > +
> > > > > > > +       if (usize > PAGE_SIZE)
> > > > > > > +               return -E2BIG;
> > > > > > > +
> > > > > > > +       if (usize < FSXATTR_SIZE_VER0)
> > > > > > > +               return -EINVAL;
> > > > > > > +
> > > > > > > +       error = copy_struct_from_user(&fsx, sizeof(struct fsxattr), ufsx, usize);
> > > > > > > +       if (error)
> > > > > > > +               return error;
> > > > > > > +
> > > > > > > +       fsxattr_to_fileattr(&fsx, &fa);
> > > > > > > +
> > > > > > > +       name = getname_maybe_null(filename, at_flags);
> > > > > > > +       if (!name) {
> > > > > > > +               CLASS(fd, f)(dfd);
> > > > > > > +
> > > > > > > +               if (fd_empty(f))
> > > > > > > +                       return -EBADF;
> > > > > > > +
> > > > > > > +               idmap = file_mnt_idmap(fd_file(f));
> > > > > > > +               dentry = file_dentry(fd_file(f));
> > > > > > > +               mnt = fd_file(f)->f_path.mnt;
> > > > > > > +       } else {
> > > > > > > +               error = filename_lookup(dfd, name, lookup_flags, &filepath,
> > > > > > > +                                       NULL);
> > > > > > > +               if (error)
> > > > > > > +                       return error;
> > > > > > > +
> > > > > > > +               idmap = mnt_idmap(filepath.mnt);
> > > > > > > +               dentry = filepath.dentry;
> > > > > > > +               mnt = filepath.mnt;
> > > > > > > +       }
> > > > > > > +
> > > > > > > +       error = mnt_want_write(mnt);
> > > > > > > +       if (!error) {
> > > > > > > +               error = vfs_fileattr_set(idmap, dentry, &fa);
> > > > > > > +               if (error == -ENOIOCTLCMD)
> > > > > > > +                       error = -EOPNOTSUPP;
> > > > > >
> > > > > > This is awkward.
> > > > > > vfs_fileattr_set() should return -EOPNOTSUPP.
> > > > > > ioctl_setflags() could maybe convert it to -ENOIOCTLCMD,
> > > > > > but looking at similar cases ioctl_fiemap(), ioctl_fsfreeze() the
> > > > > > ioctl returns -EOPNOTSUPP.
> > > > > >
> > > > > > I don't think it is necessarily a bad idea to start returning
> > > > > >  -EOPNOTSUPP instead of -ENOIOCTLCMD for the ioctl
> > > > > > because that really reflects the fact that the ioctl is now implemented
> > > > > > in vfs and not in the specific fs.
> > > > > >
> > > > > > and I think it would not be a bad idea at all to make that change
> > > > > > together with the merge of the syscalls as a sort of hint to userspace
> > > > > > that uses the ioctl, that the sycalls API exists.
> > > > > >
> > > > > > Thanks,
> > > > > > Amir.
> > > > > >
> > > > >
> > > > > Hmm, not sure what you're suggesting here. I see it as:
> > > > > - get/setfsxattrat should return EOPNOTSUPP as it make more sense
> > > > >   than ENOIOCTLCMD
> > > > > - ioctl_setflags returns ENOIOCTLCMD which also expected
> > > > >
> > > > > Don't really see a reason to change what vfs_fileattr_set() returns
> > > > > and then copying this if() to other places or start returning
> > > > > EOPNOTSUPP.
> > > > 
> > > > ENOIOCTLCMD conceptually means that the ioctl command is unknown
> > > > This is not the case since ->fileattr_[gs]et() became a vfs API
> > > 
> > > vfs_fileattr_{g,s}et() should not return ENOIOCTLCMD. Change the return
> > > code to EOPNOTSUPP and then make EOPNOTSUPP be translated to ENOTTY on
> > > on overlayfs and to ENOIOCTLCMD in ecryptfs and in fs/ioctl.c. This way
> > > we get a clean VFS api while retaining current behavior. Amir can do his
> > > cleanup based on that.
> > 
> > Also this get/set dance is not something new apis should do. It should
> > be handled like setattr_prepare() or generic_fillattr() where the
> > filesystem calls a VFS helper and that does all of this based on the
> > current state of the inode instead of calling into the filesystem twice:
> > 
> > int vfs_fileattr_set(struct mnt_idmap *idmap, struct dentry *dentry,
> > 		     struct fileattr *fa)
> > {
> > <snip>
> > 	inode_lock(inode);
> > 	err = vfs_fileattr_get(dentry, &old_ma);
> > 	if (!err) {
> > 		/* initialize missing bits from old_ma */
> > 		if (fa->flags_valid) {
> > <snip>
> > 		err = fileattr_set_prepare(inode, &old_ma, fa);
> > 		if (!err && !security_inode_setfsxattr(inode, fa))
> > 			err = inode->i_op->fileattr_set(idmap, dentry, fa);
> > 
> 
> You mean something like this? (not all fs are done)

Yes, possibly. But don't bother with this now as that'll need some more
thinking and it'll just stall your work for no good reason. Let's just
get the syscalls in mergable shape now.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-04-28  9:17 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-21 19:48 [PATCH v4 0/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
2025-03-21 19:48 ` [PATCH v4 1/3] lsm: introduce new hooks for setting/getting inode fsxattr Andrey Albershteyn
2025-03-21 21:32   ` Paul Moore
2025-03-24 19:27     ` Mickaël Salaün
2025-03-27  9:19       ` Andrey Albershteyn
2025-03-24 19:21   ` Mickaël Salaün
2025-03-21 19:48 ` [PATCH v4 2/3] fs: split fileattr/fsxattr converters into helpers Andrey Albershteyn
2025-03-27 12:32   ` Jan Kara
2025-03-21 19:48 ` [PATCH v4 3/3] fs: introduce getfsxattrat and setfsxattrat syscalls Andrey Albershteyn
2025-03-23  8:56   ` Amir Goldstein
2025-03-27  9:33     ` Andrey Albershteyn
2025-03-27 11:39       ` Amir Goldstein
2025-04-22 14:31         ` Christian Brauner
2025-04-22 15:14           ` Christian Brauner
2025-04-25 18:16             ` Andrey Albershteyn
2025-04-28  9:17               ` Christian Brauner
2025-03-27 12:31   ` Jan Kara
2025-04-22 14:59   ` Christian Brauner
2025-04-23  9:53     ` Jan Kara
2025-04-24  9:06       ` Christian Brauner
2025-04-24 17:45         ` Andrey Albershteyn
2025-03-23  8:45 ` [PATCH v4 0/3] " Amir Goldstein
2025-03-23 10:32   ` Pali Rohár
2025-03-27 11:47     ` Amir Goldstein
2025-03-27 19:26       ` Pali Rohár
2025-03-27 20:57         ` Amir Goldstein
2025-03-27 21:13           ` Pali Rohár
2025-03-28 14:09             ` Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).