[PATCH RFC 0/13] fs: generic filesystem shutdown handling

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC 0/13] fs: generic filesystem shutdown handling
@ 2024-08-07 18:29 Jan Kara
  2024-08-07 18:29 ` [PATCH 01/13] fs: Define bit numbers for SB_I_ flags Jan Kara
                   ` (13 more replies)
  0 siblings, 14 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Hello,

this patch series implements generic handling of filesystem shutdown. The idea
is very simple: Have a superblock flag, which when set, will make VFS refuse
modifications to the filesystem. The patch series consists of several parts.
Patches 1-6 cleanup handling of SB_I_ flags which is currently messy (different
flags seem to have different locks protecting them although they are modified
by plain stores). Patches 7-12 gradually convert code to be able to handle
errors from sb_start_write() / sb_start_pagefault(). Patch 13 then shows how
filesystems can use this generic flag. Additionally, we could remove some
shutdown checks from within ext4 code and rely on checks in VFS but I didn't
want to complicate the series with ext4 specific things.

Also, as Dave suggested, we can lift *_IOC_{SHUTDOWN|GOINGDOWN} ioctl handling
to VFS (currently in 5 filesystems) and just call new ->shutdown op for
the filesystem abort handling itself. But that is kind of independent thing
and this series is long enough as is.

So what do people think?

								Honza

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/13] fs: Define bit numbers for SB_I_ flags
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 02/13] fs: Convert fs_context use of SB_I_ flags to new constants Jan Kara
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

sb->s_iflags has unclear locking rules. Some users modify it under
sb_lock, some under sb->s_umount rwsem, some without any lock.  Readers
are generally not holding any locks either. The flags are rarely
modified so this does not lead to any practical problems but it is a
mess. To reconcile the situation, handle sb->i_flags without any locks
by using atomic bit operations. Since the flags are rarely modified,
this does not introduce any noticeable performance overhead and resolves
all possible issues when different users of sb->s_iflags could possibly
stomp on each other's toes. Define new constants using bit numbers and
functions to tests, set and clear the flags.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/fs.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index fd34b5755c0b..ff45a97b39cb 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1190,6 +1190,25 @@ extern int send_sigurg(struct fown_struct *fown);
 #define SB_I_RETIRED	0x00000800	/* superblock shouldn't be reused */
 #define SB_I_NOUMASK	0x00001000	/* VFS does not apply umask */
 
+enum {
+	_SB_I_CGROUPWB,		/* cgroup-aware writeback enabled */
+	_SB_I_NOEXEC,		/* Ignore executables on this fs */
+	_SB_I_NODEV,		/* Ignore devices on this fs */
+	_SB_I_STABLE_WRITES,	/* don't modify blks until WB is done */
+
+	/* sb->s_iflags to limit user namespace mounts */
+	_SB_I_USERNS_VISIBLE,	/* fstype already mounted */
+	_SB_I_IMA_UNVERIFIABLE_SIGNATURE,
+	_SB_I_UNTRUSTED_MOUNTER,
+	_SB_I_EVM_HMAC_UNSUPPORTED,
+
+	_SB_I_SKIP_SYNC,	/* Skip superblock at global sync */
+	_SB_I_PERSB_BDI,	/* has a per-sb bdi */
+	_SB_I_TS_EXPIRY_WARNED,	/* warned about timestamp range expiry */
+	_SB_I_RETIRED,		/* superblock shouldn't be reused */
+	_SB_I_NOUMASK,		/* VFS does not apply umask */
+};
+
 /* Possible states of 'frozen' field */
 enum {
 	SB_UNFROZEN = 0,		/* FS is unfrozen */
@@ -1351,6 +1370,21 @@ struct super_block {
 	struct list_head	s_inodes_wb;	/* writeback inodes */
 } __randomize_layout;
 
+static inline bool sb_test_iflag(const struct super_block *sb, int flag)
+{
+	return test_bit(flag, &sb->s_iflags);
+}
+
+static inline void sb_set_iflag(struct super_block *sb, int flag)
+{
+	set_bit(flag, &sb->s_iflags);
+}
+
+static inline void sb_clear_iflag(struct super_block *sb, int flag)
+{
+	clear_bit(flag, &sb->s_iflags);
+}
+
 static inline struct user_namespace *i_user_ns(const struct inode *inode)
 {
 	return inode->i_sb->s_user_ns;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/13] fs: Convert fs_context use of SB_I_ flags to new constants
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
  2024-08-07 18:29 ` [PATCH 01/13] fs: Define bit numbers for SB_I_ flags Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 03/13] fs: Convert mount_too_revealing() to new s_iflags handling functions Jan Kara
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Convert uses of SB_I_ constants in fc->s_iflags to the new bit
constants. No functional changes.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/bdev.c        | 2 +-
 fs/aio.c            | 2 +-
 fs/nfs/fs_context.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index c5507b6f63b8..c1ea2aeb93dd 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -373,7 +373,7 @@ static int bd_init_fs_context(struct fs_context *fc)
 	struct pseudo_fs_context *ctx = init_pseudo(fc, BDEVFS_MAGIC);
 	if (!ctx)
 		return -ENOMEM;
-	fc->s_iflags |= SB_I_CGROUPWB;
+	fc->s_iflags |= 1 << _SB_I_CGROUPWB;
 	ctx->ops = &bdev_sops;
 	return 0;
 }
diff --git a/fs/aio.c b/fs/aio.c
index 6066f64967b3..63ce0736c3a3 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -279,7 +279,7 @@ static int aio_init_fs_context(struct fs_context *fc)
 {
 	if (!init_pseudo(fc, AIO_RING_MAGIC))
 		return -ENOMEM;
-	fc->s_iflags |= SB_I_NOEXEC;
+	fc->s_iflags |= 1 << _SB_I_NOEXEC;
 	return 0;
 }
 
diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 6c9f3f6645dd..2fbae7e2b6ce 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -1643,7 +1643,7 @@ static int nfs_init_fs_context(struct fs_context *fc)
 		ctx->xprtsec.cert_serial	= TLS_NO_CERT;
 		ctx->xprtsec.privkey_serial	= TLS_NO_PRIVKEY;
 
-		fc->s_iflags		|= SB_I_STABLE_WRITES;
+		fc->s_iflags		|= 1 << _SB_I_STABLE_WRITES;
 	}
 	fc->fs_private = ctx;
 	fc->ops = &nfs_fs_context_ops;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/13] fs: Convert mount_too_revealing() to new s_iflags handling functions
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
  2024-08-07 18:29 ` [PATCH 01/13] fs: Define bit numbers for SB_I_ flags Jan Kara
  2024-08-07 18:29 ` [PATCH 02/13] fs: Convert fs_context use of SB_I_ flags to new constants Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 04/13] fs: Convert remaining usage of SB_I_ flags Jan Kara
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Convert mount_too_revealing() to use the new functions for handling
sb->s_iflags and new bit constants.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/namespace.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 328087a4df8a..75153f61a908 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -5623,21 +5623,18 @@ static bool mnt_already_visible(struct mnt_namespace *ns,
 
 static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags)
 {
-	const unsigned long required_iflags = SB_I_NOEXEC | SB_I_NODEV;
 	struct mnt_namespace *ns = current->nsproxy->mnt_ns;
-	unsigned long s_iflags;
 
 	if (ns->user_ns == &init_user_ns)
 		return false;
 
 	/* Can this filesystem be too revealing? */
-	s_iflags = sb->s_iflags;
-	if (!(s_iflags & SB_I_USERNS_VISIBLE))
+	if (!sb_test_iflag(sb, _SB_I_USERNS_VISIBLE))
 		return false;
 
-	if ((s_iflags & required_iflags) != required_iflags) {
-		WARN_ONCE(1, "Expected s_iflags to contain 0x%lx\n",
-			  required_iflags);
+	if (!sb_test_iflag(sb, _SB_I_NOEXEC) || !sb_test_iflag(sb, _SB_I_NODEV)) {
+		WARN_ONCE(1, "Expected s_iflags to contain SB_I_NOEXEC and "
+			  "SB_I_NODEV\n");
 		return true;
 	}
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/13] fs: Convert remaining usage of SB_I_ flags
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (2 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 03/13] fs: Convert mount_too_revealing() to new s_iflags handling functions Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 05/13] fs: Drop old SB_I_ constants Jan Kara
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Convert remaining handling of sb->s_iflags to use the new helper
functions and new bit constants. The patch was generated using
coccinelle with a few manual fixups to improve code style.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 drivers/android/binderfs.c            |  4 ++--
 fs/btrfs/super.c                      |  2 +-
 fs/devpts/inode.c                     |  2 +-
 fs/exec.c                             |  2 +-
 fs/ext2/super.c                       |  2 +-
 fs/ext4/super.c                       |  2 +-
 fs/f2fs/super.c                       |  2 +-
 fs/fuse/inode.c                       |  4 ++--
 fs/inode.c                            |  2 +-
 fs/kernfs/mount.c                     |  3 ++-
 fs/namei.c                            |  2 +-
 fs/namespace.c                        |  4 ++--
 fs/nfs/super.c                        |  2 +-
 fs/overlayfs/super.c                  |  6 +++---
 fs/proc/root.c                        |  4 +++-
 fs/super.c                            | 18 +++++++++---------
 fs/sync.c                             |  2 +-
 fs/sysfs/mount.c                      |  2 +-
 fs/xfs/xfs_super.c                    |  2 +-
 include/linux/backing-dev.h           |  2 +-
 include/linux/namei.h                 |  2 +-
 ipc/mqueue.c                          |  3 ++-
 security/integrity/evm/evm_main.c     |  2 +-
 security/integrity/ima/ima_appraise.c |  4 ++--
 security/integrity/ima/ima_main.c     |  4 ++--
 25 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
index 3001d754ac36..f9454b93c2f7 100644
--- a/drivers/android/binderfs.c
+++ b/drivers/android/binderfs.c
@@ -672,8 +672,8 @@ static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc)
 	 * allowed to do. So removing the SB_I_NODEV flag from s_iflags is both
 	 * necessary and safe.
 	 */
-	sb->s_iflags &= ~SB_I_NODEV;
-	sb->s_iflags |= SB_I_NOEXEC;
+	sb_clear_iflag(sb, _SB_I_NODEV);
+	sb_set_iflag(sb, _SB_I_NOEXEC);
 	sb->s_magic = BINDERFS_SUPER_MAGIC;
 	sb->s_op = &binderfs_super_ops;
 	sb->s_time_gran = 1;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 08d33cb372fb..321696697279 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -950,7 +950,7 @@ static int btrfs_fill_super(struct super_block *sb,
 #endif
 	sb->s_xattr = btrfs_xattr_handlers;
 	sb->s_time_gran = 1;
-	sb->s_iflags |= SB_I_CGROUPWB;
+	sb_set_iflag(sb, _SB_I_CGROUPWB);
 
 	err = super_setup_bdi(sb);
 	if (err) {
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index b20e565b9c5e..d473156d2791 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -428,7 +428,7 @@ devpts_fill_super(struct super_block *s, void *data, int silent)
 	struct inode *inode;
 	int error;
 
-	s->s_iflags &= ~SB_I_NODEV;
+	sb_clear_iflag(s, _SB_I_NODEV);
 	s->s_blocksize = 1024;
 	s->s_blocksize_bits = 10;
 	s->s_magic = DEVPTS_SUPER_MAGIC;
diff --git a/fs/exec.c b/fs/exec.c
index a126e3d1cacb..b62b67bea10b 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -112,7 +112,7 @@ static inline void put_binfmt(struct linux_binfmt * fmt)
 bool path_noexec(const struct path *path)
 {
 	return (path->mnt->mnt_flags & MNT_NOEXEC) ||
-	       (path->mnt->mnt_sb->s_iflags & SB_I_NOEXEC);
+	       sb_test_iflag(path->mnt->mnt_sb, _SB_I_NOEXEC);
 }
 
 #ifdef CONFIG_USELIB
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 37f7ce56adce..9da8652c10c5 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -916,7 +916,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
 
 	sb->s_flags = (sb->s_flags & ~SB_POSIXACL) |
 		(test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0);
-	sb->s_iflags |= SB_I_CGROUPWB;
+	sb_set_iflag(sb, _SB_I_CGROUPWB);
 
 	if (le32_to_cpu(es->s_rev_level) == EXT2_GOOD_OLD_REV &&
 	    (EXT2_HAS_COMPAT_FEATURE(sb, ~0U) ||
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 93c016b186c0..a776d4e7ec66 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4972,7 +4972,7 @@ static int ext4_check_journal_data_mode(struct super_block *sb)
 		if (test_opt(sb, DELALLOC))
 			clear_opt(sb, DELALLOC);
 	} else {
-		sb->s_iflags |= SB_I_CGROUPWB;
+		sb_set_iflag(sb, _SB_I_CGROUPWB);
 	}
 
 	return 0;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 3959fd137cc9..041b7b7b0810 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4472,7 +4472,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
 	super_set_uuid(sb, (void *) raw_super->uuid, sizeof(raw_super->uuid));
 	super_set_sysfs_name_bdev(sb);
-	sb->s_iflags |= SB_I_CGROUPWB;
+	sb_set_iflag(sb, _SB_I_CGROUPWB);
 
 	/* init f2fs-specific super block info */
 	sbi->valid_super_block = valid_super_block;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index d8ab4e93916f..3602a578b7b3 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1566,9 +1566,9 @@ static void fuse_sb_defaults(struct super_block *sb)
 	sb->s_maxbytes = MAX_LFS_FILESIZE;
 	sb->s_time_gran = 1;
 	sb->s_export_op = &fuse_export_operations;
-	sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE;
+	sb_set_iflag(sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE);
 	if (sb->s_user_ns != &init_user_ns)
-		sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER;
+		sb_set_iflag(sb, _SB_I_UNTRUSTED_MOUNTER);
 	sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
 }
 
diff --git a/fs/inode.c b/fs/inode.c
index 86670941884b..a8598a968940 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -216,7 +216,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
 	lockdep_set_class_and_name(&mapping->invalidate_lock,
 				   &sb->s_type->invalidate_lock_key,
 				   "mapping.invalidate_lock");
-	if (sb->s_iflags & SB_I_STABLE_WRITES)
+	if (sb_test_iflag(sb, _SB_I_STABLE_WRITES))
 		mapping_set_stable_writes(mapping);
 	inode->i_private = NULL;
 	inode->i_mapping = mapping;
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 1358c21837f1..762edcf5387e 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -252,7 +252,8 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
 
 	info->sb = sb;
 	/* Userspace would break if executables or devices appear on sysfs */
-	sb->s_iflags |= SB_I_NOEXEC | SB_I_NODEV;
+	sb_set_iflag(sb, _SB_I_NOEXEC);
+	sb_set_iflag(sb, _SB_I_NODEV);
 	sb->s_blocksize = PAGE_SIZE;
 	sb->s_blocksize_bits = PAGE_SHIFT;
 	sb->s_magic = kfc->magic;
diff --git a/fs/namei.c b/fs/namei.c
index 5512cb10fa89..de6936564298 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3308,7 +3308,7 @@ EXPORT_SYMBOL(vfs_mkobj);
 bool may_open_dev(const struct path *path)
 {
 	return !(path->mnt->mnt_flags & MNT_NODEV) &&
-		!(path->mnt->mnt_sb->s_iflags & SB_I_NODEV);
+		!sb_test_iflag(path->mnt->mnt_sb, _SB_I_NODEV);
 }
 
 static int may_open(struct mnt_idmap *idmap, const struct path *path,
diff --git a/fs/namespace.c b/fs/namespace.c
index 75153f61a908..17126569b3c4 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2919,7 +2919,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount *
 	struct super_block *sb = mnt->mnt_sb;
 
 	if (!__mnt_is_readonly(mnt) &&
-	   (!(sb->s_iflags & SB_I_TS_EXPIRY_WARNED)) &&
+	   !sb_test_iflag(sb, _SB_I_TS_EXPIRY_WARNED) &&
 	   (ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) {
 		char *buf = (char *)__get_free_page(GFP_KERNEL);
 		char *mntpath = buf ? d_path(mountpoint, buf, PAGE_SIZE) : ERR_PTR(-ENOMEM);
@@ -2931,7 +2931,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount *
 			(unsigned long long)sb->s_time_max);
 
 		free_page((unsigned long)buf);
-		sb->s_iflags |= SB_I_TS_EXPIRY_WARNED;
+		sb_set_iflag(sb, _SB_I_TS_EXPIRY_WARNED);
 	}
 }
 
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index cbbd4866b0b7..b6b806fb6286 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -1094,7 +1094,7 @@ static void nfs_fill_super(struct super_block *sb, struct nfs_fs_context *ctx)
 		sb->s_export_op = &nfs_export_ops;
 		break;
 	case 4:
-		sb->s_iflags |= SB_I_NOUMASK;
+		sb_set_iflag(sb, _SB_I_NOUMASK);
 		sb->s_time_gran = 1;
 		sb->s_time_min = S64_MIN;
 		sb->s_time_max = S64_MAX;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 06a231970cb5..afa5263ff016 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1453,14 +1453,14 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
 #ifdef CONFIG_FS_POSIX_ACL
 	sb->s_flags |= SB_POSIXACL;
 #endif
-	sb->s_iflags |= SB_I_SKIP_SYNC;
+	sb_set_iflag(sb, _SB_I_SKIP_SYNC);
 	/*
 	 * Ensure that umask handling is done by the filesystems used
 	 * for the the upper layer instead of overlayfs as that would
 	 * lead to unexpected results.
 	 */
-	sb->s_iflags |= SB_I_NOUMASK;
-	sb->s_iflags |= SB_I_EVM_HMAC_UNSUPPORTED;
+	sb_set_iflag(sb, _SB_I_NOUMASK);
+	sb_set_iflag(sb, _SB_I_EVM_HMAC_UNSUPPORTED);
 
 	err = -ENOMEM;
 	root_dentry = ovl_get_root(sb, ctx->upper.dentry, oe);
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 06a297a27ba3..ac78ec69dde9 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -171,7 +171,9 @@ static int proc_fill_super(struct super_block *s, struct fs_context *fc)
 	proc_apply_options(fs_info, fc, current_user_ns());
 
 	/* User space would break if executables or devices appear on proc */
-	s->s_iflags |= SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV;
+	sb_set_iflag(s, _SB_I_USERNS_VISIBLE);
+	sb_set_iflag(s, _SB_I_NOEXEC);
+	sb_set_iflag(s, _SB_I_NODEV);
 	s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC;
 	s->s_blocksize = 1024;
 	s->s_blocksize_bits = 10;
diff --git a/fs/super.c b/fs/super.c
index 38d72a3cf6fc..e3020b3db4f0 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -355,7 +355,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
 	s->s_bdi = &noop_backing_dev_info;
 	s->s_flags = flags;
 	if (s->s_user_ns != &init_user_ns)
-		s->s_iflags |= SB_I_NODEV;
+		sb_set_iflag(s, _SB_I_NODEV);
 	INIT_HLIST_NODE(&s->s_instances);
 	INIT_HLIST_BL_HEAD(&s->s_roots);
 	mutex_init(&s->s_sync_lock);
@@ -589,11 +589,11 @@ void retire_super(struct super_block *sb)
 {
 	WARN_ON(!sb->s_bdev);
 	__super_lock_excl(sb);
-	if (sb->s_iflags & SB_I_PERSB_BDI) {
+	if (sb_test_iflag(sb, _SB_I_PERSB_BDI)) {
 		bdi_unregister(sb->s_bdi);
-		sb->s_iflags &= ~SB_I_PERSB_BDI;
+		sb_clear_iflag(sb, _SB_I_PERSB_BDI);
 	}
-	sb->s_iflags |= SB_I_RETIRED;
+	sb_set_iflag(sb, _SB_I_RETIRED);
 	super_unlock_excl(sb);
 }
 EXPORT_SYMBOL(retire_super);
@@ -678,7 +678,7 @@ void generic_shutdown_super(struct super_block *sb)
 	super_wake(sb, SB_DYING);
 	super_unlock_excl(sb);
 	if (sb->s_bdi != &noop_backing_dev_info) {
-		if (sb->s_iflags & SB_I_PERSB_BDI)
+		if (sb_test_iflag(sb, _SB_I_PERSB_BDI))
 			bdi_unregister(sb->s_bdi);
 		bdi_put(sb->s_bdi);
 		sb->s_bdi = &noop_backing_dev_info;
@@ -1331,7 +1331,7 @@ static int super_s_dev_set(struct super_block *s, struct fs_context *fc)
 
 static int super_s_dev_test(struct super_block *s, struct fs_context *fc)
 {
-	return !(s->s_iflags & SB_I_RETIRED) &&
+	return !sb_test_iflag(s, _SB_I_RETIRED) &&
 		s->s_dev == *(dev_t *)fc->sget_key;
 }
 
@@ -1584,7 +1584,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
 	sb->s_bdev = bdev;
 	sb->s_bdi = bdi_get(bdev->bd_disk->bdi);
 	if (bdev_stable_writes(bdev))
-		sb->s_iflags |= SB_I_STABLE_WRITES;
+		sb_set_iflag(sb, _SB_I_STABLE_WRITES);
 	spin_unlock(&sb_lock);
 
 	snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
@@ -1648,7 +1648,7 @@ EXPORT_SYMBOL(get_tree_bdev);
 
 static int test_bdev_super(struct super_block *s, void *data)
 {
-	return !(s->s_iflags & SB_I_RETIRED) && s->s_dev == *(dev_t *)data;
+	return !sb_test_iflag(s, _SB_I_RETIRED) && s->s_dev == *(dev_t *)data;
 }
 
 struct dentry *mount_bdev(struct file_system_type *fs_type,
@@ -1864,7 +1864,7 @@ int super_setup_bdi_name(struct super_block *sb, char *fmt, ...)
 	}
 	WARN_ON(sb->s_bdi != &noop_backing_dev_info);
 	sb->s_bdi = bdi;
-	sb->s_iflags |= SB_I_PERSB_BDI;
+	sb_set_iflag(sb, _SB_I_PERSB_BDI);
 
 	return 0;
 }
diff --git a/fs/sync.c b/fs/sync.c
index dc725914e1ed..4e5ad48316be 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -79,7 +79,7 @@ static void sync_inodes_one_sb(struct super_block *sb, void *arg)
 
 static void sync_fs_one_sb(struct super_block *sb, void *arg)
 {
-	if (!sb_rdonly(sb) && !(sb->s_iflags & SB_I_SKIP_SYNC) &&
+	if (!sb_rdonly(sb) && !sb_test_iflag(sb, _SB_I_SKIP_SYNC) &&
 	    sb->s_op->sync_fs)
 		sb->s_op->sync_fs(sb, *(int *)arg);
 }
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 98467bb76737..124385961da7 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -33,7 +33,7 @@ static int sysfs_get_tree(struct fs_context *fc)
 		return ret;
 
 	if (kfc->new_sb_created)
-		fc->root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
+		sb_set_iflag(fc->root->d_sb, _SB_I_USERNS_VISIBLE);
 	return 0;
 }
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 27e9f749c4c7..7707f2a1a836 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1701,7 +1701,7 @@ xfs_fs_fill_super(
 		sb->s_time_max = XFS_LEGACY_TIME_MAX;
 	}
 	trace_xfs_inode_timestamp_range(mp, sb->s_time_min, sb->s_time_max);
-	sb->s_iflags |= SB_I_CGROUPWB;
+	sb_set_iflag(sb, _SB_I_CGROUPWB);
 
 	set_posix_acl_flag(sb);
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 8e7af9a03b41..54fdae7b1be4 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -176,7 +176,7 @@ static inline bool inode_cgwb_enabled(struct inode *inode)
 	return cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
 		cgroup_subsys_on_dfl(io_cgrp_subsys) &&
 		(bdi->capabilities & BDI_CAP_WRITEBACK) &&
-		(inode->i_sb->s_iflags & SB_I_CGROUPWB);
+		sb_test_iflag(inode->i_sb, _SB_I_CGROUPWB);
 }
 
 /**
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 8ec8fed3bce8..3fbf340dac1a 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -107,7 +107,7 @@ extern void unlock_rename(struct dentry *, struct dentry *);
  */
 static inline umode_t __must_check mode_strip_umask(const struct inode *dir, umode_t mode)
 {
-	if (!IS_POSIXACL(dir) && !(dir->i_sb->s_iflags & SB_I_NOUMASK))
+	if (!IS_POSIXACL(dir) && !sb_test_iflag(dir->i_sb, _SB_I_NOUMASK))
 		mode &= ~current_umask();
 	return mode;
 }
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index a7cbd69efbef..e73fff4c2f12 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -406,7 +406,8 @@ static int mqueue_fill_super(struct super_block *sb, struct fs_context *fc)
 	struct inode *inode;
 	struct ipc_namespace *ns = sb->s_fs_info;
 
-	sb->s_iflags |= SB_I_NOEXEC | SB_I_NODEV;
+	sb_set_iflag(sb, _SB_I_NOEXEC);
+	sb_set_iflag(sb, _SB_I_NODEV);
 	sb->s_blocksize = PAGE_SIZE;
 	sb->s_blocksize_bits = PAGE_SHIFT;
 	sb->s_magic = MQUEUE_MAGIC;
diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
index 62fe66dd53ce..3ff29bf73f04 100644
--- a/security/integrity/evm/evm_main.c
+++ b/security/integrity/evm/evm_main.c
@@ -155,7 +155,7 @@ static int is_unsupported_hmac_fs(struct dentry *dentry)
 {
 	struct inode *inode = d_backing_inode(dentry);
 
-	if (inode->i_sb->s_iflags & SB_I_EVM_HMAC_UNSUPPORTED) {
+	if (sb_test_iflag(inode->i_sb, _SB_I_EVM_HMAC_UNSUPPORTED)) {
 		pr_info_once("%s not supported\n", inode->i_sb->s_type->name);
 		return 1;
 	}
diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
index 656c709b974f..9c290dd8a4ac 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -564,8 +564,8 @@ int ima_appraise_measurement(enum ima_hooks func, struct ima_iint_cache *iint,
 	 * system not willing to accept such a risk, fail the file signature
 	 * verification.
 	 */
-	if ((inode->i_sb->s_iflags & SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
-	    ((inode->i_sb->s_iflags & SB_I_UNTRUSTED_MOUNTER) ||
+	if (sb_test_iflag(inode->i_sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
+	    (sb_test_iflag(inode->i_sb, _SB_I_UNTRUSTED_MOUNTER) ||
 	     (iint->flags & IMA_FAIL_UNVERIFIABLE_SIGS))) {
 		status = INTEGRITY_FAIL;
 		cause = "unverifiable-signature";
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index f04f43af651c..b04eaa33eca4 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -280,8 +280,8 @@ static int process_measurement(struct file *file, const struct cred *cred,
 	 * (Limited to privileged mounted filesystems.)
 	 */
 	if (test_and_clear_bit(IMA_CHANGE_XATTR, &iint->atomic_flags) ||
-	    ((inode->i_sb->s_iflags & SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
-	     !(inode->i_sb->s_iflags & SB_I_UNTRUSTED_MOUNTER) &&
+	    (sb_test_iflag(inode->i_sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
+	     !sb_test_iflag(inode->i_sb, _SB_I_UNTRUSTED_MOUNTER) &&
 	     !(action & IMA_FAIL_UNVERIFIABLE_SIGS))) {
 		iint->flags &= ~IMA_DONE_MASK;
 		iint->measured_pcrs = 0;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/13] fs: Drop old SB_I_ constants
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (3 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 04/13] fs: Convert remaining usage of SB_I_ flags Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants Jan Kara
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Nothing is using the old SB_I_ constants anymore. Remove them.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/fs.h | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index ff45a97b39cb..65e70ceb335e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1173,23 +1173,6 @@ extern int send_sigurg(struct fown_struct *fown);
 #define UMOUNT_UNUSED	0x80000000	/* Flag guaranteed to be unused */
 
 /* sb->s_iflags */
-#define SB_I_CGROUPWB	0x00000001	/* cgroup-aware writeback enabled */
-#define SB_I_NOEXEC	0x00000002	/* Ignore executables on this fs */
-#define SB_I_NODEV	0x00000004	/* Ignore devices on this fs */
-#define SB_I_STABLE_WRITES 0x00000008	/* don't modify blks until WB is done */
-
-/* sb->s_iflags to limit user namespace mounts */
-#define SB_I_USERNS_VISIBLE		0x00000010 /* fstype already mounted */
-#define SB_I_IMA_UNVERIFIABLE_SIGNATURE	0x00000020
-#define SB_I_UNTRUSTED_MOUNTER		0x00000040
-#define SB_I_EVM_HMAC_UNSUPPORTED	0x00000080
-
-#define SB_I_SKIP_SYNC	0x00000100	/* Skip superblock at global sync */
-#define SB_I_PERSB_BDI	0x00000200	/* has a per-sb bdi */
-#define SB_I_TS_EXPIRY_WARNED 0x00000400 /* warned about timestamp range expiry */
-#define SB_I_RETIRED	0x00000800	/* superblock shouldn't be reused */
-#define SB_I_NOUMASK	0x00001000	/* VFS does not apply umask */
-
 enum {
 	_SB_I_CGROUPWB,		/* cgroup-aware writeback enabled */
 	_SB_I_NOEXEC,		/* Ignore executables on this fs */
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (4 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 05/13] fs: Drop old SB_I_ constants Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-08 11:47   ` Amir Goldstein
  2024-08-07 18:29 ` [PATCH 07/13] overlayfs: Make ovl_start_write() return error Jan Kara
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Now that old constants are gone, remove the unnecessary underscore from
the new _SB_I_ constants. Pure mechanical replacement, no functional
changes.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/bdev.c                          |  2 +-
 drivers/android/binderfs.c            |  4 ++--
 fs/aio.c                              |  2 +-
 fs/btrfs/super.c                      |  2 +-
 fs/devpts/inode.c                     |  2 +-
 fs/exec.c                             |  2 +-
 fs/ext2/super.c                       |  2 +-
 fs/ext4/super.c                       |  2 +-
 fs/f2fs/super.c                       |  2 +-
 fs/fuse/inode.c                       |  4 ++--
 fs/inode.c                            |  2 +-
 fs/kernfs/mount.c                     |  4 ++--
 fs/namei.c                            |  2 +-
 fs/namespace.c                        |  8 ++++----
 fs/nfs/fs_context.c                   |  2 +-
 fs/nfs/super.c                        |  2 +-
 fs/overlayfs/super.c                  |  6 +++---
 fs/proc/root.c                        |  6 +++---
 fs/super.c                            | 18 ++++++++---------
 fs/sync.c                             |  2 +-
 fs/sysfs/mount.c                      |  2 +-
 fs/xfs/xfs_super.c                    |  2 +-
 include/linux/backing-dev.h           |  2 +-
 include/linux/fs.h                    | 28 +++++++++++++--------------
 include/linux/namei.h                 |  2 +-
 ipc/mqueue.c                          |  4 ++--
 security/integrity/evm/evm_main.c     |  2 +-
 security/integrity/ima/ima_appraise.c |  4 ++--
 security/integrity/ima/ima_main.c     |  4 ++--
 29 files changed, 63 insertions(+), 63 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index c1ea2aeb93dd..6c13ba60c0b1 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -373,7 +373,7 @@ static int bd_init_fs_context(struct fs_context *fc)
 	struct pseudo_fs_context *ctx = init_pseudo(fc, BDEVFS_MAGIC);
 	if (!ctx)
 		return -ENOMEM;
-	fc->s_iflags |= 1 << _SB_I_CGROUPWB;
+	fc->s_iflags |= 1 << SB_I_CGROUPWB;
 	ctx->ops = &bdev_sops;
 	return 0;
 }
diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
index f9454b93c2f7..6070923fbfbd 100644
--- a/drivers/android/binderfs.c
+++ b/drivers/android/binderfs.c
@@ -672,8 +672,8 @@ static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc)
 	 * allowed to do. So removing the SB_I_NODEV flag from s_iflags is both
 	 * necessary and safe.
 	 */
-	sb_clear_iflag(sb, _SB_I_NODEV);
-	sb_set_iflag(sb, _SB_I_NOEXEC);
+	sb_clear_iflag(sb, SB_I_NODEV);
+	sb_set_iflag(sb, SB_I_NOEXEC);
 	sb->s_magic = BINDERFS_SUPER_MAGIC;
 	sb->s_op = &binderfs_super_ops;
 	sb->s_time_gran = 1;
diff --git a/fs/aio.c b/fs/aio.c
index 63ce0736c3a3..48d99221ff57 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -279,7 +279,7 @@ static int aio_init_fs_context(struct fs_context *fc)
 {
 	if (!init_pseudo(fc, AIO_RING_MAGIC))
 		return -ENOMEM;
-	fc->s_iflags |= 1 << _SB_I_NOEXEC;
+	fc->s_iflags |= 1 << SB_I_NOEXEC;
 	return 0;
 }
 
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 321696697279..fb3938ec127c 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -950,7 +950,7 @@ static int btrfs_fill_super(struct super_block *sb,
 #endif
 	sb->s_xattr = btrfs_xattr_handlers;
 	sb->s_time_gran = 1;
-	sb_set_iflag(sb, _SB_I_CGROUPWB);
+	sb_set_iflag(sb, SB_I_CGROUPWB);
 
 	err = super_setup_bdi(sb);
 	if (err) {
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index d473156d2791..6094cb7e1a16 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -428,7 +428,7 @@ devpts_fill_super(struct super_block *s, void *data, int silent)
 	struct inode *inode;
 	int error;
 
-	sb_clear_iflag(s, _SB_I_NODEV);
+	sb_clear_iflag(s, SB_I_NODEV);
 	s->s_blocksize = 1024;
 	s->s_blocksize_bits = 10;
 	s->s_magic = DEVPTS_SUPER_MAGIC;
diff --git a/fs/exec.c b/fs/exec.c
index b62b67bea10b..a8bd15aa6bd8 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -112,7 +112,7 @@ static inline void put_binfmt(struct linux_binfmt * fmt)
 bool path_noexec(const struct path *path)
 {
 	return (path->mnt->mnt_flags & MNT_NOEXEC) ||
-	       sb_test_iflag(path->mnt->mnt_sb, _SB_I_NOEXEC);
+	       sb_test_iflag(path->mnt->mnt_sb, SB_I_NOEXEC);
 }
 
 #ifdef CONFIG_USELIB
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 9da8652c10c5..cbe79fb7ac35 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -916,7 +916,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
 
 	sb->s_flags = (sb->s_flags & ~SB_POSIXACL) |
 		(test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0);
-	sb_set_iflag(sb, _SB_I_CGROUPWB);
+	sb_set_iflag(sb, SB_I_CGROUPWB);
 
 	if (le32_to_cpu(es->s_rev_level) == EXT2_GOOD_OLD_REV &&
 	    (EXT2_HAS_COMPAT_FEATURE(sb, ~0U) ||
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index a776d4e7ec66..b5b2f17f1b65 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4972,7 +4972,7 @@ static int ext4_check_journal_data_mode(struct super_block *sb)
 		if (test_opt(sb, DELALLOC))
 			clear_opt(sb, DELALLOC);
 	} else {
-		sb_set_iflag(sb, _SB_I_CGROUPWB);
+		sb_set_iflag(sb, SB_I_CGROUPWB);
 	}
 
 	return 0;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 041b7b7b0810..8437612bf64b 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4472,7 +4472,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		(test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
 	super_set_uuid(sb, (void *) raw_super->uuid, sizeof(raw_super->uuid));
 	super_set_sysfs_name_bdev(sb);
-	sb_set_iflag(sb, _SB_I_CGROUPWB);
+	sb_set_iflag(sb, SB_I_CGROUPWB);
 
 	/* init f2fs-specific super block info */
 	sbi->valid_super_block = valid_super_block;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 3602a578b7b3..5b6254481d5c 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1566,9 +1566,9 @@ static void fuse_sb_defaults(struct super_block *sb)
 	sb->s_maxbytes = MAX_LFS_FILESIZE;
 	sb->s_time_gran = 1;
 	sb->s_export_op = &fuse_export_operations;
-	sb_set_iflag(sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE);
+	sb_set_iflag(sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE);
 	if (sb->s_user_ns != &init_user_ns)
-		sb_set_iflag(sb, _SB_I_UNTRUSTED_MOUNTER);
+		sb_set_iflag(sb, SB_I_UNTRUSTED_MOUNTER);
 	sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
 }
 
diff --git a/fs/inode.c b/fs/inode.c
index a8598a968940..3091385a4de1 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -216,7 +216,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
 	lockdep_set_class_and_name(&mapping->invalidate_lock,
 				   &sb->s_type->invalidate_lock_key,
 				   "mapping.invalidate_lock");
-	if (sb_test_iflag(sb, _SB_I_STABLE_WRITES))
+	if (sb_test_iflag(sb, SB_I_STABLE_WRITES))
 		mapping_set_stable_writes(mapping);
 	inode->i_private = NULL;
 	inode->i_mapping = mapping;
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 762edcf5387e..f5331f2e0b2d 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -252,8 +252,8 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
 
 	info->sb = sb;
 	/* Userspace would break if executables or devices appear on sysfs */
-	sb_set_iflag(sb, _SB_I_NOEXEC);
-	sb_set_iflag(sb, _SB_I_NODEV);
+	sb_set_iflag(sb, SB_I_NOEXEC);
+	sb_set_iflag(sb, SB_I_NODEV);
 	sb->s_blocksize = PAGE_SIZE;
 	sb->s_blocksize_bits = PAGE_SHIFT;
 	sb->s_magic = kfc->magic;
diff --git a/fs/namei.c b/fs/namei.c
index de6936564298..9e9bca0566e9 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3308,7 +3308,7 @@ EXPORT_SYMBOL(vfs_mkobj);
 bool may_open_dev(const struct path *path)
 {
 	return !(path->mnt->mnt_flags & MNT_NODEV) &&
-		!sb_test_iflag(path->mnt->mnt_sb, _SB_I_NODEV);
+		!sb_test_iflag(path->mnt->mnt_sb, SB_I_NODEV);
 }
 
 static int may_open(struct mnt_idmap *idmap, const struct path *path,
diff --git a/fs/namespace.c b/fs/namespace.c
index 17126569b3c4..1c5591673f96 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2919,7 +2919,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount *
 	struct super_block *sb = mnt->mnt_sb;
 
 	if (!__mnt_is_readonly(mnt) &&
-	   !sb_test_iflag(sb, _SB_I_TS_EXPIRY_WARNED) &&
+	   !sb_test_iflag(sb, SB_I_TS_EXPIRY_WARNED) &&
 	   (ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) {
 		char *buf = (char *)__get_free_page(GFP_KERNEL);
 		char *mntpath = buf ? d_path(mountpoint, buf, PAGE_SIZE) : ERR_PTR(-ENOMEM);
@@ -2931,7 +2931,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount *
 			(unsigned long long)sb->s_time_max);
 
 		free_page((unsigned long)buf);
-		sb_set_iflag(sb, _SB_I_TS_EXPIRY_WARNED);
+		sb_set_iflag(sb, SB_I_TS_EXPIRY_WARNED);
 	}
 }
 
@@ -5629,10 +5629,10 @@ static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags
 		return false;
 
 	/* Can this filesystem be too revealing? */
-	if (!sb_test_iflag(sb, _SB_I_USERNS_VISIBLE))
+	if (!sb_test_iflag(sb, SB_I_USERNS_VISIBLE))
 		return false;
 
-	if (!sb_test_iflag(sb, _SB_I_NOEXEC) || !sb_test_iflag(sb, _SB_I_NODEV)) {
+	if (!sb_test_iflag(sb, SB_I_NOEXEC) || !sb_test_iflag(sb, SB_I_NODEV)) {
 		WARN_ONCE(1, "Expected s_iflags to contain SB_I_NOEXEC and "
 			  "SB_I_NODEV\n");
 		return true;
diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 2fbae7e2b6ce..52fc52b6350f 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -1643,7 +1643,7 @@ static int nfs_init_fs_context(struct fs_context *fc)
 		ctx->xprtsec.cert_serial	= TLS_NO_CERT;
 		ctx->xprtsec.privkey_serial	= TLS_NO_PRIVKEY;
 
-		fc->s_iflags		|= 1 << _SB_I_STABLE_WRITES;
+		fc->s_iflags		|= 1 << SB_I_STABLE_WRITES;
 	}
 	fc->fs_private = ctx;
 	fc->ops = &nfs_fs_context_ops;
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index b6b806fb6286..246ecceda7c8 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -1094,7 +1094,7 @@ static void nfs_fill_super(struct super_block *sb, struct nfs_fs_context *ctx)
 		sb->s_export_op = &nfs_export_ops;
 		break;
 	case 4:
-		sb_set_iflag(sb, _SB_I_NOUMASK);
+		sb_set_iflag(sb, SB_I_NOUMASK);
 		sb->s_time_gran = 1;
 		sb->s_time_min = S64_MIN;
 		sb->s_time_max = S64_MAX;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index afa5263ff016..f5a60d0bcb1c 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1453,14 +1453,14 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
 #ifdef CONFIG_FS_POSIX_ACL
 	sb->s_flags |= SB_POSIXACL;
 #endif
-	sb_set_iflag(sb, _SB_I_SKIP_SYNC);
+	sb_set_iflag(sb, SB_I_SKIP_SYNC);
 	/*
 	 * Ensure that umask handling is done by the filesystems used
 	 * for the the upper layer instead of overlayfs as that would
 	 * lead to unexpected results.
 	 */
-	sb_set_iflag(sb, _SB_I_NOUMASK);
-	sb_set_iflag(sb, _SB_I_EVM_HMAC_UNSUPPORTED);
+	sb_set_iflag(sb, SB_I_NOUMASK);
+	sb_set_iflag(sb, SB_I_EVM_HMAC_UNSUPPORTED);
 
 	err = -ENOMEM;
 	root_dentry = ovl_get_root(sb, ctx->upper.dentry, oe);
diff --git a/fs/proc/root.c b/fs/proc/root.c
index ac78ec69dde9..7acfa535b925 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -171,9 +171,9 @@ static int proc_fill_super(struct super_block *s, struct fs_context *fc)
 	proc_apply_options(fs_info, fc, current_user_ns());
 
 	/* User space would break if executables or devices appear on proc */
-	sb_set_iflag(s, _SB_I_USERNS_VISIBLE);
-	sb_set_iflag(s, _SB_I_NOEXEC);
-	sb_set_iflag(s, _SB_I_NODEV);
+	sb_set_iflag(s, SB_I_USERNS_VISIBLE);
+	sb_set_iflag(s, SB_I_NOEXEC);
+	sb_set_iflag(s, SB_I_NODEV);
 	s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC;
 	s->s_blocksize = 1024;
 	s->s_blocksize_bits = 10;
diff --git a/fs/super.c b/fs/super.c
index e3020b3db4f0..873808245d54 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -355,7 +355,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
 	s->s_bdi = &noop_backing_dev_info;
 	s->s_flags = flags;
 	if (s->s_user_ns != &init_user_ns)
-		sb_set_iflag(s, _SB_I_NODEV);
+		sb_set_iflag(s, SB_I_NODEV);
 	INIT_HLIST_NODE(&s->s_instances);
 	INIT_HLIST_BL_HEAD(&s->s_roots);
 	mutex_init(&s->s_sync_lock);
@@ -589,11 +589,11 @@ void retire_super(struct super_block *sb)
 {
 	WARN_ON(!sb->s_bdev);
 	__super_lock_excl(sb);
-	if (sb_test_iflag(sb, _SB_I_PERSB_BDI)) {
+	if (sb_test_iflag(sb, SB_I_PERSB_BDI)) {
 		bdi_unregister(sb->s_bdi);
-		sb_clear_iflag(sb, _SB_I_PERSB_BDI);
+		sb_clear_iflag(sb, SB_I_PERSB_BDI);
 	}
-	sb_set_iflag(sb, _SB_I_RETIRED);
+	sb_set_iflag(sb, SB_I_RETIRED);
 	super_unlock_excl(sb);
 }
 EXPORT_SYMBOL(retire_super);
@@ -678,7 +678,7 @@ void generic_shutdown_super(struct super_block *sb)
 	super_wake(sb, SB_DYING);
 	super_unlock_excl(sb);
 	if (sb->s_bdi != &noop_backing_dev_info) {
-		if (sb_test_iflag(sb, _SB_I_PERSB_BDI))
+		if (sb_test_iflag(sb, SB_I_PERSB_BDI))
 			bdi_unregister(sb->s_bdi);
 		bdi_put(sb->s_bdi);
 		sb->s_bdi = &noop_backing_dev_info;
@@ -1331,7 +1331,7 @@ static int super_s_dev_set(struct super_block *s, struct fs_context *fc)
 
 static int super_s_dev_test(struct super_block *s, struct fs_context *fc)
 {
-	return !sb_test_iflag(s, _SB_I_RETIRED) &&
+	return !sb_test_iflag(s, SB_I_RETIRED) &&
 		s->s_dev == *(dev_t *)fc->sget_key;
 }
 
@@ -1584,7 +1584,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
 	sb->s_bdev = bdev;
 	sb->s_bdi = bdi_get(bdev->bd_disk->bdi);
 	if (bdev_stable_writes(bdev))
-		sb_set_iflag(sb, _SB_I_STABLE_WRITES);
+		sb_set_iflag(sb, SB_I_STABLE_WRITES);
 	spin_unlock(&sb_lock);
 
 	snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
@@ -1648,7 +1648,7 @@ EXPORT_SYMBOL(get_tree_bdev);
 
 static int test_bdev_super(struct super_block *s, void *data)
 {
-	return !sb_test_iflag(s, _SB_I_RETIRED) && s->s_dev == *(dev_t *)data;
+	return !sb_test_iflag(s, SB_I_RETIRED) && s->s_dev == *(dev_t *)data;
 }
 
 struct dentry *mount_bdev(struct file_system_type *fs_type,
@@ -1864,7 +1864,7 @@ int super_setup_bdi_name(struct super_block *sb, char *fmt, ...)
 	}
 	WARN_ON(sb->s_bdi != &noop_backing_dev_info);
 	sb->s_bdi = bdi;
-	sb_set_iflag(sb, _SB_I_PERSB_BDI);
+	sb_set_iflag(sb, SB_I_PERSB_BDI);
 
 	return 0;
 }
diff --git a/fs/sync.c b/fs/sync.c
index 4e5ad48316be..a7c0645aa9dc 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -79,7 +79,7 @@ static void sync_inodes_one_sb(struct super_block *sb, void *arg)
 
 static void sync_fs_one_sb(struct super_block *sb, void *arg)
 {
-	if (!sb_rdonly(sb) && !sb_test_iflag(sb, _SB_I_SKIP_SYNC) &&
+	if (!sb_rdonly(sb) && !sb_test_iflag(sb, SB_I_SKIP_SYNC) &&
 	    sb->s_op->sync_fs)
 		sb->s_op->sync_fs(sb, *(int *)arg);
 }
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 124385961da7..b461c216731a 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -33,7 +33,7 @@ static int sysfs_get_tree(struct fs_context *fc)
 		return ret;
 
 	if (kfc->new_sb_created)
-		sb_set_iflag(fc->root->d_sb, _SB_I_USERNS_VISIBLE);
+		sb_set_iflag(fc->root->d_sb, SB_I_USERNS_VISIBLE);
 	return 0;
 }
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 7707f2a1a836..0020724e3b0a 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1701,7 +1701,7 @@ xfs_fs_fill_super(
 		sb->s_time_max = XFS_LEGACY_TIME_MAX;
 	}
 	trace_xfs_inode_timestamp_range(mp, sb->s_time_min, sb->s_time_max);
-	sb_set_iflag(sb, _SB_I_CGROUPWB);
+	sb_set_iflag(sb, SB_I_CGROUPWB);
 
 	set_posix_acl_flag(sb);
 
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 54fdae7b1be4..bc5f96ba499e 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -176,7 +176,7 @@ static inline bool inode_cgwb_enabled(struct inode *inode)
 	return cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
 		cgroup_subsys_on_dfl(io_cgrp_subsys) &&
 		(bdi->capabilities & BDI_CAP_WRITEBACK) &&
-		sb_test_iflag(inode->i_sb, _SB_I_CGROUPWB);
+		sb_test_iflag(inode->i_sb, SB_I_CGROUPWB);
 }
 
 /**
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 65e70ceb335e..52841aab13fb 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1174,22 +1174,22 @@ extern int send_sigurg(struct fown_struct *fown);
 
 /* sb->s_iflags */
 enum {
-	_SB_I_CGROUPWB,		/* cgroup-aware writeback enabled */
-	_SB_I_NOEXEC,		/* Ignore executables on this fs */
-	_SB_I_NODEV,		/* Ignore devices on this fs */
-	_SB_I_STABLE_WRITES,	/* don't modify blks until WB is done */
+	SB_I_CGROUPWB,		/* cgroup-aware writeback enabled */
+	SB_I_NOEXEC,		/* Ignore executables on this fs */
+	SB_I_NODEV,		/* Ignore devices on this fs */
+	SB_I_STABLE_WRITES,	/* don't modify blks until WB is done */
 
 	/* sb->s_iflags to limit user namespace mounts */
-	_SB_I_USERNS_VISIBLE,	/* fstype already mounted */
-	_SB_I_IMA_UNVERIFIABLE_SIGNATURE,
-	_SB_I_UNTRUSTED_MOUNTER,
-	_SB_I_EVM_HMAC_UNSUPPORTED,
-
-	_SB_I_SKIP_SYNC,	/* Skip superblock at global sync */
-	_SB_I_PERSB_BDI,	/* has a per-sb bdi */
-	_SB_I_TS_EXPIRY_WARNED,	/* warned about timestamp range expiry */
-	_SB_I_RETIRED,		/* superblock shouldn't be reused */
-	_SB_I_NOUMASK,		/* VFS does not apply umask */
+	SB_I_USERNS_VISIBLE,	/* fstype already mounted */
+	SB_I_IMA_UNVERIFIABLE_SIGNATURE,
+	SB_I_UNTRUSTED_MOUNTER,
+	SB_I_EVM_HMAC_UNSUPPORTED,
+
+	SB_I_SKIP_SYNC,	/* Skip superblock at global sync */
+	SB_I_PERSB_BDI,	/* has a per-sb bdi */
+	SB_I_TS_EXPIRY_WARNED,	/* warned about timestamp range expiry */
+	SB_I_RETIRED,		/* superblock shouldn't be reused */
+	SB_I_NOUMASK,		/* VFS does not apply umask */
 };
 
 /* Possible states of 'frozen' field */
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 3fbf340dac1a..0bd6db9adb7f 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -107,7 +107,7 @@ extern void unlock_rename(struct dentry *, struct dentry *);
  */
 static inline umode_t __must_check mode_strip_umask(const struct inode *dir, umode_t mode)
 {
-	if (!IS_POSIXACL(dir) && !sb_test_iflag(dir->i_sb, _SB_I_NOUMASK))
+	if (!IS_POSIXACL(dir) && !sb_test_iflag(dir->i_sb, SB_I_NOUMASK))
 		mode &= ~current_umask();
 	return mode;
 }
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index e73fff4c2f12..abe4dfe4374c 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -406,8 +406,8 @@ static int mqueue_fill_super(struct super_block *sb, struct fs_context *fc)
 	struct inode *inode;
 	struct ipc_namespace *ns = sb->s_fs_info;
 
-	sb_set_iflag(sb, _SB_I_NOEXEC);
-	sb_set_iflag(sb, _SB_I_NODEV);
+	sb_set_iflag(sb, SB_I_NOEXEC);
+	sb_set_iflag(sb, SB_I_NODEV);
 	sb->s_blocksize = PAGE_SIZE;
 	sb->s_blocksize_bits = PAGE_SHIFT;
 	sb->s_magic = MQUEUE_MAGIC;
diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
index 3ff29bf73f04..a15a87250d55 100644
--- a/security/integrity/evm/evm_main.c
+++ b/security/integrity/evm/evm_main.c
@@ -155,7 +155,7 @@ static int is_unsupported_hmac_fs(struct dentry *dentry)
 {
 	struct inode *inode = d_backing_inode(dentry);
 
-	if (sb_test_iflag(inode->i_sb, _SB_I_EVM_HMAC_UNSUPPORTED)) {
+	if (sb_test_iflag(inode->i_sb, SB_I_EVM_HMAC_UNSUPPORTED)) {
 		pr_info_once("%s not supported\n", inode->i_sb->s_type->name);
 		return 1;
 	}
diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
index 9c290dd8a4ac..dfa16dba5d89 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -564,8 +564,8 @@ int ima_appraise_measurement(enum ima_hooks func, struct ima_iint_cache *iint,
 	 * system not willing to accept such a risk, fail the file signature
 	 * verification.
 	 */
-	if (sb_test_iflag(inode->i_sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
-	    (sb_test_iflag(inode->i_sb, _SB_I_UNTRUSTED_MOUNTER) ||
+	if (sb_test_iflag(inode->i_sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
+	    (sb_test_iflag(inode->i_sb, SB_I_UNTRUSTED_MOUNTER) ||
 	     (iint->flags & IMA_FAIL_UNVERIFIABLE_SIGS))) {
 		status = INTEGRITY_FAIL;
 		cause = "unverifiable-signature";
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index b04eaa33eca4..27d446136c4f 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -280,8 +280,8 @@ static int process_measurement(struct file *file, const struct cred *cred,
 	 * (Limited to privileged mounted filesystems.)
 	 */
 	if (test_and_clear_bit(IMA_CHANGE_XATTR, &iint->atomic_flags) ||
-	    (sb_test_iflag(inode->i_sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
-	     !sb_test_iflag(inode->i_sb, _SB_I_UNTRUSTED_MOUNTER) &&
+	    (sb_test_iflag(inode->i_sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
+	     !sb_test_iflag(inode->i_sb, SB_I_UNTRUSTED_MOUNTER) &&
 	     !(action & IMA_FAIL_UNVERIFIABLE_SIGS))) {
 		iint->flags &= ~IMA_DONE_MASK;
 		iint->measured_pcrs = 0;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants
  2024-08-07 18:29 ` [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants Jan Kara
@ 2024-08-08 11:47   ` Amir Goldstein
  2024-08-08 14:35     ` Darrick J. Wong
  0 siblings, 1 reply; 25+ messages in thread
From: Amir Goldstein @ 2024-08-08 11:47 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Dave Chinner, Christian Brauner

On Wed, Aug 7, 2024 at 8:31 PM Jan Kara <jack@suse.cz> wrote:
>
> Now that old constants are gone, remove the unnecessary underscore from
> the new _SB_I_ constants. Pure mechanical replacement, no functional
> changes.
>

This is a potential backporting bomb.
It is true that code using the old constant names with new macros
will not build on stable kernels, but I think this is still asking for trouble.

Also, it is a bit strange that SB_* flags are bit masks and SB_I_*
flags are bit numbers.
How about leaving the underscore and using  sb_*_iflag() macros to add
the underscore?

Thanks,
Amir.

> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  block/bdev.c                          |  2 +-
>  drivers/android/binderfs.c            |  4 ++--
>  fs/aio.c                              |  2 +-
>  fs/btrfs/super.c                      |  2 +-
>  fs/devpts/inode.c                     |  2 +-
>  fs/exec.c                             |  2 +-
>  fs/ext2/super.c                       |  2 +-
>  fs/ext4/super.c                       |  2 +-
>  fs/f2fs/super.c                       |  2 +-
>  fs/fuse/inode.c                       |  4 ++--
>  fs/inode.c                            |  2 +-
>  fs/kernfs/mount.c                     |  4 ++--
>  fs/namei.c                            |  2 +-
>  fs/namespace.c                        |  8 ++++----
>  fs/nfs/fs_context.c                   |  2 +-
>  fs/nfs/super.c                        |  2 +-
>  fs/overlayfs/super.c                  |  6 +++---
>  fs/proc/root.c                        |  6 +++---
>  fs/super.c                            | 18 ++++++++---------
>  fs/sync.c                             |  2 +-
>  fs/sysfs/mount.c                      |  2 +-
>  fs/xfs/xfs_super.c                    |  2 +-
>  include/linux/backing-dev.h           |  2 +-
>  include/linux/fs.h                    | 28 +++++++++++++--------------
>  include/linux/namei.h                 |  2 +-
>  ipc/mqueue.c                          |  4 ++--
>  security/integrity/evm/evm_main.c     |  2 +-
>  security/integrity/ima/ima_appraise.c |  4 ++--
>  security/integrity/ima/ima_main.c     |  4 ++--
>  29 files changed, 63 insertions(+), 63 deletions(-)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index c1ea2aeb93dd..6c13ba60c0b1 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -373,7 +373,7 @@ static int bd_init_fs_context(struct fs_context *fc)
>         struct pseudo_fs_context *ctx = init_pseudo(fc, BDEVFS_MAGIC);
>         if (!ctx)
>                 return -ENOMEM;
> -       fc->s_iflags |= 1 << _SB_I_CGROUPWB;
> +       fc->s_iflags |= 1 << SB_I_CGROUPWB;
>         ctx->ops = &bdev_sops;
>         return 0;
>  }
> diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
> index f9454b93c2f7..6070923fbfbd 100644
> --- a/drivers/android/binderfs.c
> +++ b/drivers/android/binderfs.c
> @@ -672,8 +672,8 @@ static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc)
>          * allowed to do. So removing the SB_I_NODEV flag from s_iflags is both
>          * necessary and safe.
>          */
> -       sb_clear_iflag(sb, _SB_I_NODEV);
> -       sb_set_iflag(sb, _SB_I_NOEXEC);
> +       sb_clear_iflag(sb, SB_I_NODEV);
> +       sb_set_iflag(sb, SB_I_NOEXEC);
>         sb->s_magic = BINDERFS_SUPER_MAGIC;
>         sb->s_op = &binderfs_super_ops;
>         sb->s_time_gran = 1;
> diff --git a/fs/aio.c b/fs/aio.c
> index 63ce0736c3a3..48d99221ff57 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -279,7 +279,7 @@ static int aio_init_fs_context(struct fs_context *fc)
>  {
>         if (!init_pseudo(fc, AIO_RING_MAGIC))
>                 return -ENOMEM;
> -       fc->s_iflags |= 1 << _SB_I_NOEXEC;
> +       fc->s_iflags |= 1 << SB_I_NOEXEC;
>         return 0;
>  }
>
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 321696697279..fb3938ec127c 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -950,7 +950,7 @@ static int btrfs_fill_super(struct super_block *sb,
>  #endif
>         sb->s_xattr = btrfs_xattr_handlers;
>         sb->s_time_gran = 1;
> -       sb_set_iflag(sb, _SB_I_CGROUPWB);
> +       sb_set_iflag(sb, SB_I_CGROUPWB);
>
>         err = super_setup_bdi(sb);
>         if (err) {
> diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> index d473156d2791..6094cb7e1a16 100644
> --- a/fs/devpts/inode.c
> +++ b/fs/devpts/inode.c
> @@ -428,7 +428,7 @@ devpts_fill_super(struct super_block *s, void *data, int silent)
>         struct inode *inode;
>         int error;
>
> -       sb_clear_iflag(s, _SB_I_NODEV);
> +       sb_clear_iflag(s, SB_I_NODEV);
>         s->s_blocksize = 1024;
>         s->s_blocksize_bits = 10;
>         s->s_magic = DEVPTS_SUPER_MAGIC;
> diff --git a/fs/exec.c b/fs/exec.c
> index b62b67bea10b..a8bd15aa6bd8 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -112,7 +112,7 @@ static inline void put_binfmt(struct linux_binfmt * fmt)
>  bool path_noexec(const struct path *path)
>  {
>         return (path->mnt->mnt_flags & MNT_NOEXEC) ||
> -              sb_test_iflag(path->mnt->mnt_sb, _SB_I_NOEXEC);
> +              sb_test_iflag(path->mnt->mnt_sb, SB_I_NOEXEC);
>  }
>
>  #ifdef CONFIG_USELIB
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index 9da8652c10c5..cbe79fb7ac35 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -916,7 +916,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
>
>         sb->s_flags = (sb->s_flags & ~SB_POSIXACL) |
>                 (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0);
> -       sb_set_iflag(sb, _SB_I_CGROUPWB);
> +       sb_set_iflag(sb, SB_I_CGROUPWB);
>
>         if (le32_to_cpu(es->s_rev_level) == EXT2_GOOD_OLD_REV &&
>             (EXT2_HAS_COMPAT_FEATURE(sb, ~0U) ||
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index a776d4e7ec66..b5b2f17f1b65 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4972,7 +4972,7 @@ static int ext4_check_journal_data_mode(struct super_block *sb)
>                 if (test_opt(sb, DELALLOC))
>                         clear_opt(sb, DELALLOC);
>         } else {
> -               sb_set_iflag(sb, _SB_I_CGROUPWB);
> +               sb_set_iflag(sb, SB_I_CGROUPWB);
>         }
>
>         return 0;
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 041b7b7b0810..8437612bf64b 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -4472,7 +4472,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>                 (test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
>         super_set_uuid(sb, (void *) raw_super->uuid, sizeof(raw_super->uuid));
>         super_set_sysfs_name_bdev(sb);
> -       sb_set_iflag(sb, _SB_I_CGROUPWB);
> +       sb_set_iflag(sb, SB_I_CGROUPWB);
>
>         /* init f2fs-specific super block info */
>         sbi->valid_super_block = valid_super_block;
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index 3602a578b7b3..5b6254481d5c 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -1566,9 +1566,9 @@ static void fuse_sb_defaults(struct super_block *sb)
>         sb->s_maxbytes = MAX_LFS_FILESIZE;
>         sb->s_time_gran = 1;
>         sb->s_export_op = &fuse_export_operations;
> -       sb_set_iflag(sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE);
> +       sb_set_iflag(sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE);
>         if (sb->s_user_ns != &init_user_ns)
> -               sb_set_iflag(sb, _SB_I_UNTRUSTED_MOUNTER);
> +               sb_set_iflag(sb, SB_I_UNTRUSTED_MOUNTER);
>         sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
>  }
>
> diff --git a/fs/inode.c b/fs/inode.c
> index a8598a968940..3091385a4de1 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -216,7 +216,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
>         lockdep_set_class_and_name(&mapping->invalidate_lock,
>                                    &sb->s_type->invalidate_lock_key,
>                                    "mapping.invalidate_lock");
> -       if (sb_test_iflag(sb, _SB_I_STABLE_WRITES))
> +       if (sb_test_iflag(sb, SB_I_STABLE_WRITES))
>                 mapping_set_stable_writes(mapping);
>         inode->i_private = NULL;
>         inode->i_mapping = mapping;
> diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
> index 762edcf5387e..f5331f2e0b2d 100644
> --- a/fs/kernfs/mount.c
> +++ b/fs/kernfs/mount.c
> @@ -252,8 +252,8 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
>
>         info->sb = sb;
>         /* Userspace would break if executables or devices appear on sysfs */
> -       sb_set_iflag(sb, _SB_I_NOEXEC);
> -       sb_set_iflag(sb, _SB_I_NODEV);
> +       sb_set_iflag(sb, SB_I_NOEXEC);
> +       sb_set_iflag(sb, SB_I_NODEV);
>         sb->s_blocksize = PAGE_SIZE;
>         sb->s_blocksize_bits = PAGE_SHIFT;
>         sb->s_magic = kfc->magic;
> diff --git a/fs/namei.c b/fs/namei.c
> index de6936564298..9e9bca0566e9 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3308,7 +3308,7 @@ EXPORT_SYMBOL(vfs_mkobj);
>  bool may_open_dev(const struct path *path)
>  {
>         return !(path->mnt->mnt_flags & MNT_NODEV) &&
> -               !sb_test_iflag(path->mnt->mnt_sb, _SB_I_NODEV);
> +               !sb_test_iflag(path->mnt->mnt_sb, SB_I_NODEV);
>  }
>
>  static int may_open(struct mnt_idmap *idmap, const struct path *path,
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 17126569b3c4..1c5591673f96 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2919,7 +2919,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount *
>         struct super_block *sb = mnt->mnt_sb;
>
>         if (!__mnt_is_readonly(mnt) &&
> -          !sb_test_iflag(sb, _SB_I_TS_EXPIRY_WARNED) &&
> +          !sb_test_iflag(sb, SB_I_TS_EXPIRY_WARNED) &&
>            (ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) {
>                 char *buf = (char *)__get_free_page(GFP_KERNEL);
>                 char *mntpath = buf ? d_path(mountpoint, buf, PAGE_SIZE) : ERR_PTR(-ENOMEM);
> @@ -2931,7 +2931,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount *
>                         (unsigned long long)sb->s_time_max);
>
>                 free_page((unsigned long)buf);
> -               sb_set_iflag(sb, _SB_I_TS_EXPIRY_WARNED);
> +               sb_set_iflag(sb, SB_I_TS_EXPIRY_WARNED);
>         }
>  }
>
> @@ -5629,10 +5629,10 @@ static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags
>                 return false;
>
>         /* Can this filesystem be too revealing? */
> -       if (!sb_test_iflag(sb, _SB_I_USERNS_VISIBLE))
> +       if (!sb_test_iflag(sb, SB_I_USERNS_VISIBLE))
>                 return false;
>
> -       if (!sb_test_iflag(sb, _SB_I_NOEXEC) || !sb_test_iflag(sb, _SB_I_NODEV)) {
> +       if (!sb_test_iflag(sb, SB_I_NOEXEC) || !sb_test_iflag(sb, SB_I_NODEV)) {
>                 WARN_ONCE(1, "Expected s_iflags to contain SB_I_NOEXEC and "
>                           "SB_I_NODEV\n");
>                 return true;
> diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
> index 2fbae7e2b6ce..52fc52b6350f 100644
> --- a/fs/nfs/fs_context.c
> +++ b/fs/nfs/fs_context.c
> @@ -1643,7 +1643,7 @@ static int nfs_init_fs_context(struct fs_context *fc)
>                 ctx->xprtsec.cert_serial        = TLS_NO_CERT;
>                 ctx->xprtsec.privkey_serial     = TLS_NO_PRIVKEY;
>
> -               fc->s_iflags            |= 1 << _SB_I_STABLE_WRITES;
> +               fc->s_iflags            |= 1 << SB_I_STABLE_WRITES;
>         }
>         fc->fs_private = ctx;
>         fc->ops = &nfs_fs_context_ops;
> diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> index b6b806fb6286..246ecceda7c8 100644
> --- a/fs/nfs/super.c
> +++ b/fs/nfs/super.c
> @@ -1094,7 +1094,7 @@ static void nfs_fill_super(struct super_block *sb, struct nfs_fs_context *ctx)
>                 sb->s_export_op = &nfs_export_ops;
>                 break;
>         case 4:
> -               sb_set_iflag(sb, _SB_I_NOUMASK);
> +               sb_set_iflag(sb, SB_I_NOUMASK);
>                 sb->s_time_gran = 1;
>                 sb->s_time_min = S64_MIN;
>                 sb->s_time_max = S64_MAX;
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index afa5263ff016..f5a60d0bcb1c 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -1453,14 +1453,14 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
>  #ifdef CONFIG_FS_POSIX_ACL
>         sb->s_flags |= SB_POSIXACL;
>  #endif
> -       sb_set_iflag(sb, _SB_I_SKIP_SYNC);
> +       sb_set_iflag(sb, SB_I_SKIP_SYNC);
>         /*
>          * Ensure that umask handling is done by the filesystems used
>          * for the the upper layer instead of overlayfs as that would
>          * lead to unexpected results.
>          */
> -       sb_set_iflag(sb, _SB_I_NOUMASK);
> -       sb_set_iflag(sb, _SB_I_EVM_HMAC_UNSUPPORTED);
> +       sb_set_iflag(sb, SB_I_NOUMASK);
> +       sb_set_iflag(sb, SB_I_EVM_HMAC_UNSUPPORTED);
>
>         err = -ENOMEM;
>         root_dentry = ovl_get_root(sb, ctx->upper.dentry, oe);
> diff --git a/fs/proc/root.c b/fs/proc/root.c
> index ac78ec69dde9..7acfa535b925 100644
> --- a/fs/proc/root.c
> +++ b/fs/proc/root.c
> @@ -171,9 +171,9 @@ static int proc_fill_super(struct super_block *s, struct fs_context *fc)
>         proc_apply_options(fs_info, fc, current_user_ns());
>
>         /* User space would break if executables or devices appear on proc */
> -       sb_set_iflag(s, _SB_I_USERNS_VISIBLE);
> -       sb_set_iflag(s, _SB_I_NOEXEC);
> -       sb_set_iflag(s, _SB_I_NODEV);
> +       sb_set_iflag(s, SB_I_USERNS_VISIBLE);
> +       sb_set_iflag(s, SB_I_NOEXEC);
> +       sb_set_iflag(s, SB_I_NODEV);
>         s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC;
>         s->s_blocksize = 1024;
>         s->s_blocksize_bits = 10;
> diff --git a/fs/super.c b/fs/super.c
> index e3020b3db4f0..873808245d54 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -355,7 +355,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
>         s->s_bdi = &noop_backing_dev_info;
>         s->s_flags = flags;
>         if (s->s_user_ns != &init_user_ns)
> -               sb_set_iflag(s, _SB_I_NODEV);
> +               sb_set_iflag(s, SB_I_NODEV);
>         INIT_HLIST_NODE(&s->s_instances);
>         INIT_HLIST_BL_HEAD(&s->s_roots);
>         mutex_init(&s->s_sync_lock);
> @@ -589,11 +589,11 @@ void retire_super(struct super_block *sb)
>  {
>         WARN_ON(!sb->s_bdev);
>         __super_lock_excl(sb);
> -       if (sb_test_iflag(sb, _SB_I_PERSB_BDI)) {
> +       if (sb_test_iflag(sb, SB_I_PERSB_BDI)) {
>                 bdi_unregister(sb->s_bdi);
> -               sb_clear_iflag(sb, _SB_I_PERSB_BDI);
> +               sb_clear_iflag(sb, SB_I_PERSB_BDI);
>         }
> -       sb_set_iflag(sb, _SB_I_RETIRED);
> +       sb_set_iflag(sb, SB_I_RETIRED);
>         super_unlock_excl(sb);
>  }
>  EXPORT_SYMBOL(retire_super);
> @@ -678,7 +678,7 @@ void generic_shutdown_super(struct super_block *sb)
>         super_wake(sb, SB_DYING);
>         super_unlock_excl(sb);
>         if (sb->s_bdi != &noop_backing_dev_info) {
> -               if (sb_test_iflag(sb, _SB_I_PERSB_BDI))
> +               if (sb_test_iflag(sb, SB_I_PERSB_BDI))
>                         bdi_unregister(sb->s_bdi);
>                 bdi_put(sb->s_bdi);
>                 sb->s_bdi = &noop_backing_dev_info;
> @@ -1331,7 +1331,7 @@ static int super_s_dev_set(struct super_block *s, struct fs_context *fc)
>
>  static int super_s_dev_test(struct super_block *s, struct fs_context *fc)
>  {
> -       return !sb_test_iflag(s, _SB_I_RETIRED) &&
> +       return !sb_test_iflag(s, SB_I_RETIRED) &&
>                 s->s_dev == *(dev_t *)fc->sget_key;
>  }
>
> @@ -1584,7 +1584,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
>         sb->s_bdev = bdev;
>         sb->s_bdi = bdi_get(bdev->bd_disk->bdi);
>         if (bdev_stable_writes(bdev))
> -               sb_set_iflag(sb, _SB_I_STABLE_WRITES);
> +               sb_set_iflag(sb, SB_I_STABLE_WRITES);
>         spin_unlock(&sb_lock);
>
>         snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
> @@ -1648,7 +1648,7 @@ EXPORT_SYMBOL(get_tree_bdev);
>
>  static int test_bdev_super(struct super_block *s, void *data)
>  {
> -       return !sb_test_iflag(s, _SB_I_RETIRED) && s->s_dev == *(dev_t *)data;
> +       return !sb_test_iflag(s, SB_I_RETIRED) && s->s_dev == *(dev_t *)data;
>  }
>
>  struct dentry *mount_bdev(struct file_system_type *fs_type,
> @@ -1864,7 +1864,7 @@ int super_setup_bdi_name(struct super_block *sb, char *fmt, ...)
>         }
>         WARN_ON(sb->s_bdi != &noop_backing_dev_info);
>         sb->s_bdi = bdi;
> -       sb_set_iflag(sb, _SB_I_PERSB_BDI);
> +       sb_set_iflag(sb, SB_I_PERSB_BDI);
>
>         return 0;
>  }
> diff --git a/fs/sync.c b/fs/sync.c
> index 4e5ad48316be..a7c0645aa9dc 100644
> --- a/fs/sync.c
> +++ b/fs/sync.c
> @@ -79,7 +79,7 @@ static void sync_inodes_one_sb(struct super_block *sb, void *arg)
>
>  static void sync_fs_one_sb(struct super_block *sb, void *arg)
>  {
> -       if (!sb_rdonly(sb) && !sb_test_iflag(sb, _SB_I_SKIP_SYNC) &&
> +       if (!sb_rdonly(sb) && !sb_test_iflag(sb, SB_I_SKIP_SYNC) &&
>             sb->s_op->sync_fs)
>                 sb->s_op->sync_fs(sb, *(int *)arg);
>  }
> diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
> index 124385961da7..b461c216731a 100644
> --- a/fs/sysfs/mount.c
> +++ b/fs/sysfs/mount.c
> @@ -33,7 +33,7 @@ static int sysfs_get_tree(struct fs_context *fc)
>                 return ret;
>
>         if (kfc->new_sb_created)
> -               sb_set_iflag(fc->root->d_sb, _SB_I_USERNS_VISIBLE);
> +               sb_set_iflag(fc->root->d_sb, SB_I_USERNS_VISIBLE);
>         return 0;
>  }
>
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 7707f2a1a836..0020724e3b0a 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1701,7 +1701,7 @@ xfs_fs_fill_super(
>                 sb->s_time_max = XFS_LEGACY_TIME_MAX;
>         }
>         trace_xfs_inode_timestamp_range(mp, sb->s_time_min, sb->s_time_max);
> -       sb_set_iflag(sb, _SB_I_CGROUPWB);
> +       sb_set_iflag(sb, SB_I_CGROUPWB);
>
>         set_posix_acl_flag(sb);
>
> diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> index 54fdae7b1be4..bc5f96ba499e 100644
> --- a/include/linux/backing-dev.h
> +++ b/include/linux/backing-dev.h
> @@ -176,7 +176,7 @@ static inline bool inode_cgwb_enabled(struct inode *inode)
>         return cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
>                 cgroup_subsys_on_dfl(io_cgrp_subsys) &&
>                 (bdi->capabilities & BDI_CAP_WRITEBACK) &&
> -               sb_test_iflag(inode->i_sb, _SB_I_CGROUPWB);
> +               sb_test_iflag(inode->i_sb, SB_I_CGROUPWB);
>  }
>
>  /**
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 65e70ceb335e..52841aab13fb 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1174,22 +1174,22 @@ extern int send_sigurg(struct fown_struct *fown);
>
>  /* sb->s_iflags */
>  enum {
> -       _SB_I_CGROUPWB,         /* cgroup-aware writeback enabled */
> -       _SB_I_NOEXEC,           /* Ignore executables on this fs */
> -       _SB_I_NODEV,            /* Ignore devices on this fs */
> -       _SB_I_STABLE_WRITES,    /* don't modify blks until WB is done */
> +       SB_I_CGROUPWB,          /* cgroup-aware writeback enabled */
> +       SB_I_NOEXEC,            /* Ignore executables on this fs */
> +       SB_I_NODEV,             /* Ignore devices on this fs */
> +       SB_I_STABLE_WRITES,     /* don't modify blks until WB is done */
>
>         /* sb->s_iflags to limit user namespace mounts */
> -       _SB_I_USERNS_VISIBLE,   /* fstype already mounted */
> -       _SB_I_IMA_UNVERIFIABLE_SIGNATURE,
> -       _SB_I_UNTRUSTED_MOUNTER,
> -       _SB_I_EVM_HMAC_UNSUPPORTED,
> -
> -       _SB_I_SKIP_SYNC,        /* Skip superblock at global sync */
> -       _SB_I_PERSB_BDI,        /* has a per-sb bdi */
> -       _SB_I_TS_EXPIRY_WARNED, /* warned about timestamp range expiry */
> -       _SB_I_RETIRED,          /* superblock shouldn't be reused */
> -       _SB_I_NOUMASK,          /* VFS does not apply umask */
> +       SB_I_USERNS_VISIBLE,    /* fstype already mounted */
> +       SB_I_IMA_UNVERIFIABLE_SIGNATURE,
> +       SB_I_UNTRUSTED_MOUNTER,
> +       SB_I_EVM_HMAC_UNSUPPORTED,
> +
> +       SB_I_SKIP_SYNC, /* Skip superblock at global sync */
> +       SB_I_PERSB_BDI, /* has a per-sb bdi */
> +       SB_I_TS_EXPIRY_WARNED,  /* warned about timestamp range expiry */
> +       SB_I_RETIRED,           /* superblock shouldn't be reused */
> +       SB_I_NOUMASK,           /* VFS does not apply umask */
>  };
>
>  /* Possible states of 'frozen' field */
> diff --git a/include/linux/namei.h b/include/linux/namei.h
> index 3fbf340dac1a..0bd6db9adb7f 100644
> --- a/include/linux/namei.h
> +++ b/include/linux/namei.h
> @@ -107,7 +107,7 @@ extern void unlock_rename(struct dentry *, struct dentry *);
>   */
>  static inline umode_t __must_check mode_strip_umask(const struct inode *dir, umode_t mode)
>  {
> -       if (!IS_POSIXACL(dir) && !sb_test_iflag(dir->i_sb, _SB_I_NOUMASK))
> +       if (!IS_POSIXACL(dir) && !sb_test_iflag(dir->i_sb, SB_I_NOUMASK))
>                 mode &= ~current_umask();
>         return mode;
>  }
> diff --git a/ipc/mqueue.c b/ipc/mqueue.c
> index e73fff4c2f12..abe4dfe4374c 100644
> --- a/ipc/mqueue.c
> +++ b/ipc/mqueue.c
> @@ -406,8 +406,8 @@ static int mqueue_fill_super(struct super_block *sb, struct fs_context *fc)
>         struct inode *inode;
>         struct ipc_namespace *ns = sb->s_fs_info;
>
> -       sb_set_iflag(sb, _SB_I_NOEXEC);
> -       sb_set_iflag(sb, _SB_I_NODEV);
> +       sb_set_iflag(sb, SB_I_NOEXEC);
> +       sb_set_iflag(sb, SB_I_NODEV);
>         sb->s_blocksize = PAGE_SIZE;
>         sb->s_blocksize_bits = PAGE_SHIFT;
>         sb->s_magic = MQUEUE_MAGIC;
> diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
> index 3ff29bf73f04..a15a87250d55 100644
> --- a/security/integrity/evm/evm_main.c
> +++ b/security/integrity/evm/evm_main.c
> @@ -155,7 +155,7 @@ static int is_unsupported_hmac_fs(struct dentry *dentry)
>  {
>         struct inode *inode = d_backing_inode(dentry);
>
> -       if (sb_test_iflag(inode->i_sb, _SB_I_EVM_HMAC_UNSUPPORTED)) {
> +       if (sb_test_iflag(inode->i_sb, SB_I_EVM_HMAC_UNSUPPORTED)) {
>                 pr_info_once("%s not supported\n", inode->i_sb->s_type->name);
>                 return 1;
>         }
> diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
> index 9c290dd8a4ac..dfa16dba5d89 100644
> --- a/security/integrity/ima/ima_appraise.c
> +++ b/security/integrity/ima/ima_appraise.c
> @@ -564,8 +564,8 @@ int ima_appraise_measurement(enum ima_hooks func, struct ima_iint_cache *iint,
>          * system not willing to accept such a risk, fail the file signature
>          * verification.
>          */
> -       if (sb_test_iflag(inode->i_sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
> -           (sb_test_iflag(inode->i_sb, _SB_I_UNTRUSTED_MOUNTER) ||
> +       if (sb_test_iflag(inode->i_sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
> +           (sb_test_iflag(inode->i_sb, SB_I_UNTRUSTED_MOUNTER) ||
>              (iint->flags & IMA_FAIL_UNVERIFIABLE_SIGS))) {
>                 status = INTEGRITY_FAIL;
>                 cause = "unverifiable-signature";
> diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> index b04eaa33eca4..27d446136c4f 100644
> --- a/security/integrity/ima/ima_main.c
> +++ b/security/integrity/ima/ima_main.c
> @@ -280,8 +280,8 @@ static int process_measurement(struct file *file, const struct cred *cred,
>          * (Limited to privileged mounted filesystems.)
>          */
>         if (test_and_clear_bit(IMA_CHANGE_XATTR, &iint->atomic_flags) ||
> -           (sb_test_iflag(inode->i_sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
> -            !sb_test_iflag(inode->i_sb, _SB_I_UNTRUSTED_MOUNTER) &&
> +           (sb_test_iflag(inode->i_sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
> +            !sb_test_iflag(inode->i_sb, SB_I_UNTRUSTED_MOUNTER) &&
>              !(action & IMA_FAIL_UNVERIFIABLE_SIGS))) {
>                 iint->flags &= ~IMA_DONE_MASK;
>                 iint->measured_pcrs = 0;
> --
> 2.35.3
>
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants
  2024-08-08 11:47   ` Amir Goldstein
@ 2024-08-08 14:35     ` Darrick J. Wong
  2024-08-08 14:50       ` Christian Brauner
  0 siblings, 1 reply; 25+ messages in thread
From: Darrick J. Wong @ 2024-08-08 14:35 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, linux-fsdevel, Dave Chinner, Christian Brauner

On Thu, Aug 08, 2024 at 01:47:20PM +0200, Amir Goldstein wrote:
> On Wed, Aug 7, 2024 at 8:31 PM Jan Kara <jack@suse.cz> wrote:
> >
> > Now that old constants are gone, remove the unnecessary underscore from
> > the new _SB_I_ constants. Pure mechanical replacement, no functional
> > changes.
> >
> 
> This is a potential backporting bomb.
> It is true that code using the old constant names with new macros
> will not build on stable kernels, but I think this is still asking for trouble.
> 
> Also, it is a bit strange that SB_* flags are bit masks and SB_I_*
> flags are bit numbers.
> How about leaving the underscore and using  sb_*_iflag() macros to add
> the underscore?

Or append _BIT to the new names, as is sometimes done elsewhere in the
kernel?

#define SB_I_VERSION_BIT	23

etc.

--D

> Thanks,
> Amir.
> 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  block/bdev.c                          |  2 +-
> >  drivers/android/binderfs.c            |  4 ++--
> >  fs/aio.c                              |  2 +-
> >  fs/btrfs/super.c                      |  2 +-
> >  fs/devpts/inode.c                     |  2 +-
> >  fs/exec.c                             |  2 +-
> >  fs/ext2/super.c                       |  2 +-
> >  fs/ext4/super.c                       |  2 +-
> >  fs/f2fs/super.c                       |  2 +-
> >  fs/fuse/inode.c                       |  4 ++--
> >  fs/inode.c                            |  2 +-
> >  fs/kernfs/mount.c                     |  4 ++--
> >  fs/namei.c                            |  2 +-
> >  fs/namespace.c                        |  8 ++++----
> >  fs/nfs/fs_context.c                   |  2 +-
> >  fs/nfs/super.c                        |  2 +-
> >  fs/overlayfs/super.c                  |  6 +++---
> >  fs/proc/root.c                        |  6 +++---
> >  fs/super.c                            | 18 ++++++++---------
> >  fs/sync.c                             |  2 +-
> >  fs/sysfs/mount.c                      |  2 +-
> >  fs/xfs/xfs_super.c                    |  2 +-
> >  include/linux/backing-dev.h           |  2 +-
> >  include/linux/fs.h                    | 28 +++++++++++++--------------
> >  include/linux/namei.h                 |  2 +-
> >  ipc/mqueue.c                          |  4 ++--
> >  security/integrity/evm/evm_main.c     |  2 +-
> >  security/integrity/ima/ima_appraise.c |  4 ++--
> >  security/integrity/ima/ima_main.c     |  4 ++--
> >  29 files changed, 63 insertions(+), 63 deletions(-)
> >
> > diff --git a/block/bdev.c b/block/bdev.c
> > index c1ea2aeb93dd..6c13ba60c0b1 100644
> > --- a/block/bdev.c
> > +++ b/block/bdev.c
> > @@ -373,7 +373,7 @@ static int bd_init_fs_context(struct fs_context *fc)
> >         struct pseudo_fs_context *ctx = init_pseudo(fc, BDEVFS_MAGIC);
> >         if (!ctx)
> >                 return -ENOMEM;
> > -       fc->s_iflags |= 1 << _SB_I_CGROUPWB;
> > +       fc->s_iflags |= 1 << SB_I_CGROUPWB;
> >         ctx->ops = &bdev_sops;
> >         return 0;
> >  }
> > diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
> > index f9454b93c2f7..6070923fbfbd 100644
> > --- a/drivers/android/binderfs.c
> > +++ b/drivers/android/binderfs.c
> > @@ -672,8 +672,8 @@ static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc)
> >          * allowed to do. So removing the SB_I_NODEV flag from s_iflags is both
> >          * necessary and safe.
> >          */
> > -       sb_clear_iflag(sb, _SB_I_NODEV);
> > -       sb_set_iflag(sb, _SB_I_NOEXEC);
> > +       sb_clear_iflag(sb, SB_I_NODEV);
> > +       sb_set_iflag(sb, SB_I_NOEXEC);
> >         sb->s_magic = BINDERFS_SUPER_MAGIC;
> >         sb->s_op = &binderfs_super_ops;
> >         sb->s_time_gran = 1;
> > diff --git a/fs/aio.c b/fs/aio.c
> > index 63ce0736c3a3..48d99221ff57 100644
> > --- a/fs/aio.c
> > +++ b/fs/aio.c
> > @@ -279,7 +279,7 @@ static int aio_init_fs_context(struct fs_context *fc)
> >  {
> >         if (!init_pseudo(fc, AIO_RING_MAGIC))
> >                 return -ENOMEM;
> > -       fc->s_iflags |= 1 << _SB_I_NOEXEC;
> > +       fc->s_iflags |= 1 << SB_I_NOEXEC;
> >         return 0;
> >  }
> >
> > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> > index 321696697279..fb3938ec127c 100644
> > --- a/fs/btrfs/super.c
> > +++ b/fs/btrfs/super.c
> > @@ -950,7 +950,7 @@ static int btrfs_fill_super(struct super_block *sb,
> >  #endif
> >         sb->s_xattr = btrfs_xattr_handlers;
> >         sb->s_time_gran = 1;
> > -       sb_set_iflag(sb, _SB_I_CGROUPWB);
> > +       sb_set_iflag(sb, SB_I_CGROUPWB);
> >
> >         err = super_setup_bdi(sb);
> >         if (err) {
> > diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> > index d473156d2791..6094cb7e1a16 100644
> > --- a/fs/devpts/inode.c
> > +++ b/fs/devpts/inode.c
> > @@ -428,7 +428,7 @@ devpts_fill_super(struct super_block *s, void *data, int silent)
> >         struct inode *inode;
> >         int error;
> >
> > -       sb_clear_iflag(s, _SB_I_NODEV);
> > +       sb_clear_iflag(s, SB_I_NODEV);
> >         s->s_blocksize = 1024;
> >         s->s_blocksize_bits = 10;
> >         s->s_magic = DEVPTS_SUPER_MAGIC;
> > diff --git a/fs/exec.c b/fs/exec.c
> > index b62b67bea10b..a8bd15aa6bd8 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -112,7 +112,7 @@ static inline void put_binfmt(struct linux_binfmt * fmt)
> >  bool path_noexec(const struct path *path)
> >  {
> >         return (path->mnt->mnt_flags & MNT_NOEXEC) ||
> > -              sb_test_iflag(path->mnt->mnt_sb, _SB_I_NOEXEC);
> > +              sb_test_iflag(path->mnt->mnt_sb, SB_I_NOEXEC);
> >  }
> >
> >  #ifdef CONFIG_USELIB
> > diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> > index 9da8652c10c5..cbe79fb7ac35 100644
> > --- a/fs/ext2/super.c
> > +++ b/fs/ext2/super.c
> > @@ -916,7 +916,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
> >
> >         sb->s_flags = (sb->s_flags & ~SB_POSIXACL) |
> >                 (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0);
> > -       sb_set_iflag(sb, _SB_I_CGROUPWB);
> > +       sb_set_iflag(sb, SB_I_CGROUPWB);
> >
> >         if (le32_to_cpu(es->s_rev_level) == EXT2_GOOD_OLD_REV &&
> >             (EXT2_HAS_COMPAT_FEATURE(sb, ~0U) ||
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index a776d4e7ec66..b5b2f17f1b65 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -4972,7 +4972,7 @@ static int ext4_check_journal_data_mode(struct super_block *sb)
> >                 if (test_opt(sb, DELALLOC))
> >                         clear_opt(sb, DELALLOC);
> >         } else {
> > -               sb_set_iflag(sb, _SB_I_CGROUPWB);
> > +               sb_set_iflag(sb, SB_I_CGROUPWB);
> >         }
> >
> >         return 0;
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > index 041b7b7b0810..8437612bf64b 100644
> > --- a/fs/f2fs/super.c
> > +++ b/fs/f2fs/super.c
> > @@ -4472,7 +4472,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >                 (test_opt(sbi, POSIX_ACL) ? SB_POSIXACL : 0);
> >         super_set_uuid(sb, (void *) raw_super->uuid, sizeof(raw_super->uuid));
> >         super_set_sysfs_name_bdev(sb);
> > -       sb_set_iflag(sb, _SB_I_CGROUPWB);
> > +       sb_set_iflag(sb, SB_I_CGROUPWB);
> >
> >         /* init f2fs-specific super block info */
> >         sbi->valid_super_block = valid_super_block;
> > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > index 3602a578b7b3..5b6254481d5c 100644
> > --- a/fs/fuse/inode.c
> > +++ b/fs/fuse/inode.c
> > @@ -1566,9 +1566,9 @@ static void fuse_sb_defaults(struct super_block *sb)
> >         sb->s_maxbytes = MAX_LFS_FILESIZE;
> >         sb->s_time_gran = 1;
> >         sb->s_export_op = &fuse_export_operations;
> > -       sb_set_iflag(sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE);
> > +       sb_set_iflag(sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE);
> >         if (sb->s_user_ns != &init_user_ns)
> > -               sb_set_iflag(sb, _SB_I_UNTRUSTED_MOUNTER);
> > +               sb_set_iflag(sb, SB_I_UNTRUSTED_MOUNTER);
> >         sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);
> >  }
> >
> > diff --git a/fs/inode.c b/fs/inode.c
> > index a8598a968940..3091385a4de1 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -216,7 +216,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
> >         lockdep_set_class_and_name(&mapping->invalidate_lock,
> >                                    &sb->s_type->invalidate_lock_key,
> >                                    "mapping.invalidate_lock");
> > -       if (sb_test_iflag(sb, _SB_I_STABLE_WRITES))
> > +       if (sb_test_iflag(sb, SB_I_STABLE_WRITES))
> >                 mapping_set_stable_writes(mapping);
> >         inode->i_private = NULL;
> >         inode->i_mapping = mapping;
> > diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
> > index 762edcf5387e..f5331f2e0b2d 100644
> > --- a/fs/kernfs/mount.c
> > +++ b/fs/kernfs/mount.c
> > @@ -252,8 +252,8 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
> >
> >         info->sb = sb;
> >         /* Userspace would break if executables or devices appear on sysfs */
> > -       sb_set_iflag(sb, _SB_I_NOEXEC);
> > -       sb_set_iflag(sb, _SB_I_NODEV);
> > +       sb_set_iflag(sb, SB_I_NOEXEC);
> > +       sb_set_iflag(sb, SB_I_NODEV);
> >         sb->s_blocksize = PAGE_SIZE;
> >         sb->s_blocksize_bits = PAGE_SHIFT;
> >         sb->s_magic = kfc->magic;
> > diff --git a/fs/namei.c b/fs/namei.c
> > index de6936564298..9e9bca0566e9 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -3308,7 +3308,7 @@ EXPORT_SYMBOL(vfs_mkobj);
> >  bool may_open_dev(const struct path *path)
> >  {
> >         return !(path->mnt->mnt_flags & MNT_NODEV) &&
> > -               !sb_test_iflag(path->mnt->mnt_sb, _SB_I_NODEV);
> > +               !sb_test_iflag(path->mnt->mnt_sb, SB_I_NODEV);
> >  }
> >
> >  static int may_open(struct mnt_idmap *idmap, const struct path *path,
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index 17126569b3c4..1c5591673f96 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -2919,7 +2919,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount *
> >         struct super_block *sb = mnt->mnt_sb;
> >
> >         if (!__mnt_is_readonly(mnt) &&
> > -          !sb_test_iflag(sb, _SB_I_TS_EXPIRY_WARNED) &&
> > +          !sb_test_iflag(sb, SB_I_TS_EXPIRY_WARNED) &&
> >            (ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) {
> >                 char *buf = (char *)__get_free_page(GFP_KERNEL);
> >                 char *mntpath = buf ? d_path(mountpoint, buf, PAGE_SIZE) : ERR_PTR(-ENOMEM);
> > @@ -2931,7 +2931,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount *
> >                         (unsigned long long)sb->s_time_max);
> >
> >                 free_page((unsigned long)buf);
> > -               sb_set_iflag(sb, _SB_I_TS_EXPIRY_WARNED);
> > +               sb_set_iflag(sb, SB_I_TS_EXPIRY_WARNED);
> >         }
> >  }
> >
> > @@ -5629,10 +5629,10 @@ static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags
> >                 return false;
> >
> >         /* Can this filesystem be too revealing? */
> > -       if (!sb_test_iflag(sb, _SB_I_USERNS_VISIBLE))
> > +       if (!sb_test_iflag(sb, SB_I_USERNS_VISIBLE))
> >                 return false;
> >
> > -       if (!sb_test_iflag(sb, _SB_I_NOEXEC) || !sb_test_iflag(sb, _SB_I_NODEV)) {
> > +       if (!sb_test_iflag(sb, SB_I_NOEXEC) || !sb_test_iflag(sb, SB_I_NODEV)) {
> >                 WARN_ONCE(1, "Expected s_iflags to contain SB_I_NOEXEC and "
> >                           "SB_I_NODEV\n");
> >                 return true;
> > diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
> > index 2fbae7e2b6ce..52fc52b6350f 100644
> > --- a/fs/nfs/fs_context.c
> > +++ b/fs/nfs/fs_context.c
> > @@ -1643,7 +1643,7 @@ static int nfs_init_fs_context(struct fs_context *fc)
> >                 ctx->xprtsec.cert_serial        = TLS_NO_CERT;
> >                 ctx->xprtsec.privkey_serial     = TLS_NO_PRIVKEY;
> >
> > -               fc->s_iflags            |= 1 << _SB_I_STABLE_WRITES;
> > +               fc->s_iflags            |= 1 << SB_I_STABLE_WRITES;
> >         }
> >         fc->fs_private = ctx;
> >         fc->ops = &nfs_fs_context_ops;
> > diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> > index b6b806fb6286..246ecceda7c8 100644
> > --- a/fs/nfs/super.c
> > +++ b/fs/nfs/super.c
> > @@ -1094,7 +1094,7 @@ static void nfs_fill_super(struct super_block *sb, struct nfs_fs_context *ctx)
> >                 sb->s_export_op = &nfs_export_ops;
> >                 break;
> >         case 4:
> > -               sb_set_iflag(sb, _SB_I_NOUMASK);
> > +               sb_set_iflag(sb, SB_I_NOUMASK);
> >                 sb->s_time_gran = 1;
> >                 sb->s_time_min = S64_MIN;
> >                 sb->s_time_max = S64_MAX;
> > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > index afa5263ff016..f5a60d0bcb1c 100644
> > --- a/fs/overlayfs/super.c
> > +++ b/fs/overlayfs/super.c
> > @@ -1453,14 +1453,14 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
> >  #ifdef CONFIG_FS_POSIX_ACL
> >         sb->s_flags |= SB_POSIXACL;
> >  #endif
> > -       sb_set_iflag(sb, _SB_I_SKIP_SYNC);
> > +       sb_set_iflag(sb, SB_I_SKIP_SYNC);
> >         /*
> >          * Ensure that umask handling is done by the filesystems used
> >          * for the the upper layer instead of overlayfs as that would
> >          * lead to unexpected results.
> >          */
> > -       sb_set_iflag(sb, _SB_I_NOUMASK);
> > -       sb_set_iflag(sb, _SB_I_EVM_HMAC_UNSUPPORTED);
> > +       sb_set_iflag(sb, SB_I_NOUMASK);
> > +       sb_set_iflag(sb, SB_I_EVM_HMAC_UNSUPPORTED);
> >
> >         err = -ENOMEM;
> >         root_dentry = ovl_get_root(sb, ctx->upper.dentry, oe);
> > diff --git a/fs/proc/root.c b/fs/proc/root.c
> > index ac78ec69dde9..7acfa535b925 100644
> > --- a/fs/proc/root.c
> > +++ b/fs/proc/root.c
> > @@ -171,9 +171,9 @@ static int proc_fill_super(struct super_block *s, struct fs_context *fc)
> >         proc_apply_options(fs_info, fc, current_user_ns());
> >
> >         /* User space would break if executables or devices appear on proc */
> > -       sb_set_iflag(s, _SB_I_USERNS_VISIBLE);
> > -       sb_set_iflag(s, _SB_I_NOEXEC);
> > -       sb_set_iflag(s, _SB_I_NODEV);
> > +       sb_set_iflag(s, SB_I_USERNS_VISIBLE);
> > +       sb_set_iflag(s, SB_I_NOEXEC);
> > +       sb_set_iflag(s, SB_I_NODEV);
> >         s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC;
> >         s->s_blocksize = 1024;
> >         s->s_blocksize_bits = 10;
> > diff --git a/fs/super.c b/fs/super.c
> > index e3020b3db4f0..873808245d54 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -355,7 +355,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
> >         s->s_bdi = &noop_backing_dev_info;
> >         s->s_flags = flags;
> >         if (s->s_user_ns != &init_user_ns)
> > -               sb_set_iflag(s, _SB_I_NODEV);
> > +               sb_set_iflag(s, SB_I_NODEV);
> >         INIT_HLIST_NODE(&s->s_instances);
> >         INIT_HLIST_BL_HEAD(&s->s_roots);
> >         mutex_init(&s->s_sync_lock);
> > @@ -589,11 +589,11 @@ void retire_super(struct super_block *sb)
> >  {
> >         WARN_ON(!sb->s_bdev);
> >         __super_lock_excl(sb);
> > -       if (sb_test_iflag(sb, _SB_I_PERSB_BDI)) {
> > +       if (sb_test_iflag(sb, SB_I_PERSB_BDI)) {
> >                 bdi_unregister(sb->s_bdi);
> > -               sb_clear_iflag(sb, _SB_I_PERSB_BDI);
> > +               sb_clear_iflag(sb, SB_I_PERSB_BDI);
> >         }
> > -       sb_set_iflag(sb, _SB_I_RETIRED);
> > +       sb_set_iflag(sb, SB_I_RETIRED);
> >         super_unlock_excl(sb);
> >  }
> >  EXPORT_SYMBOL(retire_super);
> > @@ -678,7 +678,7 @@ void generic_shutdown_super(struct super_block *sb)
> >         super_wake(sb, SB_DYING);
> >         super_unlock_excl(sb);
> >         if (sb->s_bdi != &noop_backing_dev_info) {
> > -               if (sb_test_iflag(sb, _SB_I_PERSB_BDI))
> > +               if (sb_test_iflag(sb, SB_I_PERSB_BDI))
> >                         bdi_unregister(sb->s_bdi);
> >                 bdi_put(sb->s_bdi);
> >                 sb->s_bdi = &noop_backing_dev_info;
> > @@ -1331,7 +1331,7 @@ static int super_s_dev_set(struct super_block *s, struct fs_context *fc)
> >
> >  static int super_s_dev_test(struct super_block *s, struct fs_context *fc)
> >  {
> > -       return !sb_test_iflag(s, _SB_I_RETIRED) &&
> > +       return !sb_test_iflag(s, SB_I_RETIRED) &&
> >                 s->s_dev == *(dev_t *)fc->sget_key;
> >  }
> >
> > @@ -1584,7 +1584,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
> >         sb->s_bdev = bdev;
> >         sb->s_bdi = bdi_get(bdev->bd_disk->bdi);
> >         if (bdev_stable_writes(bdev))
> > -               sb_set_iflag(sb, _SB_I_STABLE_WRITES);
> > +               sb_set_iflag(sb, SB_I_STABLE_WRITES);
> >         spin_unlock(&sb_lock);
> >
> >         snprintf(sb->s_id, sizeof(sb->s_id), "%pg", bdev);
> > @@ -1648,7 +1648,7 @@ EXPORT_SYMBOL(get_tree_bdev);
> >
> >  static int test_bdev_super(struct super_block *s, void *data)
> >  {
> > -       return !sb_test_iflag(s, _SB_I_RETIRED) && s->s_dev == *(dev_t *)data;
> > +       return !sb_test_iflag(s, SB_I_RETIRED) && s->s_dev == *(dev_t *)data;
> >  }
> >
> >  struct dentry *mount_bdev(struct file_system_type *fs_type,
> > @@ -1864,7 +1864,7 @@ int super_setup_bdi_name(struct super_block *sb, char *fmt, ...)
> >         }
> >         WARN_ON(sb->s_bdi != &noop_backing_dev_info);
> >         sb->s_bdi = bdi;
> > -       sb_set_iflag(sb, _SB_I_PERSB_BDI);
> > +       sb_set_iflag(sb, SB_I_PERSB_BDI);
> >
> >         return 0;
> >  }
> > diff --git a/fs/sync.c b/fs/sync.c
> > index 4e5ad48316be..a7c0645aa9dc 100644
> > --- a/fs/sync.c
> > +++ b/fs/sync.c
> > @@ -79,7 +79,7 @@ static void sync_inodes_one_sb(struct super_block *sb, void *arg)
> >
> >  static void sync_fs_one_sb(struct super_block *sb, void *arg)
> >  {
> > -       if (!sb_rdonly(sb) && !sb_test_iflag(sb, _SB_I_SKIP_SYNC) &&
> > +       if (!sb_rdonly(sb) && !sb_test_iflag(sb, SB_I_SKIP_SYNC) &&
> >             sb->s_op->sync_fs)
> >                 sb->s_op->sync_fs(sb, *(int *)arg);
> >  }
> > diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
> > index 124385961da7..b461c216731a 100644
> > --- a/fs/sysfs/mount.c
> > +++ b/fs/sysfs/mount.c
> > @@ -33,7 +33,7 @@ static int sysfs_get_tree(struct fs_context *fc)
> >                 return ret;
> >
> >         if (kfc->new_sb_created)
> > -               sb_set_iflag(fc->root->d_sb, _SB_I_USERNS_VISIBLE);
> > +               sb_set_iflag(fc->root->d_sb, SB_I_USERNS_VISIBLE);
> >         return 0;
> >  }
> >
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 7707f2a1a836..0020724e3b0a 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1701,7 +1701,7 @@ xfs_fs_fill_super(
> >                 sb->s_time_max = XFS_LEGACY_TIME_MAX;
> >         }
> >         trace_xfs_inode_timestamp_range(mp, sb->s_time_min, sb->s_time_max);
> > -       sb_set_iflag(sb, _SB_I_CGROUPWB);
> > +       sb_set_iflag(sb, SB_I_CGROUPWB);
> >
> >         set_posix_acl_flag(sb);
> >
> > diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> > index 54fdae7b1be4..bc5f96ba499e 100644
> > --- a/include/linux/backing-dev.h
> > +++ b/include/linux/backing-dev.h
> > @@ -176,7 +176,7 @@ static inline bool inode_cgwb_enabled(struct inode *inode)
> >         return cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
> >                 cgroup_subsys_on_dfl(io_cgrp_subsys) &&
> >                 (bdi->capabilities & BDI_CAP_WRITEBACK) &&
> > -               sb_test_iflag(inode->i_sb, _SB_I_CGROUPWB);
> > +               sb_test_iflag(inode->i_sb, SB_I_CGROUPWB);
> >  }
> >
> >  /**
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 65e70ceb335e..52841aab13fb 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -1174,22 +1174,22 @@ extern int send_sigurg(struct fown_struct *fown);
> >
> >  /* sb->s_iflags */
> >  enum {
> > -       _SB_I_CGROUPWB,         /* cgroup-aware writeback enabled */
> > -       _SB_I_NOEXEC,           /* Ignore executables on this fs */
> > -       _SB_I_NODEV,            /* Ignore devices on this fs */
> > -       _SB_I_STABLE_WRITES,    /* don't modify blks until WB is done */
> > +       SB_I_CGROUPWB,          /* cgroup-aware writeback enabled */
> > +       SB_I_NOEXEC,            /* Ignore executables on this fs */
> > +       SB_I_NODEV,             /* Ignore devices on this fs */
> > +       SB_I_STABLE_WRITES,     /* don't modify blks until WB is done */
> >
> >         /* sb->s_iflags to limit user namespace mounts */
> > -       _SB_I_USERNS_VISIBLE,   /* fstype already mounted */
> > -       _SB_I_IMA_UNVERIFIABLE_SIGNATURE,
> > -       _SB_I_UNTRUSTED_MOUNTER,
> > -       _SB_I_EVM_HMAC_UNSUPPORTED,
> > -
> > -       _SB_I_SKIP_SYNC,        /* Skip superblock at global sync */
> > -       _SB_I_PERSB_BDI,        /* has a per-sb bdi */
> > -       _SB_I_TS_EXPIRY_WARNED, /* warned about timestamp range expiry */
> > -       _SB_I_RETIRED,          /* superblock shouldn't be reused */
> > -       _SB_I_NOUMASK,          /* VFS does not apply umask */
> > +       SB_I_USERNS_VISIBLE,    /* fstype already mounted */
> > +       SB_I_IMA_UNVERIFIABLE_SIGNATURE,
> > +       SB_I_UNTRUSTED_MOUNTER,
> > +       SB_I_EVM_HMAC_UNSUPPORTED,
> > +
> > +       SB_I_SKIP_SYNC, /* Skip superblock at global sync */
> > +       SB_I_PERSB_BDI, /* has a per-sb bdi */
> > +       SB_I_TS_EXPIRY_WARNED,  /* warned about timestamp range expiry */
> > +       SB_I_RETIRED,           /* superblock shouldn't be reused */
> > +       SB_I_NOUMASK,           /* VFS does not apply umask */
> >  };
> >
> >  /* Possible states of 'frozen' field */
> > diff --git a/include/linux/namei.h b/include/linux/namei.h
> > index 3fbf340dac1a..0bd6db9adb7f 100644
> > --- a/include/linux/namei.h
> > +++ b/include/linux/namei.h
> > @@ -107,7 +107,7 @@ extern void unlock_rename(struct dentry *, struct dentry *);
> >   */
> >  static inline umode_t __must_check mode_strip_umask(const struct inode *dir, umode_t mode)
> >  {
> > -       if (!IS_POSIXACL(dir) && !sb_test_iflag(dir->i_sb, _SB_I_NOUMASK))
> > +       if (!IS_POSIXACL(dir) && !sb_test_iflag(dir->i_sb, SB_I_NOUMASK))
> >                 mode &= ~current_umask();
> >         return mode;
> >  }
> > diff --git a/ipc/mqueue.c b/ipc/mqueue.c
> > index e73fff4c2f12..abe4dfe4374c 100644
> > --- a/ipc/mqueue.c
> > +++ b/ipc/mqueue.c
> > @@ -406,8 +406,8 @@ static int mqueue_fill_super(struct super_block *sb, struct fs_context *fc)
> >         struct inode *inode;
> >         struct ipc_namespace *ns = sb->s_fs_info;
> >
> > -       sb_set_iflag(sb, _SB_I_NOEXEC);
> > -       sb_set_iflag(sb, _SB_I_NODEV);
> > +       sb_set_iflag(sb, SB_I_NOEXEC);
> > +       sb_set_iflag(sb, SB_I_NODEV);
> >         sb->s_blocksize = PAGE_SIZE;
> >         sb->s_blocksize_bits = PAGE_SHIFT;
> >         sb->s_magic = MQUEUE_MAGIC;
> > diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
> > index 3ff29bf73f04..a15a87250d55 100644
> > --- a/security/integrity/evm/evm_main.c
> > +++ b/security/integrity/evm/evm_main.c
> > @@ -155,7 +155,7 @@ static int is_unsupported_hmac_fs(struct dentry *dentry)
> >  {
> >         struct inode *inode = d_backing_inode(dentry);
> >
> > -       if (sb_test_iflag(inode->i_sb, _SB_I_EVM_HMAC_UNSUPPORTED)) {
> > +       if (sb_test_iflag(inode->i_sb, SB_I_EVM_HMAC_UNSUPPORTED)) {
> >                 pr_info_once("%s not supported\n", inode->i_sb->s_type->name);
> >                 return 1;
> >         }
> > diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
> > index 9c290dd8a4ac..dfa16dba5d89 100644
> > --- a/security/integrity/ima/ima_appraise.c
> > +++ b/security/integrity/ima/ima_appraise.c
> > @@ -564,8 +564,8 @@ int ima_appraise_measurement(enum ima_hooks func, struct ima_iint_cache *iint,
> >          * system not willing to accept such a risk, fail the file signature
> >          * verification.
> >          */
> > -       if (sb_test_iflag(inode->i_sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
> > -           (sb_test_iflag(inode->i_sb, _SB_I_UNTRUSTED_MOUNTER) ||
> > +       if (sb_test_iflag(inode->i_sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
> > +           (sb_test_iflag(inode->i_sb, SB_I_UNTRUSTED_MOUNTER) ||
> >              (iint->flags & IMA_FAIL_UNVERIFIABLE_SIGS))) {
> >                 status = INTEGRITY_FAIL;
> >                 cause = "unverifiable-signature";
> > diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> > index b04eaa33eca4..27d446136c4f 100644
> > --- a/security/integrity/ima/ima_main.c
> > +++ b/security/integrity/ima/ima_main.c
> > @@ -280,8 +280,8 @@ static int process_measurement(struct file *file, const struct cred *cred,
> >          * (Limited to privileged mounted filesystems.)
> >          */
> >         if (test_and_clear_bit(IMA_CHANGE_XATTR, &iint->atomic_flags) ||
> > -           (sb_test_iflag(inode->i_sb, _SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
> > -            !sb_test_iflag(inode->i_sb, _SB_I_UNTRUSTED_MOUNTER) &&
> > +           (sb_test_iflag(inode->i_sb, SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
> > +            !sb_test_iflag(inode->i_sb, SB_I_UNTRUSTED_MOUNTER) &&
> >              !(action & IMA_FAIL_UNVERIFIABLE_SIGS))) {
> >                 iint->flags &= ~IMA_DONE_MASK;
> >                 iint->measured_pcrs = 0;
> > --
> > 2.35.3
> >
> >
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants
  2024-08-08 14:35     ` Darrick J. Wong
@ 2024-08-08 14:50       ` Christian Brauner
  2024-08-08 17:34         ` Jan Kara
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Brauner @ 2024-08-08 14:50 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Amir Goldstein, Jan Kara, linux-fsdevel, Dave Chinner

On Thu, Aug 08, 2024 at 07:35:05AM GMT, Darrick J. Wong wrote:
> On Thu, Aug 08, 2024 at 01:47:20PM +0200, Amir Goldstein wrote:
> > On Wed, Aug 7, 2024 at 8:31 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > Now that old constants are gone, remove the unnecessary underscore from
> > > the new _SB_I_ constants. Pure mechanical replacement, no functional
> > > changes.
> > >
> > 
> > This is a potential backporting bomb.
> > It is true that code using the old constant names with new macros
> > will not build on stable kernels, but I think this is still asking for trouble.
> > 
> > Also, it is a bit strange that SB_* flags are bit masks and SB_I_*
> > flags are bit numbers.
> > How about leaving the underscore and using  sb_*_iflag() macros to add
> > the underscore?
> 
> Or append _BIT to the new names, as is sometimes done elsewhere in the
> kernel?
> 
> #define SB_I_VERSION_BIT	23

Yeah, that's better (Fwiw, SB_I_VERSION is confusingly not an
sb->i_flags. I complained about this when it was added.).

I don't want to end up with the same confusion that we have for
__I_NEW/I_NEW and __I_SYNC/I_SYNC which trips me up every so often when
I read that code.

So t probably wouldn't be the worst if we had:

#define SB_I_NODEV_BIT 3
#define SB_I_NODEV BIT(SB_I_NODEV_BIT)

so filesystems that raise that flag when they're initialized can do:

sb->i_flags |= SB_I_NODEV;

and not pointlessly make them do:

sb->i_flags |= 1 << SB_I_NODEV_BIT;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants
  2024-08-08 14:50       ` Christian Brauner
@ 2024-08-08 17:34         ` Jan Kara
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-08 17:34 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Darrick J. Wong, Amir Goldstein, Jan Kara, linux-fsdevel,
	Dave Chinner

On Thu 08-08-24 16:50:42, Christian Brauner wrote:
> On Thu, Aug 08, 2024 at 07:35:05AM GMT, Darrick J. Wong wrote:
> > On Thu, Aug 08, 2024 at 01:47:20PM +0200, Amir Goldstein wrote:
> > > On Wed, Aug 7, 2024 at 8:31 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > Now that old constants are gone, remove the unnecessary underscore from
> > > > the new _SB_I_ constants. Pure mechanical replacement, no functional
> > > > changes.
> > > >
> > > 
> > > This is a potential backporting bomb.
> > > It is true that code using the old constant names with new macros
> > > will not build on stable kernels, but I think this is still asking for trouble.
> > > 
> > > Also, it is a bit strange that SB_* flags are bit masks and SB_I_*
> > > flags are bit numbers.
> > > How about leaving the underscore and using  sb_*_iflag() macros to add
> > > the underscore?
> > 
> > Or append _BIT to the new names, as is sometimes done elsewhere in the
> > kernel?
> > 
> > #define SB_I_VERSION_BIT	23
> 
> Yeah, that's better (Fwiw, SB_I_VERSION is confusingly not an
> sb->i_flags. I complained about this when it was added.).
> 
> I don't want to end up with the same confusion that we have for
> __I_NEW/I_NEW and __I_SYNC/I_SYNC which trips me up every so often when
> I read that code.

OK, _BIT suffix sounds nice.

> So t probably wouldn't be the worst if we had:
> 
> #define SB_I_NODEV_BIT 3
> #define SB_I_NODEV BIT(SB_I_NODEV_BIT)
> 
> so filesystems that raise that flag when they're initialized can do:
> 
> sb->i_flags |= SB_I_NODEV;
> 
> and not pointlessly make them do:
> 
> sb->i_flags |= 1 << SB_I_NODEV_BIT;

Well, all sb->i_flags modifications should be using sb_set_iflag() /
sb_clear_iflag(). I know it is unnecessarily more expensive in some cases
but none of those paths is really that performance sensitive. The only
(three) places where we have expression like 1 << SB_I_<foo>_BIT are there
because the flags are also used for fc->s_iflags.

I think that keeping SB_I_NODEV around together with SB_I_NODEV_BIT makes
it easier to write code like sb->i_flags |= val without thinking twice and
the three callsites that would be simplified are not really worth it. But
if someone feels strongly about this, I can live with it.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 07/13] overlayfs: Make ovl_start_write() return error
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (5 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-08 12:01   ` Amir Goldstein
  2024-08-07 18:29 ` [PATCH 08/13] fs: Teach callers of kiocb_start_write() to handle errors Jan Kara
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

sb_start_write() will be returning error for a shutdown filesystem.
Teach all ovl_start_write() to handle the error and bail out.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/overlayfs/copy_up.c   | 42 +++++++++++++++++++++++++++++++---------
 fs/overlayfs/overlayfs.h |  2 +-
 fs/overlayfs/util.c      |  3 ++-
 3 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index a5ef2005a2cc..6ebfd9c7b260 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -584,7 +584,9 @@ static int ovl_link_up(struct ovl_copy_up_ctx *c)
 	struct ovl_fs *ofs = OVL_FS(c->dentry->d_sb);
 	struct inode *udir = d_inode(upperdir);
 
-	ovl_start_write(c->dentry);
+	err = ovl_start_write(c->dentry);
+	if (err)
+		return err;
 
 	/* Mark parent "impure" because it may now contain non-pure upper */
 	err = ovl_set_impure(c->parent, upperdir);
@@ -744,6 +746,7 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	struct path path = { .mnt = ovl_upper_mnt(ofs) };
 	struct dentry *temp, *upper, *trap;
 	struct ovl_cu_creds cc;
+	bool frozen = false;
 	int err;
 	struct ovl_cattr cattr = {
 		/* Can't properly set mode on creation because of the umask */
@@ -756,7 +759,11 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	if (err)
 		return err;
 
-	ovl_start_write(c->dentry);
+	err = ovl_start_write(c->dentry);
+	if (err) {
+		ovl_revert_cu_creds(&cc);
+		return err;
+	}
 	inode_lock(wdir);
 	temp = ovl_create_temp(ofs, c->workdir, &cattr);
 	inode_unlock(wdir);
@@ -778,7 +785,10 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 	 * ovl_copy_up_data(), so lock workdir and destdir and make sure that
 	 * temp wasn't moved before copy up completion or cleanup.
 	 */
-	ovl_start_write(c->dentry);
+	if (!err) {
+		err = ovl_start_write(c->dentry);
+		frozen = !err;
+	}
 	trap = lock_rename(c->workdir, c->destdir);
 	if (trap || temp->d_parent != c->workdir) {
 		/* temp or workdir moved underneath us? abort without cleanup */
@@ -827,7 +837,8 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
 unlock:
 	unlock_rename(c->workdir, c->destdir);
 out:
-	ovl_end_write(c->dentry);
+	if (frozen)
+		ovl_end_write(c->dentry);
 
 	return err;
 
@@ -851,7 +862,11 @@ static int ovl_copy_up_tmpfile(struct ovl_copy_up_ctx *c)
 	if (err)
 		return err;
 
-	ovl_start_write(c->dentry);
+	err = ovl_start_write(c->dentry);
+	if (err) {
+		ovl_revert_cu_creds(&cc);
+		return err;
+	}
 	tmpfile = ovl_do_tmpfile(ofs, c->workdir, c->stat.mode);
 	ovl_end_write(c->dentry);
 	ovl_revert_cu_creds(&cc);
@@ -865,7 +880,9 @@ static int ovl_copy_up_tmpfile(struct ovl_copy_up_ctx *c)
 			goto out_fput;
 	}
 
-	ovl_start_write(c->dentry);
+	err = ovl_start_write(c->dentry);
+	if (err)
+		goto out_fput;
 
 	err = ovl_copy_up_metadata(c, temp);
 	if (err)
@@ -964,7 +981,9 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
 		 * Mark parent "impure" because it may now contain non-pure
 		 * upper
 		 */
-		ovl_start_write(c->dentry);
+		err = ovl_start_write(c->dentry);
+		if (err)
+			goto out_free_fh;
 		err = ovl_set_impure(c->parent, c->destdir);
 		ovl_end_write(c->dentry);
 		if (err)
@@ -982,7 +1001,9 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
 	if (c->indexed)
 		ovl_set_flag(OVL_INDEX, d_inode(c->dentry));
 
-	ovl_start_write(c->dentry);
+	err = ovl_start_write(c->dentry);
+	if (err)
+		goto out;
 	if (to_index) {
 		/* Initialize nlink for copy up of disconnected dentry */
 		err = ovl_set_nlink_upper(c->dentry);
@@ -1088,7 +1109,10 @@ static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
 	 * Writing to upper file will clear security.capability xattr. We
 	 * don't want that to happen for normal copy-up operation.
 	 */
-	ovl_start_write(c->dentry);
+	err = ovl_start_write(c->dentry);
+	if (err)
+		goto out_free;
+
 	if (capability) {
 		err = ovl_do_setxattr(ofs, upperpath.dentry, XATTR_NAME_CAPS,
 				      capability, cap_size, 0);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 0bfe35da4b7b..ee8f2b28159a 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -423,7 +423,7 @@ static inline int ovl_do_getattr(const struct path *path, struct kstat *stat,
 /* util.c */
 int ovl_get_write_access(struct dentry *dentry);
 void ovl_put_write_access(struct dentry *dentry);
-void ovl_start_write(struct dentry *dentry);
+int __must_check ovl_start_write(struct dentry *dentry);
 void ovl_end_write(struct dentry *dentry);
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index edc9216f6e27..b53fa14506a9 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -25,10 +25,11 @@ int ovl_get_write_access(struct dentry *dentry)
 }
 
 /* Get write access to upper sb - may block if upper sb is frozen */
-void ovl_start_write(struct dentry *dentry)
+int __must_check ovl_start_write(struct dentry *dentry)
 {
 	struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
 	sb_start_write(ovl_upper_mnt(ofs)->mnt_sb);
+	return 0;
 }
 
 int ovl_want_write(struct dentry *dentry)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 07/13] overlayfs: Make ovl_start_write() return error
  2024-08-07 18:29 ` [PATCH 07/13] overlayfs: Make ovl_start_write() return error Jan Kara
@ 2024-08-08 12:01   ` Amir Goldstein
  0 siblings, 0 replies; 25+ messages in thread
From: Amir Goldstein @ 2024-08-08 12:01 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Dave Chinner, Christian Brauner

On Wed, Aug 7, 2024 at 8:30 PM Jan Kara <jack@suse.cz> wrote:
>
> sb_start_write() will be returning error for a shutdown filesystem.
> Teach all ovl_start_write() to handle the error and bail out.
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/overlayfs/copy_up.c   | 42 +++++++++++++++++++++++++++++++---------
>  fs/overlayfs/overlayfs.h |  2 +-
>  fs/overlayfs/util.c      |  3 ++-
>  3 files changed, 36 insertions(+), 11 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index a5ef2005a2cc..6ebfd9c7b260 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -584,7 +584,9 @@ static int ovl_link_up(struct ovl_copy_up_ctx *c)
>         struct ovl_fs *ofs = OVL_FS(c->dentry->d_sb);
>         struct inode *udir = d_inode(upperdir);
>
> -       ovl_start_write(c->dentry);
> +       err = ovl_start_write(c->dentry);
> +       if (err)
> +               return err;
>
>         /* Mark parent "impure" because it may now contain non-pure upper */
>         err = ovl_set_impure(c->parent, upperdir);
> @@ -744,6 +746,7 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>         struct path path = { .mnt = ovl_upper_mnt(ofs) };
>         struct dentry *temp, *upper, *trap;
>         struct ovl_cu_creds cc;
> +       bool frozen = false;

frozen is not a descriptive name for taking sb_writers?

>         int err;
>         struct ovl_cattr cattr = {
>                 /* Can't properly set mode on creation because of the umask */
> @@ -756,7 +759,11 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>         if (err)
>                 return err;
>
> -       ovl_start_write(c->dentry);
> +       err = ovl_start_write(c->dentry);
> +       if (err) {
> +               ovl_revert_cu_creds(&cc);
> +               return err;
> +       }
>         inode_lock(wdir);
>         temp = ovl_create_temp(ofs, c->workdir, &cattr);
>         inode_unlock(wdir);
> @@ -778,7 +785,10 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>          * ovl_copy_up_data(), so lock workdir and destdir and make sure that
>          * temp wasn't moved before copy up completion or cleanup.
>          */
> -       ovl_start_write(c->dentry);
> +       if (!err) {

This is confusing, I admit, but !err is not checked because ovl_cleanup()
needs sb_writers held.

Suggest something like:

err2 = ovl_start_write(c->dentry);
if (err2) {
     dput(temp);
     return err ?: err2;
}

> +               err = ovl_start_write(c->dentry);
> +               frozen = !err;
> +       }
>         trap = lock_rename(c->workdir, c->destdir);
>         if (trap || temp->d_parent != c->workdir) {
>                 /* temp or workdir moved underneath us? abort without cleanup */
> @@ -827,7 +837,8 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c)
>  unlock:
>         unlock_rename(c->workdir, c->destdir);
>  out:
> -       ovl_end_write(c->dentry);
> +       if (frozen)
> +               ovl_end_write(c->dentry);
>
>         return err;
>
> @@ -851,7 +862,11 @@ static int ovl_copy_up_tmpfile(struct ovl_copy_up_ctx *c)
>         if (err)
>                 return err;
>
> -       ovl_start_write(c->dentry);
> +       err = ovl_start_write(c->dentry);
> +       if (err) {
> +               ovl_revert_cu_creds(&cc);
> +               return err;
> +       }
>         tmpfile = ovl_do_tmpfile(ofs, c->workdir, c->stat.mode);
>         ovl_end_write(c->dentry);
>         ovl_revert_cu_creds(&cc);
> @@ -865,7 +880,9 @@ static int ovl_copy_up_tmpfile(struct ovl_copy_up_ctx *c)
>                         goto out_fput;
>         }
>
> -       ovl_start_write(c->dentry);
> +       err = ovl_start_write(c->dentry);
> +       if (err)
> +               goto out_fput;
>
>         err = ovl_copy_up_metadata(c, temp);
>         if (err)
> @@ -964,7 +981,9 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
>                  * Mark parent "impure" because it may now contain non-pure
>                  * upper
>                  */
> -               ovl_start_write(c->dentry);
> +               err = ovl_start_write(c->dentry);
> +               if (err)
> +                       goto out_free_fh;
>                 err = ovl_set_impure(c->parent, c->destdir);
>                 ovl_end_write(c->dentry);
>                 if (err)
> @@ -982,7 +1001,9 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
>         if (c->indexed)
>                 ovl_set_flag(OVL_INDEX, d_inode(c->dentry));
>
> -       ovl_start_write(c->dentry);
> +       err = ovl_start_write(c->dentry);
> +       if (err)
> +               goto out;
>         if (to_index) {
>                 /* Initialize nlink for copy up of disconnected dentry */
>                 err = ovl_set_nlink_upper(c->dentry);
> @@ -1088,7 +1109,10 @@ static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
>          * Writing to upper file will clear security.capability xattr. We
>          * don't want that to happen for normal copy-up operation.
>          */
> -       ovl_start_write(c->dentry);
> +       err = ovl_start_write(c->dentry);
> +       if (err)
> +               goto out_free;
> +
>         if (capability) {
>                 err = ovl_do_setxattr(ofs, upperpath.dentry, XATTR_NAME_CAPS,
>                                       capability, cap_size, 0);
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 0bfe35da4b7b..ee8f2b28159a 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -423,7 +423,7 @@ static inline int ovl_do_getattr(const struct path *path, struct kstat *stat,
>  /* util.c */
>  int ovl_get_write_access(struct dentry *dentry);
>  void ovl_put_write_access(struct dentry *dentry);
> -void ovl_start_write(struct dentry *dentry);
> +int __must_check ovl_start_write(struct dentry *dentry);
>  void ovl_end_write(struct dentry *dentry);
>  int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index edc9216f6e27..b53fa14506a9 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -25,10 +25,11 @@ int ovl_get_write_access(struct dentry *dentry)
>  }
>
>  /* Get write access to upper sb - may block if upper sb is frozen */
> -void ovl_start_write(struct dentry *dentry)
> +int __must_check ovl_start_write(struct dentry *dentry)
>  {
>         struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>         sb_start_write(ovl_upper_mnt(ofs)->mnt_sb);
> +       return 0;
>  }

Is this an unintentional omission of sb_start_write() return value or
fixed later on?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 08/13] fs: Teach callers of kiocb_start_write() to handle errors
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (6 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 07/13] overlayfs: Make ovl_start_write() return error Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 09/13] fs: Teach callers of file_start_write() " Jan Kara
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

sb_start_write() will be returning error on shutdown filesystem. Teach
callers of kiocb_start_write() to handle the error.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/aio.c           | 14 +++++++++-----
 fs/read_write.c    |  4 +++-
 include/linux/fs.h |  3 ++-
 io_uring/rw.c      |  7 +++++--
 4 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 48d99221ff57..3b94081118bc 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1626,12 +1626,16 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb,
 	if (ret < 0)
 		return ret;
 	ret = rw_verify_area(WRITE, file, &req->ki_pos, iov_iter_count(&iter));
-	if (!ret) {
-		if (S_ISREG(file_inode(file)->i_mode))
-			kiocb_start_write(req);
-		req->ki_flags |= IOCB_WRITE;
-		aio_rw_done(req, file->f_op->write_iter(req, &iter));
+	if (unlikely(ret))
+		goto out;
+	if (S_ISREG(file_inode(file)->i_mode)) {
+		ret = kiocb_start_write(req);
+		if (unlikely(ret))
+			goto out;
 	}
+	req->ki_flags |= IOCB_WRITE;
+	aio_rw_done(req, file->f_op->write_iter(req, &iter));
+out:
 	kfree(iovec);
 	return ret;
 }
diff --git a/fs/read_write.c b/fs/read_write.c
index 90e283b31ca1..12996892bb1d 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -859,7 +859,9 @@ ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb,
 	if (ret < 0)
 		return ret;
 
-	kiocb_start_write(iocb);
+	ret = kiocb_start_write(iocb);
+	if (ret < 0)
+		return ret;
 	ret = file->f_op->write_iter(iocb, iter);
 	if (ret != -EIOCBQUEUED)
 		kiocb_end_write(iocb);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 52841aab13fb..3ac37d9884f5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2920,7 +2920,7 @@ static inline void file_end_write(struct file *file)
  * This is a variant of sb_start_write() for async io submission.
  * Should be matched with a call to kiocb_end_write().
  */
-static inline void kiocb_start_write(struct kiocb *iocb)
+static inline int __must_check kiocb_start_write(struct kiocb *iocb)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
 
@@ -2930,6 +2930,7 @@ static inline void kiocb_start_write(struct kiocb *iocb)
 	 * doesn't complain about the held lock when we return to userspace.
 	 */
 	__sb_writers_release(inode->i_sb, SB_FREEZE_WRITE);
+	return 0;
 }
 
 /**
diff --git a/io_uring/rw.c b/io_uring/rw.c
index c004d21e2f12..a9dc9f84fb60 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -1040,8 +1040,11 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	if (unlikely(ret))
 		return ret;
 
-	if (req->flags & REQ_F_ISREG)
-		kiocb_start_write(kiocb);
+	if (req->flags & REQ_F_ISREG) {
+		ret = kiocb_start_write(kiocb);
+		if (unlikely(ret))
+			return ret;
+	}
 	kiocb->ki_flags |= IOCB_WRITE;
 
 	if (likely(req->file->f_op->write_iter))
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/13] fs: Teach callers of file_start_write() to handle errors
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (7 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 08/13] fs: Teach callers of kiocb_start_write() to handle errors Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 10/13] fs: Add __must_check annotations to sb_start_write_trylock() and similar Jan Kara
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

sb_start_write() will be returning error on shutdown filesystem. Teach
callers of file_start_write() to handle the error.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/backing-file.c      |  4 +++-
 fs/btrfs/ioctl.c       |  4 +++-
 fs/coredump.c          |  4 +++-
 fs/open.c              |  4 +++-
 fs/read_write.c        | 20 +++++++++++++++-----
 fs/remap_range.c       |  4 +++-
 fs/splice.c            |  8 ++++++--
 fs/xfs/xfs_exchrange.c |  4 +++-
 include/linux/fs.h     |  5 +++--
 9 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/fs/backing-file.c b/fs/backing-file.c
index afb557446c27..3df3fb48cb42 100644
--- a/fs/backing-file.c
+++ b/fs/backing-file.c
@@ -308,7 +308,9 @@ ssize_t backing_file_splice_write(struct pipe_inode_info *pipe,
 		return ret;
 
 	old_cred = override_creds(ctx->cred);
-	file_start_write(out);
+	ret = file_start_write(out);
+	if (ret)
+		return ret;
 	ret = iter_file_splice_write(pipe, out, ppos, len, flags);
 	file_end_write(out);
 	revert_creds(old_cred);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e0a664b8a46a..4cadba17a5a3 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -4676,7 +4676,9 @@ static int btrfs_ioctl_encoded_write(struct file *file, void __user *argp, bool
 		goto out_iov;
 	kiocb.ki_pos = pos;
 
-	file_start_write(file);
+	ret = file_start_write(file);
+	if (ret < 0)
+		goto out_iov;
 
 	ret = btrfs_do_write_iter(&kiocb, &iter, &args);
 	if (ret > 0)
diff --git a/fs/coredump.c b/fs/coredump.c
index 7f12ff6ad1d3..aa24964dc1e4 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -763,7 +763,9 @@ void do_coredump(const kernel_siginfo_t *siginfo)
 		if (!dump_vma_snapshot(&cprm))
 			goto close_fail;
 
-		file_start_write(cprm.file);
+		retval = file_start_write(cprm.file);
+		if (retval)
+			goto close_fail;
 		core_dumped = binfmt->core_dump(&cprm);
 		/*
 		 * Ensures that file size is big enough to contain the current
diff --git a/fs/open.c b/fs/open.c
index 22adbef7ecc2..4bce4ba776ab 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -330,7 +330,9 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	if (!file->f_op->fallocate)
 		return -EOPNOTSUPP;
 
-	file_start_write(file);
+	ret = file_start_write(file);
+	if (ret)
+		return ret;
 	ret = file->f_op->fallocate(file, mode, offset, len);
 
 	/*
diff --git a/fs/read_write.c b/fs/read_write.c
index 12996892bb1d..4d2831891e84 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -560,7 +560,9 @@ ssize_t kernel_write(struct file *file, const void *buf, size_t count,
 	if (ret)
 		return ret;
 
-	file_start_write(file);
+	ret = file_start_write(file);
+	if (ret)
+		return ret;
 	ret =  __kernel_write(file, buf, count, pos);
 	file_end_write(file);
 	return ret;
@@ -583,7 +585,9 @@ ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_
 		return ret;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	file_start_write(file);
+	ret = file_start_write(file);
+	if (ret)
+		return ret;
 	if (file->f_op->write)
 		ret = file->f_op->write(file, buf, count, pos);
 	else if (file->f_op->write_iter)
@@ -893,7 +897,9 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
 	if (ret < 0)
 		return ret;
 
-	file_start_write(file);
+	ret = file_start_write(file);
+	if (ret)
+		return ret;
 	ret = do_iter_readv_writev(file, iter, ppos, WRITE, flags);
 	if (ret > 0)
 		fsnotify_modify(file);
@@ -968,7 +974,9 @@ static ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
 	if (ret < 0)
 		goto out;
 
-	file_start_write(file);
+	ret = file_start_write(file);
+	if (ret < 0)
+		goto out;
 	if (file->f_op->write_iter)
 		ret = do_iter_readv_writev(file, &iter, pos, WRITE, flags);
 	else
@@ -1509,7 +1517,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
 	if (len == 0)
 		return 0;
 
-	file_start_write(file_out);
+	ret = file_start_write(file_out);
+	if (unlikely(ret))
+		return ret;
 
 	/*
 	 * Cloning is supported by more file systems, so we implement copy on
diff --git a/fs/remap_range.c b/fs/remap_range.c
index 28246dfc8485..d1aa23c16f15 100644
--- a/fs/remap_range.c
+++ b/fs/remap_range.c
@@ -399,7 +399,9 @@ loff_t vfs_clone_file_range(struct file *file_in, loff_t pos_in,
 	if (ret)
 		return ret;
 
-	file_start_write(file_out);
+	ret = file_start_write(file_out);
+	if (ret)
+		return ret;
 	ret = file_in->f_op->remap_file_range(file_in, pos_in,
 			file_out, pos_out, len, remap_flags);
 	file_end_write(file_out);
diff --git a/fs/splice.c b/fs/splice.c
index 60aed8de21f8..5d66a4dc93fc 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1160,7 +1160,9 @@ static int direct_splice_actor(struct pipe_inode_info *pipe,
 	struct file *file = sd->u.file;
 	long ret;
 
-	file_start_write(file);
+	ret = file_start_write(file);
+	if (ret)
+		return ret;
 	ret = do_splice_from(pipe, file, sd->opos, sd->total_len, sd->flags);
 	file_end_write(file);
 	return ret;
@@ -1350,7 +1352,9 @@ ssize_t do_splice(struct file *in, loff_t *off_in, struct file *out,
 		if (in->f_flags & O_NONBLOCK)
 			flags |= SPLICE_F_NONBLOCK;
 
-		file_start_write(out);
+		ret = file_start_write(out);
+		if (unlikely(ret))
+			return ret;
 		ret = do_splice_from(ipipe, out, &offset, len, flags);
 		file_end_write(out);
 
diff --git a/fs/xfs/xfs_exchrange.c b/fs/xfs/xfs_exchrange.c
index c8a655c92c92..e4840cbbe276 100644
--- a/fs/xfs/xfs_exchrange.c
+++ b/fs/xfs/xfs_exchrange.c
@@ -756,7 +756,9 @@ xfs_exchange_range(
 	if (!(fxr->file2->f_mode & FMODE_NOCMTIME) && !IS_NOCMTIME(inode2))
 		fxr->flags |= __XFS_EXCHANGE_RANGE_UPD_CMTIME2;
 
-	file_start_write(fxr->file2);
+	ret = file_start_write(fxr->file2);
+	if (ret)
+		return ret;
 	ret = xfs_exchrange_contents(fxr);
 	file_end_write(fxr->file2);
 	if (ret)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3ac37d9884f5..952f11170296 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2886,11 +2886,12 @@ static inline bool inode_wrong_type(const struct inode *inode, umode_t mode)
  * This is a variant of sb_start_write() which is a noop on non-regualr file.
  * Should be matched with a call to file_end_write().
  */
-static inline void file_start_write(struct file *file)
+static inline int __must_check file_start_write(struct file *file)
 {
 	if (!S_ISREG(file_inode(file)->i_mode))
-		return;
+		return 0;
 	sb_start_write(file_inode(file)->i_sb);
+	return 0;
 }
 
 static inline bool file_start_write_trylock(struct file *file)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/13] fs: Add __must_check annotations to sb_start_write_trylock() and similar
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (8 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 09/13] fs: Teach callers of file_start_write() " Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 11/13] fs: Make sb_start_write() return error on shutdown filesystem Jan Kara
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

The callers of sb_start_write_trylock(), sb_start_intwrite_trylock(),
file_start_write_trylock() must check the return value to figure out
whether the protection has been acquired (and thus must be dropped) or
not. Add __must_check annotation to these functions to catch forgotten
checks early.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/fs.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 952f11170296..755a4c83a2bf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1828,7 +1828,7 @@ static inline void sb_start_write(struct super_block *sb)
 	__sb_start_write(sb, SB_FREEZE_WRITE);
 }
 
-static inline bool sb_start_write_trylock(struct super_block *sb)
+static inline bool __must_check sb_start_write_trylock(struct super_block *sb)
 {
 	return __sb_start_write_trylock(sb, SB_FREEZE_WRITE);
 }
@@ -1875,7 +1875,8 @@ static inline void sb_start_intwrite(struct super_block *sb)
 	__sb_start_write(sb, SB_FREEZE_FS);
 }
 
-static inline bool sb_start_intwrite_trylock(struct super_block *sb)
+static inline bool __must_check sb_start_intwrite_trylock(
+						struct super_block *sb)
 {
 	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
 }
@@ -2894,7 +2895,7 @@ static inline int __must_check file_start_write(struct file *file)
 	return 0;
 }
 
-static inline bool file_start_write_trylock(struct file *file)
+static inline bool __must_check file_start_write_trylock(struct file *file)
 {
 	if (!S_ISREG(file_inode(file)->i_mode))
 		return true;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 11/13] fs: Make sb_start_write() return error on shutdown filesystem
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (9 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 10/13] fs: Add __must_check annotations to sb_start_write_trylock() and similar Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 12/13] fs: Make sb_start_pagefault() " Jan Kara
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Introduce new SB_I_SHUTDOWN flag that a filesystem can set when it is
forcefully shutting down (usually due to errors). Make sb_start_write()
return errors for such superblocks to avoid modifications to it which
reduces noise in the error logs and generally makes life somewhat easier
for filesystems. We teach all sb_start_write() callers to handle the
error.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/btrfs/block-group.c |  3 ++-
 fs/btrfs/defrag.c      |  6 +++++-
 fs/btrfs/volumes.c     | 13 +++++++++----
 fs/ext4/mmp.c          |  4 +++-
 fs/namespace.c         |  8 ++++++--
 fs/open.c              |  4 +++-
 fs/overlayfs/util.c    |  3 +--
 fs/quota/quota.c       |  4 ++--
 fs/xfs/xfs_ioctl.c     |  4 +++-
 include/linux/fs.h     | 16 ++++++++++++----
 10 files changed, 46 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 498442d0c216..fdd833f1f7df 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1800,7 +1800,8 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
 	if (!btrfs_should_reclaim(fs_info))
 		return;
 
-	sb_start_write(fs_info->sb);
+	if (sb_start_write(fs_info->sb) < 0)
+		return;
 
 	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) {
 		sb_end_write(fs_info->sb);
diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c
index f6dbda37a361..6d14c9be4060 100644
--- a/fs/btrfs/defrag.c
+++ b/fs/btrfs/defrag.c
@@ -274,7 +274,11 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info,
 	range.start = cur;
 	range.extent_thresh = defrag->extent_thresh;
 
-	sb_start_write(fs_info->sb);
+	ret = sb_start_write(fs_info->sb);
+	if (ret < 0) {
+		iput(inode);
+		goto cleanup;
+	}
 	ret = btrfs_defrag_file(inode, NULL, &range, defrag->transid,
 				       BTRFS_DEFRAG_BATCH);
 	sb_end_write(fs_info->sb);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index fcedc43ef291..f7b6b307c4bf 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4589,9 +4589,11 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
 static int balance_kthread(void *data)
 {
 	struct btrfs_fs_info *fs_info = data;
-	int ret = 0;
+	int ret;
 
-	sb_start_write(fs_info->sb);
+	ret = sb_start_write(fs_info->sb);
+	if (ret < 0)
+		return ret;
 	mutex_lock(&fs_info->balance_mutex);
 	if (fs_info->balance_ctl)
 		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
@@ -8231,12 +8233,15 @@ static int relocating_repair_kthread(void *data)
 	struct btrfs_block_group *cache = data;
 	struct btrfs_fs_info *fs_info = cache->fs_info;
 	u64 target;
-	int ret = 0;
+	int ret;
 
 	target = cache->start;
 	btrfs_put_block_group(cache);
 
-	sb_start_write(fs_info->sb);
+	ret = sb_start_write(fs_info->sb);
+	if (ret < 0)
+		return ret;
+
 	if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) {
 		btrfs_info(fs_info,
 			   "zoned: skip relocating block group %llu to repair: EBUSY",
diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
index bd946d0c71b7..96f69b6835f9 100644
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -63,7 +63,9 @@ static int write_mmp_block(struct super_block *sb, struct buffer_head *bh)
 	 * We protect against freezing so that we don't create dirty buffers
 	 * on frozen filesystem.
 	 */
-	sb_start_write(sb);
+	err = sb_start_write(sb);
+	if (err < 0)
+		return err;
 	err = write_mmp_block_thawed(sb, bh);
 	sb_end_write(sb);
 	return err;
diff --git a/fs/namespace.c b/fs/namespace.c
index 1c5591673f96..43fad685531e 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -512,7 +512,9 @@ int mnt_want_write(struct vfsmount *m)
 {
 	int ret;
 
-	sb_start_write(m->mnt_sb);
+	ret = sb_start_write(m->mnt_sb);
+	if (ret)
+		return ret;
 	ret = mnt_get_write_access(m);
 	if (ret)
 		sb_end_write(m->mnt_sb);
@@ -556,7 +558,9 @@ int mnt_want_write_file(struct file *file)
 {
 	int ret;
 
-	sb_start_write(file_inode(file)->i_sb);
+	ret = sb_start_write(file_inode(file)->i_sb);
+	if (ret)
+		return ret;
 	ret = mnt_get_write_access_file(file);
 	if (ret)
 		sb_end_write(file_inode(file)->i_sb);
diff --git a/fs/open.c b/fs/open.c
index 4bce4ba776ab..8fe9f4968969 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -175,7 +175,9 @@ long do_ftruncate(struct file *file, loff_t length, int small)
 	/* Check IS_APPEND on real upper inode */
 	if (IS_APPEND(file_inode(file)))
 		return -EPERM;
-	sb_start_write(inode->i_sb);
+	error = sb_start_write(inode->i_sb);
+	if (error)
+		return error;
 	error = security_file_truncate(file);
 	if (!error)
 		error = do_truncate(file_mnt_idmap(file), dentry, length,
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index b53fa14506a9..f97bf2458c66 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -28,8 +28,7 @@ int ovl_get_write_access(struct dentry *dentry)
 int __must_check ovl_start_write(struct dentry *dentry)
 {
 	struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
-	sb_start_write(ovl_upper_mnt(ofs)->mnt_sb);
-	return 0;
+	return sb_start_write(ovl_upper_mnt(ofs)->mnt_sb);
 }
 
 int ovl_want_write(struct dentry *dentry)
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 0e41fb84060f..df9c4d08f135 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -896,8 +896,8 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
 		else
 			up_read(&sb->s_umount);
 		/* Wait for sb to unfreeze */
-		sb_start_write(sb);
-		sb_end_write(sb);
+		if (sb_start_write(sb) == 0)
+			sb_end_write(sb);
 		put_super(sb);
 		goto retry;
 	}
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 4e933db75b12..5cf9e568324a 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1499,7 +1499,9 @@ xfs_file_ioctl(
 
 		trace_xfs_ioc_free_eofblocks(mp, &icw, _RET_IP_);
 
-		sb_start_write(mp->m_super);
+		error = sb_start_write(mp->m_super);
+		if (error)
+			return error;
 		error = xfs_blockgc_free_space(mp, &icw);
 		sb_end_write(mp->m_super);
 		return error;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 755a4c83a2bf..44ae86f46b12 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1190,6 +1190,9 @@ enum {
 	SB_I_TS_EXPIRY_WARNED,	/* warned about timestamp range expiry */
 	SB_I_RETIRED,		/* superblock shouldn't be reused */
 	SB_I_NOUMASK,		/* VFS does not apply umask */
+	SB_I_SHUTDOWN,		/* The filesystem has shutdown. Refuse
+				 * modification attempts with error as they are
+				 * a futile exercise. */
 };
 
 /* Possible states of 'frozen' field */
@@ -1823,9 +1826,12 @@ static inline void sb_end_intwrite(struct super_block *sb)
  *   -> i_mutex			(write path, truncate, directory ops, ...)
  *   -> s_umount		(freeze_super, thaw_super)
  */
-static inline void sb_start_write(struct super_block *sb)
+static inline int __must_check sb_start_write(struct super_block *sb)
 {
+	if (sb_test_iflag(sb, SB_I_SHUTDOWN))
+		return -EROFS;
 	__sb_start_write(sb, SB_FREEZE_WRITE);
+	return 0;
 }
 
 static inline bool __must_check sb_start_write_trylock(struct super_block *sb)
@@ -2891,8 +2897,7 @@ static inline int __must_check file_start_write(struct file *file)
 {
 	if (!S_ISREG(file_inode(file)->i_mode))
 		return 0;
-	sb_start_write(file_inode(file)->i_sb);
-	return 0;
+	return sb_start_write(file_inode(file)->i_sb);
 }
 
 static inline bool __must_check file_start_write_trylock(struct file *file)
@@ -2925,8 +2930,11 @@ static inline void file_end_write(struct file *file)
 static inline int __must_check kiocb_start_write(struct kiocb *iocb)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
+	int err;
 
-	sb_start_write(inode->i_sb);
+	err = sb_start_write(inode->i_sb);
+	if (err)
+		return err;
 	/*
 	 * Fool lockdep by telling it the lock got released so that it
 	 * doesn't complain about the held lock when we return to userspace.
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 12/13] fs: Make sb_start_pagefault() return error on shutdown filesystem
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (10 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 11/13] fs: Make sb_start_write() return error on shutdown filesystem Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 18:29 ` [PATCH 13/13] ext4: Replace EXT4_FLAGS_SHUTDOWN flag with a generic SB_I_SHUTDOWN Jan Kara
  2024-08-07 23:18 ` [PATCH RFC 0/13] fs: generic filesystem shutdown handling Dave Chinner
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Similarly to sb_start_write(), make sb_start_pagefault() return errors
for superblocks which are marked as shutdown to avoid modifications to
it which reduces noise in the error logs and generally makes life
somewhat easier for filesystems. We teach all sb_start_pagefault()
callers to handle the error.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/bcachefs/fs-io-pagecache.c | 3 ++-
 fs/btrfs/file.c               | 3 ++-
 fs/ceph/addr.c                | 9 ++++++---
 fs/ext2/file.c                | 3 ++-
 fs/ext4/file.c                | 3 ++-
 fs/ext4/inode.c               | 3 ++-
 fs/f2fs/file.c                | 4 +++-
 fs/fuse/dax.c                 | 3 ++-
 fs/gfs2/file.c                | 3 ++-
 fs/netfs/buffered_write.c     | 3 ++-
 fs/nfs/file.c                 | 3 ++-
 fs/nilfs2/file.c              | 3 ++-
 fs/ocfs2/mmap.c               | 3 ++-
 fs/orangefs/inode.c           | 3 ++-
 fs/udf/file.c                 | 3 ++-
 fs/xfs/xfs_file.c             | 3 ++-
 fs/zonefs/file.c              | 3 ++-
 include/linux/fs.h            | 5 ++++-
 mm/filemap.c                  | 3 ++-
 19 files changed, 45 insertions(+), 21 deletions(-)

diff --git a/fs/bcachefs/fs-io-pagecache.c b/fs/bcachefs/fs-io-pagecache.c
index a9cc5cad9cc9..a4efa1b76035 100644
--- a/fs/bcachefs/fs-io-pagecache.c
+++ b/fs/bcachefs/fs-io-pagecache.c
@@ -611,7 +611,8 @@ vm_fault_t bch2_page_mkwrite(struct vm_fault *vmf)
 
 	bch2_folio_reservation_init(c, inode, &res);
 
-	sb_start_pagefault(inode->v.i_sb);
+	if (sb_start_pagefault(inode->v.i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 	file_update_time(file);
 
 	/*
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 21381de906f6..481d355c66ee 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1900,7 +1900,8 @@ static vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
 
 	reserved_space = PAGE_SIZE;
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 	page_start = page_offset(page);
 	page_end = page_start + PAGE_SIZE - 1;
 	end = page_end;
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 8c16bc5250ef..60ddddce4ec1 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1686,7 +1686,9 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf)
 	if (!prealloc_cf)
 		return VM_FAULT_OOM;
 
-	sb_start_pagefault(inode->i_sb);
+	err = sb_start_pagefault(inode->i_sb);
+	if (err)
+		goto out_free;
 	ceph_block_sigs(&oldset);
 
 	if (off + thp_size(page) <= size)
@@ -1704,7 +1706,7 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf)
 	got = 0;
 	err = ceph_get_caps(vma->vm_file, CEPH_CAP_FILE_WR, want, off + len, &got);
 	if (err < 0)
-		goto out_free;
+		goto out_sigs;
 
 	doutc(cl, "%llx.%llx %llu~%zd got cap refs on %s\n", ceph_vinop(inode),
 	      off, len, ceph_cap_string(got));
@@ -1758,9 +1760,10 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf)
 	doutc(cl, "%llx.%llx %llu~%zd dropping cap refs on %s ret %x\n",
 	      ceph_vinop(inode), off, len, ceph_cap_string(got), ret);
 	ceph_put_cap_refs_async(ci, got);
-out_free:
+out_sigs:
 	ceph_restore_sigs(&oldset);
 	sb_end_pagefault(inode->i_sb);
+out_free:
 	ceph_free_cap_flush(prealloc_cf);
 	if (err < 0)
 		ret = vmf_error(err);
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index 10b061ac5bc0..b57197007b28 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -98,7 +98,8 @@ static vm_fault_t ext2_dax_fault(struct vm_fault *vmf)
 		(vmf->vma->vm_flags & VM_SHARED);
 
 	if (write) {
-		sb_start_pagefault(inode->i_sb);
+		if (sb_start_pagefault(inode->i_sb) < 0)
+			return VM_FAULT_SIGBUS;
 		file_update_time(vmf->vma->vm_file);
 	}
 	filemap_invalidate_lock_shared(inode->i_mapping);
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index c89e434db6b7..37623d4624c0 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -725,7 +725,8 @@ static vm_fault_t ext4_dax_huge_fault(struct vm_fault *vmf, unsigned int order)
 	pfn_t pfn;
 
 	if (write) {
-		sb_start_pagefault(sb);
+		if (sb_start_pagefault(sb) < 0)
+			return VM_FAULT_SIGBUS;
 		file_update_time(vmf->vma->vm_file);
 		filemap_invalidate_lock_shared(mapping);
 retry:
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 941c1c0d5c6e..195fcbb5c083 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -6128,7 +6128,8 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
 	if (unlikely(IS_IMMUTABLE(inode)))
 		return VM_FAULT_SIGBUS;
 
-	sb_start_pagefault(inode->i_sb);
+	if (unlikely(sb_start_pagefault(inode->i_sb) < 0))
+		return VM_FAULT_SIGBUS;
 	file_update_time(vma->vm_file);
 
 	filemap_invalidate_lock_shared(mapping);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 168f08507004..67ee66ff75a7 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -100,7 +100,9 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf)
 	if (need_alloc)
 		f2fs_balance_fs(sbi, true);
 
-	sb_start_pagefault(inode->i_sb);
+	err = sb_start_pagefault(inode->i_sb);
+	if (err)
+		goto out;
 
 	f2fs_bug_on(sbi, f2fs_has_inline_data(inode));
 
diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
index 12ef91d170bb..d42d0aaa1bd9 100644
--- a/fs/fuse/dax.c
+++ b/fs/fuse/dax.c
@@ -797,7 +797,8 @@ static vm_fault_t __fuse_dax_fault(struct vm_fault *vmf, unsigned int order,
 	bool retry = false;
 
 	if (write)
-		sb_start_pagefault(sb);
+		if (sb_start_pagefault(sb) < 0)
+			return VM_FAULT_SIGBUS;
 retry:
 	if (retry && !(fcd->nr_free_ranges > 0))
 		wait_event(fcd->range_waitq, (fcd->nr_free_ranges > 0));
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 08982937b5df..774203da5262 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -427,7 +427,8 @@ static vm_fault_t gfs2_page_mkwrite(struct vm_fault *vmf)
 	loff_t size;
 	int err;
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 
 	gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh);
 	err = gfs2_glock_nq(&gh);
diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 4726c315453c..e46fcd387be6 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -531,7 +531,8 @@ vm_fault_t netfs_page_mkwrite(struct vm_fault *vmf, struct netfs_group *netfs_gr
 
 	_enter("%lx", folio->index);
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 
 	if (folio_lock_killable(folio) < 0)
 		goto out;
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 61a8cdb9f1e1..ae0791c67ab8 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -593,7 +593,8 @@ static vm_fault_t nfs_vm_page_mkwrite(struct vm_fault *vmf)
 		 filp, filp->f_mapping->host->i_ino,
 		 (long long)folio_pos(folio));
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 
 	/* make sure the cache has finished storing the page */
 	if (folio_test_private_2(folio) && /* [DEPRECATED] */
diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c
index 0e3fc5ba33c7..6e80377250d1 100644
--- a/fs/nilfs2/file.c
+++ b/fs/nilfs2/file.c
@@ -54,7 +54,8 @@ static vm_fault_t nilfs_page_mkwrite(struct vm_fault *vmf)
 	if (unlikely(nilfs_near_disk_full(inode->i_sb->s_fs_info)))
 		return VM_FAULT_SIGBUS; /* -ENOSPC */
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 	folio_lock(folio);
 	if (folio->mapping != inode->i_mapping ||
 	    folio_pos(folio) >= i_size_read(inode) ||
diff --git a/fs/ocfs2/mmap.c b/fs/ocfs2/mmap.c
index 1834f26522ed..a56465a3a515 100644
--- a/fs/ocfs2/mmap.c
+++ b/fs/ocfs2/mmap.c
@@ -119,7 +119,8 @@ static vm_fault_t ocfs2_page_mkwrite(struct vm_fault *vmf)
 	int err;
 	vm_fault_t ret;
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 	ocfs2_block_signals(&oldset);
 
 	/*
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index fdb9b65db1de..170ef9456ff1 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -632,7 +632,8 @@ vm_fault_t orangefs_page_mkwrite(struct vm_fault *vmf)
 	vm_fault_t ret;
 	struct orangefs_write_range *wr;
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 
 	if (wait_on_bit(bitlock, 1, TASK_KILLABLE)) {
 		ret = VM_FAULT_RETRY;
diff --git a/fs/udf/file.c b/fs/udf/file.c
index 3a4179de316b..d97ba972f1f3 100644
--- a/fs/udf/file.c
+++ b/fs/udf/file.c
@@ -45,7 +45,8 @@ static vm_fault_t udf_page_mkwrite(struct vm_fault *vmf)
 	vm_fault_t ret = VM_FAULT_LOCKED;
 	int err;
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 	file_update_time(vma->vm_file);
 	filemap_invalidate_lock_shared(mapping);
 	folio_lock(folio);
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 4cdc54dc9686..7e2c9bd70bc2 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1283,7 +1283,8 @@ xfs_write_fault(
 	unsigned int		lock_mode = XFS_MMAPLOCK_SHARED;
 	vm_fault_t		ret;
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 	file_update_time(vmf->vma->vm_file);
 
 	/*
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 3b103715acc9..0b100d48056a 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -294,7 +294,8 @@ static vm_fault_t zonefs_filemap_page_mkwrite(struct vm_fault *vmf)
 	if (zonefs_inode_is_seq(inode))
 		return VM_FAULT_NOPAGE;
 
-	sb_start_pagefault(inode->i_sb);
+	if (sb_start_pagefault(inode->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 	file_update_time(vmf->vma->vm_file);
 
 	/* Serialize against truncates */
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 44ae86f46b12..a082777eac6a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1858,9 +1858,12 @@ static inline bool __must_check sb_start_write_trylock(struct super_block *sb)
  * mmap_lock
  *   -> sb_start_pagefault
  */
-static inline void sb_start_pagefault(struct super_block *sb)
+static inline int __must_check sb_start_pagefault(struct super_block *sb)
 {
+	if (sb_test_iflag(sb, SB_I_SHUTDOWN))
+		return -EROFS;
 	__sb_start_write(sb, SB_FREEZE_PAGEFAULT);
+	return 0;
 }
 
 /**
diff --git a/mm/filemap.c b/mm/filemap.c
index d62150418b91..97efc8a62c21 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3672,7 +3672,8 @@ vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf)
 	struct folio *folio = page_folio(vmf->page);
 	vm_fault_t ret = VM_FAULT_LOCKED;
 
-	sb_start_pagefault(mapping->host->i_sb);
+	if (sb_start_pagefault(mapping->host->i_sb) < 0)
+		return VM_FAULT_SIGBUS;
 	file_update_time(vmf->vma->vm_file);
 	folio_lock(folio);
 	if (folio->mapping != mapping) {
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 13/13] ext4: Replace EXT4_FLAGS_SHUTDOWN flag with a generic SB_I_SHUTDOWN
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (11 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 12/13] fs: Make sb_start_pagefault() " Jan Kara
@ 2024-08-07 18:29 ` Jan Kara
  2024-08-07 23:18 ` [PATCH RFC 0/13] fs: generic filesystem shutdown handling Dave Chinner
  13 siblings, 0 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-07 18:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Dave Chinner, Christian Brauner, Jan Kara

Instead of using private ext4 EXT4_FLAGS_SHUTDOWN flag, use a generic
variant SB_I_SHUTDOWN. As a bonus VFS will now refuse modification
attempts for the filesystem when the flag is set.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/ext4.h  |  3 +--
 fs/ext4/ioctl.c |  6 +++---
 fs/ext4/super.c | 11 +++++------
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 08acd152261e..7a3ea125ec86 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2240,12 +2240,11 @@ extern int ext4_feature_set_ok(struct super_block *sb, int readonly);
  * Superblock flags
  */
 #define EXT4_FLAGS_RESIZING	0
-#define EXT4_FLAGS_SHUTDOWN	1
 #define EXT4_FLAGS_BDEV_IS_DAX	2
 
 static inline int ext4_forced_shutdown(struct super_block *sb)
 {
-	return test_bit(EXT4_FLAGS_SHUTDOWN, &EXT4_SB(sb)->s_ext4_flags);
+	return sb_test_iflag(sb, SB_I_SHUTDOWN);
 }
 
 /*
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index e8bf5972dd47..086bc239ff33 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -822,18 +822,18 @@ int ext4_force_shutdown(struct super_block *sb, u32 flags)
 		ret = bdev_freeze(sb->s_bdev);
 		if (ret)
 			return ret;
-		set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
+		sb_set_iflag(sb, SB_I_SHUTDOWN);
 		bdev_thaw(sb->s_bdev);
 		break;
 	case EXT4_GOING_FLAGS_LOGFLUSH:
-		set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
+		sb_set_iflag(sb, SB_I_SHUTDOWN);
 		if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) {
 			(void) ext4_force_commit(sb);
 			jbd2_journal_abort(sbi->s_journal, -ESHUTDOWN);
 		}
 		break;
 	case EXT4_GOING_FLAGS_NOLOGFLUSH:
-		set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
+		sb_set_iflag(sb, SB_I_SHUTDOWN);
 		if (sbi->s_journal && !is_journal_aborted(sbi->s_journal))
 			jbd2_journal_abort(sbi->s_journal, -ESHUTDOWN);
 		break;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b5b2f17f1b65..928d8eb266f0 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -701,7 +701,7 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error,
 		WARN_ON_ONCE(1);
 
 	if (!continue_fs && !sb_rdonly(sb)) {
-		set_bit(EXT4_FLAGS_SHUTDOWN, &EXT4_SB(sb)->s_ext4_flags);
+		sb_set_iflag(sb, SB_I_SHUTDOWN);
 		if (journal)
 			jbd2_journal_abort(journal, -EIO);
 	}
@@ -735,11 +735,10 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error,
 
 	ext4_msg(sb, KERN_CRIT, "Remounting filesystem read-only");
 	/*
-	 * EXT4_FLAGS_SHUTDOWN was set which stops all filesystem
-	 * modifications. We don't set SB_RDONLY because that requires
-	 * sb->s_umount semaphore and setting it without proper remount
-	 * procedure is confusing code such as freeze_super() leading to
-	 * deadlocks and other problems.
+	 * SB_I_SHUTDOWN was set which stops all filesystem modifications. We
+	 * don't set SB_RDONLY because that requires sb->s_umount semaphore and
+	 * setting it without proper remount procedure is confusing code such
+	 * as freeze_super() leading to deadlocks and other problems.
 	 */
 }
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 0/13] fs: generic filesystem shutdown handling
  2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
                   ` (12 preceding siblings ...)
  2024-08-07 18:29 ` [PATCH 13/13] ext4: Replace EXT4_FLAGS_SHUTDOWN flag with a generic SB_I_SHUTDOWN Jan Kara
@ 2024-08-07 23:18 ` Dave Chinner
  2024-08-08 14:32   ` Jan Kara
  2024-08-08 14:51   ` Darrick J. Wong
  13 siblings, 2 replies; 25+ messages in thread
From: Dave Chinner @ 2024-08-07 23:18 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Christian Brauner

On Wed, Aug 07, 2024 at 08:29:45PM +0200, Jan Kara wrote:
> Hello,
> 
> this patch series implements generic handling of filesystem shutdown. The idea
> is very simple: Have a superblock flag, which when set, will make VFS refuse
> modifications to the filesystem. The patch series consists of several parts.
> Patches 1-6 cleanup handling of SB_I_ flags which is currently messy (different
> flags seem to have different locks protecting them although they are modified
> by plain stores). Patches 7-12 gradually convert code to be able to handle
> errors from sb_start_write() / sb_start_pagefault(). Patch 13 then shows how
> filesystems can use this generic flag. Additionally, we could remove some
> shutdown checks from within ext4 code and rely on checks in VFS but I didn't
> want to complicate the series with ext4 specific things.

Overall this looks good. Two things that I noticed that we should
nail down before anything else:

1. The original definition of a 'shutdown filesystem' (i.e. from the
XFS origins) is that a shutdown filesystem must *never* do -physical
IO- after the shutdown is initiated. This is a protection mechanism
for the underlying storage to prevent potential propagation of
problems in the storage media once a serious issue has been
detected. (e.g. suspect physical media can be made worse by
continually trying to read it.) It also allows the block device to
go away and we won't try to access issue new IO to it once the
->shutdown call has been complete.

IOWs, XFS implements a "no new IO after shutdown" architecture, and
this is also largely what ext4 implements as well.

However, this isn't what this generic shutdown infrastructure
implements. It only prevents new user modifications from being
started - it is effectively a "instant RO" mechanism rather than an
"instant no more IO" architecture.

Hence we have an impedence mismatch between existing shutdown
implementations that currently return -EIO on shutdown for all
operations (both read and write) and this generic implementation
which returns -EROFS only for write operations.

Hence the proposed generic shutdown model doesn't really solve the
inconsistent shutdown behaviour problem across filesystems - it just
adds a new inconsistency between existing filesystem shutdown
implementations and the generic infrastructure.

2. On shutdown, this patchset returns -EROFS.

As per #1, returning -EROFS on shutdown will be a significant change
of behaviour for some filesystems as they currently return -EIO when
the filesystem is shut down.

I don't think -EROFS is right, because existing shutdown behaviour
also impacts read-only operations and will return -EIO for them,
too.

I think the error returned by a shutdown filesystem should always be
consistent and that really means -EIO needs to be returned rather
than -EROFS.

However, given this is new generic infrastructure, we can define a
new error like -ESHUTDOWN (to reuse an existing errno) or even a
new errno like -EFSSHUTDOWN for this, document it man pages and then
convert all the existing filesystem shutdown checks to return this
error instead of -EIO...

> Also, as Dave suggested, we can lift *_IOC_{SHUTDOWN|GOINGDOWN} ioctl handling
> to VFS (currently in 5 filesystems) and just call new ->shutdown op for
> the filesystem abort handling itself. But that is kind of independent thing
> and this series is long enough as is.

Agreed - that can be done separately once we've sorted out the
little details of what a shutdown filesystem actually means and how
that gets reported consistently to userspace...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 0/13] fs: generic filesystem shutdown handling
  2024-08-07 23:18 ` [PATCH RFC 0/13] fs: generic filesystem shutdown handling Dave Chinner
@ 2024-08-08 14:32   ` Jan Kara
  2024-08-13 12:46     ` Christian Brauner
  2024-08-14  0:09     ` Dave Chinner
  2024-08-08 14:51   ` Darrick J. Wong
  1 sibling, 2 replies; 25+ messages in thread
From: Jan Kara @ 2024-08-08 14:32 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Jan Kara, linux-fsdevel, Christian Brauner

On Thu 08-08-24 09:18:51, Dave Chinner wrote:
> On Wed, Aug 07, 2024 at 08:29:45PM +0200, Jan Kara wrote:
> > Hello,
> > 
> > this patch series implements generic handling of filesystem shutdown. The idea
> > is very simple: Have a superblock flag, which when set, will make VFS refuse
> > modifications to the filesystem. The patch series consists of several parts.
> > Patches 1-6 cleanup handling of SB_I_ flags which is currently messy (different
> > flags seem to have different locks protecting them although they are modified
> > by plain stores). Patches 7-12 gradually convert code to be able to handle
> > errors from sb_start_write() / sb_start_pagefault(). Patch 13 then shows how
> > filesystems can use this generic flag. Additionally, we could remove some
> > shutdown checks from within ext4 code and rely on checks in VFS but I didn't
> > want to complicate the series with ext4 specific things.
> 
> Overall this looks good. Two things that I noticed that we should
> nail down before anything else:
> 
> 1. The original definition of a 'shutdown filesystem' (i.e. from the
> XFS origins) is that a shutdown filesystem must *never* do -physical
> IO- after the shutdown is initiated. This is a protection mechanism
> for the underlying storage to prevent potential propagation of
> problems in the storage media once a serious issue has been
> detected. (e.g. suspect physical media can be made worse by
> continually trying to read it.) It also allows the block device to
> go away and we won't try to access issue new IO to it once the
> ->shutdown call has been complete.
> 
> IOWs, XFS implements a "no new IO after shutdown" architecture, and
> this is also largely what ext4 implements as well.

Thanks for sharing this. I wasn't aware that "no new IO after shutdown" is
the goal. I knew this is required for modifications but I wasn't sure how
strict this was for writes.

> However, this isn't what this generic shutdown infrastructure
> implements. It only prevents new user modifications from being
> started - it is effectively a "instant RO" mechanism rather than an
> "instant no more IO" architecture.
> 
> Hence we have an impedence mismatch between existing shutdown
> implementations that currently return -EIO on shutdown for all
> operations (both read and write) and this generic implementation
> which returns -EROFS only for write operations.
> 
> Hence the proposed generic shutdown model doesn't really solve the
> inconsistent shutdown behaviour problem across filesystems - it just
> adds a new inconsistency between existing filesystem shutdown
> implementations and the generic infrastructure.

OK, understood. I also agree it would be good to keep this no-IO semantics
when implementing the generic solution. I'm just pondering how to achieve
that in a maintainable way. For the write path what I've done looks like
the least painful way. For the read path the simplest is probably to still
return whatever is in cache and just do the check + error return somewhere
down in the call stack just before calling into filesystem. It is easy
enough to stop things like ->read_folio, ->readahead, or ->lookup. But how
about things like ->evict_inode or ->release?  They can trigger IO but
allowing inode reclaim on shutdown fs is desirable I'd say. Similarly for
things like ->remount_fs or ->put_super. So avoiding IO from operations
like these would rely on fs implementation anyway.

> 2. On shutdown, this patchset returns -EROFS.
> 
> As per #1, returning -EROFS on shutdown will be a significant change
> of behaviour for some filesystems as they currently return -EIO when
> the filesystem is shut down.
> 
> I don't think -EROFS is right, because existing shutdown behaviour
> also impacts read-only operations and will return -EIO for them,
> too.
> 
> I think the error returned by a shutdown filesystem should always be
> consistent and that really means -EIO needs to be returned rather
> than -EROFS.
> 
> However, given this is new generic infrastructure, we can define a
> new error like -ESHUTDOWN (to reuse an existing errno) or even a
> new errno like -EFSSHUTDOWN for this, document it man pages and then
> convert all the existing filesystem shutdown checks to return this
> error instead of -EIO...

Right, -EROFS isn't really good return value when we refuse also reads. I
think -EIO is fine. -ESHUTDOWN would be ok but the standard message ("Cannot
send after transport endpoint shutdown") whould be IMO confusing to users.
I was also thinking about -EFSCORRUPTED (alias -EUCLEAN) which already has
some precedens in the filesystem space but -EIO is probably better.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 0/13] fs: generic filesystem shutdown handling
  2024-08-08 14:32   ` Jan Kara
@ 2024-08-13 12:46     ` Christian Brauner
  2024-08-14  0:09     ` Dave Chinner
  1 sibling, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2024-08-13 12:46 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dave Chinner, linux-fsdevel

> things like ->remount_fs or ->put_super. So avoiding IO from operations
> like these would rely on fs implementation anyway.

I think that needs to remain filesystem specific and I don't think this
needs to hold up the patchset.

> Right, -EROFS isn't really good return value when we refuse also reads. I
> think -EIO is fine. -ESHUTDOWN would be ok but the standard message ("Cannot
> send after transport endpoint shutdown") whould be IMO confusing to users.
> I was also thinking about -EFSCORRUPTED (alias -EUCLEAN) which already has
> some precedens in the filesystem space but -EIO is probably better.

EIO isn't great but I agree that it's the best we probably have for now.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 0/13] fs: generic filesystem shutdown handling
  2024-08-08 14:32   ` Jan Kara
  2024-08-13 12:46     ` Christian Brauner
@ 2024-08-14  0:09     ` Dave Chinner
  1 sibling, 0 replies; 25+ messages in thread
From: Dave Chinner @ 2024-08-14  0:09 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, Christian Brauner

On Thu, Aug 08, 2024 at 04:32:22PM +0200, Jan Kara wrote:
> On Thu 08-08-24 09:18:51, Dave Chinner wrote:
> > On Wed, Aug 07, 2024 at 08:29:45PM +0200, Jan Kara wrote:
> > > Hello,
> > > 
> > > this patch series implements generic handling of filesystem shutdown. The idea
> > > is very simple: Have a superblock flag, which when set, will make VFS refuse
> > > modifications to the filesystem. The patch series consists of several parts.
> > > Patches 1-6 cleanup handling of SB_I_ flags which is currently messy (different
> > > flags seem to have different locks protecting them although they are modified
> > > by plain stores). Patches 7-12 gradually convert code to be able to handle
> > > errors from sb_start_write() / sb_start_pagefault(). Patch 13 then shows how
> > > filesystems can use this generic flag. Additionally, we could remove some
> > > shutdown checks from within ext4 code and rely on checks in VFS but I didn't
> > > want to complicate the series with ext4 specific things.
> > 
> > Overall this looks good. Two things that I noticed that we should
> > nail down before anything else:
> > 
> > 1. The original definition of a 'shutdown filesystem' (i.e. from the
> > XFS origins) is that a shutdown filesystem must *never* do -physical
> > IO- after the shutdown is initiated. This is a protection mechanism
> > for the underlying storage to prevent potential propagation of
> > problems in the storage media once a serious issue has been
> > detected. (e.g. suspect physical media can be made worse by
> > continually trying to read it.) It also allows the block device to
> > go away and we won't try to access issue new IO to it once the
> > ->shutdown call has been complete.
> > 
> > IOWs, XFS implements a "no new IO after shutdown" architecture, and
> > this is also largely what ext4 implements as well.
> 
> Thanks for sharing this. I wasn't aware that "no new IO after shutdown" is
> the goal. I knew this is required for modifications but I wasn't sure how
> strict this was for writes.
> 
> > However, this isn't what this generic shutdown infrastructure
> > implements. It only prevents new user modifications from being
> > started - it is effectively a "instant RO" mechanism rather than an
> > "instant no more IO" architecture.
> > 
> > Hence we have an impedence mismatch between existing shutdown
> > implementations that currently return -EIO on shutdown for all
> > operations (both read and write) and this generic implementation
> > which returns -EROFS only for write operations.
> > 
> > Hence the proposed generic shutdown model doesn't really solve the
> > inconsistent shutdown behaviour problem across filesystems - it just
> > adds a new inconsistency between existing filesystem shutdown
> > implementations and the generic infrastructure.
> 
> OK, understood. I also agree it would be good to keep this no-IO semantics
> when implementing the generic solution. I'm just pondering how to achieve
> that in a maintainable way. For the write path what I've done looks like
> the least painful way. For the read path the simplest is probably to still
> return whatever is in cache and just do the check + error return somewhere
> down in the call stack just before calling into filesystem. It is easy
> enough to stop things like ->read_folio, ->readahead, or ->lookup. But how
> about things like ->evict_inode or ->release?

If the filesystem is shut down, inode eviction or releasing a FD
should not be doing any IO at all - they should just be releasing
in-memory resources and freeing the objects being released. e.g. we
don't process unlinked inodes when the filesystem is shut down; they
remain as unlinked on disk and recovery gets to clean up the mess.
i.e.  we process all inodes as if they were clean, linked inodes and
just tear down the in-memory structures attached to the inode.

i.e. shutdown isn't concerned about keeping anything consistent
either in memory or on disk - it's concerned only about releasing
in-memory resources such that the filesystem can be unmounted
without doing any IO at all.  e.g. ext4_evict_inode() needs to treat
all unlinked inodes as if they are bad when the filesystem is shut
down. XFS does this (see the shutdown check in
xfs_inode_needs_inactive()) and every filesystem that does unlinked
inode processing in inode eviction will need similar modifications.

Yes, this means a "shutdown means no IO" model cannot be exclusively
implemented at the VFS - it will need things like filesystems with
customised inode eviction callouts to handle these cases themselves.

> They can trigger IO but
> allowing inode reclaim on shutdown fs is desirable I'd say. Similarly for
> things like ->remount_fs or ->put_super. So avoiding IO from operations
> like these would rely on fs implementation anyway.

remounts need to follow the fundamental rule of shutdowns: you can't
change the state of a shutdown filesystem -at all- because any
operation on a shutdown filesystem should be immediately failed. The
only thing you can reliably do once a filesystem is shut down is
unmount it.  IOWs, the VFS should return -EIO when a remount is
requested on a shutdown filesystem, and the filesystem code then
doesn't have to care.

As for ->put_super(), this should act as if the filesystem is clean
when the filesystem is shut down as everything that is dirty in
memory will never get cleaned. IOWs, once shutdown has been set,
dirty state should be completely ignored by everything and so object
release/eviction should tear everything down regardless of it's
state.

Supporting a "no-IO shutdown" model properly will require filesystem
specific changes to handle, but that's really implementation details
more than anything else. What we need to do first is define
and document exactly what shutdown means and how the VFS and
filesystems need to operate when that bit is set. Then we have a
clear framework from which we can consistently answer "what should
filesystem X do in this situation" issues that arise...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 0/13] fs: generic filesystem shutdown handling
  2024-08-07 23:18 ` [PATCH RFC 0/13] fs: generic filesystem shutdown handling Dave Chinner
  2024-08-08 14:32   ` Jan Kara
@ 2024-08-08 14:51   ` Darrick J. Wong
  2024-08-09  2:30     ` Dave Chinner
  1 sibling, 1 reply; 25+ messages in thread
From: Darrick J. Wong @ 2024-08-08 14:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Jan Kara, linux-fsdevel, Christian Brauner

On Thu, Aug 08, 2024 at 09:18:51AM +1000, Dave Chinner wrote:
> On Wed, Aug 07, 2024 at 08:29:45PM +0200, Jan Kara wrote:
> > Hello,
> > 
> > this patch series implements generic handling of filesystem shutdown. The idea
> > is very simple: Have a superblock flag, which when set, will make VFS refuse
> > modifications to the filesystem. The patch series consists of several parts.
> > Patches 1-6 cleanup handling of SB_I_ flags which is currently messy (different
> > flags seem to have different locks protecting them although they are modified
> > by plain stores). Patches 7-12 gradually convert code to be able to handle
> > errors from sb_start_write() / sb_start_pagefault(). Patch 13 then shows how
> > filesystems can use this generic flag. Additionally, we could remove some
> > shutdown checks from within ext4 code and rely on checks in VFS but I didn't
> > want to complicate the series with ext4 specific things.
> 
> Overall this looks good. Two things that I noticed that we should
> nail down before anything else:
> 
> 1. The original definition of a 'shutdown filesystem' (i.e. from the
> XFS origins) is that a shutdown filesystem must *never* do -physical
> IO- after the shutdown is initiated. This is a protection mechanism
> for the underlying storage to prevent potential propagation of
> problems in the storage media once a serious issue has been
> detected. (e.g. suspect physical media can be made worse by
> continually trying to read it.) It also allows the block device to
> go away and we won't try to access issue new IO to it once the
> ->shutdown call has been complete.
> 
> IOWs, XFS implements a "no new IO after shutdown" architecture, and
> this is also largely what ext4 implements as well.

I don't think it quite does -- for EXT4_GOING_FLAGS_DEFAULT, it sets the
shutdown flag, but it doesn't actually abort the journal.  I think
that's an implementation bug since XFS /does/ shut down the log.

But looking at XFS_FSOP_GOING_FLAGS_DEFAULT, I also notice that if the
bdev_freeze fails, it returns 0 and the fs isn't shut down.  ext4, otoh,
actually does pass bdev_freeze's errno along.  I think ext4's behavior
is the correct one, right?

> However, this isn't what this generic shutdown infrastructure
> implements. It only prevents new user modifications from being
> started - it is effectively a "instant RO" mechanism rather than an
> "instant no more IO" architecture.

I thought pagefaults are still allowed on a shutdown xfs?  Curiously I
don't see a prohibition on write faults, but iirc we still allowed read
faults so that a shutdown on the rootfs doesn't immediately crash the
whole machine?

> Hence we have an impedence mismatch between existing shutdown
> implementations that currently return -EIO on shutdown for all
> operations (both read and write) and this generic implementation
> which returns -EROFS only for write operations.
> 
> Hence the proposed generic shutdown model doesn't really solve the
> inconsistent shutdown behaviour problem across filesystems - it just
> adds a new inconsistency between existing filesystem shutdown
> implementations and the generic infrastructure.
> 
> 2. On shutdown, this patchset returns -EROFS.
> 
> As per #1, returning -EROFS on shutdown will be a significant change
> of behaviour for some filesystems as they currently return -EIO when
> the filesystem is shut down.
> 
> I don't think -EROFS is right, because existing shutdown behaviour
> also impacts read-only operations and will return -EIO for them,
> too.
> 
> I think the error returned by a shutdown filesystem should always be
> consistent and that really means -EIO needs to be returned rather
> than -EROFS.
> 
> However, given this is new generic infrastructure, we can define a
> new error like -ESHUTDOWN (to reuse an existing errno) or even a
> new errno like -EFSSHUTDOWN for this, document it man pages and then
> convert all the existing filesystem shutdown checks to return this
> error instead of -EIO...

Agree.

> > Also, as Dave suggested, we can lift *_IOC_{SHUTDOWN|GOINGDOWN} ioctl handling
> > to VFS (currently in 5 filesystems) and just call new ->shutdown op for
> > the filesystem abort handling itself. But that is kind of independent thing
> > and this series is long enough as is.
> 
> Agreed - that can be done separately once we've sorted out the
> little details of what a shutdown filesystem actually means and how
> that gets reported consistently to userspace...

I would define it as:

No more writes to the filesystem or its underlying storage; file IO
and stat* calls return ESHUTDOWN; read faults still allowed.

I like the idea of hoisting this to the vfs and defining how one figures
out if the fs is shut down; thank you for working on this, Jan.

--D

> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 0/13] fs: generic filesystem shutdown handling
  2024-08-08 14:51   ` Darrick J. Wong
@ 2024-08-09  2:30     ` Dave Chinner
  0 siblings, 0 replies; 25+ messages in thread
From: Dave Chinner @ 2024-08-09  2:30 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Jan Kara, linux-fsdevel, Christian Brauner

On Thu, Aug 08, 2024 at 07:51:41AM -0700, Darrick J. Wong wrote:
> On Thu, Aug 08, 2024 at 09:18:51AM +1000, Dave Chinner wrote:
> > On Wed, Aug 07, 2024 at 08:29:45PM +0200, Jan Kara wrote:
> > > Hello,
> > > 
> > > this patch series implements generic handling of filesystem shutdown. The idea
> > > is very simple: Have a superblock flag, which when set, will make VFS refuse
> > > modifications to the filesystem. The patch series consists of several parts.
> > > Patches 1-6 cleanup handling of SB_I_ flags which is currently messy (different
> > > flags seem to have different locks protecting them although they are modified
> > > by plain stores). Patches 7-12 gradually convert code to be able to handle
> > > errors from sb_start_write() / sb_start_pagefault(). Patch 13 then shows how
> > > filesystems can use this generic flag. Additionally, we could remove some
> > > shutdown checks from within ext4 code and rely on checks in VFS but I didn't
> > > want to complicate the series with ext4 specific things.
> > 
> > Overall this looks good. Two things that I noticed that we should
> > nail down before anything else:
> > 
> > 1. The original definition of a 'shutdown filesystem' (i.e. from the
> > XFS origins) is that a shutdown filesystem must *never* do -physical
> > IO- after the shutdown is initiated. This is a protection mechanism
> > for the underlying storage to prevent potential propagation of
> > problems in the storage media once a serious issue has been
> > detected. (e.g. suspect physical media can be made worse by
> > continually trying to read it.) It also allows the block device to
> > go away and we won't try to access issue new IO to it once the
> > ->shutdown call has been complete.
> > 
> > IOWs, XFS implements a "no new IO after shutdown" architecture, and
> > this is also largely what ext4 implements as well.
> 
> I don't think it quite does -- for EXT4_GOING_FLAGS_DEFAULT, it sets the
> shutdown flag, but it doesn't actually abort the journal. I think
> that's an implementation bug since XFS /does/ shut down the log.
>
> But looking at XFS_FSOP_GOING_FLAGS_DEFAULT, I also notice that if the
> bdev_freeze fails, it returns 0 and the fs isn't shut down.  ext4, otoh,
> actually does pass bdev_freeze's errno along.  I think ext4's behavior
> is the correct one, right?

Yes, there are inconsistencies in how different filesystems
implement user-driven shutdown operations, but Jan has specifically
left addressing those sorts of inconsistencies in ioctl/->shutdown
implementations for a later patch set. I agree with that approach -
let's first focus on defining a generic model for how shutdown
filesystems should behave once they are shut down. Once we have the
model defined, then we can worry about making filesystems shutdown
mechanisms behave consistently within that model..

> > However, this isn't what this generic shutdown infrastructure
> > implements. It only prevents new user modifications from being
> > started - it is effectively a "instant RO" mechanism rather than an
> > "instant no more IO" architecture.
> 
> I thought pagefaults are still allowed on a shutdown xfs?  Curiously I
> don't see a prohibition on write faults, but iirc we still allowed read
> faults so that a shutdown on the rootfs doesn't immediately crash the
> whole machine?

Yes, page faults are allowed on a shutdown XFS filesystem right up
to the point where they need to do IO on a page cache miss.  Then
the IO request hits the block mapping code (xfs_bmapi_read()), sees
the shutdown state and the read IO fails. The result of this is
SIGBUS for your executable.

IOWs, if the executable is cached, it will keep running after a
shutdown. If it's not cached, then it's game over already.

> > > Also, as Dave suggested, we can lift *_IOC_{SHUTDOWN|GOINGDOWN} ioctl handling
> > > to VFS (currently in 5 filesystems) and just call new ->shutdown op for
> > > the filesystem abort handling itself. But that is kind of independent thing
> > > and this series is long enough as is.
> > 
> > Agreed - that can be done separately once we've sorted out the
> > little details of what a shutdown filesystem actually means and how
> > that gets reported consistently to userspace...
> 
> I would define it as:
> 
> No more writes to the filesystem or its underlying storage; file IO
> and stat* calls return ESHUTDOWN; read faults still allowed.

We need to operations in terms of whether we allow physical IO or
not. We currently don't allow physical IO from read faults on XFs,
so...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-08-14  0:09 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-07 18:29 [PATCH RFC 0/13] fs: generic filesystem shutdown handling Jan Kara
2024-08-07 18:29 ` [PATCH 01/13] fs: Define bit numbers for SB_I_ flags Jan Kara
2024-08-07 18:29 ` [PATCH 02/13] fs: Convert fs_context use of SB_I_ flags to new constants Jan Kara
2024-08-07 18:29 ` [PATCH 03/13] fs: Convert mount_too_revealing() to new s_iflags handling functions Jan Kara
2024-08-07 18:29 ` [PATCH 04/13] fs: Convert remaining usage of SB_I_ flags Jan Kara
2024-08-07 18:29 ` [PATCH 05/13] fs: Drop old SB_I_ constants Jan Kara
2024-08-07 18:29 ` [PATCH 06/13] fs: Drop unnecessary underscore from _SB_I_ constants Jan Kara
2024-08-08 11:47   ` Amir Goldstein
2024-08-08 14:35     ` Darrick J. Wong
2024-08-08 14:50       ` Christian Brauner
2024-08-08 17:34         ` Jan Kara
2024-08-07 18:29 ` [PATCH 07/13] overlayfs: Make ovl_start_write() return error Jan Kara
2024-08-08 12:01   ` Amir Goldstein
2024-08-07 18:29 ` [PATCH 08/13] fs: Teach callers of kiocb_start_write() to handle errors Jan Kara
2024-08-07 18:29 ` [PATCH 09/13] fs: Teach callers of file_start_write() " Jan Kara
2024-08-07 18:29 ` [PATCH 10/13] fs: Add __must_check annotations to sb_start_write_trylock() and similar Jan Kara
2024-08-07 18:29 ` [PATCH 11/13] fs: Make sb_start_write() return error on shutdown filesystem Jan Kara
2024-08-07 18:29 ` [PATCH 12/13] fs: Make sb_start_pagefault() " Jan Kara
2024-08-07 18:29 ` [PATCH 13/13] ext4: Replace EXT4_FLAGS_SHUTDOWN flag with a generic SB_I_SHUTDOWN Jan Kara
2024-08-07 23:18 ` [PATCH RFC 0/13] fs: generic filesystem shutdown handling Dave Chinner
2024-08-08 14:32   ` Jan Kara
2024-08-13 12:46     ` Christian Brauner
2024-08-14  0:09     ` Dave Chinner
2024-08-08 14:51   ` Darrick J. Wong
2024-08-09  2:30     ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).