[PATCH v2 0/3] fix s_uuid and f_fsid consistency for cloned filesystems

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/3] fix s_uuid and f_fsid consistency for cloned filesystems
@ 2026-03-21 11:55 Anand Jain
  2026-03-21 11:55 ` [PATCH v2 1/3] btrfs: use on-disk uuid for s_uuid in temp_fsid mounts Anand Jain
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Anand Jain @ 2026-03-21 11:55 UTC (permalink / raw)
  To: linux-ext4, linux-btrfs; +Cc: linux-xfs, hch

v2:
  Derive statfs::f_fsid only when using new 'nouuid'; naming matches with xfs.
 
v1:
 btrfs:
  https://lore.kernel.org/linux-btrfs/cover.1772095546.git.asj@kernel.org/
 ext4:
  https://lore.kernel.org/linux-ext4/e269a49eed2de23eb9f9bd7f506f0fe47696a023.1772095546.git.asj@kernel.org/

Anand Jain (3):
  btrfs: use on-disk uuid for s_uuid in temp_fsid mounts
  btrfs: derive f_fsid from on-disk fsuuid and dev_t
  ext4: derive f_fsid from block device to avoid collisions

 fs/btrfs/disk-io.c |  3 ++-
 fs/btrfs/fs.h      |  1 +
 fs/btrfs/super.c   | 35 +++++++++++++++++++++++++++++++----
 fs/ext4/ext4.h     |  1 +
 fs/ext4/super.c    | 12 ++++++++++--
 5 files changed, 45 insertions(+), 7 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/3] btrfs: use on-disk uuid for s_uuid in temp_fsid mounts
  2026-03-21 11:55 [PATCH v2 0/3] fix s_uuid and f_fsid consistency for cloned filesystems Anand Jain
@ 2026-03-21 11:55 ` Anand Jain
  2026-03-21 11:55 ` [PATCH v2 2/3] btrfs: derive f_fsid from on-disk fsuuid and dev_t Anand Jain
  2026-03-21 11:55 ` [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions Anand Jain
  2 siblings, 0 replies; 15+ messages in thread
From: Anand Jain @ 2026-03-21 11:55 UTC (permalink / raw)
  To: linux-ext4, linux-btrfs; +Cc: linux-xfs, hch

When mounting a cloned filesystem with a temporary fsuuid (temp_fsid),
layered modules like overlayfs require a persistent identifier.

While the internal in-memory fs_devices->fsid must remain unique, to
distinguish each instance of the mounted filesystem, let s_uuid carry
the original on-disk UUID.

Signed-off-by: Anand Jain <asj@kernel.org>
---
 fs/btrfs/disk-io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c835141ee384..90e0369bf682 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3451,7 +3451,8 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	/* Update the values for the current filesystem. */
 	sb->s_blocksize = sectorsize;
 	sb->s_blocksize_bits = blksize_bits(sectorsize);
-	memcpy(&sb->s_uuid, fs_info->fs_devices->fsid, BTRFS_FSID_SIZE);
+	/* Copy on-disk uuid, even for temp_fsid mounts */
+	memcpy(&sb->s_uuid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE);
 
 	mutex_lock(&fs_info->chunk_mutex);
 	ret = btrfs_read_sys_array(fs_info);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/3] btrfs: derive f_fsid from on-disk fsuuid and dev_t
  2026-03-21 11:55 [PATCH v2 0/3] fix s_uuid and f_fsid consistency for cloned filesystems Anand Jain
  2026-03-21 11:55 ` [PATCH v2 1/3] btrfs: use on-disk uuid for s_uuid in temp_fsid mounts Anand Jain
@ 2026-03-21 11:55 ` Anand Jain
  2026-03-21 11:55 ` [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions Anand Jain
  2 siblings, 0 replies; 15+ messages in thread
From: Anand Jain @ 2026-03-21 11:55 UTC (permalink / raw)
  To: linux-ext4, linux-btrfs; +Cc: linux-xfs, hch

f_fsid depends on fs_devices->fsid and subvol root id.

For cloned devices either same as the source or dynamical generated
at mount won't suite because tools like fanotify and ima depends on it.

Switch to a stable derivation using the persistent on-disk fsuuid +
root id + devt of the block device for the single device filesystem.
This is consistent as long as the device remains unchanged/replace
(excludes btrfs device replace secnario for now).

This change is only for the single device configs and is behind the
-o nouuid mount option to keep this change compatible with ABI.

Signed-off-by: Anand Jain <asj@kernel.org>
---
 fs/btrfs/fs.h    |  1 +
 fs/btrfs/super.c | 35 +++++++++++++++++++++++++++++++----
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index a4758d94b32e..6e2a5c2bd03c 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -270,6 +270,7 @@ enum {
 	BTRFS_MOUNT_IGNOREMETACSUMS		= (1ULL << 31),
 	BTRFS_MOUNT_IGNORESUPERFLAGS		= (1ULL << 32),
 	BTRFS_MOUNT_REF_TRACKER			= (1ULL << 33),
+	BTRFS_MOUNT_NOUUID			= (1ULL << 34),
 };
 
 /* These mount options require a full read-only fs, no new transaction is allowed. */
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 125fca57c164..2fb82032f1e1 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -102,6 +102,7 @@ enum {
 	Opt_compress_type,
 	Opt_degraded,
 	Opt_device,
+	Opt_nouuid,
 	Opt_fatal_errors,
 	Opt_flushoncommit,
 	Opt_max_inline,
@@ -227,6 +228,7 @@ static const struct fs_parameter_spec btrfs_fs_parameters[] = {
 	fsparam_flag_no("datasum", Opt_datasum),
 	fsparam_flag("degraded", Opt_degraded),
 	fsparam_string("device", Opt_device),
+	fsparam_flag("nouuid", Opt_nouuid),
 	fsparam_flag_no("discard", Opt_discard),
 	fsparam_enum("discard", Opt_discard_mode, btrfs_parameter_discard),
 	fsparam_enum("fatal_errors", Opt_fatal_errors, btrfs_parameter_fatal_errors),
@@ -382,6 +384,9 @@ static int btrfs_parse_param(struct fs_context *fc, struct fs_parameter *param)
 			return PTR_ERR(device);
 		break;
 	}
+	case Opt_nouuid:
+		btrfs_set_opt(ctx->mount_opt, NOUUID);
+		break;
 	case Opt_datasum:
 		if (result.negated) {
 			btrfs_set_opt(ctx->mount_opt, NODATASUM);
@@ -1113,6 +1118,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 		seq_puts(seq, ",discard");
 	if (btrfs_test_opt(info, DISCARD_ASYNC))
 		seq_puts(seq, ",discard=async");
+	if (btrfs_test_opt(info, NOUUID))
+		seq_puts(seq, ",nouuid");
 	if (!(info->sb->s_flags & SB_POSIXACL))
 		seq_puts(seq, ",noacl");
 	if (btrfs_free_space_cache_v1_active(info))
@@ -1733,7 +1740,7 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
 	u64 total_free_data = 0;
 	u64 total_free_meta = 0;
 	u32 bits = fs_info->sectorsize_bits;
-	__be32 *fsid = (__be32 *)fs_info->fs_devices->fsid;
+	__be32 *fsid;
 	unsigned factor = 1;
 	struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv;
 	int ret;
@@ -1819,15 +1826,35 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
 	buf->f_bsize = fs_info->sectorsize;
 	buf->f_namelen = BTRFS_NAME_LEN;
 
-	/* We treat it as constant endianness (it doesn't matter _which_)
-	   because we want the fsid to come out the same whether mounted
-	   on a big-endian or little-endian host */
+	/*
+	 * fs_devices->fsid is dynamically generated when temp_fsid is active
+	 * to support cloned devices. Use the original on-disk fsid instead,
+	 * as it remains consistent across mount cycles.
+	 */
+	fsid = (__be32 *)fs_info->super_copy->fsid;
+	/*
+	 * We treat it as constant endianness (it doesn't matter _which_)
+	 * because we want the fsid to come out the same whether mounted
+	 * on a big-endian or little-endian host.
+	 */
 	buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
 	buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
 	/* Mask in the root object ID too, to disambiguate subvols */
 	buf->f_fsid.val[0] ^= btrfs_root_id(BTRFS_I(d_inode(dentry))->root) >> 32;
 	buf->f_fsid.val[1] ^= btrfs_root_id(BTRFS_I(d_inode(dentry))->root);
 
+	/*
+	 * dev_t provides way to differentiate mounted cloned devices keeps
+	 * the statfs fid is consistent and unique.
+	 */
+	if (btrfs_test_opt(fs_info, NOUUID) &&
+	    fs_info->fs_devices->total_devices == 1) {
+		__kernel_fsid_t dev_fsid = \
+	u64_to_fsid(huge_encode_dev(fs_info->fs_devices->latest_dev->bdev->bd_dev));
+		buf->f_fsid.val[0] ^= dev_fsid.val[1];
+		buf->f_fsid.val[1] ^= dev_fsid.val[0];
+	}
+
 	return 0;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-21 11:55 [PATCH v2 0/3] fix s_uuid and f_fsid consistency for cloned filesystems Anand Jain
  2026-03-21 11:55 ` [PATCH v2 1/3] btrfs: use on-disk uuid for s_uuid in temp_fsid mounts Anand Jain
  2026-03-21 11:55 ` [PATCH v2 2/3] btrfs: derive f_fsid from on-disk fsuuid and dev_t Anand Jain
@ 2026-03-21 11:55 ` Anand Jain
  2026-03-23  4:16   ` Theodore Tso
  2 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2026-03-21 11:55 UTC (permalink / raw)
  To: linux-ext4, linux-btrfs; +Cc: linux-xfs, hch

statfs() currently reports f_fsid derived from the on-disk UUID.
Cloned block devices share the same UUID, so distinct ext4 instances
can return identical f_fsid values. This leads to collisions in
fanotify.

Encode sb->s_dev into f_fsid instead of using the superblock UUID.
This provides a per-device identifier and avoids conflicts when
filesystem is cloned, matching the behavior with xfs.

Place this change behind the new mount option "-o nouuid" for ABI
compatibility.

Signed-off-by: Anand Jain <asj@kernel.org>
---
 fs/ext4/ext4.h  |  1 +
 fs/ext4/super.c | 12 ++++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 293f698b7042..64d98ab64c11 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1282,6 +1282,7 @@ struct ext4_inode_info {
 						    * scanning in mballoc
 						    */
 #define EXT4_MOUNT2_ABORT		0x00000100 /* Abort filesystem */
+#define EXT4_MOUNT2_NOUUID		0x00000200 /* No duplicate f_fsid for cloned filesystem */
 
 #define clear_opt(sb, opt)		EXT4_SB(sb)->s_mount_opt &= \
 						~EXT4_MOUNT_##opt
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 43f680c750ae..65b712af4ad7 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1664,7 +1664,7 @@ static const struct export_operations ext4_export_ops = {
 enum {
 	Opt_bsd_df, Opt_minix_df, Opt_grpid, Opt_nogrpid,
 	Opt_resgid, Opt_resuid, Opt_sb,
-	Opt_nouid32, Opt_debug, Opt_removed,
+	Opt_nouid32, Opt_debug, Opt_removed, Opt_nouuid,
 	Opt_user_xattr, Opt_acl,
 	Opt_auto_da_alloc, Opt_noauto_da_alloc, Opt_noload,
 	Opt_commit, Opt_min_batch_time, Opt_max_batch_time, Opt_journal_dev,
@@ -1743,6 +1743,7 @@ static const struct fs_parameter_spec ext4_param_specs[] = {
 	fsparam_u32	("sb",			Opt_sb),
 	fsparam_enum	("errors",		Opt_errors, ext4_param_errors),
 	fsparam_flag	("nouid32",		Opt_nouid32),
+	fsparam_flag	("nouuid",		Opt_nouuid),
 	fsparam_flag	("debug",		Opt_debug),
 	fsparam_flag	("oldalloc",		Opt_removed),
 	fsparam_flag	("orlov",		Opt_removed),
@@ -1898,6 +1899,7 @@ static const struct mount_opts {
 	{Opt_acl, 0, MOPT_NOSUPPORT},
 #endif
 	{Opt_nouid32, EXT4_MOUNT_NO_UID32, MOPT_SET},
+	{Opt_nouuid, EXT4_MOUNT2_NOUUID, MOPT_SET | MOPT_2},
 	{Opt_debug, EXT4_MOUNT_DEBUG, MOPT_SET},
 	{Opt_quota, EXT4_MOUNT_QUOTA | EXT4_MOUNT_USRQUOTA, MOPT_SET | MOPT_Q},
 	{Opt_usrquota, EXT4_MOUNT_QUOTA | EXT4_MOUNT_USRQUOTA,
@@ -2389,6 +2391,9 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param)
 			return -EINVAL;
 		}
 		return 0;
+	case Opt_nouuid:
+		ctx_set_mount_opt2(ctx, EXT4_MOUNT2_NOUUID);
+		return 0;
 	}
 
 	/*
@@ -6941,7 +6946,10 @@ static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
 	buf->f_files = le32_to_cpu(es->s_inodes_count);
 	buf->f_ffree = percpu_counter_sum_positive(&sbi->s_freeinodes_counter);
 	buf->f_namelen = EXT4_NAME_LEN;
-	buf->f_fsid = uuid_to_fsid(es->s_uuid);
+	if (test_opt2(sb, NOUUID))
+		buf->f_fsid = u64_to_fsid(huge_encode_dev(sb->s_bdev->bd_dev));
+	else
+		buf->f_fsid = uuid_to_fsid(es->s_uuid);
 
 #ifdef CONFIG_QUOTA
 	if (ext4_test_inode_flag(dentry->d_inode, EXT4_INODE_PROJINHERIT) &&
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-21 11:55 ` [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions Anand Jain
@ 2026-03-23  4:16   ` Theodore Tso
  2026-03-23 15:29     ` Darrick J. Wong
  2026-03-23 15:41     ` Anand Jain
  0 siblings, 2 replies; 15+ messages in thread
From: Theodore Tso @ 2026-03-23  4:16 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-ext4, linux-btrfs, linux-xfs, hch

On Sat, Mar 21, 2026 at 07:55:19PM +0800, Anand Jain wrote:
> statfs() currently reports f_fsid derived from the on-disk UUID.
> Cloned block devices share the same UUID, so distinct ext4 instances
> can return identical f_fsid values. This leads to collisions in
> fanotify.
> 
> Encode sb->s_dev into f_fsid instead of using the superblock UUID.
> This provides a per-device identifier and avoids conflicts when
> filesystem is cloned, matching the behavior with xfs.

As I observed in [1] this leads to collisions when for removable block
devices which can be used to mount different file systems.

[1] https://lore.kernel.org/all/20260322203151.GA98947@mac.lan/

> Place this change behind the new mount option "-o nouuid" for ABI
> compatibility.

I *really* hate this mount option.  It's not at all obvious what it
means for a system administrator who hasn't had the context of reading
the e-mail discussion on this subject.

As I stated in [1], I think the f_fsid is a terrible interface that
was promulgated by history, and future usage should be strongly
discouraged, and the wise programmer won't use it because it has
significant compatibility issues.

As such, my personal preference is that we not try to condition it on
a mount option, which in all likelihood almost no one will use, and
instead just change it so that we hash the file system's UUID and
block device number together and use that for ext4's f_fsid.

Thoughts, comments?

						- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-23  4:16   ` Theodore Tso
@ 2026-03-23 15:29     ` Darrick J. Wong
  2026-03-23 16:44       ` Darrick J. Wong
  2026-03-25 10:02       ` Andreas Dilger
  2026-03-23 15:41     ` Anand Jain
  1 sibling, 2 replies; 15+ messages in thread
From: Darrick J. Wong @ 2026-03-23 15:29 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Anand Jain, linux-ext4, linux-btrfs, linux-xfs, hch

On Sun, Mar 22, 2026 at 11:16:24PM -0500, Theodore Tso wrote:
> On Sat, Mar 21, 2026 at 07:55:19PM +0800, Anand Jain wrote:
> > statfs() currently reports f_fsid derived from the on-disk UUID.
> > Cloned block devices share the same UUID, so distinct ext4 instances
> > can return identical f_fsid values. This leads to collisions in
> > fanotify.
> > 
> > Encode sb->s_dev into f_fsid instead of using the superblock UUID.
> > This provides a per-device identifier and avoids conflicts when
> > filesystem is cloned, matching the behavior with xfs.
> 
> As I observed in [1] this leads to collisions when for removable block
> devices which can be used to mount different file systems.
> 
> [1] https://lore.kernel.org/all/20260322203151.GA98947@mac.lan/
> 
> > Place this change behind the new mount option "-o nouuid" for ABI
> > compatibility.
> 
> I *really* hate this mount option.  It's not at all obvious what it
> means for a system administrator who hasn't had the context of reading
> the e-mail discussion on this subject.

I don't love 'nouuid' either, because it means something completely
different in XFS.  'fsid_from_dev' or something would at least be
clearer about what it's doing...

> As I stated in [1], I think the f_fsid is a terrible interface that
> was promulgated by history, and future usage should be strongly
> discouraged, and the wise programmer won't use it because it has
> significant compatibility issues.
> 
> As such, my personal preference is that we not try to condition it on
> a mount option, which in all likelihood almost no one will use, and
> instead just change it so that we hash the file system's UUID and
> block device number together and use that for ext4's f_fsid.

...but why not just set fsid to some approximation of the dev_t like
XFS and be done with it?

	st->f_fsid = u64_to_fsid(huge_encode_dev(mp->m_ddev_targp->bt_dev))

There are a few other single-bdev filesystems that do this.

--D

> Thoughts, comments?
> 
> 						- Ted
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-23  4:16   ` Theodore Tso
  2026-03-23 15:29     ` Darrick J. Wong
@ 2026-03-23 15:41     ` Anand Jain
  2026-04-04  8:59       ` Anand Jain
  1 sibling, 1 reply; 15+ messages in thread
From: Anand Jain @ 2026-03-23 15:41 UTC (permalink / raw)
  To: Theodore Tso
  Cc: linux-ext4, linux-btrfs, linux-xfs, Christoph Hellwig, Anand Jain

Thanks for the feedback. I'll try to address the points raised
here and in your earlier email [1].

  [1] https://lore.kernel.org/all/20260322203151.GA98947@mac.lan/

This work originally came out of a Btrfs issue where a cloned
filesystem ended up using a dynamically generated, mount-time
UUID for sb->s_uuid instead of the on-disk UUID. As a result,
OverlayFS (with index enabled) started failing mount-recycle
tests [2] for the cloned filesystem.

 [2]
https://lore.kernel.org/lkml/20251014015707.129013-1-andrealmeid@igalia.com/

While looking into that problem, I also noticed that different
filesystems derive f_fsid in inconsistent ways, and in practice
many of them base it on dev_t.

On the question of the 64-bit limit: although a 64-bit value
is not globally unique in the way a 128-bit UUID is, f_fsid
has historically been derived from dev_t. Since dev_t must be
unique within a running kernel instance, 64 bits are enough to
safely encode its effective ~32-bit dev_t without collisions.
The number of concurrently addressable block devices is also
bounded by the 12-bit major / 20-bit minor limits and
/proc/sys/fs/mount-max. IMO, within a single boot, 64 bits
should provide a collision-free identifier.

I've also submitted new test cases that validate expectations
around both sb->s_uuid and statfs::f_fsid here [3].

  [3] https://lore.kernel.org/fstests/cover.1774090817.git.asj@kernel.org/

> As I observed in [1] this leads to collisions when for removable block
> devices which can be used to mount different file systems.
>
> [1] https://lore.kernel.org/all/20260322203151.GA98947@mac.lan/

I agree. A straightforward f_fsid = f(dev_t) will collide if a
removable device is swapped but ends up reusing the same dev_t.
Theoretically, I see this can be reproduced with XFS, and with
my current patchset on Ext4. That’s clearly a blocker, and I plan
to revise, btw Btrfs does well for this test scenario.

> And even as you've proposed to change
> things, it's not consistent across file systems.  In particular, your
> proposed solution mixes s_uuid into btrfs-patched, but not
> ext4-patched.  Why?

The discrepancy exists because Btrfs must distinguish subvolume
mounts as separate logical entities. For Btrfs, the derivation
requires f(s_uuid, root_id, dev_t) to ensure that two different
subvolumes on the same device report distinct f_fsid values.
For Ext4, a simpler f(s_uuid, dev_t) should suffice to ensure
both cross-device uniqueness and persistent across media swaps.

>> Place this change behind the new mount option "-o nouuid" for ABI
>> compatibility.
> 
> I *really* hate this mount option.  It's not at all obvious what it
> means for a system administrator who hasn't had the context of reading
> the e-mail discussion on this subject.
> 
> As I stated in [1], I think the f_fsid is a terrible interface that
> was promulgated by history, and future usage should be strongly
> discouraged, and the wise programmer won't use it because it has
> significant compatibility issues.
> 
> As such, my personal preference is that we not try to condition it on
> a mount option, which in all likelihood almost no one will use, and
> instead just change it so that we hash the file system's UUID and
> block device number together and use that for ext4's f_fsid.

The decision to gate this behind a mount option followed feedback
from Christoph Hellwig. The concern is binary compatibility:
applications that manually derive an ID based on existing behavior
might break if the kernel changes its derivation logic.

I agree that -o nouuid is a poor name. If we must keep the mount
option  for ABI stability, I am open to better nomenclature.

If we agree that f_fsid is already a problematic interface and
should simply be fixed without any special options for example by
always hashing the filesystem UUID together with the block device
number for Ext4, that would be my preference.

Thanks, Anand

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-23 15:29     ` Darrick J. Wong
@ 2026-03-23 16:44       ` Darrick J. Wong
  2026-03-25 10:02       ` Andreas Dilger
  1 sibling, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2026-03-23 16:44 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Anand Jain, linux-ext4, linux-btrfs, linux-xfs, hch

On Mon, Mar 23, 2026 at 08:29:43AM -0700, Darrick J. Wong wrote:
> On Sun, Mar 22, 2026 at 11:16:24PM -0500, Theodore Tso wrote:
> > On Sat, Mar 21, 2026 at 07:55:19PM +0800, Anand Jain wrote:
> > > statfs() currently reports f_fsid derived from the on-disk UUID.
> > > Cloned block devices share the same UUID, so distinct ext4 instances
> > > can return identical f_fsid values. This leads to collisions in
> > > fanotify.
> > > 
> > > Encode sb->s_dev into f_fsid instead of using the superblock UUID.
> > > This provides a per-device identifier and avoids conflicts when
> > > filesystem is cloned, matching the behavior with xfs.
> > 
> > As I observed in [1] this leads to collisions when for removable block
> > devices which can be used to mount different file systems.
> > 
> > [1] https://lore.kernel.org/all/20260322203151.GA98947@mac.lan/
> > 
> > > Place this change behind the new mount option "-o nouuid" for ABI
> > > compatibility.
> > 
> > I *really* hate this mount option.  It's not at all obvious what it
> > means for a system administrator who hasn't had the context of reading
> > the e-mail discussion on this subject.
> 
> I don't love 'nouuid' either, because it means something completely
> different in XFS.  'fsid_from_dev' or something would at least be
> clearer about what it's doing...
> 
> > As I stated in [1], I think the f_fsid is a terrible interface that
> > was promulgated by history, and future usage should be strongly
> > discouraged, and the wise programmer won't use it because it has
> > significant compatibility issues.
> > 
> > As such, my personal preference is that we not try to condition it on
> > a mount option, which in all likelihood almost no one will use, and
> > instead just change it so that we hash the file system's UUID and
> > block device number together and use that for ext4's f_fsid.
> 
> ...but why not just set fsid to some approximation of the dev_t like
> XFS and be done with it?
> 
> 	st->f_fsid = u64_to_fsid(huge_encode_dev(mp->m_ddev_targp->bt_dev))
> 
> There are a few other single-bdev filesystems that do this.

/me reads the rest of both threads, realizes the answer is "because you
can eject media, mount another media with the same fs, and bizarrely
they produce the same fsid."

I'll point out that for the filesystems that base fsid off of a block
device, perhaps we should mix in the diskseq to prevent that problem?

But then someone said the magic words "IMA and fanotify also use fsid"
and /me backs away from yet another opportunity to argue about
open-coded file handles from legacy Unix.

--D

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-23 15:29     ` Darrick J. Wong
  2026-03-23 16:44       ` Darrick J. Wong
@ 2026-03-25 10:02       ` Andreas Dilger
  2026-03-25 10:59         ` Anand Jain
  1 sibling, 1 reply; 15+ messages in thread
From: Andreas Dilger @ 2026-03-25 10:02 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Theodore Tso, Anand Jain, linux-ext4, linux-btrfs, linux-xfs, hch

On Mar 23, 2026, at 09:29, Darrick J. Wong <djwong@kernel.org> wrote:
> 
> On Sun, Mar 22, 2026 at 11:16:24PM -0500, Theodore Tso wrote:
>> On Sat, Mar 21, 2026 at 07:55:19PM +0800, Anand Jain wrote:
>>> statfs() currently reports f_fsid derived from the on-disk UUID.
>>> Cloned block devices share the same UUID, so distinct ext4 instances
>>> can return identical f_fsid values. This leads to collisions in
>>> fanotify.
>>> 
>>> Encode sb->s_dev into f_fsid instead of using the superblock UUID.
>>> This provides a per-device identifier and avoids conflicts when
>>> filesystem is cloned, matching the behavior with xfs.
>> 
>> As I observed in [1] this leads to collisions when for removable block
>> devices which can be used to mount different file systems.
>> 
>> [1] https://lore.kernel.org/all/20260322203151.GA98947@mac.lan/
>> 
>>> Place this change behind the new mount option "-o nouuid" for ABI
>>> compatibility.
>> 
>> I *really* hate this mount option.  It's not at all obvious what it
>> means for a system administrator who hasn't had the context of reading
>> the e-mail discussion on this subject.
> 
> I don't love 'nouuid' either, because it means something completely
> different in XFS.  'fsid_from_dev' or something would at least be
> clearer about what it's doing...
> 
>> As I stated in [1], I think the f_fsid is a terrible interface that
>> was promulgated by history, and future usage should be strongly
>> discouraged, and the wise programmer won't use it because it has
>> significant compatibility issues.
>> 
>> As such, my personal preference is that we not try to condition it on
>> a mount option, which in all likelihood almost no one will use, and
>> instead just change it so that we hash the file system's UUID and
>> block device number together and use that for ext4's f_fsid.
> 
> ...but why not just set fsid to some approximation of the dev_t like
> XFS and be done with it?
> 
> st->f_fsid = u64_to_fsid(huge_encode_dev(mp->m_ddev_targp->bt_dev))
> 
> There are a few other single-bdev filesystems that do this.

On the flip side, if the fsid of a filesystem changes because a new disk
was installed on a server and the old disk gets a new device number or an
upgraded kernel with different device driver load ordering, that would
also be a big problem, as it would break NFS file handles over a reboot.

The whole point of generating the fsid from the persistent storage is that
it is persistent across reboots.  It seems like the real issue here is
cloning filesystem images and not assigning a new UUID to the cloned image.

Cheers, Andreas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-25 10:02       ` Andreas Dilger
@ 2026-03-25 10:59         ` Anand Jain
  2026-03-25 12:59           ` Theodore Tso
  0 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2026-03-25 10:59 UTC (permalink / raw)
  To: Andreas Dilger, Darrick J. Wong
  Cc: Theodore Tso, Anand Jain, linux-ext4, linux-btrfs, linux-xfs, hch



On 25/3/26 18:02, Andreas Dilger wrote:
> On Mar 23, 2026, at 09:29, Darrick J. Wong <djwong@kernel.org> wrote:
>>
>> On Sun, Mar 22, 2026 at 11:16:24PM -0500, Theodore Tso wrote:
>>> On Sat, Mar 21, 2026 at 07:55:19PM +0800, Anand Jain wrote:
>>>> statfs() currently reports f_fsid derived from the on-disk UUID.
>>>> Cloned block devices share the same UUID, so distinct ext4 instances
>>>> can return identical f_fsid values. This leads to collisions in
>>>> fanotify.
>>>>
>>>> Encode sb->s_dev into f_fsid instead of using the superblock UUID.
>>>> This provides a per-device identifier and avoids conflicts when
>>>> filesystem is cloned, matching the behavior with xfs.
>>>
>>> As I observed in [1] this leads to collisions when for removable block
>>> devices which can be used to mount different file systems.
>>>
>>> [1] https://lore.kernel.org/all/20260322203151.GA98947@mac.lan/
>>>
>>>> Place this change behind the new mount option "-o nouuid" for ABI
>>>> compatibility.
>>>
>>> I *really* hate this mount option.  It's not at all obvious what it
>>> means for a system administrator who hasn't had the context of reading
>>> the e-mail discussion on this subject.
>>
>> I don't love 'nouuid' either, because it means something completely
>> different in XFS.  'fsid_from_dev' or something would at least be
>> clearer about what it's doing...
>>
>>> As I stated in [1], I think the f_fsid is a terrible interface that
>>> was promulgated by history, and future usage should be strongly
>>> discouraged, and the wise programmer won't use it because it has
>>> significant compatibility issues.
>>>
>>> As such, my personal preference is that we not try to condition it on
>>> a mount option, which in all likelihood almost no one will use, and
>>> instead just change it so that we hash the file system's UUID and
>>> block device number together and use that for ext4's f_fsid.
>>
>> ...but why not just set fsid to some approximation of the dev_t like
>> XFS and be done with it?
>>
>> st->f_fsid = u64_to_fsid(huge_encode_dev(mp->m_ddev_targp->bt_dev))
>>
>> There are a few other single-bdev filesystems that do this.
> 
> On the flip side, if the fsid of a filesystem changes because a new disk
> was installed on a server and the old disk gets a new device number or an
> upgraded kernel with different device driver load ordering, that would
> also be a big problem, as it would break NFS file handles over a reboot.
> 
> The whole point of generating the fsid from the persistent storage is that
> it is persistent across reboots.  It seems like the real issue here is
> cloning filesystem images and not assigning a new UUID to the cloned image.
> 

IMO, sb->s_uuid (as used by overlayfs)
Represents a filesystem UUID that is persistent.
It is derived from on-disk metadata.

statfs()->f_fsid is..
A kind of runtime filesystem identifier used to distinguish mounted
filesystems within a running system.
It may be stable across reboots or device removal and reinsertion,
but this is not guaranteed. It may change if the device dev_t changes.

I have posted a set of five test cases to the mailing list
to help verify these behaviors, for your review. Another
test case to verify device reinsertion with a different
dev_t is a WIP; it will be submitted along with v3.

Thanks.

> Cheers, Andreas


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-25 10:59         ` Anand Jain
@ 2026-03-25 12:59           ` Theodore Tso
  2026-04-02  7:33             ` Anand Jain
  0 siblings, 1 reply; 15+ messages in thread
From: Theodore Tso @ 2026-03-25 12:59 UTC (permalink / raw)
  To: Anand Jain
  Cc: Andreas Dilger, Darrick J. Wong, Anand Jain, linux-ext4,
	linux-btrfs, linux-xfs, hch

On Wed, Mar 25, 2026 at 06:59:32PM +0800, Anand Jain wrote:
> 
> IMO, sb->s_uuid (as used by overlayfs)
> Represents a filesystem UUID that is persistent.
> It is derived from on-disk metadata.
> 
> statfs()->f_fsid is..
> A kind of runtime filesystem identifier used to distinguish mounted
> filesystems within a running system.
> It may be stable across reboots or device removal and reinsertion,
> but this is not guaranteed. It may change if the device dev_t changes.

I always worry about "it might be stable, but it might not; ¯\_(ツ)_/¯"

The problem with that is that people might starting using this
kinda-of-guarantee-but-maybe-not in scripts or in programs, and then
when people try to run that script or program on a different system,
or on a different file system, things goes *boom*.

So if we want to say that it is stable so long as dev_t and the file
system the same, that's a well defined semantic.

If it's that it has no guarantees whatsoever; cloud change across
reboots; could change across remounts, then maybe it should just be a
global mount sequence number that starts with a random number at boot.
So you can use it to distinguish between different mounted file
systems, but that's *all* you can do with the thing.  That would also
be a well defined semantic.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-25 12:59           ` Theodore Tso
@ 2026-04-02  7:33             ` Anand Jain
  0 siblings, 0 replies; 15+ messages in thread
From: Anand Jain @ 2026-04-02  7:33 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Andreas Dilger, Darrick J. Wong, Anand Jain, linux-ext4,
	linux-btrfs, linux-xfs, hch



On 25/3/26 20:59, Theodore Tso wrote:
> On Wed, Mar 25, 2026 at 06:59:32PM +0800, Anand Jain wrote:
>>
>> IMO, sb->s_uuid (as used by overlayfs)
>> Represents a filesystem UUID that is persistent.
>> It is derived from on-disk metadata.
>>
>> statfs()->f_fsid is..
>> A kind of runtime filesystem identifier used to distinguish mounted
>> filesystems within a running system.
>> It may be stable across reboots or device removal and reinsertion,
>> but this is not guaranteed. It may change if the device dev_t changes.
> 
> I always worry about "it might be stable, but it might not; ¯\_(ツ)_/¯"
> 
> The problem with that is that people might starting using this
> kinda-of-guarantee-but-maybe-not in scripts or in programs, and then
> when people try to run that script or program on a different system,
> or on a different file system, things goes *boom*.
> 
> So if we want to say that it is stable so long as dev_t and the file
> system the same, that's a well defined semantic.
> 

Yeah, agreed. Avoid misuse. Document that f_fsid is stable as long
as dev_t and the underlying filesystem identity don't change.

> If it's that it has no guarantees whatsoever; cloud change across
> reboots; could change across remounts, then maybe it should just be a
> global mount sequence number that starts with a random number at boot.
> So you can use it to distinguish between different mounted file
> systems, but that's *all* you can do with the thing.  That would also
> be a well defined semantic.
> 

Per-mount random value (or global mount sequence) is also a
well-defined semantic, but it comes with trade-offs: we lose
consistency across mount cycles and need to carry per-mount
state.

IMHO, it's better to stick with a deterministic id:

  f_fsid = f(dev_t, fsid)

predictable and aligned with XFS/btrfs and avoids additional state.
Bottom line, it fixes the cloned filesystem case without regressing
the existing semantics.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-03-23 15:41     ` Anand Jain
@ 2026-04-04  8:59       ` Anand Jain
  2026-04-07  5:22         ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2026-04-04  8:59 UTC (permalink / raw)
  To: Theodore Tso, Christoph Hellwig, Darrick J. Wong
  Cc: linux-ext4, linux-btrfs, linux-xfs, Anand Jain

Hi Ted, Christoph, Darrick,

As I prepare v3, I'd appreciate your final thoughts on the mount option
naming and its necessity for ext4.

For the new option, I am considering:

  -o nodup_f_fsid

  -o unique_f_fsid

Context:
Currently, ext4's f_fsid is consistent across reboots but fails to be
unique when dealing with cloned filesystems (sharing the same UUID). Per
statfs(2) [1], the primary requirement is that the (f_fsid, ino) pair
uniquely identifies a file. The man page makes no explicit guarantee
regarding consistency across mount cycles or reboots.

Proposal:
With this fix, f_fsid becomes f(uuid, dev_t). This ensures OS-wide
uniqueness and maintains consistency as long as the underlying dev_t
remains stable.

Dilemma:
While statfs(2) [1] suggests f_fsid is "some random stuff," we know
userspace (NFS, systemd) often treats it as a persistent handle.

Do you prefer one of the names above, or is there a more idiomatic ext4
naming convention I should follow?

Given the ambiguity in the man page, is gating this behind an -o option
necessary, or should we consider making uniqueness the default behavior?

[1]
----------
statfs(2)

<snap>
       Nobody knows what f_fsid is supposed to contain (but see below).

<snap>
   The f_fsid field
       Solaris,  Irix,  and  POSIX  have  a  system  call  statvfs(2)
that  returns  a  struct statvfs (defined in <sys/statvfs.h>) containing
an unsigned long f_fsid. Linux,  SunOS,  HP-UX,  4.4BSD  have  a  system
 call statfs()  that  returns a struct statfs (defined in <sys/vfs.h>)
containing a fsid_t f_fsid, where fsid_t is defined as struct { int
val[2]; }.  The same holds for  FreeBSD,  except  that  it  uses  the
include  file <sys/mount.h>.

       The  general  idea is that f_fsid contains some random stuff such
that the pair (f_fsid,ino) uniquely determines a file.  Some operating
systems use (a variation on) the device number, or the device number
combined with  the  filesystem type.  Several operating systems restrict
giving out the f_fsid field to the superuser only (and zero it for
unprivileged users), because this field is used in the filehandle  of
the  filesystem when NFS-exported, and giving it out is a security concern.

       Under some operating systems, the fsid can be used as the second
argument to the sysfs(2) system call.
----------

Thanks, Anand

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-04-04  8:59       ` Anand Jain
@ 2026-04-07  5:22         ` Christoph Hellwig
  2026-04-07 14:47           ` Theodore Tso
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2026-04-07  5:22 UTC (permalink / raw)
  To: Anand Jain
  Cc: Theodore Tso, Christoph Hellwig, Darrick J. Wong, linux-ext4,
	linux-btrfs, linux-xfs, Anand Jain

On Sat, Apr 04, 2026 at 04:59:08PM +0800, Anand Jain wrote:
> Context:
> Currently, ext4's f_fsid is consistent across reboots but fails to be
> unique when dealing with cloned filesystems (sharing the same UUID). Per
> statfs(2) [1], the primary requirement is that the (f_fsid, ino) pair
> uniquely identifies a file. The man page makes no explicit guarantee
> regarding consistency across mount cycles or reboots.
> 
> Proposal:
> With this fix, f_fsid becomes f(uuid, dev_t). This ensures OS-wide
> uniqueness and maintains consistency as long as the underlying dev_t
> remains stable.
> 
> Dilemma:
> While statfs(2) [1] suggests f_fsid is "some random stuff," we know
> userspace (NFS, systemd) often treats it as a persistent handle.
> 
> Do you prefer one of the names above, or is there a more idiomatic ext4
> naming convention I should follow?
> 
> Given the ambiguity in the man page, is gating this behind an -o option
> necessary, or should we consider making uniqueness the default behavior?
> 

My take is that anything that should persist should be an on-disk
feature flag, not a mount option.  But I'm not in charge for ext4.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions
  2026-04-07  5:22         ` Christoph Hellwig
@ 2026-04-07 14:47           ` Theodore Tso
  0 siblings, 0 replies; 15+ messages in thread
From: Theodore Tso @ 2026-04-07 14:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Anand Jain, Darrick J. Wong, linux-ext4, linux-btrfs, linux-xfs,
	Anand Jain

On Mon, Apr 06, 2026 at 10:22:16PM -0700, Christoph Hellwig wrote:
> > Dilemma:
> > While statfs(2) [1] suggests f_fsid is "some random stuff," we know
> > userspace (NFS, systemd) often treats it as a persistent handle.
> > 
> > Do you prefer one of the names above, or is there a more idiomatic ext4
> > naming convention I should follow?
> > 
>
> My take is that anything that should persist should be an on-disk
> feature flag, not a mount option.  But I'm not in charge for ext4

My take is that f_fsid is random stuff, as documented by the
specification, so anyone who tries to depend on it needs to be kept in
a padding room where they can't hurt themselves or their users.

And as far as NFS is concerned, file handles should be based on
the super block UUID, not statfs's f_fsid, and anyone who wants to
mount a snapshot as an NFS exported file system at the same time that
the original file system is mounted is _also_ should be gently coaxed
into a padding room where they can't hurt themselves or their users.

The solution that we've used for people who are cloning block devices
for things like cloud images has been for *years* has been to use
"tune2fs -U random /dev/sda1".  And this works on mounted file system,
and (for example) built into various cloud images for Google Cloud
Engine.

If we want to change statfs's f_fsid, from one set of "Random stuff"
to another set of "Random stuff", I don't really mind, but I don't
think it's worth *either* a mount option, *or* a feature flag, as
either would be confusing for system adminsitrators when some file
systems behave one way, and other file systems behave another.

	       	   	    	  - Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-04-07 14:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-21 11:55 [PATCH v2 0/3] fix s_uuid and f_fsid consistency for cloned filesystems Anand Jain
2026-03-21 11:55 ` [PATCH v2 1/3] btrfs: use on-disk uuid for s_uuid in temp_fsid mounts Anand Jain
2026-03-21 11:55 ` [PATCH v2 2/3] btrfs: derive f_fsid from on-disk fsuuid and dev_t Anand Jain
2026-03-21 11:55 ` [PATCH v2 3/3] ext4: derive f_fsid from block device to avoid collisions Anand Jain
2026-03-23  4:16   ` Theodore Tso
2026-03-23 15:29     ` Darrick J. Wong
2026-03-23 16:44       ` Darrick J. Wong
2026-03-25 10:02       ` Andreas Dilger
2026-03-25 10:59         ` Anand Jain
2026-03-25 12:59           ` Theodore Tso
2026-04-02  7:33             ` Anand Jain
2026-03-23 15:41     ` Anand Jain
2026-04-04  8:59       ` Anand Jain
2026-04-07  5:22         ` Christoph Hellwig
2026-04-07 14:47           ` Theodore Tso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox