public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] prototype: improve timestamp handling
@ 2025-04-16 14:43 Luca Di Maio
  2025-04-16 14:43 ` [PATCH RFC 1/2] xfs_proto: add origin also for directories, chardevs and symlinks Luca Di Maio
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Luca Di Maio @ 2025-04-16 14:43 UTC (permalink / raw)
  To: linux-xfs; +Cc: Luca Di Maio, dimitri.ledkov, smoser

Hi all,

This is an initial prototype to improve XFS's prototype file
functionality in scenarios where FS reproducibility is important.

Currently, when populating a filesystem with a prototype file, all generated inodes
receive timestamps set to the creation time rather than preserving timestamps from
their source files.

This patchset extends the protofile handling to preserve original timestamps (atime,
mtime, ctime) across all inode types. The implementation is split into two parts:

- First patch extends xfs_protofile.in to track origin path references for directories,
character devices and symlinks, similar to what's already implemented for regular files.

- Second patch leverages these references to read timestamp metadata from source files
and populate it into the newly created inodes during filesystem creation.

At the moment, the new `xfs_protofile` generates a file that results
invalid for older `mkfs.xfs` implementations. Also this new implementation
is not compatible with older prototype files.

I can imagine that new protofiles not working with older `mkfs.xfs`
might not be a problem, but what about backward compatibility?
I didn't find references on prototype file compatibility, is a change
like this unwanted?

If so, what do you think of a versioned support for prototype files?
I was thinking something on the lines of:

- xfs_protofile
  - if the new flag:
    - set the first comment accordingly
    - add the additional information
  - else act as old one

- proto.c
  - check if the doc starts with the comment `:origin-files enabled`
	(for example)
  - if so, this is the new format
  - else old format

Eager to know your thoughts and ideas
Thanks
L.

Luca Di Maio (2):
  xfs_proto: add origin also for directories, chardevs and symlinks
  proto: read origin also for directories, chardevs and symlinks. copy
    timestamps from origin.

 mkfs/proto.c          | 49 +++++++++++++++++++++++++++++++++++++++++++
 mkfs/xfs_protofile.in | 12 +++++------
 2 files changed, 55 insertions(+), 6 deletions(-)

--
2.49.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC 1/2] xfs_proto: add origin also for directories, chardevs and symlinks
  2025-04-16 14:43 [PATCH RFC 0/2] prototype: improve timestamp handling Luca Di Maio
@ 2025-04-16 14:43 ` Luca Di Maio
  2025-04-16 16:07   ` Darrick J. Wong
  2025-04-16 14:43 ` [PATCH RFC 2/2] proto: read origin also for directories, chardevs and symlinks. copy timestamps from origin Luca Di Maio
  2025-04-22  3:10 ` [PATCH RFC 0/2] prototype: improve timestamp handling Darrick J. Wong
  2 siblings, 1 reply; 9+ messages in thread
From: Luca Di Maio @ 2025-04-16 14:43 UTC (permalink / raw)
  To: linux-xfs; +Cc: Luca Di Maio, dimitri.ledkov, smoser

In order to preserve timestamps when populating target filesystem, we
need to have a reference to the original file.

This is already done with regular files, we extend this to dirs,
symlinks and chardevices.

Excerpt of old protofile:

```
/
0 0
d--755 0 0
: Descending path rootfs
 bin   l--777 0 0 usr/bin
 lib64 l--777 0 0 lib
 sbin  l--777 0 0 usr/bin
 dev d--755 0 0
  console c--620 0 0 5 1
  null    c--666 0 0 1 3
  random  c--666 0 0 1 8
  urandom c--666 0 0 1 9
  zero    c--666 0 0 1 5
 $
 lib d--755 0 0
  ld-linux-x86-64.so.2   ---755 0 0 rootfs/lib/ld-linux-x86-64.so.2
```

Excerpt of new protofile:

```
/
0 0
d--755 65534 65534 rootfs
: Descending path rootfs
 bin   l--777 65534 65534 usr/bin rootfs/bin
 lib64 l--777 65534 65534 lib rootfs/lib64
 sbin  l--777 65534 65534 usr/bin rootfs/sbin
 $
 dev d--755 65534 65534 rootfs/dev
  console c--620 65534 65534 5 1 rootfs/dev/console
  null    c--666 65534 65534 1 3 rootfs/dev/null
  random  c--666 65534 65534 1 8 rootfs/dev/random
  urandom c--666 65534 65534 1 9 rootfs/dev/urandom
  zero    c--666 65534 65534 1 5 rootfs/dev/zero
 $
 lib d--755 0 0 rootfs/lib
  ld-linux-x86-64.so.2   ---755 0 0 rootfs/lib/ld-linux-x86-64.so.2
```

Signed-off-by: Luca Di Maio <luca.dimaio1@gmail.com>
---
 mkfs/xfs_protofile.in | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mkfs/xfs_protofile.in b/mkfs/xfs_protofile.in
index e83c39f..066265b 100644
--- a/mkfs/xfs_protofile.in
+++ b/mkfs/xfs_protofile.in
@@ -51,12 +51,12 @@ def stat_to_str(statbuf):
 def stat_to_extra(statbuf, fullpath):
 	'''Compute the extras column for a protofile.'''

-	if stat.S_ISREG(statbuf.st_mode):
+	if stat.S_ISREG(statbuf.st_mode) or stat.S_ISDIR(statbuf.st_mode):
 		return ' %s' % fullpath
 	elif stat.S_ISCHR(statbuf.st_mode) or stat.S_ISBLK(statbuf.st_mode):
-		return ' %d %d' % (os.major(statbuf.st_rdev), os.minor(statbuf.st_rdev))
+		return ' %d %d %s' % (os.major(statbuf.st_rdev), os.minor(statbuf.st_rdev), fullpath)
 	elif stat.S_ISLNK(statbuf.st_mode):
-		return ' %s' % os.readlink(fullpath)
+		return ' %s %s' % (os.readlink(fullpath), fullpath)
 	return ''

 def max_fname_len(s1):
@@ -105,8 +105,8 @@ def walk_tree(path, depth):
 		fullpath = os.path.join(path, fname)
 		sb = os.lstat(fullpath)
 		extra = stat_to_extra(sb, fullpath)
-		print('%*s%s %s' % (depth, ' ', fname, \
-				stat_to_str(sb)))
+		print('%*s%s %s%s' % (depth, ' ', fname, \
+				stat_to_str(sb), extra))
 		walk_tree(fullpath, depth + 1)

 	if depth > 1:
@@ -134,7 +134,7 @@ def main():
 		statbuf = os.stat(args.paths[0])
 		if not stat.S_ISDIR(statbuf.st_mode):
 			raise NotADirectoryError(path)
-		print(stat_to_str(statbuf))
+		print(stat_to_str(statbuf), args.paths[0])

 		# All files under each path go in the root dir, recursively
 		for path in args.paths:
--
2.49.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH RFC 2/2] proto: read origin also for directories, chardevs and symlinks. copy timestamps from origin.
  2025-04-16 14:43 [PATCH RFC 0/2] prototype: improve timestamp handling Luca Di Maio
  2025-04-16 14:43 ` [PATCH RFC 1/2] xfs_proto: add origin also for directories, chardevs and symlinks Luca Di Maio
@ 2025-04-16 14:43 ` Luca Di Maio
  2025-04-16 16:07   ` Darrick J. Wong
  2025-04-22  3:10 ` [PATCH RFC 0/2] prototype: improve timestamp handling Darrick J. Wong
  2 siblings, 1 reply; 9+ messages in thread
From: Luca Di Maio @ 2025-04-16 14:43 UTC (permalink / raw)
  To: linux-xfs; +Cc: Luca Di Maio, dimitri.ledkov, smoser

Right now, when populating a filesystem with the prototype file,
generated inodes will have timestamps set at the creation time.

This change enables more accurate filesystem initialization by preserving
original file timestamps during inode creation rather than defaulting to
the current time.

This patch leverages the xfs_protofile changes in order to carry the
reference to the original files for files other than regular ones.

Signed-off-by: Luca Di Maio <luca.dimaio1@gmail.com>
---
 mkfs/proto.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/mkfs/proto.c b/mkfs/proto.c
index 6dd3a20..ed76155 100644
--- a/mkfs/proto.c
+++ b/mkfs/proto.c
@@ -352,6 +352,15 @@ writefile(

 	libxfs_trans_ijoin(tp, ip, 0);
 	ip->i_disk_size = statbuf.st_size;
+
+	/* Copy timestamps from source file to destination inode */
+	VFS_I(ip)->__i_atime.tv_sec = statbuf.st_atime;
+	VFS_I(ip)->__i_mtime.tv_sec = statbuf.st_mtime;
+	VFS_I(ip)->__i_ctime.tv_sec = statbuf.st_ctime;
+	VFS_I(ip)->__i_atime.tv_nsec = statbuf.st_atim.tv_nsec;
+	VFS_I(ip)->__i_mtime.tv_nsec = statbuf.st_mtim.tv_nsec;
+	VFS_I(ip)->__i_ctime.tv_nsec = statbuf.st_ctim.tv_nsec;
+
 	libxfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 	error = -libxfs_trans_commit(tp);
 	if (error)
@@ -689,6 +698,7 @@ parseproto(
 	char		*fname = NULL;
 	struct xfs_name	xname;
 	struct xfs_parent_args *ppargs = NULL;
+	struct stat		statbuf;

 	memset(&creds, 0, sizeof(creds));
 	mstr = getstr(pp);
@@ -823,10 +833,23 @@ parseproto(
 		ppargs = newpptr(mp);
 		majdev = getnum(getstr(pp), 0, 0, false);
 		mindev = getnum(getstr(pp), 0, 0, false);
+		fd = newregfile(pp, &fname);
 		error = creatproto(&tp, pip, mode | S_IFCHR,
 				IRIX_MKDEV(majdev, mindev), &creds, fsxp, &ip);
 		if (error)
 			fail(_("Inode allocation failed"), error);
+
+		/* Copy timestamps from source file to destination inode */
+		error = fstat(fd, &statbuf);
+		if (error < 0)
+			fail(_("unable to stat file to copyin"), errno);
+		VFS_I(ip)->__i_atime.tv_sec = statbuf.st_atime;
+		VFS_I(ip)->__i_mtime.tv_sec = statbuf.st_mtime;
+		VFS_I(ip)->__i_ctime.tv_sec = statbuf.st_ctime;
+		VFS_I(ip)->__i_atime.tv_nsec = statbuf.st_atim.tv_nsec;
+		VFS_I(ip)->__i_mtime.tv_nsec = statbuf.st_mtim.tv_nsec;
+		VFS_I(ip)->__i_ctime.tv_nsec = statbuf.st_ctim.tv_nsec;
+
 		libxfs_trans_ijoin(tp, pip, 0);
 		xname.type = XFS_DIR3_FT_CHRDEV;
 		newdirent(mp, tp, pip, &xname, ip, ppargs);
@@ -846,6 +869,7 @@ parseproto(
 		break;
 	case IF_SYMLINK:
 		buf = getstr(pp);
+		char* orig = getstr(pp);
 		len = (int)strlen(buf);
 		tp = getres(mp, XFS_B_TO_FSB(mp, len));
 		ppargs = newpptr(mp);
@@ -854,11 +878,24 @@ parseproto(
 		if (error)
 			fail(_("Inode allocation failed"), error);
 		writesymlink(tp, ip, buf, len);
+
+		/* Copy timestamps from source file to destination inode */
+		error = lstat(orig, &statbuf);
+		if (error < 0)
+			fail(_("unable to stat file to copyin"), errno);
+		VFS_I(ip)->__i_atime.tv_sec = statbuf.st_atime;
+		VFS_I(ip)->__i_mtime.tv_sec = statbuf.st_mtime;
+		VFS_I(ip)->__i_ctime.tv_sec = statbuf.st_ctime;
+		VFS_I(ip)->__i_atime.tv_nsec = statbuf.st_atim.tv_nsec;
+		VFS_I(ip)->__i_mtime.tv_nsec = statbuf.st_mtim.tv_nsec;
+		VFS_I(ip)->__i_ctime.tv_nsec = statbuf.st_ctim.tv_nsec;
+
 		libxfs_trans_ijoin(tp, pip, 0);
 		xname.type = XFS_DIR3_FT_SYMLINK;
 		newdirent(mp, tp, pip, &xname, ip, ppargs);
 		break;
 	case IF_DIRECTORY:
+		fd = newregfile(pp, &fname);
 		tp = getres(mp, 0);
 		error = creatproto(&tp, pip, mode | S_IFDIR, 0, &creds, fsxp,
 				&ip);
@@ -878,6 +915,18 @@ parseproto(
 			libxfs_trans_log_inode(tp, pip, XFS_ILOG_CORE);
 		}
 		newdirectory(mp, tp, ip, pip);
+
+		/* Copy timestamps from source file to destination inode */
+		error = stat(fname, &statbuf);
+		if (error < 0)
+			fail(_("unable to stat file to copyin"), errno);
+		VFS_I(ip)->__i_atime.tv_sec = statbuf.st_atime;
+		VFS_I(ip)->__i_mtime.tv_sec = statbuf.st_mtime;
+		VFS_I(ip)->__i_ctime.tv_sec = statbuf.st_ctime;
+		VFS_I(ip)->__i_atime.tv_nsec = statbuf.st_atim.tv_nsec;
+		VFS_I(ip)->__i_mtime.tv_nsec = statbuf.st_mtim.tv_nsec;
+		VFS_I(ip)->__i_ctime.tv_nsec = statbuf.st_ctim.tv_nsec;
+
 		libxfs_trans_log_inode(tp, ip, flags);
 		error = -libxfs_trans_commit(tp);
 		if (error)
--
2.49.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 2/2] proto: read origin also for directories, chardevs and symlinks. copy timestamps from origin.
  2025-04-16 14:43 ` [PATCH RFC 2/2] proto: read origin also for directories, chardevs and symlinks. copy timestamps from origin Luca Di Maio
@ 2025-04-16 16:07   ` Darrick J. Wong
  2025-04-17 14:14     ` Luca Di Maio
  0 siblings, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2025-04-16 16:07 UTC (permalink / raw)
  To: Luca Di Maio; +Cc: linux-xfs, dimitri.ledkov, smoser

On Wed, Apr 16, 2025 at 04:43:33PM +0200, Luca Di Maio wrote:
> Right now, when populating a filesystem with the prototype file,
> generated inodes will have timestamps set at the creation time.
> 
> This change enables more accurate filesystem initialization by preserving
> original file timestamps during inode creation rather than defaulting to
> the current time.
> 
> This patch leverages the xfs_protofile changes in order to carry the
> reference to the original files for files other than regular ones.
> 
> Signed-off-by: Luca Di Maio <luca.dimaio1@gmail.com>
> ---
>  mkfs/proto.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> diff --git a/mkfs/proto.c b/mkfs/proto.c
> index 6dd3a20..ed76155 100644
> --- a/mkfs/proto.c
> +++ b/mkfs/proto.c
> @@ -352,6 +352,15 @@ writefile(
> 
>  	libxfs_trans_ijoin(tp, ip, 0);
>  	ip->i_disk_size = statbuf.st_size;
> +
> +	/* Copy timestamps from source file to destination inode */
> +	VFS_I(ip)->__i_atime.tv_sec = statbuf.st_atime;
> +	VFS_I(ip)->__i_mtime.tv_sec = statbuf.st_mtime;
> +	VFS_I(ip)->__i_ctime.tv_sec = statbuf.st_ctime;
> +	VFS_I(ip)->__i_atime.tv_nsec = statbuf.st_atim.tv_nsec;
> +	VFS_I(ip)->__i_mtime.tv_nsec = statbuf.st_mtim.tv_nsec;
> +	VFS_I(ip)->__i_ctime.tv_nsec = statbuf.st_ctim.tv_nsec;

	inode_set_[acm]time()?

I don't have a particular problem with copying in timestamps for regular
files...

> +
>  	libxfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>  	error = -libxfs_trans_commit(tp);
>  	if (error)
> @@ -689,6 +698,7 @@ parseproto(
>  	char		*fname = NULL;
>  	struct xfs_name	xname;
>  	struct xfs_parent_args *ppargs = NULL;
> +	struct stat		statbuf;
> 
>  	memset(&creds, 0, sizeof(creds));
>  	mstr = getstr(pp);
> @@ -823,10 +833,23 @@ parseproto(
>  		ppargs = newpptr(mp);
>  		majdev = getnum(getstr(pp), 0, 0, false);
>  		mindev = getnum(getstr(pp), 0, 0, false);
> +		fd = newregfile(pp, &fname);

...however, newregfile calls getstr, which advances *pp.  This breaks
parsing of all existing protofiles.  Take a file like this:

msr c--600 0 0 202 3
ptmx c--000 0 0 5 2

At the point where we assign mindev, the parser is sitting at the end of
the "msr" line.  The new newregfile() call reads the next word from the
file ("ptmx") and tries to open it to copy timestamps, but that's not
the correct thing to do.

Similarly, a new protofile from patch 1:

msr c--600 0 0 202 3 /dev/cpu/0/msr
ptmx c--000 0 0 5 2 /dev/pts/ptmx

Will break parsing on an old version of mkfs because the getstr at the
top of parseproto() will read the "/dev/cpu/0/msr" and think that's the
name of a new file.

In effect, you're revving the protofile format in an incompatible way.
If you really want this, then the new parsing logic should be gated on
some sort of version number specification, either through CLI options or
by overloading the "boot image name" on the first line or the two
numbers on the second line.

Though if you're going to all that trouble, why not just amend the CLI
to take a -p directory=$PATH and walk $PATH with nftw like mke2fs does?

The protofile format that mkfs.xfs uses has been static for 52 years,
I don't know that expending a large amount of effort on it is worth the
time.  If you really must have reproducible filesystem images, would it
be easier to allow overriding current_time() via environment vars?

(I also don't really get why anyone cares about bit-reproducible
filesystem images; the only behavioral guarantees are the userspace
interface contract.  Filesystem drivers have wide latitude to do
whatever they want under the covers.)

((Also not sure why you left out block device special files?))

--D

>  		error = creatproto(&tp, pip, mode | S_IFCHR,
>  				IRIX_MKDEV(majdev, mindev), &creds, fsxp, &ip);
>  		if (error)
>  			fail(_("Inode allocation failed"), error);
> +
> +		/* Copy timestamps from source file to destination inode */
> +		error = fstat(fd, &statbuf);
> +		if (error < 0)
> +			fail(_("unable to stat file to copyin"), errno);
> +		VFS_I(ip)->__i_atime.tv_sec = statbuf.st_atime;
> +		VFS_I(ip)->__i_mtime.tv_sec = statbuf.st_mtime;
> +		VFS_I(ip)->__i_ctime.tv_sec = statbuf.st_ctime;
> +		VFS_I(ip)->__i_atime.tv_nsec = statbuf.st_atim.tv_nsec;
> +		VFS_I(ip)->__i_mtime.tv_nsec = statbuf.st_mtim.tv_nsec;
> +		VFS_I(ip)->__i_ctime.tv_nsec = statbuf.st_ctim.tv_nsec;
> +
>  		libxfs_trans_ijoin(tp, pip, 0);
>  		xname.type = XFS_DIR3_FT_CHRDEV;
>  		newdirent(mp, tp, pip, &xname, ip, ppargs);
> @@ -846,6 +869,7 @@ parseproto(
>  		break;
>  	case IF_SYMLINK:
>  		buf = getstr(pp);
> +		char* orig = getstr(pp);
>  		len = (int)strlen(buf);
>  		tp = getres(mp, XFS_B_TO_FSB(mp, len));
>  		ppargs = newpptr(mp);
> @@ -854,11 +878,24 @@ parseproto(
>  		if (error)
>  			fail(_("Inode allocation failed"), error);
>  		writesymlink(tp, ip, buf, len);
> +
> +		/* Copy timestamps from source file to destination inode */
> +		error = lstat(orig, &statbuf);
> +		if (error < 0)
> +			fail(_("unable to stat file to copyin"), errno);
> +		VFS_I(ip)->__i_atime.tv_sec = statbuf.st_atime;
> +		VFS_I(ip)->__i_mtime.tv_sec = statbuf.st_mtime;
> +		VFS_I(ip)->__i_ctime.tv_sec = statbuf.st_ctime;
> +		VFS_I(ip)->__i_atime.tv_nsec = statbuf.st_atim.tv_nsec;
> +		VFS_I(ip)->__i_mtime.tv_nsec = statbuf.st_mtim.tv_nsec;
> +		VFS_I(ip)->__i_ctime.tv_nsec = statbuf.st_ctim.tv_nsec;
> +
>  		libxfs_trans_ijoin(tp, pip, 0);
>  		xname.type = XFS_DIR3_FT_SYMLINK;
>  		newdirent(mp, tp, pip, &xname, ip, ppargs);
>  		break;
>  	case IF_DIRECTORY:
> +		fd = newregfile(pp, &fname);
>  		tp = getres(mp, 0);
>  		error = creatproto(&tp, pip, mode | S_IFDIR, 0, &creds, fsxp,
>  				&ip);
> @@ -878,6 +915,18 @@ parseproto(
>  			libxfs_trans_log_inode(tp, pip, XFS_ILOG_CORE);
>  		}
>  		newdirectory(mp, tp, ip, pip);
> +
> +		/* Copy timestamps from source file to destination inode */
> +		error = stat(fname, &statbuf);
> +		if (error < 0)
> +			fail(_("unable to stat file to copyin"), errno);
> +		VFS_I(ip)->__i_atime.tv_sec = statbuf.st_atime;
> +		VFS_I(ip)->__i_mtime.tv_sec = statbuf.st_mtime;
> +		VFS_I(ip)->__i_ctime.tv_sec = statbuf.st_ctime;
> +		VFS_I(ip)->__i_atime.tv_nsec = statbuf.st_atim.tv_nsec;
> +		VFS_I(ip)->__i_mtime.tv_nsec = statbuf.st_mtim.tv_nsec;
> +		VFS_I(ip)->__i_ctime.tv_nsec = statbuf.st_ctim.tv_nsec;
> +
>  		libxfs_trans_log_inode(tp, ip, flags);
>  		error = -libxfs_trans_commit(tp);
>  		if (error)
> --
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 1/2] xfs_proto: add origin also for directories, chardevs and symlinks
  2025-04-16 14:43 ` [PATCH RFC 1/2] xfs_proto: add origin also for directories, chardevs and symlinks Luca Di Maio
@ 2025-04-16 16:07   ` Darrick J. Wong
  0 siblings, 0 replies; 9+ messages in thread
From: Darrick J. Wong @ 2025-04-16 16:07 UTC (permalink / raw)
  To: Luca Di Maio; +Cc: linux-xfs, dimitri.ledkov, smoser

On Wed, Apr 16, 2025 at 04:43:32PM +0200, Luca Di Maio wrote:
> In order to preserve timestamps when populating target filesystem, we
> need to have a reference to the original file.
> 
> This is already done with regular files, we extend this to dirs,
> symlinks and chardevices.
> 
> Excerpt of old protofile:
> 
> ```
> /
> 0 0
> d--755 0 0
> : Descending path rootfs
>  bin   l--777 0 0 usr/bin
>  lib64 l--777 0 0 lib
>  sbin  l--777 0 0 usr/bin
>  dev d--755 0 0
>   console c--620 0 0 5 1
>   null    c--666 0 0 1 3
>   random  c--666 0 0 1 8
>   urandom c--666 0 0 1 9
>   zero    c--666 0 0 1 5
>  $
>  lib d--755 0 0
>   ld-linux-x86-64.so.2   ---755 0 0 rootfs/lib/ld-linux-x86-64.so.2
> ```
> 
> Excerpt of new protofile:
> 
> ```
> /
> 0 0
> d--755 65534 65534 rootfs
> : Descending path rootfs
>  bin   l--777 65534 65534 usr/bin rootfs/bin
>  lib64 l--777 65534 65534 lib rootfs/lib64
>  sbin  l--777 65534 65534 usr/bin rootfs/sbin
>  $
>  dev d--755 65534 65534 rootfs/dev
>   console c--620 65534 65534 5 1 rootfs/dev/console
>   null    c--666 65534 65534 1 3 rootfs/dev/null
>   random  c--666 65534 65534 1 8 rootfs/dev/random
>   urandom c--666 65534 65534 1 9 rootfs/dev/urandom
>   zero    c--666 65534 65534 1 5 rootfs/dev/zero

The trouble is, this new field     ^^^^^^^^^^^^^^^ will break parsers,
which I'll talk about in the next email.

--D

>  $
>  lib d--755 0 0 rootfs/lib
>   ld-linux-x86-64.so.2   ---755 0 0 rootfs/lib/ld-linux-x86-64.so.2
> ```
> 
> Signed-off-by: Luca Di Maio <luca.dimaio1@gmail.com>
> ---
>  mkfs/xfs_protofile.in | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mkfs/xfs_protofile.in b/mkfs/xfs_protofile.in
> index e83c39f..066265b 100644
> --- a/mkfs/xfs_protofile.in
> +++ b/mkfs/xfs_protofile.in
> @@ -51,12 +51,12 @@ def stat_to_str(statbuf):
>  def stat_to_extra(statbuf, fullpath):
>  	'''Compute the extras column for a protofile.'''
> 
> -	if stat.S_ISREG(statbuf.st_mode):
> +	if stat.S_ISREG(statbuf.st_mode) or stat.S_ISDIR(statbuf.st_mode):
>  		return ' %s' % fullpath
>  	elif stat.S_ISCHR(statbuf.st_mode) or stat.S_ISBLK(statbuf.st_mode):
> -		return ' %d %d' % (os.major(statbuf.st_rdev), os.minor(statbuf.st_rdev))
> +		return ' %d %d %s' % (os.major(statbuf.st_rdev), os.minor(statbuf.st_rdev), fullpath)
>  	elif stat.S_ISLNK(statbuf.st_mode):
> -		return ' %s' % os.readlink(fullpath)
> +		return ' %s %s' % (os.readlink(fullpath), fullpath)
>  	return ''
> 
>  def max_fname_len(s1):
> @@ -105,8 +105,8 @@ def walk_tree(path, depth):
>  		fullpath = os.path.join(path, fname)
>  		sb = os.lstat(fullpath)
>  		extra = stat_to_extra(sb, fullpath)
> -		print('%*s%s %s' % (depth, ' ', fname, \
> -				stat_to_str(sb)))
> +		print('%*s%s %s%s' % (depth, ' ', fname, \
> +				stat_to_str(sb), extra))
>  		walk_tree(fullpath, depth + 1)
> 
>  	if depth > 1:
> @@ -134,7 +134,7 @@ def main():
>  		statbuf = os.stat(args.paths[0])
>  		if not stat.S_ISDIR(statbuf.st_mode):
>  			raise NotADirectoryError(path)
> -		print(stat_to_str(statbuf))
> +		print(stat_to_str(statbuf), args.paths[0])
> 
>  		# All files under each path go in the root dir, recursively
>  		for path in args.paths:
> --
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 2/2] proto: read origin also for directories, chardevs and symlinks. copy timestamps from origin.
  2025-04-16 16:07   ` Darrick J. Wong
@ 2025-04-17 14:14     ` Luca Di Maio
  0 siblings, 0 replies; 9+ messages in thread
From: Luca Di Maio @ 2025-04-17 14:14 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, dimitri.ledkov, smoser

Thanks Darrick for the review

> On 4/16/25 18:07, Darrick J. Wong wrote: The protofile format that
> mkfs.xfs uses has been static for 52 years, I don't know that
> expending a large amount of effort on it is worth the time.  If you
> really must have reproducible filesystem images, would it be easier
> to allow overriding current_time() via environment vars?

It's nice to preserve the source timestamps, for example, in case we
have also reproducible builds for packages and rootfs (in tar.gz format)
Fixing it to a set time would lose that information in the "translation"

> On 4/16/25 18:07, Darrick J. Wong wrote: (I also don't really get
> why anyone cares about bit-reproducible filesystem images; the only
> behavioral guarantees are the userspace interface contract.
> Filesystem drivers have wide latitude to do whatever they want under
> the covers.)

For distribution of said disks and avoiding re-doing work (eg: if the
checksum matches, we know there were no changes, thus we can skip the 
publication CI)

I've reviewed the patch, in order to not change the prototype file
specification whatsoever.
The new patch will leverage a comment that `xfs_protofile` inserts since
its first iteration:

": Descending path [foo]"

This way, if the comment is not found, like for older files, the new
behavior will simply be ignored.

This should address the compatibility concerns.
I've tested this incoming patch with a matrix of 
"old-protofiles/new-mkfs/new-protofile/old-mkfs" and it seems there are 
no compatibility issues.

Thanks
L.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 0/2] prototype: improve timestamp handling
  2025-04-16 14:43 [PATCH RFC 0/2] prototype: improve timestamp handling Luca Di Maio
  2025-04-16 14:43 ` [PATCH RFC 1/2] xfs_proto: add origin also for directories, chardevs and symlinks Luca Di Maio
  2025-04-16 14:43 ` [PATCH RFC 2/2] proto: read origin also for directories, chardevs and symlinks. copy timestamps from origin Luca Di Maio
@ 2025-04-22  3:10 ` Darrick J. Wong
  2025-04-22  6:16   ` Luca Di Maio
  2 siblings, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2025-04-22  3:10 UTC (permalink / raw)
  To: Luca Di Maio; +Cc: linux-xfs, dimitri.ledkov, smoser

Crumbs, apparently I forgot ever to send this message. :(

On Wed, Apr 16, 2025 at 04:43:31PM +0200, Luca Di Maio wrote:
> Hi all,
> 
> This is an initial prototype to improve XFS's prototype file
> functionality in scenarios where FS reproducibility is important.
> 
> Currently, when populating a filesystem with a prototype file, all generated inodes
> receive timestamps set to the creation time rather than preserving timestamps from
> their source files.
> 
> This patchset extends the protofile handling to preserve original timestamps (atime,
> mtime, ctime) across all inode types. The implementation is split into two parts:
> 
> - First patch extends xfs_protofile.in to track origin path references for directories,
> character devices and symlinks, similar to what's already implemented for regular files.
> 
> - Second patch leverages these references to read timestamp metadata from source files
> and populate it into the newly created inodes during filesystem creation.
> 
> At the moment, the new `xfs_protofile` generates a file that results
> invalid for older `mkfs.xfs` implementations. Also this new implementation
> is not compatible with older prototype files.
> 
> I can imagine that new protofiles not working with older `mkfs.xfs`
> might not be a problem, but what about backward compatibility?
> I didn't find references on prototype file compatibility, is a change
> like this unwanted?

I think it'd be more ergonomic for mkfs users to introduce an alternate
implementation that uses nftw() to copy whole directory trees (like
mke2fs -d does) instead of revising a 52-year old file format to support
copying attrs of non-regular files.  Then we can move people to a
mechanism that doesn't require cli options for supporting spaces in
filenames and whatnot.

--D

> If so, what do you think of a versioned support for prototype files?
> I was thinking something on the lines of:
> 
> - xfs_protofile
>   - if the new flag:
>     - set the first comment accordingly
>     - add the additional information
>   - else act as old one
> 
> - proto.c
>   - check if the doc starts with the comment `:origin-files enabled`
> 	(for example)
>   - if so, this is the new format
>   - else old format
> 
> Eager to know your thoughts and ideas
> Thanks
> L.
> 
> Luca Di Maio (2):
>   xfs_proto: add origin also for directories, chardevs and symlinks
>   proto: read origin also for directories, chardevs and symlinks. copy
>     timestamps from origin.
> 
>  mkfs/proto.c          | 49 +++++++++++++++++++++++++++++++++++++++++++
>  mkfs/xfs_protofile.in | 12 +++++------
>  2 files changed, 55 insertions(+), 6 deletions(-)
> 
> --
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 0/2] prototype: improve timestamp handling
  2025-04-22  3:10 ` [PATCH RFC 0/2] prototype: improve timestamp handling Darrick J. Wong
@ 2025-04-22  6:16   ` Luca Di Maio
  2025-04-23 14:46     ` Darrick J. Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Luca Di Maio @ 2025-04-22  6:16 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, dimitri.ledkov, smoser, luca.dimaio1

Thanks Darrick for the feedback,
I've also sent a v3 patch for this that still uses the prototype file, without changing the file specification at all.
Let me know what you think of that.

I'm also a bit more aligned on the "walk and copy" functionality, I'll try to implement that too. In the meantime if the prototype file implementation works, it could also be an improvement what do you think?

Thanks for your review
L.

On April 22, 2025 5:10:19 AM GMT+02:00, "Darrick J. Wong" <djwong@kernel.org> wrote:
>Crumbs, apparently I forgot ever to send this message. :(
>
>On Wed, Apr 16, 2025 at 04:43:31PM +0200, Luca Di Maio wrote:
>> Hi all,
>> 
>> This is an initial prototype to improve XFS's prototype file
>> functionality in scenarios where FS reproducibility is important.
>> 
>> Currently, when populating a filesystem with a prototype file, all generated inodes
>> receive timestamps set to the creation time rather than preserving timestamps from
>> their source files.
>> 
>> This patchset extends the protofile handling to preserve original timestamps (atime,
>> mtime, ctime) across all inode types. The implementation is split into two parts:
>> 
>> - First patch extends xfs_protofile.in to track origin path references for directories,
>> character devices and symlinks, similar to what's already implemented for regular files.
>> 
>> - Second patch leverages these references to read timestamp metadata from source files
>> and populate it into the newly created inodes during filesystem creation.
>> 
>> At the moment, the new `xfs_protofile` generates a file that results
>> invalid for older `mkfs.xfs` implementations. Also this new implementation
>> is not compatible with older prototype files.
>> 
>> I can imagine that new protofiles not working with older `mkfs.xfs`
>> might not be a problem, but what about backward compatibility?
>> I didn't find references on prototype file compatibility, is a change
>> like this unwanted?
>
>I think it'd be more ergonomic for mkfs users to introduce an alternate
>implementation that uses nftw() to copy whole directory trees (like
>mke2fs -d does) instead of revising a 52-year old file format to support
>copying attrs of non-regular files.  Then we can move people to a
>mechanism that doesn't require cli options for supporting spaces in
>filenames and whatnot.
>
>--D
>
>> If so, what do you think of a versioned support for prototype files?
>> I was thinking something on the lines of:
>> 
>> - xfs_protofile
>>   - if the new flag:
>>     - set the first comment accordingly
>>     - add the additional information
>>   - else act as old one
>> 
>> - proto.c
>>   - check if the doc starts with the comment `:origin-files enabled`
>> 	(for example)
>>   - if so, this is the new format
>>   - else old format
>> 
>> Eager to know your thoughts and ideas
>> Thanks
>> L.
>> 
>> Luca Di Maio (2):
>>   xfs_proto: add origin also for directories, chardevs and symlinks
>>   proto: read origin also for directories, chardevs and symlinks. copy
>>     timestamps from origin.
>> 
>>  mkfs/proto.c          | 49 +++++++++++++++++++++++++++++++++++++++++++
>>  mkfs/xfs_protofile.in | 12 +++++------
>>  2 files changed, 55 insertions(+), 6 deletions(-)
>> 
>> --
>> 2.49.0
>> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC 0/2] prototype: improve timestamp handling
  2025-04-22  6:16   ` Luca Di Maio
@ 2025-04-23 14:46     ` Darrick J. Wong
  0 siblings, 0 replies; 9+ messages in thread
From: Darrick J. Wong @ 2025-04-23 14:46 UTC (permalink / raw)
  To: Luca Di Maio; +Cc: linux-xfs, dimitri.ledkov, smoser

On Tue, Apr 22, 2025 at 08:16:17AM +0200, Luca Di Maio wrote:
> Thanks Darrick for the feedback,
> I've also sent a v3 patch for this that still uses the prototype file,
> without changing the file specification at all.
> Let me know what you think of that.

TBH I'm not enthusiastic about changing the protofile parsing at all.
The file format is creaky but stable; let's not spend engineer time on
propping up old designs...

> I'm also a bit more aligned on the "walk and copy" functionality, I'll
> try to implement that too. In the meantime if the prototype file
> implementation works, it could also be an improvement what do you
> think?

...because I like this idea a lot better.

--D

> Thanks for your review
> L.
> 
> On April 22, 2025 5:10:19 AM GMT+02:00, "Darrick J. Wong" <djwong@kernel.org> wrote:
> >Crumbs, apparently I forgot ever to send this message. :(
> >
> >On Wed, Apr 16, 2025 at 04:43:31PM +0200, Luca Di Maio wrote:
> >> Hi all,
> >> 
> >> This is an initial prototype to improve XFS's prototype file
> >> functionality in scenarios where FS reproducibility is important.
> >> 
> >> Currently, when populating a filesystem with a prototype file, all generated inodes
> >> receive timestamps set to the creation time rather than preserving timestamps from
> >> their source files.
> >> 
> >> This patchset extends the protofile handling to preserve original timestamps (atime,
> >> mtime, ctime) across all inode types. The implementation is split into two parts:
> >> 
> >> - First patch extends xfs_protofile.in to track origin path references for directories,
> >> character devices and symlinks, similar to what's already implemented for regular files.
> >> 
> >> - Second patch leverages these references to read timestamp metadata from source files
> >> and populate it into the newly created inodes during filesystem creation.
> >> 
> >> At the moment, the new `xfs_protofile` generates a file that results
> >> invalid for older `mkfs.xfs` implementations. Also this new implementation
> >> is not compatible with older prototype files.
> >> 
> >> I can imagine that new protofiles not working with older `mkfs.xfs`
> >> might not be a problem, but what about backward compatibility?
> >> I didn't find references on prototype file compatibility, is a change
> >> like this unwanted?
> >
> >I think it'd be more ergonomic for mkfs users to introduce an alternate
> >implementation that uses nftw() to copy whole directory trees (like
> >mke2fs -d does) instead of revising a 52-year old file format to support
> >copying attrs of non-regular files.  Then we can move people to a
> >mechanism that doesn't require cli options for supporting spaces in
> >filenames and whatnot.
> >
> >--D
> >
> >> If so, what do you think of a versioned support for prototype files?
> >> I was thinking something on the lines of:
> >> 
> >> - xfs_protofile
> >>   - if the new flag:
> >>     - set the first comment accordingly
> >>     - add the additional information
> >>   - else act as old one
> >> 
> >> - proto.c
> >>   - check if the doc starts with the comment `:origin-files enabled`
> >> 	(for example)
> >>   - if so, this is the new format
> >>   - else old format
> >> 
> >> Eager to know your thoughts and ideas
> >> Thanks
> >> L.
> >> 
> >> Luca Di Maio (2):
> >>   xfs_proto: add origin also for directories, chardevs and symlinks
> >>   proto: read origin also for directories, chardevs and symlinks. copy
> >>     timestamps from origin.
> >> 
> >>  mkfs/proto.c          | 49 +++++++++++++++++++++++++++++++++++++++++++
> >>  mkfs/xfs_protofile.in | 12 +++++------
> >>  2 files changed, 55 insertions(+), 6 deletions(-)
> >> 
> >> --
> >> 2.49.0
> >> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-04-23 14:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 14:43 [PATCH RFC 0/2] prototype: improve timestamp handling Luca Di Maio
2025-04-16 14:43 ` [PATCH RFC 1/2] xfs_proto: add origin also for directories, chardevs and symlinks Luca Di Maio
2025-04-16 16:07   ` Darrick J. Wong
2025-04-16 14:43 ` [PATCH RFC 2/2] proto: read origin also for directories, chardevs and symlinks. copy timestamps from origin Luca Di Maio
2025-04-16 16:07   ` Darrick J. Wong
2025-04-17 14:14     ` Luca Di Maio
2025-04-22  3:10 ` [PATCH RFC 0/2] prototype: improve timestamp handling Darrick J. Wong
2025-04-22  6:16   ` Luca Di Maio
2025-04-23 14:46     ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox