* Reproducible XFS Filesystems Builds for VMs
@ 2025-04-11 14:38 Luca DiMaio
2025-04-14 5:39 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Luca DiMaio @ 2025-04-11 14:38 UTC (permalink / raw)
To: linux-xfs; +Cc: Scott Moser, Dimitri Ledkov
Subject: Reproducible XFS Filesystems Builds for VMs
linux-xfs@vger.kernel.org
Dear XFS Maintainers and Community,
I am a Software Engineer at Chainguard working on reproducible builds for VMs.
While we have successfully implemented reproducible disk images with
EFI+EXT4 partitions, I’ve been unable to replicate this for XFS
filesystems.
Current Approach:
We have successfully implemented reproducible disk images with
EFI+EXT4 partitions using the following methods:
- For FAT32 partitions: `mkfs.vfat --invarian -i $EFI_UUID` with
`$SOURCE_DATE_EPOCH` and populating via mtools
- For EXT4 partitions: `mkfs.ext4 -E hash_seed=$EXT4_HASH_SEED -U
$ROOTFS_UUID` with `$SOURCE_DATE_EPOCH` plus the `-d
/path/to/rootfs.tar.gz` to populate it
XFS Challenges:
For XFS, I've attempted to create reproducible filesystems using
extensive parameters:
```
mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-n version=2 $root_partition
```
I've tried to specify as many options as possible in order to avoid
runtime aleatory decisions.
Unfortunately, this does not produce reproducible results across
different disk images.
I've made progress with empty filesystems by using a combination of
`libfaketime`
to enforce `$SOURCE_DATE_EPOCH` and a custom library that overwrites
the libc's `getrandom()`:
```
~$ export LD_PRELOAD="./deterministic_rng.so /usr/lib/faketime/libfaketime.so.1"
~$ mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-n version=2 disk1.img
~$ mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-n version=2 disk2.img
~$ md5sum disk*
c68c202163dcb862762fc01970f6c8b4 disk1.img
c68c202163dcb862762fc01970f6c8b4 disk2.img
```
This approach works for empty filesystems, but when populating the filesystem by
mounting and untarring an archive, different metadata is generated
even after using
`xfs_repair -L` to reset most metadata.
The primary difference appears to be in the allocation group metadata,
which is optimized at runtime.
Question:
EXT4 addresses this issue with the -d flag, which allows populating
from an archive or directory without mounting.
Is there similar functionality available for XFS, or is there interest
in developing a method for generating reproducible XFS root
filesystems?
I'm asking this because we'd be interested in using XFS as a filesystem for the
final product.
Thank you for your time and expertise. Any guidance would be greatly
appreciated.
Regards,
L.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Reproducible XFS Filesystems Builds for VMs
2025-04-11 14:38 Reproducible XFS Filesystems Builds for VMs Luca DiMaio
@ 2025-04-14 5:39 ` Christoph Hellwig
2025-04-14 16:53 ` Luca DiMaio
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2025-04-14 5:39 UTC (permalink / raw)
To: Luca DiMaio; +Cc: linux-xfs, Scott Moser, Dimitri Ledkov
Hi Luca,
On Fri, Apr 11, 2025 at 04:38:10PM +0200, Luca DiMaio wrote:
> EXT4 addresses this issue with the -d flag, which allows populating
> from an archive or directory without mounting.
> Is there similar functionality available for XFS, or is there interest
> in developing a method for generating reproducible XFS root
> filesystems?
>
> I'm asking this because we'd be interested in using XFS as a filesystem for the
> final product.
mkfs.xfs supports the -p protofile option which allows populating the
file system with existing files and directories at mkfs time. Can you
that and reports if it helps? If not we might be able to look into
fixing issues with note. Note that the protofile is a little arcane
so read the documentation carefully.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Reproducible XFS Filesystems Builds for VMs
2025-04-14 5:39 ` Christoph Hellwig
@ 2025-04-14 16:53 ` Luca DiMaio
2025-04-16 5:34 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Luca DiMaio @ 2025-04-14 16:53 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs, Scott Moser, Dimitri Ledkov
Thanks Christoph for the prototype pointer,
I've experimented with it and indeed we get to create a reproducible
XFS Filesystem as such (still using that LD_PRELOAD trick):
```
~$ tar --sort=name --warning=no-timestamp --xattrs
--xattrs-include='*' -xpf rootfs.tar.gz --numeric-owner -C rootfs/
~$ xfs_protofile rootfs > rootfs.protofile
~$ export LD_PRELOAD="./deterministic_rng.so /usr/lib/faketime/libfaketime.so.1"
~$ mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-p rootfs.protofile \
-n version=2 disk1.img
~$ mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-p rootfs.protofile \
-n version=2 disk2.img
~$ md5sum disk*
dd06b8c8fe79e979d961291a4f78b72e disk1.img
dd06b8c8fe79e979d961291a4f78b72e disk2.img
```
This is a huge step ahead, but we still are facing some missing features/bugs:
- we lose the extended attributes of the files
- we lose the original timestamps of files and directories
I see that the prototype specification does not include anything about
those, are there plans to
support xattrs and timestamps?
Thanks a lot for the help
L.
On Mon, Apr 14, 2025 at 7:39 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> Hi Luca,
>
> On Fri, Apr 11, 2025 at 04:38:10PM +0200, Luca DiMaio wrote:
> > EXT4 addresses this issue with the -d flag, which allows populating
> > from an archive or directory without mounting.
> > Is there similar functionality available for XFS, or is there interest
> > in developing a method for generating reproducible XFS root
> > filesystems?
> >
> > I'm asking this because we'd be interested in using XFS as a filesystem for the
> > final product.
>
> mkfs.xfs supports the -p protofile option which allows populating the
> file system with existing files and directories at mkfs time. Can you
> that and reports if it helps? If not we might be able to look into
> fixing issues with note. Note that the protofile is a little arcane
> so read the documentation carefully.
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Reproducible XFS Filesystems Builds for VMs
2025-04-14 16:53 ` Luca DiMaio
@ 2025-04-16 5:34 ` Christoph Hellwig
2025-04-16 5:55 ` Darrick J. Wong
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2025-04-16 5:34 UTC (permalink / raw)
To: Luca DiMaio; +Cc: Christoph Hellwig, linux-xfs, Scott Moser, Dimitri Ledkov
On Mon, Apr 14, 2025 at 06:53:35PM +0200, Luca DiMaio wrote:
> This is a huge step ahead, but we still are facing some missing features/bugs:
>
> - we lose the extended attributes of the files
> - we lose the original timestamps of files and directories
>
> I see that the prototype specification does not include anything about
> those, are there plans to
> support xattrs and timestamps?
I don't think anyone has concrete plans to write this. But patches
would be happily accepted.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Reproducible XFS Filesystems Builds for VMs
2025-04-16 5:34 ` Christoph Hellwig
@ 2025-04-16 5:55 ` Darrick J. Wong
2025-04-16 14:50 ` Luca DiMaio
0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2025-04-16 5:55 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Luca DiMaio, linux-xfs, Scott Moser, Dimitri Ledkov
On Tue, Apr 15, 2025 at 10:34:34PM -0700, Christoph Hellwig wrote:
> On Mon, Apr 14, 2025 at 06:53:35PM +0200, Luca DiMaio wrote:
> > This is a huge step ahead, but we still are facing some missing features/bugs:
> >
> > - we lose the extended attributes of the files
> > - we lose the original timestamps of files and directories
> >
> > I see that the prototype specification does not include anything about
> > those, are there plans to
> > support xattrs and timestamps?
>
> I don't think anyone has concrete plans to write this. But patches
> would be happily accepted.
xattrs mostly work as of mkfs.xfs in 6.13. If you have more than 64k
worth of attr names (aka enough to break listxattr) then there will be
problems.
--D
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Reproducible XFS Filesystems Builds for VMs
2025-04-16 5:55 ` Darrick J. Wong
@ 2025-04-16 14:50 ` Luca DiMaio
0 siblings, 0 replies; 6+ messages in thread
From: Luca DiMaio @ 2025-04-16 14:50 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs, Scott Moser, Dimitri Ledkov
On Wed, Apr 16, 2025 at 7:55 AM Darrick J. Wong <djwong@kernel.org> wrote:
> xattrs mostly work as of mkfs.xfs in 6.13. If you have more than 64k
> worth of attr names (aka enough to break listxattr) then there will be
> problems.
>
> --D
Thanks Darrick, I saw that this was recently fixed
I've sent a patch to fix a little bug in the python generator here:
https://lore.kernel.org/linux-xfs/20250416123508.900340-1-luca.dimaio1@gmail.com/T/#t
Together with this, I've prepared an RFC to carry over timestamps when
we create XFS filesystems
this will greatly improve the ability to create reproducible XFS disks:
https://lore.kernel.org/linux-xfs/20250416123508.900340-1-luca.dimaio1@gmail.com/T/#t
Eager to know everyone's thoughts on this
L.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-04-16 14:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-11 14:38 Reproducible XFS Filesystems Builds for VMs Luca DiMaio
2025-04-14 5:39 ` Christoph Hellwig
2025-04-14 16:53 ` Luca DiMaio
2025-04-16 5:34 ` Christoph Hellwig
2025-04-16 5:55 ` Darrick J. Wong
2025-04-16 14:50 ` Luca DiMaio
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox