From: Dave Chinner <david@fromorbit.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Philipp Schrader <philipp@peloton-tech.com>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
linux-xfs <linux-xfs@vger.kernel.org>,
Austin Schuh <austin@peloton-tech.com>,
Alison Chaiken <alison@peloton-tech.com>,
Theodore Tso <tytso@mit.edu>
Subject: Re: Reproducible XFS filesystem artifacts
Date: Wed, 17 Jan 2018 17:15:33 +1100 [thread overview]
Message-ID: <20180117061533.GJ6304@dastard> (raw)
In-Reply-To: <CAOQ4uxgKonrPQLSaxhL5ok=NEVT9XAatTab4BsQt9tA7gNBJ2g@mail.gmail.com>
On Wed, Jan 17, 2018 at 06:05:17AM +0200, Amir Goldstein wrote:
> On Wed, Jan 17, 2018 at 2:52 AM, Philipp Schrader
> <philipp@peloton-tech.com> wrote:
> >> > I've not had much luck digging into the XFS spec to see prove that the
> >> > ctime is different, but I'm pretty certain. When I mount the images, I
> >> > can see that ctime is different:
> >> > $ stat -c %x,%y,%z,%n /mnt/{a,b}/log/syslog
> >> > 2017-12-28 11:26:53.552000096 -0800,1969-12-31 16:00:00.000000000
> >> > -0800,2017-12-28 11:28:50.524000060 -0800,/mnt/a/log/syslog
> >> > 2017-12-28 10:46:38.739999913 -0800,1969-12-31 16:00:00.000000000
> >> > -0800,2017-12-28 10:48:17.180000049 -0800,/mnt/b/log/syslog
> >> >
> >> > As far as I can tell, there are no mount options to null out the ctime
> >> > fields. (As an aside I'm curious as to the reason for this).
> >>
> >> Correct, there's (afaict) no userspace interface to change ctime, since
> >> it reflects the last time the inode metadata was updated by the kernel.
> >>
> >> > Is there a tool that lets me null out ctime fields on a XFS filesystem
> >> > image
> >>
> >> None that I know of.
> >>
> >> > Or maybe is there a library that lets me traverse the file
> >> > system and set the fields to zero manually?
> >>
> >> Not really, other than messing up the image with the debugger.
> >
> > Which debugger are you talking about? Do you mean xfs_db? I was really
> > hoping to avoid that :)
Yup, xfs_db is the only way you can write custom timestamps in XFS
inodes in an OOB manner. But it's not scalable in any way :/
> >> > Does what I'm asking make sense? I feel like I'm not the first person
> >> > to tackle this, but I haven't been lucky with finding anything to
> >> > address this.
> >>
> >> I'm not sure I understand the use case for exactly reproducible filesystem
> >> images (as opposed to the stuff inside said fs), can you tell us more?
> >
> > For some background, these images serve as read-only root file system
> > images on vehicles. During the initial install or during a system
> > update, new images get written to the disks. This uses a process
> > equivalent to using dd(1).
[....]
> That is not to suggest that you should not use xfs. You probably
> have your reasons for it, but whatever was already done by
> others for other fs (e.g. e2image -Qa) may be the way to go
> for xfs. xfs_copy would be the first tool I would look into extending
> for your use case.
Let's make sure we're all on the same page here.
xfs_copy was written for efficient installation of XFS filesystem
images. It doesn't store or copy unused space in it's packed
filesystem image....
What xfs_copy cannot do is modify filesystem metadata. IOWs, it can't
solve the timestamp problem the reproducable build process needs
fixed - it can only be used to optimise deployment of images, and
that's not the problem that is being discussed here.
Philipp, if you need bulk modification of all inodes in the
filesystem, then the only tool we have that has the capability of
doing this in an automated fashion is xfs_repair. It wouldn't take
much modification to set all inode timestamps to a fixed timestamp
(including the hidden ones that users can't see like crtime) and
zero out other variable things like change counters.
*However*
Even with timestamp normalisation, there's still no absolute
guarantee that two filesystems produced by different builds will be
identical. The kernel can decide to write back two files in a
different order (e.g. due to differences in memory pressure on the
machine during the build), and that means they'll be allocated
differently on disk. Or there could be races with background
filesystem operations, resulting in an AG being locked when an
allocation is attempted and so the data extent is allocated in the
next available AG rather than the one local to the inode. And so on.
Even minor kernel version differences can result in the filesystem
images having different layouts.
This is not an XFS specific issue, either. All kernel filesystem are
susceptible to physical layout variance for one reason or another.
The fundamental problem is that build system /cannot control the
filesystem layout/, so even if the contents and user visible
metadata are the same the filesystem images will still not be 100%
identical on every build.
IOWs, you're chasing a goal (100% reproducable filesystem images)
that simply cannot be acheived via writing files through a
kernel-based filesystem....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2018-01-17 6:16 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-16 4:49 Reproducible XFS filesystem artifacts Philipp Schrader
2018-01-16 7:55 ` Darrick J. Wong
2018-01-17 0:52 ` Philipp Schrader
2018-01-17 4:05 ` Amir Goldstein
2018-01-17 6:15 ` Dave Chinner [this message]
2018-01-17 6:34 ` Dave Chinner
2018-01-22 19:45 ` Philipp Schrader
2018-01-22 19:45 ` Philipp Schrader
2018-01-22 20:28 ` Austin Schuh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180117061533.GJ6304@dastard \
--to=david@fromorbit.com \
--cc=alison@peloton-tech.com \
--cc=amir73il@gmail.com \
--cc=austin@peloton-tech.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
--cc=philipp@peloton-tech.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox