From: Sean Smith <defendthedisabled@gmail.com>
To: tytso@mit.edu
Cc: defendthedisabled@gmail.com, linux-fsdevel@vger.kernel.org,
linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org,
dsterba@suse.com, david@fromorbit.com, brauner@kernel.org,
osandov@osandov.com, almaz@kernel.org,
hirofumi@mail.parknet.co.jp, linkinjeon@kernel.org
Subject: Re: [RFC PATCH v1 0/6] provenance_time (ptime): a new settable timestamp for cross-filesystem provenance
Date: Mon, 6 Apr 2026 19:05:55 -0500 [thread overview]
Message-ID: <20260407000558.417-1-DefendTheDisabled@gmail.com> (raw)
In-Reply-To: <20260405225442.GA1763@macsyma-wired.lan>
[written with AI assistance]
On Sun, Apr 05, 2026 at 06:54:42PM -0400, Theodore Tso wrote:
Thanks for the substantive engagement — it helps clarify where
the proposal needs to justify itself.
> On Sun, Apr 05, 2026 at 02:49:56PM -0500, Sean Smith wrote:
> >
> > 1. Application atomic saves destroy xattrs. Programs that save
> > via write-to-temp + rename() replace the inode, permanently
> > destroying all extended attributes. Only the VFS sees both
> > inodes during rename -- no userspace mechanism can intercept
> > this and copy metadata across.
>
> The VFS could potentially copy the xattr on a rename, no?
It could, but even scoping to user.* means adding conditional
xattr-copy logic into every filesystem's rename handler — with
dynamic allocation and xattr tree lookups on a hot path. ptime
avoids this: one inline inode field, clear semantics, same VFS
patterns as atime/mtime/btime.
> > 2. Every tool in the copy chain must explicitly opt in to xattr
> > preservation. cp requires --preserve=xattr, rsync requires -X,
> > tar requires --xattrs. Each missing flag causes silent data
> > loss. Transparent preservation through arbitrary tool flows
> > is not achievable in userspace.
>
> But this is true for your proposed ptime as well. You have to change
> every single tool to copy over the ptime. Worse, you have to change
> the format of tar in a non-standard on-disk format change to support
> this new ptime timestamp. And rsync will require a non-standard
> protocol change to support the new timestamp.
You are right that copy tools require patches. If ptime only
improved the copy-tool situation, I would agree it does not
justify new kernel surface over xattrs.
The structural difference is in the default adoption path.
xattr preservation is permanently per-invocation opt-in: each
tool call needs the correct flag, and the default is to drop
them. A kernel timestamp exposed through statx/utimensat
follows the same API pattern as mtime — standard libraries
and tools naturally evolve to preserve all standard timestamps
by default. ptime has a path to default-preservation that
xattrs structurally cannot reach.
On the formats: the tar patch uses a vendor-prefixed PAX
header (SCHILY.ptime), backward-compatible — old readers
ignore it cleanly. The rsync patch plugs into the existing
--crtimes machinery that already supports macOS and Cygwin.
> > Atomic saves are the default behavior of mainstream applications
> > (LibreOffice, Vim, Kate, etc.).
>
> You will also have to change mainstream applications to copy ptime
> from the original file to the file.new before the atomic rename.
> Using ptime doesn't change this. So you will need to make this
> non-standard, Linux-specific change to all of these mainstream
> applications.
This is where the cover letter was not clear enough, and it
is the core reason ptime must be a kernel timestamp.
The patches implement rename-over preservation in all 5
filesystem rename handlers. When rename(source, target)
replaces an existing file, and the source has ptime=0 (the
default for any newly-created temp file) while the target
has ptime != 0, the filesystem copies the target's ptime to
the source before destroying the target's inode. This runs
inside the rename transaction, atomic with the rename itself.
Most GUI applications — LibreOffice, Kate, Qt and GNOME
apps — save via write-to-temp + rename-over-original. For
these, ptime survives automatically with no application
changes:
1. App writes to temp file (ptime = 0)
2. rename(temp, document.odt)
3. Kernel: source ptime=0, target!=0 -> copies ptime
4. ptime preserved. No app change.
This is not universal: editors that use rename-away +
create-new (Vim with default backupcopy=no, Emacs) do not
trigger rename-over, and the spec documents this as a known
limitation. But the write-to-temp + rename-over pattern is
the dominant GUI save path, and the kernel handles it
transparently — something no xattr mechanism can provide
without application cooperation.
> Is it worth it? It's a huge amount of cost being spread across a very
> large part of the open source ecosystem just this fairly narrow use
> case. Personally, I'm not convinced it's worth the effort.
I think the use case is broader than I conveyed. Any workflow
that copies files from NTFS, APFS, or HFS+ onto native Linux
filesystems loses user-visible creation time unless carried
out-of-band. This affects personal migrations, enterprise
backups, dual-boot users, and professional workflows in
photography, legal, scientific data, and media production.
Windows, macOS, and SMB have supported a settable creation
timestamp for decades — Linux is the outlier.
Users already expend significant resources working around
this gap — metadata manifests, scripts to stamp creation
dates into filenames or xattrs, side-channel databases —
or simply accept the data loss. The cost is already being
paid, continuously and redundantly across the ecosystem.
One upstream investment in ptime converts that distributed
ongoing cost into a bounded effort.
ptime is separate from btime by design: it preserves btime's
value as immutable forensic metadata while providing a
settable timestamp that travels with file content across
filesystem boundaries.
On ecosystem cost: the kernel surface is ~240 lines across
28 files. For context, I am a disabled Medicaid recipient
who came to this from a disability rights litigation
workflow — I need file provenance preserved across an
NTFS-to-Btrfs migration for legal work. The complete
implementation — kernel patches across 5 filesystems,
tool patches, and xfstests — was produced in a few days using
agentic development tools, which suggests the adoption cost may
be meaningfully lower than traditional estimates as these
tools become available across the ecosystem.
I understand a new timestamp is permanent API surface and
the bar should be high. My claim is that rename-over
preservation — automatic ptime survival through application
saves, without application changes — makes this materially
different from an xattr workaround, and justifies that cost.
Sean
next prev parent reply other threads:[~2026-04-07 0:06 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-05 19:49 [RFC PATCH v1 0/6] provenance_time (ptime): a new settable timestamp for cross-filesystem provenance Sean Smith
2026-04-05 19:49 ` [PATCH 1/6] vfs: add provenance_time (ptime) infrastructure Sean Smith
2026-04-05 19:49 ` [PATCH 2/6] btrfs: add provenance time (ptime) support Sean Smith
2026-04-05 19:49 ` [PATCH 3/6] ntfs3: map ptime to NTFS creation time with rename-over Sean Smith
2026-04-05 19:50 ` [PATCH 4/6] ext4: add dedicated ptime field alongside i_crtime Sean Smith
2026-04-05 19:50 ` [PATCH 5/6] fat: map ptime to FAT creation time with rename-over Sean Smith
2026-04-05 19:50 ` [PATCH 6/6] exfat: map ptime to exFAT " Sean Smith
2026-04-05 22:54 ` [RFC PATCH v1 0/6] provenance_time (ptime): a new settable timestamp for cross-filesystem provenance Theodore Tso
2026-04-07 0:05 ` Sean Smith [this message]
2026-04-07 1:42 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260407000558.417-1-DefendTheDisabled@gmail.com \
--to=defendthedisabled@gmail.com \
--cc=almaz@kernel.org \
--cc=brauner@kernel.org \
--cc=david@fromorbit.com \
--cc=dsterba@suse.com \
--cc=hirofumi@mail.parknet.co.jp \
--cc=linkinjeon@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=osandov@osandov.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox