From: Sean Smith <defendthedisabled@gmail.com>
To: tytso@mit.edu
Cc: defendthedisabled@gmail.com, linux-fsdevel@vger.kernel.org,
linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org,
dsterba@suse.com, david@fromorbit.com, brauner@kernel.org,
osandov@osandov.com, almaz@kernel.org,
hirofumi@mail.parknet.co.jp, linkinjeon@kernel.org
Subject: Re: [RFC PATCH v1 0/6] provenance_time (ptime): a new settable timestamp for cross-filesystem provenance
Date: Mon, 6 Apr 2026 19:05:55 -0500 [thread overview]
Message-ID: <20260407000558.417-1-DefendTheDisabled@gmail.com> (raw)
In-Reply-To: <20260405225442.GA1763@macsyma-wired.lan>
[written with AI assistance]
On Sun, Apr 05, 2026 at 06:54:42PM -0400, Theodore Tso wrote:
Thanks for the substantive engagement — it helps clarify where
the proposal needs to justify itself.
> On Sun, Apr 05, 2026 at 02:49:56PM -0500, Sean Smith wrote:
> >
> > 1. Application atomic saves destroy xattrs. Programs that save
> > via write-to-temp + rename() replace the inode, permanently
> > destroying all extended attributes. Only the VFS sees both
> > inodes during rename -- no userspace mechanism can intercept
> > this and copy metadata across.
>
> The VFS could potentially copy the xattr on a rename, no?
It could, but even scoping to user.* means adding conditional
xattr-copy logic into every filesystem's rename handler — with
dynamic allocation and xattr tree lookups on a hot path. ptime
avoids this: one inline inode field, clear semantics, same VFS
patterns as atime/mtime/btime.
> > 2. Every tool in the copy chain must explicitly opt in to xattr
> > preservation. cp requires --preserve=xattr, rsync requires -X,
> > tar requires --xattrs. Each missing flag causes silent data
> > loss. Transparent preservation through arbitrary tool flows
> > is not achievable in userspace.
>
> But this is true for your proposed ptime as well. You have to change
> every single tool to copy over the ptime. Worse, you have to change
> the format of tar in a non-standard on-disk format change to support
> this new ptime timestamp. And rsync will require a non-standard
> protocol change to support the new timestamp.
You are right that copy tools require patches. If ptime only
improved the copy-tool situation, I would agree it does not
justify new kernel surface over xattrs.
The structural difference is in the default adoption path.
xattr preservation is permanently per-invocation opt-in: each
tool call needs the correct flag, and the default is to drop
them. A kernel timestamp exposed through statx/utimensat
follows the same API pattern as mtime — standard libraries
and tools naturally evolve to preserve all standard timestamps
by default. ptime has a path to default-preservation that
xattrs structurally cannot reach.
On the formats: the tar patch uses a vendor-prefixed PAX
header (SCHILY.ptime), backward-compatible — old readers
ignore it cleanly. The rsync patch plugs into the existing
--crtimes machinery that already supports macOS and Cygwin.
> > Atomic saves are the default behavior of mainstream applications
> > (LibreOffice, Vim, Kate, etc.).
>
> You will also have to change mainstream applications to copy ptime
> from the original file to the file.new before the atomic rename.
> Using ptime doesn't change this. So you will need to make this
> non-standard, Linux-specific change to all of these mainstream
> applications.
This is where the cover letter was not clear enough, and it
is the core reason ptime must be a kernel timestamp.
The patches implement rename-over preservation in all 5
filesystem rename handlers. When rename(source, target)
replaces an existing file, and the source has ptime=0 (the
default for any newly-created temp file) while the target
has ptime != 0, the filesystem copies the target's ptime to
the source before destroying the target's inode. This runs
inside the rename transaction, atomic with the rename itself.
Most GUI applications — LibreOffice, Kate, Qt and GNOME
apps — save via write-to-temp + rename-over-original. For
these, ptime survives automatically with no application
changes:
1. App writes to temp file (ptime = 0)
2. rename(temp, document.odt)
3. Kernel: source ptime=0, target!=0 -> copies ptime
4. ptime preserved. No app change.
This is not universal: editors that use rename-away +
create-new (Vim with default backupcopy=no, Emacs) do not
trigger rename-over, and the spec documents this as a known
limitation. But the write-to-temp + rename-over pattern is
the dominant GUI save path, and the kernel handles it
transparently — something no xattr mechanism can provide
without application cooperation.
> Is it worth it? It's a huge amount of cost being spread across a very
> large part of the open source ecosystem just this fairly narrow use
> case. Personally, I'm not convinced it's worth the effort.
I think the use case is broader than I conveyed. Any workflow
that copies files from NTFS, APFS, or HFS+ onto native Linux
filesystems loses user-visible creation time unless carried
out-of-band. This affects personal migrations, enterprise
backups, dual-boot users, and professional workflows in
photography, legal, scientific data, and media production.
Windows, macOS, and SMB have supported a settable creation
timestamp for decades — Linux is the outlier.
Users already expend significant resources working around
this gap — metadata manifests, scripts to stamp creation
dates into filenames or xattrs, side-channel databases —
or simply accept the data loss. The cost is already being
paid, continuously and redundantly across the ecosystem.
One upstream investment in ptime converts that distributed
ongoing cost into a bounded effort.
ptime is separate from btime by design: it preserves btime's
value as immutable forensic metadata while providing a
settable timestamp that travels with file content across
filesystem boundaries.
On ecosystem cost: the kernel surface is ~240 lines across
28 files. For context, I am a disabled Medicaid recipient
who came to this from a disability rights litigation
workflow — I need file provenance preserved across an
NTFS-to-Btrfs migration for legal work. The complete
implementation — kernel patches across 5 filesystems,
tool patches, and xfstests — was produced in a few days using
agentic development tools, which suggests the adoption cost may
be meaningfully lower than traditional estimates as these
tools become available across the ecosystem.
I understand a new timestamp is permanent API surface and
the bar should be high. My claim is that rename-over
preservation — automatic ptime survival through application
saves, without application changes — makes this materially
different from an xattr workaround, and justifies that cost.
Sean
next prev parent reply other threads:[~2026-04-07 0:06 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-05 19:49 [RFC PATCH v1 0/6] provenance_time (ptime): a new settable timestamp for cross-filesystem provenance Sean Smith
2026-04-05 19:49 ` [PATCH 1/6] vfs: add provenance_time (ptime) infrastructure Sean Smith
2026-04-05 19:49 ` [PATCH 2/6] btrfs: add provenance time (ptime) support Sean Smith
2026-04-05 19:49 ` [PATCH 3/6] ntfs3: map ptime to NTFS creation time with rename-over Sean Smith
2026-04-05 19:50 ` [PATCH 4/6] ext4: add dedicated ptime field alongside i_crtime Sean Smith
2026-04-05 19:50 ` [PATCH 5/6] fat: map ptime to FAT creation time with rename-over Sean Smith
2026-04-05 19:50 ` [PATCH 6/6] exfat: map ptime to exFAT " Sean Smith
2026-04-05 22:54 ` [RFC PATCH v1 0/6] provenance_time (ptime): a new settable timestamp for cross-filesystem provenance Theodore Tso
2026-04-07 0:05 ` Sean Smith [this message]
2026-04-07 1:42 ` Darrick J. Wong
2026-04-07 6:06 ` Sean Smith
2026-04-07 15:17 ` Darrick J. Wong
2026-04-07 23:36 ` Theodore Tso
2026-04-08 2:54 ` Sean Smith
2026-04-08 13:33 ` Theodore Tso
2026-04-09 0:15 ` Sean Smith
2026-04-09 13:38 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260407000558.417-1-DefendTheDisabled@gmail.com \
--to=defendthedisabled@gmail.com \
--cc=almaz@kernel.org \
--cc=brauner@kernel.org \
--cc=david@fromorbit.com \
--cc=dsterba@suse.com \
--cc=hirofumi@mail.parknet.co.jp \
--cc=linkinjeon@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=osandov@osandov.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.