From: Dave Chinner <david@fromorbit.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Christoph Hellwig <hch@infradead.org>,
Pavel Emelyanov <xemul@scylladb.com>,
linux-fsdevel@vger.kernel.org,
"Raphael S . Carvalho" <raphaelsc@scylladb.com>,
linux-api@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH] fs: Propagate FMODE_NOCMTIME flag to user-facing O_NOCMTIME
Date: Thu, 9 Oct 2025 08:27:44 +1100 [thread overview]
Message-ID: <aObXUBCtp4p83QzS@dread.disaster.area> (raw)
In-Reply-To: <CALCETrW3iQWQTdMbB52R4=GztfuFYvN_8p52H1fopdS8uExQWg@mail.gmail.com>
On Wed, Oct 08, 2025 at 08:22:35AM -0700, Andy Lutomirski wrote:
> On Mon, Oct 6, 2025 at 10:08 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Sat, Oct 04, 2025 at 09:08:05AM -0700, Andy Lutomirski wrote:
> > > > Well, we'll need to look into that, including maybe non-blockin
> > > > timestamp updates.
> > > >
> > >
> > > It's been 12 years (!), but maybe it's time to reconsider this:
> > >
> > > https://lore.kernel.org/all/cover.1377193658.git.luto@amacapital.net/
> >
> > I don't see how that is relevant here. Also writes through shared
> > mmaps are problematic for so many reasons that I'm not sure we want
> > to encourage people to use that more.
> >
>
> Because the same exact issue exists in the normal non-mmap write path,
> and I can even quote you upthread :)
>
> > Well, we'll need to look into that, including maybe non-blockin
> timestamp updates.
>
> I assume the code path that inspired this thread in the first place is:
>
> ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
> {
> struct file *file = iocb->ki_filp;
> struct address_space *mapping = file->f_mapping;
> struct inode *inode = mapping->host;
> ssize_t ret;
>
> ret = file_remove_privs(file);
> if (ret)
> return ret;
>
> ret = file_update_time(file);
>
> and this has *exactly* the same problem as the shared-mmap write path:
> it synchronously updates the time (well, synchronously enough that it
> sometimes blocks),
You are conflating "synchronous update" with "blocking".
Avoiding the need for synchronous timestamp updates is exactly what
the lazytime mount option provides. i.e. lazytime degrades immediate
consistency requirements to eventual consistency similar to how the
default relatime behaviour defers atime updates for eventual
writeback.
IOWs, we've already largely addressed the synchronous c/mtime update
problem but what we haven't done is made timestamp updates
fully support non-blocking caller semantics. That's a separate
problem...
> and it does so before updating the file contents
> (although the window during which the timestamp is updated and the
> contents are not is not as absurdly long as it is in the mmap case).
>
> Now my series does not change any of this, but I'm thinking more of
> the concept: instead of doing file/inode_update_time when a file is
> logically written (in write_iter, page_mkwrite, etc), set a flag so
> that the writeback code knows that the timestamp needs updating.
This is exactly what lazytime implements with the I_DIRTY_FLAG.
During writeback, if the filesystem has to modify other metadata in
the inode (e.g. block allocation), the filesystem will piggyback the
persistent update of the dirty timestamps on that modification and
clear the I_DIRTY_TIME flag.
However, if the writeback operation is a pure overwrite, then there
is no metadata modifiction occuring and so we leave the inode
I_DIRTY_TIME dirty for a future metadata persistence operation to
clean them.
IOWs, with lazytime, writeback already persists timestamp updates
when appropriate for best performance.
> Thinking out loud, to handle both write_iter and mmap, there might
> need to be two bits: one saying "the timestamp needs to be updated"
> and another saying "the timestamp has been updated in the in-memory
> inode, but the inode hasn't been dirtied yet".
The flag that implements the latter is called I_DIRTY_TIME. We have
not implemented the former as that's a userspace visible change of
behaviour.
> And maybe the latter
> is doable entirely within fs-specific code without any help from the
> generic code, but it might still be nice to keep generic_update_time
> usable for filesystems that want to do this.
generic_update_time() already supports I_DIRTY_TIME semantics.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2025-10-08 21:27 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-03 9:32 [PATCH] fs: Propagate FMODE_NOCMTIME flag to user-facing O_NOCMTIME Pavel Emelyanov
2025-10-04 4:26 ` Christoph Hellwig
2025-10-04 16:08 ` Andy Lutomirski
2025-10-07 5:08 ` Christoph Hellwig
2025-10-08 15:22 ` Andy Lutomirski
2025-10-08 21:27 ` Dave Chinner [this message]
2025-10-08 21:51 ` Andy Lutomirski
2025-10-11 1:35 ` Dave Chinner
2025-10-11 4:04 ` Andy Lutomirski
2025-10-10 5:27 ` Christoph Hellwig
2025-10-10 17:35 ` Andy Lutomirski
2025-10-05 22:06 ` Dave Chinner
2025-10-07 5:10 ` Christoph Hellwig
2025-10-05 23:38 ` Dave Chinner
2025-10-06 2:16 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aObXUBCtp4p83QzS@dread.disaster.area \
--to=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=raphaelsc@scylladb.com \
--cc=xemul@scylladb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.