public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Tso" <tytso@mit.edu>
To: Alejandro Colomar <alx@kernel.org>
Cc: Andreas Dilger <adilger@dilger.ca>,
	Vyacheslav Kovalevsky <slava.kovalevskiy.2014@gmail.com>,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-man@vger.kernel.org
Subject: Re: Writing more than 4096 bytes with O_SYNC flag does not persist all previously written data if system crashes
Date: Mon, 23 Feb 2026 14:32:38 -0500	[thread overview]
Message-ID: <20260223193238.GA63263@macsyma-wired.lan> (raw)
In-Reply-To: <aZxLxum4WFYKbx2O@devuan>

On Mon, Feb 23, 2026 at 01:46:54PM +0100, Alejandro Colomar wrote:
> Hi Ted, Andreas,
> 
> > The parenthetical comment in the second paragraph needs to be removed,
> > since fsync specifices that all dirty information in the page cache
> > will be flushed out.
> 
> Would you mind checking the text in VERSIONS (since there's a reference
> to it right next to the text you're proposing to remove)?  I suspect it
> will also need to be updated accordingly.  I don't feel qualified to
> touch that text by myself.

The text in VERSIONS is not incorrect, in that it is talking about the
distinction of O_SYNC and O_DSYNC in terms of which kinds of metadata
will be persisted.

However, the reason why all of this information regarding Synchronized
I/O is in VERSIONS is describing the historic behaviour of Linux
version 2.6.33 versus more modern versions of Linux.  But 2.6.33 dates
from February 24, 2010 --- 16 years ago.  So it might be simpler if we
simply dropped this kind of historical information.  But if you do
want to keep it, we should move the bulk of that inforamtion into
O_SYNC and O_DSYNC.

So maybe:

       O_DSYNC
              Write  operations  on the file will complete according to the re‐
              quirements of synchronized I/O data integrity completion.

              By the time write(2) (and similar) return, the  output  data  has
              been  transferred to the underlying hardware, along with any file
              metadata that would be required to retrieve that data.

	      See VERSIONS for a description of how historial versions
	      of the Linux kernes from 2010 behaved.

       O_SYNC Write  operations  on the file will complete according to the re‐
              quirements of synchronized I/O file integrity completion (by con‐
              trast with the synchronized I/O data  integrity  completion  pro‐
              vided by O_DSYNC.)

              By the time write(2) (or similar) returns, the output
              data and all file metadata associated inode for the
              opened file have been transferred to the underlying
              hardware.
	      
	      See VERSIONS for a description of how historial versions
	      of the Linux kernes from 2010 behaved.

    VERSIONS
       Before Linux 2.6.33, Linux implemented only the O_SYNC flag for
       open().  However, when that flag was specified, most
       filesystems actually pro‐ vided the equivalent of synchronized
       I/O data integrity completion (i.e., O_SYNC was actually
       implemented as the equivalent of O_DSYNC).

I'd suggest dropping everything else in VERSIONS, including the
discussion of O_RSYNC.  All of that is much more appropriate for a
tutorial.

If you really want to keep all of that text, perhaps it could be moved
into a synchronized-io man page in section 7.  In that we can talk
about the difference of fsync() and fdatasync(), which is interesting
as a conceptual model, and conceptually it is similar to the O_SYNC
and O_DSYNC.  But the difference of what data will be written back
(the data that was written in the file descriptor where the
O_SYNC/O_DSYNC flag was set, eitehr via open or fcntl, versus all
buffered data in the buffer cache).  The synchronized-io man page
could also have more of the information around O_DIRECT in one place.

> If you'd write a patch, I'd appreciate that.

Well, there's a question of what's the minimal change that is needed
to fix out-and-out inaccuracies, and we can just delete some
parenthetical comments.

BTW, if we want to delete inaccurate information, I'd also suggest
deleting the following text in the O_DIRECT section of the man page:

      A semantically similar (but deprecated) interface for block
      devices is described in raw(8).

----

Then there's trying to rearrange the tutorial-style information for
people who want to implement code which needs data persistence
guarantees.  That's quite a lot more work, and while I'm happy to
review or assist someone to write that more expansive tutorial
material, it's not something I'm willing to sign up to do.

----

Finally, there are some philosophical questions about what the goals
of the Linux kernel man pages --- how important is having historical
information (for exmaple O_DIRECT has a "since 2.4.10", which is 25
years ago --- really)? and how important is there to have tutorial
infomation and where should that information should be organized in
the man page.

My personal opinion is that the primary priority of the Linux man page
is to document the specification of the kernel interfaces that we
expose to user space.  Things like tutorial material and a descriptive
of historical versions are of secondary importance.

I'd also advocate dropping historical information for kernel versions
which are older than say, 7 years.  Curretly the oldest LTS kernel
which is supported upstream is 5.10, which was originally released in
2020, and will EOL by end of 2026.  The Linux kernel 5.0 was released
on March 3, 2019, so using a 7 year lookback means that explanation
about how the Linux kernel in 2.4.x, 2.6.y, 3.x, 4.x, etc. can be
dropped from the man pages, since IMHO it will reduces a lot of noise
that will likely confuse readers.

But that's a call for Alex and the man pages project to make.

Cheers,

					- Ted

  reply	other threads:[~2026-02-23 19:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-18 13:29 Writing more than 4096 bytes with O_SYNC flag does not persist all previously written data if system crashes Vyacheslav Kovalevsky
2026-02-18 21:55 ` Andreas Dilger
2026-02-19 13:32   ` Theodore Tso
2026-02-23 12:46     ` Alejandro Colomar
2026-02-23 19:32       ` Theodore Tso [this message]
2026-02-24  1:21         ` Andreas Dilger
2026-03-03 13:19         ` Alejandro Colomar
2026-02-24 14:47 ` Christoph Hellwig
2026-02-24 22:23   ` Darrick J. Wong
2026-02-25 14:20     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260223193238.GA63263@macsyma-wired.lan \
    --to=tytso@mit.edu \
    --cc=adilger@dilger.ca \
    --cc=alx@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=slava.kovalevskiy.2014@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox