From: Amir Goldstein <amir73il@gmail.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-unionfs@vger.kernel.org,
Fei Lv <feilv@asrmicro.com>,
Chenglong Tang <chenglongtang@google.com>,
stable@vger.kernel.org
Subject: Re: [PATCH] ovl: make fsync after metadata copy-up opt-in mount option
Date: Wed, 25 Mar 2026 14:11:31 +0100 [thread overview]
Message-ID: <CAOQ4uxge9QDMwnLr1+W0xF2GocnFWVrbhRdriaf5Qe+4KkrG4Q@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxh5NFvXGop6ne-zfRbH5p6BPT2kCt7dUkP__-TtpeJjJQ@mail.gmail.com>
On Wed, Mar 25, 2026 at 1:03 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Wed, Mar 25, 2026 at 6:55 AM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Tue, Mar 24, 2026 at 03:57:50PM +0100, Amir Goldstein wrote:
> > > From: Fei Lv <feilv@asrmicro.com>
> > >
> > > Commit 7d6899fb69d25 ("ovl: fsync after metadata copy-up") was done to
> > > fix durability of overlayfs copy up on an upper filesystem which does
> > > not enforce ordering on storing of metadata changes (e.g. ubifs).
> >
> > I'm trying to understand this previous commit more than this one,
> > but what 'enforce ordering on storing of metadata changes' does
> > overlayfs encode right now?
>
> On copy up or a directory:
> 1. create a directory in tmpdir
> 2. copy attributes and xattr from lower directory to this staged
> directory copy up
> 3. move it into place in overlayfs upperdir
>
> Until commit 7d6899fb69d25, there was no fsync before step 2 to 3.
> Only when copying a regular file there was fsync after data copy up.
>
> This of course provides no guarantee over the state of the copied up dir
> after crash, whether the directory is observed in upperdir with or without
> the attributes, but in reality this is how it is since 2014 and for many local
> filesystems (e.g. xfs), there is little risk in this practice.
>
> It should be noted that overlayfs is quite picky about which filesystems
> are allowed as upper filesystems and specifically network filesystems
> are not allowed.
>
> > There is no real ordering requirements
> > anywhere in the Linux file system API, so it does sounds like ovl
> > is making some assumptions by default?
>
> Correct. I would say "making assumptions" I would just say that
> overlayfs has never taken this aspect into account.
>
> > Are those documented somewhere?
>
> I guess not, but now that this commit introduces, fsync=ordered,strict
> and a documentation section about them, it is a good opportunity
> to expand on this point. I will add that.
>
See modified documentation below:
Thanks,
Amir.
Durability and copy up
----------------------
The fsync(2) system call ensures that the data and metadata of a file
are safely written to the backing storage, which is expected to
guarantee the existence of the information post system crash.
Without an fsync(2) call, there is no guarantee that the observed
data after a system crash will be either the old or the new data, but
in practice, the observed data after crash is often the old or new data
or a mix of both.
When an overlayfs file is modified for the first time, copy up will
create a copy of the lower file and its parent directories in the upper
layer. Since the Linux filesystem API does not enforce any particular
ordering on storing changes without explicit fsync(2) calls, in case
of a system crash, the upper file could end up with no data at all
(i.e. zeros), which would be an unusual outcome. To avoid this
experience, overlayfs calls fsync(2) on the upper file before completing
data copy up with rename(2) to make the copy up "atomic".
By default, overlayfs does not call fsync(2) on copied up directories,
so after a crash, a copied up directory could be observed in the upper
layer without some of its attributes. This has been the overlayfs
behavior since its introduction and it poses little risk in practice
for common local filesystems (e.g. ext4, xfs). This risk is further
mitigated by overlayfs restricting the upper layer to local filesystems
only (i.e. network filesystems are not allowed).
Overlayfs can be tuned to prefer performance or durability when storing
to the underlying upper layer. This is controlled by the "fsync" mount
option, which supports these values:
- "ordered": (default)
Call fsync(2) on upper file before completion of data copy up.
No fsync(2) is called on directory or metadata-only copy up.
- "strict":
Call fsync(2) on upper file and directories before completion of any
copy up.
- "volatile": [*]
Prefer performance over durability (see `Volatile mount`_)
[*] The mount option "volatile" is an alias to "fsync=volatile".
prev parent reply other threads:[~2026-03-25 13:11 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 14:57 [PATCH] ovl: make fsync after metadata copy-up opt-in mount option Amir Goldstein
2026-03-24 18:04 ` Chenglong Tang
2026-03-24 19:09 ` Amir Goldstein
2026-03-25 5:55 ` Christoph Hellwig
2026-03-25 12:03 ` Amir Goldstein
2026-03-25 13:11 ` Amir Goldstein [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOQ4uxge9QDMwnLr1+W0xF2GocnFWVrbhRdriaf5Qe+4KkrG4Q@mail.gmail.com \
--to=amir73il@gmail.com \
--cc=brauner@kernel.org \
--cc=chenglongtang@google.com \
--cc=feilv@asrmicro.com \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-unionfs@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox