From: Chenglong Tang <chenglongtang@google.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-unionfs@vger.kernel.org,
Fei Lv <feilv@asrmicro.com>,
stable@vger.kernel.org
Subject: Re: [PATCH] ovl: make fsync after metadata copy-up opt-in mount option
Date: Tue, 24 Mar 2026 11:04:12 -0700 [thread overview]
Message-ID: <CAOdxtTY0jsqrJVXH=eQzYcowEpkDxwrk1DMgg8QD4ojygWJQ_Q@mail.gmail.com> (raw)
In-Reply-To: <20260324145750.90719-1-amir73il@gmail.com>
Hi,
Regarding the patch: because we are currently locked to the 6.12 LTS
kernel, this patch doesn't apply cleanly to our tree (due to missing
mainline dependencies like the str_on_off helper).
Since we are actively tracking this for a Google COS customer
escalation, do you have a rough timeline for when you expect this to
be merged into mainline and subsequently picked up by the 6.12 stable
queue?
We will officially pull it into the COS tree as soon as it lands in
linux-stable.
Thanks again for the excellent support,
Chenglong
On Tue, Mar 24, 2026 at 7:57 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> From: Fei Lv <feilv@asrmicro.com>
>
> Commit 7d6899fb69d25 ("ovl: fsync after metadata copy-up") was done to
> fix durability of overlayfs copy up on an upper filesystem which does
> not enforce ordering on storing of metadata changes (e.g. ubifs).
>
> In an earlier revision of the regressing commit by Lei Lv, the metadata
> fsync behavior was opt-in via a new "fsync=strict" mount option.
> We were hoping that the opt-in mount option could be avoided, so the
> change was only made to depend on metacopy=off, in the hope of not
> hurting performance of metadata heavy workloads, which are more likely
> to be using metacopy=on.
>
> This hope was proven wrong by a performance regression report from Google
> COS workload after upgrade to kernel 6.12.
>
> This is an adaptation of Lei's original "fsync=strict" mount option
> to the existing upstream code.
>
> The new mount option is mutually exclusive with the "volatile" mount
> option, so the latter is now an alias to the "fsync=volatile" mount
> option.
>
> Reported-by: Chenglong Tang <chenglongtang@google.com>
> Closes: https://lore.kernel.org/linux-unionfs/CAOdxtTadAFH01Vui1FvWfcmQ8jH1O45owTzUcpYbNvBxnLeM7Q@mail.gmail.com/
> Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxgKC1SgjMWre=fUb00v8rxtd6sQi-S+dxR8oDzAuiGu8g@mail.gmail.com/
> Fixes: 7d6899fb69d25 ("ovl: fsync after metadata copy-up")
> Depends: 50e638beb67e0 ("ovl: Use str_on_off() helper in ovl_show_options()")
> Cc: stable@vger.kernel.org # v6.12
> Signed-off-by: Fei Lv <feilv@asrmicro.com>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>
> Miklos,
>
> The linked conversion was concluded with:
> "Now we just need to hope that users won't come shouting about
> performance regressions."
>
> Well, users came shouting.
>
> I am going to queue this up for an explicit opt-in to strict
> metadata fsync.
>
> Your review comment on the original fsync=strict patch are
> already addressed by the upstream commit (no double fsync).
>
> Thanks,
> Amir.
>
>
> Documentation/filesystems/overlayfs.rst | 39 +++++++++++++++++++++++++
> fs/overlayfs/copy_up.c | 6 ++--
> fs/overlayfs/ovl_entry.h | 20 +++++++++++--
> fs/overlayfs/params.c | 32 ++++++++++++++++----
> fs/overlayfs/super.c | 2 +-
> 5 files changed, 88 insertions(+), 11 deletions(-)
>
> diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst
> index af5a69f87da42..f9ef3d101c172 100644
> --- a/Documentation/filesystems/overlayfs.rst
> +++ b/Documentation/filesystems/overlayfs.rst
> @@ -783,6 +783,45 @@ controlled by the "uuid" mount option, which supports these values:
> mounted with "uuid=on".
>
>
> +Durability and copy up
> +----------------------
> +
> +The fsync(2) and fdatasync(2) system calls ensure that the metadata and
> +data of a file, respectively, are safely written to the backing
> +storage, which is expected to guarantee the existence of the information post
> +system crash.
> +
> +Without the fdatasync(2) call, there is no guarantee that the observed
> +data after a system crash will be either the old or the new data, but
> +in practice, the observed data after crash is often the old or new data or a
> +mix of both.
> +
> +When overlayfs file is modified for the first time, copy up will create
> +a copy of the lower file and its parent directories in the upper layer.
> +In case of a system crash, if fdatasync(2) was not called after the
> +modification, the upper file could end up with no data at all (i.e.
> +zeros), which would be an unusual outcome. To avoid this experience,
> +overlayfs calls fsync(2) on the upper file before completing the copy up with
> +rename(2) to make the copy up "atomic".
> +
> +Depending on the backing filesystem (e.g. ubifs), fsync(2) before
> +rename(2) may not be enough to provide the "atomic" copy up behavior
> +and fsync(2) on the copied up parent directories is required as well.
> +
> +Overlayfs can be tuned to prefer performance or durability when storing
> +to the underlying upper layer. This is controlled by the "fsync" mount
> +option, which supports these values:
> +
> +- "ordered": (default)
> + Call fsync(2) on upper file before completion of copy up.
> +- "strict":
> + Call fsync(2) on upper file and directories before completion of copy up.
> +- "volatile": [*]
> + Prefer performance over durability (see `Volatile mount`_)
> +
> +[*] The mount option "volatile" is an alias to "fsync=volatile".
> +
> +
> Volatile mount
> --------------
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 758611ee4475f..eca285a2d0c5b 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -1146,15 +1146,15 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> return -EOVERFLOW;
>
> /*
> - * With metacopy disabled, we fsync after final metadata copyup, for
> + * With "fsync=strict", we fsync after final metadata copyup, for
> * both regular files and directories to get atomic copyup semantics
> * on filesystems that do not use strict metadata ordering (e.g. ubifs).
> *
> - * With metacopy enabled we want to avoid fsync on all meta copyup
> + * By default, we want to avoid fsync on all meta copyup, because
> * that will hurt performance of workloads such as chown -R, so we
> * only fsync on data copyup as legacy behavior.
> */
> - ctx.metadata_fsync = !OVL_FS(dentry->d_sb)->config.metacopy &&
> + ctx.metadata_fsync = ovl_should_sync_strict(OVL_FS(dentry->d_sb)) &&
> (S_ISREG(ctx.stat.mode) || S_ISDIR(ctx.stat.mode));
> ctx.metacopy = ovl_need_meta_copy_up(dentry, ctx.stat.mode, flags);
>
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 1d4828dbcf7ac..dbb2242647ce4 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -5,6 +5,12 @@
> * Copyright (C) 2016 Red Hat, Inc.
> */
>
> +enum {
> + OVL_FSYNC_ORDERED,
> + OVL_FSYNC_STRICT,
> + OVL_FSYNC_VOLATILE,
> +};
> +
> struct ovl_config {
> char *upperdir;
> char *workdir;
> @@ -18,7 +24,7 @@ struct ovl_config {
> int xino;
> bool metacopy;
> bool userxattr;
> - bool ovl_volatile;
> + int fsync_mode;
> };
>
> struct ovl_sb {
> @@ -122,7 +128,17 @@ static inline struct ovl_fs *OVL_FS(struct super_block *sb)
>
> static inline bool ovl_should_sync(struct ovl_fs *ofs)
> {
> - return !ofs->config.ovl_volatile;
> + return ofs->config.fsync_mode != OVL_FSYNC_VOLATILE;
> +}
> +
> +static inline bool ovl_should_sync_strict(struct ovl_fs *ofs)
> +{
> + return ofs->config.fsync_mode == OVL_FSYNC_STRICT;
> +}
> +
> +static inline bool ovl_is_volatile(struct ovl_config *config)
> +{
> + return config->fsync_mode == OVL_FSYNC_VOLATILE;
> }
>
> static inline unsigned int ovl_numlower(struct ovl_entry *oe)
> diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c
> index 8111b437ae5d9..ba860bb92439a 100644
> --- a/fs/overlayfs/params.c
> +++ b/fs/overlayfs/params.c
> @@ -58,6 +58,7 @@ enum ovl_opt {
> Opt_xino,
> Opt_metacopy,
> Opt_verity,
> + Opt_fsync,
> Opt_volatile,
> Opt_override_creds,
> };
> @@ -140,6 +141,23 @@ static int ovl_verity_mode_def(void)
> return OVL_VERITY_OFF;
> }
>
> +static const struct constant_table ovl_parameter_fsync[] = {
> + { "ordered", OVL_FSYNC_ORDERED },
> + { "strict", OVL_FSYNC_STRICT },
> + { "volatile", OVL_FSYNC_VOLATILE },
> + {}
> +};
> +
> +static const char *ovl_fsync_mode(struct ovl_config *config)
> +{
> + return ovl_parameter_fsync[config->fsync_mode].name;
> +}
> +
> +static int ovl_fsync_mode_def(void)
> +{
> + return OVL_FSYNC_ORDERED;
> +}
> +
> const struct fs_parameter_spec ovl_parameter_spec[] = {
> fsparam_string_empty("lowerdir", Opt_lowerdir),
> fsparam_file_or_string("lowerdir+", Opt_lowerdir_add),
> @@ -155,6 +173,7 @@ const struct fs_parameter_spec ovl_parameter_spec[] = {
> fsparam_enum("xino", Opt_xino, ovl_parameter_xino),
> fsparam_enum("metacopy", Opt_metacopy, ovl_parameter_bool),
> fsparam_enum("verity", Opt_verity, ovl_parameter_verity),
> + fsparam_enum("fsync", Opt_fsync, ovl_parameter_fsync),
> fsparam_flag("volatile", Opt_volatile),
> fsparam_flag_no("override_creds", Opt_override_creds),
> {}
> @@ -665,8 +684,11 @@ static int ovl_parse_param(struct fs_context *fc, struct fs_parameter *param)
> case Opt_verity:
> config->verity_mode = result.uint_32;
> break;
> + case Opt_fsync:
> + config->fsync_mode = result.uint_32;
> + break;
> case Opt_volatile:
> - config->ovl_volatile = true;
> + config->fsync_mode = OVL_FSYNC_VOLATILE;
> break;
> case Opt_userxattr:
> config->userxattr = true;
> @@ -870,9 +892,9 @@ int ovl_fs_params_verify(const struct ovl_fs_context *ctx,
> config->index = false;
> }
>
> - if (!config->upperdir && config->ovl_volatile) {
> + if (!config->upperdir && ovl_is_volatile(config)) {
> pr_info("option \"volatile\" is meaningless in a non-upper mount, ignoring it.\n");
> - config->ovl_volatile = false;
> + config->fsync_mode = ovl_fsync_mode_def();
> }
>
> if (!config->upperdir && config->uuid == OVL_UUID_ON) {
> @@ -1070,8 +1092,8 @@ int ovl_show_options(struct seq_file *m, struct dentry *dentry)
> seq_printf(m, ",xino=%s", ovl_xino_mode(&ofs->config));
> if (ofs->config.metacopy != ovl_metacopy_def)
> seq_printf(m, ",metacopy=%s", str_on_off(ofs->config.metacopy));
> - if (ofs->config.ovl_volatile)
> - seq_puts(m, ",volatile");
> + if (ofs->config.fsync_mode != ovl_fsync_mode_def())
> + seq_printf(m, ",fsync=%s", ovl_fsync_mode(&ofs->config));
> if (ofs->config.userxattr)
> seq_puts(m, ",userxattr");
> if (ofs->config.verity_mode != ovl_verity_mode_def())
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index d4c12feec0392..0822987cfb51c 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -776,7 +776,7 @@ static int ovl_make_workdir(struct super_block *sb, struct ovl_fs *ofs,
> * For volatile mount, create a incompat/volatile/dirty file to keep
> * track of it.
> */
> - if (ofs->config.ovl_volatile) {
> + if (ovl_is_volatile(&ofs->config)) {
> err = ovl_create_volatile_dirty(ofs);
> if (err < 0) {
> pr_err("Failed to create volatile/dirty file.\n");
> --
> 2.53.0
>
next prev parent reply other threads:[~2026-03-24 18:04 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 14:57 [PATCH] ovl: make fsync after metadata copy-up opt-in mount option Amir Goldstein
2026-03-24 18:04 ` Chenglong Tang [this message]
2026-03-24 19:09 ` Amir Goldstein
2026-03-25 5:55 ` Christoph Hellwig
2026-03-25 12:03 ` Amir Goldstein
2026-03-25 13:11 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOdxtTY0jsqrJVXH=eQzYcowEpkDxwrk1DMgg8QD4ojygWJQ_Q@mail.gmail.com' \
--to=chenglongtang@google.com \
--cc=amir73il@gmail.com \
--cc=brauner@kernel.org \
--cc=feilv@asrmicro.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-unionfs@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox