From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: read only mounts with fsopen mount API are busted
Date: Thu, 24 Aug 2023 08:46:44 +1000 [thread overview]
Message-ID: <ZOaMVBGFLQ83Mrpc@dread.disaster.area> (raw)
In-Reply-To: <20230823221808.GF11263@frogsfrogsfrogs>
On Wed, Aug 23, 2023 at 03:18:08PM -0700, Darrick J. Wong wrote:
> On Thu, Aug 24, 2023 at 08:02:25AM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > Recently xfs/513 started failing on my test machines testing "-o
> > ro,norecovery" mount options. This was being emitted in dmesg:
> >
> > [ 9906.932724] XFS (pmem0): no-recovery mounts must be read-only.
> >
> > Turns out, readonly mounts with the fsopen()/fsconfig() mount API
> > have been busted since day zero. It's only taken 5 years for debian
> > unstable to start using this "new" mount API, and shortly after this
> > I noticed xfs/513 had started to fail as per above.
> >
> > The syscall trace is:
> >
> > fsopen("xfs", FSOPEN_CLOEXEC) = 3
> > mount_setattr(-1, NULL, 0, NULL, 0) = -1 EINVAL (Invalid argument)
> > .....
> > fsconfig(3, FSCONFIG_SET_STRING, "source", "/dev/pmem0", 0) = 0
> > fsconfig(3, FSCONFIG_SET_FLAG, "ro", NULL, 0) = 0
> > fsconfig(3, FSCONFIG_SET_FLAG, "norecovery", NULL, 0) = 0
> > fsconfig(3, FSCONFIG_CMD_CREATE, NULL, NULL, 0) = -1 EINVAL (Invalid argument)
> > close(3) = 0
> >
> > Showing that the actual mount instantiation (FSCONFIG_CMD_CREATE) is
> > what threw out the error.
> >
> > During mount instantiation, we call xfs_fs_validate_params() which
> > does:
> >
> > /* No recovery flag requires a read-only mount */
> > if (xfs_has_norecovery(mp) && !xfs_is_readonly(mp)) {
> > xfs_warn(mp, "no-recovery mounts must be read-only.");
> > return -EINVAL;
> > }
> >
> > and xfs_is_readonly() checks internal mount flags for read only
> > state. This state is set in xfs_init_fs_context() from the
> > context superblock flag state:
> >
> > /*
> > * Copy binary VFS mount flags we are interested in.
> > */
> > if (fc->sb_flags & SB_RDONLY)
> > set_bit(XFS_OPSTATE_READONLY, &mp->m_opstate);
> >
> > With the old mount API, all of the VFS specific superblock flags
> > had already been parsed and set before xfs_init_fs_context() is
> > called, so this all works fine.
> >
> > However, in the brave new fsopen/fsconfig world,
> > xfs_init_fs_context() is called from fsopen() context, before any
> > VFS superblock have been set or parsed. Hence if we use fsopen(),
> > the internal XFS readonly state is *never set*. Hence anything that
> > depends on xfs_is_readonly() actually returning true for read only
> > mounts is broken if fsopen() has been used to mount the filesystem.
> >
> > Fix this by moving this internal state initialisation to
> > xfs_fs_fill_super() before we attempt to validate the parameters
> > that have been set prior to the FSCONFIG_CMD_CREATE call being made.
> >
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
>
> Huh. Wow. I would have expected to find /anything/ in fstests that
> exercises the new mount api, but:
>
> lax:~/cdev/work/fstests(0)> git grep -E '(fsconfig|fspick|fsopen)'
> lax:~/cdev/work/fstests(1)>
>
> What other weird things are lurking here?
For everyone not on #xfs on oftc, here's a summary of the last day
or so since I discovered the above problem.
New xfs/270 failures with the fsconfig/fsopen mount API tell us that
unknown RO-compat bit mounts have also been completely broken since
~2018.
xfs/270 has been shutting down with superblock corruption after
mounting with an unknown RO-compat bit since 2018, but the attempt
to remount rw detects the unknown RO-compat bit before the code
checks for a shutdown situation, and hence the test passes even on a
shut down filesystem. Hence the test passes, and nobody has noticed
that the RO feature compat functionality is completely broken....
Which then lead us to the fact we are using RO-compat bits for
reflink and rmapbt, which introduce new log items and so actually
have log incompat feature bit requirements. RO-compat bits
don't cover recovery of unknown log features, just prevent new
modifications of the filesystem post-mount. So allowing log recovery
when RO-compat feature bits are set is also broken because we
screwed up reflink/rmapbt feature bit definitions, and that's
effective zero-day bugs for those features.
And, well, we modelled the XFS RO-compat functionality on the ext4
feature bit behaviour, so.....
> Anyhow, I guess that's a side effect of xfs_mount mirroring some of the
> vfs state flags, so....
Yup. But I don't see a way to easily avoid that...
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Thanks!
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2023-08-23 22:47 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-23 22:02 [PATCH] xfs: read only mounts with fsopen mount API are busted Dave Chinner
2023-08-23 22:18 ` Darrick J. Wong
2023-08-23 22:46 ` Dave Chinner [this message]
-- strict thread matches above, loose matches on Subject: below --
2024-01-16 4:33 Dave Chinner
2024-01-16 5:21 ` Christoph Hellwig
2024-01-17 2:33 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZOaMVBGFLQ83Mrpc@dread.disaster.area \
--to=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox