From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>,
chandan.babu@oracle.com, dchinner@redhat.com, hch@lst.de,
viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz,
linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, catherine.hoang@oracle.com,
martin.petersen@oracle.com
Subject: Re: [PATCH v2 07/13] xfs: Introduce FORCEALIGN inode flag
Date: Fri, 12 Jul 2024 09:20:26 +1000 [thread overview]
Message-ID: <ZpBouoiUpMgZtqMk@dread.disaster.area> (raw)
In-Reply-To: <20240711025958.GJ612460@frogsfrogsfrogs>
On Wed, Jul 10, 2024 at 07:59:58PM -0700, Darrick J. Wong wrote:
> On Fri, Jul 05, 2024 at 04:24:44PM +0000, John Garry wrote:
> > +/* Validate the forcealign inode flag */
> > +xfs_failaddr_t
> > +xfs_inode_validate_forcealign(
> > + struct xfs_mount *mp,
> > + uint32_t extsize,
> > + uint32_t cowextsize,
> > + uint16_t mode,
> > + uint16_t flags,
> > + uint64_t flags2)
> > +{
> > + bool rt = flags & XFS_DIFLAG_REALTIME;
> > +
> > + /* superblock rocompat feature flag */
> > + if (!xfs_has_forcealign(mp))
> > + return __this_address;
> > +
> > + /* Only regular files and directories */
> > + if (!S_ISDIR(mode) && !S_ISREG(mode))
> > + return __this_address;
> > +
> > + /* We require EXTSIZE or EXTSZINHERIT */
> > + if (!(flags & (XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT)))
> > + return __this_address;
> > +
> > + /* We require a non-zero extsize */
> > + if (!extsize)
> > + return __this_address;
> > +
> > + /* Reflink'ed disallowed */
> > + if (flags2 & XFS_DIFLAG2_REFLINK)
> > + return __this_address;
>
> Hmm. If we don't support reflink + forcealign ATM, then shouldn't the
> superblock verifier or xfs_fs_fill_super fail the mount so that old
> kernels won't abruptly emit EFSCORRUPTED errors if a future kernel adds
> support for forcealign'd cow and starts writing out files with both
> iflags set?
I don't think we should error out the mount because reflink and
forcealign are enabled - that's going to be the common configuration
for every user of forcealign, right? I also don't think we should
throw a corruption error if both flags are set, either.
We're making an initial *implementation choice* not to implement the
two features on the same inode at the same time. We are not making a
an on-disk format design decision that says "these two on-disk flags
are incompatible".
IOWs, if both are set on a current kernel, it's not corruption but a
more recent kernel that supports both flags has modified this inode.
Put simply, we have detected a ro-compat situation for this specific
inode.
Looking at it as a ro-compat situation rather then corruption,
what I would suggest we do is this:
1. Warn at mount that reflink+force align inodes will be treated
as ro-compat inodes. i.e. read-only.
2. prevent forcealign from being set if the shared extent flag is
set on the inode.
3. prevent shared extents from being created if the force align flag
is set (i.e. ->remap_file_range() and anything else that relies on
shared extents will fail on forcealign inodes).
4. if we read an inode with both set, we emit a warning and force
the inode to be read only so we don't screw up the force alignment
of the file (i.e. that inode operates in ro-compat mode.)
#1 is the mount time warning of potential ro-compat behaviour.
#2 and #3 prevent both from getting set on existing kernels.
#4 is the ro-compat behaviour that would occur from taking a
filesystem that ran on a newer kernel that supports force-align+COW.
This avoids corruption shutdowns and modifications that would screw
up the alignment of the shared and COW'd extents.
> That said, if the bs>ps patchset lands, then I think forcealign cow is
> a simple matter of setting the min folio order to the forcealign size
> and making sure that we always write out entire folios if any of the
> blocks cached by the folio is shared. Direct writes to forcealigned
> shared files probably has to be aligned to the forcealign size or fall
> back to buffered writes for cow.
Right, I think all the pieces we will need are slowly falling into
place in the near future, so it doesn't seem right to me to actually
prevent filesystems with reflink and force-align both enabled right
now and then end up with a weird filesystem config needed to use
forcealign just for a couple of kernel releases...
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-07-11 23:20 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-05 16:24 [PATCH v2 00/13] forcealign for xfs John Garry
2024-07-05 16:24 ` [PATCH v2 01/13] xfs: only allow minlen allocations when near ENOSPC John Garry
2024-07-05 16:24 ` [PATCH v2 02/13] xfs: always tail align maxlen allocations John Garry
2024-07-05 16:24 ` [PATCH v2 03/13] xfs: simplify extent allocation alignment John Garry
2024-07-05 16:24 ` [PATCH v2 04/13] xfs: make EOF allocation simpler John Garry
2024-08-06 18:58 ` Darrick J. Wong
2024-07-05 16:24 ` [PATCH v2 05/13] xfs: introduce forced allocation alignment John Garry
2024-07-05 16:24 ` [PATCH v2 06/13] xfs: align args->minlen for " John Garry
2024-07-05 16:24 ` [PATCH v2 07/13] xfs: Introduce FORCEALIGN inode flag John Garry
2024-07-11 2:59 ` Darrick J. Wong
2024-07-11 3:59 ` Christoph Hellwig
2024-07-11 7:17 ` John Garry
2024-07-11 23:33 ` Dave Chinner
2024-07-11 23:20 ` Dave Chinner [this message]
2024-07-12 4:56 ` Christoph Hellwig
2024-07-18 8:53 ` John Garry
2024-07-23 10:11 ` John Garry
2024-07-23 14:42 ` Christoph Hellwig
2024-07-23 15:01 ` John Garry
2024-07-23 22:26 ` Darrick J. Wong
2024-07-26 14:14 ` John Garry
2024-07-23 23:38 ` Dave Chinner
2024-07-24 0:04 ` Darrick J. Wong
2024-07-24 18:50 ` John Garry
2024-07-24 7:39 ` John Garry
2024-07-05 16:24 ` [PATCH v2 08/13] xfs: Do not free EOF blocks for forcealign John Garry
2024-07-06 7:56 ` Christoph Hellwig
2024-07-08 1:44 ` Dave Chinner
2024-07-08 7:36 ` John Garry
2024-07-08 11:12 ` Dave Chinner
2024-07-08 14:41 ` John Garry
2024-07-09 7:41 ` Christoph Hellwig
2024-07-05 16:24 ` [PATCH v2 09/13] xfs: Update xfs_inode_alloc_unitsize() " John Garry
2024-07-05 16:24 ` [PATCH v2 10/13] xfs: Unmap blocks according to forcealign John Garry
2024-07-06 7:58 ` Christoph Hellwig
2024-07-08 14:48 ` John Garry
2024-07-09 7:46 ` Christoph Hellwig
2024-07-17 15:24 ` John Garry
2024-07-17 16:42 ` Christoph Hellwig
2024-07-09 9:57 ` Dave Chinner
2024-07-09 11:19 ` Christoph Hellwig
2024-07-05 16:24 ` [PATCH v2 11/13] xfs: Only free full extents for forcealign John Garry
2024-07-06 7:59 ` Christoph Hellwig
2024-07-05 16:24 ` [PATCH v2 12/13] xfs: Don't revert allocated offset " John Garry
2024-07-05 16:24 ` [PATCH v2 13/13] xfs: Enable file data forcealign feature John Garry
2024-07-06 7:53 ` [PATCH v2 00/13] forcealign for xfs Christoph Hellwig
2024-07-08 7:48 ` John Garry
2024-07-09 7:48 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZpBouoiUpMgZtqMk@dread.disaster.area \
--to=david@fromorbit.com \
--cc=brauner@kernel.org \
--cc=catherine.hoang@oracle.com \
--cc=chandan.babu@oracle.com \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=john.g.garry@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox