From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, cem@kernel.org
Subject: Re: [PATCH 0/3] xfs: miscellaneous bug fixes
Date: Wed, 13 Nov 2024 12:09:58 +1100 [thread overview]
Message-ID: <ZzP8ZkVpa3S3G8v8@dread.disaster.area> (raw)
In-Reply-To: <20241112235946.GJ9438@frogsfrogsfrogs>
On Tue, Nov 12, 2024 at 03:59:46PM -0800, Darrick J. Wong wrote:
> On Wed, Nov 13, 2024 at 09:05:13AM +1100, Dave Chinner wrote:
> > These are three bug fixes for recent issues.
> >
> > The first is a repost of the original patch to prevent allocation of
> > sparse inode clusters at the end of an unaligned runt AG. There
> > was plenty of discussion over that fix here:
> >
> > https://lore.kernel.org/linux-xfs/20241024025142.4082218-1-david@fromorbit.com/
> >
> > And the outcome of that discussion is that we can't allow sparse
> > inode clusters overlapping the end of the runt AG without an on disk
> > format definition change. Hence this patch to ensure the check is
> > done correctly is the only change we need to make to the kernel to
> > avoid this problem in the future.
> >
> > Filesystems that have this problem on disk will need to run
> > xfs_repair to remove the bad cluster, but no data loss is possible
> > from this because the kernel currently disallows inode allocation
> > from the bad cluster and so none of the inodes in the sparse cluster
> > can actually be used. Hence there is no possible data loss or other
> > metadata corruption possible from this situation, all we need to do
> > is ensure that it doesn't happen again once repair has done it's
> > work.
>
> <shrug> How many systems are in this state?
Some. Maybe many. Unfortunately the number is largely
unquantifiable. However, when it happens it dumps corruption reports
dumped in the log, so I'd say that there aren't that many of them
out there because we aren't getting swamped with corruption reports.
> Would those users rather we
> fix the validation code in repair/scrub/wherever to allow ichunks that
> overrun the end of a runt AG?
Uh, the previous discussion ended at "considering inode chunks
overlapping the end of the runt AG as valid requires an incompat
feature flag as older kernels cannot access inodes in that
location". i.e. older kernels will flag those inodes as corrupt if
we don't add an incompat feature flag to indicate they are valid.
At that point, we have a situation where they are forced to upgrade
userspace tools to do anything with the filesytsem that the kernel
added the new incompat feature flag for on upgrade.
That's a much worse situation, because they might not realise they
need to upgrade all the userspace tools and disaster recovery
utilities to handle this new format that the kernel upgrade
introduced....
The repair/scrub/whatever code already detects and fix the issue by
removing the bad cluster from the runt AG. We just need to stop the
kernel creating the bad clusters again.
IOWs, it just simpler for everyone to fix the bug like this and
continue to consider the sparse inode cluster at the end of the AG
is invalid.
Alternatively, if users can grow the block device, then they can
simply round up the size of the block device to a whole inode
chunk. They don't need to run repair to fix the issue; the cluster
is now valid because a whole chunk will fit at end of the runt AG.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-11-13 1:10 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-12 22:05 [PATCH 0/3] xfs: miscellaneous bug fixes Dave Chinner
2024-11-12 22:05 ` [PATCH 1/3] xfs: fix sparse inode limits on runt AG Dave Chinner
2024-11-12 23:15 ` Darrick J. Wong
2024-11-13 0:12 ` Dave Chinner
2024-11-13 0:24 ` Darrick J. Wong
2024-11-12 22:05 ` [PATCH 2/3] xfs: delalloc and quota softlimit timers are incoherent Dave Chinner
2024-11-12 23:48 ` Darrick J. Wong
2024-11-13 0:14 ` Dave Chinner
2024-11-13 8:48 ` Christoph Hellwig
2024-11-12 22:05 ` [PATCH 3/3] xfs: prevent mount and log shutdown race Dave Chinner
2024-11-12 23:58 ` Darrick J. Wong
2024-11-13 0:56 ` Dave Chinner
2024-11-13 8:50 ` Christoph Hellwig
2024-11-12 23:59 ` [PATCH 0/3] xfs: miscellaneous bug fixes Darrick J. Wong
2024-11-13 1:09 ` Dave Chinner [this message]
2024-11-25 11:57 ` Carlos Maiolino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZzP8ZkVpa3S3G8v8@dread.disaster.area \
--to=david@fromorbit.com \
--cc=cem@kernel.org \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox