From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Zorro Lang <zlang@redhat.com>,
linux-xfs@vger.kernel.org, fstests@vger.kernel.org,
Carlos Maiolino <carlos@maiolino.me>
Subject: Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
Date: Thu, 9 Nov 2023 20:32:56 -0800 [thread overview]
Message-ID: <20231110043256.GI1205143@frogsfrogsfrogs> (raw)
In-Reply-To: <ZU2PhTKqwNEbjK13@dread.disaster.area>
On Fri, Nov 10, 2023 at 01:03:49PM +1100, Dave Chinner wrote:
> On Fri, Nov 10, 2023 at 09:36:51AM +0800, Zorro Lang wrote:
> > The g/047 still fails with this 2nd patch. So I did below steps [1],
> > and get the trace output as [2], those dump_inodes() messages you
> > added have been printed, please check.
>
> And that points me at the bug.
>
> dump_inodes: disk ino 0x83: init nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18
> dump_inodes: log ino 0x83: init nblocks 0x8 nextents 0x0/0x1 anextents 0x0/0x0 v3_pad 0x1 nrext64_pad 0x0 di_flags2 0x18 big
> ^^^^^^^
> The initial log inode is correct.
>
> dump_inodes: disk ino 0x83: pre nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18
> dump_inodes: log ino 0x83: pre nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18 big
> ^^^^^^^
>
> .... but on the second sample, it's been modified and the extent
> count has been zeroed? Huh, that is unexpected - what did that?
>
> Oh.
>
> Can you test the patch below and see if it fixes the issue? Keep
> the first verifier patch I sent, then apply the patch below. You can
> drop the debug traceprintk patch - the patch below should fix it.
>
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> xfs: recovery should not clear di_flushiter unconditionally
>
> From: Dave Chinner <dchinner@redhat.com>
>
> Because on v3 inodes, di_flushiter doesn't exist. It overlaps with
> zero padding in the inode, except when NREXT64=1 configurations are
> in use and the zero padding is no longer padding but holds the 64
> bit extent counter.
>
> This manifests obviously on big endian platforms (e.g. s390) because
> the log dinode is in host order and the overlap is the LSBs of the
> extent count field. It is not noticed on little endian machines
> because the overlap is at the MSB end of the extent count field and
> we need to get more than 2^^48 extents in the inode before it
> manifests. i.e. the heat death of the universe will occur before we
> see the problem in little endian machines.
>
> This is a zero-day issue for NREXT64=1 configuraitons on big endian
> machines. Fix it by only clearing di_flushiter on v2 inodes during
> recovery.
>
> Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers")
> cc: stable@kernel.org # 5.19+
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_inode_item_recover.c | 32 +++++++++++++++++---------------
> 1 file changed, 17 insertions(+), 15 deletions(-)
>
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index f4c31c2b60d5..dbdab4ce7c44 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -371,24 +371,26 @@ xlog_recover_inode_commit_pass2(
> * superblock flag to determine whether we need to look at di_flushiter
> * to skip replay when the on disk inode is newer than the log one
> */
> - if (!xfs_has_v3inodes(mp) &&
> - ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> - /*
> - * Deal with the wrap case, DI_MAX_FLUSH is less
> - * than smaller numbers
> - */
> - if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> - ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> - /* do nothing */
> - } else {
> - trace_xfs_log_recover_inode_skip(log, in_f);
> - error = 0;
> - goto out_release;
> + if (!xfs_has_v3inodes(mp)) {
> + if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> + /*
> + * Deal with the wrap case, DI_MAX_FLUSH is less
> + * than smaller numbers
> + */
> + if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> + ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> + /* do nothing */
> + } else {
> + trace_xfs_log_recover_inode_skip(log, in_f);
> + error = 0;
> + goto out_release;
> + }
> }
> +
> + /* Take the opportunity to reset the flush iteration count */
> + ldip->di_flushiter = 0;
> }
>
> - /* Take the opportunity to reset the flush iteration count */
> - ldip->di_flushiter = 0;
That's an unfortunate logic bomb left over from the V5 introduction. I
guess it was benign until we finally reused that part of the xfs_dinode.
If this fixes zorro's machine, then:
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
I wonder, if we made XFS_SUPPORT_V4=n turn xfs_has_v3inodes and
xfs_has_crc into #define'd true symbols, could we then rename all the
V4-only fields to see what stops compiling?
(Probably not, gcc will still want to parse it all even if the source
code itself is dead...)
--D
>
> if (unlikely(S_ISREG(ldip->di_mode))) {
> if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
next prev parent reply other threads:[~2023-11-10 18:17 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-29 4:11 [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs] Zorro Lang
2023-11-06 6:13 ` Dave Chinner
2023-11-06 19:26 ` Zorro Lang
2023-11-06 20:33 ` Dave Chinner
2023-11-06 22:20 ` Darrick J. Wong
2023-11-07 8:05 ` Zorro Lang
2023-11-07 8:13 ` Dave Chinner
2023-11-07 15:13 ` Zorro Lang
2023-11-08 6:38 ` Dave Chinner
[not found] ` <CAN=2_H+CdEK_rEUmYbmkCjSRqhX2cwi5yRHQcKAmKDPF16vqOw@mail.gmail.com>
2023-11-09 6:14 ` Dave Chinner
2023-11-09 14:09 ` Zorro Lang
2023-11-09 23:13 ` Dave Chinner
2023-11-10 1:36 ` Zorro Lang
2023-11-10 2:03 ` Dave Chinner
2023-11-10 4:32 ` Darrick J. Wong [this message]
2023-11-10 7:34 ` Christoph Hellwig
2023-11-10 13:56 ` Zorro Lang
2023-11-14 11:17 ` edward6
2023-11-07 8:29 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231110043256.GI1205143@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=carlos@maiolino.me \
--cc=david@fromorbit.com \
--cc=fstests@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox