[Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
@ 2023-10-29  4:11 Zorro Lang
  2023-11-06  6:13 ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Zorro Lang @ 2023-10-29  4:11 UTC (permalink / raw)
  To: linux-xfs; +Cc: fstests

Hi xfs list,

Recently I always hit xfs corruption by running fstests generic/047 [1], and
it show more failures in dmesg[2], e.g:

  XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]

The .full output lots of curruption messages [3].

Currently I only hit this issue on s390x, haven't hit it on x86_64 or ppc64le
or aarch64. The mkfs option isn't related, even on overlayfs (base on xfs)
still hit this issue.

I tested on mainline linux 6.6.0-rc7+, the HEAD commit is:

commit 750b95887e567848ac2c851dae47922cac6db946
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Thu Oct 26 20:42:02 2023 -1000

    Merge tag 'drm-fixes-2023-10-27' of git://anongit.freedesktop.org/drm/drm

Thanks,
Zorro


[1]
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/s390x s390x-kvm-091 6.6.0-rc7+ #1 SMP Fri Oct 27 08:58:03 EDT 2023
MKFS_OPTIONS  -- -f -m crc=1,finobt=1,rmapbt=0,reflink=1,bigtime=1,inobtcount=1 /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR

generic/047       _check_xfs_filesystem: filesystem on /dev/loop1 failed scrub
(see /var/lib/xfstests/results//generic/047.full for details)
_check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
(see /var/lib/xfstests/results//generic/047.full for details)
_check_dmesg: something found in dmesg (see /var/lib/xfstests/results//generic/047.dmesg)
- output mismatch (see /var/lib/xfstests/results//generic/047.out.bad)
    --- tests/generic/047.out	2023-10-27 09:04:38.145351816 -0400
    +++ /var/lib/xfstests/results//generic/047.out.bad	2023-10-27 09:08:12.475381462 -0400
    @@ -1 +1,1000 @@
     QA output created by 047
    +file /mnt/fstests/SCRATCH_DIR/1 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/2 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/3 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/4 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/5 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/6 missing - fsync failed
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/047.out /var/lib/xfstests/results//generic/047.out.bad'  to see the entire diff)
Ran: generic/047
Failures: generic/047
Failed 1 of 1 tests



[2]
[  376.468885] run fstests generic/047 at 2023-10-27 09:08:07
 [  376.675751] XFS (loop1): Mounting V5 Filesystem 716c9687-ee74-4c12-b6ad-a0b513194f2b
 [  376.677088] XFS (loop1): Ending clean mount
 [  376.678189] XFS (loop1): User initiated shutdown received.
 [  376.678194] XFS (loop1): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x5a/0xf8 [xfs] (fs/xfs/xfs_fsops.c:492).  Shutting down filesystem.
 [  376.678409] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
 [  376.679423] XFS (loop1): Unmounting Filesystem 716c9687-ee74-4c12-b6ad-a0b513194f2b
 [  376.714910] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
 [  376.716353] XFS (loop1): Ending clean mount
 [  380.375878] XFS (loop1): User initiated shutdown received.
 [  380.375888] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495).  Shutting down filesystem.
 [  380.376101] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
 [  380.380373] XFS (loop1): Unmounting Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
 [  380.383835] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
 [  380.397086] XFS (loop1): Starting recovery (logdev: internal)
 [  380.465934] XFS (loop1): Ending recovery (logdev: internal)
 [  380.467409] XFS (loop1): Unmounting Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
 [  380.475431] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
 [  380.477235] XFS (loop1): Ending clean mount
 [  380.477500] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
 [  380.477636] CPU: 0 PID: 337362 Comm: 047 Kdump: loaded Tainted: G        W          6.6.0-rc7+ #1
 [  380.477639] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
 [  380.477641] Call Trace:
 [  380.477642]  [<0000000032d71372>] dump_stack_lvl+0x62/0x80 
 [  380.477648]  [<000003ff7ff96c00>] xfs_corruption_error+0x70/0xa0 [xfs] 
 [  380.477762]  [<000003ff7ff551ce>] xfs_dabuf_map.constprop.0+0x2a6/0x368 [xfs] 
 [  380.477871]  [<000003ff7ff5773e>] xfs_da_read_buf+0x6e/0x128 [xfs] 
 [  380.477977]  [<000003ff7ff57838>] xfs_da3_node_read+0x40/0x78 [xfs] 
 [  380.478085]  [<000003ff7ff58c7a>] xfs_da3_node_lookup_int+0x82/0x558 [xfs] 
 [  380.478193]  [<000003ff7ff68d6e>] xfs_dir2_node_lookup+0x3e/0x140 [xfs] 
 [  380.478302]  [<000003ff7ff5cfd2>] xfs_dir_lookup+0x1ea/0x218 [xfs] 
 [  380.478410]  [<000003ff7fface40>] xfs_lookup+0x60/0x108 [xfs] 
 [  380.478525]  [<000003ff7ffa9d42>] xfs_vn_lookup+0x62/0x98 [xfs] 
 [  380.478641]  [<0000000032738cfa>] __lookup_slow+0x9a/0x148 
 [  380.478647]  [<000000003273dc86>] walk_component+0x126/0x1b8 
 [  380.478650]  [<000000003273e948>] path_lookupat+0x88/0x1e8 
 [  380.478653]  [<000000003273f66a>] filename_lookup+0xaa/0x198 
 [  380.478656]  [<00000000327306d0>] vfs_statx+0x90/0x160 
 [  380.478659]  [<0000000032730a1e>] vfs_fstatat+0x86/0xe8 
 [  380.478661]  [<0000000032730c98>] __do_sys_newfstatat+0x28/0x48 
 [  380.478663]  [<0000000032d97060>] __do_syscall+0x1d0/0x1f8 
 [  380.478668]  [<0000000032da7040>] system_call+0x70/0x98 
 [  380.478672] XFS (loop1): Corruption detected. Unmount and run xfs_repair
 [  380.478674] XFS (loop1): xfs_dabuf_map: bno 8388608 inode 128
 [  380.478676] XFS (loop1): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0
 [  380.478711] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
 [  380.478818] CPU: 0 PID: 337362 Comm: 047 Kdump: loaded Tainted: G        W          6.6.0-rc7+ #1
 [  380.478820] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
 [  380.478821] Call Trace:
 [  380.478822]  [<0000000032d71372>] dump_stack_lvl+0x62/0x80 
 [  380.478825]  [<000003ff7ff96c00>] xfs_corruption_error+0x70/0xa0 [xfs] 
 [  380.478939]  [<000003ff7ff551ce>] xfs_dabuf_map.constprop.0+0x2a6/0x368 [xfs] 
 [  380.479047]  [<000003ff7ff5773e>] xfs_da_read_buf+0x6e/0x128 [xfs] 
 [  380.479155]  [<000003ff7ff57838>] xfs_da3_node_read+0x40/0x78 [xfs] 
 [  380.479263]  [<000003ff7ff58c7a>] xfs_da3_node_lookup_int+0x82/0x558 [xfs] 
 [  380.479370]  [<000003ff7ff68d6e>] xfs_dir2_node_lookup+0x3e/0x140 [xfs] 
 [  380.479480]  [<000003ff7ff5cfd2>] xfs_dir_lookup+0x1ea/0x218 [xfs] 
 [  380.479594]  [<000003ff7fface40>] xfs_lookup+0x60/0x108 [xfs] 
 [  380.479710]  [<000003ff7ffa9d42>] xfs_vn_lookup+0x62/0x98 [xfs] 
 [  380.479826]  [<0000000032738cfa>] __lookup_slow+0x9a/0x148 
 [  380.479829]  [<000000003273dc86>] walk_component+0x126/0x1b8 
 [  380.479832]  [<000000003273e948>] path_lookupat+0x88/0x1e8 
 [  380.479835]  [<000000003273f66a>] filename_lookup+0xaa/0x198 
 [  380.479837]  [<00000000327306d0>] vfs_statx+0x90/0x160 
 [  380.479840]  [<0000000032730a1e>] vfs_fstatat+0x86/0xe8 
 [  380.479842]  [<0000000032730c98>] __do_sys_newfstatat+0x28/0x48 
 [  380.479844]  [<0000000032d97060>] __do_syscall+0x1d0/0x1f8 
 [  380.479847]  [<0000000032da7040>] system_call+0x70/0x98 
 [  380.479850] XFS (loop1): Corruption detected. Unmount and run xfs_repair
 [  380.479852] XFS (loop1): xfs_dabuf_map: bno 8388608 inode 128
 [  380.479887] XFS (loop1): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0
 [  380.479912] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
 [  380.480022] CPU: 0 PID: 337362 Comm: 047 Kdump: loaded Tainted: G        W          6.6.0-rc7+ #1
 [  380.480024] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
 [  380.480026] Call Trace:
 [  380.480026]  [<0000000032d71372>] dump_stack_lvl+0x62/0x80 
 [  380.480029]  [<000003ff7ff96c00>] xfs_corruption_error+0x70/0xa0 [xfs] 
 [  380.480143]  [<000003ff7ff551ce>] xfs_dabuf_map.constprop.0+0x2a6/0x368 [xfs] 
 [  380.480253]  [<000003ff7ff5773e>] xfs_da_read_buf+0x6e/0x128 [xfs] 
 [  380.480360]  [<000003ff7ff57838>] xfs_da3_node_read+0x40/0x78 [xfs] 
 [  380.480467]  [<000003ff7ff58c7a>] xfs_da3_node_lookup_int+0x82/0x558 [xfs] 
 [  380.480576]  [<000003ff7ff68d6e>] xfs_dir2_node_lookup+0x3e/0x140 [xfs] 
 [  380.480683]  [<000003ff7ff5cfd2>] xfs_dir_lookup+0x1ea/0x218 [xfs] 
 [  380.480791]  [<000003ff7fface40>] xfs_lookup+0x60/0x108 [xfs] 
 [  380.480907]  [<000003ff7ffa9d42>] xfs_vn_lookup+0x62/0x98 [xfs] 
 [  380.481021]  [<0000000032738cfa>] __lookup_slow+0x9a/0x148 
 [  380.481024]  [<000000003273dc86>] walk_component+0x126/0x1b8 
 [  380.481027]  [<000000003273e948>] path_lookupat+0x88/0x1e8 
 [  380.481030]  [<000000003273f66a>] filename_lookup+0xaa/0x198 
 [  380.481033]  [<00000000327306d0>] vfs_statx+0x90/0x160 
 [  380.481036]  [<0000000032730a1e>] vfs_fstatat+0x86/0xe8 
 [  380.481038]  [<0000000032730c98>] __do_sys_newfstatat+0x28/0x48 
 [  380.481040]  [<0000000032d97060>] __do_syscall+0x1d0/0x1f8 
 [  380.481043]  [<0000000032da7040>] system_call+0x70/0x98 
 [  380.481051] XFS (loop1): Corruption detected. Unmount and run xfs_repair
 [  380.481053] XFS (loop1): xfs_dabuf_map: bno 8388608 inode 128
 [  380.481054] XFS (loop1): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0
 [  380.481080] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
 [  380.481189] CPU: 0 PID: 337362 Comm: 047 Kdump: loaded Tainted: G        W          6.6.0-rc7+ #1
 [  380.481191] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
 [  380.481192] Call Trace:
 [  380.481193]  [<0000000032d71372>] dump_stack_lvl+0x62/0x80 
 [  380.481195]  [<000003ff7ff96c00>] xfs_corruption_error+0x70/0xa0 [xfs] 
 [  380.481310]  [<000003ff7ff551ce>] xfs_dabuf_map.constprop.0+0x2a6/0x368 [xfs] 
 [  380.481418]  [<000003ff7ff5773e>] xfs_da_read_buf+0x6e/0x128 [xfs] 
 [  380.481526]  [<000003ff7ff57838>] xfs_da3_node_read+0x40/0x78 [xfs] 
 [  380.481635]  [<000003ff7ff58c7a>] xfs_da3_node_lookup_int+0x82/0x558 [xfs] 
 [  380.481743]  [<000003ff7ff68d6e>] xfs_dir2_node_lookup+0x3e/0x140 [xfs] 
 [  380.481854]  [<000003ff7ff5cfd2>] xfs_dir_lookup+0x1ea/0x218 [xfs] 
 [  380.481961]  [<000003ff7fface40>] xfs_lookup+0x60/0x108 [xfs] 
 [  380.482076]  [<000003ff7ffa9d42>] xfs_vn_lookup+0x62/0x98 [xfs] 
 [  380.482191]  [<0000000032738cfa>] __lookup_slow+0x9a/0x148 
 [  380.482194]  [<000000003273dc86>] walk_component+0x126/0x1b8 
 [  380.482197]  [<000000003273e948>] path_lookupat+0x88/0x1e8 
 [  380.482200]  [<000000003273f66a>] filename_lookup+0xaa/0x198 
 [  380.482203]  [<00000000327306d0>] vfs_statx+0x90/0x160 
 [  380.482205]  [<0000000032730a1e>] vfs_fstatat+0x86/0xe8 
 [  380.482207]  [<0000000032730c98>] __do_sys_newfstatat+0x28/0x48 
 [  380.482209]  [<0000000032d97060>] __do_syscall+0x1d0/0x1f8 
 [  380.482212]  [<0000000032da7040>] system_call+0x70/0x98 
 [  380.482215] XFS (loop1): Corruption detected. Unmount and run xfs_repair
 [  380.482216] XFS (loop1): xfs_dabuf_map: bno 8388608 inode 128
 [  380.482218] XFS (loop1): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0
 [  380.482242] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
 [  380.482362] CPU: 0 PID: 337362 Comm: 047 Kdump: loaded Tainted: G        W          6.6.0-rc7+ #1
 [  380.482364] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
 [  380.482365] Call Trace:
 [  380.482366]  [<0000000032d71372>] dump_stack_lvl+0x62/0x80 
 [  380.482368]  [<000003ff7ff96c00>] xfs_corruption_error+0x70/0xa0 [xfs] 
 [  380.482483]  [<000003ff7ff551ce>] xfs_dabuf_map.constprop.0+0x2a6/0x368 [xfs] 
 [  380.482591]  [<000003ff7ff5773e>] xfs_da_read_buf+0x6e/0x128 [xfs] 
 [  380.482696]  [<000003ff7ff57838>] xfs_da3_node_read+0x40/0x78 [xfs] 
 [  380.482804]  [<000003ff7ff58c7a>] xfs_da3_node_lookup_int+0x82/0x558 [xfs] 
 [  380.482910]  [<000003ff7ff68d6e>] xfs_dir2_node_lookup+0x3e/0x140 [xfs] 
 [  380.483020]  [<000003ff7ff5cfd2>] xfs_dir_lookup+0x1ea/0x218 [xfs] 
 [  380.483128]  [<000003ff7fface40>] xfs_lookup+0x60/0x108 [xfs] 
 [  380.483241]  [<000003ff7ffa9d42>] xfs_vn_lookup+0x62/0x98 [xfs] 
 [  380.483356]  [<0000000032738cfa>] __lookup_slow+0x9a/0x148 
 [  380.483359]  [<000000003273dc86>] walk_component+0x126/0x1b8 
 [  380.483362]  [<000000003273e948>] path_lookupat+0x88/0x1e8 
 [  380.483365]  [<000000003273f66a>] filename_lookup+0xaa/0x198 
 [  380.483368]  [<00000000327306d0>] vfs_statx+0x90/0x160 
 [  380.483370]  [<0000000032730a1e>] vfs_fstatat+0x86/0xe8 
 [  380.483372]  [<0000000032730c98>] __do_sys_newfstatat+0x28/0x48 
 [  380.483374]  [<0000000032d97060>] __do_syscall+0x1d0/0x1f8 
 [  380.483377]  [<0000000032da7040>] system_call+0x70/0x98 
 [  380.483380] XFS (loop1): Corruption detected. Unmount and run xfs_repair
 [  380.483382] XFS (loop1): xfs_dabuf_map: bno 8388608 inode 128
 [  380.483383] XFS (loop1): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0
 ...
 ...
 [  381.349174] XFS (loop1): Corruption detected. Unmount and run xfs_repair
[  381.349175] XFS (loop1): xfs_dabuf_map: bno 8388608 inode 128
[  381.349177] XFS (loop1): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0
[  381.350972] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
[  381.351085] CPU: 1 PID: 339644 Comm: xfs_scrub Kdump: loaded Tainted: G        W          6.6.0-rc7+ #1
[  381.351088] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
[  381.351090] Call Trace:
[  381.351091]  [<0000000032d71372>] dump_stack_lvl+0x62/0x80 
[  381.351093]  [<000003ff7ff96c00>] xfs_corruption_error+0x70/0xa0 [xfs] 
[  381.351191]  [<000003ff7ff551ce>] xfs_dabuf_map.constprop.0+0x2a6/0x368 [xfs] 
[  381.351283]  [<000003ff7ff5773e>] xfs_da_read_buf+0x6e/0x128 [xfs] 
[  381.351375]  [<000003ff7ff57838>] xfs_da3_node_read+0x40/0x78 [xfs] 
[  381.351467]  [<000003ff7ff58c7a>] xfs_da3_node_lookup_int+0x82/0x558 [xfs] 
[  381.351558]  [<000003ff7ff68d6e>] xfs_dir2_node_lookup+0x3e/0x140 [xfs] 
[  381.351651]  [<000003ff80008436>] xchk_dir_lookup+0x13e/0x1e0 [xfs] 
[  381.351745]  [<000003ff800078c2>] xchk_parent+0x82/0x168 [xfs] 
[  381.351840]  [<000003ff8000aa1a>] xfs_scrub_metadata+0x1c2/0x420 [xfs] 
[  381.351938]  [<000003ff7ffa3876>] xfs_ioc_scrub_metadata+0x5e/0xb0 [xfs] 
[  381.352035]  [<000003ff7ffa5d92>] xfs_file_ioctl+0x672/0x9f8 [xfs] 
[  381.352133]  [<0000000032744c9e>] __s390x_sys_ioctl+0xbe/0x100 
[  381.352135]  [<0000000032d97060>] __do_syscall+0x1d0/0x1f8 
[  381.352138]  [<0000000032da7040>] system_call+0x70/0x98 
[  381.352141] XFS (loop1): Corruption detected. Unmount and run xfs_repair
[  381.352143] XFS (loop1): xfs_dabuf_map: bno 8388608 inode 128
[  381.352174] XFS (loop1): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0
[  381.418153] XFS (loop1): Unmounting Filesystem 40196bb2-39f4-4c32-83ef-567f42216699

[3]
_check_xfs_filesystem: filesystem on /dev/loop1 failed scrub
*** xfs_scrub -v -d -n output ***
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Phase 1: Find filesystem geometry.
/mnt/fstests/SCRATCH_DIR: using 2 threads to scrub.
Phase 2: Check internal metadata.
Info: AG 1 superblock: Optimization is possible. (scrub.c line 212)
Info: AG 2 superblock: Optimization is possible. (scrub.c line 212)
Info: AG 3 superblock: Optimization is possible. (scrub.c line 212)
Phase 3: Scan all inodes.
Corruption: inode 128 (0/128) inode record: Repairs are required. (scrub.c line 196)
Corruption: inode 128 (0/128) parent pointer: Repairs are required. (scrub.c line 196)
Corruption: inode 131 (0/131) inode record: Repairs are required. (scrub.c line 196)
...
Corruption: inode 58300 (0/58300) inode record: Repairs are required. (scrub.c line 196)
Corruption: inode 58301 (0/58301) inode record: Repairs are required. (scrub.c line 196)
Corruption: inode 58302 (0/58302) inode record: Repairs are required. (scrub.c line 196)
Corruption: inode 58303 (0/58303) inode record: Repairs are required. (scrub.c line 196)
Phase 5: Check directory tree.
Info: /mnt/fstests/SCRATCH_DIR: Filesystem has errors, skipping connectivity checks. (phase5.c line 393)
Phase 7: Check summary counters.
198.9MiB data used;  1.0K inodes used.
95.9MiB data found; 1.0K inodes found.
1.0K inodes counted; 1.0K inodes checked.
/mnt/fstests/SCRATCH_DIR: corruptions found: 1001
/mnt/fstests/SCRATCH_DIR: Re-run xfs_scrub without -n.
*** end xfs_scrub output
_check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
*** xfs_repair -n output ***
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
bad nblocks 9 for inode 128, would reset to 0
no . entry for directory 128
no .. entry for root directory 128
problem with directory contents in inode 128
would clear root inode 128
bad nblocks 8 for inode 131, would reset to 0
bad nblocks 8 for inode 132, would reset to 0
bad nblocks 8 for inode 133, would reset to 0
...
bad nblocks 8 for inode 62438, would reset to 0
bad nblocks 8 for inode 62439, would reset to 0
bad nblocks 8 for inode 62440, would reset to 0
bad nblocks 8 for inode 62441, would reset to 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
root inode would be lost
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
bad nblocks 9 for inode 128, would reset to 0
        - agno = 2
        - agno = 3
no . entry for directory 128
no .. entry for root directory 128
problem with directory contents in inode 128
would clear root inode 128
bad nblocks 8 for inode 131, would reset to 0
bad nblocks 8 for inode 132, would reset to 0
bad nblocks 8 for inode 133, would reset to 0
...
bad nblocks 8 for inode 62439, would reset to 0
bad nblocks 8 for inode 62440, would reset to 0
bad nblocks 8 for inode 62441, would reset to 0
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
would reinitialize root directory
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 131, would move to lost+found
disconnected inode 132, would move to lost+found
disconnected inode 133, would move to lost+found
...
disconnected inode 62439, would move to lost+found
disconnected inode 62440, would move to lost+found
disconnected inode 62441, would move to lost+found
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
*** end xfs_repair output
*** mount output ***
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=4096k,nr_inodes=986186,mode=755,inode64)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,size=1592372k,nr_inodes=819200,mode=755,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate,memory_recursiveprot)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
/dev/mapper/rhel_s390x--kvm--091-root on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,nosuid,noexec,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=29,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=3937)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime,seclabel)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime,seclabel)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime,seclabel)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel,pagesize=1M)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
ramfs on /run/credentials/systemd-sysctl.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
ramfs on /run/credentials/systemd-tmpfiles-setup-dev.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
/dev/vda1 on /boot type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/mapper/rhel_s390x--kvm--091-home on /home type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
ramfs on /run/credentials/systemd-tmpfiles-setup.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=796184k,nr_inodes=199046,mode=700,inode64)
*** end mount output


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-10-29  4:11 [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs] Zorro Lang
@ 2023-11-06  6:13 ` Dave Chinner
  2023-11-06 19:26   ` Zorro Lang
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-11-06  6:13 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-xfs, fstests

On Sun, Oct 29, 2023 at 12:11:22PM +0800, Zorro Lang wrote:
> Hi xfs list,
> 
> Recently I always hit xfs corruption by running fstests generic/047 [1], and
> it show more failures in dmesg[2], e.g:

OK, g/047 is an fsync test.

> 
>   XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]

Ok, a directory block index translated to a hole in the file
mapping. That's bad...

> [2]
> [  376.468885] run fstests generic/047 at 2023-10-27 09:08:07
>  [  376.675751] XFS (loop1): Mounting V5 Filesystem 716c9687-ee74-4c12-b6ad-a0b513194f2b
>  [  376.677088] XFS (loop1): Ending clean mount
>  [  376.678189] XFS (loop1): User initiated shutdown received.
>  [  376.678194] XFS (loop1): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x5a/0xf8 [xfs] (fs/xfs/xfs_fsops.c:492).  Shutting down filesystem.
>  [  376.678409] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
>  [  376.679423] XFS (loop1): Unmounting Filesystem 716c9687-ee74-4c12-b6ad-a0b513194f2b
>  [  376.714910] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
>  [  376.716353] XFS (loop1): Ending clean mount

Files are created and fsync'd here.

>  [  380.375878] XFS (loop1): User initiated shutdown received.
>  [  380.375888] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495).  Shutting down filesystem.

Then the fs is shut down.

>  [  380.376101] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
>  [  380.380373] XFS (loop1): Unmounting Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
>  [  380.383835] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
>  [  380.397086] XFS (loop1): Starting recovery (logdev: internal)
>  [  380.465934] XFS (loop1): Ending recovery (logdev: internal)

Then it is recovered....
>  [  380.467409] XFS (loop1): Unmounting Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
>  [  380.475431] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
>  [  380.477235] XFS (loop1): Ending clean mount
>  [  380.477500] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]

.... and now the directory is bad.

>  [  380.477636] CPU: 0 PID: 337362 Comm: 047 Kdump: loaded Tainted: G        W          6.6.0-rc7+ #1
>  [  380.477639] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
>  [  380.477641] Call Trace:
>  [  380.477642]  [<0000000032d71372>] dump_stack_lvl+0x62/0x80 
>  [  380.477648]  [<000003ff7ff96c00>] xfs_corruption_error+0x70/0xa0 [xfs] 
>  [  380.477762]  [<000003ff7ff551ce>] xfs_dabuf_map.constprop.0+0x2a6/0x368 [xfs] 
>  [  380.477871]  [<000003ff7ff5773e>] xfs_da_read_buf+0x6e/0x128 [xfs] 
>  [  380.477977]  [<000003ff7ff57838>] xfs_da3_node_read+0x40/0x78 [xfs] 
>  [  380.478085]  [<000003ff7ff58c7a>] xfs_da3_node_lookup_int+0x82/0x558 [xfs] 
>  [  380.478193]  [<000003ff7ff68d6e>] xfs_dir2_node_lookup+0x3e/0x140 [xfs] 

So it's supposed to be in node format, which means enough blocks to
have an external free list. I guess a thousand dirents is enough to
do that.

Yet fsync is run after every file is created and written, so the
dirents and directory blocks should all be there....

.....

> _check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
> *** xfs_repair -n output ***
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
>         - zero log...
>         - scan filesystem freespace and inode maps...
>         - found root inode chunk
> Phase 3 - for each AG...
>         - scan (but don't clear) agi unlinked lists...
>         - process known inodes and perform inode discovery...
>         - agno = 0
> bad nblocks 9 for inode 128, would reset to 0
> no . entry for directory 128
> no .. entry for root directory 128
> problem with directory contents in inode 128
> would clear root inode 128
> bad nblocks 8 for inode 131, would reset to 0
> bad nblocks 8 for inode 132, would reset to 0
> bad nblocks 8 for inode 133, would reset to 0
> ...
> bad nblocks 8 for inode 62438, would reset to 0
> bad nblocks 8 for inode 62439, would reset to 0
> bad nblocks 8 for inode 62440, would reset to 0
> bad nblocks 8 for inode 62441, would reset to 0

Yet all the files - including the data files that were fsync'd - are
all bad.

Aparently the journal has been recovered, but lots of metadata
updates that should have been in the journal are missing after
recovery has completed? That doesn't make a whole lot of sense -
when did these tests start failing? Can you run a bisect?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-06  6:13 ` Dave Chinner
@ 2023-11-06 19:26   ` Zorro Lang
  2023-11-06 20:33     ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Zorro Lang @ 2023-11-06 19:26 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Mon, Nov 06, 2023 at 05:13:30PM +1100, Dave Chinner wrote:
> On Sun, Oct 29, 2023 at 12:11:22PM +0800, Zorro Lang wrote:
> > Hi xfs list,
> > 
> > Recently I always hit xfs corruption by running fstests generic/047 [1], and
> > it show more failures in dmesg[2], e.g:
> 
> OK, g/047 is an fsync test.
> 
> > 
> >   XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
> 
> Ok, a directory block index translated to a hole in the file
> mapping. That's bad...
> 
> > [2]
> > [  376.468885] run fstests generic/047 at 2023-10-27 09:08:07
> >  [  376.675751] XFS (loop1): Mounting V5 Filesystem 716c9687-ee74-4c12-b6ad-a0b513194f2b
> >  [  376.677088] XFS (loop1): Ending clean mount
> >  [  376.678189] XFS (loop1): User initiated shutdown received.
> >  [  376.678194] XFS (loop1): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x5a/0xf8 [xfs] (fs/xfs/xfs_fsops.c:492).  Shutting down filesystem.
> >  [  376.678409] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
> >  [  376.679423] XFS (loop1): Unmounting Filesystem 716c9687-ee74-4c12-b6ad-a0b513194f2b
> >  [  376.714910] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
> >  [  376.716353] XFS (loop1): Ending clean mount
> 
> Files are created and fsync'd here.
> 
> >  [  380.375878] XFS (loop1): User initiated shutdown received.
> >  [  380.375888] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495).  Shutting down filesystem.
> 
> Then the fs is shut down.
> 
> >  [  380.376101] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
> >  [  380.380373] XFS (loop1): Unmounting Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
> >  [  380.383835] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
> >  [  380.397086] XFS (loop1): Starting recovery (logdev: internal)
> >  [  380.465934] XFS (loop1): Ending recovery (logdev: internal)
> 
> Then it is recovered....
> >  [  380.467409] XFS (loop1): Unmounting Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
> >  [  380.475431] XFS (loop1): Mounting V5 Filesystem 40196bb2-39f4-4c32-83ef-567f42216699
> >  [  380.477235] XFS (loop1): Ending clean mount
> >  [  380.477500] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
> 
> .... and now the directory is bad.
> 
> >  [  380.477636] CPU: 0 PID: 337362 Comm: 047 Kdump: loaded Tainted: G        W          6.6.0-rc7+ #1
> >  [  380.477639] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
> >  [  380.477641] Call Trace:
> >  [  380.477642]  [<0000000032d71372>] dump_stack_lvl+0x62/0x80 
> >  [  380.477648]  [<000003ff7ff96c00>] xfs_corruption_error+0x70/0xa0 [xfs] 
> >  [  380.477762]  [<000003ff7ff551ce>] xfs_dabuf_map.constprop.0+0x2a6/0x368 [xfs] 
> >  [  380.477871]  [<000003ff7ff5773e>] xfs_da_read_buf+0x6e/0x128 [xfs] 
> >  [  380.477977]  [<000003ff7ff57838>] xfs_da3_node_read+0x40/0x78 [xfs] 
> >  [  380.478085]  [<000003ff7ff58c7a>] xfs_da3_node_lookup_int+0x82/0x558 [xfs] 
> >  [  380.478193]  [<000003ff7ff68d6e>] xfs_dir2_node_lookup+0x3e/0x140 [xfs] 
> 
> So it's supposed to be in node format, which means enough blocks to
> have an external free list. I guess a thousand dirents is enough to
> do that.
> 
> Yet fsync is run after every file is created and written, so the
> dirents and directory blocks should all be there....
> 
> .....
> 
> > _check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
> > *** xfs_repair -n output ***
> > Phase 1 - find and verify superblock...
> > Phase 2 - using internal log
> >         - zero log...
> >         - scan filesystem freespace and inode maps...
> >         - found root inode chunk
> > Phase 3 - for each AG...
> >         - scan (but don't clear) agi unlinked lists...
> >         - process known inodes and perform inode discovery...
> >         - agno = 0
> > bad nblocks 9 for inode 128, would reset to 0
> > no . entry for directory 128
> > no .. entry for root directory 128
> > problem with directory contents in inode 128
> > would clear root inode 128
> > bad nblocks 8 for inode 131, would reset to 0
> > bad nblocks 8 for inode 132, would reset to 0
> > bad nblocks 8 for inode 133, would reset to 0
> > ...
> > bad nblocks 8 for inode 62438, would reset to 0
> > bad nblocks 8 for inode 62439, would reset to 0
> > bad nblocks 8 for inode 62440, would reset to 0
> > bad nblocks 8 for inode 62441, would reset to 0
> 
> Yet all the files - including the data files that were fsync'd - are
> all bad.
> 
> Aparently the journal has been recovered, but lots of metadata
> updates that should have been in the journal are missing after
> recovery has completed? That doesn't make a whole lot of sense -
> when did these tests start failing? Can you run a bisect?

Hi Dave,

Thanks for your reply :) I tried to do a kernel bisect long time, but
find nothing ... Then suddently, I found it's failed from a xfsprogs
change [1].

Although that's not the root cause of this bug (on s390x), it just
enabled "nrext64" by default, which I never tested on s390x before.
For now, we know this's an issue about this feature, and only on
s390x for now.

Thanks,
Zorro

[1]
commit e5b18d7d1d962e942ce3b0a9ccdb5872074e24df
Author: Darrick J. Wong <djwong@kernel.org>
Date:   Mon Sep 25 14:59:25 2023 -0700

    mkfs: enable large extent counts by default

> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-06 19:26   ` Zorro Lang
@ 2023-11-06 20:33     ` Dave Chinner
  2023-11-06 22:20       ` Darrick J. Wong
                         ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Dave Chinner @ 2023-11-06 20:33 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Tue, Nov 07, 2023 at 03:26:27AM +0800, Zorro Lang wrote:
> On Mon, Nov 06, 2023 at 05:13:30PM +1100, Dave Chinner wrote:
> > On Sun, Oct 29, 2023 at 12:11:22PM +0800, Zorro Lang wrote:
> > > Hi xfs list,
> > > 
> > > Recently I always hit xfs corruption by running fstests generic/047 [1], and
> > > it show more failures in dmesg[2], e.g:
> > 
> > OK, g/047 is an fsync test.
> > 
> > > 
> > >   XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
> > 
> > Ok, a directory block index translated to a hole in the file
> > mapping. That's bad...
....
> > > _check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
> > > *** xfs_repair -n output ***
> > > Phase 1 - find and verify superblock...
> > > Phase 2 - using internal log
> > >         - zero log...
> > >         - scan filesystem freespace and inode maps...
> > >         - found root inode chunk
> > > Phase 3 - for each AG...
> > >         - scan (but don't clear) agi unlinked lists...
> > >         - process known inodes and perform inode discovery...
> > >         - agno = 0
> > > bad nblocks 9 for inode 128, would reset to 0
> > > no . entry for directory 128
> > > no .. entry for root directory 128
> > > problem with directory contents in inode 128
> > > would clear root inode 128
> > > bad nblocks 8 for inode 131, would reset to 0
> > > bad nblocks 8 for inode 132, would reset to 0
> > > bad nblocks 8 for inode 133, would reset to 0
> > > ...
> > > bad nblocks 8 for inode 62438, would reset to 0
> > > bad nblocks 8 for inode 62439, would reset to 0
> > > bad nblocks 8 for inode 62440, would reset to 0
> > > bad nblocks 8 for inode 62441, would reset to 0
> > 
> > Yet all the files - including the data files that were fsync'd - are
> > all bad.
> > 
> > Aparently the journal has been recovered, but lots of metadata
> > updates that should have been in the journal are missing after
> > recovery has completed? That doesn't make a whole lot of sense -
> > when did these tests start failing? Can you run a bisect?
> 
> Hi Dave,
> 
> Thanks for your reply :) I tried to do a kernel bisect long time, but
> find nothing ... Then suddently, I found it's failed from a xfsprogs
> change [1].
> 
> Although that's not the root cause of this bug (on s390x), it just
> enabled "nrext64" by default, which I never tested on s390x before.
> For now, we know this's an issue about this feature, and only on
> s390x for now.

That's not good. Can you please determine if this is a zero-day bug
with the nrext64 feature? I think it was merged in 5.19, so if you
could try to reproduce it on a 5.18 and 5.19 kernels first, that
would be handy.

Also, from your s390 kernel build, can you get the pahole output
for the struct xfs_dinode both for a good kernel and a bad kernel?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-06 20:33     ` Dave Chinner
@ 2023-11-06 22:20       ` Darrick J. Wong
  2023-11-07  8:05       ` Zorro Lang
  2023-11-07  8:29       ` Christoph Hellwig
  2 siblings, 0 replies; 19+ messages in thread
From: Darrick J. Wong @ 2023-11-06 22:20 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Zorro Lang, linux-xfs, fstests, Carlos Maiolino

n Tue, Nov 07, 2023 at 07:33:50AM +1100, Dave Chinner wrote:
> On Tue, Nov 07, 2023 at 03:26:27AM +0800, Zorro Lang wrote:
> > On Mon, Nov 06, 2023 at 05:13:30PM +1100, Dave Chinner wrote:
> > > On Sun, Oct 29, 2023 at 12:11:22PM +0800, Zorro Lang wrote:
> > > > Hi xfs list,
> > > > 
> > > > Recently I always hit xfs corruption by running fstests generic/047 [1], and
> > > > it show more failures in dmesg[2], e.g:
> > > 
> > > OK, g/047 is an fsync test.
> > > 
> > > > 
> > > >   XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
> > > 
> > > Ok, a directory block index translated to a hole in the file
> > > mapping. That's bad...
> ....
> > > > _check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
> > > > *** xfs_repair -n output ***
> > > > Phase 1 - find and verify superblock...
> > > > Phase 2 - using internal log
> > > >         - zero log...
> > > >         - scan filesystem freespace and inode maps...
> > > >         - found root inode chunk
> > > > Phase 3 - for each AG...
> > > >         - scan (but don't clear) agi unlinked lists...
> > > >         - process known inodes and perform inode discovery...
> > > >         - agno = 0
> > > > bad nblocks 9 for inode 128, would reset to 0
> > > > no . entry for directory 128
> > > > no .. entry for root directory 128
> > > > problem with directory contents in inode 128
> > > > would clear root inode 128
> > > > bad nblocks 8 for inode 131, would reset to 0
> > > > bad nblocks 8 for inode 132, would reset to 0
> > > > bad nblocks 8 for inode 133, would reset to 0
> > > > ...
> > > > bad nblocks 8 for inode 62438, would reset to 0
> > > > bad nblocks 8 for inode 62439, would reset to 0
> > > > bad nblocks 8 for inode 62440, would reset to 0
> > > > bad nblocks 8 for inode 62441, would reset to 0
> > > 
> > > Yet all the files - including the data files that were fsync'd - are
> > > all bad.
> > > 
> > > Aparently the journal has been recovered, but lots of metadata
> > > updates that should have been in the journal are missing after
> > > recovery has completed? That doesn't make a whole lot of sense -
> > > when did these tests start failing? Can you run a bisect?
> > 
> > Hi Dave,
> > 
> > Thanks for your reply :) I tried to do a kernel bisect long time, but
> > find nothing ... Then suddently, I found it's failed from a xfsprogs
> > change [1].
> > 
> > Although that's not the root cause of this bug (on s390x), it just
> > enabled "nrext64" by default, which I never tested on s390x before.
> > For now, we know this's an issue about this feature, and only on
> > s390x for now.
> 
> That's not good. Can you please determine if this is a zero-day bug
> with the nrext64 feature? I think it was merged in 5.19, so if you
> could try to reproduce it on a 5.18 and 5.19 kernels first, that
> would be handy.
> 
> Also, from your s390 kernel build, can you get the pahole output
> for the struct xfs_dinode both for a good kernel and a bad kernel?

And could you get the pahole output of xfs_log_dinode as well?  I'm
assuming that these problems only come up after *log recovery* and that
generic/476 is fine.

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-06 20:33     ` Dave Chinner
  2023-11-06 22:20       ` Darrick J. Wong
@ 2023-11-07  8:05       ` Zorro Lang
  2023-11-07  8:13         ` Dave Chinner
  2023-11-07  8:29       ` Christoph Hellwig
  2 siblings, 1 reply; 19+ messages in thread
From: Zorro Lang @ 2023-11-07  8:05 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Tue, Nov 07, 2023 at 07:33:50AM +1100, Dave Chinner wrote:
> On Tue, Nov 07, 2023 at 03:26:27AM +0800, Zorro Lang wrote:
> > On Mon, Nov 06, 2023 at 05:13:30PM +1100, Dave Chinner wrote:
> > > On Sun, Oct 29, 2023 at 12:11:22PM +0800, Zorro Lang wrote:
> > > > Hi xfs list,
> > > > 
> > > > Recently I always hit xfs corruption by running fstests generic/047 [1], and
> > > > it show more failures in dmesg[2], e.g:
> > > 
> > > OK, g/047 is an fsync test.
> > > 
> > > > 
> > > >   XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
> > > 
> > > Ok, a directory block index translated to a hole in the file
> > > mapping. That's bad...
> ....
> > > > _check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
> > > > *** xfs_repair -n output ***
> > > > Phase 1 - find and verify superblock...
> > > > Phase 2 - using internal log
> > > >         - zero log...
> > > >         - scan filesystem freespace and inode maps...
> > > >         - found root inode chunk
> > > > Phase 3 - for each AG...
> > > >         - scan (but don't clear) agi unlinked lists...
> > > >         - process known inodes and perform inode discovery...
> > > >         - agno = 0
> > > > bad nblocks 9 for inode 128, would reset to 0
> > > > no . entry for directory 128
> > > > no .. entry for root directory 128
> > > > problem with directory contents in inode 128
> > > > would clear root inode 128
> > > > bad nblocks 8 for inode 131, would reset to 0
> > > > bad nblocks 8 for inode 132, would reset to 0
> > > > bad nblocks 8 for inode 133, would reset to 0
> > > > ...
> > > > bad nblocks 8 for inode 62438, would reset to 0
> > > > bad nblocks 8 for inode 62439, would reset to 0
> > > > bad nblocks 8 for inode 62440, would reset to 0
> > > > bad nblocks 8 for inode 62441, would reset to 0
> > > 
> > > Yet all the files - including the data files that were fsync'd - are
> > > all bad.
> > > 
> > > Aparently the journal has been recovered, but lots of metadata
> > > updates that should have been in the journal are missing after
> > > recovery has completed? That doesn't make a whole lot of sense -
> > > when did these tests start failing? Can you run a bisect?
> > 
> > Hi Dave,
> > 
> > Thanks for your reply :) I tried to do a kernel bisect long time, but
> > find nothing ... Then suddently, I found it's failed from a xfsprogs
> > change [1].
> > 
> > Although that's not the root cause of this bug (on s390x), it just
> > enabled "nrext64" by default, which I never tested on s390x before.
> > For now, we know this's an issue about this feature, and only on
> > s390x for now.
> 
> That's not good. Can you please determine if this is a zero-day bug
> with the nrext64 feature? I think it was merged in 5.19, so if you
> could try to reproduce it on a 5.18 and 5.19 kernels first, that
> would be handy.

Unfortunately, it's a bug be there nearly from beginning. The linux v5.19
can trigger this bug (with latest xfsprogs for-next branch):

FSTYP         -- xfs (debug)
PLATFORM      -- Linux/s390x s390x-kvm-026 5.19.0 #1 SMP Tue Nov 7 00:44:49 EST 2023
MKFS_OPTIONS  -- -f -m crc=1,finobt=1,rmapbt=0,reflink=1,bigtime=1,inobtcount=1 /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR

generic/047       _check_xfs_filesystem: filesystem on /dev/loop1 failed scrub
(see /var/lib/xfstests/results//generic/047.full for details)
_check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
(see /var/lib/xfstests/results//generic/047.full for details)
_check_dmesg: something found in dmesg (see /var/lib/xfstests/results//generic/047.dmesg)
- output mismatch (see /var/lib/xfstests/results//generic/047.out.bad)
    --- tests/generic/047.out	2023-11-07 01:18:41.687291490 -0500
    +++ /var/lib/xfstests/results//generic/047.out.bad	2023-11-07 01:19:32.237271471 -0500
    @@ -1 +1,1000 @@
     QA output created by 047
    +file /mnt/fstests/SCRATCH_DIR/1 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/2 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/3 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/4 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/5 missing - fsync failed
    +file /mnt/fstests/SCRATCH_DIR/6 missing - fsync failed
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/047.out /var/lib/xfstests/results//generic/047.out.bad'  to see the entire diff)
Ran: generic/047
Failures: generic/047
Failed 1 of 1 tests

  [  252.399348] XFS (loop1): Corruption detected. Unmount and run xfs_repair
  [  252.399349] XFS (loop1): xfs_dabuf_map: bno 8388608 inode 128
  [  252.399351] XFS (loop1): [00] br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0
  [  252.399367] XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x370 [xfs]
  [  252.399452] CPU: 1 PID: 129486 Comm: 047 Kdump: loaded Not tainted 5.19.0 #1
  [  252.399454] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
  [  252.399455] Call Trace:
  [  252.399469]  [<000000010fe8b17a>] dump_stack_lvl+0x62/0x80 
  [  252.399471]  [<000003ff8021e3c0>] xfs_corruption_error+0x70/0xa0 [xfs] 
  [  252.399561]  [<000003ff801dcbd6>] xfs_dabuf_map.constprop.0+0x2a6/0x370 [xfs] 
  [  252.399646]  [<000003ff801df1ce>] xfs_da_read_buf+0x6e/0x130 [xfs] 
  [  252.399731]  [<000003ff801df2d0>] xfs_da3_node_read+0x40/0x80 [xfs] 
  [  252.399816]  [<000003ff801e0732>] xfs_da3_node_lookup_int+0x82/0x560 [xfs] 
  [  252.399900]  [<000003ff801f08de>] xfs_dir2_node_lookup+0x3e/0x150 [xfs] 
  [  252.399986]  [<000003ff801e49de>] xfs_dir_lookup+0x1ee/0x220 [xfs] 
  [  252.400071]  [<000003ff80235da0>] xfs_lookup+0x60/0x110 [xfs] 
  [  252.400161]  [<000003ff802310b2>] xfs_vn_lookup+0x62/0xa0 [xfs] 
  [  252.400251]  [<000000010f86eb7a>] __lookup_slow+0x9a/0x170 
  [  252.400253]  [<000000010f87381a>] walk_component+0x14a/0x1e0 
  [  252.400255]  [<000000010f874098>] path_lookupat+0x88/0x1f0 
  [  252.400257]  [<000000010f87569a>] filename_lookup+0xaa/0x1a0 
  [  252.400259]  [<000000010f866a66>] vfs_statx+0x86/0x120 
  [  252.400261]  [<000000010f866da4>] vfs_fstatat+0x74/0xa0 
  [  252.400263]  [<000000010f867038>] __do_sys_newfstatat+0x28/0x50 
  [  252.400264]  [<000000010fe8eeb0>] __do_syscall+0x1d0/0x200 
  [  252.400266]  [<000000010fe9d6c2>] system_call+0x82/0xb0 


And the v5.18 doesn't support this feature, so ...

 [   95.265522] XFS (loop0): Superblock has unknown incompatible features (0x20) enabled.
 [   95.265533] XFS (loop0): Filesystem cannot be safely mounted by this kernel.
 [   95.265535] XFS (loop0): SB validate failed with error -22.

If you need, tell me which commit (list) is the beginning we support
this feature. Or which commit you feel it's dangerous, I can give it
a specific testing.

Thanks,
Zorro

> 
> Also, from your s390 kernel build, can you get the pahole output
> for the struct xfs_dinode both for a good kernel and a bad kernel?
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-07  8:05       ` Zorro Lang
@ 2023-11-07  8:13         ` Dave Chinner
  2023-11-07 15:13           ` Zorro Lang
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-11-07  8:13 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Tue, Nov 07, 2023 at 04:05:22PM +0800, Zorro Lang wrote:
> On Tue, Nov 07, 2023 at 07:33:50AM +1100, Dave Chinner wrote:
> > On Tue, Nov 07, 2023 at 03:26:27AM +0800, Zorro Lang wrote:
> > > Thanks for your reply :) I tried to do a kernel bisect long time, but
> > > find nothing ... Then suddently, I found it's failed from a xfsprogs
> > > change [1].
> > > 
> > > Although that's not the root cause of this bug (on s390x), it just
> > > enabled "nrext64" by default, which I never tested on s390x before.
> > > For now, we know this's an issue about this feature, and only on
> > > s390x for now.
> > 
> > That's not good. Can you please determine if this is a zero-day bug
> > with the nrext64 feature? I think it was merged in 5.19, so if you
> > could try to reproduce it on a 5.18 and 5.19 kernels first, that
> > would be handy.
> 
> Unfortunately, it's a bug be there nearly from beginning. The linux v5.19
> can trigger this bug (with latest xfsprogs for-next branch):

Ok. Can you grab the pahole output for the xfs_dinode and
xfs_log_dinode for s390 from both 5.18 and 5.19 kernel builds?
(i.e. 'pahole fs/xfs/xfs_inode.o |less' and search for the two
structures).

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-07  8:13         ` Dave Chinner
@ 2023-11-07 15:13           ` Zorro Lang
  2023-11-08  6:38             ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Zorro Lang @ 2023-11-07 15:13 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Tue, Nov 07, 2023 at 07:13:39PM +1100, Dave Chinner wrote:
> On Tue, Nov 07, 2023 at 04:05:22PM +0800, Zorro Lang wrote:
> > On Tue, Nov 07, 2023 at 07:33:50AM +1100, Dave Chinner wrote:
> > > On Tue, Nov 07, 2023 at 03:26:27AM +0800, Zorro Lang wrote:
> > > > Thanks for your reply :) I tried to do a kernel bisect long time, but
> > > > find nothing ... Then suddently, I found it's failed from a xfsprogs
> > > > change [1].
> > > > 
> > > > Although that's not the root cause of this bug (on s390x), it just
> > > > enabled "nrext64" by default, which I never tested on s390x before.
> > > > For now, we know this's an issue about this feature, and only on
> > > > s390x for now.
> > > 
> > > That's not good. Can you please determine if this is a zero-day bug
> > > with the nrext64 feature? I think it was merged in 5.19, so if you
> > > could try to reproduce it on a 5.18 and 5.19 kernels first, that
> > > would be handy.
> > 
> > Unfortunately, it's a bug be there nearly from beginning. The linux v5.19
> > can trigger this bug (with latest xfsprogs for-next branch):
> 
> Ok. Can you grab the pahole output for the xfs_dinode and
> xfs_log_dinode for s390 from both 5.18 and 5.19 kernel builds?
> (i.e. 'pahole fs/xfs/xfs_inode.o |less' and search for the two
> structures).

I can't find xfs_log_dinode in fs/xfs/xfs_inode.o, but I can find both structures
in fs/xfs/xfs_inode_item.o, so below output base on:

  # pahole fs/xfs/xfs_inode_item.o

The output on v5.19 is [1], v5.18 output is [2], the diff of 5.18 and 5.19 is [3].

Thanks,
Zorro

[1]
v5.19:
struct xfs_dinode {
        __be16                     di_magic;             /*     0     2 */
        __be16                     di_mode;              /*     2     2 */
        __u8                       di_version;           /*     4     1 */
        __u8                       di_format;            /*     5     1 */
        __be16                     di_onlink;            /*     6     2 */
        __be32                     di_uid;               /*     8     4 */
        __be32                     di_gid;               /*    12     4 */
        __be32                     di_nlink;             /*    16     4 */
        __be16                     di_projid_lo;         /*    20     2 */
        __be16                     di_projid_hi;         /*    22     2 */
        union {
                __be64             di_big_nextents;      /*    24     8 */
                __be64             di_v3_pad;            /*    24     8 */
                struct {
                        __u8       di_v2_pad[6];         /*    24     6 */
                        __be16     di_flushiter;         /*    30     2 */
                };                                       /*    24     8 */
        };                                               /*    24     8 */
        xfs_timestamp_t            di_atime;             /*    32     8 */
        xfs_timestamp_t            di_mtime;             /*    40     8 */
        xfs_timestamp_t            di_ctime;             /*    48     8 */
        __be64                     di_size;              /*    56     8 */
        __be64                     di_nblocks;           /*    64     8 */
        __be32                     di_extsize;           /*    72     4 */
        union {
                struct {
                        __be32     di_nextents;          /*    76     4 */
                        __be16     di_anextents;         /*    80     2 */
                } __attribute__((__packed__));           /*    76     6 */
                struct {
                        __be32     di_big_anextents;     /*    76     4 */
                        __be16     di_nrext64_pad;       /*    80     2 */
                } __attribute__((__packed__));           /*    76     6 */
        };                                               /*    76     6 */
        __u8                       di_forkoff;           /*    82     1 */
        __s8                       di_aformat;           /*    83     1 */
        __be32                     di_dmevmask;          /*    84     4 */
        __be16                     di_dmstate;           /*    88     2 */
        __be16                     di_flags;             /*    90     2 */
        __be32                     di_gen;               /*    92     4 */
        __be32                     di_next_unlinked;     /*    96     4 */
        __le32                     di_crc;               /*   100     4 */
        __be64                     di_changecount;       /*   104     8 */
        __be64                     di_lsn;               /*   112     8 */
        __be64                     di_flags2;            /*   120     8 */
        __be32                     di_cowextsize;        /*   128     4 */
        __u8                       di_pad2[12];          /*   132    12 */
        xfs_timestamp_t            di_crtime;            /*   144     8 */
        __be64                     di_ino;               /*   152     8 */
        uuid_t                     di_uuid;              /*   160    16 */

        /* size: 176, cachelines: 1, members: 34 */
        /* last cacheline: 176 bytes */
};

struct xfs_log_dinode {
        uint16_t                   di_magic;             /*     0     2 */
        uint16_t                   di_mode;              /*     2     2 */
        int8_t                     di_version;           /*     4     1 */
        int8_t                     di_format;            /*     5     1 */
        uint8_t                    di_pad3[2];           /*     6     2 */
        uint32_t                   di_uid;               /*     8     4 */
        uint32_t                   di_gid;               /*    12     4 */
        uint32_t                   di_nlink;             /*    16     4 */
        uint16_t                   di_projid_lo;         /*    20     2 */
        uint16_t                   di_projid_hi;         /*    22     2 */
        union {
                uint64_t           di_big_nextents;      /*    24     8 */
                uint64_t           di_v3_pad;            /*    24     8 */
                struct {
                        uint8_t    di_v2_pad[6];         /*    24     6 */
                        uint16_t   di_flushiter;         /*    30     2 */
                };                                       /*    24     8 */
        };                                               /*    24     8 */
        xfs_log_timestamp_t        di_atime;             /*    32     8 */
        xfs_log_timestamp_t        di_mtime;             /*    40     8 */
        xfs_log_timestamp_t        di_ctime;             /*    48     8 */
        xfs_fsize_t                di_size;              /*    56     8 */
        xfs_rfsblock_t             di_nblocks;           /*    64     8 */
        xfs_extlen_t               di_extsize;           /*    72     4 */
        union {
                struct {
                        uint32_t   di_nextents;          /*    76     4 */
                        uint16_t   di_anextents;         /*    80     2 */
                } __attribute__((__packed__));           /*    76     6 */
                struct {
                        uint32_t   di_big_anextents;     /*    76     4 */
                        uint16_t   di_nrext64_pad;       /*    80     2 */
                } __attribute__((__packed__));           /*    76     6 */
        };                                               /*    76     6 */
        uint8_t                    di_forkoff;           /*    82     1 */
        int8_t                     di_aformat;           /*    83     1 */
        uint32_t                   di_dmevmask;          /*    84     4 */
        uint16_t                   di_dmstate;           /*    88     2 */
        uint16_t                   di_flags;             /*    90     2 */
        uint32_t                   di_gen;               /*    92     4 */
        xfs_agino_t                di_next_unlinked;     /*    96     4 */
        uint32_t                   di_crc;               /*   100     4 */
        uint64_t                   di_changecount;       /*   104     8 */
        xfs_lsn_t                  di_lsn;               /*   112     8 */
        uint64_t                   di_flags2;            /*   120     8 */
        uint32_t                   di_cowextsize;        /*   128     4 */
        uint8_t                    di_pad2[12];          /*   132    12 */
        xfs_log_timestamp_t        di_crtime;            /*   144     8 */
        xfs_ino_t                  di_ino;               /*   152     8 */
        uuid_t                     di_uuid;              /*   160    16 */

        /* size: 176, cachelines: 1, members: 34 */
        /* last cacheline: 176 bytes */
};


[2]
v5.18:
struct xfs_dinode {
        __be16                     di_magic;             /*     0     2 */
        __be16                     di_mode;              /*     2     2 */
        __u8                       di_version;           /*     4     1 */
        __u8                       di_format;            /*     5     1 */
        __be16                     di_onlink;            /*     6     2 */
        __be32                     di_uid;               /*     8     4 */
        __be32                     di_gid;               /*    12     4 */
        __be32                     di_nlink;             /*    16     4 */
        __be16                     di_projid_lo;         /*    20     2 */
        __be16                     di_projid_hi;         /*    22     2 */
        __u8                       di_pad[6];            /*    24     6 */
        __be16                     di_flushiter;         /*    30     2 */
        xfs_timestamp_t            di_atime;             /*    32     8 */
        xfs_timestamp_t            di_mtime;             /*    40     8 */
        xfs_timestamp_t            di_ctime;             /*    48     8 */
        __be64                     di_size;              /*    56     8 */
        __be64                     di_nblocks;           /*    64     8 */
        __be32                     di_extsize;           /*    72     4 */
        __be32                     di_nextents;          /*    76     4 */
        __be16                     di_anextents;         /*    80     2 */
        __u8                       di_forkoff;           /*    82     1 */
        __s8                       di_aformat;           /*    83     1 */
        __be32                     di_dmevmask;          /*    84     4 */
        __be16                     di_dmstate;           /*    88     2 */
        __be16                     di_flags;             /*    90     2 */
        __be32                     di_gen;               /*    92     4 */
        __be32                     di_next_unlinked;     /*    96     4 */
        __le32                     di_crc;               /*   100     4 */
        __be64                     di_changecount;       /*   104     8 */
        __be64                     di_lsn;               /*   112     8 */
        __be64                     di_flags2;            /*   120     8 */
        __be32                     di_cowextsize;        /*   128     4 */
        __u8                       di_pad2[12];          /*   132    12 */
        xfs_timestamp_t            di_crtime;            /*   144     8 */
        __be64                     di_ino;               /*   152     8 */
        uuid_t                     di_uuid;              /*   160    16 */

        /* size: 176, cachelines: 1, members: 36 */
        /* last cacheline: 176 bytes */
};

struct xfs_log_dinode {
        uint16_t                   di_magic;             /*     0     2 */
        uint16_t                   di_mode;              /*     2     2 */
        int8_t                     di_version;           /*     4     1 */
        int8_t                     di_format;            /*     5     1 */
        uint8_t                    di_pad3[2];           /*     6     2 */
        uint32_t                   di_uid;               /*     8     4 */
        uint32_t                   di_gid;               /*    12     4 */
        uint32_t                   di_nlink;             /*    16     4 */
        uint16_t                   di_projid_lo;         /*    20     2 */
        uint16_t                   di_projid_hi;         /*    22     2 */
        uint8_t                    di_pad[6];            /*    24     6 */
        uint16_t                   di_flushiter;         /*    30     2 */
        xfs_log_timestamp_t        di_atime;             /*    32     8 */
        xfs_log_timestamp_t        di_mtime;             /*    40     8 */
        xfs_log_timestamp_t        di_ctime;             /*    48     8 */
        xfs_fsize_t                di_size;              /*    56     8 */
        xfs_rfsblock_t             di_nblocks;           /*    64     8 */
        xfs_extlen_t               di_extsize;           /*    72     4 */
        xfs_extnum_t               di_nextents;          /*    76     4 */
        xfs_aextnum_t              di_anextents;         /*    80     2 */
        uint8_t                    di_forkoff;           /*    82     1 */
        int8_t                     di_aformat;           /*    83     1 */
        uint32_t                   di_dmevmask;          /*    84     4 */
        uint16_t                   di_dmstate;           /*    88     2 */
        uint16_t                   di_flags;             /*    90     2 */
        uint32_t                   di_gen;               /*    92     4 */
        xfs_agino_t                di_next_unlinked;     /*    96     4 */
        uint32_t                   di_crc;               /*   100     4 */
        uint64_t                   di_changecount;       /*   104     8 */
        xfs_lsn_t                  di_lsn;               /*   112     8 */
        uint64_t                   di_flags2;            /*   120     8 */
        uint32_t                   di_cowextsize;        /*   128     4 */
        uint8_t                    di_pad2[12];          /*   132    12 */
        xfs_log_timestamp_t        di_crtime;            /*   144     8 */
        xfs_ino_t                  di_ino;               /*   152     8 */
        uuid_t                     di_uuid;              /*   160    16 */

        /* size: 176, cachelines: 1, members: 36 */
        /* last cacheline: 176 bytes */
};

[3]
# diff -Nup xfs_inode_item.pahole.518 xfs_inode_item.pahole.519
@@ -6763,16 +6799,30 @@ struct xfs_dinode {
        __be32                     di_nlink;             /*    16     4 */
        __be16                     di_projid_lo;         /*    20     2 */
        __be16                     di_projid_hi;         /*    22     2 */
-       __u8                       di_pad[6];            /*    24     6 */
-       __be16                     di_flushiter;         /*    30     2 */
+       union {
+               __be64             di_big_nextents;      /*    24     8 */
+               __be64             di_v3_pad;            /*    24     8 */
+               struct {
+                       __u8       di_v2_pad[6];         /*    24     6 */
+                       __be16     di_flushiter;         /*    30     2 */
+               };                                       /*    24     8 */
+       };                                               /*    24     8 */
        xfs_timestamp_t            di_atime;             /*    32     8 */
        xfs_timestamp_t            di_mtime;             /*    40     8 */
        xfs_timestamp_t            di_ctime;             /*    48     8 */
        __be64                     di_size;              /*    56     8 */
        __be64                     di_nblocks;           /*    64     8 */
        __be32                     di_extsize;           /*    72     4 */
-       __be32                     di_nextents;          /*    76     4 */
-       __be16                     di_anextents;         /*    80     2 */
+       union {
+               struct {
+                       __be32     di_nextents;          /*    76     4 */
+                       __be16     di_anextents;         /*    80     2 */
+               } __attribute__((__packed__));           /*    76     6 */
+               struct {
+                       __be32     di_big_anextents;     /*    76     4 */
+                       __be16     di_nrext64_pad;       /*    80     2 */
+               } __attribute__((__packed__));           /*    76     6 */
+       };                                               /*    76     6 */
        __u8                       di_forkoff;           /*    82     1 */
        __s8                       di_aformat;           /*    83     1 */
        __be32                     di_dmevmask;          /*    84     4 */
@@ -6790,7 +6840,7 @@ struct xfs_dinode {
        __be64                     di_ino;               /*   152     8 */
        uuid_t                     di_uuid;              /*   160    16 */
 
-       /* size: 176, cachelines: 1, members: 36 */
+       /* size: 176, cachelines: 1, members: 34 */
        /* last cacheline: 176 bytes */
 };
...
...
@@ -6932,16 +6992,30 @@ struct xfs_log_dinode {
        uint32_t                   di_nlink;             /*    16     4 */
        uint16_t                   di_projid_lo;         /*    20     2 */
        uint16_t                   di_projid_hi;         /*    22     2 */
-       uint8_t                    di_pad[6];            /*    24     6 */
-       uint16_t                   di_flushiter;         /*    30     2 */
+       union {
+               uint64_t           di_big_nextents;      /*    24     8 */
+               uint64_t           di_v3_pad;            /*    24     8 */
+               struct {
+                       uint8_t    di_v2_pad[6];         /*    24     6 */
+                       uint16_t   di_flushiter;         /*    30     2 */
+               };                                       /*    24     8 */
+       };                                               /*    24     8 */
        xfs_log_timestamp_t        di_atime;             /*    32     8 */
        xfs_log_timestamp_t        di_mtime;             /*    40     8 */
        xfs_log_timestamp_t        di_ctime;             /*    48     8 */
        xfs_fsize_t                di_size;              /*    56     8 */
        xfs_rfsblock_t             di_nblocks;           /*    64     8 */
        xfs_extlen_t               di_extsize;           /*    72     4 */
-       xfs_extnum_t               di_nextents;          /*    76     4 */
-       xfs_aextnum_t              di_anextents;         /*    80     2 */
+       union {
+               struct {
+                       uint32_t   di_nextents;          /*    76     4 */
+                       uint16_t   di_anextents;         /*    80     2 */
+               } __attribute__((__packed__));           /*    76     6 */
+               struct {
+                       uint32_t   di_big_anextents;     /*    76     4 */
+                       uint16_t   di_nrext64_pad;       /*    80     2 */
+               } __attribute__((__packed__));           /*    76     6 */
+       };                                               /*    76     6 */
        uint8_t                    di_forkoff;           /*    82     1 */
        int8_t                     di_aformat;           /*    83     1 */
        uint32_t                   di_dmevmask;          /*    84     4 */
@@ -6959,7 +7033,7 @@ struct xfs_log_dinode {
        xfs_ino_t                  di_ino;               /*   152     8 */
        uuid_t                     di_uuid;              /*   160    16 */
 
-       /* size: 176, cachelines: 1, members: 36 */
+       /* size: 176, cachelines: 1, members: 34 */
        /* last cacheline: 176 bytes */
 };

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-07 15:13           ` Zorro Lang
@ 2023-11-08  6:38             ` Dave Chinner
       [not found]               ` <CAN=2_H+CdEK_rEUmYbmkCjSRqhX2cwi5yRHQcKAmKDPF16vqOw@mail.gmail.com>
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-11-08  6:38 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Tue, Nov 07, 2023 at 11:13:14PM +0800, Zorro Lang wrote:
> On Tue, Nov 07, 2023 at 07:13:39PM +1100, Dave Chinner wrote:
> > On Tue, Nov 07, 2023 at 04:05:22PM +0800, Zorro Lang wrote:
> > > On Tue, Nov 07, 2023 at 07:33:50AM +1100, Dave Chinner wrote:
> > > > On Tue, Nov 07, 2023 at 03:26:27AM +0800, Zorro Lang wrote:
> > > > > Thanks for your reply :) I tried to do a kernel bisect long time, but
> > > > > find nothing ... Then suddently, I found it's failed from a xfsprogs
> > > > > change [1].
> > > > > 
> > > > > Although that's not the root cause of this bug (on s390x), it just
> > > > > enabled "nrext64" by default, which I never tested on s390x before.
> > > > > For now, we know this's an issue about this feature, and only on
> > > > > s390x for now.
> > > > 
> > > > That's not good. Can you please determine if this is a zero-day bug
> > > > with the nrext64 feature? I think it was merged in 5.19, so if you
> > > > could try to reproduce it on a 5.18 and 5.19 kernels first, that
> > > > would be handy.
> > > 
> > > Unfortunately, it's a bug be there nearly from beginning. The linux v5.19
> > > can trigger this bug (with latest xfsprogs for-next branch):
> > 
> > Ok. Can you grab the pahole output for the xfs_dinode and
> > xfs_log_dinode for s390 from both 5.18 and 5.19 kernel builds?
> > (i.e. 'pahole fs/xfs/xfs_inode.o |less' and search for the two
> > structures).
> 
> I can't find xfs_log_dinode in fs/xfs/xfs_inode.o, but I can find both structures
> in fs/xfs/xfs_inode_item.o, so below output base on:
> 
>   # pahole fs/xfs/xfs_inode_item.o
> 
> The output on v5.19 is [1], v5.18 output is [2], the diff of 5.18 and 5.19 is [3].

Ok, so there's nothing wrong with the on-disk format definition or
the journal format - they both lay out in exactly the right shape
so I think at this point we need metadumps from the broken
filesystems.

Can you pick one of the failing tests and grab metadumps from
the shutdown filesystem (i.e. before it is recovered) and then
another from after it is recovered and the problem tripped over?

I know I won't be able to replay the log on x86-64, but knowing what
is in the journal vs what ends up being recovered will tell us
where to look next.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

[parent not found: <CAN=2_H+CdEK_rEUmYbmkCjSRqhX2cwi5yRHQcKAmKDPF16vqOw@mail.gmail.com>]

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
       [not found]               ` <CAN=2_H+CdEK_rEUmYbmkCjSRqhX2cwi5yRHQcKAmKDPF16vqOw@mail.gmail.com>
@ 2023-11-09  6:14                 ` Dave Chinner
  2023-11-09 14:09                   ` Zorro Lang
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-11-09  6:14 UTC (permalink / raw)
  To: Zirong Lang; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Thu, Nov 09, 2023 at 10:43:58AM +0800, Zirong Lang wrote:
> By changing the generic/047 as below, I got 2 dump files and 2 log files.
> Please check the attachment,
> and feel free to tell me if you need more.

Well, we've narrowed down to some weird recovery issue - the journal
is intact, recovery recovers the inode from the correct log item in
the journal, but the inode written to disk by recovery is corrupt.

What this points out is that we don't actually verify the inode we
recover is valid before writeback is queued, nor do we detect the
specific corruption being encountered in the verifier (i.e. non-zero
nblocks count when extent count is zero).

Can you add the patch below and see if/when the verifier fires on
the reproducer? I'm particularly interested to know where it fires -
in recovery before writeback, or when the root inode is re-read from
disk. It doesn't fire on x64-64....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: inode recovery does not validate the recovered inode

From: Dave Chinner <dchinner@redhat.com>

Discovered when trying to track down a weird recovery corruption
issue that wasn't detected at recovery time.

The specific corruption was a zero extent count field when big
extent counts are in use, and it turns out the dinode verifier
doesn't detect that specific corruption case, either. So fix it too.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_inode_buf.c   |  3 +++
 fs/xfs/xfs_inode_item_recover.c | 14 +++++++++++++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index a35781577cad..0f970a0b3382 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -508,6 +508,9 @@ xfs_dinode_verify(
 	if (mode && nextents + naextents > nblocks)
 		return __this_address;
 
+	if (nextents + naextents == 0 && nblocks != 0)
+		return __this_address;
+
 	if (S_ISDIR(mode) && nextents > mp->m_dir_geo->max_extents)
 		return __this_address;
 
diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
index 6b09e2bf2d74..f4c31c2b60d5 100644
--- a/fs/xfs/xfs_inode_item_recover.c
+++ b/fs/xfs/xfs_inode_item_recover.c
@@ -286,6 +286,7 @@ xlog_recover_inode_commit_pass2(
 	struct xfs_log_dinode		*ldip;
 	uint				isize;
 	int				need_free = 0;
+	xfs_failaddr_t			fa;
 
 	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
 		in_f = item->ri_buf[0].i_addr;
@@ -529,8 +530,19 @@ xlog_recover_inode_commit_pass2(
 	    (dip->di_mode != 0))
 		error = xfs_recover_inode_owner_change(mp, dip, in_f,
 						       buffer_list);
-	/* re-generate the checksum. */
+	/* re-generate the checksum and validate the recovered inode. */
 	xfs_dinode_calc_crc(log->l_mp, dip);
+	fa = xfs_dinode_verify(log->l_mp, in_f->ilf_ino, dip);
+	if (fa) {
+		XFS_CORRUPTION_ERROR(
+			"Bad dinode after recovery",
+				XFS_ERRLEVEL_LOW, mp, dip, sizeof(*dip));
+		xfs_alert(mp,
+			"Metadata corruption detected at %pS, inode 0x%llx",
+			fa, in_f->ilf_ino);
+		error = -EFSCORRUPTED;
+		goto out_release;
+	}
 
 	ASSERT(bp->b_mount == mp);
 	bp->b_flags |= _XBF_LOGRECOVERY;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-09  6:14                 ` Dave Chinner
@ 2023-11-09 14:09                   ` Zorro Lang
  2023-11-09 23:13                     ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Zorro Lang @ 2023-11-09 14:09 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Thu, Nov 09, 2023 at 05:14:51PM +1100, Dave Chinner wrote:
> On Thu, Nov 09, 2023 at 10:43:58AM +0800, Zirong Lang wrote:
> > By changing the generic/047 as below, I got 2 dump files and 2 log files.
> > Please check the attachment,
> > and feel free to tell me if you need more.
> 
> Well, we've narrowed down to some weird recovery issue - the journal
> is intact, recovery recovers the inode from the correct log item in
> the journal, but the inode written to disk by recovery is corrupt.
> 
> What this points out is that we don't actually verify the inode we
> recover is valid before writeback is queued, nor do we detect the
> specific corruption being encountered in the verifier (i.e. non-zero
> nblocks count when extent count is zero).
> 
> Can you add the patch below and see if/when the verifier fires on
> the reproducer? I'm particularly interested to know where it fires -
> in recovery before writeback, or when the root inode is re-read from
> disk. It doesn't fire on x64-64....

Hi Dave,

I just re-tested with your patch, the steps as [1]. The trace-cmd output
can be downloaded from [2].

Please tell me if I misunderstood anything.

Thanks,
Zorro

[1]
# ./check generic/047
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/s390x ibm-z-507 6.6.0+ #1 SMP Wed Nov  8 12:51:20 EST 2023
MKFS_OPTIONS  -- -f -m crc=1,finobt=1,rmapbt=0,reflink=1,bigtime=1,inobtcount=1 /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR

generic/047       [failed, exit status 1]_check_dmesg: something found in dmesg (see /var/lib/xfstests/results//generic/047.dmesg)
- output mismatch (see /var/lib/xfstests/results//generic/047.out.bad)
    --- tests/generic/047.out   2023-11-08 13:02:38.424036829 -0500
    +++ /var/lib/xfstests/results//generic/047.out.bad  2023-11-09 08:41:50.026163756 -0500
    @@ -1 +1,4 @@
     QA output created by 047
    +mount: /mnt/fstests/SCRATCH_DIR: mount(2) system call failed: Structure needs cleaning.
    +mount -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR failed
    +(see /var/lib/xfstests/results//generic/047.full for details)
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/047.out /var/lib/xfstests/results//generic/047.out.bad'  to see the entire diff)
Ran: generic/047
Failures: generic/047
Failed 1 of 1 tests

# mount /dev/loop1 /mnt/fstests/SCRATCH_DIR
mount: /mnt/fstests/SCRATCH_DIR: mount(2) system call failed: Structure needs cleaning.

# dmesg
[ 8272.820617] run fstests generic/047 at 2023-11-09 08:36:39
[ 8273.231657] XFS (loop1): Mounting V5 Filesystem 9faf1858-4d74-4b58-b0e7-68fd1d90762a
[ 8273.233769] XFS (loop1): Ending clean mount
[ 8273.235510] XFS (loop1): User initiated shutdown received.
[ 8273.235520] XFS (loop1): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x5a/0xf8 [xfs] (fs/xfs/xfs_fsops.c:492).  Shutting down filesystem.
[ 8273.235772] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
[ 8273.236810] XFS (loop1): Unmounting Filesystem 9faf1858-4d74-4b58-b0e7-68fd1d90762a
[ 8273.284005] XFS (loop1): Mounting V5 Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
[ 8273.285668] XFS (loop1): Ending clean mount
[ 8275.789831] XFS (loop1): User initiated shutdown received.
[ 8275.789846] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495).  Shutting down filesystem.
[ 8275.790170] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
[ 8275.795171] XFS (loop1): Unmounting Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
[ 8282.714748] XFS (loop1): Mounting V5 Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
[ 8282.728494] XFS (loop1): Starting recovery (logdev: internal)
[ 8282.741350] 00000000: 49 4e 81 80 03 02 00 00 00 00 00 00 00 00 00 00  IN..............
[ 8282.741357] 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 8282.741360] 00000020: 35 63 5c e5 7f d2 65 f2 35 63 5c e5 7f d2 65 f2  5c\...e.5c\...e.
[ 8282.741363] 00000030: 35 63 5c e5 7f d2 65 f2 00 00 00 00 00 00 80 00  5c\...e.........
[ 8282.741366] 00000040: 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00  ................
[ 8282.741368] 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 5d b8 45 2b  ............].E+
[ 8282.741371] 00000060: ff ff ff ff 77 bb a7 2f 00 00 00 00 00 00 00 04  ....w../........
[ 8282.741375] 00000070: 00 00 00 01 00 00 00 02 00 00 00 00 00 00 00 18  ................
[ 8282.741378] 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 8282.741381] 00000090: 35 63 5c e5 7f d2 65 f2 00 00 00 00 00 00 00 83  5c\...e.........
[ 8282.741397] 000000a0: 89 0f 60 68 8e b1 4a fa b1 fb 5d eb 3d 93 7f ce  ..`h..J...].=...
[ 8282.741399] XFS (loop1): Internal error Bad dinode after recovery at line 536 of file fs/xfs/xfs_inode_item_recover.c.  Caller xlog_recover_items_pass2+0x72/0xf0 [xfs]
[ 8282.741739] CPU: 1 PID: 7645 Comm: mount Kdump: loaded Not tainted 6.6.0+ #1
[ 8282.741743] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0)
[ 8282.741746] Call Trace:
[ 8282.741748]  [<000000004e7af67a>] dump_stack_lvl+0x62/0x80 
[ 8282.741756]  [<000003ff800b1cb0>] xfs_corruption_error+0x70/0xa0 [xfs] 
[ 8282.741863]  [<000003ff800f0032>] xlog_recover_inode_commit_pass2+0x63a/0xb10 [xfs] 
[ 8282.741973]  [<000003ff800f400a>] xlog_recover_items_pass2+0x72/0xf0 [xfs] 
[ 8282.742082]  [<000003ff800f4cc6>] xlog_recover_commit_trans+0x39e/0x3c0 [xfs] 
[ 8282.742192]  [<000003ff800f4e28>] xlog_recovery_process_trans+0x140/0x148 [xfs] 
[ 8282.742302]  [<000003ff800f4ef0>] xlog_recover_process_ophdr+0xc0/0x180 [xfs] 
[ 8282.742413]  [<000003ff800f59b6>] xlog_recover_process_data+0xb6/0x168 [xfs] 
[ 8282.742528]  [<000003ff800f5b6c>] xlog_recover_process+0x104/0x150 [xfs] 
[ 8282.742638]  [<000003ff800f5f6a>] xlog_do_recovery_pass+0x3b2/0x748 [xfs] 
[ 8282.742747]  [<000003ff800f6758>] xlog_do_log_recovery+0x88/0xd8 [xfs] 
[ 8282.742856]  [<000003ff800f67f4>] xlog_do_recover+0x4c/0x218 [xfs] 
[ 8282.742965]  [<000003ff800f7cfa>] xlog_recover+0xda/0x1a0 [xfs] 
[ 8282.743073]  [<000003ff800dde96>] xfs_log_mount+0x11e/0x280 [xfs] 
[ 8282.743181]  [<000003ff800cfcc6>] xfs_mountfs+0x3e6/0x928 [xfs] 
[ 8282.743289]  [<000003ff800d7554>] xfs_fs_fill_super+0x40c/0x7d8 [xfs] 
[ 8282.743397]  [<000000004e160fec>] get_tree_bdev+0x144/0x1d0 
[ 8282.743403]  [<000000004e15ea20>] vfs_get_tree+0x38/0x110 
[ 8282.743406]  [<000000004e19140a>] do_new_mount+0x17a/0x2d0 
[ 8282.743411]  [<000000004e192084>] path_mount+0x1ac/0x818 
[ 8282.743414]  [<000000004e1927f4>] __s390x_sys_mount+0x104/0x148 
[ 8282.743417]  [<000000004e7d5910>] __do_syscall+0x1d0/0x1f8 
[ 8282.743420]  [<000000004e7e5bc0>] system_call+0x70/0x98 
[ 8282.743423] XFS (loop1): Corruption detected. Unmount and run xfs_repair
[ 8282.743425] XFS (loop1): Metadata corruption detected at xfs_dinode_verify.part.0+0x2c6/0x580 [xfs], inode 0x83
[ 8282.743582] XFS (loop1): log mount/recovery failed: error -117
[ 8282.743846] XFS (loop1): log mount failed
[ 8282.778368] XFS (loop0): Unmounting Filesystem a488de9e-d346-4098-aff4-84e6ca6936c7
[ 8458.453977] XFS (loop1): Mounting V5 Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
[ 8458.467682] XFS (loop1): Starting recovery (logdev: internal)
...
...
[ 8611.351331] XFS (loop1): Internal error Bad dinode after recovery at line 536 of file fs/xfs/xfs_inode_item_recover.c.  Caller xlog_recover_items_pass2+0x72/0xf0 [xfs]
[ 8611.351612] CPU: 1 PID: 10548 Comm: mount Kdump: loaded Not tainted 6.6.0+ #1
[ 8611.351614] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0)
[ 8611.351615] Call Trace:
[ 8611.351617]  [<000000004e7af67a>] dump_stack_lvl+0x62/0x80 
[ 8611.351624]  [<000003ff800b1cb0>] xfs_corruption_error+0x70/0xa0 [xfs] 
[ 8611.351707]  [<000003ff800f0032>] xlog_recover_inode_commit_pass2+0x63a/0xb10 [xfs] 
[ 8611.351792]  [<000003ff800f400a>] xlog_recover_items_pass2+0x72/0xf0 [xfs] 
[ 8611.351886]  [<000003ff800f4cc6>] xlog_recover_commit_trans+0x39e/0x3c0 [xfs] 
[ 8611.351989]  [<000003ff800f4e28>] xlog_recovery_process_trans+0x140/0x148 [xfs] 
[ 8611.352073]  [<000003ff800f4ef0>] xlog_recover_process_ophdr+0xc0/0x180 [xfs] 
[ 8611.352157]  [<000003ff800f59b6>] xlog_recover_process_data+0xb6/0x168 [xfs] 
[ 8611.352243]  [<000003ff800f5b6c>] xlog_recover_process+0x104/0x150 [xfs] 
[ 8611.352326]  [<000003ff800f5f6a>] xlog_do_recovery_pass+0x3b2/0x748 [xfs] 
[ 8611.352409]  [<000003ff800f6758>] xlog_do_log_recovery+0x88/0xd8 [xfs] 
[ 8611.352497]  [<000003ff800f67f4>] xlog_do_recover+0x4c/0x218 [xfs] 
[ 8611.352582]  [<000003ff800f7cfa>] xlog_recover+0xda/0x1a0 [xfs] 
[ 8611.352668]  [<000003ff800dde96>] xfs_log_mount+0x11e/0x280 [xfs] 
[ 8611.352767]  [<000003ff800cfcc6>] xfs_mountfs+0x3e6/0x928 [xfs] 
[ 8611.352851]  [<000003ff800d7554>] xfs_fs_fill_super+0x40c/0x7d8 [xfs] 
[ 8611.352935]  [<000000004e160fec>] get_tree_bdev+0x144/0x1d0 
[ 8611.352941]  [<000000004e15ea20>] vfs_get_tree+0x38/0x110 
[ 8611.352944]  [<000000004e19140a>] do_new_mount+0x17a/0x2d0 
[ 8611.352948]  [<000000004e192084>] path_mount+0x1ac/0x818 
[ 8611.352950]  [<000000004e1927f4>] __s390x_sys_mount+0x104/0x148 
[ 8611.352953]  [<000000004e7d5910>] __do_syscall+0x1d0/0x1f8 
[ 8611.352955]  [<000000004e7e5bc0>] system_call+0x70/0x98 
[ 8611.352958] XFS (loop1): Corruption detected. Unmount and run xfs_repair
[ 8611.352960] XFS (loop1): Metadata corruption detected at xfs_dinode_verify.part.0+0x2c6/0x580 [xfs], inode 0x83
[ 8611.353061] XFS (loop1): log mount/recovery failed: error -117
[ 8611.353227] XFS (loop1): log mount failed

# trace-cmd record -e xfs\* -e xlog\* -e printk mount /dev/loop1 /mnt/test
# trace-cmd report > events-with-dave-patch.txt

[2]
https://drive.google.com/file/d/1EeWhmhS9yETegPM4oPcAe5Nuj5fUBu3V/view?usp=sharing


> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: inode recovery does not validate the recovered inode
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> Discovered when trying to track down a weird recovery corruption
> issue that wasn't detected at recovery time.
> 
> The specific corruption was a zero extent count field when big
> extent counts are in use, and it turns out the dinode verifier
> doesn't detect that specific corruption case, either. So fix it too.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_inode_buf.c   |  3 +++
>  fs/xfs/xfs_inode_item_recover.c | 14 +++++++++++++-
>  2 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index a35781577cad..0f970a0b3382 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -508,6 +508,9 @@ xfs_dinode_verify(
>  	if (mode && nextents + naextents > nblocks)
>  		return __this_address;
>  
> +	if (nextents + naextents == 0 && nblocks != 0)
> +		return __this_address;
> +
>  	if (S_ISDIR(mode) && nextents > mp->m_dir_geo->max_extents)
>  		return __this_address;
>  
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index 6b09e2bf2d74..f4c31c2b60d5 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -286,6 +286,7 @@ xlog_recover_inode_commit_pass2(
>  	struct xfs_log_dinode		*ldip;
>  	uint				isize;
>  	int				need_free = 0;
> +	xfs_failaddr_t			fa;
>  
>  	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
>  		in_f = item->ri_buf[0].i_addr;
> @@ -529,8 +530,19 @@ xlog_recover_inode_commit_pass2(
>  	    (dip->di_mode != 0))
>  		error = xfs_recover_inode_owner_change(mp, dip, in_f,
>  						       buffer_list);
> -	/* re-generate the checksum. */
> +	/* re-generate the checksum and validate the recovered inode. */
>  	xfs_dinode_calc_crc(log->l_mp, dip);
> +	fa = xfs_dinode_verify(log->l_mp, in_f->ilf_ino, dip);
> +	if (fa) {
> +		XFS_CORRUPTION_ERROR(
> +			"Bad dinode after recovery",
> +				XFS_ERRLEVEL_LOW, mp, dip, sizeof(*dip));
> +		xfs_alert(mp,
> +			"Metadata corruption detected at %pS, inode 0x%llx",
> +			fa, in_f->ilf_ino);
> +		error = -EFSCORRUPTED;
> +		goto out_release;
> +	}
>  
>  	ASSERT(bp->b_mount == mp);
>  	bp->b_flags |= _XBF_LOGRECOVERY;
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-09 14:09                   ` Zorro Lang
@ 2023-11-09 23:13                     ` Dave Chinner
  2023-11-10  1:36                       ` Zorro Lang
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2023-11-09 23:13 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Thu, Nov 09, 2023 at 10:09:29PM +0800, Zorro Lang wrote:
> On Thu, Nov 09, 2023 at 05:14:51PM +1100, Dave Chinner wrote:
> > On Thu, Nov 09, 2023 at 10:43:58AM +0800, Zirong Lang wrote:
> > > By changing the generic/047 as below, I got 2 dump files and 2 log files.
> > > Please check the attachment,
> > > and feel free to tell me if you need more.
> > 
> > Well, we've narrowed down to some weird recovery issue - the journal
> > is intact, recovery recovers the inode from the correct log item in
> > the journal, but the inode written to disk by recovery is corrupt.
> > 
> > What this points out is that we don't actually verify the inode we
> > recover is valid before writeback is queued, nor do we detect the
> > specific corruption being encountered in the verifier (i.e. non-zero
> > nblocks count when extent count is zero).
> > 
> > Can you add the patch below and see if/when the verifier fires on
> > the reproducer? I'm particularly interested to know where it fires -
> > in recovery before writeback, or when the root inode is re-read from
> > disk. It doesn't fire on x64-64....
> 
> Hi Dave,
> 
> I just re-tested with your patch, the steps as [1]. The trace-cmd output
> can be downloaded from [2].
> 
> Please tell me if I misunderstood anything.

You got it right, that's exactly what I wanted to see. :)

....
> [ 8272.820617] run fstests generic/047 at 2023-11-09 08:36:39
> [ 8273.231657] XFS (loop1): Mounting V5 Filesystem 9faf1858-4d74-4b58-b0e7-68fd1d90762a
> [ 8273.233769] XFS (loop1): Ending clean mount
> [ 8273.235510] XFS (loop1): User initiated shutdown received.
> [ 8273.235520] XFS (loop1): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x5a/0xf8 [xfs] (fs/xfs/xfs_fsops.c:492).  Shutting down filesystem.
> [ 8273.235772] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
> [ 8273.236810] XFS (loop1): Unmounting Filesystem 9faf1858-4d74-4b58-b0e7-68fd1d90762a
> [ 8273.284005] XFS (loop1): Mounting V5 Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
> [ 8273.285668] XFS (loop1): Ending clean mount
> [ 8275.789831] XFS (loop1): User initiated shutdown received.
> [ 8275.789846] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495).  Shutting down filesystem.
> [ 8275.790170] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
> [ 8275.795171] XFS (loop1): Unmounting Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
> [ 8282.714748] XFS (loop1): Mounting V5 Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
> [ 8282.728494] XFS (loop1): Starting recovery (logdev: internal)
> [ 8282.741350] 00000000: 49 4e 81 80 03 02 00 00 00 00 00 00 00 00 00 00  IN..............
> [ 8282.741357] 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
> [ 8282.741360] 00000020: 35 63 5c e5 7f d2 65 f2 35 63 5c e5 7f d2 65 f2  5c\...e.5c\...e.
> [ 8282.741363] 00000030: 35 63 5c e5 7f d2 65 f2 00 00 00 00 00 00 80 00  5c\...e.........
> [ 8282.741366] 00000040: 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00  ................
> [ 8282.741368] 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 5d b8 45 2b  ............].E+
> [ 8282.741371] 00000060: ff ff ff ff 77 bb a7 2f 00 00 00 00 00 00 00 04  ....w../........
> [ 8282.741375] 00000070: 00 00 00 01 00 00 00 02 00 00 00 00 00 00 00 18  ................
> [ 8282.741378] 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> [ 8282.741381] 00000090: 35 63 5c e5 7f d2 65 f2 00 00 00 00 00 00 00 83  5c\...e.........
> [ 8282.741397] 000000a0: 89 0f 60 68 8e b1 4a fa b1 fb 5d eb 3d 93 7f ce  ..`h..J...].=...

Ok, so that's the inode core that was recovered and it's in memory
before being written to disk. It's clearly not been recovered
correctly - as it's triggered the new extent vs block count check
I added to the verifier.

> [ 8282.741399] XFS (loop1): Internal error Bad dinode after recovery at line 536 of file fs/xfs/xfs_inode_item_recover.c.  Caller xlog_recover_items_pass2+0x72/0xf0 [xfs]
> [ 8282.741739] CPU: 1 PID: 7645 Comm: mount Kdump: loaded Not tainted 6.6.0+ #1
> [ 8282.741743] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0)
> [ 8282.741746] Call Trace:
> [ 8282.741748]  [<000000004e7af67a>] dump_stack_lvl+0x62/0x80 
> [ 8282.741756]  [<000003ff800b1cb0>] xfs_corruption_error+0x70/0xa0 [xfs] 
> [ 8282.741863]  [<000003ff800f0032>] xlog_recover_inode_commit_pass2+0x63a/0xb10 [xfs] 
> [ 8282.741973]  [<000003ff800f400a>] xlog_recover_items_pass2+0x72/0xf0 [xfs] 
> [ 8282.742082]  [<000003ff800f4cc6>] xlog_recover_commit_trans+0x39e/0x3c0 [xfs] 
> [ 8282.742192]  [<000003ff800f4e28>] xlog_recovery_process_trans+0x140/0x148 [xfs] 
> [ 8282.742302]  [<000003ff800f4ef0>] xlog_recover_process_ophdr+0xc0/0x180 [xfs] 
> [ 8282.742413]  [<000003ff800f59b6>] xlog_recover_process_data+0xb6/0x168 [xfs] 
> [ 8282.742528]  [<000003ff800f5b6c>] xlog_recover_process+0x104/0x150 [xfs] 
> [ 8282.742638]  [<000003ff800f5f6a>] xlog_do_recovery_pass+0x3b2/0x748 [xfs] 
> [ 8282.742747]  [<000003ff800f6758>] xlog_do_log_recovery+0x88/0xd8 [xfs] 
> [ 8282.742856]  [<000003ff800f67f4>] xlog_do_recover+0x4c/0x218 [xfs] 
> [ 8282.742965]  [<000003ff800f7cfa>] xlog_recover+0xda/0x1a0 [xfs] 
> [ 8282.743073]  [<000003ff800dde96>] xfs_log_mount+0x11e/0x280 [xfs] 
> [ 8282.743181]  [<000003ff800cfcc6>] xfs_mountfs+0x3e6/0x928 [xfs] 
> [ 8282.743289]  [<000003ff800d7554>] xfs_fs_fill_super+0x40c/0x7d8 [xfs] 
> [ 8282.743397]  [<000000004e160fec>] get_tree_bdev+0x144/0x1d0 
> [ 8282.743403]  [<000000004e15ea20>] vfs_get_tree+0x38/0x110 
> [ 8282.743406]  [<000000004e19140a>] do_new_mount+0x17a/0x2d0 
> [ 8282.743411]  [<000000004e192084>] path_mount+0x1ac/0x818 
> [ 8282.743414]  [<000000004e1927f4>] __s390x_sys_mount+0x104/0x148 
> [ 8282.743417]  [<000000004e7d5910>] __do_syscall+0x1d0/0x1f8 
> [ 8282.743420]  [<000000004e7e5bc0>] system_call+0x70/0x98 
> [ 8282.743423] XFS (loop1): Corruption detected. Unmount and run xfs_repair
> [ 8282.743425] XFS (loop1): Metadata corruption detected at xfs_dinode_verify.part.0+0x2c6/0x580 [xfs], inode 0x83

Ok inode 0x83 - that's not the root inode, so slightly unexpected.
The trace, however, tells me that this is the first inode recovered
from the checkpoint (the root inode is the other in the checkpoint)
and we knew that the regular file inodes were bad, too.

Essentially, the inode dump tells use that everything but the extent
counts were recovered correctly from the inode log item. I guess we
now need to do specific trace_printk debugging to find out what,
exactly, is being processed incorrectly on s390.

Can you add the debug patch below on top of the corruption detection
patch? If it fails, run the same trace-cmd on the failed mount as
you've done here?

If it does not fail, it would still be good to get a trace of the
mount that performs recovery successfully. To do that you'll
have to modify the test to trace the specific mount rather than do
it separately after the test fails...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: s390 inode recovery debug

---
 fs/xfs/xfs_inode_item_recover.c | 44 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
index f4c31c2b60d5..2a0166702192 100644
--- a/fs/xfs/xfs_inode_item_recover.c
+++ b/fs/xfs/xfs_inode_item_recover.c
@@ -165,6 +165,41 @@ xfs_log_dinode_to_disk_iext_counters(
 
 }
 
+static void dump_inodes(
+	struct xfs_log_dinode	*ldip,
+	struct xfs_dinode	*dip,
+	char			*ctx)
+{
+	smp_mb();
+
+	trace_printk(
+"disk ino 0x%llx: %s nblocks 0x%llx nextents 0x%x/0x%llx anextents 0x%x/0x%llx v3_pad 0x%llx nrext64_pad 0x%x di_flags2 0x%llx",
+		be64_to_cpu(dip->di_ino),
+		ctx,
+		be64_to_cpu(dip->di_nblocks),
+		be32_to_cpu(dip->di_nextents),
+		be64_to_cpu(dip->di_big_nextents),
+		be32_to_cpu(dip->di_anextents),
+		be64_to_cpu(dip->di_big_anextents),
+		be64_to_cpu(dip->di_v3_pad),
+		be16_to_cpu(dip->di_nrext64_pad),
+		be64_to_cpu(dip->di_flags2));
+	trace_printk(
+"log ino 0x%llx: %s nblocks 0x%llx nextents 0x%x/0x%llx anextents 0x%x/0x%x v3_pad 0x%llx nrext64_pad 0x%x di_flags2 0x%llx %s",
+		ldip->di_ino,
+		ctx,
+		ldip->di_nblocks,
+		ldip->di_nextents,
+		ldip->di_big_nextents,
+		ldip->di_anextents,
+		ldip->di_big_anextents,
+		ldip->di_v3_pad,
+		ldip->di_nrext64_pad,
+		ldip->di_flags2,
+		xfs_log_dinode_has_large_extent_counts(ldip) ? "big" : "small");
+	smp_mb();
+}
+
 STATIC void
 xfs_log_dinode_to_disk(
 	struct xfs_log_dinode	*from,
@@ -196,6 +231,8 @@ xfs_log_dinode_to_disk(
 	to->di_flags = cpu_to_be16(from->di_flags);
 	to->di_gen = cpu_to_be32(from->di_gen);
 
+	dump_inodes(from, to, "before v3");
+
 	if (from->di_version == 3) {
 		to->di_changecount = cpu_to_be64(from->di_changecount);
 		to->di_crtime = xfs_log_dinode_to_disk_ts(from,
@@ -212,6 +249,8 @@ xfs_log_dinode_to_disk(
 		memset(to->di_v2_pad, 0, sizeof(to->di_v2_pad));
 	}
 
+	dump_inodes(from, to, "before iext");
+
 	xfs_log_dinode_to_disk_iext_counters(from, to);
 }
 
@@ -337,6 +376,7 @@ xlog_recover_inode_commit_pass2(
 		goto out_release;
 	}
 
+	dump_inodes(ldip, dip, "init");
 	/*
 	 * If the inode has an LSN in it, recover the inode only if the on-disk
 	 * inode's LSN is older than the lsn of the transaction we are
@@ -441,6 +481,7 @@ xlog_recover_inode_commit_pass2(
 		goto out_release;
 	}
 
+	dump_inodes(ldip, dip, "pre");
 	/*
 	 * Recover the log dinode inode into the on disk inode.
 	 *
@@ -453,6 +494,8 @@ xlog_recover_inode_commit_pass2(
 	 */
 	xfs_log_dinode_to_disk(ldip, dip, current_lsn);
 
+	dump_inodes(ldip, dip, "post");
+
 	fields = in_f->ilf_fields;
 	if (fields & XFS_ILOG_DEV)
 		xfs_dinode_put_rdev(dip, in_f->ilf_u.ilfu_rdev);
@@ -530,6 +573,7 @@ xlog_recover_inode_commit_pass2(
 	    (dip->di_mode != 0))
 		error = xfs_recover_inode_owner_change(mp, dip, in_f,
 						       buffer_list);
+	dump_inodes(ldip, dip, "done");
 	/* re-generate the checksum and validate the recovered inode. */
 	xfs_dinode_calc_crc(log->l_mp, dip);
 	fa = xfs_dinode_verify(log->l_mp, in_f->ilf_ino, dip);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-09 23:13                     ` Dave Chinner
@ 2023-11-10  1:36                       ` Zorro Lang
  2023-11-10  2:03                         ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Zorro Lang @ 2023-11-10  1:36 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Fri, Nov 10, 2023 at 10:13:26AM +1100, Dave Chinner wrote:
> On Thu, Nov 09, 2023 at 10:09:29PM +0800, Zorro Lang wrote:
> > On Thu, Nov 09, 2023 at 05:14:51PM +1100, Dave Chinner wrote:
> > > On Thu, Nov 09, 2023 at 10:43:58AM +0800, Zirong Lang wrote:
> > > > By changing the generic/047 as below, I got 2 dump files and 2 log files.
> > > > Please check the attachment,
> > > > and feel free to tell me if you need more.
> > > 
> > > Well, we've narrowed down to some weird recovery issue - the journal
> > > is intact, recovery recovers the inode from the correct log item in
> > > the journal, but the inode written to disk by recovery is corrupt.
> > > 
> > > What this points out is that we don't actually verify the inode we
> > > recover is valid before writeback is queued, nor do we detect the
> > > specific corruption being encountered in the verifier (i.e. non-zero
> > > nblocks count when extent count is zero).
> > > 
> > > Can you add the patch below and see if/when the verifier fires on
> > > the reproducer? I'm particularly interested to know where it fires -
> > > in recovery before writeback, or when the root inode is re-read from
> > > disk. It doesn't fire on x64-64....
> > 
> > Hi Dave,
> > 
> > I just re-tested with your patch, the steps as [1]. The trace-cmd output
> > can be downloaded from [2].
> > 
> > Please tell me if I misunderstood anything.
> 
> You got it right, that's exactly what I wanted to see. :)
> 
> ....
> > [ 8272.820617] run fstests generic/047 at 2023-11-09 08:36:39
> > [ 8273.231657] XFS (loop1): Mounting V5 Filesystem 9faf1858-4d74-4b58-b0e7-68fd1d90762a
> > [ 8273.233769] XFS (loop1): Ending clean mount
> > [ 8273.235510] XFS (loop1): User initiated shutdown received.
> > [ 8273.235520] XFS (loop1): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x5a/0xf8 [xfs] (fs/xfs/xfs_fsops.c:492).  Shutting down filesystem.
> > [ 8273.235772] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
> > [ 8273.236810] XFS (loop1): Unmounting Filesystem 9faf1858-4d74-4b58-b0e7-68fd1d90762a
> > [ 8273.284005] XFS (loop1): Mounting V5 Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
> > [ 8273.285668] XFS (loop1): Ending clean mount
> > [ 8275.789831] XFS (loop1): User initiated shutdown received.
> > [ 8275.789846] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495).  Shutting down filesystem.
> > [ 8275.790170] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
> > [ 8275.795171] XFS (loop1): Unmounting Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
> > [ 8282.714748] XFS (loop1): Mounting V5 Filesystem 890f6068-8eb1-4afa-b1fb-5deb3d937fce
> > [ 8282.728494] XFS (loop1): Starting recovery (logdev: internal)
> > [ 8282.741350] 00000000: 49 4e 81 80 03 02 00 00 00 00 00 00 00 00 00 00  IN..............
> > [ 8282.741357] 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > [ 8282.741360] 00000020: 35 63 5c e5 7f d2 65 f2 35 63 5c e5 7f d2 65 f2  5c\...e.5c\...e.
> > [ 8282.741363] 00000030: 35 63 5c e5 7f d2 65 f2 00 00 00 00 00 00 80 00  5c\...e.........
> > [ 8282.741366] 00000040: 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00  ................
> > [ 8282.741368] 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 5d b8 45 2b  ............].E+
> > [ 8282.741371] 00000060: ff ff ff ff 77 bb a7 2f 00 00 00 00 00 00 00 04  ....w../........
> > [ 8282.741375] 00000070: 00 00 00 01 00 00 00 02 00 00 00 00 00 00 00 18  ................
> > [ 8282.741378] 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > [ 8282.741381] 00000090: 35 63 5c e5 7f d2 65 f2 00 00 00 00 00 00 00 83  5c\...e.........
> > [ 8282.741397] 000000a0: 89 0f 60 68 8e b1 4a fa b1 fb 5d eb 3d 93 7f ce  ..`h..J...].=...
> 
> Ok, so that's the inode core that was recovered and it's in memory
> before being written to disk. It's clearly not been recovered
> correctly - as it's triggered the new extent vs block count check
> I added to the verifier.
> 
> > [ 8282.741399] XFS (loop1): Internal error Bad dinode after recovery at line 536 of file fs/xfs/xfs_inode_item_recover.c.  Caller xlog_recover_items_pass2+0x72/0xf0 [xfs]
> > [ 8282.741739] CPU: 1 PID: 7645 Comm: mount Kdump: loaded Not tainted 6.6.0+ #1
> > [ 8282.741743] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0)
> > [ 8282.741746] Call Trace:
> > [ 8282.741748]  [<000000004e7af67a>] dump_stack_lvl+0x62/0x80 
> > [ 8282.741756]  [<000003ff800b1cb0>] xfs_corruption_error+0x70/0xa0 [xfs] 
> > [ 8282.741863]  [<000003ff800f0032>] xlog_recover_inode_commit_pass2+0x63a/0xb10 [xfs] 
> > [ 8282.741973]  [<000003ff800f400a>] xlog_recover_items_pass2+0x72/0xf0 [xfs] 
> > [ 8282.742082]  [<000003ff800f4cc6>] xlog_recover_commit_trans+0x39e/0x3c0 [xfs] 
> > [ 8282.742192]  [<000003ff800f4e28>] xlog_recovery_process_trans+0x140/0x148 [xfs] 
> > [ 8282.742302]  [<000003ff800f4ef0>] xlog_recover_process_ophdr+0xc0/0x180 [xfs] 
> > [ 8282.742413]  [<000003ff800f59b6>] xlog_recover_process_data+0xb6/0x168 [xfs] 
> > [ 8282.742528]  [<000003ff800f5b6c>] xlog_recover_process+0x104/0x150 [xfs] 
> > [ 8282.742638]  [<000003ff800f5f6a>] xlog_do_recovery_pass+0x3b2/0x748 [xfs] 
> > [ 8282.742747]  [<000003ff800f6758>] xlog_do_log_recovery+0x88/0xd8 [xfs] 
> > [ 8282.742856]  [<000003ff800f67f4>] xlog_do_recover+0x4c/0x218 [xfs] 
> > [ 8282.742965]  [<000003ff800f7cfa>] xlog_recover+0xda/0x1a0 [xfs] 
> > [ 8282.743073]  [<000003ff800dde96>] xfs_log_mount+0x11e/0x280 [xfs] 
> > [ 8282.743181]  [<000003ff800cfcc6>] xfs_mountfs+0x3e6/0x928 [xfs] 
> > [ 8282.743289]  [<000003ff800d7554>] xfs_fs_fill_super+0x40c/0x7d8 [xfs] 
> > [ 8282.743397]  [<000000004e160fec>] get_tree_bdev+0x144/0x1d0 
> > [ 8282.743403]  [<000000004e15ea20>] vfs_get_tree+0x38/0x110 
> > [ 8282.743406]  [<000000004e19140a>] do_new_mount+0x17a/0x2d0 
> > [ 8282.743411]  [<000000004e192084>] path_mount+0x1ac/0x818 
> > [ 8282.743414]  [<000000004e1927f4>] __s390x_sys_mount+0x104/0x148 
> > [ 8282.743417]  [<000000004e7d5910>] __do_syscall+0x1d0/0x1f8 
> > [ 8282.743420]  [<000000004e7e5bc0>] system_call+0x70/0x98 
> > [ 8282.743423] XFS (loop1): Corruption detected. Unmount and run xfs_repair
> > [ 8282.743425] XFS (loop1): Metadata corruption detected at xfs_dinode_verify.part.0+0x2c6/0x580 [xfs], inode 0x83
> 
> Ok inode 0x83 - that's not the root inode, so slightly unexpected.
> The trace, however, tells me that this is the first inode recovered
> from the checkpoint (the root inode is the other in the checkpoint)
> and we knew that the regular file inodes were bad, too.
> 
> Essentially, the inode dump tells use that everything but the extent
> counts were recovered correctly from the inode log item. I guess we
> now need to do specific trace_printk debugging to find out what,
> exactly, is being processed incorrectly on s390.
> 
> Can you add the debug patch below on top of the corruption detection
> patch? If it fails, run the same trace-cmd on the failed mount as
> you've done here?
> 
> If it does not fail, it would still be good to get a trace of the
> mount that performs recovery successfully. To do that you'll
> have to modify the test to trace the specific mount rather than do
> it separately after the test fails...

The g/047 still fails with this 2nd patch. So I did below steps [1],
and get the trace output as [2], those dump_inodes() messages you
added have been printed, please check.

Thanks,
Zorro

[1]
# ./check generic/047
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/s390x ibm-z-507 6.6.0+ #1 SMP Wed Nov  8 12:51:20 EST 2023
MKFS_OPTIONS  -- -f -m crc=1,finobt=1,rmapbt=0,reflink=1,bigtime=1,inobtcount=1 /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR

generic/047       [failed, exit status 1]_check_dmesg: something found in dmesg (see /var/lib/xfstests/results//generic/047.dmesg)
- output mismatch (see /var/lib/xfstests/results//generic/047.out.bad)
    --- tests/generic/047.out   2023-11-08 13:02:38.424036829 -0500
    +++ /var/lib/xfstests/results//generic/047.out.bad  2023-11-09 20:19:37.313375849 -0500
    @@ -1 +1,4 @@
     QA output created by 047
    +mount: /mnt/fstests/SCRATCH_DIR: mount(2) system call failed: Structure needs cleaning.
    +mount -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR failed
    +(see /var/lib/xfstests/results//generic/047.full for details)
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/047.out /var/lib/xfstests/results//generic/047.out.bad'  to see the entire diff)
Ran: generic/047
Failures: generic/047
Failed 1 of 1 tests

# mount /dev/loop1 /mnt/fstests/SCRATCH_DIR
mount: /mnt/fstests/SCRATCH_DIR: mount(2) system call failed: Structure needs cleaning.

# dmesg
[  220.067517] run fstests generic/047 at 2023-11-09 20:19:33
[  220.509393] XFS (loop1): Mounting V5 Filesystem 83788739-5d56-4018-a6ae-f808f4d9ac0b
[  220.511726] XFS (loop1): Ending clean mount
[  220.513584] XFS (loop1): User initiated shutdown received.
[  220.513591] XFS (loop1): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x5a/0xf8 [xfs] (fs/xfs/xfs_fsops.c:492).  Shutting down filesystem.
[  220.513909] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
[  220.515071] XFS (loop1): Unmounting Filesystem 83788739-5d56-4018-a6ae-f808f4d9ac0b
[  220.566039] XFS (loop1): Mounting V5 Filesystem d53516b1-cb48-4c9b-b7bf-0d522a3ed6a0
[  220.568044] XFS (loop1): Ending clean mount
[  223.389303] XFS (loop1): User initiated shutdown received.
[  223.389320] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495).  Shutting down filesystem.
[  223.389661] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
[  223.396485] XFS (loop1): Unmounting Filesystem d53516b1-cb48-4c9b-b7bf-0d522a3ed6a0
[  223.405668] XFS (loop1): Mounting V5 Filesystem d53516b1-cb48-4c9b-b7bf-0d522a3ed6a0
[  223.422333] XFS (loop1): Starting recovery (logdev: internal)
[  223.436377] 00000000: 49 4e 81 80 03 02 00 00 00 00 00 00 00 00 00 00  IN..............
[  223.436387] 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  223.436390] 00000020: 35 63 83 41 18 01 21 e9 35 63 83 41 18 01 21 e9  5c.A..!.5c.A..!.
[  223.436394] 00000030: 35 63 83 41 18 01 21 e9 00 00 00 00 00 00 80 00  5c.A..!.........
[  223.436397] 00000040: 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00  ................
[  223.436400] 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 5a be 98 f1  ............Z...
[  223.436417] 00000060: ff ff ff ff 5e c5 ed 0a 00 00 00 00 00 00 00 04  ....^...........
[  223.436418] 00000070: 00 00 00 01 00 00 00 02 00 00 00 00 00 00 00 18  ................
[  223.436420] 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  223.436422] 00000090: 35 63 83 41 18 01 21 e9 00 00 00 00 00 00 00 83  5c.A..!.........
[  223.436424] 000000a0: d5 35 16 b1 cb 48 4c 9b b7 bf 0d 52 2a 3e d6 a0  .5...HL....R*>..
[  223.436426] XFS (loop1): Internal error Bad dinode after recovery at line 580 of file fs/xfs/xfs_inode_item_recover.c.  Caller xlog_recover_items_pass2+0x72/0xf0 [xfs]
[  223.436696] CPU: 1 PID: 7579 Comm: mount Kdump: loaded Not tainted 6.6.0+ #1
[  223.436700] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0)
[  223.436701] Call Trace:
[  223.436703]  [<00000000e1f5f67a>] dump_stack_lvl+0x62/0x80 
[  223.436711]  [<000003ff7ff8bcd8>] xfs_corruption_error+0x70/0xa0 [xfs] 
[  223.436800]  [<000003ff7ffca1f4>] xlog_recover_inode_commit_pass2+0x674/0xb40 [xfs] 
[  223.436894]  [<000003ff7ffce1c2>] xlog_recover_items_pass2+0x72/0xf0 [xfs] 
[  223.436985]  [<000003ff7ffcee7e>] xlog_recover_commit_trans+0x39e/0x3c0 [xfs] 
[  223.437067]  [<000003ff7ffcefe0>] xlog_recovery_process_trans+0x140/0x148 [xfs] 
[  223.437164]  [<000003ff7ffcf0a8>] xlog_recover_process_ophdr+0xc0/0x180 [xfs] 
[  223.437272]  [<000003ff7ffcfb6e>] xlog_recover_process_data+0xb6/0x168 [xfs] 
[  223.437381]  [<000003ff7ffcfd24>] xlog_recover_process+0x104/0x150 [xfs] 
[  223.437490]  [<000003ff7ffd0122>] xlog_do_recovery_pass+0x3b2/0x748 [xfs] 
[  223.437594]  [<000003ff7ffd0910>] xlog_do_log_recovery+0x88/0xd8 [xfs] 
[  223.437697]  [<000003ff7ffd09ac>] xlog_do_recover+0x4c/0x218 [xfs] 
[  223.437799]  [<000003ff7ffd1eb2>] xlog_recover+0xda/0x1a0 [xfs] 
[  223.437905]  [<000003ff7ffb7ebe>] xfs_log_mount+0x11e/0x280 [xfs] 
[  223.438013]  [<000003ff7ffa9cee>] xfs_mountfs+0x3e6/0x928 [xfs] 
[  223.438120]  [<000003ff7ffb157c>] xfs_fs_fill_super+0x40c/0x7d8 [xfs] 
[  223.438228]  [<00000000e1910fec>] get_tree_bdev+0x144/0x1d0 
[  223.438233]  [<00000000e190ea20>] vfs_get_tree+0x38/0x110 
[  223.438236]  [<00000000e194140a>] do_new_mount+0x17a/0x2d0 
[  223.438240]  [<00000000e1942084>] path_mount+0x1ac/0x818 
[  223.438243]  [<00000000e19427f4>] __s390x_sys_mount+0x104/0x148 
[  223.438246]  [<00000000e1f85910>] __do_syscall+0x1d0/0x1f8 
[  223.438249]  [<00000000e1f95bc0>] system_call+0x70/0x98 
[  223.438252] XFS (loop1): Corruption detected. Unmount and run xfs_repair
[  223.438255] XFS (loop1): Metadata corruption detected at xfs_dinode_verify.part.0+0x2c6/0x580 [xfs], inode 0x83
[  223.438415] XFS (loop1): log mount/recovery failed: error -117
[  223.438903] XFS (loop1): log mount failed
[  223.486754] XFS (loop0): Unmounting Filesystem a488de9e-d346-4098-aff4-84e6ca6936c7
[  345.343101] XFS (loop1): Mounting V5 Filesystem d53516b1-cb48-4c9b-b7bf-0d522a3ed6a0
[  345.354273] XFS (loop1): Starting recovery (logdev: internal)
[  345.362837] 00000000: 49 4e 81 80 03 02 00 00 00 00 00 00 00 00 00 00  IN..............
[  345.362840] 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  345.362842] 00000020: 35 63 83 41 18 01 21 e9 35 63 83 41 18 01 21 e9  5c.A..!.5c.A..!.
[  345.362845] 00000030: 35 63 83 41 18 01 21 e9 00 00 00 00 00 00 80 00  5c.A..!.........
[  345.362847] 00000040: 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00  ................
[  345.362849] 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 5a be 98 f1  ............Z...
[  345.362851] 00000060: ff ff ff ff 5e c5 ed 0a 00 00 00 00 00 00 00 04  ....^...........
[  345.362853] 00000070: 00 00 00 01 00 00 00 02 00 00 00 00 00 00 00 18  ................
[  345.362855] 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  345.362858] 00000090: 35 63 83 41 18 01 21 e9 00 00 00 00 00 00 00 83  5c.A..!.........
[  345.362919] 000000a0: d5 35 16 b1 cb 48 4c 9b b7 bf 0d 52 2a 3e d6 a0  .5...HL....R*>..
[  345.362928] XFS (loop1): Internal error Bad dinode after recovery at line 580 of file fs/xfs/xfs_inode_item_recover.c.  Caller xlog_recover_items_pass2+0x72/0xf0 [xfs]
...
...
[  345.362928] XFS (loop1): Internal error Bad dinode after recovery at line 580 of file fs/xfs/xfs_inode_item_recover.c.  Caller xlog_recover_items_pass2+0x72/0xf0 [xfs]
[  345.363185] CPU: 0 PID: 7657 Comm: mount Kdump: loaded Not tainted 6.6.0+ #1
[  345.363188] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0)
[  345.363190] Call Trace:
[  345.363192]  [<00000000e1f5f67a>] dump_stack_lvl+0x62/0x80 
[  345.363199]  [<000003ff7ff8bcd8>] xfs_corruption_error+0x70/0xa0 [xfs] 
[  345.363305]  [<000003ff7ffca1f4>] xlog_recover_inode_commit_pass2+0x674/0xb40 [xfs] 
[  345.363387]  [<000003ff7ffce1c2>] xlog_recover_items_pass2+0x72/0xf0 [xfs] 
[  345.363469]  [<000003ff7ffcee7e>] xlog_recover_commit_trans+0x39e/0x3c0 [xfs] 
[  345.363550]  [<000003ff7ffcefe0>] xlog_recovery_process_trans+0x140/0x148 [xfs] 
[  345.363631]  [<000003ff7ffcf0a8>] xlog_recover_process_ophdr+0xc0/0x180 [xfs] 
[  345.363713]  [<000003ff7ffcfb6e>] xlog_recover_process_data+0xb6/0x168 [xfs] 
[  345.363824]  [<000003ff7ffcfd24>] xlog_recover_process+0x104/0x150 [xfs] 
[  345.363906]  [<000003ff7ffd0122>] xlog_do_recovery_pass+0x3b2/0x748 [xfs] 
[  345.363987]  [<000003ff7ffd0910>] xlog_do_log_recovery+0x88/0xd8 [xfs] 
[  345.364067]  [<000003ff7ffd09ac>] xlog_do_recover+0x4c/0x218 [xfs] 
[  345.364146]  [<000003ff7ffd1eb2>] xlog_recover+0xda/0x1a0 [xfs] 
[  345.364228]  [<000003ff7ffb7ebe>] xfs_log_mount+0x11e/0x280 [xfs] 
[  345.364307]  [<000003ff7ffa9cee>] xfs_mountfs+0x3e6/0x928 [xfs] 
[  345.364386]  [<000003ff7ffb157c>] xfs_fs_fill_super+0x40c/0x7d8 [xfs] 
[  345.364465]  [<00000000e1910fec>] get_tree_bdev+0x144/0x1d0 
[  345.364470]  [<00000000e190ea20>] vfs_get_tree+0x38/0x110 
[  345.364472]  [<00000000e194140a>] do_new_mount+0x17a/0x2d0 
[  345.364476]  [<00000000e1942084>] path_mount+0x1ac/0x818 
[  345.364478]  [<00000000e19427f4>] __s390x_sys_mount+0x104/0x148 
[  345.364480]  [<00000000e1f85910>] __do_syscall+0x1d0/0x1f8 
[  345.364483]  [<00000000e1f95bc0>] system_call+0x70/0x98 
[  345.364485] XFS (loop1): Corruption detected. Unmount and run xfs_repair
[  345.364486] XFS (loop1): Metadata corruption detected at xfs_dinode_verify.part.0+0x2c6/0x580 [xfs], inode 0x83
[  345.364596] XFS (loop1): log mount/recovery failed: error -117
[  345.364877] XFS (loop1): log mount failed

# trace-cmd record -e xfs\* -e xlog\* -e printk mount /dev/loop1 /mnt/test
mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
CPU0 data recorded at offset=0x634000
    6385664 bytes in size
CPU1 data recorded at offset=0xc4b000
    0 bytes in size
# trace-cmd report > events-with-dave-patch2.txt
# bzip2 events-with-dave-patch2.txt

[2]
https://drive.google.com/file/d/1Ut1EUmlkxqxuyIhrLcLfhwEhLfeCABs4/view?usp=sharing

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: s390 inode recovery debug
> 
> ---
>  fs/xfs/xfs_inode_item_recover.c | 44 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
> 
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index f4c31c2b60d5..2a0166702192 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -165,6 +165,41 @@ xfs_log_dinode_to_disk_iext_counters(
>  
>  }
>  
> +static void dump_inodes(
> +	struct xfs_log_dinode	*ldip,
> +	struct xfs_dinode	*dip,
> +	char			*ctx)
> +{
> +	smp_mb();
> +
> +	trace_printk(
> +"disk ino 0x%llx: %s nblocks 0x%llx nextents 0x%x/0x%llx anextents 0x%x/0x%llx v3_pad 0x%llx nrext64_pad 0x%x di_flags2 0x%llx",
> +		be64_to_cpu(dip->di_ino),
> +		ctx,
> +		be64_to_cpu(dip->di_nblocks),
> +		be32_to_cpu(dip->di_nextents),
> +		be64_to_cpu(dip->di_big_nextents),
> +		be32_to_cpu(dip->di_anextents),
> +		be64_to_cpu(dip->di_big_anextents),
> +		be64_to_cpu(dip->di_v3_pad),
> +		be16_to_cpu(dip->di_nrext64_pad),
> +		be64_to_cpu(dip->di_flags2));
> +	trace_printk(
> +"log ino 0x%llx: %s nblocks 0x%llx nextents 0x%x/0x%llx anextents 0x%x/0x%x v3_pad 0x%llx nrext64_pad 0x%x di_flags2 0x%llx %s",
> +		ldip->di_ino,
> +		ctx,
> +		ldip->di_nblocks,
> +		ldip->di_nextents,
> +		ldip->di_big_nextents,
> +		ldip->di_anextents,
> +		ldip->di_big_anextents,
> +		ldip->di_v3_pad,
> +		ldip->di_nrext64_pad,
> +		ldip->di_flags2,
> +		xfs_log_dinode_has_large_extent_counts(ldip) ? "big" : "small");
> +	smp_mb();
> +}
> +
>  STATIC void
>  xfs_log_dinode_to_disk(
>  	struct xfs_log_dinode	*from,
> @@ -196,6 +231,8 @@ xfs_log_dinode_to_disk(
>  	to->di_flags = cpu_to_be16(from->di_flags);
>  	to->di_gen = cpu_to_be32(from->di_gen);
>  
> +	dump_inodes(from, to, "before v3");
> +
>  	if (from->di_version == 3) {
>  		to->di_changecount = cpu_to_be64(from->di_changecount);
>  		to->di_crtime = xfs_log_dinode_to_disk_ts(from,
> @@ -212,6 +249,8 @@ xfs_log_dinode_to_disk(
>  		memset(to->di_v2_pad, 0, sizeof(to->di_v2_pad));
>  	}
>  
> +	dump_inodes(from, to, "before iext");
> +
>  	xfs_log_dinode_to_disk_iext_counters(from, to);
>  }
>  
> @@ -337,6 +376,7 @@ xlog_recover_inode_commit_pass2(
>  		goto out_release;
>  	}
>  
> +	dump_inodes(ldip, dip, "init");
>  	/*
>  	 * If the inode has an LSN in it, recover the inode only if the on-disk
>  	 * inode's LSN is older than the lsn of the transaction we are
> @@ -441,6 +481,7 @@ xlog_recover_inode_commit_pass2(
>  		goto out_release;
>  	}
>  
> +	dump_inodes(ldip, dip, "pre");
>  	/*
>  	 * Recover the log dinode inode into the on disk inode.
>  	 *
> @@ -453,6 +494,8 @@ xlog_recover_inode_commit_pass2(
>  	 */
>  	xfs_log_dinode_to_disk(ldip, dip, current_lsn);
>  
> +	dump_inodes(ldip, dip, "post");
> +
>  	fields = in_f->ilf_fields;
>  	if (fields & XFS_ILOG_DEV)
>  		xfs_dinode_put_rdev(dip, in_f->ilf_u.ilfu_rdev);
> @@ -530,6 +573,7 @@ xlog_recover_inode_commit_pass2(
>  	    (dip->di_mode != 0))
>  		error = xfs_recover_inode_owner_change(mp, dip, in_f,
>  						       buffer_list);
> +	dump_inodes(ldip, dip, "done");
>  	/* re-generate the checksum and validate the recovered inode. */
>  	xfs_dinode_calc_crc(log->l_mp, dip);
>  	fa = xfs_dinode_verify(log->l_mp, in_f->ilf_ino, dip);
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-10  1:36                       ` Zorro Lang
@ 2023-11-10  2:03                         ` Dave Chinner
  2023-11-10  4:32                           ` Darrick J. Wong
                                             ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Dave Chinner @ 2023-11-10  2:03 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Fri, Nov 10, 2023 at 09:36:51AM +0800, Zorro Lang wrote:
> The g/047 still fails with this 2nd patch. So I did below steps [1],
> and get the trace output as [2], those dump_inodes() messages you
> added have been printed, please check.

And that points me at the bug.

dump_inodes: disk ino 0x83: init nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18
dump_inodes: log ino 0x83: init nblocks 0x8 nextents 0x0/0x1 anextents 0x0/0x0 v3_pad 0x1 nrext64_pad 0x0 di_flags2 0x18 big
                                                     ^^^^^^^
The initial log inode is correct.

dump_inodes: disk ino 0x83: pre nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18
dump_inodes: log ino 0x83: pre nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18 big
                                                    ^^^^^^^

.... but on the second sample, it's been modified and the extent
count has been zeroed? Huh, that is unexpected - what did that?

Oh.

Can you test the patch below and see if it fixes the issue? Keep
the first verifier patch I sent, then apply the patch below. You can
drop the debug traceprintk patch - the patch below should fix it.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: recovery should not clear di_flushiter unconditionally

From: Dave Chinner <dchinner@redhat.com>

Because on v3 inodes, di_flushiter doesn't exist. It overlaps with
zero padding in the inode, except when NREXT64=1 configurations are
in use and the zero padding is no longer padding but holds the 64
bit extent counter.

This manifests obviously on big endian platforms (e.g. s390) because
the log dinode is in host order and the overlap is the LSBs of the
extent count field. It is not noticed on little endian machines
because the overlap is at the MSB end of the extent count field and
we need to get more than 2^^48 extents in the inode before it
manifests. i.e. the heat death of the universe will occur before we
see the problem in little endian machines.

This is a zero-day issue for NREXT64=1 configuraitons on big endian
machines. Fix it by only clearing di_flushiter on v2 inodes during
recovery.

Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers")
cc: stable@kernel.org # 5.19+
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_inode_item_recover.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
index f4c31c2b60d5..dbdab4ce7c44 100644
--- a/fs/xfs/xfs_inode_item_recover.c
+++ b/fs/xfs/xfs_inode_item_recover.c
@@ -371,24 +371,26 @@ xlog_recover_inode_commit_pass2(
 	 * superblock flag to determine whether we need to look at di_flushiter
 	 * to skip replay when the on disk inode is newer than the log one
 	 */
-	if (!xfs_has_v3inodes(mp) &&
-	    ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
-		/*
-		 * Deal with the wrap case, DI_MAX_FLUSH is less
-		 * than smaller numbers
-		 */
-		if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
-		    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
-			/* do nothing */
-		} else {
-			trace_xfs_log_recover_inode_skip(log, in_f);
-			error = 0;
-			goto out_release;
+	if (!xfs_has_v3inodes(mp)) {
+		if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
+			/*
+			 * Deal with the wrap case, DI_MAX_FLUSH is less
+			 * than smaller numbers
+			 */
+			if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
+			    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
+				/* do nothing */
+			} else {
+				trace_xfs_log_recover_inode_skip(log, in_f);
+				error = 0;
+				goto out_release;
+			}
 		}
+
+		/* Take the opportunity to reset the flush iteration count */
+		ldip->di_flushiter = 0;
 	}
 
-	/* Take the opportunity to reset the flush iteration count */
-	ldip->di_flushiter = 0;
 
 	if (unlikely(S_ISREG(ldip->di_mode))) {
 		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-10  2:03                         ` Dave Chinner
@ 2023-11-10  4:32                           ` Darrick J. Wong
  2023-11-10  7:34                           ` Christoph Hellwig
                                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Darrick J. Wong @ 2023-11-10  4:32 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Zorro Lang, linux-xfs, fstests, Carlos Maiolino

On Fri, Nov 10, 2023 at 01:03:49PM +1100, Dave Chinner wrote:
> On Fri, Nov 10, 2023 at 09:36:51AM +0800, Zorro Lang wrote:
> > The g/047 still fails with this 2nd patch. So I did below steps [1],
> > and get the trace output as [2], those dump_inodes() messages you
> > added have been printed, please check.
> 
> And that points me at the bug.
> 
> dump_inodes: disk ino 0x83: init nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18
> dump_inodes: log ino 0x83: init nblocks 0x8 nextents 0x0/0x1 anextents 0x0/0x0 v3_pad 0x1 nrext64_pad 0x0 di_flags2 0x18 big
>                                                      ^^^^^^^
> The initial log inode is correct.
> 
> dump_inodes: disk ino 0x83: pre nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18
> dump_inodes: log ino 0x83: pre nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18 big
>                                                     ^^^^^^^
> 
> .... but on the second sample, it's been modified and the extent
> count has been zeroed? Huh, that is unexpected - what did that?
> 
> Oh.
> 
> Can you test the patch below and see if it fixes the issue? Keep
> the first verifier patch I sent, then apply the patch below. You can
> drop the debug traceprintk patch - the patch below should fix it.
> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: recovery should not clear di_flushiter unconditionally
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> Because on v3 inodes, di_flushiter doesn't exist. It overlaps with
> zero padding in the inode, except when NREXT64=1 configurations are
> in use and the zero padding is no longer padding but holds the 64
> bit extent counter.
> 
> This manifests obviously on big endian platforms (e.g. s390) because
> the log dinode is in host order and the overlap is the LSBs of the
> extent count field. It is not noticed on little endian machines
> because the overlap is at the MSB end of the extent count field and
> we need to get more than 2^^48 extents in the inode before it
> manifests. i.e. the heat death of the universe will occur before we
> see the problem in little endian machines.
> 
> This is a zero-day issue for NREXT64=1 configuraitons on big endian
> machines. Fix it by only clearing di_flushiter on v2 inodes during
> recovery.
> 
> Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers")
> cc: stable@kernel.org # 5.19+
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_inode_item_recover.c | 32 +++++++++++++++++---------------
>  1 file changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index f4c31c2b60d5..dbdab4ce7c44 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -371,24 +371,26 @@ xlog_recover_inode_commit_pass2(
>  	 * superblock flag to determine whether we need to look at di_flushiter
>  	 * to skip replay when the on disk inode is newer than the log one
>  	 */
> -	if (!xfs_has_v3inodes(mp) &&
> -	    ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> -		/*
> -		 * Deal with the wrap case, DI_MAX_FLUSH is less
> -		 * than smaller numbers
> -		 */
> -		if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> -		    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> -			/* do nothing */
> -		} else {
> -			trace_xfs_log_recover_inode_skip(log, in_f);
> -			error = 0;
> -			goto out_release;
> +	if (!xfs_has_v3inodes(mp)) {
> +		if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> +			/*
> +			 * Deal with the wrap case, DI_MAX_FLUSH is less
> +			 * than smaller numbers
> +			 */
> +			if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> +			    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> +				/* do nothing */
> +			} else {
> +				trace_xfs_log_recover_inode_skip(log, in_f);
> +				error = 0;
> +				goto out_release;
> +			}
>  		}
> +
> +		/* Take the opportunity to reset the flush iteration count */
> +		ldip->di_flushiter = 0;
>  	}
>  
> -	/* Take the opportunity to reset the flush iteration count */
> -	ldip->di_flushiter = 0;

That's an unfortunate logic bomb left over from the V5 introduction.  I
guess it was benign until we finally reused that part of the xfs_dinode.

If this fixes zorro's machine, then:
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

I wonder, if we made XFS_SUPPORT_V4=n turn xfs_has_v3inodes and
xfs_has_crc into #define'd true symbols, could we then rename all the
V4-only fields to see what stops compiling?

(Probably not, gcc will still want to parse it all even if the source
code itself is dead...)

--D

>  
>  	if (unlikely(S_ISREG(ldip->di_mode))) {
>  		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-10  2:03                         ` Dave Chinner
  2023-11-10  4:32                           ` Darrick J. Wong
@ 2023-11-10  7:34                           ` Christoph Hellwig
  2023-11-10 13:56                           ` Zorro Lang
  2023-11-14 11:17                           ` edward6
  3 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2023-11-10  7:34 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Zorro Lang, linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

> +	if (!xfs_has_v3inodes(mp)) {
> +		if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> +			/*
> +			 * Deal with the wrap case, DI_MAX_FLUSH is less
> +			 * than smaller numbers
> +			 */
> +			if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> +			    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> +				/* do nothing */
> +			} else {
> +				trace_xfs_log_recover_inode_skip(log, in_f);
> +				error = 0;
> +				goto out_release;
> +			}

Tis just moves the existing code, but the conditional style is really
weird vs a simple:

			if (be16_to_cpu(dip->di_flushiter) != DI_MAX_FLUSH ||
			    ldip->di_flushiter >= (DI_MAX_FLUSH >> 1)) {
				trace_xfs_log_recover_inode_skip(log, in_f);
				error = 0;
				goto out_release;
			}

Nitpicking aside, this looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-10  2:03                         ` Dave Chinner
  2023-11-10  4:32                           ` Darrick J. Wong
  2023-11-10  7:34                           ` Christoph Hellwig
@ 2023-11-10 13:56                           ` Zorro Lang
  2023-11-14 11:17                           ` edward6
  3 siblings, 0 replies; 19+ messages in thread
From: Zorro Lang @ 2023-11-10 13:56 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Fri, Nov 10, 2023 at 01:03:49PM +1100, Dave Chinner wrote:
> On Fri, Nov 10, 2023 at 09:36:51AM +0800, Zorro Lang wrote:
> > The g/047 still fails with this 2nd patch. So I did below steps [1],
> > and get the trace output as [2], those dump_inodes() messages you
> > added have been printed, please check.
> 
> And that points me at the bug.
> 
> dump_inodes: disk ino 0x83: init nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18
> dump_inodes: log ino 0x83: init nblocks 0x8 nextents 0x0/0x1 anextents 0x0/0x0 v3_pad 0x1 nrext64_pad 0x0 di_flags2 0x18 big
>                                                      ^^^^^^^
> The initial log inode is correct.
> 
> dump_inodes: disk ino 0x83: pre nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18
> dump_inodes: log ino 0x83: pre nblocks 0x8 nextents 0x0/0x0 anextents 0x0/0x0 v3_pad 0x0 nrext64_pad 0x0 di_flags2 0x18 big
>                                                     ^^^^^^^
> 
> .... but on the second sample, it's been modified and the extent
> count has been zeroed? Huh, that is unexpected - what did that?
> 
> Oh.
> 
> Can you test the patch below and see if it fixes the issue? Keep
> the first verifier patch I sent, then apply the patch below. You can
> drop the debug traceprintk patch - the patch below should fix it.

Great, Dave! It works, below testing [1] passed on my s390x now. Maybe I should
write Tested-by, but as a reporter, so Reported-by might be proper, so...

Reported-by: Zorro Lang <zlang@redhat.com>

Anyway, I'm doing a full round fstests "auto" group testing on s390x with this
patch, to check if there're more issue on big endian machine. If you have
any specific concern (testing condition) hope to test, please tell me, I'll
enhance the testing of that part.

Thanks,
Zorro

[1]
# ./check generic/047 generic/039 generic/065
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/s390x ibm-z-507 6.6.0+ #1 SMP Wed Nov  8 12:51:20 EST 2023
MKFS_OPTIONS  -- -f -m crc=1,finobt=1,rmapbt=0,reflink=1,bigtime=1,inobtcount=1 /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR

generic/039        0s
generic/047 10s ...  10s
generic/065        1s
Ran: generic/039 generic/047 generic/065
Passed all 3 tests

> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: recovery should not clear di_flushiter unconditionally
> 
> From: Dave Chinner <dchinner@redhat.com>
> 
> Because on v3 inodes, di_flushiter doesn't exist. It overlaps with
> zero padding in the inode, except when NREXT64=1 configurations are
> in use and the zero padding is no longer padding but holds the 64
> bit extent counter.
> 
> This manifests obviously on big endian platforms (e.g. s390) because
> the log dinode is in host order and the overlap is the LSBs of the
> extent count field. It is not noticed on little endian machines
> because the overlap is at the MSB end of the extent count field and
> we need to get more than 2^^48 extents in the inode before it
> manifests. i.e. the heat death of the universe will occur before we
> see the problem in little endian machines.
> 
> This is a zero-day issue for NREXT64=1 configuraitons on big endian
> machines. Fix it by only clearing di_flushiter on v2 inodes during
> recovery.
> 
> Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers")
> cc: stable@kernel.org # 5.19+
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_inode_item_recover.c | 32 +++++++++++++++++---------------
>  1 file changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index f4c31c2b60d5..dbdab4ce7c44 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -371,24 +371,26 @@ xlog_recover_inode_commit_pass2(
>  	 * superblock flag to determine whether we need to look at di_flushiter
>  	 * to skip replay when the on disk inode is newer than the log one
>  	 */
> -	if (!xfs_has_v3inodes(mp) &&
> -	    ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> -		/*
> -		 * Deal with the wrap case, DI_MAX_FLUSH is less
> -		 * than smaller numbers
> -		 */
> -		if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> -		    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> -			/* do nothing */
> -		} else {
> -			trace_xfs_log_recover_inode_skip(log, in_f);
> -			error = 0;
> -			goto out_release;
> +	if (!xfs_has_v3inodes(mp)) {
> +		if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> +			/*
> +			 * Deal with the wrap case, DI_MAX_FLUSH is less
> +			 * than smaller numbers
> +			 */
> +			if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> +			    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> +				/* do nothing */
> +			} else {
> +				trace_xfs_log_recover_inode_skip(log, in_f);
> +				error = 0;
> +				goto out_release;
> +			}
>  		}
> +
> +		/* Take the opportunity to reset the flush iteration count */
> +		ldip->di_flushiter = 0;
>  	}
>  
> -	/* Take the opportunity to reset the flush iteration count */
> -	ldip->di_flushiter = 0;
>  
>  	if (unlikely(S_ISREG(ldip->di_mode))) {
>  		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-10  2:03                         ` Dave Chinner
                                             ` (2 preceding siblings ...)
  2023-11-10 13:56                           ` Zorro Lang
@ 2023-11-14 11:17                           ` edward6
  3 siblings, 0 replies; 19+ messages in thread
From: edward6 @ 2023-11-14 11:17 UTC (permalink / raw)
  To: david
  Cc: carlos, djwong, fstests, linux-xfs, zlang, linux-s390,
	Eduard Shishkin

From: Eduard Shishkin <edward6@linux.ibm.com>

[...]

> Can you test the patch below and see if it fixes the issue? Keep
> the first verifier patch I sent, then apply the patch below. You can
> drop the debug traceprintk patch - the patch below should fix it.


Thanks for fixing it!
Tested-by: Eduard Shishkin <edward6@linux.ibm.com>

[...]

> This manifests obviously on big endian platforms (e.g. s390) because
> the log dinode is in host order and the overlap is the LSBs of the
> extent count field. It is not noticed on little endian machines
> because the overlap is at the MSB end of the extent count field and
> we need to get more than 2^^48 extents in the inode before it
> manifests. i.e. the heat death of the universe will occur before we
> see the problem in little endian machines.


This sounds too bold. I can easily imagine mountable images similar to
filesystem meta-data dumps, but not generated by scanning real
partitions, instead created by a special tool e.g. for debugging
purposes. My point is that on little-endian architectures the
manifestation of such a "sleeping" bug is a much more realistic event
than it may seem at first glance.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
  2023-11-06 20:33     ` Dave Chinner
  2023-11-06 22:20       ` Darrick J. Wong
  2023-11-07  8:05       ` Zorro Lang
@ 2023-11-07  8:29       ` Christoph Hellwig
  2 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2023-11-07  8:29 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Zorro Lang, linux-xfs, fstests, Darrick J. Wong, Carlos Maiolino

On Tue, Nov 07, 2023 at 07:33:50AM +1100, Dave Chinner wrote:
> That's not good. Can you please determine if this is a zero-day bug
> with the nrext64 feature? I think it was merged in 5.19, so if you
> could try to reproduce it on a 5.18 and 5.19 kernels first, that
> would be handy.
> 
> Also, from your s390 kernel build, can you get the pahole output
> for the struct xfs_dinode both for a good kernel and a bad kernel?

Another interesting angle is that s390 is the only mainstream big
endian platform left.  It might be interesting to also try to
reproduce this in a ppc64be or sparc64 system if someone has one
or a VM).


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-11-14 11:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-29  4:11 [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs] Zorro Lang
2023-11-06  6:13 ` Dave Chinner
2023-11-06 19:26   ` Zorro Lang
2023-11-06 20:33     ` Dave Chinner
2023-11-06 22:20       ` Darrick J. Wong
2023-11-07  8:05       ` Zorro Lang
2023-11-07  8:13         ` Dave Chinner
2023-11-07 15:13           ` Zorro Lang
2023-11-08  6:38             ` Dave Chinner
     [not found]               ` <CAN=2_H+CdEK_rEUmYbmkCjSRqhX2cwi5yRHQcKAmKDPF16vqOw@mail.gmail.com>
2023-11-09  6:14                 ` Dave Chinner
2023-11-09 14:09                   ` Zorro Lang
2023-11-09 23:13                     ` Dave Chinner
2023-11-10  1:36                       ` Zorro Lang
2023-11-10  2:03                         ` Dave Chinner
2023-11-10  4:32                           ` Darrick J. Wong
2023-11-10  7:34                           ` Christoph Hellwig
2023-11-10 13:56                           ` Zorro Lang
2023-11-14 11:17                           ` edward6
2023-11-07  8:29       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox