* [Bug report] More xfs courruption issue found on s390x
@ 2023-10-29 4:33 Zorro Lang
2023-11-06 6:26 ` Dave Chinner
2023-11-08 16:38 ` edward6
0 siblings, 2 replies; 6+ messages in thread
From: Zorro Lang @ 2023-10-29 4:33 UTC (permalink / raw)
To: linux-xfs; +Cc: linux-s390, fstests
Hi xfs folks,
Besides https://lore.kernel.org/linux-xfs/20231029041122.bx2k7wwm7otebjd5@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/T/#u ,
I always hit another failure on s390x too, by running generic/039 or generic/065 [1]:
XFS: Assertion failed: ip->i_nblocks == 0, file: fs/xfs/xfs_inode.c, line: 2359
More details as dmesg output [2] and .full output [3].
And ... more other kind of failure likes [3]:
XFS: Assertion failed: error != -ENOENT, file: fs/xfs/xfs_trans.c, line: 1310
All these falures are on s390x only... and more xfs corruption failure by running
fstests on s390x. I don't know if it's a s390x issue or a xfs issue on big endian
machine (s390x), so cc s390x list.
I tested on mainline linux 6.6.0-rc7+, the HEAD commit is:
commit 750b95887e567848ac2c851dae47922cac6db946
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu Oct 26 20:42:02 2023 -1000
Merge tag 'drm-fixes-2023-10-27' of git://anongit.freedesktop.org/drm/drm
Thanks,
Zorro
[1]
FSTYP -- xfs (debug)
PLATFORM -- Linux/s390x s390x-kvm-091 6.6.0-rc7+ #1 SMP Fri Oct 27 08:58:03 EDT 2023
MKFS_OPTIONS -- -f -m crc=1,finobt=1,rmapbt=0,reflink=1,bigtime=1,inobtcount=1 /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/fstests/SCRATCH_DIR
generic/065 _check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
(see /var/lib/xfstests/results//generic/065.full for details)
_check_dmesg: something found in dmesg (see /var/lib/xfstests/results//generic/065.dmesg)
- output mismatch (see /var/lib/xfstests/results//generic/065.out.bad)
--- tests/generic/065.out 2023-10-27 09:04:38.185351816 -0400
+++ /var/lib/xfstests/results//generic/065.out.bad 2023-10-27 09:09:12.145421405 -0400
@@ -4,11 +4,11 @@
wrote 65536/65536 bytes at offset 0
XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
File 'foo' content after log replay:
-0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
+0000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0020000
...
(Run 'diff -u /var/lib/xfstests/tests/generic/065.out /var/lib/xfstests/results//generic/065.out.bad' to see the entire diff)
Ran: generic/065
Failures: generic/065
Failed 1 of 1 tests
[2]
[ 440.713107] run fstests generic/065 at 2023-10-27 09:09:11
[ 440.971883] XFS (dm-3): Mounting V5 Filesystem 6d36b06f-6fbd-4a9f-925c-c49582f14078
[ 440.972968] XFS (dm-3): Ending clean mount
[ 441.037265] XFS (dm-3): Unmounting Filesystem 6d36b06f-6fbd-4a9f-925c-c49582f14078
[ 441.045980] XFS (dm-3): Mounting V5 Filesystem 6d36b06f-6fbd-4a9f-925c-c49582f14078
[ 441.046947] XFS (dm-3): Starting recovery (logdev: internal)
[ 441.047283] XFS (dm-3): Ending recovery (logdev: internal)
[ 441.055894] XFS: Assertion failed: ip->i_nblocks == 0, file: fs/xfs/xfs_inode.c, line: 2359
[ 441.055920] ------------[ cut here ]------------
[ 441.055921] WARNING: CPU: 3 PID: 212189 at fs/xfs/xfs_message.c:104 assfail+0x4e/0x68 [xfs]
[ 441.056180] Modules linked in: dm_flakey tls loop lcs ctcm fsm qeth ccwgroup zfcp qdio scsi_transport_fc dasd_fba_mod dasd_eckd_mod dasd_mod rfkill sunrpc vfio_ccw mdev zcrypt_cex4 vfio_iommu_type1 vfio drm fuse i2c_core drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng aes_s390 virtio_net des_s390 sha3_512_s390 net_failover failover sha3_256_s390 virtio_blk dm_mirror dm_region_hash dm_log dm_mod pkey zcrypt
[ 441.056210] CPU: 3 PID: 212189 Comm: kworker/3:5 Kdump: loaded Tainted: G W 6.6.0-rc7+ #1
[ 441.056213] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
[ 441.056215] Workqueue: xfs-inodegc/dm-3 xfs_inodegc_worker [xfs]
[ 441.056312] Krnl PSW : 0704c00180000000 000003ff7ffb36d2 (assfail+0x52/0x68 [xfs])
[ 441.056410] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 441.056413] Krnl GPRS: c000000100000021 000003ff8001d9a0 ffffffffffffffea 000000000000000a
[ 441.056415] 00000380048aba90 0000000000000000 000003ff801272ba 0000000088f81488
[ 441.056417] 00000000d6301708 000000008cb5d000 00000001c459c800 00000001c459c800
[ 441.056419] 0000000081de3600 0000000000000000 000003ff7ffb36c0 00000380048abb38
[ 441.056426] Krnl Code: 000003ff7ffb36c6: 95001010 cli 16(%r1),0
000003ff7ffb36ca: a774000a brc 7,000003ff7ffb36de
#000003ff7ffb36ce: af000000 mc 0,0
>000003ff7ffb36d2: eb6ff0a80004 lmg %r6,%r15,168(%r15)
000003ff7ffb36d8: 07fe bcr 15,%r14
000003ff7ffb36da: 47000700 bc 0,1792
000003ff7ffb36de: af000000 mc 0,0
000003ff7ffb36e2: 0707 bcr 0,%r7
[ 441.056439] Call Trace:
[ 441.056440] [<000003ff7ffb36d2>] assfail+0x52/0x68 [xfs]
[ 441.056537] ([<000003ff7ffb36c0>] assfail+0x40/0x68 [xfs])
[ 441.056635] [<000003ff7ffaf3ca>] xfs_ifree+0x42a/0x478 [xfs]
[ 441.056737] [<000003ff7ffaf4c8>] xfs_inactive_ifree+0xb0/0x210 [xfs]
[ 441.056815] [<000003ff7ffaf7aa>] xfs_inactive+0x182/0x2e0 [xfs]
[ 441.056899] [<000003ff7ffa1f0c>] xfs_inodegc_worker+0xf4/0x1c8 [xfs]
[ 441.056983] [<000000003242f8f0>] process_one_work+0x1b8/0x408
[ 441.056989] [<000000003242fe40>] worker_thread+0x300/0x460
[ 441.056992] [<000000003243a758>] kthread+0x120/0x128
[ 441.056995] [<00000000323b6fcc>] __ret_from_fork+0x3c/0x58
[ 441.056997] [<0000000032da7072>] ret_from_fork+0xa/0x30
[ 441.057003] Last Breaking-Event-Address:
[ 441.057004] [<000003ff7ffb351c>] xfs_printk_level+0xac/0xd8 [xfs]
[ 441.057095] ---[ end trace 0000000000000000 ]---
[ 441.060319] XFS (dm-3): Unmounting Filesystem 6d36b06f-6fbd-4a9f-925c-c49582f14078
[ 441.109229] XFS (loop0): Unmounting Filesystem 1eef4fa0-3548-491e-a877-16ca42a411b6
[3]
meta-data=/dev/loop1 isize=512 agcount=4, agsize=655360 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1 bigtime=1 inobtcount=1 nrext64=1
data = bsize=4096 blocks=2621440, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=16384, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.
_check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
*** xfs_repair -n output ***
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
bad nblocks 16 for inode 133, would reset to 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 2
- agno = 1
bad nblocks 16 for inode 133, would reset to 0
- agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
*** end xfs_repair output
*** mount output ***
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=4096k,nr_inodes=986186,mode=755,inode64)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,size=1592372k,nr_inodes=819200,mode=755,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate,memory_recursiveprot)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
/dev/mapper/rhel_s390x--kvm--091-root on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,nosuid,noexec,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=29,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=3937)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime,seclabel)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime,seclabel)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime,seclabel)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel,pagesize=1M)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
ramfs on /run/credentials/systemd-sysctl.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
ramfs on /run/credentials/systemd-tmpfiles-setup-dev.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
/dev/vda1 on /boot type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/mapper/rhel_s390x--kvm--091-home on /home type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
ramfs on /run/credentials/systemd-tmpfiles-setup.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=796184k,nr_inodes=199046,mode=700,inode64)
*** end mount output
[3]
[ 3078.795314] run fstests generic/506 at 2023-10-27 09:53:09
[ 3079.005423] XFS (loop1): Mounting V5 Filesystem c94f4c60-e4fd-4fec-b130-e245b3880061
[ 3079.006596] XFS (loop1): Ending clean mount
[ 3079.007556] XFS (loop1): User initiated shutdown received.
[ 3079.007565] XFS (loop1): Metadata I/O Error (0x4) detected at xfs_fs_goingdown+0x5a/0xf8 [xfs] (fs/xfs/xfs_fsops.c:492). Shutting down filesystem.
[ 3079.007769] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
[ 3079.008944] XFS (loop1): Unmounting Filesystem c94f4c60-e4fd-4fec-b130-e245b3880061
[ 3079.031168] XFS (loop1): Mounting V5 Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.032478] XFS (loop1): Ending clean mount
[ 3079.032515] XFS (loop1): Quotacheck needed: Please wait.
[ 3079.036144] XFS (loop1): Quotacheck: Done.
[ 3079.040399] XFS (loop1): Unmounting Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.044961] XFS (loop1): Mounting V5 Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.045808] XFS (loop1): Ending clean mount
[ 3079.055991] XFS (loop1): User initiated shutdown received.
[ 3079.055994] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495). Shutting down filesystem.
[ 3079.056076] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
[ 3079.099382] XFS (loop1): Unmounting Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.102254] XFS (loop1): Mounting V5 Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.103055] XFS (loop1): Starting recovery (logdev: internal)
[ 3079.103217] XFS (loop1): Ending recovery (logdev: internal)
[ 3079.106775] XFS (loop1): Unmounting Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.113727] XFS (loop1): Mounting V5 Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.114471] XFS (loop1): Ending clean mount
[ 3079.115334] XFS: Assertion failed: error != -ENOENT, file: fs/xfs/xfs_trans.c, line: 1310
[ 3079.115372] ------------[ cut here ]------------
[ 3079.115374] WARNING: CPU: 1 PID: 2035839 at fs/xfs/xfs_message.c:104 assfail+0x4e/0x68 [xfs]
[ 3079.115495] Modules linked in: dm_log_writes dm_thin_pool dm_persistent_data dm_bio_prison sd_mod t10_pi sg dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey tls loop lcs ctcm fsm qeth ccwgroup zfcp qdio scsi_transport_fc dasd_fba_mod dasd_eckd_mod dasd_mod rfkill sunrpc vfio_ccw mdev zcrypt_cex4 vfio_iommu_type1 vfio drm fuse i2c_core drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng aes_s390 virtio_net des_s390 sha3_512_s390 net_failover failover sha3_256_s390 virtio_blk dm_mirror dm_region_hash dm_log dm_mod pkey zcrypt [last unloaded: scsi_debug]
[ 3079.115532] CPU: 1 PID: 2035839 Comm: touch Kdump: loaded Tainted: G W 6.6.0-rc7+ #1
[ 3079.115534] Hardware name: IBM 3931 LA1 400 (KVM/Linux)
[ 3079.115536] Krnl PSW : 0704c00180000000 000003ff7ffb36d2 (assfail+0x52/0x68 [xfs])
[ 3079.115625] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 3079.115627] Krnl GPRS: c000000100000021 000003ff8001d9a0 ffffffffffffffea 000000000000000a
[ 3079.115629] 0000038006d0f928 0000000000000000 000003ff80129d40 000000008f254398
[ 3079.115630] 0000038006d0fb20 000000008f254000 00000380fffffffe 00000000a12a8400
[ 3079.115632] 000003ff8762ef68 0000000000000000 000003ff7ffb36c0 0000038006d0f9d0
[ 3079.115638] Krnl Code: 000003ff7ffb36c6: 95001010 cli 16(%r1),0
000003ff7ffb36ca: a774000a brc 7,000003ff7ffb36de
#000003ff7ffb36ce: af000000 mc 0,0
>000003ff7ffb36d2: eb6ff0a80004 lmg %r6,%r15,168(%r15)
000003ff7ffb36d8: 07fe bcr 15,%r14
000003ff7ffb36da: 47000700 bc 0,1792
000003ff7ffb36de: af000000 mc 0,0
000003ff7ffb36e2: 0707 bcr 0,%r7
[ 3079.115648] Call Trace:
[ 3079.115649] [<000003ff7ffb36d2>] assfail+0x52/0x68 [xfs]
[ 3079.115727] ([<000003ff7ffb36c0>] assfail+0x40/0x68 [xfs])
[ 3079.115803] [<000003ff7ffc0c34>] xfs_trans_alloc_ichange+0x274/0x318 [xfs]
[ 3079.115880] [<000003ff7ffa9058>] xfs_setattr_nonsize+0xa0/0x478 [xfs]
[ 3079.115966] [<00000000327530ce>] notify_change+0x38e/0x530
[ 3079.115972] [<0000000032776fee>] vfs_utimes+0x12e/0x268
[ 3079.115976] [<000000003277740e>] do_utimes+0x6e/0xb0
[ 3079.115980] [<00000000327779de>] __s390x_sys_utimensat+0x86/0xc0
[ 3079.115983] [<0000000032d97060>] __do_syscall+0x1d0/0x1f8
[ 3079.115988] [<0000000032da7040>] system_call+0x70/0x98
[ 3079.115992] Last Breaking-Event-Address:
[ 3079.115993] [<000003ff7ffb351c>] xfs_printk_level+0xac/0xd8 [xfs]
[ 3079.116070] ---[ end trace 0000000000000000 ]---
[ 3079.127976] XFS (loop1): User initiated shutdown received.
[ 3079.127983] XFS (loop1): Log I/O Error (0x6) detected at xfs_fs_goingdown+0xb4/0xf8 [xfs] (fs/xfs/xfs_fsops.c:495). Shutting down filesystem.
[ 3079.128127] XFS (loop1): Please unmount the filesystem and rectify the problem(s)
[ 3079.179362] XFS (loop1): Unmounting Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.182441] XFS (loop1): Mounting V5 Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.183327] XFS (loop1): Starting recovery (logdev: internal)
[ 3079.183464] XFS (loop1): Ending recovery (logdev: internal)
[ 3079.186928] XFS (loop1): Unmounting Filesystem ba7c075d-3018-49c9-a17d-0689c02892e2
[ 3079.198937] XFS (loop0): Unmounting Filesystem 1eef4fa0-3548-491e-a877-16ca42a411b6
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug report] More xfs courruption issue found on s390x
2023-10-29 4:33 [Bug report] More xfs courruption issue found on s390x Zorro Lang
@ 2023-11-06 6:26 ` Dave Chinner
2023-12-09 0:57 ` Luis Chamberlain
2023-11-08 16:38 ` edward6
1 sibling, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2023-11-06 6:26 UTC (permalink / raw)
To: Zorro Lang; +Cc: linux-xfs, linux-s390, fstests
On Sun, Oct 29, 2023 at 12:33:33PM +0800, Zorro Lang wrote:
> Hi xfs folks,
>
> Besides https://lore.kernel.org/linux-xfs/20231029041122.bx2k7wwm7otebjd5@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/T/#u ,
>
> I always hit another failure on s390x too, by running generic/039 or generic/065 [1]:
g/039 is a sync + fsync + flakey remount+recovery test.
g/065 is a similar by more complex test.
So that's 3 fsync + shutdown/recovery tests on s390 that are now
failing with ip->i_nblocks not matching the extent tree state after
fsync-shutdown-recover.
>
> XFS: Assertion failed: ip->i_nblocks == 0, file: fs/xfs/xfs_inode.c, line: 2359
>
> More details as dmesg output [2] and .full output [3].
>
> And ... more other kind of failure likes [3]:
> XFS: Assertion failed: error != -ENOENT, file: fs/xfs/xfs_trans.c, line: 1310
ANd that one is from g/509 after a quotacheck is completed and then
the fs is shutdown, resulting in dquots not being found after the
journal has been recovered and unlinked list recovery is being run.
i.e. unlinked list processing is tripping over missing file data
extents after recovery.
> All these falures are on s390x only... and more xfs corruption failure by running
> fstests on s390x. I don't know if it's a s390x issue or a xfs issue on big endian
> machine (s390x), so cc s390x list.
IOWs, all four of these issues are the same problem - journal
recovery is not resulting in a correct and consistent filesystem
after the journal has been flushed at runtime, so please discuss and
consolidate them all in the initial bug report thread....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug report] More xfs courruption issue found on s390x
2023-10-29 4:33 [Bug report] More xfs courruption issue found on s390x Zorro Lang
2023-11-06 6:26 ` Dave Chinner
@ 2023-11-08 16:38 ` edward6
2023-11-09 1:50 ` Zorro Lang
1 sibling, 1 reply; 6+ messages in thread
From: edward6 @ 2023-11-08 16:38 UTC (permalink / raw)
To: zlang; +Cc: fstests, linux-s390, linux-xfs, edward6
From: Eduard Shishkin <edward6@linux.ibm.com>
On Sun, Oct 29, 2023 at 12:33:33PM +0800, Zorro Lang wrote:
[...]
> All these falures are on s390x only... and more xfs corruption
> failure by running fstests on s390x. I don't know if it's a s390x
> issue or a xfs issue on big endian machine (s390x), so cc s390x
> list.
Dear all,
It is a really great nuisance. We are looking into this from our
(s390x) side in turn. Unfortunately, I don't see any ways except code
review for now. If you have any ideas on how to optimize the process,
please, let us know. Right now, I think, we can help with additional
hardware resources.
Thanks,
Eduard.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug report] More xfs courruption issue found on s390x
2023-11-08 16:38 ` edward6
@ 2023-11-09 1:50 ` Zorro Lang
0 siblings, 0 replies; 6+ messages in thread
From: Zorro Lang @ 2023-11-09 1:50 UTC (permalink / raw)
To: edward6; +Cc: fstests, linux-s390, linux-xfs
On Wed, Nov 08, 2023 at 05:38:12PM +0100, edward6@linux.ibm.com wrote:
> From: Eduard Shishkin <edward6@linux.ibm.com>
>
> On Sun, Oct 29, 2023 at 12:33:33PM +0800, Zorro Lang wrote:
>
> [...]
>
> > All these falures are on s390x only... and more xfs corruption
> > failure by running fstests on s390x. I don't know if it's a s390x
> > issue or a xfs issue on big endian machine (s390x), so cc s390x
> > list.
>
> Dear all,
>
> It is a really great nuisance. We are looking into this from our
> (s390x) side in turn. Unfortunately, I don't see any ways except code
> review for now. If you have any ideas on how to optimize the process,
> please, let us know. Right now, I think, we can help with additional
> hardware resources.
Thanks the reply from s390x side, we had more discussion under another mail
thread:
https://lore.kernel.org/linux-xfs/20231029041122.bx2k7wwm7otebjd5@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/T/#m0888b2314d0dc866bdd7e1bd8927e599d57319bd
This's an issue about a xfs feature, but only reproducible on s390x. The
expert from xfs side is looking into it, I'll provide more information
for debuging. I'll cc s390 list later, if you'd like to watch it:)
Thanks,
Zorro
>
> Thanks,
> Eduard.
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug report] More xfs courruption issue found on s390x
2023-11-06 6:26 ` Dave Chinner
@ 2023-12-09 0:57 ` Luis Chamberlain
2023-12-09 4:53 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Luis Chamberlain @ 2023-12-09 0:57 UTC (permalink / raw)
To: Dave Chinner
Cc: Zorro Lang, linux-xfs, linux-s390, fstests, Pankaj Raghav,
Daniel Gomez
On Mon, Nov 06, 2023 at 05:26:14PM +1100, Dave Chinner wrote:
> > XFS: Assertion failed: ip->i_nblocks == 0, file: fs/xfs/xfs_inode.c, line: 2359
>
> IOWs, all four of these issues are the same problem - journal
> recovery is not resulting in a correct and consistent filesystem
> after the journal has been flushed at runtime, so please discuss and
> consolidate them all in the initial bug report thread....
As recently reported, fortunately we now have reproducers for x86_64 too:
https://bugzilla.kernel.org/show_bug.cgi?id=218224
This fails on the following test sections as defined by kdevops [1]:
* xfs_nocrc_2k
* xfs_reflink
* xfs_reflink_1024
* xfs_reflink_2k
* xfs_reflink_4k
* xfs_reflink_dir_bsize_8k
* xfs_reflink_logdev
* xfs_reflink_normapbt
* xfs_reflink_nrext64
[0] https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config
Example failures:
* generic/475: https://gist.github.com/mcgrof/5d6f504f4695ba27cea7df5d63f35197
* generic/388: https://gist.github.com/mcgrof/c1c8b1dc76fdc1032a5f0aab6c2a14bf
* generic/648: https://gist.github.com/mcgrof/1e506ecbe898b45428d6e7febfc02db1
Luis
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug report] More xfs courruption issue found on s390x
2023-12-09 0:57 ` Luis Chamberlain
@ 2023-12-09 4:53 ` Dave Chinner
0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2023-12-09 4:53 UTC (permalink / raw)
To: Luis Chamberlain
Cc: Zorro Lang, linux-xfs, linux-s390, fstests, Pankaj Raghav,
Daniel Gomez
On Fri, Dec 08, 2023 at 04:57:37PM -0800, Luis Chamberlain wrote:
> On Mon, Nov 06, 2023 at 05:26:14PM +1100, Dave Chinner wrote:
> > > XFS: Assertion failed: ip->i_nblocks == 0, file: fs/xfs/xfs_inode.c, line: 2359
> >
> > IOWs, all four of these issues are the same problem - journal
> > recovery is not resulting in a correct and consistent filesystem
> > after the journal has been flushed at runtime, so please discuss and
> > consolidate them all in the initial bug report thread....
>
> As recently reported, fortunately we now have reproducers for x86_64 too:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=218224
>
> This fails on the following test sections as defined by kdevops [1]:
>
> * xfs_nocrc_2k
> * xfs_reflink
> * xfs_reflink_1024
> * xfs_reflink_2k
> * xfs_reflink_4k
> * xfs_reflink_dir_bsize_8k
> * xfs_reflink_logdev
> * xfs_reflink_normapbt
> * xfs_reflink_nrext64
>
> [0] https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config
>
> Example failures:
>
> * generic/475: https://gist.github.com/mcgrof/5d6f504f4695ba27cea7df5d63f35197
> * generic/388: https://gist.github.com/mcgrof/c1c8b1dc76fdc1032a5f0aab6c2a14bf
> * generic/648: https://gist.github.com/mcgrof/1e506ecbe898b45428d6e7febfc02db1
If this is the same problem, have you tested whether the fix for the
s390 issue makes all the problems you are seeing on x86-64 go away?
i.e. commit 7930d9e10370 ("xfs: recovery should not clear di_flushiter
unconditionally")
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-12-09 4:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-29 4:33 [Bug report] More xfs courruption issue found on s390x Zorro Lang
2023-11-06 6:26 ` Dave Chinner
2023-12-09 0:57 ` Luis Chamberlain
2023-12-09 4:53 ` Dave Chinner
2023-11-08 16:38 ` edward6
2023-11-09 1:50 ` Zorro Lang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox