[fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
@ 2023-09-03 12:00 Zorro Lang
  2023-09-03 20:40 ` Theodore Ts'o
  0 siblings, 1 reply; 14+ messages in thread
From: Zorro Lang @ 2023-09-03 12:00 UTC (permalink / raw)
  To: linux-ext4; +Cc: fstests

Hi ext4 folks,

Recently I found lots of fstests cases which belong to "recoveryloop" (e.g.
g/388 [1], g/455 [2], g/475 [3] and g/482 [4]) or does fs shutdown/resize test
(e.g. ext4/059 [5], g/530 [6]) failed ext4 with 1k blocksize, the kernel is
linux v6.6-rc0+ (HEAD=b84acc11b1c9).

I tested with MKFS_OPTIONS="-b 1024", no specific MOUNT_OPTIONS. I hit these
failure several times, and I didn't hit them on my last regression test on
v6.5-rc7+. So I think this might be a regression problem. And I didn't hit
this failures on xfs. If this's a known issue will be fixed soon, feel free
to tell me.

There's not .dmesg file, but I got part of related dmesg output as [7].

Thanks,
Zorro

[1]
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 hp-dl385pg8-09 6.5.0+ #1 SMP PREEMPT_DYNAMIC Fri Sep  1 17:48:42 EDT 2023
MKFS_OPTIONS  -- -F -b 1024 /dev/sda4
MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch

generic/388       [failed, exit status 1]- output mismatch (see /var/lib/xfstests/results//generic/388.out.bad)
    --- tests/generic/388.out	2023-09-01 18:42:54.987584713 -0400
    +++ /var/lib/xfstests/results//generic/388.out.bad	2023-09-02 03:01:15.475746583 -0400
    @@ -1,2 +1,5 @@
     QA output created by 388
     Silence is golden.
    +mount: /mnt/xfstests/scratch: mount(2) system call failed: Structure needs cleaning.
    +cycle mount failed
    +(see /var/lib/xfstests/results//generic/388.full for details)
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/388.out /var/lib/xfstests/results//generic/388.out.bad'  to see the entire diff)
Ran: generic/388
Failures: generic/388
Failed 1 of 1 tests

[2]
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 hp-dl385pg8-09 6.5.0+ #1 SMP PREEMPT_DYNAMIC Fri Sep  1 17:48:42 EDT 2023
MKFS_OPTIONS  -- -F -b 1024 /dev/sda4
MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch

generic/455       [failed, exit status 1]- output mismatch (see /var/lib/xfstests/results//generic/455.out.bad)
    --- tests/generic/455.out	2023-09-01 18:42:58.775558885 -0400
    +++ /var/lib/xfstests/results//generic/455.out.bad	2023-09-02 04:05:54.867731920 -0400
    @@ -1,2 +1,4 @@
     QA output created by 455
    -Silence is golden
    +md5sum: /mnt/xfstests/scratch/testfile1: Structure needs cleaning
    +testfile1.mark8 md5sum mismatched
    +(see /var/lib/xfstests/results//generic/455.full for details)
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/455.out /var/lib/xfstests/results//generic/455.out.bad'  to see the entire diff)
Ran: generic/455
Failures: generic/455
Failed 1 of 1 tests

[3]
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 hp-dl385pg8-09 6.5.0+ #1 SMP PREEMPT_DYNAMIC Fri Sep  1 17:48:42 EDT 2023
MKFS_OPTIONS  -- -F -b 1024 /dev/sda4
MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch

generic/475       [failed, exit status 1]- output mismatch (see /var/lib/xfstests/results//generic/475.out.bad)
    --- tests/generic/475.out	2023-09-01 18:42:59.847941016 -0400
    +++ /var/lib/xfstests/results//generic/475.out.bad	2023-09-02 04:19:05.105706247 -0400
    @@ -1,2 +1,6 @@
     QA output created by 475
     Silence is golden.
    +mount: /mnt/xfstests/scratch: cannot mount; probably corrupted filesystem on /dev/mapper/error-test.
    +mount failed
    +(see /var/lib/xfstests/results//generic/475.full for details)
    +umount: /mnt/xfstests/scratch: not mounted.
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/475.out /var/lib/xfstests/results//generic/475.out.bad'  to see the entire diff)
Ran: generic/475
Failures: generic/475
Failed 1 of 1 tests

[4]
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 hp-dl385pg8-09 6.5.0+ #1 SMP PREEMPT_DYNAMIC Fri Sep  1 17:48:42 EDT 2023
MKFS_OPTIONS  -- -F -b 1024 /dev/sda4
MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch

generic/482       [failed, exit status 1]- output mismatch (see /var/lib/xfstests/results//generic/482.out.bad)
    --- tests/generic/482.out	2023-09-01 18:43:00.246338844 -0400
    +++ /var/lib/xfstests/results//generic/482.out.bad	2023-09-02 05:23:00.179371438 -0400
    @@ -1,2 +1,3 @@
     QA output created by 482
    -Silence is golden
    +_check_generic_filesystem: filesystem on /dev/mapper/thin-vol is inconsistent
    +(see /var/lib/xfstests/results//generic/482.full for details)
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/482.out /var/lib/xfstests/results//generic/482.out.bad'  to see the entire diff)
Ran: generic/482
Failures: generic/482
Failed 1 of 1 tests

[5]
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 hp-dl385pg8-09 6.5.0+ #1 SMP PREEMPT_DYNAMIC Fri Sep  1 17:48:42 EDT 2023
MKFS_OPTIONS  -- -F -b 1024 /dev/sda4
MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch

ext4/059       [failed, exit status 1]- output mismatch (see /var/lib/xfstests/results//ext4/059.out.bad)
    --- tests/ext4/059.out	2023-09-01 18:41:56.462732307 -0400
    +++ /var/lib/xfstests/results//ext4/059.out.bad	2023-09-01 19:27:23.435239204 -0400
    @@ -1,2 +1,5 @@
     QA output created by 059
     Reserved GDT blocks:      100
    +mount: /mnt/xfstests/scratch: mount(2) system call failed: Structure needs cleaning.
    +mount -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch failed
    +(see /var/lib/xfstests/results//ext4/059.full for details)
    ...
    (Run 'diff -u /var/lib/xfstests/tests/ext4/059.out /var/lib/xfstests/results//ext4/059.out.bad'  to see the entire diff)

HINT: You _MAY_ be missing kernel fix:
      b55c3cd102a6 ext4: add reserved GDT blocks check

Ran: ext4/059
Failures: ext4/059
Failed 1 of 1 tests

[6]
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 hp-dl385pg8-09 6.5.0+ #1 SMP PREEMPT_DYNAMIC Fri Sep  1 17:48:42 EDT 2023
MKFS_OPTIONS  -- -F -b 1024 /dev/sda4
MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch

generic/530       [failed, exit status 1]- output mismatch (see /var/lib/xfstests/results//generic/530.out.bad)
    --- tests/generic/530.out	2023-09-01 18:43:02.968101577 -0400
    +++ /var/lib/xfstests/results//generic/530.out.bad	2023-09-02 05:40:51.047479015 -0400
    @@ -1,2 +1,5 @@
     QA output created by 530
     silence is golden
    +mount: /mnt/xfstests/scratch: mount(2) system call failed: Structure needs cleaning.
    +mount -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch failed
    +(see /var/lib/xfstests/results//generic/530.full for details)
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/530.out /var/lib/xfstests/results//generic/530.out.bad'  to see the entire diff)
Ran: generic/530
Failures: generic/530
Failed 1 of 1 tests

[7]
# dmesg
[ 3881.516118] run fstests ext4/059 at 2023-09-01 19:27:19 
[ 3884.695217] EXT4-fs (sda4): failed to initialize system zone (-117) 
[ 3884.696103] EXT4-fs (sda4): mount failed 
[ 3884.888820] EXT4-fs (sda5): unmounting filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97. 
[ 3885.757259] EXT4-fs (sda5): mounted filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97 r/w with ordered data mode. Quota mode: none. 
[ 3907.019968] EXT4-fs (sda4): mounted filesystem 0336869b-572b-4e59-b73e-ac93af30fe2b r/w with ordered data mode. Quota mode: none. 
[ 3907.048644] EXT4-fs (sda4): unmounting filesystem 0336869b-572b-4e59-b73e-ac93af30fe2b. 
[ 3907.221590] EXT4-fs (sda5): unmounting filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97. 
[ 3907.864579] EXT4-fs (sda5): mounted filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97 r/w with ordered data mode. Quota mode: none.
...
...
[35741.177294] run fstests generic/475 at 2023-09-02 04:18:20 
[35748.598756] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35750.773819] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 880, block_bitmap = 7208961 
[35750.774700] Buffer I/O error on dev dm-0, logical block 36, lost async page write 
[35750.774812] Aborting journal on device dm-0-8. 
[35750.774984] EXT4-fs error (device dm-0) in ext4_iomap_alloc:3320: IO failure 
[35750.775304] Buffer I/O error on dev dm-0, logical block 7741441, lost sync page write 
[35750.775411] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35750.776103] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35750.776387] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 880, block_bitmap = 7208961 
[35750.777034] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35750.777075] EXT4-fs (dm-0): I/O error while writing superblock 
[35750.777241] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35750.777256] EXT4-fs (dm-0): I/O error while writing superblock 
[35750.777728] Buffer I/O error on dev dm-0, logical block 106, lost async page write 
[35750.777770] Buffer I/O error on dev dm-0, logical block 386, lost async page write 
[35750.777813] Buffer I/O error on dev dm-0, logical block 4456450, lost async page write 
[35750.777856] Buffer I/O error on dev dm-0, logical block 4456465, lost async page write 
[35750.777936] Buffer I/O error on dev dm-0, logical block 4456481, lost async page write 
[35750.777966] Buffer I/O error on dev dm-0, logical block 4456482, lost async page write 
[35750.778006] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35750.780652] EXT4-fs (dm-0): I/O error while writing superblock 
[35750.781201] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35750.781507] EXT4-fs (dm-0): Remounting filesystem read-only 
[35750.793993] EXT4-fs (dm-0): I/O error while writing superblock 
[35750.794092] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35750.795498] EXT4-fs (dm-0): I/O error while writing superblock 
[35751.812380] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35752.695235] EXT4-fs (dm-0): 1 truncate cleaned up 
[35752.696080] EXT4-fs (dm-0): recovery complete 
[35752.735519] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35752.867597] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278557 starting block 4472833) 
[35752.867605] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35752.868078] EXT4-fs error (device dm-0): ext4_check_bdev_write_error:224: comm fsstress: Error while async write back metadata 
[35752.868296] buffer_io_error: 640 callbacks suppressed 
[35752.868301] Buffer I/O error on device dm-0, logical block 4472833 
[35752.868634] Aborting journal on device dm-0-8. 
[35752.870097] Buffer I/O error on device dm-0, logical block 4472834 
[35752.870105] Buffer I/O error on device dm-0, logical block 4472835 
[35752.870935] Buffer I/O error on device dm-0, logical block 4472836 
[35752.871229] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35752.871783] Buffer I/O error on device dm-0, logical block 4472837 
[35752.875031] Buffer I/O error on device dm-0, logical block 4472838 
[35752.876023] Buffer I/O error on device dm-0, logical block 4472839 
[35752.876851] Buffer I/O error on device dm-0, logical block 4472840 
[35752.877731] Buffer I/O error on device dm-0, logical block 4472841 
[35752.878442] Buffer I/O error on device dm-0, logical block 4472842 
[35752.879453] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245782 starting block 3940356) 
[35752.880460] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245781 starting block 67688) 
[35752.881433] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245785 starting block 71774) 
[35752.882247] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245785 starting block 71834) 
[35752.882991] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245785 starting block 72821) 
[35752.882999] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35752.884013] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245784 starting block 75471) 
[35752.884651] EXT4-fs (dm-0): I/O error while writing superblock 
[35752.885399] JBD2: Detected IO errors while flushing file data on dm-0-8 
[35752.885839] EXT4-fs (dm-0): Remounting filesystem read-only 
[35753.601844] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35754.100398] EXT4-fs (dm-0): recovery complete 
[35754.101096] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35756.265146] Aborting journal on device dm-0-8. 
[35756.265859] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35756.265878] buffer_io_error: 24 callbacks suppressed 
[35756.265884] Buffer I/O error on dev dm-0, logical block 7741441, lost sync page write 
[35756.266024] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35756.267187] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35756.268590] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35756.272124] EXT4-fs (dm-0): I/O error while writing superblock 
[35756.272185] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35756.272913] EXT4-fs (dm-0): Remounting filesystem read-only 
[35756.273425] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35756.274712] EXT4-fs (dm-0): I/O error while writing superblock 
[35756.332216] Buffer I/O error on dev dm-0, logical block 15728576, async page read 
[35756.333139] Buffer I/O error on dev dm-0, logical block 15728577, async page read 
[35756.334151] Buffer I/O error on dev dm-0, logical block 15728578, async page read 
[35756.335046] Buffer I/O error on dev dm-0, logical block 15728579, async page read 
[35757.122800] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35757.947682] EXT4-fs (dm-0): recovery complete 
[35757.957726] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35760.241303] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278589 starting block 1346794) 
[35760.242054] buffer_io_error: 432 callbacks suppressed 
[35760.242059] Buffer I/O error on device dm-0, logical block 1346794 
[35760.243052] Buffer I/O error on device dm-0, logical block 1346795 
[35760.243764] Buffer I/O error on device dm-0, logical block 1346796 
[35760.244551] Buffer I/O error on device dm-0, logical block 1346797 
[35760.245262] Buffer I/O error on device dm-0, logical block 1346798 
[35760.246193] Buffer I/O error on device dm-0, logical block 1346799 
[35760.246912] Buffer I/O error on device dm-0, logical block 1346800 
[35760.247661] Buffer I/O error on device dm-0, logical block 1346801 
[35760.248645] Buffer I/O error on device dm-0, logical block 1346802 
[35760.249395] Buffer I/O error on device dm-0, logical block 1346803 
[35760.250335] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278597 starting block 1433857) 
[35760.251113] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278598 starting block 13692220) 
[35760.251986] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278599 starting block 4465588) 
[35760.252883] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278597 starting block 1349627) 
[35760.253656] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278597 starting block 1436065) 
[35760.254404] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278597 starting block 1436609) 
[35760.255264] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278595 starting block 1451398) 
[35760.256777] Buffer I/O error on dev dm-0, logical block 7743076, lost sync page write 
[35760.259850] EXT4-fs error (device dm-0): ext4_check_bdev_write_error:224: comm fsstress: Error while async write back metadata 
[35760.260914] JBD2: Detected IO errors while flushing file data on dm-0-8 
[35760.262186] Aborting journal on device dm-0-8. 
[35760.262918] Buffer I/O error on dev dm-0, logical block 7741441, lost sync page write 
[35760.262920] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35760.263669] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35760.265606] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35760.266729] EXT4-fs (dm-0): I/O error while writing superblock 
[35760.267488] EXT4-fs (dm-0): Remounting filesystem read-only 
[35761.130037] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35762.001088] EXT4-fs (dm-0): recovery complete 
[35762.011888] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35764.250404] buffer_io_error: 4 callbacks suppressed 
[35764.250414] Buffer I/O error on dev dm-0, logical block 13631542, lost async page write 
[35764.251619] Buffer I/O error on dev dm-0, logical block 13631545, lost async page write 
[35764.252131] Buffer I/O error on dev dm-0, logical block 13633590, lost async page write 
[35764.253503] EXT4-fs error (device dm-0): __ext4_find_entry:1678: inode #852043: comm fsstress: reading directory lblock 0 
[35764.253527] Buffer I/O error on dev dm-0, logical block 13633594, lost async page write 
[35764.254920] EXT4-fs error (device dm-0): ext4_check_bdev_write_error:224: comm fsstress: Error while async write back metadata 
[35764.255800] Aborting journal on device dm-0-8. 
[35764.256566] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852052 starting block 110737) 
[35764.257236] Buffer I/O error on dev dm-0, logical block 7741441, lost sync page write 
[35764.258505] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35764.258511] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35764.259064] JBD2: Detected IO errors while flushing file data on dm-0-8 
[35764.259288] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35764.259360] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35764.259580] EXT4-fs (dm-0): ext4_do_writepages: jbd2_start: 2017 pages, ino 852052; err -5 
[35764.260286] EXT4-fs (dm-0): I/O error while writing superblock 
[35764.261228] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35764.264422] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35764.265664] EXT4-fs (dm-0): I/O error while writing superblock 
[35764.265675] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35764.266523] EXT4-fs (dm-0): Remounting filesystem read-only 
[35764.267096] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35764.268115] EXT4-fs (dm-0): I/O error while writing superblock 
[35764.321584] Buffer I/O error on dev dm-0, logical block 15728576, async page read 
[35764.322502] Buffer I/O error on dev dm-0, logical block 15728577, async page read 
[35765.112893] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35765.955813] EXT4-fs (dm-0): recovery complete 
[35765.961733] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35767.098045] Aborting journal on device dm-0-8. 
[35767.098835] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35767.099371] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35767.101491] EXT4-fs (dm-0): I/O error while writing superblock 
[35767.102259] EXT4-fs (dm-0): Remounting filesystem read-only 
[35767.855929] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35768.570841] EXT4-fs (dm-0): recovery complete 
[35768.576985] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35768.725066] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 174, block_bitmap = 1310735 
[35768.725934] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 175, block_bitmap = 1310736 
[35768.726066] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35768.726822] EXT4-fs error (device dm-0): ext4_check_bdev_write_error:224: comm fsstress: Error while async write back metadata 
[35768.727049] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm kworker/u50:1: Cannot read block bitmap - block_group = 174, block_bitmap = 1310735 
[35768.727104] Aborting journal on device dm-0-8. 
[35768.727203] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm kworker/u50:1: Cannot read block bitmap - block_group = 175, block_bitmap = 1310736 
[35768.727665] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852010 starting block 1332745) 
[35768.727749] buffer_io_error: 522 callbacks suppressed 
[35768.727760] Buffer I/O error on device dm-0, logical block 1332745 
[35768.727776] Buffer I/O error on device dm-0, logical block 1332746 
[35768.727819] Buffer I/O error on device dm-0, logical block 1332747 
[35768.727832] Buffer I/O error on device dm-0, logical block 1332748 
[35768.727976] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 174, block_bitmap = 1310735 
[35768.729159] Buffer I/O error on device dm-0, logical block 1332749 
[35768.729169] Buffer I/O error on device dm-0, logical block 1332750 
[35768.729176] Buffer I/O error on device dm-0, logical block 1332751 
[35768.729183] Buffer I/O error on device dm-0, logical block 1332752 
[35768.729201] Buffer I/O error on device dm-0, logical block 1332753 
[35768.729210] Buffer I/O error on device dm-0, logical block 1332754 
[35768.729307] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35768.729343] EXT4-fs (dm-0): I/O error while writing superblock 
[35768.729446] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35768.730981] EXT4-fs (dm-0): I/O error while writing superblock 
[35768.732162] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35768.732429] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852010 starting block 1360623) 
[35768.734186] JBD2: Detected IO errors while flushing file data on dm-0-8 
[35768.734807] EXT4-fs (dm-0): I/O error while writing superblock 
[35768.746665] EXT4-fs (dm-0): Remounting filesystem read-only 
[35769.449359] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35769.949999] EXT4-fs (dm-0): recovery complete 
[35769.950713] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35772.084650] Aborting journal on device dm-0-8. 
[35772.085397] buffer_io_error: 25 callbacks suppressed 
[35772.085401] Buffer I/O error on dev dm-0, logical block 7741441, lost sync page write 
[35772.085653] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35772.086186] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35772.086677] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35772.086693] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35772.086935] Buffer I/O error on dev dm-0, logical block 4498087, async page read 
[35772.087180] Buffer I/O error on dev dm-0, logical block 4498088, async page read 
[35772.087307] Buffer I/O error on dev dm-0, logical block 4498087, async page read 
[35772.087350] Buffer I/O error on dev dm-0, logical block 4498088, async page read 
[35772.087416] Buffer I/O error on dev dm-0, logical block 4498087, async page read 
[35772.087458] Buffer I/O error on dev dm-0, logical block 4498088, async page read 
[35772.088077] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35772.088208] EXT4-fs (dm-0): I/O error while writing superblock 
[35772.088216] EXT4-fs (dm-0): Remounting filesystem read-only 
[35772.088222] EXT4-fs (dm-0): ext4_do_writepages: jbd2_start: 9223372036854775807 pages, ino 278631; err -30 
[35772.088611] Buffer I/O error on dev dm-0, logical block 1, lost sync page write 
[35772.088755] Buffer I/O error on dev dm-0, logical block 13762562, lost async page write 
[35772.090413] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35772.100864] EXT4-fs (dm-0): I/O error while writing superblock 
[35772.100885] EXT4-fs (dm-0): I/O error while writing superblock 
[35772.956603] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35773.824386] EXT4-fs (dm-0): 3 truncates cleaned up 
[35773.825283] EXT4-fs (dm-0): recovery complete 
[35773.915764] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35775.102896] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278633 starting block 13779933) 
[35775.104737] buffer_io_error: 179 callbacks suppressed 
[35775.104741] Buffer I/O error on device dm-0, logical block 13779933 
[35775.105788] Buffer I/O error on device dm-0, logical block 13779934 
[35775.106532] Buffer I/O error on device dm-0, logical block 13779935 
[35775.107435] Buffer I/O error on device dm-0, logical block 13779936 
[35775.108894] EXT4-fs error (device dm-0): ext4_check_bdev_write_error:224: comm fsstress: Error while async write back metadata 
[35775.108995] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 112, block_bitmap = 917505 
[35775.109139] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 112, block_bitmap = 917505 
[35775.114548] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278633 starting block 13779844) 
[35775.115472] Buffer I/O error on device dm-0, logical block 13779844 
[35775.115991] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 240, block_bitmap = 1966081 
[35775.116622] Buffer I/O error on device dm-0, logical block 13779845 
[35775.117877] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 240, block_bitmap = 1966081 
[35775.118554] Buffer I/O error on device dm-0, logical block 13779846 
[35775.118564] Buffer I/O error on device dm-0, logical block 13779847 
[35775.118610] Buffer I/O error on device dm-0, logical block 13779848 
[35775.121950] Buffer I/O error on device dm-0, logical block 13779849 
[35775.123055] JBD2: Detected IO errors while flushing file data on dm-0-8 
[35775.123888] Aborting journal on device dm-0-8. 
[35775.124056] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278642 starting block 13813397) 
[35775.124597] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm ext4lazyinit: Cannot read block bitmap - block_group = 368, block_bitmap = 3014657 
[35775.124660] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35775.124889] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35775.125051] EXT4-fs (dm-0): I/O error while writing superblock 
[35775.125062] EXT4-fs (dm-0): Remounting filesystem read-only 
[35775.125374] JBD2: Detected IO errors while flushing file data on dm-0-8 
[35775.125427] EXT4-fs (dm-0): ext4_do_writepages: jbd2_start: 1014 pages, ino 278642; err -5 
[35775.126753] EXT4-fs (dm-0): I/O error while writing superblock 
[35775.904912] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35776.701465] EXT4-fs (dm-0): 1 truncate cleaned up 
[35776.702282] EXT4-fs (dm-0): recovery complete 
[35776.755936] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35776.884339] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 482, block_bitmap = 3932163 
[35776.884347] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245878 starting block 13748970) 
[35776.886429] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245878 starting block 4345) 
[35776.887849] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245878 starting block 4790) 
[35776.888106] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 482, block_bitmap = 3932163 
[35776.890080] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 482, block_bitmap = 3932163 
[35776.890315] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278650 starting block 10873) 
[35776.893040] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278650 starting block 10957) 
[35776.893793] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278650 starting block 11107) 
[35776.895056] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35776.895080] EXT4-fs error (device dm-0): ext4_check_bdev_write_error:224: comm fsstress: Error while async write back metadata 
[35776.895475] Aborting journal on device dm-0-8. 
[35776.897553] EXT4-fs error (device dm-0) in ext4_evict_inode:226: Journal has aborted 
[35776.897586] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35776.897625] EXT4-fs (dm-0): I/O error while writing superblock 
[35776.897811] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35776.898653] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35776.901252] EXT4-fs (dm-0): I/O error while writing superblock 
[35776.901319] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35776.902063] EXT4-fs (dm-0): Remounting filesystem read-only 
[35776.902587] EXT4-fs (dm-0): I/O error while writing superblock 
[35777.627679] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35778.121113] EXT4-fs (dm-0): recovery complete 
[35778.121802] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35779.238712] buffer_io_error: 49 callbacks suppressed 
[35779.238723] Buffer I/O error on dev dm-0, logical block 1313739, async page read 
[35779.238733] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 1686, block_bitmap = 13762567 
[35779.239110] Buffer I/O error on dev dm-0, logical block 1313740, async page read 
[35779.240108] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 1686, block_bitmap = 13762567 
[35779.240590] Buffer I/O error on dev dm-0, logical block 1313741, async page read 
[35779.241712] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 1686, block_bitmap = 13762567 
[35779.242346] Buffer I/O error on dev dm-0, logical block 1313739, async page read 
[35779.243136] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245878 starting block 92365) 
[35779.243453] EXT4-fs error (device dm-0): ext4_wait_block_bitmap:574: comm fsstress: Cannot read block bitmap - block_group = 1686, block_bitmap = 13762567 
[35779.244019] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245878 starting block 13824120) 
[35779.244469] Buffer I/O error on dev dm-0, logical block 1313740, async page read 
[35779.244595] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245878 starting block 95858) 
[35779.245083] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245878 starting block 13822300) 
[35779.245604] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245878 starting block 13822412) 
[35779.245953] Buffer I/O error on dev dm-0, logical block 1313741, async page read 
[35779.247758] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 82042 starting block 90126) 
[35779.248602] Buffer I/O error on dev dm-0, logical block 67279, async page read 
[35779.249249] Buffer I/O error on dev dm-0, logical block 1313739, async page read 
[35779.249295] Buffer I/O error on dev dm-0, logical block 1313740, async page read 
[35779.249337] Buffer I/O error on dev dm-0, logical block 1313741, async page read 
[35779.249636] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 82042 starting block 13828097) 
[35779.254841] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 278652 starting block 13828235) 
[35779.256038] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 82042 starting block 1352691) 
[35779.257868] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 82042 starting block 1352797) 
[35779.285144] JBD2: Detected IO errors while flushing file data on dm-0-8 
[35779.286248] Aborting journal on device dm-0-8. 
[35779.286976] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35779.28741 
[35779.316758] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35779.316763] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35779.316920] EXT4-fs (dm-0): I/O error while writing superblock 
[35779.316929] EXT4-fs (dm-0): Remounting filesystem read-only 
[35779.317280] EXT4-fs (dm-0): I/O error while writing superblock 
[35779.566358] EXT4-fs (dm-0): I/O error while writing superblock 
[35779.788181] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35779.793029] EXT4-fs (dm-0): I/O error while writing superblock 
[35780.027305] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35780.766503] EXT4-fs (dm-0): recovery complete 
[35780.778784] EXT4-fs (dm-0): mounted filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a r/w with ordered data mode. Quota mode: none. 
[35776.608309] restraintd[1442]: *** Current Time: Sat Sep 02 04:19:01 2023  Localwatchdog at: Sun Sep 03 18:30:59 2023 
[35782.983326] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245908 starting block 4480114) 
[35782.983983] buffer_io_error: 2689 callbacks suppressed 
[35782.983987] Buffer I/O error on device dm-0, logical block 4480114 
[35782.985012] Buffer I/O error on device dm-0, logical block 4480115 
[35782.985743] Buffer I/O error on device dm-0, logical block 4480116 
[35782.986500] Buffer I/O error on device dm-0, logical block 4480117 
[35782.987204] Buffer I/O error on device dm-0, logical block 4480118 
[35782.988169] Buffer I/O error on device dm-0, logical block 4480119 
[35782.988920] Buffer I/O error on device dm-0, logical block 4480120 
[35782.991085] Buffer I/O error on device dm-0, logical block 4480121 
[35782.991848] Buffer I/O error on device dm-0, logical block 4480122 
[35782.992589] Buffer I/O error on device dm-0, logical block 4480123 
[35782.993532] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245921 starting block 3940559) 
[35782.994339] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 245786 starting block 4487680) 
[35782.995696] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852108 starting block 4741) 
[35782.996400] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852108 starting block 18945) 
[35782.998202] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852108 starting block 18946) 
[35782.999189] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852116 starting block 13640449) 
[35783.000337] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852101 starting block 4764) 
[35783.001289] EXT4-fs warning (device dm-0): ext4_end_bio:343: I/O error 10 writing to inode 852101 starting block 2472) 
[35783.002165] JBD2: Detected IO errors while flushing file data on dm-0-8 
[35783.003955] Aborting journal on device dm-0-8. 
[35783.004891] JBD2: I/O error when updating journal superblock for dm-0-8. 
[35783.005135] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35783.006191] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35783.006241] EXT4-fs error (device dm-0): ext4_journal_check_start:84: comm fsstress: Detected aborted journal 
[35783.007698] EXT4-fs (dm-0): I/O error while writing superblock 
[35783.007724] EXT4-fs (dm-0): previous I/O error to superblock detected 
[35783.007787] EXT4-fs (dm-0): I/O error while writing superblock 
[35783.007794] EXT4-fs (dm-0): Remounting filesystem read-only 
[35783.008496] EXT4-fs (dm-0): I/O error while writing superblock 
[35783.818178] EXT4-fs (dm-0): unmounting filesystem 51750a3d-e318-429d-b9c1-c22ff9df118a. 
[35784.346930] JBD2: Invalid checksum recovering data block 3932226 in log 
[35784.347399] JBD2: Invalid checksum recovering data block 106 in log 
[35784.585695] JBD2: journal recovery failed 
[35784.585948] EXT4-fs (dm-0): error loading journal 
[35785.236143] EXT4-fs (sda5): unmounting filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97. 
[35786.132607] EXT4-fs (sda5): mounted filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97 r/w with ordered data mode. Quota mode: none. 
[35808.615912] EXT4-fs (sda4): mounted filesystem 430aa3b6-d8c9-4f40-bf82-ca4eb95d3fa4 r/w with ordered data mode. Quota mode: none. 
[35808.646741] EXT4-fs (sda4): unmounting filesystem 430aa3b6-d8c9-4f40-bf82-ca4eb95d3fa4. 
[35808.828231] EXT4-fs (sda5): unmounting filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97. 
[35810.181984] EXT4-fs (sda5): mounted filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97 r/w with ordered data mode. Quota mode: none.
...
...
[39556.971012] run fstests generic/482 at 2023-09-02 05:21:56 
[39559.921158] EXT4-fs (sda5): unmounting filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97. 
[39556.453564] restraintd[1442]: *** Current Time: Sat Sep 02 05:22:01 2023  Localwatchdog at: Sun Sep 03 18:30:59 2023 
[39561.860720] device-mapper: thin: Data device (dm-1) discard unsupported: Disabling discard passdown. 
[39563.789077] EXT4-fs (dm-4): mounted filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc r/w with ordered data mode. Quota mode: none. 
[39603.557128] EXT4-fs (dm-4): unmounting filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc. 
[39606.319914] EXT4-fs (dm-3): recovery complete 
[39606.320624] EXT4-fs (dm-3): mounted filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc r/w with ordered data mode. Quota mode: none. 
[39606.350151] EXT4-fs (dm-3): unmounting filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc. 
[39609.086581] EXT4-fs (dm-3): recovery complete 
[39609.091906] EXT4-fs (dm-3): mounted filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc r/w with ordered data mode. Quota mode: none. 
[39609.119662] EXT4-fs (dm-3): unmounting filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc. 
[39611.692319] EXT4-fs (dm-3): 3 truncates cleaned up 
[39611.693012] EXT4-fs (dm-3): recovery complete 
[39611.764112] EXT4-fs (dm-3): mounted filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc r/w with ordered data mode. Quota mode: none. 
[39611.792213] EXT4-fs (dm-3): unmounting filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc. 
[39614.643226] EXT4-fs (dm-3): 3 truncates cleaned up 
[39614.644018] EXT4-fs (dm-3): recovery complete 
[39614.712002] EXT4-fs (dm-3): mounted filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc r/w with ordered data mode. Quota mode: none. 
[39614.740515] EXT4-fs (dm-3): unmounting filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc. 
[39618.096068] EXT4-fs (dm-3): 1 truncate cleaned up 
[39618.096797] EXT4-fs (dm-3): recovery complete 
[39618.144804] EXT4-fs (dm-3): mounted filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc r/w with ordered data mode. Quota mode: none. 
[39618.172712] EXT4-fs (dm-3): unmounting filesystem d62704f7-00be-4efd-aeb0-d2b7dd203ccc. 
[39621.077674] EXT4-fs (sda5): mounted filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97 r/w with ordered data mode. Quota mode: none. 
[39616.192949] restraintd[1442]: *** Current Time: Sat Sep 02 05:23:01 2023  Localwatchdog at: Sun Sep 03 18:30:59 2023 
[39643.279212] EXT4-fs (sda4): mounted filesystem 1918324a-858b-4e7a-b0cd-1c5c046c1cf8 r/w with ordered data mode. Quota mode: none. 
[39643.310140] EXT4-fs (sda4): unmounting filesystem 1918324a-858b-4e7a-b0cd-1c5c046c1cf8. 
[39643.502308] EXT4-fs (sda5): unmounting filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97. 
[39645.142999] EXT4-fs (sda5): mounted filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97 r/w with ordered data mode. Quota mode: none.
...
...
[40656.609079] run fstests generic/530 at 2023-09-02 05:40:16 
[40663.063496] EXT4-fs (sda4): mounted filesystem 3da3ed7c-69ef-490b-963e-4538c3fe6901 r/w with ordered data mode. Quota mode: none. 
[40663.091672] EXT4-fs (sda4): shut down requested (1) 
[40663.092370] Aborting journal on device sda4-8. 
[40663.129520] EXT4-fs (sda4): unmounting filesystem 3da3ed7c-69ef-490b-963e-4538c3fe6901. 
[40667.471901] EXT4-fs (sda4): mounted filesystem 645d073f-24ff-487a-9594-c7d5d0549857 r/w with ordered data mode. Quota mode: none. 
[40681.753309] EXT4-fs (sda4): shut down requested (1) 
[40682.024481] Aborting journal on device sda4-8. 
[40688.769049] EXT4-fs (sda4): unmounting filesystem 645d073f-24ff-487a-9594-c7d5d0549857. 
[40690.222805] EXT4-fs error (device sda4): ext4_map_blocks:577: inode #8: block 7746285: comm mount: lblock 4844 mapped to illegal pblock 7746285 (length 1) 
[40690.232307] EXT4-fs (sda4): journal bmap failed: block 4844 ret -117 
[40690.232307]  
[40690.233018] JBD2: bad block at offset 4844 
[40690.513437] JBD2: journal recovery failed 
[40690.532275] EXT4-fs (sda4): error loading journal 
[40690.953437] EXT4-fs (sda5): unmounting filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97. 
[40691.888572] EXT4-fs (sda5): mounted filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97 r/w with ordered data mode. Quota mode: none. 
[40696.498115] restraintd[1442]: *** Current Time: Sat Sep 02 05:41:01 2023  Localwatchdog at: Sun Sep 03 18:30:59 2023 
[40711.632595] EXT4-fs (sda4): mounted filesystem d56c4efe-3755-47f0-abaf-e76215955957 r/w with ordered data mode. Quota mode: none. 
[40711.663482] EXT4-fs (sda4): unmounting filesystem d56c4efe-3755-47f0-abaf-e76215955957. 
[40711.845424] EXT4-fs (sda5): unmounting filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97. 
[40713.200627] EXT4-fs (sda5): mounted filesystem 8c07d08e-4e8d-4551-94f7-1fbf38d84d97 r/w with ordered data mode. Quota mode: none.
...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-03 12:00 [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails Zorro Lang
@ 2023-09-03 20:40 ` Theodore Ts'o
  2023-09-04  6:08   ` Theodore Ts'o
  0 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2023-09-03 20:40 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-ext4, fstests, regressions

On Sun, Sep 03, 2023 at 08:00:01PM +0800, Zorro Lang wrote:
> Hi ext4 folks,
> 
> Recently I found lots of fstests cases which belong to "recoveryloop" (e.g.
> g/388 [1], g/455 [2], g/475 [3] and g/482 [4]) or does fs shutdown/resize test
> (e.g. ext4/059 [5], g/530 [6]) failed ext4 with 1k blocksize, the kernel is
> linux v6.6-rc0+ (HEAD=b84acc11b1c9).
> 
> I tested with MKFS_OPTIONS="-b 1024", no specific MOUNT_OPTIONS. I hit these
> failure several times, and I didn't hit them on my last regression test on
> v6.5-rc7+. So I think this might be a regression problem. And I didn't hit
> this failures on xfs. If this's a known issue will be fixed soon, feel free
> to tell me.

TL;DR: there definitely seenms to be something going on with g/455 and
g/482 with the ext4/1k blocksize case in Linus's latest upstream tree,
although it wasn't there in the ext4 branch which I sent to Linus to
pull.

Unfortunately, generic/475 is a known failure, especially in the 1k
block size case.  The rate seems to change a bit over time.  For
example from 6.2:

ext4/1k: 522 tests, 2 failures, 45 skipped, 6153 seconds
  Flaky: generic/051: 40% (2/5)   generic/475: 60% (3/5)

and from 6.1.0-rc4:

ext4/1k: 522 tests, 2 failures, 45 skipped, 5660 seconds
  Flaky: generic/051: 60% (3/5)   generic/475: 40% (2/5)

In 6.5-rc3, it looks like the rate has gotten worse:

ext4/1k: 30 tests, 29 failures, 2402 seconds
  Flaky: generic/475: 97% (29/30)

Alas, finding a root cause for generic/475 has been challenging.  I
suspect that it happens when we crash while doing a large truncate on
a highly fragmented file system, such as that the truncate has to span
multiple truncates, with the inode on the orphan list so the kernel
can complete the truncate if we trash mid-truncate when we clean up
the orphan list.  However, that's just a theory, and I don't yet have
hard evidence.

The generic/388 test is very different.  It uses the shutdown ioctl,
and that's something that ext4 has never completely handled correctly.
Doing it right would require adding some locks in hot paths, so it's
one which I've suppressed for all of my ext4 tests[1].

[1] https://github.com/tytso/xfstests-bld/blob/master/test-appliance/files/root/fs/ext4/exclude

The generic/455 and generic/482 tests work by using dm-log-writes, and
that was *not* failing on the branch (v6.5.0-rc3-60-g768d612f7982) for
which I sent a pull request to Linus:

ext4/1k: 10 tests, 63 seconds
  generic/455  Pass     4s
  generic/482  Pass     8s
  generic/455  Pass     5s
  generic/482  Pass     8s
  generic/455  Pass     5s
  generic/482  Pass     7s
  generic/455  Pass     5s
  generic/482  Pass     8s
  generic/455  Pass     5s
  generic/482  Pass     8s
Totals: 10 tests, 0 skipped, 0 failures, 0 errors, 63s

... but I can confirm that it's failing on Linus's upstream (I tested
commit 708283abf896):

ext4/1k: 2 tests, 2 failures, 31 seconds
  generic/455  Failed   4s
  generic/455  Failed   2s
  generic/455  Pass     5s
  generic/455  Failed   3s
  generic/455  Failed   2s
  generic/482  Failed   2s
  generic/482  Failed   3s
  generic/482  Failed   1s
  generic/482  Failed   3s
  generic/482  Failed   4s
Totals: 10 tests, 0 skipped, 9 failures, 0 errors, 29s

						- Ted

P.S.  After doing some digging, it appears generic/455 does have some
failures on 6.4 (20%) and 6.5-rc3 (5%) on the ext4/1k blocksize test
config.  But *something* is definitely causing a much greater failure
rate in Linus's upstream.  The good news is that should make it
relatively easy to bisect.  I'll look into it.  Thanks for flagging
this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-03 20:40 ` Theodore Ts'o
@ 2023-09-04  6:08   ` Theodore Ts'o
  2023-09-05 22:11     ` Matthew Wilcox
  0 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2023-09-04  6:08 UTC (permalink / raw)
  To: Zorro Lang
  Cc: linux-ext4, fstests, regressions, Matthew Wilcox, Andrew Morton

#regzbot introduced: 8147c4c4546f9f05ef03bb839b741473b28bb560 ^

OK, I've isolated the regression of generic/455 failing with ext4/1k
to this commit, which came in via the mm tree.  Nothing seems
*obviously* wrong, but I'm not sure if there are any differences in
the semantics of the new folio functions such as kmap_local_folio,
offset_in_folio, set_folio_bh() which might be making a difference.

Using kvm-xfstests[1] I bisected this via the command:

% install-kconfig ; kbuild ; kvm-xfstests -c ext4/1k -C 10 generic/455

[1] https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md

And the bisection pointed me at this commit:

    commit 8147c4c4546f9f05ef03bb839b741473b28bb560 (refs/bisect/bad)
    Author: Matthew Wilcox (Oracle) <willy@infradead.org>
    AuthorDate: Thu Jul 13 04:55:11 2023 +0100
    Commit: Andrew Morton <akpm@linux-foundation.org>
    CommitDate: Fri Aug 18 10:12:30 2023 -0700

        jbd2: use a folio in jbd2_journal_write_metadata_buffer()

During the bisection, I treated a commit with 3+ failures as "bad",
and 0-2 commits as "good".  Running generic/455 50 times to get a
sense of the failure, with the first bad commit (8147c4c4546f), I got:

    ext4/1k: 50 tests, 21 failures, 223 seconds
      Flaky: generic/455: 42% (21/50)
    Totals: 50 tests, 0 skipped, 21 failures, 0 errors, 223s

While with the immediately preceding commit (07811230c3cd), I got:

    ext4/1k: 50 tests, 4 failures, 235 seconds
      Flaky: generic/455:  8% (4/50)
    Totals: 50 tests, 0 skipped, 4 failures, 0 errors, 235s

Comparing these two commits (8147c4c4546f vs 07811230c3cd) using the
ext4 with a 4k block size, I get:

    ext4/4k: 50 tests, 2 failures, 365 seconds
      Flaky: generic/455:  4% (2/50)
    Totals: 50 tests, 0 skipped, 2 failures, 0 errors, 365s

vs

    ext4/4k: 50 tests, 2 failures, 349 seconds
      Flaky: generic/455:  4% (2/50)
    Totals: 50 tests, 0 skipped, 2 failures, 0 errors, 349s

So issue seems to be specifically with a sub-page size block size,
since ext4/4k doesn't show any issues, while ext4/1k does.

   	       	     		       	 - Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-04  6:08   ` Theodore Ts'o
@ 2023-09-05 22:11     ` Matthew Wilcox
  2023-09-06 11:03       ` Ritesh Harjani
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Wilcox @ 2023-09-05 22:11 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Zorro Lang, linux-ext4, fstests, regressions, Andrew Morton

On Mon, Sep 04, 2023 at 02:08:19AM -0400, Theodore Ts'o wrote:
> #regzbot introduced: 8147c4c4546f9f05ef03bb839b741473b28bb560 ^
> 
> OK, I've isolated the regression of generic/455 failing with ext4/1k
> to this commit, which came in via the mm tree.  Nothing seems
> *obviously* wrong, but I'm not sure if there are any differences in
> the semantics of the new folio functions such as kmap_local_folio,
> offset_in_folio, set_folio_bh() which might be making a difference.

Thanks for the cc,  Let's see what we can do ...

virt_to_folio() - For an order-0 page, there is no difference.
offset_in_folio() - Ditto
bh->b_page vs bh->b_folio - Ditto
virt_to_folio() - Ditto
folio_set_bh() - Ditto

kmap_local_folio() vs kmap_atomic - Here, we have a difference.
memcpy_from_folio() - Same difference as above.

I suppose it must be this, and yet I cannot understand how it would
make a difference.  Perhaps you can help me?

static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot)
{
        if (IS_ENABLED(CONFIG_PREEMPT_RT))
                migrate_disable();
        else
                preempt_disable();

        pagefault_disable();
        return __kmap_local_page_prot(page, prot);
}

vs

static inline void *kmap_local_folio(struct folio *folio, size_t offset)
{
        struct page *page = folio_page(folio, offset / PAGE_SIZE);
        return __kmap_local_page_prot(page, kmap_prot) + offset % PAGE_SIZE;
}

I don't believe that returning the address with the offset included
is the problem here.  It must be disabling preemption / migration.
There's no chace this funcation accesses userspace (... is there?) so
it can't be the pagefault_disable().

We can try splitting this up into tiny commits and figuring out which
of them is the problem.  I'll be back at work tomorrow and can look
more deeply then.

> Using kvm-xfstests[1] I bisected this via the command:
> 
> % install-kconfig ; kbuild ; kvm-xfstests -c ext4/1k -C 10 generic/455
> 
> [1] https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md
> 
> 
> And the bisection pointed me at this commit:
> 
>     commit 8147c4c4546f9f05ef03bb839b741473b28bb560 (refs/bisect/bad)
>     Author: Matthew Wilcox (Oracle) <willy@infradead.org>
>     AuthorDate: Thu Jul 13 04:55:11 2023 +0100
>     Commit: Andrew Morton <akpm@linux-foundation.org>
>     CommitDate: Fri Aug 18 10:12:30 2023 -0700
> 
>         jbd2: use a folio in jbd2_journal_write_metadata_buffer()
>     
> During the bisection, I treated a commit with 3+ failures as "bad",
> and 0-2 commits as "good".  Running generic/455 50 times to get a
> sense of the failure, with the first bad commit (8147c4c4546f), I got:
> 
>     ext4/1k: 50 tests, 21 failures, 223 seconds
>       Flaky: generic/455: 42% (21/50)
>     Totals: 50 tests, 0 skipped, 21 failures, 0 errors, 223s
> 
> While with the immediately preceding commit (07811230c3cd), I got:
> 
>     ext4/1k: 50 tests, 4 failures, 235 seconds
>       Flaky: generic/455:  8% (4/50)
>     Totals: 50 tests, 0 skipped, 4 failures, 0 errors, 235s
> 
> 
> 
> Comparing these two commits (8147c4c4546f vs 07811230c3cd) using the
> ext4 with a 4k block size, I get:
> 
>     ext4/4k: 50 tests, 2 failures, 365 seconds
>       Flaky: generic/455:  4% (2/50)
>     Totals: 50 tests, 0 skipped, 2 failures, 0 errors, 365s
> 
> vs
> 
>     ext4/4k: 50 tests, 2 failures, 349 seconds
>       Flaky: generic/455:  4% (2/50)
>     Totals: 50 tests, 0 skipped, 2 failures, 0 errors, 349s
> 
> So issue seems to be specifically with a sub-page size block size,
> since ext4/4k doesn't show any issues, while ext4/1k does.

I doubt I tried it with a 1kB block size, so I'll focus on that too.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-05 22:11     ` Matthew Wilcox
@ 2023-09-06 11:03       ` Ritesh Harjani
  2023-09-06 12:38         ` Matthew Wilcox
  0 siblings, 1 reply; 14+ messages in thread
From: Ritesh Harjani @ 2023-09-06 11:03 UTC (permalink / raw)
  To: Matthew Wilcox, Theodore Ts'o
  Cc: Zorro Lang, linux-ext4, fstests, regressions, Andrew Morton,
	Jan Kara

Matthew Wilcox <willy@infradead.org> writes:

> On Mon, Sep 04, 2023 at 02:08:19AM -0400, Theodore Ts'o wrote:
>> #regzbot introduced: 8147c4c4546f9f05ef03bb839b741473b28bb560 ^
>> 
>> OK, I've isolated the regression of generic/455 failing with ext4/1k
>> to this commit, which came in via the mm tree.  Nothing seems
>> *obviously* wrong, but I'm not sure if there are any differences in
>> the semantics of the new folio functions such as kmap_local_folio,
>> offset_in_folio, set_folio_bh() which might be making a difference.
>
> Thanks for the cc,  Let's see what we can do ...
>
> virt_to_folio() - For an order-0 page, there is no difference.
> offset_in_folio() - Ditto
> bh->b_page vs bh->b_folio - Ditto
> virt_to_folio() - Ditto
> folio_set_bh() - Ditto
>
> kmap_local_folio() vs kmap_atomic - Here, we have a difference.
> memcpy_from_folio() - Same difference as above.
>
> I suppose it must be this, and yet I cannot understand how it would
> make a difference.  Perhaps you can help me?
>
> static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot)
> {
>         if (IS_ENABLED(CONFIG_PREEMPT_RT))
>                 migrate_disable();
>         else
>                 preempt_disable();
>
>         pagefault_disable();
>         return __kmap_local_page_prot(page, prot);
> }
>
> vs
>
> static inline void *kmap_local_folio(struct folio *folio, size_t offset)
> {
>         struct page *page = folio_page(folio, offset / PAGE_SIZE);
>         return __kmap_local_page_prot(page, kmap_prot) + offset % PAGE_SIZE;
> }
>
> I don't believe that returning the address with the offset included
> is the problem here.  It must be disabling preemption / migration.
> There's no chace this funcation accesses userspace (... is there?) so
> it can't be the pagefault_disable().
>
> We can try splitting this up into tiny commits and figuring out which
> of them is the problem.  I'll be back at work tomorrow and can look
> more deeply then.
>
>> Using kvm-xfstests[1] I bisected this via the command:
>> 
>> % install-kconfig ; kbuild ; kvm-xfstests -c ext4/1k -C 10 generic/455
>> 
>> [1] https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md
>> 
>> 
>> And the bisection pointed me at this commit:
>> 
>>     commit 8147c4c4546f9f05ef03bb839b741473b28bb560 (refs/bisect/bad)
>>     Author: Matthew Wilcox (Oracle) <willy@infradead.org>
>>     AuthorDate: Thu Jul 13 04:55:11 2023 +0100
>>     Commit: Andrew Morton <akpm@linux-foundation.org>
>>     CommitDate: Fri Aug 18 10:12:30 2023 -0700
>> 
>>         jbd2: use a folio in jbd2_journal_write_metadata_buffer()
>>     

This is inline with my observation too. 

However, is this log expected with below diff when running with ext4/1k?
I am finding a folio with order > 0 here.

<diff>
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 768fa05bcbed..152c08e83fa2 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -369,6 +369,12 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
                new_offset = offset_in_folio(new_folio, jh2bh(jh_in)->b_data);
        }

+       if (folio_size(new_folio) > PAGE_SIZE) {
+               pr_crit("%s: folio_size=%lu, folio_order=%d, new_offset=%u bh_size=%lu folio_test_large=%d\n",
+                       __func__, folio_size(new_folio), folio_order(new_folio), new_offset,
+                       bh_in->b_size, folio_test_large(new_folio));
+       }
+
        mapped_data = kmap_local_folio(new_folio, new_offset);
        /*
         * Fire data frozen trigger if data already wasn't frozen.  Do this

<dmesg log>
[   40.419772] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=0 bh_size=1024 folio_test_large=1
[   40.444737] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=2048 bh_size=1024 folio_test_large=1
[   40.472385] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=3072 bh_size=1024 folio_test_large=1
[   40.560581] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=8192 bh_size=1024 folio_test_large=1
[   40.588512] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=10240 bh_size=1024 folio_test_large=1
[   40.612103] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=7168 bh_size=1024 folio_test_large=1
[   40.636800] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=9216 bh_size=1024 folio_test_large=1
[   40.661166] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=10240 bh_size=1024 folio_test_large=1


Is this code path a possibility, which can cause above logs?

   ptr = jbd2_alloc() -> kmem_cache_alloc()
   <..>
   new_folio = virt_to_folio(ptr)
   new_offset = offset_in_folio(new_folio, ptr)

And then I am still not sure what the problem really is? 
Is it because at the time of checkpointing, the path is still not fully
converted to folio?

I am still missing a lot of pieces here, sorry. 

-ritesh

>> During the bisection, I treated a commit with 3+ failures as "bad",
>> and 0-2 commits as "good".  Running generic/455 50 times to get a
>> sense of the failure, with the first bad commit (8147c4c4546f), I got:
>> 
>>     ext4/1k: 50 tests, 21 failures, 223 seconds
>>       Flaky: generic/455: 42% (21/50)
>>     Totals: 50 tests, 0 skipped, 21 failures, 0 errors, 223s
>> 
>> While with the immediately preceding commit (07811230c3cd), I got:
>> 
>>     ext4/1k: 50 tests, 4 failures, 235 seconds
>>       Flaky: generic/455:  8% (4/50)
>>     Totals: 50 tests, 0 skipped, 4 failures, 0 errors, 235s
>> 
>> 
>> 
>> Comparing these two commits (8147c4c4546f vs 07811230c3cd) using the
>> ext4 with a 4k block size, I get:
>> 
>>     ext4/4k: 50 tests, 2 failures, 365 seconds
>>       Flaky: generic/455:  4% (2/50)
>>     Totals: 50 tests, 0 skipped, 2 failures, 0 errors, 365s
>> 
>> vs
>> 
>>     ext4/4k: 50 tests, 2 failures, 349 seconds
>>       Flaky: generic/455:  4% (2/50)
>>     Totals: 50 tests, 0 skipped, 2 failures, 0 errors, 349s
>> 
>> So issue seems to be specifically with a sub-page size block size,
>> since ext4/4k doesn't show any issues, while ext4/1k does.
>
> I doubt I tried it with a 1kB block size, so I'll focus on that too.

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-06 11:03       ` Ritesh Harjani
@ 2023-09-06 12:38         ` Matthew Wilcox
  2023-09-06 19:51           ` Matthew Wilcox
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Wilcox @ 2023-09-06 12:38 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Theodore Ts'o, Zorro Lang, linux-ext4, fstests, regressions,
	Andrew Morton, Jan Kara

On Wed, Sep 06, 2023 at 04:33:35PM +0530, Ritesh Harjani wrote:
> Matthew Wilcox <willy@infradead.org> writes:
> 
> > On Mon, Sep 04, 2023 at 02:08:19AM -0400, Theodore Ts'o wrote:
> >> #regzbot introduced: 8147c4c4546f9f05ef03bb839b741473b28bb560 ^
> >> 
> >> OK, I've isolated the regression of generic/455 failing with ext4/1k
> >> to this commit, which came in via the mm tree.  Nothing seems
> >> *obviously* wrong, but I'm not sure if there are any differences in
> >> the semantics of the new folio functions such as kmap_local_folio,
> >> offset_in_folio, set_folio_bh() which might be making a difference.
> >
> > Thanks for the cc,  Let's see what we can do ...
> >
> > virt_to_folio() - For an order-0 page, there is no difference.
> > offset_in_folio() - Ditto
> > bh->b_page vs bh->b_folio - Ditto
> > virt_to_folio() - Ditto
> > folio_set_bh() - Ditto
> >
> > kmap_local_folio() vs kmap_atomic - Here, we have a difference.
> > memcpy_from_folio() - Same difference as above.
> >
> > I suppose it must be this, and yet I cannot understand how it would
> > make a difference.  Perhaps you can help me?
> >
> > static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot)
> > {
> >         if (IS_ENABLED(CONFIG_PREEMPT_RT))
> >                 migrate_disable();
> >         else
> >                 preempt_disable();
> >
> >         pagefault_disable();
> >         return __kmap_local_page_prot(page, prot);
> > }
> >
> > vs
> >
> > static inline void *kmap_local_folio(struct folio *folio, size_t offset)
> > {
> >         struct page *page = folio_page(folio, offset / PAGE_SIZE);
> >         return __kmap_local_page_prot(page, kmap_prot) + offset % PAGE_SIZE;
> > }
> >
> > I don't believe that returning the address with the offset included
> > is the problem here.  It must be disabling preemption / migration.
> > There's no chace this funcation accesses userspace (... is there?) so
> > it can't be the pagefault_disable().
> >
> > We can try splitting this up into tiny commits and figuring out which
> > of them is the problem.  I'll be back at work tomorrow and can look
> > more deeply then.
> >
> >> Using kvm-xfstests[1] I bisected this via the command:
> >> 
> >> % install-kconfig ; kbuild ; kvm-xfstests -c ext4/1k -C 10 generic/455
> >> 
> >> [1] https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md
> >> 
> >> 
> >> And the bisection pointed me at this commit:
> >> 
> >>     commit 8147c4c4546f9f05ef03bb839b741473b28bb560 (refs/bisect/bad)
> >>     Author: Matthew Wilcox (Oracle) <willy@infradead.org>
> >>     AuthorDate: Thu Jul 13 04:55:11 2023 +0100
> >>     Commit: Andrew Morton <akpm@linux-foundation.org>
> >>     CommitDate: Fri Aug 18 10:12:30 2023 -0700
> >> 
> >>         jbd2: use a folio in jbd2_journal_write_metadata_buffer()
> >>     
> 
> This is inline with my observation too. 
> 
> However, is this log expected with below diff when running with ext4/1k?
> I am finding a folio with order > 0 here.
> 
> <diff>
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index 768fa05bcbed..152c08e83fa2 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -369,6 +369,12 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
>                 new_offset = offset_in_folio(new_folio, jh2bh(jh_in)->b_data);
>         }
> 
> +       if (folio_size(new_folio) > PAGE_SIZE) {
> +               pr_crit("%s: folio_size=%lu, folio_order=%d, new_offset=%u bh_size=%lu folio_test_large=%d\n",
> +                       __func__, folio_size(new_folio), folio_order(new_folio), new_offset,
> +                       bh_in->b_size, folio_test_large(new_folio));
> +       }
> +
>         mapped_data = kmap_local_folio(new_folio, new_offset);
>         /*
>          * Fire data frozen trigger if data already wasn't frozen.  Do this
> 
> <dmesg log>
> [   40.419772] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=0 bh_size=1024 folio_test_large=1
> [   40.444737] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=2048 bh_size=1024 folio_test_large=1
> [   40.472385] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=3072 bh_size=1024 folio_test_large=1
> [   40.560581] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=8192 bh_size=1024 folio_test_large=1
> [   40.588512] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=10240 bh_size=1024 folio_test_large=1
> [   40.612103] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=7168 bh_size=1024 folio_test_large=1
> [   40.636800] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=9216 bh_size=1024 folio_test_large=1
> [   40.661166] jbd2_journal_write_metadata_buffer: folio_size=16384, folio_order=2, new_offset=10240 bh_size=1024 folio_test_large=1
> 
> 
> Is this code path a possibility, which can cause above logs?
> 
>    ptr = jbd2_alloc() -> kmem_cache_alloc()
>    <..>
>    new_folio = virt_to_folio(ptr)
>    new_offset = offset_in_folio(new_folio, ptr)
> 
> And then I am still not sure what the problem really is? 
> Is it because at the time of checkpointing, the path is still not fully
> converted to folio?

Oh yikes!  I didn't know that the allocation might come from kmalloc!
Yes, slab might use high-order allocations.  I'll have to look through
this and figure out what the problem might be.

Thanks for debugging this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-06 12:38         ` Matthew Wilcox
@ 2023-09-06 19:51           ` Matthew Wilcox
  2023-09-07  2:56             ` Ritesh Harjani
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Wilcox @ 2023-09-06 19:51 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Theodore Ts'o, Zorro Lang, linux-ext4, fstests, regressions,
	Andrew Morton, Jan Kara

On Wed, Sep 06, 2023 at 01:38:23PM +0100, Matthew Wilcox wrote:
> > Is this code path a possibility, which can cause above logs?
> > 
> >    ptr = jbd2_alloc() -> kmem_cache_alloc()
> >    <..>
> >    new_folio = virt_to_folio(ptr)
> >    new_offset = offset_in_folio(new_folio, ptr)
> > 
> > And then I am still not sure what the problem really is? 
> > Is it because at the time of checkpointing, the path is still not fully
> > converted to folio?
> 
> Oh yikes!  I didn't know that the allocation might come from kmalloc!
> Yes, slab might use high-order allocations.  I'll have to look through
> this and figure out what the problem might be.

I think the probable cause is bh_offset().  Before these patches, if
we allocated a buffer at offset 9kB into an order-2 slab, we'd fill in
b_page with the third page of the slab and calculate bh_offset as 1kB.
With these patches, we set b_page to the first page of the slab, and
bh_offset still comes back as 1kB so we read from / write to entirely
the wrong place.

With this redefinition of bh_offset(), we calculate the offset relative
to the base page if it's a tail page, and relative to the folio if it's
a folio.  Works out nicely ;-)

I have three other things I'm trying to debug right now, so this isn't
tested, but if you have time you might want to give it a run.

diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 6cb3e9af78c9..dc8fcdc40e95 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -173,7 +173,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
 	return test_bit_acquire(BH_Uptodate, &bh->b_state);
 }
 
-#define bh_offset(bh)		((unsigned long)(bh)->b_data & ~PAGE_MASK)
+static inline unsigned long bh_offset(struct buffer_head *bh)
+{
+	return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
+}
 
 /* If we *know* page->private refers to buffer_heads */
 #define page_buffers(page)					\

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-06 19:51           ` Matthew Wilcox
@ 2023-09-07  2:56             ` Ritesh Harjani
  2023-09-07  3:47               ` Matthew Wilcox
  0 siblings, 1 reply; 14+ messages in thread
From: Ritesh Harjani @ 2023-09-07  2:56 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Theodore Ts'o, Zorro Lang, linux-ext4, fstests, regressions,
	Andrew Morton, Jan Kara

Matthew Wilcox <willy@infradead.org> writes:

> On Wed, Sep 06, 2023 at 01:38:23PM +0100, Matthew Wilcox wrote:
>> > Is this code path a possibility, which can cause above logs?
>> > 
>> >    ptr = jbd2_alloc() -> kmem_cache_alloc()
>> >    <..>
>> >    new_folio = virt_to_folio(ptr)
>> >    new_offset = offset_in_folio(new_folio, ptr)
>> > 
>> > And then I am still not sure what the problem really is? 
>> > Is it because at the time of checkpointing, the path is still not fully
>> > converted to folio?
>> 
>> Oh yikes!  I didn't know that the allocation might come from kmalloc!
>> Yes, slab might use high-order allocations.  I'll have to look through
>> this and figure out what the problem might be.
>
> I think the probable cause is bh_offset().  Before these patches, if
> we allocated a buffer at offset 9kB into an order-2 slab, we'd fill in
> b_page with the third page of the slab and calculate bh_offset as 1kB.
> With these patches, we set b_page to the first page of the slab, and
> bh_offset still comes back as 1kB so we read from / write to entirely
> the wrong place.
>
> With this redefinition of bh_offset(), we calculate the offset relative
> to the base page if it's a tail page, and relative to the folio if it's
> a folio.  Works out nicely ;-)

Thanks Matthew for explaining the problem clearly.


>
> I have three other things I'm trying to debug right now, so this isn't
> tested, but if you have time you might want to give it a run.

sure, I gave it a try.

>
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index 6cb3e9af78c9..dc8fcdc40e95 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -173,7 +173,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
>  	return test_bit_acquire(BH_Uptodate, &bh->b_state);
>  }
>  
> -#define bh_offset(bh)		((unsigned long)(bh)->b_data & ~PAGE_MASK)
> +static inline unsigned long bh_offset(struct buffer_head *bh)
> +{
> +	return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
> +}
>  
>  /* If we *know* page->private refers to buffer_heads */
>  #define page_buffers(page)					\


I used "const" for bh to avoid warnings from fs/nilfs/alloc.c

diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 4ede47649a81..b61fa79cb7f5 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -171,7 +171,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
        return test_bit_acquire(BH_Uptodate, &bh->b_state);
 }

-#define bh_offset(bh)          ((unsigned long)(bh)->b_data & ~PAGE_MASK)
+static inline unsigned long bh_offset(const struct buffer_head *bh)
+{
+       return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
+}

 /* If we *know* page->private refers to buffer_heads */
 #define page_buffers(page)                                     \


But this change alone was still giving me failures. On looking into
usage of b_data, I found we use offset_in_page() instead of bh_offset()
in jbd2. So I added below changes in fs/jbd2 to replace offset_in_page()
to bh_offset()...

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 1073259902a6..0c25640714ac 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -304,7 +304,7 @@ static __u32 jbd2_checksum_data(__u32 crc32_sum, struct buffer_head *bh)

        addr = kmap_atomic(page);
        checksum = crc32_be(crc32_sum,
-               (void *)(addr + offset_in_page(bh->b_data)), bh->b_size);
+               (void *)(addr + bh_offset(bh)), bh->b_size);
        kunmap_atomic(addr);

        return checksum;
@@ -333,7 +333,7 @@ static void jbd2_block_tag_csum_set(journal_t *j, journal_block_tag_t *tag,
        seq = cpu_to_be32(sequence);
        addr = kmap_atomic(page);
        csum32 = jbd2_chksum(j, j->j_csum_seed, (__u8 *)&seq, sizeof(seq));
-       csum32 = jbd2_chksum(j, csum32, addr + offset_in_page(bh->b_data),
+       csum32 = jbd2_chksum(j, csum32, addr + bh_offset(bh),
                             bh->b_size);
        kunmap_atomic(addr);

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 4d1fda1f7143..2ac57f7a242d 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -942,7 +942,7 @@ static void jbd2_freeze_jh_data(struct journal_head *jh)

        J_EXPECT_JH(jh, buffer_uptodate(bh), "Possible IO failure.\n");
        page = bh->b_page;
-       offset = offset_in_page(bh->b_data);
+       offset = bh_offset(bh);
        source = kmap_atomic(page);
        /* Fire data frozen trigger just before we copy the data */
        jbd2_buffer_frozen_trigger(jh, source + offset, jh->b_triggers);


With all of above diffs, here are the results.

ext4/1k: 15 tests, 1 failures, 1709 seconds
  generic/455  Pass     43s
  generic/475  Pass     128s
  generic/482  Pass     183s
  generic/455  Pass     43s
  generic/475  Pass     134s
  generic/482  Pass     191s
  generic/455  Pass     41s
  generic/475  Pass     139s
  generic/482  Pass     135s
  generic/455  Pass     46s
  generic/475  Pass     132s
  generic/482  Pass     146s
  generic/455  Pass     47s
  generic/475  Failed   145s
  generic/482  Pass     156s
Totals: 15 tests, 0 skipped, 1 failures, 0 errors, 1709s

I guess the above failure (generic/475) could be due to it's flakey
behaviour which Ted was mentioning.


Now, while we are at it, I think we should also make change to reiserfs from
offset_in_page() to bh_offset()

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 015bfe4e4524..23411ec163d4 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -4217,7 +4217,7 @@ static int do_journal_end(struct reiserfs_transaction_handle *th, int flags)
                        page = cn->bh->b_page;
                        addr = kmap(page);
                        memcpy(tmp_bh->b_data,
-                              addr + offset_in_page(cn->bh->b_data),
+                              addr + bh_offset(cn->bh),
                               cn->bh->b_size);
                        kunmap(page);
                        mark_buffer_dirty(tmp_bh);


I will also run "auto" group with ext4/1k with all of above change. Will
update the results once it is done.


-ritesh

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-07  2:56             ` Ritesh Harjani
@ 2023-09-07  3:47               ` Matthew Wilcox
  2023-09-07 13:35                 ` Ritesh Harjani
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Wilcox @ 2023-09-07  3:47 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Theodore Ts'o, Zorro Lang, linux-ext4, fstests, regressions,
	Andrew Morton, Jan Kara

On Thu, Sep 07, 2023 at 08:26:35AM +0530, Ritesh Harjani wrote:
> Matthew Wilcox <willy@infradead.org> writes:
> 
> > On Wed, Sep 06, 2023 at 01:38:23PM +0100, Matthew Wilcox wrote:
> >> > Is this code path a possibility, which can cause above logs?
> >> > 
> >> >    ptr = jbd2_alloc() -> kmem_cache_alloc()
> >> >    <..>
> >> >    new_folio = virt_to_folio(ptr)
> >> >    new_offset = offset_in_folio(new_folio, ptr)
> >> > 
> >> > And then I am still not sure what the problem really is? 
> >> > Is it because at the time of checkpointing, the path is still not fully
> >> > converted to folio?
> >> 
> >> Oh yikes!  I didn't know that the allocation might come from kmalloc!
> >> Yes, slab might use high-order allocations.  I'll have to look through
> >> this and figure out what the problem might be.
> >
> > I think the probable cause is bh_offset().  Before these patches, if
> > we allocated a buffer at offset 9kB into an order-2 slab, we'd fill in
> > b_page with the third page of the slab and calculate bh_offset as 1kB.
> > With these patches, we set b_page to the first page of the slab, and
> > bh_offset still comes back as 1kB so we read from / write to entirely
> > the wrong place.
> >
> > With this redefinition of bh_offset(), we calculate the offset relative
> > to the base page if it's a tail page, and relative to the folio if it's
> > a folio.  Works out nicely ;-)
> 
> Thanks Matthew for explaining the problem clearly.
> 
> 
> >
> > I have three other things I'm trying to debug right now, so this isn't
> > tested, but if you have time you might want to give it a run.
> 
> sure, I gave it a try.
> 
> >
> > diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> > index 6cb3e9af78c9..dc8fcdc40e95 100644
> > --- a/include/linux/buffer_head.h
> > +++ b/include/linux/buffer_head.h
> > @@ -173,7 +173,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
> >  	return test_bit_acquire(BH_Uptodate, &bh->b_state);
> >  }
> >  
> > -#define bh_offset(bh)		((unsigned long)(bh)->b_data & ~PAGE_MASK)
> > +static inline unsigned long bh_offset(struct buffer_head *bh)
> > +{
> > +	return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
> > +}
> >  
> >  /* If we *know* page->private refers to buffer_heads */
> >  #define page_buffers(page)					\
> 
> 
> I used "const" for bh to avoid warnings from fs/nilfs/alloc.c

Excellent.  I didn't try compiling nilfs ;-)

> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index 4ede47649a81..b61fa79cb7f5 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -171,7 +171,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
>         return test_bit_acquire(BH_Uptodate, &bh->b_state);
>  }
> 
> -#define bh_offset(bh)          ((unsigned long)(bh)->b_data & ~PAGE_MASK)
> +static inline unsigned long bh_offset(const struct buffer_head *bh)
> +{
> +       return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
> +}
> 
>  /* If we *know* page->private refers to buffer_heads */
>  #define page_buffers(page)                                     \
> 
> 
> But this change alone was still giving me failures. On looking into
> usage of b_data, I found we use offset_in_page() instead of bh_offset()
> in jbd2. So I added below changes in fs/jbd2 to replace offset_in_page()
> to bh_offset()...
> 
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index 1073259902a6..0c25640714ac 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -304,7 +304,7 @@ static __u32 jbd2_checksum_data(__u32 crc32_sum, struct buffer_head *bh)
> 
>         addr = kmap_atomic(page);
>         checksum = crc32_be(crc32_sum,
> -               (void *)(addr + offset_in_page(bh->b_data)), bh->b_size);
> +               (void *)(addr + bh_offset(bh)), bh->b_size);
>         kunmap_atomic(addr);

Hm, that's not going to work on a highmem machine.  It'll work on 64-bit!
Actually, no, it'll work on a highmem machine because slab doesn't
allocate from highmem.  Still, it's a bit unclean.  Let's go full folio
on this one:

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 1073259902a6..8d6f934c3d95 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -298,14 +298,12 @@ static int journal_finish_inode_data_buffers(journal_t *journal,
 
 static __u32 jbd2_checksum_data(__u32 crc32_sum, struct buffer_head *bh)
 {
-	struct page *page = bh->b_page;
 	char *addr;
 	__u32 checksum;
 
-	addr = kmap_atomic(page);
-	checksum = crc32_be(crc32_sum,
-		(void *)(addr + offset_in_page(bh->b_data)), bh->b_size);
-	kunmap_atomic(addr);
+	addr = kmap_local_folio(bh->b_folio, bh_offset(bh));
+	checksum = crc32_be(crc32_sum, addr, bh->b_size);
+	kunmap_local(addr);
 
 	return checksum;
 }
@@ -322,7 +320,6 @@ static void jbd2_block_tag_csum_set(journal_t *j, journal_block_tag_t *tag,
 				    struct buffer_head *bh, __u32 sequence)
 {
 	journal_block_tag3_t *tag3 = (journal_block_tag3_t *)tag;
-	struct page *page = bh->b_page;
 	__u8 *addr;
 	__u32 csum32;
 	__be32 seq;
@@ -331,11 +328,10 @@ static void jbd2_block_tag_csum_set(journal_t *j, journal_block_tag_t *tag,
 		return;
 
 	seq = cpu_to_be32(sequence);
-	addr = kmap_atomic(page);
+	addr = kmap_local_folio(bh->b_folio, bh_offset(bh));
 	csum32 = jbd2_chksum(j, j->j_csum_seed, (__u8 *)&seq, sizeof(seq));
-	csum32 = jbd2_chksum(j, csum32, addr + offset_in_page(bh->b_data),
-			     bh->b_size);
-	kunmap_atomic(addr);
+	csum32 = jbd2_chksum(j, csum32, addr, bh->b_size);
+	kunmap_local(addr);
 
 	if (jbd2_has_feature_csum3(j))
 		tag3->t_checksum = cpu_to_be32(csum32);
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 4d1fda1f7143..5f08b5fd105a 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -935,19 +935,15 @@ static void warn_dirty_buffer(struct buffer_head *bh)
 /* Call t_frozen trigger and copy buffer data into jh->b_frozen_data. */
 static void jbd2_freeze_jh_data(struct journal_head *jh)
 {
-	struct page *page;
-	int offset;
 	char *source;
 	struct buffer_head *bh = jh2bh(jh);
 
 	J_EXPECT_JH(jh, buffer_uptodate(bh), "Possible IO failure.\n");
-	page = bh->b_page;
-	offset = offset_in_page(bh->b_data);
-	source = kmap_atomic(page);
+	source = kmap_local_folio(bh->b_folio, bh_offset(bh));
 	/* Fire data frozen trigger just before we copy the data */
-	jbd2_buffer_frozen_trigger(jh, source + offset, jh->b_triggers);
-	memcpy(jh->b_frozen_data, source + offset, bh->b_size);
-	kunmap_atomic(source);
+	jbd2_buffer_frozen_trigger(jh, source, jh->b_triggers);
+	memcpy(jh->b_frozen_data, source, bh->b_size);
+	kunmap_local(source);
 
 	/*
 	 * Now that the frozen data is saved off, we need to store any matching

(I've been thinking about adding a kmap_local_bh(bh))

> ext4/1k: 15 tests, 1 failures, 1709 seconds
>   generic/455  Pass     43s
>   generic/475  Pass     128s
>   generic/482  Pass     183s
>   generic/455  Pass     43s
>   generic/475  Pass     134s
>   generic/482  Pass     191s
>   generic/455  Pass     41s
>   generic/475  Pass     139s
>   generic/482  Pass     135s
>   generic/455  Pass     46s
>   generic/475  Pass     132s
>   generic/482  Pass     146s
>   generic/455  Pass     47s
>   generic/475  Failed   145s
>   generic/482  Pass     156s
> Totals: 15 tests, 0 skipped, 1 failures, 0 errors, 1709s
> 
> I guess the above failure (generic/475) could be due to it's flakey
> behaviour which Ted was mentioning.
> 
> 
> Now, while we are at it, I think we should also make change to reiserfs from
> offset_in_page() to bh_offset()
> 
> diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
> index 015bfe4e4524..23411ec163d4 100644
> --- a/fs/reiserfs/journal.c
> +++ b/fs/reiserfs/journal.c
> @@ -4217,7 +4217,7 @@ static int do_journal_end(struct reiserfs_transaction_handle *th, int flags)
>                         page = cn->bh->b_page;
>                         addr = kmap(page);
>                         memcpy(tmp_bh->b_data,
> -                              addr + offset_in_page(cn->bh->b_data),
> +                              addr + bh_offset(cn->bh),
>                                cn->bh->b_size);
>                         kunmap(page);

This one should probably be:

-			addr = kmap(page);
-			memcpy(tmp_bh->b_data,
-				addr + offset_in_page(cn->bh->b_data),
-				cn->bh->b_size);
-			kunmap(page);
+			memcpy_from_folio(tmp_bh->b_data, cn->bh->b_folio,
+					bh_offset(cn->bh), cn->bh->b_size);

> I will also run "auto" group with ext4/1k with all of above change. Will
> update the results once it is done.

Appreciate it!  I don't think you'll see a significant difference with
the patches above; you've nailed the actual problems and I'm just
using slighlty nicer APIs.

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-07  3:47               ` Matthew Wilcox
@ 2023-09-07 13:35                 ` Ritesh Harjani
  2023-09-07 14:15                   ` Matthew Wilcox
  0 siblings, 1 reply; 14+ messages in thread
From: Ritesh Harjani @ 2023-09-07 13:35 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Theodore Ts'o, Zorro Lang, linux-ext4, fstests, regressions,
	Andrew Morton, Jan Kara

Matthew Wilcox <willy@infradead.org> writes:

> On Thu, Sep 07, 2023 at 08:26:35AM +0530, Ritesh Harjani wrote:
>> Matthew Wilcox <willy@infradead.org> writes:
>> 
>> > On Wed, Sep 06, 2023 at 01:38:23PM +0100, Matthew Wilcox wrote:
>> >> > Is this code path a possibility, which can cause above logs?
>> >> > 
>> >> >    ptr = jbd2_alloc() -> kmem_cache_alloc()
>> >> >    <..>
>> >> >    new_folio = virt_to_folio(ptr)
>> >> >    new_offset = offset_in_folio(new_folio, ptr)
>> >> > 
>> >> > And then I am still not sure what the problem really is? 
>> >> > Is it because at the time of checkpointing, the path is still not fully
>> >> > converted to folio?
>> >> 
>> >> Oh yikes!  I didn't know that the allocation might come from kmalloc!
>> >> Yes, slab might use high-order allocations.  I'll have to look through
>> >> this and figure out what the problem might be.
>> >
>> > I think the probable cause is bh_offset().  Before these patches, if
>> > we allocated a buffer at offset 9kB into an order-2 slab, we'd fill in
>> > b_page with the third page of the slab and calculate bh_offset as 1kB.
>> > With these patches, we set b_page to the first page of the slab, and
>> > bh_offset still comes back as 1kB so we read from / write to entirely
>> > the wrong place.
>> >
>> > With this redefinition of bh_offset(), we calculate the offset relative
>> > to the base page if it's a tail page, and relative to the folio if it's
>> > a folio.  Works out nicely ;-)
>> 
>> Thanks Matthew for explaining the problem clearly.
>> 
>> 
>> >
>> > I have three other things I'm trying to debug right now, so this isn't
>> > tested, but if you have time you might want to give it a run.
>> 
>> sure, I gave it a try.
>> 
>> >
>> > diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
>> > index 6cb3e9af78c9..dc8fcdc40e95 100644
>> > --- a/include/linux/buffer_head.h
>> > +++ b/include/linux/buffer_head.h
>> > @@ -173,7 +173,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
>> >  	return test_bit_acquire(BH_Uptodate, &bh->b_state);
>> >  }
>> >  
>> > -#define bh_offset(bh)		((unsigned long)(bh)->b_data & ~PAGE_MASK)
>> > +static inline unsigned long bh_offset(struct buffer_head *bh)
>> > +{
>> > +	return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
>> > +}
>> >  
>> >  /* If we *know* page->private refers to buffer_heads */
>> >  #define page_buffers(page)					\
>> 
>> 
>> I used "const" for bh to avoid warnings from fs/nilfs/alloc.c
>
> Excellent.  I didn't try compiling nilfs ;-)
>
>> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
>> index 4ede47649a81..b61fa79cb7f5 100644
>> --- a/include/linux/buffer_head.h
>> +++ b/include/linux/buffer_head.h
>> @@ -171,7 +171,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
>>         return test_bit_acquire(BH_Uptodate, &bh->b_state);
>>  }
>> 
>> -#define bh_offset(bh)          ((unsigned long)(bh)->b_data & ~PAGE_MASK)
>> +static inline unsigned long bh_offset(const struct buffer_head *bh)
>> +{
>> +       return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
>> +}
>> 
>>  /* If we *know* page->private refers to buffer_heads */
>>  #define page_buffers(page)                                     \
>> 
>> 
>> But this change alone was still giving me failures. On looking into
>> usage of b_data, I found we use offset_in_page() instead of bh_offset()
>> in jbd2. So I added below changes in fs/jbd2 to replace offset_in_page()
>> to bh_offset()...
>> 
>> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
>> index 1073259902a6..0c25640714ac 100644
>> --- a/fs/jbd2/commit.c
>> +++ b/fs/jbd2/commit.c
>> @@ -304,7 +304,7 @@ static __u32 jbd2_checksum_data(__u32 crc32_sum, struct buffer_head *bh)
>> 
>>         addr = kmap_atomic(page);
>>         checksum = crc32_be(crc32_sum,
>> -               (void *)(addr + offset_in_page(bh->b_data)), bh->b_size);
>> +               (void *)(addr + bh_offset(bh)), bh->b_size);
>>         kunmap_atomic(addr);
>
> Hm, that's not going to work on a highmem machine.  It'll work on 64-bit!
> Actually, no, it'll work on a highmem machine because slab doesn't
> allocate from highmem.  Still, it's a bit unclean.  Let's go full folio
> on this one:
>
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index 1073259902a6..8d6f934c3d95 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -298,14 +298,12 @@ static int journal_finish_inode_data_buffers(journal_t *journal,
>  
>  static __u32 jbd2_checksum_data(__u32 crc32_sum, struct buffer_head *bh)
>  {
> -	struct page *page = bh->b_page;
>  	char *addr;
>  	__u32 checksum;
>  
> -	addr = kmap_atomic(page);
> -	checksum = crc32_be(crc32_sum,
> -		(void *)(addr + offset_in_page(bh->b_data)), bh->b_size);
> -	kunmap_atomic(addr);
> +	addr = kmap_local_folio(bh->b_folio, bh_offset(bh));
> +	checksum = crc32_be(crc32_sum, addr, bh->b_size);
> +	kunmap_local(addr);
>  
>  	return checksum;
>  }
> @@ -322,7 +320,6 @@ static void jbd2_block_tag_csum_set(journal_t *j, journal_block_tag_t *tag,
>  				    struct buffer_head *bh, __u32 sequence)
>  {
>  	journal_block_tag3_t *tag3 = (journal_block_tag3_t *)tag;
> -	struct page *page = bh->b_page;
>  	__u8 *addr;
>  	__u32 csum32;
>  	__be32 seq;
> @@ -331,11 +328,10 @@ static void jbd2_block_tag_csum_set(journal_t *j, journal_block_tag_t *tag,
>  		return;
>  
>  	seq = cpu_to_be32(sequence);
> -	addr = kmap_atomic(page);
> +	addr = kmap_local_folio(bh->b_folio, bh_offset(bh));
>  	csum32 = jbd2_chksum(j, j->j_csum_seed, (__u8 *)&seq, sizeof(seq));
> -	csum32 = jbd2_chksum(j, csum32, addr + offset_in_page(bh->b_data),
> -			     bh->b_size);
> -	kunmap_atomic(addr);
> +	csum32 = jbd2_chksum(j, csum32, addr, bh->b_size);
> +	kunmap_local(addr);
>  
>  	if (jbd2_has_feature_csum3(j))
>  		tag3->t_checksum = cpu_to_be32(csum32);
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index 4d1fda1f7143..5f08b5fd105a 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -935,19 +935,15 @@ static void warn_dirty_buffer(struct buffer_head *bh)
>  /* Call t_frozen trigger and copy buffer data into jh->b_frozen_data. */
>  static void jbd2_freeze_jh_data(struct journal_head *jh)
>  {
> -	struct page *page;
> -	int offset;
>  	char *source;
>  	struct buffer_head *bh = jh2bh(jh);
>  
>  	J_EXPECT_JH(jh, buffer_uptodate(bh), "Possible IO failure.\n");
> -	page = bh->b_page;
> -	offset = offset_in_page(bh->b_data);
> -	source = kmap_atomic(page);
> +	source = kmap_local_folio(bh->b_folio, bh_offset(bh));
>  	/* Fire data frozen trigger just before we copy the data */
> -	jbd2_buffer_frozen_trigger(jh, source + offset, jh->b_triggers);
> -	memcpy(jh->b_frozen_data, source + offset, bh->b_size);
> -	kunmap_atomic(source);
> +	jbd2_buffer_frozen_trigger(jh, source, jh->b_triggers);
> +	memcpy(jh->b_frozen_data, source, bh->b_size);
> +	kunmap_local(source);
>  
>  	/*
>  	 * Now that the frozen data is saved off, we need to store any matching
>
> (I've been thinking about adding a kmap_local_bh(bh))
>
>> ext4/1k: 15 tests, 1 failures, 1709 seconds
>>   generic/455  Pass     43s
>>   generic/475  Pass     128s
>>   generic/482  Pass     183s
>>   generic/455  Pass     43s
>>   generic/475  Pass     134s
>>   generic/482  Pass     191s
>>   generic/455  Pass     41s
>>   generic/475  Pass     139s
>>   generic/482  Pass     135s
>>   generic/455  Pass     46s
>>   generic/475  Pass     132s
>>   generic/482  Pass     146s
>>   generic/455  Pass     47s
>>   generic/475  Failed   145s
>>   generic/482  Pass     156s
>> Totals: 15 tests, 0 skipped, 1 failures, 0 errors, 1709s
>> 
>> I guess the above failure (generic/475) could be due to it's flakey
>> behaviour which Ted was mentioning.
>> 
>> 
>> Now, while we are at it, I think we should also make change to reiserfs from
>> offset_in_page() to bh_offset()
>> 
>> diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
>> index 015bfe4e4524..23411ec163d4 100644
>> --- a/fs/reiserfs/journal.c
>> +++ b/fs/reiserfs/journal.c
>> @@ -4217,7 +4217,7 @@ static int do_journal_end(struct reiserfs_transaction_handle *th, int flags)
>>                         page = cn->bh->b_page;
>>                         addr = kmap(page);
>>                         memcpy(tmp_bh->b_data,
>> -                              addr + offset_in_page(cn->bh->b_data),
>> +                              addr + bh_offset(cn->bh),
>>                                cn->bh->b_size);
>>                         kunmap(page);
>
> This one should probably be:
>
> -			addr = kmap(page);
> -			memcpy(tmp_bh->b_data,
> -				addr + offset_in_page(cn->bh->b_data),
> -				cn->bh->b_size);
> -			kunmap(page);
> +			memcpy_from_folio(tmp_bh->b_data, cn->bh->b_folio,
> +					bh_offset(cn->bh), cn->bh->b_size);
>
>> I will also run "auto" group with ext4/1k with all of above change. Will
>> update the results once it is done.
>
> Appreciate it!  I don't think you'll see a significant difference with
> the patches above; you've nailed the actual problems and I'm just
> using slighlty nicer APIs.

Thanks Matthew for proposing the final changes using folio.
(there were just some minor change required for fs/reiserfs/ for unused variables)
Pasting the final patch below (you as the author with my Signed-off-by &
Tested-by), which I have tested it on my system with "ext4/1k -g auto"

-------------------- Summary report
KERNEL:    kernel 6.5.0-xfstests-11705-ge1ee6db7734e #62 SMP PREEMPT_DYNAMIC Thu Sep  7 10:39:34 IST 2023 x86_64
CMDLINE:   -c ext4/1k -g auto
CPUS:      4
MEM:       7943.72

ext4/1k: 527 tests, 1 failures, 39 skipped, 9182 seconds
  Failures: ext4/059
Totals: 531 tests, 39 skipped, 5 failures, 0 errors, 9123s

You also proposed you would like to add kmap_local_bh(), hence not
sending it as a separate patch, in case if you would like to do it differently.

Thanks again for helping with the fix! 

---

From baeedb714497ae8f3809cc6e7cffa8884af43fac Mon Sep 17 00:00:00 2001
Message-Id: <baeedb714497ae8f3809cc6e7cffa8884af43fac.1694092539.git.ritesh.list@gmail.com>
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Date: Thu, 7 Sep 2023 10:01:54 +0530
Subject: [PATCH] buffer: Fix definition of bh_offset() for struct buffer_head

Note that buffer head infrastructure is being transitioned from page based to
folio based- d685c668b069: ("buffer: add b_folio as an alias of b_page").

Now, jbd2_alloc() allocates a buffer (bh) from kmem cache when the
buffer_size is < PAGE_SIZE. (for e.g. 1k blocksize on 4k pagesize) and
then we might save this buffer info inside buffer_head, using
folio_set_bh() :-
        bh->b_folio = folio;
        if (!highmem)
          bh->b_data = folio_address(folio) + offset;

So far all good. However, while using this buffer's b_data, we use
bh_offset() or offset_in_page(), which assumes the buffer to be of
a PAGE_SIZE. This is not true anymore with b_folio as slab might use
high-order allocations.

This patch fixes the definition of bh_offset() and make use of
bh_offset() instead of offset_in_page() at places inside fs/jbd2 and
fs/reiserfs.
Also while we are at it, this patch converts these places to use folio
APIs instead.

Fixes: 8147c4c4546f ("jbd2: use a folio in jbd2_journal_write_metadata_buffer()")
Reported-by: Zorro Lang <zlang@kernel.org>
Tested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/jbd2/commit.c            | 16 ++++++----------
 fs/jbd2/transaction.c       | 12 ++++--------
 fs/reiserfs/journal.c       | 11 +++--------
 include/linux/buffer_head.h |  5 ++++-
 4 files changed, 17 insertions(+), 27 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 1073259902a6..8d6f934c3d95 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -298,14 +298,12 @@ static int journal_finish_inode_data_buffers(journal_t *journal,

 static __u32 jbd2_checksum_data(__u32 crc32_sum, struct buffer_head *bh)
 {
-       struct page *page = bh->b_page;
        char *addr;
        __u32 checksum;

-       addr = kmap_atomic(page);
-       checksum = crc32_be(crc32_sum,
-               (void *)(addr + offset_in_page(bh->b_data)), bh->b_size);
-       kunmap_atomic(addr);
+       addr = kmap_local_folio(bh->b_folio, bh_offset(bh));
+       checksum = crc32_be(crc32_sum, addr, bh->b_size);
+       kunmap_local(addr);

        return checksum;
 }
@@ -322,7 +320,6 @@ static void jbd2_block_tag_csum_set(journal_t *j, journal_block_tag_t *tag,
                                    struct buffer_head *bh, __u32 sequence)
 {
        journal_block_tag3_t *tag3 = (journal_block_tag3_t *)tag;
-       struct page *page = bh->b_page;
        __u8 *addr;
        __u32 csum32;
        __be32 seq;
@@ -331,11 +328,10 @@ static void jbd2_block_tag_csum_set(journal_t *j, journal_block_tag_t *tag,
                return;

        seq = cpu_to_be32(sequence);
-       addr = kmap_atomic(page);
+       addr = kmap_local_folio(bh->b_folio, bh_offset(bh));
        csum32 = jbd2_chksum(j, j->j_csum_seed, (__u8 *)&seq, sizeof(seq));
-       csum32 = jbd2_chksum(j, csum32, addr + offset_in_page(bh->b_data),
-                            bh->b_size);
-       kunmap_atomic(addr);
+       csum32 = jbd2_chksum(j, csum32, addr, bh->b_size);
+       kunmap_local(addr);

        if (jbd2_has_feature_csum3(j))
                tag3->t_checksum = cpu_to_be32(csum32);
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 4d1fda1f7143..5f08b5fd105a 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -935,19 +935,15 @@ static void warn_dirty_buffer(struct buffer_head *bh)
 /* Call t_frozen trigger and copy buffer data into jh->b_frozen_data. */
 static void jbd2_freeze_jh_data(struct journal_head *jh)
 {
-       struct page *page;
-       int offset;
        char *source;
        struct buffer_head *bh = jh2bh(jh);

        J_EXPECT_JH(jh, buffer_uptodate(bh), "Possible IO failure.\n");
-       page = bh->b_page;
-       offset = offset_in_page(bh->b_data);
-       source = kmap_atomic(page);
+       source = kmap_local_folio(bh->b_folio, bh_offset(bh));
        /* Fire data frozen trigger just before we copy the data */
-       jbd2_buffer_frozen_trigger(jh, source + offset, jh->b_triggers);
-       memcpy(jh->b_frozen_data, source + offset, bh->b_size);
-       kunmap_atomic(source);
+       jbd2_buffer_frozen_trigger(jh, source, jh->b_triggers);
+       memcpy(jh->b_frozen_data, source, bh->b_size);
+       kunmap_local(source);

        /*
         * Now that the frozen data is saved off, we need to store any matching
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 015bfe4e4524..541ee1c5d2b3 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -4205,8 +4205,6 @@ static int do_journal_end(struct reiserfs_transaction_handle *th, int flags)
                /* copy all the real blocks into log area.  dirty log blocks */
                if (buffer_journaled(cn->bh)) {
                        struct buffer_head *tmp_bh;
-                       char *addr;
-                       struct page *page;
                        tmp_bh =
                            journal_getblk(sb,
                                           SB_ONDISK_JOURNAL_1st_BLOCK(sb) +
@@ -4214,12 +4212,9 @@ static int do_journal_end(struct reiserfs_transaction_handle *th, int flags)
                                             jindex) %
                                            SB_ONDISK_JOURNAL_SIZE(sb)));
                        set_buffer_uptodate(tmp_bh);
-                       page = cn->bh->b_page;
-                       addr = kmap(page);
-                       memcpy(tmp_bh->b_data,
-                              addr + offset_in_page(cn->bh->b_data),
-                              cn->bh->b_size);
-                       kunmap(page);
+                       memcpy_from_folio(tmp_bh->b_data, cn->bh->b_folio,
+                                       bh_offset(cn->bh), cn->bh->b_size);
+
                        mark_buffer_dirty(tmp_bh);
                        jindex++;
                        set_buffer_journal_dirty(cn->bh);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 4ede47649a81..b61fa79cb7f5 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -171,7 +171,10 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
        return test_bit_acquire(BH_Uptodate, &bh->b_state);
 }

-#define bh_offset(bh)          ((unsigned long)(bh)->b_data & ~PAGE_MASK)
+static inline unsigned long bh_offset(const struct buffer_head *bh)
+{
+       return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
+}

 /* If we *know* page->private refers to buffer_heads */
 #define page_buffers(page)                                     \
--
2.30.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-07 13:35                 ` Ritesh Harjani
@ 2023-09-07 14:15                   ` Matthew Wilcox
  2023-09-07 14:59                     ` Ritesh Harjani
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Wilcox @ 2023-09-07 14:15 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Theodore Ts'o, Zorro Lang, linux-ext4, fstests, regressions,
	Andrew Morton, Jan Kara

On Thu, Sep 07, 2023 at 07:05:38PM +0530, Ritesh Harjani wrote:
> Thanks Matthew for proposing the final changes using folio.
> (there were just some minor change required for fs/reiserfs/ for unused variables)
> Pasting the final patch below (you as the author with my Signed-off-by &
> Tested-by), which I have tested it on my system with "ext4/1k -g auto"

I'd rather split that patch up a bit -- I don't think the reiserfs
part fixes any actual problem.  I've pushed out
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/bh-fixes

or git clone git://git.infradead.org/users/willy/pagecache.git

I credited you as the author on the second two since I just tidied up
your proposed fixes.

I've also checked ocfs2 as the other user of JBD2 and I don't see any
problems there.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-07 14:15                   ` Matthew Wilcox
@ 2023-09-07 14:59                     ` Ritesh Harjani
  2023-09-10  9:26                       ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 14+ messages in thread
From: Ritesh Harjani @ 2023-09-07 14:59 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Theodore Ts'o, Zorro Lang, linux-ext4, fstests, regressions,
	Andrew Morton, Jan Kara

Matthew Wilcox <willy@infradead.org> writes:

> On Thu, Sep 07, 2023 at 07:05:38PM +0530, Ritesh Harjani wrote:
>> Thanks Matthew for proposing the final changes using folio.
>> (there were just some minor change required for fs/reiserfs/ for unused variables)
>> Pasting the final patch below (you as the author with my Signed-off-by &
>> Tested-by), which I have tested it on my system with "ext4/1k -g auto"
>
> I'd rather split that patch up a bit -- I don't think the reiserfs
> part fixes any actual problem.  I've pushed out
> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/bh-fixes
>
> or git clone git://git.infradead.org/users/willy/pagecache.git
>
> I credited you as the author on the second two since I just tidied up
> your proposed fixes.
>
> I've also checked ocfs2 as the other user of JBD2 and I don't see any
> problems there.

Thanks Matthew! :) 

-ritesh 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-07 14:59                     ` Ritesh Harjani
@ 2023-09-10  9:26                       ` Linux regression tracking (Thorsten Leemhuis)
  2023-09-11  3:43                         ` Theodore Ts'o
  0 siblings, 1 reply; 14+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-09-10  9:26 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), Matthew Wilcox
  Cc: Theodore Ts'o, Zorro Lang, linux-ext4, fstests, regressions,
	Andrew Morton, Jan Kara

[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 07.09.23 16:59, Ritesh Harjani (IBM) wrote:
> Matthew Wilcox <willy@infradead.org> writes:
> 
>> On Thu, Sep 07, 2023 at 07:05:38PM +0530, Ritesh Harjani wrote:
>>> Thanks Matthew for proposing the final changes using folio.
>>> (there were just some minor change required for fs/reiserfs/ for unused variables)
>>> Pasting the final patch below (you as the author with my Signed-off-by &
>>> Tested-by), which I have tested it on my system with "ext4/1k -g auto"
>>
>> I'd rather split that patch up a bit -- I don't think the reiserfs
>> part fixes any actual problem.  I've pushed out
>> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/bh-fixes
>>
>> or git clone git://git.infradead.org/users/willy/pagecache.git
>>
>> I credited you as the author on the second two since I just tidied up
>> your proposed fixes.
>>
>> I've also checked ocfs2 as the other user of JBD2 and I don't see any
>> problems there.
> 
> Thanks Matthew! :) 

#regzbot fix: jbd2: Remove page size assumptions
#regzbot ignore-activity

(fix can currently be found in
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/bh-fixes
as
https://git.infradead.org/users/willy/pagecache.git/commit/fc0a6fa4a2c7b434665f087801a06c544b16f085
)

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails
  2023-09-10  9:26                       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-09-11  3:43                         ` Theodore Ts'o
  0 siblings, 0 replies; 14+ messages in thread
From: Theodore Ts'o @ 2023-09-11  3:43 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Ritesh Harjani (IBM), Matthew Wilcox, Zorro Lang, linux-ext4,
	fstests, Andrew Morton, Jan Kara

On Sun, Sep 10, 2023 at 11:26:00AM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> 
> #regzbot fix: jbd2: Remove page size assumptions
> #regzbot ignore-activity
> 
> (fix can currently be found in
> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/bh-fixes
> as
> https://git.infradead.org/users/willy/pagecache.git/commit/fc0a6fa4a2c7b434665f087801a06c544b16f085
> )

Per our discussion at last week's ext4 video chat, I've cherry-picked
the two fixes on the ext4 dev tree:

147d4a092e9a - jbd2: Remove page size assumptions (3 days ago)
f94cf2206b06 - buffer: Make bh_offset() work for compound pages (3 days ago)

I didn't take the reiserfs change, since this is for the ext4 git
tree, and as near as I can tell, it's more of a code cleanup rather
than an immediate fix.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-09-11  3:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-03 12:00 [fstests generic/388, 455, 475, 482 ...] Ext4 journal recovery test fails Zorro Lang
2023-09-03 20:40 ` Theodore Ts'o
2023-09-04  6:08   ` Theodore Ts'o
2023-09-05 22:11     ` Matthew Wilcox
2023-09-06 11:03       ` Ritesh Harjani
2023-09-06 12:38         ` Matthew Wilcox
2023-09-06 19:51           ` Matthew Wilcox
2023-09-07  2:56             ` Ritesh Harjani
2023-09-07  3:47               ` Matthew Wilcox
2023-09-07 13:35                 ` Ritesh Harjani
2023-09-07 14:15                   ` Matthew Wilcox
2023-09-07 14:59                     ` Ritesh Harjani
2023-09-10  9:26                       ` Linux regression tracking (Thorsten Leemhuis)
2023-09-11  3:43                         ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).