Re: [syzbot] [iomap?] kernel BUG in folio_end

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
       [not found] <68cc0578.050a0220.28a605.0006.GAE@google.com>
@ 2025-11-01  2:11 ` syzbot
  2025-11-03 16:58   ` Joanne Koong
  2025-11-02  5:39 ` syzbot
  1 sibling, 1 reply; 15+ messages in thread
From: syzbot @ 2025-11-01  2:11 UTC (permalink / raw)
  To: brauner, chao, djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel,
	linux-kernel, linux-xfs, syzkaller-bugs

syzbot has found a reproducer for the following issue on:

HEAD commit:    98bd8b16ae57 Add linux-next specific files for 20251031
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=163b2bcd980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=63d09725c93bcc1c
dashboard link: https://syzkaller.appspot.com/bug?extid=3686758660f980b402dc
compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=176fc342580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10403f34580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/975261746f29/disk-98bd8b16.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/ad565c6cf272/vmlinux-98bd8b16.xz
kernel image: https://storage.googleapis.com/syzbot-assets/1816a55a8d5f/bzImage-98bd8b16.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/d6d9eee31fdb/mount_0.gz
  fsck result: failed (log: https://syzkaller.appspot.com/x/fsck.log?x=17803f34580000)

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3686758660f980b402dc@syzkaller.appspotmail.com

 vms_complete_munmap_vmas+0x206/0x8a0 mm/vma.c:1279
 do_vmi_align_munmap+0x364/0x440 mm/vma.c:1538
 do_vmi_munmap+0x253/0x2e0 mm/vma.c:1586
 __vm_munmap+0x207/0x380 mm/vma.c:3196
 __do_sys_munmap mm/mmap.c:1077 [inline]
 __se_sys_munmap mm/mmap.c:1074 [inline]
 __x64_sys_munmap+0x60/0x70 mm/mmap.c:1074
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
------------[ cut here ]------------
kernel BUG at mm/filemap.c:1530!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 5989 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
RIP: 0010:folio_end_read+0x1e9/0x230 mm/filemap.c:1530
Code: 79 c7 ff 48 89 df 48 c7 c6 20 6d 74 8b e8 9f df 2e ff 90 0f 0b e8 d7 79 c7 ff 48 89 df 48 c7 c6 40 63 74 8b e8 88 df 2e ff 90 <0f> 0b e8 c0 79 c7 ff 48 89 df 48 c7 c6 20 6d 74 8b e8 71 df 2e ff
RSP: 0018:ffffc90003f8e268 EFLAGS: 00010246
RAX: c6904ff3387db700 RBX: ffffea0001b5ef00 RCX: 0000000000000000
RDX: 0000000000000007 RSI: ffffffff8d780a1b RDI: 00000000ffffffff
RBP: 0000000000000000 R08: ffffffff8f7d7477 R09: 1ffffffff1efae8e
R10: dffffc0000000000 R11: fffffbfff1efae8f R12: 1ffffd400036bde1
R13: 1ffffd400036bde0 R14: ffffea0001b5ef08 R15: 00fff20000004060
FS:  0000555572333500(0000) GS:ffff888125fe2000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f57d6844000 CR3: 0000000075586000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 iomap_readahead+0x96a/0xbc0 fs/iomap/buffered-io.c:547
 iomap_bio_readahead include/linux/iomap.h:608 [inline]
 erofs_readahead+0x1c3/0x3c0 fs/erofs/data.c:383
 read_pages+0x17a/0x580 mm/readahead.c:163
 page_cache_ra_order+0x924/0xe70 mm/readahead.c:518
 filemap_readahead mm/filemap.c:2658 [inline]
 filemap_get_pages+0x7ff/0x1df0 mm/filemap.c:2704
 filemap_read+0x3f6/0x11a0 mm/filemap.c:2800
 __kernel_read+0x4cf/0x960 fs/read_write.c:530
 integrity_kernel_read+0x89/0xd0 security/integrity/iint.c:28
 ima_calc_file_hash_tfm security/integrity/ima/ima_crypto.c:480 [inline]
 ima_calc_file_shash security/integrity/ima/ima_crypto.c:511 [inline]
 ima_calc_file_hash+0x85e/0x16f0 security/integrity/ima/ima_crypto.c:568
 ima_collect_measurement+0x428/0x8f0 security/integrity/ima/ima_api.c:293
 process_measurement+0x1121/0x1a40 security/integrity/ima/ima_main.c:405
 ima_file_check+0xd7/0x120 security/integrity/ima/ima_main.c:656
 security_file_post_open+0xbb/0x290 security/security.c:2652
 do_open fs/namei.c:3977 [inline]
 path_openat+0x2f26/0x3830 fs/namei.c:4134
 do_filp_open+0x1fa/0x410 fs/namei.c:4161
 do_sys_openat2+0x121/0x1c0 fs/open.c:1437
 do_sys_open fs/open.c:1452 [inline]
 __do_sys_openat fs/open.c:1468 [inline]
 __se_sys_openat fs/open.c:1463 [inline]
 __x64_sys_openat+0x138/0x170 fs/open.c:1463
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f0b08d8efc9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffec6a5d268 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 00007f0b08fe5fa0 RCX: 00007f0b08d8efc9
RDX: 0000000000121140 RSI: 0000200000000000 RDI: ffffffffffffff9c
RBP: 00007f0b08e11f91 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000013d R11: 0000000000000246 R12: 0000000000000000
R13: 00007f0b08fe5fa0 R14: 00007f0b08fe5fa0 R15: 0000000000000004
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:folio_end_read+0x1e9/0x230 mm/filemap.c:1530
Code: 79 c7 ff 48 89 df 48 c7 c6 20 6d 74 8b e8 9f df 2e ff 90 0f 0b e8 d7 79 c7 ff 48 89 df 48 c7 c6 40 63 74 8b e8 88 df 2e ff 90 <0f> 0b e8 c0 79 c7 ff 48 89 df 48 c7 c6 20 6d 74 8b e8 71 df 2e ff
RSP: 0018:ffffc90003f8e268 EFLAGS: 00010246
RAX: c6904ff3387db700 RBX: ffffea0001b5ef00 RCX: 0000000000000000
RDX: 0000000000000007 RSI: ffffffff8d780a1b RDI: 00000000ffffffff
RBP: 0000000000000000 R08: ffffffff8f7d7477 R09: 1ffffffff1efae8e
R10: dffffc0000000000 R11: fffffbfff1efae8f R12: 1ffffd400036bde1
R13: 1ffffd400036bde0 R14: ffffea0001b5ef08 R15: 00fff20000004060
FS:  0000555572333500(0000) GS:ffff888125ee2000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30063fff CR3: 0000000075586000 CR4: 00000000003526f0


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-01  2:11 ` [syzbot] [iomap?] kernel BUG in folio_end_read (2) syzbot
@ 2025-11-03 16:58   ` Joanne Koong
  2025-11-04  2:43     ` syzbot
  0 siblings, 1 reply; 15+ messages in thread
From: Joanne Koong @ 2025-11-03 16:58 UTC (permalink / raw)
  To: syzbot
  Cc: brauner, chao, djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel,
	linux-kernel, linux-xfs, syzkaller-bugs

On Sat, Nov 1, 2025 at 1:26 PM syzbot
<syzbot+3686758660f980b402dc@syzkaller.appspotmail.com> wrote:
>
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit:    98bd8b16ae57 Add linux-next specific files for 20251031
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=163b2bcd980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=63d09725c93bcc1c
> dashboard link: https://syzkaller.appspot.com/bug?extid=3686758660f980b402dc
> compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=176fc342580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10403f34580000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/975261746f29/disk-98bd8b16.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/ad565c6cf272/vmlinux-98bd8b16.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/1816a55a8d5f/bzImage-98bd8b16.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/d6d9eee31fdb/mount_0.gz
>   fsck result: failed (log: https://syzkaller.appspot.com/x/fsck.log?x=17803f34580000)
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+3686758660f980b402dc@syzkaller.appspotmail.com
>
>  vms_complete_munmap_vmas+0x206/0x8a0 mm/vma.c:1279
>  do_vmi_align_munmap+0x364/0x440 mm/vma.c:1538
>  do_vmi_munmap+0x253/0x2e0 mm/vma.c:1586
>  __vm_munmap+0x207/0x380 mm/vma.c:3196
>  __do_sys_munmap mm/mmap.c:1077 [inline]
>  __se_sys_munmap mm/mmap.c:1074 [inline]
>  __x64_sys_munmap+0x60/0x70 mm/mmap.c:1074
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> ------------[ cut here ]------------
> kernel BUG at mm/filemap.c:1530!

I think this is the same bug that was fixed by [1].

[1] https://lore.kernel.org/linux-fsdevel/20251031211309.1774819-2-joannelkoong@gmail.com/

> Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
> CPU: 1 UID: 0 PID: 5989 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
> RIP: 0010:folio_end_read+0x1e9/0x230 mm/filemap.c:1530
> Code: 79 c7 ff 48 89 df 48 c7 c6 20 6d 74 8b e8 9f df 2e ff 90 0f 0b e8 d7 79 c7 ff 48 89 df 48 c7 c6 40 63 74 8b e8 88 df 2e ff 90 <0f> 0b e8 c0 79 c7 ff 48 89 df 48 c7 c6 20 6d 74 8b e8 71 df 2e ff
> RSP: 0018:ffffc90003f8e268 EFLAGS: 00010246
> RAX: c6904ff3387db700 RBX: ffffea0001b5ef00 RCX: 0000000000000000
> RDX: 0000000000000007 RSI: ffffffff8d780a1b RDI: 00000000ffffffff
> RBP: 0000000000000000 R08: ffffffff8f7d7477 R09: 1ffffffff1efae8e
> R10: dffffc0000000000 R11: fffffbfff1efae8f R12: 1ffffd400036bde1
> R13: 1ffffd400036bde0 R14: ffffea0001b5ef08 R15: 00fff20000004060
> FS:  0000555572333500(0000) GS:ffff888125fe2000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f57d6844000 CR3: 0000000075586000 CR4: 00000000003526f0
> Call Trace:
>  <TASK>
>  iomap_readahead+0x96a/0xbc0 fs/iomap/buffered-io.c:547
>  iomap_bio_readahead include/linux/iomap.h:608 [inline]
>  erofs_readahead+0x1c3/0x3c0 fs/erofs/data.c:383
>  read_pages+0x17a/0x580 mm/readahead.c:163
>  page_cache_ra_order+0x924/0xe70 mm/readahead.c:518
>  filemap_readahead mm/filemap.c:2658 [inline]
>  filemap_get_pages+0x7ff/0x1df0 mm/filemap.c:2704
>  filemap_read+0x3f6/0x11a0 mm/filemap.c:2800
>  __kernel_read+0x4cf/0x960 fs/read_write.c:530
>  integrity_kernel_read+0x89/0xd0 security/integrity/iint.c:28
>  ima_calc_file_hash_tfm security/integrity/ima/ima_crypto.c:480 [inline]
>  ima_calc_file_shash security/integrity/ima/ima_crypto.c:511 [inline]
>  ima_calc_file_hash+0x85e/0x16f0 security/integrity/ima/ima_crypto.c:568
>  ima_collect_measurement+0x428/0x8f0 security/integrity/ima/ima_api.c:293
>  process_measurement+0x1121/0x1a40 security/integrity/ima/ima_main.c:405
>  ima_file_check+0xd7/0x120 security/integrity/ima/ima_main.c:656
>  security_file_post_open+0xbb/0x290 security/security.c:2652
>  do_open fs/namei.c:3977 [inline]
>  path_openat+0x2f26/0x3830 fs/namei.c:4134
>  do_filp_open+0x1fa/0x410 fs/namei.c:4161
>  do_sys_openat2+0x121/0x1c0 fs/open.c:1437
>  do_sys_open fs/open.c:1452 [inline]
>  __do_sys_openat fs/open.c:1468 [inline]
>  __se_sys_openat fs/open.c:1463 [inline]
>  __x64_sys_openat+0x138/0x170 fs/open.c:1463
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f0b08d8efc9
> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007ffec6a5d268 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
> RAX: ffffffffffffffda RBX: 00007f0b08fe5fa0 RCX: 00007f0b08d8efc9
> RDX: 0000000000121140 RSI: 0000200000000000 RDI: ffffffffffffff9c
> RBP: 00007f0b08e11f91 R08: 0000000000000000 R09: 0000000000000000
> R10: 000000000000013d R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f0b08fe5fa0 R14: 00007f0b08fe5fa0 R15: 0000000000000004
>  </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:folio_end_read+0x1e9/0x230 mm/filemap.c:1530
> Code: 79 c7 ff 48 89 df 48 c7 c6 20 6d 74 8b e8 9f df 2e ff 90 0f 0b e8 d7 79 c7 ff 48 89 df 48 c7 c6 40 63 74 8b e8 88 df 2e ff 90 <0f> 0b e8 c0 79 c7 ff 48 89 df 48 c7 c6 20 6d 74 8b e8 71 df 2e ff
> RSP: 0018:ffffc90003f8e268 EFLAGS: 00010246
> RAX: c6904ff3387db700 RBX: ffffea0001b5ef00 RCX: 0000000000000000
> RDX: 0000000000000007 RSI: ffffffff8d780a1b RDI: 00000000ffffffff
> RBP: 0000000000000000 R08: ffffffff8f7d7477 R09: 1ffffffff1efae8e
> R10: dffffc0000000000 R11: fffffbfff1efae8f R12: 1ffffd400036bde1
> R13: 1ffffd400036bde0 R14: ffffea0001b5ef08 R15: 00fff20000004060
> FS:  0000555572333500(0000) GS:ffff888125ee2000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b30063fff CR3: 0000000075586000 CR4: 00000000003526f0
>
>
> ---
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
>

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
master

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-03 16:58   ` Joanne Koong
@ 2025-11-04  2:43     ` syzbot
  2025-11-04 17:45       ` Joanne Koong
  0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2025-11-04  2:43 UTC (permalink / raw)
  To: brauner, chao, djwong, jaegeuk, joannelkoong, linux-f2fs-devel,
	linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING in get_data

loop0: detected capacity change from 0 to 16
------------[ cut here ]------------
WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840 kernel/printk/printk_ringbuffer.c:1278, CPU#1: syz.0.585/7652
Modules linked in:
CPU: 1 UID: 0 PID: 7652 Comm: syz.0.585 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
RIP: 0010:get_data+0x48a/0x840 kernel/printk/printk_ringbuffer.c:1278
Code: 83 c4 f8 48 b8 00 00 00 00 00 fc ff df 41 0f b6 04 07 84 c0 0f 85 ee 01 00 00 44 89 65 00 49 83 c5 08 eb 13 e8 a7 19 1f 00 90 <0f> 0b 90 eb 05 e8 9c 19 1f 00 45 31 ed 4c 89 e8 48 83 c4 28 5b 41
RSP: 0018:ffffc900035170e0 EFLAGS: 00010293
RAX: ffffffff81a1eee9 RBX: 00003fffffffffff RCX: ffff888033255b80
RDX: 0000000000000000 RSI: 00003fffffffffff RDI: 0000000000000000
RBP: 0000000000000012 R08: 0000000000000e55 R09: 000000325e213cc7
R10: 000000325e213cc7 R11: 00001de4c2000037 R12: 0000000000000012
R13: 0000000000000000 R14: ffffc90003517228 R15: 1ffffffff1bca646
FS:  00007f44eb8da6c0(0000) GS:ffff888125fda000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f44ea9722e0 CR3: 0000000066344000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 copy_data kernel/printk/printk_ringbuffer.c:1857 [inline]
 prb_read kernel/printk/printk_ringbuffer.c:1966 [inline]
 _prb_read_valid+0x672/0xa90 kernel/printk/printk_ringbuffer.c:2143
 prb_read_valid+0x3c/0x60 kernel/printk/printk_ringbuffer.c:2215
 printk_get_next_message+0x15c/0x7b0 kernel/printk/printk.c:2978
 console_emit_next_record kernel/printk/printk.c:3062 [inline]
 console_flush_one_record kernel/printk/printk.c:3194 [inline]
 console_flush_all+0x4cc/0xb10 kernel/printk/printk.c:3268
 __console_flush_and_unlock kernel/printk/printk.c:3298 [inline]
 console_unlock+0xbb/0x190 kernel/printk/printk.c:3338
 vprintk_emit+0x4c5/0x590 kernel/printk/printk.c:2423
 _printk+0xcf/0x120 kernel/printk/printk.c:2448
 _erofs_printk+0x349/0x410 fs/erofs/super.c:33
 erofs_fc_fill_super+0x1591/0x1b20 fs/erofs/super.c:746
 get_tree_bdev_flags+0x40e/0x4d0 fs/super.c:1692
 vfs_get_tree+0x92/0x2b0 fs/super.c:1752
 fc_mount fs/namespace.c:1198 [inline]
 do_new_mount_fc fs/namespace.c:3641 [inline]
 do_new_mount+0x302/0xa10 fs/namespace.c:3717
 do_mount fs/namespace.c:4040 [inline]
 __do_sys_mount fs/namespace.c:4228 [inline]
 __se_sys_mount+0x313/0x410 fs/namespace.c:4205
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f44ea99076a
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f44eb8d9e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f44eb8d9ef0 RCX: 00007f44ea99076a
RDX: 0000200000000180 RSI: 00002000000001c0 RDI: 00007f44eb8d9eb0
RBP: 0000200000000180 R08: 00007f44eb8d9ef0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00002000000001c0
R13: 00007f44eb8d9eb0 R14: 00000000000001a1 R15: 0000200000000080
 </TASK>


Tested on:

commit:         98231209 Add linux-next specific files for 20251103
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1370a292580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=43cc0e31558cb527
dashboard link: https://syzkaller.appspot.com/bug?extid=3686758660f980b402dc
compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8

Note: no patches were applied.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-04  2:43     ` syzbot
@ 2025-11-04 17:45       ` Joanne Koong
  2025-11-04 18:25         ` Petr Mladek
  0 siblings, 1 reply; 15+ messages in thread
From: Joanne Koong @ 2025-11-04 17:45 UTC (permalink / raw)
  To: syzbot, pmladek@suse.com, amurray@thegoodpenguin.co.uk
  Cc: brauner, chao, djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel,
	linux-kernel, linux-xfs, syzkaller-bugs

On Mon, Nov 3, 2025 at 6:43 PM syzbot
<syzbot+3686758660f980b402dc@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> WARNING in get_data
>
> loop0: detected capacity change from 0 to 16
> ------------[ cut here ]------------
> WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840 kernel/printk/printk_ringbuffer.c:1278, CPU#1: syz.0.585/7652
> Modules linked in:
> CPU: 1 UID: 0 PID: 7652 Comm: syz.0.585 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
> RIP: 0010:get_data+0x48a/0x840 kernel/printk/printk_ringbuffer.c:1278
> Code: 83 c4 f8 48 b8 00 00 00 00 00 fc ff df 41 0f b6 04 07 84 c0 0f 85 ee 01 00 00 44 89 65 00 49 83 c5 08 eb 13 e8 a7 19 1f 00 90 <0f> 0b 90 eb 05 e8 9c 19 1f 00 45 31 ed 4c 89 e8 48 83 c4 28 5b 41
> RSP: 0018:ffffc900035170e0 EFLAGS: 00010293
> RAX: ffffffff81a1eee9 RBX: 00003fffffffffff RCX: ffff888033255b80
> RDX: 0000000000000000 RSI: 00003fffffffffff RDI: 0000000000000000
> RBP: 0000000000000012 R08: 0000000000000e55 R09: 000000325e213cc7
> R10: 000000325e213cc7 R11: 00001de4c2000037 R12: 0000000000000012
> R13: 0000000000000000 R14: ffffc90003517228 R15: 1ffffffff1bca646
> FS:  00007f44eb8da6c0(0000) GS:ffff888125fda000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f44ea9722e0 CR3: 0000000066344000 CR4: 00000000003526f0
> Call Trace:
>  <TASK>
>  copy_data kernel/printk/printk_ringbuffer.c:1857 [inline]
>  prb_read kernel/printk/printk_ringbuffer.c:1966 [inline]
>  _prb_read_valid+0x672/0xa90 kernel/printk/printk_ringbuffer.c:2143
>  prb_read_valid+0x3c/0x60 kernel/printk/printk_ringbuffer.c:2215
>  printk_get_next_message+0x15c/0x7b0 kernel/printk/printk.c:2978
>  console_emit_next_record kernel/printk/printk.c:3062 [inline]
>  console_flush_one_record kernel/printk/printk.c:3194 [inline]
>  console_flush_all+0x4cc/0xb10 kernel/printk/printk.c:3268
>  __console_flush_and_unlock kernel/printk/printk.c:3298 [inline]
>  console_unlock+0xbb/0x190 kernel/printk/printk.c:3338
>  vprintk_emit+0x4c5/0x590 kernel/printk/printk.c:2423
>  _printk+0xcf/0x120 kernel/printk/printk.c:2448
>  _erofs_printk+0x349/0x410 fs/erofs/super.c:33
>  erofs_fc_fill_super+0x1591/0x1b20 fs/erofs/super.c:746
>  get_tree_bdev_flags+0x40e/0x4d0 fs/super.c:1692
>  vfs_get_tree+0x92/0x2b0 fs/super.c:1752
>  fc_mount fs/namespace.c:1198 [inline]
>  do_new_mount_fc fs/namespace.c:3641 [inline]
>  do_new_mount+0x302/0xa10 fs/namespace.c:3717
>  do_mount fs/namespace.c:4040 [inline]
>  __do_sys_mount fs/namespace.c:4228 [inline]
>  __se_sys_mount+0x313/0x410 fs/namespace.c:4205
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f44ea99076a
> Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f44eb8d9e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
> RAX: ffffffffffffffda RBX: 00007f44eb8d9ef0 RCX: 00007f44ea99076a
> RDX: 0000200000000180 RSI: 00002000000001c0 RDI: 00007f44eb8d9eb0
> RBP: 0000200000000180 R08: 00007f44eb8d9ef0 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00002000000001c0
> R13: 00007f44eb8d9eb0 R14: 00000000000001a1 R15: 0000200000000080
>  </TASK>
>

This looks unrelated to the iomap changes and seems tied to the recent
printk console flushing changes. Hmm, maybe one of these changes
[1,2,3]?

ccing Andrew and Petr, who would know more

[1] https://lore.kernel.org/all/20251020-printk_legacy_thread_console_lock-v3-1-00f1f0ac055a@thegoodpenguin.co.uk/
[2] https://lore.kernel.org/all/20251020-printk_legacy_thread_console_lock-v3-2-00f1f0ac055a@thegoodpenguin.co.uk/
[3] https://lore.kernel.org/all/20251020-printk_legacy_thread_console_lock-v3-3-00f1f0ac055a@thegoodpenguin.co.uk/

Thanks,
Joanne

>
> Tested on:
>
> commit:         98231209 Add linux-next specific files for 20251103
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1370a292580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=43cc0e31558cb527
> dashboard link: https://syzkaller.appspot.com/bug?extid=3686758660f980b402dc
> compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
>
> Note: no patches were applied.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-04 17:45       ` Joanne Koong
@ 2025-11-04 18:25         ` Petr Mladek
  2025-11-05 14:54           ` John Ogness
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2025-11-04 18:25 UTC (permalink / raw)
  To: Joanne Koong
  Cc: syzbot, amurray@thegoodpenguin.co.uk, brauner, chao, djwong,
	jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel, linux-xfs,
	syzkaller-bugs, John Ogness

Adding John into Cc.

On Tue 2025-11-04 09:45:27, Joanne Koong wrote:
> On Mon, Nov 3, 2025 at 6:43 PM syzbot
> <syzbot+3686758660f980b402dc@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > WARNING in get_data
> >
> > loop0: detected capacity change from 0 to 16
> > ------------[ cut here ]------------
> > WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840 kernel/printk/printk_ringbuffer.c:1278, CPU#1: syz.0.585/7652

It seems to trigger an "Illegac block description" warning, see :

   1263         /* Regular data block: @begin less than @next and in same wrap. */
   1264         if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
   1265             blk_lpos->begin < blk_lpos->next) {
   1266                 db = to_block(data_ring, blk_lpos->begin);
   1267                 *data_size = blk_lpos->next - blk_lpos->begin;
   1268 
   1269         /* Wrapping data block: @begin is one wrap behind @next. */
   1270         } else if (!is_blk_wrapped(data_ring,
   1271                                    blk_lpos->begin + DATA_SIZE(data_ring),
   1272                                    blk_lpos->next)) {
   1273                 db = to_block(data_ring, 0);
   1274                 *data_size = DATA_INDEX(data_ring, blk_lpos->next);
   1275 
   1276         /* Illegal block description. */
   1277         } else {
   1278                 WARN_ON_ONCE(1);		<-----------
   1279                 return NULL;
   1280         }


> > Modules linked in:
> > CPU: 1 UID: 0 PID: 7652 Comm: syz.0.585 Not tainted syzkaller #0 PREEMPT(full)
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
> > RIP: 0010:get_data+0x48a/0x840 kernel/printk/printk_ringbuffer.c:1278
> > Code: 83 c4 f8 48 b8 00 00 00 00 00 fc ff df 41 0f b6 04 07 84 c0 0f 85 ee 01 00 00 44 89 65 00 49 83 c5 08 eb 13 e8 a7 19 1f 00 90 <0f> 0b 90 eb 05 e8 9c 19 1f 00 45 31 ed 4c 89 e8 48 83 c4 28 5b 41
> > RSP: 0018:ffffc900035170e0 EFLAGS: 00010293
> > RAX: ffffffff81a1eee9 RBX: 00003fffffffffff RCX: ffff888033255b80
> > RDX: 0000000000000000 RSI: 00003fffffffffff RDI: 0000000000000000
> > RBP: 0000000000000012 R08: 0000000000000e55 R09: 000000325e213cc7
> > R10: 000000325e213cc7 R11: 00001de4c2000037 R12: 0000000000000012
> > R13: 0000000000000000 R14: ffffc90003517228 R15: 1ffffffff1bca646
> > FS:  00007f44eb8da6c0(0000) GS:ffff888125fda000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f44ea9722e0 CR3: 0000000066344000 CR4: 00000000003526f0
> > Call Trace:
> >  <TASK>
> >  copy_data kernel/printk/printk_ringbuffer.c:1857 [inline]
> >  prb_read kernel/printk/printk_ringbuffer.c:1966 [inline]
> >  _prb_read_valid+0x672/0xa90 kernel/printk/printk_ringbuffer.c:2143
> >  prb_read_valid+0x3c/0x60 kernel/printk/printk_ringbuffer.c:2215
> >  printk_get_next_message+0x15c/0x7b0 kernel/printk/printk.c:2978
> >  console_emit_next_record kernel/printk/printk.c:3062 [inline]
> >  console_flush_one_record kernel/printk/printk.c:3194 [inline]
> >  console_flush_all+0x4cc/0xb10 kernel/printk/printk.c:3268
> >  __console_flush_and_unlock kernel/printk/printk.c:3298 [inline]
> >  console_unlock+0xbb/0x190 kernel/printk/printk.c:3338
> >  vprintk_emit+0x4c5/0x590 kernel/printk/printk.c:2423
> >  _printk+0xcf/0x120 kernel/printk/printk.c:2448
> >  _erofs_printk+0x349/0x410 fs/erofs/super.c:33
> >  erofs_fc_fill_super+0x1591/0x1b20 fs/erofs/super.c:746
> >  get_tree_bdev_flags+0x40e/0x4d0 fs/super.c:1692
> >  vfs_get_tree+0x92/0x2b0 fs/super.c:1752
> >  fc_mount fs/namespace.c:1198 [inline]
> >  do_new_mount_fc fs/namespace.c:3641 [inline]
> >  do_new_mount+0x302/0xa10 fs/namespace.c:3717
> >  do_mount fs/namespace.c:4040 [inline]
> >  __do_sys_mount fs/namespace.c:4228 [inline]
> >  __se_sys_mount+0x313/0x410 fs/namespace.c:4205
> >  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> >  do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
> >  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > RIP: 0033:0x7f44ea99076a
> > Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007f44eb8d9e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
> > RAX: ffffffffffffffda RBX: 00007f44eb8d9ef0 RCX: 00007f44ea99076a
> > RDX: 0000200000000180 RSI: 00002000000001c0 RDI: 00007f44eb8d9eb0
> > RBP: 0000200000000180 R08: 00007f44eb8d9ef0 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00002000000001c0
> > R13: 00007f44eb8d9eb0 R14: 00000000000001a1 R15: 0000200000000080
> >  </TASK>
> >
> 
> This looks unrelated to the iomap changes and seems tied to the recent
> printk console flushing changes. Hmm, maybe one of these changes
> [1,2,3]?
>> 
> [1] https://lore.kernel.org/all/20251020-printk_legacy_thread_console_lock-v3-1-00f1f0ac055a@thegoodpenguin.co.uk/
> [2] https://lore.kernel.org/all/20251020-printk_legacy_thread_console_lock-v3-2-00f1f0ac055a@thegoodpenguin.co.uk/
> [3] https://lore.kernel.org/all/20251020-printk_legacy_thread_console_lock-v3-3-00f1f0ac055a@thegoodpenguin.co.uk/

These patches modified the callers of the printk_ringbuffer API.
I doubt that they might cause the problem.

It rather looks like an internal bug in the printk_ringbuffer code.
And there is only one recent patch:

   https://patch.msgid.link/20250905144152.9137-2-d-tatianin@yandex-team.ru

The scenario leading to the WARN() is not obvious to me. But the patch
touched this code path. So it is a likely culprit. I have to think
more about it.

Anyway, I wonder if the WARNING is reproducible and if it happens even after
reverting the commit 67e1b0052f6bb82be84e3 ("printk_ringbuffer: don't
needlessly wrap data blocks around")

Best Regards,
Petr

> Thanks,
> Joanne
> 
> >
> > Tested on:
> >
> > commit:         98231209 Add linux-next specific files for 20251103
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1370a292580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=43cc0e31558cb527
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3686758660f980b402dc
> > compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
> >
> > Note: no patches were applied.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-04 18:25         ` Petr Mladek
@ 2025-11-05 14:54           ` John Ogness
  2025-11-05 16:49             ` Petr Mladek
  0 siblings, 1 reply; 15+ messages in thread
From: John Ogness @ 2025-11-05 14:54 UTC (permalink / raw)
  To: Petr Mladek, Joanne Koong
  Cc: syzbot, amurray@thegoodpenguin.co.uk, brauner, chao, djwong,
	jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel, linux-xfs,
	syzkaller-bugs

On 2025-11-04, Petr Mladek <pmladek@suse.com> wrote:
> Adding John into Cc.

Thanks.

> It rather looks like an internal bug in the printk_ringbuffer code.
> And there is only one recent patch:
>
>    https://patch.msgid.link/20250905144152.9137-2-d-tatianin@yandex-team.ru
>
> The scenario leading to the WARN() is not obvious to me. But the patch
> touched this code path. So it is a likely culprit. I have to think
> more about it.

I have been digging into this all day and I can find no explanation.

The patch you refer to brings a minor semantic change: is_blk_wrapped()
returns false if begin_lpos and next_lpos are the same, whereas before
we would have true. However, these values are not allowed to be the same
(except for the data-less special case values).

> Anyway, I wonder if the WARNING is reproducible and if it happens even after
> reverting the commit 67e1b0052f6bb82be84e3 ("printk_ringbuffer: don't
> needlessly wrap data blocks around")

Note that a quick search on lore shows another similar report:

https://lore.kernel.org/all/69078fb6.050a0220.29fc44.0029.GAE@google.com/

We may want to revert the commit until we can take a closer look at
this.

I will divert my energies from code-reading to trying to reproduce this.

John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-05 14:54           ` John Ogness
@ 2025-11-05 16:49             ` Petr Mladek
  2025-11-05 19:58               ` John Ogness
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2025-11-05 16:49 UTC (permalink / raw)
  To: John Ogness
  Cc: Joanne Koong, syzbot, amurray@thegoodpenguin.co.uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On Wed 2025-11-05 16:00:28, John Ogness wrote:
> On 2025-11-04, Petr Mladek <pmladek@suse.com> wrote:
> > Adding John into Cc.
> 
> Thanks.
> 
> > It rather looks like an internal bug in the printk_ringbuffer code.
> > And there is only one recent patch:
> >
> >    https://patch.msgid.link/20250905144152.9137-2-d-tatianin@yandex-team.ru
> >
> > The scenario leading to the WARN() is not obvious to me. But the patch
> > touched this code path. So it is a likely culprit. I have to think
> > more about it.
> 
> I have been digging into this all day and I can find no explanation.
> 
> The patch you refer to brings a minor semantic change: is_blk_wrapped()
> returns false if begin_lpos and next_lpos are the same, whereas before
> we would have true. However, these values are not allowed to be the same
> (except for the data-less special case values).
> 
> > Anyway, I wonder if the WARNING is reproducible and if it happens even after
> > reverting the commit 67e1b0052f6bb82be84e3 ("printk_ringbuffer: don't
> > needlessly wrap data blocks around")
> 
> Note that a quick search on lore shows another similar report:
> 
> https://lore.kernel.org/all/69078fb6.050a0220.29fc44.0029.GAE@google.com/

Great catch!

There is a common pattern. There is always one dropped message before
the WARNING() triggers.

This is from
https://syzkaller.appspot.com/x/log.txt?x=1653a342580000

[  179.188108][ T7136] ntfs3(loop0): Different NTFS sector size (4096) and media sector size (512).
** 1 printk messages dropped **
[  179.211874][ T7136] ------------[ cut here ]------------
[  179.211911][ T7136] WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840, CPU#1: syz.0.359/7136


And this is from
https://syzkaller.appspot.com/x/log.txt?x=1370a292580000

[  216.317316][ T7652] loop0: detected capacity change from 0 to 16
** 1 printk messages dropped **
[  216.327750][ T7652] ------------[ cut here ]------------
[  216.327789][ T7652] WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840, CPU#1: syz.0.585/7652


I wonder whether it is related to blk_lpos->begin or blk_lpos->next
overflow. They are supposed to overflow at the end of the 1st wrap,
see kernel/printk/printk_ringbuffer.h:

<paste>
 *   BLK0_LPOS
 *     The initial @head_lpos and @tail_lpos for data rings. It is at index
 *     0 and the lpos value is such that it will overflow on the first wrap.
[...]
*/
#define BLK0_LPOS(sz_bits)	(-(_DATA_SIZE(sz_bits)))
</paste>


Now, the question is why the following check ends by the WARN():

static const char *get_data(struct prb_data_ring *data_ring,
			    struct prb_data_blk_lpos *blk_lpos,
			    unsigned int *data_size)
{
[...]
	/* Regular data block: @begin less than @next and in same wrap. */
	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
	    blk_lpos->begin < blk_lpos->next) {
		db = to_block(data_ring, blk_lpos->begin);
		*data_size = blk_lpos->next - blk_lpos->begin;

	/* Wrapping data block: @begin is one wrap behind @next. */
	} else if (!is_blk_wrapped(data_ring,
				   blk_lpos->begin + DATA_SIZE(data_ring),
				   blk_lpos->next)) {
		db = to_block(data_ring, 0);
		*data_size = DATA_INDEX(data_ring, blk_lpos->next);

	/* Illegal block description. */
	} else {
		WARN_ON_ONCE(1);
		return NULL;
	}
[...]

The new is_blk_wrapped() check makes sense on its own.

But what happens when blk_lpos->next overflows to "0"?
is_blk_wrapped() returns false because it checks (blk_lpos->next - 1).
But the extra check "blk_lpos->begin < blk_lpos->next" fails because
it checks the overflown "blk_lpos->next".

I guess that we should do:

From f9cae42b4a910127fb7694aebe2e46247dbb0fcb Mon Sep 17 00:00:00 2001
From: Petr Mladek <pmladek@suse.com>
Date: Wed, 5 Nov 2025 17:14:57 +0100
Subject: [PATCH] printk_ringbuffer: Fix check of valid data size when blk_lpos
 overflows

The commit 67e1b0052f6bb8 ("printk_ringbuffer: don't needlessly wrap
data blocks around") allows to use the last 4 bytes of the ring buffer.

But the check for the data_size was not properly updated. It fails
when blk_lpos->next overflows to "0". In this case:

  + is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)
    returns false because it checks "blk_lpos->next - 1"

  + but "blk_lpos->begin < blk_lpos->next" fails because
    blk_lpos->next is already 0.

  + is_blk_wrapped(data_ring, blk_lpos->begin + DATA_SIZE(data_ring),
    blk_lpos->next) returns false because "begin_lpos" is from
    next wrap but "next_lpos - 1" is from the previous one

As a result, get_data() triggers the WARN_ON_ONCE() for "Illegal
block description", for example:

[  216.317316][ T7652] loop0: detected capacity change from 0 to 16
** 1 printk messages dropped **
[  216.327750][ T7652] ------------[ cut here ]------------
[  216.327789][ T7652] WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840, CPU#1: syz.0.585/7652
[  216.327848][ T7652] Modules linked in:
[  216.327907][ T7652] CPU: 1 UID: 0 PID: 7652 Comm: syz.0.585 Not tainted syzkaller #0 PREEMPT(full)
[  216.327933][ T7652] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
[  216.327953][ T7652] RIP: 0010:get_data+0x48a/0x840
[  216.327986][ T7652] Code: 83 c4 f8 48 b8 00 00 00 00 00 fc ff df 41 0f b6 04 07 84 c0 0f 85 ee 01 00 00 44 89 65 00 49 83 c5 08 eb 13 e8 a7 19 1f 00 90 <0f> 0b 90 eb 05 e8 9c 19 1f 00 45 31 ed 4c 89 e8 48 83 c4 28 5b 41
[  216.328007][ T7652] RSP: 0018:ffffc900035170e0 EFLAGS: 00010293
[  216.328029][ T7652] RAX: ffffffff81a1eee9 RBX: 00003fffffffffff RCX: ffff888033255b80
[  216.328048][ T7652] RDX: 0000000000000000 RSI: 00003fffffffffff RDI: 0000000000000000
[  216.328063][ T7652] RBP: 0000000000000012 R08: 0000000000000e55 R09: 000000325e213cc7
[  216.328079][ T7652] R10: 000000325e213cc7 R11: 00001de4c2000037 R12: 0000000000000012
[  216.328095][ T7652] R13: 0000000000000000 R14: ffffc90003517228 R15: 1ffffffff1bca646
[  216.328111][ T7652] FS:  00007f44eb8da6c0(0000) GS:ffff888125fda000(0000) knlGS:0000000000000000
[  216.328131][ T7652] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  216.328147][ T7652] CR2: 00007f44ea9722e0 CR3: 0000000066344000 CR4: 00000000003526f0
[  216.328168][ T7652] Call Trace:
[  216.328178][ T7652]  <TASK>
[  216.328199][ T7652]  _prb_read_valid+0x672/0xa90
[  216.328328][ T7652]  ? desc_read+0x1b8/0x3f0
[  216.328381][ T7652]  ? __pfx__prb_read_valid+0x10/0x10
[  216.328422][ T7652]  ? panic_on_this_cpu+0x32/0x40
[  216.328450][ T7652]  prb_read_valid+0x3c/0x60
[  216.328482][ T7652]  printk_get_next_message+0x15c/0x7b0
[  216.328526][ T7652]  ? __pfx_printk_get_next_message+0x10/0x10
[  216.328561][ T7652]  ? __lock_acquire+0xab9/0xd20
[  216.328595][ T7652]  ? console_flush_all+0x131/0xb10
[  216.328621][ T7652]  ? console_flush_all+0x478/0xb10
[  216.328648][ T7652]  console_flush_all+0x4cc/0xb10
[  216.328673][ T7652]  ? console_flush_all+0x131/0xb10
[  216.328704][ T7652]  ? __pfx_console_flush_all+0x10/0x10
[  216.328748][ T7652]  ? is_printk_cpu_sync_owner+0x32/0x40
[  216.328781][ T7652]  console_unlock+0xbb/0x190
[  216.328815][ T7652]  ? __pfx___down_trylock_console_sem+0x10/0x10
[  216.328853][ T7652]  ? __pfx_console_unlock+0x10/0x10
[  216.328899][ T7652]  vprintk_emit+0x4c5/0x590
[  216.328935][ T7652]  ? __pfx_vprintk_emit+0x10/0x10
[  216.328993][ T7652]  _printk+0xcf/0x120
[  216.329028][ T7652]  ? __pfx__printk+0x10/0x10
[  216.329051][ T7652]  ? kernfs_get+0x5a/0x90
[  216.329090][ T7652]  _erofs_printk+0x349/0x410
[  216.329130][ T7652]  ? __pfx__erofs_printk+0x10/0x10
[  216.329161][ T7652]  ? __raw_spin_lock_init+0x45/0x100
[  216.329186][ T7652]  ? __init_swait_queue_head+0xa9/0x150
[  216.329231][ T7652]  erofs_fc_fill_super+0x1591/0x1b20
[  216.329285][ T7652]  ? __pfx_erofs_fc_fill_super+0x10/0x10
[  216.329324][ T7652]  ? sb_set_blocksize+0x104/0x180
[  216.329356][ T7652]  ? setup_bdev_super+0x4c1/0x5b0
[  216.329385][ T7652]  get_tree_bdev_flags+0x40e/0x4d0
[  216.329410][ T7652]  ? __pfx_erofs_fc_fill_super+0x10/0x10
[  216.329444][ T7652]  ? __pfx_get_tree_bdev_flags+0x10/0x10
[  216.329483][ T7652]  vfs_get_tree+0x92/0x2b0
[  216.329512][ T7652]  do_new_mount+0x302/0xa10
[  216.329537][ T7652]  ? apparmor_capable+0x137/0x1b0
[  216.329576][ T7652]  ? __pfx_do_new_mount+0x10/0x10
[  216.329605][ T7652]  ? ns_capable+0x8a/0xf0
[  216.329637][ T7652]  ? kmem_cache_free+0x19b/0x690
[  216.329682][ T7652]  __se_sys_mount+0x313/0x410
[  216.329717][ T7652]  ? __pfx___se_sys_mount+0x10/0x10
[  216.329836][ T7652]  ? do_syscall_64+0xbe/0xfa0
[  216.329869][ T7652]  ? __x64_sys_mount+0x20/0xc0
[  216.329901][ T7652]  do_syscall_64+0xfa/0xfa0
[  216.329932][ T7652]  ? lockdep_hardirqs_on+0x9c/0x150
[  216.329964][ T7652]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  216.329988][ T7652]  ? clear_bhb_loop+0x60/0xb0
[  216.330017][ T7652]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  216.330040][ T7652] RIP: 0033:0x7f44ea99076a
[  216.330080][ T7652] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
[  216.330100][ T7652] RSP: 002b:00007f44eb8d9e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[  216.330128][ T7652] RAX: ffffffffffffffda RBX: 00007f44eb8d9ef0 RCX: 00007f44ea99076a
[  216.330146][ T7652] RDX: 0000200000000180 RSI: 00002000000001c0 RDI: 00007f44eb8d9eb0
[  216.330164][ T7652] RBP: 0000200000000180 R08: 00007f44eb8d9ef0 R09: 0000000000000000
[  216.330181][ T7652] R10: 0000000000000000 R11: 0000000000000246 R12: 00002000000001c0
[  216.330196][ T7652] R13: 00007f44eb8d9eb0 R14: 00000000000001a1 R15: 0000200000000080
[  216.330233][ T7652]  </TASK>

The check comparing "blk_lpos->next" must decrement 1 as well.

Alternative:

The check can be removed. Instead we might add a check for invalid
*data_size, something like:

	if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size))
		return NULL;

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 839f504db6d3..1272c220c8b4 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -1262,7 +1262,7 @@ static const char *get_data(struct prb_data_ring *data_ring,
 
 	/* Regular data block: @begin less than @next and in same wrap. */
 	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
-	    blk_lpos->begin < blk_lpos->next) {
+	    blk_lpos->begin < blk_lpos->next - 1) {
 		db = to_block(data_ring, blk_lpos->begin);
 		*data_size = blk_lpos->next - blk_lpos->begin;
 
-- 
2.51.1

Another question is whether this is the only problem caused the patch.

> We may want to revert the commit until we can take a closer look at
> this.
> 
> I will divert my energies from code-reading to trying to reproduce this.

It might help to fill messages with a fixed size which might trigger
blk_lpos->next == 0 in the 1st wrap.

I could try this tomorrow. It is getting late here. But I wanted
to send my thoughts ASAP.

Best Regards,
Petr

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-05 16:49             ` Petr Mladek
@ 2025-11-05 19:58               ` John Ogness
  2025-11-06 11:36                 ` John Ogness
  0 siblings, 1 reply; 15+ messages in thread
From: John Ogness @ 2025-11-05 19:58 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Joanne Koong, syzbot, amurray@thegoodpenguin.co.uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On 2025-11-05, Petr Mladek <pmladek@suse.com> wrote:
> I guess that we should do:
>
> From f9cae42b4a910127fb7694aebe2e46247dbb0fcb Mon Sep 17 00:00:00 2001
> From: Petr Mladek <pmladek@suse.com>
> Date: Wed, 5 Nov 2025 17:14:57 +0100
> Subject: [PATCH] printk_ringbuffer: Fix check of valid data size when blk_lpos
>  overflows
>
> The commit 67e1b0052f6bb8 ("printk_ringbuffer: don't needlessly wrap
> data blocks around") allows to use the last 4 bytes of the ring buffer.
>
> But the check for the data_size was not properly updated. It fails
> when blk_lpos->next overflows to "0". In this case:
>
>   + is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)
>     returns false because it checks "blk_lpos->next - 1"
>
>   + but "blk_lpos->begin < blk_lpos->next" fails because
>     blk_lpos->next is already 0.
>
>   + is_blk_wrapped(data_ring, blk_lpos->begin + DATA_SIZE(data_ring),
>     blk_lpos->next) returns false because "begin_lpos" is from
>     next wrap but "next_lpos - 1" is from the previous one
>
> As a result, get_data() triggers the WARN_ON_ONCE() for "Illegal
> block description", for example:

Beautiful catch!

> Another question is whether this is the only problem caused the patch.

This comparison is quite special. It caught my attention while combing
through the code. Sadly, I missed this fix despite staring at the
problem. I was more concerned about making sure it could handle wraps
correctly without realizing it was an incorrect range check.

Tomorrow I will recomb through again, this time verifying all the range
checks.

> It might help to fill messages with a fixed size which might trigger
> blk_lpos->next == 0 in the 1st wrap.

I did this and indeed it reproduces the WARN_ON_ONCE() when next==0. And
with your patch applied, the warning is gone.

John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-05 19:58               ` John Ogness
@ 2025-11-06 11:36                 ` John Ogness
  2025-11-06 16:22                   ` Petr Mladek
  0 siblings, 1 reply; 15+ messages in thread
From: John Ogness @ 2025-11-06 11:36 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Joanne Koong, syzbot, amurray@thegoodpenguin.co.uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On 2025-11-05, John Ogness <john.ogness@linutronix.de> wrote:
>> Another question is whether this is the only problem caused the patch.
>
> This comparison is quite special. It caught my attention while combing
> through the code.

The reason that this comparison is special is because it is the only one
that does not take wrapping into account. I did it that way originally
because it is AND with a wrap check. But this is an ugly special
case. It should use the same wrap check as the other 3 cases in
nbcon.c. If it had, the bug would not have happened.

I always considered these wrap checks to be non-obvious and
error-prone. So what if we create a nice helper function to simplify and
unify the wrap checks? Something like this:

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 839f504db6d30..8499ee642c31d 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -390,6 +390,17 @@ static unsigned int to_blk_size(unsigned int size)
 	return size;
 }
 
+/*
+ * Check if @lpos1 is before @lpos2. This takes ringbuffer wrapping
+ * into account. If @lpos1 is more than a full wrap before @lpos2,
+ * it is considered to be after @lpos2.
+ */
+static bool lpos1_before_lpos2(struct prb_data_ring *data_ring,
+			       unsigned long lpos1, unsigned long lpos2)
+{
+	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
+}
+
 /*
  * Sanity checker for reserve size. The ringbuffer code assumes that a data
  * block does not exceed the maximum possible size that could fit within the
@@ -577,7 +588,7 @@ static bool data_make_reusable(struct printk_ringbuffer *rb,
 	unsigned long id;
 
 	/* Loop until @lpos_begin has advanced to or beyond @lpos_end. */
-	while ((lpos_end - lpos_begin) - 1 < DATA_SIZE(data_ring)) {
+	while (lpos1_before_lpos2(data_ring, lpos_begin, lpos_end)) {
 		blk = to_block(data_ring, lpos_begin);
 
 		/*
@@ -668,7 +679,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb, unsigned long lpos)
 	 * sees the new tail lpos, any descriptor states that transitioned to
 	 * the reusable state must already be visible.
 	 */
-	while ((lpos - tail_lpos) - 1 < DATA_SIZE(data_ring)) {
+	while (lpos1_before_lpos2(data_ring, tail_lpos, lpos)) {
 		/*
 		 * Make all descriptors reusable that are associated with
 		 * data blocks before @lpos.
@@ -1149,7 +1160,7 @@ static char *data_realloc(struct printk_ringbuffer *rb, unsigned int size,
 	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
 
 	/* If the data block does not increase, there is nothing to do. */
-	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
+	if (!lpos1_before_lpos2(data_ring, head_lpos, next_lpos)) {
 		if (wrapped)
 			blk = to_block(data_ring, 0);
 		else
@@ -1262,7 +1273,7 @@ static const char *get_data(struct prb_data_ring *data_ring,
 
 	/* Regular data block: @begin less than @next and in same wrap. */
 	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
-	    blk_lpos->begin < blk_lpos->next) {
+	    lpos1_before_lpos2(data_ring, blk_lpos->begin, blk_lpos->next)) {
 		db = to_block(data_ring, blk_lpos->begin);
 		*data_size = blk_lpos->next - blk_lpos->begin;
 
This change also fixes the issue. Thoughts?

John

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-06 11:36                 ` John Ogness
@ 2025-11-06 16:22                   ` Petr Mladek
  2025-11-06 18:58                     ` John Ogness
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2025-11-06 16:22 UTC (permalink / raw)
  To: John Ogness
  Cc: Joanne Koong, syzbot, amurray@thegoodpenguin.co.uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On Thu 2025-11-06 12:42:21, John Ogness wrote:
> On 2025-11-05, John Ogness <john.ogness@linutronix.de> wrote:
> >> Another question is whether this is the only problem caused the patch.
> >
> > This comparison is quite special. It caught my attention while combing
> > through the code.
> 
> The reason that this comparison is special is because it is the only one
> that does not take wrapping into account. I did it that way originally
> because it is AND with a wrap check. But this is an ugly special
> case. It should use the same wrap check as the other 3 cases in
> nbcon.c. If it had, the bug would not have happened.

I think that there are actually some differences between the
comparsions, see below.

> I always considered these wrap checks to be non-obvious and
> error-prone. So what if we create a nice helper function to simplify and
> unify the wrap checks? Something like this:

But I agree that some wrappers with a good description
would be helpful.

> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 839f504db6d30..8499ee642c31d 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -390,6 +390,17 @@ static unsigned int to_blk_size(unsigned int size)
>  	return size;
>  }
>  
> +/*
> + * Check if @lpos1 is before @lpos2. This takes ringbuffer wrapping
> + * into account. If @lpos1 is more than a full wrap before @lpos2,
> + * it is considered to be after @lpos2.

The 2nd sentence is a brain teaser ;-)

> + */
> +static bool lpos1_before_lpos2(struct prb_data_ring *data_ring,
> +			       unsigned long lpos1, unsigned long lpos2)
> +{
> +	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
> +}

It would be nice to describe the semantic a more clean way. Sigh,
it is not easy. I tried several variants and ended up with:

   + using "lt" instead of "before" because "lower than" is
     a well known mathematical therm.

   + adding "_safe" suffix to make it clear that it is not
     a simple mathematical comparsion. It takes the wrap
     into account.

Something like:

/*
 * Returns true when @lpos1 is lower than @lpos2 and both values
 * are comparable.
 *
 * It is safe when the compared values are read a lock less way.
 * One of them must be already overwritten when the difference
 * is bigger then the data ring buffer size.
 */
static bool lpos1_lt_lpos2_safe(struct prb_data_ring *data_ring,
				unsined long lpos1, unsigned long lpos2)
{
	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
}

>  /*
>   * Sanity checker for reserve size. The ringbuffer code assumes that a data
>   * block does not exceed the maximum possible size that could fit within the
> @@ -577,7 +588,7 @@ static bool data_make_reusable(struct printk_ringbuffer *rb,
>  	unsigned long id;
>  
>  	/* Loop until @lpos_begin has advanced to or beyond @lpos_end. */
> -	while ((lpos_end - lpos_begin) - 1 < DATA_SIZE(data_ring)) {
> +	while (lpos1_before_lpos2(data_ring, lpos_begin, lpos_end)) {

lpos1_lt_lpos2_safe() fits here.

>  		blk = to_block(data_ring, lpos_begin);
>  		/*
> @@ -668,7 +679,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb, unsigned long lpos)
>  	 * sees the new tail lpos, any descriptor states that transitioned to
>  	 * the reusable state must already be visible.
>  	 */
> -	while ((lpos - tail_lpos) - 1 < DATA_SIZE(data_ring)) {
> +	while (lpos1_before_lpos2(data_ring, tail_lpos, lpos)) {
>  		/*
>  		 * Make all descriptors reusable that are associated with
>  		 * data blocks before @lpos.

Same here.

> @@ -1149,7 +1160,7 @@ static char *data_realloc(struct printk_ringbuffer *rb, unsigned int size,
>  	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
>  
>  	/* If the data block does not increase, there is nothing to do. */
> -	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
> +	if (!lpos1_before_lpos2(data_ring, head_lpos, next_lpos)) {

I think that the original code was correct. And using the "-1" is
wrong here.

Both data_make_reusable() and data_push_tail() had to use "-1"
because it was the "lower than" semantic. But in this case,
we do not need to do anything even when "head_lpos == next_lpos"

By other words, both data_make_reusable() and data_push_tail()
needed to make a free space when the position was "lower than".
There was enough space when the values were "equal".

It means that "equal" should be OK in data_realloc(). By other
words, data_realloc() should use "le" aka "less or equal"
semantic.

The helper function might be:

/*
 * Returns true when @lpos1 is lower or equal than @lpos2 and both
 * values are comparable.
 *
 * It is safe when the compared values are read a lock less way.
 * One of them must be already overwritten when the difference
 * is bigger then the data ring buffer size.
 */
static bool lpos1_le_lpos2_safe(struct prb_data_ring *data_ring,
				unsined long lpos1, unsigned long lpos2)
{
	return lpos2 - lpos1 < DATA_SIZE(data_ring);
}


>  		if (wrapped)
>  			blk = to_block(data_ring, 0);
>  		else
> @@ -1262,7 +1273,7 @@ static const char *get_data(struct prb_data_ring *data_ring,
>  
>  	/* Regular data block: @begin less than @next and in same wrap. */
>  	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
> -	    blk_lpos->begin < blk_lpos->next) {
> +	    lpos1_before_lpos2(data_ring, blk_lpos->begin, blk_lpos->next)) {

Hmm, I think that it is more complicated here.

The "lower than" semantic is weird here. One would expect that "equal"
values, aka "zero size" is perfectly fine.

It does not hurt because the "zero size" case is already handled
earlier. But still, the "lower than" semantic does not fit here.

IMHO, the main motivation for this fix is to make sure that
blk_lpos->begin and blk_lpos->next will produce a valid
*data_size.

From this POV, even lpos1_le_lpos2_safe() does not fit here
because the data_size must be lower than half of the size
of the ring buffer.

> 		db = to_block(data_ring, blk_lpos->begin);
>  		*data_size = blk_lpos->next - blk_lpos->begin;

I think that we should do the following:

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 839f504db6d3..78e02711872e 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -1260,9 +1260,8 @@ static const char *get_data(struct prb_data_ring *data_ring,
 		return NULL;
 	}
 
-	/* Regular data block: @begin less than @next and in same wrap. */
-	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
-	    blk_lpos->begin < blk_lpos->next) {
+	/* Regular data block: @begin and @next in same wrap. */
+	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
 		db = to_block(data_ring, blk_lpos->begin);
 		*data_size = blk_lpos->next - blk_lpos->begin;
 
@@ -1279,6 +1278,10 @@ static const char *get_data(struct prb_data_ring *data_ring,
 		return NULL;
 	}
 
+	/* Double check that the data_size is reasonable. */
+	if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size)))
+		return NULL;
+
 	/* A valid data block will always be aligned to the ID size. */
 	if (WARN_ON_ONCE(blk_lpos->begin != ALIGN(blk_lpos->begin, sizeof(db->id))) ||
 	    WARN_ON_ONCE(blk_lpos->next != ALIGN(blk_lpos->next, sizeof(db->id)))) {

1. Just distinguish regular vs. wrapped. vs. invalid block.

2. Add sanity check for the "data_size". It might catch some wrong values
   in both code paths for "regular" and "wrapped" blocks. So, win win.

How does that sound?

Best Regards,
Petr

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-06 16:22                   ` Petr Mladek
@ 2025-11-06 18:58                     ` John Ogness
  2025-11-06 19:36                       ` John Ogness
  2025-11-07 11:48                       ` Petr Mladek
  0 siblings, 2 replies; 15+ messages in thread
From: John Ogness @ 2025-11-06 18:58 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Joanne Koong, syzbot, amurray@thegoodpenguin.co.uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On 2025-11-06, Petr Mladek <pmladek@suse.com> wrote:
>> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
>> index 839f504db6d30..8499ee642c31d 100644
>> --- a/kernel/printk/printk_ringbuffer.c
>> +++ b/kernel/printk/printk_ringbuffer.c
>> @@ -390,6 +390,17 @@ static unsigned int to_blk_size(unsigned int size)
>>  	return size;
>>  }
>>  
>> +/*
>> + * Check if @lpos1 is before @lpos2. This takes ringbuffer wrapping
>> + * into account. If @lpos1 is more than a full wrap before @lpos2,
>> + * it is considered to be after @lpos2.
>
> The 2nd sentence is a brain teaser ;-)
>
>> + */
>> +static bool lpos1_before_lpos2(struct prb_data_ring *data_ring,
>> +			       unsigned long lpos1, unsigned long lpos2)
>> +{
>> +	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
>> +}
>
> It would be nice to describe the semantic a more clean way. Sigh,
> it is not easy. I tried several variants and ended up with:
>
>    + using "lt" instead of "before" because "lower than" is
>      a well known mathematical therm.

I explicitly chose a word other than "less" or "lower" because I was
concerned people might visualize values. The lpos does not necessarily
have a lesser or lower value. "Preceeds" would also be a choice of mine.

When I see "lt" I immediately think "less than" and "<". But I will not
fight it. I can handle "lt".

>    + adding "_safe" suffix to make it clear that it is not
>      a simple mathematical comparsion. It takes the wrap
>      into account.

I find "_safe" confusing. Especially when you look at the implementation
you wonder, "what is safe about this?". Especially when comparing it to
all the complexity of the rest of the code. But I can handle "_safe" if
it is important for you.

> Something like:
>
> /*
>  * Returns true when @lpos1 is lower than @lpos2 and both values
>  * are comparable.
>  *
>  * It is safe when the compared values are read a lock less way.
>  * One of them must be already overwritten when the difference
>  * is bigger then the data ring buffer size.

This makes quite a bit of assumptions about the context and intention of
the call. I preferred my brain teaser version. But to me it is not worth
bike-shedding. If this explanation helps you, I am fine with it.

>  */
> static bool lpos1_lt_lpos2_safe(struct prb_data_ring *data_ring,
> 				unsined long lpos1, unsigned long lpos2)
> {
> 	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
> }
>
>>  /*
>>   * Sanity checker for reserve size. The ringbuffer code assumes that a data
>>   * block does not exceed the maximum possible size that could fit within the
>> @@ -577,7 +588,7 @@ static bool data_make_reusable(struct printk_ringbuffer *rb,
>>  	unsigned long id;
>>  
>>  	/* Loop until @lpos_begin has advanced to or beyond @lpos_end. */
>> -	while ((lpos_end - lpos_begin) - 1 < DATA_SIZE(data_ring)) {
>> +	while (lpos1_before_lpos2(data_ring, lpos_begin, lpos_end)) {
>
> lpos1_lt_lpos2_safe() fits here.
>
>>  		blk = to_block(data_ring, lpos_begin);
>>  		/*
>> @@ -668,7 +679,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb, unsigned long lpos)
>>  	 * sees the new tail lpos, any descriptor states that transitioned to
>>  	 * the reusable state must already be visible.
>>  	 */
>> -	while ((lpos - tail_lpos) - 1 < DATA_SIZE(data_ring)) {
>> +	while (lpos1_before_lpos2(data_ring, tail_lpos, lpos)) {
>>  		/*
>>  		 * Make all descriptors reusable that are associated with
>>  		 * data blocks before @lpos.
>
> Same here.
>
>> @@ -1149,7 +1160,7 @@ static char *data_realloc(struct printk_ringbuffer *rb, unsigned int size,
>>  	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
>>  
>>  	/* If the data block does not increase, there is nothing to do. */
>> -	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
>> +	if (!lpos1_before_lpos2(data_ring, head_lpos, next_lpos)) {
>
> I think that the original code was correct. And using the "-1" is
> wrong here.

You have overlooked that I inverted the check. It is no longer checking:

    next_pos <= head_pos

but is instead checking:

    !(head_pos < next_pos)

IOW, if "next has not overtaken head".

> Both data_make_reusable() and data_push_tail() had to use "-1"
> because it was the "lower than" semantic. But in this case,
> we do not need to do anything even when "head_lpos == next_lpos"
>
> By other words, both data_make_reusable() and data_push_tail()
> needed to make a free space when the position was "lower than".
> There was enough space when the values were "equal".
>
> It means that "equal" should be OK in data_realloc(). By other
> words, data_realloc() should use "le" aka "less or equal"
> semantic.
>
> The helper function might be:
>
> /*
>  * Returns true when @lpos1 is lower or equal than @lpos2 and both
>  * values are comparable.
>  *
>  * It is safe when the compared values are read a lock less way.
>  * One of them must be already overwritten when the difference
>  * is bigger then the data ring buffer size.
>  */
> static bool lpos1_le_lpos2_safe(struct prb_data_ring *data_ring,
> 				unsined long lpos1, unsigned long lpos2)
> {
> 	return lpos2 - lpos1 < DATA_SIZE(data_ring);
> }

If you negate lpos1_lt_lpos2_safe() and swap the parameters, there is no
need for a second helper. That is what I did.

>> @@ -1262,7 +1273,7 @@ static const char *get_data(struct prb_data_ring *data_ring,
>>  
>>  	/* Regular data block: @begin less than @next and in same wrap. */
>>  	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
>> -	    blk_lpos->begin < blk_lpos->next) {
>> +	    lpos1_before_lpos2(data_ring, blk_lpos->begin, blk_lpos->next)) {
>
> Hmm, I think that it is more complicated here.
>
> The "lower than" semantic is weird here. One would expect that "equal"
> values, aka "zero size" is perfectly fine.

No, we would _not_ expect that zero size is OK, because we are detecting
"Regular data blocks", in which case they must _not_ be equal.

> It does not hurt because the "zero size" case is already handled
> earlier. But still, the "lower than" semantic does not fit here.

Currently we have 3 explicit checks:

1. data-less

2. regular

3. wrapping

But I agree the checks are "relaxed" because we are doing only minimal
sanity checks on the positions, rather than size validation.

> IMHO, the main motivation for this fix is to make sure that
> blk_lpos->begin and blk_lpos->next will produce a valid
> *data_size.
>
> From this POV, even lpos1_le_lpos2_safe() does not fit here
> because the data_size must be lower than half of the size
> of the ring buffer.

Currently we do not do size validation for reading, only for writing. If
you are arguing that we _should_ perform better size validation on read,
then I agree this is the place for it.

>> 		db = to_block(data_ring, blk_lpos->begin);
>>  		*data_size = blk_lpos->next - blk_lpos->begin;
>
> I think that we should do the following:
>
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 839f504db6d3..78e02711872e 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -1260,9 +1260,8 @@ static const char *get_data(struct prb_data_ring *data_ring,
>  		return NULL;
>  	}
>  
> -	/* Regular data block: @begin less than @next and in same wrap. */
> -	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
> -	    blk_lpos->begin < blk_lpos->next) {
> +	/* Regular data block: @begin and @next in same wrap. */
> +	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
>  		db = to_block(data_ring, blk_lpos->begin);
>  		*data_size = blk_lpos->next - blk_lpos->begin;
>  
> @@ -1279,6 +1278,10 @@ static const char *get_data(struct prb_data_ring *data_ring,
>  		return NULL;
>  	}
>  
> +	/* Double check that the data_size is reasonable. */
> +	if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size)))
> +		return NULL;
> +
>  	/* A valid data block will always be aligned to the ID size. */
>  	if (WARN_ON_ONCE(blk_lpos->begin != ALIGN(blk_lpos->begin, sizeof(db->id))) ||
>  	    WARN_ON_ONCE(blk_lpos->next != ALIGN(blk_lpos->next, sizeof(db->id)))) {
>
> 1. Just distinguish regular vs. wrapped. vs. invalid block.
>
> 2. Add sanity check for the "data_size". It might catch some wrong values
>    in both code paths for "regular" and "wrapped" blocks. So, win win.
>
> How does that sound?

I think it can be made even more simple since we are adding size
validation:

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index b7ab4e75917f0..04bc863eae411 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -1271,23 +1271,15 @@ static const char *get_data(struct prb_data_ring *data_ring,
 		return NULL;
 	}
 
-	/* Regular data block: @begin less than @next and in same wrap. */
-	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
-	    blk_lpos->begin < blk_lpos->next) {
-		db = to_block(data_ring, blk_lpos->begin);
-		*data_size = blk_lpos->next - blk_lpos->begin;
-
-	/* Wrapping data block: @begin is one wrap behind @next. */
-	} else if (!is_blk_wrapped(data_ring,
-				   blk_lpos->begin + DATA_SIZE(data_ring),
-				   blk_lpos->next)) {
+	/* Wrapping data block description. */
+	if (is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
 		db = to_block(data_ring, 0);
 		*data_size = DATA_INDEX(data_ring, blk_lpos->next);
 
-	/* Illegal block description. */
+	/* Regular data block description. */
 	} else {
-		WARN_ON_ONCE(1);
-		return NULL;
+		db = to_block(data_ring, blk_lpos->begin);
+		*data_size = blk_lpos->next - blk_lpos->begin;
 	}
 
 	/* A valid data block will always be aligned to the ID size. */
@@ -1300,6 +1292,10 @@ static const char *get_data(struct prb_data_ring *data_ring,
 	if (WARN_ON_ONCE(*data_size < sizeof(db->id)))
 		return NULL;
 
+	/* Check if the data size is at least legal. */
+	if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size)))
+		return NULL;
+
 	/* Subtract block ID space from size to reflect data size. */
 	*data_size -= sizeof(db->id);
 
So it ends up looking like this:

	/* Wrapping data block description. */
	if (is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
		db = to_block(data_ring, 0);
		*data_size = DATA_INDEX(data_ring, blk_lpos->next);

	/* Regular data block description. */
	} else {
		db = to_block(data_ring, blk_lpos->begin);
		*data_size = blk_lpos->next - blk_lpos->begin;
	}
...
	/* Ensure the data size is at least legal. */
	if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size)))
		return NULL;

(Note that there is already WARN_ON_ONCE() checks for misaligned lpos
values and sizes less than sizeof(id).)

How does this sound?

John

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-06 18:58                     ` John Ogness
@ 2025-11-06 19:36                       ` John Ogness
  2025-11-07 11:48                       ` Petr Mladek
  1 sibling, 0 replies; 15+ messages in thread
From: John Ogness @ 2025-11-06 19:36 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Joanne Koong, syzbot, amurray@thegoodpenguin.co.uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On 2025-11-06, John Ogness <john.ogness@linutronix.de> wrote:
>> I think that we should do the following:
>>
>> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
>> index 839f504db6d3..78e02711872e 100644
>> --- a/kernel/printk/printk_ringbuffer.c
>> +++ b/kernel/printk/printk_ringbuffer.c
>> @@ -1260,9 +1260,8 @@ static const char *get_data(struct prb_data_ring *data_ring,
>>  		return NULL;
>>  	}
>>  
>> -	/* Regular data block: @begin less than @next and in same wrap. */
>> -	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
>> -	    blk_lpos->begin < blk_lpos->next) {
>> +	/* Regular data block: @begin and @next in same wrap. */
>> +	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
>>  		db = to_block(data_ring, blk_lpos->begin);
>>  		*data_size = blk_lpos->next - blk_lpos->begin;

Upon further consideration, your suggestion here is better. The wrapping
data block detection should continue to make sure there is exactly one 1
wrap. The size check will not catch the case where there are multiple
wraps.

John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-06 18:58                     ` John Ogness
  2025-11-06 19:36                       ` John Ogness
@ 2025-11-07 11:48                       ` Petr Mladek
  2025-11-07 13:41                         ` John Ogness
  1 sibling, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2025-11-07 11:48 UTC (permalink / raw)
  To: John Ogness
  Cc: Joanne Koong, syzbot, amurray@thegoodpenguin.co.uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On Thu 2025-11-06 20:04:03, John Ogness wrote:
> On 2025-11-06, Petr Mladek <pmladek@suse.com> wrote:
> >> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> >> index 839f504db6d30..8499ee642c31d 100644
> >> --- a/kernel/printk/printk_ringbuffer.c
> >> +++ b/kernel/printk/printk_ringbuffer.c
> >> @@ -390,6 +390,17 @@ static unsigned int to_blk_size(unsigned int size)
> >>  	return size;
> >>  }
> >>  
> >> +/*
> >> + * Check if @lpos1 is before @lpos2. This takes ringbuffer wrapping
> >> + * into account. If @lpos1 is more than a full wrap before @lpos2,
> >> + * it is considered to be after @lpos2.
> >
> > The 2nd sentence is a brain teaser ;-)
> >
> >> + */
> >> +static bool lpos1_before_lpos2(struct prb_data_ring *data_ring,
> >> +			       unsigned long lpos1, unsigned long lpos2)
> >> +{
> >> +	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
> >> +}
> >
> > It would be nice to describe the semantic a more clean way. Sigh,
> > it is not easy. I tried several variants and ended up with:
> >
> >    + using "lt" instead of "before" because "lower than" is
> >      a well known mathematical therm.
> 
> I explicitly chose a word other than "less" or "lower" because I was
> concerned people might visualize values. The lpos does not necessarily
> have a lesser or lower value. "Preceeds" would also be a choice of mine.

The word "before" was fine. I proposed "lt" because it was shorter and
I wanted to add "le" variant. I wanted to keep it short also because I
wanted to add another suffix to make it obvious that there was
the twist with wrapping.


> When I see "lt" I immediately think "less than" and "<". But I will not
> fight it. I can handle "lt".
> 
> >    + adding "_safe" suffix to make it clear that it is not
> >      a simple mathematical comparsion. It takes the wrap
> >      into account.
> 
> I find "_safe" confusing. Especially when you look at the implementation
> you wonder, "what is safe about this?". Especially when comparing it to
> all the complexity of the rest of the code. But I can handle "_safe" if
> it is important for you.

OK, forget "_safe". The helper function should make the code more
clear. And it won't work when even you or me are confused.

I though about "_wrap" but it was confusing as well. The code uses
the word "wrap" many times and it is always about wrapping over
the end of the data ring, for example, DATA_WRAPS() computes how
many times the data array was filled [*].

But in this case, data_make_reusable(), and data_push_tail(),
the edge for wrapping is a moving target. It is defined by
data_ring->head_lpos and data_ring->tail_lpos.

[*] It is not the exact number because it is computed from lpos
    which is not initialized to zero and might overflow.

> > Something like:
> >
> > /*
> >  * Returns true when @lpos1 is lower than @lpos2 and both values
> >  * are comparable.
> >  *
> >  * It is safe when the compared values are read a lock less way.
> >  * One of them must be already overwritten when the difference
> >  * is bigger then the data ring buffer size.
> 
> This makes quite a bit of assumptions about the context and intention of
> the call. I preferred my brain teaser version. But to me it is not worth
> bike-shedding. If this explanation helps you, I am fine with it.

My problem with the "brain teaser" version is the sentence"

  "If @lpos1 is more than a full wrap before @lpos2,
   it is considered to be after @lpos2."

It says what it does but it does not explain why. And the "why"
is very important here.

I actually think that the sentence is misleading. If @lpos1 is more
than a full wrap before @lpos2 it is still _before_ @lpos2!

Why we want to return "false" in this case? My understanding is
that it is because we want to break the "while" cycle where
the function is used because we are clearly working with
outdated lpos values.

What about?

/*
 * Return true when @lpos1 is lower than @lpos2 and both values
 * look sane.
 *
 * They are considered insane when the difference is bigger than
 * the data buffer size. It happens when the values are read
 * without locking and another CPU already moved the ring buffer
 * head and/or tail.
 *
 * The caller must behave carefully. The changes based on this
 * check must be done using cmpxchg() to confirm that the check
 * worked with valid values.
 */
static bool lpos1_before_lpos2_sane(struct prb_data_ring *data_ring,
				    unsined long lpos1, unsigned long lpos2)
{
	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
}

Feel free to come up with any other function name or description.
Whatever you think that is more clear. but I have a favor to ask you to:

  + explain why the function returns false when the difference is
    bigger that the data buffer size.

  + ideally avoid the word "wrap" because it has another meaning
    in the printk ring buffer code as explained earlier.


> >>  /*
> >>   * Sanity checker for reserve size. The ringbuffer code assumes that a data
> >>   * block does not exceed the maximum possible size that could fit within the
> >> @@ -577,7 +588,7 @@ static bool data_make_reusable(struct printk_ringbuffer *rb,
> >>  	unsigned long id;
> >>  
> >>  	/* Loop until @lpos_begin has advanced to or beyond @lpos_end. */
> >> -	while ((lpos_end - lpos_begin) - 1 < DATA_SIZE(data_ring)) {
> >> +	while (lpos1_before_lpos2(data_ring, lpos_begin, lpos_end)) {
> >
> > lpos1_lt_lpos2_safe() fits here.
> >
> >>  		blk = to_block(data_ring, lpos_begin);
> >>  		/*
> >> @@ -668,7 +679,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb, unsigned long lpos)
> >>  	 * sees the new tail lpos, any descriptor states that transitioned to
> >>  	 * the reusable state must already be visible.
> >>  	 */
> >> -	while ((lpos - tail_lpos) - 1 < DATA_SIZE(data_ring)) {
> >> +	while (lpos1_before_lpos2(data_ring, tail_lpos, lpos)) {
> >>  		/*
> >>  		 * Make all descriptors reusable that are associated with
> >>  		 * data blocks before @lpos.
> >
> > Same here.
> >
> >> @@ -1149,7 +1160,7 @@ static char *data_realloc(struct printk_ringbuffer *rb, unsigned int size,
> >>  	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
> >>  
> >>  	/* If the data block does not increase, there is nothing to do. */
> >> -	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
> >> +	if (!lpos1_before_lpos2(data_ring, head_lpos, next_lpos)) {
> >
> > I think that the original code was correct. And using the "-1" is
> > wrong here.
> 
> You have overlooked that I inverted the check. It is no longer checking:
> 
>     next_pos <= head_pos
> 
> but is instead checking:
> 
>     !(head_pos < next_pos)
> 
> IOW, if "next has not overtaken head".

I see. I missed this. Hmm, this would be correct when the comparsion was
mathemathical (lt, le). But is this correct in our case when take
into account the ring buffer wrapping?

The original check returned "false" when the difference between head_lpos
and next_lpos was bigger than the data ring size.

The new check would return "true", aka "!false", in this case.

Hmm, it seems that the buffer wrapping is not possible because
this code is called when desc_reopen_last() succeeded. And nobody
is allowed to free reopened block.

Anyway, I consider using (!lpos1_before_lpos2()) as highly confusing
in this case.

I would either keep the code as is. Maybe we could add a comment
explaining that

	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {

might fail only when the substraction is negative. It should never be
positive because head_lpos advanced more than the data buffer size
over next_lpos because the data block is reopened and nobody could
free it.

Maybe, we could even add a check for this.


> > Both data_make_reusable() and data_push_tail() had to use "-1"
> > because it was the "lower than" semantic. But in this case,
> > we do not need to do anything even when "head_lpos == next_lpos"
> >
> > By other words, both data_make_reusable() and data_push_tail()
> > needed to make a free space when the position was "lower than".
> > There was enough space when the values were "equal".
> >
> > It means that "equal" should be OK in data_realloc(). By other
> > words, data_realloc() should use "le" aka "less or equal"
> > semantic.
> >
> > The helper function might be:
> >
> > /*
> >  * Returns true when @lpos1 is lower or equal than @lpos2 and both
> >  * values are comparable.
> >  *
> >  * It is safe when the compared values are read a lock less way.
> >  * One of them must be already overwritten when the difference
> >  * is bigger then the data ring buffer size.
> >  */
> > static bool lpos1_le_lpos2_safe(struct prb_data_ring *data_ring,
> > 				unsined long lpos1, unsigned long lpos2)
> > {
> > 	return lpos2 - lpos1 < DATA_SIZE(data_ring);
> > }
> 
> If you negate lpos1_lt_lpos2_safe() and swap the parameters, there is no
> need for a second helper. That is what I did.

Sigh, lpos1_le_lpos2_safe() does not say the truth after all.
And (!lpos1_lt_lpos2_safe()) looks wrong to me.

I am going to wait what you say about my comments above.

> >> @@ -1262,7 +1273,7 @@ static const char *get_data(struct prb_data_ring *data_ring,
> >>  
> >>  	/* Regular data block: @begin less than @next and in same wrap. */
> >>  	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
> >> -	    blk_lpos->begin < blk_lpos->next) {
> >> +	    lpos1_before_lpos2(data_ring, blk_lpos->begin, blk_lpos->next)) {
> >
> > Hmm, I think that it is more complicated here.
> >
> > The "lower than" semantic is weird here. One would expect that "equal"
> > values, aka "zero size" is perfectly fine.
> 
> No, we would _not_ expect that zero size is OK, because we are detecting
> "Regular data blocks", in which case they must _not_ be equal.

It seems that you have more or less agreed with my proposal to
use  check_data_size() in the other replay, see
https://lore.kernel.org/all/87ecqb3qd0.fsf@jogness.linutronix.de/

I think about fixing this in a separate patch and pushing this
into linux-next ASAP to fix the regression.

We could improve the other comparisons later...

How does that sound?
Should I prepare the patch for get_data() are you going to do so?

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
  2025-11-07 11:48                       ` Petr Mladek
@ 2025-11-07 13:41                         ` John Ogness
  0 siblings, 0 replies; 15+ messages in thread
From: John Ogness @ 2025-11-07 13:41 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Joanne Koong, syzbot, amurray@thegoodpenguin.co.uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On 2025-11-07, Petr Mladek <pmladek@suse.com> wrote:
> What about?
>
> /*
>  * Return true when @lpos1 is lower than @lpos2 and both values
>  * look sane.
>  *
>  * They are considered insane when the difference is bigger than
>  * the data buffer size. It happens when the values are read
>  * without locking and another CPU already moved the ring buffer
>  * head and/or tail.
>  *
>  * The caller must behave carefully. The changes based on this
>  * check must be done using cmpxchg() to confirm that the check
>  * worked with valid values.
>  */
> static bool lpos1_before_lpos2_sane(struct prb_data_ring *data_ring,
> 				    unsined long lpos1, unsigned long lpos2)
> {
> 	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
> }
>
> Feel free to come up with any other function name or description.

I prefer "_bounded" to "_sane". And I really don't care if it is
"before" or "lt". I was only stating why I chose "before" instead of
something else. But I really don't care. Really.

My preferences would be:

lpos1_before_lpos2_bounded()

lpos1_lt_lpos2_bounded()

But I can live with lpos1_before_lpos2_sane() if you think "_sane" is
better.

>> You have overlooked that I inverted the check. It is no longer checking:
>> 
>>     next_pos <= head_pos
>> 
>> but is instead checking:
>> 
>>     !(head_pos < next_pos)
>> 
>> IOW, if "next has not overtaken head".
>
> I see. I missed this. Hmm, this would be correct when the comparsion was
> mathemathical (lt, le). But is this correct in our case when take
> into account the ring buffer wrapping?
>
> The original check returned "false" when the difference between head_lpos
> and next_lpos was bigger than the data ring size.
>
> The new check would return "true", aka "!false", in this case.

Sure, but that is not possible. Even if we assume there has been
corrupted data, the new get_data() will catch that.

> Hmm, it seems that the buffer wrapping is not possible because
> this code is called when desc_reopen_last() succeeded. And nobody
> is allowed to free reopened block.

Correct.

> Anyway, I consider using (!lpos1_before_lpos2()) as highly confusing
> in this case.

I think if you look at what the new check is checking instead of trying
to mentally map the old check to the new check, it is not confusing.

> I would either keep the code as is.

:-/ That defeats the whole purpose of the new helper, which is simply
comparing the relative position of two lpos values. That is exactly what
is being done here.

I would prefer adding an additional lpos1_le_lpos2_bounded() variant
before leaving the old code. A new variant is unnecessary, but at least
we would have all logical position comparison code together.

> Maybe we could add a comment explaining that
>
> 	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
>
> might fail only when the substraction is negative. It should never be
> positive because head_lpos advanced more than the data buffer size
> over next_lpos because the data block is reopened and nobody could
> free it.
>
> Maybe, we could even add a check for this.

If data is being illegally manipulated underneath us, we are screwed
anyway. I see no point in sprinkling checks around in case someone is
modifying our data even though we have exclusive access to it.

> I think about fixing this in a separate patch and pushing this
> into linux-next ASAP to fix the regression.
>
> We could improve the other comparisons later...
>
> How does that sound?

Sure. Are you planning on letting 6.19 pull 2 patches or will you fold
them for the 6.19 pull?

> Should I prepare the patch for get_data() are you going to do so?

I would prefer you do it so that we do not need any more discussing for
the quick fix. ;-)

John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
       [not found] <68cc0578.050a0220.28a605.0006.GAE@google.com>
  2025-11-01  2:11 ` [syzbot] [iomap?] kernel BUG in folio_end_read (2) syzbot
@ 2025-11-02  5:39 ` syzbot
  1 sibling, 0 replies; 15+ messages in thread
From: syzbot @ 2025-11-02  5:39 UTC (permalink / raw)
  To: brauner, chao, djwong, jaegeuk, joannelkoong, linux-f2fs-devel,
	linux-fsdevel, linux-kernel, linux-xfs, syzkaller-bugs

syzbot has bisected this issue to:

commit 51311f045375984dabdf8cc523e80d39a4c3dd5a
Author: Joanne Koong <joannelkoong@gmail.com>
Date:   Fri Sep 26 00:26:02 2025 +0000

    iomap: track pending read bytes more optimally

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=103a0012580000
start commit:   98bd8b16ae57 Add linux-next specific files for 20251031
git tree:       linux-next
final oops:     https://syzkaller.appspot.com/x/report.txt?x=123a0012580000
console output: https://syzkaller.appspot.com/x/log.txt?x=143a0012580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=63d09725c93bcc1c
dashboard link: https://syzkaller.appspot.com/bug?extid=3686758660f980b402dc
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=176fc342580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10403f34580000

Reported-by: syzbot+3686758660f980b402dc@syzkaller.appspotmail.com
Fixes: 51311f045375 ("iomap: track pending read bytes more optimally")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-11-07 13:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <68cc0578.050a0220.28a605.0006.GAE@google.com>
2025-11-01  2:11 ` [syzbot] [iomap?] kernel BUG in folio_end_read (2) syzbot
2025-11-03 16:58   ` Joanne Koong
2025-11-04  2:43     ` syzbot
2025-11-04 17:45       ` Joanne Koong
2025-11-04 18:25         ` Petr Mladek
2025-11-05 14:54           ` John Ogness
2025-11-05 16:49             ` Petr Mladek
2025-11-05 19:58               ` John Ogness
2025-11-06 11:36                 ` John Ogness
2025-11-06 16:22                   ` Petr Mladek
2025-11-06 18:58                     ` John Ogness
2025-11-06 19:36                       ` John Ogness
2025-11-07 11:48                       ` Petr Mladek
2025-11-07 13:41                         ` John Ogness
2025-11-02  5:39 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).