[syzbot] [nilfs?] INFO: task hung in nilfs_segctor

public inbox for linux-nilfs@vger.kernel.org
 help / color / mirror / Atom feed

* [syzbot] [nilfs?] INFO: task hung in nilfs_segctor_thread (6)
@ 2025-12-17  0:46 syzbot
  2025-12-17  8:43 ` [PATCH] nilfs2: fix potential block overflow that cause system hang Edward Adam Davis
  0 siblings, 1 reply; 10+ messages in thread
From: syzbot @ 2025-12-17  0:46 UTC (permalink / raw)
  To: axboe, konishi.ryusuke, kristian, linux-kernel, linux-nilfs,
	slava, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    8f0b4cce4481 Linux 6.19-rc1
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=178ac11a580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1f2b6fe1fdf1a00b
dashboard link: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12efdb90580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10f9cd92580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/ea3b19e4d883/disk-8f0b4cce.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/bd7c115820ba/vmlinux-8f0b4cce.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e5813cc1963f/bzImage-8f0b4cce.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/80eb4ac785e9/mount_0.gz

The issue was bisected to:

commit 2b9ac22b12a266eb4fec246a07b504dd4983b16b
Author: Kristian Klausen <kristian@klausen.dk>
Date:   Fri Jun 18 11:51:57 2021 +0000

    loop: Fix missing discard support when using LOOP_CONFIGURE

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17e3b11a580000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=1413b11a580000
console output: https://syzkaller.appspot.com/x/log.txt?x=1013b11a580000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
Fixes: 2b9ac22b12a2 ("loop: Fix missing discard support when using LOOP_CONFIGURE")

INFO: tas[  327.531540][   T38] INFO: task segctord:6093 blocked for more than 143 seconds.
      Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:segctord        state:D stack:28968 pid:6093  tgid:6093  ppid:2      task_flags:0x200040 flags:0x00080000
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5256 [inline]
 __schedule+0x1480/0x50a0 kernel/sched/core.c:6863
 __schedule_loop kernel/sched/core.c:6945 [inline]
 rt_mutex_schedule+0x77/0xf0 kernel/sched/core.c:7241
 rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
 nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
 nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684
 kthread+0x711/0x8a0 kernel/kthread.c:463
 ret_from_fork+0x599/0xb30 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
 </TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/38:
 #0: ffffffff8d5ae880 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
 #0: ffffffff8d5ae880 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:867 [inline]
 #0: ffffffff8d5ae880 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x2e/0x180 kernel/locking/lockdep.c:6775
3 locks held by kworker/u8:14/1555:
2 locks held by getty/5561:
 #0: ffff8880354c00a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
 #1: ffffc90003e8b2e0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0x44f/0x1460 drivers/tty/n_tty.c:2211
1 lock held by syz-executor/5830:
5 locks held by syz.0.17/6090:
1 lock held by segctord/6093:
 #0: ffff88803672b2b0 (&nilfs->ns_segctor_sem){++++}-{4:4}, at: nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
2 locks held by syz.1.18/6168:
1 lock held by segctord/6169:
 #0: ffff88805c1e12b0 (&nilfs->ns_segctor_sem){++++}-{4:4}, at: nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
2 locks held by syz.2.19/6194:
1 lock held by segctord/6195:
 #0: ffff88801fadf2b0 (&nilfs->ns_segctor_sem){++++}-{4:4}, at: nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
3 locks held by syz.3.20/6222:
1 lock held by segctord/6223:
 #0: ffff88801b7aa2b0 (&nilfs->ns_segctor_sem){++++}-{4:4}, at: nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
4 locks held by syz.4.21/6261:
1 lock held by segctord/6263:
 #0: ffff8880308212b0 (&nilfs->ns_segctor_sem){++++}-{4:4}, at: nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
3 locks held by syz.5.22/6295:
1 lock held by segctord/6296:
 #0: ffff888033ee82b0 (&nilfs->ns_segctor_sem){++++}-{4:4}, at: nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
2 locks held by syz.6.23/6334:
1 lock held by segctord/6335:
 #0: ffff888038fc92b0 (&nilfs->ns_segctor_sem){++++}-{4:4}, at: nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357

=============================================

NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 38 Comm: khungtaskd Not tainted syzkaller #0 PREEMPT_{RT,(full)} 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Call Trace:
 <TASK>
 dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
 nmi_cpu_backtrace+0x39e/0x3d0 lib/nmi_backtrace.c:113
 nmi_trigger_cpumask_backtrace+0x17a/0x300 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:160 [inline]
 __sys_info lib/sys_info.c:157 [inline]
 sys_info+0x135/0x170 lib/sys_info.c:165
 check_hung_uninterruptible_tasks kernel/hung_task.c:346 [inline]
 watchdog+0xf95/0xfe0 kernel/hung_task.c:515
 kthread+0x711/0x8a0 kernel/kthread.c:463
 ret_from_fork+0x599/0xb30 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
 </TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 6295 Comm: syz.5.22 Not tainted syzkaller #0 PREEMPT_{RT,(full)} 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:io_apic_sync arch/x86/kernel/apic/io_apic.c:398 [inline]
RIP: 0010:io_apic_modify_irq arch/x86/kernel/apic/io_apic.c:386 [inline]
RIP: 0010:mask_ioapic_irq+0x187/0x380 arch/x86/kernel/apic/io_apic.c:407
Code: 10 00 00 00 c1 e5 0c 81 c5 00 40 20 00 48 63 cd 48 c7 c2 00 f0 7f ff 48 29 ca 8b 0b 81 e1 ff 0f 00 00 89 04 0a 44 89 74 0a 10 <41> 0f b6 44 35 00 84 c0 0f 85 2c 01 00 00 44 8b 2f 49 81 fd 80 00
RSP: 0018:ffffc90000007f00 EFLAGS: 00000046
RAX: 0000000000000024 RBX: ffffffff91b73694 RCX: 0000000000000000
RDX: ffffffffff5fb000 RSI: dffffc0000000000 RDI: ffff88813ff7d850
RBP: 0000000000204000 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffff52000000fbc R12: ffff88813ff7d840
R13: 1ffff11027fefb0a R14: 0000000000018020 R15: 000000000000000a
FS:  00007fe5d0c066c0(0000) GS:ffff888126d03000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055f04873d608 CR3: 000000003567c000 CR4: 00000000003526f0
Call Trace:
 <IRQ>
 mask_irq kernel/irq/chip.c:434 [inline]
 handle_fasteoi_irq+0x33f/0xa00 kernel/irq/chip.c:762
 generic_handle_irq_desc include/linux/irqdesc.h:172 [inline]
 handle_irq arch/x86/kernel/irq.c:255 [inline]
 call_irq_handler arch/x86/kernel/irq.c:-1 [inline]
 __common_interrupt+0x141/0x1f0 arch/x86/kernel/irq.c:326
 common_interrupt+0xb6/0xe0 arch/x86/kernel/irq.c:319
 </IRQ>
 <TASK>
 asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:688
RIP: 0010:arch_atomic_read arch/x86/include/asm/atomic.h:23 [inline]
RIP: 0010:raw_atomic_read include/linux/atomic/atomic-arch-fallback.h:457 [inline]
RIP: 0010:rcu_is_watching_curr_cpu include/linux/context_tracking.h:128 [inline]
RIP: 0010:rcu_is_watching+0x4e/0xb0 kernel/rcu/tree.c:751
Code: ff df 4c 8d 34 dd d0 ad 01 8d 4c 89 f0 48 c1 e8 03 42 80 3c 38 00 74 08 4c 89 f7 e8 ac 92 7c 00 48 c7 c3 58 0c b3 91 49 03 1e <48> 89 d8 48 c1 e8 03 42 0f b6 04 38 84 c0 75 34 8b 03 65 ff 0d 79
RSP: 0018:ffffc90003f074c8 EFLAGS: 00000283
RAX: 1ffffffff1a035ba RBX: ffff8880b8833c58 RCX: f99f3176bf70e500
RDX: 0000000000000000 RSI: ffffffff8b3f5640 RDI: ffffffff8b3f5600
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88803df1b2b8 R11: ffffed1007be365b R12: dffffc0000000000
R13: ffff88803df1b280 R14: ffffffff8d01add0 R15: dffffc0000000000
 rcu_read_lock include/linux/rcupdate.h:868 [inline]
 bio_associate_blkg+0xa6/0x230 block/blk-cgroup.c:2154
 bio_init block/bio.c:267 [inline]
 bio_alloc_percpu_cache block/bio.c:473 [inline]
 bio_alloc_bioset+0x46a/0x12d0 block/bio.c:526
 bio_alloc include/linux/bio.h:374 [inline]
 blk_alloc_discard_bio+0x194/0x2c0 block/blk-lib.c:47
 __blkdev_issue_discard block/blk-lib.c:68 [inline]
 blkdev_issue_discard+0xf2/0x1b0 block/blk-lib.c:93
 nilfs_sufile_trim_fs+0xc31/0xf90 fs/nilfs2/sufile.c:1182
 nilfs_ioctl_trim_fs fs/nilfs2/ioctl.c:1041 [inline]
 nilfs_ioctl+0x1411/0x25a0 fs/nilfs2/ioctl.c:1354
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl+0xff/0x170 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe5d159f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fe5d0c06038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fe5d17f5fa0 RCX: 00007fe5d159f749
RDX: 00002000000004c0 RSI: 00000000c0185879 RDI: 0000000000000005
RBP: 00007fe5d1623f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fe5d17f6038 R14: 00007fe5d17f5fa0 R15: 00007ffff490ff28
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nilfs2: fix potential block overflow that cause system hang
  2025-12-17  0:46 [syzbot] [nilfs?] INFO: task hung in nilfs_segctor_thread (6) syzbot
@ 2025-12-17  8:43 ` Edward Adam Davis
  2025-12-17 13:38   ` Ryusuke Konishi
  0 siblings, 1 reply; 10+ messages in thread
From: Edward Adam Davis @ 2025-12-17  8:43 UTC (permalink / raw)
  To: syzbot+7eedce5eb281acd832f0
  Cc: axboe, konishi.ryusuke, kristian, linux-kernel, linux-nilfs,
	slava, syzkaller-bugs

When a user executes the FITRIM command, an underflow can occur when
calculating nblocks if end_block is too small. Since nblocks is of
type sector_t, which is u64, a negative nblocks value will become a
very large positive integer. This ultimately leads to the block layer
function __blkdev_issue_discard() taking an excessively long time to
process the bio chain, and the ns_segctor_sem lock remains held for a
long period. This prevents other tasks from acquiring the ns_segctor_sem
lock, resulting in the hang reported by syzbot in [1].

The fix involves adding a check for the end block: if it equals the
start block, the trim operation is exited and -EINVAL is returned.

[1]
task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
Call Trace:
 rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
 nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
 nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684

Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
 fs/nilfs2/sufile.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
index 83f93337c01b..63a1f0b29066 100644
--- a/fs/nilfs2/sufile.c
+++ b/fs/nilfs2/sufile.c
@@ -1093,6 +1093,9 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
 	else
 		end_block = start_block + len - 1;
 
+	if (start_block == end_block)
+		return -EINVAL;
+
 	segnum = nilfs_get_segnum_of_block(nilfs, start_block);
 	segnum_end = nilfs_get_segnum_of_block(nilfs, end_block);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix potential block overflow that cause system hang
  2025-12-17  8:43 ` [PATCH] nilfs2: fix potential block overflow that cause system hang Edward Adam Davis
@ 2025-12-17 13:38   ` Ryusuke Konishi
  2025-12-18  2:28     ` [PATCH v2] nilfs2: Fix " Edward Adam Davis
  0 siblings, 1 reply; 10+ messages in thread
From: Ryusuke Konishi @ 2025-12-17 13:38 UTC (permalink / raw)
  To: Edward Adam Davis
  Cc: syzbot+7eedce5eb281acd832f0, axboe, kristian, linux-kernel,
	linux-nilfs, slava, syzkaller-bugs

On Wed, Dec 17, 2025 at 5:43 PM Edward Adam Davis wrote:
>
> When a user executes the FITRIM command, an underflow can occur when
> calculating nblocks if end_block is too small. Since nblocks is of
> type sector_t, which is u64, a negative nblocks value will become a
> very large positive integer. This ultimately leads to the block layer
> function __blkdev_issue_discard() taking an excessively long time to
> process the bio chain, and the ns_segctor_sem lock remains held for a
> long period. This prevents other tasks from acquiring the ns_segctor_sem
> lock, resulting in the hang reported by syzbot in [1].
>
> The fix involves adding a check for the end block: if it equals the
> start block, the trim operation is exited and -EINVAL is returned.
>
> [1]
> task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
> Call Trace:
>  rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
>  nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
>  nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
>  nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684
>
> Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
> Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
>  fs/nilfs2/sufile.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
> index 83f93337c01b..63a1f0b29066 100644
> --- a/fs/nilfs2/sufile.c
> +++ b/fs/nilfs2/sufile.c
> @@ -1093,6 +1093,9 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
>         else
>                 end_block = start_block + len - 1;
>
> +       if (start_block == end_block)
> +               return -EINVAL;
> +
>         segnum = nilfs_get_segnum_of_block(nilfs, start_block);
>         segnum_end = nilfs_get_segnum_of_block(nilfs, end_block);
>
> --
> 2.43.0

Hi Edward,

Thanks for the patch.

And, sorry for the noise on the block layer. As his patch points out,
this looks like a defect in the NILFS2 fstrim implementation.

However, I would like to discuss the approach to the fix with Edward.
Since the FITRIM request size is larger than the block size (which is
1KiB in the syzbot reproducer), the request itself looks valid. I
believe we need to fix the logic that causes the loop overrun instead
of rejecting the request.

I attempted to reproduce the issue using the exact same ioctl
parameters, but it completed successfully. Therefore, I suspect that
specific disk usage or metadata corruption might be a prerequisite for
triggering this bug.

I will follow up with more detailed feedback later.

Thanks,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2] nilfs2: Fix potential block overflow that cause system hang
  2025-12-17 13:38   ` Ryusuke Konishi
@ 2025-12-18  2:28     ` Edward Adam Davis
  2025-12-18  4:22       ` [PATCH v3] " Edward Adam Davis
  0 siblings, 1 reply; 10+ messages in thread
From: Edward Adam Davis @ 2025-12-18  2:28 UTC (permalink / raw)
  To: konishi.ryusuke
  Cc: axboe, eadavis, kristian, linux-kernel, linux-nilfs, slava,
	syzkaller-bugs

When a user executes the FITRIM command, an underflow can occur when
calculating nblocks if end_block is too small. Since nblocks is of
type sector_t, which is u64, a negative nblocks value will become a
very large positive integer. This ultimately leads to the block layer
function __blkdev_issue_discard() taking an excessively long time to
process the bio chain, and the ns_segctor_sem lock remains held for a
long period. This prevents other tasks from acquiring the ns_segctor_sem
lock, resulting in the hang reported by syzbot in [1].

Before recalculating nblocks, add checks for the end and start block.

[1]
task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
Call Trace:
 rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
 nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
 nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684

Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
v1 -> v2: continue do discard and comments

 fs/nilfs2/sufile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
index 83f93337c01b..75ca318b5763 100644
--- a/fs/nilfs2/sufile.c
+++ b/fs/nilfs2/sufile.c
@@ -1175,7 +1175,7 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
 			nblocks -= start_block - start;
 			start = start_block;
 		}
-		if (start + nblocks > end_block + 1)
+		if (start + nblocks > end_block + 1 && end_block > start)
 			nblocks = end_block - start + 1;
 
 		if (nblocks >= minlen) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3] nilfs2: Fix potential block overflow that cause system hang
  2025-12-18  2:28     ` [PATCH v2] nilfs2: Fix " Edward Adam Davis
@ 2025-12-18  4:22       ` Edward Adam Davis
  2025-12-18 11:32         ` Ryusuke Konishi
  0 siblings, 1 reply; 10+ messages in thread
From: Edward Adam Davis @ 2025-12-18  4:22 UTC (permalink / raw)
  To: eadavis
  Cc: axboe, konishi.ryusuke, kristian, linux-kernel, linux-nilfs,
	slava, syzkaller-bugs

When a user executes the FITRIM command, an underflow can occur when
calculating nblocks if end_block is too small. Since nblocks is of
type sector_t, which is u64, a negative nblocks value will become a
very large positive integer. This ultimately leads to the block layer
function __blkdev_issue_discard() taking an excessively long time to
process the bio chain, and the ns_segctor_sem lock remains held for a
long period. This prevents other tasks from acquiring the ns_segctor_sem
lock, resulting in the hang reported by syzbot in [1].

If the ending block is too small, for example, smaller than first data
block, this poses a risk of corrupting the filesystem's superblock.
Here, I check if the segment's ending block number is 0 to determine
if the previously calculated ending block is too small.

Although the start and len values in the user input range are too small,
a conservative strategy is adopted here to safely ignore them, which is
equivalent to a no-op; it will not perform any trimming and will not
throw an error.

[1]
task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
Call Trace:
 rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
 nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
 nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684

Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
v2 -> v3: change to segment end check and update comments
v1 -> v2: continue do discard and comments

 fs/nilfs2/sufile.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
index 83f93337c01b..fa612d5ec726 100644
--- a/fs/nilfs2/sufile.c
+++ b/fs/nilfs2/sufile.c
@@ -1095,6 +1095,8 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)

 	segnum = nilfs_get_segnum_of_block(nilfs, start_block);
 	segnum_end = nilfs_get_segnum_of_block(nilfs, end_block);
+	if (!segnum_end)
+		return 0;

 	down_read(&NILFS_MDT(sufile)->mi_sem);

-- 
2.43.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] nilfs2: Fix potential block overflow that cause system hang
  2025-12-18  4:22       ` [PATCH v3] " Edward Adam Davis
@ 2025-12-18 11:32         ` Ryusuke Konishi
  2025-12-18 11:48           ` [PATCH v4] " Edward Adam Davis
  0 siblings, 1 reply; 10+ messages in thread
From: Ryusuke Konishi @ 2025-12-18 11:32 UTC (permalink / raw)
  To: Edward Adam Davis
  Cc: slava, linux-nilfs, syzkaller-bugs, axboe, kristian, linux-kernel

On Thu, Dec 18, 2025 at 1:22 PM Edward Adam Davis wrote:
>
> When a user executes the FITRIM command, an underflow can occur when
> calculating nblocks if end_block is too small. Since nblocks is of
> type sector_t, which is u64, a negative nblocks value will become a
> very large positive integer. This ultimately leads to the block layer
> function __blkdev_issue_discard() taking an excessively long time to
> process the bio chain, and the ns_segctor_sem lock remains held for a
> long period. This prevents other tasks from acquiring the ns_segctor_sem
> lock, resulting in the hang reported by syzbot in [1].
>
> If the ending block is too small, for example, smaller than first data
> block, this poses a risk of corrupting the filesystem's superblock.
> Here, I check if the segment's ending block number is 0 to determine
> if the previously calculated ending block is too small.
>
> Although the start and len values in the user input range are too small,
> a conservative strategy is adopted here to safely ignore them, which is
> equivalent to a no-op; it will not perform any trimming and will not
> throw an error.
>
> [1]
> task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
> Call Trace:
>  rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
>  nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
>  nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
>  nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684
>
> Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
> Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
> v2 -> v3: change to segment end check and update comments
> v1 -> v2: continue do discard and comments
>
>  fs/nilfs2/sufile.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
> index 83f93337c01b..fa612d5ec726 100644
> --- a/fs/nilfs2/sufile.c
> +++ b/fs/nilfs2/sufile.c
> @@ -1095,6 +1095,8 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
>
>         segnum = nilfs_get_segnum_of_block(nilfs, start_block);
>         segnum_end = nilfs_get_segnum_of_block(nilfs, end_block);
> +       if (!segnum_end)
> +               return 0;
>
>         down_read(&NILFS_MDT(sufile)->mi_sem);
>
> --
> 2.43.0

Hi Edward,

Thanks for submitting the patches.

However, I see two issues with the v3 patch:

First, this patch results in ignoring discard requests that are
limited to the region within segment number 0.
This is not the desired behavior.

Also, the final processing step that sets the actual discarded byte
size to range->len gets skipped.
When exiting normally, the code needs to goto a (new) label just
before the following assignment:

range->len = ndiscarded << nilfs->ns_blocksize_bits;

The root cause lies in the logic that clips the last free extent to
fit within the range specified by the ioctl.
As you noticed in your v2 patch, the issue is that end_block (the end
of the clipping region) is located before start (the start position of
the free extent), which causes an underflow in the following nblocks
calculation:

if (start + nblocks > end_block + 1)
        nblocks = end_block - start + 1;

The reason this happens is that the beginning of segment 0 is
truncated to reserve space for the primary superblock, so its starting
block effectively becomes the block number defined by
nilfs->ns_first_data_block.

While the segment range obtained by nilfs_get_segment_range() reflects
this adjustment, nilfs_get_segnum_of_block() does not (it returns 0
even for blocks preceding the first block in segment 0).

So, if we want to add a check beforehand, I think it would be better
to skip the operation if end_block is less than
nilfs->ns_first_data_block.

Thanks,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4] nilfs2: Fix potential block overflow that cause system hang
  2025-12-18 11:32         ` Ryusuke Konishi
@ 2025-12-18 11:48           ` Edward Adam Davis
  2025-12-18 12:01             ` Ryusuke Konishi
  0 siblings, 1 reply; 10+ messages in thread
From: Edward Adam Davis @ 2025-12-18 11:48 UTC (permalink / raw)
  To: konishi.ryusuke
  Cc: axboe, eadavis, kristian, linux-kernel, linux-nilfs, slava,
	syzkaller-bugs

When a user executes the FITRIM command, an underflow can occur when
calculating nblocks if end_block is too small. Since nblocks is of
type sector_t, which is u64, a negative nblocks value will become a
very large positive integer. This ultimately leads to the block layer
function __blkdev_issue_discard() taking an excessively long time to
process the bio chain, and the ns_segctor_sem lock remains held for a
long period. This prevents other tasks from acquiring the ns_segctor_sem
lock, resulting in the hang reported by syzbot in [1].

If the ending block is too small, for example, smaller than first data
block, this poses a risk of corrupting the filesystem's superblock.

Although the start and len values in the user input range are too small,
a conservative strategy is adopted here to safely ignore them, which is
equivalent to a no-op; it will not perform any trimming and will not
throw an error.

[1]
task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
Call Trace:
 rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
 nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
 nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684

Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
v3 -> v4: check end block and first data block
v2 -> v3: change to segment end check and update comments
v1 -> v2: continue do discard and comments

 fs/nilfs2/sufile.c | 3 ++
 1 file changed, 3 insertions(+)

diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
index 83f93337c01b..5d7cbd26a910 100644
--- a/fs/nilfs2/sufile.c
+++ b/fs/nilfs2/sufile.c
@@ -1093,6 +1093,9 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
 	else
 		end_block = start_block + len - 1;

+	if (end_block < nilfs->ns_first_data_block)
+		return 0;
+
 	segnum = nilfs_get_segnum_of_block(nilfs, start_block);
 	segnum_end = nilfs_get_segnum_of_block(nilfs, end_block);

-- 
2.43.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v4] nilfs2: Fix potential block overflow that cause system hang
  2025-12-18 11:48           ` [PATCH v4] " Edward Adam Davis
@ 2025-12-18 12:01             ` Ryusuke Konishi
  2025-12-18 12:10               ` [PATCH v5] " Edward Adam Davis
  0 siblings, 1 reply; 10+ messages in thread
From: Ryusuke Konishi @ 2025-12-18 12:01 UTC (permalink / raw)
  To: Edward Adam Davis
  Cc: axboe, kristian, linux-kernel, linux-nilfs, slava, syzkaller-bugs

On Thu, Dec 18, 2025 at 8:48 PM Edward Adam Davis wrote:
>
> When a user executes the FITRIM command, an underflow can occur when
> calculating nblocks if end_block is too small. Since nblocks is of
> type sector_t, which is u64, a negative nblocks value will become a
> very large positive integer. This ultimately leads to the block layer
> function __blkdev_issue_discard() taking an excessively long time to
> process the bio chain, and the ns_segctor_sem lock remains held for a
> long period. This prevents other tasks from acquiring the ns_segctor_sem
> lock, resulting in the hang reported by syzbot in [1].
>
> If the ending block is too small, for example, smaller than first data
> block, this poses a risk of corrupting the filesystem's superblock.
>
> Although the start and len values in the user input range are too small,
> a conservative strategy is adopted here to safely ignore them, which is
> equivalent to a no-op; it will not perform any trimming and will not
> throw an error.
>
> [1]
> task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
> Call Trace:
>  rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
>  nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
>  nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
>  nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684
>
> Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
> Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
> v3 -> v4: check end block and first data block
> v2 -> v3: change to segment end check and update comments
> v1 -> v2: continue do discard and comments
>
>  fs/nilfs2/sufile.c | 3 ++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
> index 83f93337c01b..5d7cbd26a910 100644
> --- a/fs/nilfs2/sufile.c
> +++ b/fs/nilfs2/sufile.c
> @@ -1093,6 +1093,9 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
>         else
>                 end_block = start_block + len - 1;
>
> +       if (end_block < nilfs->ns_first_data_block)
> +               return 0;
> +
>         segnum = nilfs_get_segnum_of_block(nilfs, start_block);
>         segnum_end = nilfs_get_segnum_of_block(nilfs, end_block);
>
> --
> 2.43.0

Hi,

One of my comments in the v3 patch is not reflected.
If exiting successfully, it is at least necessary to assign the
discarded size (0 in this case) to range->len.

For future optimization, I recommend leaving ret = 0 (the initial
value) and jumping before the final assignment statement (since
ndiscarded is also initialized to 0):

        range->len = ndiscarded << nilfs->ns_blocksize_bits;
        return ret;

Regards,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5] nilfs2: Fix potential block overflow that cause system hang
  2025-12-18 12:01             ` Ryusuke Konishi
@ 2025-12-18 12:10               ` Edward Adam Davis
  2025-12-18 12:42                 ` Ryusuke Konishi
  0 siblings, 1 reply; 10+ messages in thread
From: Edward Adam Davis @ 2025-12-18 12:10 UTC (permalink / raw)
  To: konishi.ryusuke
  Cc: axboe, eadavis, kristian, linux-kernel, linux-nilfs, slava,
	syzkaller-bugs

When a user executes the FITRIM command, an underflow can occur when
calculating nblocks if end_block is too small. Since nblocks is of
type sector_t, which is u64, a negative nblocks value will become a
very large positive integer. This ultimately leads to the block layer
function __blkdev_issue_discard() taking an excessively long time to
process the bio chain, and the ns_segctor_sem lock remains held for a
long period. This prevents other tasks from acquiring the ns_segctor_sem
lock, resulting in the hang reported by syzbot in [1].

If the ending block is too small, for example, smaller than first data
block, this poses a risk of corrupting the filesystem's superblock.
Exiting successfully and assign the discarded size (0 in this case)
to range->len.

Although the start and len values in the user input range are too small,
a conservative strategy is adopted here to safely ignore them, which is
equivalent to a no-op; it will not perform any trimming and will not
throw an error.

[1]
task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
Call Trace:
 rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
 nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
 nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684

Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
v4 -> v5: assign discarded size to range->len
v3 -> v4: check end block and first data block
v2 -> v3: change to segment end check and update comments
v1 -> v2: continue do discard and comments

 fs/nilfs2/sufile.c | 3 ++
 1 file changed, 3 insertions(+)

diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
index 83f93337c01b..eceedca02697 100644
--- a/fs/nilfs2/sufile.c
+++ b/fs/nilfs2/sufile.c
@@ -1093,6 +1093,9 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
 	else
 		end_block = start_block + len - 1;

+	if (end_block < nilfs->ns_first_data_block)
+		goto out;
+
 	segnum = nilfs_get_segnum_of_block(nilfs, start_block);
 	segnum_end = nilfs_get_segnum_of_block(nilfs, end_block);

@@ -1191,6 +1194,7 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
 out_sem:
 	up_read(&NILFS_MDT(sufile)->mi_sem);

+out:
 	range->len = ndiscarded << nilfs->ns_blocksize_bits;
 	return ret;
 }
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v5] nilfs2: Fix potential block overflow that cause system hang
  2025-12-18 12:10               ` [PATCH v5] " Edward Adam Davis
@ 2025-12-18 12:42                 ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2025-12-18 12:42 UTC (permalink / raw)
  To: Edward Adam Davis
  Cc: axboe, kristian, linux-kernel, linux-nilfs, slava, syzkaller-bugs

On Thu, Dec 18, 2025 at 9:11 PM Edward Adam Davis  wrote:
>
> When a user executes the FITRIM command, an underflow can occur when
> calculating nblocks if end_block is too small. Since nblocks is of
> type sector_t, which is u64, a negative nblocks value will become a
> very large positive integer. This ultimately leads to the block layer
> function __blkdev_issue_discard() taking an excessively long time to
> process the bio chain, and the ns_segctor_sem lock remains held for a
> long period. This prevents other tasks from acquiring the ns_segctor_sem
> lock, resulting in the hang reported by syzbot in [1].
>
> If the ending block is too small, for example, smaller than first data
> block, this poses a risk of corrupting the filesystem's superblock.
> Exiting successfully and assign the discarded size (0 in this case)
> to range->len.
>
> Although the start and len values in the user input range are too small,
> a conservative strategy is adopted here to safely ignore them, which is
> equivalent to a no-op; it will not perform any trimming and will not
> throw an error.
>
> [1]
> task:segctord state:D stack:28968 pid:6093 tgid:6093  ppid:2 task_flags:0x200040 flags:0x00080000
> Call Trace:
>  rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
>  nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
>  nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
>  nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684
>
> Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs")
> Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
> v4 -> v5: assign discarded size to range->len
> v3 -> v4: check end block and first data block
> v2 -> v3: change to segment end check and update comments
> v1 -> v2: continue do discard and comments
>
>  fs/nilfs2/sufile.c | 3 ++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
> index 83f93337c01b..eceedca02697 100644
> --- a/fs/nilfs2/sufile.c
> +++ b/fs/nilfs2/sufile.c
> @@ -1093,6 +1093,9 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
>         else
>                 end_block = start_block + len - 1;
>
> +       if (end_block < nilfs->ns_first_data_block)
> +               goto out;
> +
>         segnum = nilfs_get_segnum_of_block(nilfs, start_block);
>         segnum_end = nilfs_get_segnum_of_block(nilfs, end_block);
>
> @@ -1191,6 +1194,7 @@ int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
>  out_sem:
>         up_read(&NILFS_MDT(sufile)->mi_sem);
>
> +out:
>         range->len = ndiscarded << nilfs->ns_blocksize_bits;
>         return ret;
>  }
> --
> 2.43.0

Thanks Edward!

I'll send this v5 patch upstream once I've tested it.

Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-12-18 12:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-17  0:46 [syzbot] [nilfs?] INFO: task hung in nilfs_segctor_thread (6) syzbot
2025-12-17  8:43 ` [PATCH] nilfs2: fix potential block overflow that cause system hang Edward Adam Davis
2025-12-17 13:38   ` Ryusuke Konishi
2025-12-18  2:28     ` [PATCH v2] nilfs2: Fix " Edward Adam Davis
2025-12-18  4:22       ` [PATCH v3] " Edward Adam Davis
2025-12-18 11:32         ` Ryusuke Konishi
2025-12-18 11:48           ` [PATCH v4] " Edward Adam Davis
2025-12-18 12:01             ` Ryusuke Konishi
2025-12-18 12:10               ` [PATCH v5] " Edward Adam Davis
2025-12-18 12:42                 ` Ryusuke Konishi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox