* [PATCH] btrfs: fix hung task when cloning inline extent races with writeback
@ 2026-03-26 1:49 Deepanshu Kartikey
2026-03-26 2:46 ` [syzbot] [btrfs?] INFO: task hung in btrfs_invalidate_folio (3) syzbot
0 siblings, 1 reply; 2+ messages in thread
From: Deepanshu Kartikey @ 2026-03-26 1:49 UTC (permalink / raw)
To: syzbot+63056bf627663701bbbf
Cc: Deepanshu Kartikey, stable, Deepanshu Kartikey
From: Deepanshu Kartikey <Kartikey406@gmail.com>
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
When cloning an inline extent, clone_copy_inline_extent() calls
copy_inline_to_page() which locks an extent range in the destination
inode's io_tree, dirties a page with the inline data, and sets
BTRFS_INODE_NO_DELALLOC_FLUSH on the inode. At this point i_size is
still 0 since clone_finish_inode_update() has not been called yet.
Then clone_copy_inline_extent() calls start_transaction() which may
block waiting for the current transaction to commit. While blocked,
the transaction commit calls btrfs_start_delalloc_flush() which calls
try_to_writeback_inodes_sb(), queuing a kworker to flush the clone
destination inode.
The kworker calls btrfs_writepages() -> extent_writepage() and since
i_size is still 0, the dirty page appears to be beyond EOF. This
causes extent_writepage() to call folio_invalidate() ->
btrfs_invalidate_folio() -> btrfs_lock_extent() which blocks forever
because the clone operation holds that lock, creating a circular
deadlock:
clone -> waits for transaction commit to finish
commit -> waits for kworker writeback to finish
kworker -> waits for extent lock held by clone
Additionally any periodic background writeback that races with the
clone operation before i_size is updated will also block on the same
extent lock causing a hung task warning.
The flag BTRFS_INODE_NO_DELALLOC_FLUSH was introduced by commit
3d45f221ce62 to prevent this deadlock but was only checked inside
start_delalloc_inodes(), which is only reached through the btrfs
metadata reclaim path. The transaction commit path goes through
try_to_writeback_inodes_sb() which is a VFS function that bypasses
start_delalloc_inodes() entirely, so the flag was never checked there.
Fix this by checking BTRFS_INODE_NO_DELALLOC_FLUSH at the top of
btrfs_writepages() and returning early if set. This catches all
writeback paths since every writeback on a btrfs inode eventually
calls btrfs_writepages(). The inode will be safely written after the
clone operation finishes and clears the flag, at which point all
locks are released and i_size is properly updated.
Also change the local variable type from 'struct inode *' to
'struct btrfs_inode *' to avoid the double BTRFS_I() conversion.
Fixes: 3d45f221ce62 ("btrfs: fix deadlock when cloning inline extent and low on free metadata space")
CC: stable@vger.kernel.org
Reported-by: syzbot+63056bf627663701bbbf@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
fs/btrfs/extent_io.c | 39 ++++++++++++++++++++++++++++++++++++---
1 file changed, 36 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 5f97a3d2a8d7..f7df7c0c8955 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2698,21 +2698,54 @@ void extent_write_locked_range(struct inode *inode, const struct folio *locked_f
int btrfs_writepages(struct address_space *mapping, struct writeback_control *wbc)
{
- struct inode *inode = mapping->host;
+ struct btrfs_inode *inode = BTRFS_I(mapping->host);
int ret = 0;
struct btrfs_bio_ctrl bio_ctrl = {
.wbc = wbc,
.opf = REQ_OP_WRITE | wbc_to_write_flags(wbc),
};
+ /*
+ * If this inode is being used for a clone/reflink operation that
+ * copied an inline extent into a page of the destination inode, skip
+ * writeback to avoid a deadlock or a long blocked task.
+ *
+ * The clone operation holds the extent range locked in the inode's
+ * io_tree for its entire duration. Any writeback attempt on this
+ * inode will block trying to lock that same extent range inside
+ * writepage_delalloc() or btrfs_invalidate_folio(), causing a
+ * hung task.
+ *
+ * When writeback is triggered from the transaction commit path via
+ * btrfs_start_delalloc_flush() -> try_to_writeback_inodes_sb(),
+ * this becomes a true circular deadlock:
+ *
+ * clone -> waits for transaction commit to finish
+ * commit -> waits for kworker writeback to finish
+ * kworker -> waits for extent lock held by clone
+ *
+ * The flag BTRFS_INODE_NO_DELALLOC_FLUSH was already checked in
+ * start_delalloc_inodes() but only for the btrfs metadata reclaim
+ * path. The transaction commit path goes through
+ * try_to_writeback_inodes_sb() which bypasses that check entirely
+ * and calls btrfs_writepages() directly.
+ *
+ * By checking the flag here we catch all writeback paths. The inode
+ * will be safely written after the clone operation finishes and
+ * clears BTRFS_INODE_NO_DELALLOC_FLUSH, at which point all locks
+ * are released and writeback can proceed normally.
+ */
+ if (test_bit(BTRFS_INODE_NO_DELALLOC_FLUSH, &inode->runtime_flags))
+ return 0;
+
/*
* Allow only a single thread to do the reloc work in zoned mode to
* protect the write pointer updates.
*/
- btrfs_zoned_data_reloc_lock(BTRFS_I(inode));
+ btrfs_zoned_data_reloc_lock(inode);
ret = extent_write_cache_pages(mapping, &bio_ctrl);
submit_write_bio(&bio_ctrl, ret);
- btrfs_zoned_data_reloc_unlock(BTRFS_I(inode));
+ btrfs_zoned_data_reloc_unlock(inode);
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [syzbot] [btrfs?] INFO: task hung in btrfs_invalidate_folio (3)
2026-03-26 1:49 [PATCH] btrfs: fix hung task when cloning inline extent races with writeback Deepanshu Kartikey
@ 2026-03-26 2:46 ` syzbot
0 siblings, 0 replies; 2+ messages in thread
From: syzbot @ 2026-03-26 2:46 UTC (permalink / raw)
To: kartikey406, linux-kernel, stable, syzkaller-bugs
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: task hung in btrfs_invalidate_folio
INFO: task kworker/u8:7:151 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u8:7 state:D stack:21504 pid:151 tgid:151 ppid:2 task_flags:0x4208060 flags:0x00080000
Workqueue: writeback wb_workfn (flush-btrfs-6)
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5298 [inline]
__schedule+0x1553/0x5240 kernel/sched/core.c:6911
__schedule_loop kernel/sched/core.c:6993 [inline]
schedule+0x164/0x360 kernel/sched/core.c:7008
wait_extent_bit fs/btrfs/extent-io-tree.c:811 [inline]
btrfs_lock_extent_bits+0x59c/0x700 fs/btrfs/extent-io-tree.c:1914
btrfs_lock_extent fs/btrfs/extent-io-tree.h:152 [inline]
btrfs_invalidate_folio+0x43d/0xc40 fs/btrfs/inode.c:7718
extent_writepage fs/btrfs/extent_io.c:1852 [inline]
extent_write_cache_pages fs/btrfs/extent_io.c:2580 [inline]
btrfs_writepages+0x1369/0x24a0 fs/btrfs/extent_io.c:2746
do_writepages+0x32e/0x550 mm/page-writeback.c:2554
__writeback_single_inode+0x133/0x11a0 fs/fs-writeback.c:1750
writeback_sb_inodes+0x995/0x19d0 fs/fs-writeback.c:2042
wb_writeback+0x456/0xb70 fs/fs-writeback.c:2227
wb_do_writeback fs/fs-writeback.c:2374 [inline]
wb_workfn+0x41a/0xf60 fs/fs-writeback.c:2414
process_one_work kernel/workqueue.c:3276 [inline]
process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
INFO: task syz.0.22:6562 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.22 state:D stack:22752 pid:6562 tgid:6561 ppid:6245 task_flags:0x400140 flags:0x00080002
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5298 [inline]
__schedule+0x1553/0x5240 kernel/sched/core.c:6911
__schedule_loop kernel/sched/core.c:6993 [inline]
schedule+0x164/0x360 kernel/sched/core.c:7008
wait_current_trans+0x39f/0x590 fs/btrfs/transaction.c:535
start_transaction+0x6a7/0x1650 fs/btrfs/transaction.c:705
clone_copy_inline_extent fs/btrfs/reflink.c:299 [inline]
btrfs_clone+0x128a/0x24d0 fs/btrfs/reflink.c:529
btrfs_clone_files+0x271/0x3f0 fs/btrfs/reflink.c:750
btrfs_remap_file_range+0x76b/0x1320 fs/btrfs/reflink.c:903
vfs_copy_file_range+0xda7/0x1390 fs/read_write.c:1600
__do_sys_copy_file_range fs/read_write.c:1683 [inline]
__se_sys_copy_file_range+0x2fb/0x480 fs/read_write.c:1650
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7faf436fc799
RSP: 002b:00007faf42d56028 EFLAGS: 00000246 ORIG_RAX: 0000000000000146
RAX: ffffffffffffffda RBX: 00007faf43975fa0 RCX: 00007faf436fc799
RDX: 0000000000000005 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00007faf43792c99 R08: 0000000000000863 R09: 0000000000000000
R10: 00002000000000c0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007faf43976038 R14: 00007faf43975fa0 R15: 00007fffc650d9b8
</TASK>
INFO: task syz.0.22:6632 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.22 state:D stack:24736 pid:6632 tgid:6561 ppid:6245 task_flags:0x400040 flags:0x00080002
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5298 [inline]
__schedule+0x1553/0x5240 kernel/sched/core.c:6911
__schedule_loop kernel/sched/core.c:6993 [inline]
schedule+0x164/0x360 kernel/sched/core.c:7008
wb_wait_for_completion+0x3e8/0x790 fs/fs-writeback.c:227
__writeback_inodes_sb_nr+0x24c/0x2d0 fs/fs-writeback.c:2838
try_to_writeback_inodes_sb+0x9a/0xc0 fs/fs-writeback.c:2886
btrfs_start_delalloc_flush fs/btrfs/transaction.c:2175 [inline]
btrfs_commit_transaction+0x82e/0x31a0 fs/btrfs/transaction.c:2364
btrfs_ioctl+0xca7/0xd00 fs/btrfs/ioctl.c:5212
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xff/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7faf436fc799
RSP: 002b:00007faf42d35028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007faf43976090 RCX: 00007faf436fc799
RDX: 0000000000000000 RSI: 0000000000009408 RDI: 0000000000000004
RBP: 00007faf43792c99 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007faf43976128 R14: 00007faf43976090 R15: 00007fffc650d9b8
</TASK>
Showing all locks held in the system:
1 lock held by khungtaskd/38:
#0: ffffffff8ddcba00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:312 [inline]
#0: ffffffff8ddcba00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:850 [inline]
#0: ffffffff8ddcba00 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x2e/0x180 kernel/locking/lockdep.c:6775
2 locks held by kworker/u8:7/151:
#0: ffff88801aac4138 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3251 [inline]
#0: ffff88801aac4138 ((wq_completion)writeback){+.+.}-{0:0}, at: process_scheduled_works+0xa52/0x18c0 kernel/workqueue.c:3359
#1: ffffc90003a97c40 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3252 [inline]
#1: ffffc90003a97c40 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_scheduled_works+0xa8d/0x18c0 kernel/workqueue.c:3359
2 locks held by getty/5555:
#0: ffff8880377e50a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
#1: ffffc90003e7e2e0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0x462/0x13c0 drivers/tty/n_tty.c:2211
3 locks held by syz-executor/6249:
4 locks held by syz.0.22/6562:
#0: ffff8880625fe480 (sb_writers#12){.+.+}-{0:0}, at: file_start_write include/linux/fs.h:2710 [inline]
#0: ffff8880625fe480 (sb_writers#12){.+.+}-{0:0}, at: vfs_copy_file_range+0x9bb/0x1390 fs/read_write.c:1588
#1: ffff888058eed8c8 (&sb->s_type->i_mutex_key#24){+.+.}-{4:4}, at: inode_lock include/linux/fs.h:1028 [inline]
#1: ffff888058eed8c8 (&sb->s_type->i_mutex_key#24){+.+.}-{4:4}, at: btrfs_inode_lock+0x51/0xe0 fs/btrfs/inode.c:369
#2: ffff888058eed728 (&ei->i_mmap_lock){++++}-{4:4}, at: btrfs_inode_lock+0xcb/0xe0 fs/btrfs/inode.c:372
#3: ffff8880625fe770 (sb_internal#2){.+.+}-{0:0}, at: clone_copy_inline_extent fs/btrfs/reflink.c:299 [inline]
#3: ffff8880625fe770 (sb_internal#2){.+.+}-{0:0}, at: btrfs_clone+0x128a/0x24d0 fs/btrfs/reflink.c:529
3 locks held by syz.0.22/6632:
#0: ffff888045007118 (btrfs_trans_num_writers){++++}-{0:0}, at: join_transaction+0x41b/0xc90 fs/btrfs/transaction.c:298
#1: ffff888045007140 (btrfs_trans_num_extwriters){++++}-{0:0}, at: join_transaction+0x41b/0xc90 fs/btrfs/transaction.c:298
#2: ffff8880625fe0d0 (&type->s_umount_key#56){++++}-{4:4}, at: try_to_writeback_inodes_sb+0x22/0xc0 fs/fs-writeback.c:2883
1 lock held by udevd/6608:
#0: ffff8880226e58b0 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: inode_lock_shared include/linux/fs.h:1043 [inline]
#0: ffff8880226e58b0 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: blkdev_read_iter+0x2ff/0x440 block/fops.c:854
1 lock held by btrfs-transacti/6627:
#0: ffff888045004d98 (&fs_info->transaction_kthread_mutex){+.+.}-{4:4}, at: transaction_kthread+0xe4/0x450 fs/btrfs/disk-io.c:1515
2 locks held by syz.3.215/10113:
3 locks held by syz.5.214/10126:
3 locks held by syz.4.216/10132:
2 locks held by udevadm/10146:
=============================================
NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 38 Comm: khungtaskd Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
nmi_cpu_backtrace+0x274/0x2d0 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x17a/0x300 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:161 [inline]
__sys_info lib/sys_info.c:157 [inline]
sys_info+0x135/0x170 lib/sys_info.c:165
check_hung_uninterruptible_tasks kernel/hung_task.c:346 [inline]
watchdog+0xfd9/0x1030 kernel/hung_task.c:515
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 10146 Comm: udevadm Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
RIP: 0010:__lock_acquire+0xa9d/0x2cf0 kernel/locking/lockdep.c:5234
Code: fa 02 85 c0 74 1c 83 3d d4 21 ca 0d 00 75 13 48 8d 3d a7 32 cd 0d 48 c7 c6 65 71 67 8d 67 48 0f b9 3a 90 31 c0 48 83 78 40 00 <0f> 84 5a 1b 00 00 48 09 dd 41 8b 45 20 89 c1 81 e1 00 80 04 00 81
RSP: 0018:ffffc90006b9f6f8 EFLAGS: 00000082
RAX: ffffffff92f73c88 RBX: 00000000e60eadd5 RCX: 0000000010530efd
RDX: 000000005f479b93 RSI: 0000000099d18f2c RDI: ffff88802864db80
RBP: c57865e900000000 R08: ffffffff81767e65 R09: ffffffff8ddcba00
R10: ffffc90006b9f9d8 R11: ffffffff81af90c0 R12: ffff88802864e738
R13: ffff88802864e738 R14: ffff88802864db80 R15: 0000000000000000
FS: 00007f1e8ff33880(0000) GS:ffff888126339000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555d4ad6a5f8 CR3: 0000000064d0e000 CR4: 00000000003526f0
Call Trace:
<TASK>
lock_acquire+0xf0/0x2e0 kernel/locking/lockdep.c:5868
rcu_lock_acquire include/linux/rcupdate.h:312 [inline]
rcu_read_lock include/linux/rcupdate.h:850 [inline]
class_rcu_constructor include/linux/rcupdate.h:1193 [inline]
unwind_next_frame+0xc2/0x23c0 arch/x86/kernel/unwind_orc.c:495
arch_stack_walk+0x11b/0x150 arch/x86/kernel/stacktrace.c:25
stack_trace_save+0xa9/0x100 kernel/stacktrace.c:122
kasan_save_stack mm/kasan/common.c:57 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
poison_slab_object mm/kasan/common.c:253 [inline]
__kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
kasan_slab_free include/linux/kasan.h:235 [inline]
slab_free_hook mm/slub.c:2685 [inline]
slab_free mm/slub.c:6165 [inline]
kmem_cache_free+0x185/0x6b0 mm/slub.c:6295
file_free fs/file_table.c:71 [inline]
__fput+0x6d7/0xa90 fs/file_table.c:482
fput_close_sync+0x11f/0x240 fs/file_table.c:574
__do_sys_close fs/open.c:1509 [inline]
__se_sys_close fs/open.c:1494 [inline]
__x64_sys_close+0x7e/0x110 fs/open.c:1494
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f1e9008fa67
Code: 44 00 00 48 83 ec 10 48 63 ff 45 31 c9 45 31 c0 6a 01 31 c9 e8 ca 19 f9 ff 48 83 c4 18 c3 0f 1f 44 00 00 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 61 b3 0d 00 f7 d8 64 89 02 b8
RSP: 002b:00007fff339659e8 EFLAGS: 00000297 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000555d4ad572a0 RCX: 00007f1e9008fa67
RDX: 00007f1e90169ea0 RSI: 0000555d4ad68be0 RDI: 0000000000000005
RBP: 00007f1e90169ff0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000000
R13: 3d45505954564544 R14: 3d5845444e494649 R15: 3d454d414e564544
</TASK>
Tested on:
commit: 0138af24 Merge tag 'erofs-for-7.0-rc6-fixes' of git://..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1049a1d6580000
kernel config: https://syzkaller.appspot.com/x/.config?x=45cb3c58fd963c27
dashboard link: https://syzkaller.appspot.com/bug?extid=63056bf627663701bbbf
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=15b16a06580000
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-03-26 2:46 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-26 1:49 [PATCH] btrfs: fix hung task when cloning inline extent races with writeback Deepanshu Kartikey
2026-03-26 2:46 ` [syzbot] [btrfs?] INFO: task hung in btrfs_invalidate_folio (3) syzbot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox