linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [block?] BUG: sleeping function called from invalid context in __xas_nomem (2)
@ 2025-06-27 16:20 syzbot
  2025-06-28  8:36 ` [PATCH] brd: fix sleeping memory allocation in brd_insert_page() Tetsuo Handa
  0 siblings, 1 reply; 5+ messages in thread
From: syzbot @ 2025-06-27 16:20 UTC (permalink / raw)
  To: axboe, linux-block, linux-kernel, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    86731a2a651e Linux 6.16-rc3
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1630bb0c580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4ad206eb0100c6a2
dashboard link: https://syzkaller.appspot.com/bug?extid=ea4c8fd177a47338881a
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-86731a2a.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/9e7ff33d1e1f/vmlinux-86731a2a.xz
kernel image: https://storage.googleapis.com/syzbot-assets/1bb9a09c88bb/bzImage-86731a2a.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ea4c8fd177a47338881a@syzkaller.appspotmail.com

BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:321
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 6843, name: syz.1.211
preempt_count: 0, expected: 0
RCU nest depth: 1, expected: 0
1 lock held by syz.1.211/6843:
 #0: ffffffff8e5c47c0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_release include/linux/rcupdate.h:341 [inline]
 #0: ffffffff8e5c47c0 (rcu_read_lock){....}-{1:3}, at: rcu_read_unlock include/linux/rcupdate.h:871 [inline]
 #0: ffffffff8e5c47c0 (rcu_read_lock){....}-{1:3}, at: brd_insert_page drivers/block/brd.c:65 [inline]
 #0: ffffffff8e5c47c0 (rcu_read_lock){....}-{1:3}, at: brd_rw_bvec drivers/block/brd.c:121 [inline]
 #0: ffffffff8e5c47c0 (rcu_read_lock){....}-{1:3}, at: brd_submit_bio+0x935/0x10a0 drivers/block/brd.c:191
CPU: 1 UID: 0 PID: 6843 Comm: syz.1.211 Not tainted 6.16.0-rc3-syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120
 __might_resched+0x3c0/0x5e0 kernel/sched/core.c:8800
 might_alloc include/linux/sched/mm.h:321 [inline]
 might_alloc include/linux/sched/mm.h:316 [inline]
 slab_pre_alloc_hook mm/slub.c:4099 [inline]
 slab_alloc_node mm/slub.c:4177 [inline]
 kmem_cache_alloc_lru_noprof+0x2d2/0x3b0 mm/slub.c:4216
 __xas_nomem+0x266/0x670 lib/xarray.c:341
 __xa_cmpxchg_raw lib/xarray.c:1786 [inline]
 __xa_cmpxchg+0x119/0x290 lib/xarray.c:1766
 brd_insert_page drivers/block/brd.c:72 [inline]
 brd_rw_bvec drivers/block/brd.c:121 [inline]
 brd_submit_bio+0x9ce/0x10a0 drivers/block/brd.c:191
 __submit_bio+0x301/0x690 block/blk-core.c:644
 __submit_bio_noacct block/blk-core.c:690 [inline]
 submit_bio_noacct_nocheck+0x852/0xd30 block/blk-core.c:753
 submit_bio_noacct+0x50d/0x1eb0 block/blk-core.c:874
 __blkdev_direct_IO block/fops.c:257 [inline]
 blkdev_direct_IO+0x1647/0x1ff0 block/fops.c:433
 blkdev_direct_write block/fops.c:701 [inline]
 blkdev_write_iter+0x6fd/0xdf0 block/fops.c:768
 do_iter_readv_writev+0x654/0x950 fs/read_write.c:827
 vfs_writev+0x35f/0xde0 fs/read_write.c:1057
 do_writev+0x132/0x340 fs/read_write.c:1103
 do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline]
 __do_fast_syscall_32+0x7c/0x3a0 arch/x86/entry/syscall_32.c:306
 do_fast_syscall_32+0x32/0x80 arch/x86/entry/syscall_32.c:331
 entry_SYSENTER_compat_after_hwframe+0x84/0x8e
RIP: 0023:0xf7f11579
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f503655c EFLAGS: 00000296 ORIG_RAX: 0000000000000092
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000080000a40
RDX: 0000000000000021 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
----------------
Code disassembly (best guess), 2 bytes skipped:
   0:	10 06                	adc    %al,(%rsi)
   2:	03 74 b4 01          	add    0x1(%rsp,%rsi,4),%esi
   6:	10 07                	adc    %al,(%rdi)
   8:	03 74 b0 01          	add    0x1(%rax,%rsi,4),%esi
   c:	10 08                	adc    %cl,(%rax)
   e:	03 74 d8 01          	add    0x1(%rax,%rbx,8),%esi
  1e:	00 51 52             	add    %dl,0x52(%rcx)
  21:	55                   	push   %rbp
  22:	89 e5                	mov    %esp,%ebp
  24:	0f 34                	sysenter
  26:	cd 80                	int    $0x80
* 28:	5d                   	pop    %rbp <-- trapping instruction
  29:	5a                   	pop    %rdx
  2a:	59                   	pop    %rcx
  2b:	c3                   	ret
  2c:	90                   	nop
  2d:	90                   	nop
  2e:	90                   	nop
  2f:	90                   	nop
  30:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
  37:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] brd: fix sleeping memory allocation in brd_insert_page()
  2025-06-27 16:20 [syzbot] [block?] BUG: sleeping function called from invalid context in __xas_nomem (2) syzbot
@ 2025-06-28  8:36 ` Tetsuo Handa
  2025-06-28  9:39   ` Tetsuo Handa
  2025-06-30  5:36   ` Christoph Hellwig
  0 siblings, 2 replies; 5+ messages in thread
From: Tetsuo Handa @ 2025-06-28  8:36 UTC (permalink / raw)
  To: Jens Axboe, Yu Kuai, Christoph Hellwig, LKML

syzbot is reporting that brd_insert_page() is calling
__xa_cmpxchg(__GFP_DIRECT_RECLAIM) with spinlock and RCU lock held.
Change __xa_cmpxchg() to use GFP_NOWAIT | __GFP_NOWARN, for it is likely
that __xa_cmpxchg() succeeds because of preceding alloc_page().

Fixes: bbcacab2e8ee ("brd: avoid extra xarray lookups on first write")
Reported-by: syzbot+ea4c8fd177a47338881a@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ea4c8fd177a47338881a
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/block/brd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index b1be6c510372..ed3eb931750c 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -70,7 +70,7 @@ static struct page *brd_insert_page(struct brd_device *brd, sector_t sector,
 
 	xa_lock(&brd->brd_pages);
 	ret = __xa_cmpxchg(&brd->brd_pages, sector >> PAGE_SECTORS_SHIFT, NULL,
-			page, gfp);
+			page, GFP_NOWAIT | __GFP_NOWARN);
 	if (ret) {
 		xa_unlock(&brd->brd_pages);
 		__free_page(page);
-- 
2.50.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] brd: fix sleeping memory allocation in brd_insert_page()
  2025-06-28  8:36 ` [PATCH] brd: fix sleeping memory allocation in brd_insert_page() Tetsuo Handa
@ 2025-06-28  9:39   ` Tetsuo Handa
  2025-06-28 11:03     ` Tetsuo Handa
  2025-06-30  5:36   ` Christoph Hellwig
  1 sibling, 1 reply; 5+ messages in thread
From: Tetsuo Handa @ 2025-06-28  9:39 UTC (permalink / raw)
  To: Jens Axboe, Yu Kuai, Christoph Hellwig, LKML

On 2025/06/28 17:36, Tetsuo Handa wrote:
> syzbot is reporting that brd_insert_page() is calling
> __xa_cmpxchg(__GFP_DIRECT_RECLAIM) with spinlock and RCU lock held.

Hmm. Holding spinlock itself is OK because xa_lock() is a requirement.

> Change __xa_cmpxchg() to use GFP_NOWAIT | __GFP_NOWARN, for it is likely
> that __xa_cmpxchg() succeeds because of preceding alloc_page().

Since this gfp flag is for allocating index array, it should use
__GFP_DIRECT_RECLAIM if possible. Then, deferring RCU lock if possible
makes sense. Then, I wonder what this RCU lock is protecting...


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] brd: fix sleeping memory allocation in brd_insert_page()
  2025-06-28  9:39   ` Tetsuo Handa
@ 2025-06-28 11:03     ` Tetsuo Handa
  0 siblings, 0 replies; 5+ messages in thread
From: Tetsuo Handa @ 2025-06-28 11:03 UTC (permalink / raw)
  To: Jens Axboe, Yu Kuai, Christoph Hellwig, LKML

On 2025/06/28 18:39, Tetsuo Handa wrote:
> On 2025/06/28 17:36, Tetsuo Handa wrote:
>> syzbot is reporting that brd_insert_page() is calling
>> __xa_cmpxchg(__GFP_DIRECT_RECLAIM) with spinlock and RCU lock held.
> 
> Hmm. Holding spinlock itself is OK because xa_lock() is a requirement.
> 
>> Change __xa_cmpxchg() to use GFP_NOWAIT | __GFP_NOWARN, for it is likely
>> that __xa_cmpxchg() succeeds because of preceding alloc_page().
> 
> Since this gfp flag is for allocating index array, it should use
> __GFP_DIRECT_RECLAIM if possible. Then, deferring RCU lock if possible
> makes sense. Then, I wonder what this RCU lock is protecting...
> 

OK. I assume that the "concurrent discard" in
https://lkml.kernel.org/20250628011459.832760-1-yukuai1@huaweicloud.com means
brd_do_discard().

Calling rcu_read_lock() from brd_insert_page() before xa_unlock() is called prevents
__free_page() from brd_free_one_page() from call_rcu() from brd_do_discard(), even if
the page allocated by alloc_page() and stored into brd->brd_pages by __xa_cmpxchg() is
removed by __xa_erase() before brd_rw_bvec() calls memcpy_{to,from}_page()/memset();
allowing brd_rw_bvec() to continue using the page returned by brd_insert_page().

I came to worry one possibility about the above expectation, for I don't know
details of xarray.

__xa_cmpxchg() calls __xa_cmpxchg_raw() with xa_lock already held.
__xa_cmpxchg_raw() always calls __xas_nomem() with xa_lock already held.
__xas_nomem() might temporarily release xa_lock for allocating memory if
__GFP_DIRECT_RECLAIM is specified.
__xa_cmpxchg_raw() might store "entry" at xas_store() before calling __xas_nomem().

Then, is there a possibility that __xas_nomem() temporarily releases xa_lock for
allocating memory after__xa_cmpxchg_raw() already called xas_store() ?
Unless there is a guarantee that __xas_nomem() never releases xa_lock if
__xa_cmpxchg_raw() called xa_store(), there will be a race window that
the page allocated by alloc_page() and stored into brd->brd_pages by __xa_cmpxchg() is
removed by __xa_erase() from brd_do_discard() and __free_page() from brd_free_one_page()
 from call_rcu() from brd_do_discard() is fired before brd_insert_page() calls rcu_lock()
immediately after returning from __xa_cmpxchg().

Also, what serializes concurrent brd_insert_page(), for when __xas_nomem() temporarily
released xa_lock for allocating memory, two threads might concurrently call
kmem_cache_alloc_lru() from __xas_nomem() ?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] brd: fix sleeping memory allocation in brd_insert_page()
  2025-06-28  8:36 ` [PATCH] brd: fix sleeping memory allocation in brd_insert_page() Tetsuo Handa
  2025-06-28  9:39   ` Tetsuo Handa
@ 2025-06-30  5:36   ` Christoph Hellwig
  1 sibling, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2025-06-30  5:36 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Jens Axboe, Yu Kuai, Christoph Hellwig, LKML

I think the correct fix is "brd: fix leeping function called from invalid
context in brd_insert_page()" from Yu Kuai.  Please take a look at that
and double check it, though.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-30  5:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-27 16:20 [syzbot] [block?] BUG: sleeping function called from invalid context in __xas_nomem (2) syzbot
2025-06-28  8:36 ` [PATCH] brd: fix sleeping memory allocation in brd_insert_page() Tetsuo Handa
2025-06-28  9:39   ` Tetsuo Handa
2025-06-28 11:03     ` Tetsuo Handa
2025-06-30  5:36   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).