All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] -next lockdep invalid wait context
@ 2024-10-30 21:05 Paul E. McKenney
  2024-10-30 21:48 ` Vlastimil Babka
  0 siblings, 1 reply; 40+ messages in thread
From: Paul E. McKenney @ 2024-10-30 21:05 UTC (permalink / raw)
  To: linux-next, linux-kernel, kasan-dev, linux-mm
  Cc: sfr, bigeasy, longman, boqun.feng, elver, cl, penberg, rientjes,
	iamjoonsoo.kim, akpm, vbabka

Hello!

The next-20241030 release gets the splat shown below when running
scftorture in a preemptible kernel.  This bisects to this commit:

560af5dc839e ("lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING")

Except that all this is doing is enabling lockdep to find the problem.

The obvious way to fix this is to make the kmem_cache structure's
cpu_slab field's ->lock be a raw spinlock, but this might not be what
we want for real-time response.

This can be reproduced deterministically as follows:

tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration 2 --configs PREEMPT --kconfig CONFIG_NR_CPUS=64 --memory 7G --trust-make --kasan --bootargs "scftorture.nthreads=64 torture.disable_onoff_at_boot csdlock_debug=1"

I doubt that the number of CPUs or amount of memory makes any difference,
but that is what I used.

Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

[   35.659746] =============================
[   35.659746] [ BUG: Invalid wait context ]
[   35.659746] 6.12.0-rc5-next-20241029 #57233 Not tainted
[   35.659746] -----------------------------
[   35.659746] swapper/37/0 is trying to lock:
[   35.659746] ffff8881ff4bf2f0 (&c->lock){....}-{3:3}, at: put_cpu_partial+0x49/0x1b0
[   35.659746] other info that might help us debug this:
[   35.659746] context-{2:2}
[   35.659746] no locks held by swapper/37/0.
[   35.659746] stack backtrace:
[   35.659746] CPU: 37 UID: 0 PID: 0 Comm: swapper/37 Not tainted 6.12.0-rc5-next-20241029 #57233
[   35.659746] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[   35.659746] Call Trace:
[   35.659746]  <IRQ>
[   35.659746]  dump_stack_lvl+0x68/0xa0
[   35.659746]  __lock_acquire+0x8fd/0x3b90
[   35.659746]  ? start_secondary+0x113/0x210
[   35.659746]  ? __pfx___lock_acquire+0x10/0x10
[   35.659746]  ? __pfx___lock_acquire+0x10/0x10
[   35.659746]  ? __pfx___lock_acquire+0x10/0x10
[   35.659746]  ? __pfx___lock_acquire+0x10/0x10
[   35.659746]  lock_acquire+0x19b/0x520
[   35.659746]  ? put_cpu_partial+0x49/0x1b0
[   35.659746]  ? __pfx_lock_acquire+0x10/0x10
[   35.659746]  ? __pfx_lock_release+0x10/0x10
[   35.659746]  ? lock_release+0x20f/0x6f0
[   35.659746]  ? __pfx_lock_release+0x10/0x10
[   35.659746]  ? lock_release+0x20f/0x6f0
[   35.659746]  ? kasan_save_track+0x14/0x30
[   35.659746]  put_cpu_partial+0x52/0x1b0
[   35.659746]  ? put_cpu_partial+0x49/0x1b0
[   35.659746]  ? __pfx_scf_handler_1+0x10/0x10
[   35.659746]  __flush_smp_call_function_queue+0x2d2/0x600
[   35.659746]  __sysvec_call_function_single+0x50/0x280
[   35.659746]  sysvec_call_function_single+0x6b/0x80
[   35.659746]  </IRQ>
[   35.659746]  <TASK>
[   35.659746]  asm_sysvec_call_function_single+0x1a/0x20
[   35.659746] RIP: 0010:default_idle+0xf/0x20
[   35.659746] Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90
 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 33 80 3e 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
[   35.659746] RSP: 0018:ffff888100a9fe68 EFLAGS: 00000202
[   35.659746] RAX: 0000000000040d75 RBX: 0000000000000025 RCX: ffffffffab83df45
[   35.659746] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa8a5f7ba
[   35.659746] RBP: dffffc0000000000 R08: 0000000000000001 R09: ffffed103fe96c3c
[   35.659746] R10: ffff8881ff4b61e3 R11: 0000000000000000 R12: ffffffffad13f1d0
[   35.659746] R13: 1ffff11020153fd2 R14: 0000000000000000 R15: 0000000000000000
[   35.659746]  ? ct_kernel_exit.constprop.0+0xc5/0xf0
[   35.659746]  ? do_idle+0x2fa/0x3b0
[   35.659746]  default_idle_call+0x6d/0xb0
[   35.659746]  do_idle+0x2fa/0x3b0
[   35.659746]  ? __pfx_do_idle+0x10/0x10
[   35.659746]  cpu_startup_entry+0x4f/0x60
[   35.659746]  start_secondary+0x1bc/0x210
[   35.659746]  common_startup_64+0x12c/0x138
[   35.659746]  </TASK>

^ permalink raw reply	[flat|nested] 40+ messages in thread
* [syzbot] [kernfs?] WARNING: locking bug in kernfs_path_from_node
@ 2024-11-01 18:28 syzbot
  2024-11-05  7:58 ` syzbot
  0 siblings, 1 reply; 40+ messages in thread
From: syzbot @ 2024-11-01 18:28 UTC (permalink / raw)
  To: gregkh, linux-kernel, syzkaller-bugs, tj

Hello,

syzbot found the following issue on:

HEAD commit:    f9f24ca362a4 Add linux-next specific files for 20241031
git tree:       linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=17a0a187980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=328572ed4d152be9
dashboard link: https://syzkaller.appspot.com/bug?extid=6ea37e2e6ffccf41a7e6
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13119340580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d56630580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/eb84549dd6b3/disk-f9f24ca3.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/beb29bdfa297/vmlinux-f9f24ca3.xz
kernel image: https://storage.googleapis.com/syzbot-assets/8881fe3245ad/bzImage-f9f24ca3.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+6ea37e2e6ffccf41a7e6@syzkaller.appspotmail.com

=============================
[ BUG: Invalid wait context ]
6.12.0-rc5-next-20241031-syzkaller #0 Not tainted
-----------------------------
strace-static-x/5846 is trying to lock:
ffffffff8eac8698 (kernfs_rename_lock){....}-{3:3}, at: kernfs_path_from_node+0x92/0xb00 fs/kernfs/dir.c:229
other info that might help us debug this:
context-{5:5}
3 locks held by strace-static-x/5846:
 #0: ffff8880b873e598 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:598
 #1: ffffffff8e939f20 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 #1: ffffffff8e939f20 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
 #1: ffffffff8e939f20 (rcu_read_lock){....}-{1:3}, at: __bpf_trace_run kernel/trace/bpf_trace.c:2339 [inline]
 #1: ffffffff8e939f20 (rcu_read_lock){....}-{1:3}, at: bpf_trace_run2+0x1fc/0x540 kernel/trace/bpf_trace.c:2381
 #2: ffff88802f7129e0 (&mm->mmap_lock){++++}-{4:4}, at: mmap_read_trylock include/linux/mmap_lock.h:208 [inline]
 #2: ffff88802f7129e0 (&mm->mmap_lock){++++}-{4:4}, at: stack_map_get_build_id_offset+0x431/0x870 kernel/bpf/stackmap.c:157
stack backtrace:
CPU: 1 UID: 0 PID: 5846 Comm: strace-static-x Not tainted 6.12.0-rc5-next-20241031-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 print_lock_invalid_wait_context kernel/locking/lockdep.c:4826 [inline]
 check_wait_context kernel/locking/lockdep.c:4898 [inline]
 __lock_acquire+0x15a8/0x2100 kernel/locking/lockdep.c:5176
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
 __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:160 [inline]
 _raw_read_lock_irqsave+0xdd/0x130 kernel/locking/spinlock.c:236
 kernfs_path_from_node+0x92/0xb00 fs/kernfs/dir.c:229
 kernfs_path include/linux/kernfs.h:598 [inline]
 cgroup_path include/linux/cgroup.h:599 [inline]
 get_mm_memcg_path+0xb9/0x380 mm/mmap_lock.c:82
 __mmap_lock_do_trace_acquire_returned+0x9f/0x2f0 mm/mmap_lock.c:102
 __mmap_lock_trace_acquire_returned include/linux/mmap_lock.h:36 [inline]
 mmap_read_trylock include/linux/mmap_lock.h:209 [inline]
 stack_map_get_build_id_offset+0x84d/0x870 kernel/bpf/stackmap.c:157
 __bpf_get_stack+0x8da/0xad0 kernel/bpf/stackmap.c:483
 ____bpf_get_stack kernel/bpf/stackmap.c:499 [inline]
 bpf_get_stack+0x33/0x50 kernel/bpf/stackmap.c:496
 ____bpf_get_stack_raw_tp kernel/trace/bpf_trace.c:1933 [inline]
 bpf_get_stack_raw_tp+0x1a3/0x240 kernel/trace/bpf_trace.c:1923
 bpf_prog_ec3b2eefa702d8d3+0x43/0x47
 bpf_dispatcher_nop_func include/linux/bpf.h:1290 [inline]
 __bpf_prog_run include/linux/filter.h:701 [inline]
 bpf_prog_run include/linux/filter.h:708 [inline]
 __bpf_trace_run kernel/trace/bpf_trace.c:2340 [inline]
 bpf_trace_run2+0x2ec/0x540 kernel/trace/bpf_trace.c:2381
 trace_tlb_flush+0x118/0x140 include/trace/events/tlb.h:38
 switch_mm_irqs_off+0x77a/0xa70
 context_switch kernel/sched/core.c:5311 [inline]
 __schedule+0x10c7/0x4c30 kernel/sched/core.c:6707
 __schedule_loop kernel/sched/core.c:6784 [inline]
 schedule+0x14b/0x320 kernel/sched/core.c:6799
 do_wait+0x2a5/0x560 kernel/exit.c:1696
 kernel_wait4+0x2a7/0x3e0 kernel/exit.c:1850
 __do_sys_wait4 kernel/exit.c:1878 [inline]
 __se_sys_wait4 kernel/exit.c:1874 [inline]
 __x64_sys_wait4+0x134/0x1e0 kernel/exit.c:1874
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4d6ad6
Code: Unable to access opcode bytes at 0x4d6aac.
RSP: 002b:00007ffc879cddd8 EFLAGS: 00000246 ORIG_RAX: 000000000000003d
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004d6ad6
RDX: 0000000040000000 RSI: 00007ffc879cddfc RDI: 00000000ffffffff
RBP: 0000000000000000 R08: 0000000000000017 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000001b15d400
R13: 00007ffc879cddfc R14: 000000001b157ce0 R15: 000000000063f160
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2024-11-13  7:58 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-30 21:05 [BUG] -next lockdep invalid wait context Paul E. McKenney
2024-10-30 21:48 ` Vlastimil Babka
2024-10-30 22:34   ` Marco Elver
2024-10-30 23:04     ` Boqun Feng
2024-10-30 23:10     ` Paul E. McKenney
2024-10-31  7:21       ` Sebastian Andrzej Siewior
2024-10-31  7:35         ` Vlastimil Babka
2024-10-31  7:55           ` Sebastian Andrzej Siewior
2024-10-31  8:18             ` Vlastimil Babka
2024-11-01 17:14               ` Paul E. McKenney
2024-10-31 17:50             ` Paul E. McKenney
2024-11-01 19:50               ` Boqun Feng
2024-11-01 19:54                 ` [PATCH] scftorture: Use workqueue to free scf_check Boqun Feng
2024-11-01 23:35                   ` Paul E. McKenney
2024-11-03  3:35                     ` Boqun Feng
2024-11-03 15:03                       ` Paul E. McKenney
2024-11-04 10:50                         ` [PATCH 1/2] scftorture: Move memory allocation outside of preempt_disable region Sebastian Andrzej Siewior
2024-11-04 10:50                           ` [PATCH 2/2] scftorture: Use a lock-less list to free memory Sebastian Andrzej Siewior
2024-11-05  1:00                             ` Boqun Feng
2024-11-07 11:21                               ` Sebastian Andrzej Siewior
2024-11-07 14:08                                 ` Paul E. McKenney
2024-11-07 14:43                                   ` Sebastian Andrzej Siewior
2024-11-07 14:59                                     ` Paul E. McKenney
2024-11-02  0:12         ` [BUG] -next lockdep invalid wait context Hillf Danton
2024-11-02  0:45           ` Boqun Feng
2024-11-04 18:08             ` Tejun Heo
2024-11-05  9:37               ` Vlastimil Babka
2024-11-08 10:05               ` Sebastian Andrzej Siewior
2024-11-08 17:02                 ` Tejun Heo
2024-11-08 17:12                   ` Sebastian Andrzej Siewior
2024-11-08 22:24                   ` [PATCH] kernfs: Use RCU for kernfs_node::name lookup Sebastian Andrzej Siewior
2024-11-08 22:31                     ` Tejun Heo
2024-11-11 17:04                       ` Sebastian Andrzej Siewior
2024-11-12 19:02                         ` Tejun Heo
2024-11-13  7:58                           ` Sebastian Andrzej Siewior
2024-11-08 23:16                     ` Hillf Danton
2024-11-08 23:48                       ` [syzbot] [kernfs?] WARNING: locking bug in kernfs_path_from_node syzbot
2024-11-11  4:49                     ` [PATCH] kernfs: Use RCU for kernfs_node::name lookup kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2024-11-01 18:28 [syzbot] [kernfs?] WARNING: locking bug in kernfs_path_from_node syzbot
2024-11-05  7:58 ` syzbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.