From: syzbot <syzbot+628f63ef3b071e16463e@syzkaller.appspotmail.com>
To: eadavis@qq.com, linux-kernel@vger.kernel.org,
syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [bpf?] [net?] possible deadlock in scheduler_tick (3)
Date: Thu, 21 Mar 2024 07:51:02 -0700 [thread overview]
Message-ID: <0000000000001388ec06142cd682@google.com> (raw)
In-Reply-To: <tencent_897431776767BA3D0B361296C236024BFC09@qq.com>
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in scheduler_tick
=====================================================
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
6.8.0-syzkaller-05231-ga51cd6bf8e10-dirty #0 Not tainted
-----------------------------------------------------
kworker/u8:5/1087 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
ffff88807d83c020 (&htab->buckets[i].lock){+...}-{2:2}, at: sock_hash_delete_elem+0xb1/0x2f0 net/core/sock_map.c:940
and this task is already holding:
ffff8880b943e158 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:559
which would create a new lock dependency:
(&rq->__lock){-.-.}-{2:2} -> (&htab->buckets[i].lock){+...}-{2:2}
but this new dependency connects a HARDIRQ-irq-safe lock:
(&rq->__lock){-.-.}-{2:2}
... which became HARDIRQ-irq-safe at:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
_raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:559
raw_spin_rq_lock kernel/sched/sched.h:1385 [inline]
rq_lock kernel/sched/sched.h:1699 [inline]
scheduler_tick+0xa1/0x6e0 kernel/sched/core.c:5679
update_process_times+0x202/0x230 kernel/time/timer.c:2481
tick_periodic+0x190/0x220 kernel/time/tick-common.c:100
tick_handle_periodic+0x4a/0x160 kernel/time/tick-common.c:112
timer_interrupt+0x5c/0x70 arch/x86/kernel/time.c:57
__handle_irq_event_percpu+0x28c/0xa30 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0x89/0x1f0 kernel/irq/handle.c:210
handle_level_irq+0x3c5/0x6e0 kernel/irq/chip.c:648
generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
handle_irq arch/x86/kernel/irq.c:238 [inline]
__common_interrupt+0x13a/0x230 arch/x86/kernel/irq.c:257
common_interrupt+0xa5/0xd0 arch/x86/kernel/irq.c:247
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
__setup_irq+0x1277/0x1cf0 kernel/irq/manage.c:1818
request_threaded_irq+0x2ab/0x380 kernel/irq/manage.c:2202
request_irq include/linux/interrupt.h:168 [inline]
setup_default_timer_irq+0x25/0x60 arch/x86/kernel/time.c:70
x86_late_time_init+0x66/0xc0 arch/x86/kernel/time.c:94
start_kernel+0x3f3/0x500 init/main.c:1039
x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:509
x86_64_start_kernel+0x99/0xa0 arch/x86/kernel/head64.c:490
common_startup_64+0x13e/0x147
to a HARDIRQ-irq-unsafe lock:
(&htab->buckets[i].lock){+...}-{2:2}
... which became HARDIRQ-irq-unsafe at:
...
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:356 [inline]
sock_hash_free+0x164/0x820 net/core/sock_map.c:1155
bpf_map_free_deferred+0xe6/0x110 kernel/bpf/syscall.c:734
process_one_work kernel/workqueue.c:3254 [inline]
process_scheduled_works+0xa00/0x1770 kernel/workqueue.c:3335
worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&htab->buckets[i].lock);
local_irq_disable();
lock(&rq->__lock);
lock(&htab->buckets[i].lock);
<Interrupt>
lock(&rq->__lock);
*** DEADLOCK ***
2 locks held by kworker/u8:5/1087:
#0: ffff8880b943e158 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:559
#1: ffffffff8e131920 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline]
#1: ffffffff8e131920 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline]
#1: ffffffff8e131920 (rcu_read_lock){....}-{1:2}, at: __bpf_trace_run kernel/trace/bpf_trace.c:2380 [inline]
#1: ffffffff8e131920 (rcu_read_lock){....}-{1:2}, at: bpf_trace_run2+0x114/0x420 kernel/trace/bpf_trace.c:2420
the dependencies between HARDIRQ-irq-safe lock and the holding lock:
-> (&rq->__lock){-.-.}-{2:2} {
IN-HARDIRQ-W at:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
_raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:559
raw_spin_rq_lock kernel/sched/sched.h:1385 [inline]
rq_lock kernel/sched/sched.h:1699 [inline]
scheduler_tick+0xa1/0x6e0 kernel/sched/core.c:5679
update_process_times+0x202/0x230 kernel/time/timer.c:2481
tick_periodic+0x190/0x220 kernel/time/tick-common.c:100
tick_handle_periodic+0x4a/0x160 kernel/time/tick-common.c:112
timer_interrupt+0x5c/0x70 arch/x86/kernel/time.c:57
__handle_irq_event_percpu+0x28c/0xa30 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0x89/0x1f0 kernel/irq/handle.c:210
handle_level_irq+0x3c5/0x6e0 kernel/irq/chip.c:648
generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
handle_irq arch/x86/kernel/irq.c:238 [inline]
__common_interrupt+0x13a/0x230 arch/x86/kernel/irq.c:257
common_interrupt+0xa5/0xd0 arch/x86/kernel/irq.c:247
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
__setup_irq+0x1277/0x1cf0 kernel/irq/manage.c:1818
request_threaded_irq+0x2ab/0x380 kernel/irq/manage.c:2202
request_irq include/linux/interrupt.h:168 [inline]
setup_default_timer_irq+0x25/0x60 arch/x86/kernel/time.c:70
x86_late_time_init+0x66/0xc0 arch/x86/kernel/time.c:94
start_kernel+0x3f3/0x500 init/main.c:1039
x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:509
x86_64_start_kernel+0x99/0xa0 arch/x86/kernel/head64.c:490
common_startup_64+0x13e/0x147
IN-SOFTIRQ-W at:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
_raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:559
raw_spin_rq_lock kernel/sched/sched.h:1385 [inline]
rq_lock kernel/sched/sched.h:1699 [inline]
ttwu_queue kernel/sched/core.c:4055 [inline]
try_to_wake_up+0x7d3/0x1470 kernel/sched/core.c:4378
kick_pool+0x41b/0x5c0 kernel/workqueue.c:1284
__queue_work+0xc20/0xec0 kernel/workqueue.c:2401
call_timer_fn+0x17e/0x600 kernel/time/timer.c:1792
expire_timers kernel/time/timer.c:1838 [inline]
__run_timers kernel/time/timer.c:2408 [inline]
__run_timer_base+0x695/0x8e0 kernel/time/timer.c:2419
run_timer_base kernel/time/timer.c:2428 [inline]
run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2438
__do_softirq+0x2bc/0x943 kernel/softirq.c:554
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
default_idle+0x13/0x20 arch/x86/kernel/process.c:742
default_idle_call+0x74/0xb0 kernel/sched/idle.c:117
cpuidle_idle_call kernel/sched/idle.c:191 [inline]
do_idle+0x22f/0x5d0 kernel/sched/idle.c:332
cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:430
rest_init+0x2e0/0x300 init/main.c:730
arch_call_rest_init+0xe/0x10 init/main.c:831
start_kernel+0x47a/0x500 init/main.c:1077
x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:509
x86_64_start_kernel+0x99/0xa0 arch/x86/kernel/head64.c:490
common_startup_64+0x13e/0x147
INITIAL USE at:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
_raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:559
raw_spin_rq_lock kernel/sched/sched.h:1385 [inline]
_raw_spin_rq_lock_irqsave kernel/sched/sched.h:1404 [inline]
rq_lock_irqsave kernel/sched/sched.h:1683 [inline]
rq_attach_root+0xee/0x540 kernel/sched/topology.c:494
sched_init+0x64e/0xc30 kernel/sched/core.c:10031
start_kernel+0x1ab/0x500 init/main.c:948
x86_64_start_reservations+0x2a/0x30 arch/x86/kernel/head64.c:509
x86_64_start_kernel+0x99/0xa0 arch/x86/kernel/head64.c:490
common_startup_64+0x13e/0x147
}
... key at: [<ffffffff926c4080>] sched_init.__key+0x0/0x20
the dependencies between the lock to be acquired
and HARDIRQ-irq-unsafe lock:
-> (&htab->buckets[i].lock){+...}-{2:2} {
HARDIRQ-ON-W at:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:356 [inline]
sock_hash_free+0x164/0x820 net/core/sock_map.c:1155
bpf_map_free_deferred+0xe6/0x110 kernel/bpf/syscall.c:734
process_one_work kernel/workqueue.c:3254 [inline]
process_scheduled_works+0xa00/0x1770 kernel/workqueue.c:3335
worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
INITIAL USE at:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
sock_hash_delete_elem+0xb1/0x2f0 net/core/sock_map.c:940
0xffffffffa00060f6
bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
__bpf_prog_run include/linux/filter.h:657 [inline]
bpf_prog_run include/linux/filter.h:664 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2381 [inline]
bpf_trace_run2+0x204/0x420 kernel/trace/bpf_trace.c:2420
trace_contention_end+0xd7/0x100 include/trace/events/lock.h:122
__mutex_lock_common kernel/locking/mutex.c:617 [inline]
__mutex_lock+0x2e5/0xd70 kernel/locking/mutex.c:752
srcu_advance_state kernel/rcu/srcutree.c:1651 [inline]
process_srcu+0x6e/0x13a0 kernel/rcu/srcutree.c:1811
process_one_work kernel/workqueue.c:3254 [inline]
process_scheduled_works+0xa00/0x1770 kernel/workqueue.c:3335
worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
}
... key at: [<ffffffff94882300>] sock_hash_alloc.__key+0x0/0x20
... acquired at:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
sock_hash_delete_elem+0xb1/0x2f0 net/core/sock_map.c:940
bpf_prog_43221478a22f23b5+0x42/0x46
bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
__bpf_prog_run include/linux/filter.h:657 [inline]
bpf_prog_run include/linux/filter.h:664 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2381 [inline]
bpf_trace_run2+0x204/0x420 kernel/trace/bpf_trace.c:2420
trace_contention_end+0xf6/0x120 include/trace/events/lock.h:122
__pv_queued_spin_lock_slowpath+0x939/0xc60 kernel/locking/qspinlock.c:560
pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:584 [inline]
queued_spin_lock_slowpath+0x42/0x50 arch/x86/include/asm/qspinlock.h:51
queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
do_raw_spin_lock+0x272/0x370 kernel/locking/spinlock_debug.c:116
raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:559
raw_spin_rq_lock kernel/sched/sched.h:1385 [inline]
rq_lock kernel/sched/sched.h:1699 [inline]
__schedule+0x354/0x4a20 kernel/sched/core.c:6652
__schedule_loop kernel/sched/core.c:6813 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6828
worker_thread+0xa2c/0xd70 kernel/workqueue.c:3431
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
stack backtrace:
CPU: 0 PID: 1087 Comm: kworker/u8:5 Not tainted 6.8.0-syzkaller-05231-ga51cd6bf8e10-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Workqueue: 0x0 (ipv6_addrconf)
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
print_bad_irq_dependency kernel/locking/lockdep.c:2626 [inline]
check_irq_usage kernel/locking/lockdep.c:2865 [inline]
check_prev_add kernel/locking/lockdep.c:3138 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x4dc7/0x58e0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
sock_hash_delete_elem+0xb1/0x2f0 net/core/sock_map.c:940
bpf_prog_43221478a22f23b5+0x42/0x46
bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
__bpf_prog_run include/linux/filter.h:657 [inline]
bpf_prog_run include/linux/filter.h:664 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2381 [inline]
bpf_trace_run2+0x204/0x420 kernel/trace/bpf_trace.c:2420
trace_contention_end+0xf6/0x120 include/trace/events/lock.h:122
__pv_queued_spin_lock_slowpath+0x939/0xc60 kernel/locking/qspinlock.c:560
pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:584 [inline]
queued_spin_lock_slowpath+0x42/0x50 arch/x86/include/asm/qspinlock.h:51
queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
do_raw_spin_lock+0x272/0x370 kernel/locking/spinlock_debug.c:116
raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:559
raw_spin_rq_lock kernel/sched/sched.h:1385 [inline]
rq_lock kernel/sched/sched.h:1699 [inline]
__schedule+0x354/0x4a20 kernel/sched/core.c:6652
__schedule_loop kernel/sched/core.c:6813 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6828
worker_thread+0xa2c/0xd70 kernel/workqueue.c:3431
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
</TASK>
Tested on:
commit: a51cd6bf arm64: bpf: fix 32bit unconditional bswap
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=127edef1180000
kernel config: https://syzkaller.appspot.com/x/.config?x=6fb1be60a193d440
dashboard link: https://syzkaller.appspot.com/bug?extid=628f63ef3b071e16463e
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=151c1531180000
next prev parent reply other threads:[~2024-03-21 14:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-20 18:46 [syzbot] [bpf?] [net?] possible deadlock in scheduler_tick (3) syzbot
2024-03-20 23:51 ` Hillf Danton
2024-03-21 14:28 ` syzbot
2024-03-21 0:18 ` Edward Adam Davis
2024-03-21 14:51 ` syzbot [this message]
2024-04-20 14:31 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0000000000001388ec06142cd682@google.com \
--to=syzbot+628f63ef3b071e16463e@syzkaller.appspotmail.com \
--cc=eadavis@qq.com \
--cc=linux-kernel@vger.kernel.org \
--cc=syzkaller-bugs@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.