* INFO: rcu detected stall in __hrtimer_run_queues @ 2021-02-20 21:05 syzbot 2021-11-16 15:41 ` [syzbot] " syzbot 0 siblings, 1 reply; 4+ messages in thread From: syzbot @ 2021-02-20 21:05 UTC (permalink / raw) To: fweisbec, linux-kernel, mingo, syzkaller-bugs, tglx Hello, syzbot found the following issue on: HEAD commit: f40ddce8 Linux 5.11 git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1397f498d00000 kernel config: https://syzkaller.appspot.com/x/.config?x=e53d04227c52a0df dashboard link: https://syzkaller.appspot.com/bug?extid=de9526ade17c659d8336 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17a81012d00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1282b6d2d00000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+de9526ade17c659d8336@syzkaller.appspotmail.com rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 1-....: (10153 ticks this GP) idle=f1a/1/0x4000000000000000 softirq=10867/10868 fqs=925 (t=10502 jiffies g=10029 q=19103) NMI backtrace for cpu 1 CPU: 1 PID: 10530 Comm: syz-executor248 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] rcu_dump_cpu_stacks+0x1f4/0x230 kernel/rcu/tree_stall.h:337 print_cpu_stall kernel/rcu/tree_stall.h:569 [inline] check_cpu_stall kernel/rcu/tree_stall.h:643 [inline] rcu_pending kernel/rcu/tree.c:3751 [inline] rcu_sched_clock_irq.cold+0x48e/0xedf kernel/rcu/tree.c:2580 update_process_times+0x16d/0x200 kernel/time/timer.c:1782 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1369 __run_hrtimer kernel/time/hrtimer.c:1519 [inline] __hrtimer_run_queues+0x1c0/0xe40 kernel/time/hrtimer.c:1583 hrtimer_interrupt+0x334/0x940 kernel/time/hrtimer.c:1645 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline] __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1106 run_sysvec_on_irqstack_cond arch/x86/include/asm/irq_stack.h:91 [inline] sysvec_apic_timer_interrupt+0x48/0x100 arch/x86/kernel/apic/apic.c:1100 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:629 RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline] RIP: 0010:_raw_spin_unlock_irqrestore+0x25/0x50 kernel/locking/spinlock.c:191 Code: f8 5d c3 66 90 55 48 89 fd 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 9a d6 5b f8 48 89 ef e8 42 8b 5c f8 f6 c7 02 75 1a 53 9d <bf> 01 00 00 00 e8 81 92 50 f8 65 8b 05 7a f8 04 77 85 c0 74 0a 5b RSP: 0018:ffffc90000db0e48 EFLAGS: 00000286 RAX: 0000000000ce281a RBX: 0000000000000286 RCX: 1ffffffff1b46a19 RDX: 0000000000000000 RSI: 0000000000000102 RDI: 0000000000000000 RBP: ffff8880b9d26a00 R08: 0000000000000001 R09: 0000000000000001 R10: ffffffff8178a8b8 R11: 0000000000000000 R12: 000000a80fcc296a R13: ffff8880b9d26c80 R14: ffff8880b9d26a00 R15: ffffffff851589e0 __run_hrtimer kernel/time/hrtimer.c:1515 [inline] __hrtimer_run_queues+0x51a/0xe40 kernel/time/hrtimer.c:1583 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600 __do_softirq+0x29b/0x9f6 kernel/softirq.c:343 asm_call_irq_on_stack+0xf/0x20 </IRQ> __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline] run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline] do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77 invoke_softirq kernel/softirq.c:226 [inline] __irq_exit_rcu kernel/softirq.c:420 [inline] irq_exit_rcu+0x134/0x200 kernel/softirq.c:432 sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1100 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:629 RIP: 0010:queue_work_on+0x83/0xd0 kernel/workqueue.c:1530 Code: 31 ff 89 ee e8 6e 02 29 00 40 84 ed 74 46 e8 e4 fb 28 00 31 ff 48 89 de e8 ca 03 29 00 48 85 db 75 26 e8 d0 fb 28 00 41 56 9d <48> 83 c4 08 44 89 f8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 e8 b6 fb 28 RSP: 0018:ffffc90002e5fc80 EFLAGS: 00000293 RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000000000 RDX: ffff8880265f5340 RSI: ffffffff8149da20 RDI: 0000000000000000 RBP: ffffc90002e68000 R08: 0000000000000001 R09: 0000000000000001 R10: ffffffff8178a8b8 R11: 0000000000000000 R12: ffff8880b9d31568 R13: ffff888010c64c00 R14: 0000000000000293 R15: 0000000000000001 queue_work include/linux/workqueue.h:507 [inline] schedule_work include/linux/workqueue.h:568 [inline] __vfree_deferred mm/vmalloc.c:2307 [inline] vfree_atomic+0xac/0xe0 mm/vmalloc.c:2325 free_thread_stack kernel/fork.c:291 [inline] release_task_stack kernel/fork.c:428 [inline] put_task_stack+0x29c/0x480 kernel/fork.c:439 finish_task_switch.isra.0+0x557/0x7e0 kernel/sched/core.c:4236 context_switch kernel/sched/core.c:4330 [inline] __schedule+0x914/0x21a0 kernel/sched/core.c:5078 preempt_schedule_common+0x45/0xc0 kernel/sched/core.c:5238 preempt_schedule_thunk+0x16/0x18 arch/x86/entry/thunk_64.S:35 __raw_spin_unlock include/linux/spinlock_api_smp.h:152 [inline] _raw_spin_unlock+0x36/0x40 kernel/locking/spinlock.c:183 spin_unlock include/linux/spinlock.h:394 [inline] pick_file+0x129/0x1e0 fs/file.c:613 close_fd+0x44/0x80 fs/file.c:622 __do_sys_close fs/open.c:1299 [inline] __se_sys_close fs/open.c:1297 [inline] __x64_sys_close+0x2f/0xa0 fs/open.c:1297 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x403353 Code: c7 c2 c0 ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 45 c3 0f 1f 40 00 48 83 ec 18 89 7c 24 0c e8 RSP: 002b:00007ffdb38739a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000000403353 RDX: 0000000000042000 RSI: 0000000000000004 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000031 R09: 0000000000000031 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000aa47b R13: 00007ffdb3873a00 R14: 00007ffdb38739f0 R15: 00007ffdb38739c4 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [syzbot] INFO: rcu detected stall in __hrtimer_run_queues 2021-02-20 21:05 INFO: rcu detected stall in __hrtimer_run_queues syzbot @ 2021-11-16 15:41 ` syzbot 2021-11-16 15:42 ` Jens Axboe 0 siblings, 1 reply; 4+ messages in thread From: syzbot @ 2021-11-16 15:41 UTC (permalink / raw) To: axboe, fweisbec, hch, hdanton, linux-block, linux-kernel, mingo, paulmck, syzkaller-bugs, tglx syzbot suspects this issue was fixed by commit: commit b60876296847e6cd7f1da4b8b7f0f31399d59aa1 Author: Jens Axboe <axboe@kernel.dk> Date: Fri Oct 15 21:03:52 2021 +0000 block: improve layout of struct request bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=137f2d01b00000 start commit: f40ddce88593 Linux 5.11 git tree: upstream kernel config: https://syzkaller.appspot.com/x/.config?x=e53d04227c52a0df dashboard link: https://syzkaller.appspot.com/bug?extid=de9526ade17c659d8336 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17a81012d00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1282b6d2d00000 If the result looks correct, please mark the issue as fixed by replying with: #syz fix: block: improve layout of struct request For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [syzbot] INFO: rcu detected stall in __hrtimer_run_queues 2021-11-16 15:41 ` [syzbot] " syzbot @ 2021-11-16 15:42 ` Jens Axboe 2021-11-17 11:59 ` Paul E. McKenney 0 siblings, 1 reply; 4+ messages in thread From: Jens Axboe @ 2021-11-16 15:42 UTC (permalink / raw) To: syzbot, fweisbec, hch, hdanton, linux-block, linux-kernel, mingo, paulmck, syzkaller-bugs, tglx On 11/16/21 8:41 AM, syzbot wrote: > syzbot suspects this issue was fixed by commit: > > commit b60876296847e6cd7f1da4b8b7f0f31399d59aa1 > Author: Jens Axboe <axboe@kernel.dk> > Date: Fri Oct 15 21:03:52 2021 +0000 > > block: improve layout of struct request No functional changes in that patch, so looks like a fluky bisection. -- Jens Axboe ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [syzbot] INFO: rcu detected stall in __hrtimer_run_queues 2021-11-16 15:42 ` Jens Axboe @ 2021-11-17 11:59 ` Paul E. McKenney 0 siblings, 0 replies; 4+ messages in thread From: Paul E. McKenney @ 2021-11-17 11:59 UTC (permalink / raw) To: Jens Axboe Cc: syzbot, fweisbec, hch, hdanton, linux-block, linux-kernel, mingo, syzkaller-bugs, tglx, x86 On Tue, Nov 16, 2021 at 08:42:39AM -0700, Jens Axboe wrote: > On 11/16/21 8:41 AM, syzbot wrote: > > syzbot suspects this issue was fixed by commit: > > > > commit b60876296847e6cd7f1da4b8b7f0f31399d59aa1 > > Author: Jens Axboe <axboe@kernel.dk> > > Date: Fri Oct 15 21:03:52 2021 +0000 > > > > block: improve layout of struct request > > No functional changes in that patch, so looks like a fluky bisection. I am seeing an intermittent (and rather strange) stall warnings on v5.16-rc1. This is a self-detected stall from the idle loop. The reason that this is strange is that the usual reason that a CPU stalls in the idle loop is due to a long-running interrupt, in which case you would expect other CPUs to detect the stall. Reproduce using RCU's TRE07 scenario, except that the MTBF looks to be several hundred hours. But I ran this scenario long enough on v5.15-rc* to be confident that this stall warning is a regression introduced recently. And the reason is that the CPU, despite being in the idle loop, is not marked as idle from an RCU perspective (see the "idle=d59/0/0x1"): rcu: 0-...!: (13 ticks this GP) idle=d59/0/0x1 softirq=281261 /281261 fqs=1 (t=2199037 jiffies g=249449 q=5) NMI backtrace for cpu 0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1 #4571 Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module_el8.5.0+ 746+bbd5d70c 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x33/0x42 nmi_cpu_backtrace.cold.6+0x30/0x70 ? lapic_can_unplug_cpu+0x70/0x70 nmi_trigger_cpumask_backtrace+0xbf/0xd0 rcu_dump_cpu_stacks+0xc0/0x120 rcu_sched_clock_irq.cold.110+0x15a/0x312 ? get_nohz_timer_target+0x60/0x190 ? lock_timer_base+0x62/0x80 ? account_process_tick+0xd4/0x160 ? tick_sched_handle.isra.24+0x40/0x40 update_process_times+0x8e/0xc0 tick_sched_handle.isra.24+0x30/0x40 tick_sched_timer+0x6a/0x80 __hrtimer_run_queues+0xfc/0x2a0 hrtimer_interrupt+0x105/0x220 ? resched_curr+0x1e/0xc0 __sysvec_apic_timer_interrupt+0x7a/0x160 sysvec_apic_timer_interrupt+0x85/0xb0 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0010:default_idle+0xb/0x10 Code: ff 48 89 df e8 16 5c 90 ff eb d7 e8 bf 82 ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc eb 07 0f 00 2d df 92 41 00 fb f4 <c3> 0f 1f 40 00 65 48 8b 04 25 00 ad 01 00 f0 80 48 02 20 48 8b 10 RSP: 0018:ffffffff9dc03e98 EFLAGS: 00000202 RAX: ffffffff9d3ed200 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff9da3e0a1 RDI: ffffffff9da683ae RBP: ffffffff9de86050 R08: 00000000e141ad57 R09: ffffa03c5f229d40 R10: 0000000000002400 R11: 0000000000002400 R12: 0000000000000000 R13: 0000000000000000 R14: ffffffffffffffff R15: ffffffff9dc14940 ? __cpuidle_text_start+0x8/0x8 ? __cpuidle_text_start+0x8/0x8 default_idle_call+0x28/0xd0 do_idle+0x1fb/0x290 cpu_startup_entry+0x14/0x20 start_kernel+0x659/0x680 secondary_startup_64_no_verify+0xc2/0xcb </TASK> The usual reason for this odd situation is that someone forgot an irq_enter() or added an extra irq_exit(). Or likewise for a number of similar functions that tell RCU to start/stop ignoring the current CPU: nmi_enter(), nmi_exit(), rcu_*_enter(), rcu_*_exit(), and so on. Adding the x86 list on CC. Thanx, Paul ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-11-17 11:59 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-02-20 21:05 INFO: rcu detected stall in __hrtimer_run_queues syzbot 2021-11-16 15:41 ` [syzbot] " syzbot 2021-11-16 15:42 ` Jens Axboe 2021-11-17 11:59 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox