* [Qemu-devel] [Bug 1750229] [NEW] virtio-blk-pci regression: softlock in guest kernel at module loading
@ 2018-02-18 11:39 Matwey V. Kornilov
2018-02-18 12:26 ` [Qemu-devel] [Bug 1750229] " Matwey V. Kornilov
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Matwey V. Kornilov @ 2018-02-18 11:39 UTC (permalink / raw)
To: qemu-devel
Public bug reported:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other hardware
with the same setup (same host kernel, same qemu command line and host
kernel binaries). How could I try to find the issue reason?
** Affects: qemu
Importance: Undecided
Status: New
** Attachment added: "qemu trace:*virt* output"
https://bugs.launchpad.net/bugs/1750229/+attachment/5057639/+files/qemu-debug-log.xz
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750229
Title:
virtio-blk-pci regression: softlock in guest kernel at module loading
Status in QEMU:
New
Bug description:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other
hardware with the same setup (same host kernel, same qemu command line
and host kernel binaries). How could I try to find the issue reason?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Qemu-devel] [Bug 1750229] Re: virtio-blk-pci regression: softlock in guest kernel at module loading
2018-02-18 11:39 [Qemu-devel] [Bug 1750229] [NEW] virtio-blk-pci regression: softlock in guest kernel at module loading Matwey V. Kornilov
@ 2018-02-18 12:26 ` Matwey V. Kornilov
2018-02-20 11:47 ` Matwey V. Kornilov
2018-02-18 12:26 ` Matwey V. Kornilov
2019-01-03 18:13 ` Matwey V. Kornilov
2 siblings, 1 reply; 8+ messages in thread
From: Matwey V. Kornilov @ 2018-02-18 12:26 UTC (permalink / raw)
To: qemu-devel
** Attachment added: ".build.kernel.kvm"
https://bugs.launchpad.net/qemu/+bug/1750229/+attachment/5057653/+files/.build.kernel.kvm
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750229
Title:
virtio-blk-pci regression: softlock in guest kernel at module loading
Status in QEMU:
New
Bug description:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other
hardware with the same setup (same host kernel, same qemu command line
and host kernel binaries). How could I try to find the issue reason?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Qemu-devel] [Bug 1750229] Re: virtio-blk-pci regression: softlock in guest kernel at module loading
2018-02-18 11:39 [Qemu-devel] [Bug 1750229] [NEW] virtio-blk-pci regression: softlock in guest kernel at module loading Matwey V. Kornilov
2018-02-18 12:26 ` [Qemu-devel] [Bug 1750229] " Matwey V. Kornilov
@ 2018-02-18 12:26 ` Matwey V. Kornilov
2019-01-03 18:13 ` Matwey V. Kornilov
2 siblings, 0 replies; 8+ messages in thread
From: Matwey V. Kornilov @ 2018-02-18 12:26 UTC (permalink / raw)
To: qemu-devel
** Attachment added: ".build.initrd.kvm"
https://bugs.launchpad.net/qemu/+bug/1750229/+attachment/5057654/+files/.build.initrd.kvm
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750229
Title:
virtio-blk-pci regression: softlock in guest kernel at module loading
Status in QEMU:
New
Bug description:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other
hardware with the same setup (same host kernel, same qemu command line
and host kernel binaries). How could I try to find the issue reason?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [Bug 1750229] Re: virtio-blk-pci regression: softlock in guest kernel at module loading
2018-02-18 12:26 ` [Qemu-devel] [Bug 1750229] " Matwey V. Kornilov
@ 2018-02-20 11:47 ` Matwey V. Kornilov
2018-02-20 12:49 ` Matwey V. Kornilov
0 siblings, 1 reply; 8+ messages in thread
From: Matwey V. Kornilov @ 2018-02-20 11:47 UTC (permalink / raw)
To: qemu-devel
Well, I've found that on qemu side VirtQueue for virtio_blk device
infinitely calls virtio_blk_handle_vq() where virtio_blk_get_request()
(virtqueue_pop() in essens) always returns NULL.
2018-02-18 15:26 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
> ** Attachment added: ".build.kernel.kvm"
> https://bugs.launchpad.net/qemu/+bug/1750229/+attachment/5057653/+files/.build.kernel.kvm
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1750229
>
> Title:
> virtio-blk-pci regression: softlock in guest kernel at module loading
>
> Status in QEMU:
> New
>
> Bug description:
> Hello,
>
> I am running qemu from master git branch on x86_64 host with kernel is
> 4.4.114. I've found that commit
>
> 9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
>
> introduces an regression with the following command:
>
> qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
> -vga none -runas qemu -kernel .build.kernel.kvm -initrd
> .build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
> nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
> -drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
> -device virtio-blk-pci,drive=disk -serial stdio -smp 2
>
> Starting from this commit to master the following happens with a wide
> variety of guest kernels (4.4 to 4.15):
>
> [ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
> [ 62.437426] Showing busy workqueues and worker pools:
> [ 62.443117] workqueue events: flags=0x0
> [ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
> [ 62.448161] pending: check_corruption
> [ 62.458570] workqueue kblockd: flags=0x18
> [ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
> [ 62.463082] in-flight: 4:blk_mq_run_work_fn
> [ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
> [ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
> [ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
> [ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
> [ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
> [ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
> [ 62.492121] Call Trace:
> [ 62.492121] <IRQ>
> [ 62.492121] sched_show_task+0xdf/0x100
> [ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
> [ 62.492121] rcu_check_callbacks+0x93d/0x9d0
> [ 62.492121] ? tick_sched_do_timer+0x40/0x40
> [ 62.492121] update_process_times+0x28/0x50
> [ 62.492121] tick_sched_handle+0x22/0x70
> [ 62.492121] tick_sched_timer+0x34/0x70
> [ 62.492121] __hrtimer_run_queues+0xcc/0x250
> [ 62.492121] hrtimer_interrupt+0xab/0x1f0
> [ 62.492121] smp_apic_timer_interrupt+0x62/0x150
> [ 62.492121] apic_timer_interrupt+0xa2/0xb0
> [ 62.492121] </IRQ>
> [ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
> [ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
> [ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
> [ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
> [ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
> [ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
> [ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
> [ 62.492121] ? vp_synchronize_vectors+0x60/0x60
> [ 62.492121] vp_notify+0x12/0x20
> [ 62.492121] virtqueue_notify+0x18/0x30
> [ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
> [ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
> [ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
> [ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
> [ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
> [ 62.492121] process_one_work+0x1e3/0x420
> [ 62.492121] worker_thread+0x2b/0x3d0
> [ 62.492121] ? process_one_work+0x420/0x420
> [ 62.492121] kthread+0x113/0x130
> [ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 62.492121] ret_from_fork+0x3a/0x50
> [ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
> [ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
> [ 62.492121] Call Trace:
> [ 62.492121] <IRQ>
> [ 62.492121] sched_show_task+0xdf/0x100
> [ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
> [ 62.492121] rcu_check_callbacks+0x972/0x9d0
> [ 62.492121] ? tick_sched_do_timer+0x40/0x40
> [ 62.492121] update_process_times+0x28/0x50
> [ 62.492121] tick_sched_handle+0x22/0x70
> [ 62.492121] tick_sched_timer+0x34/0x70
> [ 62.492121] __hrtimer_run_queues+0xcc/0x250
> [ 62.492121] hrtimer_interrupt+0xab/0x1f0
> [ 62.492121] smp_apic_timer_interrupt+0x62/0x150
> [ 62.492121] apic_timer_interrupt+0xa2/0xb0
> [ 62.492121] </IRQ>
> [ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
> [ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
> [ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
> [ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
> [ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
> [ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
> [ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
> [ 62.492121] ? vp_synchronize_vectors+0x60/0x60
> [ 62.492121] vp_notify+0x12/0x20
> [ 62.492121] virtqueue_notify+0x18/0x30
> [ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
> [ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
> [ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
> [ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
> [ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
> [ 62.492121] process_one_work+0x1e3/0x420
> [ 62.492121] worker_thread+0x2b/0x3d0
> [ 62.492121] ? process_one_work+0x420/0x420
> [ 62.492121] kthread+0x113/0x130
> [ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
> [ 62.492121] ret_from_fork+0x3a/0x50
>
> Another important thing is that the commit works well on other
> hardware with the same setup (same host kernel, same qemu command line
> and host kernel binaries). How could I try to find the issue reason?
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
--
With best regards,
Matwey V. Kornilov
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750229
Title:
virtio-blk-pci regression: softlock in guest kernel at module loading
Status in QEMU:
New
Bug description:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other
hardware with the same setup (same host kernel, same qemu command line
and host kernel binaries). How could I try to find the issue reason?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [Bug 1750229] Re: virtio-blk-pci regression: softlock in guest kernel at module loading
2018-02-20 11:47 ` Matwey V. Kornilov
@ 2018-02-20 12:49 ` Matwey V. Kornilov
2018-02-21 8:58 ` Matwey V. Kornilov
0 siblings, 1 reply; 8+ messages in thread
From: Matwey V. Kornilov @ 2018-02-20 12:49 UTC (permalink / raw)
To: qemu-devel
virtqueue_pop() returns NULL due to virtio_queue_empty_rcu() returns
true all the time.
2018-02-20 14:47 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
> Well, I've found that on qemu side VirtQueue for virtio_blk device
> infinitely calls virtio_blk_handle_vq() where virtio_blk_get_request()
> (virtqueue_pop() in essens) always returns NULL.
>
> 2018-02-18 15:26 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
>> ** Attachment added: ".build.kernel.kvm"
>> https://bugs.launchpad.net/qemu/+bug/1750229/+attachment/5057653/+files/.build.kernel.kvm
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1750229
>>
>> Title:
>> virtio-blk-pci regression: softlock in guest kernel at module loading
>>
>> Status in QEMU:
>> New
>>
>> Bug description:
>> Hello,
>>
>> I am running qemu from master git branch on x86_64 host with kernel is
>> 4.4.114. I've found that commit
>>
>> 9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
>>
>> introduces an regression with the following command:
>>
>> qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
>> -vga none -runas qemu -kernel .build.kernel.kvm -initrd
>> .build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
>> nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
>> -drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
>> -device virtio-blk-pci,drive=disk -serial stdio -smp 2
>>
>> Starting from this commit to master the following happens with a wide
>> variety of guest kernels (4.4 to 4.15):
>>
>> [ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
>> [ 62.437426] Showing busy workqueues and worker pools:
>> [ 62.443117] workqueue events: flags=0x0
>> [ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
>> [ 62.448161] pending: check_corruption
>> [ 62.458570] workqueue kblockd: flags=0x18
>> [ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
>> [ 62.463082] in-flight: 4:blk_mq_run_work_fn
>> [ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
>> [ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
>> [ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
>> [ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
>> [ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
>> [ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
>> [ 62.492121] Call Trace:
>> [ 62.492121] <IRQ>
>> [ 62.492121] sched_show_task+0xdf/0x100
>> [ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
>> [ 62.492121] rcu_check_callbacks+0x93d/0x9d0
>> [ 62.492121] ? tick_sched_do_timer+0x40/0x40
>> [ 62.492121] update_process_times+0x28/0x50
>> [ 62.492121] tick_sched_handle+0x22/0x70
>> [ 62.492121] tick_sched_timer+0x34/0x70
>> [ 62.492121] __hrtimer_run_queues+0xcc/0x250
>> [ 62.492121] hrtimer_interrupt+0xab/0x1f0
>> [ 62.492121] smp_apic_timer_interrupt+0x62/0x150
>> [ 62.492121] apic_timer_interrupt+0xa2/0xb0
>> [ 62.492121] </IRQ>
>> [ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
>> [ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
>> [ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
>> [ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
>> [ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
>> [ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
>> [ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
>> [ 62.492121] ? vp_synchronize_vectors+0x60/0x60
>> [ 62.492121] vp_notify+0x12/0x20
>> [ 62.492121] virtqueue_notify+0x18/0x30
>> [ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
>> [ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
>> [ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
>> [ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
>> [ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
>> [ 62.492121] process_one_work+0x1e3/0x420
>> [ 62.492121] worker_thread+0x2b/0x3d0
>> [ 62.492121] ? process_one_work+0x420/0x420
>> [ 62.492121] kthread+0x113/0x130
>> [ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
>> [ 62.492121] ret_from_fork+0x3a/0x50
>> [ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
>> [ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
>> [ 62.492121] Call Trace:
>> [ 62.492121] <IRQ>
>> [ 62.492121] sched_show_task+0xdf/0x100
>> [ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
>> [ 62.492121] rcu_check_callbacks+0x972/0x9d0
>> [ 62.492121] ? tick_sched_do_timer+0x40/0x40
>> [ 62.492121] update_process_times+0x28/0x50
>> [ 62.492121] tick_sched_handle+0x22/0x70
>> [ 62.492121] tick_sched_timer+0x34/0x70
>> [ 62.492121] __hrtimer_run_queues+0xcc/0x250
>> [ 62.492121] hrtimer_interrupt+0xab/0x1f0
>> [ 62.492121] smp_apic_timer_interrupt+0x62/0x150
>> [ 62.492121] apic_timer_interrupt+0xa2/0xb0
>> [ 62.492121] </IRQ>
>> [ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
>> [ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
>> [ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
>> [ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
>> [ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
>> [ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
>> [ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
>> [ 62.492121] ? vp_synchronize_vectors+0x60/0x60
>> [ 62.492121] vp_notify+0x12/0x20
>> [ 62.492121] virtqueue_notify+0x18/0x30
>> [ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
>> [ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
>> [ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
>> [ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
>> [ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
>> [ 62.492121] process_one_work+0x1e3/0x420
>> [ 62.492121] worker_thread+0x2b/0x3d0
>> [ 62.492121] ? process_one_work+0x420/0x420
>> [ 62.492121] kthread+0x113/0x130
>> [ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
>> [ 62.492121] ret_from_fork+0x3a/0x50
>>
>> Another important thing is that the commit works well on other
>> hardware with the same setup (same host kernel, same qemu command line
>> and host kernel binaries). How could I try to find the issue reason?
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
>
>
>
> --
> With best regards,
> Matwey V. Kornilov
--
With best regards,
Matwey V. Kornilov
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750229
Title:
virtio-blk-pci regression: softlock in guest kernel at module loading
Status in QEMU:
New
Bug description:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other
hardware with the same setup (same host kernel, same qemu command line
and host kernel binaries). How could I try to find the issue reason?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [Bug 1750229] Re: virtio-blk-pci regression: softlock in guest kernel at module loading
2018-02-20 12:49 ` Matwey V. Kornilov
@ 2018-02-21 8:58 ` Matwey V. Kornilov
2018-02-22 11:17 ` Matwey V. Kornilov
0 siblings, 1 reply; 8+ messages in thread
From: Matwey V. Kornilov @ 2018-02-21 8:58 UTC (permalink / raw)
To: qemu-devel
Well, last_avail_idx equals to shadow_avail_idx and both of them are 1
at the qemu side. So, only one request is transferred.
I wonder why, probably something is badly cached, but new avail_idx
(which is supposed to become 2) is never shown up.
2018-02-20 15:49 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
> virtqueue_pop() returns NULL due to virtio_queue_empty_rcu() returns
> true all the time.
>
> 2018-02-20 14:47 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
>> Well, I've found that on qemu side VirtQueue for virtio_blk device
>> infinitely calls virtio_blk_handle_vq() where virtio_blk_get_request()
>> (virtqueue_pop() in essens) always returns NULL.
>>
>> 2018-02-18 15:26 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
>>> ** Attachment added: ".build.kernel.kvm"
>>> https://bugs.launchpad.net/qemu/+bug/1750229/+attachment/5057653/+files/.build.kernel.kvm
>>>
>>> --
>>> You received this bug notification because you are subscribed to the bug
>>> report.
>>> https://bugs.launchpad.net/bugs/1750229
>>>
>>> Title:
>>> virtio-blk-pci regression: softlock in guest kernel at module loading
>>>
>>> Status in QEMU:
>>> New
>>>
>>> Bug description:
>>> Hello,
>>>
>>> I am running qemu from master git branch on x86_64 host with kernel is
>>> 4.4.114. I've found that commit
>>>
>>> 9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
>>>
>>> introduces an regression with the following command:
>>>
>>> qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
>>> -vga none -runas qemu -kernel .build.kernel.kvm -initrd
>>> .build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
>>> nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
>>> -drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
>>> -device virtio-blk-pci,drive=disk -serial stdio -smp 2
>>>
>>> Starting from this commit to master the following happens with a wide
>>> variety of guest kernels (4.4 to 4.15):
>>>
>>> [ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
>>> [ 62.437426] Showing busy workqueues and worker pools:
>>> [ 62.443117] workqueue events: flags=0x0
>>> [ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
>>> [ 62.448161] pending: check_corruption
>>> [ 62.458570] workqueue kblockd: flags=0x18
>>> [ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
>>> [ 62.463082] in-flight: 4:blk_mq_run_work_fn
>>> [ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
>>> [ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
>>> [ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
>>> [ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
>>> [ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
>>> [ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
>>> [ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
>>> [ 62.492121] Call Trace:
>>> [ 62.492121] <IRQ>
>>> [ 62.492121] sched_show_task+0xdf/0x100
>>> [ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
>>> [ 62.492121] rcu_check_callbacks+0x93d/0x9d0
>>> [ 62.492121] ? tick_sched_do_timer+0x40/0x40
>>> [ 62.492121] update_process_times+0x28/0x50
>>> [ 62.492121] tick_sched_handle+0x22/0x70
>>> [ 62.492121] tick_sched_timer+0x34/0x70
>>> [ 62.492121] __hrtimer_run_queues+0xcc/0x250
>>> [ 62.492121] hrtimer_interrupt+0xab/0x1f0
>>> [ 62.492121] smp_apic_timer_interrupt+0x62/0x150
>>> [ 62.492121] apic_timer_interrupt+0xa2/0xb0
>>> [ 62.492121] </IRQ>
>>> [ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
>>> [ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
>>> [ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
>>> [ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
>>> [ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
>>> [ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
>>> [ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
>>> [ 62.492121] ? vp_synchronize_vectors+0x60/0x60
>>> [ 62.492121] vp_notify+0x12/0x20
>>> [ 62.492121] virtqueue_notify+0x18/0x30
>>> [ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
>>> [ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
>>> [ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
>>> [ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
>>> [ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
>>> [ 62.492121] process_one_work+0x1e3/0x420
>>> [ 62.492121] worker_thread+0x2b/0x3d0
>>> [ 62.492121] ? process_one_work+0x420/0x420
>>> [ 62.492121] kthread+0x113/0x130
>>> [ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
>>> [ 62.492121] ret_from_fork+0x3a/0x50
>>> [ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
>>> [ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
>>> [ 62.492121] Call Trace:
>>> [ 62.492121] <IRQ>
>>> [ 62.492121] sched_show_task+0xdf/0x100
>>> [ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
>>> [ 62.492121] rcu_check_callbacks+0x972/0x9d0
>>> [ 62.492121] ? tick_sched_do_timer+0x40/0x40
>>> [ 62.492121] update_process_times+0x28/0x50
>>> [ 62.492121] tick_sched_handle+0x22/0x70
>>> [ 62.492121] tick_sched_timer+0x34/0x70
>>> [ 62.492121] __hrtimer_run_queues+0xcc/0x250
>>> [ 62.492121] hrtimer_interrupt+0xab/0x1f0
>>> [ 62.492121] smp_apic_timer_interrupt+0x62/0x150
>>> [ 62.492121] apic_timer_interrupt+0xa2/0xb0
>>> [ 62.492121] </IRQ>
>>> [ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
>>> [ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
>>> [ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
>>> [ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
>>> [ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
>>> [ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
>>> [ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
>>> [ 62.492121] ? vp_synchronize_vectors+0x60/0x60
>>> [ 62.492121] vp_notify+0x12/0x20
>>> [ 62.492121] virtqueue_notify+0x18/0x30
>>> [ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
>>> [ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
>>> [ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
>>> [ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
>>> [ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
>>> [ 62.492121] process_one_work+0x1e3/0x420
>>> [ 62.492121] worker_thread+0x2b/0x3d0
>>> [ 62.492121] ? process_one_work+0x420/0x420
>>> [ 62.492121] kthread+0x113/0x130
>>> [ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
>>> [ 62.492121] ret_from_fork+0x3a/0x50
>>>
>>> Another important thing is that the commit works well on other
>>> hardware with the same setup (same host kernel, same qemu command line
>>> and host kernel binaries). How could I try to find the issue reason?
>>>
>>> To manage notifications about this bug go to:
>>> https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
>>
>>
>>
>> --
>> With best regards,
>> Matwey V. Kornilov
>
>
>
> --
> With best regards,
> Matwey V. Kornilov
--
With best regards,
Matwey V. Kornilov
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750229
Title:
virtio-blk-pci regression: softlock in guest kernel at module loading
Status in QEMU:
New
Bug description:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other
hardware with the same setup (same host kernel, same qemu command line
and host kernel binaries). How could I try to find the issue reason?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [Bug 1750229] Re: virtio-blk-pci regression: softlock in guest kernel at module loading
2018-02-21 8:58 ` Matwey V. Kornilov
@ 2018-02-22 11:17 ` Matwey V. Kornilov
0 siblings, 0 replies; 8+ messages in thread
From: Matwey V. Kornilov @ 2018-02-22 11:17 UTC (permalink / raw)
To: qemu-devel
The same story with 4.15.4 host kernel.
2018-02-21 11:58 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
> Well, last_avail_idx equals to shadow_avail_idx and both of them are 1
> at the qemu side. So, only one request is transferred.
> I wonder why, probably something is badly cached, but new avail_idx
> (which is supposed to become 2) is never shown up.
>
> 2018-02-20 15:49 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
>> virtqueue_pop() returns NULL due to virtio_queue_empty_rcu() returns
>> true all the time.
>>
>> 2018-02-20 14:47 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
>>> Well, I've found that on qemu side VirtQueue for virtio_blk device
>>> infinitely calls virtio_blk_handle_vq() where virtio_blk_get_request()
>>> (virtqueue_pop() in essens) always returns NULL.
>>>
>>> 2018-02-18 15:26 GMT+03:00 Matwey V. Kornilov <matwey.kornilov@gmail.com>:
>>>> ** Attachment added: ".build.kernel.kvm"
>>>> https://bugs.launchpad.net/qemu/+bug/1750229/+attachment/5057653/+files/.build.kernel.kvm
>>>>
>>>> --
>>>> You received this bug notification because you are subscribed to the bug
>>>> report.
>>>> https://bugs.launchpad.net/bugs/1750229
>>>>
>>>> Title:
>>>> virtio-blk-pci regression: softlock in guest kernel at module loading
>>>>
>>>> Status in QEMU:
>>>> New
>>>>
>>>> Bug description:
>>>> Hello,
>>>>
>>>> I am running qemu from master git branch on x86_64 host with kernel is
>>>> 4.4.114. I've found that commit
>>>>
>>>> 9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
>>>>
>>>> introduces an regression with the following command:
>>>>
>>>> qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
>>>> -vga none -runas qemu -kernel .build.kernel.kvm -initrd
>>>> .build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
>>>> nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
>>>> -drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
>>>> -device virtio-blk-pci,drive=disk -serial stdio -smp 2
>>>>
>>>> Starting from this commit to master the following happens with a wide
>>>> variety of guest kernels (4.4 to 4.15):
>>>>
>>>> [ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
>>>> [ 62.437426] Showing busy workqueues and worker pools:
>>>> [ 62.443117] workqueue events: flags=0x0
>>>> [ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
>>>> [ 62.448161] pending: check_corruption
>>>> [ 62.458570] workqueue kblockd: flags=0x18
>>>> [ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
>>>> [ 62.463082] in-flight: 4:blk_mq_run_work_fn
>>>> [ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
>>>> [ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
>>>> [ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
>>>> [ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
>>>> [ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
>>>> [ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
>>>> [ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
>>>> [ 62.492121] Call Trace:
>>>> [ 62.492121] <IRQ>
>>>> [ 62.492121] sched_show_task+0xdf/0x100
>>>> [ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
>>>> [ 62.492121] rcu_check_callbacks+0x93d/0x9d0
>>>> [ 62.492121] ? tick_sched_do_timer+0x40/0x40
>>>> [ 62.492121] update_process_times+0x28/0x50
>>>> [ 62.492121] tick_sched_handle+0x22/0x70
>>>> [ 62.492121] tick_sched_timer+0x34/0x70
>>>> [ 62.492121] __hrtimer_run_queues+0xcc/0x250
>>>> [ 62.492121] hrtimer_interrupt+0xab/0x1f0
>>>> [ 62.492121] smp_apic_timer_interrupt+0x62/0x150
>>>> [ 62.492121] apic_timer_interrupt+0xa2/0xb0
>>>> [ 62.492121] </IRQ>
>>>> [ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
>>>> [ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
>>>> [ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
>>>> [ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
>>>> [ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
>>>> [ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
>>>> [ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
>>>> [ 62.492121] ? vp_synchronize_vectors+0x60/0x60
>>>> [ 62.492121] vp_notify+0x12/0x20
>>>> [ 62.492121] virtqueue_notify+0x18/0x30
>>>> [ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
>>>> [ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
>>>> [ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
>>>> [ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
>>>> [ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
>>>> [ 62.492121] process_one_work+0x1e3/0x420
>>>> [ 62.492121] worker_thread+0x2b/0x3d0
>>>> [ 62.492121] ? process_one_work+0x420/0x420
>>>> [ 62.492121] kthread+0x113/0x130
>>>> [ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
>>>> [ 62.492121] ret_from_fork+0x3a/0x50
>>>> [ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
>>>> [ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
>>>> [ 62.492121] Call Trace:
>>>> [ 62.492121] <IRQ>
>>>> [ 62.492121] sched_show_task+0xdf/0x100
>>>> [ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
>>>> [ 62.492121] rcu_check_callbacks+0x972/0x9d0
>>>> [ 62.492121] ? tick_sched_do_timer+0x40/0x40
>>>> [ 62.492121] update_process_times+0x28/0x50
>>>> [ 62.492121] tick_sched_handle+0x22/0x70
>>>> [ 62.492121] tick_sched_timer+0x34/0x70
>>>> [ 62.492121] __hrtimer_run_queues+0xcc/0x250
>>>> [ 62.492121] hrtimer_interrupt+0xab/0x1f0
>>>> [ 62.492121] smp_apic_timer_interrupt+0x62/0x150
>>>> [ 62.492121] apic_timer_interrupt+0xa2/0xb0
>>>> [ 62.492121] </IRQ>
>>>> [ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
>>>> [ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
>>>> [ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
>>>> [ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
>>>> [ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
>>>> [ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
>>>> [ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
>>>> [ 62.492121] ? vp_synchronize_vectors+0x60/0x60
>>>> [ 62.492121] vp_notify+0x12/0x20
>>>> [ 62.492121] virtqueue_notify+0x18/0x30
>>>> [ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
>>>> [ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
>>>> [ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
>>>> [ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
>>>> [ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
>>>> [ 62.492121] process_one_work+0x1e3/0x420
>>>> [ 62.492121] worker_thread+0x2b/0x3d0
>>>> [ 62.492121] ? process_one_work+0x420/0x420
>>>> [ 62.492121] kthread+0x113/0x130
>>>> [ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
>>>> [ 62.492121] ret_from_fork+0x3a/0x50
>>>>
>>>> Another important thing is that the commit works well on other
>>>> hardware with the same setup (same host kernel, same qemu command line
>>>> and host kernel binaries). How could I try to find the issue reason?
>>>>
>>>> To manage notifications about this bug go to:
>>>> https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
>>>
>>>
>>>
>>> --
>>> With best regards,
>>> Matwey V. Kornilov
>>
>>
>>
>> --
>> With best regards,
>> Matwey V. Kornilov
>
>
>
> --
> With best regards,
> Matwey V. Kornilov
--
With best regards,
Matwey V. Kornilov
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750229
Title:
virtio-blk-pci regression: softlock in guest kernel at module loading
Status in QEMU:
New
Bug description:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other
hardware with the same setup (same host kernel, same qemu command line
and host kernel binaries). How could I try to find the issue reason?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Qemu-devel] [Bug 1750229] Re: virtio-blk-pci regression: softlock in guest kernel at module loading
2018-02-18 11:39 [Qemu-devel] [Bug 1750229] [NEW] virtio-blk-pci regression: softlock in guest kernel at module loading Matwey V. Kornilov
2018-02-18 12:26 ` [Qemu-devel] [Bug 1750229] " Matwey V. Kornilov
2018-02-18 12:26 ` Matwey V. Kornilov
@ 2019-01-03 18:13 ` Matwey V. Kornilov
2 siblings, 0 replies; 8+ messages in thread
From: Matwey V. Kornilov @ 2019-01-03 18:13 UTC (permalink / raw)
To: qemu-devel
This has been fixed in the kernel
** Changed in: qemu
Status: New => Fix Released
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1750229
Title:
virtio-blk-pci regression: softlock in guest kernel at module loading
Status in QEMU:
Fix Released
Bug description:
Hello,
I am running qemu from master git branch on x86_64 host with kernel is
4.4.114. I've found that commit
9a4c0e220d8a "hw/virtio-pci: fix virtio behaviour"
introduces an regression with the following command:
qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic
-vga none -runas qemu -kernel .build.kernel.kvm -initrd
.build.initrd.kvm -append 'panic=1 softlockup_panic=1 no-kvmclock
nmi_watchdog=0 console=ttyS0 root=/dev/disk/by-id/virtio-0' -m 2048
-drive file=./root,format=raw,if=none,id=disk,serial=0,cache=unsafe
-device virtio-blk-pci,drive=disk -serial stdio -smp 2
Starting from this commit to master the following happens with a wide
variety of guest kernels (4.4 to 4.15):
[ 62.428107] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=-20 stuck for 59s!
[ 62.437426] Showing busy workqueues and worker pools:
[ 62.443117] workqueue events: flags=0x0
[ 62.447512] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[ 62.448161] pending: check_corruption
[ 62.458570] workqueue kblockd: flags=0x18
[ 62.463082] pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=3/256
[ 62.463082] in-flight: 4:blk_mq_run_work_fn
[ 62.463082] pending: blk_mq_run_work_fn, blk_mq_timeout_work
[ 62.474831] pool 1: cpus=0 node=0 flags=0x0 nice=-20 hung=59s workers=2 idle: 214
[ 62.492121] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.492121] Tasks blocked on level-0 rcu_node (CPUs 0-1): P4
[ 62.492121] (detected by 0, t=15002 jiffies, g=-130, c=-131, q=32)
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x93d/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
[ 62.492121] kworker/0:0H R running task 0 4 2 0x80000000
[ 62.492121] Workqueue: kblockd blk_mq_run_work_fn
[ 62.492121] Call Trace:
[ 62.492121] <IRQ>
[ 62.492121] sched_show_task+0xdf/0x100
[ 62.492121] rcu_print_detail_task_stall_rnp+0x48/0x69
[ 62.492121] rcu_check_callbacks+0x972/0x9d0
[ 62.492121] ? tick_sched_do_timer+0x40/0x40
[ 62.492121] update_process_times+0x28/0x50
[ 62.492121] tick_sched_handle+0x22/0x70
[ 62.492121] tick_sched_timer+0x34/0x70
[ 62.492121] __hrtimer_run_queues+0xcc/0x250
[ 62.492121] hrtimer_interrupt+0xab/0x1f0
[ 62.492121] smp_apic_timer_interrupt+0x62/0x150
[ 62.492121] apic_timer_interrupt+0xa2/0xb0
[ 62.492121] </IRQ>
[ 62.492121] RIP: 0010:iowrite16+0x1d/0x30
[ 62.492121] RSP: 0018:ffffa477c034fcc8 EFLAGS: 00010292 ORIG_RAX: ffffffffffffff11
[ 62.492121] RAX: ffffffffa24fbdb0 RBX: ffff92a1f8f82000 RCX: 0000000000000001
[ 62.492121] RDX: ffffa477c0371000 RSI: ffffa477c0371000 RDI: 0000000000000000
[ 62.492121] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000001080020
[ 62.492121] R10: ffffdc7cc1e4fc00 R11: 0000000000000000 R12: 0000000000000000
[ 62.492121] R13: 0000000000000000 R14: ffff92a1f93f0000 R15: ffff92a1f8e1aa80
[ 62.492121] ? vp_synchronize_vectors+0x60/0x60
[ 62.492121] vp_notify+0x12/0x20
[ 62.492121] virtqueue_notify+0x18/0x30
[ 62.492121] virtio_queue_rq+0x2f5/0x300 [virtio_blk]
[ 62.492121] blk_mq_dispatch_rq_list+0x7e/0x4a0
[ 62.492121] blk_mq_do_dispatch_sched+0x4a/0xd0
[ 62.492121] blk_mq_sched_dispatch_requests+0x106/0x170
[ 62.492121] __blk_mq_run_hw_queue+0x80/0x90
[ 62.492121] process_one_work+0x1e3/0x420
[ 62.492121] worker_thread+0x2b/0x3d0
[ 62.492121] ? process_one_work+0x420/0x420
[ 62.492121] kthread+0x113/0x130
[ 62.492121] ? kthread_create_worker_on_cpu+0x50/0x50
[ 62.492121] ret_from_fork+0x3a/0x50
Another important thing is that the commit works well on other
hardware with the same setup (same host kernel, same qemu command line
and host kernel binaries). How could I try to find the issue reason?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1750229/+subscriptions
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-01-03 18:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-18 11:39 [Qemu-devel] [Bug 1750229] [NEW] virtio-blk-pci regression: softlock in guest kernel at module loading Matwey V. Kornilov
2018-02-18 12:26 ` [Qemu-devel] [Bug 1750229] " Matwey V. Kornilov
2018-02-20 11:47 ` Matwey V. Kornilov
2018-02-20 12:49 ` Matwey V. Kornilov
2018-02-21 8:58 ` Matwey V. Kornilov
2018-02-22 11:17 ` Matwey V. Kornilov
2018-02-18 12:26 ` Matwey V. Kornilov
2019-01-03 18:13 ` Matwey V. Kornilov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).