* blk-mq and CPU hotplug error
@ 2018-03-26 21:25 joserz
2018-03-27 2:56 ` jianchao.wang
0 siblings, 1 reply; 2+ messages in thread
From: joserz @ 2018-03-26 21:25 UTC (permalink / raw)
To: linux-block; +Cc: hch
Hello everyone!
I'm running Ubuntu 18.04 (4.15.0-12-generic) in KVM/QEMU (powerpc64le).
Everything looks good until I try to hotplug CPUs in my VM. As soon as I
do that I get the following error in my VM dmesg:
[ 763.629425] WARNING: CPU: 34 PID: 2276 at /build/linux-zBVy54/linux-4.15.0/block/blk-mq.c:1211 __blk_mq_run_hw_queue+0xf0/0x140
...
[ 763.629497] CPU: 34 PID: 2276 Comm: kworker/34:1H Not tainted 4.15.0-12-generic #13-Ubuntu
[ 763.629501] Workqueue: kblockd blk_mq_run_work_fn
See that blk-mq complains about CPU 34. This CPU is one of the
hotplugged CPUs.
Looking at mq/0/cpu_list, I see the my new CPUs aren't in that list:
# cat /sys/block/vda/mq/0/cpu_list
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
But the real problem comes after it, if I try to run any command that
performs intensive IO operations, like:
# dd if=/dev/random of=test.img bs=1M count=100
I start to get errors like:
[ 967.661543] INFO: task kworker/u128:3:454 blocked for more than 120 seconds.
[ 967.661656] Tainted: G W 4.15.0-12-generic #13-Ubuntu
[ 967.661705] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 967.661762] kworker/u128:3 D 0 454 2 0x00000800
[ 967.661802] Workqueue: writeback wb_workfn (flush-252:0)
[ 967.661805] Call Trace:
[ 967.661808] [c000000267e5f1d0] [0000002200000000] 0x2200000000 (unreliable)
[ 967.661813] [c000000267e5f3a0] [c00000000001c220] __switch_to+0x2a0/0x4d0
[ 967.661823] [c000000267e5f400] [c000000000d035e4] __schedule+0x2a4/0xaf0
[ 967.661825] [c000000267e5f4d0] [c000000000d03e70] schedule+0x40/0xc0
[ 967.661834] [c000000267e5f4f0] [c00000000014f77c] io_schedule+0x2c/0x50
[ 967.661838] [c000000267e5f520] [c0000000002d810c] __lock_page+0x14c/0x1c0
[ 967.661842] [c000000267e5f5c0] [c0000000004c9288] mpage_prepare_extent_to_map+0x2c8/0x3d0
[ 967.661844] [c000000267e5f6c0] [c0000000004d16bc] ext4_writepages+0x49c/0xfd0
[ 967.661847] [c000000267e5f870] [c0000000002f1a9c] do_writepages+0x4c/0x130
[ 967.661849] [c000000267e5f8e0] [c00000000041a370] __writeback_single_inode+0x70/0x570
[ 967.661852] [c000000267e5f940] [c00000000041af0c] writeback_sb_inodes+0x26c/0x5d0
[ 967.661854] [c000000267e5fa50] [c00000000041b308] __writeback_inodes_wb+0x98/0x110
[ 967.661856] [c000000267e5fab0] [c00000000041b72c] wb_writeback+0x29c/0x460
[ 967.661859] [c000000267e5fb80] [c00000000041c38c] wb_workfn+0x1ec/0x600
[ 967.661862] [c000000267e5fc90] [c000000000131538] process_one_work+0x298/0x5a0
[ 967.661864] [c000000267e5fd20] [c0000000001318d8] worker_thread+0x98/0x630
[ 967.661866] [c000000267e5fdc0] [c00000000013a508] kthread+0x1a8/0x1b0
[ 967.661869] [c000000267e5fe30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4
[ 967.661872] INFO: task jbd2/vda2-8:579 blocked for more than 120 seconds.
[ 967.661924] Tainted: G W 4.15.0-12-generic #13-Ubuntu
[ 967.661973] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 967.662035] jbd2/vda2-8 D 0 579 2 0x00000800
[ 967.662037] Call Trace:
...snip
Which leads to an unresponsive VM that forces me to reboot it. After
searching the internet I changed the IO scheduler:
# cat /sys/block/vda/queue/scheduler
[none]
# modprobe bfq
# echo bfq > /sys/block/vda/queue/scheduler
# cat /sys/block/vda/queue/scheduler
[bfq] none
which seems to be enough (it still throws that WARNING, but no
hung_task_timeout_secs so far).
Now I'm not sure on how to attack the problem. At first I thought that
CPU hotplug IRQ should also update the cpu_list but now I think that the
problem could be in the queue, maybe it should give up scheduling tasks
on the failing CPU and try another one. What do you people think?
Any suggestion on how to debug it is welcome as well.
Thank you!
Jose R. Ziviani
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: blk-mq and CPU hotplug error
2018-03-26 21:25 blk-mq and CPU hotplug error joserz
@ 2018-03-27 2:56 ` jianchao.wang
0 siblings, 0 replies; 2+ messages in thread
From: jianchao.wang @ 2018-03-27 2:56 UTC (permalink / raw)
To: joserz, linux-block; +Cc: hch
Hi Jose
On 03/27/2018 05:25 AM, joserz@linux.vnet.ibm.com wrote:
> Hello everyone!
>
> I'm running Ubuntu 18.04 (4.15.0-12-generic) in KVM/QEMU (powerpc64le).
> Everything looks good until I try to hotplug CPUs in my VM. As soon as I
> do that I get the following error in my VM dmesg:
Please refer to the following commit:
20e4d813 blk-mq: simplify queue mapping & schedule with each possisble CPU
84676c1 genirq/affinity: assign vectors to all possible CPUs
Thanks
Jianchao
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-03-27 2:56 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-26 21:25 blk-mq and CPU hotplug error joserz
2018-03-27 2:56 ` jianchao.wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).