Re: [PATCH V5 0/6] blk-mq: improvement CPU hotplug

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

From: John Garry <john.garry@huawei.com>
To: Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>
Cc: <linux-block@vger.kernel.org>,
	Bart Van Assche <bvanassche@acm.org>,
	Hannes Reinecke <hare@suse.com>, Christoph Hellwig <hch@lst.de>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	Keith Busch <keith.busch@intel.com>,
	chenxiang <chenxiang66@hisilicon.com>
Subject: Re: [PATCH V5 0/6] blk-mq: improvement CPU hotplug
Date: Mon, 20 Jan 2020 13:23:35 +0000	[thread overview]
Message-ID: <6dbe8c9f-af4e-3157-b6e9-6bbf43efb1e1@huawei.com> (raw)
In-Reply-To: <929dbfac-de46-a947-6a2c-f4d8d504c631@huawei.com>

On 15/01/2020 17:00, John Garry wrote:
> On 15/01/2020 11:44, Ming Lei wrote:
>> Hi,
>>
>> Thomas mentioned:
>>      "
>>       That was the constraint of managed interrupts from the very 
>> beginning:
>>        The driver/subsystem has to quiesce the interrupt line and the 
>> associated
>>        queue _before_ it gets shutdown in CPU unplug and not fiddle 
>> with it
>>        until it's restarted by the core when the CPU is plugged in again.
>>      "
>>
>> But no drivers or blk-mq do that before one hctx becomes inactive(all
>> CPUs for one hctx are offline), and even it is worse, blk-mq stills tries
>> to run hw queue after hctx is dead, see blk_mq_hctx_notify_dead().
>>
>> This patchset tries to address the issue by two stages:
>>
>> 1) add one new cpuhp state of CPUHP_AP_BLK_MQ_ONLINE
>>
>> - mark the hctx as internal stopped, and drain all in-flight requests
>> if the hctx is going to be dead.
>>
>> 2) re-submit IO in the state of CPUHP_BLK_MQ_DEAD after the hctx 
>> becomes dead
>>
>> - steal bios from the request, and resubmit them via 
>> generic_make_request(),
>> then these IO will be mapped to other live hctx for dispatch
>>
>> Please comment & review, thanks!
>>
>> V5:
>>     - rename BLK_MQ_S_INTERNAL_STOPPED as BLK_MQ_S_INACTIVE
>>     - re-factor code for re-submit requests in cpu dead hotplug handler
>>     - address requeue corner case
>>
>> V4:
>>     - resubmit IOs in dispatch list in case that this hctx is dead
>>
>> V3:
>>     - re-organize patch 2 & 3 a bit for addressing Hannes's comment
>>     - fix patch 4 for avoiding potential deadlock, as found by Hannes
>>
>> V2:
>>     - patch4 & patch 5 in V1 have been merged to block tree, so remove
>>       them
>>     - address comments from John Garry and Minwoo
>>
>>

Hi Ming,

Colleague Xiang Chen found this:

Starting 40 processes
Jobs: 40 (f=40)
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [8.3% done] 
[2899MB/0KB/0KB /s] [742K/0/0 iops] [eta 00m:33s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [11.1% done] 
[2892MB/0KB/0KB /s] [740K/0/0 iops] [eta 00m:32s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [13.9% done] 
[2887MB/0KB/0KB /s] [739K/0/0 iops] [eta 00m:31s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [16.7% done] 
[2850MB/0KB/0KB /s] [730K/0/0 iops] [eta 00m:30s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [19.4% done] 
[2834MB/0KB/0KB /s] [726K/0/0 iops] [eta 00m:29s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [22.2% done] 
[2827MB/0KB/0KB /s] [724K/0/0 iops] [eta 00m:28s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [25.0% done] 
[2804MB/0KB/0KB /s] [718K/0/0 iops] [eta 00m:27s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [27.8% done] 
[2783MB/0KB/0KB /s] [713K/0/0 iops] [eta 00m:26s]
[  112.990292] CPU16: shutdown
[  112.993113] psci: CPU16 killed (polled 0 ms)
[  113.029429] CPU15: shutdown
[  113.032245] psci: CPU15 killed (polled 0 ms)
[  113.073649] CPU14: shutdown
[  113.076461] psci: CPU14 killed (polled 0 ms)
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRR[  113.121258] CPU13: 
shutdown
RRRRRRRRRR] [30.[  113.124247] psci: CPU13 killed (polled 0 ms)
6% done] [2772MB/0KB/0KB /s] [710K/0/0 iops] [eta 00m:25s]
[  113.177108] IRQ 600: no longer affine to CPU12
[  113.183185] CPU12: shutdown
[  113.186001] psci: CPU12 killed (polled 0 ms)
[  113.242555] CPU11: shutdown
[  113.245368] psci: CPU11 killed (polled 0 ms)
[  113.302598] CPU10: shutdown
[  113.305420] psci: CPU10 killed (polled 0 ms)
[  113.365757] CPU9: shutdown
[  113.368489] psci: CPU9 killed (polled 0 ms)
[  113.611904] IRQ 599: no longer affine to CPU8
[  113.620611] CPU8: shutdown
[  113.623340] psci: CPU8 killed (polled 0 ms)
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [33.3% done] 
[2765MB/0KB/0KB /s] [708K/0/0 iops] [eta 00m:24s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [36.1% done] 
[2767MB/0KB/0KB /s] [708K/0/0 iops] [eta 00m:23s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [38.9% done] 
[2771MB/0KB/0KB /s] [709K/0/0 iops] [eta 00m:22s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [41.7% done] 
[2780MB/0KB/0KB /s] [712K/0/0 iops] [eta 00m:21s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [44.4% done] 
[2731MB/0KB/0KB /s] [699K/0/0 iops] [eta 00m:20s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [47.2% done] 
[2757MB/0KB/0KB /s] [706K/0/0 iops] [eta 00m:19s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [50.0% done] 
[2764MB/0KB/0KB /s] [708K/0/0 iops] [eta 00m:18s]
[  120.762141] ------------[ cut here ]------------
[  120.766760] WARNING: CPU: 0 PID: 10 at block/blk.h:299 
generic_make_request_checks+0x480/0x4e8
[  120.775348] Modules linked in:
[  120.778397] CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 5.5.0-

Then this:

1158048319d:18h:40m:34s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049605d:23h:03m:30s]
[  141.915198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  141.921280] rcu: 	96-...0: (4 GPs behind) idle=af6/0/0x1 
softirq=90/90 fqs=2626
[  141.928658] rcu: 	99-...0: (2 GPs behind) 
idle=b82/1/0x4000000000000000 softirq=82/82 fqs=2627
[  141.937339] 	(detected by 17, t=5254 jiffies, g=3925, q=545)
[  141.942985] Task dump for CPU 96:
[  141.946292] swapper/96      R  running task        0     0      1 
0x0000002a
[  141.953321] Call trace:
[  141.955763]  __switch_to+0xbc/0x218
[  141.959244]  0x0
[  141.961079] Task dump for CPU 99:
[  141.964385] kworker/99:1H   R  running task        0  3473      2 
0x0000022a
[  141.971417] Workqueue: kblockd blk_mq_run_work_fn
[  141.976109] Call trace:
[  141.978550]  __switch_to+0xbc/0x218
[  141.982029]  blk_mq_run_work_fn+0x1c/0x28
[  141.986027]  process_one_work+0x1e0/0x358
[  141.990025]  worker_thread+0x40/0x488
[  141.993678]  kthread+0x118/0x120
[  141.996897]  ret_from_fork+0x10/0x18


Fuller log at bottom.

We're basing the patchset on v5.5-rc6:
https://github.com/hisilicon/kernel-dev/tree/private-topic-sas-5.5-mq-debug-improve-cpu-hotplug-v5 
(we have not tested my latest patch there, but I doubt that this is the 
problem).

Not sure if something missing which is in -next.

Cheers,
John



  uptime:	0 days, 0 hours
Usage on /:	8%		Swap usage:	0.0%
IP Addresses:	192.168.1.164

Euler:~ # export TMOUT=0
Euler:~ # fdisk -l | grep sd | wc -l
7
Euler:~ # ps -aux | grep irqbalance
root       3307  0.0  0.0 106076  1796 ttyS0    S+   05:58   0:00 grep 
--color=auto irqbalance
Euler:~ # echo 15 > /proc/sys/kernel/printk
Euler:~ #
Euler:~ #
Euler:~ # ./hotplug_sas_d06.sh
Looping ... number 0
Creat 4k_read_depth2048_fiotest file sucessfully
cp: cannot stat â€˜liba*â€™: No such file or directory
chmod: cannot access â€˜fioâ€™: No such file or directory
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=2048
...
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=2048
...
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=2048
...
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=2048
...
fio-2.1.10
Starting 40 processes
Jobs: 40 (f=40)
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [8.3% done] 
[2899MB/0KB/0KB /s] [742K/0/0 iops] [eta 00m:33s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [11.1% done] 
[2892MB/0KB/0KB /s] [740K/0/0 iops] [eta 00m:32s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [13.9% done] 
[2887MB/0KB/0KB /s] [739K/0/0 iops] [eta 00m:31s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [16.7% done] 
[2850MB/0KB/0KB /s] [730K/0/0 iops] [eta 00m:30s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [19.4% done] 
[2834MB/0KB/0KB /s] [726K/0/0 iops] [eta 00m:29s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [22.2% done] 
[2827MB/0KB/0KB /s] [724K/0/0 iops] [eta 00m:28s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [25.0% done] 
[2804MB/0KB/0KB /s] [718K/0/0 iops] [eta 00m:27s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [27.8% done] 
[2783MB/0KB/0KB /s] [713K/0/0 iops] [eta 00m:26s]
[  112.990292] CPU16: shutdown
[  112.993113] psci: CPU16 killed (polled 0 ms)
[  113.029429] CPU15: shutdown
[  113.032245] psci: CPU15 killed (polled 0 ms)
[  113.073649] CPU14: shutdown
[  113.076461] psci: CPU14 killed (polled 0 ms)
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRR[  113.121258] CPU13: 
shutdown
RRRRRRRRRR] [30.[  113.124247] psci: CPU13 killed (polled 0 ms)
6% done] [2772MB/0KB/0KB /s] [710K/0/0 iops] [eta 00m:25s]
[  113.177108] IRQ 600: no longer affine to CPU12
[  113.183185] CPU12: shutdown
[  113.186001] psci: CPU12 killed (polled 0 ms)
[  113.242555] CPU11: shutdown
[  113.245368] psci: CPU11 killed (polled 0 ms)
[  113.302598] CPU10: shutdown
[  113.305420] psci: CPU10 killed (polled 0 ms)
[  113.365757] CPU9: shutdown
[  113.368489] psci: CPU9 killed (polled 0 ms)
[  113.611904] IRQ 599: no longer affine to CPU8
[  113.620611] CPU8: shutdown
[  113.623340] psci: CPU8 killed (polled 0 ms)
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [33.3% done] 
[2765MB/0KB/0KB /s] [708K/0/0 iops] [eta 00m:24s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [36.1% done] 
[2767MB/0KB/0KB /s] [708K/0/0 iops] [eta 00m:23s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [38.9% done] 
[2771MB/0KB/0KB /s] [709K/0/0 iops] [eta 00m:22s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [41.7% done] 
[2780MB/0KB/0KB /s] [712K/0/0 iops] [eta 00m:21s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [44.4% done] 
[2731MB/0KB/0KB /s] [699K/0/0 iops] [eta 00m:20s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [47.2% done] 
[2757MB/0KB/0KB /s] [706K/0/0 iops] [eta 00m:19s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [50.0% done] 
[2764MB/0KB/0KB /s] [708K/0/0 iops] [eta 00m:18s]
[  120.762141] ------------[ cut here ]------------
[  120.766760] WARNING: CPU: 0 PID: 10 at block/blk.h:299 
generic_make_request_checks+0x480/0x4e8
[  120.775348] Modules linked in:
[  120.778397] CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 
5.5.0-rc6-31756-gdddc331 #333
[  120.786282] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 
2280-V2 CS V3.B160.02 01/15/2020
[  120.795129] pstate: 80c00089 (Nzcv daIf +PAN +UAO)
[  120.799907] pc : generic_make_request_checks+0x480/0x4e8
[  120.805205] lr : generic_make_request_checks+0x48/0x4e8
[  120.810415] sp : ffff80001052b700
[  120.813720] x29: ffff80001052b700 x28: 0000000000000dd0
[  120.819017] x27: ffff00277b6c0d00 x26: ffff00277863ea18
[  120.824313] x25: 0000000000000000 x24: ffff00277b6c0d6c
[  120.829612] x23: 0000000000000008 x22: ffff0027d66c7800
[  120.834909] x21: ffff2027c7133790 x20: ffffc8fcee7c9000
[  120.840205] x19: ffff00277863ea18 x18: 000000000000000e
[  120.845502] x17: 0000000000000007 x16: 0000000000000001
[  120.850801] x15: 0000000000000019 x14: 0000000000000033
[  120.856098] x13: 0000000000000001 x12: 00000000015cd30a
[  120.861394] x11: 0000000000000001 x10: 0000000000000040
[  120.866692] x9 : ffffc8fcee7df550 x8 : ffffc8fcee3fb018
[  120.871990] x7 : ffff0027d2010010 x6 : 00000006711b402b
[  120.877287] x5 : 00000000000000c0 x4 : 0000000000000006
[  120.882585] x3 : ffffffffffffffff x2 : 00000000ffffffff
[  120.887882] x1 : 0000000000000080 x0 : 0000000000000080
[  120.893181] Call trace:
[  120.895621]  generic_make_request_checks+0x480/0x4e8
[  120.900571]  generic_make_request+0x2c/0x320
[  120.904832]  blk_mq_hctx_deactivate+0x1a0/0x2e8
[  120.909352]  __blk_mq_delay_run_hw_queue+0x220/0x230
[  120.914303]  blk_mq_run_hw_queue+0xa0/0x110
[  120.918475]  blk_mq_run_hw_queues+0x48/0x70
[  120.922649]  scsi_end_request+0x1d8/0x230
[  120.926648]  scsi_io_completion+0x70/0x520
[  120.930735]  scsi_finish_command+0xe0/0xf0
[  120.934822]  scsi_softirq_done+0x80/0x190
[  120.938821]  blk_mq_complete_request+0xc4/0x160
[  120.943339]  scsi_mq_done+0x70/0xc8
[  120.946821]  ata_scsi_qc_complete+0xa0/0x3f0
[  120.951081]  __ata_qc_complete+0xa4/0x150
[  120.955079]  ata_qc_complete+0xe8/0x200
[  120.958906]  sas_ata_task_done+0x194/0x2d0
[  120.962992]  cq_tasklet_v3_hw+0x12c/0x5e8
[  120.966991]  tasklet_action_common.isra.15+0x11c/0x190
[  120.972115]  tasklet_action+0x24/0x30
[  120.975767]  efi_header_end+0x114/0x234
[  120.979592]  run_ksoftirqd+0x38/0x48
[  120.983159]  smpboot_thread_fn+0x16c/0x270
[  120.987245]  kthread+0x118/0x120
[  120.990465]  ret_from_fork+0x10/0x18
[  120.994031] ---[ end trace 0125724e2f3f2476 ]---
[  120.998643] ------------[ cut here ]------------
[  121.003250] WARNING: CPU: 0 PID: 10 at block/blk-mq.c:1365 
__blk_mq_run_hw_queue+0xdc/0x128
[  121.011576] Modules linked in:
[  121.014623] CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G        W 
    5.5.0-rc6-31756-gdddc331 #333
[  121.023902] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 
2280-V2 CS V3.B160.02 01/15/2020
[  121.032749] pstate: 00c00089 (nzcv daIf +PAN +UAO)
[  121.037526] pc : __blk_mq_run_hw_queue+0xdc/0x128
[  121.042217] lr : __blk_mq_delay_run_hw_queue+0x1ac/0x230
[  121.047515] sp : ffff80001052b540
[  121.050820] x29: ffff80001052b540 x28: 0000000000000dd0
[  121.056118] x27: ffff00277b6c0d00 x26: ffff00277863ea18
[  121.061416] x25: 000000000000000c x24: 000000000000000a
[  121.066714] x23: 0000000000000001 x22: 0000000000000000
[  121.072012] x21: ffffc8fcee7c9000 x20: 0000000000000000
[  121.077310] x19: ffff0027cb7f8000 x18: 000000000000000e
[  121.082608] x17: 0000000000000007 x16: 0000000000000001
[  121.087905] x15: 0000000000000019 x14: 0000000000000033
[  121.093203] x13: 0000000000000001 x12: 00000000015cd30a
[  121.098501] x11: 0000000000000001 x10: 0000000000000040
[  121.103799] x9 : 00000000ffff4840 x8 : ffffc8fcee3fb018
[  121.109097] x7 : ffffc8fcee7c98e8 x6 : 0000000000000000
[  121.114395] x5 : ffffc8fcee7c98c8 x4 : 0000000000000000
[  121.119692] x3 : 0000000000000000 x2 : ffffc8fcee7c98c8
[  121.124990] x1 : 00000000000000ff x0 : 0000000000000103
[  121.130288] Call trace:
[  121.132729]  __blk_mq_run_hw_queue+0xdc/0x128
[  121.137073]  __blk_mq_delay_run_hw_queue+0x1ac/0x230
[  121.142025]  blk_mq_run_hw_queue+0xa0/0x110
[  121.146196]  blk_mq_request_bypass_insert+0x54/0x68
[  121.151060]  __blk_mq_try_issue_directly+0x60/0x208
[  121.155926]  blk_mq_try_issue_directly+0x50/0xb8
[  121.160529]  blk_mq_make_request+0x3d0/0x420
[  121.164787]  generic_make_request+0xf4/0x320
[  121.169046]  blk_mq_hctx_deactivate+0x1a0/0x2e8
[  121.173566]  __blk_mq_delay_run_hw_queue+0x220/0x230
[  121.178517]  blk_mq_run_hw_queue+0xa0/0x110
[  121.182688]  blk_mq_run_hw_queues+0x48/0x70
[  121.186860]  scsi_end_request+0x1d8/0x230
[  121.190858]  scsi_io_completion+0x70/0x520
[  121.194944]  scsi_finish_command+0xe0/0xf0
[  121.199029]  scsi_softirq_done+0x80/0x190
[  121.203027]  blk_mq_complete_request+0xc4/0x160
[  121.207545]  scsi_mq_done+0x70/0xc8
[  121.211024]  ata_scsi_qc_complete+0xa0/0x3f0
[  121.215283]  __ata_qc_complete+0xa4/0x150
[  121.219281]  ata_qc_complete+0xe8/0x200
[  121.223106]  sas_ata_task_done+0x194/0x2d0
[  121.227191]  cq_tasklet_v3_hw+0x12c/0x5e8
[  121.231190]  tasklet_action_common.isra.15+0x11c/0x190
[  121.236312]  tasklet_action+0x24/0x30
[  121.239965]  efi_header_end+0x114/0x234
[  121.243790]  run_ksoftirqd+0x38/0x48
[  121.247356]  smpboot_thread_fn+0x16c/0x270
[  121.251441]  kthread+0x118/0x120
[  121.254661]  ret_from_fork+0x10/0x18
[  121.258226] ---[ end trace 0125724e2f3f2477 ]---
[  121.267999] BUG: scheduling while atomic: ksoftirqd/0/10/0x00000103
[  121.274248] Modules linked in:
[  121.277295] CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G        W 
    5.5.0-rc6-31756-gdddc331 #333
[  121.286574] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 
2280-V2 CS V3.B160.02 01/15/2020
[  121.295420] Call trace:
[  121.297861]  dump_backtrace+0x0/0x1a0
[  121.301513]  show_stack+0x14/0x20
[  121.304822]  dump_stack+0xbc/0x104
[  121.308216]  __schedule_bug+0x50/0x70
[  121.311869]  __schedule+0x534/0x598
[  121.315349]  schedule+0x74/0x100
[  121.318569]  io_schedule+0x18/0x98
[  121.321963]  blk_mq_get_tag+0x150/0x300
[  121.325789]  blk_mq_get_request+0x120/0x3a8
[  121.329961]  blk_mq_make_request+0xf8/0x420
[  121.334134]  generic_make_request+0xf4/0x320
[  121.338392]  blk_mq_hctx_deactivate+0x1a0/0x2e8
[  121.342910]  __blk_mq_delay_run_hw_queue+0x220/0x230
[  121.347861]  blk_mq_run_hw_queue+0xa0/0x110
[  121.352034]  blk_mq_run_hw_queues+0x48/0x70
[  121.356205]  scsi_end_request+0x1d8/0x230
[  121.360204]  scsi_io_completion+0x70/0x520
[  121.364289]  scsi_finish_command+0xe0/0xf0
[  121.368373]  scsi_softirq_done+0x80/0x190
[  121.372372]  blk_mq_complete_request+0xc4/0x160
[  121.376891]  scsi_mq_done+0x70/0xc8
[  121.380371]  ata_scsi_qc_complete+0xa0/0x3f0
[  121.384629]  __ata_qc_complete+0xa4/0x150
[  121.388629]  ata_qc_complete+0xe8/0x200
[  121.392453]  sas_ata_task_done+0x194/0x2d0
[  121.396540]  cq_tasklet_v3_hw+0x12c/0x5e8
[  121.400540]  tasklet_action_common.isra.15+0x11c/0x190
[  121.405664]  tasklet_action+0x24/0x30
[  121.409316]  efi_header_end+0x114/0x234
[  121.413141]  run_ksoftirqd+0x38/0x48
[  121.416706]  smpboot_thread_fn+0x16c/0x270
[  121.420792]  kthread+0x118/0x120
[  121.424013]  ret_from_fork+0x10/0x18
Jobs: 40 (f=40):[  121.427627] NOHZ: local_softirq_pending 2c2
  [RRRRRRRRRRRRRR[  121.433311] NOHZ: local_softirq_pending 40
RRRRRRRRRRRRRRRR[  121.438658] NOHZ: local_softirq_pending 40
RRRRRRRRRR] [52.[  121.444103] NOHZ: local_softirq_pending 40
8% done] [1676MB[  121.449564] NOHZ: local_softirq_pending 40
/0KB/0KB /s] [42[  121.455028] NOHZ: local_softirq_pending 40
9K/0/0 iops] [et[  121.460491] NOHZ: local_softirq_pending 40
a 00m:17s]
[  121.465948] NOHZ: local_softirq_pending 40
[  121.470963] NOHZ: local_softirq_pending 40
[  121.475055] NOHZ: local_softirq_pending 40
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [55.6% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:16s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [58.3% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:15s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [61.1% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:14s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [63.9% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:13s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [66.7% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:12s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [69.4% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:11s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [72.2% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:10s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [75.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:09s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [77.8% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:08s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [80.6% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:07s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [83.3% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:06s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [86.1% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:05s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [88.9% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:04s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [91.7% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:03s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [94.4% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:02s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [97.2% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:01s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [100.0% 
done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:00s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158047033d:14h:17m:38s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048319d:18h:40m:34s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049605d:23h:03m:30s]
[  141.915198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  141.921280] rcu: 	96-...0: (4 GPs behind) idle=af6/0/0x1 
softirq=90/90 fqs=2626
[  141.928658] rcu: 	99-...0: (2 GPs behind) 
idle=b82/1/0x4000000000000000 softirq=82/82 fqs=2627
[  141.937339] 	(detected by 17, t=5254 jiffies, g=3925, q=545)
[  141.942985] Task dump for CPU 96:
[  141.946292] swapper/96      R  running task        0     0      1 
0x0000002a
[  141.953321] Call trace:
[  141.955763]  __switch_to+0xbc/0x218
[  141.959244]  0x0
[  141.961079] Task dump for CPU 99:
[  141.964385] kworker/99:1H   R  running task        0  3473      2 
0x0000022a
[  141.971417] Workqueue: kblockd blk_mq_run_work_fn
[  141.976109] Call trace:
[  141.978550]  __switch_to+0xbc/0x218
[  141.982029]  blk_mq_run_work_fn+0x1c/0x28
[  141.986027]  process_one_work+0x1e0/0x358
[  141.990025]  worker_thread+0x40/0x488
[  141.993678]  kthread+0x118/0x120
[  141.996897]  ret_from_fork+0x10/0x18
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158042478d:14h:26m:49s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158043549d:01h:15m:55s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158044619d:12h:05m:01s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158045689d:22h:54m:06s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158046760d:09h:43m:12s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158047830d:20h:32m:18s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048901d:07h:21m:24s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049971d:18h:10m:30s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048794d:00h:27m:52s]
[  151.263502] softirq: huh, entered softirq 6 TASKLET 000000002aed33dd 
with preempt_count 00000100, exited with fffffffe?
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049816d:15h:16m:18s]
[  151.312427] sas: Enter sas_scsi_recover_host busy: 192 failed: 192
[  151.318613] sas: trying to find task 0x00000000294f58ee
[  151.323830] sas: sas_scsi_find_task: aborting task 0x00000000294f58ee
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158045974d:14h:49m:07s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158046897d:22h:56m:49s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158047821d:07h:04m:31s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048744d:15h:12m:13s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158046537d:10h:31m:29s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050141d:20h:12m:10s]
[  157.407215] hisi_sas_v3_hw 0000:74:02.0: internal task abort: timeout 
and not done.
[  157.414855] hisi_sas_v3_hw 0000:74:02.0: abort task: internal abort 
failed
[  157.421720] hisi_sas_v3_hw 0000:74:02.0: abort task: rc=-5
[  157.427202] sas: sas_scsi_find_task: querying task 0x00000000294f58ee
[  157.433634] sas: sas_scsi_find_task: task 0x00000000294f58ee failed 
to abort
[  157.440670] sas: task 0x00000000294f58ee is not at LU: I_T recover
[  157.446841] sas: I_T nexus reset for dev 500e004aaaaaaa02
[  157.452239] hisi_sas_v3_hw 0000:74:02.0: internal task abort: 
executing internal task failed: -22
[  157.461095] hisi_sas_v3_hw 0000:74:02.0: I_T nexus reset: internal 
abort (-22)
[  157.468310] hisi_sas_v3_hw 0000:74:02.0: internal task abort: 
executing internal task failed: -22
[  157.477170] hisi_sas_v3_hw 0000:74:02.0: lu_reset: internal abort failed
[  157.483866] hisi_sas_v3_hw 0000:74:02.0: lu_reset: for device[3]:rc= -22
[  157.490558] hisi_sas_v3_hw 0000:74:02.0: internal task abort: 
executing internal task failed: -22
[  157.499417] hisi_sas_v3_hw 0000:74:02.0: I_T nexus reset: internal 
abort (-22)
[  157.506627] sas: clear nexus ha
[  157.516706] hisi_sas_v3_hw 0000:74:02.0: controller resetting...
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158047356d:20h:42m:24s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048189d:11h:12m:35s]
[  159.318030] hisi_sas_v3_hw 0000:74:02.0: phyup: phy0 link_rate=11
[  159.324114] hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=11
[  159.330195] hisi_sas_v3_hw 0000:74:02.0: phyup: phy2 link_rate=11
[  159.336274] hisi_sas_v3_hw 0000:74:02.0: phyup: phy3 link_rate=11
[  159.342354] hisi_sas_v3_hw 0000:74:02.0: phyup: phy4 link_rate=11
[  159.348433] hisi_sas_v3_hw 0000:74:02.0: phyup: phy5 link_rate=11
[  159.354513] hisi_sas_v3_hw 0000:74:02.0: phyup: phy6 link_rate=11
[  159.360593] hisi_sas_v3_hw 0000:74:02.0: phyup: phy7 link_rate=11
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049022d:01h:42m:46s]
[  160.349201] hisi_sas_v3_hw 0000:74:02.0: controller reset complete
[  160.349208] sas: broadcast received: 0
[  160.355378] hisi_sas_v3_hw 0000:74:02.0: dump count exceeded!
[  160.359138] sas: REVALIDATING DOMAIN on port 0, pid:937
[  160.370508] sas: ex 500e004aaaaaaa1f phy01 change count has changed
[  160.376993] sas: ex 500e004aaaaaaa1f phy01 originated BROADCAST(CHANGE)
[  160.383603] sas: ex 500e004aaaaaaa1f rediscovering phy01
[  160.389117] sas: ex 500e004aaaaaaa1f phy01:U:5 attached: 
0000000000000000 (no device)
[  160.398068] sas: ex 500e004aaaaaaa1f phy16 originated BROADCAST(CHANGE)
[  160.404676] sas: ex 500e004aaaaaaa1f rediscovering phy16, part of a 
wide port with phy17
[  160.412978] sas: ex 500e004aaaaaaa1f phy16 broadcast flutter
[  160.418751] sas: ex 500e004aaaaaaa1f phy17 originated BROADCAST(CHANGE)
[  160.425358] sas: ex 500e004aaaaaaa1f rediscovering phy17, part of a 
wide port with phy16
[  160.433638] sas: ex 500e004aaaaaaa1f phy17 broadcast flutter
[  160.439421] sas: ex 500e004aaaaaaa1f phy18 originated BROADCAST(CHANGE)
[  160.446030] sas: ex 500e004aaaaaaa1f rediscovering phy18, part of a 
wide port with phy16
[  160.454316] sas: ex 500e004aaaaaaa1f phy18 broadcast flutter
[  160.460105] sas: ex 500e004aaaaaaa1f phy19 originated BROADCAST(CHANGE)
[  160.466712] sas: ex 500e004aaaaaaa1f rediscovering phy19, part of a 
wide port with phy16
[  160.475022] sas: ex 500e004aaaaaaa1f phy19 broadcast flutter
[  160.480801] sas: ex 500e004aaaaaaa1f phy20 originated BROADCAST(CHANGE)
[  160.487410] sas: ex 500e004aaaaaaa1f rediscovering phy20, part of a 
wide port with phy16
[  160.495692] sas: ex 500e004aaaaaaa1f phy20 broadcast flutter
[  160.501421] sas: ex 500e004aaaaaaa1f phy21 originated BROADCAST(CHANGE)
[  160.508029] sas: ex 500e004aaaaaaa1f rediscovering phy21, part of a 
wide port with phy16
[  160.516309] sas: ex 500e004aaaaaaa1f phy21 broadcast flutter
[  160.522048] sas: ex 500e004aaaaaaa1f phy22 originated BROADCAST(CHANGE)
[  160.528653] sas: ex 500e004aaaaaaa1f rediscovering phy22, part of a 
wide port with phy16
[  160.536975] sas: ex 500e004aaaaaaa1f phy22 broadcast flutter
[  160.542699] sas: ex 500e004aaaaaaa1f phy23 originated BROADCAST(CHANGE)
[  160.549307] sas: ex 500e004aaaaaaa1f rediscovering phy23, part of a 
wide port with phy16
[  160.557591] sas: ex 500e004aaaaaaa1f phy23 broadcast flutter
[  160.563318] sas: done REVALIDATING DOMAIN on port 0, pid:937, res 0x0
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049854d:16h:12m:57s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048255d:18h:40m:08s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050345d:15h:15m:03s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049839d:22h:28m:25s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158046025d:15h:04m:58s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158046744d:14h:10m:15s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158047614d:11h:05m:16s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049100d:12h:27m:26s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048901d:11h:26m:07s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049620d:10h:31m:24s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050339d:09h:36m:41s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049314d:16h:14m:45s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158047401d:22h:42m:56s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048059d:06h:48m:09s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158048716d:14h:53m:22s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049373d:22h:58m:36s]
[  176.479216] sas: clear nexus ha succeeded
[  176.483220] sas: --- Exit sas_eh_handle_sas_errors -- clear_q
[  176.488980] sas: ata2: end_device-0:0:4: cmd error handler
[  176.494464] sas: ata1: end_device-0:0:2: cmd error handler
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050031d:07h:03m:49s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049507d:04h:34m:56s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050077d:05h:12m:47s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049104d:11h:12m:35s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049724d:15h:20m:48s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050344d:19h:29m:01s]
[  182.751214] hisi_sas_v3_hw 0000:74:02.0: internal task abort: timeout 
and not done.
[  182.751217] hisi_sas_v3_hw 0000:74:02.0: dump count exceeded!
[  182.758857] hisi_sas_v3_hw 0000:74:02.0: lu_reset: internal abort failed
[  182.758858] hisi_sas_v3_hw 0000:74:02.0: lu_reset: for device[4]:rc= -5
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158047857d:06h:57m:44s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049091d:00h:22m:57s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] 
[0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158049020d:21h:38m:22s]
Jobs: 40 (f=40): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR

> 
>>
>> Ming Lei (6):
>>    blk-mq: add new state of BLK_MQ_S_INACTIVE
>>    blk-mq: prepare for draining IO when hctx's all CPUs are offline
>>    blk-mq: stop to handle IO and drain IO before hctx becomes inactive
>>    blk-mq: re-submit IO in case that hctx is inactive
>>    blk-mq: handle requests dispatched from IO scheduler in case of
>>      inactive hctx
>>    block: deactivate hctx when all its CPUs are offline when running
>>      queue
>>
>>   block/blk-mq-debugfs.c     |   2 +
>>   block/blk-mq-tag.c         |   2 +-
>>   block/blk-mq-tag.h         |   2 +
>>   block/blk-mq.c             | 172 +++++++++++++++++++++++++++++++------
>>   block/blk-mq.h             |   3 +-
>>   drivers/block/loop.c       |   2 +-
>>   drivers/md/dm-rq.c         |   2 +-
>>   include/linux/blk-mq.h     |   6 ++
>>   include/linux/cpuhotplug.h |   1 +
>>   9 files changed, 163 insertions(+), 29 deletions(-)
>>
>> Cc: John Garry <john.garry@huawei.com>
>> Cc: Bart Van Assche <bvanassche@acm.org>
>> Cc: Hannes Reinecke <hare@suse.com>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Keith Busch <keith.busch@intel.com>
>>
> 
> .

next prev parent reply	other threads:[~2020-01-20 13:23 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15 11:44 [PATCH V5 0/6] blk-mq: improvement CPU hotplug Ming Lei
2020-01-15 11:44 ` [PATCH 1/6] blk-mq: add new state of BLK_MQ_S_INACTIVE Ming Lei
2020-01-15 11:44 ` [PATCH 2/6] blk-mq: prepare for draining IO when hctx's all CPUs are offline Ming Lei
2020-01-15 11:44 ` [PATCH 3/6] blk-mq: stop to handle IO and drain IO before hctx becomes inactive Ming Lei
2020-01-15 11:44 ` [PATCH 4/6] blk-mq: re-submit IO in case that hctx is inactive Ming Lei
2020-01-15 11:44 ` [PATCH 5/6] blk-mq: handle requests dispatched from IO scheduler in case of inactive hctx Ming Lei
2020-01-15 11:44 ` [PATCH 6/6] block: deactivate hctx when all its CPUs are offline when running queue Ming Lei
2020-01-15 17:00 ` [PATCH V5 0/6] blk-mq: improvement CPU hotplug John Garry
2020-01-20 13:23   ` John Garry [this message]
2020-01-31 10:04     ` Ming Lei
2020-01-31 10:24       ` John Garry
2020-01-31 10:58         ` Ming Lei
2020-01-31 17:51           ` John Garry
2020-01-31 18:02             ` John Garry
2020-02-01  1:31               ` Ming Lei
2020-02-01 11:05                 ` Marc Zyngier
2020-02-01 11:31                   ` Thomas Gleixner
2020-02-03 10:30                     ` John Garry
2020-02-03 10:49                       ` John Garry
2020-02-03 10:59                         ` Ming Lei
2020-02-03 12:56                           ` John Garry
2020-02-03 15:43                             ` Marc Zyngier
2020-02-03 18:16                               ` John Garry
2020-02-05 14:08                                 ` John Garry
2020-02-05 14:23                                   ` Marc Zyngier
2020-02-07 10:56           ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6dbe8c9f-af4e-3157-b6e9-6bbf43efb1e1@huawei.com \
    --to=john.garry@huawei.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=chenxiang66@hisilicon.com \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox