From: <peter.wang@mediatek.com>
To: <stanley.chu@mediatek.com>, <linux-scsi@vger.kernel.org>,
<martin.petersen@oracle.com>, <avri.altman@wdc.com>,
<alim.akhtar@samsung.com>, <jejb@linux.ibm.com>
Cc: <wsd_upstream@mediatek.com>, <linux-mediatek@lists.infradead.org>,
<peter.wang@mediatek.com>, <chun-hung.wu@mediatek.com>,
<alice.chao@mediatek.com>, <cc.chou@mediatek.com>,
<chaotian.jing@mediatek.com>, <jiajie.hao@mediatek.com>,
<powen.kao@mediatek.com>, <qilin.tan@mediatek.com>,
<lin.gui@mediatek.com>, <tun-yu.yu@mediatek.com>,
<eddie.huang@mediatek.com>, <naomi.chu@mediatek.com>
Subject: [PATCH v5] ufs: core: wlun send SSU timeout recovery
Date: Wed, 27 Sep 2023 11:35:57 +0800 [thread overview]
Message-ID: <20230927033557.13801-1-peter.wang@mediatek.com> (raw)
From: Peter Wang <peter.wang@mediatek.com>
When runtime pm send SSU times out, the SCSI core invokes
eh_host_reset_handler, which hooks function ufshcd_eh_host_reset_handler
schedule eh_work and stuck at wait flush_work(&hba->eh_work).
However, ufshcd_err_handler hangs in wait rpm resume.
Do link recovery only in this case.
Below is IO hang stack dump in kernel-6.1
kworker/4:0 D
<ffffffd7d31f6fb4> __switch_to+0x180/0x344
<ffffffd7d31f779c> __schedule+0x5ec/0xa14
<ffffffd7d31f7c3c> schedule+0x78/0xe0
<ffffffd7d31fefbc> schedule_timeout+0xb0/0x15c
<ffffffd7d31f8120> io_schedule_timeout+0x48/0x70
<ffffffd7d31f8e40> do_wait_for_common+0x108/0x19c
<ffffffd7d31f837c> wait_for_completion_io_timeout+0x50/0x78
<ffffffd7d2876bc0> blk_execute_rq+0x1b8/0x218
<ffffffd7d2b4297c> scsi_execute_cmd+0x148/0x238
<ffffffd7d2da7358> ufshcd_set_dev_pwr_mode+0xe8/0x244
<ffffffd7d2da7e40> __ufshcd_wl_resume+0x1e0/0x45c
<ffffffd7d2da7b28> ufshcd_wl_runtime_resume+0x3c/0x174
<ffffffd7d2b4f290> scsi_runtime_resume+0x7c/0xc8
<ffffffd7d2ae1d48> __rpm_callback+0xa0/0x410
<ffffffd7d2ae0128> rpm_resume+0x43c/0x67c
<ffffffd7d2ae1e98> __rpm_callback+0x1f0/0x410
<ffffffd7d2ae014c> rpm_resume+0x460/0x67c
<ffffffd7d2ae1450> pm_runtime_work+0xa4/0xac
<ffffffd7d22e39ac> process_one_work+0x208/0x598
<ffffffd7d22e3fc0> worker_thread+0x228/0x438
<ffffffd7d22eb038> kthread+0x104/0x1d4
<ffffffd7d22171a0> ret_from_fork+0x10/0x20
scsi_eh_0 D
<ffffffd7d31f6fb4> __switch_to+0x180/0x344
<ffffffd7d31f779c> __schedule+0x5ec/0xa14
<ffffffd7d31f7c3c> schedule+0x78/0xe0
<ffffffd7d31fef50> schedule_timeout+0x44/0x15c
<ffffffd7d31f8e40> do_wait_for_common+0x108/0x19c
<ffffffd7d31f8234> wait_for_completion+0x48/0x64
<ffffffd7d22deb88> __flush_work+0x260/0x2d0
<ffffffd7d22de918> flush_work+0x10/0x20
<ffffffd7d2da4728> ufshcd_eh_host_reset_handler+0x88/0xcc
<ffffffd7d2b41da4> scsi_try_host_reset+0x48/0xe0
<ffffffd7d2b410fc> scsi_eh_ready_devs+0x934/0xa40
<ffffffd7d2b41618> scsi_error_handler+0x168/0x374
<ffffffd7d22eb038> kthread+0x104/0x1d4
<ffffffd7d22171a0> ret_from_fork+0x10/0x20
kworker/u16:5 D
<ffffffd7d31f6fb4> __switch_to+0x180/0x344
<ffffffd7d31f779c> __schedule+0x5ec/0xa14
<ffffffd7d31f7c3c> schedule+0x78/0xe0
<ffffffd7d2adfe00> rpm_resume+0x114/0x67c
<ffffffd7d2adfca8> __pm_runtime_resume+0x70/0xb4
<ffffffd7d2d9cf48> ufshcd_err_handler+0x1a0/0xe68
<ffffffd7d22e39ac> process_one_work+0x208/0x598
<ffffffd7d22e3fc0> worker_thread+0x228/0x438
<ffffffd7d22eb038> kthread+0x104/0x1d4
<ffffffd7d22171a0> ret_from_fork+0x10/0x20
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
drivers/ufs/core/ufshcd.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index c2df07545f96..0619cefa092e 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -7716,6 +7716,20 @@ static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd)
hba = shost_priv(cmd->device->host);
+ /*
+ * If runtime pm send SSU and got timeout, scsi_error_handler
+ * stuck at this function to wait for flush_work(&hba->eh_work).
+ * And ufshcd_err_handler(eh_work) stuck at wait for runtime pm active.
+ * Do ufshcd_link_recovery instead schedule eh_work can prevent
+ * dead lock to happen.
+ */
+ if (hba->pm_op_in_progress) {
+ if (ufshcd_link_recovery(hba))
+ err = FAILED;
+
+ return err;
+ }
+
spin_lock_irqsave(hba->host->host_lock, flags);
hba->force_reset = true;
ufshcd_schedule_eh_work(hba);
--
2.18.0
next reply other threads:[~2023-09-27 4:10 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-27 3:35 peter.wang [this message]
2023-09-27 18:31 ` [PATCH v5] ufs: core: wlun send SSU timeout recovery Bart Van Assche
2023-09-28 1:58 ` Peter Wang (王信友)
2023-09-28 16:30 ` Bart Van Assche
2023-09-29 14:14 ` Stanley Chu
2023-10-10 1:23 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230927033557.13801-1-peter.wang@mediatek.com \
--to=peter.wang@mediatek.com \
--cc=alice.chao@mediatek.com \
--cc=alim.akhtar@samsung.com \
--cc=avri.altman@wdc.com \
--cc=cc.chou@mediatek.com \
--cc=chaotian.jing@mediatek.com \
--cc=chun-hung.wu@mediatek.com \
--cc=eddie.huang@mediatek.com \
--cc=jejb@linux.ibm.com \
--cc=jiajie.hao@mediatek.com \
--cc=lin.gui@mediatek.com \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=naomi.chu@mediatek.com \
--cc=powen.kao@mediatek.com \
--cc=qilin.tan@mediatek.com \
--cc=stanley.chu@mediatek.com \
--cc=tun-yu.yu@mediatek.com \
--cc=wsd_upstream@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox