From: Bart Van Assche <bvanassche@acm.org>
To: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>,
linux-scsi@vger.kernel.org,
Adrian Hunter <adrian.hunter@intel.com>,
Bart Van Assche <bvanassche@acm.org>,
"James E.J. Bottomley" <jejb@linux.ibm.com>,
Bean Huo <beanhuo@micron.com>, Avri Altman <avri.altman@wdc.com>,
Jinyoung Choi <j-young.choi@samsung.com>
Subject: [PATCH v4 10/10] scsi: ufs: Fix a deadlock between PM and the SCSI error handler
Date: Tue, 18 Oct 2022 13:29:58 -0700 [thread overview]
Message-ID: <20221018202958.1902564-11-bvanassche@acm.org> (raw)
In-Reply-To: <20221018202958.1902564-1-bvanassche@acm.org>
The following deadlock has been observed on multiple test setups:
* ufshcd_wl_suspend() is waiting for blk_execute_rq(START STOP UNIT) to
complete while ufshcd_wl_suspend() holds host_sem.
* The SCSI error handler is activated, changes the host state to
SHOST_RECOVERY, ufshcd_eh_host_reset_handler() and ufshcd_err_handler()
are called and the latter function tries to obtain host_sem.
This is a deadlock because blk_execute_rq() can't execute SCSI commands
while the host is in the SHOST_RECOVERY state and because the error
handler cannot make progress because host_sem is held by another thread.
Fix this deadlock as follows:
* Fail attempts to suspend the system while the SCSI error handler is in
progress by setting the SCMD_FAIL_IF_RECOVERING flag for START STOP
UNIT commands.
* If the system is suspending and a START STOP UNIT command times out,
handle the SCSI command timeout from inside the context of the SCSI
timeout handler instead of activating the SCSI error handler.
The runtime power management code is not affected by this deadlock since
hba->host_sem is not touched by the runtime power management functions
in the UFS driver.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufshcd.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index c5ccc7ba583b..b2203dd79e8c 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -8292,6 +8292,28 @@ static void ufshcd_async_scan(void *data, async_cookie_t cookie)
}
}
+static enum scsi_timeout_action ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
+{
+ struct ufs_hba *hba = shost_priv(scmd->device->host);
+
+ if (!hba->system_suspending) {
+ /* Activate the error handler in the SCSI core. */
+ return SCSI_EH_NOT_HANDLED;
+ }
+
+ /*
+ * If we get here we know that no TMFs are outstanding and also that
+ * the only pending command is a START STOP UNIT command. Handle the
+ * timeout of that command directly to prevent a deadlock between
+ * ufshcd_set_dev_pwr_mode() and ufshcd_err_handler().
+ */
+ ufshcd_link_recovery(hba);
+ dev_info(hba->dev, "%s() finished; outstanding_tasks = %#lx.\n",
+ __func__, hba->outstanding_tasks);
+
+ return hba->outstanding_reqs ? SCSI_EH_RESET_TIMER : SCSI_EH_DONE;
+}
+
static const struct attribute_group *ufshcd_driver_groups[] = {
&ufs_sysfs_unit_descriptor_group,
&ufs_sysfs_lun_attributes_group,
@@ -8326,6 +8348,7 @@ static struct scsi_host_template ufshcd_driver_template = {
.eh_abort_handler = ufshcd_abort,
.eh_device_reset_handler = ufshcd_eh_device_reset_handler,
.eh_host_reset_handler = ufshcd_eh_host_reset_handler,
+ .eh_timed_out = ufshcd_eh_timed_out,
.this_id = -1,
.sg_tablesize = SG_ALL,
.cmd_per_lun = UFSHCD_CMD_PER_LUN,
@@ -8747,6 +8770,7 @@ static int ufshcd_execute_start_stop(struct scsi_device *sdev,
scmd->cmd_len = COMMAND_SIZE(cdb[0]);
memcpy(scmd->cmnd, cdb, scmd->cmd_len);
scmd->allowed = 0/*retries*/;
+ scmd->flags |= SCMD_FAIL_IF_RECOVERING;
req->timeout = 1 * HZ;
req->rq_flags |= RQF_PM | RQF_QUIET;
next prev parent reply other threads:[~2022-10-18 20:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-18 20:29 [PATCH v4 00/10] Fix a deadlock in the UFS driver Bart Van Assche
2022-10-18 20:29 ` [PATCH v4 01/10] scsi: core: Fix a race between scsi_done() and scsi_timeout() Bart Van Assche
2022-10-18 20:29 ` [PATCH v4 02/10] scsi: core: Change the return type of .eh_timed_out() Bart Van Assche
2022-10-20 17:22 ` Mike Christie
2022-10-18 20:29 ` [PATCH v4 03/10] scsi: core: Support failing requests while recovering Bart Van Assche
2022-10-19 19:52 ` Mike Christie
2022-10-19 21:11 ` Bart Van Assche
2022-10-20 17:20 ` Mike Christie
2022-10-18 20:29 ` [PATCH v4 04/10] scsi: ufs: Remove an outdated comment Bart Van Assche
2022-10-18 20:29 ` [PATCH v4 05/10] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode() Bart Van Assche
2022-10-18 20:29 ` [PATCH v4 06/10] scsi: ufs: Reduce the START STOP UNIT timeout Bart Van Assche
2022-10-18 20:29 ` [PATCH v4 07/10] scsi: ufs: Try harder to change the power mode Bart Van Assche
2022-10-18 20:29 ` [PATCH v4 08/10] scsi: ufs: Track system suspend / resume activity Bart Van Assche
2022-10-18 20:29 ` [PATCH v4 09/10] scsi: ufs: Introduce the function ufshcd_execute_start_stop() Bart Van Assche
2022-10-19 19:57 ` Mike Christie
2022-10-19 21:13 ` Bart Van Assche
2022-10-20 17:21 ` Mike Christie
2022-10-18 20:29 ` Bart Van Assche [this message]
2022-10-22 3:27 ` [PATCH v4 00/10] Fix a deadlock in the UFS driver Martin K. Petersen
2022-10-23 6:16 ` Avri Altman
2022-10-24 23:27 ` Bart Van Assche
2022-10-27 2:58 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221018202958.1902564-11-bvanassche@acm.org \
--to=bvanassche@acm.org \
--cc=adrian.hunter@intel.com \
--cc=avri.altman@wdc.com \
--cc=beanhuo@micron.com \
--cc=j-young.choi@samsung.com \
--cc=jaegeuk@kernel.org \
--cc=jejb@linux.ibm.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox