From: Wenchao Hao <haowenchao22@gmail.com>
To: Bart Van Assche <bvanassche@acm.org>,
"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org,
"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
Avri Altman <avri.altman@wdc.com>,
Peter Wang <peter.wang@mediatek.com>,
Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
Bean Huo <beanhuo@micron.com>,
Andrew Halaney <ahalaney@redhat.com>
Subject: Re: [PATCH 8/8] scsi: ufs: Check for completion from the timeout handler
Date: Thu, 27 Jun 2024 11:50:36 +0800 [thread overview]
Message-ID: <a4ad73c7-0a8e-4f87-b9fc-f3a2d58d5206@gmail.com> (raw)
In-Reply-To: <20240617210844.337476-9-bvanassche@acm.org>
On 2024/6/18 5:07, Bart Van Assche wrote:
> If ufshcd_abort() returns SUCCESS for an already completed command then
> that command is completed twice. This results in a crash. Prevent this by
> checking whether a command has completed without completion interrupt from
> the timeout handler. This CL fixes the following kernel crash:
>
Hi Bart,
Could you describe in more detail about how command completed twice happened?
I think this would not happen, below is my analysis, if it is wrong, please
correct me.
In your description, ufshcd_abort() returns SUCCESS is condition which trigger
the issue.
There are 2 paths would call ufshcd_abort(). The first one is triggered by
block layer's timeout, which is:
scsi_timeout()
hostt->eh_timed_out() [ufshcd_eh_timed_out]
// return SCSI_EH_NOT_HANDLED
test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state)
// return true
scsi_abort_command()
queue_delayed_work(shost->tmf_work_q, &scmd->abort_work, ...)
The above flow would trigger ufshcd_abort() be called in workqueue:
scmd_eh_abort_handler()
scsi_try_to_abort_cmd()
hostt->eh_abort_handler() [ufshcd_abort]
While when ufshcd_abort() is called by above flow, the command has been marked
as SCMD_STATE_COMPLETE, and it means scsi_timeout() win the scsi_done(), so
normal context(usually triggered by driver's irq handler which call scsi_done())
would not handle this command any more.
The only path which would handle this command is scmd_eh_abort_handler() if
ufshcd_abort() return SUCCESS.
Is any other context which would finish this command?
Another path is:
scsi_send_eh_cmnd()
shost->hostt->queuecommand() // queued command did not finish in time
scsi_abort_eh_cmnd()
This is error handler path when host is set to RECOVERY state, no new normal
command would be send any more. What's more, if ufshcd_abort() is called now,
the aborted command is also command send in scsi_send_eh_cmnd().
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Call trace:
> dma_direct_map_sg+0x70/0x274
> scsi_dma_map+0x84/0x124
> ufshcd_queuecommand+0x3fc/0x880
> scsi_queue_rq+0x7d0/0x111c
> blk_mq_dispatch_rq_list+0x440/0xebc
> blk_mq_do_dispatch_sched+0x5a4/0x6b8
> __blk_mq_sched_dispatch_requests+0x150/0x220
> __blk_mq_run_hw_queue+0xf0/0x218
> __blk_mq_delay_run_hw_queue+0x8c/0x18c
> blk_mq_run_hw_queue+0x1a4/0x360
> blk_mq_sched_insert_requests+0x130/0x334
> blk_mq_flush_plug_list+0x138/0x234
> blk_flush_plug_list+0x118/0x164
> blk_finish_plug()
> read_pages+0x38c/0x408
> page_cache_ra_unbounded+0x230/0x2f8
> do_sync_mmap_readahead+0x1a4/0x208
> filemap_fault+0x27c/0x8f4
> f2fs_filemap_fault+0x28/0xfc
> __do_fault+0xc4/0x208
> handle_pte_fault+0x290/0xe04
> do_handle_mm_fault+0x52c/0x858
> do_page_fault+0x5dc/0x798
> do_translation_fault+0x40/0x54
> do_mem_abort+0x60/0x134
> el0_da+0x40/0xb8
> el0t_64_sync_handler+0xc4/0xe4
> el0t_64_sync+0x1b4/0x1b8
>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/ufs/core/ufshcd.c | 23 ++++++++++++++++++++++-
> 1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index e3835e61e4b1..47cc0802c4f4 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -8922,7 +8922,28 @@ static void ufshcd_async_scan(void *data, async_cookie_t cookie)
>
> static enum scsi_timeout_action ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
> {
> - struct ufs_hba *hba = shost_priv(scmd->device->host);
> + struct scsi_device *sdev = scmd->device;
> + struct ufs_hba *hba = shost_priv(sdev->host);
> + struct scsi_cmnd *cmd2 = scmd;
> + const u32 unique_tag = blk_mq_unique_tag(scsi_cmd_to_rq(scmd));
> +
> + WARN_ON_ONCE(!scmd);
> +
> + if (is_mcq_enabled(hba)) {
> + struct request *rq = scsi_cmd_to_rq(scmd);
> + struct ufs_hw_queue *hwq = ufshcd_mcq_req_to_hwq(hba, rq);
> +
> + ufshcd_mcq_poll_cqe_lock(hba, hwq, &cmd2);
> + } else {
> + __ufshcd_poll(hba->host, UFSHCD_POLL_FROM_INTERRUPT_CONTEXT,
> + &cmd2);
> + }
> + if (cmd2 == NULL) {
> + sdev_printk(KERN_INFO, sdev,
> + "%s: cmd with tag %#x has already been completed\n",
> + __func__, unique_tag);
> + return SCSI_EH_DONE;
> + }
>
> if (!hba->system_suspending) {
> /* Activate the error handler in the SCSI core. */
>
prev parent reply other threads:[~2024-06-27 3:50 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-17 21:07 [PATCH 0/8] UFS patches for kernel 6.11 Bart Van Assche
2024-06-17 21:07 ` [PATCH 1/8] scsi: ufs: Initialize struct uic_command once Bart Van Assche
2024-06-18 1:25 ` Daejun Park
2024-06-18 16:15 ` Bart Van Assche
2024-06-18 6:18 ` Avri Altman
2024-06-19 6:55 ` Manivannan Sadhasivam
2024-06-17 21:07 ` [PATCH 2/8] scsi: ufs: Remove two constants Bart Van Assche
2024-06-19 6:58 ` Manivannan Sadhasivam
2024-06-17 21:07 ` [PATCH 3/8] scsi: ufs: Inline ufshcd_mcq_vops_get_hba_mac() Bart Van Assche
2024-06-18 6:23 ` Avri Altman
2024-06-18 16:14 ` Bart Van Assche
2024-06-17 21:07 ` [PATCH 4/8] scsi: ufs: Make .get_hba_mac() optional Bart Van Assche
2024-06-18 1:28 ` Daejun Park
2024-06-18 16:17 ` Bart Van Assche
2024-06-19 7:13 ` Manivannan Sadhasivam
2024-06-19 7:57 ` Manivannan Sadhasivam
2024-06-21 3:32 ` Peter Wang (王信友)
2024-06-23 13:33 ` manivannan.sadhasivam
2024-06-24 8:39 ` Peter Wang (王信友)
2024-06-24 17:30 ` Bart Van Assche
2024-06-17 21:07 ` [PATCH 5/8] scsi: ufs: Declare ufshcd_mcq_poll_cqe_lock() once Bart Van Assche
2024-06-18 11:01 ` Avri Altman
2024-06-17 21:07 ` [PATCH 6/8] scsi: ufs: Make ufshcd_poll() complain about unsupported arguments Bart Van Assche
2024-06-19 7:32 ` Manivannan Sadhasivam
2024-06-20 20:13 ` Bart Van Assche
2024-06-23 13:39 ` Manivannan Sadhasivam
2024-06-17 21:07 ` [PATCH 7/8] scsi: ufs: Make the polling code report which command has been completed Bart Van Assche
2024-06-17 21:07 ` [PATCH 8/8] scsi: ufs: Check for completion from the timeout handler Bart Van Assche
2024-06-21 6:54 ` Peter Wang (王信友)
2024-06-21 17:23 ` Bart Van Assche
2024-06-24 8:54 ` Peter Wang (王信友)
2024-06-24 18:12 ` Bart Van Assche
2024-06-25 10:04 ` Peter Wang (王信友)
2024-06-25 16:33 ` Bart Van Assche
2024-06-26 3:54 ` Peter Wang (王信友)
2024-06-26 21:54 ` Bart Van Assche
2024-06-27 10:56 ` Peter Wang (王信友)
2024-06-27 16:33 ` Bart Van Assche
2024-06-27 3:50 ` Wenchao Hao [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a4ad73c7-0a8e-4f87-b9fc-f3a2d58d5206@gmail.com \
--to=haowenchao22@gmail.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=ahalaney@redhat.com \
--cc=avri.altman@wdc.com \
--cc=beanhuo@micron.com \
--cc=bvanassche@acm.org \
--cc=linux-scsi@vger.kernel.org \
--cc=manivannan.sadhasivam@linaro.org \
--cc=martin.petersen@oracle.com \
--cc=peter.wang@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox