* Re: [PATCH] ufs: core: Fix an error handler crash
2025-12-04 17:04 [PATCH] ufs: core: Fix an error handler crash Bart Van Assche
@ 2025-12-04 17:38 ` Nitin Rawat
2025-12-04 18:26 ` Bart Van Assche
2025-12-05 8:31 ` Peter Wang (王信友)
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Nitin Rawat @ 2025-12-04 17:38 UTC (permalink / raw)
To: Bart Van Assche, Martin K . Petersen
Cc: linux-scsi, Peter Wang, James E.J. Bottomley, Matthias Brugger,
AngeloGioacchino Del Regno, Avri Altman, Bean Huo, Adrian Hunter,
Bao D. Nguyen
On 12/4/2025 10:34 PM, Bart Van Assche wrote:
> The UFS error handler may be activated before SCSI scanning has started and
> hence before hba->ufs_device_wlun has been set. Check the
> hba->ufs_device_wlun pointer before using it.
>
> Cc: Peter Wang <peter.wang@mediatek.com>
> Cc: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
> Fixes: e23ef4f22db3 ("scsi: ufs: core: Fix error handler host_sem issue")
> Fixes: f966e02ae521 ("scsi: ufs: core: Fix runtime suspend error deadlock")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/ufs/core/ufshcd.c | 25 ++++++++++++++-----------
> 1 file changed, 14 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index b834b9635062..80c0b49f30b0 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -6698,19 +6698,22 @@ static void ufshcd_err_handler(struct work_struct *work)
> hba->saved_uic_err, hba->force_reset,
> ufshcd_is_link_broken(hba) ? "; link is broken" : "");
>
> - /*
> - * Use ufshcd_rpm_get_noresume() here to safely perform link recovery
> - * even if an error occurs during runtime suspend or runtime resume.
> - * This avoids potential deadlocks that could happen if we tried to
> - * resume the device while a PM operation is already in progress.
> - */
> - ufshcd_rpm_get_noresume(hba);
> - if (hba->pm_op_in_progress) {
> - ufshcd_link_recovery(hba);
> + if (hba->ufs_device_wlun) {
> + /*
> + * Use ufshcd_rpm_get_noresume() here to safely perform link
> + * recovery even if an error occurs during runtime suspend or
> + * runtime resume. This avoids potential deadlocks that could
> + * happen if we tried to resume the device while a PM operation
> + * is already in progress.
> + */
> + ufshcd_rpm_get_noresume(hba);
> + if (hba->pm_op_in_progress) {
> + ufshcd_link_recovery(hba);
> + ufshcd_rpm_put(hba);
> + return;
> + }
> ufshcd_rpm_put(hba);
> - return;
> }
> - ufshcd_rpm_put(hba);
>
> down(&hba->host_sem);
> spin_lock_irqsave(hba->host->host_lock, flags);
Hi Bart,
It seems you missed sending the below patch. Both patches are required
to address the issue (hang and clock scaling errors), except for the UIC
error, which still needs to be root-caused
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 1b3fe1d8655e..fd0b6b620b53 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -1455,15 +1455,14 @@ static int ufshcd_clock_scaling_prepare(struct
> ufs_hba *hba, u64 timeout_us)
> static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, int
err)
> {
> up_write(&hba->clk_scaling_lock);
> -
> + mutex_unlock(&hba->wb_mutex);
> + blk_mq_unquiesce_tagset(&hba->host->tag_set);
> + mutex_unlock(&hba->host->scan_mutex);
> +
> /* Enable Write Booster if current gear requires it else
disable it */
> if (ufshcd_enable_wb_if_scaling_up(hba) && !err)
> ufshcd_wb_toggle(hba, hba->pwr_info.gear_rx >=
> hba->clk_scaling.wb_gear);
>
> - mutex_unlock(&hba->wb_mutex);
> -
> - blk_mq_unquiesce_tagset(&hba->host->tag_set);
> - mutex_unlock(&hba->host->scan_mutex);
> ufshcd_release(hba);
> }
Thanks,
Nitin
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] ufs: core: Fix an error handler crash
2025-12-04 17:38 ` Nitin Rawat
@ 2025-12-04 18:26 ` Bart Van Assche
0 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2025-12-04 18:26 UTC (permalink / raw)
To: Nitin Rawat, Martin K . Petersen; +Cc: linux-scsi, James E.J. Bottomley
On 12/4/25 7:38 AM, Nitin Rawat wrote:
> It seems you missed sending the below patch. Both patches are required
> to address the issue (hang and clock scaling errors), except for the UIC
> error, which still needs to be root-caused
My shell history tells me that I tried to send the fix for the clock
scaling hang to the linux-scsi mailing list yesterday. Apparently the
patch didn't reach the linux-scsi mailing list. Maybe this was caused by
the hotel Wi-Fi. Anyway, thanks for having pointed this out. I have
resent the fix for the clock scaling hang.
Bart.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ufs: core: Fix an error handler crash
2025-12-04 17:04 [PATCH] ufs: core: Fix an error handler crash Bart Van Assche
2025-12-04 17:38 ` Nitin Rawat
@ 2025-12-05 8:31 ` Peter Wang (王信友)
2025-12-06 15:06 ` Nitin Rawat
2025-12-09 2:58 ` Martin K. Petersen
3 siblings, 0 replies; 6+ messages in thread
From: Peter Wang (王信友) @ 2025-12-05 8:31 UTC (permalink / raw)
To: bvanassche@acm.org, martin.petersen@oracle.com
Cc: beanhuo@micron.com, quic_nguyenb@quicinc.com,
linux-scsi@vger.kernel.org, AngeloGioacchino Del Regno,
adrian.hunter@intel.com, avri.altman@sandisk.com,
matthias.bgg@gmail.com, nitin.rawat@oss.qualcomm.com,
James.Bottomley@HansenPartnership.com
On Thu, 2025-12-04 at 07:04 -1000, Bart Van Assche wrote:
> The UFS error handler may be activated before SCSI scanning has
> started and
> hence before hba->ufs_device_wlun has been set. Check the
> hba->ufs_device_wlun pointer before using it.
>
> Cc: Peter Wang <peter.wang@mediatek.com>
> Cc: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
> Fixes: e23ef4f22db3 ("scsi: ufs: core: Fix error handler host_sem
> issue")
> Fixes: f966e02ae521 ("scsi: ufs: core: Fix runtime suspend error
> deadlock")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Thanks for fix this bug.
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] ufs: core: Fix an error handler crash
2025-12-04 17:04 [PATCH] ufs: core: Fix an error handler crash Bart Van Assche
2025-12-04 17:38 ` Nitin Rawat
2025-12-05 8:31 ` Peter Wang (王信友)
@ 2025-12-06 15:06 ` Nitin Rawat
2025-12-09 2:58 ` Martin K. Petersen
3 siblings, 0 replies; 6+ messages in thread
From: Nitin Rawat @ 2025-12-06 15:06 UTC (permalink / raw)
To: Bart Van Assche, Martin K . Petersen
Cc: linux-scsi, Peter Wang, James E.J. Bottomley, Matthias Brugger,
AngeloGioacchino Del Regno, Avri Altman, Bean Huo, Adrian Hunter,
Bao D. Nguyen
On 12/4/2025 10:34 PM, Bart Van Assche wrote:
> The UFS error handler may be activated before SCSI scanning has started and
> hence before hba->ufs_device_wlun has been set. Check the
> hba->ufs_device_wlun pointer before using it.
>
> Cc: Peter Wang <peter.wang@mediatek.com>
> Cc: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
> Fixes: e23ef4f22db3 ("scsi: ufs: core: Fix error handler host_sem issue")
> Fixes: f966e02ae521 ("scsi: ufs: core: Fix runtime suspend error deadlock")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
Tested-by: Nitin Rawat <nitin.rawat@oss.qualcomm.com> #SM8750
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] ufs: core: Fix an error handler crash
2025-12-04 17:04 [PATCH] ufs: core: Fix an error handler crash Bart Van Assche
` (2 preceding siblings ...)
2025-12-06 15:06 ` Nitin Rawat
@ 2025-12-09 2:58 ` Martin K. Petersen
3 siblings, 0 replies; 6+ messages in thread
From: Martin K. Petersen @ 2025-12-09 2:58 UTC (permalink / raw)
To: Bart Van Assche
Cc: Martin K . Petersen, linux-scsi, Peter Wang, Nitin Rawat,
James E.J. Bottomley, Matthias Brugger,
AngeloGioacchino Del Regno, Avri Altman, Bean Huo, Adrian Hunter,
Bao D. Nguyen
Bart,
> The UFS error handler may be activated before SCSI scanning has
> started and hence before hba->ufs_device_wlun has been set. Check the
> hba->ufs_device_wlun pointer before using it.
Applied to 6.19/scsi-staging, thanks!
--
Martin K. Petersen
^ permalink raw reply [flat|nested] 6+ messages in thread