Linux SCSI subsystem development
 help / color / mirror / Atom feed
* [PATCH] ufs: core: Fix a deadlock in the frequency scaling code
@ 2025-12-04 18:15 Bart Van Assche
  2025-12-05  8:32 ` Peter Wang (王信友)
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Bart Van Assche @ 2025-12-04 18:15 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, Bart Van Assche, Manivannan Sadhasivam, Roger Shimizu,
	Nitin Rawat, James E.J. Bottomley, Peter Wang, Avri Altman,
	Bean Huo, Adrian Hunter, Bao D. Nguyen

Commit 08b12cda6c44 ("scsi: ufs: core: Switch to scsi_get_internal_cmd()")
accidentally introduced a deadlock in the frequency scaling code.
ufshcd_clock_scaling_unprepare() may submit a device management command
while SCSI command processing is blocked. The deadlock was introduced by
using the SCSI core for submitting device management commands
(scsi_get_internal_cmd() + blk_execute_rq()). Fix this deadlock by calling
blk_mq_unquiesce_tagset() before any device management commands are
submitted by ufshcd_clock_scaling_unprepare().

Fixes: 08b12cda6c44 ("scsi: ufs: core: Switch to scsi_get_internal_cmd()")
Reported-by: Manivannan Sadhasivam <mani@kernel.org>
Reported-by: Roger Shimizu <rosh@debian.org>
Closes: https://lore.kernel.org/linux-scsi/ehorjaflathzab5oekx2nae2zss5vi2r36yqkqsfjb2fgsifz2@yk3us5g3igow/
Tested-by: Roger Shimizu <rosh@debian.org>
Cc: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 1837ae204d5e..80c0b49f30b0 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1455,15 +1455,14 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba, u64 timeout_us)
 static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, int err)
 {
 	up_write(&hba->clk_scaling_lock);
+	mutex_unlock(&hba->wb_mutex);
+	blk_mq_unquiesce_tagset(&hba->host->tag_set);
+	mutex_unlock(&hba->host->scan_mutex);
 
 	/* Enable Write Booster if current gear requires it else disable it */
 	if (ufshcd_enable_wb_if_scaling_up(hba) && !err)
 		ufshcd_wb_toggle(hba, hba->pwr_info.gear_rx >= hba->clk_scaling.wb_gear);
 
-	mutex_unlock(&hba->wb_mutex);
-
-	blk_mq_unquiesce_tagset(&hba->host->tag_set);
-	mutex_unlock(&hba->host->scan_mutex);
 	ufshcd_release(hba);
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix a deadlock in the frequency scaling code
  2025-12-04 18:15 [PATCH] ufs: core: Fix a deadlock in the frequency scaling code Bart Van Assche
@ 2025-12-05  8:32 ` Peter Wang (王信友)
  2025-12-06 15:07 ` Nitin Rawat
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Peter Wang (王信友) @ 2025-12-05  8:32 UTC (permalink / raw)
  To: bvanassche@acm.org, martin.petersen@oracle.com
  Cc: beanhuo@micron.com, rosh@debian.org, quic_nguyenb@quicinc.com,
	linux-scsi@vger.kernel.org, adrian.hunter@intel.com,
	avri.altman@sandisk.com, nitin.rawat@oss.qualcomm.com,
	mani@kernel.org, James.Bottomley@HansenPartnership.com

On Thu, 2025-12-04 at 08:15 -1000, Bart Van Assche wrote:
> Commit 08b12cda6c44 ("scsi: ufs: core: Switch to
> scsi_get_internal_cmd()")
> accidentally introduced a deadlock in the frequency scaling code.
> ufshcd_clock_scaling_unprepare() may submit a device management
> command
> while SCSI command processing is blocked. The deadlock was introduced
> by
> using the SCSI core for submitting device management commands
> (scsi_get_internal_cmd() + blk_execute_rq()). Fix this deadlock by
> calling
> blk_mq_unquiesce_tagset() before any device management commands are
> submitted by ufshcd_clock_scaling_unprepare().
> 
> Fixes: 08b12cda6c44 ("scsi: ufs: core: Switch to
> scsi_get_internal_cmd()")
> Reported-by: Manivannan Sadhasivam <mani@kernel.org>
> Reported-by: Roger Shimizu <rosh@debian.org>
> Closes:
> https://lore.kernel.org/linux-scsi/ehorjaflathzab5oekx2nae2zss5vi2r36yqkqsfjb2fgsifz2@yk3us5g3igow/
> Tested-by: Roger Shimizu <rosh@debian.org>
> Cc: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---

Reviewed-by: Peter Wang <peter.wang@mediatek.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix a deadlock in the frequency scaling code
  2025-12-04 18:15 [PATCH] ufs: core: Fix a deadlock in the frequency scaling code Bart Van Assche
  2025-12-05  8:32 ` Peter Wang (王信友)
@ 2025-12-06 15:07 ` Nitin Rawat
  2025-12-07 21:48 ` Alexey Klimov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Nitin Rawat @ 2025-12-06 15:07 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, Manivannan Sadhasivam, Roger Shimizu,
	James E.J. Bottomley, Peter Wang, Avri Altman, Bean Huo,
	Adrian Hunter, Bao D. Nguyen



On 12/4/2025 11:45 PM, Bart Van Assche wrote:
> Commit 08b12cda6c44 ("scsi: ufs: core: Switch to scsi_get_internal_cmd()")
> accidentally introduced a deadlock in the frequency scaling code.
> ufshcd_clock_scaling_unprepare() may submit a device management command
> while SCSI command processing is blocked. The deadlock was introduced by
> using the SCSI core for submitting device management commands
> (scsi_get_internal_cmd() + blk_execute_rq()). Fix this deadlock by calling
> blk_mq_unquiesce_tagset() before any device management commands are
> submitted by ufshcd_clock_scaling_unprepare().
> 
> Fixes: 08b12cda6c44 ("scsi: ufs: core: Switch to scsi_get_internal_cmd()")
> Reported-by: Manivannan Sadhasivam <mani@kernel.org>
> Reported-by: Roger Shimizu <rosh@debian.org>
> Closes: https://lore.kernel.org/linux-scsi/ehorjaflathzab5oekx2nae2zss5vi2r36yqkqsfjb2fgsifz2@yk3us5g3igow/
> Tested-by: Roger Shimizu <rosh@debian.org>
> Cc: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Nitin Rawat <nitin.rawat@oss.qualcomm.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix a deadlock in the frequency scaling code
  2025-12-04 18:15 [PATCH] ufs: core: Fix a deadlock in the frequency scaling code Bart Van Assche
  2025-12-05  8:32 ` Peter Wang (王信友)
  2025-12-06 15:07 ` Nitin Rawat
@ 2025-12-07 21:48 ` Alexey Klimov
  2025-12-09  2:59 ` Martin K. Petersen
  2025-12-09 11:33 ` Manivannan Sadhasivam
  4 siblings, 0 replies; 7+ messages in thread
From: Alexey Klimov @ 2025-12-07 21:48 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, Manivannan Sadhasivam, Roger Shimizu, Nitin Rawat,
	James E.J. Bottomley, Peter Wang, Avri Altman, Bean Huo,
	Adrian Hunter, Bao D. Nguyen

On Thu Dec 4, 2025 at 6:15 PM GMT, Bart Van Assche wrote:
> Commit 08b12cda6c44 ("scsi: ufs: core: Switch to scsi_get_internal_cmd()")
> accidentally introduced a deadlock in the frequency scaling code.
> ufshcd_clock_scaling_unprepare() may submit a device management command
> while SCSI command processing is blocked. The deadlock was introduced by
> using the SCSI core for submitting device management commands
> (scsi_get_internal_cmd() + blk_execute_rq()). Fix this deadlock by calling
> blk_mq_unquiesce_tagset() before any device management commands are
> submitted by ufshcd_clock_scaling_unprepare().
>
> Fixes: 08b12cda6c44 ("scsi: ufs: core: Switch to scsi_get_internal_cmd()")
> Reported-by: Manivannan Sadhasivam <mani@kernel.org>
> Reported-by: Roger Shimizu <rosh@debian.org>
> Closes: https://lore.kernel.org/linux-scsi/ehorjaflathzab5oekx2nae2zss5vi2r36yqkqsfjb2fgsifz2@yk3us5g3igow/
> Tested-by: Roger Shimizu <rosh@debian.org>
> Cc: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

This fixes the regression of freezing/hanging boot in initramfs
on Qcom RB5 board with recent linux-nexts. Thank you!

Tested-by: Alexey Klimov <alexey.klimov@linaro.org> # RB5 board

Best regards,
Alexey


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix a deadlock in the frequency scaling code
  2025-12-04 18:15 [PATCH] ufs: core: Fix a deadlock in the frequency scaling code Bart Van Assche
                   ` (2 preceding siblings ...)
  2025-12-07 21:48 ` Alexey Klimov
@ 2025-12-09  2:59 ` Martin K. Petersen
  2025-12-09 11:33 ` Manivannan Sadhasivam
  4 siblings, 0 replies; 7+ messages in thread
From: Martin K. Petersen @ 2025-12-09  2:59 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Manivannan Sadhasivam,
	Roger Shimizu, Nitin Rawat, James E.J. Bottomley, Peter Wang,
	Avri Altman, Bean Huo, Adrian Hunter, Bao D. Nguyen


Bart,

> Commit 08b12cda6c44 ("scsi: ufs: core: Switch to
> scsi_get_internal_cmd()") accidentally introduced a deadlock in the
> frequency scaling code. ufshcd_clock_scaling_unprepare() may submit a
> device management command while SCSI command processing is blocked.
> The deadlock was introduced by using the SCSI core for submitting
> device management commands (scsi_get_internal_cmd() +
> blk_execute_rq()). Fix this deadlock by calling
> blk_mq_unquiesce_tagset() before any device management commands are
> submitted by ufshcd_clock_scaling_unprepare().

Applied to 6.19/scsi-staging, thanks!

-- 
Martin K. Petersen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix a deadlock in the frequency scaling code
  2025-12-04 18:15 [PATCH] ufs: core: Fix a deadlock in the frequency scaling code Bart Van Assche
                   ` (3 preceding siblings ...)
  2025-12-09  2:59 ` Martin K. Petersen
@ 2025-12-09 11:33 ` Manivannan Sadhasivam
  2025-12-09 17:51   ` Bart Van Assche
  4 siblings, 1 reply; 7+ messages in thread
From: Manivannan Sadhasivam @ 2025-12-09 11:33 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Roger Shimizu, Nitin Rawat,
	James E.J. Bottomley, Peter Wang, Avri Altman, Bean Huo,
	Adrian Hunter, Bao D. Nguyen

On Thu, Dec 04, 2025 at 08:15:43AM -1000, Bart Van Assche wrote:
> Commit 08b12cda6c44 ("scsi: ufs: core: Switch to scsi_get_internal_cmd()")
> accidentally introduced a deadlock in the frequency scaling code.
> ufshcd_clock_scaling_unprepare() may submit a device management command
> while SCSI command processing is blocked. The deadlock was introduced by
> using the SCSI core for submitting device management commands
> (scsi_get_internal_cmd() + blk_execute_rq()). Fix this deadlock by calling
> blk_mq_unquiesce_tagset() before any device management commands are
> submitted by ufshcd_clock_scaling_unprepare().
> 
> Fixes: 08b12cda6c44 ("scsi: ufs: core: Switch to scsi_get_internal_cmd()")
> Reported-by: Manivannan Sadhasivam <mani@kernel.org>
> Reported-by: Roger Shimizu <rosh@debian.org>
> Closes: https://lore.kernel.org/linux-scsi/ehorjaflathzab5oekx2nae2zss5vi2r36yqkqsfjb2fgsifz2@yk3us5g3igow/
> Tested-by: Roger Shimizu <rosh@debian.org>
> Cc: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Thanks for the quirk fix. While it seems to have fixed the boot hang, I'm
quite skeptical about the fix. What is the guarantee that another device
management command won't be submitted before blk_mq_unquiesce_tagset() in the
future? IOW, the developer would have no idea that doing such will result in a
deadlock as nothing in the UFS code makes it clear, like using the same lock and
such.

- Mani

> ---
>  drivers/ufs/core/ufshcd.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 1837ae204d5e..80c0b49f30b0 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -1455,15 +1455,14 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba, u64 timeout_us)
>  static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, int err)
>  {
>  	up_write(&hba->clk_scaling_lock);
> +	mutex_unlock(&hba->wb_mutex);
> +	blk_mq_unquiesce_tagset(&hba->host->tag_set);
> +	mutex_unlock(&hba->host->scan_mutex);
>  
>  	/* Enable Write Booster if current gear requires it else disable it */
>  	if (ufshcd_enable_wb_if_scaling_up(hba) && !err)
>  		ufshcd_wb_toggle(hba, hba->pwr_info.gear_rx >= hba->clk_scaling.wb_gear);
>  
> -	mutex_unlock(&hba->wb_mutex);
> -
> -	blk_mq_unquiesce_tagset(&hba->host->tag_set);
> -	mutex_unlock(&hba->host->scan_mutex);
>  	ufshcd_release(hba);
>  }
>  

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix a deadlock in the frequency scaling code
  2025-12-09 11:33 ` Manivannan Sadhasivam
@ 2025-12-09 17:51   ` Bart Van Assche
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Van Assche @ 2025-12-09 17:51 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Martin K . Petersen, linux-scsi, Roger Shimizu, Nitin Rawat,
	James E.J. Bottomley, Peter Wang, Avri Altman, Bean Huo,
	Adrian Hunter, Bao D. Nguyen

On 12/9/25 3:33 AM, Manivannan Sadhasivam wrote:
> Thanks for the quirk fix. While it seems to have fixed the boot hang, I'm
> quite skeptical about the fix. What is the guarantee that another device
> management command won't be submitted before blk_mq_unquiesce_tagset() in the
> future? IOW, the developer would have no idea that doing such will result in a
> deadlock as nothing in the UFS code makes it clear, like using the same lock and
> such.
I think that the recent changes can be considered as a bug fix. With 
previous kernel versions, only SCSI commands were paused during clock
frequency transitions but device management commands not. With the
approach that landed in Linus' master branch, device management commands
are also paused during clock frequency transitions. Device management
commands should also be paused because these also use the communication
link between host controller and UFS device.

Regarding debugging potential future deadlocks caused by submitting a
device management command while the tag set is quiesced, a call trace
is sufficient to discover the root cause (echo w > /proc/sysrq-trigger).

Regarding future changes in ufshcd_clock_scaling_unprepare(), I think
there are two cases: changes made by developers who have access to a
test setup that supports frequency scaling and changes made by
developers who don't have access to a test setup that supports frequency
scaling. Has it already been considered to insert calls to
ufshcd_clock_scaling_prepare() and ufshcd_clock_scaling_unprepare() near
the end of ufshcd_async_scan() to make sure that this code gets executed
on test setups that do not support frequency scaling?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-12-09 17:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-04 18:15 [PATCH] ufs: core: Fix a deadlock in the frequency scaling code Bart Van Assche
2025-12-05  8:32 ` Peter Wang (王信友)
2025-12-06 15:07 ` Nitin Rawat
2025-12-07 21:48 ` Alexey Klimov
2025-12-09  2:59 ` Martin K. Petersen
2025-12-09 11:33 ` Manivannan Sadhasivam
2025-12-09 17:51   ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox