[PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume
       [not found] <CGME20250717081220epcas1p224952b344389e4967beb893297f1ae02@epcas1p2.samsung.com>
@ 2025-07-17  8:12 ` Seunghui Lee
  2025-07-17 14:54   ` Bean Huo
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Seunghui Lee @ 2025-07-17  8:12 UTC (permalink / raw)
  To: alim.akhtar, avri.altman, bvanassche, James.Bottomley,
	martin.petersen, linux-scsi, sdriver.sec
  Cc: Seunghui Lee

If the h8 exit fails during runtime resume process,
the runtime thread enters runtime suspend immediately
and the error handler operates at the same time.
It becomes stuck and cannot be recovered through the error handler.
To fix this, use link recovery instead of the error handler.

Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths")
Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
---
Changes from v1:
 * Add the Fixes tag as Beanhuo's requested.
---
 drivers/ufs/core/ufshcd.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 50adfb8b335b..dc2845c32d72 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -4340,7 +4340,7 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd)
 	hba->uic_async_done = NULL;
 	if (reenable_intr)
 		ufshcd_enable_intr(hba, UIC_COMMAND_COMPL);
-	if (ret) {
+	if (ret && !hba->pm_op_in_progress) {
 		ufshcd_set_link_broken(hba);
 		ufshcd_schedule_eh_work(hba);
 	}
@@ -4348,6 +4348,14 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd)
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
 	mutex_unlock(&hba->uic_cmd_mutex);
 
+	/*
+	 * If the h8 exit fails during the runtime resume process,
+	 * it becomes stuck and cannot be recovered through the error handler.
+	 * To fix this, use link recovery instead of the error handler.
+	 */
+	if (ret && hba->pm_op_in_progress)
+		ret = ufshcd_link_recovery(hba);
+
 	return ret;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume
  2025-07-17  8:12 ` [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume Seunghui Lee
@ 2025-07-17 14:54   ` Bean Huo
  2025-07-17 17:47   ` Bart Van Assche
  2025-07-25  1:25   ` Martin K. Petersen
  2 siblings, 0 replies; 6+ messages in thread
From: Bean Huo @ 2025-07-17 14:54 UTC (permalink / raw)
  To: Seunghui Lee, alim.akhtar, avri.altman, bvanassche,
	James.Bottomley, martin.petersen, linux-scsi, sdriver.sec

On Thu, 2025-07-17 at 17:12 +0900, Seunghui Lee wrote:
> If the h8 exit fails during runtime resume process,
> the runtime thread enters runtime suspend immediately
> and the error handler operates at the same time.
> It becomes stuck and cannot be recovered through the error handler.
> To fix this, use link recovery instead of the error handler.
> 
> Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths")
> Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>

Based on my investigation, if child device's runtime-resume fails,the upper layer will set the status back to suspended. It assumes that
the driver or error handler will communicate with the device and handle
recovery. If there is no better solution to fix this issue than this
patch, let's go with this.

Reviewed-By: Bean Huo <beanhuo@micron.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume
  2025-07-17  8:12 ` [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume Seunghui Lee
  2025-07-17 14:54   ` Bean Huo
@ 2025-07-17 17:47   ` Bart Van Assche
  2025-07-18  6:59     ` Seunghui Lee
  2025-07-25  1:25   ` Martin K. Petersen
  2 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2025-07-17 17:47 UTC (permalink / raw)
  To: Seunghui Lee, alim.akhtar, avri.altman, James.Bottomley,
	martin.petersen, linux-scsi, sdriver.sec

On 7/17/25 1:12 AM, Seunghui Lee wrote:
> If the h8 exit fails during runtime resume process,
> the runtime thread enters runtime suspend immediately
> and the error handler operates at the same time.
> It becomes stuck and cannot be recovered through the error handler.
> To fix this, use link recovery instead of the error handler.
> 
> Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths")
> Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
> ---
> Changes from v1:
>   * Add the Fixes tag as Beanhuo's requested.
> ---
>   drivers/ufs/core/ufshcd.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 50adfb8b335b..dc2845c32d72 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -4340,7 +4340,7 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd)
>   	hba->uic_async_done = NULL;
>   	if (reenable_intr)
>   		ufshcd_enable_intr(hba, UIC_COMMAND_COMPL);
> -	if (ret) {
> +	if (ret && !hba->pm_op_in_progress) {
>   		ufshcd_set_link_broken(hba);
>   		ufshcd_schedule_eh_work(hba);
>   	}
> @@ -4348,6 +4348,14 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd)
>   	spin_unlock_irqrestore(hba->host->host_lock, flags);
>   	mutex_unlock(&hba->uic_cmd_mutex);
>   
> +	/*
> +	 * If the h8 exit fails during the runtime resume process,
> +	 * it becomes stuck and cannot be recovered through the error handler.
> +	 * To fix this, use link recovery instead of the error handler.
> +	 */
> +	if (ret && hba->pm_op_in_progress)
> +		ret = ufshcd_link_recovery(hba);
> +
>   	return ret;
>   }

There are multiple calls to ufshcd_uic_pwr_ctrl() from outside the
runtime power management callbacks. Hence, hba->pm_op_in_progress may
be changed from another thread while ufshcd_uic_pwr_ctrl() is in
progress. Hence, ufshcd_uic_pwr_ctrl() calls from outside runtime power
management callbacks should be serialized against these callbacks, e.g.
by surrounding these calls with pm_runtime_get_sync() and
pm_runtime_put().

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume
  2025-07-17 17:47   ` Bart Van Assche
@ 2025-07-18  6:59     ` Seunghui Lee
  2025-07-18 20:48       ` Bart Van Assche
  0 siblings, 1 reply; 6+ messages in thread
From: Seunghui Lee @ 2025-07-18  6:59 UTC (permalink / raw)
  To: 'Bart Van Assche', alim.akhtar, avri.altman,
	James.Bottomley, martin.petersen, linux-scsi, sdriver.sec

> On 7/17/25 1:12 AM, Seunghui Lee wrote:
> > If the h8 exit fails during runtime resume process, the runtime thread
> > enters runtime suspend immediately and the error handler operates at
> > the same time.
> > It becomes stuck and cannot be recovered through the error handler.
> > To fix this, use link recovery instead of the error handler.
> >
> > Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and
> > other error recovery paths")
> > Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
> > ---
> > Changes from v1:
> >   * Add the Fixes tag as Beanhuo's requested.
> > ---
> >   drivers/ufs/core/ufshcd.c | 10 +++++++++-
> >   1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> > index 50adfb8b335b..dc2845c32d72 100644
> > --- a/drivers/ufs/core/ufshcd.c
> > +++ b/drivers/ufs/core/ufshcd.c
> > @@ -4340,7 +4340,7 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba,
> struct uic_command *cmd)
> >   	hba->uic_async_done = NULL;
> >   	if (reenable_intr)
> >   		ufshcd_enable_intr(hba, UIC_COMMAND_COMPL);
> > -	if (ret) {
> > +	if (ret && !hba->pm_op_in_progress) {
> >   		ufshcd_set_link_broken(hba);
> >   		ufshcd_schedule_eh_work(hba);
> >   	}
> > @@ -4348,6 +4348,14 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba
> *hba, struct uic_command *cmd)
> >   	spin_unlock_irqrestore(hba->host->host_lock, flags);
> >   	mutex_unlock(&hba->uic_cmd_mutex);
> >
> > +	/*
> > +	 * If the h8 exit fails during the runtime resume process,
> > +	 * it becomes stuck and cannot be recovered through the error
> handler.
> > +	 * To fix this, use link recovery instead of the error handler.
> > +	 */
> > +	if (ret && hba->pm_op_in_progress)
> > +		ret = ufshcd_link_recovery(hba);
> > +
> >   	return ret;
> >   }
> 
> There are multiple calls to ufshcd_uic_pwr_ctrl() from outside the runtime
> power management callbacks. Hence, hba->pm_op_in_progress may be changed
> from another thread while ufshcd_uic_pwr_ctrl() is in progress. Hence,
> ufshcd_uic_pwr_ctrl() calls from outside runtime power management
> callbacks should be serialized against these callbacks, e.g.
> by surrounding these calls with pm_runtime_get_sync() and pm_runtime_put().
> 
> Thanks,
> 
> Bart.

Hi Bart,

Thank you for your opinion.
I understand what you're concerned about.

I've checked multiple calls to ufshcd_uic_pwr_ctrl().
I think that follow cases looks okay.
(using rpm get/put, init sequence, synchronize with PM operation sequence)
Do you have any better idea to cover this issue?

1. ufshcd_send_bsg_uic_cmd()
ufs_bsg_request()	// using rpm get sync, put

2. __ufshcd_wl_suspend() / __ufshcd_wl_resume()	// runtime sequence

3. ufshcd_change_pwr_mode()
ufshcd_dme_get_attr()
	ufshcd_get_max_pwr_mode() from ufshcd_device_init()	// init
	ufshcd_disable_tx_lcc() from ufshcd_link_startup()	// init
	ufshcd_is_pwr_mode_restore_needed() from ufshcd_err_handler()	// using rpm get sync, put
	ufshcd_quirk_tune_host_pa_tactivate()	// init
	ufshcd_quirk_override_pa_h8time()		// init
ufshcd_config_pwr_mode()
	ufschd_scale_gear()	// synchronize with __ufshcd_wl_resume() / __ufshcd_wl_suspend()
	ufschd_err_handler()	// using rpm get sync, put
	ufshcd_post_device_init()	// init


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume
  2025-07-18  6:59     ` Seunghui Lee
@ 2025-07-18 20:48       ` Bart Van Assche
  0 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2025-07-18 20:48 UTC (permalink / raw)
  To: Seunghui Lee, alim.akhtar, avri.altman, James.Bottomley,
	martin.petersen, linux-scsi, sdriver.sec

On 7/17/25 11:59 PM, Seunghui Lee wrote:
> I've checked multiple calls to ufshcd_uic_pwr_ctrl().
> I think that follow cases looks okay.
> (using rpm get/put, init sequence, synchronize with PM operation sequence)
> Do you have any better idea to cover this issue?
> 
> 1. ufshcd_send_bsg_uic_cmd()
> ufs_bsg_request()	// using rpm get sync, put
> 
> 2. __ufshcd_wl_suspend() / __ufshcd_wl_resume()	// runtime sequence
> 
> 3. ufshcd_change_pwr_mode()
> ufshcd_dme_get_attr()
> 	ufshcd_get_max_pwr_mode() from ufshcd_device_init()	// init
> 	ufshcd_disable_tx_lcc() from ufshcd_link_startup()	// init
> 	ufshcd_is_pwr_mode_restore_needed() from ufshcd_err_handler()	// using rpm get sync, put
> 	ufshcd_quirk_tune_host_pa_tactivate()	// init
> 	ufshcd_quirk_override_pa_h8time()		// init
> ufshcd_config_pwr_mode()
> 	ufschd_scale_gear()	// synchronize with __ufshcd_wl_resume() / __ufshcd_wl_suspend()
> 	ufschd_err_handler()	// using rpm get sync, put
> 	ufshcd_post_device_init()	// init
> 

Hi Seunghui,

Thank you for this summary. The call I was most concerned about is the
call from ufschd_scale_gear(). You are right that these calls are
already serialized against runtime power management. In other words, I'm
now fine with this patch.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume
  2025-07-17  8:12 ` [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume Seunghui Lee
  2025-07-17 14:54   ` Bean Huo
  2025-07-17 17:47   ` Bart Van Assche
@ 2025-07-25  1:25   ` Martin K. Petersen
  2 siblings, 0 replies; 6+ messages in thread
From: Martin K. Petersen @ 2025-07-25  1:25 UTC (permalink / raw)
  To: Seunghui Lee
  Cc: alim.akhtar, avri.altman, bvanassche, James.Bottomley,
	martin.petersen, linux-scsi, sdriver.sec


Seunghui,

> If the h8 exit fails during runtime resume process, the runtime thread
> enters runtime suspend immediately and the error handler operates at
> the same time. It becomes stuck and cannot be recovered through the
> error handler. To fix this, use link recovery instead of the error
> handler.

Applied to 6.17/scsi-staging, thanks!

-- 
Martin K. Petersen

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-07-25  1:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20250717081220epcas1p224952b344389e4967beb893297f1ae02@epcas1p2.samsung.com>
2025-07-17  8:12 ` [PATCH v2] ufs: core: Use link recovery when the h8 exit failure during runtime resume Seunghui Lee
2025-07-17 14:54   ` Bean Huo
2025-07-17 17:47   ` Bart Van Assche
2025-07-18  6:59     ` Seunghui Lee
2025-07-18 20:48       ` Bart Van Assche
2025-07-25  1:25   ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).