* [PATCH v3] scsi: ufs: correct ufshcd_shutdown flow
@ 2022-07-21 6:58 peter.wang
2022-07-22 21:12 ` Bart Van Assche
0 siblings, 1 reply; 6+ messages in thread
From: peter.wang @ 2022-07-21 6:58 UTC (permalink / raw)
To: stanley.chu, linux-scsi, martin.petersen, avri.altman,
alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
qilin.tan, lin.gui
From: Peter Wang <peter.wang@mediatek.com>
Both ufshcd_shtdown and ufshcd_wl_shutdown could run concurrently.
And it have race condition when ufshcd_wl_shutdown on-going and
clock/power turn off by ufshcd_shutdown.
The normal case:
ufshcd_wl_shutdown -> ufshcd_shtdown
ufshcd_shtdown should turn off clock/power.
The abnormal case:
ufshcd_shtdown -> ufshcd_wl_shutdown
Wait ufshcd_wl_shutdown set device to power off and turn off clock/power.
If timeout happen, means device still in active mode, cannot turn off
clock/power directly. Skip and keep clock/power on in this case.
Also remove pm_runtime_get_sync because shutdown is focus on
turn off clock/power. We don't need turn on(resume) and turn off.
The second reason is ufshcd_wl_shutdown call ufshcd_rpm_get_sync
already, if ufshcd_shtdown wait ufshcd_wl_shutdown finish,
hba->dev is already resume and no need pm_runtime_get_sync.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
drivers/ufs/core/ufshcd.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index c7b337480e3e..47b639fd28b9 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -58,6 +58,9 @@
/* Task management command timeout */
#define TM_CMD_TIMEOUT 100 /* msecs */
+/* Shutdown wait devcie into power off timeout */
+#define UFS_SHUTDOWN_TIMEOUT 500 /* msecs */
+
/* maximum number of retries for a general UIC command */
#define UFS_UIC_COMMAND_RETRIES 3
@@ -9461,10 +9464,15 @@ EXPORT_SYMBOL(ufshcd_runtime_resume);
*/
int ufshcd_shutdown(struct ufs_hba *hba)
{
- if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
- goto out;
+ unsigned long timeout;
- pm_runtime_get_sync(hba->dev);
+ /* Wait ufshcd_wl_shutdown clear ufs state */
+ timeout = jiffies + msecs_to_jiffies(UFS_SHUTDOWN_TIMEOUT);
+ while (!ufshcd_is_ufs_dev_poweroff(hba) || !ufshcd_is_link_off(hba)) {
+ if (time_after(jiffies, timeout))
+ goto out;
+ msleep(1);
+ }
ufshcd_suspend(hba);
out:
--
2.18.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3] scsi: ufs: correct ufshcd_shutdown flow
2022-07-21 6:58 [PATCH v3] scsi: ufs: correct ufshcd_shutdown flow peter.wang
@ 2022-07-22 21:12 ` Bart Van Assche
2022-07-25 2:11 ` Stanley Chu
2022-07-25 3:47 ` Peter Wang
0 siblings, 2 replies; 6+ messages in thread
From: Bart Van Assche @ 2022-07-22 21:12 UTC (permalink / raw)
To: peter.wang, stanley.chu, linux-scsi, martin.petersen, avri.altman,
alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
chaotian.jing, jiajie.hao, powen.kao, qilin.tan, lin.gui
On 7/20/22 23:58, peter.wang@mediatek.com wrote:
> Both ufshcd_shtdown and ufshcd_wl_shutdown could run concurrently.
> And it have race condition when ufshcd_wl_shutdown on-going and
> clock/power turn off by ufshcd_shutdown.
>
> The normal case:
> ufshcd_wl_shutdown -> ufshcd_shtdown
> ufshcd_shtdown should turn off clock/power.
>
> The abnormal case:
> ufshcd_shtdown -> ufshcd_wl_shutdown
How can this happen since device_shutdown() iterates over devices in the
opposite order in which these have been created?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3] scsi: ufs: correct ufshcd_shutdown flow
2022-07-22 21:12 ` Bart Van Assche
@ 2022-07-25 2:11 ` Stanley Chu
2022-07-25 3:47 ` Peter Wang
1 sibling, 0 replies; 6+ messages in thread
From: Stanley Chu @ 2022-07-25 2:11 UTC (permalink / raw)
To: Bart Van Assche
Cc: peter.wang, Stanley Chu, linux-scsi, Martin K . Petersen,
Avri Altman, alim.akhtar, James E.J. Bottomley, wsd_upstream,
linux-mediatek, chun-hung.wu, alice.chao, cc.chou, chaotian.jing,
jiajie.hao, powen.kao, qilin.tan, lin.gui
On Sat, Jul 23, 2022 at 5:15 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 7/20/22 23:58, peter.wang@mediatek.com wrote:
> > Both ufshcd_shtdown and ufshcd_wl_shutdown could run concurrently.
> > And it have race condition when ufshcd_wl_shutdown on-going and
> > clock/power turn off by ufshcd_shutdown.
> >
> > The normal case:
> > ufshcd_wl_shutdown -> ufshcd_shtdown
> > ufshcd_shtdown should turn off clock/power.
> >
> > The abnormal case:
> > ufshcd_shtdown -> ufshcd_wl_shutdown
>
> How can this happen since device_shutdown() iterates over devices in the
> opposite order in which these have been created?
>
Is it possible for more than one initiator to invoke
kernel_power_off() at the same time?
In this case, the order of device shutdown is not promised anymore
because devices_kset is manipulated simultaneously by multiple
initiators in device_shutdown().
> Thanks,
>
> Bart.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3] scsi: ufs: correct ufshcd_shutdown flow
2022-07-22 21:12 ` Bart Van Assche
2022-07-25 2:11 ` Stanley Chu
@ 2022-07-25 3:47 ` Peter Wang
2022-07-25 17:07 ` Bart Van Assche
1 sibling, 1 reply; 6+ messages in thread
From: Peter Wang @ 2022-07-25 3:47 UTC (permalink / raw)
To: Bart Van Assche, stanley.chu, linux-scsi, martin.petersen,
avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
chaotian.jing, jiajie.hao, powen.kao, qilin.tan, lin.gui
On 7/23/22 5:12 AM, Bart Van Assche wrote:
> On 7/20/22 23:58, peter.wang@mediatek.com wrote:
>> Both ufshcd_shtdown and ufshcd_wl_shutdown could run concurrently.
>> And it have race condition when ufshcd_wl_shutdown on-going and
>> clock/power turn off by ufshcd_shutdown.
>>
>> The normal case:
>> ufshcd_wl_shutdown -> ufshcd_shtdown
>> ufshcd_shtdown should turn off clock/power.
>>
>> The abnormal case:
>> ufshcd_shtdown -> ufshcd_wl_shutdown
>
> How can this happen since device_shutdown() iterates over devices in
> the opposite order in which these have been created?
>
> Thanks,
>
> Bart.
Hi Bart,
Because kernel_restart is export, and mediatek may call kernel_restart
while shutdown is on going.
EXPORT_SYMBOL_GPL(kernel_restart);
kernel_restart -> kernel_restart_prepare -> device_shutdown
So, there may have two thread execute device_shutdown concurrently.
And the list_lock in device_shutdown function is not protect shutdown
callback function,
caused two callback function(ufshcd_shutdown/ufshcd_wl_shutdown) could
run concurrently.
Here is the error log that two thread in device_shutdown.
[37684.002227] [T1500641] platform +platform:112b0000.ufshci
kpoc_charger: ufshcd-mtk 112b0000.ufshci: [name:core&]shutdown
[37684.002264] [T1600339] scsi +scsi:0:0:0:49488 charger_thread:
ufs_device_wlun 0:0:0:49488: [name:core&]shutdown
Thanks.
BR
Peter
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3] scsi: ufs: correct ufshcd_shutdown flow
2022-07-25 3:47 ` Peter Wang
@ 2022-07-25 17:07 ` Bart Van Assche
2022-07-26 9:16 ` Peter Wang
0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2022-07-25 17:07 UTC (permalink / raw)
To: Peter Wang, stanley.chu, linux-scsi, martin.petersen, avri.altman,
alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
chaotian.jing, jiajie.hao, powen.kao, qilin.tan, lin.gui
On 7/24/22 20:47, Peter Wang wrote:
> Because kernel_restart is export, and mediatek may call kernel_restart
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Is this code upstream?
> while shutdown is on going.
> EXPORT_SYMBOL_GPL(kernel_restart);
> kernel_restart -> kernel_restart_prepare -> device_shutdown
>
> So, there may have two thread execute device_shutdown concurrently.
> And the list_lock in device_shutdown function is not protect shutdown
> callback function,
> caused two callback function(ufshcd_shutdown/ufshcd_wl_shutdown) could
> run concurrently.
>
> Here is the error log that two thread in device_shutdown.
> [37684.002227] [T1500641] platform +platform:112b0000.ufshci
> kpoc_charger: ufshcd-mtk 112b0000.ufshci: [name:core&]shutdown
> [37684.002264] [T1600339] scsi +scsi:0:0:0:49488 charger_thread:
> ufs_device_wlun 0:0:0:49488: [name:core&]shutdown
Hi Peter,
I had not yet taken a look at the kernel_restart() function. Now that I
have taken a look, it seems to me that kernel_restart() calls must be
serialized via the system_transition_mutex. Please make sure that the
kernel_restart() calls are serialized.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3] scsi: ufs: correct ufshcd_shutdown flow
2022-07-25 17:07 ` Bart Van Assche
@ 2022-07-26 9:16 ` Peter Wang
0 siblings, 0 replies; 6+ messages in thread
From: Peter Wang @ 2022-07-26 9:16 UTC (permalink / raw)
To: Bart Van Assche, stanley.chu, linux-scsi, martin.petersen,
avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
chaotian.jing, jiajie.hao, powen.kao, qilin.tan, lin.gui
Hi Bart,
On 7/26/22 1:07 AM, Bart Van Assche wrote:
> On 7/24/22 20:47, Peter Wang wrote:
>> Because kernel_restart is export, and mediatek may call kernel_restart
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Is this code upstream?
>
No, it not upstream.
And sorry that I correct mediatek usage, it is kernel_power_off, not
kernel_restart.
>> while shutdown is on going.
>> EXPORT_SYMBOL_GPL(kernel_restart);
>> kernel_restart -> kernel_restart_prepare -> device_shutdown
>>
>> So, there may have two thread execute device_shutdown concurrently.
>> And the list_lock in device_shutdown function is not protect shutdown
>> callback function,
>> caused two callback function(ufshcd_shutdown/ufshcd_wl_shutdown)
>> could run concurrently.
>>
>> Here is the error log that two thread in device_shutdown.
>> [37684.002227] [T1500641] platform +platform:112b0000.ufshci
>> kpoc_charger: ufshcd-mtk 112b0000.ufshci: [name:core&]shutdown
>> [37684.002264] [T1600339] scsi +scsi:0:0:0:49488 charger_thread:
>> ufs_device_wlun 0:0:0:49488: [name:core&]shutdown
>
> Hi Peter,
>
> I had not yet taken a look at the kernel_restart() function. Now that
> I have taken a look, it seems to me that kernel_restart() calls must
> be serialized via the system_transition_mutex. Please make sure that
> the kernel_restart() calls are serialized.
>
> Thanks,
>
> Bart.
I think kernel_power_off could use system_transition_mutex to protect
shutdown racing.
We will try it, Thanks for the suggestion.
And if no need wait, it could more simple in this patch. I will upload
next version again.
Thanks.
Peter
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-07-26 9:16 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-07-21 6:58 [PATCH v3] scsi: ufs: correct ufshcd_shutdown flow peter.wang
2022-07-22 21:12 ` Bart Van Assche
2022-07-25 2:11 ` Stanley Chu
2022-07-25 3:47 ` Peter Wang
2022-07-25 17:07 ` Bart Van Assche
2022-07-26 9:16 ` Peter Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox