* [PATH v3 0/2] Two UFS bug fixes @ 2021-01-06 21:41 Jaegeuk Kim 2021-01-06 21:41 ` [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns Jaegeuk Kim 2021-01-06 21:41 ` [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd Jaegeuk Kim 0 siblings, 2 replies; 14+ messages in thread From: Jaegeuk Kim @ 2021-01-06 21:41 UTC (permalink / raw) To: linux-kernel, linux-scsi, kernel-team Cc: cang, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu Change log from v2: - fix build warning ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns 2021-01-06 21:41 [PATH v3 0/2] Two UFS bug fixes Jaegeuk Kim @ 2021-01-06 21:41 ` Jaegeuk Kim 2021-01-07 6:10 ` Can Guo 2021-01-06 21:41 ` [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd Jaegeuk Kim 1 sibling, 1 reply; 14+ messages in thread From: Jaegeuk Kim @ 2021-01-06 21:41 UTC (permalink / raw) To: linux-kernel, linux-scsi, kernel-team Cc: cang, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu, Jaegeuk Kim When gate_work/ungate_work gets an error during hibern8_enter or exit, ufshcd_err_handler() ufshcd_scsi_block_requests() ufshcd_reset_and_restore() ufshcd_clear_ua_wluns() -> stuck ufshcd_scsi_unblock_requests() In order to avoid it, ufshcd_clear_ua_wluns() can be called per recovery flows such as suspend/resume, link_recovery, and error_handler. Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd resets") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> --- drivers/scsi/ufs/ufshcd.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index bedb822a40a3..1678cec08b51 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -3996,6 +3996,8 @@ int ufshcd_link_recovery(struct ufs_hba *hba) if (ret) dev_err(hba->dev, "%s: link recovery failed, err %d", __func__, ret); + else + ufshcd_clear_ua_wluns(hba); return ret; } @@ -6003,6 +6005,9 @@ static void ufshcd_err_handler(struct work_struct *work) ufshcd_scsi_unblock_requests(hba); ufshcd_err_handling_unprepare(hba); up(&hba->eh_sem); + + if (!err && needs_reset) + ufshcd_clear_ua_wluns(hba); } /** @@ -6940,14 +6945,11 @@ static int ufshcd_host_reset_and_restore(struct ufs_hba *hba) ufshcd_set_clk_freq(hba, true); err = ufshcd_hba_enable(hba); - if (err) - goto out; /* Establish the link again and restore the device */ - err = ufshcd_probe_hba(hba, false); if (!err) - ufshcd_clear_ua_wluns(hba); -out: + err = ufshcd_probe_hba(hba, false); + if (err) dev_err(hba->dev, "%s: Host init failed %d\n", __func__, err); ufshcd_update_evt_hist(hba, UFS_EVT_HOST_RESET, (u32)err); @@ -8777,6 +8779,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) ufshcd_resume_clkscaling(hba); hba->clk_gating.is_suspended = false; hba->dev_info.b_rpm_dev_flush_capable = false; + ufshcd_clear_ua_wluns(hba); ufshcd_release(hba); out: if (hba->dev_info.b_rpm_dev_flush_capable) { @@ -8887,6 +8890,8 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op) cancel_delayed_work(&hba->rpm_dev_flush_recheck_work); } + ufshcd_clear_ua_wluns(hba); + /* Schedule clock gating in case of no access to UFS device yet */ ufshcd_release(hba); -- 2.29.2.729.g45daf8777d-goog ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns 2021-01-06 21:41 ` [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns Jaegeuk Kim @ 2021-01-07 6:10 ` Can Guo 2021-01-07 6:57 ` Jaegeuk Kim 0 siblings, 1 reply; 14+ messages in thread From: Can Guo @ 2021-01-07 6:10 UTC (permalink / raw) To: Jaegeuk Kim Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 2021-01-07 05:41, Jaegeuk Kim wrote: > When gate_work/ungate_work gets an error during hibern8_enter or exit, > ufshcd_err_handler() > ufshcd_scsi_block_requests() > ufshcd_reset_and_restore() > ufshcd_clear_ua_wluns() -> stuck > ufshcd_scsi_unblock_requests() > > In order to avoid it, ufshcd_clear_ua_wluns() can be called per > recovery flows > such as suspend/resume, link_recovery, and error_handler. > > Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd > resets") > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > --- > drivers/scsi/ufs/ufshcd.c | 15 ++++++++++----- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index bedb822a40a3..1678cec08b51 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -3996,6 +3996,8 @@ int ufshcd_link_recovery(struct ufs_hba *hba) > if (ret) > dev_err(hba->dev, "%s: link recovery failed, err %d", > __func__, ret); > + else > + ufshcd_clear_ua_wluns(hba); Can we put it right after ufshcd_scsi_add_wlus() in ufshcd_add_lus()? Thanks, Can Guo. > > return ret; > } > @@ -6003,6 +6005,9 @@ static void ufshcd_err_handler(struct work_struct > *work) > ufshcd_scsi_unblock_requests(hba); > ufshcd_err_handling_unprepare(hba); > up(&hba->eh_sem); > + > + if (!err && needs_reset) > + ufshcd_clear_ua_wluns(hba); > } > > /** > @@ -6940,14 +6945,11 @@ static int > ufshcd_host_reset_and_restore(struct ufs_hba *hba) > ufshcd_set_clk_freq(hba, true); > > err = ufshcd_hba_enable(hba); > - if (err) > - goto out; > > /* Establish the link again and restore the device */ > - err = ufshcd_probe_hba(hba, false); > if (!err) > - ufshcd_clear_ua_wluns(hba); > -out: > + err = ufshcd_probe_hba(hba, false); > + > if (err) > dev_err(hba->dev, "%s: Host init failed %d\n", __func__, err); > ufshcd_update_evt_hist(hba, UFS_EVT_HOST_RESET, (u32)err); > @@ -8777,6 +8779,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, > enum ufs_pm_op pm_op) > ufshcd_resume_clkscaling(hba); > hba->clk_gating.is_suspended = false; > hba->dev_info.b_rpm_dev_flush_capable = false; > + ufshcd_clear_ua_wluns(hba); > ufshcd_release(hba); > out: > if (hba->dev_info.b_rpm_dev_flush_capable) { > @@ -8887,6 +8890,8 @@ static int ufshcd_resume(struct ufs_hba *hba, > enum ufs_pm_op pm_op) > cancel_delayed_work(&hba->rpm_dev_flush_recheck_work); > } > > + ufshcd_clear_ua_wluns(hba); > + > /* Schedule clock gating in case of no access to UFS device yet */ > ufshcd_release(hba); ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns 2021-01-07 6:10 ` Can Guo @ 2021-01-07 6:57 ` Jaegeuk Kim 2021-01-07 7:09 ` Can Guo 0 siblings, 1 reply; 14+ messages in thread From: Jaegeuk Kim @ 2021-01-07 6:57 UTC (permalink / raw) To: Can Guo Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 01/07, Can Guo wrote: > On 2021-01-07 05:41, Jaegeuk Kim wrote: > > When gate_work/ungate_work gets an error during hibern8_enter or exit, > > ufshcd_err_handler() > > ufshcd_scsi_block_requests() > > ufshcd_reset_and_restore() > > ufshcd_clear_ua_wluns() -> stuck > > ufshcd_scsi_unblock_requests() > > > > In order to avoid it, ufshcd_clear_ua_wluns() can be called per recovery > > flows > > such as suspend/resume, link_recovery, and error_handler. > > > > Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd > > resets") > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > --- > > drivers/scsi/ufs/ufshcd.c | 15 ++++++++++----- > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > > index bedb822a40a3..1678cec08b51 100644 > > --- a/drivers/scsi/ufs/ufshcd.c > > +++ b/drivers/scsi/ufs/ufshcd.c > > @@ -3996,6 +3996,8 @@ int ufshcd_link_recovery(struct ufs_hba *hba) > > if (ret) > > dev_err(hba->dev, "%s: link recovery failed, err %d", > > __func__, ret); > > + else > > + ufshcd_clear_ua_wluns(hba); > > Can we put it right after ufshcd_scsi_add_wlus() in ufshcd_add_lus()? May I ask the reason? We'll call it after ufshcd_add_lus() later tho. > > Thanks, > Can Guo. > > > > > return ret; > > } > > @@ -6003,6 +6005,9 @@ static void ufshcd_err_handler(struct work_struct > > *work) > > ufshcd_scsi_unblock_requests(hba); > > ufshcd_err_handling_unprepare(hba); > > up(&hba->eh_sem); > > + > > + if (!err && needs_reset) > > + ufshcd_clear_ua_wluns(hba); > > } > > > > /** > > @@ -6940,14 +6945,11 @@ static int > > ufshcd_host_reset_and_restore(struct ufs_hba *hba) > > ufshcd_set_clk_freq(hba, true); > > > > err = ufshcd_hba_enable(hba); > > - if (err) > > - goto out; > > > > /* Establish the link again and restore the device */ > > - err = ufshcd_probe_hba(hba, false); > > if (!err) > > - ufshcd_clear_ua_wluns(hba); > > -out: > > + err = ufshcd_probe_hba(hba, false); > > + > > if (err) > > dev_err(hba->dev, "%s: Host init failed %d\n", __func__, err); > > ufshcd_update_evt_hist(hba, UFS_EVT_HOST_RESET, (u32)err); > > @@ -8777,6 +8779,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, > > enum ufs_pm_op pm_op) > > ufshcd_resume_clkscaling(hba); > > hba->clk_gating.is_suspended = false; > > hba->dev_info.b_rpm_dev_flush_capable = false; > > + ufshcd_clear_ua_wluns(hba); > > ufshcd_release(hba); > > out: > > if (hba->dev_info.b_rpm_dev_flush_capable) { > > @@ -8887,6 +8890,8 @@ static int ufshcd_resume(struct ufs_hba *hba, > > enum ufs_pm_op pm_op) > > cancel_delayed_work(&hba->rpm_dev_flush_recheck_work); > > } > > > > + ufshcd_clear_ua_wluns(hba); > > + > > /* Schedule clock gating in case of no access to UFS device yet */ > > ufshcd_release(hba); ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns 2021-01-07 6:57 ` Jaegeuk Kim @ 2021-01-07 7:09 ` Can Guo 2021-01-07 7:40 ` Jaegeuk Kim 0 siblings, 1 reply; 14+ messages in thread From: Can Guo @ 2021-01-07 7:09 UTC (permalink / raw) To: Jaegeuk Kim Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 2021-01-07 14:57, Jaegeuk Kim wrote: > On 01/07, Can Guo wrote: >> On 2021-01-07 05:41, Jaegeuk Kim wrote: >> > When gate_work/ungate_work gets an error during hibern8_enter or exit, >> > ufshcd_err_handler() >> > ufshcd_scsi_block_requests() >> > ufshcd_reset_and_restore() >> > ufshcd_clear_ua_wluns() -> stuck >> > ufshcd_scsi_unblock_requests() >> > >> > In order to avoid it, ufshcd_clear_ua_wluns() can be called per recovery >> > flows >> > such as suspend/resume, link_recovery, and error_handler. >> > >> > Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd >> > resets") >> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >> > --- >> > drivers/scsi/ufs/ufshcd.c | 15 ++++++++++----- >> > 1 file changed, 10 insertions(+), 5 deletions(-) >> > >> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >> > index bedb822a40a3..1678cec08b51 100644 >> > --- a/drivers/scsi/ufs/ufshcd.c >> > +++ b/drivers/scsi/ufs/ufshcd.c >> > @@ -3996,6 +3996,8 @@ int ufshcd_link_recovery(struct ufs_hba *hba) >> > if (ret) >> > dev_err(hba->dev, "%s: link recovery failed, err %d", >> > __func__, ret); >> > + else >> > + ufshcd_clear_ua_wluns(hba); >> >> Can we put it right after ufshcd_scsi_add_wlus() in ufshcd_add_lus()? > > May I ask the reason? We'll call it after ufshcd_add_lus() later tho. > I think the code will be more readable - we do all the LU related stuffs in one func, just nit-picking though. I found this because I am planning to move the devfreq init codes out of ufshcd_add_lus() due to it is inappropriate to init devfreq in there by its naming, but it might be a good place for ufshcd_clear_ua_wluns(). Thanks, Can Guo. >> >> Thanks, >> Can Guo. >> >> > >> > return ret; >> > } >> > @@ -6003,6 +6005,9 @@ static void ufshcd_err_handler(struct work_struct >> > *work) >> > ufshcd_scsi_unblock_requests(hba); >> > ufshcd_err_handling_unprepare(hba); >> > up(&hba->eh_sem); >> > + >> > + if (!err && needs_reset) >> > + ufshcd_clear_ua_wluns(hba); >> > } >> > >> > /** >> > @@ -6940,14 +6945,11 @@ static int >> > ufshcd_host_reset_and_restore(struct ufs_hba *hba) >> > ufshcd_set_clk_freq(hba, true); >> > >> > err = ufshcd_hba_enable(hba); >> > - if (err) >> > - goto out; >> > >> > /* Establish the link again and restore the device */ >> > - err = ufshcd_probe_hba(hba, false); >> > if (!err) >> > - ufshcd_clear_ua_wluns(hba); >> > -out: >> > + err = ufshcd_probe_hba(hba, false); >> > + >> > if (err) >> > dev_err(hba->dev, "%s: Host init failed %d\n", __func__, err); >> > ufshcd_update_evt_hist(hba, UFS_EVT_HOST_RESET, (u32)err); >> > @@ -8777,6 +8779,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, >> > enum ufs_pm_op pm_op) >> > ufshcd_resume_clkscaling(hba); >> > hba->clk_gating.is_suspended = false; >> > hba->dev_info.b_rpm_dev_flush_capable = false; >> > + ufshcd_clear_ua_wluns(hba); >> > ufshcd_release(hba); >> > out: >> > if (hba->dev_info.b_rpm_dev_flush_capable) { >> > @@ -8887,6 +8890,8 @@ static int ufshcd_resume(struct ufs_hba *hba, >> > enum ufs_pm_op pm_op) >> > cancel_delayed_work(&hba->rpm_dev_flush_recheck_work); >> > } >> > >> > + ufshcd_clear_ua_wluns(hba); >> > + >> > /* Schedule clock gating in case of no access to UFS device yet */ >> > ufshcd_release(hba); ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns 2021-01-07 7:09 ` Can Guo @ 2021-01-07 7:40 ` Jaegeuk Kim 0 siblings, 0 replies; 14+ messages in thread From: Jaegeuk Kim @ 2021-01-07 7:40 UTC (permalink / raw) To: Can Guo Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 01/07, Can Guo wrote: > On 2021-01-07 14:57, Jaegeuk Kim wrote: > > On 01/07, Can Guo wrote: > > > On 2021-01-07 05:41, Jaegeuk Kim wrote: > > > > When gate_work/ungate_work gets an error during hibern8_enter or exit, > > > > ufshcd_err_handler() > > > > ufshcd_scsi_block_requests() > > > > ufshcd_reset_and_restore() > > > > ufshcd_clear_ua_wluns() -> stuck > > > > ufshcd_scsi_unblock_requests() > > > > > > > > In order to avoid it, ufshcd_clear_ua_wluns() can be called per recovery > > > > flows > > > > such as suspend/resume, link_recovery, and error_handler. > > > > > > > > Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd > > > > resets") > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > > > --- > > > > drivers/scsi/ufs/ufshcd.c | 15 ++++++++++----- > > > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > > > > index bedb822a40a3..1678cec08b51 100644 > > > > --- a/drivers/scsi/ufs/ufshcd.c > > > > +++ b/drivers/scsi/ufs/ufshcd.c > > > > @@ -3996,6 +3996,8 @@ int ufshcd_link_recovery(struct ufs_hba *hba) > > > > if (ret) > > > > dev_err(hba->dev, "%s: link recovery failed, err %d", > > > > __func__, ret); > > > > + else > > > > + ufshcd_clear_ua_wluns(hba); > > > > > > Can we put it right after ufshcd_scsi_add_wlus() in ufshcd_add_lus()? > > > > May I ask the reason? We'll call it after ufshcd_add_lus() later tho. > > > > I think the code will be more readable - we do all the LU related > stuffs in one func, just nit-picking though. I found this because > I am planning to move the devfreq init codes out of ufshcd_add_lus() > due to it is inappropriate to init devfreq in there by its naming, > but it might be a good place for ufshcd_clear_ua_wluns(). Ok, that looks good to me. Thanks. > > Thanks, > Can Guo. > > > > > > > Thanks, > > > Can Guo. > > > > > > > > > > > return ret; > > > > } > > > > @@ -6003,6 +6005,9 @@ static void ufshcd_err_handler(struct work_struct > > > > *work) > > > > ufshcd_scsi_unblock_requests(hba); > > > > ufshcd_err_handling_unprepare(hba); > > > > up(&hba->eh_sem); > > > > + > > > > + if (!err && needs_reset) > > > > + ufshcd_clear_ua_wluns(hba); > > > > } > > > > > > > > /** > > > > @@ -6940,14 +6945,11 @@ static int > > > > ufshcd_host_reset_and_restore(struct ufs_hba *hba) > > > > ufshcd_set_clk_freq(hba, true); > > > > > > > > err = ufshcd_hba_enable(hba); > > > > - if (err) > > > > - goto out; > > > > > > > > /* Establish the link again and restore the device */ > > > > - err = ufshcd_probe_hba(hba, false); > > > > if (!err) > > > > - ufshcd_clear_ua_wluns(hba); > > > > -out: > > > > + err = ufshcd_probe_hba(hba, false); > > > > + > > > > if (err) > > > > dev_err(hba->dev, "%s: Host init failed %d\n", __func__, err); > > > > ufshcd_update_evt_hist(hba, UFS_EVT_HOST_RESET, (u32)err); > > > > @@ -8777,6 +8779,7 @@ static int ufshcd_suspend(struct ufs_hba *hba, > > > > enum ufs_pm_op pm_op) > > > > ufshcd_resume_clkscaling(hba); > > > > hba->clk_gating.is_suspended = false; > > > > hba->dev_info.b_rpm_dev_flush_capable = false; > > > > + ufshcd_clear_ua_wluns(hba); > > > > ufshcd_release(hba); > > > > out: > > > > if (hba->dev_info.b_rpm_dev_flush_capable) { > > > > @@ -8887,6 +8890,8 @@ static int ufshcd_resume(struct ufs_hba *hba, > > > > enum ufs_pm_op pm_op) > > > > cancel_delayed_work(&hba->rpm_dev_flush_recheck_work); > > > > } > > > > > > > > + ufshcd_clear_ua_wluns(hba); > > > > + > > > > /* Schedule clock gating in case of no access to UFS device yet */ > > > > ufshcd_release(hba); ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd 2021-01-06 21:41 [PATH v3 0/2] Two UFS bug fixes Jaegeuk Kim 2021-01-06 21:41 ` [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns Jaegeuk Kim @ 2021-01-06 21:41 ` Jaegeuk Kim 2021-01-07 6:07 ` Can Guo 2021-01-07 6:38 ` Can Guo 1 sibling, 2 replies; 14+ messages in thread From: Jaegeuk Kim @ 2021-01-06 21:41 UTC (permalink / raw) To: linux-kernel, linux-scsi, kernel-team Cc: cang, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu, Jaegeuk Kim, Jaegeuk Kim From: Jaegeuk Kim <jaegeuk@google.com> This fixes a warning caused by wrong reserve tag usage in __ufshcd_issue_tm_cmd. WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 blk_mq_get_tag+0x438/0x46c And, in ufshcd_err_handler(), we can avoid to send tm_cmd before aborting outstanding commands by waiting a bit for IO completion like this. __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> --- drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 1678cec08b51..47fc8da3cbf9 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -44,6 +44,9 @@ /* Query request timeout */ #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ +/* LINERESET TIME OUT */ +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ + /* Task management command timeout */ #define TM_CMD_TIMEOUT 100 /* msecs */ @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct *work) * check if power mode restore is needed. */ if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) { + ktime_t start = ktime_get(); + hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR; if (!hba->saved_uic_err) hba->saved_err &= ~UIC_ERROR; @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct *work) if (ufshcd_is_pwr_mode_restore_needed(hba)) needs_restore = true; spin_lock_irqsave(hba->host->host_lock, flags); + /* Wait for IO completion to avoid aborting IOs */ + while (hba->outstanding_reqs) { + ufshcd_complete_requests(hba); + spin_unlock_irqrestore(hba->host->host_lock, flags); + schedule(); + spin_lock_irqsave(hba->host->host_lock, flags); + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > + LINERESET_IO_TIMEOUT_MS) { + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", + __func__, hba->outstanding_reqs); + break; + } + } + if (!hba->saved_err && !needs_restore) goto skip_err_handling; } @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba) intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); } - if (enabled_intr_status && retval == IRQ_NONE) { - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", - __func__, intr_status); + if (enabled_intr_status && retval == IRQ_NONE && + !ufshcd_eh_in_progress(hba)) { + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, 0x%08x)\n", + __func__, + intr_status, + hba->ufs_stats.last_intr_status, + enabled_intr_status); ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); } @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, * Even though we use wait_event() which sleeps indefinitely, * the maximum wait time is bounded by %TM_CMD_TIMEOUT. */ - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); + req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED | + BLK_MQ_REQ_NOWAIT); + if (IS_ERR(req)) + return PTR_ERR(req); + req->end_io_data = &wait; free_slot = req->tag; WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) hba->tmf_tag_set = (struct blk_mq_tag_set) { .nr_hw_queues = 1, + .reserved_tags = 1, .queue_depth = hba->nutmrs, .ops = &ufshcd_tmf_ops, .flags = BLK_MQ_F_NO_SCHED, -- 2.29.2.729.g45daf8777d-goog ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd 2021-01-06 21:41 ` [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd Jaegeuk Kim @ 2021-01-07 6:07 ` Can Guo 2021-01-07 6:51 ` Jaegeuk Kim 2021-01-07 6:38 ` Can Guo 1 sibling, 1 reply; 14+ messages in thread From: Can Guo @ 2021-01-07 6:07 UTC (permalink / raw) To: Jaegeuk Kim Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu, Jaegeuk Kim Hi Jaegeuk, On 2021-01-07 05:41, Jaegeuk Kim wrote: > From: Jaegeuk Kim <jaegeuk@google.com> > > This fixes a warning caused by wrong reserve tag usage in > __ufshcd_issue_tm_cmd. > > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 > blk_get_request+0x68/0x70 > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 > blk_mq_get_tag+0x438/0x46c > > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before > aborting > outstanding commands by waiting a bit for IO completion like this. > > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > --- > drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++---- > 1 file changed, 32 insertions(+), 4 deletions(-) > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index 1678cec08b51..47fc8da3cbf9 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -44,6 +44,9 @@ > /* Query request timeout */ > #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ > > +/* LINERESET TIME OUT */ > +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ > + > /* Task management command timeout */ > #define TM_CMD_TIMEOUT 100 /* msecs */ > > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct > *work) > * check if power mode restore is needed. > */ > if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) { > + ktime_t start = ktime_get(); I don't see the connection btw line-reset and following tmf cmd. My point is that line-reset is not the only non-fatal error which leads us to the following tmf cmd. So the wait should be outside of this check - just put it right before clearing outstanding reqs. Thanks, Can Guo. > + > hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR; > if (!hba->saved_uic_err) > hba->saved_err &= ~UIC_ERROR; > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct > work_struct *work) > if (ufshcd_is_pwr_mode_restore_needed(hba)) > needs_restore = true; > spin_lock_irqsave(hba->host->host_lock, flags); > + /* Wait for IO completion to avoid aborting IOs */ > + while (hba->outstanding_reqs) { > + ufshcd_complete_requests(hba); > + spin_unlock_irqrestore(hba->host->host_lock, flags); > + schedule(); > + spin_lock_irqsave(hba->host->host_lock, flags); > + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > > + LINERESET_IO_TIMEOUT_MS) { > + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", > + __func__, hba->outstanding_reqs); > + break; > + } > + } > + > if (!hba->saved_err && !needs_restore) > goto skip_err_handling; > } > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void > *__hba) > intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); > } > > - if (enabled_intr_status && retval == IRQ_NONE) { > - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", > - __func__, intr_status); > + if (enabled_intr_status && retval == IRQ_NONE && > + !ufshcd_eh_in_progress(hba)) { > + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, > 0x%08x)\n", > + __func__, > + intr_status, > + hba->ufs_stats.last_intr_status, > + enabled_intr_status); > ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); > } > > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba > *hba, > * Even though we use wait_event() which sleeps indefinitely, > * the maximum wait time is bounded by %TM_CMD_TIMEOUT. > */ > - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); > + req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED | > + BLK_MQ_REQ_NOWAIT); > + if (IS_ERR(req)) > + return PTR_ERR(req); > + > req->end_io_data = &wait; > free_slot = req->tag; > WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void > __iomem *mmio_base, unsigned int irq) > > hba->tmf_tag_set = (struct blk_mq_tag_set) { > .nr_hw_queues = 1, > + .reserved_tags = 1, > .queue_depth = hba->nutmrs, > .ops = &ufshcd_tmf_ops, > .flags = BLK_MQ_F_NO_SCHED, ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd 2021-01-07 6:07 ` Can Guo @ 2021-01-07 6:51 ` Jaegeuk Kim 0 siblings, 0 replies; 14+ messages in thread From: Jaegeuk Kim @ 2021-01-07 6:51 UTC (permalink / raw) To: Can Guo Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 01/07, Can Guo wrote: > Hi Jaegeuk, > > On 2021-01-07 05:41, Jaegeuk Kim wrote: > > From: Jaegeuk Kim <jaegeuk@google.com> > > > > This fixes a warning caused by wrong reserve tag usage in > > __ufshcd_issue_tm_cmd. > > > > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 > > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 > > blk_mq_get_tag+0x438/0x46c > > > > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before > > aborting > > outstanding commands by waiting a bit for IO completion like this. > > > > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > --- > > drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++---- > > 1 file changed, 32 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > > index 1678cec08b51..47fc8da3cbf9 100644 > > --- a/drivers/scsi/ufs/ufshcd.c > > +++ b/drivers/scsi/ufs/ufshcd.c > > @@ -44,6 +44,9 @@ > > /* Query request timeout */ > > #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ > > > > +/* LINERESET TIME OUT */ > > +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ > > + > > /* Task management command timeout */ > > #define TM_CMD_TIMEOUT 100 /* msecs */ > > > > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct > > *work) > > * check if power mode restore is needed. > > */ > > if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) { > > + ktime_t start = ktime_get(); > > I don't see the connection btw line-reset and following tmf cmd. > My point is that line-reset is not the only non-fatal error which > leads us to the following tmf cmd. So the wait should be outside > of this check - just put it right before clearing outstanding reqs. Ok. Let me move it in v4. > > Thanks, > Can Guo. > > > + > > hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR; > > if (!hba->saved_uic_err) > > hba->saved_err &= ~UIC_ERROR; > > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct > > *work) > > if (ufshcd_is_pwr_mode_restore_needed(hba)) > > needs_restore = true; > > spin_lock_irqsave(hba->host->host_lock, flags); > > + /* Wait for IO completion to avoid aborting IOs */ > > + while (hba->outstanding_reqs) { > > + ufshcd_complete_requests(hba); > > + spin_unlock_irqrestore(hba->host->host_lock, flags); > > + schedule(); > > + spin_lock_irqsave(hba->host->host_lock, flags); > > + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > > > + LINERESET_IO_TIMEOUT_MS) { > > + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", > > + __func__, hba->outstanding_reqs); > > + break; > > + } > > + } > > + > > if (!hba->saved_err && !needs_restore) > > goto skip_err_handling; > > } > > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void > > *__hba) > > intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); > > } > > > > - if (enabled_intr_status && retval == IRQ_NONE) { > > - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", > > - __func__, intr_status); > > + if (enabled_intr_status && retval == IRQ_NONE && > > + !ufshcd_eh_in_progress(hba)) { > > + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, > > 0x%08x)\n", > > + __func__, > > + intr_status, > > + hba->ufs_stats.last_intr_status, > > + enabled_intr_status); > > ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); > > } > > > > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba > > *hba, > > * Even though we use wait_event() which sleeps indefinitely, > > * the maximum wait time is bounded by %TM_CMD_TIMEOUT. > > */ > > - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); > > + req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED | > > + BLK_MQ_REQ_NOWAIT); > > + if (IS_ERR(req)) > > + return PTR_ERR(req); > > + > > req->end_io_data = &wait; > > free_slot = req->tag; > > WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); > > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void > > __iomem *mmio_base, unsigned int irq) > > > > hba->tmf_tag_set = (struct blk_mq_tag_set) { > > .nr_hw_queues = 1, > > + .reserved_tags = 1, > > .queue_depth = hba->nutmrs, > > .ops = &ufshcd_tmf_ops, > > .flags = BLK_MQ_F_NO_SCHED, ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd 2021-01-06 21:41 ` [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd Jaegeuk Kim 2021-01-07 6:07 ` Can Guo @ 2021-01-07 6:38 ` Can Guo 2021-01-07 6:51 ` Jaegeuk Kim 1 sibling, 1 reply; 14+ messages in thread From: Can Guo @ 2021-01-07 6:38 UTC (permalink / raw) To: Jaegeuk Kim Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu, Jaegeuk Kim On 2021-01-07 05:41, Jaegeuk Kim wrote: > From: Jaegeuk Kim <jaegeuk@google.com> > > This fixes a warning caused by wrong reserve tag usage in > __ufshcd_issue_tm_cmd. > > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 > blk_get_request+0x68/0x70 > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 > blk_mq_get_tag+0x438/0x46c > > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before > aborting > outstanding commands by waiting a bit for IO completion like this. > > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out > Would you mind add a Fixes tag? > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > --- > drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++---- > 1 file changed, 32 insertions(+), 4 deletions(-) > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index 1678cec08b51..47fc8da3cbf9 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -44,6 +44,9 @@ > /* Query request timeout */ > #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ > > +/* LINERESET TIME OUT */ > +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ > + > /* Task management command timeout */ > #define TM_CMD_TIMEOUT 100 /* msecs */ > > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct > *work) > * check if power mode restore is needed. > */ > if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) { > + ktime_t start = ktime_get(); > + > hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR; > if (!hba->saved_uic_err) > hba->saved_err &= ~UIC_ERROR; > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct > work_struct *work) > if (ufshcd_is_pwr_mode_restore_needed(hba)) > needs_restore = true; > spin_lock_irqsave(hba->host->host_lock, flags); > + /* Wait for IO completion to avoid aborting IOs */ > + while (hba->outstanding_reqs) { > + ufshcd_complete_requests(hba); > + spin_unlock_irqrestore(hba->host->host_lock, flags); > + schedule(); > + spin_lock_irqsave(hba->host->host_lock, flags); > + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > > + LINERESET_IO_TIMEOUT_MS) { > + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", > + __func__, hba->outstanding_reqs); > + break; > + } > + } > + > if (!hba->saved_err && !needs_restore) > goto skip_err_handling; > } > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void > *__hba) > intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); > } > > - if (enabled_intr_status && retval == IRQ_NONE) { > - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", > - __func__, intr_status); > + if (enabled_intr_status && retval == IRQ_NONE && > + !ufshcd_eh_in_progress(hba)) { > + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, > 0x%08x)\n", > + __func__, > + intr_status, > + hba->ufs_stats.last_intr_status, > + enabled_intr_status); > ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); > } > > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba > *hba, > * Even though we use wait_event() which sleeps indefinitely, > * the maximum wait time is bounded by %TM_CMD_TIMEOUT. > */ > - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); > + req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED | > + BLK_MQ_REQ_NOWAIT); Sorry that I didn't pay much attention to this part of code before. May I know why must we use the BLK_MQ_REQ_RESERVED flag? Thanks, Can Guo. > + if (IS_ERR(req)) > + return PTR_ERR(req); > + > req->end_io_data = &wait; > free_slot = req->tag; > WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void > __iomem *mmio_base, unsigned int irq) > > hba->tmf_tag_set = (struct blk_mq_tag_set) { > .nr_hw_queues = 1, > + .reserved_tags = 1, If we give reserved_tags as 1 and always ask for a tm requst with BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated from the reserved sbitmap_queue, whose depth is set to 1 here. UFS supports tm queue depth as 8, but here is allowing only one tm req at a time. Why? Please correct me if my understanding is wrong. Thanks, Can Guo. > .queue_depth = hba->nutmrs, > .ops = &ufshcd_tmf_ops, > .flags = BLK_MQ_F_NO_SCHED, ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd 2021-01-07 6:38 ` Can Guo @ 2021-01-07 6:51 ` Jaegeuk Kim 2021-01-07 7:03 ` Can Guo 0 siblings, 1 reply; 14+ messages in thread From: Jaegeuk Kim @ 2021-01-07 6:51 UTC (permalink / raw) To: Can Guo Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 01/07, Can Guo wrote: > On 2021-01-07 05:41, Jaegeuk Kim wrote: > > From: Jaegeuk Kim <jaegeuk@google.com> > > > > This fixes a warning caused by wrong reserve tag usage in > > __ufshcd_issue_tm_cmd. > > > > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 > > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 > > blk_mq_get_tag+0x438/0x46c > > > > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before > > aborting > > outstanding commands by waiting a bit for IO completion like this. > > > > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out > > > > Would you mind add a Fixes tag? Ok. > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > --- > > drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++---- > > 1 file changed, 32 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > > index 1678cec08b51..47fc8da3cbf9 100644 > > --- a/drivers/scsi/ufs/ufshcd.c > > +++ b/drivers/scsi/ufs/ufshcd.c > > @@ -44,6 +44,9 @@ > > /* Query request timeout */ > > #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ > > > > +/* LINERESET TIME OUT */ > > +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ > > + > > /* Task management command timeout */ > > #define TM_CMD_TIMEOUT 100 /* msecs */ > > > > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct > > *work) > > * check if power mode restore is needed. > > */ > > if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) { > > + ktime_t start = ktime_get(); > > + > > hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR; > > if (!hba->saved_uic_err) > > hba->saved_err &= ~UIC_ERROR; > > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct > > *work) > > if (ufshcd_is_pwr_mode_restore_needed(hba)) > > needs_restore = true; > > spin_lock_irqsave(hba->host->host_lock, flags); > > + /* Wait for IO completion to avoid aborting IOs */ > > + while (hba->outstanding_reqs) { > > + ufshcd_complete_requests(hba); > > + spin_unlock_irqrestore(hba->host->host_lock, flags); > > + schedule(); > > + spin_lock_irqsave(hba->host->host_lock, flags); > > + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > > > + LINERESET_IO_TIMEOUT_MS) { > > + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", > > + __func__, hba->outstanding_reqs); > > + break; > > + } > > + } > > + > > if (!hba->saved_err && !needs_restore) > > goto skip_err_handling; > > } > > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void > > *__hba) > > intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); > > } > > > > - if (enabled_intr_status && retval == IRQ_NONE) { > > - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", > > - __func__, intr_status); > > + if (enabled_intr_status && retval == IRQ_NONE && > > + !ufshcd_eh_in_progress(hba)) { > > + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, > > 0x%08x)\n", > > + __func__, > > + intr_status, > > + hba->ufs_stats.last_intr_status, > > + enabled_intr_status); > > ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); > > } > > > > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba > > *hba, > > * Even though we use wait_event() which sleeps indefinitely, > > * the maximum wait time is bounded by %TM_CMD_TIMEOUT. > > */ > > - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); > > + req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED | > > + BLK_MQ_REQ_NOWAIT); > > Sorry that I didn't pay much attention to this part of code before. > May I know why must we use the BLK_MQ_REQ_RESERVED flag? What I understood is the reserved tag is used when aborting outstanding IOs when all the 32 tags were used. > > Thanks, > Can Guo. > > > + if (IS_ERR(req)) > > + return PTR_ERR(req); > > + > > req->end_io_data = &wait; > > free_slot = req->tag; > > WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); > > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void > > __iomem *mmio_base, unsigned int irq) > > > > hba->tmf_tag_set = (struct blk_mq_tag_set) { > > .nr_hw_queues = 1, > > + .reserved_tags = 1, > > If we give reserved_tags as 1 and always ask for a tm requst with > BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated > from the reserved sbitmap_queue, whose depth is set to 1 here. > UFS supports tm queue depth as 8, but here is allowing only one tm > req at a time. Why? Please correct me if my understanding is wrong. I couldn't find tm can be issued in parallel, so thought it was issued one at a time. If we set 8, then we can use 24 for IOs, IIUC. Please correct me as well. I'm still trying to understand the flow. > > Thanks, > Can Guo. > > > .queue_depth = hba->nutmrs, > > .ops = &ufshcd_tmf_ops, > > .flags = BLK_MQ_F_NO_SCHED, ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd 2021-01-07 6:51 ` Jaegeuk Kim @ 2021-01-07 7:03 ` Can Guo 2021-01-07 7:20 ` Can Guo 2021-01-07 7:23 ` Jaegeuk Kim 0 siblings, 2 replies; 14+ messages in thread From: Can Guo @ 2021-01-07 7:03 UTC (permalink / raw) To: Jaegeuk Kim Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 2021-01-07 14:51, Jaegeuk Kim wrote: > On 01/07, Can Guo wrote: >> On 2021-01-07 05:41, Jaegeuk Kim wrote: >> > From: Jaegeuk Kim <jaegeuk@google.com> >> > >> > This fixes a warning caused by wrong reserve tag usage in >> > __ufshcd_issue_tm_cmd. >> > >> > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 >> > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 >> > blk_mq_get_tag+0x438/0x46c >> > >> > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before >> > aborting >> > outstanding commands by waiting a bit for IO completion like this. >> > >> > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out >> > >> >> Would you mind add a Fixes tag? > > Ok. > >> >> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >> > --- >> > drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++---- >> > 1 file changed, 32 insertions(+), 4 deletions(-) >> > >> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >> > index 1678cec08b51..47fc8da3cbf9 100644 >> > --- a/drivers/scsi/ufs/ufshcd.c >> > +++ b/drivers/scsi/ufs/ufshcd.c >> > @@ -44,6 +44,9 @@ >> > /* Query request timeout */ >> > #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ >> > >> > +/* LINERESET TIME OUT */ >> > +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ >> > + >> > /* Task management command timeout */ >> > #define TM_CMD_TIMEOUT 100 /* msecs */ >> > >> > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct >> > *work) >> > * check if power mode restore is needed. >> > */ >> > if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) { >> > + ktime_t start = ktime_get(); >> > + >> > hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR; >> > if (!hba->saved_uic_err) >> > hba->saved_err &= ~UIC_ERROR; >> > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct >> > *work) >> > if (ufshcd_is_pwr_mode_restore_needed(hba)) >> > needs_restore = true; >> > spin_lock_irqsave(hba->host->host_lock, flags); >> > + /* Wait for IO completion to avoid aborting IOs */ >> > + while (hba->outstanding_reqs) { >> > + ufshcd_complete_requests(hba); >> > + spin_unlock_irqrestore(hba->host->host_lock, flags); >> > + schedule(); >> > + spin_lock_irqsave(hba->host->host_lock, flags); >> > + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > >> > + LINERESET_IO_TIMEOUT_MS) { >> > + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", >> > + __func__, hba->outstanding_reqs); >> > + break; >> > + } >> > + } >> > + >> > if (!hba->saved_err && !needs_restore) >> > goto skip_err_handling; >> > } >> > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void >> > *__hba) >> > intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); >> > } >> > >> > - if (enabled_intr_status && retval == IRQ_NONE) { >> > - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", >> > - __func__, intr_status); >> > + if (enabled_intr_status && retval == IRQ_NONE && >> > + !ufshcd_eh_in_progress(hba)) { >> > + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, >> > 0x%08x)\n", >> > + __func__, >> > + intr_status, >> > + hba->ufs_stats.last_intr_status, >> > + enabled_intr_status); >> > ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); >> > } >> > >> > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba >> > *hba, >> > * Even though we use wait_event() which sleeps indefinitely, >> > * the maximum wait time is bounded by %TM_CMD_TIMEOUT. >> > */ >> > - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); >> > + req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED | >> > + BLK_MQ_REQ_NOWAIT); >> >> Sorry that I didn't pay much attention to this part of code before. >> May I know why must we use the BLK_MQ_REQ_RESERVED flag? > > What I understood is the reserved tag is used when aborting outstanding > IOs when all the 32 tags were used. > No, the tm requests and I/O requests are on two different tag sets: tm requests come from hba->tmf_tag_set, while I/O requests come from hba->shost->tag_set. Meaning they don't share tags with each other. >> >> Thanks, >> Can Guo. >> >> > + if (IS_ERR(req)) >> > + return PTR_ERR(req); >> > + >> > req->end_io_data = &wait; >> > free_slot = req->tag; >> > WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); >> > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void >> > __iomem *mmio_base, unsigned int irq) >> > >> > hba->tmf_tag_set = (struct blk_mq_tag_set) { >> > .nr_hw_queues = 1, >> > + .reserved_tags = 1, >> >> If we give reserved_tags as 1 and always ask for a tm requst with >> BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated >> from the reserved sbitmap_queue, whose depth is set to 1 here. >> UFS supports tm queue depth as 8, but here is allowing only one tm >> req at a time. Why? Please correct me if my understanding is wrong. > > I couldn't find tm can be issued in parallel, so thought it was issued > one at a time. If we set 8, then we can use 24 for IOs, IIUC. > > Please correct me as well. I'm still trying to understand the flow. > UFS allows a queue depth as 8, which means it support sending multiple tm requests at the same time. You can check commit 69a6c269c097d780a2 - before this change, we used to use below func to allocate tags for tm reqs, which can tell you the true story. So I am thinking why don't we just we remove the BLK_MQ_REQ_RESERVED flag? Removing it can also fix the warning I suppose. What do you think? -static bool ufshcd_get_tm_free_slot(struct ufs_hba *hba, int *free_slot) -{ - int tag; - bool ret = false; - - if (!free_slot) - goto out; - - do { - tag = find_first_zero_bit(&hba->tm_slots_in_use, hba->nutmrs); - if (tag >= hba->nutmrs) - goto out; - } while (test_and_set_bit_lock(tag, &hba->tm_slots_in_use)); - - *free_slot = tag; - ret = true; -out: - return ret; -} Thanks, Can Guo. >> >> Thanks, >> Can Guo. >> >> > .queue_depth = hba->nutmrs, >> > .ops = &ufshcd_tmf_ops, >> > .flags = BLK_MQ_F_NO_SCHED, ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd 2021-01-07 7:03 ` Can Guo @ 2021-01-07 7:20 ` Can Guo 2021-01-07 7:23 ` Jaegeuk Kim 1 sibling, 0 replies; 14+ messages in thread From: Can Guo @ 2021-01-07 7:20 UTC (permalink / raw) To: Jaegeuk Kim Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 2021-01-07 15:03, Can Guo wrote: > On 2021-01-07 14:51, Jaegeuk Kim wrote: >> On 01/07, Can Guo wrote: >>> On 2021-01-07 05:41, Jaegeuk Kim wrote: >>> > From: Jaegeuk Kim <jaegeuk@google.com> >>> > >>> > This fixes a warning caused by wrong reserve tag usage in >>> > __ufshcd_issue_tm_cmd. >>> > >>> > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 >>> > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 >>> > blk_mq_get_tag+0x438/0x46c >>> > >>> > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before >>> > aborting >>> > outstanding commands by waiting a bit for IO completion like this. >>> > >>> > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out >>> > >>> >>> Would you mind add a Fixes tag? >> >> Ok. >> >>> >>> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>> > --- >>> > drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++---- >>> > 1 file changed, 32 insertions(+), 4 deletions(-) >>> > >>> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>> > index 1678cec08b51..47fc8da3cbf9 100644 >>> > --- a/drivers/scsi/ufs/ufshcd.c >>> > +++ b/drivers/scsi/ufs/ufshcd.c >>> > @@ -44,6 +44,9 @@ >>> > /* Query request timeout */ >>> > #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ >>> > >>> > +/* LINERESET TIME OUT */ >>> > +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ >>> > + >>> > /* Task management command timeout */ >>> > #define TM_CMD_TIMEOUT 100 /* msecs */ >>> > >>> > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct >>> > *work) >>> > * check if power mode restore is needed. >>> > */ >>> > if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) { >>> > + ktime_t start = ktime_get(); >>> > + >>> > hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR; >>> > if (!hba->saved_uic_err) >>> > hba->saved_err &= ~UIC_ERROR; >>> > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct >>> > *work) >>> > if (ufshcd_is_pwr_mode_restore_needed(hba)) >>> > needs_restore = true; >>> > spin_lock_irqsave(hba->host->host_lock, flags); >>> > + /* Wait for IO completion to avoid aborting IOs */ >>> > + while (hba->outstanding_reqs) { >>> > + ufshcd_complete_requests(hba); >>> > + spin_unlock_irqrestore(hba->host->host_lock, flags); >>> > + schedule(); >>> > + spin_lock_irqsave(hba->host->host_lock, flags); >>> > + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > >>> > + LINERESET_IO_TIMEOUT_MS) { >>> > + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", >>> > + __func__, hba->outstanding_reqs); >>> > + break; >>> > + } >>> > + } >>> > + >>> > if (!hba->saved_err && !needs_restore) >>> > goto skip_err_handling; >>> > } >>> > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void >>> > *__hba) >>> > intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); >>> > } >>> > >>> > - if (enabled_intr_status && retval == IRQ_NONE) { >>> > - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", >>> > - __func__, intr_status); >>> > + if (enabled_intr_status && retval == IRQ_NONE && >>> > + !ufshcd_eh_in_progress(hba)) { >>> > + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, >>> > 0x%08x)\n", >>> > + __func__, >>> > + intr_status, >>> > + hba->ufs_stats.last_intr_status, >>> > + enabled_intr_status); >>> > ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); >>> > } >>> > >>> > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba >>> > *hba, >>> > * Even though we use wait_event() which sleeps indefinitely, >>> > * the maximum wait time is bounded by %TM_CMD_TIMEOUT. >>> > */ >>> > - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); >>> > + req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED | >>> > + BLK_MQ_REQ_NOWAIT); >>> >>> Sorry that I didn't pay much attention to this part of code before. >>> May I know why must we use the BLK_MQ_REQ_RESERVED flag? >> >> What I understood is the reserved tag is used when aborting >> outstanding >> IOs when all the 32 tags were used. >> > > No, the tm requests and I/O requests are on two different tag sets: > tm requests come from hba->tmf_tag_set, while I/O requests come from > hba->shost->tag_set. Meaning they don't share tags with each other. Add they are issued on two different HW queues - one for tm reqs, one for I/O reqs, which is why two different tag sets are created. > >>> >>> Thanks, >>> Can Guo. >>> >>> > + if (IS_ERR(req)) >>> > + return PTR_ERR(req); >>> > + >>> > req->end_io_data = &wait; >>> > free_slot = req->tag; >>> > WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); >>> > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void >>> > __iomem *mmio_base, unsigned int irq) >>> > >>> > hba->tmf_tag_set = (struct blk_mq_tag_set) { >>> > .nr_hw_queues = 1, >>> > + .reserved_tags = 1, >>> >>> If we give reserved_tags as 1 and always ask for a tm requst with >>> BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated >>> from the reserved sbitmap_queue, whose depth is set to 1 here. >>> UFS supports tm queue depth as 8, but here is allowing only one tm >>> req at a time. Why? Please correct me if my understanding is wrong. >> >> I couldn't find tm can be issued in parallel, so thought it was issued >> one at a time. If we set 8, then we can use 24 for IOs, IIUC. >> >> Please correct me as well. I'm still trying to understand the flow. >> > > UFS allows a queue depth as 8, which means it support sending multiple > tm requests at the same time. You can check commit 69a6c269c097d780a2 - > before this change, we used to use below func to allocate tags for > tm reqs, which can tell you the true story. > > So I am thinking why don't we just we remove the BLK_MQ_REQ_RESERVED > flag? > Removing it can also fix the warning I suppose. What do you think? > > -static bool ufshcd_get_tm_free_slot(struct ufs_hba *hba, int > *free_slot) > -{ > - int tag; > - bool ret = false; > - > - if (!free_slot) > - goto out; > - > - do { > - tag = find_first_zero_bit(&hba->tm_slots_in_use, > hba->nutmrs); > - if (tag >= hba->nutmrs) > - goto out; > - } while (test_and_set_bit_lock(tag, &hba->tm_slots_in_use)); > - > - *free_slot = tag; > - ret = true; > -out: > - return ret; > -} > > Thanks, > Can Guo. > >>> >>> Thanks, >>> Can Guo. >>> >>> > .queue_depth = hba->nutmrs, >>> > .ops = &ufshcd_tmf_ops, >>> > .flags = BLK_MQ_F_NO_SCHED, ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd 2021-01-07 7:03 ` Can Guo 2021-01-07 7:20 ` Can Guo @ 2021-01-07 7:23 ` Jaegeuk Kim 1 sibling, 0 replies; 14+ messages in thread From: Jaegeuk Kim @ 2021-01-07 7:23 UTC (permalink / raw) To: Can Guo Cc: linux-kernel, linux-scsi, kernel-team, alim.akhtar, avri.altman, bvanassche, martin.petersen, stanley.chu On 01/07, Can Guo wrote: > On 2021-01-07 14:51, Jaegeuk Kim wrote: > > On 01/07, Can Guo wrote: > > > On 2021-01-07 05:41, Jaegeuk Kim wrote: > > > > From: Jaegeuk Kim <jaegeuk@google.com> > > > > > > > > This fixes a warning caused by wrong reserve tag usage in > > > > __ufshcd_issue_tm_cmd. > > > > > > > > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 > > > > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 > > > > blk_mq_get_tag+0x438/0x46c > > > > > > > > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before > > > > aborting > > > > outstanding commands by waiting a bit for IO completion like this. > > > > > > > > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out > > > > > > > > > > Would you mind add a Fixes tag? > > > > Ok. > > > > > > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > > > --- > > > > drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++---- > > > > 1 file changed, 32 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > > > > index 1678cec08b51..47fc8da3cbf9 100644 > > > > --- a/drivers/scsi/ufs/ufshcd.c > > > > +++ b/drivers/scsi/ufs/ufshcd.c > > > > @@ -44,6 +44,9 @@ > > > > /* Query request timeout */ > > > > #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */ > > > > > > > > +/* LINERESET TIME OUT */ > > > > +#define LINERESET_IO_TIMEOUT_MS (30000) /* 30 sec */ > > > > + > > > > /* Task management command timeout */ > > > > #define TM_CMD_TIMEOUT 100 /* msecs */ > > > > > > > > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct > > > > *work) > > > > * check if power mode restore is needed. > > > > */ > > > > if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) { > > > > + ktime_t start = ktime_get(); > > > > + > > > > hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR; > > > > if (!hba->saved_uic_err) > > > > hba->saved_err &= ~UIC_ERROR; > > > > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct > > > > *work) > > > > if (ufshcd_is_pwr_mode_restore_needed(hba)) > > > > needs_restore = true; > > > > spin_lock_irqsave(hba->host->host_lock, flags); > > > > + /* Wait for IO completion to avoid aborting IOs */ > > > > + while (hba->outstanding_reqs) { > > > > + ufshcd_complete_requests(hba); > > > > + spin_unlock_irqrestore(hba->host->host_lock, flags); > > > > + schedule(); > > > > + spin_lock_irqsave(hba->host->host_lock, flags); > > > > + if (ktime_to_ms(ktime_sub(ktime_get(), start)) > > > > > + LINERESET_IO_TIMEOUT_MS) { > > > > + dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n", > > > > + __func__, hba->outstanding_reqs); > > > > + break; > > > > + } > > > > + } > > > > + > > > > if (!hba->saved_err && !needs_restore) > > > > goto skip_err_handling; > > > > } > > > > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void > > > > *__hba) > > > > intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS); > > > > } > > > > > > > > - if (enabled_intr_status && retval == IRQ_NONE) { > > > > - dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n", > > > > - __func__, intr_status); > > > > + if (enabled_intr_status && retval == IRQ_NONE && > > > > + !ufshcd_eh_in_progress(hba)) { > > > > + dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, > > > > 0x%08x)\n", > > > > + __func__, > > > > + intr_status, > > > > + hba->ufs_stats.last_intr_status, > > > > + enabled_intr_status); > > > > ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: "); > > > > } > > > > > > > > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba > > > > *hba, > > > > * Even though we use wait_event() which sleeps indefinitely, > > > > * the maximum wait time is bounded by %TM_CMD_TIMEOUT. > > > > */ > > > > - req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); > > > > + req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED | > > > > + BLK_MQ_REQ_NOWAIT); > > > > > > Sorry that I didn't pay much attention to this part of code before. > > > May I know why must we use the BLK_MQ_REQ_RESERVED flag? > > > > What I understood is the reserved tag is used when aborting outstanding > > IOs when all the 32 tags were used. > > > > No, the tm requests and I/O requests are on two different tag sets: > tm requests come from hba->tmf_tag_set, while I/O requests come from > hba->shost->tag_set. Meaning they don't share tags with each other. I see. :) > > > > > > > Thanks, > > > Can Guo. > > > > > > > + if (IS_ERR(req)) > > > > + return PTR_ERR(req); > > > > + > > > > req->end_io_data = &wait; > > > > free_slot = req->tag; > > > > WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); > > > > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void > > > > __iomem *mmio_base, unsigned int irq) > > > > > > > > hba->tmf_tag_set = (struct blk_mq_tag_set) { > > > > .nr_hw_queues = 1, > > > > + .reserved_tags = 1, > > > > > > If we give reserved_tags as 1 and always ask for a tm requst with > > > BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated > > > from the reserved sbitmap_queue, whose depth is set to 1 here. > > > UFS supports tm queue depth as 8, but here is allowing only one tm > > > req at a time. Why? Please correct me if my understanding is wrong. > > > > I couldn't find tm can be issued in parallel, so thought it was issued > > one at a time. If we set 8, then we can use 24 for IOs, IIUC. > > > > Please correct me as well. I'm still trying to understand the flow. > > > > UFS allows a queue depth as 8, which means it support sending multiple > tm requests at the same time. You can check commit 69a6c269c097d780a2 - > before this change, we used to use below func to allocate tags for > tm reqs, which can tell you the true story. > > So I am thinking why don't we just we remove the BLK_MQ_REQ_RESERVED flag? > Removing it can also fix the warning I suppose. What do you think? Yeah, I believe it won't give a warning. Okay, let me check it out. > > -static bool ufshcd_get_tm_free_slot(struct ufs_hba *hba, int *free_slot) > -{ > - int tag; > - bool ret = false; > - > - if (!free_slot) > - goto out; > - > - do { > - tag = find_first_zero_bit(&hba->tm_slots_in_use, > hba->nutmrs); > - if (tag >= hba->nutmrs) > - goto out; > - } while (test_and_set_bit_lock(tag, &hba->tm_slots_in_use)); > - > - *free_slot = tag; > - ret = true; > -out: > - return ret; > -} > > Thanks, > Can Guo. > > > > > > > Thanks, > > > Can Guo. > > > > > > > .queue_depth = hba->nutmrs, > > > > .ops = &ufshcd_tmf_ops, > > > > .flags = BLK_MQ_F_NO_SCHED, ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2021-01-07 7:41 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-01-06 21:41 [PATH v3 0/2] Two UFS bug fixes Jaegeuk Kim 2021-01-06 21:41 ` [PATCH v3 1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns Jaegeuk Kim 2021-01-07 6:10 ` Can Guo 2021-01-07 6:57 ` Jaegeuk Kim 2021-01-07 7:09 ` Can Guo 2021-01-07 7:40 ` Jaegeuk Kim 2021-01-06 21:41 ` [PATCH v3 2/2] scsi: ufs: handle LINERESET with correct tm_cmd Jaegeuk Kim 2021-01-07 6:07 ` Can Guo 2021-01-07 6:51 ` Jaegeuk Kim 2021-01-07 6:38 ` Can Guo 2021-01-07 6:51 ` Jaegeuk Kim 2021-01-07 7:03 ` Can Guo 2021-01-07 7:20 ` Can Guo 2021-01-07 7:23 ` Jaegeuk Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox