* [PATCH v8 0/3] fix abort defect
@ 2024-09-23 8:03 peter.wang
2024-09-23 8:03 ` [PATCH v8 1/3] ufs: core: fix the issue of ICU failure peter.wang
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: peter.wang @ 2024-09-23 8:03 UTC (permalink / raw)
To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, ed.tsai,
bvanassche, quic_nguyenb
From: Peter Wang <peter.wang@mediatek.com>
V8:
- Remove the abort variable to simplify the abort process.
- Correct error handler successfully aborts release flow.
- Ingore MCQ OCS: ABORTED.
V7:
- Use a variable instead of a flag.
- Add a check for MCQ mode when setting this variable to UFS_ERR_HANDLER.
- Print OCS information for OCS_ABORTED and OCS_INVALID_COMMAND_STATUS.
- Add a MediaTek quirk for handling OCS_ABORTED in SDB mode.
- Skip notifying SCSI from ISR during SCSI abort (ufshcd_abort()).
V6:
- Add err handler check before set flag true.
V5:
- Change flag name.
- Amend comment and patch description.
V4:
- Remove nullify SQ entry abort requeue.
- Add more comment for flag usage and set description.
- Fix build warning.
V3:
- Change comment and use variable(rtc) for error print
- Change flag name and move flag set before ufshcd_clear_cmd
- Add SDB mode clear UTRLCLR tag receive OCS_ABORTED requeue
V2:
- Fix mcq_enabled build error.
Peter Wang (3):
ufs: core: fix the issue of ICU failure
ufs: core: fix error handler process for MCQ abort
ufs: core: add a quirk for MediaTek SDB mode aborted
drivers/ufs/core/ufs-mcq.c | 15 ++++++++-------
drivers/ufs/core/ufshcd.c | 28 ++++++++++++++++++++++++++--
drivers/ufs/host/ufs-mediatek.c | 1 +
include/ufs/ufshcd.h | 6 ++++++
4 files changed, 41 insertions(+), 9 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v8 1/3] ufs: core: fix the issue of ICU failure
2024-09-23 8:03 [PATCH v8 0/3] fix abort defect peter.wang
@ 2024-09-23 8:03 ` peter.wang
2024-09-23 8:03 ` [PATCH v8 2/3] ufs: core: fix error handler process for MCQ abort peter.wang
2024-09-23 8:03 ` [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted peter.wang
2 siblings, 0 replies; 10+ messages in thread
From: peter.wang @ 2024-09-23 8:03 UTC (permalink / raw)
To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, ed.tsai,
bvanassche, quic_nguyenb, stable
From: Peter Wang <peter.wang@mediatek.com>
When setting the ICU bit without using read-modify-write,
SQRTCy will restart SQ again and receive an RTC return
error code 2 (Failure - SQ not stopped).
Additionally, the error log has been modified so that
this type of error can be observed.
Fixes: ab248643d3d6 ("scsi: ufs: core: Add error handling for MCQ mode")
Cc: stable@vger.kernel.org
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufs-mcq.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index 5891cdacd0b3..3903947dbed1 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -539,7 +539,7 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
struct scsi_cmnd *cmd = lrbp->cmd;
struct ufs_hw_queue *hwq;
void __iomem *reg, *opr_sqd_base;
- u32 nexus, id, val;
+ u32 nexus, id, val, rtc;
int err;
if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
@@ -569,17 +569,18 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
opr_sqd_base = mcq_opr_base(hba, OPR_SQD, id);
writel(nexus, opr_sqd_base + REG_SQCTI);
- /* SQRTCy.ICU = 1 */
- writel(SQ_ICU, opr_sqd_base + REG_SQRTC);
+ /* Initiate Cleanup */
+ writel(readl(opr_sqd_base + REG_SQRTC) | SQ_ICU,
+ opr_sqd_base + REG_SQRTC);
/* Poll SQRTSy.CUS = 1. Return result from SQRTSy.RTC */
reg = opr_sqd_base + REG_SQRTS;
err = read_poll_timeout(readl, val, val & SQ_CUS, 20,
MCQ_POLL_US, false, reg);
- if (err)
- dev_err(hba->dev, "%s: failed. hwq=%d, tag=%d err=%ld\n",
- __func__, id, task_tag,
- FIELD_GET(SQ_ICU_ERR_CODE_MASK, readl(reg)));
+ rtc = FIELD_GET(SQ_ICU_ERR_CODE_MASK, readl(reg));
+ if (err || rtc)
+ dev_err(hba->dev, "%s: failed. hwq=%d, tag=%d err=%d RTC=%d\n",
+ __func__, id, task_tag, err, rtc);
if (ufshcd_mcq_sq_start(hba, hwq))
err = -ETIMEDOUT;
--
2.45.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v8 2/3] ufs: core: fix error handler process for MCQ abort
2024-09-23 8:03 [PATCH v8 0/3] fix abort defect peter.wang
2024-09-23 8:03 ` [PATCH v8 1/3] ufs: core: fix the issue of ICU failure peter.wang
@ 2024-09-23 8:03 ` peter.wang
2024-09-23 18:19 ` Bart Van Assche
2024-09-23 8:03 ` [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted peter.wang
2 siblings, 1 reply; 10+ messages in thread
From: peter.wang @ 2024-09-23 8:03 UTC (permalink / raw)
To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, ed.tsai,
bvanassche, quic_nguyenb
From: Peter Wang <peter.wang@mediatek.com>
When the error handler successfully aborts a MCQ request,
it only releases the command and does not notify the SCSI layer.
This may cause another abort after 30 seconds timeout.
This patch notifies the SCSI layer to requeue the request.
Additionally, ignore the OCS: ABORTED CQ slot after MCQ mode
SQ cleanup. This makes the behavior of MCQ mode consistent with
that of legacy SDB mode.
Also, print logs for OCS: ABORTED and OCS_INVALID_COMMAND_STATUS
for debugging purposes.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
drivers/ufs/core/ufshcd.c | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index a6f818cdef0e..b5c7bc50a27e 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5405,9 +5405,15 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
break;
case OCS_ABORTED:
result |= DID_ABORT << 16;
+ dev_warn(hba->dev,
+ "OCS aborted from controller = %x for tag %d\n",
+ ocs, lrbp->task_tag);
break;
case OCS_INVALID_COMMAND_STATUS:
result |= DID_REQUEUE << 16;
+ dev_warn(hba->dev,
+ "OCS invaild from controller = %x for tag %d\n",
+ ocs, lrbp->task_tag);
break;
case OCS_INVALID_CMD_TABLE_ATTR:
case OCS_INVALID_PRDT_ATTR:
@@ -5526,6 +5532,18 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, int task_tag,
ufshcd_update_monitor(hba, lrbp);
ufshcd_add_command_trace(hba, task_tag, UFS_CMD_COMP);
cmd->result = ufshcd_transfer_rsp_status(hba, lrbp, cqe);
+
+ /*
+ * Ignore MCQ OCS: ABORTED posted by the host controller.
+ * This makes the behavior of MCQ mode consistent with that
+ * of legacy SDB mode.
+ */
+ if (hba->mcq_enabled) {
+ ocs = ufshcd_get_tr_ocs(lrbp, cqe);
+ if (ocs == OCS_ABORTED)
+ return;
+ }
+
ufshcd_release_scsi_cmd(hba, lrbp);
/* Do not touch lrbp after scsi done */
scsi_done(cmd);
@@ -6486,8 +6504,11 @@ static bool ufshcd_abort_one(struct request *rq, void *priv)
if (!hwq)
return 0;
spin_lock_irqsave(&hwq->cq_lock, flags);
- if (ufshcd_cmd_inflight(lrbp->cmd))
+ if (ufshcd_cmd_inflight(lrbp->cmd)) {
+ set_host_byte(lrbp->cmd, DID_REQUEUE);
ufshcd_release_scsi_cmd(hba, lrbp);
+ scsi_done(lrbp->cmd);
+ }
spin_unlock_irqrestore(&hwq->cq_lock, flags);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted
2024-09-23 8:03 [PATCH v8 0/3] fix abort defect peter.wang
2024-09-23 8:03 ` [PATCH v8 1/3] ufs: core: fix the issue of ICU failure peter.wang
2024-09-23 8:03 ` [PATCH v8 2/3] ufs: core: fix error handler process for MCQ abort peter.wang
@ 2024-09-23 8:03 ` peter.wang
2024-09-23 18:15 ` Bart Van Assche
2 siblings, 1 reply; 10+ messages in thread
From: peter.wang @ 2024-09-23 8:03 UTC (permalink / raw)
To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, ed.tsai,
bvanassche, quic_nguyenb
From: Peter Wang <peter.wang@mediatek.com>
Because the MediaTek UFS controller uses UTRLCLR to clear commands
and fills the OCS with ABORTED, this patch introduces a quirk to
treat ABORTED as INVALID_OCS_VALUE.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
drivers/ufs/core/ufshcd.c | 5 ++++-
drivers/ufs/host/ufs-mediatek.c | 1 +
include/ufs/ufshcd.h | 6 ++++++
3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index b5c7bc50a27e..b42079c3d634 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5404,7 +5404,10 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
}
break;
case OCS_ABORTED:
- result |= DID_ABORT << 16;
+ if (hba->quirks & UFSHCD_QUIRK_OCS_ABORTED)
+ result |= DID_REQUEUE << 16;
+ else
+ result |= DID_ABORT << 16;
dev_warn(hba->dev,
"OCS aborted from controller = %x for tag %d\n",
ocs, lrbp->task_tag);
diff --git a/drivers/ufs/host/ufs-mediatek.c b/drivers/ufs/host/ufs-mediatek.c
index 02c9064284e1..8a4c1b8f5a26 100644
--- a/drivers/ufs/host/ufs-mediatek.c
+++ b/drivers/ufs/host/ufs-mediatek.c
@@ -1021,6 +1021,7 @@ static int ufs_mtk_init(struct ufs_hba *hba)
hba->quirks |= UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL;
hba->quirks |= UFSHCD_QUIRK_MCQ_BROKEN_INTR;
hba->quirks |= UFSHCD_QUIRK_MCQ_BROKEN_RTC;
+ hba->quirks |= UFSHCD_QUIRK_OCS_ABORTED;
hba->vps->wb_flush_threshold = UFS_WB_BUF_REMAIN_PERCENT(80);
if (host->caps & UFS_MTK_CAP_DISABLE_AH8)
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index 0fd2aebac728..8f156803d703 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -684,6 +684,12 @@ enum ufshcd_quirks {
* single doorbell mode.
*/
UFSHCD_QUIRK_BROKEN_LSDBS_CAP = 1 << 25,
+
+ /*
+ * Some host controllers set OCS_ABORTED after UTRLCLR (SDB mode),
+ * this quirk is set to treat OCS: ABORTED as INVALID_OCS_VALUE
+ */
+ UFSHCD_QUIRK_OCS_ABORTED = 1 << 26,
};
enum ufshcd_caps {
--
2.45.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted
2024-09-23 8:03 ` [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted peter.wang
@ 2024-09-23 18:15 ` Bart Van Assche
2024-09-24 8:51 ` Peter Wang (王信友)
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2024-09-23 18:15 UTC (permalink / raw)
To: peter.wang, linux-scsi, martin.petersen, avri.altman, alim.akhtar,
jejb
Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
chaotian.jing, jiajie.hao, powen.kao, qilin.tan, lin.gui,
tun-yu.yu, eddie.huang, naomi.chu, ed.tsai, quic_nguyenb
On 9/23/24 1:03 AM, peter.wang@mediatek.com wrote:
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index b5c7bc50a27e..b42079c3d634 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -5404,7 +5404,10 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
> }
> break;
> case OCS_ABORTED:
> - result |= DID_ABORT << 16;
> + if (hba->quirks & UFSHCD_QUIRK_OCS_ABORTED)
> + result |= DID_REQUEUE << 16;
> + else
> + result |= DID_ABORT << 16;
> dev_warn(hba->dev,
> "OCS aborted from controller = %x for tag %d\n",
> ocs, lrbp->task_tag);
I think the approach of this patch is racy: the cmd->result assignment
by ufshcd_transfer_rsp_status() races with the cmd->result assignment by
ufshcd_abort_one(). How about addressing this race as follows?
* In ufshcd_compl_one_cqe(), if the OCS_ABORTED status is encountered,
set a completion.
* In the code that aborts SCSI commands, for MediaTek controllers only,
wait for that completion (based on a quirk).
* Instead of introducing an if-statement in
ufshcd_transfer_rsp_status(), rely on the cmd->result value assigned
by ufshcd_abort_one().
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v8 2/3] ufs: core: fix error handler process for MCQ abort
2024-09-23 8:03 ` [PATCH v8 2/3] ufs: core: fix error handler process for MCQ abort peter.wang
@ 2024-09-23 18:19 ` Bart Van Assche
2024-09-24 8:48 ` Peter Wang (王信友)
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2024-09-23 18:19 UTC (permalink / raw)
To: peter.wang, linux-scsi, martin.petersen, avri.altman, alim.akhtar,
jejb
Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
chaotian.jing, jiajie.hao, powen.kao, qilin.tan, lin.gui,
tun-yu.yu, eddie.huang, naomi.chu, ed.tsai, quic_nguyenb
On 9/23/24 1:03 AM, peter.wang@mediatek.com wrote:
> When the error handler successfully aborts a MCQ request,
> it only releases the command and does not notify the SCSI layer.
> This may cause another abort after 30 seconds timeout.
> This patch notifies the SCSI layer to requeue the request.
>
> Additionally, ignore the OCS: ABORTED CQ slot after MCQ mode
> SQ cleanup. This makes the behavior of MCQ mode consistent with
> that of legacy SDB mode.
>
> Also, print logs for OCS: ABORTED and OCS_INVALID_COMMAND_STATUS
> for debugging purposes.
Although I like the approach of this patch, two comments below.
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index a6f818cdef0e..b5c7bc50a27e 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -5405,9 +5405,15 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
> break;
> case OCS_ABORTED:
> result |= DID_ABORT << 16;
> + dev_warn(hba->dev,
> + "OCS aborted from controller = %x for tag %d\n",
> + ocs, lrbp->task_tag);
> break;
Including the OCS status in this message seems redundant to me.
> case OCS_INVALID_COMMAND_STATUS:
> result |= DID_REQUEUE << 16;
> + dev_warn(hba->dev,
> + "OCS invaild from controller = %x for tag %d\n",
> + ocs, lrbp->task_tag);
Also here, including the OCS status in this message seems redundant to me.
Please change "invaild" into "invalid".
> @@ -5526,6 +5532,18 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, int task_tag,
> ufshcd_update_monitor(hba, lrbp);
> ufshcd_add_command_trace(hba, task_tag, UFS_CMD_COMP);
> cmd->result = ufshcd_transfer_rsp_status(hba, lrbp, cqe);
> +
> + /*
> + * Ignore MCQ OCS: ABORTED posted by the host controller.
> + * This makes the behavior of MCQ mode consistent with that
> + * of legacy SDB mode.
> + */
> + if (hba->mcq_enabled) {
> + ocs = ufshcd_get_tr_ocs(lrbp, cqe);
> + if (ocs == OCS_ABORTED)
> + return;
> + }
Why only ignore the OCS_ABORTED status in MCQ mode? Is my understanding
correct that MediaTek controllers can also report this status in legacy
mode?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v8 2/3] ufs: core: fix error handler process for MCQ abort
2024-09-23 18:19 ` Bart Van Assche
@ 2024-09-24 8:48 ` Peter Wang (王信友)
0 siblings, 0 replies; 10+ messages in thread
From: Peter Wang (王信友) @ 2024-09-24 8:48 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org, bvanassche@acm.org,
avri.altman@wdc.com, jejb@linux.ibm.com, alim.akhtar@samsung.com,
martin.petersen@oracle.com
Cc: linux-mediatek@lists.infradead.org,
Jiajie Hao (郝加节),
CC Chou (周志杰),
Eddie Huang (黃智傑),
Alice Chao (趙珮均),
Ed Tsai (蔡宗軒), wsd_upstream,
quic_nguyenb@quicinc.com, Lin Gui (桂林),
Chun-Hung Wu (巫駿宏),
Tun-yu Yu (游敦聿),
Chaotian Jing (井朝天),
Powen Kao (高伯文),
Naomi Chu (朱詠田),
Qilin Tan (谭麒麟)
On Mon, 2024-09-23 at 11:19 -0700, Bart Van Assche wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On 9/23/24 1:03 AM, peter.wang@mediatek.com wrote:
> > When the error handler successfully aborts a MCQ request,
> > it only releases the command and does not notify the SCSI layer.
> > This may cause another abort after 30 seconds timeout.
> > This patch notifies the SCSI layer to requeue the request.
> >
> > Additionally, ignore the OCS: ABORTED CQ slot after MCQ mode
> > SQ cleanup. This makes the behavior of MCQ mode consistent with
> > that of legacy SDB mode.
> >
> > Also, print logs for OCS: ABORTED and OCS_INVALID_COMMAND_STATUS
> > for debugging purposes.
>
> Although I like the approach of this patch, two comments below.
>
> > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> > index a6f818cdef0e..b5c7bc50a27e 100644
> > --- a/drivers/ufs/core/ufshcd.c
> > +++ b/drivers/ufs/core/ufshcd.c
> > @@ -5405,9 +5405,15 @@ ufshcd_transfer_rsp_status(struct ufs_hba
> *hba, struct ufshcd_lrb *lrbp,
> > break;
> > case OCS_ABORTED:
> > result |= DID_ABORT << 16;
> > +dev_warn(hba->dev,
> > +"OCS aborted from controller = %x for tag %d\n",
> > +ocs, lrbp->task_tag);
> > break;
>
> Including the OCS status in this message seems redundant to me.
>
> > case OCS_INVALID_COMMAND_STATUS:
> > result |= DID_REQUEUE << 16;
> > +dev_warn(hba->dev,
> > +"OCS invaild from controller = %x for tag %d\n",
> > +ocs, lrbp->task_tag);
>
> Also here, including the OCS status in this message seems redundant
> to me.
>
> Please change "invaild" into "invalid".
Hi Bart,
Okay, I will revise the content printed here.
>
> > @@ -5526,6 +5532,18 @@ void ufshcd_compl_one_cqe(struct ufs_hba
> *hba, int task_tag,
> > ufshcd_update_monitor(hba, lrbp);
> > ufshcd_add_command_trace(hba, task_tag, UFS_CMD_COMP);
> > cmd->result = ufshcd_transfer_rsp_status(hba, lrbp, cqe);
> > +
> > +/*
> > + * Ignore MCQ OCS: ABORTED posted by the host controller.
> > + * This makes the behavior of MCQ mode consistent with that
> > + * of legacy SDB mode.
> > + */
> > +if (hba->mcq_enabled) {
> > +ocs = ufshcd_get_tr_ocs(lrbp, cqe);
> > +if (ocs == OCS_ABORTED)
> > +return;
> > +}
>
> Why only ignore the OCS_ABORTED status in MCQ mode? Is my
> understanding
> correct that MediaTek controllers can also report this status in
> legacy
> mode?
>
> Thanks,
Because according to the specification, only MCQ will have
OCS: ABORTED. MediaTek, even with a quirk handling SDB OCS ABORTED,
cannot simply ignore it here and not deal with it.
Otherwise, it would need to release directly like MCQ ufshcd_abort_one.
Thanks
Peter
>
> Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted
2024-09-23 18:15 ` Bart Van Assche
@ 2024-09-24 8:51 ` Peter Wang (王信友)
2024-09-24 19:36 ` Bart Van Assche
0 siblings, 1 reply; 10+ messages in thread
From: Peter Wang (王信友) @ 2024-09-24 8:51 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org, bvanassche@acm.org,
avri.altman@wdc.com, jejb@linux.ibm.com, alim.akhtar@samsung.com,
martin.petersen@oracle.com
Cc: linux-mediatek@lists.infradead.org,
Jiajie Hao (郝加节),
CC Chou (周志杰),
Eddie Huang (黃智傑),
Alice Chao (趙珮均),
Ed Tsai (蔡宗軒), wsd_upstream,
quic_nguyenb@quicinc.com, Lin Gui (桂林),
Chun-Hung Wu (巫駿宏),
Tun-yu Yu (游敦聿),
Chaotian Jing (井朝天),
Powen Kao (高伯文),
Naomi Chu (朱詠田),
Qilin Tan (谭麒麟)
On Mon, 2024-09-23 at 11:15 -0700, Bart Van Assche wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On 9/23/24 1:03 AM, peter.wang@mediatek.com wrote:
> > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> > index b5c7bc50a27e..b42079c3d634 100644
> > --- a/drivers/ufs/core/ufshcd.c
> > +++ b/drivers/ufs/core/ufshcd.c
> > @@ -5404,7 +5404,10 @@ ufshcd_transfer_rsp_status(struct ufs_hba
> *hba, struct ufshcd_lrb *lrbp,
> > }
> > break;
> > case OCS_ABORTED:
> > -result |= DID_ABORT << 16;
> > +if (hba->quirks & UFSHCD_QUIRK_OCS_ABORTED)
> > +result |= DID_REQUEUE << 16;
> > +else
> > +result |= DID_ABORT << 16;
> > dev_warn(hba->dev,
> > "OCS aborted from controller = %x for tag %d\n",
> > ocs, lrbp->task_tag);
>
> I think the approach of this patch is racy: the cmd->result
> assignment
> by ufshcd_transfer_rsp_status() races with the cmd->result assignment
> by
> ufshcd_abort_one(). How about addressing this race as follows?
> * In ufshcd_compl_one_cqe(), if the OCS_ABORTED status is
> encountered,
> set a completion.
> * In the code that aborts SCSI commands, for MediaTek controllers
> only,
> wait for that completion (based on a quirk).
> * Instead of introducing an if-statement in
> ufshcd_transfer_rsp_status(), rely on the cmd->result value
> assigned
> by ufshcd_abort_one().
>
> Thanks,
>
> Bart.
Hi Bart,
Sorry, I might not have understood the potential racing issue here.
The SCSI abort command doesn't need to wait for completion because
SCSI doesn't care about cmd->result, right?
The error handler abort also doesn't need to wait for completion,
because it should have a guaranteed order?
Firstly, ufshcd_abort_one is only likely to be filled
back with OCS: ABORTED by MediaTek UFS controller after
ufshcd_try_to_abort_task, and the sequence is as follows.
ufshcd_err_handler()
ufshcd_abort_all()
ufshcd_abort_one()
ufshcd_try_to_abort_task() // trigger mediatek controller
fill OCS: ABORTED
ufshcd_complete_requests()
ufshcd_transfer_req_compl()
ufshcd_poll()
get outstanding_lock
clear outstanding_reqs tag
release outstanding_lock
__ufshcd_transfer_req_compl()
ufshcd_compl_one_cqe()
cmd->result = DID_REQUEUE // mediatek may need quirk
change DID_ABORT to DID_REQUEUE
In addition, the ISR will use the outstanding_lock with
ufshcd_err_handler to provide protection, so there won't be any
racing that causes the command to be released repeatedly.
The only possible issue might be that after ufshcd_abort_one,
the MediaTek UFS controller has not yet filled in OCS: ABORTED
and has entered ufshcd_transfer_rsp_status for checking.
But this doesn't matter, because it will just be treated
as OCS_INVALID_COMMAND_STATUS.
Is there any corner case that I might have overlooked?
Thanks.
Peter
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted
2024-09-24 8:51 ` Peter Wang (王信友)
@ 2024-09-24 19:36 ` Bart Van Assche
2024-09-25 5:19 ` Peter Wang (王信友)
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2024-09-24 19:36 UTC (permalink / raw)
To: Peter Wang (王信友), linux-scsi@vger.kernel.org,
avri.altman@wdc.com, jejb@linux.ibm.com, alim.akhtar@samsung.com,
martin.petersen@oracle.com
Cc: linux-mediatek@lists.infradead.org,
Jiajie Hao (郝加节),
CC Chou (周志杰),
Eddie Huang (黃智傑),
Alice Chao (趙珮均),
Ed Tsai (蔡宗軒), wsd_upstream,
quic_nguyenb@quicinc.com, Lin Gui (桂林),
Chun-Hung Wu (巫駿宏),
Tun-yu Yu (游敦聿),
Chaotian Jing (井朝天),
Powen Kao (高伯文),
Naomi Chu (朱詠田),
Qilin Tan (谭麒麟)
On 9/24/24 1:51 AM, Peter Wang (王信友) wrote:
> Is there any corner case that I might have overlooked?
Maybe I misunderstood how MediaTek controllers work. Anyway, if a
MediaTek controller reports the completion status OCS_ABORTED, is
there any chance that this status will be reported after
ufshcd_release_scsi_cmd() has been called? Is there any chance that
the controller writing OCS_ABORTED into the status field will race
with the host software overwriting that status field while submitting
a new command?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted
2024-09-24 19:36 ` Bart Van Assche
@ 2024-09-25 5:19 ` Peter Wang (王信友)
0 siblings, 0 replies; 10+ messages in thread
From: Peter Wang (王信友) @ 2024-09-25 5:19 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org, bvanassche@acm.org,
avri.altman@wdc.com, jejb@linux.ibm.com, alim.akhtar@samsung.com,
martin.petersen@oracle.com
Cc: linux-mediatek@lists.infradead.org,
Jiajie Hao (郝加节),
CC Chou (周志杰),
Eddie Huang (黃智傑),
Alice Chao (趙珮均), quic_nguyenb@quicinc.com,
wsd_upstream, Ed Tsai (蔡宗軒),
Lin Gui (桂林),
Chun-Hung Wu (巫駿宏),
Tun-yu Yu (游敦聿),
Chaotian Jing (井朝天),
Powen Kao (高伯文),
Naomi Chu (朱詠田),
Qilin Tan (谭麒麟)
On Tue, 2024-09-24 at 12:36 -0700, Bart Van Assche wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On 9/24/24 1:51 AM, Peter Wang (王信友) wrote:
> > Is there any corner case that I might have overlooked?
>
> Maybe I misunderstood how MediaTek controllers work. Anyway, if a
> MediaTek controller reports the completion status OCS_ABORTED, is
> there any chance that this status will be reported after
> ufshcd_release_scsi_cmd() has been called? Is there any chance that
> the controller writing OCS_ABORTED into the status field will race
> with the host software overwriting that status field while submitting
> a new command?
>
> Thanks,
>
> Bart.
>
Hi Bart,
Okay, I may not have clearly explained the behavior of the
MediaTek UFS controller.
In the UFSHCI specification, under the description of UTRLCLR.
After writing to UTRLCLR, when UTRLDBR is cleared to 0,
the MediaTek UFS controller will simultaneously change the
OCS to ABORTED. So, OCS: ABORTED is filled at the moment
when ufshcd_clear_cmd successfully waits for DBR to clear.
That's why I wrote it like this:
ufshcd_try_to_abort_task() // triggers MediaTek controller to fill OCS:
ABORTED
Therefore, the status should already be ABORTED before
ufshcd_release_scsi_cmd, so there shouldn't be a racing issue.
Thanks.
Peter
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-09-25 5:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-23 8:03 [PATCH v8 0/3] fix abort defect peter.wang
2024-09-23 8:03 ` [PATCH v8 1/3] ufs: core: fix the issue of ICU failure peter.wang
2024-09-23 8:03 ` [PATCH v8 2/3] ufs: core: fix error handler process for MCQ abort peter.wang
2024-09-23 18:19 ` Bart Van Assche
2024-09-24 8:48 ` Peter Wang (王信友)
2024-09-23 8:03 ` [PATCH v8 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted peter.wang
2024-09-23 18:15 ` Bart Van Assche
2024-09-24 8:51 ` Peter Wang (王信友)
2024-09-24 19:36 ` Bart Van Assche
2024-09-25 5:19 ` Peter Wang (王信友)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox