* [PATCH v9 1/3] ufs: core: fix the issue of ICU failure
2024-09-25 9:55 [PATCH v9 0/3] fix abort defect peter.wang
@ 2024-09-25 9:55 ` peter.wang
2024-09-25 9:55 ` [PATCH v9 2/3] ufs: core: fix error handler process for MCQ abort peter.wang
2024-09-25 9:55 ` [PATCH v9 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted peter.wang
2 siblings, 0 replies; 10+ messages in thread
From: peter.wang @ 2024-09-25 9:55 UTC (permalink / raw)
To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, ed.tsai,
bvanassche, quic_nguyenb, stable
From: Peter Wang <peter.wang@mediatek.com>
When setting the ICU bit without using read-modify-write,
SQRTCy will restart SQ again and receive an RTC return
error code 2 (Failure - SQ not stopped).
Additionally, the error log has been modified so that
this type of error can be observed.
Fixes: ab248643d3d6 ("scsi: ufs: core: Add error handling for MCQ mode")
Cc: stable@vger.kernel.org
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/core/ufs-mcq.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index 5891cdacd0b3..3903947dbed1 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -539,7 +539,7 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
struct scsi_cmnd *cmd = lrbp->cmd;
struct ufs_hw_queue *hwq;
void __iomem *reg, *opr_sqd_base;
- u32 nexus, id, val;
+ u32 nexus, id, val, rtc;
int err;
if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
@@ -569,17 +569,18 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
opr_sqd_base = mcq_opr_base(hba, OPR_SQD, id);
writel(nexus, opr_sqd_base + REG_SQCTI);
- /* SQRTCy.ICU = 1 */
- writel(SQ_ICU, opr_sqd_base + REG_SQRTC);
+ /* Initiate Cleanup */
+ writel(readl(opr_sqd_base + REG_SQRTC) | SQ_ICU,
+ opr_sqd_base + REG_SQRTC);
/* Poll SQRTSy.CUS = 1. Return result from SQRTSy.RTC */
reg = opr_sqd_base + REG_SQRTS;
err = read_poll_timeout(readl, val, val & SQ_CUS, 20,
MCQ_POLL_US, false, reg);
- if (err)
- dev_err(hba->dev, "%s: failed. hwq=%d, tag=%d err=%ld\n",
- __func__, id, task_tag,
- FIELD_GET(SQ_ICU_ERR_CODE_MASK, readl(reg)));
+ rtc = FIELD_GET(SQ_ICU_ERR_CODE_MASK, readl(reg));
+ if (err || rtc)
+ dev_err(hba->dev, "%s: failed. hwq=%d, tag=%d err=%d RTC=%d\n",
+ __func__, id, task_tag, err, rtc);
if (ufshcd_mcq_sq_start(hba, hwq))
err = -ETIMEDOUT;
--
2.45.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH v9 2/3] ufs: core: fix error handler process for MCQ abort
2024-09-25 9:55 [PATCH v9 0/3] fix abort defect peter.wang
2024-09-25 9:55 ` [PATCH v9 1/3] ufs: core: fix the issue of ICU failure peter.wang
@ 2024-09-25 9:55 ` peter.wang
2024-09-25 16:49 ` Bart Van Assche
2024-09-25 9:55 ` [PATCH v9 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted peter.wang
2 siblings, 1 reply; 10+ messages in thread
From: peter.wang @ 2024-09-25 9:55 UTC (permalink / raw)
To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, ed.tsai,
bvanassche, quic_nguyenb
From: Peter Wang <peter.wang@mediatek.com>
When the error handler successfully aborts a MCQ request,
it only releases the command and does not notify the SCSI layer.
This may cause another abort after 30 seconds timeout.
This patch notifies the SCSI layer to requeue the request.
Additionally, ignore the OCS: ABORTED CQ slot after MCQ mode
SQ cleanup. This makes the behavior of MCQ mode consistent with
that of legacy SDB mode.
Also, print logs for OCS: ABORTED and OCS_INVALID_COMMAND_STATUS
for debugging purposes.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
drivers/ufs/core/ufshcd.c | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index a6f818cdef0e..4fff929b70d6 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5405,9 +5405,15 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
break;
case OCS_ABORTED:
result |= DID_ABORT << 16;
+ dev_warn(hba->dev,
+ "OCS aborted from controller for tag %d\n",
+ lrbp->task_tag);
break;
case OCS_INVALID_COMMAND_STATUS:
result |= DID_REQUEUE << 16;
+ dev_warn(hba->dev,
+ "OCS invaild from controller for tag %d\n",
+ lrbp->task_tag);
break;
case OCS_INVALID_CMD_TABLE_ATTR:
case OCS_INVALID_PRDT_ATTR:
@@ -5526,6 +5532,18 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, int task_tag,
ufshcd_update_monitor(hba, lrbp);
ufshcd_add_command_trace(hba, task_tag, UFS_CMD_COMP);
cmd->result = ufshcd_transfer_rsp_status(hba, lrbp, cqe);
+
+ /*
+ * Ignore MCQ OCS: ABORTED posted by the host controller.
+ * This makes the behavior of MCQ mode consistent with that
+ * of legacy SDB mode.
+ */
+ if (hba->mcq_enabled) {
+ ocs = ufshcd_get_tr_ocs(lrbp, cqe);
+ if (ocs == OCS_ABORTED)
+ return;
+ }
+
ufshcd_release_scsi_cmd(hba, lrbp);
/* Do not touch lrbp after scsi done */
scsi_done(cmd);
@@ -6486,8 +6504,11 @@ static bool ufshcd_abort_one(struct request *rq, void *priv)
if (!hwq)
return 0;
spin_lock_irqsave(&hwq->cq_lock, flags);
- if (ufshcd_cmd_inflight(lrbp->cmd))
+ if (ufshcd_cmd_inflight(lrbp->cmd)) {
+ set_host_byte(lrbp->cmd, DID_REQUEUE);
ufshcd_release_scsi_cmd(hba, lrbp);
+ scsi_done(lrbp->cmd);
+ }
spin_unlock_irqrestore(&hwq->cq_lock, flags);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v9 2/3] ufs: core: fix error handler process for MCQ abort
2024-09-25 9:55 ` [PATCH v9 2/3] ufs: core: fix error handler process for MCQ abort peter.wang
@ 2024-09-25 16:49 ` Bart Van Assche
2024-09-26 3:45 ` Peter Wang (王信友)
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2024-09-25 16:49 UTC (permalink / raw)
To: peter.wang, linux-scsi, martin.petersen, avri.altman, alim.akhtar,
jejb
Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
chaotian.jing, jiajie.hao, powen.kao, qilin.tan, lin.gui,
tun-yu.yu, eddie.huang, naomi.chu, ed.tsai, quic_nguyenb
On 9/25/24 2:55 AM, peter.wang@mediatek.com wrote:
> case OCS_INVALID_COMMAND_STATUS:
> result |= DID_REQUEUE << 16;
> + dev_warn(hba->dev,
> + "OCS invaild from controller for tag %d\n",
> + lrbp->task_tag);
Please change "invaild" into "invalid". Once that change has been made,
feel free to add:
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 2/3] ufs: core: fix error handler process for MCQ abort
2024-09-25 16:49 ` Bart Van Assche
@ 2024-09-26 3:45 ` Peter Wang (王信友)
2024-09-26 18:26 ` Bart Van Assche
0 siblings, 1 reply; 10+ messages in thread
From: Peter Wang (王信友) @ 2024-09-26 3:45 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org, bvanassche@acm.org,
avri.altman@wdc.com, jejb@linux.ibm.com, alim.akhtar@samsung.com,
martin.petersen@oracle.com
Cc: linux-mediatek@lists.infradead.org,
Jiajie Hao (郝加节),
CC Chou (周志杰),
Eddie Huang (黃智傑),
Alice Chao (趙珮均),
Ed Tsai (蔡宗軒), wsd_upstream,
quic_nguyenb@quicinc.com, Lin Gui (桂林),
Chun-Hung Wu (巫駿宏),
Tun-yu Yu (游敦聿),
Chaotian Jing (井朝天),
Powen Kao (高伯文),
Naomi Chu (朱詠田),
Qilin Tan (谭麒麟)
On Wed, 2024-09-25 at 09:49 -0700, Bart Van Assche wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On 9/25/24 2:55 AM, peter.wang@mediatek.com wrote:
> > case OCS_INVALID_COMMAND_STATUS:
> > result |= DID_REQUEUE << 16;
> > +dev_warn(hba->dev,
> > +"OCS invaild from controller for tag %d\n",
> > +lrbp->task_tag);
>
> Please change "invaild" into "invalid". Once that change has been
> made,
> feel free to add:
>
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Hi Bart,
Sorry, this typo wasn't corrected.
However, I still feel that this patch is not quite reasonable.
The reason is that in the first patch,
"ufs: core: fix the issue of ICU failure"
we corrected the ICU problem, allowing the SQ to clean up
correctly, while the CQ will have a corresponding response.
But the second patch directly ignores the CQ's response and
continues to use a workaround to release the command right
after ufshcd_try_to_abort_task.
The Legacy SDB mode's approach would not release the
command after ufshcd_try_to_abort_task.
Instead, it releases after the DBR is cleared.
Therefore, as I previously suggested, it would probably
be more reasonable to directly requeue the ABORTED commands
as shown in the patch below.
---------------------------------------------------------------------
---
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 24a32e2fd75e..06aa4ed1a9e6 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5417,10 +5417,12 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba,
struct ufshcd_lrb *lrbp,
}
break;
case OCS_ABORTED:
- result |= DID_ABORT << 16;
- break;
case OCS_INVALID_COMMAND_STATUS:
result |= DID_REQUEUE << 16;
+ dev_warn(hba->dev,
+ "OCS %s from controller for tag %d\n",
+ (ocs == OCS_ABORTED? "aborted" :
"invalid"),
+ lrbp->task_tag);
break;
case OCS_INVALID_CMD_TABLE_ATTR:
case OCS_INVALID_PRDT_ATTR:
@@ -6466,26 +6468,12 @@ static bool ufshcd_abort_one(struct request
*rq, void *priv)
struct scsi_device *sdev = cmd->device;
struct Scsi_Host *shost = sdev->host;
struct ufs_hba *hba = shost_priv(shost);
- struct ufshcd_lrb *lrbp = &hba->lrb[tag];
- struct ufs_hw_queue *hwq;
- unsigned long flags;
*ret = ufshcd_try_to_abort_task(hba, tag);
dev_err(hba->dev, "Aborting tag %d / CDB %#02x %s\n", tag,
hba->lrb[tag].cmd ? hba->lrb[tag].cmd->cmnd[0] : -1,
*ret ? "failed" : "succeeded");
- /* Release cmd in MCQ mode if abort succeeds */
- if (hba->mcq_enabled && (*ret == 0)) {
- hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp-
>cmd));
- if (!hwq)
- return 0;
- spin_lock_irqsave(&hwq->cq_lock, flags);
- if (ufshcd_cmd_inflight(lrbp->cmd))
- ufshcd_release_scsi_cmd(hba, lrbp);
- spin_unlock_irqrestore(&hwq->cq_lock, flags);
- }
-
return *ret == 0;
}
---------------------------------------------------------------------
---
This patch has several advantages:
1. It makes the patch 'ufs: core: fix the issue of ICU failure'
seem valuable.
2. The patch is more concise.
3. There is no need to fetch OCS to determine OCS: ABORTED
on every CQ completion, which increases ISR time.
4. The err_handler flow for SDB and MCQ would be consistent.
5. There is no need for the MediaTek SDB quirk.
What do you think?"
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v9 2/3] ufs: core: fix error handler process for MCQ abort
2024-09-26 3:45 ` Peter Wang (王信友)
@ 2024-09-26 18:26 ` Bart Van Assche
2024-09-27 7:51 ` Peter Wang (王信友)
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2024-09-26 18:26 UTC (permalink / raw)
To: Peter Wang (王信友), linux-scsi@vger.kernel.org,
avri.altman@wdc.com, jejb@linux.ibm.com, alim.akhtar@samsung.com,
martin.petersen@oracle.com
Cc: linux-mediatek@lists.infradead.org,
Jiajie Hao (郝加节),
CC Chou (周志杰),
Eddie Huang (黃智傑),
Alice Chao (趙珮均),
Ed Tsai (蔡宗軒), wsd_upstream,
quic_nguyenb@quicinc.com, Lin Gui (桂林),
Chun-Hung Wu (巫駿宏),
Tun-yu Yu (游敦聿),
Chaotian Jing (井朝天),
Powen Kao (高伯文),
Naomi Chu (朱詠田),
Qilin Tan (谭麒麟)
On 9/25/24 8:45 PM, Peter Wang (王信友) wrote:
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 24a32e2fd75e..06aa4ed1a9e6 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -5417,10 +5417,12 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba,
> struct ufshcd_lrb *lrbp,
> }
> break;
> case OCS_ABORTED:
> - result |= DID_ABORT << 16;
> - break;
> case OCS_INVALID_COMMAND_STATUS:
> result |= DID_REQUEUE << 16;
> + dev_warn(hba->dev,
> + "OCS %s from controller for tag %d\n",
> + (ocs == OCS_ABORTED? "aborted" :
> "invalid"),
> + lrbp->task_tag);
> break;
> case OCS_INVALID_CMD_TABLE_ATTR:
> case OCS_INVALID_PRDT_ATTR:
> @@ -6466,26 +6468,12 @@ static bool ufshcd_abort_one(struct request
> *rq, void *priv)
> struct scsi_device *sdev = cmd->device;
> struct Scsi_Host *shost = sdev->host;
> struct ufs_hba *hba = shost_priv(shost);
> - struct ufshcd_lrb *lrbp = &hba->lrb[tag];
> - struct ufs_hw_queue *hwq;
> - unsigned long flags;
>
> *ret = ufshcd_try_to_abort_task(hba, tag);
> dev_err(hba->dev, "Aborting tag %d / CDB %#02x %s\n", tag,
> hba->lrb[tag].cmd ? hba->lrb[tag].cmd->cmnd[0] : -1,
> *ret ? "failed" : "succeeded");
>
> - /* Release cmd in MCQ mode if abort succeeds */
> - if (hba->mcq_enabled && (*ret == 0)) {
> - hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp-
>> cmd));
> - if (!hwq)
> - return 0;
> - spin_lock_irqsave(&hwq->cq_lock, flags);
> - if (ufshcd_cmd_inflight(lrbp->cmd))
> - ufshcd_release_scsi_cmd(hba, lrbp);
> - spin_unlock_irqrestore(&hwq->cq_lock, flags);
> - }
> -
> return *ret == 0;
> }
>
> ---------------------------------------------------------------------
>
>
> This patch has several advantages:
>
> 1. It makes the patch 'ufs: core: fix the issue of ICU failure'
> seem valuable.
> 2. The patch is more concise.
> 3. There is no need to fetch OCS to determine OCS: ABORTED
> on every CQ completion, which increases ISR time.
> 4. The err_handler flow for SDB and MCQ would be consistent.
> 5. There is no need for the MediaTek SDB quirk.
>
>
> What do you think?"
Hi Peter,
Is the above patch sufficient? In MCQ mode, aborting a command happens
as follows (simplified):
(1) Send the ABORT TASK TMF. If this TMF succeeds, no SQE will be
generated. If this TMF succeeds it means that the SCSI command has
reached the UFS device and hence is no longer present in any
submission queue (SQ).
(2) If the command is still in a submission queue, nullify the SQE. In
this case a CQE will be generated with status ABORTED.
It seems to me that the above patch handles (2) but not (1)?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v9 2/3] ufs: core: fix error handler process for MCQ abort
2024-09-26 18:26 ` Bart Van Assche
@ 2024-09-27 7:51 ` Peter Wang (王信友)
0 siblings, 0 replies; 10+ messages in thread
From: Peter Wang (王信友) @ 2024-09-27 7:51 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org, bvanassche@acm.org,
avri.altman@wdc.com, jejb@linux.ibm.com, alim.akhtar@samsung.com,
martin.petersen@oracle.com
Cc: linux-mediatek@lists.infradead.org,
Jiajie Hao (郝加节),
CC Chou (周志杰),
Eddie Huang (黃智傑),
Alice Chao (趙珮均), quic_nguyenb@quicinc.com,
wsd_upstream, Ed Tsai (蔡宗軒),
Lin Gui (桂林),
Chun-Hung Wu (巫駿宏),
Tun-yu Yu (游敦聿),
Chaotian Jing (井朝天),
Powen Kao (高伯文),
Naomi Chu (朱詠田),
Qilin Tan (谭麒麟)
On Thu, 2024-09-26 at 11:26 -0700, Bart Van Assche wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On 9/25/24 8:45 PM, Peter Wang (王信友) wrote:
> > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> > index 24a32e2fd75e..06aa4ed1a9e6 100644
> > --- a/drivers/ufs/core/ufshcd.c
> > +++ b/drivers/ufs/core/ufshcd.c
> > @@ -5417,10 +5417,12 @@ ufshcd_transfer_rsp_status(struct ufs_hba
> *hba,
> > struct ufshcd_lrb *lrbp,
> > }
> > break;
> > case OCS_ABORTED:
> > -result |= DID_ABORT << 16;
> > -break;
> > case OCS_INVALID_COMMAND_STATUS:
> > result |= DID_REQUEUE << 16;
> > +dev_warn(hba->dev,
> > +"OCS %s from controller for tag %d\n",
> > +(ocs == OCS_ABORTED? "aborted" :
> > "invalid"),
> > +lrbp->task_tag);
> > break;
> > case OCS_INVALID_CMD_TABLE_ATTR:
> > case OCS_INVALID_PRDT_ATTR:
> > @@ -6466,26 +6468,12 @@ static bool ufshcd_abort_one(struct request
> > *rq, void *priv)
> > struct scsi_device *sdev = cmd->device;
> > struct Scsi_Host *shost = sdev->host;
> > struct ufs_hba *hba = shost_priv(shost);
> > -struct ufshcd_lrb *lrbp = &hba->lrb[tag];
> > -struct ufs_hw_queue *hwq;
> > -unsigned long flags;
> >
> > *ret = ufshcd_try_to_abort_task(hba, tag);
> > dev_err(hba->dev, "Aborting tag %d / CDB %#02x %s\n", tag,
> > hba->lrb[tag].cmd ? hba->lrb[tag].cmd->cmnd[0] : -1,
> > *ret ? "failed" : "succeeded");
> >
> > -/* Release cmd in MCQ mode if abort succeeds */
> > -if (hba->mcq_enabled && (*ret == 0)) {
> > -hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp-
> >> cmd));
> > -if (!hwq)
> > -return 0;
> > -spin_lock_irqsave(&hwq->cq_lock, flags);
> > -if (ufshcd_cmd_inflight(lrbp->cmd))
> > -ufshcd_release_scsi_cmd(hba, lrbp);
> > -spin_unlock_irqrestore(&hwq->cq_lock, flags);
> > -}
> > -
> > return *ret == 0;
> > }
> >
> > -----------------------------------------------------------------
> ----
> >
> >
> > This patch has several advantages:
> >
> > 1. It makes the patch 'ufs: core: fix the issue of ICU failure'
> > seem valuable.
> > 2. The patch is more concise.
> > 3. There is no need to fetch OCS to determine OCS: ABORTED
> > on every CQ completion, which increases ISR time.
> > 4. The err_handler flow for SDB and MCQ would be consistent.
> > 5. There is no need for the MediaTek SDB quirk.
> >
> >
> > What do you think?"
>
> Hi Peter,
>
> Is the above patch sufficient? In MCQ mode, aborting a command
> happens
> as follows (simplified):
> (1) Send the ABORT TASK TMF. If this TMF succeeds, no SQE will be
> generated. If this TMF succeeds it means that the SCSI command
> has
> reached the UFS device and hence is no longer present in any
> submission queue (SQ).
> (2) If the command is still in a submission queue, nullify the SQE.
> In
> this case a CQE will be generated with status ABORTED.
>
> It seems to me that the above patch handles (2) but not (1)?
>
> Thanks,
>
> Bart.
Hi Bart,
Regarding your description of 'ABORT TMF success' and 'SQE'
I think there might be some misunderstanding.
In this section of the UFSHCI 4.0 specification.
4.4.6 (Informative) Processing Abort in MCQ mode: An Implementation
Example
There are three case for MCQ abort:
1. When the host controller has already sent out the SQE
and the UFS device has already responded with the
corresponding response, the CQ Entry will automatically
increment by 1. This case is the simplest, the SQE
will have a corresponding CQE for the host to cleanup
resources.
2. When the host controller has not yet sent out this SQE
(SQ is not empty), the software can fill in 'nullify' to
notify the host controller that there is no need to send
it, and directly fill the corresponding response into the
CQ with OCS: ABORTED. This scenario is also straightforward,
the UFS device won't be aware, and only the host controller
needs to clean up the related resources.
3. When the host controller has already sent out the SQE
and is waiting for the response from the UFS device (CQE),
the software can initiate cleanup to notify the host
controller that there is no need to wait, and directly fill
the corresponding response into the CQ with OCS: ABORTED.
Therefore, when a TMF is successfully executed, for example,
tag 0 has been aborted, the software will know that the UFS
device will no longer return a response for tag 0.
Thus, the host controller needs to initiate cleanup,
allowing the CQE corresponding to this SQE to return,
while also cleaning the resources for tag 0.
In both the second and third cases, no matter which one occurs,
they will result in the CQE corresponding to the SQE being
filled with OCS: ABORTED. So, as long as we directly
requeue when OCS: ABORTED, it will satisfy both of these
situations.
Thanks
Peter
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v9 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted
2024-09-25 9:55 [PATCH v9 0/3] fix abort defect peter.wang
2024-09-25 9:55 ` [PATCH v9 1/3] ufs: core: fix the issue of ICU failure peter.wang
2024-09-25 9:55 ` [PATCH v9 2/3] ufs: core: fix error handler process for MCQ abort peter.wang
@ 2024-09-25 9:55 ` peter.wang
2024-09-25 16:56 ` Bart Van Assche
2 siblings, 1 reply; 10+ messages in thread
From: peter.wang @ 2024-09-25 9:55 UTC (permalink / raw)
To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb
Cc: wsd_upstream, linux-mediatek, peter.wang, chun-hung.wu,
alice.chao, cc.chou, chaotian.jing, jiajie.hao, powen.kao,
qilin.tan, lin.gui, tun-yu.yu, eddie.huang, naomi.chu, ed.tsai,
bvanassche, quic_nguyenb
From: Peter Wang <peter.wang@mediatek.com>
Because the MediaTek UFS controller uses UTRLCLR to clear commands
and fills the OCS with ABORTED, this patch introduces a quirk to
treat ABORTED as INVALID_OCS_VALUE.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
drivers/ufs/core/ufshcd.c | 5 ++++-
drivers/ufs/host/ufs-mediatek.c | 1 +
include/ufs/ufshcd.h | 6 ++++++
3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 4fff929b70d6..d429817fca94 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5404,7 +5404,10 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
}
break;
case OCS_ABORTED:
- result |= DID_ABORT << 16;
+ if (hba->quirks & UFSHCD_QUIRK_OCS_ABORTED)
+ result |= DID_REQUEUE << 16;
+ else
+ result |= DID_ABORT << 16;
dev_warn(hba->dev,
"OCS aborted from controller for tag %d\n",
lrbp->task_tag);
diff --git a/drivers/ufs/host/ufs-mediatek.c b/drivers/ufs/host/ufs-mediatek.c
index 02c9064284e1..8a4c1b8f5a26 100644
--- a/drivers/ufs/host/ufs-mediatek.c
+++ b/drivers/ufs/host/ufs-mediatek.c
@@ -1021,6 +1021,7 @@ static int ufs_mtk_init(struct ufs_hba *hba)
hba->quirks |= UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL;
hba->quirks |= UFSHCD_QUIRK_MCQ_BROKEN_INTR;
hba->quirks |= UFSHCD_QUIRK_MCQ_BROKEN_RTC;
+ hba->quirks |= UFSHCD_QUIRK_OCS_ABORTED;
hba->vps->wb_flush_threshold = UFS_WB_BUF_REMAIN_PERCENT(80);
if (host->caps & UFS_MTK_CAP_DISABLE_AH8)
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index 0fd2aebac728..8f156803d703 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -684,6 +684,12 @@ enum ufshcd_quirks {
* single doorbell mode.
*/
UFSHCD_QUIRK_BROKEN_LSDBS_CAP = 1 << 25,
+
+ /*
+ * Some host controllers set OCS_ABORTED after UTRLCLR (SDB mode),
+ * this quirk is set to treat OCS: ABORTED as INVALID_OCS_VALUE
+ */
+ UFSHCD_QUIRK_OCS_ABORTED = 1 << 26,
};
enum ufshcd_caps {
--
2.45.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v9 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted
2024-09-25 9:55 ` [PATCH v9 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted peter.wang
@ 2024-09-25 16:56 ` Bart Van Assche
2024-09-26 3:42 ` Peter Wang (王信友)
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2024-09-25 16:56 UTC (permalink / raw)
To: peter.wang, linux-scsi, martin.petersen, avri.altman, alim.akhtar,
jejb
Cc: wsd_upstream, linux-mediatek, chun-hung.wu, alice.chao, cc.chou,
chaotian.jing, jiajie.hao, powen.kao, qilin.tan, lin.gui,
tun-yu.yu, eddie.huang, naomi.chu, ed.tsai, quic_nguyenb
On 9/25/24 2:55 AM, peter.wang@mediatek.com wrote:
> From: Peter Wang <peter.wang@mediatek.com>
>
> Because the MediaTek UFS controller uses UTRLCLR to clear commands
> and fills the OCS with ABORTED, this patch introduces a quirk to
> treat ABORTED as INVALID_OCS_VALUE.
>
> Signed-off-by: Peter Wang <peter.wang@mediatek.com>
> ---
> drivers/ufs/core/ufshcd.c | 5 ++++-
> drivers/ufs/host/ufs-mediatek.c | 1 +
> include/ufs/ufshcd.h | 6 ++++++
> 3 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 4fff929b70d6..d429817fca94 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -5404,7 +5404,10 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
> }
> break;
> case OCS_ABORTED:
> - result |= DID_ABORT << 16;
> + if (hba->quirks & UFSHCD_QUIRK_OCS_ABORTED)
> + result |= DID_REQUEUE << 16;
> + else
> + result |= DID_ABORT << 16;
> dev_warn(hba->dev,
> "OCS aborted from controller for tag %d\n",
> lrbp->task_tag);
> diff --git a/drivers/ufs/host/ufs-mediatek.c b/drivers/ufs/host/ufs-mediatek.c
> index 02c9064284e1..8a4c1b8f5a26 100644
> --- a/drivers/ufs/host/ufs-mediatek.c
> +++ b/drivers/ufs/host/ufs-mediatek.c
> @@ -1021,6 +1021,7 @@ static int ufs_mtk_init(struct ufs_hba *hba)
> hba->quirks |= UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL;
> hba->quirks |= UFSHCD_QUIRK_MCQ_BROKEN_INTR;
> hba->quirks |= UFSHCD_QUIRK_MCQ_BROKEN_RTC;
> + hba->quirks |= UFSHCD_QUIRK_OCS_ABORTED;
> hba->vps->wb_flush_threshold = UFS_WB_BUF_REMAIN_PERCENT(80);
>
> if (host->caps & UFS_MTK_CAP_DISABLE_AH8)
> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
> index 0fd2aebac728..8f156803d703 100644
> --- a/include/ufs/ufshcd.h
> +++ b/include/ufs/ufshcd.h
> @@ -684,6 +684,12 @@ enum ufshcd_quirks {
> * single doorbell mode.
> */
> UFSHCD_QUIRK_BROKEN_LSDBS_CAP = 1 << 25,
> +
> + /*
> + * Some host controllers set OCS_ABORTED after UTRLCLR (SDB mode),
> + * this quirk is set to treat OCS: ABORTED as INVALID_OCS_VALUE
> + */
> + UFSHCD_QUIRK_OCS_ABORTED = 1 << 26,
> };
>
> enum ufshcd_caps {
ufshcd_transfer_rsp_status() only has one caller, namely
ufshcd_compl_one_cqe(). The previous patch makes sure that that
ufshcd_compl_one_cqe() is not called if a SCSI command is aborted. So
why does this patch modify how OCS_ABORTED is processed? Is this patch
necessary or can it perhaps be dropped?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v9 3/3] ufs: core: add a quirk for MediaTek SDB mode aborted
2024-09-25 16:56 ` Bart Van Assche
@ 2024-09-26 3:42 ` Peter Wang (王信友)
0 siblings, 0 replies; 10+ messages in thread
From: Peter Wang (王信友) @ 2024-09-26 3:42 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org, bvanassche@acm.org,
avri.altman@wdc.com, jejb@linux.ibm.com, alim.akhtar@samsung.com,
martin.petersen@oracle.com
Cc: linux-mediatek@lists.infradead.org,
Jiajie Hao (郝加节),
CC Chou (周志杰),
Eddie Huang (黃智傑),
Alice Chao (趙珮均),
Ed Tsai (蔡宗軒), wsd_upstream,
quic_nguyenb@quicinc.com, Lin Gui (桂林),
Chun-Hung Wu (巫駿宏),
Tun-yu Yu (游敦聿),
Chaotian Jing (井朝天),
Powen Kao (高伯文),
Naomi Chu (朱詠田),
Qilin Tan (谭麒麟)
On Wed, 2024-09-25 at 09:56 -0700, Bart Van Assche wrote:
>
>
> ufshcd_transfer_rsp_status() only has one caller, namely
> ufshcd_compl_one_cqe(). The previous patch makes sure that that
> ufshcd_compl_one_cqe() is not called if a SCSI command is aborted. So
> why does this patch modify how OCS_ABORTED is processed? Is this
> patch
> necessary or can it perhaps be dropped?
>
> Thanks,
>
> Bart.
Hi Bart,
Because in Legacy SDB mode, the error handler still has the
chance to call ufshcd_compl_one_cqe();
The call flow is as follows:
ufshcd_err_handler()
ufshcd_abort_all()
ufshcd_abort_one()
ufshcd_try_to_abort_task() // mediatek controller fill
OCS: ABORTED
ufshcd_complete_requests()
ufshcd_transfer_req_compl()
ufshcd_poll()
get outstanding_lock
clear outstanding_reqs tag
release outstanding_lock
__ufshcd_transfer_req_compl()
ufshcd_compl_one_cqe()
cmd->result = DID_REQUEUE // mediatek need quirk change
DID_ABORT to DID_REQUEUE
ufshcd_release_scsi_cmd()
scsi_done()
Thanks.
Peter
^ permalink raw reply [flat|nested] 10+ messages in thread