Linux SCSI subsystem development
 help / color / mirror / Atom feed
* [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation
@ 2025-10-27 15:44 Bart Van Assche
  2025-10-28 13:05 ` Peter Wang (王信友)
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2025-10-27 15:44 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, Bart Van Assche, James E.J. Bottomley, Peter Wang,
	Manivannan Sadhasivam, Chenyuan Yang, ping.gao, Alok Tiwari,
	Po-Wen Kao, Stanley Jhu, Alim Akhtar

ufshcd_mcq_sq_cleanup() must return 0 if the command with tag 'task_tag'
is no longer in a submission queue. Check whether or not a command is
still pending by calling ufshcd_mcq_sqe_search().

Fixes: aa9d5d0015a8 ("scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufs-mcq.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index d04662b57cd1..cd47b7e438f4 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -36,6 +36,9 @@
 /* Max mcq register polling time in microseconds */
 #define MCQ_POLL_US 500000
 
+static bool ufshcd_mcq_sqe_search(struct ufs_hba *hba, struct ufs_hw_queue *hwq,
+				  int task_tag);
+
 static int rw_queue_count_set(const char *val, const struct kernel_param *kp)
 {
 	return param_set_uint_minmax(val, kp, UFS_MCQ_MIN_RW_QUEUES,
@@ -553,9 +556,6 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
 	u32 nexus, id, val;
 	int err;
 
-	if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
-		return -ETIMEDOUT;
-
 	if (task_tag != hba->nutrs - UFSHCD_NUM_RESERVED) {
 		if (!cmd)
 			return -EINVAL;
@@ -566,6 +566,10 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
 		hwq = hba->dev_cmd_queue;
 	}
 
+	if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
+		return ufshcd_mcq_sqe_search(hba, hwq, task_tag) ? -ETIMEDOUT :
+			0;
+
 	id = hwq->id;
 
 	guard(mutex)(&hwq->sq_mutex);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation
  2025-10-27 15:44 [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation Bart Van Assche
@ 2025-10-28 13:05 ` Peter Wang (王信友)
  2025-10-28 13:46   ` Bart Van Assche
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Wang (王信友) @ 2025-10-28 13:05 UTC (permalink / raw)
  To: bvanassche@acm.org, martin.petersen@oracle.com
  Cc: Powen Kao (高伯文), chu.stanley@gmail.com,
	alok.a.tiwari@oracle.com, linux-scsi@vger.kernel.org,
	chenyuan0y@gmail.com, alim.akhtar@samsung.com,
	ping.gao@samsung.com, mani@kernel.org,
	James.Bottomley@HansenPartnership.com

On Mon, 2025-10-27 at 08:44 -0700, Bart Van Assche wrote:
> ufshcd_mcq_sq_cleanup() must return 0 if the command with tag
> 'task_tag'
> is no longer in a submission queue. Check whether or not a command is
> still pending by calling ufshcd_mcq_sqe_search().

Hi Bart,

What if the tag is not in the submission queue, but the 
completion queue is still waiting for the tag's response? 
If we return 0, it may cause ufshcd_abort to think it 
succeeded, even though the tag is still in an error state?

Thanks.
Peter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation
  2025-10-28 13:05 ` Peter Wang (王信友)
@ 2025-10-28 13:46   ` Bart Van Assche
  2025-10-29  7:18     ` Peter Wang (王信友)
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2025-10-28 13:46 UTC (permalink / raw)
  To: Peter Wang (王信友), martin.petersen@oracle.com
  Cc: Powen Kao (高伯文), chu.stanley@gmail.com,
	alok.a.tiwari@oracle.com, linux-scsi@vger.kernel.org,
	chenyuan0y@gmail.com, alim.akhtar@samsung.com,
	ping.gao@samsung.com, mani@kernel.org,
	James.Bottomley@HansenPartnership.com

On 10/28/25 6:05 AM, Peter Wang (王信友) wrote:
> On Mon, 2025-10-27 at 08:44 -0700, Bart Van Assche wrote:
>> ufshcd_mcq_sq_cleanup() must return 0 if the command with tag
>> 'task_tag'
>> is no longer in a submission queue. Check whether or not a command is
>> still pending by calling ufshcd_mcq_sqe_search().
> 
> Hi Bart,
> 
> What if the tag is not in the submission queue, but the
> completion queue is still waiting for the tag's response?
> If we return 0, it may cause ufshcd_abort to think it
> succeeded, even though the tag is still in an error state?

ufshcd_mcq_sqe_search() only searches the submission queue. Examining
completion queues is not something ufshcd_mcq_sqe_search() must do.

The UFSHCI driver is based on the assumption that the UFSHCI controller
works correctly and hence passes completions quickly to the host. The
SCSI core only tries to abort a SCSI command after the SCSI timeout has
expired. The smallest SCSI timeout that can be configured is one second.
This is several orders of magnitude larger than the typical time for
passing a completion from the UFSHCI to the host. Hence, I think it is
very unlikely that a completion is present in the host controller and
has not yet reached the host by the time the SCSI core aborts a SCSI
command.

Bart.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation
  2025-10-28 13:46   ` Bart Van Assche
@ 2025-10-29  7:18     ` Peter Wang (王信友)
  2025-10-29 15:50       ` Bart Van Assche
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Wang (王信友) @ 2025-10-29  7:18 UTC (permalink / raw)
  To: bvanassche@acm.org, martin.petersen@oracle.com
  Cc: Powen Kao (高伯文), alok.a.tiwari@oracle.com,
	chu.stanley@gmail.com, linux-scsi@vger.kernel.org,
	chenyuan0y@gmail.com, alim.akhtar@samsung.com,
	ping.gao@samsung.com, mani@kernel.org,
	James.Bottomley@HansenPartnership.com

On Tue, 2025-10-28 at 06:46 -0700, Bart Van Assche wrote:
> ufshcd_mcq_sqe_search() only searches the submission queue. Examining
> completion queues is not something ufshcd_mcq_sqe_search() must do.
> 
> The UFSHCI driver is based on the assumption that the UFSHCI
> controller
> works correctly and hence passes completions quickly to the host. The
> SCSI core only tries to abort a SCSI command after the SCSI timeout
> has
> expired. The smallest SCSI timeout that can be configured is one
> second.
> This is several orders of magnitude larger than the typical time for
> passing a completion from the UFSHCI to the host. Hence, I think it
> is
> very unlikely that a completion is present in the host controller and
> has not yet reached the host by the time the SCSI core aborts a SCSI
> command.
> 
> Bart.

Hi Bart,

The probability is indeed quite low, but it can still happen,
especially when the timeout is set low. For example, the dev 
command only has 1.5 seconds. Some device dev commands may be
delayed by normal read/write commands and eventually reach 
this timeout.

By the way,
ufshcd_mcq_sqe_search will return true if UFSHCD_QUIRK_MCQ_BROKEN_RTC.
Therefore, this code:

+	if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
+		return ufshcd_mcq_sqe_search(hba, hwq, task_tag) ? -
ETIMEDOUT :
+			0;
+

should be equivalent to:

-	if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
-		return -ETIMEDOUT;
-

am I right?

Thanks
Peter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation
  2025-10-29  7:18     ` Peter Wang (王信友)
@ 2025-10-29 15:50       ` Bart Van Assche
  2025-10-30  6:38         ` Peter Wang (王信友)
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2025-10-29 15:50 UTC (permalink / raw)
  To: Peter Wang (王信友), martin.petersen@oracle.com
  Cc: alok.a.tiwari@oracle.com, chu.stanley@gmail.com,
	linux-scsi@vger.kernel.org, chenyuan0y@gmail.com,
	alim.akhtar@samsung.com, ping.gao@samsung.com, mani@kernel.org,
	James.Bottomley@HansenPartnership.com

On 10/29/25 12:18 AM, Peter Wang (王信友) wrote:
> The probability is indeed quite low, but it can still happen,
> especially when the timeout is set low. For example, the dev
> command only has 1.5 seconds. Some device dev commands may be
> delayed by normal read/write commands and eventually reach
> this timeout.

As mentioned before, whether or not a command completion is still in
the completion queue and has not yet reached the host falls outside the
scope of ufshcd_mcq_sqe_search(). That function should search one
specific submission queue and should not search the completion queues.

> By the way,
> ufshcd_mcq_sqe_search will return true if UFSHCD_QUIRK_MCQ_BROKEN_RTC.
> Therefore, this code:
> 
> +if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
> +return ufshcd_mcq_sqe_search(hba, hwq, task_tag) ? -
> ETIMEDOUT :
> +0;
> +
> 
> should be equivalent to:
> 
> -if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
> -return -ETIMEDOUT;
> -
> 
> am I right?

No. ufshcd_mcq_sqe_search() is typically called after a timeout has
occurred. After a timeout has occurred it is very likely that a command
is no longer present in the submission queue it was submitted to. Hence,
the typical case is that ufshcd_mcq_sqe_search() will not find the
command and hence that my patch will change the return value from
-ETIMEDOUT into 0.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation
  2025-10-29 15:50       ` Bart Van Assche
@ 2025-10-30  6:38         ` Peter Wang (王信友)
  2025-10-30 16:11           ` Bart Van Assche
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Wang (王信友) @ 2025-10-30  6:38 UTC (permalink / raw)
  To: bvanassche@acm.org, martin.petersen@oracle.com
  Cc: alok.a.tiwari@oracle.com, chu.stanley@gmail.com,
	linux-scsi@vger.kernel.org, chenyuan0y@gmail.com,
	alim.akhtar@samsung.com, ping.gao@samsung.com,
	James.Bottomley@HansenPartnership.com, mani@kernel.org

On Wed, 2025-10-29 at 08:50 -0700, Bart Van Assche wrote:
> 
> 
> No. ufshcd_mcq_sqe_search() is typically called after a timeout has
> occurred. After a timeout has occurred it is very likely that a
> command
> is no longer present in the submission queue it was submitted to.
> Hence,
> the typical case is that ufshcd_mcq_sqe_search() will not find the
> command and hence that my patch will change the return value from
> -ETIMEDOUT into 0.
> 
> Thanks,
> 
> Bart.

Hi Bart,

It’s possible that you misunderstood my point.
ufshcd_mcq_sqe_search will alwasy return true if
UFSHCD_QUIRK_MCQ_BROKEN_RTC flag is set by below flow.

static bool ufshcd_mcq_sqe_search(...)
{
	...
	if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
		return true;
	...
}

Hence, the code below will always return -ETIMEDOUT.

if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
	return ufshcd_mcq_sqe_search(hba, hwq, task_tag) ? -ETIMEDOUT
: 0;


Thanks.
Peter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation
  2025-10-30  6:38         ` Peter Wang (王信友)
@ 2025-10-30 16:11           ` Bart Van Assche
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Van Assche @ 2025-10-30 16:11 UTC (permalink / raw)
  To: Peter Wang (王信友), martin.petersen@oracle.com
  Cc: alok.a.tiwari@oracle.com, chu.stanley@gmail.com,
	linux-scsi@vger.kernel.org, chenyuan0y@gmail.com,
	alim.akhtar@samsung.com, ping.gao@samsung.com,
	James.Bottomley@HansenPartnership.com, mani@kernel.org

On 10/29/25 11:38 PM, Peter Wang (王信友) wrote:
> It’s possible that you misunderstood my point.
> ufshcd_mcq_sqe_search will alwasy return true if
> UFSHCD_QUIRK_MCQ_BROKEN_RTC flag is set by below flow.
> 
> static bool ufshcd_mcq_sqe_search(...)
> {
> ...
> if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
> return true;
> ...
> }
> 
> Hence, the code below will always return -ETIMEDOUT.
> 
> if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
> return ufshcd_mcq_sqe_search(hba, hwq, task_tag) ? -ETIMEDOUT
> : 0;

Hi Peter,

Thank you for having clarified this. This is something I had overlooked.

I took another look at ufshcd_mcq_sqe_search() and it seems to me that
that function can't operate safely without stopping the submission
queue. And the submission queue can't be stopped if the quirk
UFSHCD_QUIRK_MCQ_BROKEN_RTC has been set.

I will drop this patch.

Bart.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-10-30 16:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-27 15:44 [PATCH] ufs: core: Fix the UFSHCD_QUIRK_MCQ_BROKEN_RTC implementation Bart Van Assche
2025-10-28 13:05 ` Peter Wang (王信友)
2025-10-28 13:46   ` Bart Van Assche
2025-10-29  7:18     ` Peter Wang (王信友)
2025-10-29 15:50       ` Bart Van Assche
2025-10-30  6:38         ` Peter Wang (王信友)
2025-10-30 16:11           ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox