public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: "Peter Wang (王信友)" <peter.wang@mediatek.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>
Cc: wsd_upstream <wsd_upstream@mediatek.com>,
	"linux-mediatek@lists.infradead.org"
	<linux-mediatek@lists.infradead.org>
Subject: Re: [PATCH v10 2/2] ufs: core: requeue aborted request
Date: Tue, 8 Oct 2024 11:29:47 -0700	[thread overview]
Message-ID: <a02c83eb-d057-48cc-9735-770928a2a0a1@acm.org> (raw)
In-Reply-To: <8c463196860b71f26bddad0e7e8be6aacd470109.camel@mediatek.com>

On 10/7/24 12:20 AM, Peter Wang (王信友) wrote:
> On Thu, 2024-10-03 at 13:02 -0700, Bart Van Assche wrote:
>>   	
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>>   On 10/2/24 5:42 AM, Peter Wang (王信友) wrote:
>>> This patch merely aligns with the approach of SDB mode
>>> and does not involve the flow of scsi_done. Besides,
>>> I don't see any issue with concurrency between
>>> ufshcd_abort_one() calling ufshcd_try_to_abort_task()
>>> and scsi_done(). Can you point out the specific flow where
>>> the problem occurs? If there is one, shouldn't SDB mode
>>> have the same issue?
>>
>> Hi Peter,
>>
>> Correct, my comment applies to both legacy mode and MCQ mode. From
>> the
>> section in the UFS standard about ABORT TASK: "A response of FUNCTION
>> COMPLETE shall indicate that the command was aborted or was not in
>> the
>> task set." In other words, if a command completes just before
>> ufshcd_try_to_abort_task() calls ufshcd_issue_tm_cmd(), then
>> ufshcd_try_to_abort_task() will call ufshcd_clear_cmd() for a command
>> that has already completed. In legacy mode, this call will succeed.
>>
> 
> Hi Bart,
> 
> Yes, the legacy SDB mode is protected by the outstanding_lock.
> 
> 
>> Hence, both ufshcd_compl_one_cqe() and ufshcd_abort_all() will call
>> ufshcd_release(hba). This will cause hba->clk_gating.active_reqs to
>> be
>> decremented twice instead of only once. Do you agree that this can
>> happen and also that it should be prevented that this happens?
>>
>> Thanks,
>>
>> Bart.
> 
> Sorry, I still don't understand why both ufshcd_compl_one_cqe()
> and ufshcd_abort_all() will call ufshcd_release(hba)?
> Because I have already removed the ufshcd_release_scsi_cmd from
> ufshcd_abort_one, so the command won't be released immediately
> after ufshcd_try_to_abort_task succeeds. Instead, it will wait
> until the CQ Entry comes in before releasing. And since it is
> protected by the cq_lock, it should only release once, right?

Hi Peter,

I think what you wrote applies to MCQ mode only. In my previous email
I clearly referred to "legacy mode" (SDB mode). Summarizing my previous
email, I think that in legacy mode it is possible that ufshcd_release()
is called twice while it only should be called once. Here are the
possible solutions I see:
* Add a function to the SCSI core for setting SCMD_STATE_COMPLETE. This
   may be controversial since no other SCSI LLD needs this functionality.
* Changing the error handling approach in the UFS driver to the same
   approach other SCSI LLDs use: instead of using queue_work() to
   activate the error handler, call scsi_schedule_eh(). This will cause
   the error handler to be activated later, namely after all pending
   commands have timed out instead of aborting any pending commands
   first.
* Add a variant of scsi_schedule_eh() to the SCSI core that accelerates
   error handling by calling scsi_timeout() on all pending commands.

Thanks,

Bart.


  reply	other threads:[~2024-10-08 18:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-01  9:19 [PATCH v10 0/2] fix abort defect peter.wang
2024-10-01  9:19 ` [PATCH v10 1/2] ufs: core: fix the issue of ICU failure peter.wang
2024-10-01  9:19 ` [PATCH v10 2/2] ufs: core: requeue aborted request peter.wang
2024-10-01 17:13   ` Bart Van Assche
2024-10-02 12:42     ` Peter Wang (王信友)
2024-10-03 20:02       ` Bart Van Assche
2024-10-07  7:20         ` Peter Wang (王信友)
2024-10-08 18:29           ` Bart Van Assche [this message]
2024-10-09  2:17             ` Peter Wang (王信友)
2024-10-09 18:06               ` Bart Van Assche
2024-10-11  5:44                 ` Peter Wang (王信友)
2024-10-09 17:59   ` Bart Van Assche
2024-10-16  2:38 ` [PATCH v10 0/2] fix abort defect Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a02c83eb-d057-48cc-9735-770928a2a0a1@acm.org \
    --to=bvanassche@acm.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=peter.wang@mediatek.com \
    --cc=wsd_upstream@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox