From: Can Guo <cang@codeaurora.org>
To: Bart Van Assche <bvanassche@acm.org>
Cc: asutoshd@codeaurora.org, nguyenb@codeaurora.org,
hongwus@codeaurora.org, linux-scsi@vger.kernel.org,
kernel-team@android.com, Stanley Chu <stanley.chu@mediatek.com>,
Alim Akhtar <alim.akhtar@samsung.com>,
Avri Altman <avri.altman@wdc.com>,
"James E.J. Bottomley" <jejb@linux.ibm.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Matthias Brugger <matthias.bgg@gmail.com>,
Bean Huo <beanhuo@micron.com>, Jaegeuk Kim <jaegeuk@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Kiwoong Kim <kwmad.kim@samsung.com>,
Satya Tangirala <satyat@google.com>,
open list <linux-kernel@vger.kernel.org>,
"moderated list:ARM/Mediatek SoC support"
<linux-arm-kernel@lists.infradead.org>,
"moderated list:ARM/Mediatek SoC support"
<linux-mediatek@lists.infradead.org>
Subject: Re: [PATCH v1 2/3] scsi: ufs: Optimize host lock on transfer requests send/compl paths
Date: Wed, 23 Jun 2021 10:04:39 +0800 [thread overview]
Message-ID: <e8de57b920c9246f5a41aa44c5a4fc36@codeaurora.org> (raw)
In-Reply-To: <a29164e1-ab7d-6dbc-0fb9-029f203de735@acm.org>
Hi Bart,
On 2021-06-17 10:49, Bart Van Assche wrote:
> On 5/24/21 1:36 AM, Can Guo wrote:
>> @@ -2688,6 +2705,43 @@ static int ufshcd_queuecommand(struct Scsi_Host
>> *host, struct scsi_cmnd *cmd)
>> + case UFSHCD_STATE_EH_SCHEDULED_FATAL:
>> + /*
>> + * pm_runtime_get_sync() is used at error handling preparation
>> + * stage. If a scsi cmd, e.g. the SSU cmd, is sent from hba's
>> + * PM ops, it can never be finished if we let SCSI layer keep
>> + * retrying it, which gets err handler stuck forever. Neither
>> + * can we let the scsi cmd pass through, because UFS is in bad
>> + * state, the scsi cmd may eventually time out, which will get
>> + * err handler blocked for too long. So, just fail the scsi cmd
>> + * sent from PM ops, err handler can recover PM error anyways.
>> + */
>> + if (hba->pm_op_in_progress) {
>> + hba->force_reset = true;
>> + set_host_byte(cmd, DID_BAD_TARGET);
>> + cmd->scsi_done(cmd);
>> + goto out;
>> + }
>> + fallthrough;
>
> Hi Can,
>
> I know that this patch only moves the above code and that the above
> code
> has not been introduced by this patch. Anyway, is my understanding
> correct that ufshcd_err_handler() can change the host controller state
> from UFSHCD_STATE_EH_SCHEDULED_FATAL into UFSHCD_STATE_RESET and next
> into UFSHCD_STATE_OPERATIONAL? If so, if the above code completes a
> READ
> with status DID_BAD_TARGET and if recovery by the error handler
> succeeds, will that cause the filesystem above the UFS driver to change
> into read-only mode? If the above code completes a WRITE with status
> DID_BAD_TARGET, will that cause data corruption? Is there any other
> solution to prevent data corruption than merging the
> UFSHCD_STATE_EH_SCHEDULED_FATAL and UFSHCD_STATE_EH_SCHEDULED_NON_FATAL
> back into a single state and changing the ufshcd_rpm_get_sync(hba) call
> in ufshcd_err_handling_prepare() into a pm_runtime_get_noresume() call?
>
Here, when hba->pm_op_in_progress is true, there cannot be READ or WRITE
command since hba is resuming or suspending. When fatal erorr happens,
the
DID_BAD_TARGET above is intend to let the SSU (or whatever PM requests
blocking suspend/resume) fail fast (neither returning HOST_BUSY nor
letting
the cmd pass through can achieve such purpose), so that error handling
prepare
won't get stuck [1] when it calls
lock_system_sleep()
runtime_pm_get_sync()
The reason why I split UFSHCD_STATE_EH_SCHEDULED to
UFSHCD_STATE_EH_SCHEDULED_FATAL
and UFSHCD_STATE_EH_SCHEDULED_NON_FATAL is that
1. For non-fatal errors, HW can recover by itself, so when host state is
UFSHCD_STATE_EH_SCHEDULED_NON_FATAL, cmd can still passthrough.
2. When non-fatal error (LINE-RESET for example) happens, error handler
only
needs to do a power mode transition without a full reset. If we only
have one
state, returning HOST_BUSY will get error handling prepare stuck [1],
while
fast failing SSU cmds shall make error handler do a full reset (which
goes
too far for non-fatal errors).
Thanks,
Can Guo.
> Thanks,
>
> Bart.
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek
WARNING: multiple messages have this Message-ID (diff)
From: Can Guo <cang@codeaurora.org>
To: Bart Van Assche <bvanassche@acm.org>
Cc: asutoshd@codeaurora.org, nguyenb@codeaurora.org,
hongwus@codeaurora.org, linux-scsi@vger.kernel.org,
kernel-team@android.com, Stanley Chu <stanley.chu@mediatek.com>,
Alim Akhtar <alim.akhtar@samsung.com>,
Avri Altman <avri.altman@wdc.com>,
"James E.J. Bottomley" <jejb@linux.ibm.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Matthias Brugger <matthias.bgg@gmail.com>,
Bean Huo <beanhuo@micron.com>, Jaegeuk Kim <jaegeuk@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Kiwoong Kim <kwmad.kim@samsung.com>,
Satya Tangirala <satyat@google.com>,
open list <linux-kernel@vger.kernel.org>,
"moderated list:ARM/Mediatek SoC support"
<linux-arm-kernel@lists.infradead.org>,
"moderated list:ARM/Mediatek SoC support"
<linux-mediatek@lists.infradead.org>
Subject: Re: [PATCH v1 2/3] scsi: ufs: Optimize host lock on transfer requests send/compl paths
Date: Wed, 23 Jun 2021 10:04:39 +0800 [thread overview]
Message-ID: <e8de57b920c9246f5a41aa44c5a4fc36@codeaurora.org> (raw)
In-Reply-To: <a29164e1-ab7d-6dbc-0fb9-029f203de735@acm.org>
Hi Bart,
On 2021-06-17 10:49, Bart Van Assche wrote:
> On 5/24/21 1:36 AM, Can Guo wrote:
>> @@ -2688,6 +2705,43 @@ static int ufshcd_queuecommand(struct Scsi_Host
>> *host, struct scsi_cmnd *cmd)
>> + case UFSHCD_STATE_EH_SCHEDULED_FATAL:
>> + /*
>> + * pm_runtime_get_sync() is used at error handling preparation
>> + * stage. If a scsi cmd, e.g. the SSU cmd, is sent from hba's
>> + * PM ops, it can never be finished if we let SCSI layer keep
>> + * retrying it, which gets err handler stuck forever. Neither
>> + * can we let the scsi cmd pass through, because UFS is in bad
>> + * state, the scsi cmd may eventually time out, which will get
>> + * err handler blocked for too long. So, just fail the scsi cmd
>> + * sent from PM ops, err handler can recover PM error anyways.
>> + */
>> + if (hba->pm_op_in_progress) {
>> + hba->force_reset = true;
>> + set_host_byte(cmd, DID_BAD_TARGET);
>> + cmd->scsi_done(cmd);
>> + goto out;
>> + }
>> + fallthrough;
>
> Hi Can,
>
> I know that this patch only moves the above code and that the above
> code
> has not been introduced by this patch. Anyway, is my understanding
> correct that ufshcd_err_handler() can change the host controller state
> from UFSHCD_STATE_EH_SCHEDULED_FATAL into UFSHCD_STATE_RESET and next
> into UFSHCD_STATE_OPERATIONAL? If so, if the above code completes a
> READ
> with status DID_BAD_TARGET and if recovery by the error handler
> succeeds, will that cause the filesystem above the UFS driver to change
> into read-only mode? If the above code completes a WRITE with status
> DID_BAD_TARGET, will that cause data corruption? Is there any other
> solution to prevent data corruption than merging the
> UFSHCD_STATE_EH_SCHEDULED_FATAL and UFSHCD_STATE_EH_SCHEDULED_NON_FATAL
> back into a single state and changing the ufshcd_rpm_get_sync(hba) call
> in ufshcd_err_handling_prepare() into a pm_runtime_get_noresume() call?
>
Here, when hba->pm_op_in_progress is true, there cannot be READ or WRITE
command since hba is resuming or suspending. When fatal erorr happens,
the
DID_BAD_TARGET above is intend to let the SSU (or whatever PM requests
blocking suspend/resume) fail fast (neither returning HOST_BUSY nor
letting
the cmd pass through can achieve such purpose), so that error handling
prepare
won't get stuck [1] when it calls
lock_system_sleep()
runtime_pm_get_sync()
The reason why I split UFSHCD_STATE_EH_SCHEDULED to
UFSHCD_STATE_EH_SCHEDULED_FATAL
and UFSHCD_STATE_EH_SCHEDULED_NON_FATAL is that
1. For non-fatal errors, HW can recover by itself, so when host state is
UFSHCD_STATE_EH_SCHEDULED_NON_FATAL, cmd can still passthrough.
2. When non-fatal error (LINE-RESET for example) happens, error handler
only
needs to do a power mode transition without a full reset. If we only
have one
state, returning HOST_BUSY will get error handling prepare stuck [1],
while
fast failing SSU cmds shall make error handler do a full reset (which
goes
too far for non-fatal errors).
Thanks,
Can Guo.
> Thanks,
>
> Bart.
next prev parent reply other threads:[~2021-06-23 2:04 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-24 8:36 [PATCH v1 0/3] Optimize host lock on TR send/compl paths and utilize UTRLCNR Can Guo
2021-05-24 8:36 ` [PATCH v1 1/3] scsi: ufs: Remove a redundant command completion logic in error handler Can Guo
2021-05-24 16:43 ` Bart Van Assche
2021-05-25 4:15 ` Stanley Chu
2021-05-31 7:14 ` Bean Huo
2021-05-24 8:36 ` [PATCH v1 2/3] scsi: ufs: Optimize host lock on transfer requests send/compl paths Can Guo
2021-05-24 8:36 ` Can Guo
2021-05-24 8:36 ` Can Guo
2021-05-24 11:25 ` kernel test robot
2021-05-24 11:25 ` kernel test robot
2021-06-08 17:53 ` Nathan Chancellor
2021-06-08 17:53 ` Nathan Chancellor
2021-06-09 1:01 ` Can Guo
2021-06-09 1:01 ` Can Guo
2021-05-24 20:10 ` Bart Van Assche
2021-05-24 20:10 ` Bart Van Assche
2021-05-24 20:10 ` Bart Van Assche
2021-05-25 1:34 ` Asutosh Das (asd)
2021-05-25 1:34 ` Asutosh Das (asd)
2021-05-25 8:24 ` Avri Altman
2021-05-25 8:24 ` Avri Altman
2021-05-25 8:24 ` Avri Altman
2021-05-28 7:30 ` Avri Altman
2021-05-28 7:30 ` Avri Altman
2021-05-28 7:30 ` Avri Altman
2021-06-02 21:18 ` Asutosh Das (asd)
2021-06-02 21:18 ` Asutosh Das (asd)
2021-05-25 1:40 ` Can Guo
2021-05-25 1:40 ` Can Guo
2021-05-25 16:40 ` Bart Van Assche
2021-05-25 16:40 ` Bart Van Assche
2021-05-25 16:40 ` Bart Van Assche
2021-05-31 16:04 ` Bean Huo
2021-05-31 16:04 ` Bean Huo
2021-05-31 16:04 ` Bean Huo
2021-06-02 2:14 ` Can Guo
2021-06-02 2:14 ` Can Guo
2021-06-03 0:18 ` Bart Van Assche
2021-06-03 0:18 ` Bart Van Assche
2021-06-03 0:18 ` Bart Van Assche
2021-06-03 2:54 ` Stanley Chu
2021-06-03 2:54 ` Stanley Chu
2021-06-03 2:54 ` Stanley Chu
2021-06-04 1:49 ` Can Guo
2021-06-04 1:49 ` Can Guo
2021-06-17 2:49 ` Bart Van Assche
2021-06-17 2:49 ` Bart Van Assche
2021-06-17 2:49 ` Bart Van Assche
2021-06-23 2:04 ` Can Guo [this message]
2021-06-23 2:04 ` Can Guo
2021-06-28 22:58 ` Bart Van Assche
2021-06-28 22:58 ` Bart Van Assche
2021-06-28 22:58 ` Bart Van Assche
2021-06-29 5:41 ` Can Guo
2021-06-29 5:41 ` Can Guo
2021-07-01 15:57 ` Bart Van Assche
2021-07-01 15:57 ` Bart Van Assche
2021-07-01 15:57 ` Bart Van Assche
2021-05-24 8:36 ` [PATCH v1 3/3] scsi: ufs: Utilize Transfer Request List Completion Notification Register Can Guo
2021-05-24 8:36 ` Can Guo
2021-05-24 8:36 ` Can Guo
2021-05-31 16:05 ` Bean Huo
2021-05-31 16:05 ` Bean Huo
2021-05-31 16:05 ` Bean Huo
2021-06-03 2:54 ` Stanley Chu
2021-06-03 2:54 ` Stanley Chu
2021-06-03 2:54 ` Stanley Chu
2021-06-16 3:48 ` [PATCH v1 0/3] Optimize host lock on TR send/compl paths and utilize UTRLCNR Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8de57b920c9246f5a41aa44c5a4fc36@codeaurora.org \
--to=cang@codeaurora.org \
--cc=adrian.hunter@intel.com \
--cc=alim.akhtar@samsung.com \
--cc=asutoshd@codeaurora.org \
--cc=avri.altman@wdc.com \
--cc=beanhuo@micron.com \
--cc=bvanassche@acm.org \
--cc=hongwus@codeaurora.org \
--cc=jaegeuk@kernel.org \
--cc=jejb@linux.ibm.com \
--cc=kernel-team@android.com \
--cc=kwmad.kim@samsung.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=matthias.bgg@gmail.com \
--cc=nguyenb@codeaurora.org \
--cc=satyat@google.com \
--cc=stanley.chu@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.