public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Mike Christie <michael.christie@oracle.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>,
	linux-scsi@vger.kernel.org,
	Adrian Hunter <adrian.hunter@intel.com>,
	Christoph Hellwig <hch@lst.de>, Ming Lei <ming.lei@redhat.com>,
	John Garry <john.garry@huawei.com>,
	Hannes Reinecke <hare@suse.de>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>
Subject: Re: [PATCH v3 3/8] scsi: core: Support failing requests while recovering
Date: Mon, 3 Oct 2022 12:20:10 -0700	[thread overview]
Message-ID: <d734148d-ca9f-70e7-e180-77097c2fc0b2@acm.org> (raw)
In-Reply-To: <9160ed4f-95a8-42ff-366e-f52d7816d8c3@oracle.com>

On 10/3/22 10:27, Mike Christie wrote:
> On 9/29/22 5:00 PM, Bart Van Assche wrote:
>> The current behavior for SCSI commands submitted while error recovery
>> is ongoing is to retry command submission after error recovery has
>> finished. See also the scsi_host_in_recovery() check in
>> scsi_host_queue_ready(). Add support for failing SCSI commands while
>> host recovery is in progress. This functionality will be used to fix a
>> deadlock in the UFS driver.
>>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Cc: John Garry <john.garry@huawei.com>
>> Cc: Mike Christie <michael.christie@oracle.com>
>> Cc: Hannes Reinecke <hare@suse.de>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>   drivers/scsi/scsi_lib.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 473d9403f0c1..ecd2ce3815df 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>>   				   struct scsi_device *sdev,
>>   				   struct scsi_cmnd *cmd)
>>   {
>> -	if (scsi_host_in_recovery(shost))
>> -		return 0;
>> -
>>   	if (atomic_read(&shost->host_blocked) > 0) {
>>   		if (scsi_host_busy(shost) > 0)
>>   			goto starved;
>> @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>>   	ret = BLK_STS_RESOURCE;
>>   	if (!scsi_target_queue_ready(shost, sdev))
>>   		goto out_put_budget;
>> +	if (unlikely(scsi_host_in_recovery(shost))) {
>> +		if (req->cmd_flags & REQ_FAILFAST_MASK)
>> +			ret = BLK_STS_OFFLINE;
>> +		goto out_dec_target_busy;
>> +	}
>>   	if (!scsi_host_queue_ready(q, shost, sdev, cmd))
>>   		goto out_dec_target_busy;
>>   
> 
> This might add a regression to dm-multipath or it at least makes
> the behavior difficult to understand for users and support people.
> 
> If there is a transport issue and the cmd times out and the abort
> does as well or fails, then we would start the scsi eh recovery. When the
> driver/transport class figures out it's a transport issue we will block
> the scsi_device. So before this patch we would requeue the cmd then we
> would wait until the fast_io_fail (FC) or replacement_timer (iscsi) to
> fire before failing commands upwards and failing paths.
> 
> With this patch we can end up doing 2 or 3 things depending on timing:
> 1. If the cmd hits the code above we will fail the command and cause a
> path failure.
> 
> 2. If driver/transport blocks the scsi_device first then we would hit the
> scsi_device state check in scsi_queue_rq and not fail the path like before.
> 
> 3. With or without your patch, if dm_mq_queue_rq calls the busy callback
> first (this does blk_lld_busy -> scsi_mq_lld_busy -> scsi_host_in_recovery)
> then it might see all the paths in recovery. It considers this a temp condition
> and will requeue cmds. So in this case we will not fail the path.
> 
> I'm not sure what the correct fix is. Maybe not fast fail REQ_FAILFAST_TRANSPORT
> above, or maybe add a new FAILFAST type that you could use?

Hi Mike,

I will look into introducing a new request flag for this purpose.

Thanks,

Bart.

  reply	other threads:[~2022-10-03 19:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout() Bart Van Assche
2022-09-30  0:17   ` Mike Christie
2022-09-30  0:32     ` Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 2/8] scsi: core: Change the return type of .eh_timed_out() Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 3/8] scsi: core: Support failing requests while recovering Bart Van Assche
2022-10-03 17:27   ` Mike Christie
2022-10-03 19:20     ` Bart Van Assche [this message]
2022-09-29 22:00 ` [PATCH v3 4/8] scsi: ufs: Remove an outdated comment Bart Van Assche
2022-09-30 10:26   ` Bean Huo
2022-10-03  5:55   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode() Bart Van Assche
2022-09-30 10:28   ` Bean Huo
2022-10-03  5:58   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode Bart Van Assche
2022-09-30 12:14   ` Bean Huo
2022-10-03  6:10   ` Adrian Hunter
2022-10-03 16:38     ` Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 7/8] scsi: ufs: Track system suspend / resume activity Bart Van Assche
2022-10-03  6:10   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler Bart Van Assche
2022-09-30 15:03   ` Bean Huo
2022-09-30 17:15     ` Bart Van Assche
2022-10-02 21:21       ` Bean Huo
2022-10-04  0:21         ` Bart Van Assche
2022-10-03  6:18   ` Adrian Hunter
2022-10-03 16:45     ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d734148d-ca9f-70e7-e180-77097c2fc0b2@acm.org \
    --to=bvanassche@acm.org \
    --cc=adrian.hunter@intel.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=jejb@linux.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=michael.christie@oracle.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox