linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com>
To: Libo Chen <clbchenlibo.chen@huawei.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
	yrl.pp-manager.tt@hitachi.com
Subject: Re: Re: [PATCH v2] scsi: Add 'retry_timeout' to avoid infinite command retry
Date: Tue, 11 Feb 2014 10:33:44 +0900	[thread overview]
Message-ID: <52F97DF8.2040602@hitachi.com> (raw)
In-Reply-To: <52F47A03.2010805@huawei.com>


(2014/02/07 15:15), Libo Chen wrote:
> On 2014/2/7 13:46, James Bottomley wrote:
>> On Fri, 2014-02-07 at 09:22 +0900, Eiichi Tsukata wrote:
>>> Currently, scsi error handling in scsi_io_completion() tries to
>>> unconditionally requeue scsi command when device keeps some error state.
>>> For example, UNIT_ATTENTION causes infinite retry with
>>> action == ACTION_RETRY.
>>> This is because retryable errors are thought to be temporary and the scsi
>>> device will soon recover from those errors. Normally, such retry policy is
>>> appropriate because the device will soon recover from temporary error state.
>>
>>
>>> But there is no guarantee that device is able to recover from error state
>>> immediately. Actually, we've experienced an infinite retry on some hardware.
>>> Therefore hardware error can results in infinite command retry loop.
>> Could you please add an analysis of the actual failure; which devices
>> and what conditions.
>>
>
> same question, can you explain?


I'm afraid to say that I can't expose the device name because I've not 
confirmed yet that
the device is responsible for the problem with the device manufacturer. 
However, with the
limited evidence, It seems that SCSI command retried with UNIT_ATTENTION.

In the previous thread, Ewan reported that a storage array returned a 
CHECK_CONDITION
with invalid sense data, which caused the command to be retried 
indefinitely:
https://lkml.org/lkml/2013/8/20/498

So, we should try to avoid infinite retry in SCSI middle layer, not in 
each SCSI LLDD.

>>> This patch adds 'retry_timeout' sysfs attribute which limits the retry time
>>> of each scsi command. This attribute is located in scsi sysfs directory
>>> for example "/sys/bus/scsi/devices/X:X:X:X/" and value is in seconds.
>>> Once scsi command retry time is longer than this timeout,
>>> the command is treated as failure. 'retry_timeout' is set to '0' by default
>>> which means no timeout set.
>> Don't do this ... you're mixing a feature (which you'd need to justify)
>> with an apparent bug fix.
>>
>> Once you dump all the complexity, I think the patch boils down to a
>> simple check before the action switch in scsi_io_completion():
>>
>> 	if (action !=  ACTION_FAIL &&
>> 	    time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
>> 		action = ACTION_FAIL;
>> 		description = "command timed out";
>> 	}
>>

Sounds good!
Thanks for much simpler code. It's enough to fix the bug.
I'll resend the patch soon with the above code.

Eiichi.

>> James
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>>
>
> .
>


      reply	other threads:[~2014-02-11  1:33 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1391579254-26204-1-git-send-email-eiichi.tsukata.xh@hitachi.com>
2014-02-05  5:47 ` [REVIEW PATCH] scsi: Add 'retry_timeout' to avoid infinite command retry Eiichi Tsukata
2014-02-05 16:55   ` James Bottomley
2014-02-06  4:11     ` Eiichi Tsukata
2014-02-07  0:22       ` [PATCH v2] " Eiichi Tsukata
2014-02-07  5:46         ` James Bottomley
2014-02-07  6:15           ` Libo Chen
2014-02-11  1:33             ` Eiichi Tsukata [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52F97DF8.2040602@hitachi.com \
    --to=eiichi.tsukata.xh@hitachi.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=clbchenlibo.chen@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=yrl.pp-manager.tt@hitachi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).