* [PATCH] scsi: ata: don't reset three times if device is offline for SAS host
@ 2018-01-24 13:20 chenxiang
2018-02-12 16:51 ` Tejun Heo
0 siblings, 1 reply; 4+ messages in thread
From: chenxiang @ 2018-01-24 13:20 UTC (permalink / raw)
To: martin.petersen, tj; +Cc: linux-ide, linux-scsi, linuxarm, chenxiang
In ata_eh_reset, it will reset three times at most for sata disk. For
some drivers through libsas, it calls sas_ata_hard_reset at last. When
device is gone, function sas_ata_hard_reset will return -ENODEV. But
it will still try to reset three times for offline device. This process
lasts a long time:
[11248.344323] ata13.00: status: { ERR }
[11248.344324] ata13.00: error: { ABRT }
[11248.344327] ata13: hard resetting link
[11248.503557] sas: ata: ex 500e004aaaaaaa1f phy02:U:A attached:0000000000000000 (no device)
[11249.359524] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta03d:19h:35m:17s]
[11249.365692] ata13: reset failed (errno=-19), retrying in 9 secs
[11258.451402] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops][eta 03d:22h:10m:48s]
[11259.411508] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 03d:22h:28m:05s]
[11259.417683] ata13: reset failed (errno=-19), retrying in 10 secs
[11268.695401] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops] [eta 04d:01h:03m:37s]
[11269.699513] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 04d:01h:20m:54s]
[11269.705689] ata13: reset failed (errno=-19), retrying in 34 secs
[11304.275393] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops] [eta 04d:11h:25m:43s]
[11305.283516] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 04d:11h:43m:00s]
[11305.289692] ata13: reset failed, giving up
[11305.293785] ata13.00: disabled
Actually it is no need to reset three times for this scenario. So add
a check to avoid it.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
---
drivers/ata/libata-eh.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 11c3137..23a8946 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -3032,6 +3032,15 @@ int ata_eh_reset(struct ata_link *link, int classify,
goto out;
}
+ if (rc == -ENODEV && ap->flags & ATA_FLAG_SAS_HOST) {
+ ata_link_warn(failed_link,
+ "reset failed (errno=%d, device is offline for SAS host\n)",
+ rc);
+ if (ata_is_host_link(link))
+ ata_eh_thaw_port(ap);
+ goto out;
+ }
+
now = jiffies;
if (time_before(now, deadline)) {
unsigned long delta = deadline - now;
--
1.9.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] scsi: ata: don't reset three times if device is offline for SAS host
2018-01-24 13:20 [PATCH] scsi: ata: don't reset three times if device is offline for SAS host chenxiang
@ 2018-02-12 16:51 ` Tejun Heo
2018-02-13 1:44 ` chenxiang (M)
0 siblings, 1 reply; 4+ messages in thread
From: Tejun Heo @ 2018-02-12 16:51 UTC (permalink / raw)
To: chenxiang; +Cc: martin.petersen, linux-ide, linux-scsi, linuxarm
Hello,
On Wed, Jan 24, 2018 at 09:20:25PM +0800, chenxiang wrote:
> In ata_eh_reset, it will reset three times at most for sata disk. For
> some drivers through libsas, it calls sas_ata_hard_reset at last. When
> device is gone, function sas_ata_hard_reset will return -ENODEV. But
> it will still try to reset three times for offline device. This process
> lasts a long time:
>
> [11248.344323] ata13.00: status: { ERR }
> [11248.344324] ata13.00: error: { ABRT }
> [11248.344327] ata13: hard resetting link
> [11248.503557] sas: ata: ex 500e004aaaaaaa1f phy02:U:A attached:0000000000000000 (no device)
> [11249.359524] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta03d:19h:35m:17s]
> [11249.365692] ata13: reset failed (errno=-19), retrying in 9 secs
> [11258.451402] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops][eta 03d:22h:10m:48s]
> [11259.411508] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 03d:22h:28m:05s]
> [11259.417683] ata13: reset failed (errno=-19), retrying in 10 secs
> [11268.695401] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops] [eta 04d:01h:03m:37s]
> [11269.699513] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 04d:01h:20m:54s]
> [11269.705689] ata13: reset failed (errno=-19), retrying in 34 secs
> [11304.275393] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops] [eta 04d:11h:25m:43s]
> [11305.283516] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 04d:11h:43m:00s]
> [11305.289692] ata13: reset failed, giving up
> [11305.293785] ata13.00: disabled
>
> Actually it is no need to reset three times for this scenario. So add
> a check to avoid it.
I'm a bit reluctant in changing this per-driver. Does this actually
hurt something?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] scsi: ata: don't reset three times if device is offline for SAS host
2018-02-12 16:51 ` Tejun Heo
@ 2018-02-13 1:44 ` chenxiang (M)
2018-02-13 14:27 ` Tejun Heo
0 siblings, 1 reply; 4+ messages in thread
From: chenxiang (M) @ 2018-02-13 1:44 UTC (permalink / raw)
To: Tejun Heo; +Cc: martin.petersen, linux-ide, linux-scsi, linuxarm
Hi Tejun,
在 2018/2/13 0:51, Tejun Heo 写道:
> Hello,
>
> On Wed, Jan 24, 2018 at 09:20:25PM +0800, chenxiang wrote:
>> In ata_eh_reset, it will reset three times at most for sata disk. For
>> some drivers through libsas, it calls sas_ata_hard_reset at last. When
>> device is gone, function sas_ata_hard_reset will return -ENODEV. But
>> it will still try to reset three times for offline device. This process
>> lasts a long time:
>>
>> [11248.344323] ata13.00: status: { ERR }
>> [11248.344324] ata13.00: error: { ABRT }
>> [11248.344327] ata13: hard resetting link
>> [11248.503557] sas: ata: ex 500e004aaaaaaa1f phy02:U:A attached:0000000000000000 (no device)
>> [11249.359524] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta03d:19h:35m:17s]
>> [11249.365692] ata13: reset failed (errno=-19), retrying in 9 secs
>> [11258.451402] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops][eta 03d:22h:10m:48s]
>> [11259.411508] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 03d:22h:28m:05s]
>> [11259.417683] ata13: reset failed (errno=-19), retrying in 10 secs
>> [11268.695401] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops] [eta 04d:01h:03m:37s]
>> [11269.699513] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 04d:01h:20m:54s]
>> [11269.705689] ata13: reset failed (errno=-19), retrying in 34 secs
>> [11304.275393] ata13: hard resetting linkKB/0KB/0KB /s] [0/0/0 iops] [eta 04d:11h:25m:43s]
>> [11305.283516] sas: ata13: end_device-1:0:2: reset failed (errno=-19)[eta 04d:11h:43m:00s]
>> [11305.289692] ata13: reset failed, giving up
>> [11305.293785] ata13.00: disabled
>>
>> Actually it is no need to reset three times for this scenario. So add
>> a check to avoid it.
> I'm a bit reluctant in changing this per-driver. Does this actually
> hurt something?
For those drivers using libsas, i think they have the same issue. It
takes about 1 minute to
recover but actually device is gone, so this recover is useless for this
scenario (when enter EH,
all normal IOs are blocked actually, so it will cause normal IOs are
blocked one more minute which
user doesn't want to).
Actually in sas_ata_hard_reset, there are two situations returned
-ENODEV which represent device is gone:
- LLDD directly returns -ENODEV through lldd_I_T_nexus_reset;
- It sends SMP DISCOVER to check local phy in smp_ata_check_ready, and
find it is gone;
> Thanks.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] scsi: ata: don't reset three times if device is offline for SAS host
2018-02-13 1:44 ` chenxiang (M)
@ 2018-02-13 14:27 ` Tejun Heo
0 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2018-02-13 14:27 UTC (permalink / raw)
To: chenxiang (M); +Cc: martin.petersen, linux-ide, linux-scsi, linuxarm
Hello,
On Tue, Feb 13, 2018 at 09:44:53AM +0800, chenxiang (M) wrote:
> For those drivers using libsas, i think they have the same issue.
> It takes about 1 minute to
> recover but actually device is gone, so this recover is useless for
> this scenario (when enter EH,
> all normal IOs are blocked actually, so it will cause normal IOs are
> blocked one more minute which
> user doesn't want to).
Right, it'd block other devices sharing the port. Doesn't sas map
each ata device to its own port tho?
> Actually in sas_ata_hard_reset, there are two situations returned
> -ENODEV which represent device is gone:
> - LLDD directly returns -ENODEV through lldd_I_T_nexus_reset;
> - It sends SMP DISCOVER to check local phy in smp_ata_check_ready,
> and find it is gone;
So, if there are real consequences, we can definitely add a way to
short-circuit the recovery logic but let's do that by adding proper
signaling rathr than testing for driver type.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-02-13 14:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-24 13:20 [PATCH] scsi: ata: don't reset three times if device is offline for SAS host chenxiang
2018-02-12 16:51 ` Tejun Heo
2018-02-13 1:44 ` chenxiang (M)
2018-02-13 14:27 ` Tejun Heo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.