public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed
From: AlanCui4080 <me@alancui.cc>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: linux-ide@vger.kernel.org
Subject: Re: Default IDENTIFY timeout is 5000ms which is too short for enterprise disks
Date: Fri, 10 Apr 2026 19:24:29 +0800	[thread overview]
Message-ID: <12870147.O9o76ZdvQC@alanarchdesktop> (raw)
In-Reply-To: <4482b737-1454-48cb-a941-165aa84fb2eb@kernel.org>

On Friday, 10 April 2026 12:19,you wrote:
> I need to check the code again, but no, That's not that. Sinc on resume we
> revalidate the device, it is ata_dev_reread_id() that needs to be a bit more lax
> on timeouts and repeatedly call ata_dev_read_id() with an increasing timeout as
> defined by ata_eh_identify_timeouts(). That should the IDENTIFY issue for drives
> that slow to respond to that command on resume/while spinning up.
> 
> >> Or add a check/wait for "drive ready"
> >> when resuming, similar to the PUIS handling (power up in standby).
> > 
> > There is tried_spinup in ata_dev_read_id(), but seems required the device to 
> > response at least incomplete IDENTIFY, with a device will never response 
> > during spining up, is that possible to implement it?
> 
> Ah, yes, forgot about that one. So it is not an option.
> 

Hi, I've tried (and extra WARN ONCE at ata_port_is_frozen):

---

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 374993031895..0ac0daae33f9 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3902,7 +3902,15 @@ int ata_dev_reread_id(struct ata_device *dev, unsigned int readid_flags)
        int rc;
 
        /* read ID data */
-       rc = ata_dev_read_id(dev, &class, readid_flags, id);
+       int retry_read_id = 3;
+       do {
+               rc = ata_dev_read_id(dev, &class, readid_flags, id);
+               if (rc) {
+                       ata_dev_warn(dev, "retrying ata_dev_read_id(), %d times remainng",
+                               retry_read_id);
+               }
+               retry_read_id--;
+       } while (rc && retry_read_id > 0);
        if (rc)
                return rc

--

But it reports:

```
[  119.260621] ata2: found unknown device (class 0)
[  119.264620] ata4: found unknown device (class 0)
[  119.415623] ata2: found unknown device (class 0)
[  119.415634] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  119.422627] ata4: found unknown device (class 0)
[  119.422636] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  124.646636] ata4.00: qc timeout after 5000 msecs (cmd 0xec)
[  124.646646] ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  124.646648] ata4.00: retrying ata_dev_read_id(), 3 times remainng
[  124.646657] ------------[ cut here ]------------
[  124.646659] ata_port_is_frozen(ap)
[  124.646660] WARNING: drivers/ata/libata-core.c:1549 at ata_exec_internal+0x4e4/0x590, CPU#0: scsi_eh_3/155
...
[  124.646793] Call Trace:
[  124.646795]  <TASK>
[  124.646799]  ata_dev_read_id+0x3b2/0x560
[  124.646805]  ata_dev_reread_id+0x50/0xf0
[  124.646808]  ata_dev_revalidate+0x64/0xd0
[  124.646811]  ata_eh_recover+0xa76/0xf90
[  124.646815]  ? update_load_avg+0x7b/0x740
[  124.646819]  ? __dequeue_entity+0x4f4/0x5d0
[  124.646823]  sata_pmp_error_handler+0x387/0x660
[  124.646827]  ? __flush_work+0x2b1/0x360
[  124.646832]  ahci_error_handler+0x42/0x80
[  124.646836]  ata_scsi_port_error_handler+0x71a/0x950
[  124.646840]  ata_scsi_error+0x95/0xd0
[  124.646843]  scsi_error_handler+0xd1/0x530
[  124.646848]  ? __pfx_scsi_error_handler+0x10/0x10
[  124.646851]  kthread+0xfc/0x240
[  124.646855]  ? __pfx_kthread+0x10/0x10
[  124.646858]  ret_from_fork+0x243/0x280
[  124.646862]  ? __pfx_kthread+0x10/0x10
[  124.646865]  ret_from_fork_asm+0x1a/0x30
[  124.646873]  </TASK>
[  124.646875] ---[ end trace 0000000000000000 ]---
[  124.646877] ata4.00: failed to IDENTIFY (I/O error, err_mask=0x40)
[  124.646879] ata4.00: retrying ata_dev_read_id(), 2 times remainng
[  124.646886] ata4.00: failed to IDENTIFY (I/O error, err_mask=0x40)
[  124.646888] ata4.00: retrying ata_dev_read_id(), 1 times remainng
[  124.646889] ata4.00: revalidation failed (errno=-5)
[  124.646919] ata2.00: qc timeout after 5000 msecs (cmd 0xec)
[  124.646927] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[  124.646929] ata2.00: retrying ata_dev_read_id(), 3 times remainng
[  124.646937] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x40)
[  124.646939] ata2.00: retrying ata_dev_read_id(), 2 times remainng
[  124.646945] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x40)
[  124.646947] ata2.00: retrying ata_dev_read_id(), 1 times remainng
[  124.646948] ata2.00: revalidation failed (errno=-5)
[  125.110629] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  125.110649] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  125.146916] ata2.00: configured for UDMA/133
[  125.163102] ata4.00: configured for UDMA/133

```

And, yes, libata will freeze the link when the qc failed:

```
if (qc->flags & ATA_QCFLAG_ACTIVE) {
	qc->err_mask |= AC_ERR_TIMEOUT;
	ata_port_freeze(ap);
	ata_dev_warn(dev, "qc timeout after %u msecs (cmd 0x%x)\n",
		     timeout, command);
}
```

So, should the retry happened in ata_exec_internal()? No, the ata_exec_internal has
no a path to cancel the command already issued, it can only be freeze and reset the
port. All we can do is to continue wait and increase the timeout 3 times before
let the port reset. I don't think that is a good idea.

Alan.



  parent reply	other threads:[~2026-04-10 11:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09 10:21 Default IDENTIFY timeout is 5000ms which is too short for enterprise disks AlanCui4080
2026-04-09 11:55 ` Damien Le Moal
2026-04-09 12:01 ` Damien Le Moal
     [not found] ` <14062658.dW097sEU6C@alanarchdesktop>
     [not found]   ` <4482b737-1454-48cb-a941-165aa84fb2eb@kernel.org>
2026-04-10 11:24     ` AlanCui4080 [this message]
2026-04-10 12:14       ` AlanCui4080

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12870147.O9o76ZdvQC@alanarchdesktop \
    --to=me@alancui.cc \
    --cc=dlemoal@kernel.org \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox