All of lore.kernel.org
 help / color / mirror / Atom feed
From: Niklas Cassel <cassel@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: AlanCui4080 <me@alancui.cc>, linux-ide@vger.kernel.org
Subject: Re: Default IDENTIFY timeout is 5000ms which is too short for enterprise disks
Date: Wed, 15 Apr 2026 14:40:44 +0200	[thread overview]
Message-ID: <ad-HTImFZxdtkD2J@ryzen> (raw)
In-Reply-To: <6740c3b7-c63c-4181-b36e-962e07bb468f@kernel.org>

Hello Alan, Damien,

On Thu, Apr 09, 2026 at 02:01:03PM +0200, Damien Le Moal wrote:
> On 2026/04/09 12:21, AlanCui4080 wrote:

[...]

> And no, we should not introduce a quirk for this. Rather, we should do the same
> 3-steps timeout for revalidation after a resume from suspend in the same manner
> as a regular probe does. Or add a check/wait for "drive ready" when resuming,
> similar to the PUIS handling (power up in standby).

Just like regular ata_dev_read_id() (called during probe),
ata_dev_reread_id() (called during revalidate) already does increase the
timeout with each retry:

# echo +10 > /sys/class/rtc/rtc0/wakealarm
# echo mem > /sys/power/state

[   22.709542] PM: suspend entry (deep)
[   22.734353] Filesystems sync: 0.024 seconds
[   22.749431] Freezing user space processes
[   22.750703] Freezing user space processes completed (elapsed 0.000 seconds)
[   22.751533] OOM killer disabled.
[   22.751939] Freezing remaining freezable tasks
[   22.753553] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[   22.763375] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   22.764396] ata1.00: Entering standby power mode
[   22.775472] PM: suspend devices took 0.021 seconds

...
...
...

[   28.826052] PM: suspend exit
[   29.063513] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

QEMU: cmd_identify: IDENTIFY count: 4, intentionally not sending completion

[   29.064800] ata2: SATA link down (SStatus 0 SControl 300)
[   29.071055] ata3: SATA link down (SStatus 0 SControl 300)
[   29.072053] ata6: SATA link down (SStatus 0 SControl 300)
[   29.073009] ata5: SATA link down (SStatus 0 SControl 300)
[   29.074038] ata4: SATA link down (SStatus 0 SControl 300)
[   34.168702] ata1.00: qc timeout after 5000 msecs (cmd 0xec)
[   34.169820] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[   34.170754] ata1.00: revalidation failed (errno=-5)
[   34.481679] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

QEMU: cmd_identify: IDENTIFY count: 5, intentionally not sending completion

[   44.920692] ata1.00: qc timeout after 10000 msecs (cmd 0xec)
[   44.921814] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[   44.922647] ata1.00: revalidation failed (errno=-5)
[   45.232872] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

QEMU: cmd_identify: IDENTIFY count: 6, intentionally not sending completion

[   75.640934] ata1.00: qc timeout after 30000 msecs (cmd 0xec)
[   75.642127] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[   75.643091] ata1.00: revalidation failed (errno=-5)
[   75.643893] ata1.00: disable device
[   75.950433] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)



Alan,
could you please provide a full dmesg (dmesg that has not been cut)
when reproducing your problem on kernel v7.0.

And please explain your problem as detailed as you can, including which
drive/port (ataX.YY) that you are having a problem with.

ZFS is not a filesystem in the kernel, so we don't really care about it.

Are you saying that you see something like:
[   75.643893] ata1.00: disable device

instead of something like:
[   75.068077] sd 0:0:0:0: [sda] Starting disk
[   75.069628] ata1.00: configured for UDMA/100



Note that if you specify an explicit probe timeout value, e.g.
libata.ata_probe_timeout=10

Then that timeout value will be used for each retry:
https://github.com/torvalds/linux/blob/v7.0/drivers/ata/libata-core.c#L1612-L1617

I.e. if you specify an explicit probe timeout value, you will not
automatically get a larger timeout timeout for each retry.


Kind regards,
Niklas

  reply	other threads:[~2026-04-15 12:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09 10:21 Default IDENTIFY timeout is 5000ms which is too short for enterprise disks AlanCui4080
2026-04-09 11:55 ` Damien Le Moal
2026-04-09 12:01 ` Damien Le Moal
2026-04-15 12:40   ` Niklas Cassel [this message]
2026-04-16 12:59     ` AlanCui4080
2026-04-20 16:27       ` Niklas Cassel
2026-04-23  9:18         ` AlanCui4080
2026-04-23 11:15           ` Niklas Cassel
2026-04-23 14:26             ` AlanCui4080
2026-04-23 16:17               ` Niklas Cassel
2026-05-08 20:48                 ` AlanCui4080
     [not found] ` <14062658.dW097sEU6C@alanarchdesktop>
     [not found]   ` <4482b737-1454-48cb-a941-165aa84fb2eb@kernel.org>
2026-04-10 11:24     ` AlanCui4080
2026-04-10 12:14       ` AlanCui4080

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad-HTImFZxdtkD2J@ryzen \
    --to=cassel@kernel.org \
    --cc=dlemoal@kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=me@alancui.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.