From: Damien Le Moal <dlemoal@kernel.org>
To: Henry Tseng <henrytseng@qnap.com>
Cc: Niklas Cassel <cassel@kernel.org>,
linux-ide@vger.kernel.org, Kevin Ko <kevinko@qnap.com>,
SW Chen <swchen@qnap.com>
Subject: Re: [PATCH v3] ata: libata: avoid long timeouts on hot-unplugged SATA DAS
Date: Mon, 8 Dec 2025 12:47:48 +0900 [thread overview]
Message-ID: <0ca4b4c2-22b4-40bf-8f91-21697c4249f1@kernel.org> (raw)
In-Reply-To: <20251201094622.1475358-1-henrytseng@qnap.com>
On 12/1/25 6:46 PM, Henry Tseng wrote:
> When a SATA DAS enclosure is connected behind a Thunderbolt PCIe
> switch, hot-unplugging the whole enclosure causes pciehp to tear down
> the PCI hierarchy before the SCSI layer issues SYNCHRONIZE CACHE and
> START STOP UNIT for the disks.
>
> libata still queues these commands and the AHCI driver tries to access
> the HBA registers even though the PCI channel is already offline. This
> results in a series of timeouts and error recovery attempts, e.g.:
>
> [ 824.778346] pcieport 0000:00:07.0: pciehp: Slot(14): Link Down
> [ 891.612720] ata8.00: qc timeout after 5000 msecs (cmd 0xec)
> [ 902.876501] ata8.00: qc timeout after 10000 msecs (cmd 0xec)
> [ 934.107998] ata8.00: qc timeout after 30000 msecs (cmd 0xec)
> [ 936.206431] sd 7:0:0:0: [sda] Synchronize Cache(10) failed:
> Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> ...
> [ 1006.298356] ata1.00: qc timeout after 5000 msecs (cmd 0xec)
> [ 1017.561926] ata1.00: qc timeout after 10000 msecs (cmd 0xec)
> [ 1048.791790] ata1.00: qc timeout after 30000 msecs (cmd 0xec)
> [ 1050.890035] sd 0:0:0:0: [sdb] Synchronize Cache(10) failed:
> Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>
> With this patch applied, the same hot-unplug looks like:
>
> [ 59.965496] pcieport 0000:00:07.0: pciehp: Slot(14): Link Down
> [ 60.002502] sd 7:0:0:0: [sda] Synchronize Cache(10) failed:
> Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> ...
> [ 60.103050] sd 0:0:0:0: [sdb] Synchronize Cache(10) failed:
> Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>
> In this test setup with two disks, the hot-unplug sequence shrinks from
> about 226 seconds (~3.8 minutes) between the Link Down event and the
> last SYNCHRONIZE CACHE failure to under a second. Without this patch the
> total delay grows roughly with the number of disks, because each disk
> gets its own SYNCHRONIZE CACHE and qc timeout series.
>
> If the underlying PCI device is already gone, these commands cannot
> succeed anyway. Avoid issuing them by introducing
> ata_adapter_is_online(), which checks pci_channel_offline() for
> PCI-based hosts. It is used from ata_scsi_find_dev() to return NULL,
> causing the SCSI layer to fail new commands with DID_BAD_TARGET
> immediately, and from ata_qc_issue() to bail out before touching the
> HBA registers.
>
> Since such failures would otherwise trigger libata error handling,
> ata_adapter_is_online() is also consulted from ata_scsi_port_error_handler().
> When the adapter is offline, libata skips ap->ops->error_handler(ap) and
> completes error handling using the existing path, rather than running
> a full EH sequence against a dead adapter.
>
> With this change, SYNCHRONIZE CACHE and START STOP UNIT commands
> issued during hot-unplug fail quickly once the PCI channel is offline,
> without qc timeout spam or long libata EH delays.
>
> Suggested-by: Damien Le Moal <dlemoal@kernel.org>
> Signed-off-by: Henry Tseng <henrytseng@qnap.com>
Applied to for-6.20. Thanks!
--
Damien Le Moal
Western Digital Research
prev parent reply other threads:[~2025-12-08 3:52 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-01 9:46 [PATCH v3] ata: libata: avoid long timeouts on hot-unplugged SATA DAS Henry Tseng
2025-12-02 2:44 ` Damien Le Moal
2025-12-02 7:45 ` Niklas Cassel
2025-12-08 3:47 ` Damien Le Moal [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0ca4b4c2-22b4-40bf-8f91-21697c4249f1@kernel.org \
--to=dlemoal@kernel.org \
--cc=cassel@kernel.org \
--cc=henrytseng@qnap.com \
--cc=kevinko@qnap.com \
--cc=linux-ide@vger.kernel.org \
--cc=swchen@qnap.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox