* [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
@ 2026-04-25 6:04 Xingui Yang
2026-04-25 22:53 ` Damien Le Moal
2026-04-27 13:17 ` Niklas Cassel
0 siblings, 2 replies; 7+ messages in thread
From: Xingui Yang @ 2026-04-25 6:04 UTC (permalink / raw)
To: dlemoal, cassel
Cc: linux-scsi, linux-kernel, yangxingui, liuyonglong, kangfenglong
When sata_link_hardreset() detects that the link is offline, it currently
returns immediately without distinguishing the reason. According to SATA
specification, the SStatus register's det filed (bits 0-3) indicates:
- 0x0: No device detected, PHY not communicating
- 0x1: Device detected but PHY communication not established
- 0x3: Device detected and PHY communication established
This patch helps improve device detection reliability and adds a check
when the link is offline but det filed shows 0x1, return -EAGAIN to
trigger retry, rather than giving up immediately.
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
---
drivers/ata/libata-sata.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
index b9d635088f5f..e5bb92c38e38 100644
--- a/drivers/ata/libata-sata.c
+++ b/drivers/ata/libata-sata.c
@@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
if (rc)
goto out;
/* if link is offline nothing more to do */
- if (ata_phys_link_offline(link))
+ if (ata_phys_link_offline(link)) {
+ u32 sstatus;
+
+ if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
+ (sstatus & 0xf) == 0x1) {
+ ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
+ sstatus);
+ rc = -EAGAIN;
+ }
+
goto out;
+ }
/* Link is online. From this point, -ENODEV too is an error. */
if (online)
--
2.33.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-25 6:04 [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established Xingui Yang
@ 2026-04-25 22:53 ` Damien Le Moal
2026-04-27 1:51 ` yangxingui
2026-04-27 13:17 ` Niklas Cassel
1 sibling, 1 reply; 7+ messages in thread
From: Damien Le Moal @ 2026-04-25 22:53 UTC (permalink / raw)
To: Xingui Yang, cassel; +Cc: linux-scsi, linux-kernel, liuyonglong, kangfenglong
On 4/25/26 15:04, Xingui Yang wrote:
> When sata_link_hardreset() detects that the link is offline, it currently
> returns immediately without distinguishing the reason. According to SATA
> specification, the SStatus register's det filed (bits 0-3) indicates:
> - 0x0: No device detected, PHY not communicating
> - 0x1: Device detected but PHY communication not established
> - 0x3: Device detected and PHY communication established
>
> This patch helps improve device detection reliability and adds a check
> when the link is offline but det filed shows 0x1, return -EAGAIN to
> trigger retry, rather than giving up immediately.
>
> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi list.
Also, please check your mail setup: your email was in my Junk folder.
> ---
> drivers/ata/libata-sata.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
> index b9d635088f5f..e5bb92c38e38 100644
> --- a/drivers/ata/libata-sata.c
> +++ b/drivers/ata/libata-sata.c
> @@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
> if (rc)
> goto out;
> /* if link is offline nothing more to do */
> - if (ata_phys_link_offline(link))
> + if (ata_phys_link_offline(link)) {
This is preceeded by a call to sata_link_resume(), which calls
sata_link_debounce() and that function makes sure that DET is stable. So if
after that DET still shows that their is no PHY, there is likely a big problem
with it and it is super slow to be established.
In this case, I do not think that doing another hardreset is the right thing to
do. Have you tried increasing the deadline for hardreset ? That deadline is used
as the limit for the link debounce too.
Do you have a specific controller/device where you see this issue ? What exactly
is the hardware setup where you see this issue ?
> + u32 sstatus;
> +
> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
> + (sstatus & 0xf) == 0x1) {
> + ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
> + sstatus);
> + rc = -EAGAIN;
> + }
> +
> goto out;
> + }
>
> /* Link is online. From this point, -ENODEV too is an error. */
> if (online)
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-25 22:53 ` Damien Le Moal
@ 2026-04-27 1:51 ` yangxingui
2026-04-27 4:45 ` Damien Le Moal
0 siblings, 1 reply; 7+ messages in thread
From: yangxingui @ 2026-04-27 1:51 UTC (permalink / raw)
To: Damien Le Moal, cassel
Cc: linux-scsi, linux-kernel, liuyonglong, kangfenglong, linux-ide
On 2026/4/26 6:53, Damien Le Moal wrote:
> On 4/25/26 15:04, Xingui Yang wrote:
>> When sata_link_hardreset() detects that the link is offline, it currently
>> returns immediately without distinguishing the reason. According to SATA
>> specification, the SStatus register's det filed (bits 0-3) indicates:
>> - 0x0: No device detected, PHY not communicating
>> - 0x1: Device detected but PHY communication not established
>> - 0x3: Device detected and PHY communication established
>>
>> This patch helps improve device detection reliability and adds a check
>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>> trigger retry, rather than giving up immediately.
>>
>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>
> This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi list.
Ok.
>
> Also, please check your mail setup: your email was in my Junk folder.
Well, patche was sent using the git send command.
>
>> ---
>> drivers/ata/libata-sata.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
>> index b9d635088f5f..e5bb92c38e38 100644
>> --- a/drivers/ata/libata-sata.c
>> +++ b/drivers/ata/libata-sata.c
>> @@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
>> if (rc)
>> goto out;
>> /* if link is offline nothing more to do */
>> - if (ata_phys_link_offline(link))
>> + if (ata_phys_link_offline(link)) {
>
> This is preceeded by a call to sata_link_resume(), which calls
> sata_link_debounce() and that function makes sure that DET is stable. So if
> after that DET still shows that their is no PHY, there is likely a big problem
> with it and it is super slow to be established.
>
> In this case, I do not think that doing another hardreset is the right thing to
> do. Have you tried increasing the deadline for hardreset ? That deadline is used
> as the limit for the link debounce too.
>
> Do you have a specific controller/device where you see this issue ? What exactly
> is the hardware setup where you see this issue ?
Our customer imports and verifies a new disk, there is an occasional
failure in performing a hard reset on the disk and no exception log is
generated for resume and debounce.
[ 22.864418][ T1285] ahci 0000:76:03.0: Adding to iommu group 23
[ 22.870403][ T1285] ahci 0000:76:03.0: controller does not support
SXS, disabling CAP_SXS
[ 22.878655][ T1285] ahci 0000:76:03.0: SSS flag set, parallel bus
scan disabled
[ 22.885966][ T1285] ahci 0000:76:03.0: AHCI 0001.0300 32 slots 2
ports 6 Gbps 0x3 impl SATA mode
[ 22.894743][ T1285] ahci 0000:76:03.0: flags: 64bit ncq sntf stag pm
led clo only pmp fbs slum part ccc ems boh
[ 22.905277][ T1285] scsi host0: ahci
[ 22.909061][ T1285] scsi host1: ahci
[ 22.966463][ T1285] ata1: SATA max UDMA/133 abar m4096@0xa3010000
port 0xa3010100 irq 108
[ 22.974629][ T1285] ata2: SATA max UDMA/133 abar m4096@0xa3010000
port 0xa3010180 irq 109
[ 25.242373][ T1286] ata1: SATA link down (SStatus 1 SControl 300)
<==============
[ 25.659901][ T1288] ata2: SATA link down (SStatus 0 SControl 300)
>
>
>
>> + u32 sstatus;
>> +
>> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
>> + (sstatus & 0xf) == 0x1) {
>> + ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
>> + sstatus);
>> + rc = -EAGAIN;
>> + }
>> +
>> goto out;
>> + }
>>
>> /* Link is online. From this point, -ENODEV too is an error. */
>> if (online)
>
>
Thanks,
Xingui
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-27 1:51 ` yangxingui
@ 2026-04-27 4:45 ` Damien Le Moal
2026-04-29 1:14 ` yangxingui
0 siblings, 1 reply; 7+ messages in thread
From: Damien Le Moal @ 2026-04-27 4:45 UTC (permalink / raw)
To: yangxingui, cassel
Cc: linux-scsi, linux-kernel, liuyonglong, kangfenglong, linux-ide
On 4/27/26 10:51 AM, yangxingui wrote:
>
>
> On 2026/4/26 6:53, Damien Le Moal wrote:
>> On 4/25/26 15:04, Xingui Yang wrote:
>>> When sata_link_hardreset() detects that the link is offline, it currently
>>> returns immediately without distinguishing the reason. According to SATA
>>> specification, the SStatus register's det filed (bits 0-3) indicates:
>>> - 0x0: No device detected, PHY not communicating
>>> - 0x1: Device detected but PHY communication not established
>>> - 0x3: Device detected and PHY communication established
>>>
>>> This patch helps improve device detection reliability and adds a check
>>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>>> trigger retry, rather than giving up immediately.
>>>
>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>>
>> This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi
>> list.
>
> Ok.
>>
>> Also, please check your mail setup: your email was in my Junk folder.
>
> Well, patche was sent using the git send command.
Not git send-email, your smtp server. It probably has something wrong with
DMARC. All your emails endup in my junk folder.
>> This is preceeded by a call to sata_link_resume(), which calls
>> sata_link_debounce() and that function makes sure that DET is stable. So if
>> after that DET still shows that their is no PHY, there is likely a big problem
>> with it and it is super slow to be established.
>>
>> In this case, I do not think that doing another hardreset is the right thing to
>> do. Have you tried increasing the deadline for hardreset ? That deadline is used
>> as the limit for the link debounce too.
>>
>> Do you have a specific controller/device where you see this issue ? What exactly
>> is the hardware setup where you see this issue ?
>
> Our customer imports and verifies a new disk, there is an occasional failure in
> performing a hard reset on the disk and no exception log is generated for
> resume and debounce.
Does this hold for all disks or for only one or some models ?
>
> [ 22.864418][ T1285] ahci 0000:76:03.0: Adding to iommu group 23
> [ 22.870403][ T1285] ahci 0000:76:03.0: controller does not support SXS,
> disabling CAP_SXS
> [ 22.878655][ T1285] ahci 0000:76:03.0: SSS flag set, parallel bus scan disabled
> [ 22.885966][ T1285] ahci 0000:76:03.0: AHCI 0001.0300 32 slots 2 ports 6
> Gbps 0x3 impl SATA mode
> [ 22.894743][ T1285] ahci 0000:76:03.0: flags: 64bit ncq sntf stag pm led clo
> only pmp fbs slum part ccc ems boh
> [ 22.905277][ T1285] scsi host0: ahci
> [ 22.909061][ T1285] scsi host1: ahci
> [ 22.966463][ T1285] ata1: SATA max UDMA/133 abar m4096@0xa3010000 port
> 0xa3010100 irq 108
> [ 22.974629][ T1285] ata2: SATA max UDMA/133 abar m4096@0xa3010000 port
> 0xa3010180 irq 109
> [ 25.242373][ T1286] ata1: SATA link down (SStatus 1 SControl 300)
> <==============
> [ 25.659901][ T1288] ata2: SATA link down (SStatus 0 SControl 300)
>>
>>
>>
>>> + u32 sstatus;
>>> +
>>> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
>>> + (sstatus & 0xf) == 0x1) {
>>> + ata_link_warn(link, "device detected but PHY not ready (SStatus
>>> %X), retrying\n",
>>> + sstatus);
>>> + rc = -EAGAIN;
>>> + }
>>> +
>>> goto out;
>>> + }
>>> /* Link is online. From this point, -ENODEV too is an error. */
>>> if (online)
>>
>>
>
> Thanks,
> Xingui
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-25 6:04 [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established Xingui Yang
2026-04-25 22:53 ` Damien Le Moal
@ 2026-04-27 13:17 ` Niklas Cassel
2026-04-29 1:06 ` yangxingui
1 sibling, 1 reply; 7+ messages in thread
From: Niklas Cassel @ 2026-04-27 13:17 UTC (permalink / raw)
To: Xingui Yang; +Cc: dlemoal, linux-scsi, linux-kernel, liuyonglong, kangfenglong
On Sat, Apr 25, 2026 at 02:04:47PM +0800, Xingui Yang wrote:
> When sata_link_hardreset() detects that the link is offline, it currently
> returns immediately without distinguishing the reason. According to SATA
> specification, the SStatus register's det filed (bits 0-3) indicates:
> - 0x0: No device detected, PHY not communicating
> - 0x1: Device detected but PHY communication not established
> - 0x3: Device detected and PHY communication established
>
> This patch helps improve device detection reliability and adds a check
> when the link is offline but det filed shows 0x1, return -EAGAIN to
> trigger retry, rather than giving up immediately.
>
> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
> ---
> drivers/ata/libata-sata.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
> index b9d635088f5f..e5bb92c38e38 100644
> --- a/drivers/ata/libata-sata.c
> +++ b/drivers/ata/libata-sata.c
> @@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
> if (rc)
> goto out;
> /* if link is offline nothing more to do */
> - if (ata_phys_link_offline(link))
> + if (ata_phys_link_offline(link)) {
> + u32 sstatus;
> +
> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
> + (sstatus & 0xf) == 0x1) {
> + ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
> + sstatus);
> + rc = -EAGAIN;
> + }
> +
This looks like you are more or less duplicating the function
ata_eh_link_established(), untrouced in commit 4371fe1ba400 ("ata:
libata-eh: Avoid unnecessary resets when revalidating devices").
Could you perhaps try to reuse this function?
(It is currently private, so you would need to make it public.)
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-27 13:17 ` Niklas Cassel
@ 2026-04-29 1:06 ` yangxingui
0 siblings, 0 replies; 7+ messages in thread
From: yangxingui @ 2026-04-29 1:06 UTC (permalink / raw)
To: Niklas Cassel
Cc: dlemoal, linux-scsi, linux-kernel, liuyonglong, kangfenglong
On 2026/4/27 21:17, Niklas Cassel wrote:
> On Sat, Apr 25, 2026 at 02:04:47PM +0800, Xingui Yang wrote:
>> When sata_link_hardreset() detects that the link is offline, it currently
>> returns immediately without distinguishing the reason. According to SATA
>> specification, the SStatus register's det filed (bits 0-3) indicates:
>> - 0x0: No device detected, PHY not communicating
>> - 0x1: Device detected but PHY communication not established
>> - 0x3: Device detected and PHY communication established
>>
>> This patch helps improve device detection reliability and adds a check
>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>> trigger retry, rather than giving up immediately.
>>
>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>> ---
>> drivers/ata/libata-sata.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
>> index b9d635088f5f..e5bb92c38e38 100644
>> --- a/drivers/ata/libata-sata.c
>> +++ b/drivers/ata/libata-sata.c
>> @@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
>> if (rc)
>> goto out;
>> /* if link is offline nothing more to do */
>> - if (ata_phys_link_offline(link))
>> + if (ata_phys_link_offline(link)) {
>> + u32 sstatus;
>> +
>> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
>> + (sstatus & 0xf) == 0x1) {
>> + ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
>> + sstatus);
>> + rc = -EAGAIN;
>> + }
>> +
>
> This looks like you are more or less duplicating the function
> ata_eh_link_established(), untrouced in commit 4371fe1ba400 ("ata:
> libata-eh: Avoid unnecessary resets when revalidating devices").
>
> Could you perhaps try to reuse this function?
>
> (It is currently private, so you would need to make it public.)
This looks like a pretty good suggestion, but according to the log
print, when an exception occurs, the ipm field is 0, indicating that the
communication has not been established. It might not be suitable to use
this interface yet.
[ 25.242373][ T1286] ata1: SATA link down (SStatus 1 SControl 300)
Thanks,
Xingui
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-27 4:45 ` Damien Le Moal
@ 2026-04-29 1:14 ` yangxingui
0 siblings, 0 replies; 7+ messages in thread
From: yangxingui @ 2026-04-29 1:14 UTC (permalink / raw)
To: Damien Le Moal, cassel
Cc: linux-scsi, linux-kernel, liuyonglong, kangfenglong, linux-ide
On 2026/4/27 12:45, Damien Le Moal wrote:
> On 4/27/26 10:51 AM, yangxingui wrote:
>>
>>
>> On 2026/4/26 6:53, Damien Le Moal wrote:
>>> On 4/25/26 15:04, Xingui Yang wrote:
>>>> When sata_link_hardreset() detects that the link is offline, it currently
>>>> returns immediately without distinguishing the reason. According to SATA
>>>> specification, the SStatus register's det filed (bits 0-3) indicates:
>>>> - 0x0: No device detected, PHY not communicating
>>>> - 0x1: Device detected but PHY communication not established
>>>> - 0x3: Device detected and PHY communication established
>>>>
>>>> This patch helps improve device detection reliability and adds a check
>>>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>>>> trigger retry, rather than giving up immediately.
>>>>
>>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>>>
>>> This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi
>>> list.
>>
>> Ok.
>>>
>>> Also, please check your mail setup: your email was in my Junk folder.
>>
>> Well, patche was sent using the git send command.
>
> Not git send-email, your smtp server. It probably has something wrong with
> DMARC. All your emails endup in my junk folder.
Alright, it might be related to the company's SMTP server, but this
configuration is fixed, and I'm not quite sure how to fix it yet.
>
>>> This is preceeded by a call to sata_link_resume(), which calls
>>> sata_link_debounce() and that function makes sure that DET is stable. So if
>>> after that DET still shows that their is no PHY, there is likely a big problem
>>> with it and it is super slow to be established.
>>>
>>> In this case, I do not think that doing another hardreset is the right thing to
>>> do. Have you tried increasing the deadline for hardreset ? That deadline is used
>>> as the limit for the link debounce too.
>>>
>>> Do you have a specific controller/device where you see this issue ? What exactly
>>> is the hardware setup where you see this issue ?
>>
>> Our customer imports and verifies a new disk, there is an occasional failure in
>> performing a hard reset on the disk and no exception log is generated for
>> resume and debounce.
>
> Does this hold for all disks or for only one or some models ?
It may be some models, It is not found on other disks.
Thanks,
Xingui
.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-04-29 1:14 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-25 6:04 [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established Xingui Yang
2026-04-25 22:53 ` Damien Le Moal
2026-04-27 1:51 ` yangxingui
2026-04-27 4:45 ` Damien Le Moal
2026-04-29 1:14 ` yangxingui
2026-04-27 13:17 ` Niklas Cassel
2026-04-29 1:06 ` yangxingui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox