* [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
@ 2026-04-25 6:04 Xingui Yang
2026-04-25 22:53 ` Damien Le Moal
2026-04-27 13:17 ` Niklas Cassel
0 siblings, 2 replies; 11+ messages in thread
From: Xingui Yang @ 2026-04-25 6:04 UTC (permalink / raw)
To: dlemoal, cassel
Cc: linux-scsi, linux-kernel, yangxingui, liuyonglong, kangfenglong
When sata_link_hardreset() detects that the link is offline, it currently
returns immediately without distinguishing the reason. According to SATA
specification, the SStatus register's det filed (bits 0-3) indicates:
- 0x0: No device detected, PHY not communicating
- 0x1: Device detected but PHY communication not established
- 0x3: Device detected and PHY communication established
This patch helps improve device detection reliability and adds a check
when the link is offline but det filed shows 0x1, return -EAGAIN to
trigger retry, rather than giving up immediately.
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
---
drivers/ata/libata-sata.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
index b9d635088f5f..e5bb92c38e38 100644
--- a/drivers/ata/libata-sata.c
+++ b/drivers/ata/libata-sata.c
@@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
if (rc)
goto out;
/* if link is offline nothing more to do */
- if (ata_phys_link_offline(link))
+ if (ata_phys_link_offline(link)) {
+ u32 sstatus;
+
+ if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
+ (sstatus & 0xf) == 0x1) {
+ ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
+ sstatus);
+ rc = -EAGAIN;
+ }
+
goto out;
+ }
/* Link is online. From this point, -ENODEV too is an error. */
if (online)
--
2.33.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-25 6:04 [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established Xingui Yang
@ 2026-04-25 22:53 ` Damien Le Moal
2026-04-27 1:51 ` yangxingui
2026-04-27 13:17 ` Niklas Cassel
1 sibling, 1 reply; 11+ messages in thread
From: Damien Le Moal @ 2026-04-25 22:53 UTC (permalink / raw)
To: Xingui Yang, cassel; +Cc: linux-scsi, linux-kernel, liuyonglong, kangfenglong
On 4/25/26 15:04, Xingui Yang wrote:
> When sata_link_hardreset() detects that the link is offline, it currently
> returns immediately without distinguishing the reason. According to SATA
> specification, the SStatus register's det filed (bits 0-3) indicates:
> - 0x0: No device detected, PHY not communicating
> - 0x1: Device detected but PHY communication not established
> - 0x3: Device detected and PHY communication established
>
> This patch helps improve device detection reliability and adds a check
> when the link is offline but det filed shows 0x1, return -EAGAIN to
> trigger retry, rather than giving up immediately.
>
> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi list.
Also, please check your mail setup: your email was in my Junk folder.
> ---
> drivers/ata/libata-sata.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
> index b9d635088f5f..e5bb92c38e38 100644
> --- a/drivers/ata/libata-sata.c
> +++ b/drivers/ata/libata-sata.c
> @@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
> if (rc)
> goto out;
> /* if link is offline nothing more to do */
> - if (ata_phys_link_offline(link))
> + if (ata_phys_link_offline(link)) {
This is preceeded by a call to sata_link_resume(), which calls
sata_link_debounce() and that function makes sure that DET is stable. So if
after that DET still shows that their is no PHY, there is likely a big problem
with it and it is super slow to be established.
In this case, I do not think that doing another hardreset is the right thing to
do. Have you tried increasing the deadline for hardreset ? That deadline is used
as the limit for the link debounce too.
Do you have a specific controller/device where you see this issue ? What exactly
is the hardware setup where you see this issue ?
> + u32 sstatus;
> +
> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
> + (sstatus & 0xf) == 0x1) {
> + ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
> + sstatus);
> + rc = -EAGAIN;
> + }
> +
> goto out;
> + }
>
> /* Link is online. From this point, -ENODEV too is an error. */
> if (online)
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-25 22:53 ` Damien Le Moal
@ 2026-04-27 1:51 ` yangxingui
2026-04-27 4:45 ` Damien Le Moal
0 siblings, 1 reply; 11+ messages in thread
From: yangxingui @ 2026-04-27 1:51 UTC (permalink / raw)
To: Damien Le Moal, cassel
Cc: linux-scsi, linux-kernel, liuyonglong, kangfenglong, linux-ide
On 2026/4/26 6:53, Damien Le Moal wrote:
> On 4/25/26 15:04, Xingui Yang wrote:
>> When sata_link_hardreset() detects that the link is offline, it currently
>> returns immediately without distinguishing the reason. According to SATA
>> specification, the SStatus register's det filed (bits 0-3) indicates:
>> - 0x0: No device detected, PHY not communicating
>> - 0x1: Device detected but PHY communication not established
>> - 0x3: Device detected and PHY communication established
>>
>> This patch helps improve device detection reliability and adds a check
>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>> trigger retry, rather than giving up immediately.
>>
>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>
> This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi list.
Ok.
>
> Also, please check your mail setup: your email was in my Junk folder.
Well, patche was sent using the git send command.
>
>> ---
>> drivers/ata/libata-sata.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
>> index b9d635088f5f..e5bb92c38e38 100644
>> --- a/drivers/ata/libata-sata.c
>> +++ b/drivers/ata/libata-sata.c
>> @@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
>> if (rc)
>> goto out;
>> /* if link is offline nothing more to do */
>> - if (ata_phys_link_offline(link))
>> + if (ata_phys_link_offline(link)) {
>
> This is preceeded by a call to sata_link_resume(), which calls
> sata_link_debounce() and that function makes sure that DET is stable. So if
> after that DET still shows that their is no PHY, there is likely a big problem
> with it and it is super slow to be established.
>
> In this case, I do not think that doing another hardreset is the right thing to
> do. Have you tried increasing the deadline for hardreset ? That deadline is used
> as the limit for the link debounce too.
>
> Do you have a specific controller/device where you see this issue ? What exactly
> is the hardware setup where you see this issue ?
Our customer imports and verifies a new disk, there is an occasional
failure in performing a hard reset on the disk and no exception log is
generated for resume and debounce.
[ 22.864418][ T1285] ahci 0000:76:03.0: Adding to iommu group 23
[ 22.870403][ T1285] ahci 0000:76:03.0: controller does not support
SXS, disabling CAP_SXS
[ 22.878655][ T1285] ahci 0000:76:03.0: SSS flag set, parallel bus
scan disabled
[ 22.885966][ T1285] ahci 0000:76:03.0: AHCI 0001.0300 32 slots 2
ports 6 Gbps 0x3 impl SATA mode
[ 22.894743][ T1285] ahci 0000:76:03.0: flags: 64bit ncq sntf stag pm
led clo only pmp fbs slum part ccc ems boh
[ 22.905277][ T1285] scsi host0: ahci
[ 22.909061][ T1285] scsi host1: ahci
[ 22.966463][ T1285] ata1: SATA max UDMA/133 abar m4096@0xa3010000
port 0xa3010100 irq 108
[ 22.974629][ T1285] ata2: SATA max UDMA/133 abar m4096@0xa3010000
port 0xa3010180 irq 109
[ 25.242373][ T1286] ata1: SATA link down (SStatus 1 SControl 300)
<==============
[ 25.659901][ T1288] ata2: SATA link down (SStatus 0 SControl 300)
>
>
>
>> + u32 sstatus;
>> +
>> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
>> + (sstatus & 0xf) == 0x1) {
>> + ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
>> + sstatus);
>> + rc = -EAGAIN;
>> + }
>> +
>> goto out;
>> + }
>>
>> /* Link is online. From this point, -ENODEV too is an error. */
>> if (online)
>
>
Thanks,
Xingui
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-27 1:51 ` yangxingui
@ 2026-04-27 4:45 ` Damien Le Moal
2026-04-29 1:14 ` yangxingui
0 siblings, 1 reply; 11+ messages in thread
From: Damien Le Moal @ 2026-04-27 4:45 UTC (permalink / raw)
To: yangxingui, cassel
Cc: linux-scsi, linux-kernel, liuyonglong, kangfenglong, linux-ide
On 4/27/26 10:51 AM, yangxingui wrote:
>
>
> On 2026/4/26 6:53, Damien Le Moal wrote:
>> On 4/25/26 15:04, Xingui Yang wrote:
>>> When sata_link_hardreset() detects that the link is offline, it currently
>>> returns immediately without distinguishing the reason. According to SATA
>>> specification, the SStatus register's det filed (bits 0-3) indicates:
>>> - 0x0: No device detected, PHY not communicating
>>> - 0x1: Device detected but PHY communication not established
>>> - 0x3: Device detected and PHY communication established
>>>
>>> This patch helps improve device detection reliability and adds a check
>>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>>> trigger retry, rather than giving up immediately.
>>>
>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>>
>> This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi
>> list.
>
> Ok.
>>
>> Also, please check your mail setup: your email was in my Junk folder.
>
> Well, patche was sent using the git send command.
Not git send-email, your smtp server. It probably has something wrong with
DMARC. All your emails endup in my junk folder.
>> This is preceeded by a call to sata_link_resume(), which calls
>> sata_link_debounce() and that function makes sure that DET is stable. So if
>> after that DET still shows that their is no PHY, there is likely a big problem
>> with it and it is super slow to be established.
>>
>> In this case, I do not think that doing another hardreset is the right thing to
>> do. Have you tried increasing the deadline for hardreset ? That deadline is used
>> as the limit for the link debounce too.
>>
>> Do you have a specific controller/device where you see this issue ? What exactly
>> is the hardware setup where you see this issue ?
>
> Our customer imports and verifies a new disk, there is an occasional failure in
> performing a hard reset on the disk and no exception log is generated for
> resume and debounce.
Does this hold for all disks or for only one or some models ?
>
> [ 22.864418][ T1285] ahci 0000:76:03.0: Adding to iommu group 23
> [ 22.870403][ T1285] ahci 0000:76:03.0: controller does not support SXS,
> disabling CAP_SXS
> [ 22.878655][ T1285] ahci 0000:76:03.0: SSS flag set, parallel bus scan disabled
> [ 22.885966][ T1285] ahci 0000:76:03.0: AHCI 0001.0300 32 slots 2 ports 6
> Gbps 0x3 impl SATA mode
> [ 22.894743][ T1285] ahci 0000:76:03.0: flags: 64bit ncq sntf stag pm led clo
> only pmp fbs slum part ccc ems boh
> [ 22.905277][ T1285] scsi host0: ahci
> [ 22.909061][ T1285] scsi host1: ahci
> [ 22.966463][ T1285] ata1: SATA max UDMA/133 abar m4096@0xa3010000 port
> 0xa3010100 irq 108
> [ 22.974629][ T1285] ata2: SATA max UDMA/133 abar m4096@0xa3010000 port
> 0xa3010180 irq 109
> [ 25.242373][ T1286] ata1: SATA link down (SStatus 1 SControl 300)
> <==============
> [ 25.659901][ T1288] ata2: SATA link down (SStatus 0 SControl 300)
>>
>>
>>
>>> + u32 sstatus;
>>> +
>>> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
>>> + (sstatus & 0xf) == 0x1) {
>>> + ata_link_warn(link, "device detected but PHY not ready (SStatus
>>> %X), retrying\n",
>>> + sstatus);
>>> + rc = -EAGAIN;
>>> + }
>>> +
>>> goto out;
>>> + }
>>> /* Link is online. From this point, -ENODEV too is an error. */
>>> if (online)
>>
>>
>
> Thanks,
> Xingui
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-25 6:04 [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established Xingui Yang
2026-04-25 22:53 ` Damien Le Moal
@ 2026-04-27 13:17 ` Niklas Cassel
2026-04-29 1:06 ` yangxingui
1 sibling, 1 reply; 11+ messages in thread
From: Niklas Cassel @ 2026-04-27 13:17 UTC (permalink / raw)
To: Xingui Yang; +Cc: dlemoal, linux-scsi, linux-kernel, liuyonglong, kangfenglong
On Sat, Apr 25, 2026 at 02:04:47PM +0800, Xingui Yang wrote:
> When sata_link_hardreset() detects that the link is offline, it currently
> returns immediately without distinguishing the reason. According to SATA
> specification, the SStatus register's det filed (bits 0-3) indicates:
> - 0x0: No device detected, PHY not communicating
> - 0x1: Device detected but PHY communication not established
> - 0x3: Device detected and PHY communication established
>
> This patch helps improve device detection reliability and adds a check
> when the link is offline but det filed shows 0x1, return -EAGAIN to
> trigger retry, rather than giving up immediately.
>
> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
> ---
> drivers/ata/libata-sata.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
> index b9d635088f5f..e5bb92c38e38 100644
> --- a/drivers/ata/libata-sata.c
> +++ b/drivers/ata/libata-sata.c
> @@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
> if (rc)
> goto out;
> /* if link is offline nothing more to do */
> - if (ata_phys_link_offline(link))
> + if (ata_phys_link_offline(link)) {
> + u32 sstatus;
> +
> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
> + (sstatus & 0xf) == 0x1) {
> + ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
> + sstatus);
> + rc = -EAGAIN;
> + }
> +
This looks like you are more or less duplicating the function
ata_eh_link_established(), untrouced in commit 4371fe1ba400 ("ata:
libata-eh: Avoid unnecessary resets when revalidating devices").
Could you perhaps try to reuse this function?
(It is currently private, so you would need to make it public.)
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-27 13:17 ` Niklas Cassel
@ 2026-04-29 1:06 ` yangxingui
0 siblings, 0 replies; 11+ messages in thread
From: yangxingui @ 2026-04-29 1:06 UTC (permalink / raw)
To: Niklas Cassel
Cc: dlemoal, linux-scsi, linux-kernel, liuyonglong, kangfenglong
On 2026/4/27 21:17, Niklas Cassel wrote:
> On Sat, Apr 25, 2026 at 02:04:47PM +0800, Xingui Yang wrote:
>> When sata_link_hardreset() detects that the link is offline, it currently
>> returns immediately without distinguishing the reason. According to SATA
>> specification, the SStatus register's det filed (bits 0-3) indicates:
>> - 0x0: No device detected, PHY not communicating
>> - 0x1: Device detected but PHY communication not established
>> - 0x3: Device detected and PHY communication established
>>
>> This patch helps improve device detection reliability and adds a check
>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>> trigger retry, rather than giving up immediately.
>>
>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>> ---
>> drivers/ata/libata-sata.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/ata/libata-sata.c b/drivers/ata/libata-sata.c
>> index b9d635088f5f..e5bb92c38e38 100644
>> --- a/drivers/ata/libata-sata.c
>> +++ b/drivers/ata/libata-sata.c
>> @@ -667,8 +667,18 @@ int sata_link_hardreset(struct ata_link *link, const unsigned int *timing,
>> if (rc)
>> goto out;
>> /* if link is offline nothing more to do */
>> - if (ata_phys_link_offline(link))
>> + if (ata_phys_link_offline(link)) {
>> + u32 sstatus;
>> +
>> + if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
>> + (sstatus & 0xf) == 0x1) {
>> + ata_link_warn(link, "device detected but PHY not ready (SStatus %X), retrying\n",
>> + sstatus);
>> + rc = -EAGAIN;
>> + }
>> +
>
> This looks like you are more or less duplicating the function
> ata_eh_link_established(), untrouced in commit 4371fe1ba400 ("ata:
> libata-eh: Avoid unnecessary resets when revalidating devices").
>
> Could you perhaps try to reuse this function?
>
> (It is currently private, so you would need to make it public.)
This looks like a pretty good suggestion, but according to the log
print, when an exception occurs, the ipm field is 0, indicating that the
communication has not been established. It might not be suitable to use
this interface yet.
[ 25.242373][ T1286] ata1: SATA link down (SStatus 1 SControl 300)
Thanks,
Xingui
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-27 4:45 ` Damien Le Moal
@ 2026-04-29 1:14 ` yangxingui
2026-04-29 1:36 ` Damien Le Moal
0 siblings, 1 reply; 11+ messages in thread
From: yangxingui @ 2026-04-29 1:14 UTC (permalink / raw)
To: Damien Le Moal, cassel
Cc: linux-scsi, linux-kernel, liuyonglong, kangfenglong, linux-ide
On 2026/4/27 12:45, Damien Le Moal wrote:
> On 4/27/26 10:51 AM, yangxingui wrote:
>>
>>
>> On 2026/4/26 6:53, Damien Le Moal wrote:
>>> On 4/25/26 15:04, Xingui Yang wrote:
>>>> When sata_link_hardreset() detects that the link is offline, it currently
>>>> returns immediately without distinguishing the reason. According to SATA
>>>> specification, the SStatus register's det filed (bits 0-3) indicates:
>>>> - 0x0: No device detected, PHY not communicating
>>>> - 0x1: Device detected but PHY communication not established
>>>> - 0x3: Device detected and PHY communication established
>>>>
>>>> This patch helps improve device detection reliability and adds a check
>>>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>>>> trigger retry, rather than giving up immediately.
>>>>
>>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>>>
>>> This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi
>>> list.
>>
>> Ok.
>>>
>>> Also, please check your mail setup: your email was in my Junk folder.
>>
>> Well, patche was sent using the git send command.
>
> Not git send-email, your smtp server. It probably has something wrong with
> DMARC. All your emails endup in my junk folder.
Alright, it might be related to the company's SMTP server, but this
configuration is fixed, and I'm not quite sure how to fix it yet.
>
>>> This is preceeded by a call to sata_link_resume(), which calls
>>> sata_link_debounce() and that function makes sure that DET is stable. So if
>>> after that DET still shows that their is no PHY, there is likely a big problem
>>> with it and it is super slow to be established.
>>>
>>> In this case, I do not think that doing another hardreset is the right thing to
>>> do. Have you tried increasing the deadline for hardreset ? That deadline is used
>>> as the limit for the link debounce too.
>>>
>>> Do you have a specific controller/device where you see this issue ? What exactly
>>> is the hardware setup where you see this issue ?
>>
>> Our customer imports and verifies a new disk, there is an occasional failure in
>> performing a hard reset on the disk and no exception log is generated for
>> resume and debounce.
>
> Does this hold for all disks or for only one or some models ?
It may be some models, It is not found on other disks.
Thanks,
Xingui
.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-29 1:14 ` yangxingui
@ 2026-04-29 1:36 ` Damien Le Moal
2026-04-29 7:01 ` yangxingui
0 siblings, 1 reply; 11+ messages in thread
From: Damien Le Moal @ 2026-04-29 1:36 UTC (permalink / raw)
To: yangxingui, cassel; +Cc: linux-kernel, liuyonglong, kangfenglong, linux-ide
On 4/29/26 10:14, yangxingui wrote:
>
>
> On 2026/4/27 12:45, Damien Le Moal wrote:
>> On 4/27/26 10:51 AM, yangxingui wrote:
>>>
>>>
>>> On 2026/4/26 6:53, Damien Le Moal wrote:
>>>> On 4/25/26 15:04, Xingui Yang wrote:
>>>>> When sata_link_hardreset() detects that the link is offline, it currently
>>>>> returns immediately without distinguishing the reason. According to SATA
>>>>> specification, the SStatus register's det filed (bits 0-3) indicates:
>>>>> - 0x0: No device detected, PHY not communicating
>>>>> - 0x1: Device detected but PHY communication not established
>>>>> - 0x3: Device detected and PHY communication established
>>>>>
>>>>> This patch helps improve device detection reliability and adds a check
>>>>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>>>>> trigger retry, rather than giving up immediately.
>>>>>
>>>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>>>>
>>>> This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi
>>>> list.
>>>
>>> Ok.
>>>>
>>>> Also, please check your mail setup: your email was in my Junk folder.
>>>
>>> Well, patche was sent using the git send command.
>>
>> Not git send-email, your smtp server. It probably has something wrong with
>> DMARC. All your emails endup in my junk folder.
>
> Alright, it might be related to the company's SMTP server, but this
> configuration is fixed, and I'm not quite sure how to fix it yet.
>
>>
>>>> This is preceeded by a call to sata_link_resume(), which calls
>>>> sata_link_debounce() and that function makes sure that DET is stable. So if
>>>> after that DET still shows that their is no PHY, there is likely a big problem
>>>> with it and it is super slow to be established.
>>>>
>>>> In this case, I do not think that doing another hardreset is the right thing to
>>>> do. Have you tried increasing the deadline for hardreset ? That deadline is used
>>>> as the limit for the link debounce too.
>>>>
>>>> Do you have a specific controller/device where you see this issue ? What exactly
>>>> is the hardware setup where you see this issue ?
>>>
>>> Our customer imports and verifies a new disk, there is an occasional failure in
>>> performing a hard reset on the disk and no exception log is generated for
>>> resume and debounce.
>>
>> Does this hold for all disks or for only one or some models ?
>
> It may be some models, It is not found on other disks.
Some model ? Which one ?
And please remove linux-scsi from your emails for this topic.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-29 1:36 ` Damien Le Moal
@ 2026-04-29 7:01 ` yangxingui
2026-04-30 8:46 ` Niklas Cassel
0 siblings, 1 reply; 11+ messages in thread
From: yangxingui @ 2026-04-29 7:01 UTC (permalink / raw)
To: Damien Le Moal, cassel; +Cc: linux-kernel, liuyonglong, kangfenglong, linux-ide
On 2026/4/29 9:36, Damien Le Moal wrote:
> On 4/29/26 10:14, yangxingui wrote:
>>
>>
>> On 2026/4/27 12:45, Damien Le Moal wrote:
>>> On 4/27/26 10:51 AM, yangxingui wrote:
>>>>
>>>>
>>>> On 2026/4/26 6:53, Damien Le Moal wrote:
>>>>> On 4/25/26 15:04, Xingui Yang wrote:
>>>>>> When sata_link_hardreset() detects that the link is offline, it currently
>>>>>> returns immediately without distinguishing the reason. According to SATA
>>>>>> specification, the SStatus register's det filed (bits 0-3) indicates:
>>>>>> - 0x0: No device detected, PHY not communicating
>>>>>> - 0x1: Device detected but PHY communication not established
>>>>>> - 0x3: Device detected and PHY communication established
>>>>>>
>>>>>> This patch helps improve device detection reliability and adds a check
>>>>>> when the link is offline but det filed shows 0x1, return -EAGAIN to
>>>>>> trigger retry, rather than giving up immediately.
>>>>>>
>>>>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com>
>>>>>
>>>>> This is a pure ATA patch so please CC the linux-ide list, not the linux-scsi
>>>>> list.
>>>>
>>>> Ok.
>>>>>
>>>>> Also, please check your mail setup: your email was in my Junk folder.
>>>>
>>>> Well, patche was sent using the git send command.
>>>
>>> Not git send-email, your smtp server. It probably has something wrong with
>>> DMARC. All your emails endup in my junk folder.
>>
>> Alright, it might be related to the company's SMTP server, but this
>> configuration is fixed, and I'm not quite sure how to fix it yet.
>>
>>>
>>>>> This is preceeded by a call to sata_link_resume(), which calls
>>>>> sata_link_debounce() and that function makes sure that DET is stable. So if
>>>>> after that DET still shows that their is no PHY, there is likely a big problem
>>>>> with it and it is super slow to be established.
>>>>>
>>>>> In this case, I do not think that doing another hardreset is the right thing to
>>>>> do. Have you tried increasing the deadline for hardreset ? That deadline is used
>>>>> as the limit for the link debounce too.
>>>>>
>>>>> Do you have a specific controller/device where you see this issue ? What exactly
>>>>> is the hardware setup where you see this issue ?
>>>>
>>>> Our customer imports and verifies a new disk, there is an occasional failure in
>>>> performing a hard reset on the disk and no exception log is generated for
>>>> resume and debounce.
>>>
>>> Does this hold for all disks or for only one or some models ?
>>
>> It may be some models, It is not found on other disks.
>
> Some model ? Which one ?
When the disk is properly connected, the log is as follows:
[ 22.658068][ T1297] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl
300)
[ 22.665083][ T1297] ata1.00: ATA-10: S1XE240S3N6Y9TC1AP, F66002.0,
max UDMA/100
[ 22.727017][ T806] scsi 0:0:0:0: Direct-Access ATA
S1XE240S3N6Y9TC1 02.0 PQ: 0 ANSI: 5
>
> And please remove linux-scsi from your emails for this topic.
Ok.
Thanks,
Xingui
.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-29 7:01 ` yangxingui
@ 2026-04-30 8:46 ` Niklas Cassel
2026-04-30 9:28 ` Niklas Cassel
0 siblings, 1 reply; 11+ messages in thread
From: Niklas Cassel @ 2026-04-30 8:46 UTC (permalink / raw)
To: yangxingui
Cc: Damien Le Moal, linux-kernel, liuyonglong, kangfenglong,
linux-ide
On Wed, Apr 29, 2026 at 03:01:48PM +0800, yangxingui wrote:
> > > > > > This is preceeded by a call to sata_link_resume(), which calls
> > > > > > sata_link_debounce() and that function makes sure that DET is stable. So if
> > > > > > after that DET still shows that their is no PHY, there is likely a big problem
> > > > > > with it and it is super slow to be established.
I agree with Damien, sata_link_debounce() is supposed to make sure that
DET is stable.
sata_link_debounce() will not explicitly wait for SStatus.DET to turn 0x3.
If value is stable, and SStatus.DET == 1, and time is before "deadline",
sata_link_debounce() will continue looping.
Else, if value is stable, and has been stable for "duration" amount of time,
it will return.
Since your print shows that SStatus == 1, that most likely means that the
deadline expired in sata_link_debounce().
I suggest that you try to increase the deadline, perhaps start off by simply
multiplying it by some factor in sata_link_debounce().
It would also be helpful if your commit message explained why returning
-EAGAIN makes a difference, because from what I can see, if the deadline
expires, sata_link_debounce() returns 0, which should cause sata_link_resume()
to return 0, which should cause sata_link_hardreset() to
return 0, with online == false.
If that is the case ata_do_reset() would return 0, and
ata_eh_followup_srst_needed() (returns true only if -EAGAIN) would return false.
Which should eventually cause us to retry another hard reset, as long as
tries <= max_tries.
By making sata_link_hardreset() return -EAGAIN, the difference I see is that
we will for a software reset followed by the hardreset, but you commit message
did not mention that.
So, my question is, why is it not sufficient to retry another
hardreset/COMRESET?
Does it work to only do a hardreset (without if a follow up softreset) if you
increase the deadline?
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established
2026-04-30 8:46 ` Niklas Cassel
@ 2026-04-30 9:28 ` Niklas Cassel
0 siblings, 0 replies; 11+ messages in thread
From: Niklas Cassel @ 2026-04-30 9:28 UTC (permalink / raw)
To: yangxingui
Cc: Damien Le Moal, linux-kernel, liuyonglong, kangfenglong,
linux-ide
On Thu, Apr 30, 2026 at 10:46:22AM +0200, Niklas Cassel wrote:
> If that is the case ata_do_reset() would return 0, and
> ata_eh_followup_srst_needed() (returns true only if -EAGAIN) would return false.
>
> Which should eventually cause us to retry another hard reset, as long as
> tries <= max_tries.
I see now that max_tries is just set to 1.
I think I would prefer another hardreset (with a larger timeout) over
a follow-up softreset after the hardreset...
If -EAGAIN is reserved for "do an follow up SRST after the COMRESET",
because certain Port Multipliers need it.
Perhaps introduce another error code, which means, device detected,
overload max_tries to 3 and goto retry.
That way we will retry using COMRESET, with increasing timeouts, since:
deadline = ata_deadline(jiffies, ata_eh_reset_timeouts[try++]);
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-04-30 9:28 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-25 6:04 [PATCH] ata: libata-sata: retry hardreset when device detected but PHY not established Xingui Yang
2026-04-25 22:53 ` Damien Le Moal
2026-04-27 1:51 ` yangxingui
2026-04-27 4:45 ` Damien Le Moal
2026-04-29 1:14 ` yangxingui
2026-04-29 1:36 ` Damien Le Moal
2026-04-29 7:01 ` yangxingui
2026-04-30 8:46 ` Niklas Cassel
2026-04-30 9:28 ` Niklas Cassel
2026-04-27 13:17 ` Niklas Cassel
2026-04-29 1:06 ` yangxingui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox