ata timeout exceptions

public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed

* ata timeout exceptions
@ 2025-11-03  4:13 Eyal Lebedinsky
  2025-11-09 20:40 ` Niklas Cassel
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-03  4:13 UTC (permalink / raw)
  To: list linux-ide

I have a sata disk that is probably on its last legs.
It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
It sees very little activity.

Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.

For the last few weeks it started to log timeout errors (not always) like this:

   kernel: ata2.00: exception Emask 0x0 SAct 0x2020 SErr 0x0 action 0x6 frozen
   kernel: ata2.00: failed command: WRITE FPDMA QUEUED
   kernel: ata2.00: cmd 61/80:28:a0:10:df/00:00:d1:01:00/40 tag 5 ncq dma 65536 out
                    res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
   kernel: ata2.00: status: { DRDY }
   kernel: ata2.00: failed command: WRITE FPDMA QUEUED
   kernel: ata2.00: cmd 61/00:68:18:15:30/20:00:20:01:00/40 tag 13 ncq dma 4194304 out
                    res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
   kernel: ata2.00: status: { DRDY }
   kernel: ata2: hard resetting link
   kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
   kernel: ata2.00: configured for UDMA/133
   kernel: ata2: EH complete

Looking at the smart log I see that one more command_timeout was counted and no other attribute is incremented.

However, later on, this error was followed by 31 more failures, probably the full command queue was aborted.
The messages mention 'tag 0 ncq dma' through 'tag 31 ncq dma'.
Again, in the smart log, the whole burst counted as one extra command_timeout.

After this going on for a few days, a repeated burst of errors lead to:
   kernel: ata2.00: NCQ disabled due to excessive errors

 From now on, only one exception is logged:
   kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
   kernel: ata2.00: failed command: WRITE DMA EXT
   kernel: ata2.00: cmd 35/00:00:98:a3:4c/00:20:86:01:00/e0 tag 6 dma 4194304 out
                    res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
   kernel: ata2.00: status: { DRDY }
   kernel: ata2: hard resetting link
   kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
   kernel: ata2.00: configured for UDMA/133
   kernel: ata2: EH complete

Furthermore, the smart log shows no change. This has been going on for the last two days,
over a dozen times.

I want to understand what is going on:

1) Why do I not see an I/O error and the writes to the disk (rsync) seem to complete?
    Which layer absorbs the errors, hiding them from the application?

2) Why do I get only one command_timeout counted (originally, with ncq active) and none when ncq is disabled?

Naturally, I already copied the disk to a replacement which I will install after this disk fails completely.

--
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-03  4:13 ata timeout exceptions Eyal Lebedinsky
@ 2025-11-09 20:40 ` Niklas Cassel
  2025-11-09 22:41   ` Eyal Lebedinsky
  2025-11-14  4:32 ` Eyal Lebedinsky
  2025-12-16 23:39 ` Eyal Lebedinsky
  2 siblings, 1 reply; 28+ messages in thread
From: Niklas Cassel @ 2025-11-09 20:40 UTC (permalink / raw)
  To: Eyal Lebedinsky; +Cc: list linux-ide

Hello Eyal,

On Mon, Nov 03, 2025 at 03:13:34PM +1100, Eyal Lebedinsky wrote:
> 
> I want to understand what is going on:
> 
> 1) Why do I not see an I/O error and the writes to the disk (rsync) seem to complete?
>    Which layer absorbs the errors, hiding them from the application?

SCSI layer.

For a timed out command, libata will set DID_TIME_OUT:
https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/ata/libata-eh.c#L652-L654

For most commands SCSI layer, SCSI will set cmd->allowed to sdkp->max_retries:
https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/sd.c#L1411

which by default is 5:
https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/sd.c#L3962

Thus, most commands will be retried up to 5 times:
https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/scsi_error.c#L2225

Thus, the user will only see the I/O as an error if the command failed
6 times.

(Note that if the command returns sense data instead of timeout, depending on
the sense data returned, we might report an I/O error to the user immediately.

> 
> 2) Why do I get only one command_timeout counted (originally, with ncq active) and none when ncq is disabled?

You are right that even if it is only a single command that times out,
the whole queue will be drained and retried.
(Because we always do a hard reset after a command timeout.)

command_timeout is most likely increased only by one because it was
only a single command that timed out. (The other commands might have
been queued but were never executed/finished.)

I have no idea why a command timeout, when NCQ has been disabled,
does not increase the command_timeout counter. My expectation would
have been for the counter to still be increased by one.

Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-09 20:40 ` Niklas Cassel
@ 2025-11-09 22:41   ` Eyal Lebedinsky
  2025-11-10 13:11     ` Niklas Cassel
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-09 22:41 UTC (permalink / raw)
  To: list linux-ide; +Cc: Niklas Cassel

Hello Niklas,

On 10/11/25 07:40, Niklas Cassel wrote:
> Hello Eyal,
> 
> On Mon, Nov 03, 2025 at 03:13:34PM +1100, Eyal Lebedinsky wrote:
>>
>> I want to understand what is going on:
>>
>> 1) Why do I not see an I/O error and the writes to the disk (rsync) seem to complete?
>>     Which layer absorbs the errors, hiding them from the application?
> 
> SCSI layer.

I now understand this, the error does not originate from the disk itself which may be unaware of it.

> For a timed out command, libata will set DID_TIME_OUT:
> https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/ata/libata-eh.c#L652-L654
> 
> For most commands SCSI layer, SCSI will set cmd->allowed to sdkp->max_retries:
> https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/sd.c#L1411
> 
> which by default is 5:
> https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/sd.c#L3962
> 
> Thus, most commands will be retried up to 5 times:
> https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/scsi_error.c#L2225
> 
> Thus, the user will only see the I/O as an error if the command failed
> 6 times.
> 
> (Note that if the command returns sense data instead of timeout, depending on
> the sense data returned, we might report an I/O error to the user immediately.

Initially, after a series of failures ncq was internally disabled
	ata2.00: NCQ disabled due to excessive errors
after which I forced it off, in the boot command
	ata2.00: FORCE: modified (noncq)
and no Command_Timeout was counted since.

after which I set command
>>
>> 2) Why do I get only one command_timeout counted (originally, with ncq active) and none when ncq is disabled?
> 
> You are right that even if it is only a single command that times out,
> the whole queue will be drained and retried.
> (Because we always do a hard reset after a command timeout.)
> 
> command_timeout is most likely increased only by one because it was
> only a single command that timed out. (The other commands might have
> been queued but were never executed/finished.)
> 
> I have no idea why a command timeout, when NCQ has been disabled,
> does not increase the command_timeout counter. My expectation would
> have been for the counter to still be increased by one.
This is an older SMA disk, and I will not be surprised if the disk was not even executing the command yet
but was doing some housekeeping when it was reset. After raising the timeout 30s to 180s I still had one
case where a reset was invoked. I see (iostat was running) that there was no activity on the disk that whole time.

Or maybe it is just a fw bug in the disk (ST8000AS0002-1NA17Z from 2016)?
Is it possible that a reset when a command is pending is not counted in the smart log?

Interestingly, after repeated consecutive resets the link speed was downshifted 6.0->3.0->1.5g.
Now it boots at 3.0g when it used to always boot at 6.0g.
There must be a real issue there which is why the disk will be replaced anyway.

Regardless, I now have a better understanding of the i/o path.

Regards,
	Eyal

> Kind regards,
> Niklas
-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-09 22:41   ` Eyal Lebedinsky
@ 2025-11-10 13:11     ` Niklas Cassel
  0 siblings, 0 replies; 28+ messages in thread
From: Niklas Cassel @ 2025-11-10 13:11 UTC (permalink / raw)
  To: Eyal Lebedinsky; +Cc: list linux-ide

On Mon, Nov 10, 2025 at 09:41:29AM +1100, Eyal Lebedinsky wrote:
> > > 2) Why do I get only one command_timeout counted (originally, with ncq active) and none when ncq is disabled?
> > 
> > You are right that even if it is only a single command that times out,
> > the whole queue will be drained and retried.
> > (Because we always do a hard reset after a command timeout.)
> > 
> > command_timeout is most likely increased only by one because it was
> > only a single command that timed out. (The other commands might have
> > been queued but were never executed/finished.)
> > 
> > I have no idea why a command timeout, when NCQ has been disabled,
> > does not increase the command_timeout counter. My expectation would
> > have been for the counter to still be increased by one.
> This is an older SMA disk, and I will not be surprised if the disk was not even executing the command yet
> but was doing some housekeeping when it was reset. After raising the timeout 30s to 180s I still had one
> case where a reset was invoked. I see (iostat was running) that there was no activity on the disk that whole time.
> 
> Or maybe it is just a fw bug in the disk (ST8000AS0002-1NA17Z from 2016)?
> Is it possible that a reset when a command is pending is not counted in the smart log?
> 
> Interestingly, after repeated consecutive resets the link speed was downshifted 6.0->3.0->1.5g.
> Now it boots at 3.0g when it used to always boot at 6.0g.
> There must be a real issue there which is why the disk will be replaced anyway.
> 
> Regardless, I now have a better understanding of the i/o path.

I'm not sure how the command_timeout counter in the smart log works.

But from the Linux driver perspective, if an I/O has not completed within the
timeout, we will reset the controller, and retry the outstanding commands.

This timeout is defined by Linux, and is by default 30 seconds, like you
mentioned.

Not sure how the drive FW counts a command_timeout, but it is possible that
this internal counter has a timeout that is different from 30 seconds.

For AHCI, performing a hardreset is done by writing a register, so it is not
actually a command that is sent down to the drive. (For a softreset on the
other hand, a command is actually sent down to the drive.)


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-03  4:13 ata timeout exceptions Eyal Lebedinsky
  2025-11-09 20:40 ` Niklas Cassel
@ 2025-11-14  4:32 ` Eyal Lebedinsky
  2025-11-18 15:17   ` Niklas Cassel
  2025-12-16 23:39 ` Eyal Lebedinsky
  2 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-14  4:32 UTC (permalink / raw)
  To: list linux-ide

On 3/11/25 15:13, Eyal Lebedinsky wrote:
> I have a sata disk that is probably on its last legs.
> It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
	It is ST8000AS0002-1NA17Z from 2016.> It sees very little activity.
> 
> Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
> 
> For the last few weeks it started to log timeout errors (not always) like this:

For the last two weeks I was monitoring the activity on the disk and here is what I did:

- added to boot command line:	libata.force=2.00:noncq
	now smartmon sees no more Command_Timeout errors since
- added to rc.local:		echo 180 >/sys/block/sda/device/timeout	# was 30
	Drastically less timeout/resets
	No counters change in smartctl report

Q1) Do I need to also set eh_timeout?

Q2) Is there any disk parameter I should set?

Then ran iostat and monitored the system log.
Every 2 hours a sync of 20-30MB is done to this disk. 4 minutes for a smooth run.
Mostly it completes without errors logged.
However, once or twice a day the pauses become long enough to hit the 180s timeout.

Note: I do not see pauses longer that 30s but shorter than 180s.

See logs below.

Q3) What is going on in the disk during a pause? I understand that there was no communication from the disk,
just a long wait until the system issues a reset, when it probably retries successfully.

This disk was used for a few years without such errors. The first report is recent (from 25/Oct this year).

Regards,
	Eyal

## example sync with pauses but without resets (timeout set to 180s):
14:07:46 2025-11-14
14:07:46 Device       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
14:07:56 sda         0.00         0.00         0.00         0.00          0          0          0
14:08:06 sda        29.60        10.40     55742.80         0.00        104     557428          0       start
14:08:16 sda        74.30         9.20    132401.60         0.00         92    1324016          0
14:08:26 sda        56.30         6.80    137086.80         0.00         68    1370868          0
14:08:36 sda        55.50         4.80    144625.20         0.00         48    1446252          0
14:08:46 sda        49.30         1.60     90455.20         0.00         16     904552          0
14:08:56 sda         0.00         0.00         0.00         0.00          0          0          0       pause #1
14:09:06 sda         0.00         0.00         0.00         0.00          0          0          0
14:09:16 sda         0.00         0.00         0.00         0.00          0          0          0
14:09:26 sda        95.50         3.60    135380.00         0.00         36    1353800          0
14:09:36 sda        85.20         1.60    164806.00         0.00         16    1648060          0
14:09:46 sda         4.10         0.80     10730.80         0.00          8     107308          0
14:09:56 sda         0.00         0.00         0.00         0.00          0          0          0       pause #2
14:10:06 sda         0.00         0.00         0.00         0.00          0          0          0
14:10:16 sda        16.80         0.80     33057.20         0.00          8     330572          0
14:10:26 sda         0.00         0.00         0.00         0.00          0          0          0       pause #3
14:10:36 sda         0.00         0.00         0.00         0.00          0          0          0
14:10:46 sda         0.00         0.00         0.00         0.00          0          0          0
14:10:56 sda        35.30         3.60     69011.20         0.00         36     690112          0
14:11:06 sda        75.90         4.00    145637.60         0.00         40    1456376          0
14:11:16 sda        10.00         0.80     24583.60         0.00          8     245836          0
14:11:26 sda         0.00         0.00         0.00         0.00          0          0          0       short pause #1
14:11:36 sda         9.00         0.40     14786.40         0.00          4     147864          0
14:11:46 sda        61.90         2.80    146004.80         0.00         28    1460048          0
14:11:56 sda        10.50         1.20     28356.80         0.00         12     283568          0
14:12:06 sda         0.00         0.00         0.00         0.00          0          0          0       pause #4
14:12:16 sda         0.00         0.00         0.00         0.00          0          0          0
14:12:26 sda        58.80         3.60    139051.20         0.00         36    1390512          0
14:12:36 sda        10.70         0.80     25866.40         0.00          8     258664          0
14:12:46 sda         0.00         0.00         0.00         0.00          0          0          0       short pause #2
14:12:46 sda         1.65        10.83      2721.90         0.00     142777   35897004          0
14:12:56 sda         0.00         0.00         0.00         0.00          0          0          0       short pause #3
14:13:06 sda        19.80         0.80     31588.00         0.00          8     315880          0
14:13:16 sda        34.50         2.00     84646.40         0.00         20     846464          0
14:13:26 sda         0.00         0.00         0.00         0.00          0          0          0       pause #5
14:13:36 sda         0.00         0.00         0.00         0.00          0          0          0
14:13:46 sda        68.20         2.80    118868.40         0.00         28    1188684          0
14:13:56 sda         0.20         0.00       380.80         0.00          0       3808          0
14:14:06 sda         0.00         0.00         0.00         0.00          0          0          0       pause #6
14:14:16 sda         0.00         0.00         0.00         0.00          0          0          0
14:14:26 sda        24.00         1.60     51746.40         0.00         16     517464          0
14:14:36 sda        63.80         5.20    136939.20         0.00         52    1369392          0
14:14:46 sda        59.70         4.80    125234.80         0.00         48    1252348          0
14:14:56 sda        16.70         1.20     40121.20         0.00         12     401212          0
14:15:06 sda         8.80         0.00        69.20         0.00          0        692          0
14:15:16 sda         0.00         0.00         0.00         0.00          0          0          0       end
14:15:26 sda         0.00         0.00         0.00         0.00          0          0          0
14:15:26 Device       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd

However, a few times a day the pauses become long enough to hit the 180s timeout:

## example sync with pauses and one reset (timeout set to 180s):
12:08:05 sda        29.20       194.80     45394.40         0.00       1948     453944          0       start
12:08:15 sda        36.40        14.40    104648.40         0.00        144    1046484          0
12:08:25 sda        37.10        13.20    107768.40         0.00        132    1077684          0
12:08:35 sda        28.60         8.80     99516.40         0.00         88     995164          0
12:08:45 sda        31.30         8.40    114315.60         0.00         84    1143156          0
12:08:55 sda        33.10         9.60    111729.20         0.00         96    1117292          0
12:09:05 sda        12.10         1.60     24356.00         0.00         16     243560          0
12:09:15 sda         0.00         0.00         0.00         0.00          0          0          0       pause #1
12:09:25 sda         0.00         0.00         0.00         0.00          0          0          0
12:09:35 sda         7.10         2.00     16304.00         0.00         20     163040          0
12:09:45 sda        51.40        10.00    109822.80         0.00        100    1098228          0
12:09:55 sda        58.10        33.20    111903.60         0.00        332    1119036          0
12:10:05 sda        40.20        11.20    111351.20         0.00        112    1113512          0
12:10:15 sda        45.00        15.60    101494.00         0.00        156    1014940          0
12:10:25 sda        43.80        10.80    121330.80         0.00        108    1213308          0
12:10:35 sda        41.30        17.20    112128.40         0.00        172    1121284          0
12:10:45 sda        46.30        10.00    111787.20         0.00        100    1117872          0
12:10:55 sda        43.80         9.20    108923.60         0.00         92    1089236          0
12:11:05 sda        47.70        11.60    115351.60         0.00        116    1153516          0
12:11:15 sda        68.90        11.60    122597.60         0.00        116    1225976          0
12:11:25 sda        35.60         7.20     76411.60         0.00         72     764116          0
12:11:35 sda        24.60         4.80     55798.80         0.00         48     557988          0
12:11:45 sda        11.90         0.80     24110.00         0.00          8     241100          0
12:11:55 sda         0.00         0.00         0.00         0.00          0          0          0       pause #2
12:12:05 sda         0.00         0.00         0.00         0.00          0          0          0
12:12:15 sda         9.90         0.80     10282.40         0.00          8     102824          0
12:12:25 sda         0.00         0.00         0.00         0.00          0          0          0       long pause
12:12:35 sda         0.00         0.00         0.00         0.00          0          0          0
12:12:45 sda         0.00         0.00         0.00         0.00          0          0          0
12:12:55 sda         0.00         0.00         0.00         0.00          0          0          0
12:13:05 sda         0.00         0.00         0.00         0.00          0          0          0
12:13:15 sda         0.00         0.00         0.00         0.00          0          0          0
12:13:25 sda         0.00         0.00         0.00         0.00          0          0          0
12:13:35 sda         0.00         0.00         0.00         0.00          0          0          0
12:13:45 sda         0.00         0.00         0.00         0.00          0          0          0
12:13:55 sda         0.00         0.00         0.00         0.00          0          0          0
12:14:05 sda         0.00         0.00         0.00         0.00          0          0          0
12:14:15 sda         0.00         0.00         0.00         0.00          0          0          0
12:14:25 sda         0.00         0.00         0.00         0.00          0          0          0
12:14:35 sda         0.00         0.00         0.00         0.00          0          0          0
12:14:45 sda         0.00         0.00         0.00         0.00          0          0          0
12:14:45 2025-11-14
12:14:45 Device       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
12:14:45 sda         2.20        23.25      3139.39         0.00     142021   19175400          0
12:14:55 sda         0.00         0.00         0.00         0.00          0          0          0
12:15:05 sda         0.00         0.00         0.00         0.00          0          0          0       timeout

12:15:14+11:00 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0 tag 28 dma 4194304 out
                                 res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
12:15:14+11:00 kernel: ata2.00: status: { DRDY }
12:15:14+11:00 kernel: ata2: hard resetting link
12:15:15+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
12:15:15+11:00 kernel: ata2.00: configured for UDMA/133
12:15:15+11:00 kernel: ata2: EH complete

12:15:15 sda         1.40         0.00      2985.20         0.00          0      29852          0
12:15:25 sda        50.60        11.60    121033.60         0.00        116    1210336          0
12:15:35 sda        15.00         1.60     30986.00         0.00         16     309860          0
12:15:45 sda         0.00         0.00         0.00         0.00          0          0          0       pause #3
12:15:55 sda         0.00         0.00         0.00         0.00          0          0          0
12:16:05 sda        11.00         0.40     14271.20         0.00          4     142712          0
12:16:15 sda         5.50         4.00      5271.60         0.00         40      52716          0
12:16:25 sda         0.00         0.00         0.00         0.00          0          0          0       end
12:16:35 sda         0.00         0.00         0.00         0.00          0          0          0

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-14  4:32 ` Eyal Lebedinsky
@ 2025-11-18 15:17   ` Niklas Cassel
  2025-11-18 23:05     ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Niklas Cassel @ 2025-11-18 15:17 UTC (permalink / raw)
  To: Eyal Lebedinsky; +Cc: list linux-ide, dlemoal

On Fri, Nov 14, 2025 at 03:32:20PM +1100, Eyal Lebedinsky wrote:
> > For the last few weeks it started to log timeout errors (not always) like this:
> 
> For the last two weeks I was monitoring the activity on the disk and here is what I did:
> 
> - added to boot command line:	libata.force=2.00:noncq
> 	now smartmon sees no more Command_Timeout errors since

To be honest, I don't think you should need to disable NCQ.


> This disk was used for a few years without such errors. The first report is recent (from 25/Oct this year).

Which kernel version are you running?

> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0 tag 28 dma 4194304 out
>                                 res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
> 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
> 12:15:14+11:00 kernel: ata2: hard resetting link
> 12:15:15+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> 12:15:15+11:00 kernel: ata2.00: configured for UDMA/133
> 12:15:15+11:00 kernel: ata2: EH complete

It is a 4194304 byte write that is failing, i.e. 4 MiB write.

This sounds very much like a recent bug report we have received:
https://bugzilla.kernel.org/show_bug.cgi?id=220693

In fact, a lot of the failing commands in that bug report is also a read
or write of size 4 MiB.

I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
size for rotational devices") and see if that improves things for you
(while keeping NCQ enabled).


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-18 15:17   ` Niklas Cassel
@ 2025-11-18 23:05     ` Eyal Lebedinsky
  2025-11-19  5:41       ` Damien Le Moal
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-18 23:05 UTC (permalink / raw)
  To: Niklas Cassel; +Cc: list linux-ide, dlemoal

Thanks Niklas,

On 19/11/25 02:17, Niklas Cassel wrote:
> On Fri, Nov 14, 2025 at 03:32:20PM +1100, Eyal Lebedinsky wrote:
>>> For the last few weeks it started to log timeout errors (not always) like this:
>>
>> For the last two weeks I was monitoring the activity on the disk and here is what I did:
>>
>> - added to boot command line:	libata.force=2.00:noncq
>> 	now smartmon sees no more Command_Timeout errors since
> 
> To be honest, I don't think you should need to disable NCQ.

Disabling NCQ caused the disk to NOT count a smart Command_Timeout, it did not stop the actual pauses/resets.
>> This disk was used for a few years without such errors. The first report is recent (from 25/Oct this year).
> 
> Which kernel version are you running?

This is happening for a while now, using:
	6.17.6-200.fc42.x86_64
	6.17.7-200.fc42.x86_64
Before the start it was running without a problem:
	6.16.3 - 6.16.12	since Aug 23
	6.17.4-200.fc42.x86_64	since Oct 24 20:48:40	
First timeout/reset a day later	at    Oct 25 20:09:27

I did set the timeout as high at 240s and found that if a pause is longer than 30s then it will always continue and timeout.
I can set it higher if there is a possibility that it WILL complete the write. Is it worth it? How long?

The system runs off nvme and includes a 7 disk raid6.

Maybe relevant:
	for a while (recently) following a reset this disk would downshift (6.0->3.0->1.5Gbps).
	For a period it would actually boot up at 3.0Gbps.
	It is back to 6.0Gbps for about 2 weeks (and many resets).

I still suspect the disk itself it at fault (I have a replacement synced and ready).
	>> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0 tag 28 dma 4194304 out
>>                                  res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
>> 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>> 12:15:14+11:00 kernel: ata2: hard resetting link
>> 12:15:15+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>> 12:15:15+11:00 kernel: ata2.00: configured for UDMA/133
>> 12:15:15+11:00 kernel: ata2: EH complete
> 
> It is a 4194304 byte write that is failing, i.e. 4 MiB write.

Yes, this is the size of almost all commands. With NCQ enabled the sizes are very variable and often less that 1 MiB.

> This sounds very much like a recent bug report we have received:
> https://bugzilla.kernel.org/show_bug.cgi?id=220693
> 
> In fact, a lot of the failing commands in that bug report is also a read
> or write of size 4 MiB.
> 
> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
> size for rotational devices") and see if that improves things for you
> (while keeping NCQ enabled).I read it. I never had I/O errors reported for this disk so it looks different to me.

Regardless, I am not set up to build a kernel (I used to), and being my main server I hesitate to fiddle with it.
I will keep this disk active and observe the situation.

Regards,
	Eyal

> Kind regards,
> Niklas
-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-18 23:05     ` Eyal Lebedinsky
@ 2025-11-19  5:41       ` Damien Le Moal
  2025-11-19 13:37         ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-11-19  5:41 UTC (permalink / raw)
  To: eyal, Niklas Cassel; +Cc: list linux-ide

On 11/19/25 08:05, Eyal Lebedinsky wrote:
> I still suspect the disk itself it at fault (I have a replacement synced and
> ready). >> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0
>>> tag 28 dma 4194304 out res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4
>>> (timeout) 12:15:14+11:00 kernel: ata2.00: status: { DRDY } 
>>> 12:15:14+11:00 kernel: ata2: hard resetting link 12:15:15+11:00 kernel:
>>> ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) 12:15:15+11:00
>>> kernel: ata2.00: configured for UDMA/133 12:15:15+11:00 kernel: ata2: EH
>>> complete
>> 
>> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
> 
> Yes, this is the size of almost all commands. With NCQ enabled the sizes are
> very variable and often less that 1 MiB.

Yes, because there will be more requests queued in the block layer, which
increases the chances of merging sequential requests. That's why the average
command size goes up.

> 
>> This sounds very much like a recent bug report we have received: https://
>> bugzilla.kernel.org/show_bug.cgi?id=220693
>> 
>> In fact, a lot of the failing commands in that bug report is also a read 
>> or write of size 4 MiB.
>> 
>> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead 
>> size for rotational devices") and see if that improves things for you 
>> (while keeping NCQ enabled).I read it. I never had I/O errors reported for
>> this disk so it looks different to me.
> 
> Regardless, I am not set up to build a kernel (I used to), and being my main
> server I hesitate to fiddle with it. I will keep this disk active and
> observe the situation.

No, reverting this commit will not do anything to the max command size that a
disk can see. But you could try this:

echo 1280 > /sys/block/sdX/queue/max_sectors_kb

to reduce the maximum command size that the disk will receive.

On the other hand, if all drives in your RAID6 array are the same and only this
drive is misbehaving, then I would be tempted to say the same you are: that the
disk is turning bad and replacing it is the best solution.



-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-19  5:41       ` Damien Le Moal
@ 2025-11-19 13:37         ` Eyal Lebedinsky
  2025-11-20  3:34           ` Damien Le Moal
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-19 13:37 UTC (permalink / raw)
  To: Damien Le Moal, Niklas Cassel; +Cc: list linux-ide

Thanks Damien,

On 19/11/25 16:41, Damien Le Moal wrote:
> On 11/19/25 08:05, Eyal Lebedinsky wrote:
>> I still suspect the disk itself it at fault (I have a replacement synced and
>> ready). >> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>>>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0
>>>> tag 28 dma 4194304 out res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4
>>>> (timeout) 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>>>> 12:15:14+11:00 kernel: ata2: hard resetting link 12:15:15+11:00 kernel:
>>>> ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) 12:15:15+11:00
>>>> kernel: ata2.00: configured for UDMA/133 12:15:15+11:00 kernel: ata2: EH
>>>> complete
>>>
>>> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
>>
>> Yes, this is the size of almost all commands. With NCQ enabled the sizes are
>> very variable and often less that 1 MiB.
> 
> Yes, because there will be more requests queued in the block layer, which
> increases the chances of merging sequential requests. That's why the average
> command size goes up.
> 
>>
>>> This sounds very much like a recent bug report we have received: https://
>>> bugzilla.kernel.org/show_bug.cgi?id=220693
>>>
>>> In fact, a lot of the failing commands in that bug report is also a read
>>> or write of size 4 MiB.
>>>
>>> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
>>> size for rotational devices") and see if that improves things for you
>>> (while keeping NCQ enabled).I read it. I never had I/O errors reported for
>>> this disk so it looks different to me.
>>
>> Regardless, I am not set up to build a kernel (I used to), and being my main
>> server I hesitate to fiddle with it. I will keep this disk active and
>> observe the situation.
> 
> No, reverting this commit will not do anything to the max command size that a
> disk can see. But you could try this:
> 
> echo 1280 > /sys/block/sdX/queue/max_sectors_kb
> 
> to reduce the maximum command size that the disk will receive.

I will try this.

> On the other hand, if all drives in your RAID6 array are the same and only this
> drive is misbehaving, then I would be tempted to say the same you are: that the
> disk is turning bad and replacing it is the best solution.

"this drive" is NOT part of the RAID, it is just a scratch disk used when space is
needed or for some local backups. It is old and it will not be any drama if it fails.
This is why I am comfortable trying more options before replacing it.

My main interest is to understand what actually is happening inside the disk.
I assume that copying the data from the CRM part to the SMR part is going on.

The two hourly job that at times triggers a timeout (1 or 2 times a day) is rsyncing 20GB
into the drive, so this much is updated. It takes just 4 minutes on a good day.

Anyway, thanks everyone,
	Eyal

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-19 13:37         ` Eyal Lebedinsky
@ 2025-11-20  3:34           ` Damien Le Moal
  2025-11-20 11:38             ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-11-20  3:34 UTC (permalink / raw)
  To: eyal, Niklas Cassel; +Cc: list linux-ide

On 11/19/25 10:37 PM, Eyal Lebedinsky wrote:
> Thanks Damien,
> 
> On 19/11/25 16:41, Damien Le Moal wrote:
>> On 11/19/25 08:05, Eyal Lebedinsky wrote:
>>> I still suspect the disk itself it at fault (I have a replacement synced and
>>> ready). >> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>>>>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0
>>>>> tag 28 dma 4194304 out res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4
>>>>> (timeout) 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>>>>> 12:15:14+11:00 kernel: ata2: hard resetting link 12:15:15+11:00 kernel:
>>>>> ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) 12:15:15+11:00
>>>>> kernel: ata2.00: configured for UDMA/133 12:15:15+11:00 kernel: ata2: EH
>>>>> complete
>>>>
>>>> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
>>>
>>> Yes, this is the size of almost all commands. With NCQ enabled the sizes are
>>> very variable and often less that 1 MiB.
>>
>> Yes, because there will be more requests queued in the block layer, which
>> increases the chances of merging sequential requests. That's why the average
>> command size goes up.
>>
>>>
>>>> This sounds very much like a recent bug report we have received: https://
>>>> bugzilla.kernel.org/show_bug.cgi?id=220693
>>>>
>>>> In fact, a lot of the failing commands in that bug report is also a read
>>>> or write of size 4 MiB.
>>>>
>>>> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
>>>> size for rotational devices") and see if that improves things for you
>>>> (while keeping NCQ enabled).I read it. I never had I/O errors reported for
>>>> this disk so it looks different to me.
>>>
>>> Regardless, I am not set up to build a kernel (I used to), and being my main
>>> server I hesitate to fiddle with it. I will keep this disk active and
>>> observe the situation.
>>
>> No, reverting this commit will not do anything to the max command size that a
>> disk can see. But you could try this:
>>
>> echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>>
>> to reduce the maximum command size that the disk will receive.
> 
> I will try this.
> 
>> On the other hand, if all drives in your RAID6 array are the same and only this
>> drive is misbehaving, then I would be tempted to say the same you are: that the
>> disk is turning bad and replacing it is the best solution.
> 
> "this drive" is NOT part of the RAID, it is just a scratch disk used when space is
> needed or for some local backups. It is old and it will not be any drama if it
> fails.
> This is why I am comfortable trying more options before replacing it.
> 
> My main interest is to understand what actually is happening inside the disk.
> I assume that copying the data from the CRM part to the SMR part is going on.

Ah ! This is a drive-managed SMR drive ? I missed that point. Yeah, with these
drives, the performance profile (throughpu & command latency) can be all over
the place depending on the internal state of the drive. So all bets are off in
terms of timeout... In your case, this seems extreme though, so there is likely
a head going bad and lots of internal retries going on that make latency even
worse than usual. Maybe have a look at SMART output to see if you lots of bad
sectors remapped ?


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-20  3:34           ` Damien Le Moal
@ 2025-11-20 11:38             ` Eyal Lebedinsky
  2025-11-20 12:18               ` Damien Le Moal
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-20 11:38 UTC (permalink / raw)
  To: Damien Le Moal, Niklas Cassel; +Cc: list linux-ide

Thanks again Damien,

On 20/11/25 14:34, Damien Le Moal wrote:
> On 11/19/25 10:37 PM, Eyal Lebedinsky wrote:
>> Thanks Damien,
>>
>> On 19/11/25 16:41, Damien Le Moal wrote:
>>> On 11/19/25 08:05, Eyal Lebedinsky wrote:
>>>> I still suspect the disk itself it at fault (I have a replacement synced and
>>>> ready). >> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>>>>>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0
>>>>>> tag 28 dma 4194304 out res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4
>>>>>> (timeout) 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>>>>>> 12:15:14+11:00 kernel: ata2: hard resetting link 12:15:15+11:00 kernel:
>>>>>> ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) 12:15:15+11:00
>>>>>> kernel: ata2.00: configured for UDMA/133 12:15:15+11:00 kernel: ata2: EH
>>>>>> complete
>>>>>
>>>>> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
>>>>
>>>> Yes, this is the size of almost all commands. With NCQ enabled the sizes are
>>>> very variable and often less that 1 MiB.
>>>
>>> Yes, because there will be more requests queued in the block layer, which
>>> increases the chances of merging sequential requests. That's why the average
>>> command size goes up.
>>>
>>>>
>>>>> This sounds very much like a recent bug report we have received: https://
>>>>> bugzilla.kernel.org/show_bug.cgi?id=220693
>>>>>
>>>>> In fact, a lot of the failing commands in that bug report is also a read
>>>>> or write of size 4 MiB.
>>>>>
>>>>> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
>>>>> size for rotational devices") and see if that improves things for you
>>>>> (while keeping NCQ enabled).I read it. I never had I/O errors reported for
>>>>> this disk so it looks different to me.
>>>>
>>>> Regardless, I am not set up to build a kernel (I used to), and being my main
>>>> server I hesitate to fiddle with it. I will keep this disk active and
>>>> observe the situation.
>>>
>>> No, reverting this commit will not do anything to the max command size that a
>>> disk can see. But you could try this:
>>>
>>> echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>>>
>>> to reduce the maximum command size that the disk will receive.

Done.

>> I will try this.
>>
>>> On the other hand, if all drives in your RAID6 array are the same and only this
>>> drive is misbehaving, then I would be tempted to say the same you are: that the
>>> disk is turning bad and replacing it is the best solution.
>>
>> "this drive" is NOT part of the RAID, it is just a scratch disk used when space is
>> needed or for some local backups. It is old and it will not be any drama if it
>> fails.
>> This is why I am comfortable trying more options before replacing it.
>>
>> My main interest is to understand what actually is happening inside the disk.
>> I assume that copying the data from the CRM part to the SMR part is going on.
> 
> Ah ! This is a drive-managed SMR drive ? I missed that point. Yeah, with these
> drives, the performance profile (throughpu & command latency) can be all over
> the place depending on the internal state of the drive. So all bets are off in
> terms of timeout... In your case, this seems extreme though, so there is likely
> a head going bad and lots of internal retries going on that make latency even
> worse than usual. Maybe have a look at SMART output to see if you lots of bad
> sectors remapped ?

Nothing bad in smart report.

Another positive: After setting a lower max_sectors_kb as suggested, the drive is
running smoothly. I also added --fsync to the rsync which probably also regulated
the pace a bit.

So far today there was no reset required, and also no pause at all.

Maybe after the disk was used for a long while, and as a large amount of data was
replaced regularly, the data is now distributed wildly.

Is there an equivalent to 'trim' that can be used to tell the drive what blocks
can be discarded (and reused)? If so, worth a try?

Regards,
	Eyal

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-20 11:38             ` Eyal Lebedinsky
@ 2025-11-20 12:18               ` Damien Le Moal
  2025-11-20 23:53                 ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-11-20 12:18 UTC (permalink / raw)
  To: eyal, Niklas Cassel; +Cc: list linux-ide

On 11/20/25 20:38, Eyal Lebedinsky wrote:
>> Ah ! This is a drive-managed SMR drive ? I missed that point. Yeah, with these
>> drives, the performance profile (throughpu & command latency) can be all over
>> the place depending on the internal state of the drive. So all bets are off in
>> terms of timeout... In your case, this seems extreme though, so there is likely
>> a head going bad and lots of internal retries going on that make latency even
>> worse than usual. Maybe have a look at SMART output to see if you lots of bad
>> sectors remapped ?
> 
> Nothing bad in smart report.
> 
> Another positive: After setting a lower max_sectors_kb as suggested, the drive is
> running smoothly. I also added --fsync to the rsync which probably also regulated
> the pace a bit.
> 
> So far today there was no reset required, and also no pause at all.
> 
> Maybe after the disk was used for a long while, and as a large amount of data was
> replaced regularly, the data is now distributed wildly.
> 
> Is there an equivalent to 'trim' that can be used to tell the drive what blocks
> can be discarded (and reused)? If so, worth a try?

If you drive shows a non-zero value for:

cat /sys/block/sdX/queue/discard_max_hw_bytes

then you can run fstrim against the FS on the drive to trim (discard) the unused
blocks. If the value is zero, then the drive does not support discard/trim.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-20 12:18               ` Damien Le Moal
@ 2025-11-20 23:53                 ` Eyal Lebedinsky
  0 siblings, 0 replies; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-20 23:53 UTC (permalink / raw)
  To: Damien Le Moal, Niklas Cassel; +Cc: list linux-ide

Thanks Damien,

On 20/11/25 23:18, Damien Le Moal wrote:
> On 11/20/25 20:38, Eyal Lebedinsky wrote:
>>> Ah ! This is a drive-managed SMR drive ? I missed that point. Yeah, with these
>>> drives, the performance profile (throughpu & command latency) can be all over
>>> the place depending on the internal state of the drive. So all bets are off in
>>> terms of timeout... In your case, this seems extreme though, so there is likely
>>> a head going bad and lots of internal retries going on that make latency even
>>> worse than usual. Maybe have a look at SMART output to see if you lots of bad
>>> sectors remapped ?
>>
>> Nothing bad in smart report.
>>
>> Another positive: After setting a lower max_sectors_kb as suggested, the drive is
>> running smoothly. I also added --fsync to the rsync which probably also regulated
>> the pace a bit.
>>
>> So far today there was no reset required, and also no pause at all.
>>
>> Maybe after the disk was used for a long while, and as a large amount of data was
>> replaced regularly, the data is now distributed wildly.
>>
>> Is there an equivalent to 'trim' that can be used to tell the drive what blocks
>> can be discarded (and reused)? If so, worth a try?
> 
> If you drive shows a non-zero value for:
> 
> cat /sys/block/sdX/queue/discard_max_hw_bytes
> 
> then you can run fstrim against the FS on the drive to trim (discard) the unused
> blocks. If the value is zero, then the drive does not support discard/trim.

Not supported.

Is there a way to mark everything unused? Or was SMR not designed to handle this, way back in 2014?
I am able to copy the disk elsewhere, clear?, then copy back.

<unrelated>
Seagate shows the disk (ST8000AS0002-1NA17Z) as released in 2014. Says "No Newer Firmware Available".
Latest fw I found mentioned are AR15/AR17 (mine is AR13) which still do not support trim. It is for a later disk revision.
	https://smarthdd.com/database/ST8000AS0002-1NA17Z/
</unrelated>

Regards,
	Eyal

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-11-03  4:13 ata timeout exceptions Eyal Lebedinsky
  2025-11-09 20:40 ` Niklas Cassel
  2025-11-14  4:32 ` Eyal Lebedinsky
@ 2025-12-16 23:39 ` Eyal Lebedinsky
  2025-12-17  1:35   ` Damien Le Moal
  2 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-16 23:39 UTC (permalink / raw)
  To: list linux-ide

Resolved.

Limiting disk access bandwidth (as suggested by Damien Le Moal <dlemoal@kernel.org>)
	# echo 1280 > /sys/block/sdX/queue/max_sectors_kb
did the trick. No pauses/resets anymore for over a month.

Setting
	# echo 180 >/sys/block/sda/device/timeout
did not help, only made the pauses longer before the reset.

Thanks everyone.
	Eyal

On 3/11/25 15:13, Eyal Lebedinsky wrote:
> I have a sata disk that is probably on its last legs.
> It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
> It sees very little activity.
> 
> Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
> 
> For the last few weeks it started to log timeout errors (not always) like this:
> 
>    kernel: ata2.00: exception Emask 0x0 SAct 0x2020 SErr 0x0 action 0x6 frozen
>    kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>    kernel: ata2.00: cmd 61/80:28:a0:10:df/00:00:d1:01:00/40 tag 5 ncq dma 65536 out
>                     res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
>    kernel: ata2.00: status: { DRDY }
>    kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>    kernel: ata2.00: cmd 61/00:68:18:15:30/20:00:20:01:00/40 tag 13 ncq dma 4194304 out
>                     res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>    kernel: ata2.00: status: { DRDY }
>    kernel: ata2: hard resetting link
>    kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>    kernel: ata2.00: configured for UDMA/133
>    kernel: ata2: EH complete

[trimmed]

-- 
Eyal Lebedinsky	(eyal@eyal.emu.id.au)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-16 23:39 ` Eyal Lebedinsky
@ 2025-12-17  1:35   ` Damien Le Moal
  2025-12-17 11:56     ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-12-17  1:35 UTC (permalink / raw)
  To: eyal, list linux-ide, Niklas Cassel

On 12/17/25 08:39, Eyal Lebedinsky wrote:
> Resolved.
> 
> Limiting disk access bandwidth (as suggested by Damien Le Moal <dlemoal@kernel.org>)
> 	# echo 1280 > /sys/block/sdX/queue/max_sectors_kb
> did the trick. No pauses/resets anymore for over a month.

We now have patches queued up to limit max_sectors_kb for devices and
controllers behaving badly. If you send us your device information (hdparm -I)
and controller info (PCI ID of your AHCI adapter), we can add a permanent quirk.

Though we would need to determine if is is the device or the adapter that is
mis-behaving, and also ideally, the command size at which things break.
We had another case with a device breaking above 4MiB commands. A quirk setting
max hw sectors to 8191 sectors solved the issue.

> 
> Setting
> 	# echo 180 >/sys/block/sda/device/timeout
> did not help, only made the pauses longer before the reset.
> 
> Thanks everyone.
> 	Eyal
> 
> On 3/11/25 15:13, Eyal Lebedinsky wrote:
>> I have a sata disk that is probably on its last legs.
>> It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
>> It sees very little activity.
>>
>> Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
>>
>> For the last few weeks it started to log timeout errors (not always) like this:
>>
>>    kernel: ata2.00: exception Emask 0x0 SAct 0x2020 SErr 0x0 action 0x6 frozen
>>    kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>>    kernel: ata2.00: cmd 61/80:28:a0:10:df/00:00:d1:01:00/40 tag 5 ncq dma 65536 out
>>                     res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
>>    kernel: ata2.00: status: { DRDY }
>>    kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>>    kernel: ata2.00: cmd 61/00:68:18:15:30/20:00:20:01:00/40 tag 13 ncq dma 4194304 out
>>                     res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>>    kernel: ata2.00: status: { DRDY }
>>    kernel: ata2: hard resetting link
>>    kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>>    kernel: ata2.00: configured for UDMA/133
>>    kernel: ata2: EH complete
> 
> [trimmed]
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-17  1:35   ` Damien Le Moal
@ 2025-12-17 11:56     ` Eyal Lebedinsky
  2025-12-17 12:02       ` Niklas Cassel
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-17 11:56 UTC (permalink / raw)
  To: list linux-ide; +Cc: Damien Le Moal, Niklas Cassel

On 17/12/25 12:35, Damien Le Moal wrote:
> On 12/17/25 08:39, Eyal Lebedinsky wrote:
>> Resolved.
>>
>> Limiting disk access bandwidth (as suggested by Damien Le Moal <dlemoal@kernel.org>)
>> 	# echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>> did the trick. No pauses/resets anymore for over a month.
> 
> We now have patches queued up to limit max_sectors_kb for devices and
> controllers behaving badly. If you send us your device information (hdparm -I)
> and controller info (PCI ID of your AHCI adapter), we can add a permanent quirk.
> 
> Though we would need to determine if is is the device or the adapter that is
> mis-behaving, and also ideally, the command size at which things break.
> We had another case with a device breaking above 4MiB commands. A quirk setting
> max hw sectors to 8191 sectors solved the issue.

The machine is: Gigabyte Z390 UD, BIOS AMI F8

The disk is, according to smartctl:
	Model Family:     Seagate Archive HDD (SMR)
	Device Model:     ST8000AS0002-1NA17Z
	Firmware Version: AR13
	User Capacity:    8,001,563,222,016 bytes [8.00 TB]
	Sector Sizes:     512 bytes logical, 4096 bytes physical
There was no hw change during this period.

Here is what I think:

This disk did not exhibit the problem for the last 1.5 years when it was in constant use.
	[before that, since Jan/2016, it was used every few months as a backup disk]
Then, last month it started to show the problem.

Being an early SMR disk, is it possible that it reached a state where all block updates require a track read/write
(no more unused tracks) and at high bandwidth it gets into trouble. It did not matter how high I set the timeout
(I tested up to 240) it always timed out if any pause was encountered.

Being a rather old disk (a 2014 model?)
	Maybe a fw bug?
	Maybe an SMR design misfeature?

Do you want me to try different max_sectors_kb values to see where it breaks?

Regards,
	Eyal

>>
>> Setting
>> 	# echo 180 >/sys/block/sda/device/timeout
>> did not help, only made the pauses longer before the reset.
>>
>> Thanks everyone.
>> 	Eyal
>>
>> On 3/11/25 15:13, Eyal Lebedinsky wrote:
>>> I have a sata disk that is probably on its last legs.
>>> It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
>>> It sees very little activity.
>>>
>>> Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
>>>
>>> For the last few weeks it started to log timeout errors (not always) like this:
>>>
>>>     kernel: ata2.00: exception Emask 0x0 SAct 0x2020 SErr 0x0 action 0x6 frozen
>>>     kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>>>     kernel: ata2.00: cmd 61/80:28:a0:10:df/00:00:d1:01:00/40 tag 5 ncq dma 65536 out
>>>                      res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
>>>     kernel: ata2.00: status: { DRDY }
>>>     kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>>>     kernel: ata2.00: cmd 61/00:68:18:15:30/20:00:20:01:00/40 tag 13 ncq dma 4194304 out
>>>                      res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>>>     kernel: ata2.00: status: { DRDY }
>>>     kernel: ata2: hard resetting link
>>>     kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>>>     kernel: ata2.00: configured for UDMA/133
>>>     kernel: ata2: EH complete
>>
>> [trimmed]
>>
> 
> 


-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-17 11:56     ` Eyal Lebedinsky
@ 2025-12-17 12:02       ` Niklas Cassel
  2025-12-20  4:03         ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Niklas Cassel @ 2025-12-17 12:02 UTC (permalink / raw)
  To: eyal, Eyal Lebedinsky, list linux-ide; +Cc: Damien Le Moal

On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>On 17/12/25 12:35, Damien Le Moal wrote:
>> On 12/17/25 08:39, Eyal Lebedinsky wrote:
>>> Resolved.
>>> 
>>> Limiting disk access bandwidth (as suggested by Damien Le Moal <dlemoal@kernel.org>)
>>> 	# echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>>> did the trick. No pauses/resets anymore for over a month.
>> 
>> We now have patches queued up to limit max_sectors_kb for devices and
>> controllers behaving badly. If you send us your device information (hdparm -I)
>> and controller info (PCI ID of your AHCI adapter), we can add a permanent quirk.
>> 
>> Though we would need to determine if is is the device or the adapter that is
>> mis-behaving, and also ideally, the command size at which things break.
>> We had another case with a device breaking above 4MiB commands. A quirk setting
>> max hw sectors to 8191 sectors solved the issue.
>
>The machine is: Gigabyte Z390 UD, BIOS AMI F8
>
>The disk is, according to smartctl:
>	Model Family:     Seagate Archive HDD (SMR)
>	Device Model:     ST8000AS0002-1NA17Z
>	Firmware Version: AR13
>	User Capacity:    8,001,563,222,016 bytes [8.00 TB]
>	Sector Sizes:     512 bytes logical, 4096 bytes physical
>There was no hw change during this period.
>
>Here is what I think:
>
>This disk did not exhibit the problem for the last 1.5 years when it was in constant use.
>	[before that, since Jan/2016, it was used every few months as a backup disk]
>Then, last month it started to show the problem.
>
>Being an early SMR disk, is it possible that it reached a state where all block updates require a track read/write
>(no more unused tracks) and at high bandwidth it gets into trouble. It did not matter how high I set the timeout
>(I tested up to 240) it always timed out if any pause was encountered.
>
>Being a rather old disk (a 2014 model?)
>	Maybe a fw bug?
>	Maybe an SMR design misfeature?
>
>Do you want me to try different max_sectors_kb values to see where it breaks?
>

You can also try this:

https://github.com/floatious/max-sectors-quirk

It tries these max_sector_kb values:
declare -a sizes=(128 1024 2048 3072 4095 4096)

You can simply modify the script if you want to try more intermediate sizes.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-17 12:02       ` Niklas Cassel
@ 2025-12-20  4:03         ` Eyal Lebedinsky
  2025-12-21  8:34           ` Damien Le Moal
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-20  4:03 UTC (permalink / raw)
  To: list linux-ide; +Cc: Damien Le Moal, Niklas Cassel

On 17/12/25 23:02, Niklas Cassel wrote:
> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:

[trimmed]

>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>
> 
> You can also try this:
> 
> https://github.com/floatious/max-sectors-quirk
> 
> It tries these max_sector_kb values:
> declare -a sizes=(128 1024 2048 3072 4095 4096)
> 
> You can simply modify the script if you want to try more intermediate sizes.

After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
See comments after the test report.

------------------ test start
$ sudo sh ./find-max-sectors.sh /dev/sda
Drive model:
ST8000AS0002-1NA17Z

Drive firmware:
AR13

SATA / AHCI controller:
00:17.0 SATA controller [0106]: Intel Corporation Cannon Lake PCH SATA AHCI Controller [8086:a352] (rev 10)

Drive values before running the test:
/sys/block/sda/queue/max_hw_sectors_kb:32767
/sys/block/sda/queue/max_sectors_kb:4096
/sys/block/sda/queue/read_ahead_kb:8192

Running test with max_sectors 128 KiB
Test: PASS

Running test with max_sectors 1024 KiB
Test: PASS

Running test with max_sectors 2048 KiB
Test: PASS

Running test with max_sectors 3072 KiB
Test: PASS

Running test with max_sectors 4095 KiB
Test: PASS

Running test with max_sectors 4096 KiB
Test: PASS
------------------ test end

For the last few days I tested my usual workload with 3072, 4095 and 4096.

Up to 3072 all runs were clean, no pauses.

While I failed to produce a reset, the last (large) sizes often triggered pauses, about 30s each,
after which the job continued to completion. With 4096 I did see a long pause triggering a reset.

Note: What I saw so far (many times) is that if a pause is longer than 30s it never recovers until a reset.

Important: my workload is writing to this disk, I never saw any problems reading from it.

My guess is that the disk has confidence in accepting large write blocks but it fails to live up to the promise.
Writing is more involved on SMR and maybe there is an issue specific to large writes.

Looking at the smart stats, the disk had only 180TB lifetime writes, and it is specced for 55TB/year (or 180TB/y with v2 and v3).
So it is far from full.

I will now leave it running for a few days with max_sector_kb=4095 to see if it also triggers a reset.

Regards,
	Eyal

----- example of my workload, no pause. max_sector_kb=4096, timeout=120
18:06:52 2025-12-19
18:07:02 Device   wareq-sz      w/s     kB_w/s       kB_w
18:07:02 sda          0.00     0.00       0.00       0.00
18:07:12 sda          0.00     0.00       0.00       0.00
18:07:22 sda          0.00     0.00       0.00       0.00
18:07:32 sda          0.00     0.00       0.00       0.00
18:07:42 sda          0.00     0.00       0.00       0.00
18:07:52 sda          0.00     0.00       0.00       0.00
18:08:02 sda       1027.69    14.00   14387.66  143876.60
18:08:12 sda        912.34   105.60   96343.10  963431.04
18:08:22 sda       1303.94   106.50  138869.61 1388696.10
18:08:32 sda       1844.63    72.60  133920.14 1339201.38
18:08:42 sda       2083.31    66.30  138123.45 1381234.53
18:08:52 sda       1339.79    71.30   95527.03  955270.27
18:09:02 sda       2427.31    47.70  115782.69 1157826.87
18:09:12 sda       1817.04    69.80  126829.39 1268293.92
18:09:22 sda       1805.27    54.70   98748.27  987482.69
18:09:32 sda       1186.68    90.10  106919.87 1069198.68
18:09:42 sda       1041.10   114.10  118789.51 1187895.10
18:09:52 sda        972.94   141.80  137962.89 1379628.92
18:10:02 sda       1086.14    90.70   98512.90  985128.98
18:10:12 sda       1360.86    74.50  101384.07 1013840.70
18:10:22 sda       1354.29    87.60  118635.80 1186358.04
18:10:32 sda       1712.45    63.00  107884.35 1078843.50
18:10:42 sda       1529.06    74.20  113456.25 1134562.52
18:10:52 sda       1681.66    65.80  110653.23 1106532.28
18:11:02 sda       1589.97    73.80  117339.79 1173397.86
18:11:12 sda       1623.92    35.40   57486.77  574867.68
18:11:22 sda          0.00     0.00       0.00       0.00
18:11:32 sda          0.00     0.00       0.00       0.00
18:11:42 sda          0.00     0.00       0.00       0.00
18:11:52 sda          0.00     0.00       0.00       0.00
-----

----- example of my workload, showing two pauses followed by a 3rd long pause+reset
22:06:53 2025-12-19
22:07:03 Device   wareq-sz      w/s     kB_w/s       kB_w
22:07:03 sda          0.00     0.00       0.00       0.00
22:07:13 sda          0.00     0.00       0.00       0.00
22:07:23 sda          0.00     0.00       0.00       0.00
22:07:33 sda          0.00     0.00       0.00       0.00
22:07:43 sda          0.00     0.00       0.00       0.00
22:07:53 sda          0.00     0.00       0.00       0.00
22:08:03 sda        584.10    27.70   16179.57  161795.70
22:08:13 sda        865.20   137.40  118878.48 1188784.80
22:08:23 sda       1035.57    73.50   76114.39  761143.95
22:08:33 sda        928.37   133.00  123473.21 1234732.10
22:08:43 sda        740.79   153.30  113563.11 1135631.07
22:08:53 sda        780.15   103.90   81057.59  810575.85
22:09:03 sda       1061.52   107.60  114219.55 1142195.52
22:09:13 sda        674.01   127.40   85868.87  858688.74
22:09:23 sda        959.01   137.80  132151.58 1321515.78
22:09:33 sda       1427.83    81.90  116939.28 1169392.77
22:09:43 sda       1210.29    90.50  109531.24 1095312.45
22:09:53 sda        864.15    32.90   28430.53  284305.35
22:10:03 sda          0.00     0.00       0.00       0.00
22:10:13 sda          0.00     0.00       0.00       0.00
22:10:23 sda       1787.96    11.00   19667.56  196675.60
22:10:33 sda          0.00     0.00       0.00       0.00
22:10:43 sda          0.00     0.00       0.00       0.00
22:10:53 sda          0.00     0.00       0.00       0.00
22:11:03 sda       1569.92    33.40   52435.33  524353.28
22:11:13 sda       1262.29    69.50   87729.15  877291.55
22:11:23 sda          0.00     0.00       0.00       0.00
22:11:33 sda          0.00     0.00       0.00       0.00
22:11:43 sda          0.00     0.00       0.00       0.00
22:11:53 sda          0.00     0.00       0.00       0.00
22:11:53 2025-12-19
22:12:03 Device   wareq-sz      w/s     kB_w/s       kB_w
22:12:03 sda          0.00     0.00       0.00       0.00
22:12:13 sda          0.00     0.00       0.00       0.00
22:12:23 sda          0.00     0.00       0.00       0.00
22:12:33 sda          0.00     0.00       0.00       0.00
22:12:43 sda          0.00     0.00       0.00       0.00
22:12:53 sda          0.00     0.00       0.00       0.00
22:13:03 sda          0.00     0.00       0.00       0.00
22:13:13 sda          0.00     0.00       0.00       0.00
22:13:23 sda        972.28   122.40  119007.07 1190070.72
22:13:33 sda       1303.25    92.00  119899.00 1198990.00
22:13:43 sda       2139.47    51.80  110824.55 1108245.46
22:13:53 sda       1240.71    99.60  123574.72 1235747.16
22:14:03 sda       1516.56    74.00  112225.44 1122254.40
22:14:13 sda       1832.90    56.60  103742.14 1037421.40
22:14:23 sda       1112.75   108.90  121178.48 1211784.75
22:14:33 sda        573.11    93.50   53585.79  535857.85
22:14:43 sda          0.00     0.00       0.00       0.00
22:14:53 sda          0.00     0.00       0.00       0.00
-- the system log shows the reset:
2025-12-19T22:13:12+11:00 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
2025-12-19T22:13:12+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
2025-12-19T22:13:12+11:00 kernel: ata2.00: cmd 35/00:00:00:08:34/00:20:ba:01:00/e0 tag 15 dma 4194304 out
                                            res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
2025-12-19T22:13:12+11:00 kernel: ata2.00: status: { DRDY }
2025-12-19T22:13:12+11:00 kernel: ata2: hard resetting link
2025-12-19T22:13:13+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2025-12-19T22:13:13+11:00 kernel: ata2.00: configured for UDMA/133
2025-12-19T22:13:13+11:00 kernel: ata2: EH complete
-----

> Kind regards,
> Niklas


-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-20  4:03         ` Eyal Lebedinsky
@ 2025-12-21  8:34           ` Damien Le Moal
  2025-12-21 12:12             ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-12-21  8:34 UTC (permalink / raw)
  To: eyal, list linux-ide; +Cc: Niklas Cassel

On 12/20/25 13:03, Eyal Lebedinsky wrote:
> On 17/12/25 23:02, Niklas Cassel wrote:
>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
> 
> [trimmed]
> 
>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>
>>
>> You can also try this:
>>
>> https://github.com/floatious/max-sectors-quirk
>>
>> It tries these max_sector_kb values:
>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>
>> You can simply modify the script if you want to try more intermediate sizes.
> 
> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
> See comments after the test report.

From your test results, it seems that the drive actually correctly handles very
large write commands, but because it is a drive-managed SMR disk, such commands
can take a very long time to process (due to the drive needing internal garbage
collection first), which triggers a timeout and a reset as the ata subsystem
assumes that the drive has stopped responding.

Limiting write commands to smaller sizes seems to mostly avoid this issue, even
though I do not think that gives any guarantees that the same issue will not
happen for small writes too.

So my suggestion is that you run with something like "libata.force=[<port
ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
but that would be a really more of a big hammer solution.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-21  8:34           ` Damien Le Moal
@ 2025-12-21 12:12             ` Eyal Lebedinsky
  2025-12-21 22:43               ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-21 12:12 UTC (permalink / raw)
  To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal

On 21/12/25 19:34, Damien Le Moal wrote:
> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>> On 17/12/25 23:02, Niklas Cassel wrote:
>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>
>> [trimmed]
>>
>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>
>>>
>>> You can also try this:
>>>
>>> https://github.com/floatious/max-sectors-quirk
>>>
>>> It tries these max_sector_kb values:
>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>
>>> You can simply modify the script if you want to try more intermediate sizes.
>>
>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>> See comments after the test report.
> 
>>From your test results, it seems that the drive actually correctly handles very
> large write commands, but because it is a drive-managed SMR disk, such commands
> can take a very long time to process (due to the drive needing internal garbage
> collection first), which triggers a timeout and a reset as the ata subsystem
> assumes that the drive has stopped responding.
> 
> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
> though I do not think that gives any guarantees that the same issue will not
> happen for small writes too.
> 
> So my suggestion is that you run with something like "libata.force=[<port
> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
> but that would be a really more of a big hammer solution.

I mostly agree. However, extending the timeout did not help in the past.
I found that even setting it to 240 was not enough.
If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
The pause is either up to 30s or unlimited (until timeout is reached and a reset).

ATM I have the setting of timeout=120 in rc.local.

I also have in my boot cmd:
	libata.force=2.00:noncq
This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
despite many timeout resets since.

I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.

BTW, for the last few days I am running with max_sectors_kb=3584 and saw no pauses at all.
with 4096 and 4095 I did get a long pause/reset.

Regards,
	Eyal

(*) smart shows 180TB written so far in 1.6y on. I see a workload limit of 180TB/y listed but
I also saw an older spec that said 55TB/y. I do not know if I have a v1, v2 or v3 of this model.

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-21 12:12             ` Eyal Lebedinsky
@ 2025-12-21 22:43               ` Eyal Lebedinsky
  2025-12-21 23:14                 ` Damien Le Moal
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-21 22:43 UTC (permalink / raw)
  To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal

On 21/12/25 23:12, Eyal Lebedinsky wrote:
> On 21/12/25 19:34, Damien Le Moal wrote:
>> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>>> On 17/12/25 23:02, Niklas Cassel wrote:
>>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>>
>>> [trimmed]
>>>
>>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>>
>>>>
>>>> You can also try this:
>>>>
>>>> https://github.com/floatious/max-sectors-quirk
>>>>
>>>> It tries these max_sector_kb values:
>>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>>
>>>> You can simply modify the script if you want to try more intermediate sizes.
>>>
>>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>>> See comments after the test report.
>>
>>> From your test results, it seems that the drive actually correctly handles very
>> large write commands, but because it is a drive-managed SMR disk, such commands
>> can take a very long time to process (due to the drive needing internal garbage
>> collection first), which triggers a timeout and a reset as the ata subsystem
>> assumes that the drive has stopped responding.
>>
>> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
>> though I do not think that gives any guarantees that the same issue will not
>> happen for small writes too.
>>
>> So my suggestion is that you run with something like "libata.force=[<port
>> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
>> but that would be a really more of a big hammer solution.
> 
> I mostly agree. However, extending the timeout did not help in the past.
> I found that even setting it to 240 was not enough.
> If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
> The pause is either up to 30s or unlimited (until timeout is reached and a reset).
> 
> ATM I have the setting of timeout=120 in rc.local.
> 
> I also have in my boot cmd:
>      libata.force=2.00:noncq
> This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
> In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
> despite many timeout resets since.
> 
> I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.

Have just ran the test, failed quickly, see below.
It shows a 17 minutes pause followed by a reset and a resumption and completion of the test.

The important question is: do we need a quirk?
	Is this an inherent problem with this model? If so then a quirk is justified.
	Is this an inherent problem after this model had many writes? Again, a quirk is justified.
	Is this a fault with my specific disk? No quirk, I will keep my safe settings.

Let me know if further testing, or more information, is required.

Regards,
	Eyal

08:45:33 sudo sh -c 'echo 1024 >/sys/block/sda/device/timeout'
08:45:45 sudo sh -c 'echo 4096 > /sys/block/sda/queue/max_sectors_kb'
08:51:00 sudo /usr/local/bin/sync-tellerstats.sh		# start a test
         Quickly got a few shorter timeouts then an indefinite one:
08:51:10 2025-12-22
08:51:10 Device   wareq-sz      w/s     kB_w/s       kB_w
08:51:00 sda          0.00     0.00       0.00       0.00
08:51:10 sda       2187.47    59.90  131029.45 1310294.53	start of test
08:51:20 sda       2041.32    75.90  154936.19 1549361.88
08:51:30 sda       1745.31    87.50  152714.62 1527146.25
08:51:40 sda       1601.22    92.50  148112.85 1481128.50
08:51:50 sda       1444.75    97.40  140718.65 1407186.50
08:52:00 sda       2097.17    37.90   79482.74  794827.43
08:52:10 sda          0.00     0.00       0.00       0.00	short pause
08:52:20 sda          0.00     0.00       0.00       0.00
08:52:30 sda        622.74     7.30    4546.00   45460.02
08:52:40 sda          0.00     0.00       0.00       0.00	short pause
08:52:50 sda          0.00     0.00       0.00       0.00
08:53:00 sda          0.00     0.00       0.00       0.00
08:53:10 sda       2083.45    35.00   72920.75  729207.50
08:53:20 sda       1285.02   107.00  137497.14 1374971.40
08:53:30 sda       1882.78    77.30  145538.89 1455388.94
08:53:40 sda       1839.43    14.70   27039.62  270396.21
08:53:50 sda          0.00     0.00       0.00       0.00	pause
08:54:00 sda          0.00     0.00       0.00       0.00
08:54:10 sda          0.00     0.00       0.00       0.00
08:54:20 sda          0.00     0.00       0.00       0.00	longer than 30s
08:54:30 sda          0.00     0.00       0.00       0.00
...
09:10:50 sda          0.00     0.00       0.00       0.00	17 minutes later...
09:11:00 sda       1648.50    52.40   86381.40  863814.00
09:11:10 sda       1240.31   111.80  138666.66 1386666.58
09:11:20 sda       2172.02    64.50  140095.29 1400952.90
09:11:30 sda       2592.31    47.90  124171.65 1241716.49
09:11:40 sda       2143.92    57.60  123489.79 1234897.92
09:11:50 sda       1604.22    89.40  143417.27 1434172.68
09:12:00 sda       1550.97    88.60  137415.94 1374159.42
09:12:00 2025-12-22
09:12:10 Device   wareq-sz      w/s     kB_w/s       kB_w
09:12:10 sda       1215.60    47.50   57741.00  577410.00	end of test
09:12:20 sda          0.00     0.00       0.00       0.00
09:12:30 sda          0.00     0.00       0.00       0.00
-- system log:
2025-12-22T09:10:52+11:00 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
2025-12-22T09:10:52+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
2025-12-22T09:10:52+11:00 kernel: ata2.00: cmd 35/00:00:00:c8:d0/00:20:af:00:00/e0 tag 1 dma 4194304 out
                                            res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2025-12-22T09:10:52+11:00 kernel: ata2.00: status: { DRDY }
2025-12-22T09:10:52+11:00 kernel: ata2: hard resetting link
2025-12-22T09:10:53+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2025-12-22T09:10:53+11:00 kernel: ata2.00: configured for UDMA/133
2025-12-22T09:10:53+11:00 kernel: ata2: EH complete
-- end of test:
09:13:37 sudo sh -c 'echo 3584 > /sys/block/sda/queue/max_sectors_kb'
09:13:56 sudo sh -c 'echo 120 >/sys/block/sda/device/timeout'

smart does NOT show increased '188 Command_Timeout'.

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-21 22:43               ` Eyal Lebedinsky
@ 2025-12-21 23:14                 ` Damien Le Moal
  2025-12-22  2:10                   ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-12-21 23:14 UTC (permalink / raw)
  To: eyal, list linux-ide; +Cc: Niklas Cassel

On 12/22/25 07:43, Eyal Lebedinsky wrote:
> On 21/12/25 23:12, Eyal Lebedinsky wrote:
>> On 21/12/25 19:34, Damien Le Moal wrote:
>>> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>>>> On 17/12/25 23:02, Niklas Cassel wrote:
>>>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>>>
>>>> [trimmed]
>>>>
>>>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>>>
>>>>>
>>>>> You can also try this:
>>>>>
>>>>> https://github.com/floatious/max-sectors-quirk
>>>>>
>>>>> It tries these max_sector_kb values:
>>>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>>>
>>>>> You can simply modify the script if you want to try more intermediate sizes.
>>>>
>>>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>>>> See comments after the test report.
>>>
>>>> From your test results, it seems that the drive actually correctly handles very
>>> large write commands, but because it is a drive-managed SMR disk, such commands
>>> can take a very long time to process (due to the drive needing internal garbage
>>> collection first), which triggers a timeout and a reset as the ata subsystem
>>> assumes that the drive has stopped responding.
>>>
>>> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
>>> though I do not think that gives any guarantees that the same issue will not
>>> happen for small writes too.
>>>
>>> So my suggestion is that you run with something like "libata.force=[<port
>>> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
>>> but that would be a really more of a big hammer solution.
>>
>> I mostly agree. However, extending the timeout did not help in the past.
>> I found that even setting it to 240 was not enough.
>> If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
>> The pause is either up to 30s or unlimited (until timeout is reached and a reset).
>>
>> ATM I have the setting of timeout=120 in rc.local.
>>
>> I also have in my boot cmd:
>>      libata.force=2.00:noncq
>> This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
>> In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
>> despite many timeout resets since.
>>
>> I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.
> 
> Have just ran the test, failed quickly, see below.
> It shows a 17 minutes pause followed by a reset and a resumption and completion of the test.
> 
> The important question is: do we need a quirk?
> 	Is this an inherent problem with this model? If so then a quirk is justified.
> 	Is this an inherent problem after this model had many writes? Again, a quirk is justified.
> 	Is this a fault with my specific disk? No quirk, I will keep my safe settings.
> 
> Let me know if further testing, or more information, is required.
> 
> Regards,
> 	Eyal
> 
> 08:45:33 sudo sh -c 'echo 1024 >/sys/block/sda/device/timeout'
> 08:45:45 sudo sh -c 'echo 4096 > /sys/block/sda/queue/max_sectors_kb'
> 08:51:00 sudo /usr/local/bin/sync-tellerstats.sh		# start a test
>          Quickly got a few shorter timeouts then an indefinite one:
> 08:51:10 2025-12-22
> 08:51:10 Device   wareq-sz      w/s     kB_w/s       kB_w
> 08:51:00 sda          0.00     0.00       0.00       0.00
> 08:51:10 sda       2187.47    59.90  131029.45 1310294.53	start of test
> 08:51:20 sda       2041.32    75.90  154936.19 1549361.88
> 08:51:30 sda       1745.31    87.50  152714.62 1527146.25
> 08:51:40 sda       1601.22    92.50  148112.85 1481128.50
> 08:51:50 sda       1444.75    97.40  140718.65 1407186.50
> 08:52:00 sda       2097.17    37.90   79482.74  794827.43
> 08:52:10 sda          0.00     0.00       0.00       0.00	short pause
> 08:52:20 sda          0.00     0.00       0.00       0.00
> 08:52:30 sda        622.74     7.30    4546.00   45460.02
> 08:52:40 sda          0.00     0.00       0.00       0.00	short pause
> 08:52:50 sda          0.00     0.00       0.00       0.00
> 08:53:00 sda          0.00     0.00       0.00       0.00
> 08:53:10 sda       2083.45    35.00   72920.75  729207.50
> 08:53:20 sda       1285.02   107.00  137497.14 1374971.40
> 08:53:30 sda       1882.78    77.30  145538.89 1455388.94
> 08:53:40 sda       1839.43    14.70   27039.62  270396.21
> 08:53:50 sda          0.00     0.00       0.00       0.00	pause
> 08:54:00 sda          0.00     0.00       0.00       0.00
> 08:54:10 sda          0.00     0.00       0.00       0.00
> 08:54:20 sda          0.00     0.00       0.00       0.00	longer than 30s
> 08:54:30 sda          0.00     0.00       0.00       0.00
> ...
> 09:10:50 sda          0.00     0.00       0.00       0.00	17 minutes later...
> 09:11:00 sda       1648.50    52.40   86381.40  863814.00
> 09:11:10 sda       1240.31   111.80  138666.66 1386666.58
> 09:11:20 sda       2172.02    64.50  140095.29 1400952.90
> 09:11:30 sda       2592.31    47.90  124171.65 1241716.49
> 09:11:40 sda       2143.92    57.60  123489.79 1234897.92
> 09:11:50 sda       1604.22    89.40  143417.27 1434172.68
> 09:12:00 sda       1550.97    88.60  137415.94 1374159.42
> 09:12:00 2025-12-22
> 09:12:10 Device   wareq-sz      w/s     kB_w/s       kB_w
> 09:12:10 sda       1215.60    47.50   57741.00  577410.00	end of test
> 09:12:20 sda          0.00     0.00       0.00       0.00
> 09:12:30 sda          0.00     0.00       0.00       0.00
> -- system log:
> 2025-12-22T09:10:52+11:00 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> 2025-12-22T09:10:52+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
> 2025-12-22T09:10:52+11:00 kernel: ata2.00: cmd 35/00:00:00:c8:d0/00:20:af:00:00/e0 tag 1 dma 4194304 out
>                                             res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> 2025-12-22T09:10:52+11:00 kernel: ata2.00: status: { DRDY }
> 2025-12-22T09:10:52+11:00 kernel: ata2: hard resetting link
> 2025-12-22T09:10:53+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> 2025-12-22T09:10:53+11:00 kernel: ata2.00: configured for UDMA/133
> 2025-12-22T09:10:53+11:00 kernel: ata2: EH complete
> -- end of test:
> 09:13:37 sudo sh -c 'echo 3584 > /sys/block/sda/queue/max_sectors_kb'
> 09:13:56 sudo sh -c 'echo 120 >/sys/block/sda/device/timeout'
> 
> smart does NOT show increased '188 Command_Timeout'.

Probably because the drive is stuck doing something. So it seems best to quirk
this drive. Can you try with NCQ enabled and a max sectors of 2048 (1 MiB
commands max) ? I suspect this should work. Otherwise, if NCQ also causes
issues, we can quirk both NCQ and max sectors for this drive.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-21 23:14                 ` Damien Le Moal
@ 2025-12-22  2:10                   ` Eyal Lebedinsky
  2025-12-22  3:43                     ` Damien Le Moal
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-22  2:10 UTC (permalink / raw)
  To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal

On 22/12/25 10:14, Damien Le Moal wrote:
> On 12/22/25 07:43, Eyal Lebedinsky wrote:
>> On 21/12/25 23:12, Eyal Lebedinsky wrote:
>>> On 21/12/25 19:34, Damien Le Moal wrote:
>>>> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>>>>> On 17/12/25 23:02, Niklas Cassel wrote:
>>>>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>>>>
>>>>> [trimmed]
>>>>>
>>>>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>>>>
>>>>>>
>>>>>> You can also try this:
>>>>>>
>>>>>> https://github.com/floatious/max-sectors-quirk
>>>>>>
>>>>>> It tries these max_sector_kb values:
>>>>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>>>>
>>>>>> You can simply modify the script if you want to try more intermediate sizes.
>>>>>
>>>>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>>>>> See comments after the test report.
>>>>
>>>>>  From your test results, it seems that the drive actually correctly handles very
>>>> large write commands, but because it is a drive-managed SMR disk, such commands
>>>> can take a very long time to process (due to the drive needing internal garbage
>>>> collection first), which triggers a timeout and a reset as the ata subsystem
>>>> assumes that the drive has stopped responding.
>>>>
>>>> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
>>>> though I do not think that gives any guarantees that the same issue will not
>>>> happen for small writes too.
>>>>
>>>> So my suggestion is that you run with something like "libata.force=[<port
>>>> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
>>>> but that would be a really more of a big hammer solution.
>>>
>>> I mostly agree. However, extending the timeout did not help in the past.
>>> I found that even setting it to 240 was not enough.
>>> If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
>>> The pause is either up to 30s or unlimited (until timeout is reached and a reset).
>>>
>>> ATM I have the setting of timeout=120 in rc.local.
>>>
>>> I also have in my boot cmd:
>>>       libata.force=2.00:noncq
>>> This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
>>> In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
>>> despite many timeout resets since.
>>>
>>> I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.
>>
>> Have just ran the test, failed quickly, see below.
>> It shows a 17 minutes pause followed by a reset and a resumption and completion of the test.
>>
>> The important question is: do we need a quirk?
>> 	Is this an inherent problem with this model? If so then a quirk is justified.
>> 	Is this an inherent problem after this model had many writes? Again, a quirk is justified.
>> 	Is this a fault with my specific disk? No quirk, I will keep my safe settings.
>>
>> Let me know if further testing, or more information, is required.
>>
>> Regards,
>> 	Eyal

[trimmed]

> Can you try with NCQ enabled and a max sectors of 2048 (1 MiB
> commands max) ? I suspect this should work. Otherwise, if NCQ also causes
> issues, we can quirk both NCQ and max sectors for this drive.

To be clear, set "max_sectors_kb=1024"? This always worked, no pause (or timeout).
	"this should work" meaning "no trouble" or "will reproduce the problem"?

With ncq enabled, I had the same numbers of timeouts, with the difference being that the disk
logged many errors (one for each tag) and also registered a Command_Timeout which it did not otherwise.

Question: will setting
	$ sudo sh -c 'echo 32 >/sys/block/sda/device/queue_depth'
be the same as booting without
	libata.force=2.00:noncq
or do I actually need to reboot?

Thanks
	Eyal

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-22  2:10                   ` Eyal Lebedinsky
@ 2025-12-22  3:43                     ` Damien Le Moal
  2025-12-22  5:57                       ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-12-22  3:43 UTC (permalink / raw)
  To: eyal, list linux-ide; +Cc: Niklas Cassel

On 12/22/25 11:10, Eyal Lebedinsky wrote:
> On 22/12/25 10:14, Damien Le Moal wrote:
>> On 12/22/25 07:43, Eyal Lebedinsky wrote:
>>> On 21/12/25 23:12, Eyal Lebedinsky wrote:
>>>> On 21/12/25 19:34, Damien Le Moal wrote:
>>>>> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>>>>>> On 17/12/25 23:02, Niklas Cassel wrote:
>>>>>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>>>>>
>>>>>> [trimmed]
>>>>>>
>>>>>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>>>>>
>>>>>>>
>>>>>>> You can also try this:
>>>>>>>
>>>>>>> https://github.com/floatious/max-sectors-quirk
>>>>>>>
>>>>>>> It tries these max_sector_kb values:
>>>>>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>>>>>
>>>>>>> You can simply modify the script if you want to try more intermediate sizes.
>>>>>>
>>>>>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>>>>>> See comments after the test report.
>>>>>
>>>>>>  From your test results, it seems that the drive actually correctly handles very
>>>>> large write commands, but because it is a drive-managed SMR disk, such commands
>>>>> can take a very long time to process (due to the drive needing internal garbage
>>>>> collection first), which triggers a timeout and a reset as the ata subsystem
>>>>> assumes that the drive has stopped responding.
>>>>>
>>>>> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
>>>>> though I do not think that gives any guarantees that the same issue will not
>>>>> happen for small writes too.
>>>>>
>>>>> So my suggestion is that you run with something like "libata.force=[<port
>>>>> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
>>>>> but that would be a really more of a big hammer solution.
>>>>
>>>> I mostly agree. However, extending the timeout did not help in the past.
>>>> I found that even setting it to 240 was not enough.
>>>> If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
>>>> The pause is either up to 30s or unlimited (until timeout is reached and a reset).
>>>>
>>>> ATM I have the setting of timeout=120 in rc.local.
>>>>
>>>> I also have in my boot cmd:
>>>>       libata.force=2.00:noncq
>>>> This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
>>>> In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
>>>> despite many timeout resets since.
>>>>
>>>> I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.
>>>
>>> Have just ran the test, failed quickly, see below.
>>> It shows a 17 minutes pause followed by a reset and a resumption and completion of the test.
>>>
>>> The important question is: do we need a quirk?
>>> 	Is this an inherent problem with this model? If so then a quirk is justified.
>>> 	Is this an inherent problem after this model had many writes? Again, a quirk is justified.
>>> 	Is this a fault with my specific disk? No quirk, I will keep my safe settings.
>>>
>>> Let me know if further testing, or more information, is required.
>>>
>>> Regards,
>>> 	Eyal
> 
> [trimmed]
> 
>> Can you try with NCQ enabled and a max sectors of 2048 (1 MiB
>> commands max) ? I suspect this should work. Otherwise, if NCQ also causes
>> issues, we can quirk both NCQ and max sectors for this drive.
> 
> To be clear, set "max_sectors_kb=1024"? This always worked, no pause (or timeout).
> 	"this should work" meaning "no trouble" or "will reproduce the problem"?

Yes, I meant to say that you should not see any timeout/long pause with NCQ for
a max command size of 1MiB. The reason I say that is that most drive managed SMR
implementation are based on some form of logging of random writes. Logging small
writes is fast and relatively easy to handle (even if the log is full, a small
portion of it can be recovered with just a few IOs). But if the writes are
large, things can get ugly as freeing up enough space in that log can be very
costly. That of course all depend on the vendor/model implementation of device
managed SMR FW...

> With ncq enabled, I had the same numbers of timeouts, with the difference being that the disk
> logged many errors (one for each tag) and also registered a Command_Timeout which it did not otherwise.

My point was to try NCQ combined with a small max sectors quirk to see if that
works well or not.

> 
> Question: will setting
> 	$ sudo sh -c 'echo 32 >/sys/block/sda/device/queue_depth'
> be the same as booting without
> 	libata.force=2.00:noncq
> or do I actually need to reboot?

You will need to reboot without libata.force=noncq. Otherwise,
ata_dev_config_ncq() will always do nothing for the drive and NCQ will not be
seen as supported.

Once you confirm if we really need to maintain NCQ off or not with a small max
sectors limit, we can write a proper quirk for this drive.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-22  3:43                     ` Damien Le Moal
@ 2025-12-22  5:57                       ` Eyal Lebedinsky
  2025-12-30 22:43                         ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-22  5:57 UTC (permalink / raw)
  To: Damien Le Moal, list linux-ide; +Cc: Niklas Cassel

On 22/12/25 14:43, Damien Le Moal wrote:
> On 12/22/25 11:10, Eyal Lebedinsky wrote:
>> On 22/12/25 10:14, Damien Le Moal wrote:
>>> Can you try with NCQ enabled and a max sectors of 2048 (1 MiB
>>> commands max) ? I suspect this should work. Otherwise, if NCQ also causes
>>> issues, we can quirk both NCQ and max sectors for this drive.
>>
>> To be clear, set "max_sectors_kb=1024"? This always worked, no pause (or timeout).
>> 	"this should work" meaning "no trouble" or "will reproduce the problem"?

This is how it is now set.

> Yes, I meant to say that you should not see any timeout/long pause with NCQ for
> a max command size of 1MiB. The reason I say that is that most drive managed SMR
> implementation are based on some form of logging of random writes. Logging small
> writes is fast and relatively easy to handle (even if the log is full, a small
> portion of it can be recovered with just a few IOs). But if the writes are
> large, things can get ugly as freeing up enough space in that log can be very
> costly. That of course all depend on the vendor/model implementation of device
> managed SMR FW...
> 
>> With ncq enabled, I had the same numbers of timeouts, with the difference being that the disk
>> logged many errors (one for each tag) and also registered a Command_Timeout which it did not otherwise.
> 
> My point was to try NCQ combined with a small max sectors quirk to see if that
> works well or not.
> 
>>
>> Question: will setting
>> 	$ sudo sh -c 'echo 32 >/sys/block/sda/device/queue_depth'
>> be the same as booting without
>> 	libata.force=2.00:noncq
>> or do I actually need to reboot?
> 
> You will need to reboot without libata.force=noncq. Otherwise,
> ata_dev_config_ncq() will always do nothing for the drive and NCQ will not be
> seen as supported.

I discovered that I need to reboot, which I did.

> Once you confirm if we really need to maintain NCQ off or not with a small max
> sectors limit, we can write a proper quirk for this drive.

I am leaving it this way. It runs the usual workload (rsync in) every 2 hours. I will report back in a few days.

The recent run with these settings was smooth:

16:07:56 sda          0.00     0.00       0.00       0.00
16:08:06 sda        921.57    55.70   51331.45  513314.49
16:08:16 sda        972.67   130.90  127322.50 1273225.03
16:08:26 sda        952.22   141.00  134263.02 1342630.20
16:08:36 sda        968.10   128.50  124400.85 1244008.50
16:08:46 sda        916.85   144.60  132576.51 1325765.10
16:08:56 sda        978.13   128.20  125396.27 1253962.66
16:09:06 sda        937.33   139.00  130288.87 1302888.70
16:09:16 sda        857.16   136.10  116659.48 1166594.76
16:09:26 sda        971.57   123.40  119891.74 1198917.38
16:09:36 sda        969.15   136.80  132579.72 1325797.20
16:09:46 sda        969.75   105.40  102211.65 1022116.50
16:09:56 sda        914.09   129.30  118191.84 1181918.37
16:10:06 sda        931.91   123.50  115090.88 1150908.85
16:10:16 sda        947.24   123.30  116794.69 1167946.92
16:10:26 sda        921.27   136.50  125753.35 1257533.55
16:10:36 sda        964.06   128.30  123688.90 1236888.98
16:10:46 sda        955.16   137.30  131143.47 1311434.68
16:10:56 sda        954.25   104.50   99719.12  997191.25
16:11:06 sda        663.10    29.20   19362.52  193625.20
16:11:16 sda          0.00     0.00       0.00       0.00
16:11:26 sda          0.00     0.00       0.00       0.00
16:11:26 2025-12-22
16:11:36 Device   wareq-sz      w/s     kB_w/s       kB_w

BTW, I did a quick try with "max_sectors_kb=4096" and a fast dd:
	$ dd if=/dev/zero of=/data2/tmp/zero.dd bs=4M count=$((21500/4)) status=progress
and it worked without an issue:

15:34:46 sda          0.00     0.00       0.00       0.00
15:34:56 sda       4096.00    15.10   61849.60  618496.00
15:35:06 sda       3959.49    26.90  106510.28 1065102.81
15:35:16 sda       4007.07    27.50  110194.43 1101944.25
15:35:26 sda       4020.03    26.80  107736.80 1077368.04
15:35:36 sda       4035.55    26.90  108556.29 1085562.95
15:35:46 sda       4032.27    25.60  103226.11 1032261.12
15:35:56 sda       4020.59    27.00  108555.93 1085559.30
15:36:06 sda       3974.66    26.90  106918.35 1069183.54
15:36:16 sda       4030.60    24.90  100361.94 1003619.40
15:36:26 sda       4004.95    26.90  107733.15 1077331.55
15:36:26 2025-12-22
15:36:36 Device   wareq-sz      w/s     kB_w/s       kB_w
15:36:36 sda       3991.85    27.40  109376.69 1093766.90
15:36:46 sda       4045.15    24.00   97083.60  970836.00
15:36:56 sda       4017.98    26.10  104869.28 1048692.78
15:37:06 sda       4017.81    26.10  104864.84 1048648.41
15:37:16 sda       4000.09    25.50  102002.29 1020022.95
15:37:26 sda       3990.31    27.00  107738.37 1077383.70
15:37:36 sda       4048.90    25.90  104866.51 1048665.10
15:37:46 sda       4003.07    26.30  105280.74 1052807.41
15:37:56 sda       4051.49    27.50  111415.97 1114159.75
15:38:06 sda       4033.31    26.00  104866.06 1048660.60
15:38:16 sda       4036.07    27.20  109781.10 1097811.04
15:38:26 sda       3340.74     5.40   18040.00  180399.96
15:38:36 sda          0.00     0.00       0.00       0.00

This is unlike my usual workload which writes about 250 files, some a few GB and some rather small.

Eyal

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-22  5:57                       ` Eyal Lebedinsky
@ 2025-12-30 22:43                         ` Eyal Lebedinsky
  2026-01-02  1:21                           ` Damien Le Moal
  0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-30 22:43 UTC (permalink / raw)
  To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal

On 22/12/25 16:57, Eyal Lebedinsky wrote:
> On 22/12/25 14:43, Damien Le Moal wrote:

[trimmed]

>> Once you confirm if we really need to maintain NCQ off or not with a small max
>> sectors limit, we can write a proper quirk for this drive.
> 
> I am leaving it this way. It runs the usual workload (rsync in) every 2 hours. I will report back in a few days.
It is now "a few days" later (9 days). All is well and not a single pause observed.
My job (rsync'ing 21GB into this disk every 2 hours) reports:
	max_sectors_kb=1024 timeout=120 queue_depth=32

I am keeping it with these parameters, but can try different values if it tells us anything.

Happy New Year Everyone,
	Eyal

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2025-12-30 22:43                         ` Eyal Lebedinsky
@ 2026-01-02  1:21                           ` Damien Le Moal
  2026-01-02  6:30                             ` Eyal Lebedinsky
  0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2026-01-02  1:21 UTC (permalink / raw)
  To: eyal, list linux-ide; +Cc: Niklas Cassel

On 12/31/25 07:43, Eyal Lebedinsky wrote:
> On 22/12/25 16:57, Eyal Lebedinsky wrote:
>> On 22/12/25 14:43, Damien Le Moal wrote:
> 
> [trimmed]
> 
>>> Once you confirm if we really need to maintain NCQ off or not with a small max
>>> sectors limit, we can write a proper quirk for this drive.
>>
>> I am leaving it this way. It runs the usual workload (rsync in) every 2 hours. I will report back in a few days.
> It is now "a few days" later (9 days). All is well and not a single pause observed.
> My job (rsync'ing 21GB into this disk every 2 hours) reports:
> 	max_sectors_kb=1024 timeout=120 queue_depth=32
> 
> I am keeping it with these parameters, but can try different values if it tells us anything.

If you are OK with keeping these as your default with this drive and setting
that through a udev rule or whatever else you prefer, then I think we are good.
We can make the max_sectors_kb=1024 permanent as a quirk though, but that would
be a little extreme since in the end, we are dealing with a very very slow drive
here, not really a buggy one.

> Happy New Year Everyone,

Thanks. Happy new year to you too !


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: ata timeout exceptions
  2026-01-02  1:21                           ` Damien Le Moal
@ 2026-01-02  6:30                             ` Eyal Lebedinsky
  0 siblings, 0 replies; 28+ messages in thread
From: Eyal Lebedinsky @ 2026-01-02  6:30 UTC (permalink / raw)
  To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal

On 2/1/26 12:21, Damien Le Moal wrote:
> On 12/31/25 07:43, Eyal Lebedinsky wrote:
>> On 22/12/25 16:57, Eyal Lebedinsky wrote:
>>> On 22/12/25 14:43, Damien Le Moal wrote:
>>
>> [trimmed]
>>
>>>> Once you confirm if we really need to maintain NCQ off or not with a small max
>>>> sectors limit, we can write a proper quirk for this drive.
>>>
>>> I am leaving it this way. It runs the usual workload (rsync in) every 2 hours. I will report back in a few days.
>> It is now "a few days" later (9 days). All is well and not a single pause observed.
>> My job (rsync'ing 21GB into this disk every 2 hours) reports:
>> 	max_sectors_kb=1024 timeout=120 queue_depth=32
>>
>> I am keeping it with these parameters, but can try different values if it tells us anything.
> 
> If you are OK with keeping these as your default with this drive and setting
> that through a udev rule or whatever else you prefer, then I think we are good.
> We can make the max_sectors_kb=1024 permanent as a quirk though, but that would
> be a little extreme since in the end, we are dealing with a very very slow drive
> here, not really a buggy one.

I agree. I will keep these setting here.

Thanks again for your effort,
	Eyal

>> Happy New Year Everyone,
> 
> Thanks. Happy new year to you too !
-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2026-01-02  6:31 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-03  4:13 ata timeout exceptions Eyal Lebedinsky
2025-11-09 20:40 ` Niklas Cassel
2025-11-09 22:41   ` Eyal Lebedinsky
2025-11-10 13:11     ` Niklas Cassel
2025-11-14  4:32 ` Eyal Lebedinsky
2025-11-18 15:17   ` Niklas Cassel
2025-11-18 23:05     ` Eyal Lebedinsky
2025-11-19  5:41       ` Damien Le Moal
2025-11-19 13:37         ` Eyal Lebedinsky
2025-11-20  3:34           ` Damien Le Moal
2025-11-20 11:38             ` Eyal Lebedinsky
2025-11-20 12:18               ` Damien Le Moal
2025-11-20 23:53                 ` Eyal Lebedinsky
2025-12-16 23:39 ` Eyal Lebedinsky
2025-12-17  1:35   ` Damien Le Moal
2025-12-17 11:56     ` Eyal Lebedinsky
2025-12-17 12:02       ` Niklas Cassel
2025-12-20  4:03         ` Eyal Lebedinsky
2025-12-21  8:34           ` Damien Le Moal
2025-12-21 12:12             ` Eyal Lebedinsky
2025-12-21 22:43               ` Eyal Lebedinsky
2025-12-21 23:14                 ` Damien Le Moal
2025-12-22  2:10                   ` Eyal Lebedinsky
2025-12-22  3:43                     ` Damien Le Moal
2025-12-22  5:57                       ` Eyal Lebedinsky
2025-12-30 22:43                         ` Eyal Lebedinsky
2026-01-02  1:21                           ` Damien Le Moal
2026-01-02  6:30                             ` Eyal Lebedinsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox