* ata timeout exceptions
@ 2025-11-03 4:13 Eyal Lebedinsky
2025-11-09 20:40 ` Niklas Cassel
` (2 more replies)
0 siblings, 3 replies; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-03 4:13 UTC (permalink / raw)
To: list linux-ide
I have a sata disk that is probably on its last legs.
It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
It sees very little activity.
Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
For the last few weeks it started to log timeout errors (not always) like this:
kernel: ata2.00: exception Emask 0x0 SAct 0x2020 SErr 0x0 action 0x6 frozen
kernel: ata2.00: failed command: WRITE FPDMA QUEUED
kernel: ata2.00: cmd 61/80:28:a0:10:df/00:00:d1:01:00/40 tag 5 ncq dma 65536 out
res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
kernel: ata2.00: status: { DRDY }
kernel: ata2.00: failed command: WRITE FPDMA QUEUED
kernel: ata2.00: cmd 61/00:68:18:15:30/20:00:20:01:00/40 tag 13 ncq dma 4194304 out
res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
kernel: ata2.00: status: { DRDY }
kernel: ata2: hard resetting link
kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
kernel: ata2.00: configured for UDMA/133
kernel: ata2: EH complete
Looking at the smart log I see that one more command_timeout was counted and no other attribute is incremented.
However, later on, this error was followed by 31 more failures, probably the full command queue was aborted.
The messages mention 'tag 0 ncq dma' through 'tag 31 ncq dma'.
Again, in the smart log, the whole burst counted as one extra command_timeout.
After this going on for a few days, a repeated burst of errors lead to:
kernel: ata2.00: NCQ disabled due to excessive errors
From now on, only one exception is logged:
kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
kernel: ata2.00: failed command: WRITE DMA EXT
kernel: ata2.00: cmd 35/00:00:98:a3:4c/00:20:86:01:00/e0 tag 6 dma 4194304 out
res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
kernel: ata2.00: status: { DRDY }
kernel: ata2: hard resetting link
kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
kernel: ata2.00: configured for UDMA/133
kernel: ata2: EH complete
Furthermore, the smart log shows no change. This has been going on for the last two days,
over a dozen times.
I want to understand what is going on:
1) Why do I not see an I/O error and the writes to the disk (rsync) seem to complete?
Which layer absorbs the errors, hiding them from the application?
2) Why do I get only one command_timeout counted (originally, with ncq active) and none when ncq is disabled?
Naturally, I already copied the disk to a replacement which I will install after this disk fails completely.
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-03 4:13 ata timeout exceptions Eyal Lebedinsky
@ 2025-11-09 20:40 ` Niklas Cassel
2025-11-09 22:41 ` Eyal Lebedinsky
2025-11-14 4:32 ` Eyal Lebedinsky
2025-12-16 23:39 ` Eyal Lebedinsky
2 siblings, 1 reply; 28+ messages in thread
From: Niklas Cassel @ 2025-11-09 20:40 UTC (permalink / raw)
To: Eyal Lebedinsky; +Cc: list linux-ide
Hello Eyal,
On Mon, Nov 03, 2025 at 03:13:34PM +1100, Eyal Lebedinsky wrote:
>
> I want to understand what is going on:
>
> 1) Why do I not see an I/O error and the writes to the disk (rsync) seem to complete?
> Which layer absorbs the errors, hiding them from the application?
SCSI layer.
For a timed out command, libata will set DID_TIME_OUT:
https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/ata/libata-eh.c#L652-L654
For most commands SCSI layer, SCSI will set cmd->allowed to sdkp->max_retries:
https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/sd.c#L1411
which by default is 5:
https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/sd.c#L3962
Thus, most commands will be retried up to 5 times:
https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/scsi_error.c#L2225
Thus, the user will only see the I/O as an error if the command failed
6 times.
(Note that if the command returns sense data instead of timeout, depending on
the sense data returned, we might report an I/O error to the user immediately.
>
> 2) Why do I get only one command_timeout counted (originally, with ncq active) and none when ncq is disabled?
You are right that even if it is only a single command that times out,
the whole queue will be drained and retried.
(Because we always do a hard reset after a command timeout.)
command_timeout is most likely increased only by one because it was
only a single command that timed out. (The other commands might have
been queued but were never executed/finished.)
I have no idea why a command timeout, when NCQ has been disabled,
does not increase the command_timeout counter. My expectation would
have been for the counter to still be increased by one.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-09 20:40 ` Niklas Cassel
@ 2025-11-09 22:41 ` Eyal Lebedinsky
2025-11-10 13:11 ` Niklas Cassel
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-09 22:41 UTC (permalink / raw)
To: list linux-ide; +Cc: Niklas Cassel
Hello Niklas,
On 10/11/25 07:40, Niklas Cassel wrote:
> Hello Eyal,
>
> On Mon, Nov 03, 2025 at 03:13:34PM +1100, Eyal Lebedinsky wrote:
>>
>> I want to understand what is going on:
>>
>> 1) Why do I not see an I/O error and the writes to the disk (rsync) seem to complete?
>> Which layer absorbs the errors, hiding them from the application?
>
> SCSI layer.
I now understand this, the error does not originate from the disk itself which may be unaware of it.
> For a timed out command, libata will set DID_TIME_OUT:
> https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/ata/libata-eh.c#L652-L654
>
> For most commands SCSI layer, SCSI will set cmd->allowed to sdkp->max_retries:
> https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/sd.c#L1411
>
> which by default is 5:
> https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/sd.c#L3962
>
> Thus, most commands will be retried up to 5 times:
> https://github.com/torvalds/linux/blob/v6.18-rc4/drivers/scsi/scsi_error.c#L2225
>
> Thus, the user will only see the I/O as an error if the command failed
> 6 times.
>
> (Note that if the command returns sense data instead of timeout, depending on
> the sense data returned, we might report an I/O error to the user immediately.
Initially, after a series of failures ncq was internally disabled
ata2.00: NCQ disabled due to excessive errors
after which I forced it off, in the boot command
ata2.00: FORCE: modified (noncq)
and no Command_Timeout was counted since.
after which I set command
>>
>> 2) Why do I get only one command_timeout counted (originally, with ncq active) and none when ncq is disabled?
>
> You are right that even if it is only a single command that times out,
> the whole queue will be drained and retried.
> (Because we always do a hard reset after a command timeout.)
>
> command_timeout is most likely increased only by one because it was
> only a single command that timed out. (The other commands might have
> been queued but were never executed/finished.)
>
> I have no idea why a command timeout, when NCQ has been disabled,
> does not increase the command_timeout counter. My expectation would
> have been for the counter to still be increased by one.
This is an older SMA disk, and I will not be surprised if the disk was not even executing the command yet
but was doing some housekeeping when it was reset. After raising the timeout 30s to 180s I still had one
case where a reset was invoked. I see (iostat was running) that there was no activity on the disk that whole time.
Or maybe it is just a fw bug in the disk (ST8000AS0002-1NA17Z from 2016)?
Is it possible that a reset when a command is pending is not counted in the smart log?
Interestingly, after repeated consecutive resets the link speed was downshifted 6.0->3.0->1.5g.
Now it boots at 3.0g when it used to always boot at 6.0g.
There must be a real issue there which is why the disk will be replaced anyway.
Regardless, I now have a better understanding of the i/o path.
Regards,
Eyal
> Kind regards,
> Niklas
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-09 22:41 ` Eyal Lebedinsky
@ 2025-11-10 13:11 ` Niklas Cassel
0 siblings, 0 replies; 28+ messages in thread
From: Niklas Cassel @ 2025-11-10 13:11 UTC (permalink / raw)
To: Eyal Lebedinsky; +Cc: list linux-ide
On Mon, Nov 10, 2025 at 09:41:29AM +1100, Eyal Lebedinsky wrote:
> > > 2) Why do I get only one command_timeout counted (originally, with ncq active) and none when ncq is disabled?
> >
> > You are right that even if it is only a single command that times out,
> > the whole queue will be drained and retried.
> > (Because we always do a hard reset after a command timeout.)
> >
> > command_timeout is most likely increased only by one because it was
> > only a single command that timed out. (The other commands might have
> > been queued but were never executed/finished.)
> >
> > I have no idea why a command timeout, when NCQ has been disabled,
> > does not increase the command_timeout counter. My expectation would
> > have been for the counter to still be increased by one.
> This is an older SMA disk, and I will not be surprised if the disk was not even executing the command yet
> but was doing some housekeeping when it was reset. After raising the timeout 30s to 180s I still had one
> case where a reset was invoked. I see (iostat was running) that there was no activity on the disk that whole time.
>
> Or maybe it is just a fw bug in the disk (ST8000AS0002-1NA17Z from 2016)?
> Is it possible that a reset when a command is pending is not counted in the smart log?
>
> Interestingly, after repeated consecutive resets the link speed was downshifted 6.0->3.0->1.5g.
> Now it boots at 3.0g when it used to always boot at 6.0g.
> There must be a real issue there which is why the disk will be replaced anyway.
>
> Regardless, I now have a better understanding of the i/o path.
I'm not sure how the command_timeout counter in the smart log works.
But from the Linux driver perspective, if an I/O has not completed within the
timeout, we will reset the controller, and retry the outstanding commands.
This timeout is defined by Linux, and is by default 30 seconds, like you
mentioned.
Not sure how the drive FW counts a command_timeout, but it is possible that
this internal counter has a timeout that is different from 30 seconds.
For AHCI, performing a hardreset is done by writing a register, so it is not
actually a command that is sent down to the drive. (For a softreset on the
other hand, a command is actually sent down to the drive.)
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-03 4:13 ata timeout exceptions Eyal Lebedinsky
2025-11-09 20:40 ` Niklas Cassel
@ 2025-11-14 4:32 ` Eyal Lebedinsky
2025-11-18 15:17 ` Niklas Cassel
2025-12-16 23:39 ` Eyal Lebedinsky
2 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-14 4:32 UTC (permalink / raw)
To: list linux-ide
On 3/11/25 15:13, Eyal Lebedinsky wrote:
> I have a sata disk that is probably on its last legs.
> It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
It is ST8000AS0002-1NA17Z from 2016.> It sees very little activity.
>
> Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
>
> For the last few weeks it started to log timeout errors (not always) like this:
For the last two weeks I was monitoring the activity on the disk and here is what I did:
- added to boot command line: libata.force=2.00:noncq
now smartmon sees no more Command_Timeout errors since
- added to rc.local: echo 180 >/sys/block/sda/device/timeout # was 30
Drastically less timeout/resets
No counters change in smartctl report
Q1) Do I need to also set eh_timeout?
Q2) Is there any disk parameter I should set?
Then ran iostat and monitored the system log.
Every 2 hours a sync of 20-30MB is done to this disk. 4 minutes for a smooth run.
Mostly it completes without errors logged.
However, once or twice a day the pauses become long enough to hit the 180s timeout.
Note: I do not see pauses longer that 30s but shorter than 180s.
See logs below.
Q3) What is going on in the disk during a pause? I understand that there was no communication from the disk,
just a long wait until the system issues a reset, when it probably retries successfully.
This disk was used for a few years without such errors. The first report is recent (from 25/Oct this year).
Regards,
Eyal
## example sync with pauses but without resets (timeout set to 180s):
14:07:46 2025-11-14
14:07:46 Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
14:07:56 sda 0.00 0.00 0.00 0.00 0 0 0
14:08:06 sda 29.60 10.40 55742.80 0.00 104 557428 0 start
14:08:16 sda 74.30 9.20 132401.60 0.00 92 1324016 0
14:08:26 sda 56.30 6.80 137086.80 0.00 68 1370868 0
14:08:36 sda 55.50 4.80 144625.20 0.00 48 1446252 0
14:08:46 sda 49.30 1.60 90455.20 0.00 16 904552 0
14:08:56 sda 0.00 0.00 0.00 0.00 0 0 0 pause #1
14:09:06 sda 0.00 0.00 0.00 0.00 0 0 0
14:09:16 sda 0.00 0.00 0.00 0.00 0 0 0
14:09:26 sda 95.50 3.60 135380.00 0.00 36 1353800 0
14:09:36 sda 85.20 1.60 164806.00 0.00 16 1648060 0
14:09:46 sda 4.10 0.80 10730.80 0.00 8 107308 0
14:09:56 sda 0.00 0.00 0.00 0.00 0 0 0 pause #2
14:10:06 sda 0.00 0.00 0.00 0.00 0 0 0
14:10:16 sda 16.80 0.80 33057.20 0.00 8 330572 0
14:10:26 sda 0.00 0.00 0.00 0.00 0 0 0 pause #3
14:10:36 sda 0.00 0.00 0.00 0.00 0 0 0
14:10:46 sda 0.00 0.00 0.00 0.00 0 0 0
14:10:56 sda 35.30 3.60 69011.20 0.00 36 690112 0
14:11:06 sda 75.90 4.00 145637.60 0.00 40 1456376 0
14:11:16 sda 10.00 0.80 24583.60 0.00 8 245836 0
14:11:26 sda 0.00 0.00 0.00 0.00 0 0 0 short pause #1
14:11:36 sda 9.00 0.40 14786.40 0.00 4 147864 0
14:11:46 sda 61.90 2.80 146004.80 0.00 28 1460048 0
14:11:56 sda 10.50 1.20 28356.80 0.00 12 283568 0
14:12:06 sda 0.00 0.00 0.00 0.00 0 0 0 pause #4
14:12:16 sda 0.00 0.00 0.00 0.00 0 0 0
14:12:26 sda 58.80 3.60 139051.20 0.00 36 1390512 0
14:12:36 sda 10.70 0.80 25866.40 0.00 8 258664 0
14:12:46 sda 0.00 0.00 0.00 0.00 0 0 0 short pause #2
14:12:46 sda 1.65 10.83 2721.90 0.00 142777 35897004 0
14:12:56 sda 0.00 0.00 0.00 0.00 0 0 0 short pause #3
14:13:06 sda 19.80 0.80 31588.00 0.00 8 315880 0
14:13:16 sda 34.50 2.00 84646.40 0.00 20 846464 0
14:13:26 sda 0.00 0.00 0.00 0.00 0 0 0 pause #5
14:13:36 sda 0.00 0.00 0.00 0.00 0 0 0
14:13:46 sda 68.20 2.80 118868.40 0.00 28 1188684 0
14:13:56 sda 0.20 0.00 380.80 0.00 0 3808 0
14:14:06 sda 0.00 0.00 0.00 0.00 0 0 0 pause #6
14:14:16 sda 0.00 0.00 0.00 0.00 0 0 0
14:14:26 sda 24.00 1.60 51746.40 0.00 16 517464 0
14:14:36 sda 63.80 5.20 136939.20 0.00 52 1369392 0
14:14:46 sda 59.70 4.80 125234.80 0.00 48 1252348 0
14:14:56 sda 16.70 1.20 40121.20 0.00 12 401212 0
14:15:06 sda 8.80 0.00 69.20 0.00 0 692 0
14:15:16 sda 0.00 0.00 0.00 0.00 0 0 0 end
14:15:26 sda 0.00 0.00 0.00 0.00 0 0 0
14:15:26 Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
However, a few times a day the pauses become long enough to hit the 180s timeout:
## example sync with pauses and one reset (timeout set to 180s):
12:08:05 sda 29.20 194.80 45394.40 0.00 1948 453944 0 start
12:08:15 sda 36.40 14.40 104648.40 0.00 144 1046484 0
12:08:25 sda 37.10 13.20 107768.40 0.00 132 1077684 0
12:08:35 sda 28.60 8.80 99516.40 0.00 88 995164 0
12:08:45 sda 31.30 8.40 114315.60 0.00 84 1143156 0
12:08:55 sda 33.10 9.60 111729.20 0.00 96 1117292 0
12:09:05 sda 12.10 1.60 24356.00 0.00 16 243560 0
12:09:15 sda 0.00 0.00 0.00 0.00 0 0 0 pause #1
12:09:25 sda 0.00 0.00 0.00 0.00 0 0 0
12:09:35 sda 7.10 2.00 16304.00 0.00 20 163040 0
12:09:45 sda 51.40 10.00 109822.80 0.00 100 1098228 0
12:09:55 sda 58.10 33.20 111903.60 0.00 332 1119036 0
12:10:05 sda 40.20 11.20 111351.20 0.00 112 1113512 0
12:10:15 sda 45.00 15.60 101494.00 0.00 156 1014940 0
12:10:25 sda 43.80 10.80 121330.80 0.00 108 1213308 0
12:10:35 sda 41.30 17.20 112128.40 0.00 172 1121284 0
12:10:45 sda 46.30 10.00 111787.20 0.00 100 1117872 0
12:10:55 sda 43.80 9.20 108923.60 0.00 92 1089236 0
12:11:05 sda 47.70 11.60 115351.60 0.00 116 1153516 0
12:11:15 sda 68.90 11.60 122597.60 0.00 116 1225976 0
12:11:25 sda 35.60 7.20 76411.60 0.00 72 764116 0
12:11:35 sda 24.60 4.80 55798.80 0.00 48 557988 0
12:11:45 sda 11.90 0.80 24110.00 0.00 8 241100 0
12:11:55 sda 0.00 0.00 0.00 0.00 0 0 0 pause #2
12:12:05 sda 0.00 0.00 0.00 0.00 0 0 0
12:12:15 sda 9.90 0.80 10282.40 0.00 8 102824 0
12:12:25 sda 0.00 0.00 0.00 0.00 0 0 0 long pause
12:12:35 sda 0.00 0.00 0.00 0.00 0 0 0
12:12:45 sda 0.00 0.00 0.00 0.00 0 0 0
12:12:55 sda 0.00 0.00 0.00 0.00 0 0 0
12:13:05 sda 0.00 0.00 0.00 0.00 0 0 0
12:13:15 sda 0.00 0.00 0.00 0.00 0 0 0
12:13:25 sda 0.00 0.00 0.00 0.00 0 0 0
12:13:35 sda 0.00 0.00 0.00 0.00 0 0 0
12:13:45 sda 0.00 0.00 0.00 0.00 0 0 0
12:13:55 sda 0.00 0.00 0.00 0.00 0 0 0
12:14:05 sda 0.00 0.00 0.00 0.00 0 0 0
12:14:15 sda 0.00 0.00 0.00 0.00 0 0 0
12:14:25 sda 0.00 0.00 0.00 0.00 0 0 0
12:14:35 sda 0.00 0.00 0.00 0.00 0 0 0
12:14:45 sda 0.00 0.00 0.00 0.00 0 0 0
12:14:45 2025-11-14
12:14:45 Device tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd
12:14:45 sda 2.20 23.25 3139.39 0.00 142021 19175400 0
12:14:55 sda 0.00 0.00 0.00 0.00 0 0 0
12:15:05 sda 0.00 0.00 0.00 0.00 0 0 0 timeout
12:15:14+11:00 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0 tag 28 dma 4194304 out
res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
12:15:14+11:00 kernel: ata2.00: status: { DRDY }
12:15:14+11:00 kernel: ata2: hard resetting link
12:15:15+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
12:15:15+11:00 kernel: ata2.00: configured for UDMA/133
12:15:15+11:00 kernel: ata2: EH complete
12:15:15 sda 1.40 0.00 2985.20 0.00 0 29852 0
12:15:25 sda 50.60 11.60 121033.60 0.00 116 1210336 0
12:15:35 sda 15.00 1.60 30986.00 0.00 16 309860 0
12:15:45 sda 0.00 0.00 0.00 0.00 0 0 0 pause #3
12:15:55 sda 0.00 0.00 0.00 0.00 0 0 0
12:16:05 sda 11.00 0.40 14271.20 0.00 4 142712 0
12:16:15 sda 5.50 4.00 5271.60 0.00 40 52716 0
12:16:25 sda 0.00 0.00 0.00 0.00 0 0 0 end
12:16:35 sda 0.00 0.00 0.00 0.00 0 0 0
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-14 4:32 ` Eyal Lebedinsky
@ 2025-11-18 15:17 ` Niklas Cassel
2025-11-18 23:05 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Niklas Cassel @ 2025-11-18 15:17 UTC (permalink / raw)
To: Eyal Lebedinsky; +Cc: list linux-ide, dlemoal
On Fri, Nov 14, 2025 at 03:32:20PM +1100, Eyal Lebedinsky wrote:
> > For the last few weeks it started to log timeout errors (not always) like this:
>
> For the last two weeks I was monitoring the activity on the disk and here is what I did:
>
> - added to boot command line: libata.force=2.00:noncq
> now smartmon sees no more Command_Timeout errors since
To be honest, I don't think you should need to disable NCQ.
> This disk was used for a few years without such errors. The first report is recent (from 25/Oct this year).
Which kernel version are you running?
> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0 tag 28 dma 4194304 out
> res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
> 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
> 12:15:14+11:00 kernel: ata2: hard resetting link
> 12:15:15+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> 12:15:15+11:00 kernel: ata2.00: configured for UDMA/133
> 12:15:15+11:00 kernel: ata2: EH complete
It is a 4194304 byte write that is failing, i.e. 4 MiB write.
This sounds very much like a recent bug report we have received:
https://bugzilla.kernel.org/show_bug.cgi?id=220693
In fact, a lot of the failing commands in that bug report is also a read
or write of size 4 MiB.
I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
size for rotational devices") and see if that improves things for you
(while keeping NCQ enabled).
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-18 15:17 ` Niklas Cassel
@ 2025-11-18 23:05 ` Eyal Lebedinsky
2025-11-19 5:41 ` Damien Le Moal
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-18 23:05 UTC (permalink / raw)
To: Niklas Cassel; +Cc: list linux-ide, dlemoal
Thanks Niklas,
On 19/11/25 02:17, Niklas Cassel wrote:
> On Fri, Nov 14, 2025 at 03:32:20PM +1100, Eyal Lebedinsky wrote:
>>> For the last few weeks it started to log timeout errors (not always) like this:
>>
>> For the last two weeks I was monitoring the activity on the disk and here is what I did:
>>
>> - added to boot command line: libata.force=2.00:noncq
>> now smartmon sees no more Command_Timeout errors since
>
> To be honest, I don't think you should need to disable NCQ.
Disabling NCQ caused the disk to NOT count a smart Command_Timeout, it did not stop the actual pauses/resets.
>> This disk was used for a few years without such errors. The first report is recent (from 25/Oct this year).
>
> Which kernel version are you running?
This is happening for a while now, using:
6.17.6-200.fc42.x86_64
6.17.7-200.fc42.x86_64
Before the start it was running without a problem:
6.16.3 - 6.16.12 since Aug 23
6.17.4-200.fc42.x86_64 since Oct 24 20:48:40
First timeout/reset a day later at Oct 25 20:09:27
I did set the timeout as high at 240s and found that if a pause is longer than 30s then it will always continue and timeout.
I can set it higher if there is a possibility that it WILL complete the write. Is it worth it? How long?
The system runs off nvme and includes a 7 disk raid6.
Maybe relevant:
for a while (recently) following a reset this disk would downshift (6.0->3.0->1.5Gbps).
For a period it would actually boot up at 3.0Gbps.
It is back to 6.0Gbps for about 2 weeks (and many resets).
I still suspect the disk itself it at fault (I have a replacement synced and ready).
>> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0 tag 28 dma 4194304 out
>> res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
>> 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>> 12:15:14+11:00 kernel: ata2: hard resetting link
>> 12:15:15+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>> 12:15:15+11:00 kernel: ata2.00: configured for UDMA/133
>> 12:15:15+11:00 kernel: ata2: EH complete
>
> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
Yes, this is the size of almost all commands. With NCQ enabled the sizes are very variable and often less that 1 MiB.
> This sounds very much like a recent bug report we have received:
> https://bugzilla.kernel.org/show_bug.cgi?id=220693
>
> In fact, a lot of the failing commands in that bug report is also a read
> or write of size 4 MiB.
>
> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
> size for rotational devices") and see if that improves things for you
> (while keeping NCQ enabled).I read it. I never had I/O errors reported for this disk so it looks different to me.
Regardless, I am not set up to build a kernel (I used to), and being my main server I hesitate to fiddle with it.
I will keep this disk active and observe the situation.
Regards,
Eyal
> Kind regards,
> Niklas
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-18 23:05 ` Eyal Lebedinsky
@ 2025-11-19 5:41 ` Damien Le Moal
2025-11-19 13:37 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-11-19 5:41 UTC (permalink / raw)
To: eyal, Niklas Cassel; +Cc: list linux-ide
On 11/19/25 08:05, Eyal Lebedinsky wrote:
> I still suspect the disk itself it at fault (I have a replacement synced and
> ready). >> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0
>>> tag 28 dma 4194304 out res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4
>>> (timeout) 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>>> 12:15:14+11:00 kernel: ata2: hard resetting link 12:15:15+11:00 kernel:
>>> ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) 12:15:15+11:00
>>> kernel: ata2.00: configured for UDMA/133 12:15:15+11:00 kernel: ata2: EH
>>> complete
>>
>> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
>
> Yes, this is the size of almost all commands. With NCQ enabled the sizes are
> very variable and often less that 1 MiB.
Yes, because there will be more requests queued in the block layer, which
increases the chances of merging sequential requests. That's why the average
command size goes up.
>
>> This sounds very much like a recent bug report we have received: https://
>> bugzilla.kernel.org/show_bug.cgi?id=220693
>>
>> In fact, a lot of the failing commands in that bug report is also a read
>> or write of size 4 MiB.
>>
>> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
>> size for rotational devices") and see if that improves things for you
>> (while keeping NCQ enabled).I read it. I never had I/O errors reported for
>> this disk so it looks different to me.
>
> Regardless, I am not set up to build a kernel (I used to), and being my main
> server I hesitate to fiddle with it. I will keep this disk active and
> observe the situation.
No, reverting this commit will not do anything to the max command size that a
disk can see. But you could try this:
echo 1280 > /sys/block/sdX/queue/max_sectors_kb
to reduce the maximum command size that the disk will receive.
On the other hand, if all drives in your RAID6 array are the same and only this
drive is misbehaving, then I would be tempted to say the same you are: that the
disk is turning bad and replacing it is the best solution.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-19 5:41 ` Damien Le Moal
@ 2025-11-19 13:37 ` Eyal Lebedinsky
2025-11-20 3:34 ` Damien Le Moal
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-19 13:37 UTC (permalink / raw)
To: Damien Le Moal, Niklas Cassel; +Cc: list linux-ide
Thanks Damien,
On 19/11/25 16:41, Damien Le Moal wrote:
> On 11/19/25 08:05, Eyal Lebedinsky wrote:
>> I still suspect the disk itself it at fault (I have a replacement synced and
>> ready). >> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>>>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0
>>>> tag 28 dma 4194304 out res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4
>>>> (timeout) 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>>>> 12:15:14+11:00 kernel: ata2: hard resetting link 12:15:15+11:00 kernel:
>>>> ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) 12:15:15+11:00
>>>> kernel: ata2.00: configured for UDMA/133 12:15:15+11:00 kernel: ata2: EH
>>>> complete
>>>
>>> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
>>
>> Yes, this is the size of almost all commands. With NCQ enabled the sizes are
>> very variable and often less that 1 MiB.
>
> Yes, because there will be more requests queued in the block layer, which
> increases the chances of merging sequential requests. That's why the average
> command size goes up.
>
>>
>>> This sounds very much like a recent bug report we have received: https://
>>> bugzilla.kernel.org/show_bug.cgi?id=220693
>>>
>>> In fact, a lot of the failing commands in that bug report is also a read
>>> or write of size 4 MiB.
>>>
>>> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
>>> size for rotational devices") and see if that improves things for you
>>> (while keeping NCQ enabled).I read it. I never had I/O errors reported for
>>> this disk so it looks different to me.
>>
>> Regardless, I am not set up to build a kernel (I used to), and being my main
>> server I hesitate to fiddle with it. I will keep this disk active and
>> observe the situation.
>
> No, reverting this commit will not do anything to the max command size that a
> disk can see. But you could try this:
>
> echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>
> to reduce the maximum command size that the disk will receive.
I will try this.
> On the other hand, if all drives in your RAID6 array are the same and only this
> drive is misbehaving, then I would be tempted to say the same you are: that the
> disk is turning bad and replacing it is the best solution.
"this drive" is NOT part of the RAID, it is just a scratch disk used when space is
needed or for some local backups. It is old and it will not be any drama if it fails.
This is why I am comfortable trying more options before replacing it.
My main interest is to understand what actually is happening inside the disk.
I assume that copying the data from the CRM part to the SMR part is going on.
The two hourly job that at times triggers a timeout (1 or 2 times a day) is rsyncing 20GB
into the drive, so this much is updated. It takes just 4 minutes on a good day.
Anyway, thanks everyone,
Eyal
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-19 13:37 ` Eyal Lebedinsky
@ 2025-11-20 3:34 ` Damien Le Moal
2025-11-20 11:38 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-11-20 3:34 UTC (permalink / raw)
To: eyal, Niklas Cassel; +Cc: list linux-ide
On 11/19/25 10:37 PM, Eyal Lebedinsky wrote:
> Thanks Damien,
>
> On 19/11/25 16:41, Damien Le Moal wrote:
>> On 11/19/25 08:05, Eyal Lebedinsky wrote:
>>> I still suspect the disk itself it at fault (I have a replacement synced and
>>> ready). >> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>>>>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0
>>>>> tag 28 dma 4194304 out res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4
>>>>> (timeout) 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>>>>> 12:15:14+11:00 kernel: ata2: hard resetting link 12:15:15+11:00 kernel:
>>>>> ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) 12:15:15+11:00
>>>>> kernel: ata2.00: configured for UDMA/133 12:15:15+11:00 kernel: ata2: EH
>>>>> complete
>>>>
>>>> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
>>>
>>> Yes, this is the size of almost all commands. With NCQ enabled the sizes are
>>> very variable and often less that 1 MiB.
>>
>> Yes, because there will be more requests queued in the block layer, which
>> increases the chances of merging sequential requests. That's why the average
>> command size goes up.
>>
>>>
>>>> This sounds very much like a recent bug report we have received: https://
>>>> bugzilla.kernel.org/show_bug.cgi?id=220693
>>>>
>>>> In fact, a lot of the failing commands in that bug report is also a read
>>>> or write of size 4 MiB.
>>>>
>>>> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
>>>> size for rotational devices") and see if that improves things for you
>>>> (while keeping NCQ enabled).I read it. I never had I/O errors reported for
>>>> this disk so it looks different to me.
>>>
>>> Regardless, I am not set up to build a kernel (I used to), and being my main
>>> server I hesitate to fiddle with it. I will keep this disk active and
>>> observe the situation.
>>
>> No, reverting this commit will not do anything to the max command size that a
>> disk can see. But you could try this:
>>
>> echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>>
>> to reduce the maximum command size that the disk will receive.
>
> I will try this.
>
>> On the other hand, if all drives in your RAID6 array are the same and only this
>> drive is misbehaving, then I would be tempted to say the same you are: that the
>> disk is turning bad and replacing it is the best solution.
>
> "this drive" is NOT part of the RAID, it is just a scratch disk used when space is
> needed or for some local backups. It is old and it will not be any drama if it
> fails.
> This is why I am comfortable trying more options before replacing it.
>
> My main interest is to understand what actually is happening inside the disk.
> I assume that copying the data from the CRM part to the SMR part is going on.
Ah ! This is a drive-managed SMR drive ? I missed that point. Yeah, with these
drives, the performance profile (throughpu & command latency) can be all over
the place depending on the internal state of the drive. So all bets are off in
terms of timeout... In your case, this seems extreme though, so there is likely
a head going bad and lots of internal retries going on that make latency even
worse than usual. Maybe have a look at SMART output to see if you lots of bad
sectors remapped ?
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-20 3:34 ` Damien Le Moal
@ 2025-11-20 11:38 ` Eyal Lebedinsky
2025-11-20 12:18 ` Damien Le Moal
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-20 11:38 UTC (permalink / raw)
To: Damien Le Moal, Niklas Cassel; +Cc: list linux-ide
Thanks again Damien,
On 20/11/25 14:34, Damien Le Moal wrote:
> On 11/19/25 10:37 PM, Eyal Lebedinsky wrote:
>> Thanks Damien,
>>
>> On 19/11/25 16:41, Damien Le Moal wrote:
>>> On 11/19/25 08:05, Eyal Lebedinsky wrote:
>>>> I still suspect the disk itself it at fault (I have a replacement synced and
>>>> ready). >> 12:15:14+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
>>>>>> 12:15:14+11:00 kernel: ata2.00: cmd 35/00:00:00:68:4e/00:20:77:00:00/e0
>>>>>> tag 28 dma 4194304 out res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4
>>>>>> (timeout) 12:15:14+11:00 kernel: ata2.00: status: { DRDY }
>>>>>> 12:15:14+11:00 kernel: ata2: hard resetting link 12:15:15+11:00 kernel:
>>>>>> ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) 12:15:15+11:00
>>>>>> kernel: ata2.00: configured for UDMA/133 12:15:15+11:00 kernel: ata2: EH
>>>>>> complete
>>>>>
>>>>> It is a 4194304 byte write that is failing, i.e. 4 MiB write.
>>>>
>>>> Yes, this is the size of almost all commands. With NCQ enabled the sizes are
>>>> very variable and often less that 1 MiB.
>>>
>>> Yes, because there will be more requests queued in the block layer, which
>>> increases the chances of merging sequential requests. That's why the average
>>> command size goes up.
>>>
>>>>
>>>>> This sounds very much like a recent bug report we have received: https://
>>>>> bugzilla.kernel.org/show_bug.cgi?id=220693
>>>>>
>>>>> In fact, a lot of the failing commands in that bug report is also a read
>>>>> or write of size 4 MiB.
>>>>>
>>>>> I guess you could try reverting 459779d04ae8 ("block: Improve read ahead
>>>>> size for rotational devices") and see if that improves things for you
>>>>> (while keeping NCQ enabled).I read it. I never had I/O errors reported for
>>>>> this disk so it looks different to me.
>>>>
>>>> Regardless, I am not set up to build a kernel (I used to), and being my main
>>>> server I hesitate to fiddle with it. I will keep this disk active and
>>>> observe the situation.
>>>
>>> No, reverting this commit will not do anything to the max command size that a
>>> disk can see. But you could try this:
>>>
>>> echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>>>
>>> to reduce the maximum command size that the disk will receive.
Done.
>> I will try this.
>>
>>> On the other hand, if all drives in your RAID6 array are the same and only this
>>> drive is misbehaving, then I would be tempted to say the same you are: that the
>>> disk is turning bad and replacing it is the best solution.
>>
>> "this drive" is NOT part of the RAID, it is just a scratch disk used when space is
>> needed or for some local backups. It is old and it will not be any drama if it
>> fails.
>> This is why I am comfortable trying more options before replacing it.
>>
>> My main interest is to understand what actually is happening inside the disk.
>> I assume that copying the data from the CRM part to the SMR part is going on.
>
> Ah ! This is a drive-managed SMR drive ? I missed that point. Yeah, with these
> drives, the performance profile (throughpu & command latency) can be all over
> the place depending on the internal state of the drive. So all bets are off in
> terms of timeout... In your case, this seems extreme though, so there is likely
> a head going bad and lots of internal retries going on that make latency even
> worse than usual. Maybe have a look at SMART output to see if you lots of bad
> sectors remapped ?
Nothing bad in smart report.
Another positive: After setting a lower max_sectors_kb as suggested, the drive is
running smoothly. I also added --fsync to the rsync which probably also regulated
the pace a bit.
So far today there was no reset required, and also no pause at all.
Maybe after the disk was used for a long while, and as a large amount of data was
replaced regularly, the data is now distributed wildly.
Is there an equivalent to 'trim' that can be used to tell the drive what blocks
can be discarded (and reused)? If so, worth a try?
Regards,
Eyal
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-20 11:38 ` Eyal Lebedinsky
@ 2025-11-20 12:18 ` Damien Le Moal
2025-11-20 23:53 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-11-20 12:18 UTC (permalink / raw)
To: eyal, Niklas Cassel; +Cc: list linux-ide
On 11/20/25 20:38, Eyal Lebedinsky wrote:
>> Ah ! This is a drive-managed SMR drive ? I missed that point. Yeah, with these
>> drives, the performance profile (throughpu & command latency) can be all over
>> the place depending on the internal state of the drive. So all bets are off in
>> terms of timeout... In your case, this seems extreme though, so there is likely
>> a head going bad and lots of internal retries going on that make latency even
>> worse than usual. Maybe have a look at SMART output to see if you lots of bad
>> sectors remapped ?
>
> Nothing bad in smart report.
>
> Another positive: After setting a lower max_sectors_kb as suggested, the drive is
> running smoothly. I also added --fsync to the rsync which probably also regulated
> the pace a bit.
>
> So far today there was no reset required, and also no pause at all.
>
> Maybe after the disk was used for a long while, and as a large amount of data was
> replaced regularly, the data is now distributed wildly.
>
> Is there an equivalent to 'trim' that can be used to tell the drive what blocks
> can be discarded (and reused)? If so, worth a try?
If you drive shows a non-zero value for:
cat /sys/block/sdX/queue/discard_max_hw_bytes
then you can run fstrim against the FS on the drive to trim (discard) the unused
blocks. If the value is zero, then the drive does not support discard/trim.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-20 12:18 ` Damien Le Moal
@ 2025-11-20 23:53 ` Eyal Lebedinsky
0 siblings, 0 replies; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-11-20 23:53 UTC (permalink / raw)
To: Damien Le Moal, Niklas Cassel; +Cc: list linux-ide
Thanks Damien,
On 20/11/25 23:18, Damien Le Moal wrote:
> On 11/20/25 20:38, Eyal Lebedinsky wrote:
>>> Ah ! This is a drive-managed SMR drive ? I missed that point. Yeah, with these
>>> drives, the performance profile (throughpu & command latency) can be all over
>>> the place depending on the internal state of the drive. So all bets are off in
>>> terms of timeout... In your case, this seems extreme though, so there is likely
>>> a head going bad and lots of internal retries going on that make latency even
>>> worse than usual. Maybe have a look at SMART output to see if you lots of bad
>>> sectors remapped ?
>>
>> Nothing bad in smart report.
>>
>> Another positive: After setting a lower max_sectors_kb as suggested, the drive is
>> running smoothly. I also added --fsync to the rsync which probably also regulated
>> the pace a bit.
>>
>> So far today there was no reset required, and also no pause at all.
>>
>> Maybe after the disk was used for a long while, and as a large amount of data was
>> replaced regularly, the data is now distributed wildly.
>>
>> Is there an equivalent to 'trim' that can be used to tell the drive what blocks
>> can be discarded (and reused)? If so, worth a try?
>
> If you drive shows a non-zero value for:
>
> cat /sys/block/sdX/queue/discard_max_hw_bytes
>
> then you can run fstrim against the FS on the drive to trim (discard) the unused
> blocks. If the value is zero, then the drive does not support discard/trim.
Not supported.
Is there a way to mark everything unused? Or was SMR not designed to handle this, way back in 2014?
I am able to copy the disk elsewhere, clear?, then copy back.
<unrelated>
Seagate shows the disk (ST8000AS0002-1NA17Z) as released in 2014. Says "No Newer Firmware Available".
Latest fw I found mentioned are AR15/AR17 (mine is AR13) which still do not support trim. It is for a later disk revision.
https://smarthdd.com/database/ST8000AS0002-1NA17Z/
</unrelated>
Regards,
Eyal
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-11-03 4:13 ata timeout exceptions Eyal Lebedinsky
2025-11-09 20:40 ` Niklas Cassel
2025-11-14 4:32 ` Eyal Lebedinsky
@ 2025-12-16 23:39 ` Eyal Lebedinsky
2025-12-17 1:35 ` Damien Le Moal
2 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-16 23:39 UTC (permalink / raw)
To: list linux-ide
Resolved.
Limiting disk access bandwidth (as suggested by Damien Le Moal <dlemoal@kernel.org>)
# echo 1280 > /sys/block/sdX/queue/max_sectors_kb
did the trick. No pauses/resets anymore for over a month.
Setting
# echo 180 >/sys/block/sda/device/timeout
did not help, only made the pauses longer before the reset.
Thanks everyone.
Eyal
On 3/11/25 15:13, Eyal Lebedinsky wrote:
> I have a sata disk that is probably on its last legs.
> It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
> It sees very little activity.
>
> Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
>
> For the last few weeks it started to log timeout errors (not always) like this:
>
> kernel: ata2.00: exception Emask 0x0 SAct 0x2020 SErr 0x0 action 0x6 frozen
> kernel: ata2.00: failed command: WRITE FPDMA QUEUED
> kernel: ata2.00: cmd 61/80:28:a0:10:df/00:00:d1:01:00/40 tag 5 ncq dma 65536 out
> res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
> kernel: ata2.00: status: { DRDY }
> kernel: ata2.00: failed command: WRITE FPDMA QUEUED
> kernel: ata2.00: cmd 61/00:68:18:15:30/20:00:20:01:00/40 tag 13 ncq dma 4194304 out
> res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> kernel: ata2.00: status: { DRDY }
> kernel: ata2: hard resetting link
> kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> kernel: ata2.00: configured for UDMA/133
> kernel: ata2: EH complete
[trimmed]
--
Eyal Lebedinsky (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-16 23:39 ` Eyal Lebedinsky
@ 2025-12-17 1:35 ` Damien Le Moal
2025-12-17 11:56 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-12-17 1:35 UTC (permalink / raw)
To: eyal, list linux-ide, Niklas Cassel
On 12/17/25 08:39, Eyal Lebedinsky wrote:
> Resolved.
>
> Limiting disk access bandwidth (as suggested by Damien Le Moal <dlemoal@kernel.org>)
> # echo 1280 > /sys/block/sdX/queue/max_sectors_kb
> did the trick. No pauses/resets anymore for over a month.
We now have patches queued up to limit max_sectors_kb for devices and
controllers behaving badly. If you send us your device information (hdparm -I)
and controller info (PCI ID of your AHCI adapter), we can add a permanent quirk.
Though we would need to determine if is is the device or the adapter that is
mis-behaving, and also ideally, the command size at which things break.
We had another case with a device breaking above 4MiB commands. A quirk setting
max hw sectors to 8191 sectors solved the issue.
>
> Setting
> # echo 180 >/sys/block/sda/device/timeout
> did not help, only made the pauses longer before the reset.
>
> Thanks everyone.
> Eyal
>
> On 3/11/25 15:13, Eyal Lebedinsky wrote:
>> I have a sata disk that is probably on its last legs.
>> It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
>> It sees very little activity.
>>
>> Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
>>
>> For the last few weeks it started to log timeout errors (not always) like this:
>>
>> kernel: ata2.00: exception Emask 0x0 SAct 0x2020 SErr 0x0 action 0x6 frozen
>> kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>> kernel: ata2.00: cmd 61/80:28:a0:10:df/00:00:d1:01:00/40 tag 5 ncq dma 65536 out
>> res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
>> kernel: ata2.00: status: { DRDY }
>> kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>> kernel: ata2.00: cmd 61/00:68:18:15:30/20:00:20:01:00/40 tag 13 ncq dma 4194304 out
>> res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>> kernel: ata2.00: status: { DRDY }
>> kernel: ata2: hard resetting link
>> kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>> kernel: ata2.00: configured for UDMA/133
>> kernel: ata2: EH complete
>
> [trimmed]
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-17 1:35 ` Damien Le Moal
@ 2025-12-17 11:56 ` Eyal Lebedinsky
2025-12-17 12:02 ` Niklas Cassel
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-17 11:56 UTC (permalink / raw)
To: list linux-ide; +Cc: Damien Le Moal, Niklas Cassel
On 17/12/25 12:35, Damien Le Moal wrote:
> On 12/17/25 08:39, Eyal Lebedinsky wrote:
>> Resolved.
>>
>> Limiting disk access bandwidth (as suggested by Damien Le Moal <dlemoal@kernel.org>)
>> # echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>> did the trick. No pauses/resets anymore for over a month.
>
> We now have patches queued up to limit max_sectors_kb for devices and
> controllers behaving badly. If you send us your device information (hdparm -I)
> and controller info (PCI ID of your AHCI adapter), we can add a permanent quirk.
>
> Though we would need to determine if is is the device or the adapter that is
> mis-behaving, and also ideally, the command size at which things break.
> We had another case with a device breaking above 4MiB commands. A quirk setting
> max hw sectors to 8191 sectors solved the issue.
The machine is: Gigabyte Z390 UD, BIOS AMI F8
The disk is, according to smartctl:
Model Family: Seagate Archive HDD (SMR)
Device Model: ST8000AS0002-1NA17Z
Firmware Version: AR13
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
There was no hw change during this period.
Here is what I think:
This disk did not exhibit the problem for the last 1.5 years when it was in constant use.
[before that, since Jan/2016, it was used every few months as a backup disk]
Then, last month it started to show the problem.
Being an early SMR disk, is it possible that it reached a state where all block updates require a track read/write
(no more unused tracks) and at high bandwidth it gets into trouble. It did not matter how high I set the timeout
(I tested up to 240) it always timed out if any pause was encountered.
Being a rather old disk (a 2014 model?)
Maybe a fw bug?
Maybe an SMR design misfeature?
Do you want me to try different max_sectors_kb values to see where it breaks?
Regards,
Eyal
>>
>> Setting
>> # echo 180 >/sys/block/sda/device/timeout
>> did not help, only made the pauses longer before the reset.
>>
>> Thanks everyone.
>> Eyal
>>
>> On 3/11/25 15:13, Eyal Lebedinsky wrote:
>>> I have a sata disk that is probably on its last legs.
>>> It is a plain disk (no RAID or such). If it matters, it is an old 8TB Seagate SMA disk.
>>> It sees very little activity.
>>>
>>> Every two hours a small rsync copies a directory into this disk. A few 100s of files are copied each time, a few 10s of GB in total.
>>>
>>> For the last few weeks it started to log timeout errors (not always) like this:
>>>
>>> kernel: ata2.00: exception Emask 0x0 SAct 0x2020 SErr 0x0 action 0x6 frozen
>>> kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>>> kernel: ata2.00: cmd 61/80:28:a0:10:df/00:00:d1:01:00/40 tag 5 ncq dma 65536 out
>>> res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
>>> kernel: ata2.00: status: { DRDY }
>>> kernel: ata2.00: failed command: WRITE FPDMA QUEUED
>>> kernel: ata2.00: cmd 61/00:68:18:15:30/20:00:20:01:00/40 tag 13 ncq dma 4194304 out
>>> res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>>> kernel: ata2.00: status: { DRDY }
>>> kernel: ata2: hard resetting link
>>> kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>>> kernel: ata2.00: configured for UDMA/133
>>> kernel: ata2: EH complete
>>
>> [trimmed]
>>
>
>
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-17 11:56 ` Eyal Lebedinsky
@ 2025-12-17 12:02 ` Niklas Cassel
2025-12-20 4:03 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Niklas Cassel @ 2025-12-17 12:02 UTC (permalink / raw)
To: eyal, Eyal Lebedinsky, list linux-ide; +Cc: Damien Le Moal
On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>On 17/12/25 12:35, Damien Le Moal wrote:
>> On 12/17/25 08:39, Eyal Lebedinsky wrote:
>>> Resolved.
>>>
>>> Limiting disk access bandwidth (as suggested by Damien Le Moal <dlemoal@kernel.org>)
>>> # echo 1280 > /sys/block/sdX/queue/max_sectors_kb
>>> did the trick. No pauses/resets anymore for over a month.
>>
>> We now have patches queued up to limit max_sectors_kb for devices and
>> controllers behaving badly. If you send us your device information (hdparm -I)
>> and controller info (PCI ID of your AHCI adapter), we can add a permanent quirk.
>>
>> Though we would need to determine if is is the device or the adapter that is
>> mis-behaving, and also ideally, the command size at which things break.
>> We had another case with a device breaking above 4MiB commands. A quirk setting
>> max hw sectors to 8191 sectors solved the issue.
>
>The machine is: Gigabyte Z390 UD, BIOS AMI F8
>
>The disk is, according to smartctl:
> Model Family: Seagate Archive HDD (SMR)
> Device Model: ST8000AS0002-1NA17Z
> Firmware Version: AR13
> User Capacity: 8,001,563,222,016 bytes [8.00 TB]
> Sector Sizes: 512 bytes logical, 4096 bytes physical
>There was no hw change during this period.
>
>Here is what I think:
>
>This disk did not exhibit the problem for the last 1.5 years when it was in constant use.
> [before that, since Jan/2016, it was used every few months as a backup disk]
>Then, last month it started to show the problem.
>
>Being an early SMR disk, is it possible that it reached a state where all block updates require a track read/write
>(no more unused tracks) and at high bandwidth it gets into trouble. It did not matter how high I set the timeout
>(I tested up to 240) it always timed out if any pause was encountered.
>
>Being a rather old disk (a 2014 model?)
> Maybe a fw bug?
> Maybe an SMR design misfeature?
>
>Do you want me to try different max_sectors_kb values to see where it breaks?
>
You can also try this:
https://github.com/floatious/max-sectors-quirk
It tries these max_sector_kb values:
declare -a sizes=(128 1024 2048 3072 4095 4096)
You can simply modify the script if you want to try more intermediate sizes.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-17 12:02 ` Niklas Cassel
@ 2025-12-20 4:03 ` Eyal Lebedinsky
2025-12-21 8:34 ` Damien Le Moal
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-20 4:03 UTC (permalink / raw)
To: list linux-ide; +Cc: Damien Le Moal, Niklas Cassel
On 17/12/25 23:02, Niklas Cassel wrote:
> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
[trimmed]
>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>
>
> You can also try this:
>
> https://github.com/floatious/max-sectors-quirk
>
> It tries these max_sector_kb values:
> declare -a sizes=(128 1024 2048 3072 4095 4096)
>
> You can simply modify the script if you want to try more intermediate sizes.
After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
See comments after the test report.
------------------ test start
$ sudo sh ./find-max-sectors.sh /dev/sda
Drive model:
ST8000AS0002-1NA17Z
Drive firmware:
AR13
SATA / AHCI controller:
00:17.0 SATA controller [0106]: Intel Corporation Cannon Lake PCH SATA AHCI Controller [8086:a352] (rev 10)
Drive values before running the test:
/sys/block/sda/queue/max_hw_sectors_kb:32767
/sys/block/sda/queue/max_sectors_kb:4096
/sys/block/sda/queue/read_ahead_kb:8192
Running test with max_sectors 128 KiB
Test: PASS
Running test with max_sectors 1024 KiB
Test: PASS
Running test with max_sectors 2048 KiB
Test: PASS
Running test with max_sectors 3072 KiB
Test: PASS
Running test with max_sectors 4095 KiB
Test: PASS
Running test with max_sectors 4096 KiB
Test: PASS
------------------ test end
For the last few days I tested my usual workload with 3072, 4095 and 4096.
Up to 3072 all runs were clean, no pauses.
While I failed to produce a reset, the last (large) sizes often triggered pauses, about 30s each,
after which the job continued to completion. With 4096 I did see a long pause triggering a reset.
Note: What I saw so far (many times) is that if a pause is longer than 30s it never recovers until a reset.
Important: my workload is writing to this disk, I never saw any problems reading from it.
My guess is that the disk has confidence in accepting large write blocks but it fails to live up to the promise.
Writing is more involved on SMR and maybe there is an issue specific to large writes.
Looking at the smart stats, the disk had only 180TB lifetime writes, and it is specced for 55TB/year (or 180TB/y with v2 and v3).
So it is far from full.
I will now leave it running for a few days with max_sector_kb=4095 to see if it also triggers a reset.
Regards,
Eyal
----- example of my workload, no pause. max_sector_kb=4096, timeout=120
18:06:52 2025-12-19
18:07:02 Device wareq-sz w/s kB_w/s kB_w
18:07:02 sda 0.00 0.00 0.00 0.00
18:07:12 sda 0.00 0.00 0.00 0.00
18:07:22 sda 0.00 0.00 0.00 0.00
18:07:32 sda 0.00 0.00 0.00 0.00
18:07:42 sda 0.00 0.00 0.00 0.00
18:07:52 sda 0.00 0.00 0.00 0.00
18:08:02 sda 1027.69 14.00 14387.66 143876.60
18:08:12 sda 912.34 105.60 96343.10 963431.04
18:08:22 sda 1303.94 106.50 138869.61 1388696.10
18:08:32 sda 1844.63 72.60 133920.14 1339201.38
18:08:42 sda 2083.31 66.30 138123.45 1381234.53
18:08:52 sda 1339.79 71.30 95527.03 955270.27
18:09:02 sda 2427.31 47.70 115782.69 1157826.87
18:09:12 sda 1817.04 69.80 126829.39 1268293.92
18:09:22 sda 1805.27 54.70 98748.27 987482.69
18:09:32 sda 1186.68 90.10 106919.87 1069198.68
18:09:42 sda 1041.10 114.10 118789.51 1187895.10
18:09:52 sda 972.94 141.80 137962.89 1379628.92
18:10:02 sda 1086.14 90.70 98512.90 985128.98
18:10:12 sda 1360.86 74.50 101384.07 1013840.70
18:10:22 sda 1354.29 87.60 118635.80 1186358.04
18:10:32 sda 1712.45 63.00 107884.35 1078843.50
18:10:42 sda 1529.06 74.20 113456.25 1134562.52
18:10:52 sda 1681.66 65.80 110653.23 1106532.28
18:11:02 sda 1589.97 73.80 117339.79 1173397.86
18:11:12 sda 1623.92 35.40 57486.77 574867.68
18:11:22 sda 0.00 0.00 0.00 0.00
18:11:32 sda 0.00 0.00 0.00 0.00
18:11:42 sda 0.00 0.00 0.00 0.00
18:11:52 sda 0.00 0.00 0.00 0.00
-----
----- example of my workload, showing two pauses followed by a 3rd long pause+reset
22:06:53 2025-12-19
22:07:03 Device wareq-sz w/s kB_w/s kB_w
22:07:03 sda 0.00 0.00 0.00 0.00
22:07:13 sda 0.00 0.00 0.00 0.00
22:07:23 sda 0.00 0.00 0.00 0.00
22:07:33 sda 0.00 0.00 0.00 0.00
22:07:43 sda 0.00 0.00 0.00 0.00
22:07:53 sda 0.00 0.00 0.00 0.00
22:08:03 sda 584.10 27.70 16179.57 161795.70
22:08:13 sda 865.20 137.40 118878.48 1188784.80
22:08:23 sda 1035.57 73.50 76114.39 761143.95
22:08:33 sda 928.37 133.00 123473.21 1234732.10
22:08:43 sda 740.79 153.30 113563.11 1135631.07
22:08:53 sda 780.15 103.90 81057.59 810575.85
22:09:03 sda 1061.52 107.60 114219.55 1142195.52
22:09:13 sda 674.01 127.40 85868.87 858688.74
22:09:23 sda 959.01 137.80 132151.58 1321515.78
22:09:33 sda 1427.83 81.90 116939.28 1169392.77
22:09:43 sda 1210.29 90.50 109531.24 1095312.45
22:09:53 sda 864.15 32.90 28430.53 284305.35
22:10:03 sda 0.00 0.00 0.00 0.00
22:10:13 sda 0.00 0.00 0.00 0.00
22:10:23 sda 1787.96 11.00 19667.56 196675.60
22:10:33 sda 0.00 0.00 0.00 0.00
22:10:43 sda 0.00 0.00 0.00 0.00
22:10:53 sda 0.00 0.00 0.00 0.00
22:11:03 sda 1569.92 33.40 52435.33 524353.28
22:11:13 sda 1262.29 69.50 87729.15 877291.55
22:11:23 sda 0.00 0.00 0.00 0.00
22:11:33 sda 0.00 0.00 0.00 0.00
22:11:43 sda 0.00 0.00 0.00 0.00
22:11:53 sda 0.00 0.00 0.00 0.00
22:11:53 2025-12-19
22:12:03 Device wareq-sz w/s kB_w/s kB_w
22:12:03 sda 0.00 0.00 0.00 0.00
22:12:13 sda 0.00 0.00 0.00 0.00
22:12:23 sda 0.00 0.00 0.00 0.00
22:12:33 sda 0.00 0.00 0.00 0.00
22:12:43 sda 0.00 0.00 0.00 0.00
22:12:53 sda 0.00 0.00 0.00 0.00
22:13:03 sda 0.00 0.00 0.00 0.00
22:13:13 sda 0.00 0.00 0.00 0.00
22:13:23 sda 972.28 122.40 119007.07 1190070.72
22:13:33 sda 1303.25 92.00 119899.00 1198990.00
22:13:43 sda 2139.47 51.80 110824.55 1108245.46
22:13:53 sda 1240.71 99.60 123574.72 1235747.16
22:14:03 sda 1516.56 74.00 112225.44 1122254.40
22:14:13 sda 1832.90 56.60 103742.14 1037421.40
22:14:23 sda 1112.75 108.90 121178.48 1211784.75
22:14:33 sda 573.11 93.50 53585.79 535857.85
22:14:43 sda 0.00 0.00 0.00 0.00
22:14:53 sda 0.00 0.00 0.00 0.00
-- the system log shows the reset:
2025-12-19T22:13:12+11:00 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
2025-12-19T22:13:12+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
2025-12-19T22:13:12+11:00 kernel: ata2.00: cmd 35/00:00:00:08:34/00:20:ba:01:00/e0 tag 15 dma 4194304 out
res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
2025-12-19T22:13:12+11:00 kernel: ata2.00: status: { DRDY }
2025-12-19T22:13:12+11:00 kernel: ata2: hard resetting link
2025-12-19T22:13:13+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2025-12-19T22:13:13+11:00 kernel: ata2.00: configured for UDMA/133
2025-12-19T22:13:13+11:00 kernel: ata2: EH complete
-----
> Kind regards,
> Niklas
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-20 4:03 ` Eyal Lebedinsky
@ 2025-12-21 8:34 ` Damien Le Moal
2025-12-21 12:12 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-12-21 8:34 UTC (permalink / raw)
To: eyal, list linux-ide; +Cc: Niklas Cassel
On 12/20/25 13:03, Eyal Lebedinsky wrote:
> On 17/12/25 23:02, Niklas Cassel wrote:
>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>
> [trimmed]
>
>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>
>>
>> You can also try this:
>>
>> https://github.com/floatious/max-sectors-quirk
>>
>> It tries these max_sector_kb values:
>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>
>> You can simply modify the script if you want to try more intermediate sizes.
>
> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
> See comments after the test report.
From your test results, it seems that the drive actually correctly handles very
large write commands, but because it is a drive-managed SMR disk, such commands
can take a very long time to process (due to the drive needing internal garbage
collection first), which triggers a timeout and a reset as the ata subsystem
assumes that the drive has stopped responding.
Limiting write commands to smaller sizes seems to mostly avoid this issue, even
though I do not think that gives any guarantees that the same issue will not
happen for small writes too.
So my suggestion is that you run with something like "libata.force=[<port
ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
but that would be a really more of a big hammer solution.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-21 8:34 ` Damien Le Moal
@ 2025-12-21 12:12 ` Eyal Lebedinsky
2025-12-21 22:43 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-21 12:12 UTC (permalink / raw)
To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal
On 21/12/25 19:34, Damien Le Moal wrote:
> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>> On 17/12/25 23:02, Niklas Cassel wrote:
>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>
>> [trimmed]
>>
>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>
>>>
>>> You can also try this:
>>>
>>> https://github.com/floatious/max-sectors-quirk
>>>
>>> It tries these max_sector_kb values:
>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>
>>> You can simply modify the script if you want to try more intermediate sizes.
>>
>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>> See comments after the test report.
>
>>From your test results, it seems that the drive actually correctly handles very
> large write commands, but because it is a drive-managed SMR disk, such commands
> can take a very long time to process (due to the drive needing internal garbage
> collection first), which triggers a timeout and a reset as the ata subsystem
> assumes that the drive has stopped responding.
>
> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
> though I do not think that gives any guarantees that the same issue will not
> happen for small writes too.
>
> So my suggestion is that you run with something like "libata.force=[<port
> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
> but that would be a really more of a big hammer solution.
I mostly agree. However, extending the timeout did not help in the past.
I found that even setting it to 240 was not enough.
If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
The pause is either up to 30s or unlimited (until timeout is reached and a reset).
ATM I have the setting of timeout=120 in rc.local.
I also have in my boot cmd:
libata.force=2.00:noncq
This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
despite many timeout resets since.
I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.
BTW, for the last few days I am running with max_sectors_kb=3584 and saw no pauses at all.
with 4096 and 4095 I did get a long pause/reset.
Regards,
Eyal
(*) smart shows 180TB written so far in 1.6y on. I see a workload limit of 180TB/y listed but
I also saw an older spec that said 55TB/y. I do not know if I have a v1, v2 or v3 of this model.
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-21 12:12 ` Eyal Lebedinsky
@ 2025-12-21 22:43 ` Eyal Lebedinsky
2025-12-21 23:14 ` Damien Le Moal
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-21 22:43 UTC (permalink / raw)
To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal
On 21/12/25 23:12, Eyal Lebedinsky wrote:
> On 21/12/25 19:34, Damien Le Moal wrote:
>> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>>> On 17/12/25 23:02, Niklas Cassel wrote:
>>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>>
>>> [trimmed]
>>>
>>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>>
>>>>
>>>> You can also try this:
>>>>
>>>> https://github.com/floatious/max-sectors-quirk
>>>>
>>>> It tries these max_sector_kb values:
>>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>>
>>>> You can simply modify the script if you want to try more intermediate sizes.
>>>
>>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>>> See comments after the test report.
>>
>>> From your test results, it seems that the drive actually correctly handles very
>> large write commands, but because it is a drive-managed SMR disk, such commands
>> can take a very long time to process (due to the drive needing internal garbage
>> collection first), which triggers a timeout and a reset as the ata subsystem
>> assumes that the drive has stopped responding.
>>
>> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
>> though I do not think that gives any guarantees that the same issue will not
>> happen for small writes too.
>>
>> So my suggestion is that you run with something like "libata.force=[<port
>> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
>> but that would be a really more of a big hammer solution.
>
> I mostly agree. However, extending the timeout did not help in the past.
> I found that even setting it to 240 was not enough.
> If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
> The pause is either up to 30s or unlimited (until timeout is reached and a reset).
>
> ATM I have the setting of timeout=120 in rc.local.
>
> I also have in my boot cmd:
> libata.force=2.00:noncq
> This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
> In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
> despite many timeout resets since.
>
> I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.
Have just ran the test, failed quickly, see below.
It shows a 17 minutes pause followed by a reset and a resumption and completion of the test.
The important question is: do we need a quirk?
Is this an inherent problem with this model? If so then a quirk is justified.
Is this an inherent problem after this model had many writes? Again, a quirk is justified.
Is this a fault with my specific disk? No quirk, I will keep my safe settings.
Let me know if further testing, or more information, is required.
Regards,
Eyal
08:45:33 sudo sh -c 'echo 1024 >/sys/block/sda/device/timeout'
08:45:45 sudo sh -c 'echo 4096 > /sys/block/sda/queue/max_sectors_kb'
08:51:00 sudo /usr/local/bin/sync-tellerstats.sh # start a test
Quickly got a few shorter timeouts then an indefinite one:
08:51:10 2025-12-22
08:51:10 Device wareq-sz w/s kB_w/s kB_w
08:51:00 sda 0.00 0.00 0.00 0.00
08:51:10 sda 2187.47 59.90 131029.45 1310294.53 start of test
08:51:20 sda 2041.32 75.90 154936.19 1549361.88
08:51:30 sda 1745.31 87.50 152714.62 1527146.25
08:51:40 sda 1601.22 92.50 148112.85 1481128.50
08:51:50 sda 1444.75 97.40 140718.65 1407186.50
08:52:00 sda 2097.17 37.90 79482.74 794827.43
08:52:10 sda 0.00 0.00 0.00 0.00 short pause
08:52:20 sda 0.00 0.00 0.00 0.00
08:52:30 sda 622.74 7.30 4546.00 45460.02
08:52:40 sda 0.00 0.00 0.00 0.00 short pause
08:52:50 sda 0.00 0.00 0.00 0.00
08:53:00 sda 0.00 0.00 0.00 0.00
08:53:10 sda 2083.45 35.00 72920.75 729207.50
08:53:20 sda 1285.02 107.00 137497.14 1374971.40
08:53:30 sda 1882.78 77.30 145538.89 1455388.94
08:53:40 sda 1839.43 14.70 27039.62 270396.21
08:53:50 sda 0.00 0.00 0.00 0.00 pause
08:54:00 sda 0.00 0.00 0.00 0.00
08:54:10 sda 0.00 0.00 0.00 0.00
08:54:20 sda 0.00 0.00 0.00 0.00 longer than 30s
08:54:30 sda 0.00 0.00 0.00 0.00
...
09:10:50 sda 0.00 0.00 0.00 0.00 17 minutes later...
09:11:00 sda 1648.50 52.40 86381.40 863814.00
09:11:10 sda 1240.31 111.80 138666.66 1386666.58
09:11:20 sda 2172.02 64.50 140095.29 1400952.90
09:11:30 sda 2592.31 47.90 124171.65 1241716.49
09:11:40 sda 2143.92 57.60 123489.79 1234897.92
09:11:50 sda 1604.22 89.40 143417.27 1434172.68
09:12:00 sda 1550.97 88.60 137415.94 1374159.42
09:12:00 2025-12-22
09:12:10 Device wareq-sz w/s kB_w/s kB_w
09:12:10 sda 1215.60 47.50 57741.00 577410.00 end of test
09:12:20 sda 0.00 0.00 0.00 0.00
09:12:30 sda 0.00 0.00 0.00 0.00
-- system log:
2025-12-22T09:10:52+11:00 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
2025-12-22T09:10:52+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
2025-12-22T09:10:52+11:00 kernel: ata2.00: cmd 35/00:00:00:c8:d0/00:20:af:00:00/e0 tag 1 dma 4194304 out
res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2025-12-22T09:10:52+11:00 kernel: ata2.00: status: { DRDY }
2025-12-22T09:10:52+11:00 kernel: ata2: hard resetting link
2025-12-22T09:10:53+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2025-12-22T09:10:53+11:00 kernel: ata2.00: configured for UDMA/133
2025-12-22T09:10:53+11:00 kernel: ata2: EH complete
-- end of test:
09:13:37 sudo sh -c 'echo 3584 > /sys/block/sda/queue/max_sectors_kb'
09:13:56 sudo sh -c 'echo 120 >/sys/block/sda/device/timeout'
smart does NOT show increased '188 Command_Timeout'.
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-21 22:43 ` Eyal Lebedinsky
@ 2025-12-21 23:14 ` Damien Le Moal
2025-12-22 2:10 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-12-21 23:14 UTC (permalink / raw)
To: eyal, list linux-ide; +Cc: Niklas Cassel
On 12/22/25 07:43, Eyal Lebedinsky wrote:
> On 21/12/25 23:12, Eyal Lebedinsky wrote:
>> On 21/12/25 19:34, Damien Le Moal wrote:
>>> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>>>> On 17/12/25 23:02, Niklas Cassel wrote:
>>>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>>>
>>>> [trimmed]
>>>>
>>>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>>>
>>>>>
>>>>> You can also try this:
>>>>>
>>>>> https://github.com/floatious/max-sectors-quirk
>>>>>
>>>>> It tries these max_sector_kb values:
>>>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>>>
>>>>> You can simply modify the script if you want to try more intermediate sizes.
>>>>
>>>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>>>> See comments after the test report.
>>>
>>>> From your test results, it seems that the drive actually correctly handles very
>>> large write commands, but because it is a drive-managed SMR disk, such commands
>>> can take a very long time to process (due to the drive needing internal garbage
>>> collection first), which triggers a timeout and a reset as the ata subsystem
>>> assumes that the drive has stopped responding.
>>>
>>> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
>>> though I do not think that gives any guarantees that the same issue will not
>>> happen for small writes too.
>>>
>>> So my suggestion is that you run with something like "libata.force=[<port
>>> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
>>> but that would be a really more of a big hammer solution.
>>
>> I mostly agree. However, extending the timeout did not help in the past.
>> I found that even setting it to 240 was not enough.
>> If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
>> The pause is either up to 30s or unlimited (until timeout is reached and a reset).
>>
>> ATM I have the setting of timeout=120 in rc.local.
>>
>> I also have in my boot cmd:
>> libata.force=2.00:noncq
>> This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
>> In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
>> despite many timeout resets since.
>>
>> I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.
>
> Have just ran the test, failed quickly, see below.
> It shows a 17 minutes pause followed by a reset and a resumption and completion of the test.
>
> The important question is: do we need a quirk?
> Is this an inherent problem with this model? If so then a quirk is justified.
> Is this an inherent problem after this model had many writes? Again, a quirk is justified.
> Is this a fault with my specific disk? No quirk, I will keep my safe settings.
>
> Let me know if further testing, or more information, is required.
>
> Regards,
> Eyal
>
> 08:45:33 sudo sh -c 'echo 1024 >/sys/block/sda/device/timeout'
> 08:45:45 sudo sh -c 'echo 4096 > /sys/block/sda/queue/max_sectors_kb'
> 08:51:00 sudo /usr/local/bin/sync-tellerstats.sh # start a test
> Quickly got a few shorter timeouts then an indefinite one:
> 08:51:10 2025-12-22
> 08:51:10 Device wareq-sz w/s kB_w/s kB_w
> 08:51:00 sda 0.00 0.00 0.00 0.00
> 08:51:10 sda 2187.47 59.90 131029.45 1310294.53 start of test
> 08:51:20 sda 2041.32 75.90 154936.19 1549361.88
> 08:51:30 sda 1745.31 87.50 152714.62 1527146.25
> 08:51:40 sda 1601.22 92.50 148112.85 1481128.50
> 08:51:50 sda 1444.75 97.40 140718.65 1407186.50
> 08:52:00 sda 2097.17 37.90 79482.74 794827.43
> 08:52:10 sda 0.00 0.00 0.00 0.00 short pause
> 08:52:20 sda 0.00 0.00 0.00 0.00
> 08:52:30 sda 622.74 7.30 4546.00 45460.02
> 08:52:40 sda 0.00 0.00 0.00 0.00 short pause
> 08:52:50 sda 0.00 0.00 0.00 0.00
> 08:53:00 sda 0.00 0.00 0.00 0.00
> 08:53:10 sda 2083.45 35.00 72920.75 729207.50
> 08:53:20 sda 1285.02 107.00 137497.14 1374971.40
> 08:53:30 sda 1882.78 77.30 145538.89 1455388.94
> 08:53:40 sda 1839.43 14.70 27039.62 270396.21
> 08:53:50 sda 0.00 0.00 0.00 0.00 pause
> 08:54:00 sda 0.00 0.00 0.00 0.00
> 08:54:10 sda 0.00 0.00 0.00 0.00
> 08:54:20 sda 0.00 0.00 0.00 0.00 longer than 30s
> 08:54:30 sda 0.00 0.00 0.00 0.00
> ...
> 09:10:50 sda 0.00 0.00 0.00 0.00 17 minutes later...
> 09:11:00 sda 1648.50 52.40 86381.40 863814.00
> 09:11:10 sda 1240.31 111.80 138666.66 1386666.58
> 09:11:20 sda 2172.02 64.50 140095.29 1400952.90
> 09:11:30 sda 2592.31 47.90 124171.65 1241716.49
> 09:11:40 sda 2143.92 57.60 123489.79 1234897.92
> 09:11:50 sda 1604.22 89.40 143417.27 1434172.68
> 09:12:00 sda 1550.97 88.60 137415.94 1374159.42
> 09:12:00 2025-12-22
> 09:12:10 Device wareq-sz w/s kB_w/s kB_w
> 09:12:10 sda 1215.60 47.50 57741.00 577410.00 end of test
> 09:12:20 sda 0.00 0.00 0.00 0.00
> 09:12:30 sda 0.00 0.00 0.00 0.00
> -- system log:
> 2025-12-22T09:10:52+11:00 kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> 2025-12-22T09:10:52+11:00 kernel: ata2.00: failed command: WRITE DMA EXT
> 2025-12-22T09:10:52+11:00 kernel: ata2.00: cmd 35/00:00:00:c8:d0/00:20:af:00:00/e0 tag 1 dma 4194304 out
> res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> 2025-12-22T09:10:52+11:00 kernel: ata2.00: status: { DRDY }
> 2025-12-22T09:10:52+11:00 kernel: ata2: hard resetting link
> 2025-12-22T09:10:53+11:00 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> 2025-12-22T09:10:53+11:00 kernel: ata2.00: configured for UDMA/133
> 2025-12-22T09:10:53+11:00 kernel: ata2: EH complete
> -- end of test:
> 09:13:37 sudo sh -c 'echo 3584 > /sys/block/sda/queue/max_sectors_kb'
> 09:13:56 sudo sh -c 'echo 120 >/sys/block/sda/device/timeout'
>
> smart does NOT show increased '188 Command_Timeout'.
Probably because the drive is stuck doing something. So it seems best to quirk
this drive. Can you try with NCQ enabled and a max sectors of 2048 (1 MiB
commands max) ? I suspect this should work. Otherwise, if NCQ also causes
issues, we can quirk both NCQ and max sectors for this drive.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-21 23:14 ` Damien Le Moal
@ 2025-12-22 2:10 ` Eyal Lebedinsky
2025-12-22 3:43 ` Damien Le Moal
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-22 2:10 UTC (permalink / raw)
To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal
On 22/12/25 10:14, Damien Le Moal wrote:
> On 12/22/25 07:43, Eyal Lebedinsky wrote:
>> On 21/12/25 23:12, Eyal Lebedinsky wrote:
>>> On 21/12/25 19:34, Damien Le Moal wrote:
>>>> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>>>>> On 17/12/25 23:02, Niklas Cassel wrote:
>>>>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>>>>
>>>>> [trimmed]
>>>>>
>>>>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>>>>
>>>>>>
>>>>>> You can also try this:
>>>>>>
>>>>>> https://github.com/floatious/max-sectors-quirk
>>>>>>
>>>>>> It tries these max_sector_kb values:
>>>>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>>>>
>>>>>> You can simply modify the script if you want to try more intermediate sizes.
>>>>>
>>>>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>>>>> See comments after the test report.
>>>>
>>>>> From your test results, it seems that the drive actually correctly handles very
>>>> large write commands, but because it is a drive-managed SMR disk, such commands
>>>> can take a very long time to process (due to the drive needing internal garbage
>>>> collection first), which triggers a timeout and a reset as the ata subsystem
>>>> assumes that the drive has stopped responding.
>>>>
>>>> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
>>>> though I do not think that gives any guarantees that the same issue will not
>>>> happen for small writes too.
>>>>
>>>> So my suggestion is that you run with something like "libata.force=[<port
>>>> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
>>>> but that would be a really more of a big hammer solution.
>>>
>>> I mostly agree. However, extending the timeout did not help in the past.
>>> I found that even setting it to 240 was not enough.
>>> If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
>>> The pause is either up to 30s or unlimited (until timeout is reached and a reset).
>>>
>>> ATM I have the setting of timeout=120 in rc.local.
>>>
>>> I also have in my boot cmd:
>>> libata.force=2.00:noncq
>>> This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
>>> In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
>>> despite many timeout resets since.
>>>
>>> I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.
>>
>> Have just ran the test, failed quickly, see below.
>> It shows a 17 minutes pause followed by a reset and a resumption and completion of the test.
>>
>> The important question is: do we need a quirk?
>> Is this an inherent problem with this model? If so then a quirk is justified.
>> Is this an inherent problem after this model had many writes? Again, a quirk is justified.
>> Is this a fault with my specific disk? No quirk, I will keep my safe settings.
>>
>> Let me know if further testing, or more information, is required.
>>
>> Regards,
>> Eyal
[trimmed]
> Can you try with NCQ enabled and a max sectors of 2048 (1 MiB
> commands max) ? I suspect this should work. Otherwise, if NCQ also causes
> issues, we can quirk both NCQ and max sectors for this drive.
To be clear, set "max_sectors_kb=1024"? This always worked, no pause (or timeout).
"this should work" meaning "no trouble" or "will reproduce the problem"?
With ncq enabled, I had the same numbers of timeouts, with the difference being that the disk
logged many errors (one for each tag) and also registered a Command_Timeout which it did not otherwise.
Question: will setting
$ sudo sh -c 'echo 32 >/sys/block/sda/device/queue_depth'
be the same as booting without
libata.force=2.00:noncq
or do I actually need to reboot?
Thanks
Eyal
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-22 2:10 ` Eyal Lebedinsky
@ 2025-12-22 3:43 ` Damien Le Moal
2025-12-22 5:57 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2025-12-22 3:43 UTC (permalink / raw)
To: eyal, list linux-ide; +Cc: Niklas Cassel
On 12/22/25 11:10, Eyal Lebedinsky wrote:
> On 22/12/25 10:14, Damien Le Moal wrote:
>> On 12/22/25 07:43, Eyal Lebedinsky wrote:
>>> On 21/12/25 23:12, Eyal Lebedinsky wrote:
>>>> On 21/12/25 19:34, Damien Le Moal wrote:
>>>>> On 12/20/25 13:03, Eyal Lebedinsky wrote:
>>>>>> On 17/12/25 23:02, Niklas Cassel wrote:
>>>>>>> On 17 December 2025 20:56:07 GMT+09:00, Eyal Lebedinsky <eyal@eyal.emu.id.au> wrote:
>>>>>>
>>>>>> [trimmed]
>>>>>>
>>>>>>>> Do you want me to try different max_sectors_kb values to see where it breaks?
>>>>>>>>
>>>>>>>
>>>>>>> You can also try this:
>>>>>>>
>>>>>>> https://github.com/floatious/max-sectors-quirk
>>>>>>>
>>>>>>> It tries these max_sector_kb values:
>>>>>>> declare -a sizes=(128 1024 2048 3072 4095 4096)
>>>>>>>
>>>>>>> You can simply modify the script if you want to try more intermediate sizes.
>>>>>>
>>>>>> After testing the script on a sacrificial disk, I got brave and ran it on the offending disk.
>>>>>> See comments after the test report.
>>>>>
>>>>>> From your test results, it seems that the drive actually correctly handles very
>>>>> large write commands, but because it is a drive-managed SMR disk, such commands
>>>>> can take a very long time to process (due to the drive needing internal garbage
>>>>> collection first), which triggers a timeout and a reset as the ata subsystem
>>>>> assumes that the drive has stopped responding.
>>>>>
>>>>> Limiting write commands to smaller sizes seems to mostly avoid this issue, even
>>>>> though I do not think that gives any guarantees that the same issue will not
>>>>> happen for small writes too.
>>>>>
>>>>> So my suggestion is that you run with something like "libata.force=[<port
>>>>> ID>:]max_sec=1024" for that drive. We can also add a permanent quirk for it too,
>>>>> but that would be a really more of a big hammer solution.
>>>>
>>>> I mostly agree. However, extending the timeout did not help in the past.
>>>> I found that even setting it to 240 was not enough.
>>>> If there is no response in 30s then none is coming. I never saw a pause longer than 30s that came back later.
>>>> The pause is either up to 30s or unlimited (until timeout is reached and a reset).
>>>>
>>>> ATM I have the setting of timeout=120 in rc.local.
>>>>
>>>> I also have in my boot cmd:
>>>> libata.force=2.00:noncq
>>>> This is because with ncq, if a pause reaches a timeout then the disk records a Command_Timeout.
>>>> In the smart log it is now up to 34360262682 (8,8,26) but stable since ncq was disabled (at 1/Nov)
>>>> despite many timeout resets since.
>>>>
>>>> I suspect a fw bug or a design fault(*), but just for fun I will run with max_sectors_kb=4096 and timeout=1024.
>>>
>>> Have just ran the test, failed quickly, see below.
>>> It shows a 17 minutes pause followed by a reset and a resumption and completion of the test.
>>>
>>> The important question is: do we need a quirk?
>>> Is this an inherent problem with this model? If so then a quirk is justified.
>>> Is this an inherent problem after this model had many writes? Again, a quirk is justified.
>>> Is this a fault with my specific disk? No quirk, I will keep my safe settings.
>>>
>>> Let me know if further testing, or more information, is required.
>>>
>>> Regards,
>>> Eyal
>
> [trimmed]
>
>> Can you try with NCQ enabled and a max sectors of 2048 (1 MiB
>> commands max) ? I suspect this should work. Otherwise, if NCQ also causes
>> issues, we can quirk both NCQ and max sectors for this drive.
>
> To be clear, set "max_sectors_kb=1024"? This always worked, no pause (or timeout).
> "this should work" meaning "no trouble" or "will reproduce the problem"?
Yes, I meant to say that you should not see any timeout/long pause with NCQ for
a max command size of 1MiB. The reason I say that is that most drive managed SMR
implementation are based on some form of logging of random writes. Logging small
writes is fast and relatively easy to handle (even if the log is full, a small
portion of it can be recovered with just a few IOs). But if the writes are
large, things can get ugly as freeing up enough space in that log can be very
costly. That of course all depend on the vendor/model implementation of device
managed SMR FW...
> With ncq enabled, I had the same numbers of timeouts, with the difference being that the disk
> logged many errors (one for each tag) and also registered a Command_Timeout which it did not otherwise.
My point was to try NCQ combined with a small max sectors quirk to see if that
works well or not.
>
> Question: will setting
> $ sudo sh -c 'echo 32 >/sys/block/sda/device/queue_depth'
> be the same as booting without
> libata.force=2.00:noncq
> or do I actually need to reboot?
You will need to reboot without libata.force=noncq. Otherwise,
ata_dev_config_ncq() will always do nothing for the drive and NCQ will not be
seen as supported.
Once you confirm if we really need to maintain NCQ off or not with a small max
sectors limit, we can write a proper quirk for this drive.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-22 3:43 ` Damien Le Moal
@ 2025-12-22 5:57 ` Eyal Lebedinsky
2025-12-30 22:43 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-22 5:57 UTC (permalink / raw)
To: Damien Le Moal, list linux-ide; +Cc: Niklas Cassel
On 22/12/25 14:43, Damien Le Moal wrote:
> On 12/22/25 11:10, Eyal Lebedinsky wrote:
>> On 22/12/25 10:14, Damien Le Moal wrote:
>>> Can you try with NCQ enabled and a max sectors of 2048 (1 MiB
>>> commands max) ? I suspect this should work. Otherwise, if NCQ also causes
>>> issues, we can quirk both NCQ and max sectors for this drive.
>>
>> To be clear, set "max_sectors_kb=1024"? This always worked, no pause (or timeout).
>> "this should work" meaning "no trouble" or "will reproduce the problem"?
This is how it is now set.
> Yes, I meant to say that you should not see any timeout/long pause with NCQ for
> a max command size of 1MiB. The reason I say that is that most drive managed SMR
> implementation are based on some form of logging of random writes. Logging small
> writes is fast and relatively easy to handle (even if the log is full, a small
> portion of it can be recovered with just a few IOs). But if the writes are
> large, things can get ugly as freeing up enough space in that log can be very
> costly. That of course all depend on the vendor/model implementation of device
> managed SMR FW...
>
>> With ncq enabled, I had the same numbers of timeouts, with the difference being that the disk
>> logged many errors (one for each tag) and also registered a Command_Timeout which it did not otherwise.
>
> My point was to try NCQ combined with a small max sectors quirk to see if that
> works well or not.
>
>>
>> Question: will setting
>> $ sudo sh -c 'echo 32 >/sys/block/sda/device/queue_depth'
>> be the same as booting without
>> libata.force=2.00:noncq
>> or do I actually need to reboot?
>
> You will need to reboot without libata.force=noncq. Otherwise,
> ata_dev_config_ncq() will always do nothing for the drive and NCQ will not be
> seen as supported.
I discovered that I need to reboot, which I did.
> Once you confirm if we really need to maintain NCQ off or not with a small max
> sectors limit, we can write a proper quirk for this drive.
I am leaving it this way. It runs the usual workload (rsync in) every 2 hours. I will report back in a few days.
The recent run with these settings was smooth:
16:07:56 sda 0.00 0.00 0.00 0.00
16:08:06 sda 921.57 55.70 51331.45 513314.49
16:08:16 sda 972.67 130.90 127322.50 1273225.03
16:08:26 sda 952.22 141.00 134263.02 1342630.20
16:08:36 sda 968.10 128.50 124400.85 1244008.50
16:08:46 sda 916.85 144.60 132576.51 1325765.10
16:08:56 sda 978.13 128.20 125396.27 1253962.66
16:09:06 sda 937.33 139.00 130288.87 1302888.70
16:09:16 sda 857.16 136.10 116659.48 1166594.76
16:09:26 sda 971.57 123.40 119891.74 1198917.38
16:09:36 sda 969.15 136.80 132579.72 1325797.20
16:09:46 sda 969.75 105.40 102211.65 1022116.50
16:09:56 sda 914.09 129.30 118191.84 1181918.37
16:10:06 sda 931.91 123.50 115090.88 1150908.85
16:10:16 sda 947.24 123.30 116794.69 1167946.92
16:10:26 sda 921.27 136.50 125753.35 1257533.55
16:10:36 sda 964.06 128.30 123688.90 1236888.98
16:10:46 sda 955.16 137.30 131143.47 1311434.68
16:10:56 sda 954.25 104.50 99719.12 997191.25
16:11:06 sda 663.10 29.20 19362.52 193625.20
16:11:16 sda 0.00 0.00 0.00 0.00
16:11:26 sda 0.00 0.00 0.00 0.00
16:11:26 2025-12-22
16:11:36 Device wareq-sz w/s kB_w/s kB_w
BTW, I did a quick try with "max_sectors_kb=4096" and a fast dd:
$ dd if=/dev/zero of=/data2/tmp/zero.dd bs=4M count=$((21500/4)) status=progress
and it worked without an issue:
15:34:46 sda 0.00 0.00 0.00 0.00
15:34:56 sda 4096.00 15.10 61849.60 618496.00
15:35:06 sda 3959.49 26.90 106510.28 1065102.81
15:35:16 sda 4007.07 27.50 110194.43 1101944.25
15:35:26 sda 4020.03 26.80 107736.80 1077368.04
15:35:36 sda 4035.55 26.90 108556.29 1085562.95
15:35:46 sda 4032.27 25.60 103226.11 1032261.12
15:35:56 sda 4020.59 27.00 108555.93 1085559.30
15:36:06 sda 3974.66 26.90 106918.35 1069183.54
15:36:16 sda 4030.60 24.90 100361.94 1003619.40
15:36:26 sda 4004.95 26.90 107733.15 1077331.55
15:36:26 2025-12-22
15:36:36 Device wareq-sz w/s kB_w/s kB_w
15:36:36 sda 3991.85 27.40 109376.69 1093766.90
15:36:46 sda 4045.15 24.00 97083.60 970836.00
15:36:56 sda 4017.98 26.10 104869.28 1048692.78
15:37:06 sda 4017.81 26.10 104864.84 1048648.41
15:37:16 sda 4000.09 25.50 102002.29 1020022.95
15:37:26 sda 3990.31 27.00 107738.37 1077383.70
15:37:36 sda 4048.90 25.90 104866.51 1048665.10
15:37:46 sda 4003.07 26.30 105280.74 1052807.41
15:37:56 sda 4051.49 27.50 111415.97 1114159.75
15:38:06 sda 4033.31 26.00 104866.06 1048660.60
15:38:16 sda 4036.07 27.20 109781.10 1097811.04
15:38:26 sda 3340.74 5.40 18040.00 180399.96
15:38:36 sda 0.00 0.00 0.00 0.00
This is unlike my usual workload which writes about 250 files, some a few GB and some rather small.
Eyal
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-22 5:57 ` Eyal Lebedinsky
@ 2025-12-30 22:43 ` Eyal Lebedinsky
2026-01-02 1:21 ` Damien Le Moal
0 siblings, 1 reply; 28+ messages in thread
From: Eyal Lebedinsky @ 2025-12-30 22:43 UTC (permalink / raw)
To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal
On 22/12/25 16:57, Eyal Lebedinsky wrote:
> On 22/12/25 14:43, Damien Le Moal wrote:
[trimmed]
>> Once you confirm if we really need to maintain NCQ off or not with a small max
>> sectors limit, we can write a proper quirk for this drive.
>
> I am leaving it this way. It runs the usual workload (rsync in) every 2 hours. I will report back in a few days.
It is now "a few days" later (9 days). All is well and not a single pause observed.
My job (rsync'ing 21GB into this disk every 2 hours) reports:
max_sectors_kb=1024 timeout=120 queue_depth=32
I am keeping it with these parameters, but can try different values if it tells us anything.
Happy New Year Everyone,
Eyal
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2025-12-30 22:43 ` Eyal Lebedinsky
@ 2026-01-02 1:21 ` Damien Le Moal
2026-01-02 6:30 ` Eyal Lebedinsky
0 siblings, 1 reply; 28+ messages in thread
From: Damien Le Moal @ 2026-01-02 1:21 UTC (permalink / raw)
To: eyal, list linux-ide; +Cc: Niklas Cassel
On 12/31/25 07:43, Eyal Lebedinsky wrote:
> On 22/12/25 16:57, Eyal Lebedinsky wrote:
>> On 22/12/25 14:43, Damien Le Moal wrote:
>
> [trimmed]
>
>>> Once you confirm if we really need to maintain NCQ off or not with a small max
>>> sectors limit, we can write a proper quirk for this drive.
>>
>> I am leaving it this way. It runs the usual workload (rsync in) every 2 hours. I will report back in a few days.
> It is now "a few days" later (9 days). All is well and not a single pause observed.
> My job (rsync'ing 21GB into this disk every 2 hours) reports:
> max_sectors_kb=1024 timeout=120 queue_depth=32
>
> I am keeping it with these parameters, but can try different values if it tells us anything.
If you are OK with keeping these as your default with this drive and setting
that through a udev rule or whatever else you prefer, then I think we are good.
We can make the max_sectors_kb=1024 permanent as a quirk though, but that would
be a little extreme since in the end, we are dealing with a very very slow drive
here, not really a buggy one.
> Happy New Year Everyone,
Thanks. Happy new year to you too !
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ata timeout exceptions
2026-01-02 1:21 ` Damien Le Moal
@ 2026-01-02 6:30 ` Eyal Lebedinsky
0 siblings, 0 replies; 28+ messages in thread
From: Eyal Lebedinsky @ 2026-01-02 6:30 UTC (permalink / raw)
To: list linux-ide; +Cc: Niklas Cassel, Damien Le Moal
On 2/1/26 12:21, Damien Le Moal wrote:
> On 12/31/25 07:43, Eyal Lebedinsky wrote:
>> On 22/12/25 16:57, Eyal Lebedinsky wrote:
>>> On 22/12/25 14:43, Damien Le Moal wrote:
>>
>> [trimmed]
>>
>>>> Once you confirm if we really need to maintain NCQ off or not with a small max
>>>> sectors limit, we can write a proper quirk for this drive.
>>>
>>> I am leaving it this way. It runs the usual workload (rsync in) every 2 hours. I will report back in a few days.
>> It is now "a few days" later (9 days). All is well and not a single pause observed.
>> My job (rsync'ing 21GB into this disk every 2 hours) reports:
>> max_sectors_kb=1024 timeout=120 queue_depth=32
>>
>> I am keeping it with these parameters, but can try different values if it tells us anything.
>
> If you are OK with keeping these as your default with this drive and setting
> that through a udev rule or whatever else you prefer, then I think we are good.
> We can make the max_sectors_kb=1024 permanent as a quirk though, but that would
> be a little extreme since in the end, we are dealing with a very very slow drive
> here, not really a buggy one.
I agree. I will keep these setting here.
Thanks again for your effort,
Eyal
>> Happy New Year Everyone,
>
> Thanks. Happy new year to you too !
--
Eyal at Home (eyal@eyal.emu.id.au)
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2026-01-02 6:31 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-03 4:13 ata timeout exceptions Eyal Lebedinsky
2025-11-09 20:40 ` Niklas Cassel
2025-11-09 22:41 ` Eyal Lebedinsky
2025-11-10 13:11 ` Niklas Cassel
2025-11-14 4:32 ` Eyal Lebedinsky
2025-11-18 15:17 ` Niklas Cassel
2025-11-18 23:05 ` Eyal Lebedinsky
2025-11-19 5:41 ` Damien Le Moal
2025-11-19 13:37 ` Eyal Lebedinsky
2025-11-20 3:34 ` Damien Le Moal
2025-11-20 11:38 ` Eyal Lebedinsky
2025-11-20 12:18 ` Damien Le Moal
2025-11-20 23:53 ` Eyal Lebedinsky
2025-12-16 23:39 ` Eyal Lebedinsky
2025-12-17 1:35 ` Damien Le Moal
2025-12-17 11:56 ` Eyal Lebedinsky
2025-12-17 12:02 ` Niklas Cassel
2025-12-20 4:03 ` Eyal Lebedinsky
2025-12-21 8:34 ` Damien Le Moal
2025-12-21 12:12 ` Eyal Lebedinsky
2025-12-21 22:43 ` Eyal Lebedinsky
2025-12-21 23:14 ` Damien Le Moal
2025-12-22 2:10 ` Eyal Lebedinsky
2025-12-22 3:43 ` Damien Le Moal
2025-12-22 5:57 ` Eyal Lebedinsky
2025-12-30 22:43 ` Eyal Lebedinsky
2026-01-02 1:21 ` Damien Le Moal
2026-01-02 6:30 ` Eyal Lebedinsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox