Why do Pending sectors disappear without writing to them?

public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed

* Why do Pending sectors disappear without writing to them?
@ 2026-01-10  0:35 Eyal Lebedinsky
  2026-01-12 14:05 ` Damien Le Moal
  0 siblings, 1 reply; 5+ messages in thread
From: Eyal Lebedinsky @ 2026-01-10  0:35 UTC (permalink / raw)
  To: list linux-ide

This happens with some regularity. This disk (/dev/sdf1) is part of a raid6. It seems to be unhealthy (another story).
What I saw, a few times recently, is a smart report like
	197 Current_Pending_Sector  -O--C-   100   100   000    -    8
	198 Offline_Uncorrectable   ----C-   100   100   000    -    8
and at the end of the smart report
	Pending Defects log (GP Log 0x0c)
	Index                LBA    Hours
	    0        22791960168    54593
	    1        22791960169    54593
	    2        22791960170    54593
	    3        22791960171    54593
	    4        22791960172    54593
	    5        22791960173    54593
	    6        22791960174    54593
	    7        22791960175    54593
This stays for some time. For example this morning it started just after midnight
until I checked the logs this morning.

I reacted by running
	$ sudo raid6check /dev/md127 $(((22791960168-2048-262144)/1024-1)) 2
BTW, I convert sector number to fs block:
	2048   (sdf1 start from fdisk)
	262144 'Data Offset : 262144 sectors' from 'mdadm --examine'
	1024   'Chunk Size : 512K' [1024s]    from 'mdadm --examine'
I expected an issue to be reported, but none were.
However ... the above 197/198 smart attributes went to zero, and the 'Pending Defects log' was cleared.

My question is: raid6check is running in read-only mode, yet the disk cleared the pending reports. Why?
I thought that you need to write to it for that.

Maybe the disk attempts to read the block (8 sectors) anyway and decides it is actually good?
	In other words: first read failure is logged as Pending, a following good read clears it?
	A failed write counts it as Reallocated_Sector_Ct.

Interestingly, there is no indication of the reason for the initial failure (197/198 0->8).
No i/o error. No md report. No program failure.
The first log is from a regular 30m smart check.

TIA

-- 
Eyal at Home (eyal@eyal.emu.id.au)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Why do Pending sectors disappear without writing to them?
  2026-01-10  0:35 Why do Pending sectors disappear without writing to them? Eyal Lebedinsky
@ 2026-01-12 14:05 ` Damien Le Moal
  2026-01-13  1:34   ` Eyal Lebedinsky
  0 siblings, 1 reply; 5+ messages in thread
From: Damien Le Moal @ 2026-01-12 14:05 UTC (permalink / raw)
  To: Eyal Lebedinsky (emu), list linux-ide

On 1/10/26 01:35, Eyal Lebedinsky wrote:
> This happens with some regularity. This disk (/dev/sdf1) is part of a raid6. It seems to be unhealthy (another story).
> What I saw, a few times recently, is a smart report like
> 	197 Current_Pending_Sector  -O--C-   100   100   000    -    8
> 	198 Offline_Uncorrectable   ----C-   100   100   000    -    8
> and at the end of the smart report
> 	Pending Defects log (GP Log 0x0c)
> 	Index                LBA    Hours
> 	    0        22791960168    54593
> 	    1        22791960169    54593
> 	    2        22791960170    54593
> 	    3        22791960171    54593
> 	    4        22791960172    54593
> 	    5        22791960173    54593
> 	    6        22791960174    54593
> 	    7        22791960175    54593
> This stays for some time. For example this morning it started just after midnight
> until I checked the logs this morning.
> 
> I reacted by running
> 	$ sudo raid6check /dev/md127 $(((22791960168-2048-262144)/1024-1)) 2
> BTW, I convert sector number to fs block:
> 	2048   (sdf1 start from fdisk)
> 	262144 'Data Offset : 262144 sectors' from 'mdadm --examine'
> 	1024   'Chunk Size : 512K' [1024s]    from 'mdadm --examine'
> I expected an issue to be reported, but none were.
> However ... the above 197/198 smart attributes went to zero, and the 'Pending Defects log' was cleared.
> 
> My question is: raid6check is running in read-only mode, yet the disk cleared the pending reports. Why?
> I thought that you need to write to it for that.

Most likely, the drive was slow to remap the failed sectors, so they showed up
in the smart report. Your raid6check may have rewritten these and the drive
these time remapped the sectors and the error went away. Probably, your remapped
sector count increased. Not sure if you have the old values.

> Maybe the disk attempts to read the block (8 sectors) anyway and decides it is actually good?
> 	In other words: first read failure is logged as Pending, a following good read clears it?
> 	A failed write counts it as Reallocated_Sector_Ct.
> 
> Interestingly, there is no indication of the reason for the initial failure (197/198 0->8).
> No i/o error. No md report. No program failure.
> The first log is from a regular 30m smart check.

That is odd. I would have expected a failed write. But that said, these days, if
you get a failed write from a disk, you can pretty much consider the disk dead
since bad sector remapping inside the disk is automatic and you'll get a failed
write only if that fails.

> 
> TIA
> 

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Why do Pending sectors disappear without writing to them?
  2026-01-12 14:05 ` Damien Le Moal
@ 2026-01-13  1:34   ` Eyal Lebedinsky
  2026-01-13  7:02     ` Damien Le Moal
  0 siblings, 1 reply; 5+ messages in thread
From: Eyal Lebedinsky @ 2026-01-13  1:34 UTC (permalink / raw)
  To: list linux-ide; +Cc: Damien Le Moal

On 13/1/26 01:05, Damien Le Moal wrote:
> On 1/10/26 01:35, Eyal Lebedinsky wrote:
>> This happens with some regularity. This disk (/dev/sdf1) is part of a raid6. It seems to be unhealthy (another story).
>> What I saw, a few times recently, is a smart report like
>> 	197 Current_Pending_Sector  -O--C-   100   100   000    -    8
>> 	198 Offline_Uncorrectable   ----C-   100   100   000    -    8
>> and at the end of the smart report
>> 	Pending Defects log (GP Log 0x0c)
>> 	Index                LBA    Hours
>> 	    0        22791960168    54593
>> 	    1        22791960169    54593
>> 	    2        22791960170    54593
>> 	    3        22791960171    54593
>> 	    4        22791960172    54593
>> 	    5        22791960173    54593
>> 	    6        22791960174    54593
>> 	    7        22791960175    54593
>> This stays for some time. For example this morning it started just after midnight
>> until I checked the logs this morning.
>>
>> I reacted by running
>> 	$ sudo raid6check /dev/md127 $(((22791960168-2048-262144)/1024-1)) 2
>> BTW, I convert sector number to fs block:
>> 	2048   (sdf1 start from fdisk)
>> 	262144 'Data Offset : 262144 sectors' from 'mdadm --examine'
>> 	1024   'Chunk Size : 512K' [1024s]    from 'mdadm --examine'
>> I expected an issue to be reported, but none were.
>> However ... the above 197/198 smart attributes went to zero, and the 'Pending Defects log' was cleared.
>>
>> My question is: raid6check is running in read-only mode, yet the disk cleared the pending reports. Why?
>> I thought that you need to write to it for that.
> 
> Most likely, the drive was slow to remap the failed sectors, so they showed up
> in the smart report. Your raid6check may have rewritten these and the drive
> these time remapped the sectors and the error went away. Probably, your remapped
> sector count increased. Not sure if you have the old values.

Here is a recent example of the situation:

... gone to bed
Jan 10 00:44:24 smartd[1254]: Device: /dev/sdf [SAT], 8 Currently unreadable (pending) sectors
	Current_Pending_Sector	0->8
	Reallocated_Sector_Ct	648 -> 656
... reports continue every 30m until the morning
... started my day
... noticed a smart report
Jan 10 08:44:30 smartd[1254]: Device: /dev/sdf [SAT], 8 Currently unreadable (pending) sectors
        08:45:24 ran: smartctl and the errors are still showing.
        08:55:49 ran: sudo raid6check /dev/md127 $(((22791960168-2048-262144)/1024-1)) 2
		22791960168 is first LBA in list of 8 consecutive "Pending Defects", so one fs block.
        08:56:22 ran: smartctl and all errors are gone.
Jan 10 09:14:25 smartd[1254]: Device: /dev/sdf [SAT], No more Currently unreadable (pending) sectors, warning condition reset after 1 email

BTW, there was no increase in Reallocated_Sector_Ct during this time.
It seems that the Reallocated goes up at the same time Uncorrectable goes up and defects are listed. Upfront.
Uncorrectable and defects list are cleared at the same time (with raid6check) without more Reallocated counted.
And I did not do any writes that I know of.

So two unexplained:
	1 No errors in system log when the sectors failed, and reported by smartctl.
	2 raid6check in read-only mode clears the errors.
I guess my understanding and Seagate's are different.

>> Maybe the disk attempts to read the block (8 sectors) anyway and decides it is actually good?
>> 	In other words: first read failure is logged as Pending, a following good read clears it?
>> 	A failed write counts it as Reallocated_Sector_Ct.
>>
>> Interestingly, there is no indication of the reason for the initial failure (197/198 0->8).
>> No i/o error. No md report. No program failure.
>> The first log is from a regular 30m smart check.
> 
> That is odd. I would have expected a failed write. But that said, these days, if
> you get a failed write from a disk, you can pretty much consider the disk dead
> since bad sector remapping inside the disk is automatic and you'll get a failed
> write only if that fails.

Agreed, this disk is on its last leg (since Nov/25), a spare is available.

>> TIA
>>
> 
> 


-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Why do Pending sectors disappear without writing to them?
  2026-01-13  1:34   ` Eyal Lebedinsky
@ 2026-01-13  7:02     ` Damien Le Moal
  2026-01-13  7:34       ` Eyal Lebedinsky
  0 siblings, 1 reply; 5+ messages in thread
From: Damien Le Moal @ 2026-01-13  7:02 UTC (permalink / raw)
  To: eyal, list linux-ide

On 2026/01/13 2:34, Eyal Lebedinsky wrote:
> On 13/1/26 01:05, Damien Le Moal wrote:
>> On 1/10/26 01:35, Eyal Lebedinsky wrote:
>>> This happens with some regularity. This disk (/dev/sdf1) is part of a raid6. It seems to be unhealthy (another story).
>>> What I saw, a few times recently, is a smart report like
>>> 	197 Current_Pending_Sector  -O--C-   100   100   000    -    8
>>> 	198 Offline_Uncorrectable   ----C-   100   100   000    -    8
>>> and at the end of the smart report
>>> 	Pending Defects log (GP Log 0x0c)
>>> 	Index                LBA    Hours
>>> 	    0        22791960168    54593
>>> 	    1        22791960169    54593
>>> 	    2        22791960170    54593
>>> 	    3        22791960171    54593
>>> 	    4        22791960172    54593
>>> 	    5        22791960173    54593
>>> 	    6        22791960174    54593
>>> 	    7        22791960175    54593
>>> This stays for some time. For example this morning it started just after midnight
>>> until I checked the logs this morning.
>>>
>>> I reacted by running
>>> 	$ sudo raid6check /dev/md127 $(((22791960168-2048-262144)/1024-1)) 2
>>> BTW, I convert sector number to fs block:
>>> 	2048   (sdf1 start from fdisk)
>>> 	262144 'Data Offset : 262144 sectors' from 'mdadm --examine'
>>> 	1024   'Chunk Size : 512K' [1024s]    from 'mdadm --examine'
>>> I expected an issue to be reported, but none were.
>>> However ... the above 197/198 smart attributes went to zero, and the 'Pending Defects log' was cleared.
>>>
>>> My question is: raid6check is running in read-only mode, yet the disk cleared the pending reports. Why?
>>> I thought that you need to write to it for that.
>>
>> Most likely, the drive was slow to remap the failed sectors, so they showed up
>> in the smart report. Your raid6check may have rewritten these and the drive
>> these time remapped the sectors and the error went away. Probably, your remapped
>> sector count increased. Not sure if you have the old values.
> 
> Here is a recent example of the situation:
> 
> ... gone to bed
> Jan 10 00:44:24 smartd[1254]: Device: /dev/sdf [SAT], 8 Currently unreadable (pending) sectors
> 	Current_Pending_Sector	0->8
> 	Reallocated_Sector_Ct	648 -> 656
> ... reports continue every 30m until the morning
> ... started my day
> ... noticed a smart report
> Jan 10 08:44:30 smartd[1254]: Device: /dev/sdf [SAT], 8 Currently unreadable (pending) sectors
>         08:45:24 ran: smartctl and the errors are still showing.
>         08:55:49 ran: sudo raid6check /dev/md127 $(((22791960168-2048-262144)/1024-1)) 2
> 		22791960168 is first LBA in list of 8 consecutive "Pending Defects", so one fs block.
>         08:56:22 ran: smartctl and all errors are gone.
> Jan 10 09:14:25 smartd[1254]: Device: /dev/sdf [SAT], No more Currently unreadable (pending) sectors, warning condition reset after 1 email
> 
> BTW, there was no increase in Reallocated_Sector_Ct during this time.
> It seems that the Reallocated goes up at the same time Uncorrectable goes up and defects are listed. Upfront.
> Uncorrectable and defects list are cleared at the same time (with raid6check) without more Reallocated counted.
> And I did not do any writes that I know of.
> 
> So two unexplained:
> 	1 No errors in system log when the sectors failed, and reported by smartctl.
> 	2 raid6check in read-only mode clears the errors.
> I guess my understanding and Seagate's are different.

It may be that you do have bad sectors which take a lot of time to be read as
they need deep error corrections (HDDs have different ways of reading sectors
that get slower and slower if the sector is not easy to read). So reading these
sector is slow and smartctl reports that, but raid6check finally succeeds in
getting the data and the temporary error clears.

Guessing here. This would need a drive log analysis to fully understand this,
but only the disk vendor can do that.

You could try rewriting these bad sectors (copy them over themselves) to see if
this pattern goes away.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Why do Pending sectors disappear without writing to them?
  2026-01-13  7:02     ` Damien Le Moal
@ 2026-01-13  7:34       ` Eyal Lebedinsky
  0 siblings, 0 replies; 5+ messages in thread
From: Eyal Lebedinsky @ 2026-01-13  7:34 UTC (permalink / raw)
  To: Damien Le Moal, list linux-ide

On 13/1/26 18:02, Damien Le Moal wrote:

> You could try rewriting these bad sectors (copy them over themselves) to see if
> this pattern goes away.

Damien,

Unfortunately only a very small number is ever reported (maybe 40) out of 656 counted in
Reallocated_Sector_Ct. The same sector was never reported twice.

I will let the disk limp with palliative care until the suffering is unbearable.

Thanks four your advice.

Eyal

-- 
Eyal at Home (eyal@eyal.emu.id.au)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-01-13  7:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-10  0:35 Why do Pending sectors disappear without writing to them? Eyal Lebedinsky
2026-01-12 14:05 ` Damien Le Moal
2026-01-13  1:34   ` Eyal Lebedinsky
2026-01-13  7:02     ` Damien Le Moal
2026-01-13  7:34       ` Eyal Lebedinsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox