Impending failure?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Impending failure?
@ 2011-11-04 13:53 Alex
  2011-11-04 13:57 ` Mathias Burén
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Alex @ 2011-11-04 13:53 UTC (permalink / raw)
  To: linux-raid

Hi,
I have a fedora15 system with two 80GB SATA disks using RAID1 and have
the following messages in syslog:

Nov  4 07:43:11 mail smartd[2001]: Device: /dev/sda [SAT], 2 Offline
uncorrectable sectors
Nov  4 08:13:11 mail smartd[2001]: Device: /dev/sda [SAT], 2 Currently
unreadable (pending) sectors

"smartctl --all /dev/sda" does show that errors did occur at some
point in the past, but it doesn't seem to be affecting the RAID:

# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda2[0] sdb2[1]
      74750908 blocks super 1.1 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md0 : active raid1 sda1[0] sdb1[1]
      511988 blocks super 1.0 [2/2] [UU]

unused devices: <none>

Perhaps the bad sectors haven't been accessed, which is why the RAID
still appears to be intact?

Do I need to verify the RAID integrity in some way? Force a rebuild?

I believe the boot sector is installed on sda, which is also the bad
disk. If I remove the disk to replace it, I'm concerned the system
will no longer boot.

Can you point me to instructions on the best way to replace a disk?

Thanks,
Alex

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 13:53 Impending failure? Alex
@ 2011-11-04 13:57 ` Mathias Burén
       [not found]   ` <CALJXSJouyQxMcV_CKGUiU-QkzDor2FTW25uGFyy-jDLnppnoAg@mail.gmail.com>
  2011-11-04 14:33 ` Mikael Abrahamsson
  2011-11-04 18:05 ` John Robinson
  2 siblings, 1 reply; 12+ messages in thread
From: Mathias Burén @ 2011-11-04 13:57 UTC (permalink / raw)
  To: Alex; +Cc: linux-raid

On 4 November 2011 13:53, Alex <mysqlstudent@gmail.com> wrote:
> Hi,
> I have a fedora15 system with two 80GB SATA disks using RAID1 and have
> the following messages in syslog:
>
> Nov  4 07:43:11 mail smartd[2001]: Device: /dev/sda [SAT], 2 Offline
> uncorrectable sectors
> Nov  4 08:13:11 mail smartd[2001]: Device: /dev/sda [SAT], 2 Currently
> unreadable (pending) sectors
>
> "smartctl --all /dev/sda" does show that errors did occur at some
> point in the past, but it doesn't seem to be affecting the RAID:
>
> # cat /proc/mdstat
> Personalities : [raid1]
> md1 : active raid1 sda2[0] sdb2[1]
>      74750908 blocks super 1.1 [2/2] [UU]
>      bitmap: 1/1 pages [4KB], 65536KB chunk
>
> md0 : active raid1 sda1[0] sdb1[1]
>      511988 blocks super 1.0 [2/2] [UU]
>
> unused devices: <none>
>
> Perhaps the bad sectors haven't been accessed, which is why the RAID
> still appears to be intact?
>
> Do I need to verify the RAID integrity in some way? Force a rebuild?
>
> I believe the boot sector is installed on sda, which is also the bad
> disk. If I remove the disk to replace it, I'm concerned the system
> will no longer boot.
>
> Can you point me to instructions on the best way to replace a disk?
>
> Thanks,
> Alex
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Hi,

Basically, if uncorrectable sectors >0 or pending sectors >0 the drive
is failing. So replace ASAP :)

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 13:53 Impending failure? Alex
  2011-11-04 13:57 ` Mathias Burén
@ 2011-11-04 14:33 ` Mikael Abrahamsson
  2011-11-04 15:31   ` Alex
  2011-11-04 18:05 ` John Robinson
  2 siblings, 1 reply; 12+ messages in thread
From: Mikael Abrahamsson @ 2011-11-04 14:33 UTC (permalink / raw)
  To: Alex; +Cc: linux-raid

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1056 bytes --]

On Fri, 4 Nov 2011, Alex wrote:

> Can you point me to instructions on the best way to replace a disk?

First run "repair" on the array, hopefully it'll notice the unreadable 
blocks and re-write them.

echo repair >> /sys/block/md0/md/sync_action

Also make sure your OS does regular scrubs of the raid, usually this is 
done by monthly runs of checkarray, this is an example from Ubuntu:

:/etc/cron.d$ cat mdadm
#
# cron.d/mdadm -- schedules periodic redundancy checks of MD devices
#
# Copyright © martin f. krafft <madduck@madduck.net>
# distributed under the terms of the Artistic Licence 2.0
#

# By default, run at 00:57 on every Sunday, but do nothing unless the day 
of
# the month is less than or equal to 7. Thus, only run on the first Sunday 
of
# each month. crontab(5) sucks, unfortunately, in this regard; therefore 
this
# hack (see #380425).
57 0 * * 0 root if [ -x /usr/share/mdadm/checkarray ] && [ $(date +\%d) -le 7 ]; then /usr/share/mdadm/checkarray --cron --all --idle --quiet; fi

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
       [not found]   ` <CALJXSJouyQxMcV_CKGUiU-QkzDor2FTW25uGFyy-jDLnppnoAg@mail.gmail.com>
@ 2011-11-04 14:45     ` Jérôme Poulin
  2011-11-04 14:59       ` Mikael Abrahamsson
  2011-11-04 15:09       ` Thomas Fjellstrom
  0 siblings, 2 replies; 12+ messages in thread
From: Jérôme Poulin @ 2011-11-04 14:45 UTC (permalink / raw)
  To: linux-raid

On Fri, Nov 4, 2011 at 9:57 AM, Mathias Burén <mathias.buren@gmail.com> wrote:
>
> Basically, if uncorrectable sectors >0 or pending sectors >0 the drive
> is failing. So replace ASAP :)

Exactly, with the new drives, they try to hide errors as much as they
can, as soon as SMART shows anything suspect, READ DMA failures,
pending sectors, offline uncorrectable sectors, or any tests fail, the
disk is going to die soon.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 14:45     ` Jérôme Poulin
@ 2011-11-04 14:59       ` Mikael Abrahamsson
  2011-11-04 15:09       ` Thomas Fjellstrom
  1 sibling, 0 replies; 12+ messages in thread
From: Mikael Abrahamsson @ 2011-11-04 14:59 UTC (permalink / raw)
  To: Jérôme Poulin; +Cc: linux-raid

[-- Attachment #1: Type: TEXT/PLAIN, Size: 653 bytes --]

On Fri, 4 Nov 2011, Jérôme Poulin wrote:

> Exactly, with the new drives, they try to hide errors as much as they
> can, as soon as SMART shows anything suspect, READ DMA failures,
> pending sectors, offline uncorrectable sectors, or any tests fail, the
> disk is going to die soon.

Pending sectors (READ ERROR) is not as bad as equivalent write errors.

A 2TB drive has a bit error rate meaning it'll be according to spec if it 
throws you a read error very 6 times you read all data from it.

So saying read errors and pending sectors are a strong indication of 
impending doom is an exaggeration.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 14:45     ` Jérôme Poulin
  2011-11-04 14:59       ` Mikael Abrahamsson
@ 2011-11-04 15:09       ` Thomas Fjellstrom
  1 sibling, 0 replies; 12+ messages in thread
From: Thomas Fjellstrom @ 2011-11-04 15:09 UTC (permalink / raw)
  To: Jérôme Poulin; +Cc: linux-raid

On November 4, 2011, Jérôme Poulin wrote:
> On Fri, Nov 4, 2011 at 9:57 AM, Mathias Burén <mathias.buren@gmail.com> 
wrote:
> > Basically, if uncorrectable sectors >0 or pending sectors >0 the drive
> > is failing. So replace ASAP :)
> 
> Exactly, with the new drives, they try to hide errors as much as they
> can, as soon as SMART shows anything suspect, READ DMA failures,
> pending sectors, offline uncorrectable sectors, or any tests fail, the
> disk is going to die soon.

Maybe I was just lucky, but I had an uncorrectable sector error at one point, 
and ran a dd write on the drive, and SMART reported the sectors were 
corrected. It wouldn't do it with a read, but a full write worked. The drive 
worked for quite some time after that. I might still have it, but I can't 
remember which drive that was (I've had a lot).

> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Thomas Fjellstrom
thomas@fjellstrom.ca
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 14:33 ` Mikael Abrahamsson
@ 2011-11-04 15:31   ` Alex
  2011-11-04 15:43     ` Mathias Burén
  0 siblings, 1 reply; 12+ messages in thread
From: Alex @ 2011-11-04 15:31 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

Hi,

>> Can you point me to instructions on the best way to replace a disk?
>
> First run "repair" on the array, hopefully it'll notice the unreadable
> blocks and re-write them.
>
> echo repair >> /sys/block/md0/md/sync_action
>
> Also make sure your OS does regular scrubs of the raid, usually this is done
> by monthly runs of checkarray, this is an example from Ubuntu:

Great, thanks. I recalled something like that, but couldn't remember exactly.

The system passed the above rebuild test on both arrays, but I'm
obviously still concerned about the disk. Here are the relevant
smartctl lines:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   108   089   006    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0003   094   094   000    Pre-fail
Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age
Always       -       29
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x000f   083   060   030    Pre-fail
Always       -       209739855
  9 Power_On_Hours          0x0032   074   074   000    Old_age
Always       -       22816
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age
Always       -       37
187 Reported_Uncorrect      0x0032   095   095   000    Old_age
Always       -       5
189 High_Fly_Writes         0x003a   100   100   000    Old_age
Always       -       0
190 Airflow_Temperature_Cel 0x0022   075   064   045    Old_age
Always       -       25 (Min/Max 23/32)
194 Temperature_Celsius     0x0022   025   040   000    Old_age
Always       -       25 (0 18 0 0)
195 Hardware_ECC_Recovered  0x001a   057   045   000    Old_age
Always       -       51009302
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -       2
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
Offline      -       2
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age
Always       -       0

Pending_sector and uncorrectable are both greater than zero. Is this
drive on its way to failure?

Can someone point me to the proper mdadm commands to set the drive
faulty then rebuild it after installing the new one?

Thanks again,
Alex

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 15:31   ` Alex
@ 2011-11-04 15:43     ` Mathias Burén
  2011-11-04 17:42       ` Peter Zieba
  0 siblings, 1 reply; 12+ messages in thread
From: Mathias Burén @ 2011-11-04 15:43 UTC (permalink / raw)
  To: Alex; +Cc: Mikael Abrahamsson, linux-raid

On 4 November 2011 15:31, Alex <mysqlstudent@gmail.com> wrote:
> Hi,
>
>>> Can you point me to instructions on the best way to replace a disk?
>>
>> First run "repair" on the array, hopefully it'll notice the unreadable
>> blocks and re-write them.
>>
>> echo repair >> /sys/block/md0/md/sync_action
>>
>> Also make sure your OS does regular scrubs of the raid, usually this is done
>> by monthly runs of checkarray, this is an example from Ubuntu:
>
> Great, thanks. I recalled something like that, but couldn't remember exactly.
>
> The system passed the above rebuild test on both arrays, but I'm
> obviously still concerned about the disk. Here are the relevant
> smartctl lines:
>
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
>  1 Raw_Read_Error_Rate     0x000f   108   089   006    Pre-fail
> Always       -       0
>  3 Spin_Up_Time            0x0003   094   094   000    Pre-fail
> Always       -       0
>  4 Start_Stop_Count        0x0032   100   100   020    Old_age
> Always       -       29
>  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       0
>  7 Seek_Error_Rate         0x000f   083   060   030    Pre-fail
> Always       -       209739855
>  9 Power_On_Hours          0x0032   074   074   000    Old_age
> Always       -       22816
>  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail
> Always       -       0
>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age
> Always       -       37
> 187 Reported_Uncorrect      0x0032   095   095   000    Old_age
> Always       -       5
> 189 High_Fly_Writes         0x003a   100   100   000    Old_age
> Always       -       0
> 190 Airflow_Temperature_Cel 0x0022   075   064   045    Old_age
> Always       -       25 (Min/Max 23/32)
> 194 Temperature_Celsius     0x0022   025   040   000    Old_age
> Always       -       25 (0 18 0 0)
> 195 Hardware_ECC_Recovered  0x001a   057   045   000    Old_age
> Always       -       51009302
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age
> Always       -       2
> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
> Offline      -       2
> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
> Offline      -       0
> 202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age
> Always       -       0
>
> Pending_sector and uncorrectable are both greater than zero. Is this
> drive on its way to failure?
>
> Can someone point me to the proper mdadm commands to set the drive
> faulty then rebuild it after installing the new one?
>
> Thanks again,
> Alex
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

187 Reported_Uncorrect      0x0032   095   095   000    Old_age
Always       -       5
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -       2
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
Offline      -       2

This tells me to get rid of the drive. I don't know the mdadm commands
from my head, sorry, but it's in the man page(s). If you want you run
a scrub and see if these numbers change. If the drive fails hard
enough then md will kick it out of the array anyway. Btw, I scrub my
RAID6 (7 HDDs) once a week.

/Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 15:43     ` Mathias Burén
@ 2011-11-04 17:42       ` Peter Zieba
  2011-11-07 21:17         ` Alex
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zieba @ 2011-11-04 17:42 UTC (permalink / raw)
  To: Mathias Burén; +Cc: Mikael Abrahamsson, linux-raid, Alex

So, in my personal experience with pending sectors, it's worth mentioning the following:

If you do a "check", and you have any pending sectors that are within the partition that is used for the md device, they should be read and rewritten as needed, causing the count to go down. However, I've noticed that sometimes I have pending sector counts on drives that don't go away after a "check". These would go away, however, if I failed and then removed the drive with mdadm, and then subsequently zero filled the /entire/ drive (as opposed to just the partition on that disk that is used by the array). The reason for this is that there's a small chunk of unused space that never gets read or written to right after the partition (even though I technically partition the entire drive as one large partition (fd  Linux raid auto).

I think what actually happens in this case is that when the system reads data from near the end of the array, the drive itself will do read-ahead and cache it. So, even though the computer never requested those abandoned sectors, the drive eventually notices that it can't read them, and makes a note of the fact. So, this is harmless.

You could probably avoid the potential for false-positive on pending sectors if you used the entire disk for the array (no partitions), but I'm pretty sure that breaks the raid auto-detection.

Currently, my main array has 8 2TB hitachi disks, in a raid 6. It is scrubbed once a week, and one disk consistently has 8 pending sectors on it. I'm certain I could make those go away if I wanted, but, frankly, it's purely aesthetic as far as I'm concerned. Some of my drives also have non-zero "196 Reallocated_Event_Count" and "5 Reallocated_Sector_Ct", however, I have no drives with non-zero "Offline_Uncorrectable". I haven't had any problems with the disks or array (other than a temperature induced failure ... but that's another story, and I still run the same disks after that event). I used to have lots of issues before I started scrubbing consistently.

Peter

----- Original Message -----
From: "Mathias Burén" <mathias.buren@gmail.com>
To: "Alex" <mysqlstudent@gmail.com>
Cc: "Mikael Abrahamsson" <swmike@swm.pp.se>, linux-raid@vger.kernel.org
Sent: Friday, November 4, 2011 10:43:07 AM
Subject: Re: Impending failure?

On 4 November 2011 15:31, Alex <mysqlstudent@gmail.com> wrote:
> Hi,
>
>>> Can you point me to instructions on the best way to replace a disk?
>>
>> First run "repair" on the array, hopefully it'll notice the unreadable
>> blocks and re-write them.
>>
>> echo repair >> /sys/block/md0/md/sync_action
>>
>> Also make sure your OS does regular scrubs of the raid, usually this is done
>> by monthly runs of checkarray, this is an example from Ubuntu:
>
> Great, thanks. I recalled something like that, but couldn't remember exactly.
>
> The system passed the above rebuild test on both arrays, but I'm
> obviously still concerned about the disk. Here are the relevant
> smartctl lines:
>
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
>  1 Raw_Read_Error_Rate     0x000f   108   089   006    Pre-fail
> Always       -       0
>  3 Spin_Up_Time            0x0003   094   094   000    Pre-fail
> Always       -       0
>  4 Start_Stop_Count        0x0032   100   100   020    Old_age
> Always       -       29
>  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       0
>  7 Seek_Error_Rate         0x000f   083   060   030    Pre-fail
> Always       -       209739855
>  9 Power_On_Hours          0x0032   074   074   000    Old_age
> Always       -       22816
>  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail
> Always       -       0
>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age
> Always       -       37
> 187 Reported_Uncorrect      0x0032   095   095   000    Old_age
> Always       -       5
> 189 High_Fly_Writes         0x003a   100   100   000    Old_age
> Always       -       0
> 190 Airflow_Temperature_Cel 0x0022   075   064   045    Old_age
> Always       -       25 (Min/Max 23/32)
> 194 Temperature_Celsius     0x0022   025   040   000    Old_age
> Always       -       25 (0 18 0 0)
> 195 Hardware_ECC_Recovered  0x001a   057   045   000    Old_age
> Always       -       51009302
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age
> Always       -       2
> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
> Offline      -       2
> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
> Offline      -       0
> 202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age
> Always       -       0
>
> Pending_sector and uncorrectable are both greater than zero. Is this
> drive on its way to failure?
>
> Can someone point me to the proper mdadm commands to set the drive
> faulty then rebuild it after installing the new one?
>
> Thanks again,
> Alex
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

187 Reported_Uncorrect      0x0032   095   095   000    Old_age
Always       -       5
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -       2
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
Offline      -       2

This tells me to get rid of the drive. I don't know the mdadm commands
from my head, sorry, but it's in the man page(s). If you want you run
a scrub and see if these numbers change. If the drive fails hard
enough then md will kick it out of the array anyway. Btw, I scrub my
RAID6 (7 HDDs) once a week.

/Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 13:53 Impending failure? Alex
  2011-11-04 13:57 ` Mathias Burén
  2011-11-04 14:33 ` Mikael Abrahamsson
@ 2011-11-04 18:05 ` John Robinson
  2 siblings, 0 replies; 12+ messages in thread
From: John Robinson @ 2011-11-04 18:05 UTC (permalink / raw)
  To: Alex; +Cc: linux-raid

On 04/11/2011 13:53, Alex wrote:
> Hi,
> I have a fedora15 system with two 80GB SATA disks using RAID1
[...]
> I believe the boot sector is installed on sda, which is also the bad
> disk. If I remove the disk to replace it, I'm concerned the system
> will no longer boot.
>
> Can you point me to instructions on the best way to replace a disk?

If you used Fedora's installer to set up the md RAIDs in the first 
place, you will be able to boot off the second drive, as the Fedora 
installer will have installed grub on it as well. You would have to tell 
your BIOS to boot from it though.

If you've room in the case and spare SATA ports for a third drive (with 
which to replace the failing drive), put it in and assuming it appears 
as sdc I'd do the switcheroo by growing the array to three drives then 
shrinking it again, something like this (all off the top of my head, so 
check it first by looking at the man page, which will help you 
understand what's going on anyway):

# sfdisk -d /dev/sda
Note down where sda1 starts. It will likely be either 63 or 2048.
# dd if=/dev/sda of=/dev/sdc bs=512 count=<howevermanyitwas>
This will copy the partition table and boot code.
# blockdev --rereadpt /dev/sdc
# mdadm /dev/md0 --add /dev/sdc1
# mdadm /dev/md1 --add /dev/sdc2
# mdadm --grow /dev/md0 -n3
# mdadm --grow /dev/md1 -n3
Wait for this to finish, either by looking at /proc/mdstat from time to 
time, or using mdadm --wait /dev/md1. This gives you a three-way mirror. 
It's possible this process will cause sda to fail, but that's OK. Next 
we want to remove the duff drive:
# mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
# mdadm /dev/md1 --fail /dev/sda2 --remove /dev/sda2
# mdadm --grow /dev/md0 -n2
# mdadm --grow /dev/md1 -n2

Now you can shut down again, and install the new drive in the place of 
the original failing drive.

Cheers,

John.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-04 17:42       ` Peter Zieba
@ 2011-11-07 21:17         ` Alex
  2011-11-08  0:04           ` Peter Zieba
  0 siblings, 1 reply; 12+ messages in thread
From: Alex @ 2011-11-07 21:17 UTC (permalink / raw)
  To: Peter Zieba; +Cc: Mathias Burén, Mikael Abrahamsson, linux-raid

Hi guys,

> So, in my personal experience with pending sectors, it's worth mentioning the following:
>
> If you do a "check", and you have any pending sectors that are within the partition that is used for the md device, they should be read and rewritten as needed, causing the count to go down. However, I've noticed that sometimes I have pending sector counts on drives that don't go away after a "check". These would go away, however, if I failed and then removed the drive with mdadm, and then subsequently zero filled the /entire/ drive (as opposed to just the partition on that disk that is used by the array). The reason for this is that there's a small chunk of unused space that never gets read or written to right after the partition (even though I technically partition the entire drive as one large partition (fd  Linux raid auto).
>
> I think what actually happens in this case is that when the system reads data from near the end of the array, the drive itself will do read-ahead and cache it. So, even though the computer never requested those abandoned sectors, the drive eventually notices that it can't read them, and makes a note of the fact. So, this is harmless.
>
> You could probably avoid the potential for false-positive on pending sectors if you used the entire disk for the array (no partitions), but I'm pretty sure that breaks the raid auto-detection.
>
> Currently, my main array has 8 2TB hitachi disks, in a raid 6. It is scrubbed once a week, and one disk consistently has 8 pending sectors on it. I'm certain I could make those go away if I wanted, but, frankly, it's purely aesthetic as far as I'm concerned. Some of my drives also have non-zero "196 Reallocated_Event_Count" and "5 Reallocated_Sector_Ct", however, I have no drives with non-zero "Offline_Uncorrectable". I haven't had any problems with the disks or array (other than a temperature induced failure ... but that's another story, and I still run the same disks after that event). I used to have lots of issues before I started scrubbing consistently.

I think I understand your explanation. You are basically saying that
if I recheck the drive, there's a possibility the pending defective
sectors may resolve themselves?

Given that I have an existing system, how do I check the integrity of
the partitions? What is the contents of the "check" script to which
you refer?

Is this safe to do remotely?

Is it necessary to set a disk faulty before removing it?

Thanks,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Impending failure?
  2011-11-07 21:17         ` Alex
@ 2011-11-08  0:04           ` Peter Zieba
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Zieba @ 2011-11-08  0:04 UTC (permalink / raw)
  To: Alex; +Cc: Mathias Burén, Mikael Abrahamsson, linux-raid

Ok, so first I'll explain what a pending sector is. It is not "pending defective sectors". It could be defective, but at the moment it's in a sort of limbo state. That's why it's called pending.

With pending sectors, basically, the drive is unable to read the given sector at-this-time -- but it's not necessarily a bad sector (this is actually a normal occurrence with large disks every once in a while). In other words, if you tried to read it again, the drive might be able to read it (due to temperature fluctuations, phase of the moon, etc. etc.). If it managed to succeed, the drive rewrites what should have been there in the first place. Imagine something written very faintly on a piece of paper. If you look at it, you might not be able to read it. If you stare at it for a while and try a few times you might be able to figure out what was there. Or, you give up and come back to it a day later and suddenly, it makes sense. Now, if the drive succeeds in figuring out what was there, it tries writing it back to the disk. If this fails, now you have something more than just a pending sector. It's actually marked bad, and remapped to the spare sectors the drive has. This is a much better indicator of drive failure "5 Reallocated_Sector_Ct" or "196 Reallocated_Event_Count" (not sure how these differ, but in any case, they're worse than pending sectors).

Now, if the drive fails to determine what was supposed to be in this sector, the machine eventually gets notified of the failure to return the sector, mdadm then reads the data from parity, and writes the proper data back to the original drive that failed to read the given sector. So, if the write succeeds, the sector is no longer pending, and the count should be decremented by one. If the write fails, it should be remapped to a spare sector and again the pending sector count should be decremented. Either case is transparent to the machine/mdadm.

So, with the "check" sync_action, you are forcing a read of all data and parity data, on every drive. If there are pending sectors (and there could easily be ones you hit that the drive doesn't even know are pending yet), they will naturally be corrected (either by the drive managing to read it properly this time, or by it giving up and mdadm checking the parity for what was supposed to be there.)

So, yes, pending sectors should be resolved by a check /as long as they're in use by mdadm/. It's important to understand that there might be sectors outside of what's being used by mdadm (partition table, and wasted space at the end of a drive). A "check" will not resolve these, but they're also not an issue. The drive simply isn't sure what's written in a location that you don't care about.

A check should be safe to do remotely, provided you understand that it will generate a lot of I/O. IIRC, this I/O is of lower priority than actual I/O of the running system, and so it shouldn't cause a problem other than making regular I/O of the system a little slower. I believe a minimum and maximum speed can be set for this check/repair. If you have severe problems that are lurking (in other words, they'll manifest sooner or later without changing anything), this could in theory kick out enough drives to bring the array down. This is very unlikely, however.

If you want to run a check on an array with redundancy (Raid-1, Raid-5, Raid-6, etc.), doing the following:
echo check > /sys/block/md0/md/sync_action

Will cause the array to check all of the data against the redundancy. The actual intended purpose of this is to check that the parity data matches the actual data. In other words, this is for mdadm's housekeeping. It happens to do a good job of housekeeping for the drives themselves due to forcing everything to be read, however. After running this command, you can check on the progresss with:
cat /proc/mdstat

Once this is complete, "mismatch count" should be updated with how many errors were found if you did a "check", which can be read from:
cat /sys/block/md0/md/mismatch_cnt

This should be zero.

If this turns up mismatches, they can be repaired with:
echo repair > /sys/block/md0/md/sync_action

Or, you can just run repair outright (and parity will be fixed as it is found to be bad).

Having bad parity data isn't normal. Something like this would happen due to abnormal conditions (power outages, etc.).

The short answer to all of this is stop worrying about pending sectors (they are relatively normal), and run a "check" once a week, and all should be well. On recent CentOS/Rhel this is a cron job, which is located in:
/etc/cron.weekly/99-raid-check

It is configured by:
/etc/sysconfig/raid-check

This wasn't always included in rhel/centos (it has been added in the last year or two if I'm not mistaken.). No idea how other distros handle this (they simply might not, either).

To remove a disk properly:
mdadm --manage /dev/md0 --fail /dev/sda1
mdadm --manage /dev/md0 --remove /dev/sda1

If you managed to already remove a disk without telling mdadm you were going to do so, this might help:
mdadm --manage /dev/md0 --remove detached

If all the disks are happily functioning in the array currently (as in, mdadm hasn't kicked any out of the running array), I'd recommend running a "check" before removing any disks, to clean up any pending sectors first.

Then if you still want to remove a disk to either replace it, or do something to it outside of what's healthy/sane to do to a disk in a running array, go ahead with the remove.

Disclaimers and notes:
 - All commands assume you're dealing with "/dev/md0". Where applicable, all commands involving an operation on a specific drive assume you're dealing with /dev/sda1 as the member of the array you want to act on.
 - These are my experiences and related to my particular configuration. Your situation may warrant different action, in spite of my confidence in the accuracy of what's written here.
 - I use "parity" when sometimes I'm simply referring to another copy of the data (Raid-1, Raid-10.) for the sake of brevity.
 - I've heard of someone mention mismatch counts being normal in certain weird situations somewhere on the list (something to do with swap???)
 - There are shorter ways of doing the fail/remove operations, and can also be done with one line (all in the man page).

Cheers

Peter Zieba

----- Original Message -----
From: "Alex" <mysqlstudent@gmail.com>
To: "Peter Zieba" <pzieba@networkmayhem.com>
Cc: "Mathias Burén" <mathias.buren@gmail.com>, "Mikael Abrahamsson" <swmike@swm.pp.se>, linux-raid@vger.kernel.org
Sent: Monday, November 7, 2011 3:17:54 PM
Subject: Re: Impending failure?

Hi guys,

> So, in my personal experience with pending sectors, it's worth mentioning the following:
>
> If you do a "check", and you have any pending sectors that are within the partition that is used for the md device, they should be read and rewritten as needed, causing the count to go down. However, I've noticed that sometimes I have pending sector counts on drives that don't go away after a "check". These would go away, however, if I failed and then removed the drive with mdadm, and then subsequently zero filled the /entire/ drive (as opposed to just the partition on that disk that is used by the array). The reason for this is that there's a small chunk of unused space that never gets read or written to right after the partition (even though I technically partition the entire drive as one large partition (fd  Linux raid auto).
>
> I think what actually happens in this case is that when the system reads data from near the end of the array, the drive itself will do read-ahead and cache it. So, even though the computer never requested those abandoned sectors, the drive eventually notices that it can't read them, and makes a note of the fact. So, this is harmless.
>
> You could probably avoid the potential for false-positive on pending sectors if you used the entire disk for the array (no partitions), but I'm pretty sure that breaks the raid auto-detection.
>
> Currently, my main array has 8 2TB hitachi disks, in a raid 6. It is scrubbed once a week, and one disk consistently has 8 pending sectors on it. I'm certain I could make those go away if I wanted, but, frankly, it's purely aesthetic as far as I'm concerned. Some of my drives also have non-zero "196 Reallocated_Event_Count" and "5 Reallocated_Sector_Ct", however, I have no drives with non-zero "Offline_Uncorrectable". I haven't had any problems with the disks or array (other than a temperature induced failure ... but that's another story, and I still run the same disks after that event). I used to have lots of issues before I started scrubbing consistently.

I think I understand your explanation. You are basically saying that
if I recheck the drive, there's a possibility the pending defective
sectors may resolve themselves?

Given that I have an existing system, how do I check the integrity of
the partitions? What is the contents of the "check" script to which
you refer?

Is this safe to do remotely?

Is it necessary to set a disk faulty before removing it?

Thanks,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-11-08  0:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-04 13:53 Impending failure? Alex
2011-11-04 13:57 ` Mathias Burén
     [not found]   ` <CALJXSJouyQxMcV_CKGUiU-QkzDor2FTW25uGFyy-jDLnppnoAg@mail.gmail.com>
2011-11-04 14:45     ` Jérôme Poulin
2011-11-04 14:59       ` Mikael Abrahamsson
2011-11-04 15:09       ` Thomas Fjellstrom
2011-11-04 14:33 ` Mikael Abrahamsson
2011-11-04 15:31   ` Alex
2011-11-04 15:43     ` Mathias Burén
2011-11-04 17:42       ` Peter Zieba
2011-11-07 21:17         ` Alex
2011-11-08  0:04           ` Peter Zieba
2011-11-04 18:05 ` John Robinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).