* Impending failure?
@ 2011-11-04 13:53 Alex
2011-11-04 13:57 ` Mathias Burén
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Alex @ 2011-11-04 13:53 UTC (permalink / raw)
To: linux-raid
Hi,
I have a fedora15 system with two 80GB SATA disks using RAID1 and have
the following messages in syslog:
Nov 4 07:43:11 mail smartd[2001]: Device: /dev/sda [SAT], 2 Offline
uncorrectable sectors
Nov 4 08:13:11 mail smartd[2001]: Device: /dev/sda [SAT], 2 Currently
unreadable (pending) sectors
"smartctl --all /dev/sda" does show that errors did occur at some
point in the past, but it doesn't seem to be affecting the RAID:
# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda2[0] sdb2[1]
74750908 blocks super 1.1 [2/2] [UU]
bitmap: 1/1 pages [4KB], 65536KB chunk
md0 : active raid1 sda1[0] sdb1[1]
511988 blocks super 1.0 [2/2] [UU]
unused devices: <none>
Perhaps the bad sectors haven't been accessed, which is why the RAID
still appears to be intact?
Do I need to verify the RAID integrity in some way? Force a rebuild?
I believe the boot sector is installed on sda, which is also the bad
disk. If I remove the disk to replace it, I'm concerned the system
will no longer boot.
Can you point me to instructions on the best way to replace a disk?
Thanks,
Alex
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 13:53 Impending failure? Alex
@ 2011-11-04 13:57 ` Mathias Burén
[not found] ` <CALJXSJouyQxMcV_CKGUiU-QkzDor2FTW25uGFyy-jDLnppnoAg@mail.gmail.com>
2011-11-04 14:33 ` Mikael Abrahamsson
2011-11-04 18:05 ` John Robinson
2 siblings, 1 reply; 12+ messages in thread
From: Mathias Burén @ 2011-11-04 13:57 UTC (permalink / raw)
To: Alex; +Cc: linux-raid
On 4 November 2011 13:53, Alex <mysqlstudent@gmail.com> wrote:
> Hi,
> I have a fedora15 system with two 80GB SATA disks using RAID1 and have
> the following messages in syslog:
>
> Nov 4 07:43:11 mail smartd[2001]: Device: /dev/sda [SAT], 2 Offline
> uncorrectable sectors
> Nov 4 08:13:11 mail smartd[2001]: Device: /dev/sda [SAT], 2 Currently
> unreadable (pending) sectors
>
> "smartctl --all /dev/sda" does show that errors did occur at some
> point in the past, but it doesn't seem to be affecting the RAID:
>
> # cat /proc/mdstat
> Personalities : [raid1]
> md1 : active raid1 sda2[0] sdb2[1]
> 74750908 blocks super 1.1 [2/2] [UU]
> bitmap: 1/1 pages [4KB], 65536KB chunk
>
> md0 : active raid1 sda1[0] sdb1[1]
> 511988 blocks super 1.0 [2/2] [UU]
>
> unused devices: <none>
>
> Perhaps the bad sectors haven't been accessed, which is why the RAID
> still appears to be intact?
>
> Do I need to verify the RAID integrity in some way? Force a rebuild?
>
> I believe the boot sector is installed on sda, which is also the bad
> disk. If I remove the disk to replace it, I'm concerned the system
> will no longer boot.
>
> Can you point me to instructions on the best way to replace a disk?
>
> Thanks,
> Alex
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Hi,
Basically, if uncorrectable sectors >0 or pending sectors >0 the drive
is failing. So replace ASAP :)
Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 13:53 Impending failure? Alex
2011-11-04 13:57 ` Mathias Burén
@ 2011-11-04 14:33 ` Mikael Abrahamsson
2011-11-04 15:31 ` Alex
2011-11-04 18:05 ` John Robinson
2 siblings, 1 reply; 12+ messages in thread
From: Mikael Abrahamsson @ 2011-11-04 14:33 UTC (permalink / raw)
To: Alex; +Cc: linux-raid
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1056 bytes --]
On Fri, 4 Nov 2011, Alex wrote:
> Can you point me to instructions on the best way to replace a disk?
First run "repair" on the array, hopefully it'll notice the unreadable
blocks and re-write them.
echo repair >> /sys/block/md0/md/sync_action
Also make sure your OS does regular scrubs of the raid, usually this is
done by monthly runs of checkarray, this is an example from Ubuntu:
:/etc/cron.d$ cat mdadm
#
# cron.d/mdadm -- schedules periodic redundancy checks of MD devices
#
# Copyright © martin f. krafft <madduck@madduck.net>
# distributed under the terms of the Artistic Licence 2.0
#
# By default, run at 00:57 on every Sunday, but do nothing unless the day
of
# the month is less than or equal to 7. Thus, only run on the first Sunday
of
# each month. crontab(5) sucks, unfortunately, in this regard; therefore
this
# hack (see #380425).
57 0 * * 0 root if [ -x /usr/share/mdadm/checkarray ] && [ $(date +\%d) -le 7 ]; then /usr/share/mdadm/checkarray --cron --all --idle --quiet; fi
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
[not found] ` <CALJXSJouyQxMcV_CKGUiU-QkzDor2FTW25uGFyy-jDLnppnoAg@mail.gmail.com>
@ 2011-11-04 14:45 ` Jérôme Poulin
2011-11-04 14:59 ` Mikael Abrahamsson
2011-11-04 15:09 ` Thomas Fjellstrom
0 siblings, 2 replies; 12+ messages in thread
From: Jérôme Poulin @ 2011-11-04 14:45 UTC (permalink / raw)
To: linux-raid
On Fri, Nov 4, 2011 at 9:57 AM, Mathias Burén <mathias.buren@gmail.com> wrote:
>
> Basically, if uncorrectable sectors >0 or pending sectors >0 the drive
> is failing. So replace ASAP :)
Exactly, with the new drives, they try to hide errors as much as they
can, as soon as SMART shows anything suspect, READ DMA failures,
pending sectors, offline uncorrectable sectors, or any tests fail, the
disk is going to die soon.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 14:45 ` Jérôme Poulin
@ 2011-11-04 14:59 ` Mikael Abrahamsson
2011-11-04 15:09 ` Thomas Fjellstrom
1 sibling, 0 replies; 12+ messages in thread
From: Mikael Abrahamsson @ 2011-11-04 14:59 UTC (permalink / raw)
To: Jérôme Poulin; +Cc: linux-raid
[-- Attachment #1: Type: TEXT/PLAIN, Size: 653 bytes --]
On Fri, 4 Nov 2011, Jérôme Poulin wrote:
> Exactly, with the new drives, they try to hide errors as much as they
> can, as soon as SMART shows anything suspect, READ DMA failures,
> pending sectors, offline uncorrectable sectors, or any tests fail, the
> disk is going to die soon.
Pending sectors (READ ERROR) is not as bad as equivalent write errors.
A 2TB drive has a bit error rate meaning it'll be according to spec if it
throws you a read error very 6 times you read all data from it.
So saying read errors and pending sectors are a strong indication of
impending doom is an exaggeration.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 14:45 ` Jérôme Poulin
2011-11-04 14:59 ` Mikael Abrahamsson
@ 2011-11-04 15:09 ` Thomas Fjellstrom
1 sibling, 0 replies; 12+ messages in thread
From: Thomas Fjellstrom @ 2011-11-04 15:09 UTC (permalink / raw)
To: Jérôme Poulin; +Cc: linux-raid
On November 4, 2011, Jérôme Poulin wrote:
> On Fri, Nov 4, 2011 at 9:57 AM, Mathias Burén <mathias.buren@gmail.com>
wrote:
> > Basically, if uncorrectable sectors >0 or pending sectors >0 the drive
> > is failing. So replace ASAP :)
>
> Exactly, with the new drives, they try to hide errors as much as they
> can, as soon as SMART shows anything suspect, READ DMA failures,
> pending sectors, offline uncorrectable sectors, or any tests fail, the
> disk is going to die soon.
Maybe I was just lucky, but I had an uncorrectable sector error at one point,
and ran a dd write on the drive, and SMART reported the sectors were
corrected. It wouldn't do it with a read, but a full write worked. The drive
worked for quite some time after that. I might still have it, but I can't
remember which drive that was (I've had a lot).
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Thomas Fjellstrom
thomas@fjellstrom.ca
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 14:33 ` Mikael Abrahamsson
@ 2011-11-04 15:31 ` Alex
2011-11-04 15:43 ` Mathias Burén
0 siblings, 1 reply; 12+ messages in thread
From: Alex @ 2011-11-04 15:31 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: linux-raid
Hi,
>> Can you point me to instructions on the best way to replace a disk?
>
> First run "repair" on the array, hopefully it'll notice the unreadable
> blocks and re-write them.
>
> echo repair >> /sys/block/md0/md/sync_action
>
> Also make sure your OS does regular scrubs of the raid, usually this is done
> by monthly runs of checkarray, this is an example from Ubuntu:
Great, thanks. I recalled something like that, but couldn't remember exactly.
The system passed the above rebuild test on both arrays, but I'm
obviously still concerned about the disk. Here are the relevant
smartctl lines:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 108 089 006 Pre-fail
Always - 0
3 Spin_Up_Time 0x0003 094 094 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 29
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000f 083 060 030 Pre-fail
Always - 209739855
9 Power_On_Hours 0x0032 074 074 000 Old_age
Always - 22816
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 37
187 Reported_Uncorrect 0x0032 095 095 000 Old_age
Always - 5
189 High_Fly_Writes 0x003a 100 100 000 Old_age
Always - 0
190 Airflow_Temperature_Cel 0x0022 075 064 045 Old_age
Always - 25 (Min/Max 23/32)
194 Temperature_Celsius 0x0022 025 040 000 Old_age
Always - 25 (0 18 0 0)
195 Hardware_ECC_Recovered 0x001a 057 045 000 Old_age
Always - 51009302
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 2
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 2
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age
Always - 0
Pending_sector and uncorrectable are both greater than zero. Is this
drive on its way to failure?
Can someone point me to the proper mdadm commands to set the drive
faulty then rebuild it after installing the new one?
Thanks again,
Alex
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 15:31 ` Alex
@ 2011-11-04 15:43 ` Mathias Burén
2011-11-04 17:42 ` Peter Zieba
0 siblings, 1 reply; 12+ messages in thread
From: Mathias Burén @ 2011-11-04 15:43 UTC (permalink / raw)
To: Alex; +Cc: Mikael Abrahamsson, linux-raid
On 4 November 2011 15:31, Alex <mysqlstudent@gmail.com> wrote:
> Hi,
>
>>> Can you point me to instructions on the best way to replace a disk?
>>
>> First run "repair" on the array, hopefully it'll notice the unreadable
>> blocks and re-write them.
>>
>> echo repair >> /sys/block/md0/md/sync_action
>>
>> Also make sure your OS does regular scrubs of the raid, usually this is done
>> by monthly runs of checkarray, this is an example from Ubuntu:
>
> Great, thanks. I recalled something like that, but couldn't remember exactly.
>
> The system passed the above rebuild test on both arrays, but I'm
> obviously still concerned about the disk. Here are the relevant
> smartctl lines:
>
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000f 108 089 006 Pre-fail
> Always - 0
> 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail
> Always - 0
> 4 Start_Stop_Count 0x0032 100 100 020 Old_age
> Always - 29
> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
> Always - 0
> 7 Seek_Error_Rate 0x000f 083 060 030 Pre-fail
> Always - 209739855
> 9 Power_On_Hours 0x0032 074 074 000 Old_age
> Always - 22816
> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
> Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age
> Always - 37
> 187 Reported_Uncorrect 0x0032 095 095 000 Old_age
> Always - 5
> 189 High_Fly_Writes 0x003a 100 100 000 Old_age
> Always - 0
> 190 Airflow_Temperature_Cel 0x0022 075 064 045 Old_age
> Always - 25 (Min/Max 23/32)
> 194 Temperature_Celsius 0x0022 025 040 000 Old_age
> Always - 25 (0 18 0 0)
> 195 Hardware_ECC_Recovered 0x001a 057 045 000 Old_age
> Always - 51009302
> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age
> Always - 2
> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
> Offline - 2
> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
> Always - 0
> 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
> Offline - 0
> 202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age
> Always - 0
>
> Pending_sector and uncorrectable are both greater than zero. Is this
> drive on its way to failure?
>
> Can someone point me to the proper mdadm commands to set the drive
> faulty then rebuild it after installing the new one?
>
> Thanks again,
> Alex
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
187 Reported_Uncorrect 0x0032 095 095 000 Old_age
Always - 5
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 2
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 2
This tells me to get rid of the drive. I don't know the mdadm commands
from my head, sorry, but it's in the man page(s). If you want you run
a scrub and see if these numbers change. If the drive fails hard
enough then md will kick it out of the array anyway. Btw, I scrub my
RAID6 (7 HDDs) once a week.
/Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 15:43 ` Mathias Burén
@ 2011-11-04 17:42 ` Peter Zieba
2011-11-07 21:17 ` Alex
0 siblings, 1 reply; 12+ messages in thread
From: Peter Zieba @ 2011-11-04 17:42 UTC (permalink / raw)
To: Mathias Burén; +Cc: Mikael Abrahamsson, linux-raid, Alex
So, in my personal experience with pending sectors, it's worth mentioning the following:
If you do a "check", and you have any pending sectors that are within the partition that is used for the md device, they should be read and rewritten as needed, causing the count to go down. However, I've noticed that sometimes I have pending sector counts on drives that don't go away after a "check". These would go away, however, if I failed and then removed the drive with mdadm, and then subsequently zero filled the /entire/ drive (as opposed to just the partition on that disk that is used by the array). The reason for this is that there's a small chunk of unused space that never gets read or written to right after the partition (even though I technically partition the entire drive as one large partition (fd Linux raid auto).
I think what actually happens in this case is that when the system reads data from near the end of the array, the drive itself will do read-ahead and cache it. So, even though the computer never requested those abandoned sectors, the drive eventually notices that it can't read them, and makes a note of the fact. So, this is harmless.
You could probably avoid the potential for false-positive on pending sectors if you used the entire disk for the array (no partitions), but I'm pretty sure that breaks the raid auto-detection.
Currently, my main array has 8 2TB hitachi disks, in a raid 6. It is scrubbed once a week, and one disk consistently has 8 pending sectors on it. I'm certain I could make those go away if I wanted, but, frankly, it's purely aesthetic as far as I'm concerned. Some of my drives also have non-zero "196 Reallocated_Event_Count" and "5 Reallocated_Sector_Ct", however, I have no drives with non-zero "Offline_Uncorrectable". I haven't had any problems with the disks or array (other than a temperature induced failure ... but that's another story, and I still run the same disks after that event). I used to have lots of issues before I started scrubbing consistently.
Peter
----- Original Message -----
From: "Mathias Burén" <mathias.buren@gmail.com>
To: "Alex" <mysqlstudent@gmail.com>
Cc: "Mikael Abrahamsson" <swmike@swm.pp.se>, linux-raid@vger.kernel.org
Sent: Friday, November 4, 2011 10:43:07 AM
Subject: Re: Impending failure?
On 4 November 2011 15:31, Alex <mysqlstudent@gmail.com> wrote:
> Hi,
>
>>> Can you point me to instructions on the best way to replace a disk?
>>
>> First run "repair" on the array, hopefully it'll notice the unreadable
>> blocks and re-write them.
>>
>> echo repair >> /sys/block/md0/md/sync_action
>>
>> Also make sure your OS does regular scrubs of the raid, usually this is done
>> by monthly runs of checkarray, this is an example from Ubuntu:
>
> Great, thanks. I recalled something like that, but couldn't remember exactly.
>
> The system passed the above rebuild test on both arrays, but I'm
> obviously still concerned about the disk. Here are the relevant
> smartctl lines:
>
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000f 108 089 006 Pre-fail
> Always - 0
> 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail
> Always - 0
> 4 Start_Stop_Count 0x0032 100 100 020 Old_age
> Always - 29
> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
> Always - 0
> 7 Seek_Error_Rate 0x000f 083 060 030 Pre-fail
> Always - 209739855
> 9 Power_On_Hours 0x0032 074 074 000 Old_age
> Always - 22816
> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
> Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age
> Always - 37
> 187 Reported_Uncorrect 0x0032 095 095 000 Old_age
> Always - 5
> 189 High_Fly_Writes 0x003a 100 100 000 Old_age
> Always - 0
> 190 Airflow_Temperature_Cel 0x0022 075 064 045 Old_age
> Always - 25 (Min/Max 23/32)
> 194 Temperature_Celsius 0x0022 025 040 000 Old_age
> Always - 25 (0 18 0 0)
> 195 Hardware_ECC_Recovered 0x001a 057 045 000 Old_age
> Always - 51009302
> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age
> Always - 2
> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
> Offline - 2
> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
> Always - 0
> 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
> Offline - 0
> 202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age
> Always - 0
>
> Pending_sector and uncorrectable are both greater than zero. Is this
> drive on its way to failure?
>
> Can someone point me to the proper mdadm commands to set the drive
> faulty then rebuild it after installing the new one?
>
> Thanks again,
> Alex
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
187 Reported_Uncorrect 0x0032 095 095 000 Old_age
Always - 5
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always - 2
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 2
This tells me to get rid of the drive. I don't know the mdadm commands
from my head, sorry, but it's in the man page(s). If you want you run
a scrub and see if these numbers change. If the drive fails hard
enough then md will kick it out of the array anyway. Btw, I scrub my
RAID6 (7 HDDs) once a week.
/Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 13:53 Impending failure? Alex
2011-11-04 13:57 ` Mathias Burén
2011-11-04 14:33 ` Mikael Abrahamsson
@ 2011-11-04 18:05 ` John Robinson
2 siblings, 0 replies; 12+ messages in thread
From: John Robinson @ 2011-11-04 18:05 UTC (permalink / raw)
To: Alex; +Cc: linux-raid
On 04/11/2011 13:53, Alex wrote:
> Hi,
> I have a fedora15 system with two 80GB SATA disks using RAID1
[...]
> I believe the boot sector is installed on sda, which is also the bad
> disk. If I remove the disk to replace it, I'm concerned the system
> will no longer boot.
>
> Can you point me to instructions on the best way to replace a disk?
If you used Fedora's installer to set up the md RAIDs in the first
place, you will be able to boot off the second drive, as the Fedora
installer will have installed grub on it as well. You would have to tell
your BIOS to boot from it though.
If you've room in the case and spare SATA ports for a third drive (with
which to replace the failing drive), put it in and assuming it appears
as sdc I'd do the switcheroo by growing the array to three drives then
shrinking it again, something like this (all off the top of my head, so
check it first by looking at the man page, which will help you
understand what's going on anyway):
# sfdisk -d /dev/sda
Note down where sda1 starts. It will likely be either 63 or 2048.
# dd if=/dev/sda of=/dev/sdc bs=512 count=<howevermanyitwas>
This will copy the partition table and boot code.
# blockdev --rereadpt /dev/sdc
# mdadm /dev/md0 --add /dev/sdc1
# mdadm /dev/md1 --add /dev/sdc2
# mdadm --grow /dev/md0 -n3
# mdadm --grow /dev/md1 -n3
Wait for this to finish, either by looking at /proc/mdstat from time to
time, or using mdadm --wait /dev/md1. This gives you a three-way mirror.
It's possible this process will cause sda to fail, but that's OK. Next
we want to remove the duff drive:
# mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1
# mdadm /dev/md1 --fail /dev/sda2 --remove /dev/sda2
# mdadm --grow /dev/md0 -n2
# mdadm --grow /dev/md1 -n2
Now you can shut down again, and install the new drive in the place of
the original failing drive.
Cheers,
John.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-04 17:42 ` Peter Zieba
@ 2011-11-07 21:17 ` Alex
2011-11-08 0:04 ` Peter Zieba
0 siblings, 1 reply; 12+ messages in thread
From: Alex @ 2011-11-07 21:17 UTC (permalink / raw)
To: Peter Zieba; +Cc: Mathias Burén, Mikael Abrahamsson, linux-raid
Hi guys,
> So, in my personal experience with pending sectors, it's worth mentioning the following:
>
> If you do a "check", and you have any pending sectors that are within the partition that is used for the md device, they should be read and rewritten as needed, causing the count to go down. However, I've noticed that sometimes I have pending sector counts on drives that don't go away after a "check". These would go away, however, if I failed and then removed the drive with mdadm, and then subsequently zero filled the /entire/ drive (as opposed to just the partition on that disk that is used by the array). The reason for this is that there's a small chunk of unused space that never gets read or written to right after the partition (even though I technically partition the entire drive as one large partition (fd Linux raid auto).
>
> I think what actually happens in this case is that when the system reads data from near the end of the array, the drive itself will do read-ahead and cache it. So, even though the computer never requested those abandoned sectors, the drive eventually notices that it can't read them, and makes a note of the fact. So, this is harmless.
>
> You could probably avoid the potential for false-positive on pending sectors if you used the entire disk for the array (no partitions), but I'm pretty sure that breaks the raid auto-detection.
>
> Currently, my main array has 8 2TB hitachi disks, in a raid 6. It is scrubbed once a week, and one disk consistently has 8 pending sectors on it. I'm certain I could make those go away if I wanted, but, frankly, it's purely aesthetic as far as I'm concerned. Some of my drives also have non-zero "196 Reallocated_Event_Count" and "5 Reallocated_Sector_Ct", however, I have no drives with non-zero "Offline_Uncorrectable". I haven't had any problems with the disks or array (other than a temperature induced failure ... but that's another story, and I still run the same disks after that event). I used to have lots of issues before I started scrubbing consistently.
I think I understand your explanation. You are basically saying that
if I recheck the drive, there's a possibility the pending defective
sectors may resolve themselves?
Given that I have an existing system, how do I check the integrity of
the partitions? What is the contents of the "check" script to which
you refer?
Is this safe to do remotely?
Is it necessary to set a disk faulty before removing it?
Thanks,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Impending failure?
2011-11-07 21:17 ` Alex
@ 2011-11-08 0:04 ` Peter Zieba
0 siblings, 0 replies; 12+ messages in thread
From: Peter Zieba @ 2011-11-08 0:04 UTC (permalink / raw)
To: Alex; +Cc: Mathias Burén, Mikael Abrahamsson, linux-raid
Ok, so first I'll explain what a pending sector is. It is not "pending defective sectors". It could be defective, but at the moment it's in a sort of limbo state. That's why it's called pending.
With pending sectors, basically, the drive is unable to read the given sector at-this-time -- but it's not necessarily a bad sector (this is actually a normal occurrence with large disks every once in a while). In other words, if you tried to read it again, the drive might be able to read it (due to temperature fluctuations, phase of the moon, etc. etc.). If it managed to succeed, the drive rewrites what should have been there in the first place. Imagine something written very faintly on a piece of paper. If you look at it, you might not be able to read it. If you stare at it for a while and try a few times you might be able to figure out what was there. Or, you give up and come back to it a day later and suddenly, it makes sense. Now, if the drive succeeds in figuring out what was there, it tries writing it back to the disk. If this fails, now you have something more than just a pending sector. It's actually marked bad, and remapped to the spare sectors the drive has. This is a much better indicator of drive failure "5 Reallocated_Sector_Ct" or "196 Reallocated_Event_Count" (not sure how these differ, but in any case, they're worse than pending sectors).
Now, if the drive fails to determine what was supposed to be in this sector, the machine eventually gets notified of the failure to return the sector, mdadm then reads the data from parity, and writes the proper data back to the original drive that failed to read the given sector. So, if the write succeeds, the sector is no longer pending, and the count should be decremented by one. If the write fails, it should be remapped to a spare sector and again the pending sector count should be decremented. Either case is transparent to the machine/mdadm.
So, with the "check" sync_action, you are forcing a read of all data and parity data, on every drive. If there are pending sectors (and there could easily be ones you hit that the drive doesn't even know are pending yet), they will naturally be corrected (either by the drive managing to read it properly this time, or by it giving up and mdadm checking the parity for what was supposed to be there.)
So, yes, pending sectors should be resolved by a check /as long as they're in use by mdadm/. It's important to understand that there might be sectors outside of what's being used by mdadm (partition table, and wasted space at the end of a drive). A "check" will not resolve these, but they're also not an issue. The drive simply isn't sure what's written in a location that you don't care about.
A check should be safe to do remotely, provided you understand that it will generate a lot of I/O. IIRC, this I/O is of lower priority than actual I/O of the running system, and so it shouldn't cause a problem other than making regular I/O of the system a little slower. I believe a minimum and maximum speed can be set for this check/repair. If you have severe problems that are lurking (in other words, they'll manifest sooner or later without changing anything), this could in theory kick out enough drives to bring the array down. This is very unlikely, however.
If you want to run a check on an array with redundancy (Raid-1, Raid-5, Raid-6, etc.), doing the following:
echo check > /sys/block/md0/md/sync_action
Will cause the array to check all of the data against the redundancy. The actual intended purpose of this is to check that the parity data matches the actual data. In other words, this is for mdadm's housekeeping. It happens to do a good job of housekeeping for the drives themselves due to forcing everything to be read, however. After running this command, you can check on the progresss with:
cat /proc/mdstat
Once this is complete, "mismatch count" should be updated with how many errors were found if you did a "check", which can be read from:
cat /sys/block/md0/md/mismatch_cnt
This should be zero.
If this turns up mismatches, they can be repaired with:
echo repair > /sys/block/md0/md/sync_action
Or, you can just run repair outright (and parity will be fixed as it is found to be bad).
Having bad parity data isn't normal. Something like this would happen due to abnormal conditions (power outages, etc.).
The short answer to all of this is stop worrying about pending sectors (they are relatively normal), and run a "check" once a week, and all should be well. On recent CentOS/Rhel this is a cron job, which is located in:
/etc/cron.weekly/99-raid-check
It is configured by:
/etc/sysconfig/raid-check
This wasn't always included in rhel/centos (it has been added in the last year or two if I'm not mistaken.). No idea how other distros handle this (they simply might not, either).
To remove a disk properly:
mdadm --manage /dev/md0 --fail /dev/sda1
mdadm --manage /dev/md0 --remove /dev/sda1
If you managed to already remove a disk without telling mdadm you were going to do so, this might help:
mdadm --manage /dev/md0 --remove detached
If all the disks are happily functioning in the array currently (as in, mdadm hasn't kicked any out of the running array), I'd recommend running a "check" before removing any disks, to clean up any pending sectors first.
Then if you still want to remove a disk to either replace it, or do something to it outside of what's healthy/sane to do to a disk in a running array, go ahead with the remove.
Disclaimers and notes:
- All commands assume you're dealing with "/dev/md0". Where applicable, all commands involving an operation on a specific drive assume you're dealing with /dev/sda1 as the member of the array you want to act on.
- These are my experiences and related to my particular configuration. Your situation may warrant different action, in spite of my confidence in the accuracy of what's written here.
- I use "parity" when sometimes I'm simply referring to another copy of the data (Raid-1, Raid-10.) for the sake of brevity.
- I've heard of someone mention mismatch counts being normal in certain weird situations somewhere on the list (something to do with swap???)
- There are shorter ways of doing the fail/remove operations, and can also be done with one line (all in the man page).
Cheers
Peter Zieba
----- Original Message -----
From: "Alex" <mysqlstudent@gmail.com>
To: "Peter Zieba" <pzieba@networkmayhem.com>
Cc: "Mathias Burén" <mathias.buren@gmail.com>, "Mikael Abrahamsson" <swmike@swm.pp.se>, linux-raid@vger.kernel.org
Sent: Monday, November 7, 2011 3:17:54 PM
Subject: Re: Impending failure?
Hi guys,
> So, in my personal experience with pending sectors, it's worth mentioning the following:
>
> If you do a "check", and you have any pending sectors that are within the partition that is used for the md device, they should be read and rewritten as needed, causing the count to go down. However, I've noticed that sometimes I have pending sector counts on drives that don't go away after a "check". These would go away, however, if I failed and then removed the drive with mdadm, and then subsequently zero filled the /entire/ drive (as opposed to just the partition on that disk that is used by the array). The reason for this is that there's a small chunk of unused space that never gets read or written to right after the partition (even though I technically partition the entire drive as one large partition (fd Linux raid auto).
>
> I think what actually happens in this case is that when the system reads data from near the end of the array, the drive itself will do read-ahead and cache it. So, even though the computer never requested those abandoned sectors, the drive eventually notices that it can't read them, and makes a note of the fact. So, this is harmless.
>
> You could probably avoid the potential for false-positive on pending sectors if you used the entire disk for the array (no partitions), but I'm pretty sure that breaks the raid auto-detection.
>
> Currently, my main array has 8 2TB hitachi disks, in a raid 6. It is scrubbed once a week, and one disk consistently has 8 pending sectors on it. I'm certain I could make those go away if I wanted, but, frankly, it's purely aesthetic as far as I'm concerned. Some of my drives also have non-zero "196 Reallocated_Event_Count" and "5 Reallocated_Sector_Ct", however, I have no drives with non-zero "Offline_Uncorrectable". I haven't had any problems with the disks or array (other than a temperature induced failure ... but that's another story, and I still run the same disks after that event). I used to have lots of issues before I started scrubbing consistently.
I think I understand your explanation. You are basically saying that
if I recheck the drive, there's a possibility the pending defective
sectors may resolve themselves?
Given that I have an existing system, how do I check the integrity of
the partitions? What is the contents of the "check" script to which
you refer?
Is this safe to do remotely?
Is it necessary to set a disk faulty before removing it?
Thanks,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-11-08 0:04 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-04 13:53 Impending failure? Alex
2011-11-04 13:57 ` Mathias Burén
[not found] ` <CALJXSJouyQxMcV_CKGUiU-QkzDor2FTW25uGFyy-jDLnppnoAg@mail.gmail.com>
2011-11-04 14:45 ` Jérôme Poulin
2011-11-04 14:59 ` Mikael Abrahamsson
2011-11-04 15:09 ` Thomas Fjellstrom
2011-11-04 14:33 ` Mikael Abrahamsson
2011-11-04 15:31 ` Alex
2011-11-04 15:43 ` Mathias Burén
2011-11-04 17:42 ` Peter Zieba
2011-11-07 21:17 ` Alex
2011-11-08 0:04 ` Peter Zieba
2011-11-04 18:05 ` John Robinson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).