* mdadm RAID6 faulty drive
@ 2013-03-25 16:02 Paramasivam, Meenakshisundaram
2013-03-25 17:43 ` Phil Turmel
0 siblings, 1 reply; 4+ messages in thread
From: Paramasivam, Meenakshisundaram @ 2013-03-25 16:02 UTC (permalink / raw)
To: linux-raid@vger.kernel.org
Hi,
As a result of extended power outage, the FedoraCore 17 machine with mdadm RAID went down. Bringing it up, I noticed "faulty /dev/sdf" in mdadm -detail. However mdadm -E /dev/sdf shows "State : clean". Details are shown below. When I tried to add the drive to array, resync fails (I see lots of eSATA bus resets), and I get the same message in mdadm -detail.
Questions:
1. How can a clean drive be reported faulty?
2. Is there a easy way to mark drive (/dev/sdf) as "assume-clean" and add it?
Please let me know if I should get an exact replacement drive at this stage, pull out faulty /dev/sdf, and add the new drive to array. Thanks.
Details:
#mdadm --detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Thu Dec 20 13:08:56 2012
Raid Level : raid6
Array Size : 11720297472 (11177.35 GiB 12001.58 GB)
Used Dev Size : 1953382912 (1862.89 GiB 2000.26 GB)
Raid Devices : 8
Total Devices : 8
Persistence : Superblock is persistent
Update Time : Mon Mar 25 11:37:12 2013
State : clean, degraded
Active Devices : 7
Working Devices : 7
Failed Devices : 1
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : RAIDvol1
UUID : 8a9eee70:89f2639b:68f5350d:11f444fe
Events : 1494
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 96 1 active sync /dev/sdg
2 8 112 2 active sync /dev/sdh
3 8 128 3 active sync /dev/sdi
8 8 16 4 active sync /dev/sdb
5 8 32 5 active sync /dev/sdc
6 8 48 6 active sync /dev/sdd
7 8 64 7 active sync /dev/sde
9 8 80 - faulty /dev/sdf
# mdadm -E /dev/sdf
/dev/sdf:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x2
Array UUID : 8a9eee70:89f2639b:68f5350d:11f444fe
Name : RAIDvol1
Creation Time : Thu Dec 20 13:08:56 2012
Raid Level : raid6
Raid Devices : 8
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 11720297472 (11177.35 GiB 12001.58 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Recovery Offset : 0 sectors
State : clean
Device UUID : 0a58fa5c:4d10f401:07dead3a:ec844676
Update Time : Fri Mar 22 14:25:13 2013
Checksum : ba455125 - correct
Events : 1043
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AAAAAAAA ('A' == active, '.' == missing)
#
Sundar
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: mdadm RAID6 faulty drive
2013-03-25 16:02 mdadm RAID6 faulty drive Paramasivam, Meenakshisundaram
@ 2013-03-25 17:43 ` Phil Turmel
2013-03-26 20:05 ` Roy Sigurd Karlsbakk
2013-03-27 20:46 ` Paramasivam, Meenakshisundaram
0 siblings, 2 replies; 4+ messages in thread
From: Phil Turmel @ 2013-03-25 17:43 UTC (permalink / raw)
To: Paramasivam, Meenakshisundaram; +Cc: linux-raid@vger.kernel.org
On 03/25/2013 12:02 PM, Paramasivam, Meenakshisundaram wrote:
>
> Hi,
>
> As a result of extended power outage, the FedoraCore 17 machine with
> mdadm RAID went down. Bringing it up, I noticed "faulty /dev/sdf" in
> mdadm -detail. However mdadm -E /dev/sdf shows "State : clean".
> Details are shown below. When I tried to add the drive to array,
> resync fails (I see lots of eSATA bus resets), and I get the same
> message in mdadm -detail.
>
> Questions:
> 1. How can a clean drive be reported faulty?
When the drive is kicked out for I/O errors its superblock is left as-is
(just as if you pulled its sata cable). The remaining devices'
superblocks are marked to show the failed drive, and *their*
superblocks' event count is bumped. The failed status of that device is
derived during assembly when its superblock is found to be stale.
> 2. Is there a easy way to mark drive (/dev/sdf) as "assume-clean" and
> add it?
No. The closest thing is to use a write-intent bitmap and "re-add"
devices that are disconnected.
That's not your problem.
> Please let me know if I should get an exact replacement drive at
> this stage, pull out faulty /dev/sdf, and add the new drive to array.
> Thanks.
You very likely need a new drive. You might want to try plugging that
drive into a different controller, or a different port on the same
controller, just to narrow the diagnosis.
You could also show us some of the kernel error messages, or show the
output of "smartctl -x /dev/sdf".
Phil
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: mdadm RAID6 faulty drive
2013-03-25 17:43 ` Phil Turmel
@ 2013-03-26 20:05 ` Roy Sigurd Karlsbakk
2013-03-27 20:46 ` Paramasivam, Meenakshisundaram
1 sibling, 0 replies; 4+ messages in thread
From: Roy Sigurd Karlsbakk @ 2013-03-26 20:05 UTC (permalink / raw)
To: Phil Turmel; +Cc: linux-raid, Meenakshisundaram Paramasivam
> You could also show us some of the kernel error messages, or show the
> output of "smartctl -x /dev/sdf".
I second that. Also, if the command above shows no errors, do tests with smartctl -t short and -t long. If those succeed, you may try to re-add it, but I wouldn't --assume-clean. Better go through a normal rebuild to see if any errors occur.
Vennlige hilsener / Best regards
roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
roy@karlsbakk.net
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: mdadm RAID6 faulty drive
2013-03-25 17:43 ` Phil Turmel
2013-03-26 20:05 ` Roy Sigurd Karlsbakk
@ 2013-03-27 20:46 ` Paramasivam, Meenakshisundaram
1 sibling, 0 replies; 4+ messages in thread
From: Paramasivam, Meenakshisundaram @ 2013-03-27 20:46 UTC (permalink / raw)
To: Phil Turmel; +Cc: linux-raid@vger.kernel.org
Thanks for the smartctl. Though smartctl -t short /dev/sdf passed the test, I was still unable to add the drive.
Then I did this:
#mdadm /dev/md2 --remove /dev/sdf
#mdadm --stop /dev/md2 (after turning off all processes
#dd if=/dev/zero of=/dev/sdf
dd: writing to `/dev/sdf': Input/output error
8522041+0 records in
8522040+0 records out
4363284480 bytes (4.4 GB) copied, 138.782 s, 31.4 MB/s
Appears to be a bad drive, I will replace it.
Sundar
________________________________________
From: Phil Turmel [philip@turmel.org]
Sent: Monday, March 25, 2013 1:43 PM
To: Paramasivam, Meenakshisundaram
Cc: linux-raid@vger.kernel.org
Subject: Re: mdadm RAID6 faulty drive
On 03/25/2013 12:02 PM, Paramasivam, Meenakshisundaram wrote:
>
> Hi,
>
> As a result of extended power outage, the FedoraCore 17 machine with
> mdadm RAID went down. Bringing it up, I noticed "faulty /dev/sdf" in
> mdadm -detail. However mdadm -E /dev/sdf shows "State : clean".
> Details are shown below. When I tried to add the drive to array,
> resync fails (I see lots of eSATA bus resets), and I get the same
> message in mdadm -detail.
>
> Questions:
> 1. How can a clean drive be reported faulty?
When the drive is kicked out for I/O errors its superblock is left as-is
(just as if you pulled its sata cable). The remaining devices'
superblocks are marked to show the failed drive, and *their*
superblocks' event count is bumped. The failed status of that device is
derived during assembly when its superblock is found to be stale.
> 2. Is there a easy way to mark drive (/dev/sdf) as "assume-clean" and
> add it?
No. The closest thing is to use a write-intent bitmap and "re-add"
devices that are disconnected.
That's not your problem.
> Please let me know if I should get an exact replacement drive at
> this stage, pull out faulty /dev/sdf, and add the new drive to array.
> Thanks.
You very likely need a new drive. You might want to try plugging that
drive into a different controller, or a different port on the same
controller, just to narrow the diagnosis.
You could also show us some of the kernel error messages, or show the
output of "smartctl -x /dev/sdf".
Phil
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-03-27 20:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-25 16:02 mdadm RAID6 faulty drive Paramasivam, Meenakshisundaram
2013-03-25 17:43 ` Phil Turmel
2013-03-26 20:05 ` Roy Sigurd Karlsbakk
2013-03-27 20:46 ` Paramasivam, Meenakshisundaram
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).