* Building new RAID5 results in removed and failed devices
@ 2010-06-24 20:26 Markus Krainz
2010-06-24 21:03 ` Dan Williams
2010-06-24 21:38 ` Robin Hill
0 siblings, 2 replies; 6+ messages in thread
From: Markus Krainz @ 2010-06-24 20:26 UTC (permalink / raw)
To: linux-raid
1. I delete my old array md0
2. Then I create a new one as md1
3. The new md1 is not ok and shows removed, faulty and spare devices.
5. I tried this serveral times now, but i always get the same issue.
I am at my wits end and would very much appreciate your advice. Thank
you in advance.
This is what I did:
/dev/ mdadm --stop md0
mdadm: stopped md0
/dev/ mdadm --remove md0
/dev/ mdadm --detail md0
mdadm: md device md0 does not appear to be active.
/dev/ mdadm --zero-superblock /dev/sdd1
/dev/ mdadm --zero-superblock /dev/sdc1
mdadm: Unrecognised md component device - /dev/sdc1
/dev/ mdadm --zero-superblock /dev/sdb1
/dev/ mdadm --create --verbose /dev/md1 --chunk=64 --level=5
--raid-devices=2 /dev/sdd1 /dev/sdb1
mdadm: layout defaults to left-symmetric
mdadm: size set to 1953511936K
mdadm: array /dev/md1 started.
/dev/ mdadm --detail md1
md1:
Version : 00.90
Creation Time : Thu Jun 24 15:17:51 2010
Raid Level : raid5
Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Thu Jun 24 17:23:46 2010
State : clean, degraded, recovering
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
Rebuild Status : 36% complete
UUID : d83aec6d:a71a2345:4cbed0ec:a797be60 (local to host
d-serv)
Events : 0.12
Number Major Minor RaidDevice State
0 8 49 0 active sync /dev/sdd1
2 8 17 1 spare rebuilding /dev/sdb1
/dev/ mdadm --detail md1
md1:
Version : 00.90
Creation Time : Thu Jun 24 15:17:51 2010
Raid Level : raid5
Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Thu Jun 24 18:04:28 2010
State : clean, degraded
Active Devices : 0
Working Devices : 1
Failed Devices : 1
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
0 0 0 0 removed
1 0 0 1 removed
2 8 17 - spare /dev/sdb1
3 8 49 - faulty spare /dev/sdd1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Building new RAID5 results in removed and failed devices
2010-06-24 20:26 Building new RAID5 results in removed and failed devices Markus Krainz
@ 2010-06-24 21:03 ` Dan Williams
2010-06-24 22:23 ` Markus Krainz
2010-06-24 21:38 ` Robin Hill
1 sibling, 1 reply; 6+ messages in thread
From: Dan Williams @ 2010-06-24 21:03 UTC (permalink / raw)
To: Markus Krainz; +Cc: linux-raid
On Thu, Jun 24, 2010 at 1:26 PM, Markus Krainz <ldm@gmx.at> wrote:
> 1. I delete my old array md0
> 2. Then I create a new one as md1
> 3. The new md1 is not ok and shows removed, faulty and spare devices.
This is expected without the --force option:
--force
Insist that mdadm accept the geometry and layout
specified without question.
Normally mdadm will not allow creation of an array with
only one device, and
will try to create a RAID5 array with one missing
drive (as this makes the
initial resync work faster). With --force, mdadm will
not try to be so
clever.
--
Dan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Building new RAID5 results in removed and failed devices
2010-06-24 20:26 Building new RAID5 results in removed and failed devices Markus Krainz
2010-06-24 21:03 ` Dan Williams
@ 2010-06-24 21:38 ` Robin Hill
2010-06-24 22:10 ` Markus Krainz
1 sibling, 1 reply; 6+ messages in thread
From: Robin Hill @ 2010-06-24 21:38 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1292 bytes --]
On Thu Jun 24, 2010 at 10:26:33PM +0200, Markus Krainz wrote:
> 1. I delete my old array md0
> 2. Then I create a new one as md1
> 3. The new md1 is not ok and shows removed, faulty and spare devices.
> 5. I tried this serveral times now, but i always get the same issue.
>
> I am at my wits end and would very much appreciate your advice. Thank
> you in advance.
>
<--snip-->
> Number Major Minor RaidDevice State
> 0 8 49 0 active sync /dev/sdd1
> 2 8 17 1 spare rebuilding /dev/sdb1
>
<--snip-->
> 2 8 17 - spare /dev/sdb1
> 3 8 49 - faulty spare /dev/sdd1
>
Looks like it started doing the initial synchronisation as normal, but
hid a read error on sdd1 which caused the array to fail. sdd1 is left
as faulty, and sdb1 is spare as the rebuild didn't complete. I'd
recommend testing sdd1 (SMART tests, read tests, write tests) before
trying to use it any further.
Cheers,
Robin
--
___
( ' } | Robin Hill <robin@robinhill.me.uk> |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Building new RAID5 results in removed and failed devices
2010-06-24 21:38 ` Robin Hill
@ 2010-06-24 22:10 ` Markus Krainz
2010-06-25 4:06 ` Mikael Abrahamsson
0 siblings, 1 reply; 6+ messages in thread
From: Markus Krainz @ 2010-06-24 22:10 UTC (permalink / raw)
To: linux-raid
> Looks like it started doing the initial synchronisation as normal, but
> hid a read error on sdd1 which caused the array to fail. sdd1 is left
> as faulty, and sdb1 is spare as the rebuild didn't complete. I'd
> recommend testing sdd1 (SMART tests, read tests, write tests) before
> trying to use it any further.
>
> Cheers,
> Robin
>
I think this unlikely because I did a full 4-pass badblocks
read/write-test on sdb, sdc and sdd before using them.
No bad blocks have been found.
smartctl -H shows PASSED on all 3 devices. However sdd and sdc have
Offline_Uncorrectable set to 1 while sdb has not. Is this a sign of disk
failure?
Best regards,
Markus
~/ sudo smartctl -H /dev/sdb
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
~/ sudo smartctl -H /dev/sdc
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
~/ sudo smartctl -H /dev/sdd
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
~/ sudo smartctl -A /dev/sdd
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 243 142 021 Pre-fail
Always - 4816
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 42
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age
Always - 453
10 Spin_Retry_Count 0x0032 100 253 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 40
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 32
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 514
194 Temperature_Celsius 0x0022 100 087 000 Old_age
Always - 52
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 1
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 1
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0
~/ sudo smartctl -A /dev/sdc
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 245 156 021 Pre-fail
Always - 4725
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 39
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age
Always - 456
10 Spin_Retry_Count 0x0032 100 253 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 38
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 23
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 1286
194 Temperature_Celsius 0x0022 110 087 000 Old_age
Always - 42
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 1
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 1
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0
~/ sudo smartctl -A /dev/sdb
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 245 158 021 Pre-fail
Always - 4716
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 40
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age
Always - 455
10 Spin_Retry_Count 0x0032 100 253 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 38
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 29
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 1215
194 Temperature_Celsius 0x0022 105 082 000 Old_age
Always - 47
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Building new RAID5 results in removed and failed devices
2010-06-24 21:03 ` Dan Williams
@ 2010-06-24 22:23 ` Markus Krainz
0 siblings, 0 replies; 6+ messages in thread
From: Markus Krainz @ 2010-06-24 22:23 UTC (permalink / raw)
To: linux-raid
On 06/24/2010 11:03 PM, Dan Williams wrote:
> This is expected without the --force option:
>
> --force
> Insist that mdadm accept the geometry and layout
> specified without question.
> Normally mdadm will not allow creation of an array with
> only one device, and
> will try to create a RAID5 array with one missing
> drive (as this makes the
> initial resync work faster). With --force, mdadm will
> not try to be so
> clever.
>
>
The manpage you quoted says mdadm will not create a raid5 array with
only one device.
However in my case I have --raid-devices=2 and both /dev/sdd1 and
/dev/sdb1. So I should be ok with a raid5 and 2 devices?
The reason I am not using raid1 is because a want to add more drives
later on.
Best regards,
Markus
mdadm --create --verbose /dev/md1 --chunk=64 --level=5 --raid-devices=2
/dev/sdd1 /dev/sdb1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Building new RAID5 results in removed and failed devices
2010-06-24 22:10 ` Markus Krainz
@ 2010-06-25 4:06 ` Mikael Abrahamsson
0 siblings, 0 replies; 6+ messages in thread
From: Mikael Abrahamsson @ 2010-06-25 4:06 UTC (permalink / raw)
To: Markus Krainz; +Cc: linux-raid
On Fri, 25 Jun 2010, Markus Krainz wrote:
> Offline_Uncorrectable set to 1 while sdb has not. Is this a sign of disk
> failure?
Yes, it means there is one sector that returned read error.
Check your kernel logs ("dmesg") and it's very likely you'll see media
layer errors there.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-06-25 4:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-24 20:26 Building new RAID5 results in removed and failed devices Markus Krainz
2010-06-24 21:03 ` Dan Williams
2010-06-24 22:23 ` Markus Krainz
2010-06-24 21:38 ` Robin Hill
2010-06-24 22:10 ` Markus Krainz
2010-06-25 4:06 ` Mikael Abrahamsson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.