All of lore.kernel.org
 help / color / mirror / Atom feed
* Building new RAID5 results in removed and failed devices
@ 2010-06-24 20:26 Markus Krainz
  2010-06-24 21:03 ` Dan Williams
  2010-06-24 21:38 ` Robin Hill
  0 siblings, 2 replies; 6+ messages in thread
From: Markus Krainz @ 2010-06-24 20:26 UTC (permalink / raw)
  To: linux-raid

1. I delete my old array md0
2. Then I create a new one as md1
3. The new md1 is not ok and shows removed, faulty and spare devices.
5. I tried this serveral times now, but i always get the same issue.

I am at my wits end and would very much appreciate your advice. Thank 
you in advance.

This is what I did:

/dev/ mdadm --stop md0
mdadm: stopped md0

/dev/ mdadm --remove md0

/dev/ mdadm --detail md0
mdadm: md device md0 does not appear to be active.

/dev/ mdadm --zero-superblock /dev/sdd1

/dev/ mdadm --zero-superblock /dev/sdc1
mdadm: Unrecognised md component device - /dev/sdc1

/dev/ mdadm --zero-superblock /dev/sdb1

/dev/ mdadm --create --verbose /dev/md1 --chunk=64 --level=5 
--raid-devices=2 /dev/sdd1 /dev/sdb1
mdadm: layout defaults to left-symmetric
mdadm: size set to 1953511936K
mdadm: array /dev/md1 started.

/dev/ mdadm --detail md1
md1:
         Version : 00.90
   Creation Time : Thu Jun 24 15:17:51 2010
      Raid Level : raid5
      Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
    Raid Devices : 2
   Total Devices : 2
Preferred Minor : 1
     Persistence : Superblock is persistent

     Update Time : Thu Jun 24 17:23:46 2010
           State : clean, degraded, recovering
  Active Devices : 1
Working Devices : 2
  Failed Devices : 0
   Spare Devices : 1

          Layout : left-symmetric
      Chunk Size : 64K

  Rebuild Status : 36% complete

            UUID : d83aec6d:a71a2345:4cbed0ec:a797be60 (local to host 
d-serv)
          Events : 0.12

     Number   Major   Minor   RaidDevice State
        0       8       49        0      active sync   /dev/sdd1
        2       8       17        1      spare rebuilding   /dev/sdb1

/dev/ mdadm --detail md1
md1:
         Version : 00.90
   Creation Time : Thu Jun 24 15:17:51 2010
      Raid Level : raid5
      Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
    Raid Devices : 2
   Total Devices : 2
Preferred Minor : 1
     Persistence : Superblock is persistent

     Update Time : Thu Jun 24 18:04:28 2010
           State : clean, degraded
  Active Devices : 0
Working Devices : 1
  Failed Devices : 1
   Spare Devices : 1

          Layout : left-symmetric
      Chunk Size : 64K

     Number   Major   Minor   RaidDevice State
        0       0        0        0      removed
        1       0        0        1      removed

        2       8       17        -      spare   /dev/sdb1
        3       8       49        -      faulty spare   /dev/sdd1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Building new RAID5 results in removed and failed devices
  2010-06-24 20:26 Building new RAID5 results in removed and failed devices Markus Krainz
@ 2010-06-24 21:03 ` Dan Williams
  2010-06-24 22:23   ` Markus Krainz
  2010-06-24 21:38 ` Robin Hill
  1 sibling, 1 reply; 6+ messages in thread
From: Dan Williams @ 2010-06-24 21:03 UTC (permalink / raw)
  To: Markus Krainz; +Cc: linux-raid

On Thu, Jun 24, 2010 at 1:26 PM, Markus Krainz <ldm@gmx.at> wrote:
> 1. I delete my old array md0
> 2. Then I create a new one as md1
> 3. The new md1 is not ok and shows removed, faulty and spare devices.

This is expected without the --force option:

--force
              Insist that mdadm accept the geometry and layout
specified without question.
              Normally mdadm will not allow creation of an array with
only one device, and
              will  try  to create a RAID5 array with one missing
drive (as this makes the
              initial resync work faster).  With --force, mdadm will
not  try  to  be  so
              clever.

--
Dan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Building new RAID5 results in removed and failed devices
  2010-06-24 20:26 Building new RAID5 results in removed and failed devices Markus Krainz
  2010-06-24 21:03 ` Dan Williams
@ 2010-06-24 21:38 ` Robin Hill
  2010-06-24 22:10   ` Markus Krainz
  1 sibling, 1 reply; 6+ messages in thread
From: Robin Hill @ 2010-06-24 21:38 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1292 bytes --]

On Thu Jun 24, 2010 at 10:26:33PM +0200, Markus Krainz wrote:

> 1. I delete my old array md0
> 2. Then I create a new one as md1
> 3. The new md1 is not ok and shows removed, faulty and spare devices.
> 5. I tried this serveral times now, but i always get the same issue.
> 
> I am at my wits end and would very much appreciate your advice. Thank 
> you in advance.
> 
<--snip-->
>      Number   Major   Minor   RaidDevice State
>         0       8       49        0      active sync   /dev/sdd1
>         2       8       17        1      spare rebuilding   /dev/sdb1
> 
<--snip-->
>         2       8       17        -      spare   /dev/sdb1
>         3       8       49        -      faulty spare   /dev/sdd1
> 
Looks like it started doing the initial synchronisation as normal, but
hid a read error on sdd1 which caused the array to fail.  sdd1 is left
as faulty, and sdb1 is spare as the rebuild didn't complete.  I'd
recommend testing sdd1 (SMART tests, read tests, write tests) before
trying to use it any further.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Building new RAID5 results in removed and failed devices
  2010-06-24 21:38 ` Robin Hill
@ 2010-06-24 22:10   ` Markus Krainz
  2010-06-25  4:06     ` Mikael Abrahamsson
  0 siblings, 1 reply; 6+ messages in thread
From: Markus Krainz @ 2010-06-24 22:10 UTC (permalink / raw)
  To: linux-raid


> Looks like it started doing the initial synchronisation as normal, but
> hid a read error on sdd1 which caused the array to fail.  sdd1 is left
> as faulty, and sdb1 is spare as the rebuild didn't complete.  I'd
> recommend testing sdd1 (SMART tests, read tests, write tests) before
> trying to use it any further.
>
> Cheers,
>      Robin
>    

I think this unlikely because I did a full 4-pass badblocks 
read/write-test on sdb, sdc and sdd before using them.
No bad blocks have been found.

smartctl -H shows PASSED on all 3 devices. However sdd and sdc have 
Offline_Uncorrectable set to 1 while sdb has not. Is this a sign of disk 
failure?

Best regards,
Markus



~/ sudo smartctl -H /dev/sdb
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

~/ sudo smartctl -H /dev/sdc
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

~/ sudo smartctl -H /dev/sdd
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


~/ sudo smartctl -A /dev/sdd
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  
Always       -       0
   3 Spin_Up_Time            0x0027   243   142   021    Pre-fail  
Always       -       4816
   4 Start_Stop_Count        0x0032   100   100   000    Old_age   
Always       -       42
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  
Always       -       0
   7 Seek_Error_Rate         0x002e   200   200   000    Old_age   
Always       -       0
   9 Power_On_Hours          0x0032   100   100   000    Old_age   
Always       -       453
  10 Spin_Retry_Count        0x0032   100   253   000    Old_age   
Always       -       0
  11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   
Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   
Always       -       40
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   
Always       -       32
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   
Always       -       514
194 Temperature_Celsius     0x0022   100   087   000    Old_age   
Always       -       52
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   
Always       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   
Offline      -       1
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   
Offline      -       0


~/ sudo smartctl -A /dev/sdc
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  
Always       -       0
   3 Spin_Up_Time            0x0027   245   156   021    Pre-fail  
Always       -       4725
   4 Start_Stop_Count        0x0032   100   100   000    Old_age   
Always       -       39
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  
Always       -       0
   7 Seek_Error_Rate         0x002e   200   200   000    Old_age   
Always       -       0
   9 Power_On_Hours          0x0032   100   100   000    Old_age   
Always       -       456
  10 Spin_Retry_Count        0x0032   100   253   000    Old_age   
Always       -       0
  11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   
Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   
Always       -       38
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   
Always       -       23
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   
Always       -       1286
194 Temperature_Celsius     0x0022   110   087   000    Old_age   
Always       -       42
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   
Always       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   
Offline      -       1
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   
Offline      -       0

~/ sudo smartctl -A /dev/sdb
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  
Always       -       0
   3 Spin_Up_Time            0x0027   245   158   021    Pre-fail  
Always       -       4716
   4 Start_Stop_Count        0x0032   100   100   000    Old_age   
Always       -       40
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  
Always       -       0
   7 Seek_Error_Rate         0x002e   200   200   000    Old_age   
Always       -       0
   9 Power_On_Hours          0x0032   100   100   000    Old_age   
Always       -       455
  10 Spin_Retry_Count        0x0032   100   253   000    Old_age   
Always       -       0
  11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   
Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   
Always       -       38
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   
Always       -       29
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   
Always       -       1215
194 Temperature_Celsius     0x0022   105   082   000    Old_age   
Always       -       47
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   
Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   
Offline      -       0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Building new RAID5 results in removed and failed devices
  2010-06-24 21:03 ` Dan Williams
@ 2010-06-24 22:23   ` Markus Krainz
  0 siblings, 0 replies; 6+ messages in thread
From: Markus Krainz @ 2010-06-24 22:23 UTC (permalink / raw)
  To: linux-raid

On 06/24/2010 11:03 PM, Dan Williams wrote:
> This is expected without the --force option:
>
> --force
>                Insist that mdadm accept the geometry and layout
> specified without question.
>                Normally mdadm will not allow creation of an array with
> only one device, and
>                will  try  to create a RAID5 array with one missing
> drive (as this makes the
>                initial resync work faster).  With --force, mdadm will
> not  try  to  be  so
>                clever.
>
>    

The manpage you quoted says mdadm will not create a raid5 array with 
only one device.
However in my case I have --raid-devices=2 and both /dev/sdd1 and 
/dev/sdb1. So I should be ok with a raid5 and 2 devices?
The reason I am not using raid1 is because a want to add more drives 
later on.

Best regards,
Markus




mdadm --create --verbose /dev/md1 --chunk=64 --level=5 --raid-devices=2 
/dev/sdd1 /dev/sdb1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Building new RAID5 results in removed and failed devices
  2010-06-24 22:10   ` Markus Krainz
@ 2010-06-25  4:06     ` Mikael Abrahamsson
  0 siblings, 0 replies; 6+ messages in thread
From: Mikael Abrahamsson @ 2010-06-25  4:06 UTC (permalink / raw)
  To: Markus Krainz; +Cc: linux-raid

On Fri, 25 Jun 2010, Markus Krainz wrote:

> Offline_Uncorrectable set to 1 while sdb has not. Is this a sign of disk 
> failure?

Yes, it means there is one sector that returned read error.

Check your kernel logs ("dmesg") and it's very likely you'll see media 
layer errors there.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-06-25  4:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-24 20:26 Building new RAID5 results in removed and failed devices Markus Krainz
2010-06-24 21:03 ` Dan Williams
2010-06-24 22:23   ` Markus Krainz
2010-06-24 21:38 ` Robin Hill
2010-06-24 22:10   ` Markus Krainz
2010-06-25  4:06     ` Mikael Abrahamsson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.